Skip to main content

The difference between SESSION wait_timeout and GLOBAL wait_timeout

Preface

Recently, while developing a service, I encountered an issue with the go-sql-driver/mysql. The error message was “packets.go:122: closing bad idle connection: EOF”.

Issue description

Firstly this issue is not a bug, the explain of this issue is that the ORM framework try to reuse the closed MySQL connection but failed.

Issue reproduction

I am not an expert in MySQL, but I connect to AWS Aurora using the interactive client.

To find information about the connection from the service, I run the command “SHOW PROCESSLIST”.

mysql> SHOW PROCESSLIST;
+------+------------------------+--------------------+--------------+---------+-------+------------------------+------------------+
| Id   | User                   | Host               | db           | Command | Time  | State                  | Info             |
+------+------------------------+--------------------+--------------+---------+-------+------------------------+------------------+
|    5 | event_scheduler        | localhost          | NULL         | Daemon  | 27593 | Waiting on empty queue | NULL             |
|   11 | rdsadmin               | localhost          | NULL         | Sleep   |     1 |                        | NULL             |
|   12 | rdsadmin               | localhost          | NULL         | Sleep   |     1 |                        | NULL             |
|   14 | servic                 | 172.36.2.100:35138 | service      | Sleep   |   299 |                        | NULL             |
|   21 | rdsadmin               | localhost          | NULL         | Sleep   |   200 |                        | NULL             |
|   24 | rdsadmin               | localhost          | NULL         | Sleep   |     2 |                        | NULL             |
| 2378 | bastion                | 172.38.4.50:46910  | service      | Query   |     0 | init                   | SHOW PROCESSLIST |
+------+------------------------+--------------------+--------------+---------+-------+------------------------+------------------+
7 rows in set (0.00 sec)

The TCP connection to the service was lost, so I attempted to resend my request.

As I expected, the issue reoccurred.

mysql> SHOW PROCESSLIST;
+------+------------------------+--------------------+--------------+---------+-------+------------------------+------------------+
| Id   | User                   | Host               | db           | Command | Time  | State                  | Info             |
+------+------------------------+--------------------+--------------+---------+-------+------------------------+------------------+
|    5 | event_scheduler        | localhost          | NULL         | Daemon  | 27595 | Waiting on empty queue | NULL             |
|   11 | rdsadmin               | localhost          | NULL         | Sleep   |     3 |                        | NULL             |
|   12 | rdsadmin               | localhost          | NULL         | Sleep   |     3 |                        | NULL             |
|   21 | rdsadmin               | localhost          | NULL         | Sleep   |   202 |                        | NULL             |
|   24 | rdsadmin               | localhost          | NULL         | Sleep   |     4 |                        | NULL             |
| 2378 | bastion                | 172.38.4.50:46910  | service      | Query   |     0 | init                   | SHOW PROCESSLIST |
+------+------------------------+--------------------+--------------+---------+-------+------------------------+------------------+
6 rows in set (0.00 sec)

Positioning and resolving problem

Due to the default setting of the ORM framework, idle TCP connections are not closed directly.

Therefore, I believe that MySQL is responsible for closing these idle connections.

mysql> SHOW VARIABLES LIKE '%timeout%';
+----------------------------------+----------+
| Variable_name                    | Value    |
+----------------------------------+----------+
| aurora_fwd_master_idle_timeout   | 60       |
| aurora_globaldb_rpo_wait_timeout | 60       |
| connect_timeout                  | 10       |
| delayed_insert_timeout           | 300      |
| have_statement_timeout           | YES      |
| innodb_flush_log_at_timeout      | 1        |
| innodb_lock_wait_timeout         | 50       |
| innodb_rollback_on_timeout       | OFF      |
| interactive_timeout              | 28800    |
| lock_wait_timeout                | 31536000 |
| net_read_timeout                 | 30       |
| net_write_timeout                | 60       |
| rpl_stop_slave_timeout           | 31536000 |
| slave_net_timeout                | 60       |
| wait_timeout                     | 28800    |
+----------------------------------+----------+
15 rows in set (0.01 sec)

Due to a lack of MySQL operation and maintenance experience, I am unaware that the following command only outputs session variables and everythings looks great.

So I think the issue may be caused by the AWS networking components and I opened a ticket to seek assistance from AWS support.

Thankfully, with the help of the supporter, it was discovered that the Aurora “wait_timeout” setting was set to 300s.

By default, due to historical reasons, the former colleagues set the wait_timeout of Aurora in the production environment to 300s.

However, the current infrastructure colleagues create Aurora through CFn based on the existing configuration, which caused this problem.

The following is the main content of this article.

What is the wait_timeout

There are two most important timeout variables setting in MySQL.

  1. interactive_timeout
  2. wait_timout

The “interactive_timeout” parameter manages interactive connections, such as those made through the MySQL CLI.

On the other hand, the “wait_timeout” parameter manages non-interactive connections.

mysql> SHOW GLOBAL VARIABLES LIKE '%timeout%';
+----------------------------------+----------+
| Variable_name                    | Value    |
+----------------------------------+----------+
| aurora_fwd_master_idle_timeout   | 60       |
| aurora_globaldb_rpo_wait_timeout | 60       |
| connect_timeout                  | 10       |
| delayed_insert_timeout           | 300      |
| have_statement_timeout           | YES      |
| innodb_flush_log_at_timeout      | 1        |
| innodb_lock_wait_timeout         | 50       |
| innodb_rollback_on_timeout       | OFF      |
| interactive_timeout              | 28800    |
| lock_wait_timeout                | 31536000 |
| net_read_timeout                 | 30       |
| net_write_timeout                | 60       |
| rpl_stop_slave_timeout           | 31536000 |
| slave_net_timeout                | 60       |
| wait_timeout                     | 300      |
+----------------------------------+----------+
15 rows in set (0.01 sec)

The difference between SESSION and GLOBAL variables

There are two concepts related to SESSION and GLOBAL variables.

The value of the SESSION interactive_timeout and wait_timeout depends on the GLOBAL interactive_timeout

The minimum value will be chosen for interactive_timeout and wait_timour in both the SESSION and GLOBAL.

To modify the interactive_timeout or wait_timout in the SESSION, there are two methods:

  1. Use the command “SET SESSION inetactive_timeout = 300” and it will only affect new interactive connections.

  2. Use the command “SET GLOBAL interactive_timeout = 300” to set both inetactive_timeout and wait_timout values which will equal to the GLOBAL interactive_timeout.

Reference