{"id":25719,"date":"2021-06-23T07:29:00","date_gmt":"2021-06-23T00:29:00","guid":{"rendered":"https:\/\/tom.ji42.com\/?p=25719"},"modified":"2021-06-23T07:29:00","modified_gmt":"2021-06-23T00:29:00","slug":"mysql-got-an-error-reading-communication-packet-errors","status":"publish","type":"post","link":"https:\/\/tom.tomwork.net\/?p=25719","title":{"rendered":"MySQL &#8220;Got an Error Reading Communication Packet&#8221; Errors"},"content":{"rendered":"<p>In Percona\u2019s managed services, we often receive customer questions on communication failure errors\u2014where customers are faced with intermittent \u201cGot an error reading communication packets\u201d messages. I thought this topic deserved blog post so we can discuss possible reasons for this error, and how to remedy this problem. I hope this will help readers on how to investigate and resolve this problem.<!--more--><\/p>\n<p>First of all, whenever a communication error occurs it increments the status counter for either\u00a0<a href=\"http:\/\/dev.mysql.com\/doc\/refman\/5.6\/en\/server-status-variables.html#statvar_Aborted_clients\" target=\"_blank\" rel=\"nofollow noopener\">Aborted_clients<\/a>\u00a0or\u00a0<a href=\"http:\/\/dev.mysql.com\/doc\/refman\/5.6\/en\/server-status-variables.html#statvar_Aborted_connects\" target=\"_blank\" rel=\"nofollow noopener\">Aborted_connects<\/a>, which describe the\u00a0number of connections that were aborted because the client died without closing the connection properly and the number of failed attempts to connect to MySQL server (respectively). The possible reasons for both errors are numerous (see the\u00a0<strong>Aborted_clients increments<\/strong>\u00a0or\u00a0<strong>Aborted_connects increments<\/strong>\u00a0sections in the MySQL\u00a0<a href=\"http:\/\/dev.mysql.com\/doc\/refman\/5.6\/en\/communication-errors.html\" target=\"_blank\" rel=\"nofollow noopener\">manual<\/a>).<\/p>\n<p>In the case of\u00a0<a href=\"http:\/\/dev.mysql.com\/doc\/refman\/5.6\/en\/server-system-variables.html#sysvar_log_warnings\" target=\"_blank\" rel=\"nofollow noopener\">log_warnings<\/a>&gt;1, MySQL also writes this information to the error log (shown below):<\/p>\n<div>\n<div class=\"CodeMirror cm-s-default\">\n<div class=\"CodeMirror-hscrollbar\">\n<div><\/div>\n<\/div>\n<div class=\"CodeMirror-scroll\" tabindex=\"-1\">\n<div class=\"CodeMirror-sizer\">\n<div>\n<div class=\"CodeMirror-lines\" role=\"presentation\">\n<div role=\"presentation\">\n<div class=\"CodeMirror-measure\"><\/div>\n<div class=\"CodeMirror-measure\"><\/div>\n<div><\/div>\n<div class=\"CodeMirror-cursors\"><\/div>\n<div class=\"CodeMirror-code\" tabindex=\"0\" role=\"presentation\" contenteditable=\"false\" spellcheck=\"false\">\n<div>\n<div class=\"CodeMirror-gutter-wrapper\" contenteditable=\"false\">\n<div class=\"CodeMirror-linenumber CodeMirror-gutter-elt\">1<\/div>\n<\/div>\n<pre class=\" CodeMirror-line \" role=\"presentation\"><span role=\"presentation\">[<span class=\"cm-variable\">Warning<\/span>] <span class=\"cm-variable\">Aborted<\/span> <span class=\"cm-variable\">connection<\/span> <span class=\"cm-number\">305628<\/span> <span class=\"cm-variable\">to<\/span> <span class=\"cm-variable\">db<\/span>: <span class=\"cm-string\">'db'<\/span> <span class=\"cm-variable\">user<\/span>: <span class=\"cm-string\">'dbuser'<\/span> <span class=\"cm-variable\">host<\/span>: <span class=\"cm-string\">'hostname'<\/span> (<span class=\"cm-variable\">Got<\/span> <span class=\"cm-variable\">an<\/span> <span class=\"cm-variable\">error<\/span> <span class=\"cm-variable\">reading<\/span> <span class=\"cm-variable\">communication<\/span> <span class=\"cm-variable\">packets<\/span>)<\/span><\/pre>\n<\/div>\n<div>\n<div class=\"CodeMirror-gutter-wrapper\" contenteditable=\"false\">\n<div class=\"CodeMirror-linenumber CodeMirror-gutter-elt\">2<\/div>\n<\/div>\n<pre class=\" CodeMirror-line \" role=\"presentation\"><span role=\"presentation\">[<span class=\"cm-variable\">Warning<\/span>] <span class=\"cm-variable\">Aborted<\/span> <span class=\"cm-variable\">connection<\/span> <span class=\"cm-number\">305627<\/span> <span class=\"cm-variable\">to<\/span> <span class=\"cm-variable\">db<\/span>: <span class=\"cm-string\">'db'<\/span> <span class=\"cm-variable\">user<\/span>: <span class=\"cm-string\">'dbuser'<\/span> <span class=\"cm-variable\">host<\/span>: <span class=\"cm-string\">'hostname'<\/span> (<span class=\"cm-variable\">Got<\/span> <span class=\"cm-variable\">an<\/span> <span class=\"cm-variable\">error<\/span> <span class=\"cm-variable\">reading<\/span> <span class=\"cm-variable\">communication<\/span> <span class=\"cm-variable\">packets<\/span>)<\/span><\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div><\/div>\n<div class=\"CodeMirror-gutters\">\n<div class=\"CodeMirror-gutter CodeMirror-linenumbers\"><\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<p>In this case, MySQL increments the status counter for\u00a0<code>Aborted_clients<\/code>, which could mean:<\/p>\n<ul>\n<li>The client connected successfully but terminated improperly (and may relate to not closing the connection properly)<\/li>\n<li>The client slept for longer than the defined\u00a0<a href=\"http:\/\/dev.mysql.com\/doc\/refman\/5.6\/en\/server-system-variables.html#sysvar_wait_timeout\" target=\"_blank\" rel=\"nofollow noopener\">wait_timeout<\/a>\u00a0or\u00a0<a href=\"http:\/\/dev.mysql.com\/doc\/refman\/5.6\/en\/server-system-variables.html#sysvar_interactive_timeout\" target=\"_blank\" rel=\"nofollow noopener\">interactive_timeout<\/a>\u00a0seconds (which ends up causing the connection to sleep for wait_timeout seconds and then the connection gets forcibly closed by the MySQL server)<\/li>\n<li>The client terminated abnormally or exceeded the\u00a0<a href=\"https:\/\/dev.mysql.com\/doc\/refman\/5.6\/en\/server-system-variables.html#sysvar_max_allowed_packet\" target=\"_blank\" rel=\"nofollow noopener\">max_allowed_packet<\/a>\u00a0for queries<\/li>\n<\/ul>\n<p>The above is not an all-inclusive list.Now, how to identify what causing this problem and how to remedy this problem.<\/p>\n<h2>How Do We Identify What Caused This Problem, and How Do We Fix It?<\/h2>\n<p>To be honest, aborted connection errors are not easy to diagnose. But in my experience, it\u2019s related to network\/firewall issues most of the time. We usually investigate those issues with the help of Percona toolkit scripts, i.e.\u00a0<a href=\"https:\/\/www.percona.com\/doc\/percona-toolkit\/2.2\/pt-summary.html\" target=\"_blank\" rel=\"nofollow noopener\">pt-summary<\/a>\u00a0\/\u00a0<a href=\"https:\/\/www.percona.com\/doc\/percona-toolkit\/2.2\/pt-mysql-summary.html\" target=\"_blank\" rel=\"nofollow noopener\">pt-mysql-summary<\/a>\u00a0\/\u00a0<a href=\"https:\/\/www.percona.com\/doc\/percona-toolkit\/2.2\/pt-stalk.html\" target=\"_blank\" rel=\"nofollow noopener\">pt-stalk<\/a>. The outputs from those scripts can be very helpful.<\/p>\n<p>Some of the reasons can be:<\/p>\n<ul>\n<li>A high rate of connections sleeping inside MySQL for hundred of seconds is one of the symptoms that applications aren\u2019t closing connections after doing work, and instead relying on the\u00a0<code>wait_timeout<\/code>\u00a0to close them. I strongly recommend changing the application logic to properly close connections at the end of an operation.<\/li>\n<li>Check to make sure the value of\u00a0<code>max_allowed_packet<\/code>\u00a0is high enough, and that your clients are not receiving a \u201cpacket too large\u201d message. This situation aborts the connection without properly closing it.<\/li>\n<li>Another possibility is\u00a0<code>TIME_WAIT<\/code>. I\u2019ve noticed many\u00a0<code>TIME_WAIT<\/code>\u00a0notifications from the netstat, so I would recommend confirming the connections are well managed to close on the application side.<\/li>\n<li>Make sure the transactions are committed (begin and commit) properly, so that once the application is \u201cdone\u201d with the connection it is left in a clean state.<\/li>\n<li>You should ensure that client applications do not abort connections. For example, if PHP has option max_execution_time\u00a0set to 5 seconds, increasing\u00a0<a href=\"https:\/\/dev.mysql.com\/doc\/refman\/5.6\/en\/server-system-variables.html#sysvar_connect_timeout\" target=\"_blank\" rel=\"nofollow noopener\">connect_timeout<\/a>\u00a0would not help because PHP will kill the script. Other programming languages and environments can have similar safety options.<\/li>\n<li>Another cause for delay in connections is DNS problems. Check if you have\u00a0<a href=\"http:\/\/dev.mysql.com\/doc\/refman\/5.6\/en\/server-options.html#option_mysqld_skip-name-resolve\" target=\"_blank\" rel=\"nofollow noopener\">skip-name-resolve<\/a>\u00a0enabled, and if hosts are authenticated against their IP address instead their hostname.<\/li>\n<li>One way to find out where your application is misbehaving is to add some logging to your code that will save the application actions along with the MySQL connection ID. With that, you can correlate it to the connection number from the error lines. Enable the Audit log plugin, which logs connections and query activity, and check the\u00a0<a href=\"https:\/\/www.percona.com\/doc\/percona-server\/5.6\/management\/audit_log_plugin.html\" target=\"_blank\" rel=\"noopener\">Percona Audit Log Plugin<\/a>\u00a0as soon as you hit a connection abort error. You can check for the audit log to identify which query is the culprit. If you can\u2019t use the Audit plugin for some reason, you can consider using the MySQL general log \u2013 however, this can be risky on a loaded server. You should enable the\u00a0<a href=\"http:\/\/dev.mysql.com\/doc\/refman\/5.6\/en\/query-log.html\" target=\"_blank\" rel=\"nofollow noopener\">general log<\/a>\u00a0for at least a few minutes. While it puts a heavy burden on the server, the errors tend to happen fairly often, so you should be able to collect the needed data before the log grows too large. I recommend enabling the general log with an -f tail, then disable the general log when you see the next warning in the log. Once you find the query from the aborted connection, identify which piece of your application issues that query and co-relate\u00a0the queries with portions of your application.<\/li>\n<li>Try increasing the\u00a0<a href=\"https:\/\/dev.mysql.com\/doc\/refman\/5.6\/en\/server-system-variables.html#sysvar_net_read_timeout\" target=\"_blank\" rel=\"nofollow noopener\">net_read_timeout<\/a>\u00a0and\u00a0<a href=\"https:\/\/dev.mysql.com\/doc\/refman\/5.6\/en\/server-system-variables.html#sysvar_net_write_timeout\" target=\"_blank\" rel=\"nofollow noopener\">net_write_timeout<\/a>\u00a0values for MySQL and see if that reduces the number of errors.\u00a0<code>net_read_timeout<\/code>\u00a0is rarely the problem unless you have an\u00a0extremely poor network. Try tweaking those values, however, because in most cases a query is generated and sent as a\u00a0single packet to the server, and applications can\u2019t switch to doing something else while leaving the server with a\u00a0partially received query. There is a very detailed\u00a0<a href=\"https:\/\/www.percona.com\/blog\/2007\/07\/08\/mysql-net_write_timeout-vs-wait_timeout-and-protocol-notes\/\" target=\"_blank\" rel=\"nofollow noopener\">blog post<\/a>\u00a0on this topic from our CEO, Peter Zaitsev.<\/li>\n<\/ul>\n<p>Aborted connections happen because a connection was not closed properly. The server can\u2019t cause aborted connections unless there is a networking problem\u00a0between the server and the client (like the\u00a0server is half duplex, and the client is full duplex)\u2014but that is the network causing the problem, not the server. In any case, such problems should show up as errors on the networking interface. To be extra sure, check the\u00a0<code>ifconfig -a<\/code>\u00a0output on the MySQL server to check if there are errors.<\/p>\n<p>Another way to troubleshoot this problem is via\u00a0<code>tcpdump<\/code>. You can refer to this\u00a0<a href=\"https:\/\/www.percona.com\/blog\/2008\/08\/23\/how-to-track-down-the-source-of-aborted_connects\/\" target=\"_blank\" rel=\"nofollow noopener\">blog post<\/a>\u00a0on how to track down the\u00a0source of aborted connections. Look\u00a0for potential network issues, timeouts and resource issues with MySQL.<\/p>\n<p>I found this\u00a0<a href=\"https:\/\/www.percona.com\/blog\/2011\/04\/18\/how-to-use-tcpdump-on-very-busy-hosts\/\" target=\"_blank\" rel=\"nofollow noopener\">blog post<\/a>\u00a0useful in explaining how to use\u00a0<code>tcpdump<\/code>\u00a0on busy hosts. It provides help for tracking down the TCP exchange sequence that led to the aborted connection, which can help you figure out why the connection broke.<\/p>\n<p>For network issues, use a ping to calculate the round trip time (RTT) between machine where mysqld is located and the machine from where the application makes requests. Send a large file (1GB or more) to and from client and server machines, watch the\u00a0process\u00a0using\u00a0<code>tcpdump<\/code>, then check if an error occurred during transfer. Repeat this test few times. I also found this from my colleague Marco Tusa useful:\u00a0<a href=\"http:\/\/www.tusacentral.net\/joomla\/index.php\/mysql-blogs\/164-effective-way-to-check-the-network-connection-when-in-need-of-a-geographic-distribution-replication-.html\" target=\"_blank\" rel=\"nofollow noopener\">Effective way to check network connection<\/a>.<\/p>\n<p>One other idea I can think of is to capture the\u00a0<code>netstat -s<\/code>\u00a0output along with a timestamp after every N seconds (e.g., 10 seconds so you can relate\u00a0\u00a0<code>netstat -s<\/code>\u00a0output of BEFORE and AFTER an aborted connection error from the MySQL error log). With the aborted connection error timestamp, you can co-relate it with the\u00a0<code>netstat<\/code>\u00a0sample captured as per a timestamp of\u00a0<code>netstat<\/code>, and watch which error counters increased under the TcpExt section of\u00a0<code>netstat -s<\/code>.<\/p>\n<p>Along with that, you should also check the network infrastructure sitting between the client and the server for proxies, load balancers, and\u00a0firewalls that could be causing a problem.<\/p>\n<h2><strong>Conclusion<\/strong><\/h2>\n<p>I\u2019ve tried to cover communication failure errors, and how to identify and fix the possible aborted connections. Take into account,\u00a0faulty ethernets, hubs, switches, cables, and so forth can cause this issue as well. You must replace the hardware itself to properly diagnose these issues.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In Percona\u2019s managed services, we often receive customer questions on communication failure errors\u2014where customers are faced with intermittent \u201cGot an error reading communication packets\u201d messages. I thought this topic deserved blog post so we can discuss possible reasons for this error, and how to remedy this problem. I hope this will help readers on how [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[13],"tags":[],"class_list":["post-25719","post","type-post","status-publish","format-standard","hentry","category-13"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p6cOVM-6GP","_links":{"self":[{"href":"https:\/\/tom.tomwork.net\/index.php?rest_route=\/wp\/v2\/posts\/25719","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/tom.tomwork.net\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/tom.tomwork.net\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/tom.tomwork.net\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/tom.tomwork.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=25719"}],"version-history":[{"count":1,"href":"https:\/\/tom.tomwork.net\/index.php?rest_route=\/wp\/v2\/posts\/25719\/revisions"}],"predecessor-version":[{"id":25720,"href":"https:\/\/tom.tomwork.net\/index.php?rest_route=\/wp\/v2\/posts\/25719\/revisions\/25720"}],"wp:attachment":[{"href":"https:\/\/tom.tomwork.net\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=25719"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/tom.tomwork.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=25719"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/tom.tomwork.net\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=25719"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}