Re: Intermittent "Can't connect to MySQL server on '' (4)" (2003)" after 20+ days uptime

Sreekanth CHAVA Mon, 18 Aug 2008 12:22:31 -0700

HI  Pieter

I  have  a  suggestion.....this  might  not be  very helpful....


Try  to reconfigure  the connections  between  the  client  and  Mysql
server  where  the  problem  exists.....and  then try  to  notice the

uptime and  logs of the  server.

CHAVA

On Mon, Aug 18, 2008 at 12:00 PM, Pieter de Zwart <
[EMAIL PROTECTED]> wrote:

> Greetings to all,
>
> I am having a weird issue with MySQL that I can't solve.  We are getting
> intermittent client connection errors code 2003 to the database server for
> 10mins seemingly at random, and after 20+ days of uptime. Unfortunately, I
> have not been able to correlate these connection problems with any other
> queries, jobs, etc, so I was hoping someone here might be able to help me
> out.
>
> The problem is as follows. Seemingly at random, the master suddenly stops
> accepting connections, and the clients return connection error 2003,
> indicating the master did not respond in a timely manner. This goes on for
> about 10 minutes, at which point the master starts accepting connections
> again, without any human input. This happened at 4am on Sunday morning for
> example, so it healed itself before I could get myself out of bed and
> comprehend the situation, let alone connect somewhere and try and fix it.
> We are seeing this happen about 4 or 5 times a week for the last 2 weeks,
> and there seems to be no pattern as to the time or date. Sometimes it
> happens twice in one day, and then disappears for 4 days. There was no
> spike
> in activity as far as we can tell, and the CPU and network usage were
> stable
> at about 2% and 4% of capacity respectively. Also, we have slow query log
> turned on and set to 1sec, and there are no queries anywhere near the gaps
> in connection.
>
> We are running MySQL 5.0.44 on a single master on its on hardware, with a
> replication slave on a different machine. We have a write through memcached
> setup in front my MySQL, which handles the majority of the requests, so
> MySQL is seeing about 20 to 30 ops (select, inserts, updates) per second on
> average. All of this is running on Amazon EC2 instances, and have dedicated
> boxes (we are running the 64bit Large Instance, which is supposed to be a
> dedicated virtual box with 2 CPU, 2 cores apiece and 8G of ram, with 1.5/2G
> free.) We then have two other machines that run the front end web servers
> running PHP 5.1.6 and load balancers, which connect to the database when
> the
> cache doesnt have the required information. I did not post this to the PHP
> section since it seems like a more general issue with the server as opposed
> to the clients.
>
> After the second time it happened, we switched out our AWS hardware in
> hopes
> that it was a hardware fluke, but to no avail. The problem reared its
> uglyhead 3 days later.  We doubt it is the internal Amazon network since
> the
> external monitoring of the box continues to work and spit out information,
> and no other box is showing similar connection symptoms. Also, all of our
> boxes are in the same Amazon Zone, which implies that they are in the same
> colo. This makes me think that a combination of our configuration and
> queries are causing the trouble.
>
> I checked the archives, but it seems that the people who encountered this
> error saw it during setup/configuration, and not randomly after 30 days of
> uptime. I doubt anyone has the answer, so I was hoping someone could help
> me
> understand the best way to debug this problem in order to find the reason
> for these random outages.
>
> Thanks in advance for any and all help!
>
> Pieter de Zwart
>



-- 
Sreekanth CHAVA

Re: Intermittent "Can't connect to MySQL server on '' (4)" (2003)" after 20+ days uptime

Reply via email to