Regardless of who has implemented the network and the status of provided monitoring tools, this has all the look and feel of intermittent network issues. I would run an independant network scan (maybe nmap?) from one of the affected clients to the affected host and I bet you will find that the same fluctuations occur on other ports.
On Mon, Aug 18, 2008 at 3:22 PM, Sreekanth CHAVA <[EMAIL PROTECTED]> wrote: > HI Pieter > > I have a suggestion.....this might not be very helpful.... > > Try to reconfigure the connections between the client and Mysql > server where the problem exists.....and then try to notice the > > uptime and logs of the server. > > CHAVA > > On Mon, Aug 18, 2008 at 12:00 PM, Pieter de Zwart < > [EMAIL PROTECTED]> wrote: > >> Greetings to all, >> >> I am having a weird issue with MySQL that I can't solve. We are getting >> intermittent client connection errors code 2003 to the database server for >> 10mins seemingly at random, and after 20+ days of uptime. Unfortunately, I >> have not been able to correlate these connection problems with any other >> queries, jobs, etc, so I was hoping someone here might be able to help me >> out. >> >> The problem is as follows. Seemingly at random, the master suddenly stops >> accepting connections, and the clients return connection error 2003, >> indicating the master did not respond in a timely manner. This goes on for >> about 10 minutes, at which point the master starts accepting connections >> again, without any human input. This happened at 4am on Sunday morning for >> example, so it healed itself before I could get myself out of bed and >> comprehend the situation, let alone connect somewhere and try and fix it. >> We are seeing this happen about 4 or 5 times a week for the last 2 weeks, >> and there seems to be no pattern as to the time or date. Sometimes it >> happens twice in one day, and then disappears for 4 days. There was no >> spike >> in activity as far as we can tell, and the CPU and network usage were >> stable >> at about 2% and 4% of capacity respectively. Also, we have slow query log >> turned on and set to 1sec, and there are no queries anywhere near the gaps >> in connection. >> >> We are running MySQL 5.0.44 on a single master on its on hardware, with a >> replication slave on a different machine. We have a write through memcached >> setup in front my MySQL, which handles the majority of the requests, so >> MySQL is seeing about 20 to 30 ops (select, inserts, updates) per second on >> average. All of this is running on Amazon EC2 instances, and have dedicated >> boxes (we are running the 64bit Large Instance, which is supposed to be a >> dedicated virtual box with 2 CPU, 2 cores apiece and 8G of ram, with 1.5/2G >> free.) We then have two other machines that run the front end web servers >> running PHP 5.1.6 and load balancers, which connect to the database when >> the >> cache doesnt have the required information. I did not post this to the PHP >> section since it seems like a more general issue with the server as opposed >> to the clients. >> >> After the second time it happened, we switched out our AWS hardware in >> hopes >> that it was a hardware fluke, but to no avail. The problem reared its >> uglyhead 3 days later. We doubt it is the internal Amazon network since >> the >> external monitoring of the box continues to work and spit out information, >> and no other box is showing similar connection symptoms. Also, all of our >> boxes are in the same Amazon Zone, which implies that they are in the same >> colo. This makes me think that a combination of our configuration and >> queries are causing the trouble. >> >> I checked the archives, but it seems that the people who encountered this >> error saw it during setup/configuration, and not randomly after 30 days of >> uptime. I doubt anyone has the answer, so I was hoping someone could help >> me >> understand the best way to debug this problem in order to find the reason >> for these random outages. >> >> Thanks in advance for any and all help! >> >> Pieter de Zwart >> > > > > -- > Sreekanth CHAVA > -- - michael dykman - [EMAIL PROTECTED] - All models are wrong. Some models are useful. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]