At first, I thought it was a low memory problem. I reduced my buffers in my.cnf and run the test again. The master still hung and this is a screenshot of "top" command just before it died:
http://choonkeng.hopto.org/temp/replication-hang.gif There are still plenty of memory and no disk swapping when the master goes 99%. Error and slow logs don't show any useful information. I am totally clueless now, please share your opinion. Thanks. << Original text follows >> Hello everyone, The master is moderately busy with load average of 2-3. This has worked very well for a long time before I started replication (few months). When replication is started, both master and slave appear to work fine with no significant increase of load on either side. But, after a few (random) hours, the master mysqld process will start to use 99% CPU and the load will go to 3, 4, 6, 15, 24, 56, 312... in a matter of seconds! The master then stops responding and needs a reboot. My replication setup is as follows: Master Slave DB A <==================> DB A DB B The master is setup with binlog-do-db=A and slave with replicate-do-db=A. Queries to A are mostly SELECT while to B are mostly INSERT, UPDATE & DELETE. The queries that cause 99% CPU, as stated in slow.log, are not really CPU intensive. (slow.log indicates these queries took HOURS to run where they would normally finish in 1 or 2 seconds). These are the queries we run every minute and second, mysql handles these with no problem, why suddenly mysqld eats all CPU? Can anyone shed some light? Thanks in advanced. Regards, CK _______________________________ Do you Yahoo!? Win 1 of 4,000 free domain names from Yahoo! Enter now. http://promotions.yahoo.com/goldrush -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]