Hi, I'm working for a client that has an auction application using MySQL. We have a program to simulate the load when N users have an auction page open. The auction page pings the server for updates every few seconds, and that ping results in two UPDATEs and 64 SELECTs.
I have cut the statements into a separate program so that they can be run on the machine (through a perl script) with no other processes (e.g. the webserver) using resources. I'm trying to figure out the maximum number of users we can have on at once in that condition. I have a script that forks off N children which then sit there in a loop doing that set of queries every few seconds. If I set N to about 160 or so, the machine seems to be doing everything fine (I have a two second delay between forks, so the load comes up gradually). However, at some point, some times, the machine totally locks up. All along the way the top processes in top are the forked instances of the perl script that are running the test. But then, suddenly the mysql daemon occupies all the top spots and the cpus are all at 100%. I have had it happen with N=160, after all 160 processes are forked and have been running for a few minutes, then suddenly it locks up. Does this suggest anything to anyone? I have tried everything in the manual regarding tuning the server parameters, and nothing seems to change the failure significantly. The only thing I have different from default in my.cnf right now is the max_connections. I know that we are only using 10-11 percent of the machine's memory, the key cache efficiency is 100%, I don't think any individual query is too laborious (I have set up indexes and used EXPLAIN to make sure that is working), and I even eliminated the UPDATE statements to make sure that it wasn't just a line forming waiting to get write locks. Same "signature" of failure happened. I have sar -A output, graphs of the mysql SHOW STATUS variables over time, and the output of SHOW VARIABLES at http://www.fulcrum.org/gsg/with_sar/db01p/testdata/3.20.21.34/ The sar output during the transition from working to frozen is in http://www.fulcrum.org/gsg/with_sar/db01p/testdata/3.20.21.34/sar6.out you can see that the transition happens between 09:41:35 and 09:41:40. I have one observation that might not be related, but I am pretty sure I have seen it in every test that tanked. It's a one-time drop in Open_files: http://www.fulcrum.org/gsg/with_sar/db01p/testdata/3.20.21.34/graphs/mysql_single/Open_files.png What would cause that? I assume that mysql is freeing up resources for something, but what? Thanks for anything you can tell me. The show variables output is at http://www.fulcrum.org/gsg/with_sar/db01p/testdata/3.20.21.34/show_variables mike south --------------------------------------------------------------------- Before posting, please check: http://www.mysql.com/manual.php (the manual) http://lists.mysql.com/ (the list archive) To request this thread, e-mail <[EMAIL PROTECTED]> To unsubscribe, e-mail <[EMAIL PROTECTED]> Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php