Hi,

I'm working for a client that has an auction application using MySQL.
We have a program to simulate the load when N users have an auction
page open.  The auction page pings the server for updates every 
few seconds, and that ping results in two UPDATEs and 64 SELECTs.

I have cut the statements into a separate program so that they can
be run on the machine (through a perl script) with no other processes
(e.g. the webserver) using resources.  I'm trying to figure out
the maximum number of users we can have on at once in that 
condition.

I have a script that forks off N children which then sit there in a
loop doing that set of queries every few seconds.  If I set N to 
about 160 or so, the machine seems to be doing everything fine (I
have a two second delay between forks, so the load comes up gradually).

However, at some point, some times, the machine totally locks up.  All
along the way the top processes in top are the forked instances of the 
perl script that are running the test.  But then, suddenly the mysql
daemon occupies all the top spots and the cpus are all at 100%.

I have had it happen with N=160, after all 160 processes are forked and
have been running for a few minutes, then suddenly it locks up.

Does this suggest anything to anyone?  I have tried everything in the
manual regarding tuning the server parameters, and nothing seems to 
change the failure significantly.  The only thing I have different from
default in my.cnf right now is the max_connections.

I know that we are only using 10-11 percent of the machine's memory, the
key cache efficiency is 100%, I don't think any individual query is too
laborious (I have set up indexes and used EXPLAIN to make sure that is
working), and I even eliminated the UPDATE statements to make sure that
it wasn't just a line forming waiting to get write locks.  Same "signature"
of failure happened.

I have sar -A output, graphs of the mysql SHOW STATUS variables over
time, and the output of SHOW VARIABLES at

http://www.fulcrum.org/gsg/with_sar/db01p/testdata/3.20.21.34/

The sar output during the transition from working to frozen is 
in 

http://www.fulcrum.org/gsg/with_sar/db01p/testdata/3.20.21.34/sar6.out

you can see that the transition happens between 09:41:35 and 09:41:40.

I have one observation that might not be related, but I am pretty sure
I have seen it in every test that tanked.  It's a one-time drop in
Open_files:

http://www.fulcrum.org/gsg/with_sar/db01p/testdata/3.20.21.34/graphs/mysql_single/Open_files.png

What would cause that?  I assume that mysql is freeing up resources for 
something, but what?

Thanks for anything you can tell me.  The show variables output is at

http://www.fulcrum.org/gsg/with_sar/db01p/testdata/3.20.21.34/show_variables

mike south

---------------------------------------------------------------------
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/           (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Reply via email to