How To Duplicate Number of Hits from Prod Sever to NEW QA server?

2010-08-27 Thread Nunzio Daveri
Hello, I have been asked to replay the traffic load we have on one of our 
5.1.4X servers against a new 5.5 test server we are getting ready to put into 
production as a stand alone.  My question is that I have 6GB of mysqld.log BUT 
how to I figure out how MUCH to play back at a given time and how fast / number 
of clients etc...?  The log only tells me this is all the data that has come to 
the production server but it doesn't tell me how fast, how many connections, 
how 
many users at a time, how many inserts, updates or delets per second etc...

How do I replay the log against the 5.5 server so as to duplicate real time 
traffic and not just replay the logs?  Is there a tool or a shell script?  I 
know there are built in benchmarking tools but I am trying to tell mgmt that 
5.1.4x was lets say 60% percentage busy (cpu/mem/io) with traffic hitting it on 
Monday, the SAME amount of traffic on 5.5 is only 48% busy.

Any help or advise is greatly appreciated please.

Thanks in advance.

Nunzio


  

RE: How To Duplicate Number of Hits from Prod Sever to NEW QA server?

2010-08-27 Thread Jerry Schwartz
-Original Message-
From: Nunzio Daveri [mailto:nunziodav...@yahoo.com]
Sent: Friday, August 27, 2010 10:19 AM
To: mysql@lists.mysql.com
Subject: How To Duplicate Number of Hits from Prod Sever to NEW QA server?

Hello, I have been asked to replay the traffic load we have on one of our
5.1.4X servers against a new 5.5 test server we are getting ready to put into
production as a stand alone.  My question is that I have 6GB of mysqld.log 
BUT
how to I figure out how MUCH to play back at a given time and how fast / 
number
of clients etc...?  The log only tells me this is all the data that has come 
to
the production server but it doesn't tell me how fast, how many connections,
how
many users at a time, how many inserts, updates or delets per second etc...

How do I replay the log against the 5.5 server so as to duplicate real 
time
traffic and not just replay the logs?  Is there a tool or a shell script?  I
know there are built in benchmarking tools but I am trying to tell mgmt 
that
5.1.4x was lets say 60% percentage busy (cpu/mem/io) with traffic hitting it 
on
Monday, the SAME amount of traffic on 5.5 is only 48% busy.

Any help or advise is greatly appreciated please.

[JS] Unfortunately, the short answer is that benchmarking an interactive 
system is difficult.

Would your management be satisfied if you just used the query log to beat the 
living daylights out of the two systems? That way you could say that you 
eliminated all of the confounding factors and truly measured the database 
servers' actual throughput.

If they won't swallow that, then you'll have to read the rest of my reply.

- IF THE APPLICATION IS WEB BASED, you should be able to log the web server's 
traffic and use that to determine all of the information you need. Just replay 
the server log, using the times in there to determine the transaction arrival 
rates. If that won't do it, for some reason, then read on.

- Ideally, you should have a second system (a PC would have the horsepower) to 
drive the transactions into your MySQL server. That is the easy part.

- As you said, you need to simulate the transaction arrival rate. Since you 
don't really know when the individual transactions arrive, you need to use 
some random delays; but what to use?

- Where are the transactions coming from? An end-user program? A web server? 
Those two cases could have somewhat different characteristics.

- A web server handles one transaction at a time per thread, I'd guess; so 
from the perspective of MySQL the number of threads would be equivalent to the 
number of users.

- An end-user program that talks directly to MySQL would present one 
transaction at a time per user.

- How clustered are the transactions for an individual user (human or web 
server thread)? If the human at the input side (who ultimately initiates the 
transactions) is processing an order, there might be a customer lookup; a 
product lookup; and an actual order entry. Then there might be a time lapse 
between orders, depending upon the workflow.

- Depending upon the application, one transaction might request multiple 
queries. If the transactions are different, you'd have to know a rough 
proportion within the mixture. That probably isn't hard to guesstimate.

- Because the humans are presumably not in lock-step, your best choice for 
simulating the transaction arrival rate might be a Poisson distribution. While 
the transactions (and consequent queries) executed by an individual human 
might not fit a Poisson distribution, when you combine all of the users it 
should be a fair approximation.

- If the web server is a bottleneck then the arrival rate of the queries is 
the maximum rate at which the web server can service transactions, at least at 
times which things are busy.

- One last tidbit: in any network system, there is always a bottleneck.

As an alternative to a physical benchmark there are simulation tools; but you 
need to address these issues either way.

Now, some practical pointers about presenting your results:

- Use lots of graphs. They'll draw attention away from the tables of numbers.

- Use 3D graphs. They look whizzy, and at the same time they make it harder to 
eyeball exact numbers.

- If you wind up sitting across the table from someone who brought colored 
pencils and a ruler, prepare for the death of 1000 cuts.

Now you know why I got out of the performance consulting business.

Good luck.

Regards,

Jerry Schwartz
Global Information Incorporated
195 Farmington Ave.
Farmington, CT 06032

860.674.8796 / FAX: 860.674.8341
E-mail: je...@gii.co.jp
Web site: www.the-infoshop.com




Thanks in advance.

Nunzio







-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org