[lopsa-tech] How to benchmark a whole infrastructure

Jordi Moles Wed, 29 Jan 2014 03:37:02 -0800

Hi everyone,

I'm a 100% sure that I'm not the first one to consider benchmarking awhole infraestructure, but yet I haven't found any relevant informationon how to face this challenge.

As I guess we all know what we are talking about, I will try to draw thescenario as short as possible.

I've been working for years in the hosting business and we've alwaysdone some kind of testing for new stuff. Back in the Wonder Years, wedid some "ab", "dd" or "bonnie", etc. to test disk, cpu and so on.

Then we grew up and we needed to benchmark our commercial website orsome big customer's website. We tried a lot of tools and we ended upwith things like Jmeter, which was a great help for quite a while.Recently, we've been using Locust, which is a great tool, not so easy tosetup, but very powerful.

But now we can say "we are mature", we sell cloud and many more thingsthat a bunch of websites on a webserver are at stake.

As a cloud engineer you design a storage solution that will be able tohost thousands of vms (The same scenario and needs apply to other thingslike big Database clusters, for example). You spend days doing yourmaths and propose a number of available vms for the budget you areassigned. We all know what comes next... someone comes and says... noway... we have to fit at least twice as much in there, we have to makemoney!!

So... you know that there is no possible way to fit twice the number ofvms you calculated, but you have to open a negotiation and provideenough information to the managers in order to come to an agreement,somewhere in between.

And here's where the problem lies... how am I supposed to test a wholeinfrastructure that can host thousands of vms?We've already worked with load distributed testing tools, such as Jmeteror Locust. They are great, but have one big issue: They were designed totest one ip address, not thousands of vms.

So... I guess that many have come to this situation only to realise thatthere is no way to test that effectively. However, I'm sure you guyshave at some point found a way to test an infrastructure like this in amore realistic way than performing tests the old way. I would appreciateany idea you can give me.

Obviously you need a proper architecture and setup, nice hardware, dailymaintenance, and much more. Everything we can possibly do to get oursystem clean and updated is already being done, but... at what pointshould we stop putting data in?


What we are doing right when preparing a new system is:

-create a nagios/munin system where you monitor the main stuff: network,disk latencies, etc.

-create hundreds/thousands of vms, depending on the TBs available.
-launch all or most of those vms (some are only used to occupy the space).

-ssh into most of them and perform at once or intermittently some typeof disk test like dd, bonnie or iozone.-start browsing "manually" some websites hosted on those vms and decideif they are slow. Obviously this is a very subjective matter. Despitethat, we can say that most people feel "happy" if the web loads in aless than a second.

sometimes, by just looking at the munin graphs, you can see somepossible bottlenecks, but we've had degradations of the service with alot of less active vms than the warning threshold we managed to identifyduring the tests.

So, to sum up, I know that if someone had came out with a solution forthis issue, it would be very simple to find on the first page of Google,but let's put in common our strategies to see if we can properlybenchmark some small parts of the system.


Regards,

Jordi.


--
--
Jordi Moles Blanco
IaaS Engineer Cdmon.com
___________________________
Tlf: 902 36 41 38
Tlf: 93 567 75 77
mailto: [email protected]
http://www.cdmon.com

http://es.linkedin.com/in/jordimolesblanco

_______________________________________________
Tech mailing list
[email protected]
https://lists.lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
http://lopsa.org/

[lopsa-tech] How to benchmark a whole infrastructure

Reply via email to