Ceph Benchmark HowTo

2012-07-24 Thread Mehdi Abaakouk
Hi all,

I am currently doing some tests on Ceph, more precisely on the RDB and RADOSGW 
parts. 
My goal is to get some performance metrics according to the hardwares and 
the Ceph setup.

To do so, I am preparing a benchmark how-to, to help people to compare their
metrics.

I have started the how-to here : http://ceph.com/w/index.php?title=Benchmark
I have linked it in the misc section of the main page.

Then, first question, is it alright if I continue publishing this procedure 
on your wiki ?

The how-to is not finished yet, this is only a first draft.
My test platform is not ready yet too, so the result of the bench can't be
used yet.

The next work I will do on the how-to is to add some explanations on how
to interpret the results of benchmark.

So, if you have some comments, ideas of benchmarks, or anything that can be 
helpful to me to improve the how-to and/or compare future results, 
I would be glad to read them.

And thanks a lot for your work on Ceph, this is a great storage system :)

Best Regards,
-- 
Mehdi Abaakouk for eNovance
mail: sil...@sileht.net
irc: sileht


signature.asc
Description: Digital signature


Re: Ceph Benchmark HowTo

2012-07-25 Thread Mehdi Abaakouk
On Tue, Jul 24, 2012 at 10:55:37AM -0500, Mark Nelson wrote:
> On 07/24/2012 09:43 AM, Mehdi Abaakouk wrote:
> 
> Thanks for taking the time to put all of your benchmarking
> procedures into writing!  Having this kind of community
>
> ...
>

Thanks, for yours comments and these tools, that will help me for sure.


-- 
Mehdi Abaakouk
mail: sil...@sileht.net
irc: sileht


signature.asc
Description: Digital signature


Re: Ceph Benchmark HowTo

2012-07-25 Thread Mehdi Abaakouk
Hi Florian,

On Wed, Jul 25, 2012 at 10:06:04PM +0200, Florian Haas wrote:
> Hi Mehdi,
> For the OSD tests, which OSD filesystem are you testing on? Are you   
>   
> 
> using a separate journal device? If yes, what type? 

Actually, I use xfs and the journal is on a same disk in an other partition.
After reading documentation, I seems that using a dedicated disk is 
better and SSD is a good choice.

> seekwatcher -t rbd-latency-write.trace -o rbd-latency-write.png -p 'dd
> if=/dev/zero of=/dev/rbd0 bs=4M count=1000 oflag=direct' -d /dev/rbd0
> 
> Just making sure: are you getting the same numbers just with dd,
> rather than dd invoked by seekwatcher?

yes

> 
> Also, for your dd latency test of 4M direct I/O reads writes, you seem
> to be getting 39 and 300 ms average latency, yet further down it says
> "RBD latency read/write: 28ms and 114.5ms". Any explanation for the
> write latency being cut in half on what was apparently a different
> test run?

Yes this is a different run, the one on the bottom was with less servers
but with better hardware.

> 
> Also, were read and write caches cleared between tests? (echo 3 >
> /proc/sys/vm/drop_caches)

No, I will add it 

> Cheers,
> Florian

I known that my setup is not really optimal,
Writing these tests help me to understand how ceph work and
I'm sure with your advice I will build a better cluster :)

Thanks for your help.

Cheers,
-- 
Mehdi Abaakouk
mail: sil...@sileht.net
irc: sileht


signature.asc
Description: Digital signature


Re: Ceph Benchmark HowTo

2012-07-31 Thread Mehdi Abaakouk
Hi all,

I have updated the how-to here:
http://ceph.com/wiki/Benchmark

And published the results of my latest tests:
http://ceph.com/wiki/Benchmark#First_Example

All results are good, my benchmark is clearly limited by my network
connection ~ 110MB/s.

In exception of the rest-api bench, the value seems really low.

I have configured radosgw with this:
http://ceph.com/docs/master/radosgw/config/
I clean disk cache on all servers before the bench,
and start rest-bench for 900 seconds with default value.

Is my rest-bench result normal ? Have I missed something ?

Don't hesitate if you need more informations on my setup.

And then, I have another question about how is the Standard Deviation
calculated with rados bench and rest-bench ? with the reported value 
printed each second by the benchmark client ?
If yes, when latency is too high, the reported bandwith is sometime zero,
then has the calculated StdDev for bandwith a sens ?


Cheers,
-- 
Mehdi Abaakouk for eNovance
mail: sil...@sileht.net
irc: sileht


signature.asc
Description: Digital signature


About teuthology

2012-07-31 Thread Mehdi Abaakouk
Hi,

I have taken a look into teuthology, the automation of all this tests
are good, but are they any way to run it into a already installed
ceph clusters ? 

Thanks in advance.

Cheers,

-- 
Mehdi Abaakouk for eNovance
mail: sil...@sileht.net
irc: sileht


signature.asc
Description: Digital signature


Re: About teuthology

2012-07-31 Thread Mehdi Abaakouk
On Tue, Jul 31, 2012 at 09:27:54AM -0500, Mark Nelson wrote:
> On 7/31/12 8:59 AM, Mehdi Abaakouk wrote:
> Hi Mehdi,
> 
> I think a number of the test related tasks should run fine without
> strictly requiring the ceph task.  You may have to change binary
> locations for things like rados, but those should be pretty minor.
> 
> Best way to find out is to give it a try!

Thanks for your quick answer :)

I have already tried, but the code massively refers to files in
/tmp/cephtest/, it seems to me that changing the path of the 
binaries isn't enough, some of them are built by the ceph task.

Perhaps a quicker (a bit dirty) way is to create a new task 'cephdist',
that prepares the required files in /tmp/cephtest.
ie:
- link dist binary to /tmp/cephtest/binary/usr/local/bin/...
- link /etc/ceph/ceph.conf to /tmp/cephtest/ceph.conf
- ship cephtest tool in /tmp/cephtest (like ceph task)
- make dummy script for coverage (because distributed ceph doesn't seem
  to have ceph-coverage)

What do you think about it ?

Cheers

-- 
Mehdi Abaakouk for eNovance
mail: sil...@sileht.net
irc: sileht


signature.asc
Description: Digital signature


Re: Ceph Benchmark HowTo

2012-08-02 Thread Mehdi Abaakouk
On Wed, Aug 01, 2012 at 09:06:44AM -0500, Mark Nelson wrote:
> I haven't actually used bonnie++ myself, but I've read some rather
> bad reports from various other people in the industry.  Not sure how
> much it's changed since then...
> 
> https://blogs.oracle.com/roch/entry/decoding_bonnie
> http://www.quora.com/What-are-some-file-system-benchmarks
> http://scalability.org/?p=1685
> http://scalability.org/?p=1688
> 
> I'd say to just take extra care to make sure that that it's behaving
> the way you intended it to (probably good advice no matter which
> benchmark you use!)

Thanks, for this good links :), I have started to try fio too for its
flexibility.

> >All results are good, my benchmark is clearly limited by my network
> >connection ~ 110MB/s.
> 
> Gigabit Ethernet is definitely going to be a limitation with large
> block sequential IO for most modern disks.  I'm concerned with your
> 6 client numbers though.  I assume those numbers are per client?
> Even so, with 10 OSDs that performance is pretty bad!  Are you
> getting a good distribution of writes across all OSDs?  Consistent
> throughput over time on each?

This is a network issue too, the 6 clients tests are not really 
representatives, all clients share the same 1 gigabit link, I will 
acquire more hardwares to be more realistic soon (and replace these
results).

Some precisions have been added to the benchmark page.

> >In exception of the rest-api bench, the value seems really low.
> > ...
> >Is my rest-bench result normal ? Have I missed something ?
> 
> You may want to try increasing the number of concurrent rest-bench
> operations.  Also I'd explicitly specify the number of PGs for the
> pool you create to make sure that you are getting a good
> distribution.

During my test the number of PGs is 640 for 10 OSDs, I have tried with more
concurrent operation 32 and 64, but the result is almost the same with
more latency. 


Cheers,
-- 
Mehdi Abaakouk for eNovance
mail: sil...@sileht.net
irc: sileht


signature.asc
Description: Digital signature