Re: [ceph-users] Ceph performance with 8K blocks.

Jason Villalta Fri, 20 Sep 2013 17:18:22 -0700

Thanks Jamie,
I have not tried bonnie++.  I was trying to keep it to sequential IO for
comparison since that is all Rados bench can do.   I did do a full io test
in a windows vm using SQLIO.  I have both read/write sequential/random for
4/8/64K blocks from that test.  I also have access to a Dell Equallogic so
I was using that as high end benchmark with the same SQLIO tests.  Same
goes for a single Intel SSD 320 my partner has.  I can attach that if you
want to look at it.
In those test the random IO was not too bad between the Equallogic and Ceph
but the sequential was not(possibly because of the huge cache in the
Equallogic.  But I am still bother by the zero difference in performance
between using 1 SSD disk vs 2 vs 3 disks.  I would think there would be an
increase in reads atleast.




On Fri, Sep 20, 2013 at 7:44 PM, Jamie Alquiza <j...@grey-boundary.com> wrote:

> The iflag addition should help with at least having more accurate reads
> via dd, but in terms of actually testing performance, have you tried
> sysbench or bonie++?
>
> I'd be curious how things change with multiple io threads, as dd isn't
> necessarily a good performance investigation tool (you're rather testing
> "dd performance" as opposed to "using dd to test performance") if the
> concern is what to expect for your multi-tenant vm block store.
>
> Personally, I get more bugged out over many-thread random read throughput
> or synchronous write latency.
>
> On Friday, September 20, 2013, Jason Villalta wrote:
>
>> Thanks Jamie,
>>
>> I tried that too.  But similar results.  The issue looks to possibly be
>> with the latency but everything is running on one server so logiclly I
>> would think there would be no latency but according to this there may be
>> something that is causing slow results.  See Co-Residency
>> http://ceph.com/docs/master/rados/troubleshooting/troubleshooting-osd/
>>
>> I have not found a way to prove this to be true other than testing many
>> difference configurations of OSDs and drives.  At one point I had 3 OSDs
>> all running one SSD drive.  The performance was the same as when three OSDs
>> were running on 3 separate SSD drives.  Seems like there is something else
>> going on here.
>>
>> Also I ran iotop while running rados bench and virtual machine sqlio.
>>  Write max out at 200-300MBps for the duration of the test.  Reads never
>> hit a sustained rate anywhere near that speed.
>>
>>
>>
>> On Fri, Sep 20, 2013 at 7:18 PM, Jamie Alquiza <j...@grey-boundary.com>wrote:
>>
>> I thought I'd just throw this in there, as I've been following this
>> thread: dd also has an 'iflag' directive just like the 'oflag'.
>>
>> I don't have a deep, offhand recollection of the caching mechanisms at
>> play here, but assuming you want a solid synchronous / non-cached read, you
>> should probably specify 'iflag=direct'.
>>
>> On Friday, September 20, 2013, Jason Villalta wrote:
>>
>> Mike,
>> So I do have to ask, where would the extra latency be coming from if all
>> my OSDs are on the same machine that my test VM is running on?  I have
>> tried every SSD tweak in the book.  The primary concerning issue I see is
>> with Read performance of sequential IOs in the 4-8K range.  I would expect
>> those to pull from three SSD disks on a local machine atleast as fast one
>> Native SDD test.  But I don't see that, its actually slower.
>>
>>
>> On Wed, Sep 18, 2013 at 4:02 PM, Jason Villalta <ja...@rubixnet.com>wrote:
>>
>> Thank Mike,
>> High hopes right ;)
>>
>> I guess we are not doing too bad compared to you numbers then.  Just wish
>> the gap was a little closer between native and ceph per osd.
>>
>> C:\Program Files (x86)\SQLIO>sqlio -kW -t8 -s30 -o8 -fsequential -b1024
>> -BH -LS
>> c:\TestFile.dat
>> sqlio v1.5.SG
>> using system counter for latency timings, 100000000 counts per second
>> 8 threads writing for 30 secs to file c:\TestFile.dat
>>         using 1024KB sequential IOs
>>         enabling multiple I/Os per thread with 8 outstanding
>>         buffering set to use hardware disk cache (but not file cache)
>> using current size: 10240 MB for file: c:\TestFile.dat
>> initialization done
>> CUMULATIVE DATA:
>> throughput metrics:
>> IOs/sec:   180.20
>> MBs/sec:   180.20
>> latency metrics:
>> Min_Latency(ms): 39
>> Avg_Latency(ms): 352
>> Max_Latency(ms): 692
>> histogram:
>> ms: 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22
>> 23 24+
>> %:  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
>>  0 100
>>
>>
>>
>> On Wed, Sep 18, 2013 at 3:55 PM, Mike Lowe <j.michael.l...@gmail.com>wrote:
>>
>> Well, in a word, yes. You really expect a network replicated storage
>> system in user space to be comparable to direct attached ssd storage?  For
>> what it's worth, I've got a pile of regular spinning rust, this is what my
>> cluster will do inside a vm with rbd writeback caching on.  As you can see,
>> latency is everything.
>>
>> dd if=/dev/zero of=1g bs=1M count=1024
>> 1024+0 records in
>> 1024+0 records out
>> 1073741824 bytes (1.1 GB) copied, 6.26289 s, 171 MB/s
>> dd if=/dev/zero of=1g bs=1M count=1024 oflag=dsync
>> 1024+0 records in
>> 1024+0 records out
>> 1073741824 bytes (1.1 GB) copied, 37.4144 s, 28.7 MB/s
>>
>> As you can see, latency is a killer.
>>
>> On Sep 18, 2013, at 3:23 PM, Jason Villalta <ja...@rubixnet.com> wrote:
>>
>> Any other thoughts on this thread guys.  I am just crazy to want near
>> native SSD performance on a small SSD cluster?
>>
>>
>> On Wed, Sep 18, 2013 at 8:21 AM, Jason Villalta <ja...@rubixnet.com>wrote:
>>
>> That dd give me this.
>>
>> dd if=ddbenchfile of=- bs=8K | dd if=- of=/dev/null bs=8K
>> 8192000000 bytes (8.2 GB) copied, 31.1807 s, 263 MB/s
>>
>> Which makes sense because the SSD is running as SATA 2 which should give
>> 3Gbps or ~300MBps
>>
>> I am still trying to better understand the speed difference between the
>> small block speeds seen with dd vs the same small object size with rados.
>>  It is not a difference
>>
>> --
>> --
>> *Jason Villalta*
>> Co-founder
>> [image: Inline image 1]
>> 800.799.4407x1230 | www.RubixTechnology.com<http://www.rubixtechnology.com/>
>>
>
>
> --
> -ja. Sent via mobile.
>



-- 
-- 
*Jason Villalta*
Co-founder
[image: Inline image 1]
800.799.4407x1230 | www.RubixTechnology.com<http://www.rubixtechnology.com/>

<<EmailLogo.png>>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph performance with 8K blocks.

Reply via email to