Re: [ceph-users] Ceph performance with 8K blocks.

Jason Villalta Fri, 20 Sep 2013 16:28:16 -0700

Thanks Jamie,

I tried that too.  But similar results.  The issue looks to possibly be
with the latency but everything is running on one server so logiclly I
would think there would be no latency but according to this there may be
something that is causing slow results.  See Co-Residency
http://ceph.com/docs/master/rados/troubleshooting/troubleshooting-osd/


I have not found a way to prove this to be true other than testing many
difference configurations of OSDs and drives.  At one point I had 3 OSDs
all running one SSD drive.  The performance was the same as when three OSDs
were running on 3 separate SSD drives.  Seems like there is something else
going on here.

Also I ran iotop while running rados bench and virtual machine sqlio.
 Write max out at 200-300MBps for the duration of the test.  Reads never
hit a sustained rate anywhere near that speed.



On Fri, Sep 20, 2013 at 7:18 PM, Jamie Alquiza <j...@grey-boundary.com> wrote:

> I thought I'd just throw this in there, as I've been following this
> thread: dd also has an 'iflag' directive just like the 'oflag'.
>
> I don't have a deep, offhand recollection of the caching mechanisms at
> play here, but assuming you want a solid synchronous / non-cached read, you
> should probably specify 'iflag=direct'.
>
> On Friday, September 20, 2013, Jason Villalta wrote:
>
>> Mike,
>> So I do have to ask, where would the extra latency be coming from if all
>> my OSDs are on the same machine that my test VM is running on?  I have
>> tried every SSD tweak in the book.  The primary concerning issue I see is
>> with Read performance of sequential IOs in the 4-8K range.  I would expect
>> those to pull from three SSD disks on a local machine atleast as fast one
>> Native SDD test.  But I don't see that, its actually slower.
>>
>>
>> On Wed, Sep 18, 2013 at 4:02 PM, Jason Villalta <ja...@rubixnet.com>wrote:
>>
>> Thank Mike,
>> High hopes right ;)
>>
>> I guess we are not doing too bad compared to you numbers then.  Just wish
>> the gap was a little closer between native and ceph per osd.
>>
>> C:\Program Files (x86)\SQLIO>sqlio -kW -t8 -s30 -o8 -fsequential -b1024
>> -BH -LS
>> c:\TestFile.dat
>> sqlio v1.5.SG
>> using system counter for latency timings, 100000000 counts per second
>> 8 threads writing for 30 secs to file c:\TestFile.dat
>>         using 1024KB sequential IOs
>>         enabling multiple I/Os per thread with 8 outstanding
>>         buffering set to use hardware disk cache (but not file cache)
>> using current size: 10240 MB for file: c:\TestFile.dat
>> initialization done
>> CUMULATIVE DATA:
>> throughput metrics:
>> IOs/sec:   180.20
>> MBs/sec:   180.20
>> latency metrics:
>> Min_Latency(ms): 39
>> Avg_Latency(ms): 352
>> Max_Latency(ms): 692
>> histogram:
>> ms: 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22
>> 23 24+
>> %:  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
>>  0 100
>>
>>
>>
>> On Wed, Sep 18, 2013 at 3:55 PM, Mike Lowe <j.michael.l...@gmail.com>wrote:
>>
>> Well, in a word, yes. You really expect a network replicated storage
>> system in user space to be comparable to direct attached ssd storage?  For
>> what it's worth, I've got a pile of regular spinning rust, this is what my
>> cluster will do inside a vm with rbd writeback caching on.  As you can see,
>> latency is everything.
>>
>> dd if=/dev/zero of=1g bs=1M count=1024
>> 1024+0 records in
>> 1024+0 records out
>> 1073741824 bytes (1.1 GB) copied, 6.26289 s, 171 MB/s
>> dd if=/dev/zero of=1g bs=1M count=1024 oflag=dsync
>> 1024+0 records in
>> 1024+0 records out
>> 1073741824 bytes (1.1 GB) copied, 37.4144 s, 28.7 MB/s
>>
>> As you can see, latency is a killer.
>>
>> On Sep 18, 2013, at 3:23 PM, Jason Villalta <ja...@rubixnet.com> wrote:
>>
>> Any other thoughts on this thread guys.  I am just crazy to want near
>> native SSD performance on a small SSD cluster?
>>
>>
>> On Wed, Sep 18, 2013 at 8:21 AM, Jason Villalta <ja...@rubixnet.com>wrote:
>>
>> That dd give me this.
>>
>> dd if=ddbenchfile of=- bs=8K | dd if=- of=/dev/null bs=8K
>> 8192000000 bytes (8.2 GB) copied, 31.1807 s, 263 MB/s
>>
>> Which makes sense because the SSD is running as SATA 2 which should give
>> 3Gbps or ~300MBps
>>
>> I am still trying to better understand the speed difference between the
>> small block speeds seen with dd vs the same small object size with rados.
>>  It is not a difference of a few MB per sec.  It seems to nearly be a
>> factor of 10.  I just want to know if this is a hard limit in Ceph or a
>> factor of the underlying disk speed.  Meaning if I use spindles to read
>> data would the speed be the same or would the read speed be a factor of 10
>> less than the speed of the underlying disk?
>>
>>
>> On Wed, Sep 18, 2013 at 4:27 AM, Alex Bligh <a...@alex.org.uk> wrote:
>>
>>
>> On 17 Sep 2013, at 21:47, Jason Villalta wrote:
>>
>> > dd if=ddbenchfile of=/dev/null bs=8K
>> >
>>
>>
>
> --
> -ja. Sent via mobile.
>



-- 
-- 
*Jason Villalta*
Co-founder
[image: Inline image 1]
800.799.4407x1230 | www.RubixTechnology.com<http://www.rubixtechnology.com/>

<<EmailLogo.png>>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph performance with 8K blocks.

Reply via email to