> On 25 Feb 2016, at 14:39, Nick Fisk <n...@fisk.me.uk> wrote:
> 
> 
> 
>> -----Original Message-----
>> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
>> Huan Zhang
>> Sent: 25 February 2016 11:11
>> To: josh.dur...@inktank.com
>> Cc: ceph-users <ceph-us...@ceph.com>
>> Subject: [ceph-users] Guest sync write iops so poor.
>> 
>> Hi,
>>   We test sync iops with fio sync=1 for database workloads in VM,
>> the backend is librbd and ceph (all SSD setup).
>>   The result is sad to me. we only get ~400 IOPS sync randwrite with
>> iodepth=1
>> to iodepth=32.
>>   But test in physical machine with fio ioengine=rbd sync=1, we can reache
>> ~35K IOPS.
>> seems the qemu rbd is the bottleneck.
>>   qemu version is 2.1.2 with rbd_aio_flush patched.
>>    rbd cache is off, qemu cache=none.
>> 
>> So what's wrong with it? Is that normal? Could you give me some help?
> 
> Yes, this is normal at QD=1. As the write needs to be acknowledged by both 
> replica OSD's across a network connection the round trip latency severely 
> limits you as compared to travelling along a 30cm sata cable.
> 
> The two biggest contributors to latency is the network and the speed at which 
> the CPU can process the ceph code.  To improve performance look at these two 
> areas first. Easy win is to disable debug logging in ceph.
> 
> However this number should scale as you increase the QD, so something is not 
> right if you are seeing the same performance at QD=1 as QD=32.

Are you sure? Unless something (io elevator) coalesces the writes then they 
should be serialized and blocking, QD doesn't necessarily help there. Either 
way, you're benchmarking the elevator and not RBD if you reach higher IOPS with 
QD>1, IMO.

35K IOPS with ioengine=rbd sounds like the "sync=1" option doesn't actually 
work. Or it's not touching the same object (but I wonder whether write ordering 
is preserved at that rate?).

400 IOPS is sadly the same figure I can reach on a raw device... testing with 
filesystem you can easily reach <200 IOPS (because of journal, metadata... but 
again, then you're benchmarking filesystem journal and ioelevator efficiency, 
not RBD itself).

Jan


> 
>> Thanks very much.
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to