On Sat, Sep 6, 2014 at 5:22 PM, Sage Weil <[email protected]> wrote: > On Thu, 4 Sep 2014, Ilya Dryomov wrote: >> On Thu, Sep 4, 2014 at 3:39 PM, Chaitanya Huilgol >> <[email protected]> wrote: >> > Hi, >> > >> > In our benchmarking tests we observed that the ms_tcp_nodelay in ceph.conf >> > option is not affecting the kernel rbd and as expected we see poor latency >> > numbers for lower queue depths and 4K rand reads. There is significant >> > increase in latency from qd=2 to 24 and starts tapering down for higher >> > queue depths. >> > We did not find relevant kernel_setsockopt with TCP_NODELAY in the kernel >> > RBD/libceph (messenger.c) source. Unless we are missing something, looks >> > like currently the kernel RBD is not setting this and this is affecting >> > latency numbers are lower queue depths. >> > >> > I have tested with userspace fio(rbd engine) and rados bench and we see >> > similar latency behavior when ms_tcp_nodelay is set to false. However >> > setting this to true gives consistent low latency numbers for all queue >> > depths >> > >> > Any ideas/thoughts on this? >> > >> > OS Ubuntu 14.04 >> > Kernel: 3.13.0-24-generic #46-Ubuntu SMP >> > Ceph: Latest Master >> >> No, we don't set TCP_NODELAY in the kernel client, but I think we can >> add it as a rbd map/mount option. Sage? > > We definitely can, and I think more importantly it should be on by > default, as it is in userspace. I'm surpised we missed that. :( IIRC we > are carefully setting the MORE (or CORK?) flag on all but the last write > for a message, but I take it there is a socket-level option we missed?
Yeah, but also see http://tracker.ceph.com/issues/9345. Thanks, Ilya -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html
