On Thu, 4 Sep 2014, Ilya Dryomov wrote: > On Thu, Sep 4, 2014 at 3:39 PM, Chaitanya Huilgol > <[email protected]> wrote: > > Hi, > > > > In our benchmarking tests we observed that the ms_tcp_nodelay in ceph.conf > > option is not affecting the kernel rbd and as expected we see poor latency > > numbers for lower queue depths and 4K rand reads. There is significant > > increase in latency from qd=2 to 24 and starts tapering down for higher > > queue depths. > > We did not find relevant kernel_setsockopt with TCP_NODELAY in the kernel > > RBD/libceph (messenger.c) source. Unless we are missing something, looks > > like currently the kernel RBD is not setting this and this is affecting > > latency numbers are lower queue depths. > > > > I have tested with userspace fio(rbd engine) and rados bench and we see > > similar latency behavior when ms_tcp_nodelay is set to false. However > > setting this to true gives consistent low latency numbers for all queue > > depths > > > > Any ideas/thoughts on this? > > > > OS Ubuntu 14.04 > > Kernel: 3.13.0-24-generic #46-Ubuntu SMP > > Ceph: Latest Master > > No, we don't set TCP_NODELAY in the kernel client, but I think we can > add it as a rbd map/mount option. Sage?
We definitely can, and I think more importantly it should be on by default, as it is in userspace. I'm surpised we missed that. :( IIRC we are carefully setting the MORE (or CORK?) flag on all but the last write for a message, but I take it there is a socket-level option we missed? sage -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html
