On Fri, Mar 6, 2015 at 10:18 AM, Nick Fisk <n...@fisk.me.uk> wrote:

> On Fri, Mar 6, 2015 at 9:04 AM, Nick Fisk <n...@fisk.me.uk> wrote:
>
>
>
>
>
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> Jake Young
> Sent: 06 March 2015 12:52
> To: Nick Fisk
> Cc: ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] tgt and krbd
>
>
>
> On Thursday, March 5, 2015, Nick Fisk <n...@fisk.me.uk> wrote:
> Hi All,
>
> Just a heads up after a day’s experimentation.
>
> I believe tgt with its default settings has a small write cache when
> exporting a kernel mapped RBD. Doing some write tests I saw 4 times the
> write throughput when using tgt aio + krbd compared to tgt with the builtin
> librbd.
>
> After running the following command against the LUN, which apparently
> disables write cache, Performance dropped back to what I am seeing using
> tgt+librbd and also the same as fio.
>
> tgtadm --op update --mode logicalunit --tid 2 --lun 3 -P
> mode_page=8:0:18:0x10:0:0xff:0xff:0:0:0xff:0xff:0xff:0xff:0x80:0x14:0:0:0:0:0:0
>
> From that I can only deduce that using tgt + krbd in its default state is
> not 100% safe to use, especially in an HA environment.
>
> Nick
>
>
>
>
> Hey Nick,
>
> tgt actually does not have any caches. No read, no write.  tgt's design is
> to passthrough all commands to the backend as efficiently as possible.
>
>
> http://xo4t.mj.am/link/xo4t/6jv2q54/1/dy6ksWJtZ-g2UEgyc-v5dA/aHR0cDovL2xpc3RzLndwa2cub3JnL3BpcGVybWFpbC9zdGd0LzIwMTMtTWF5LzAwNTc4OC5odG1s
>
> The configuration parameters just inform the initiators whether the
> backend storage has a cache. Clearly this makes a big difference for you.
> What initiator are you using with this test?
>
> Maybe the kernel is doing the caching.  What tuning parameters do you have
> on the krbd disk?
>
> It could be that using aio is much more efficient. Maybe built in lib rbd
> isn't doing aio?
>
> Jake
>
>
> Hi Jake,
>
> Hmm that’s interesting, it’s definitely effecting write behaviour though.
>
> I was running iometer doing single io depth writes in a windows VM on ESXi
> using its software initiator, which as far as I’m aware should be sending
> sync writes for each request.
>
> I saw in iostat on the tgt server that my 128kb writes were being
> coalesced into ~1024kb writes, which would explain the performance
> increase. So something somewhere is doing caching, albeit on a small scale.
>
>  The krbd disk was all using default settings. I know the RBD support for
> tgt is using the librbd sync writes which I suppose might explain the
> default difference, but this should be the expected behaviour.
>
> Nick
>
>
>
>
> *From:* ceph-users [mailto:ceph-users-boun...@lists.ceph.com] *On Behalf
> Of *Jake Young
> *Sent:* 06 March 2015 15:07
>
> *To:* Nick Fisk
> *Cc:* ceph-users@lists.ceph.com
> *Subject:* Re: [ceph-users] tgt and krbd
>
>
>
> My initator is also VMware software iscsi.  I had my tgt iscsi targets'
> write-cache setting off.
>
> I turned write and read cache on in the middle of creating a large eager
> zeroed disk (tgt has no VAAI support, so this is all regular synchronous
> IO) and it did give me a clear performance boost.
>
> Not orders of magnitude, but maybe 15% faster.
>
> If the image makes it to the list, the yellow line is write KBps.  It went
> from about 85MBps to about 100MBps.  What was more noticeable was that the
> latency (grey line) went from around 250 ms to 130ms.
>
> [image: Inline image 1]
>
> I'm pretty sure this IO (zeroing) is always 1MB writes, so I don't think
> this caused my write size to change.  Maybe it did something to the iSCSI
> packets?
>
>
>
> Jake
>
>
>
>
> Hi Jake,
>
>
>
> Good to see it’s not just me.
>
>
>
> I’m guessing that the fact you are doing 1MB writes means that the latency
> difference is having a less noticeable impact on the overall write
> bandwidth. What I have been discovering with Ceph + iSCSi is that due to
> all the extra hops (client->iscsi proxy->pri OSD-> sec OSD) is that you get
> a lot of latency serialisation which dramatically impacts single threaded
> iops at small IO sizes.
>

That makes sense.  I don't really understand how latency is going down if
tgt is not really doing caching.


>
>
> A few days back I tested adding a tiny SSD write cache on the iscsi proxy
> and this had a dramatic effect in “hiding” the latency behind it from the
> client.
>
>
>
> Nick
>

After seeing your results, I've been considering experimenting with that.
Currently, my iSCSI proxy nodes are VMs.

I would like to build a few dedicated servers with fast SSDs or fusion-io
devices.  It depends on my budget, it's hard to justify getting a card that
costs 10x the rest of the server...  I would run all my tgt instances in
containers pointing to the rbd disk+cache device.  A fusion-io device could
support many tgt containers.

I don't really want to go back to krbd.  I have a few rbd's that are format
2 with striping, there aren't any stable kernels that support that (or any
kernels at all yet for "fancy striping").  I wish there was a way to
incorporate a local cache device into tgt with librbd backends.

Jake
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to