> -----Original Message-----
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> Maged Mokhtar
> Sent: 06 April 2017 12:21
> To: Brady Deetz <bde...@gmail.com>; ceph-users <ceph-us...@ceph.com>
> Subject: Re: [ceph-users] rbd iscsi gateway question
> 
> The io hang (it is actually a pause not hang) is done by Ceph only in case
of a
> simultaneous failure of 2 hosts or 2 osds on separate hosts. A single
host/osd
> being out will not cause this.  In PetaSAN project www.petasan.org we use
> LIO/krbd. We have done a lot of tests on VMWare, in case of io failure,
the io
> will block for approx 30s on the VMWare ESX (default timeout, but can be
> configured)  then it will resume on the other MPIO path.
> 
> We are using a custom LIO/kernel upstreamed from SLE 12 used in their
> enterprise storage offering, it supports direct rbd backstore. I believe
there
> was a request to include it mainstream kernel but it did not happen,
> probably waiting for TCMU solution which will be better/cleaner design.

Yes, should have mentioned this, if you are using the suse kernel, they have
a fix for this spiral of death problem. Any other distribution or vanilla
kernel, will hang if a Ceph IO takes longer than about 5-10s. It's the path
failure bit which is the problem, LIO tries to abort the IO, but RBD doesn't
support this yet.

> 
> Cheers /maged
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to