> On Jul 28, 2015, at 7:57 PM, Ilya Dryomov <idryo...@gmail.com> wrote:
> 
> On Tue, Jul 28, 2015 at 2:46 PM, van <chaofa...@owtware.com> wrote:
>> Hi, Ilya,
>> 
>>  In the dmesg, there is also a lot of libceph socket error, which I think
>> may be caused by my stopping ceph service without unmap rbd.
> 
> Well, sure enough, if you kill all OSDs, the filesystem mounted on top
> of rbd device will get stuck.

Sure it will get stuck if osds are stopped. And since rados requests have retry 
policy, the stucked requests will recover after I start the daemon again.

But in my case, the osds are running in normal state and librbd API can 
read/write normally.
Meanwhile, heavy fio test for the filesystem mounted on top of rbd device will 
get stuck.

I wonder if this phenomenon is triggered by running rbd kernel client on 
machines have ceph daemons, i.e. the annoying loopback mount deadlock issue.

In my opinion, if it’s due to the loopback mount deadlock, the OSDs will become 
unresponsive.
No matter the requests are from user space requests (like API) or from kernel 
client.
Am I right?

If so, my case seems to be triggered by another bug.

Anyway, it seems that I should separate client and daemons at least.

Thanks.

> 
> Thanks,
> 
>                Ilya

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to