On 04/24/17 22:23, Phil Lacroute wrote:
> Jason,
>
> Thanks for the suggestion.  That seems to show it is not the OSD that
> got stuck:
>
> ceph7:~$ sudo rbd -c debug/ceph.conf info app/image1
> …
> 2017-04-24 13:13:49.761076 7f739aefc700  1 --
> 192.168.206.17:0/1250293899 --> 192.168.206.13:6804/22934 --
> osd_op(client.4384.0:3 1.af6f1e38 rbd_header.1058238e1f29 [call
> rbd.get_size,call rbd.get_object_prefix] snapc 0=[]
> ack+read+known_if_redirected e27) v7 -- ?+0 0x7f737c0077f0 con
> 0x7f737c0064e0
> …
> 2017-04-24 13:14:04.756328 7f73a2880700  1 --
> 192.168.206.17:0/1250293899 --> 192.168.206.13:6804/22934 -- ping
> magic: 0 v1 -- ?+0 0x7f7374000fc0 con 0x7f737c0064e0
>
> ceph0:~$ sudo ceph pg map 1.af6f1e38
> osdmap e27 pg 1.af6f1e38 (1.38) -> up [11,16,2] acting [11,16,2]
>
> ceph3:~$ sudo ceph daemon osd.11 ops
> {
>     "ops": [],
>     "num_ops": 0
> }
>
> I repeated this a few times and it’s always the same command and same
> placement group that hangs, but OSD11 has no ops (and neither do OSD16
> and OSD2, although I think that’s expected).
>
> Is there other tracing I should do on the OSD or something more to
> look at on the client?
>
> Thanks,
> Phil
Does it still happen if you disable exclusive-lock, or maybe separately
fast-diff and object-map?

I have a similar problem where VMs with those 3 features hang and need
kill -9, and without them, they never hang.
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to