I have set the logfile to be opened with SYNC and that seems to be giving me 
more consistent output

I see the crash is mostly happening around queue_aio_write. Most of the time 
the last thing I see is this entry "librados: queue_aio_write 0x7f0928004390 
completion 0x1ea65d0 write_seq 147". I've never seen it happen in a read 
operation. And always where there are lots of writes queued.

>From adding further debug statements, I can see that the exact point of the 
>crash is very soon after that but not at a constant point.

I'm thinking that the crash is actually happening somewhere in the callback 
chain, although definitely before librbd callbacks are invoked as debug prints 
in the rados end of the callbacks shows nothing.

Where is the 'top' of the callback chain on a write? I can see that librados 
calls librbd (which then calls the callback in tapdisk), but what calls 
librados?

Thanks

James

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to