Guys,
I'm running a three node cluster (version 0.53), and after a while of
running under constant write load generated by two daemons, I am
seeing that 1 request is totally blocked:
[WRN] 1 slow requests, 1 included below; oldest blocked for 7550.891933 secs
2012-10-29 10:33:54.689563 osd.0
Interesting, I don't think the request is stalled. I think we
completed the request, but leaked a reference to the request
structure. Do you see IO from the clients stall? What is the output
of ceph -s? What version are you running (ceph-osd --version)?
-Sam
On Mon, Oct 29, 2012 at 10:53 AM,
The client's IO held up fine, and I don't see any signs of them
blocking. The writes are done inside of an aio_operate() rados call.
In the client logs too, I don't see any record of a failed write.
ceph -s
health HEALTH_OK
monmap e1: 1 mons at {a=10.25.36.11:6789/0}, election epoch 2,