Hi Stefan,

On Thu, 23 Sep 2010, Stefan Majer wrote:

> Hi,
> 
> we saw on one of our OSDs (16 in total) the followin assert.
> 
> osd/OSD.cc: In function 'void OSD::start_recovery_op(PG*, const sobject_t&)':
> osd/OSD.cc:4250: FAILED assert(recovery_oids.count(soid) == 0)
>  1: (PG::start_recovery_op(sobject_t const&)+0x127) [0x525627]
>  2: (ReplicatedPG::recover_object_replicas(sobject_t const&)+0x191) [0x482881]
>  3: (ReplicatedPG::recover_replicas(int)+0x2db) [0x482ddb]
>  4: (ReplicatedPG::start_recovery_ops(int)+0x92) [0x4832f2]
>  5: (OSD::do_recovery(PG*)+0x1e3) [0x4b8cf3]
>  6: (ThreadPool::worker()+0x291) [0x5ac5d1]
>  7: (ThreadPool::WorkThread::entry()+0xd) [0x4ec93d]
>  8: (Thread::_entry_func(void*)+0x7) [0x46eee7]
>  9: (()+0x77e1) [0x7f952e6917e1]
>  10: (clone()+0x6d) [0x7f952d8b551d]
> 
> Any hints to further nail down this problem.

Without logs, it's hard to tell what caused it.  Has it only happened the 
one time?  Did the OSD behave when it was restarted?

Generally speaking, 'debug osd = 20' and 'debug ms = 1' would have the 
context needed to identify the problem, but it's a lot of a logging and 
will slow things down some.  

sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to