Public bug reported:

After an unfortunate incident with dhcpd going away, we lost 3/6 of
our ceph cluster and had to remotely power cycle them to get them
back.  Now that everything is back up, the ceph cluster has mostly
recovered but we had a couple of pg's stuck in an inconsistent state,
so I ran 'ceph osd repair' on one of the osds involved in the
inconsistent pgs.  It ran for a while and fixed some things, and then
exploded with this:

2013-09-18 18:52:24.116439 7fdf4e2d9700 -1 osd/ReplicatedPG.cc: In function 
'void ReplicatedPG::recover_got(hobject_t, eversion_t)' thread 7fdf4e2d9700 
time 2013-09-18 18:52:24.035055
osd/ReplicatedPG.cc: 5351: FAILED assert(missing.num_missing() == 0)

 ceph version 0.48.3argonaut (commit:920f82e805efec2cae05b79c155c07df0f3ed5dd)
 1: (ReplicatedPG::recover_got(hobject_t, eversion_t)+0x4d4) [0x7fdf60c29794]
 2: (ReplicatedPG::submit_push_complete(ObjectRecoveryInfo&, 
ObjectStore::Transaction*)+0x490) [0x7fdf60c2c950]
 3: (ReplicatedPG::handle_pull_response(std::tr1::shared_ptr<OpRequest>)+0x4c6) 
[0x7fdf60c4ac26]
 4: (ReplicatedPG::sub_op_push(std::tr1::shared_ptr<OpRequest>)+0x96) 
[0x7fdf60c4ba66]
 5: (ReplicatedPG::do_sub_op(std::tr1::shared_ptr<OpRequest>)+0x3f7) 
[0x7fdf60c4bf17]
 6: (PG::do_request(std::tr1::shared_ptr<OpRequest>)+0xa7) [0x7fdf60d03a07]
 7: (OSD::dequeue_op(PG*)+0x23a) [0x7fdf60cc156a]
 8: (ThreadPool::worker()+0x4c4) [0x7fdf60e86dd4]
 9: (ThreadPool::WorkThread::entry()+0xd) [0x7fdf60cdab2d]
 10: (()+0x7e9a) [0x7fdf604aee9a]
 11: (clone()+0x6d) [0x7fdf5e9baccd]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to 
interpret this.

Along with 10K more lines of spew about what it was doing.  This is
ceph 0.48.3-0ubuntu1~cloud0 from the Folsom pocket of the Ubuntu Cloud
Archive and the machine is running Ubuntu 12.04 LTS.

** Affects: ceph (Ubuntu)
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1227327

Title:
  ceph osd repair fails with assert(missing.num_missing() == 0)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1227327/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to