On 4/12/16 12:01 AM, Gregory Farnum wrote:
On Mon, Apr 11, 2016 at 3:45 PM, Eric Hall <eric.h...@vanderbilt.edu> wrote:
Power failure in data center has left 3 mons unable to start with
mon/OSDMonitor.cc: 125: FAILED assert(version >= osdmap.epoch)

Have found simliar problem discussed at
http://irclogs.ceph.widodh.nl/index.php?date=2015-05-29, but am unsure how
to proceed.

If I read
ceph-kvstore-tool /var/lib/ceph/mon/ceph-cephsecurestore1/store.db list
correctly, they believe osdmap is 1, but they also have osdmap:full_38456
and osdmap:38630 in the store.

Exactly what values are you reading that's giving you those values?
The "real" OSDMap epoch is going to be at least 38630...if you're very
lucky it will be exactly 38630. But since it reset itself to 1 in the
monitor's store, I doubt you'll be lucky.

I'm getting this from ceph-kvstore-tool list.

So in order to get your cluster back up, you need to find the largest
osdmap version in your cluster. You can do that, very tediously, by
looking at the OSDMap stores. Or you may have debug logs indicating it
more easily on the monitors.

I don't see info like this in any logs.  How/where do I inspect this?

Thank you,
--
Eric

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to