Sorry to take so long in replying....

I ended up evacuating data and rebuilding using Luminous with BlueStore
OSDs.  I need my usual drive/host failure testing before going live.  Of
course other things are burning right now and have my attention.  Hopefully
I can finish that work in the next few days.

On Wed, Jun 28, 2017 at 10:11 PM, Mazzystr <mazzy...@gmail.com> wrote:

> I should be able to try that tomorrow.
>
> I'll report back in afterward
>
> On Wed, Jun 28, 2017 at 10:09 PM, Brad Hubbard <bhubb...@redhat.com>
> wrote:
>
>> On Thu, Jun 29, 2017 at 11:58 AM, Mazzystr <mazzy...@gmail.com> wrote:
>> > just one MON
>>
>> Try just replacing that MON then?
>>
>> >
>> > On Wed, Jun 28, 2017 at 8:05 PM, Brad Hubbard <bhubb...@redhat.com>
>> wrote:
>> >>
>> >> On Wed, Jun 28, 2017 at 10:18 PM, Mazzystr <mazzy...@gmail.com> wrote:
>> >> > The corruption is back in mons logs...
>> >> >
>> >> > 2017-06-28 08:16:53.078495 7f1a0b9da700  1 leveldb: Compaction error:
>> >> > Corruption: bad entry in block
>> >> > 2017-06-28 08:16:53.078499 7f1a0b9da700  1 leveldb: Waiting after
>> >> > background
>> >> > compaction error: Corruption: bad entry in block
>> >>
>> >> Is this just one MON, or is it in the logs of all of your MONs?
>> >>
>> >> >
>> >> >
>> >> > On Tue, Jun 27, 2017 at 10:42 PM, Mazzystr <mazzy...@gmail.com>
>> wrote:
>> >> >>
>> >> >> 22:16 ccallegar: good grief...talk about a handful of sand in your
>> eye!
>> >> >> I've been chasing down a "leveldb: Compaction error: Corruption: bad
>> >> >> entry
>> >> >> in block " in mons logs...
>> >> >> 22:17 ccallegar: I ran a python leveldb.repair() and restarted osd's
>> >> >> and
>> >> >> mons and my cluster crashed and burned
>> >> >> 22:18 ccallegar: a couple files ended up in leveldb lost dirs.  The
>> >> >> path
>> >> >> is different if it's a mons or osd
>> >> >> 22:19 ccallegar: for mons logs showed a MANIFEST file missing.  I
>> moved
>> >> >> the file that landed in lost back to normal position, chown'd
>> >> >> ceph:ceph,
>> >> >> restarted mons and mons came back online!
>> >> >> 22:21 ccallegar: osd logs showed a sst file missing.  looks like
>> >> >> leveldb.repair() does the needful but names the new file a .ldb.  I
>> >> >> renamed
>> >> >> the file, chown'd ceph:ceph, restarted osd and they came back
>> online!
>> >> >>
>> >> >> leveldb corruption log entries have gone away and my cluster is
>> >> >> recovering
>> >> >> it's way to happiness.
>> >> >>
>> >> >> Hopefully this helps someone else out
>> >> >>
>> >> >> Thanks,
>> >> >> /Chris
>> >> >>
>> >> >>
>> >> >> On Tue, Jun 27, 2017 at 6:39 PM, Mazzystr <mazzy...@gmail.com>
>> wrote:
>> >> >>>
>> >> >>> Hi Ceph Users,
>> >> >>> I've been chasing down some levelDB corruption messages in my mons
>> >> >>> logs.
>> >> >>> I ran a python leveldb repair on mon and odd leveldbs.  The job
>> caused
>> >> >>> a
>> >> >>> files to disappear and a log file to appear in lost directory.  Mon
>> >> >>> and
>> >> >>> osd's refuse to boot.
>> >> >>>
>> >> >>> Ceph version is kraken 11.02.
>> >> >>>
>> >> >>> There's not a whole lot of info on the internet regarding this.
>> >> >>> Anyone
>> >> >>> have any ideas on how to recover the mess?
>> >> >>>
>> >> >>> Thanks,
>> >> >>> /Chris C
>> >> >>
>> >> >>
>> >> >
>> >> >
>> >> > _______________________________________________
>> >> > ceph-users mailing list
>> >> > ceph-users@lists.ceph.com
>> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Cheers,
>> >> Brad
>> >
>> >
>>
>>
>>
>> --
>> Cheers,
>> Brad
>>
>
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to