Re: [ceph-users] mon service failed to start

Gregory Farnum Fri, 16 Feb 2018 13:40:07 -0800

The disk that the monitor is on...there isn't anything for you to configure
about a monitor WAL though so I'm not sure how that enters into it?


On Fri, Feb 16, 2018 at 12:46 PM Behnam Loghmani <behnam.loghm...@gmail.com>
wrote:

> Thanks for your reply
>
> Do you mean, that's the problem with the disk I use for WAL and DB?
>
> On Fri, Feb 16, 2018 at 11:33 PM, Gregory Farnum <gfar...@redhat.com>
> wrote:
>
>>
>> On Fri, Feb 16, 2018 at 7:37 AM Behnam Loghmani <
>> behnam.loghm...@gmail.com> wrote:
>>
>>> Hi there,
>>>
>>> I have a Ceph cluster version 12.2.2 on CentOS 7.
>>>
>>> It is a testing cluster and I have set it up 2 weeks ago.
>>> after some days, I see that one of the three mons has stopped(out of
>>> quorum) and I can't start it anymore.
>>> I checked the mon service log and the output shows this error:
>>>
>>> """
>>> mon.XXXXXX@-1(probing) e4 preinit clean up potentially inconsistent
>>> store state
>>> rocksdb: submit_transaction_sync error: Corruption: block checksum
>>> mismatch
>>>
>>
>> This bit is the important one. Your disk is bad and it’s feeding back
>> corrupted data.
>>
>>
>>
>>
>>> code = 2 Rocksdb transaction:
>>>      0> 2018-02-16 17:37:07.041812 7f45a1e52e40 -1
>>> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.2/rpm/el7/BUI
>>> LD/ceph-12.2.2/src/mon/MonitorDBStore.h: In function 'void
>>> MonitorDBStore::clear(std::set<std::basic_string<char> >&)' thread
>>> 7f45a1e52e40 time 2018-02-16 17:37:07.040846
>>> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.2/rpm/el7/BUILD/ceph-12.2.2/src/mon/MonitorDBStore.h:
>>> 581: FAILE
>>> D assert(r >= 0)
>>> """
>>>
>>> the only solution I found is to remove this mon from quorum and remove
>>> all mon data and re-add this mon to quorum again.
>>> and ceph goes to the healthy status again.
>>>
>>> but now after some days this mon has stopped and I face the same problem
>>> again.
>>>
>>> My cluster setup is:
>>> 4 osd hosts
>>> total 8 osds
>>> 3 mons
>>> 1 rgw
>>>
>>> this cluster has setup with ceph-volume lvm and wal/db separation on
>>> logical volumes.
>>>
>>> Best regards,
>>> Behnam Loghmani
>>>
>>>
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>
>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] mon service failed to start

Reply via email to