The disk that the monitor is on...there isn't anything for you to configure about a monitor WAL though so I'm not sure how that enters into it?
On Fri, Feb 16, 2018 at 12:46 PM Behnam Loghmani <behnam.loghm...@gmail.com> wrote: > Thanks for your reply > > Do you mean, that's the problem with the disk I use for WAL and DB? > > On Fri, Feb 16, 2018 at 11:33 PM, Gregory Farnum <gfar...@redhat.com> > wrote: > >> >> On Fri, Feb 16, 2018 at 7:37 AM Behnam Loghmani < >> behnam.loghm...@gmail.com> wrote: >> >>> Hi there, >>> >>> I have a Ceph cluster version 12.2.2 on CentOS 7. >>> >>> It is a testing cluster and I have set it up 2 weeks ago. >>> after some days, I see that one of the three mons has stopped(out of >>> quorum) and I can't start it anymore. >>> I checked the mon service log and the output shows this error: >>> >>> """ >>> mon.XXXXXX@-1(probing) e4 preinit clean up potentially inconsistent >>> store state >>> rocksdb: submit_transaction_sync error: Corruption: block checksum >>> mismatch >>> >> >> This bit is the important one. Your disk is bad and it’s feeding back >> corrupted data. >> >> >> >> >>> code = 2 Rocksdb transaction: >>> 0> 2018-02-16 17:37:07.041812 7f45a1e52e40 -1 >>> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.2/rpm/el7/BUI >>> LD/ceph-12.2.2/src/mon/MonitorDBStore.h: In function 'void >>> MonitorDBStore::clear(std::set<std::basic_string<char> >&)' thread >>> 7f45a1e52e40 time 2018-02-16 17:37:07.040846 >>> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.2/rpm/el7/BUILD/ceph-12.2.2/src/mon/MonitorDBStore.h: >>> 581: FAILE >>> D assert(r >= 0) >>> """ >>> >>> the only solution I found is to remove this mon from quorum and remove >>> all mon data and re-add this mon to quorum again. >>> and ceph goes to the healthy status again. >>> >>> but now after some days this mon has stopped and I face the same problem >>> again. >>> >>> My cluster setup is: >>> 4 osd hosts >>> total 8 osds >>> 3 mons >>> 1 rgw >>> >>> this cluster has setup with ceph-volume lvm and wal/db separation on >>> logical volumes. >>> >>> Best regards, >>> Behnam Loghmani >>> >>> >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users@lists.ceph.com >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >> >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com