Re: [ceph-users] mon service failed to start

Behnam Loghmani Wed, 21 Feb 2018 09:47:03 -0800

Hi there,

I changed SATA port and cable of SSD disk and also update ceph to version
12.2.3 and rebuild OSDs
but when recovery starts OSDs failed with this error:



2018-02-21 21:12:18.037974 7f3479fe2d00 -1 bluestore(/var/lib/ceph/osd/ceph-7)
_verify_csum bad crc32c/0x1000 checksum at blob offset 0x0, got 0x84c097b0,
expected 0xaf1040a2, device location [0x10000~1000], logical extent
0x0~1000, object #-1:7b3f43c4:::osd_superblock:0#
2018-02-21 21:12:18.038002 7f3479fe2d00 -1 osd.7 0 OSD::init() : unable to
read osd superblock
2018-02-21 21:12:18.038009 7f3479fe2d00  1 bluestore(/var/lib/ceph/osd/ceph-7)
umount
2018-02-21 21:12:18.038282 7f3479fe2d00  1 stupidalloc 0x0x55e99236c620
shutdown
2018-02-21 21:12:18.038308 7f3479fe2d00  1 freelist shutdown
2018-02-21 21:12:18.038336 7f3479fe2d00  4 rocksdb:
[/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_
64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/
centos7/MACHINE_SIZE/huge/release/12.2.3/rpm/el7/BUILD/
ceph-12.2.3/src/rocksdb/db/db_impl.cc:217] Shutdown: ca
nceling all background work
2018-02-21 21:12:18.041561 7f3465561700  4 rocksdb: (Original Log Time
2018/02/21-21:12:18.041514) [/home/jenkins-build/build/
workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/
AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/
release/12.2.3/rpm/el7/BUILD/ceph-12.
2.3/src/rocksdb/db/compaction_job.cc:621] [default] compacted to: base
level 1 max bytes base 268435456 files[5 0 0 0 0 0 0] max score 0.00,
MB/sec: 2495.2 rd, 10.1 wr, level 1, files in(5, 0) out(1) MB in(213.6,
0.0) out(0.9), read-write-amplify(1.0) write-amplify(0.0) S
hutdown in progress: Database shutdown or Column
2018-02-21 21:12:18.041569 7f3465561700  4 rocksdb: (Original Log Time
2018/02/21-21:12:18.041545) EVENT_LOG_v1 {"time_micros": 1519234938041530,
"job": 3, "event": "compaction_finished", "compaction_time_micros": 89747,
"output_level": 1, "num_output_files": 1, "total_ou
tput_size": 902552, "num_input_records": 4470, "num_output_records": 4377,
"num_subcompactions": 1, "num_single_delete_mismatches": 0,
"num_single_delete_fallthrough": 44, "lsm_state": [5, 0, 0, 0, 0, 0, 0]}
2018-02-21 21:12:18.041663 7f3479fe2d00  4 rocksdb: EVENT_LOG_v1
{"time_micros": 1519234938041657, "job": 4, "event": "table_file_deletion",
"file_number": 249}
2018-02-21 21:12:18.042144 7f3479fe2d00  4 rocksdb:
[/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_
64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/
centos7/MACHINE_SIZE/huge/release/12.2.3/rpm/el7/BUILD/
ceph-12.2.3/src/rocksdb/db/db_impl.cc:343] Shutdown com
plete
2018-02-21 21:12:18.043474 7f3479fe2d00  1 bluefs umount
2018-02-21 21:12:18.043775 7f3479fe2d00  1 stupidalloc 0x0x55e991f05d40
shutdown
2018-02-21 21:12:18.043784 7f3479fe2d00  1 stupidalloc 0x0x55e991f05db0
shutdown
2018-02-21 21:12:18.043786 7f3479fe2d00  1 stupidalloc 0x0x55e991f05e20
shutdown
2018-02-21 21:12:18.043826 7f3479fe2d00  1 bdev(0x55e992254600
/dev/vg0/wal-b) close
2018-02-21 21:12:18.301531 7f3479fe2d00  1 bdev(0x55e992255800
/dev/vg0/db-b) close
2018-02-21 21:12:18.545488 7f3479fe2d00  1 bdev(0x55e992254400
/var/lib/ceph/osd/ceph-7/block) close
2018-02-21 21:12:18.650473 7f3479fe2d00  1 bdev(0x55e992254000
/var/lib/ceph/osd/ceph-7/block) close
2018-02-21 21:12:18.900003 7f3479fe2d00 -1  ** ERROR: osd init failed: (22)
Invalid argument


On Wed, Feb 21, 2018 at 5:06 PM, Behnam Loghmani <behnam.loghm...@gmail.com>
wrote:

> but disks pass all the tests with smartctl, badblocks and there isn't any
> error on disks. because the ssd has contain WAL/DB of OSDs it's difficult
> to test it on other cluster nodes
>
> On Wed, Feb 21, 2018 at 4:58 PM, <kna...@gmail.com> wrote:
>
>> Could the problem be related with some faulty hardware (RAID-controller,
>> port, cable) but not disk? Does "faulty" disk works OK on other server?
>>
>> Behnam Loghmani wrote on 21/02/18 16:09:
>>
>>> Hi there,
>>>
>>> I changed the SSD on the problematic node with the new one and
>>> reconfigure OSDs and MON service on it.
>>> but the problem occurred again with:
>>>
>>> "rocksdb: submit_transaction error: Corruption: block checksum mismatch
>>> code = 2"
>>>
>>> I get fully confused now.
>>>
>>>
>>>
>>> On Tue, Feb 20, 2018 at 5:16 PM, Behnam Loghmani <
>>> behnam.loghm...@gmail.com <mailto:behnam.loghm...@gmail.com>> wrote:
>>>
>>>     Hi Caspar,
>>>
>>>     I checked the filesystem and there isn't any error on filesystem.
>>>     The disk is SSD and it doesn't any attribute related to Wear level
>>> in smartctl and filesystem is
>>>     mounted with default options and no discard.
>>>
>>>     my ceph structure on this node is like this:
>>>
>>>     it has osd,mon,rgw services
>>>     1 SSD for OS and WAL/DB
>>>     2 HDD
>>>
>>>     OSDs are created by ceph-volume lvm.
>>>
>>>     the whole SSD is on 1 vg.
>>>     OS is on root lv
>>>     OSD.1 DB is on db-a
>>>     OSD.1 WAL is on wal-a
>>>     OSD.2 DB is on db-b
>>>     OSD.2 WAL is on wal-b
>>>
>>>     output of lvs:
>>>
>>>        data-a data-a -wi-a-----
>>>        data-b data-b -wi-a-----
>>>        db-a   vg0    -wi-a-----
>>>        db-b   vg0    -wi-a-----
>>>        root   vg0    -wi-ao----
>>>        wal-a  vg0    -wi-a-----
>>>        wal-b  vg0    -wi-a-----
>>>
>>>     after making a heavy write on the radosgw, OSD.1 and OSD.2 has
>>> stopped with "block checksum
>>>     mismatch" error.
>>>     Now on this node MON and OSDs services has stopped working with this
>>> error
>>>
>>>     I think my issue is related to this bug:
>>> http://tracker.ceph.com/issues/22102
>>>     <http://tracker.ceph.com/issues/22102>
>>>
>>>     I ran
>>>     #ceph-bluestore-tool fsck --path /var/lib/ceph/osd/ceph-1 --deep 1
>>>     but it returns the same error:
>>>
>>>     *** Caught signal (Aborted) **
>>>       in thread 7fbf6c923d00 thread_name:ceph-bluestore-
>>>     2018-02-20 16:44:30.128787 7fbf6c923d00 -1 abort: Corruption: block
>>> checksum mismatch
>>>       ceph version 12.2.2 (cf0baeeeeba3b47f9427c6c97e2144b094b7e5ba)
>>> luminous (stable)
>>>       1: (()+0x3eb0b1) [0x55f779e6e0b1]
>>>       2: (()+0xf5e0) [0x7fbf61ae15e0]
>>>       3: (gsignal()+0x37) [0x7fbf604d31f7]
>>>       4: (abort()+0x148) [0x7fbf604d48e8]
>>>       5: (RocksDBStore::get(std::string const&, char const*, unsigned
>>> long,
>>>     ceph::buffer::list*)+0x1ce) [0x55f779d2b5ce]
>>>       6: (BlueStore::Collection::get_onode(ghobject_t const&,
>>> bool)+0x545) [0x55f779cd8f75]
>>>       7: (BlueStore::_fsck(bool, bool)+0x1bb5) [0x55f779cf1a75]
>>>       8: (main()+0xde0) [0x55f779baab90]
>>>       9: (__libc_start_main()+0xf5) [0x7fbf604bfc05]
>>>       10: (()+0x1bc59f) [0x55f779c3f59f]
>>>     2018-02-20 16:44:30.131334 7fbf6c923d00 -1 *** Caught signal
>>> (Aborted) **
>>>       in thread 7fbf6c923d00 thread_name:ceph-bluestore-
>>>
>>>       ceph version 12.2.2 (cf0baeeeeba3b47f9427c6c97e2144b094b7e5ba)
>>> luminous (stable)
>>>       1: (()+0x3eb0b1) [0x55f779e6e0b1]
>>>       2: (()+0xf5e0) [0x7fbf61ae15e0]
>>>       3: (gsignal()+0x37) [0x7fbf604d31f7]
>>>       4: (abort()+0x148) [0x7fbf604d48e8]
>>>       5: (RocksDBStore::get(std::string const&, char const*, unsigned
>>> long,
>>>     ceph::buffer::list*)+0x1ce) [0x55f779d2b5ce]
>>>       6: (BlueStore::Collection::get_onode(ghobject_t const&,
>>> bool)+0x545) [0x55f779cd8f75]
>>>       7: (BlueStore::_fsck(bool, bool)+0x1bb5) [0x55f779cf1a75]
>>>       8: (main()+0xde0) [0x55f779baab90]
>>>       9: (__libc_start_main()+0xf5) [0x7fbf604bfc05]
>>>       10: (()+0x1bc59f) [0x55f779c3f59f]
>>>       NOTE: a copy of the executable, or `objdump -rdS <executable>` is
>>> needed to interpret this.
>>>
>>>          -1> 2018-02-20 16:44:30.128787 7fbf6c923d00 -1 abort:
>>> Corruption: block checksum mismatch
>>>           0> 2018-02-20 16:44:30.131334 7fbf6c923d00 -1 *** Caught
>>> signal (Aborted) **
>>>       in thread 7fbf6c923d00 thread_name:ceph-bluestore-
>>>
>>>       ceph version 12.2.2 (cf0baeeeeba3b47f9427c6c97e2144b094b7e5ba)
>>> luminous (stable)
>>>       1: (()+0x3eb0b1) [0x55f779e6e0b1]
>>>       2: (()+0xf5e0) [0x7fbf61ae15e0]
>>>       3: (gsignal()+0x37) [0x7fbf604d31f7]
>>>       4: (abort()+0x148) [0x7fbf604d48e8]
>>>       5: (RocksDBStore::get(std::string const&, char const*, unsigned
>>> long,
>>>     ceph::buffer::list*)+0x1ce) [0x55f779d2b5ce]
>>>       6: (BlueStore::Collection::get_onode(ghobject_t const&,
>>> bool)+0x545) [0x55f779cd8f75]
>>>       7: (BlueStore::_fsck(bool, bool)+0x1bb5) [0x55f779cf1a75]
>>>       8: (main()+0xde0) [0x55f779baab90]
>>>       9: (__libc_start_main()+0xf5) [0x7fbf604bfc05]
>>>       10: (()+0x1bc59f) [0x55f779c3f59f]
>>>       NOTE: a copy of the executable, or `objdump -rdS <executable>` is
>>> needed to interpret this.
>>>
>>>
>>>
>>>     Could you please help me to recover this node or find a way to prove
>>> SSD disk problem.
>>>
>>>     Best regards,
>>>     Behnam Loghmani
>>>
>>>
>>>
>>>
>>>     On Mon, Feb 19, 2018 at 1:35 PM, Caspar Smit <caspars...@supernas.eu
>>>     <mailto:caspars...@supernas.eu>> wrote:
>>>
>>>         Hi Behnam,
>>>
>>>         I would firstly recommend running a filesystem check on the
>>> monitor disk first to see if
>>>         there are any inconsistencies.
>>>
>>>         Is the disk where the monitor is running on a spinning disk or
>>> SSD?
>>>
>>>         If SSD you should check the Wear level stats through smartctl.
>>>         Maybe trim (discard) enabled on the filesystem mount? (discard
>>> could cause
>>>         problems/corruption in combination with certain SSD firmwares)
>>>
>>>         Caspar
>>>
>>>         2018-02-16 23:03 GMT+01:00 Behnam Loghmani <
>>> behnam.loghm...@gmail.com
>>>         <mailto:behnam.loghm...@gmail.com>>:
>>>
>>>             I checked the disk that monitor is on it with smartctl and
>>> it didn't return any error
>>>             and it doesn't have any Current_Pending_Sector.
>>>             Do you recommend any disk checks to make sure that this disk
>>> has problem and then I can
>>>             send the report to the provider for replacing the disk
>>>
>>>             On Sat, Feb 17, 2018 at 1:09 AM, Gregory Farnum <
>>> gfar...@redhat.com
>>>             <mailto:gfar...@redhat.com>> wrote:
>>>
>>>                 The disk that the monitor is on...there isn't anything
>>> for you to configure about a
>>>                 monitor WAL though so I'm not sure how that enters into
>>> it?
>>>
>>>                 On Fri, Feb 16, 2018 at 12:46 PM Behnam Loghmani <
>>> behnam.loghm...@gmail.com
>>>                 <mailto:behnam.loghm...@gmail.com>> wrote:
>>>
>>>                     Thanks for your reply
>>>
>>>                     Do you mean, that's the problem with the disk I use
>>> for WAL and DB?
>>>
>>>                     On Fri, Feb 16, 2018 at 11:33 PM, Gregory Farnum <
>>> gfar...@redhat.com
>>>                     <mailto:gfar...@redhat.com>> wrote:
>>>
>>>
>>>                         On Fri, Feb 16, 2018 at 7:37 AM Behnam Loghmani <
>>> behnam.loghm...@gmail.com
>>>                         <mailto:behnam.loghm...@gmail.com>> wrote:
>>>
>>>                             Hi there,
>>>
>>>                             I have a Ceph cluster version 12.2.2 on
>>> CentOS 7.
>>>
>>>                             It is a testing cluster and I have set it up
>>> 2 weeks ago.
>>>                             after some days, I see that one of the three
>>> mons has stopped(out of
>>>                             quorum) and I can't start it anymore.
>>>                             I checked the mon service log and the output
>>> shows this error:
>>>
>>>                             """
>>>                             mon.XXXXXX@-1(probing) e4 preinit clean up
>>> potentially inconsistent
>>>                             store state
>>>                             rocksdb: submit_transaction_sync error:
>>> Corruption: block checksum mismatch
>>>
>>>                         This bit is the important one. Your disk is bad
>>> and it’s feeding back
>>>                         corrupted data.
>>>
>>>
>>>
>>>
>>>                             code = 2 Rocksdb transaction:
>>>                                   0> 2018-02-16 17:37:07.041812
>>> 7f45a1e52e40 -1
>>>                             /home/jenkins-build/build/work
>>> space/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE
>>> _DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.2/rpm/el7/BUI
>>>                             LD/ceph-12.2.2/src/mon/MonitorDBStore.h: In
>>> function 'void
>>>                             
>>> MonitorDBStore::clear(std::set<std::basic_string<char>
>>> >&)' thread
>>>                             7f45a1e52e40 time 2018-02-16 17:37:07.040846
>>>                             /home/jenkins-build/build/work
>>> space/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE
>>> _DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.2/
>>> rpm/el7/BUILD/ceph-12.2.2/src/mon/MonitorDBStore.h:
>>>                             581: FAILE
>>>                             D assert(r >= 0)
>>>                             """
>>>
>>>                             the only solution I found is to remove this
>>> mon from quorum and remove
>>>                             all mon data and re-add this mon to quorum
>>> again.
>>>                             and ceph goes to the healthy status again.
>>>
>>>                             but now after some days this mon has stopped
>>> and I face the same problem
>>>                             again.
>>>
>>>                             My cluster setup is:
>>>                             4 osd hosts
>>>                             total 8 osds
>>>                             3 mons
>>>                             1 rgw
>>>
>>>                             this cluster has setup with ceph-volume lvm
>>> and wal/db separation on
>>>                             logical volumes.
>>>
>>>                             Best regards,
>>>                             Behnam Loghmani
>>>
>>>
>>>                             ______________________________
>>> _________________
>>>                             ceph-users mailing list
>>>                             ceph-users@lists.ceph.com <mailto:
>>> ceph-users@lists.ceph.com>
>>>                             http://lists.ceph.com/listinfo
>>> .cgi/ceph-users-ceph.com
>>>                             <http://lists.ceph.com/listinf
>>> o.cgi/ceph-users-ceph.com>
>>>
>>>
>>>
>>>
>>>             _______________________________________________
>>>             ceph-users mailing list
>>>             ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
>>>             http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>             <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
>>>
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] mon service failed to start

Reply via email to