Hi there, I changed SATA port and cable of SSD disk and also update ceph to version 12.2.3 and rebuild OSDs but when recovery starts OSDs failed with this error:
2018-02-21 21:12:18.037974 7f3479fe2d00 -1 bluestore(/var/lib/ceph/osd/ceph-7) _verify_csum bad crc32c/0x1000 checksum at blob offset 0x0, got 0x84c097b0, expected 0xaf1040a2, device location [0x10000~1000], logical extent 0x0~1000, object #-1:7b3f43c4:::osd_superblock:0# 2018-02-21 21:12:18.038002 7f3479fe2d00 -1 osd.7 0 OSD::init() : unable to read osd superblock 2018-02-21 21:12:18.038009 7f3479fe2d00 1 bluestore(/var/lib/ceph/osd/ceph-7) umount 2018-02-21 21:12:18.038282 7f3479fe2d00 1 stupidalloc 0x0x55e99236c620 shutdown 2018-02-21 21:12:18.038308 7f3479fe2d00 1 freelist shutdown 2018-02-21 21:12:18.038336 7f3479fe2d00 4 rocksdb: [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_ 64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/ centos7/MACHINE_SIZE/huge/release/12.2.3/rpm/el7/BUILD/ ceph-12.2.3/src/rocksdb/db/db_impl.cc:217] Shutdown: ca nceling all background work 2018-02-21 21:12:18.041561 7f3465561700 4 rocksdb: (Original Log Time 2018/02/21-21:12:18.041514) [/home/jenkins-build/build/ workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/ AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/ release/12.2.3/rpm/el7/BUILD/ceph-12. 2.3/src/rocksdb/db/compaction_job.cc:621] [default] compacted to: base level 1 max bytes base 268435456 files[5 0 0 0 0 0 0] max score 0.00, MB/sec: 2495.2 rd, 10.1 wr, level 1, files in(5, 0) out(1) MB in(213.6, 0.0) out(0.9), read-write-amplify(1.0) write-amplify(0.0) S hutdown in progress: Database shutdown or Column 2018-02-21 21:12:18.041569 7f3465561700 4 rocksdb: (Original Log Time 2018/02/21-21:12:18.041545) EVENT_LOG_v1 {"time_micros": 1519234938041530, "job": 3, "event": "compaction_finished", "compaction_time_micros": 89747, "output_level": 1, "num_output_files": 1, "total_ou tput_size": 902552, "num_input_records": 4470, "num_output_records": 4377, "num_subcompactions": 1, "num_single_delete_mismatches": 0, "num_single_delete_fallthrough": 44, "lsm_state": [5, 0, 0, 0, 0, 0, 0]} 2018-02-21 21:12:18.041663 7f3479fe2d00 4 rocksdb: EVENT_LOG_v1 {"time_micros": 1519234938041657, "job": 4, "event": "table_file_deletion", "file_number": 249} 2018-02-21 21:12:18.042144 7f3479fe2d00 4 rocksdb: [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_ 64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/ centos7/MACHINE_SIZE/huge/release/12.2.3/rpm/el7/BUILD/ ceph-12.2.3/src/rocksdb/db/db_impl.cc:343] Shutdown com plete 2018-02-21 21:12:18.043474 7f3479fe2d00 1 bluefs umount 2018-02-21 21:12:18.043775 7f3479fe2d00 1 stupidalloc 0x0x55e991f05d40 shutdown 2018-02-21 21:12:18.043784 7f3479fe2d00 1 stupidalloc 0x0x55e991f05db0 shutdown 2018-02-21 21:12:18.043786 7f3479fe2d00 1 stupidalloc 0x0x55e991f05e20 shutdown 2018-02-21 21:12:18.043826 7f3479fe2d00 1 bdev(0x55e992254600 /dev/vg0/wal-b) close 2018-02-21 21:12:18.301531 7f3479fe2d00 1 bdev(0x55e992255800 /dev/vg0/db-b) close 2018-02-21 21:12:18.545488 7f3479fe2d00 1 bdev(0x55e992254400 /var/lib/ceph/osd/ceph-7/block) close 2018-02-21 21:12:18.650473 7f3479fe2d00 1 bdev(0x55e992254000 /var/lib/ceph/osd/ceph-7/block) close 2018-02-21 21:12:18.900003 7f3479fe2d00 -1 ** ERROR: osd init failed: (22) Invalid argument On Wed, Feb 21, 2018 at 5:06 PM, Behnam Loghmani <behnam.loghm...@gmail.com> wrote: > but disks pass all the tests with smartctl, badblocks and there isn't any > error on disks. because the ssd has contain WAL/DB of OSDs it's difficult > to test it on other cluster nodes > > On Wed, Feb 21, 2018 at 4:58 PM, <kna...@gmail.com> wrote: > >> Could the problem be related with some faulty hardware (RAID-controller, >> port, cable) but not disk? Does "faulty" disk works OK on other server? >> >> Behnam Loghmani wrote on 21/02/18 16:09: >> >>> Hi there, >>> >>> I changed the SSD on the problematic node with the new one and >>> reconfigure OSDs and MON service on it. >>> but the problem occurred again with: >>> >>> "rocksdb: submit_transaction error: Corruption: block checksum mismatch >>> code = 2" >>> >>> I get fully confused now. >>> >>> >>> >>> On Tue, Feb 20, 2018 at 5:16 PM, Behnam Loghmani < >>> behnam.loghm...@gmail.com <mailto:behnam.loghm...@gmail.com>> wrote: >>> >>> Hi Caspar, >>> >>> I checked the filesystem and there isn't any error on filesystem. >>> The disk is SSD and it doesn't any attribute related to Wear level >>> in smartctl and filesystem is >>> mounted with default options and no discard. >>> >>> my ceph structure on this node is like this: >>> >>> it has osd,mon,rgw services >>> 1 SSD for OS and WAL/DB >>> 2 HDD >>> >>> OSDs are created by ceph-volume lvm. >>> >>> the whole SSD is on 1 vg. >>> OS is on root lv >>> OSD.1 DB is on db-a >>> OSD.1 WAL is on wal-a >>> OSD.2 DB is on db-b >>> OSD.2 WAL is on wal-b >>> >>> output of lvs: >>> >>> data-a data-a -wi-a----- >>> data-b data-b -wi-a----- >>> db-a vg0 -wi-a----- >>> db-b vg0 -wi-a----- >>> root vg0 -wi-ao---- >>> wal-a vg0 -wi-a----- >>> wal-b vg0 -wi-a----- >>> >>> after making a heavy write on the radosgw, OSD.1 and OSD.2 has >>> stopped with "block checksum >>> mismatch" error. >>> Now on this node MON and OSDs services has stopped working with this >>> error >>> >>> I think my issue is related to this bug: >>> http://tracker.ceph.com/issues/22102 >>> <http://tracker.ceph.com/issues/22102> >>> >>> I ran >>> #ceph-bluestore-tool fsck --path /var/lib/ceph/osd/ceph-1 --deep 1 >>> but it returns the same error: >>> >>> *** Caught signal (Aborted) ** >>> in thread 7fbf6c923d00 thread_name:ceph-bluestore- >>> 2018-02-20 16:44:30.128787 7fbf6c923d00 -1 abort: Corruption: block >>> checksum mismatch >>> ceph version 12.2.2 (cf0baeeeeba3b47f9427c6c97e2144b094b7e5ba) >>> luminous (stable) >>> 1: (()+0x3eb0b1) [0x55f779e6e0b1] >>> 2: (()+0xf5e0) [0x7fbf61ae15e0] >>> 3: (gsignal()+0x37) [0x7fbf604d31f7] >>> 4: (abort()+0x148) [0x7fbf604d48e8] >>> 5: (RocksDBStore::get(std::string const&, char const*, unsigned >>> long, >>> ceph::buffer::list*)+0x1ce) [0x55f779d2b5ce] >>> 6: (BlueStore::Collection::get_onode(ghobject_t const&, >>> bool)+0x545) [0x55f779cd8f75] >>> 7: (BlueStore::_fsck(bool, bool)+0x1bb5) [0x55f779cf1a75] >>> 8: (main()+0xde0) [0x55f779baab90] >>> 9: (__libc_start_main()+0xf5) [0x7fbf604bfc05] >>> 10: (()+0x1bc59f) [0x55f779c3f59f] >>> 2018-02-20 16:44:30.131334 7fbf6c923d00 -1 *** Caught signal >>> (Aborted) ** >>> in thread 7fbf6c923d00 thread_name:ceph-bluestore- >>> >>> ceph version 12.2.2 (cf0baeeeeba3b47f9427c6c97e2144b094b7e5ba) >>> luminous (stable) >>> 1: (()+0x3eb0b1) [0x55f779e6e0b1] >>> 2: (()+0xf5e0) [0x7fbf61ae15e0] >>> 3: (gsignal()+0x37) [0x7fbf604d31f7] >>> 4: (abort()+0x148) [0x7fbf604d48e8] >>> 5: (RocksDBStore::get(std::string const&, char const*, unsigned >>> long, >>> ceph::buffer::list*)+0x1ce) [0x55f779d2b5ce] >>> 6: (BlueStore::Collection::get_onode(ghobject_t const&, >>> bool)+0x545) [0x55f779cd8f75] >>> 7: (BlueStore::_fsck(bool, bool)+0x1bb5) [0x55f779cf1a75] >>> 8: (main()+0xde0) [0x55f779baab90] >>> 9: (__libc_start_main()+0xf5) [0x7fbf604bfc05] >>> 10: (()+0x1bc59f) [0x55f779c3f59f] >>> NOTE: a copy of the executable, or `objdump -rdS <executable>` is >>> needed to interpret this. >>> >>> -1> 2018-02-20 16:44:30.128787 7fbf6c923d00 -1 abort: >>> Corruption: block checksum mismatch >>> 0> 2018-02-20 16:44:30.131334 7fbf6c923d00 -1 *** Caught >>> signal (Aborted) ** >>> in thread 7fbf6c923d00 thread_name:ceph-bluestore- >>> >>> ceph version 12.2.2 (cf0baeeeeba3b47f9427c6c97e2144b094b7e5ba) >>> luminous (stable) >>> 1: (()+0x3eb0b1) [0x55f779e6e0b1] >>> 2: (()+0xf5e0) [0x7fbf61ae15e0] >>> 3: (gsignal()+0x37) [0x7fbf604d31f7] >>> 4: (abort()+0x148) [0x7fbf604d48e8] >>> 5: (RocksDBStore::get(std::string const&, char const*, unsigned >>> long, >>> ceph::buffer::list*)+0x1ce) [0x55f779d2b5ce] >>> 6: (BlueStore::Collection::get_onode(ghobject_t const&, >>> bool)+0x545) [0x55f779cd8f75] >>> 7: (BlueStore::_fsck(bool, bool)+0x1bb5) [0x55f779cf1a75] >>> 8: (main()+0xde0) [0x55f779baab90] >>> 9: (__libc_start_main()+0xf5) [0x7fbf604bfc05] >>> 10: (()+0x1bc59f) [0x55f779c3f59f] >>> NOTE: a copy of the executable, or `objdump -rdS <executable>` is >>> needed to interpret this. >>> >>> >>> >>> Could you please help me to recover this node or find a way to prove >>> SSD disk problem. >>> >>> Best regards, >>> Behnam Loghmani >>> >>> >>> >>> >>> On Mon, Feb 19, 2018 at 1:35 PM, Caspar Smit <caspars...@supernas.eu >>> <mailto:caspars...@supernas.eu>> wrote: >>> >>> Hi Behnam, >>> >>> I would firstly recommend running a filesystem check on the >>> monitor disk first to see if >>> there are any inconsistencies. >>> >>> Is the disk where the monitor is running on a spinning disk or >>> SSD? >>> >>> If SSD you should check the Wear level stats through smartctl. >>> Maybe trim (discard) enabled on the filesystem mount? (discard >>> could cause >>> problems/corruption in combination with certain SSD firmwares) >>> >>> Caspar >>> >>> 2018-02-16 23:03 GMT+01:00 Behnam Loghmani < >>> behnam.loghm...@gmail.com >>> <mailto:behnam.loghm...@gmail.com>>: >>> >>> I checked the disk that monitor is on it with smartctl and >>> it didn't return any error >>> and it doesn't have any Current_Pending_Sector. >>> Do you recommend any disk checks to make sure that this disk >>> has problem and then I can >>> send the report to the provider for replacing the disk >>> >>> On Sat, Feb 17, 2018 at 1:09 AM, Gregory Farnum < >>> gfar...@redhat.com >>> <mailto:gfar...@redhat.com>> wrote: >>> >>> The disk that the monitor is on...there isn't anything >>> for you to configure about a >>> monitor WAL though so I'm not sure how that enters into >>> it? >>> >>> On Fri, Feb 16, 2018 at 12:46 PM Behnam Loghmani < >>> behnam.loghm...@gmail.com >>> <mailto:behnam.loghm...@gmail.com>> wrote: >>> >>> Thanks for your reply >>> >>> Do you mean, that's the problem with the disk I use >>> for WAL and DB? >>> >>> On Fri, Feb 16, 2018 at 11:33 PM, Gregory Farnum < >>> gfar...@redhat.com >>> <mailto:gfar...@redhat.com>> wrote: >>> >>> >>> On Fri, Feb 16, 2018 at 7:37 AM Behnam Loghmani < >>> behnam.loghm...@gmail.com >>> <mailto:behnam.loghm...@gmail.com>> wrote: >>> >>> Hi there, >>> >>> I have a Ceph cluster version 12.2.2 on >>> CentOS 7. >>> >>> It is a testing cluster and I have set it up >>> 2 weeks ago. >>> after some days, I see that one of the three >>> mons has stopped(out of >>> quorum) and I can't start it anymore. >>> I checked the mon service log and the output >>> shows this error: >>> >>> """ >>> mon.XXXXXX@-1(probing) e4 preinit clean up >>> potentially inconsistent >>> store state >>> rocksdb: submit_transaction_sync error: >>> Corruption: block checksum mismatch >>> >>> This bit is the important one. Your disk is bad >>> and it’s feeding back >>> corrupted data. >>> >>> >>> >>> >>> code = 2 Rocksdb transaction: >>> 0> 2018-02-16 17:37:07.041812 >>> 7f45a1e52e40 -1 >>> /home/jenkins-build/build/work >>> space/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE >>> _DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.2/rpm/el7/BUI >>> LD/ceph-12.2.2/src/mon/MonitorDBStore.h: In >>> function 'void >>> >>> MonitorDBStore::clear(std::set<std::basic_string<char> >>> >&)' thread >>> 7f45a1e52e40 time 2018-02-16 17:37:07.040846 >>> /home/jenkins-build/build/work >>> space/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE >>> _DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.2/ >>> rpm/el7/BUILD/ceph-12.2.2/src/mon/MonitorDBStore.h: >>> 581: FAILE >>> D assert(r >= 0) >>> """ >>> >>> the only solution I found is to remove this >>> mon from quorum and remove >>> all mon data and re-add this mon to quorum >>> again. >>> and ceph goes to the healthy status again. >>> >>> but now after some days this mon has stopped >>> and I face the same problem >>> again. >>> >>> My cluster setup is: >>> 4 osd hosts >>> total 8 osds >>> 3 mons >>> 1 rgw >>> >>> this cluster has setup with ceph-volume lvm >>> and wal/db separation on >>> logical volumes. >>> >>> Best regards, >>> Behnam Loghmani >>> >>> >>> ______________________________ >>> _________________ >>> ceph-users mailing list >>> ceph-users@lists.ceph.com <mailto: >>> ceph-users@lists.ceph.com> >>> http://lists.ceph.com/listinfo >>> .cgi/ceph-users-ceph.com >>> <http://lists.ceph.com/listinf >>> o.cgi/ceph-users-ceph.com> >>> >>> >>> >>> >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com> >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com> >>> >>> >>> >>> >>> >>> >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users@lists.ceph.com >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >>> _______________________________________________ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com