Thanks for you reply!

Yes, it a Nvme, and on node has two Nvmes as db/wal, one for ssd(0-2) and 
another for hdd(3-6).
I have no spare to try.
It’s  very strange, the load not very high at that time. and both ssd and nvme 
seems healthy.

If cannot fix it.  I am afraid I need to setup more nodes and set out remove 
these OSDs which using this Nvme?  

Thanks,
zx


> 在 2021年2月22日,上午10:07,Mark Lehrer <leh...@gmail.com> 写道:
> 
>> One nvme  sudden crash again. Could anyone please help shed some light here?
> 
> It looks like a flaky NVMe drive.  Do you have a spare to try?
> 
> 
> On Mon, Feb 22, 2021 at 1:56 AM zxcs <zhuxion...@163.com> wrote:
>> 
>> One nvme  sudden crash again. Could anyone please help shed some light here? 
>> Thank a ton!!!
>> Below are syslog and ceph log.
>> 
>> From  /var/log/syslog
>> Feb 21 19:38:33 ip kernel: [232562.847916] nvme 0000:03:00.0: I/O 943 QID 7 
>> timeout, aborting
>> Feb 21 19:38:34 ip kernel: [232563.847946] nvme 0000:03:00.0: I/O 911 QID 18 
>> timeout, aborting
>> Feb 21 19:38:34 ip kernel: [232563.847964] nvme 0000:03:00.0: I/O 776 QID 28 
>> timeout, aborting
>> Feb 21 19:38:36 ip ceph-osd[3241]: 2021-02-21 19:38:36.218 7f023b58f700 -1 
>> osd.16 7868 get_health_metrics reporting 1 slow ops, oldest is 
>> osd_op(mds.1.51327:1034954064 2.80 2:018429c8:::2002458b0ca.00000000:head 
>> [create,setxattr parent (357),setxattr layout (30)] snapc 0=[] 
>> ondisk+write+known_if_redirected+full_force e7868)
>> Feb 21 19:38:36 ip kernel: [232565.851961] nvme 0000:03:00.0: I/O 442 QID 2 
>> timeout, aborting
>> Feb 21 19:38:36 ip kernel: [232565.851982] nvme 0000:03:00.0: I/O 912 QID 18 
>> timeout, aborting
>> Feb 21 19:38:37 ip ceph-osd[3241]: 2021-02-21 19:38:37.254 7f023b58f700 -1 
>> osd.16 7868 get_health_metrics reporting 1 slow ops, oldest is 
>> osd_op(mds.1.51327:1034954064 2.80 2:018429c8:::2002458b0ca.00000000:head 
>> [create,setxattr parent (357),setxattr layout (30)] snapc 0=[] 
>> ondisk+write+known_if_redirected+full_force e7868)
>> Feb 21 19:38:38 ip ceph-osd[3241]: 2021-02-21 19:38:38.286 7f023b58f700 -1 
>> osd.16 7868 get_health_metrics reporting 1 slow ops, oldest is 
>> osd_op(mds.1.51327:1034954064 2.80 2:018429c8:::2002458b0ca.00000000:head 
>> [create,setxattr parent (357),setxattr layout (30)] snapc 0=[] 
>> ondisk+write+known_if_redirected+full_force e7868)
>> Feb 21 19:38:39 ip ceph-osd[3241]: 2021-02-21 19:38:39.334 7f023b58f700 -1 
>> osd.16 7868 get_health_metrics reporting 1 slow ops, oldest is 
>> osd_op(mds.1.51327:1034954064 2.80 2:018429c8:::2002458b0ca.00000000:head 
>> [create,setxattr parent (357),setxattr layout (30)] snapc 0=[] 
>> ondisk+write+known_if_redirected+full_force e7868)
>> Feb 21 19:38:40 ip ceph-osd[3241]: 2021-02-21 19:38:40.322 7f023b58f700 -1 
>> osd.16 7868 get_health_metrics reporting 1 slow ops, oldest is 
>> osd_op(mds.1.51327:1034954064 2.80 2:018429c8:::2002458b0ca.00000000:head 
>> [create,setxattr parent (357),setxattr layout (30)] snapc 0=[] 
>> ondisk+write+known_if_redirected+full_force e7868)
>> Feb 21 19:38:41 ip ceph-osd[3241]: 2021-02-21 19:38:41.326 7f023b58f700 -1 
>> osd.16 7868 get_health_metrics reporting 1 slow ops, oldest is 
>> osd_op(mds.1.51327:1034954064 2.80 2:018429c8:::2002458b0ca.00000000:head 
>> [create,setxattr parent (357),setxattr layout (30)] snapc 0=[] 
>> ondisk+write+known_if_redirected+full_force e7868)
>> Feb 21 19:38:41 ip kernel: [232570.852035] nvme 0000:03:00.0: I/O 860 QID 9 
>> timeout, aborting
>> Feb 21 19:38:42 ip ceph-osd[3241]: 2021-02-21 19:38:42.298 7f023b58f700 -1 
>> osd.16 7868 get_health_metrics reporting 1 slow ops, oldest is 
>> osd_op(mds.1.51327:1034954064 2.80 2:018429c8:::2002458b0ca.00000000:head 
>> [create,setxattr parent (357),setxattr layout (30)] snapc 0=[] 
>> ondisk+write+known_if_redirected+full_force e7868)
>> Feb 21 19:38:43 ip ceph-osd[3241]: 2021-02-21 19:38:43.258 7f023b58f700 -1 
>> osd.16 7868 get_health_metrics reporting 1 slow ops, oldest is 
>> osd_op(mds.1.51327:1034954064 2.80 2:018429c8:::2002458b0ca.00000000:head 
>> [create,setxattr parent (357),setxattr layout (30)] snapc 0=[] 
>> ondisk+write+known_if_redirected+full_force e7868)
>> Feb 21 19:38:44 ip ceph-osd[3241]: 2021-02-21 19:38:44.258 7f023b58f700 -1 
>> osd.16 7868 get_health_metrics reporting 2 slow ops, oldest is 
>> osd_op(mds.1.51327:1034954064 2.80 2:018429c8:::2002458b0ca.00000000:head 
>> [create,setxattr parent (357),setxattr layout (30)] snapc 0=[] 
>> ondisk+write+known_if_redirected+full_force e7868)
>> Feb 21 19:38:45 ip ntpd[3480]: Soliciting pool server 84.16.67.12
>> Feb 21 19:38:45 ip ceph-osd[3241]: 2021-02-21 19:38:45.286 7f023b58f700 -1 
>> osd.16 7868 get_health_metrics reporting 2 slow ops, oldest is 
>> osd_op(mds.1.51327:1034954064 2.80 2:018429c8:::2002458b0ca.00000000:head 
>> [create,setxattr parent (357),setxattr layout (30)] snapc 0=[] 
>> ondisk+write+known_if_redirected+full_force e7868)
>> Feb 21 19:38:46 ip ceph-osd[3241]: 2021-02-21 19:38:46.254 7f023b58f700 -1 
>> osd.16 7868 get_health_metrics reporting 2 slow ops, oldest is 
>> osd_op(mds.1.51327:1034954064 2.80 2:018429c8:::2002458b0ca.00000000:head 
>> [create,setxattr parent (357),setxattr layout (30)] snapc 0=[] 
>> ondisk+write+known_if_redirected+full_force e7868)
>> Feb 21 19:38:47 ip ceph-osd[3241]: 2021-02-21 19:38:47.226 7f023b58f700 -1 
>> osd.16 7868 get_health_metrics reporting 2 slow ops, oldest is 
>> osd_op(mds.1.51327:1034954064 2.80 2:018429c8:::2002458b0ca.00000000:head 
>> [create,setxattr parent (357),setxattr layout (30)] snapc 0=[] 
>> ondisk+write+known_if_redirected+full_force e7868)
>> Feb 21 19:39:04 ip kernel: [232593.860464] nvme 0000:03:00.0: I/O 943 QID 7 
>> timeout, reset controller
>> Feb 21 19:39:33 ip kernel: [232622.868975] nvme 0000:03:00.0: I/O 0 QID 0 
>> timeout, reset controller
>> Feb 21 19:40:35 ip ceph-osd[3241]: 2021-02-21 19:40:35.211 7f0236585700 -1 
>> bdev(0xc3f0a80 /var/lib/ceph/osd/ceph-16/block.wal) _aio_thread got r=-5 
>> ((5) Input/output error)
>> Feb 21 19:40:35 ip ceph-osd[3241]: 2021-02-21 19:40:35.211 7f021a54d700 -1 
>> bdev(0xc3f0700 /var/lib/ceph/osd/ceph-16/block.db) direct_read_unaligned 
>> stalled read  0x611527d7~fe6 since 232570.311299s, timeout is 5.000000s
>> Feb 21 19:40:35 ip ceph-osd[3241]: 2021-02-21 19:40:35.211 7f021a54d700 -1 
>> bdev(0xc3f0700 /var/lib/ceph/osd/ceph-16/block.db) direct_read_unaligned 
>> 0x611527d7~fe6 error: (5) Input/output error
>> Feb 21 19:40:35 ip kernel: [232684.442054] nvme 0000:03:00.0: Device not 
>> ready; aborting reset
>> Feb 21 19:40:35 ip kernel: [232684.442694] nvme 0000:03:00.0: Cancelling I/O 
>> 776 QID 28
>> Feb 21 19:40:35 ip kernel: [232684.442700] nvme 0000:03:00.0: completing 
>> aborted command with status: 0007
>> Feb 21 19:40:35 ip kernel: [232684.442705] blk_update_request: I/O error, 
>> dev nvme0n1, sector 1344455888
>> Feb 21 19:40:35 ip kernel: [232684.442937] nvme 0000:03:00.0: Cancelling I/O 
>> 911 QID 18
>> Feb 21 19:40:35 ip kernel: [232684.442941] nvme 0000:03:00.0: completing 
>> aborted command with status: 0007
>> Feb 21 19:40:35 ip ceph-osd[3241]: 
>> /build/ceph-14.2.10/src/os/bluestore/BlueFS.cc: In function 'int 
>> BlueFS::_read_random(BlueFS::FileReader*, uint64_t, size_t, char*)' thread 
>> 7f021a54d700 time 2021-02-21 19:40:35.216141
>> Feb 21 19:40:35 ip kernel: [232684.442944] blk_update_request: I/O error, 
>> dev nvme0n1, sector 673580352
>> Feb 21 19:40:35 ip kernel: [232684.443163] nvme 0000:03:00.0: Cancelling I/O 
>> 912 QID 18
>> Feb 21 19:40:35 ip kernel: [232684.443168] nvme 0000:03:00.0: completing 
>> aborted command with status: 0007
>> Feb 21 19:40:35 ip kernel: [232684.443171] blk_update_request: I/O error, 
>> dev nvme0n1, sector 429184832
>> Feb 21 19:40:35 ip kernel: [232684.443384] nvme 0000:03:00.0: Cancelling I/O 
>> 913 QID 18
>> Feb 21 19:40:35 ip kernel: [232684.443388] blk_update_request: I/O error, 
>> dev nvme0n1, sector 1680904848
>> Feb 21 19:40:35 ip kernel: [232684.443607] nvme 0000:03:00.0: Cancelling I/O 
>> 914 QID 18
>> Feb 21 19:40:35 ip kernel: [232684.443610] blk_update_request: I/O error, 
>> dev nvme0n1, sector 349779760
>> Feb 21 19:40:35 ip kernel: [232684.443823] nvme 0000:03:00.0: Cancelling I/O 
>> 915 QID 18
>> Feb 21 19:40:35 ip kernel: [232684.443826] blk_update_request: I/O error, 
>> dev nvme0n1, sector 349798704
>> Feb 21 19:40:35 ip kernel: [232684.444043] nvme 0000:03:00.0: Cancelling I/O 
>> 860 QID 9
>> Feb 21 19:40:35 ip kernel: [232684.444047] nvme 0000:03:00.0: completing 
>> aborted command with status: 0007
>> Feb 21 19:40:35 ip kernel: [232684.444050] blk_update_request: I/O error, 
>> dev nvme0n1, sector 1051102760
>> Feb 21 19:40:35 ip kernel: [232684.444273] nvme 0000:03:00.0: Cancelling I/O 
>> 943 QID 7
>> Feb 21 19:40:35 ip kernel: [232684.444281] nvme 0000:03:00.0: Cancelling I/O 
>> 442 QID 2
>> Feb 21 19:40:35 ip kernel: [232684.444287] nvme 0000:03:00.0: completing 
>> aborted command with status: 0007
>> Feb 21 19:40:35 ip kernel: [232684.444291] blk_update_request: I/O error, 
>> dev nvme0n1, sector 340275160
>> Feb 21 19:40:35 ip kernel: [232684.444514] nvme 0000:03:00.0: Cancelling I/O 
>> 0 QID 0
>> Feb 21 19:40:35 ip kernel: [232684.444520] nvme 0000:03:00.0: Cancelling I/O 
>> 1 QID 0
>> Feb 21 19:40:35 ip kernel: [232684.444528] nvme nvme0: Abort status: 0x7
>> Feb 21 19:40:35 ip kernel: [232684.444534] nvme 0000:03:00.0: Cancelling I/O 
>> 2 QID 0
>> Feb 21 19:40:35 ip kernel: [232684.444540] nvme nvme0: Abort status: 0x7
>> Feb 21 19:40:35 ip kernel: [232684.444546] nvme 0000:03:00.0: Cancelling I/O 
>> 3 QID 0
>> Feb 21 19:40:35 ip kernel: [232684.444552] nvme nvme0: Abort status: 0x7
>> Feb 21 19:40:35 ip kernel: [232684.444558] nvme 0000:03:00.0: Cancelling I/O 
>> 4 QID 0
>> Feb 21 19:40:35 ip kernel: [232684.444563] nvme nvme0: Abort status: 0x7
>> Feb 21 19:40:35 ip kernel: [232684.444570] nvme 0000:03:00.0: Cancelling I/O 
>> 5 QID 0
>> Feb 21 19:40:35 ip kernel: [232684.444575] nvme nvme0: Abort status: 0x7
>> Feb 21 19:40:35 ip kernel: [232684.444581] nvme 0000:03:00.0: Cancelling I/O 
>> 6 QID 0
>> Feb 21 19:40:35 ip kernel: [232684.444587] nvme 0000:03:00.0: Cancelling I/O 
>> 7 QID 0
>> Feb 21 19:40:35 ip kernel: [232684.444594] nvme 0000:03:00.0: Cancelling I/O 
>> 8 QID 0
>> Feb 21 19:40:35 ip kernel: [232684.444600] nvme 0000:03:00.0: Cancelling I/O 
>> 9 QID 0
>> Feb 21 19:40:35 ip kernel: [232684.444606] nvme 0000:03:00.0: Cancelling I/O 
>> 10 QID 0
>> Feb 21 19:40:35 ip kernel: [232684.444613] nvme 0000:03:00.0: Cancelling I/O 
>> 11 QID 0
>> Feb 21 19:40:35 ip kernel: [232684.444619] nvme 0000:03:00.0: Cancelling I/O 
>> 12 QID 0
>> Feb 21 19:40:35 ip kernel: [232684.444625] nvme 0000:03:00.0: Cancelling I/O 
>> 13 QID 0
>> Feb 21 19:40:35 ip kernel: [232684.444631] nvme 0000:03:00.0: Cancelling I/O 
>> 14 QID 0
>> Feb 21 19:40:35 ip kernel: [232684.444636] nvme 0000:03:00.0: Cancelling I/O 
>> 15 QID 0
>> Feb 21 19:40:35 ip kernel: [232684.444640] nvme 0000:03:00.0: Cancelling I/O 
>> 16 QID 0
>> Feb 21 19:40:35 ip kernel: [232684.444644] nvme 0000:03:00.0: Cancelling I/O 
>> 17 QID 0
>> Feb 21 19:40:35 ip kernel: [232684.444648] nvme 0000:03:00.0: Cancelling I/O 
>> 18 QID 0
>> Feb 21 19:40:35 ip kernel: [232684.444653] nvme 0000:03:00.0: Cancelling I/O 
>> 19 QID 0
>> Feb 21 19:40:35 ip kernel: [232684.444657] nvme 0000:03:00.0: Cancelling I/O 
>> 20 QID 0
>> Feb 21 19:40:35 ip kernel: [232684.444661] nvme 0000:03:00.0: Cancelling I/O 
>> 21 QID 0
>> Feb 21 19:40:35 ip kernel: [232684.444665] nvme 0000:03:00.0: Cancelling I/O 
>> 22 QID 0
>> Feb 21 19:40:35 ip kernel: [232684.444669] nvme 0000:03:00.0: Cancelling I/O 
>> 23 QID 0
>> Feb 21 19:40:35 ip kernel: [232684.444673] nvme 0000:03:00.0: Cancelling I/O 
>> 24 QID 0
>> Feb 21 19:40:35 ip kernel: [232684.444677] nvme 0000:03:00.0: Cancelling I/O 
>> 25 QID 0
>> Feb 21 19:40:35 ip kernel: [232684.444680] nvme 0000:03:00.0: Cancelling I/O 
>> 26 QID 0
>> Feb 21 19:40:35 ip kernel: [232684.444684] nvme 0000:03:00.0: Cancelling I/O 
>> 27 QID 0
>> Feb 21 19:40:35 ip kernel: [232684.444688] nvme 0000:03:00.0: Cancelling I/O 
>> 28 QID 0
>> Feb 21 19:40:35 ip kernel: [232684.444692] nvme 0000:03:00.0: Cancelling I/O 
>> 29 QID 0
>> Feb 21 19:40:35 ip kernel: [232684.444696] nvme 0000:03:00.0: Cancelling I/O 
>> 30 QID 0
>> Feb 21 19:40:35 ip kernel: [232684.444700] nvme 0000:03:00.0: Cancelling I/O 
>> 31 QID 0
>> Feb 21 19:40:35 ip kernel: [232684.444704] nvme 0000:03:00.0: Cancelling I/O 
>> 32 QID 0
>> Feb 21 19:40:35 ip kernel: [232684.444708] nvme 0000:03:00.0: Cancelling I/O 
>> 33 QID 0
>> Feb 21 19:40:35 ip kernel: [232684.444712] nvme 0000:03:00.0: Cancelling I/O 
>> 34 QID 0
>> Feb 21 19:40:35 ip kernel: [232684.444716] nvme 0000:03:00.0: Cancelling I/O 
>> 35 QID 0
>> Feb 21 19:40:35 ip kernel: [232684.444720] nvme 0000:03:00.0: Cancelling I/O 
>> 36 QID 0
>> Feb 21 19:40:35 ip kernel: [232684.444724] nvme 0000:03:00.0: Cancelling I/O 
>> 37 QID 0
>> Feb 21 19:40:35 ip kernel: [232684.444753] nvme 0000:03:00.0: Cancelling I/O 
>> 943 QID 7
>> Feb 21 19:40:35 ip kernel: [232684.444758] nvme 0000:03:00.0: Cancelling I/O 
>> 0 QID 0
>> Feb 21 19:40:35 ip kernel: [232684.444762] nvme nvme0: Abort status: 0xfffc
>> Feb 21 19:40:35 ip kernel: [232684.444784] nvme 0000:03:00.0: completing 
>> aborted command with status: fffffffc
>> Feb 21 19:40:35 ip kernel: [232684.444792] blk_update_request: I/O error, 
>> dev nvme0n1, sector 694864
>> Feb 21 19:40:35 ip ceph-osd[3241]: 
>> /build/ceph-14.2.10/src/os/bluestore/BlueFS.cc: 1555: FAILED ceph_assert(r 
>> == 0)
>> Feb 21 19:40:35 ip ceph-osd[3228]: 2021-02-21 19:40:35.211 7f6fa71de700 -1 
>> bdev(0xbb4ea80 /var/lib/ceph/osd/ceph-8/block.wal) _aio_thread got r=-5 ((5) 
>> Input/output error)
>> Feb 21 19:40:35 ip ceph-osd[3228]: 
>> /build/ceph-14.2.10/src/os/bluestore/BlueFS.cc: In function 'int 
>> BlueFS::_read_random(BlueFS::FileReader*, uint64_t, size_t, char*)' thread 
>> 7f6f899a3700 time 2021-02-21 19:40:35.216629
>> Feb 21 19:40:35 ip ceph-osd[3228]: 
>> /build/ceph-14.2.10/src/os/bluestore/BlueFS.cc: 1555: FAILED ceph_assert(r 
>> == 0)
>> Feb 21 19:40:35 ip ceph-osd[3228]: 2021-02-21 19:40:35.211 7f6f899a3700 -1 
>> bdev(0xbb4e700 /var/lib/ceph/osd/ceph-8/block.db) direct_read_unaligned 
>> stalled read  0x54d0c509d~11b5 since 232536.575882s, timeout is 5.000000s
>> Feb 21 19:40:35 ip ceph-osd[3228]: 2021-02-21 19:40:35.211 7f6f899a3700 -1 
>> bdev(0xbb4e700 /var/lib/ceph/osd/ceph-8/block.db) direct_read_unaligned 
>> 0x54d0c509d~11b5 error: (5) Input/output error
>> Feb 21 19:40:35 ip ceph-osd[3228]: 
>> /build/ceph-14.2.10/src/os/bluestore/KernelDevice.cc: In function 'void 
>> KernelDevice::_aio_thread()' thread 7f6fa71de700 time 2021-02-21 
>> 19:40:35.215716
>> Feb 21 19:40:35 ip ceph-osd[3228]: 
>> /build/ceph-14.2.10/src/os/bluestore/KernelDevice.cc: 534: 
>> ceph_abort_msg("Unexpected IO error. This may suggest a hardware issue. 
>> Please check your kernel log!")
>> Feb 21 19:40:35 ip ceph-osd[3228]:  ceph version 14.2.10 
>> (b340acf629a010a74d90da5782a2c5fe0b54ac20) nautilus (stable)
>> Feb 21 19:40:35 ip ceph-osd[3228]:  1: (ceph::__ceph_abort(char const*, int, 
>> char const*, std::__cxx11::basic_string<char, std::char_traits<char>, 
>> std::allocator<char> > const&)+0xdf) [0x83eb8c]
>> Feb 21 19:40:35 ip ceph-osd[3228]:  2: (KernelDevice::_aio_thread()+0xbc5) 
>> [0xe8d5b5]
>> Feb 21 19:40:35 ip ceph-osd[3228]:  3: 
>> (KernelDevice::AioCompletionThread::entry()+0xd) [0xe8f3cd]
>> Feb 21 19:40:35 ip ceph-osd[3228]:  4: (()+0x76ba) [0x7f6fb567c6ba]
>> Feb 21 19:40:35 ip ceph-osd[3228]:  5: (clone()+0x6d) [0x7f6fb4c834dd]
>> Feb 21 19:40:35 ip ceph-osd[3228]: 2021-02-21 19:40:35.215 7f6fa71de700 -1 
>> /build/ceph-14.2.10/src/os/bluestore/KernelDevice.cc: In function 'void 
>> KernelDevice::_aio_thread()' thread 7f6fa71de700 time 2021-02-21 
>> 19:40:35.215716
>> Feb 21 19:40:35 ip ceph-osd[3228]: 
>> /build/ceph-14.2.10/src/os/bluestore/KernelDevice.cc: 534: 
>> ceph_abort_msg("Unexpected IO error. This may suggest a hardware issue. 
>> Please check your kernel log!")
>> Feb 21 19:40:35 ip ceph-osd[3228]:  ceph version 14.2.10 
>> (b340acf629a010a74d90da5782a2c5fe0b54ac20) nautilus (stable)
>> Feb 21 19:40:35 ip ceph-osd[3201]: 2021-02-21 19:40:35.211 7fcf34f84700 -1 
>> bdev(0xd4b2700 /var/lib/ceph/osd/ceph-0/block.db) direct_read_unaligned 
>> stalled read  0xb299e8f61~1182 since 232531.455725s, timeout is 5.000000s
>> Feb 21 19:40:35 ip ceph-osd[3201]: 2021-02-21 19:40:35.211 7fcf34f84700 -1 
>> bdev(0xd4b2700 /var/lib/ceph/osd/ceph-0/block.db) direct_read_unaligned 
>> 0xb299e8f61~1182 error: (5) Input/output error
>> Feb 21 19:40:35 ip ceph-osd[3201]: 
>> /build/ceph-14.2.10/src/os/bluestore/BlueFS.cc: In function 'int 
>> BlueFS::_read_random(BlueFS::FileReader*, uint64_t, size_t, char*)' thread 
>> 7fcf34f84700 time 2021-02-21 19:40:35.215957
>> Feb 21 19:40:35 ip ceph-osd[3201]: 
>> /build/ceph-14.2.10/src/os/bluestore/BlueFS.cc: 1555: FAILED ceph_assert(r 
>> == 0)
>> Feb 21 19:40:35 ip ceph-osd[3201]: 
>> /build/ceph-14.2.10/src/os/bluestore/BlueFS.cc: In function 'int 
>> BlueFS::_read_random(BlueFS::FileReader*, uint64_t, size_t, char*)' thread 
>> 7fcf35f86700 time 2021-02-21 19:40:35.216371
>> Feb 21 19:40:35 ip ceph-osd[3201]: 
>> /build/ceph-14.2.10/src/os/bluestore/BlueFS.cc: 1555: FAILED ceph_assert(r 
>> == 0)
>> Feb 21 19:40:35 ip ceph-osd[3201]: 
>> /build/ceph-14.2.10/src/os/bluestore/BlueFS.cc: In function 'int 
>> BlueFS::_read_random(BlueFS::FileReader*, uint64_t, size_t, char*)' thread 
>> 7fcf35f86700 time 2021-02-21 19:40:35.216371
>> Feb 21 19:40:35 ip ceph-osd[3201]: 
>> /build/ceph-14.2.10/src/os/bluestore/BlueFS.cc: 1555: FAILED ceph_assert(r 
>> == 0)
>> Feb 21 19:40:35 ip ceph-osd[3201]: 2021-02-21 19:40:35.211 7fcf31f7e700 -1 
>> bdev(0xd4b2700 /var/lib/ceph/osd/ceph-0/block.db) direct_read_unaligned 
>> stalled read  0x1b25e6102~ff3 since 232572.688747s, timeout is 5.000000s
>> Feb 21 19:40:35 ip ceph-osd[3201]: 2021-02-21 19:40:35.211 7fcf35f86700 -1 
>> bdev(0xd4b2700 /var/lib/ceph/osd/ceph-0/block.db) direct_read_unaligned 
>> stalled read  0x1b2f26bd6~1015 since 232573.917283s, timeout is 5.000000s
>> Feb 21 19:40:35 ip ceph-osd[3201]: 2021-02-21 19:40:35.211 7fcf31f7e700 -1 
>> bdev(0xd4b2700 /var/lib/ceph/osd/ceph-0/block.db) direct_read_unaligned 
>> 0x1b25e6102~ff3 error: (5) Input/output error
>> Feb 21 19:40:35 ip ceph-osd[3201]: 2021-02-21 19:40:35.211 7fcf35f86700 -1 
>> bdev(0xd4b2700 /var/lib/ceph/osd/ceph-0/block.db) direct_read_unaligned 
>> 0x1b2f26bd6~1015 error: (5) Input/output error
>> Feb 21 19:40:35 ip ceph-osd[3201]: 2021-02-21 19:40:35.211 7fcf3077b700 -1 
>> bdev(0xd4b2700 /var/lib/ceph/osd/ceph-0/block.db) direct_read_unaligned 
>> stalled read  0x904fbb6a~101d since 232531.225342s, timeout is 5.000000s
>> Feb 21 19:40:35 ip ceph-osd[3241]: 
>> /build/ceph-14.2.10/src/os/bluestore/KernelDevice.cc: In function 'void 
>> KernelDevice::_aio_thread()' thread 7f0236585700 time 2021-02-21 
>> 19:40:35.215482
>> Feb 21 19:40:35 ip ceph-osd[3241]: 
>> /build/ceph-14.2.10/src/os/bluestore/KernelDevice.cc: 534: 
>> ceph_abort_msg("Unexpected IO error. This may suggest a hardware issue. 
>> Please check your kernel log!")
>> Feb 21 19:40:35 ip ceph-osd[3241]:  ceph version 14.2.10 
>> (b340acf629a010a74d90da5782a2c5fe0b54ac20) nautilus (stable)
>> Feb 21 19:40:35 ip ceph-osd[3241]:  1: (ceph::__ceph_abort(char const*, int, 
>> char const*, std::__cxx11::basic_string<char, std::char_traits<char>, 
>> std::allocator<char> > const&)+0xdf) [0x83eb8c]
>> Feb 21 19:40:35 ip ceph-osd[3241]:  2: (KernelDevice::_aio_thread()+0xbc5) 
>> [0xe8d5b5]
>> Feb 21 19:40:35 ip ceph-osd[3241]:  3: 
>> (KernelDevice::AioCompletionThread::entry()+0xd) [0xe8f3cd]
>> Feb 21 19:40:35 ip ceph-osd[3241]:  4: (()+0x76ba) [0x7f0244a236ba]
>> Feb 21 19:40:35 ip ceph-osd[3241]:  5: (clone()+0x6d) [0x7f024402a4dd]
>> Feb 21 19:40:35 ip ceph-osd[3241]: *** Caught signal (Aborted) **
>> Feb 21 19:40:35 ip ceph-osd[3241]:  in thread 7f0236585700 
>> thread_name:bstore_aio
>> Feb 21 19:40:35 ip ceph-osd[3241]: 2021-02-21 19:40:35.215 7f0236585700 -1 
>> /build/ceph-14.2.10/src/os/bluestore/KernelDevice.cc: In function 'void 
>> KernelDevice::_aio_thread()' thread 7f0236585700 time 2021-02-21 
>> 19:40:35.215482
>> Feb 21 19:40:35 ip ceph-osd[3241]: 
>> /build/ceph-14.2.10/src/os/bluestore/KernelDevice.cc: 534: 
>> ceph_abort_msg("Unexpected IO error. This may suggest a hardware issue. 
>> Please check your kernel log!")
>> Feb 21 19:40:35 ip ceph-osd[3241]:  ceph version 14.2.10 
>> (b340acf629a010a74d90da5782a2c5fe0b54ac20) nautilus (stable)
>> Feb 21 19:40:35 ip ceph-osd[3241]:  1: (ceph::__ceph_abort(char const*, int, 
>> char const*, std::__cxx11::basic_string<char, std::char_traits<char>, 
>> std::allocator<char> > const&)+0xdf) [0x83eb8c]
>> Feb 21 19:40:35 ip ceph-osd[3241]:  2: (KernelDevice::_aio_thread()+0xbc5) 
>> [0xe8d5b5]
>> Feb 21 19:40:35 ip ceph-osd[3241]:  3: 
>> (KernelDevice::AioCompletionThread::entry()+0xd) [0xe8f3cd]
>> Feb 21 19:40:35 ip ceph-osd[3241]:  4: (()+0x76ba) [0x7f0244a236ba]
>> Feb 21 19:40:35 ip ceph-osd[3241]:  5: (clone()+0x6d) [0x7f024402a4dd]
>> Feb 21 19:40:35 ip ceph-osd[3228]:  1: (ceph::__ceph_abort(char const*, int, 
>> char const*, std::__cxx11::basic_string<char, std::char_traits<char>, 
>> std::allocator<char> > const&)+0xdf) [0x83eb8c]
>> Feb 21 19:40:35 ip ceph-osd[3228]:  2: (KernelDevice::_aio_thread()+0xbc5) 
>> [0xe8d5b5]
>> Feb 21 19:40:35 ip ceph-osd[3228]:  3: 
>> (KernelDevice::AioCompletionThread::entry()+0xd) [0xe8f3cd]
>> Feb 21 19:40:35 ip ceph-osd[3228]:  4: (()+0x76ba) [0x7f6fb567c6ba]
>> Feb 21 19:40:35 ip ceph-osd[3228]:  5: (clone()+0x6d) [0x7f6fb4c834dd]
>> Feb 21 19:40:35 ip ceph-osd[3228]: *** Caught signal (Aborted) **
>> Feb 21 19:40:35 ip ceph-osd[3228]:  in thread 7f6fa71de700 
>> thread_name:bstore_aio
>> Feb 21 19:40:35 ip ceph-osd[3201]: 2021-02-21 19:40:35.211 7fcf3077b700 -1 
>> bdev(0xd4b2700 /var/lib/ceph/osd/ceph-0/block.db) direct_read_unaligned 
>> 0x904fbb6a~101d error: (5) Input/output error
>> Feb 21 19:40:35 ip ceph-osd[3201]: 
>> /build/ceph-14.2.10/src/os/bluestore/BlueFS.cc: In function 'int 
>> BlueFS::_read_random(BlueFS::FileReader*, uint64_t, size_t, char*)' thread 
>> 7fcf3077b700 time 2021-02-21 19:40:35.216680
>> Feb 21 19:40:35 ip ceph-osd[3201]: 
>> /build/ceph-14.2.10/src/os/bluestore/BlueFS.cc: 1555: FAILED ceph_assert(r 
>> == 0)
>> Feb 21 19:40:35 ip ceph-osd[3201]:  ceph version 14.2.10 
>> (b340acf629a010a74d90da5782a2c5fe0b54ac20) nautilus (stable)
>> Feb 21 19:40:35 ip ceph-osd[3201]:  1: (ceph::__ceph_assert_fail(char 
>> const*, char const*, int, char const*)+0x152) [0x83e4e2]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  2: (ceph::__ceph_assertf_fail(char 
>> const*, char const*, int, char const*, char const*, ...)+0) [0x83e6bd]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  3: 
>> (BlueFS::_read_random(BlueFS::FileReader*, unsigned long, unsigned long, 
>> char*)+0xc9b) [0xe3eb8b]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  4: 
>> (BlueRocksRandomAccessFile::Read(unsigned long, unsigned long, 
>> rocksdb::Slice*, char*) const+0x20) [0xe6cd60]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  5: 
>> (rocksdb::RandomAccessFileReader::Read(unsigned long, unsigned long, 
>> rocksdb::Slice*, char*) const+0x476) [0x14c6966]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  6: 
>> (rocksdb::BlockFetcher::ReadBlockContents()+0x4d3) [0x1485723]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  7: 
>> (rocksdb::BlockBasedTable::MaybeReadBlockAndLoadToCache(rocksdb::FilePrefetchBuffer*,
>>  rocksdb::BlockBasedTable::Rep*, rocksdb::ReadOptions const&, 
>> rocksdb::BlockHandle const&, rocksdb::UncompressionDict const&, 
>> rocksdb::BlockBasedTable::CachableEntry<rocksdb::Block>*, bool, 
>> rocksdb::GetContext*)+0x4a3) [0x1474143]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  8: (rocksdb::DataBlockIter* 
>> rocksdb::BlockBasedTable::NewDataBlockIterator<rocksdb::DataBlockIter>(rocksdb::BlockBasedTable::Rep*,
>>  rocksdb::ReadOptions const&, rocksdb::BlockHandle const&, 
>> rocksdb::DataBlockIter*, bool, bool, bool, rocksdb::GetContext*, 
>> rocksdb::Status, rocksdb::FilePrefetchBuffer*)+0x303) [0x1481853]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  9: 
>> (rocksdb::BlockBasedTable::Get(rocksdb::ReadOptions const&, rocksdb::Slice 
>> const&, rocksdb::GetContext*, rocksdb::SliceTransform const*, bool)+0x4c2) 
>> [0x147d342]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  10: 
>> (rocksdb::TableCache::Get(rocksdb::ReadOptions const&, 
>> rocksdb::InternalKeyComparator const&, rocksdb::FileMetaData const&, 
>> rocksdb::Slice const&, rocksdb::GetContext*, rocksdb::SliceTransform const*, 
>> rocksdb::HistogramImpl*, bool, int)+0x187) [0x13ea3d7]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  11: 
>> (rocksdb::Version::Get(rocksdb::ReadOptions const&, rocksdb::LookupKey 
>> const&, rocksdb::PinnableSlice*, rocksdb::Status*, rocksdb::MergeContext*, 
>> unsigned long*, bool*, bool*, unsigned long*, rocksdb::ReadCallback*, 
>> bool*)+0x3f4) [0x1405214]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  12: 
>> (rocksdb::DBImpl::GetImpl(rocksdb::ReadOptions const&, 
>> rocksdb::ColumnFamilyHandle*, rocksdb::Slice const&, 
>> rocksdb::PinnableSlice*, bool*, rocksdb::ReadCallback*, bool*)+0x5e1) 
>> [0x130e771]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  13: 
>> (rocksdb::DBImpl::Get(rocksdb::ReadOptions const&, 
>> rocksdb::ColumnFamilyHandle*, rocksdb::Slice const&, 
>> rocksdb::PinnableSlice*)+0x26) [0x130ef16]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  14: 
>> (rocksdb::DB::Get(rocksdb::ReadOptions const&, rocksdb::ColumnFamilyHandle*, 
>> rocksdb::Slice const&, std::__cxx11::basic_string<char, 
>> std::char_traits<char>, std::allocator<char> >*)+0xae) [0x1327eae]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  15: 
>> (RocksDBStore::get(std::__cxx11::basic_string<char, std::char_traits<char>, 
>> std::allocator<char> > const&, char const*, unsigned long, 
>> ceph::buffer::v14_2_0::list*)+0x31c) [0x12e623c]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  16: 
>> (BlueStore::Collection::get_onode(ghobject_t const&, bool)+0x1ed) [0xd3337d]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  17: 
>> (BlueStore::_txc_add_transaction(BlueStore::TransContext*, 
>> ObjectStore::Transaction*)+0x1cad) [0xd9cd5d]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  18: 
>> (BlueStore::queue_transactions(boost::intrusive_ptr<ObjectStore::CollectionImpl>&,
>>  std::vector<ObjectStore::Transaction, 
>> std::allocator<ObjectStore::Transaction> >&, 
>> boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x2aa) [0xda1b9a]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  19: (non-virtual thunk to 
>> PrimaryLogPG::queue_transactions(std::vector<ObjectStore::Transaction, 
>> std::allocator<ObjectStore::Transaction> >&, 
>> boost::intrusive_ptr<OpRequest>)+0x54) [0xadfd54]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  20: 
>> (ReplicatedBackend::do_repop(boost::intrusive_ptr<OpRequest>)+0xb09) 
>> [0xbe0809]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  21: 
>> (ReplicatedBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x1a7) 
>> [0xbeeb67]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  22: 
>> (PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x97) [0xaf3487]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  23: 
>> (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, 
>> ThreadPool::TPHandle&)+0x705) [0xaa2105]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  24: 
>> (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, 
>> ThreadPool::TPHandle&)+0x1bf) [0x8c12af]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  25: (PGOpItem::run(OSD*, OSDShard*, 
>> boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x62) [0xb7a152]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  26: (OSD::ShardedOpWQ::_process(unsigned 
>> int, ceph::heartbeat_handle_d*)+0xbed) [0x8df07d]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  27: 
>> (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x4ac) [0xee700c]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  28: 
>> (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0xeea1d0]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  29: (()+0x76ba) [0x7fcf6045c6ba]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  30: (clone()+0x6d) [0x7fcf5fa634dd]
>> Feb 21 19:40:35 ip ceph-osd[3201]: *** Caught signal (Aborted) **
>> Feb 21 19:40:35 ip ceph-osd[3201]:  in thread 7fcf35f86700 
>> thread_name:tp_osd_tp
>> Feb 21 19:40:35 ip ceph-osd[3201]: 2021-02-21 19:40:35.235 7fcf35f86700 -1 
>> /build/ceph-14.2.10/src/os/bluestore/BlueFS.cc: In function 'int 
>> BlueFS::_read_random(BlueFS::FileReader*, uint64_t, size_t, char*)' thread 
>> 7fcf3077b700 time 2021-02-21 19:40:35.216680
>> Feb 21 19:40:35 ip ceph-osd[3201]: 
>> /build/ceph-14.2.10/src/os/bluestore/BlueFS.cc: 1555: FAILED ceph_assert(r 
>> == 0)
>> Feb 21 19:40:35 ip ceph-osd[3201]:  ceph version 14.2.10 
>> (b340acf629a010a74d90da5782a2c5fe0b54ac20) nautilus (stable)
>> Feb 21 19:40:35 ip ceph-osd[3201]:  1: (ceph::__ceph_assert_fail(char 
>> const*, char const*, int, char const*)+0x152) [0x83e4e2]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  2: (ceph::__ceph_assertf_fail(char 
>> const*, char const*, int, char const*, char const*, ...)+0) [0x83e6bd]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  3: 
>> (BlueFS::_read_random(BlueFS::FileReader*, unsigned long, unsigned long, 
>> char*)+0xc9b) [0xe3eb8b]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  4: 
>> (BlueRocksRandomAccessFile::Read(unsigned long, unsigned long, 
>> rocksdb::Slice*, char*) const+0x20) [0xe6cd60]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  5: 
>> (rocksdb::RandomAccessFileReader::Read(unsigned long, unsigned long, 
>> rocksdb::Slice*, char*) const+0x476) [0x14c6966]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  6: 
>> (rocksdb::BlockFetcher::ReadBlockContents()+0x4d3) [0x1485723]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  7: 
>> (rocksdb::BlockBasedTable::MaybeReadBlockAndLoadToCache(rocksdb::FilePrefetchBuffer*,
>>  rocksdb::BlockBasedTable::Rep*, rocksdb::ReadOptions const&, 
>> rocksdb::BlockHandle const&, rocksdb::UncompressionDict const&, 
>> rocksdb::BlockBasedTable::CachableEntry<rocksdb::Block>*, bool, 
>> rocksdb::GetContext*)+0x4a3) [0x1474143]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  8: (rocksdb::DataBlockIter* 
>> rocksdb::BlockBasedTable::NewDataBlockIterator<rocksdb::DataBlockIter>(rocksdb::BlockBasedTable::Rep*,
>>  rocksdb::ReadOptions const&, rocksdb::BlockHandle const&, 
>> rocksdb::DataBlockIter*, bool, bool, bool, rocksdb::GetContext*, 
>> rocksdb::Status, rocksdb::FilePrefetchBuffer*)+0x303) [0x1481853]
>> Feb 21 19:40:35 ip ceph-osd[3228]:  ceph version 14.2.10 
>> (b340acf629a010a74d90da5782a2c5fe0b54ac20) nautilus (stable)
>> Feb 21 19:40:35 ip ceph-osd[3228]:  1: (ceph::__ceph_assert_fail(char 
>> const*, char const*, int, char const*)+0x152) [0x83e4e2]
>> Feb 21 19:40:35 ip ceph-osd[3228]:  2: (ceph::__ceph_assertf_fail(char 
>> const*, char const*, int, char const*, char const*, ...)+0) [0x83e6bd]
>> Feb 21 19:40:35 ip ceph-osd[3228]:  3: 
>> (BlueFS::_read_random(BlueFS::FileReader*, unsigned long, unsigned long, 
>> char*)+0xc9b) [0xe3eb8b]
>> Feb 21 19:40:35 ip ceph-osd[3228]:  4: 
>> (BlueRocksRandomAccessFile::Read(unsigned long, unsigned long, 
>> rocksdb::Slice*, char*) const+0x20) [0xe6cd60]
>> Feb 21 19:40:35 ip ceph-osd[3228]:  5: 
>> (rocksdb::RandomAccessFileReader::Read(unsigned long, unsigned long, 
>> rocksdb::Slice*, char*) const+0x476) [0x14c6966]
>> Feb 21 19:40:35 ip ceph-osd[3228]:  6: 
>> (rocksdb::BlockFetcher::ReadBlockContents()+0x4d3) [0x1485723]
>> Feb 21 19:40:35 ip ceph-osd[3228]:  7: 
>> (rocksdb::BlockBasedTable::MaybeReadBlockAndLoadToCache(rocksdb::FilePrefetchBuffer*,
>>  rocksdb::BlockBasedTable::Rep*, rocksdb::ReadOptions const&, 
>> rocksdb::BlockHandle const&, rocksdb::UncompressionDict const&, 
>> rocksdb::BlockBasedTable::CachableEntry<rocksdb::Block>*, bool, 
>> rocksdb::GetContext*)+0x4a3) [0x1474143]
>> Feb 21 19:40:35 ip ceph-osd[3228]:  8: (rocksdb::DataBlockIter* 
>> rocksdb::BlockBasedTable::NewDataBlockIterator<rocksdb::DataBlockIter>(rocksdb::BlockBasedTable::Rep*,
>>  rocksdb::ReadOptions const&, rocksdb::BlockHandle const&, 
>> rocksdb::DataBlockIter*, bool, bool, bool, rocksdb::GetContext*, 
>> rocksdb::Status, rocksdb::FilePrefetchBuffer*)+0x303) [0x1481853]
>> Feb 21 19:40:35 ip ceph-osd[3228]:  9: 
>> (rocksdb::BlockBasedTable::Get(rocksdb::ReadOptions const&, rocksdb::Slice 
>> const&, rocksdb::GetContext*, rocksdb::SliceTransform const*, bool)+0x4c2) 
>> [0x147d342]
>> Feb 21 19:40:35 ip ceph-osd[3228]:  10: 
>> (rocksdb::TableCache::Get(rocksdb::ReadOptions const&, 
>> rocksdb::InternalKeyComparator const&, rocksdb::FileMetaData const&, 
>> rocksdb::Slice const&, rocksdb::GetContext*, rocksdb::SliceTransform const*, 
>> rocksdb::HistogramImpl*, bool, int)+0x187) [0x13ea3d7]
>> Feb 21 19:40:35 ip ceph-osd[3228]:  11: 
>> (rocksdb::Version::Get(rocksdb::ReadOptions const&, rocksdb::LookupKey 
>> const&, rocksdb::PinnableSlice*, rocksdb::Status*, rocksdb::MergeContext*, 
>> unsigned long*, bool*, bool*, unsigned long*, rocksdb::ReadCallback*, 
>> bool*)+0x3f4) [0x1405214]
>> Feb 21 19:40:35 ip ceph-osd[3241]:  ceph version 14.2.10 
>> (b340acf629a010a74d90da5782a2c5fe0b54ac20) nautilus (stable)
>> Feb 21 19:40:35 ip ceph-osd[3241]:  1: (ceph::__ceph_assert_fail(char 
>> const*, char const*, int, char const*)+0x152) [0x83e4e2]
>> Feb 21 19:40:35 ip ceph-osd[3241]:  2: (ceph::__ceph_assertf_fail(char 
>> const*, char const*, int, char const*, char const*, ...)+0) [0x83e6bd]
>> Feb 21 19:40:35 ip ceph-osd[3241]:  3: 
>> (BlueFS::_read_random(BlueFS::FileReader*, unsigned long, unsigned long, 
>> char*)+0xc9b) [0xe3eb8b]
>> Feb 21 19:40:35 ip ceph-osd[3241]:  4: 
>> (BlueRocksRandomAccessFile::Read(unsigned long, unsigned long, 
>> rocksdb::Slice*, char*) const+0x20) [0xe6cd60]
>> Feb 21 19:40:35 ip ceph-osd[3241]:  5: 
>> (rocksdb::RandomAccessFileReader::Read(unsigned long, unsigned long, 
>> rocksdb::Slice*, char*) const+0x476) [0x14c6966]
>> Feb 21 19:40:35 ip ceph-osd[3241]:  6: 
>> (rocksdb::BlockFetcher::ReadBlockContents()+0x4d3) [0x1485723]
>> Feb 21 19:40:35 ip ceph-osd[3241]:  7: 
>> (rocksdb::BlockBasedTable::MaybeReadBlockAndLoadToCache(rocksdb::FilePrefetchBuffer*,
>>  rocksdb::BlockBasedTable::Rep*, rocksdb::ReadOptions const&, 
>> rocksdb::BlockHandle const&, rocksdb::UncompressionDict const&, 
>> rocksdb::BlockBasedTable::CachableEntry<rocksdb::Block>*, bool, 
>> rocksdb::GetContext*)+0x4a3) [0x1474143]
>> Feb 21 19:40:35 ip ceph-osd[3241]:  8: (rocksdb::DataBlockIter* 
>> rocksdb::BlockBasedTable::NewDataBlockIterator<rocksdb::DataBlockIter>(rocksdb::BlockBasedTable::Rep*,
>>  rocksdb::ReadOptions const&, rocksdb::BlockHandle const&, 
>> rocksdb::DataBlockIter*, bool, bool, bool, rocksdb::GetContext*, 
>> rocksdb::Status, rocksdb::FilePrefetchBuffer*)+0x303) [0x1481853]
>> Feb 21 19:40:35 ip ceph-osd[3241]:  9: 
>> (rocksdb::BlockBasedTable::Get(rocksdb::ReadOptions const&, rocksdb::Slice 
>> const&, rocksdb::GetContext*, rocksdb::SliceTransform const*, bool)+0x4c2) 
>> [0x147d342]
>> Feb 21 19:40:35 ip ceph-osd[3241]:  10: 
>> (rocksdb::TableCache::Get(rocksdb::ReadOptions const&, 
>> rocksdb::InternalKeyComparator const&, rocksdb::FileMetaData const&, 
>> rocksdb::Slice const&, rocksdb::GetContext*, rocksdb::SliceTransform const*, 
>> rocksdb::HistogramImpl*, bool, int)+0x187) [0x13ea3d7]
>> Feb 21 19:40:35 ip ceph-osd[3241]:  11: 
>> (rocksdb::Version::Get(rocksdb::ReadOptions const&, rocksdb::LookupKey 
>> const&, rocksdb::PinnableSlice*, rocksdb::Status*, rocksdb::MergeContext*, 
>> unsigned long*, bool*, bool*, unsigned long*, rocksdb::ReadCallback*, 
>> bool*)+0x3f4) [0x1405214]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  9: 
>> (rocksdb::BlockBasedTable::Get(rocksdb::ReadOptions const&, rocksdb::Slice 
>> const&, rocksdb::GetContext*, rocksdb::SliceTransform const*, bool)+0x4c2) 
>> [0x147d342]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  10: 
>> (rocksdb::TableCache::Get(rocksdb::ReadOptions const&, 
>> rocksdb::InternalKeyComparator const&, rocksdb::FileMetaData const&, 
>> rocksdb::Slice const&, rocksdb::GetContext*, rocksdb::SliceTransform const*, 
>> rocksdb::HistogramImpl*, bool, int)+0x187) [0x13ea3d7]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  11: 
>> (rocksdb::Version::Get(rocksdb::ReadOptions const&, rocksdb::LookupKey 
>> const&, rocksdb::PinnableSlice*, rocksdb::Status*, rocksdb::MergeContext*, 
>> unsigned long*, bool*, bool*, unsigned long*, rocksdb::ReadCallback*, 
>> bool*)+0x3f4) [0x1405214]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  12: 
>> (rocksdb::DBImpl::GetImpl(rocksdb::ReadOptions const&, 
>> rocksdb::ColumnFamilyHandle*, rocksdb::Slice const&, 
>> rocksdb::PinnableSlice*, bool*, rocksdb::ReadCallback*, bool*)+0x5e1) 
>> [0x130e771]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  13: 
>> (rocksdb::DBImpl::Get(rocksdb::ReadOptions const&, 
>> rocksdb::ColumnFamilyHandle*, rocksdb::Slice const&, 
>> rocksdb::PinnableSlice*)+0x26) [0x130ef16]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  14: 
>> (rocksdb::DB::Get(rocksdb::ReadOptions const&, rocksdb::ColumnFamilyHandle*, 
>> rocksdb::Slice const&, std::__cxx11::basic_string<char, 
>> std::char_traits<char>, std::allocator<char> >*)+0xae) [0x1327eae]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  15: 
>> (RocksDBStore::get(std::__cxx11::basic_string<char, std::char_traits<char>, 
>> std::allocator<char> > const&, char const*, unsigned long, 
>> ceph::buffer::v14_2_0::list*)+0x31c) [0x12e623c]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  16: 
>> (BlueStore::Collection::get_onode(ghobject_t const&, bool)+0x1ed) [0xd3337d]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  17: 
>> (BlueStore::_txc_add_transaction(BlueStore::TransContext*, 
>> ObjectStore::Transaction*)+0x1cad) [0xd9cd5d]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  18: 
>> (BlueStore::queue_transactions(boost::intrusive_ptr<ObjectStore::CollectionImpl>&,
>>  std::vector<ObjectStore::Transaction, 
>> std::allocator<ObjectStore::Transaction> >&, 
>> boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x2aa) [0xda1b9a]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  19: (non-virtual thunk to 
>> PrimaryLogPG::queue_transactions(std::vector<ObjectStore::Transaction, 
>> std::allocator<ObjectStore::Transaction> >&, 
>> boost::intrusive_ptr<OpRequest>)+0x54) [0xadfd54]
>> Feb 21 19:40:35 ip ceph-osd[3228]:  12: 
>> (rocksdb::DBImpl::GetImpl(rocksdb::ReadOptions const&, 
>> rocksdb::ColumnFamilyHandle*, rocksdb::Slice const&, 
>> rocksdb::PinnableSlice*, bool*, rocksdb::ReadCallback*, bool*)+0x5e1) 
>> [0x130e771]
>> Feb 21 19:40:35 ip ceph-osd[3228]:  13: 
>> (rocksdb::DBImpl::Get(rocksdb::ReadOptions const&, 
>> rocksdb::ColumnFamilyHandle*, rocksdb::Slice const&, 
>> rocksdb::PinnableSlice*)+0x26) [0x130ef16]
>> Feb 21 19:40:35 ip ceph-osd[3228]:  14: 
>> (rocksdb::DB::Get(rocksdb::ReadOptions const&, rocksdb::ColumnFamilyHandle*, 
>> rocksdb::Slice const&, std::__cxx11::basic_string<char, 
>> std::char_traits<char>, std::allocator<char> >*)+0xae) [0x1327eae]
>> Feb 21 19:40:35 ip ceph-osd[3228]:  15: 
>> (RocksDBStore::get(std::__cxx11::basic_string<char, std::char_traits<char>, 
>> std::allocator<char> > const&, char const*, unsigned long, 
>> ceph::buffer::v14_2_0::list*)+0x31c) [0x12e623c]
>> Feb 21 19:40:35 ip ceph-osd[3228]:  16: 
>> (BlueStore::Collection::get_onode(ghobject_t const&, bool)+0x1ed) [0xd3337d]
>> Feb 21 19:40:35 ip ceph-osd[3228]:  17: 
>> (BlueStore::getattr(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, 
>> ghobject_t const&, char const*, ceph::buffer::v14_2_0::ptr&)+0xe3) [0xd4f803]
>> Feb 21 19:40:35 ip ceph-osd[3228]:  18: 
>> (PGBackend::objects_get_attr(hobject_t const&, 
>> std::__cxx11::basic_string<char, std::char_traits<char>, 
>> std::allocator<char> > const&, ceph::buffer::v14_2_0::list*)+0xc7) [0xaebc17]
>> Feb 21 19:40:35 ip ceph-osd[3228]:  19: 
>> (PrimaryLogPG::get_object_context(hobject_t const&, bool, 
>> std::map<std::__cxx11::basic_string<char, std::char_traits<char>, 
>> std::allocator<char> >, ceph::buffer::v14_2_0::list, 
>> std::less<std::__cxx11::basic_string<char, std::char_traits<char>, 
>> std::allocator<char> > >, 
>> std::allocator<std::pair<std::__cxx11::basic_string<char, 
>> std::char_traits<char>, std::allocator<char> > const, 
>> ceph::buffer::v14_2_0::list> > > const*)+0x64f) [0xa409bf]
>> Feb 21 19:40:35 ip ceph-osd[3228]:  20: 
>> (PrimaryLogPG::find_object_context(hobject_t const&, 
>> std::shared_ptr<ObjectContext>*, bool, bool, hobject_t*)+0x250) [0xa452d0]
>> Feb 21 19:40:35 ip ceph-osd[3228]:  21: 
>> (PrimaryLogPG::do_op(boost::intrusive_ptr<OpRequest>&)+0x2c71) [0xa9fbb1]
>> Feb 21 19:40:35 ip ceph-osd[3228]:  22: 
>> (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, 
>> ThreadPool::TPHandle&)+0xd08) [0xaa2708]
>> Feb 21 19:40:35 ip ceph-osd[3241]:  12: 
>> (rocksdb::DBImpl::GetImpl(rocksdb::ReadOptions const&, 
>> rocksdb::ColumnFamilyHandle*, rocksdb::Slice const&, 
>> rocksdb::PinnableSlice*, bool*, rocksdb::ReadCallback*, bool*)+0x5e1) 
>> [0x130e771]
>> Feb 21 19:40:35 ip ceph-osd[3241]:  13: 
>> (rocksdb::DBImpl::Get(rocksdb::ReadOptions const&, 
>> rocksdb::ColumnFamilyHandle*, rocksdb::Slice const&, 
>> rocksdb::PinnableSlice*)+0x26) [0x130ef16]
>> Feb 21 19:40:35 ip ceph-osd[3241]:  14: 
>> (rocksdb::DB::Get(rocksdb::ReadOptions const&, rocksdb::ColumnFamilyHandle*, 
>> rocksdb::Slice const&, std::__cxx11::basic_string<char, 
>> std::char_traits<char>, std::allocator<char> >*)+0xae) [0x1327eae]
>> Feb 21 19:40:35 ip ceph-osd[3241]:  15: 
>> (RocksDBStore::get(std::__cxx11::basic_string<char, std::char_traits<char>, 
>> std::allocator<char> > const&, char const*, unsigned long, 
>> ceph::buffer::v14_2_0::list*)+0x31c) [0x12e623c]
>> Feb 21 19:40:35 ip ceph-osd[3241]:  16: 
>> (BlueStore::Collection::get_onode(ghobject_t const&, bool)+0x1ed) [0xd3337d]
>> Feb 21 19:40:35 ip ceph-osd[3241]:  17: 
>> (BlueStore::getattr(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, 
>> ghobject_t const&, char const*, ceph::buffer::v14_2_0::ptr&)+0xe3) [0xd4f803]
>> Feb 21 19:40:35 ip ceph-osd[3241]:  18: 
>> (PGBackend::objects_get_attr(hobject_t const&, 
>> std::__cxx11::basic_string<char, std::char_traits<char>, 
>> std::allocator<char> > const&, ceph::buffer::v14_2_0::list*)+0xc7) [0xaebc17]
>> Feb 21 19:40:35 ip ceph-osd[3241]:  19: 
>> (PrimaryLogPG::get_object_context(hobject_t const&, bool, 
>> std::map<std::__cxx11::basic_string<char, std::char_traits<char>, 
>> std::allocator<char> >, ceph::buffer::v14_2_0::list, 
>> std::less<std::__cxx11::basic_string<char, std::char_traits<char>, 
>> std::allocator<char> > >, 
>> std::allocator<std::pair<std::__cxx11::basic_string<char, 
>> std::char_traits<char>, std::allocator<char> > const, 
>> ceph::buffer::v14_2_0::list> > > const*)+0x64f) [0xa409bf]
>> Feb 21 19:40:35 ip ceph-osd[3241]:  20: 
>> (PrimaryLogPG::find_object_context(hobject_t const&, 
>> std::shared_ptr<ObjectContext>*, bool, bool, hobject_t*)+0x250) [0xa452d0]
>> Feb 21 19:40:35 ip ceph-osd[3241]:  21: 
>> (PrimaryLogPG::do_op(boost::intrusive_ptr<OpRequest>&)+0x2c71) [0xa9fbb1]
>> Feb 21 19:40:35 ip ceph-osd[3241]:  22: 
>> (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, 
>> ThreadPool::TPHandle&)+0xd08) [0xaa2708]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  20: 
>> (ReplicatedBackend::do_repop(boost::intrusive_ptr<OpRequest>)+0xb09) 
>> [0xbe0809]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  21: 
>> (ReplicatedBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x1a7) 
>> [0xbeeb67]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  22: 
>> (PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x97) [0xaf3487]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  23: 
>> (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, 
>> ThreadPool::TPHandle&)+0x705) [0xaa2105]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  24: 
>> (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, 
>> ThreadPool::TPHandle&)+0x1bf) [0x8c12af]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  25: (PGOpItem::run(OSD*, OSDShard*, 
>> boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x62) [0xb7a152]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  26: (OSD::ShardedOpWQ::_process(unsigned 
>> int, ceph::heartbeat_handle_d*)+0xbed) [0x8df07d]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  27: 
>> (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x4ac) [0xee700c]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  28: 
>> (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0xeea1d0]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  29: (()+0x76ba) [0x7fcf6045c6ba]
>> Feb 21 19:40:35 ip ceph-osd[3201]:  30: (clone()+0x6d) [0x7fcf5fa634dd]
>> Feb 21 19:40:35 ip ceph-osd[3228]:  23: 
>> (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, 
>> ThreadPool::TPHandle&)+0x1bf) [0x8c12af]
>> Feb 21 19:40:35 ip ceph-osd[3228]:  24: (PGOpItem::run(OSD*, OSDShard*, 
>> boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x62) [0xb7a152]
>> Feb 21 19:40:35 ip ceph-osd[3228]:  25: (OSD::ShardedOpWQ::_process(unsigned 
>> int, ceph::heartbeat_handle_d*)+0xbed) [0x8df07d]
>> Feb 21 19:40:35 ip ceph-osd[3228]:  26: 
>> (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x4ac) [0xee700c]
>> Feb 21 19:40:35 ip ceph-osd[3228]:  27: 
>> (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0xeea1d0]
>> Feb 21 19:40:35 ip ceph-osd[3228]:  28: (()+0x76ba) [0x7f6fb567c6ba]
>> Feb 21 19:40:35 ip ceph-osd[3228]:  29: (clone()+0x6d) [0x7f6fb4c834dd]
>> Feb 21 19:40:35 ip ceph-osd[3241]:  23: 
>> (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, 
>> ThreadPool::TPHandle&)+0x1bf) [0x8c12af]
>> Feb 21 19:40:35 ip ceph-osd[3241]:  24: (PGOpItem::run(OSD*, OSDShard*, 
>> boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x62) [0xb7a152]
>> Feb 21 19:40:35 ip ceph-osd[3241]:  25: (OSD::ShardedOpWQ::_process(unsigned 
>> int, ceph::heartbeat_handle_d*)+0xbed) [0x8df07d]
>> Feb 21 19:40:35 ip ceph-osd[3241]:  26: 
>> (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x4ac) [0xee700c]
>> Feb 21 19:40:35 ip ceph-osd[3241]:  27: 
>> (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0xeea1d0]
>> Feb 21 19:40:35 ip ceph-osd[3241]:  28: (()+0x76ba) [0x7f0244a236ba]
>> Feb 21 19:40:35 ip ceph-osd[3241]:  29: (clone()+0x6d) [0x7f024402a4dd]
>> 
>> 
>> From /var/log/ceph/ceph-osd-16.log
>> ** File Read Latency Histogram By Level [default] **
>> 
>> 2021-02-21 19:38:36.218 7f023b58f700  0 log_channel(cluster) log [WRN] : 
>> slow request osd_op(mds.1.51327:1034954064 2.80 
>> 2:018429c8:::2002458b0ca.00000000:head [create,setxattr parent 
>> (357),setxattr layout (30)] snapc 0=[] 
>> ondisk+write+known_if_redirected+full_force e7868) initiated 2021-02-21 
>> 19:38:05.293724 currently waiting for sub ops
>> 2021-02-21 19:38:36.218 7f023b58f700 -1 osd.16 7868 get_health_metrics 
>> reporting 1 slow ops, oldest is osd_op(mds.1.51327:1034954064 2.80 
>> 2:018429c8:::2002458b0ca.00000000:head [create,setxattr parent 
>> (357),setxattr layout (30)] snapc 0=[] 
>> ondisk+write+known_if_redirected+full_force e7868)
>> 2021-02-21 19:38:37.254 7f023b58f700  0 log_channel(cluster) log [WRN] : 
>> slow request osd_op(mds.1.51327:1034954064 2.80 
>> 2:018429c8:::2002458b0ca.00000000:head [create,setxattr parent 
>> (357),setxattr layout (30)] snapc 0=[] 
>> ondisk+write+known_if_redirected+full_force e7868) initiated 2021-02-21 
>> 19:38:05.293724 currently waiting for sub ops
>> 2021-02-21 19:38:37.254 7f023b58f700 -1 osd.16 7868 get_health_metrics 
>> reporting 1 slow ops, oldest is osd_op(mds.1.51327:1034954064 2.80 
>> 2:018429c8:::2002458b0ca.00000000:head [create,setxattr parent 
>> (357),setxattr layout (30)] snapc 0=[] 
>> ondisk+write+known_if_redirected+full_force e7868)
>> 2021-02-21 19:38:38.286 7f023b58f700  0 log_channel(cluster) log [WRN] : 
>> slow request osd_op(mds.1.51327:1034954064 2.80 
>> 2:018429c8:::2002458b0ca.00000000:head [create,setxattr parent 
>> (357),setxattr layout (30)] snapc 0=[] 
>> ondisk+write+known_if_redirected+full_force e7868) initiated 2021-02-21 
>> 19:38:05.293724 currently waiting for sub ops
>> 2021-02-21 19:38:38.286 7f023b58f700 -1 osd.16 7868 get_health_metrics 
>> reporting 1 slow ops, oldest is osd_op(mds.1.51327:1034954064 2.80 
>> 2:018429c8:::2002458b0ca.00000000:head [create,setxattr parent 
>> (357),setxattr layout (30)] snapc 0=[] 
>> ondisk+write+known_if_redirected+full_force e7868)
>> 2021-02-21 19:38:39.334 7f023b58f700  0 log_channel(cluster) log [WRN] : 
>> slow request osd_op(mds.1.51327:1034954064 2.80 
>> 2:018429c8:::2002458b0ca.00000000:head [create,setxattr parent 
>> (357),setxattr layout (30)] snapc 0=[] 
>> ondisk+write+known_if_redirected+full_force e7868) initiated 2021-02-21 
>> 19:38:05.293724 currently waiting for sub ops
>> 2021-02-21 19:38:39.334 7f023b58f700 -1 osd.16 7868 get_health_metrics 
>> reporting 1 slow ops, oldest is osd_op(mds.1.51327:1034954064 2.80 
>> 2:018429c8:::2002458b0ca.00000000:head [create,setxattr parent 
>> (357),setxattr layout (30)] snapc 0=[] 
>> ondisk+write+known_if_redirected+full_force e7868)
>> 2021-02-21 19:38:40.322 7f023b58f700  0 log_channel(cluster) log [WRN] : 
>> slow request osd_op(mds.1.51327:1034954064 2.80 
>> 2:018429c8:::2002458b0ca.00000000:head [create,setxattr parent 
>> (357),setxattr layout (30)] snapc 0=[] 
>> ondisk+write+known_if_redirected+full_force e7868) initiated 2021-02-21 
>> 19:38:05.293724 currently waiting for sub ops
>> 2021-02-21 19:38:40.322 7f023b58f700 -1 osd.16 7868 get_health_metrics 
>> reporting 1 slow ops, oldest is osd_op(mds.1.51327:1034954064 2.80 
>> 2:018429c8:::2002458b0ca.00000000:head [create,setxattr parent 
>> (357),setxattr layout (30)] snapc 0=[] 
>> ondisk+write+known_if_redirected+full_force e7868)
>> 2021-02-21 19:38:41.326 7f023b58f700  0 log_channel(cluster) log [WRN] : 
>> slow request osd_op(mds.1.51327:1034954064 2.80 
>> 2:018429c8:::2002458b0ca.00000000:head [create,setxattr parent 
>> (357),setxattr layout (30)] snapc 0=[] 
>> ondisk+write+known_if_redirected+full_force e7868) initiated 2021-02-21 
>> 19:38:05.293724 currently waiting for sub ops
>> 2021-02-21 19:38:41.326 7f023b58f700 -1 osd.16 7868 get_health_metrics 
>> reporting 1 slow ops, oldest is osd_op(mds.1.51327:1034954064 2.80 
>> 2:018429c8:::2002458b0ca.00000000:head [create,setxattr parent 
>> (357),setxattr layout (30)] snapc 0=[] 
>> ondisk+write+known_if_redirected+full_force e7868)
>> 2021-02-21 19:38:42.298 7f023b58f700  0 log_channel(cluster) log [WRN] : 
>> slow request osd_op(mds.1.51327:1034954064 2.80 
>> 2:018429c8:::2002458b0ca.00000000:head [create,setxattr parent 
>> (357),setxattr layout (30)] snapc 0=[] 
>> ondisk+write+known_if_redirected+full_force e7868) initiated 2021-02-21 
>> 19:38:05.293724 currently waiting for sub ops
>> 2021-02-21 19:38:42.298 7f023b58f700 -1 osd.16 7868 get_health_metrics 
>> reporting 1 slow ops, oldest is osd_op(mds.1.51327:1034954064 2.80 
>> 2:018429c8:::2002458b0ca.00000000:head [create,setxattr parent 
>> (357),setxattr layout (30)] snapc 0=[] 
>> ondisk+write+known_if_redirected+full_force e7868)
>> 2021-02-21 19:38:43.258 7f023b58f700  0 log_channel(cluster) log [WRN] : 
>> slow request osd_op(mds.1.51327:1034954064 2.80 
>> 2:018429c8:::2002458b0ca.00000000:head [create,setxattr parent 
>> (357),setxattr layout (30)] snapc 0=[] 
>> ondisk+write+known_if_redirected+full_force e7868) initiated 2021-02-21 
>> 19:38:05.293724 currently waiting for sub ops
>> 2021-02-21 19:38:43.258 7f023b58f700 -1 osd.16 7868 get_health_metrics 
>> reporting 1 slow ops, oldest is osd_op(mds.1.51327:1034954064 2.80 
>> 2:018429c8:::2002458b0ca.00000000:head [create,setxattr parent 
>> (357),setxattr layout (30)] snapc 0=[] 
>> ondisk+write+known_if_redirected+full_force e7868)
>> 2021-02-21 19:38:44.258 7f023b58f700  0 log_channel(cluster) log [WRN] : 
>> slow request osd_op(mds.1.51327:1034954064 2.80 
>> 2:018429c8:::2002458b0ca.00000000:head [create,setxattr parent 
>> (357),setxattr layout (30)] snapc 0=[] 
>> ondisk+write+known_if_redirected+full_force e7868) initiated 2021-02-21 
>> 19:38:05.293724 currently waiting for sub ops
>> 2021-02-21 19:38:44.258 7f023b58f700  0 log_channel(cluster) log [WRN] : 
>> slow request osd_repop(mds.3.49311:354864012 2.be e7868/7857 
>> 2:7d8b5a59:::203.00123c76:head v 7868'5766635) initiated 2021-02-21 
>> 19:38:13.743464 currently started
>> 2021-02-21 19:38:44.258 7f023b58f700 -1 osd.16 7868 get_health_metrics 
>> reporting 2 slow ops, oldest is osd_op(mds.1.51327:1034954064 2.80 
>> 2:018429c8:::2002458b0ca.00000000:head [create,setxattr parent 
>> (357),setxattr layout (30)] snapc 0=[] 
>> ondisk+write+known_if_redirected+full_force e7868)
>> 2021-02-21 19:38:45.286 7f023b58f700  0 log_channel(cluster) log [WRN] : 
>> slow request osd_op(mds.1.51327:1034954064 2.80 
>> 2:018429c8:::2002458b0ca.00000000:head [create,setxattr parent 
>> (357),setxattr layout (30)] snapc 0=[] 
>> ondisk+write+known_if_redirected+full_force e7868) initiated 2021-02-21 
>> 19:38:05.293724 currently waiting for sub ops
>> 2021-02-21 19:38:45.286 7f023b58f700  0 log_channel(cluster) log [WRN] : 
>> slow request osd_repop(mds.3.49311:354864012 2.be e7868/7857 
>> 2:7d8b5a59:::203.00123c76:head v 7868'5766635) initiated 2021-02-21 
>> 19:38:13.743464 currently started
>> 2021-02-21 19:38:45.286 7f023b58f700 -1 osd.16 7868 get_health_metrics 
>> reporting 2 slow ops, oldest is osd_op(mds.1.51327:1034954064 2.80 
>> 2:018429c8:::2002458b0ca.00000000:head [create,setxattr parent 
>> (357),setxattr layout (30)] snapc 0=[] 
>> ondisk+write+known_if_redirected+full_force e7868)
>> 2021-02-21 19:38:46.254 7f023b58f700  0 log_channel(cluster) log [WRN] : 
>> slow request osd_op(mds.1.51327:1034954064 2.80 
>> 2:018429c8:::2002458b0ca.00000000:head [create,setxattr parent 
>> (357),setxattr layout (30)] snapc 0=[] 
>> ondisk+write+known_if_redirected+full_force e7868) initiated 2021-02-21 
>> 19:38:05.293724 currently waiting for sub ops
>> 2021-02-21 19:38:46.254 7f023b58f700  0 log_channel(cluster) log [WRN] : 
>> slow request osd_repop(mds.3.49311:354864012 2.be e7868/7857 
>> 2:7d8b5a59:::203.00123c76:head v 7868'5766635) initiated 2021-02-21 
>> 19:38:13.743464 currently started
>> 2021-02-21 19:38:46.254 7f023b58f700 -1 osd.16 7868 get_health_metrics 
>> reporting 2 slow ops, oldest is osd_op(mds.1.51327:1034954064 2.80 
>> 2:018429c8:::2002458b0ca.00000000:head [create,setxattr parent 
>> (357),setxattr layout (30)] snapc 0=[] 
>> ondisk+write+known_if_redirected+full_force e7868)
>> 2021-02-21 19:38:47.226 7f023b58f700  0 log_channel(cluster) log [WRN] : 
>> slow request osd_op(mds.1.51327:1034954064 2.80 
>> 2:018429c8:::2002458b0ca.00000000:head [create,setxattr parent 
>> (357),setxattr layout (30)] snapc 0=[] 
>> ondisk+write+known_if_redirected+full_force e7868) initiated 2021-02-21 
>> 19:38:05.293724 currently waiting for sub ops
>> 2021-02-21 19:38:47.226 7f023b58f700  0 log_channel(cluster) log [WRN] : 
>> slow request osd_repop(mds.3.49311:354864012 2.be e7868/7857 
>> 2:7d8b5a59:::203.00123c76:head v 7868'5766635) initiated 2021-02-21 
>> 19:38:13.743464 currently started
>> 2021-02-21 19:38:47.226 7f023b58f700 -1 osd.16 7868 get_health_metrics 
>> reporting 2 slow ops, oldest is osd_op(mds.1.51327:1034954064 2.80 
>> 2:018429c8:::2002458b0ca.00000000:head [create,setxattr parent 
>> (357),setxattr layout (30)] snapc 0=[] 
>> ondisk+write+known_if_redirected+full_force e7868)
>> 2021-02-21 19:39:01.150 7f0241749700  1 heartbeat_map is_healthy 
>> 'OSD::osd_op_tp thread 0x7f021a54d700' had timed out after 15
>> 2021-02-21 19:39:01.150 7f0241f4a700  1 heartbeat_map is_healthy 
>> 'OSD::osd_op_tp thread 0x7f021a54d700' had timed out after 15
>> 2021-02-21 19:39:01.330 7f0241f4a700  1 heartbeat_map is_healthy 
>> 'OSD::osd_op_tp thread 0x7f021a54d700' had timed out after 15
>> 2021-02-21 19:39:01.330 7f0240f48700  1 heartbeat_map is_healthy 
>> 'OSD::osd_op_tp thread 0x7f021a54d700' had timed out after 15
>> ...
>> 2021-02-21 19:40:35.099 7f0240f48700  1 heartbeat_map is_healthy 
>> 'OSD::osd_op_tp thread 0x7f0216d46700' had timed out after 15
>> 2021-02-21 19:40:35.099 7f0240f48700  1 heartbeat_map is_healthy 
>> 'OSD::osd_op_tp thread 0x7f021a54d700' had timed out after 15
>> 2021-02-21 19:40:35.211 7f0236585700 -1 bdev(0xc3f0a80 
>> /var/lib/ceph/osd/ceph-16/block.wal) _aio_thread got r=-5 ((5) Input/output 
>> error)
>> 2021-02-21 19:40:35.211 7f021a54d700 -1 bdev(0xc3f0700 
>> /var/lib/ceph/osd/ceph-16/block.db) direct_read_unaligned stalled read  
>> 0x611527d7~fe6 since 232570.311299s, timeout is 5.000000s
>> 2021-02-21 19:40:35.211 7f021a54d700 -1 bdev(0xc3f0700 
>> /var/lib/ceph/osd/ceph-16/block.db) direct_read_unaligned 0x611527d7~fe6 
>> error: (5) Input/output error
>> 2021-02-21 19:40:35.215 7f0236585700 -1 
>> /build/ceph-14.2.10/src/os/bluestore/KernelDevice.cc: In function 'void 
>> KernelDevice::_aio_thread()' thread 7f0236585700 time 2021-02-21 
>> 19:40:35.215482
>> /build/ceph-14.2.10/src/os/bluestore/KernelDevice.cc: 534: 
>> ceph_abort_msg("Unexpected IO error. This may suggest a hardware issue. 
>> Please check your kernel log!")
>> 
>> ceph version 14.2.10 (b340acf629a010a74d90da5782a2c5fe0b54ac20) nautilus 
>> (stable)
>> 1: (ceph::__ceph_abort(char const*, int, char const*, 
>> std::__cxx11::basic_string<char, std::char_traits<char>, 
>> std::allocator<char> > const&)+0xdf) [0x83eb8c]
>> 2: (KernelDevice::_aio_thread()+0xbc5) [0xe8d5b5]
>> 3: (KernelDevice::AioCompletionThread::entry()+0xd) [0xe8f3cd]
>> 4: (()+0x76ba) [0x7f0244a236ba]
>> 5: (clone()+0x6d) [0x7f024402a4dd]
>> 
>> Thanks,
>> Xiong
>> 
>>> 在 2021年2月19日,下午10:34,zxcs <zhuxion...@163.com> 写道:
>>> 
>>> you mean OS? it ubuntu 16.04 and Nvme is Samsung 970 PRO 1TB.
>>> 
>>> Thanks,
>>> zx
>>> 
>>>> 在 2021年2月19日,下午6:56,Konstantin Shalygin <k0...@k0ste.ru 
>>>> <mailto:k0...@k0ste.ru> <mailto:k0...@k0ste.ru <mailto:k0...@k0ste.ru>>> 
>>>> 写道:
>>>> 
>>>> Look's good, what is your hardware? Server model & NVM'es?
>>>> 
>>>> 
>>>> 
>>>> k
>>>> 
>>>>> On 19 Feb 2021, at 13:22, zxcs <zhuxion...@163.com 
>>>>> <mailto:zhuxion...@163.com> <mailto:zhuxion...@163.com 
>>>>> <mailto:zhuxion...@163.com>>> wrote:
>>>>> 
>>>>> BTW, actually i have two nodes has same issues, and another error node's 
>>>>> nvme output as below
>>>>> 
>>>>> Smart Log for NVME device:nvme0n1 namespace-id:ffffffff
>>>>> critical_warning                    : 0
>>>>> temperature                         : 29 C
>>>>> available_spare                     : 100%
>>>>> available_spare_threshold           : 10%
>>>>> percentage_used                     : 1%
>>>>> data_units_read                     : 592,340,175
>>>>> data_units_written                  : 26,443,352
>>>>> host_read_commands                  : 5,341,278,662
>>>>> host_write_commands                 : 515,730,885
>>>>> controller_busy_time                : 14,052
>>>>> power_cycles                        : 8
>>>>> power_on_hours                      : 4,294
>>>>> unsafe_shutdowns                    : 6
>>>>> media_errors                        : 0
>>>>> num_err_log_entries                 : 0
>>>>> Warning Temperature Time            : 0
>>>>> Critical Composite Temperature Time : 0
>>>>> Temperature Sensor 1                : 29 C
>>>>> Temperature Sensor 2                : 46 C
>>>>> Temperature Sensor 3                : 0 C
>>>>> Temperature Sensor 4                : 0 C
>>>>> Temperature Sensor 5                : 0 C
>>>>> Temperature Sensor 6                : 0 C
>>>>> Temperature Sensor 7                : 0 C
>>>>> Temperature Sensor 8                : 0 C
>>>>> 
>>>>> 
>>>>> For compare, i get one healthy node’s nvme output as below:
>>>>> 
>>>>> mart Log for NVME device:nvme0n1 namespace-id:ffffffff
>>>>> critical_warning                    : 0
>>>>> temperature                         : 27 C
>>>>> available_spare                     : 100%
>>>>> available_spare_threshold           : 10%
>>>>> percentage_used                     : 1%
>>>>> data_units_read                     : 579,829,652
>>>>> data_units_written                  : 28,271,336
>>>>> host_read_commands                  : 5,237,750,233
>>>>> host_write_commands                 : 518,979,861
>>>>> controller_busy_time                : 14,166
>>>>> power_cycles                        : 3
>>>>> power_on_hours                      : 4,252
>>>>> unsafe_shutdowns                    : 1
>>>>> media_errors                        : 0
>>>>> num_err_log_entries                 : 0
>>>>> Warning Temperature Time            : 0
>>>>> Critical Composite Temperature Time : 0
>>>>> Temperature Sensor 1                : 27 C
>>>>> Temperature Sensor 2                : 39 C
>>>>> Temperature Sensor 3                : 0 C
>>>>> Temperature Sensor 4                : 0 C
>>>>> Temperature Sensor 5                : 0 C
>>>>> Temperature Sensor 6                : 0 C
>>>>> Temperature Sensor 7                : 0 C
>>>>> Temperature Sensor 8                : 0 C
>>>> 
>>> 
>>> _______________________________________________
>>> ceph-users mailing list -- ceph-users@ceph.io <mailto:ceph-users@ceph.io>
>>> To unsubscribe send an email to ceph-users-le...@ceph.io 
>>> <mailto:ceph-users-le...@ceph.io>
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to