Re: [ceph-users] Slow requests from bluestore osds / crashing rbd-nbd
Got it. Thanks for the explanation, Jason! Kind regards, Charles Alva Sent from Gmail Mobile On Tue, May 21, 2019 at 5:16 PM Jason Dillaman wrote: > On Tue, May 21, 2019 at 12:03 PM Charles Alva > wrote: > > > > Hi Jason, > > > > Should we disable fstrim services inside VM which runs on top of RBD? > > It has a potential to be a thundering herd issue if you have lots of > VMs all issuing discards all at the same time and your RBD images do > not have object-map enabled. With object-map enabled, the discards > will just get ignored if the backing objects do not already exist. You > could hit a similar issue if you have hundreds or thousands of VMs all > running a scheduled IO heavy task all at the same time (e.g. a yum/apt > update every week at midnight), so it's not really tied to discard (or > even Ceph/RBD) but more of a peak IOPS capacity issue. > > > I recall Ubuntu OS has weekly fstrim cronjob enabled by default, while > we have to enable fstrim service manually on Debian and CentOS. > > > > > > Kind regards, > > > > Charles Alva > > Sent from Gmail Mobile > > > > On Tue, May 21, 2019, 4:49 AM Jason Dillaman > wrote: > >> > >> On Mon, May 20, 2019 at 2:17 PM Marc Schöchlin wrote: > >> > > >> > Hello cephers, > >> > > >> > we have a few systems which utilize a rbd-bd map/mount to get access > to a rbd volume. > >> > (This problem seems to be related to "[ceph-users] Slow requests from > bluestore osds" (the original thread)) > >> > > >> > Unfortunately the rbd-nbd device of a system crashes three mondays in > series at ~00:00 when the systemd fstrim timer executes "fstrim -av". > >> > (which runs in parallel to deep scrub operations) > >> > >> That's probably not a good practice if you have lots of VMs doing this > >> at the same time *and* you are not using object-map. The reason is > >> that "fstrim" could discard huge extents that result around a thousand > >> concurrent remove/truncate/zero ops per image being thrown at your > >> cluster. > >> > >> > After that the device constantly reports io errors every time a > access to the filesystem happens. > >> > Unmounting, remapping and mounting helped to get the > filesystem/device back into business :-) > >> > >> If the cluster was being DDoSed by the fstrims, the VM OSes' might > >> have timed out thinking a controller failure. > >> > >> > Manual 30 minute stresstests using the following fio command, did not > produce any problems on client side > >> > (Ceph storage reported some slow requests while testing). > >> > > >> > fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 > --name=test --filename=test --bs=4k --iodepth=64 --size=4G > --readwrite=randrw --rwmixread=50 --numjobs=50 --loops=10 > >> > > >> > It seems that others also experienced this problem: > https://ceph-users.ceph.narkive.com/2FIfyx1U/rbd-nbd-timeout-and-crash > >> > The change for setting device timeouts by not seems to be merged to > luminous. > >> > Experiments setting the timeout manually after mapping using > https://github.com/OnApp/nbd-kernel_mod/blob/master/nbd_set_timeout.c > haven't change the situation. > >> > > >> > Do you have suggestions how to analyze/solve the situation? > >> > > >> > Regards > >> > Marc > >> > > >> > > >> > > >> > > >> > The client kernel throws messages like this: > >> > > >> > May 19 23:59:01 int-nfs-001 CRON[836295]: (root) CMD (command -v > debian-sa1 > /dev/null && debian-sa1 60 2) > >> > May 20 00:00:30 int-nfs-001 systemd[1]: Starting Discard unused > blocks... > >> > May 20 00:01:02 int-nfs-001 kernel: [1077851.623582] block nbd0: > Connection timed out > >> > May 20 00:01:02 int-nfs-001 kernel: [1077851.623613] block nbd0: > shutting down sockets > >> > May 20 00:01:02 int-nfs-001 kernel: [1077851.623617] print_req_error: > I/O error, dev nbd0, sector 84082280 > >> > May 20 00:01:02 int-nfs-001 kernel: [1077851.623632] block nbd0: > Connection timed out > >> > May 20 00:01:02 int-nfs-001 kernel: [1077851.623636] print_req_error: > I/O error, dev nbd0, sector 92470887 > >> > May 20 00:01:02 int-nfs-001 kernel: [1077851.623642] block nbd0: > Connection timed out > >> > > >> > Ceph throws messages like this: > >> > > >> > 2019-05-20 00:00:00.000124 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 > 173572 : cluster [INF] overall HEALTH_OK > >> > 2019-05-20 00:00:54.249998 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 > 173586 : cluster [WRN] Health check failed: 644 slow requests are blocked > > 32 sec. Implicated osds 51 (REQUEST_SLOW) > >> > 2019-05-20 00:01:00.330566 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 > 173587 : cluster [WRN] Health check update: 594 slow requests are blocked > > 32 sec. Implicated osds 51 (REQUEST_SLOW) > >> > 2019-05-20 00:01:09.768476 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 > 173591 : cluster [WRN] Health check update: 505 slow requests are blocked > > 32 sec. Implicated osds 51 (REQUEST_SLOW) > >> > 2019-05-20 00:01:14.768769 mon.ceph-mon-s43 mon.0
Re: [ceph-users] Slow requests from bluestore osds / crashing rbd-nbd
On Tue, May 21, 2019 at 12:03 PM Charles Alva wrote: > > Hi Jason, > > Should we disable fstrim services inside VM which runs on top of RBD? It has a potential to be a thundering herd issue if you have lots of VMs all issuing discards all at the same time and your RBD images do not have object-map enabled. With object-map enabled, the discards will just get ignored if the backing objects do not already exist. You could hit a similar issue if you have hundreds or thousands of VMs all running a scheduled IO heavy task all at the same time (e.g. a yum/apt update every week at midnight), so it's not really tied to discard (or even Ceph/RBD) but more of a peak IOPS capacity issue. > I recall Ubuntu OS has weekly fstrim cronjob enabled by default, while we > have to enable fstrim service manually on Debian and CentOS. > > > Kind regards, > > Charles Alva > Sent from Gmail Mobile > > On Tue, May 21, 2019, 4:49 AM Jason Dillaman wrote: >> >> On Mon, May 20, 2019 at 2:17 PM Marc Schöchlin wrote: >> > >> > Hello cephers, >> > >> > we have a few systems which utilize a rbd-bd map/mount to get access to a >> > rbd volume. >> > (This problem seems to be related to "[ceph-users] Slow requests from >> > bluestore osds" (the original thread)) >> > >> > Unfortunately the rbd-nbd device of a system crashes three mondays in >> > series at ~00:00 when the systemd fstrim timer executes "fstrim -av". >> > (which runs in parallel to deep scrub operations) >> >> That's probably not a good practice if you have lots of VMs doing this >> at the same time *and* you are not using object-map. The reason is >> that "fstrim" could discard huge extents that result around a thousand >> concurrent remove/truncate/zero ops per image being thrown at your >> cluster. >> >> > After that the device constantly reports io errors every time a access to >> > the filesystem happens. >> > Unmounting, remapping and mounting helped to get the filesystem/device >> > back into business :-) >> >> If the cluster was being DDoSed by the fstrims, the VM OSes' might >> have timed out thinking a controller failure. >> >> > Manual 30 minute stresstests using the following fio command, did not >> > produce any problems on client side >> > (Ceph storage reported some slow requests while testing). >> > >> > fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 >> > --name=test --filename=test --bs=4k --iodepth=64 --size=4G >> > --readwrite=randrw --rwmixread=50 --numjobs=50 --loops=10 >> > >> > It seems that others also experienced this problem: >> > https://ceph-users.ceph.narkive.com/2FIfyx1U/rbd-nbd-timeout-and-crash >> > The change for setting device timeouts by not seems to be merged to >> > luminous. >> > Experiments setting the timeout manually after mapping using >> > https://github.com/OnApp/nbd-kernel_mod/blob/master/nbd_set_timeout.c >> > haven't change the situation. >> > >> > Do you have suggestions how to analyze/solve the situation? >> > >> > Regards >> > Marc >> > >> > >> > >> > >> > The client kernel throws messages like this: >> > >> > May 19 23:59:01 int-nfs-001 CRON[836295]: (root) CMD (command -v >> > debian-sa1 > /dev/null && debian-sa1 60 2) >> > May 20 00:00:30 int-nfs-001 systemd[1]: Starting Discard unused blocks... >> > May 20 00:01:02 int-nfs-001 kernel: [1077851.623582] block nbd0: >> > Connection timed out >> > May 20 00:01:02 int-nfs-001 kernel: [1077851.623613] block nbd0: shutting >> > down sockets >> > May 20 00:01:02 int-nfs-001 kernel: [1077851.623617] print_req_error: I/O >> > error, dev nbd0, sector 84082280 >> > May 20 00:01:02 int-nfs-001 kernel: [1077851.623632] block nbd0: >> > Connection timed out >> > May 20 00:01:02 int-nfs-001 kernel: [1077851.623636] print_req_error: I/O >> > error, dev nbd0, sector 92470887 >> > May 20 00:01:02 int-nfs-001 kernel: [1077851.623642] block nbd0: >> > Connection timed out >> > >> > Ceph throws messages like this: >> > >> > 2019-05-20 00:00:00.000124 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 >> > 173572 : cluster [INF] overall HEALTH_OK >> > 2019-05-20 00:00:54.249998 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 >> > 173586 : cluster [WRN] Health check failed: 644 slow requests are blocked >> > > 32 sec. Implicated osds 51 (REQUEST_SLOW) >> > 2019-05-20 00:01:00.330566 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 >> > 173587 : cluster [WRN] Health check update: 594 slow requests are blocked >> > > 32 sec. Implicated osds 51 (REQUEST_SLOW) >> > 2019-05-20 00:01:09.768476 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 >> > 173591 : cluster [WRN] Health check update: 505 slow requests are blocked >> > > 32 sec. Implicated osds 51 (REQUEST_SLOW) >> > 2019-05-20 00:01:14.768769 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 >> > 173592 : cluster [WRN] Health check update: 497 slow requests are blocked >> > > 32 sec. Implicated osds 51 (REQUEST_SLOW) >> > 2019-05-20 00:01:20.610398 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 >> >
Re: [ceph-users] Slow requests from bluestore osds / crashing rbd-nbd
Hi Jason, Should we disable fstrim services inside VM which runs on top of RBD? I recall Ubuntu OS has weekly fstrim cronjob enabled by default, while we have to enable fstrim service manually on Debian and CentOS. Kind regards, Charles Alva Sent from Gmail Mobile On Tue, May 21, 2019, 4:49 AM Jason Dillaman wrote: > On Mon, May 20, 2019 at 2:17 PM Marc Schöchlin wrote: > > > > Hello cephers, > > > > we have a few systems which utilize a rbd-bd map/mount to get access to > a rbd volume. > > (This problem seems to be related to "[ceph-users] Slow requests from > bluestore osds" (the original thread)) > > > > Unfortunately the rbd-nbd device of a system crashes three mondays in > series at ~00:00 when the systemd fstrim timer executes "fstrim -av". > > (which runs in parallel to deep scrub operations) > > That's probably not a good practice if you have lots of VMs doing this > at the same time *and* you are not using object-map. The reason is > that "fstrim" could discard huge extents that result around a thousand > concurrent remove/truncate/zero ops per image being thrown at your > cluster. > > > After that the device constantly reports io errors every time a access > to the filesystem happens. > > Unmounting, remapping and mounting helped to get the filesystem/device > back into business :-) > > If the cluster was being DDoSed by the fstrims, the VM OSes' might > have timed out thinking a controller failure. > > > Manual 30 minute stresstests using the following fio command, did not > produce any problems on client side > > (Ceph storage reported some slow requests while testing). > > > > fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 > --name=test --filename=test --bs=4k --iodepth=64 --size=4G > --readwrite=randrw --rwmixread=50 --numjobs=50 --loops=10 > > > > It seems that others also experienced this problem: > https://ceph-users.ceph.narkive.com/2FIfyx1U/rbd-nbd-timeout-and-crash > > The change for setting device timeouts by not seems to be merged to > luminous. > > Experiments setting the timeout manually after mapping using > https://github.com/OnApp/nbd-kernel_mod/blob/master/nbd_set_timeout.c > haven't change the situation. > > > > Do you have suggestions how to analyze/solve the situation? > > > > Regards > > Marc > > > > > > > > > > The client kernel throws messages like this: > > > > May 19 23:59:01 int-nfs-001 CRON[836295]: (root) CMD (command -v > debian-sa1 > /dev/null && debian-sa1 60 2) > > May 20 00:00:30 int-nfs-001 systemd[1]: Starting Discard unused blocks... > > May 20 00:01:02 int-nfs-001 kernel: [1077851.623582] block nbd0: > Connection timed out > > May 20 00:01:02 int-nfs-001 kernel: [1077851.623613] block nbd0: > shutting down sockets > > May 20 00:01:02 int-nfs-001 kernel: [1077851.623617] print_req_error: > I/O error, dev nbd0, sector 84082280 > > May 20 00:01:02 int-nfs-001 kernel: [1077851.623632] block nbd0: > Connection timed out > > May 20 00:01:02 int-nfs-001 kernel: [1077851.623636] print_req_error: > I/O error, dev nbd0, sector 92470887 > > May 20 00:01:02 int-nfs-001 kernel: [1077851.623642] block nbd0: > Connection timed out > > > > Ceph throws messages like this: > > > > 2019-05-20 00:00:00.000124 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 > 173572 : cluster [INF] overall HEALTH_OK > > 2019-05-20 00:00:54.249998 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 > 173586 : cluster [WRN] Health check failed: 644 slow requests are blocked > > 32 sec. Implicated osds 51 (REQUEST_SLOW) > > 2019-05-20 00:01:00.330566 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 > 173587 : cluster [WRN] Health check update: 594 slow requests are blocked > > 32 sec. Implicated osds 51 (REQUEST_SLOW) > > 2019-05-20 00:01:09.768476 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 > 173591 : cluster [WRN] Health check update: 505 slow requests are blocked > > 32 sec. Implicated osds 51 (REQUEST_SLOW) > > 2019-05-20 00:01:14.768769 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 > 173592 : cluster [WRN] Health check update: 497 slow requests are blocked > > 32 sec. Implicated osds 51 (REQUEST_SLOW) > > 2019-05-20 00:01:20.610398 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 > 173593 : cluster [WRN] Health check update: 509 slow requests are blocked > > 32 sec. Implicated osds 51 (REQUEST_SLOW) > > 2019-05-20 00:01:28.721891 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 > 173594 : cluster [WRN] Health check update: 501 slow requests are blocked > > 32 sec. Implicated osds 51 (REQUEST_SLOW) > > 2019-05-20 00:01:34.909842 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 > 173596 : cluster [WRN] Health check update: 494 slow requests are blocked > > 32 sec. Implicated osds 51 (REQUEST_SLOW) > > 2019-05-20 00:01:44.770330 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 > 173597 : cluster [WRN] Health check update: 500 slow requests are blocked > > 32 sec. Implicated osds 51 (REQUEST_SLOW) > > 2019-05-20 00:01:49.770625 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 > 173599 :
Re: [ceph-users] Slow requests from bluestore osds / crashing rbd-nbd
On Tue, May 21, 2019 at 11:28 AM Marc Schöchlin wrote: > > Hello Jason, > > Am 20.05.19 um 23:49 schrieb Jason Dillaman: > > On Mon, May 20, 2019 at 2:17 PM Marc Schöchlin wrote: > > Hello cephers, > > we have a few systems which utilize a rbd-bd map/mount to get access to a rbd > volume. > (This problem seems to be related to "[ceph-users] Slow requests from > bluestore osds" (the original thread)) > > Unfortunately the rbd-nbd device of a system crashes three mondays in series > at ~00:00 when the systemd fstrim timer executes "fstrim -av". > (which runs in parallel to deep scrub operations) > > That's probably not a good practice if you have lots of VMs doing this > at the same time *and* you are not using object-map. The reason is > that "fstrim" could discard huge extents that result around a thousand > concurrent remove/truncate/zero ops per image being thrown at your > cluster. > > Sure, currently we do not have lots of vms which are capable to run fstim on > rbd volumes. > But the already involved RBD Images are multiple-tb images with a high > write/deletetion rate. > Therefore i am already in progress to distribute fstrims by adding random > delays > > After that the device constantly reports io errors every time a access to the > filesystem happens. > Unmounting, remapping and mounting helped to get the filesystem/device back > into business :-) > > If the cluster was being DDoSed by the fstrims, the VM OSes' might > have timed out thinking a controller failure. > > > Yes and no :-) Probably my problem is related to the kernel release, kernel > setting or the operating system release. > Why? > > we run ~800 RBD images on that ceph cluster with rbd-nbd 12.2.5 in our xen > cluster as dom0-storage repository device without any timeout problems > (kernel 4.4.0+10, centos 7) > we run some 35TB kRBD images with multiples of the load of the crashed > rbd-nbd very write/read/deletion load without any timeout problems > the timeout problem appears on two vms (ubuntu 18.04, ubuntu 16.04) which > utilize the described settings > > From my point of view, the error behavior is currently reproducible with a > good probability. > Do you have suggestions how to find the root cause of this problem? Can you provide any logs/backtraces/core dumps from the rbd-nbd process? > Regards > Marc -- Jason ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Slow requests from bluestore osds / crashing rbd-nbd
Hello Jason, Am 20.05.19 um 23:49 schrieb Jason Dillaman: > On Mon, May 20, 2019 at 2:17 PM Marc Schöchlin wrote: >> Hello cephers, >> >> we have a few systems which utilize a rbd-bd map/mount to get access to a >> rbd volume. >> (This problem seems to be related to "[ceph-users] Slow requests from >> bluestore osds" (the original thread)) >> >> Unfortunately the rbd-nbd device of a system crashes three mondays in series >> at ~00:00 when the systemd fstrim timer executes "fstrim -av". >> (which runs in parallel to deep scrub operations) > That's probably not a good practice if you have lots of VMs doing this > at the same time *and* you are not using object-map. The reason is > that "fstrim" could discard huge extents that result around a thousand > concurrent remove/truncate/zero ops per image being thrown at your > cluster. Sure, currently we do not have lots of vms which are capable to run fstim on rbd volumes. But the already involved RBD Images are multiple-tb images with a high write/deletetion rate. Therefore i am already in progress to distribute fstrims by adding random delays > >> After that the device constantly reports io errors every time a access to >> the filesystem happens. >> Unmounting, remapping and mounting helped to get the filesystem/device back >> into business :-) > If the cluster was being DDoSed by the fstrims, the VM OSes' might > have timed out thinking a controller failure. Yes and no :-) Probably my problem is related to the kernel release, kernel setting or the operating system release. Why? * we run ~800 RBD images on that ceph cluster with rbd-nbd 12.2.5 in our xen cluster as dom0-storage repository device without any timeout problems (kernel 4.4.0+10, centos 7) * we run some 35TB kRBD images with multiples of the load of the crashed rbd-nbd very write/read/deletion load without any timeout problems * the timeout problem appears on two vms (ubuntu 18.04, ubuntu 16.04) which utilize the described settings From my point of view, the error behavior is currently reproducible with a good probability. Do you have suggestions how to find the root cause of this problem? Regards Marc ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Slow requests from bluestore osds / crashing rbd-nbd
On Mon, May 20, 2019 at 2:17 PM Marc Schöchlin wrote: > > Hello cephers, > > we have a few systems which utilize a rbd-bd map/mount to get access to a rbd > volume. > (This problem seems to be related to "[ceph-users] Slow requests from > bluestore osds" (the original thread)) > > Unfortunately the rbd-nbd device of a system crashes three mondays in series > at ~00:00 when the systemd fstrim timer executes "fstrim -av". > (which runs in parallel to deep scrub operations) That's probably not a good practice if you have lots of VMs doing this at the same time *and* you are not using object-map. The reason is that "fstrim" could discard huge extents that result around a thousand concurrent remove/truncate/zero ops per image being thrown at your cluster. > After that the device constantly reports io errors every time a access to the > filesystem happens. > Unmounting, remapping and mounting helped to get the filesystem/device back > into business :-) If the cluster was being DDoSed by the fstrims, the VM OSes' might have timed out thinking a controller failure. > Manual 30 minute stresstests using the following fio command, did not produce > any problems on client side > (Ceph storage reported some slow requests while testing). > > fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test > --filename=test --bs=4k --iodepth=64 --size=4G --readwrite=randrw > --rwmixread=50 --numjobs=50 --loops=10 > > It seems that others also experienced this problem: > https://ceph-users.ceph.narkive.com/2FIfyx1U/rbd-nbd-timeout-and-crash > The change for setting device timeouts by not seems to be merged to luminous. > Experiments setting the timeout manually after mapping using > https://github.com/OnApp/nbd-kernel_mod/blob/master/nbd_set_timeout.c haven't > change the situation. > > Do you have suggestions how to analyze/solve the situation? > > Regards > Marc > > > > > The client kernel throws messages like this: > > May 19 23:59:01 int-nfs-001 CRON[836295]: (root) CMD (command -v debian-sa1 > > /dev/null && debian-sa1 60 2) > May 20 00:00:30 int-nfs-001 systemd[1]: Starting Discard unused blocks... > May 20 00:01:02 int-nfs-001 kernel: [1077851.623582] block nbd0: Connection > timed out > May 20 00:01:02 int-nfs-001 kernel: [1077851.623613] block nbd0: shutting > down sockets > May 20 00:01:02 int-nfs-001 kernel: [1077851.623617] print_req_error: I/O > error, dev nbd0, sector 84082280 > May 20 00:01:02 int-nfs-001 kernel: [1077851.623632] block nbd0: Connection > timed out > May 20 00:01:02 int-nfs-001 kernel: [1077851.623636] print_req_error: I/O > error, dev nbd0, sector 92470887 > May 20 00:01:02 int-nfs-001 kernel: [1077851.623642] block nbd0: Connection > timed out > > Ceph throws messages like this: > > 2019-05-20 00:00:00.000124 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 173572 > : cluster [INF] overall HEALTH_OK > 2019-05-20 00:00:54.249998 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 173586 > : cluster [WRN] Health check failed: 644 slow requests are blocked > 32 sec. > Implicated osds 51 (REQUEST_SLOW) > 2019-05-20 00:01:00.330566 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 173587 > : cluster [WRN] Health check update: 594 slow requests are blocked > 32 sec. > Implicated osds 51 (REQUEST_SLOW) > 2019-05-20 00:01:09.768476 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 173591 > : cluster [WRN] Health check update: 505 slow requests are blocked > 32 sec. > Implicated osds 51 (REQUEST_SLOW) > 2019-05-20 00:01:14.768769 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 173592 > : cluster [WRN] Health check update: 497 slow requests are blocked > 32 sec. > Implicated osds 51 (REQUEST_SLOW) > 2019-05-20 00:01:20.610398 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 173593 > : cluster [WRN] Health check update: 509 slow requests are blocked > 32 sec. > Implicated osds 51 (REQUEST_SLOW) > 2019-05-20 00:01:28.721891 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 173594 > : cluster [WRN] Health check update: 501 slow requests are blocked > 32 sec. > Implicated osds 51 (REQUEST_SLOW) > 2019-05-20 00:01:34.909842 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 173596 > : cluster [WRN] Health check update: 494 slow requests are blocked > 32 sec. > Implicated osds 51 (REQUEST_SLOW) > 2019-05-20 00:01:44.770330 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 173597 > : cluster [WRN] Health check update: 500 slow requests are blocked > 32 sec. > Implicated osds 51 (REQUEST_SLOW) > 2019-05-20 00:01:49.770625 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 173599 > : cluster [WRN] Health check update: 608 slow requests are blocked > 32 sec. > Implicated osds 51 (REQUEST_SLOW) > 2019-05-20 00:01:55.073734 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 173600 > : cluster [WRN] Health check update: 593 slow requests are blocked > 32 sec. > Implicated osds 51 (REQUEST_SLOW) > 2019-05-20 00:02:04.771432 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 173607 > : cluster [WRN] Health
Re: [ceph-users] Slow requests from bluestore osds / crashing rbd-nbd
Hello cephers, we have a few systems which utilize a rbd-bd map/mount to get access to a rbd volume. (This problem seems to be related to "[ceph-users] Slow requests from bluestore osds" (the original thread)) Unfortunately the rbd-nbd device of a system crashes three mondays in series at ~00:00 when the systemd fstrim timer executes "fstrim -av". (which runs in parallel to deep scrub operations) After that the device constantly reports io errors every time a access to the filesystem happens. Unmounting, remapping and mounting helped to get the filesystem/device back into business :-) Manual 30 minute stresstests using the following fio command, did not produce any problems on client side (Ceph storage reported some slow requests while testing). fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=test --bs=4k --iodepth=64 --size=4G --readwrite=randrw --rwmixread=50 --numjobs=50 --loops=10 It seems that others also experienced this problem: https://ceph-users.ceph.narkive.com/2FIfyx1U/rbd-nbd-timeout-and-crash The change for setting device timeouts by not seems to be merged to luminous. Experiments setting the timeout manually after mapping using https://github.com/OnApp/nbd-kernel_mod/blob/master/nbd_set_timeout.c haven't change the situation. Do you have suggestions how to analyze/solve the situation? Regards Marc -- The client kernel throws messages like this: May 19 23:59:01 int-nfs-001 CRON[836295]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 60 2) May 20 00:00:30 int-nfs-001 systemd[1]: Starting Discard unused blocks... May 20 00:01:02 int-nfs-001 kernel: [1077851.623582] block nbd0: Connection timed out May 20 00:01:02 int-nfs-001 kernel: [1077851.623613] block nbd0: shutting down sockets May 20 00:01:02 int-nfs-001 kernel: [1077851.623617] print_req_error: I/O error, dev nbd0, sector 84082280 May 20 00:01:02 int-nfs-001 kernel: [1077851.623632] block nbd0: Connection timed out May 20 00:01:02 int-nfs-001 kernel: [1077851.623636] print_req_error: I/O error, dev nbd0, sector 92470887 May 20 00:01:02 int-nfs-001 kernel: [1077851.623642] block nbd0: Connection timed out Ceph throws messages like this: 2019-05-20 00:00:00.000124 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 173572 : cluster [INF] overall HEALTH_OK 2019-05-20 00:00:54.249998 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 173586 : cluster [WRN] Health check failed: 644 slow requests are blocked > 32 sec. Implicated osds 51 (REQUEST_SLOW) 2019-05-20 00:01:00.330566 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 173587 : cluster [WRN] Health check update: 594 slow requests are blocked > 32 sec. Implicated osds 51 (REQUEST_SLOW) 2019-05-20 00:01:09.768476 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 173591 : cluster [WRN] Health check update: 505 slow requests are blocked > 32 sec. Implicated osds 51 (REQUEST_SLOW) 2019-05-20 00:01:14.768769 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 173592 : cluster [WRN] Health check update: 497 slow requests are blocked > 32 sec. Implicated osds 51 (REQUEST_SLOW) 2019-05-20 00:01:20.610398 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 173593 : cluster [WRN] Health check update: 509 slow requests are blocked > 32 sec. Implicated osds 51 (REQUEST_SLOW) 2019-05-20 00:01:28.721891 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 173594 : cluster [WRN] Health check update: 501 slow requests are blocked > 32 sec. Implicated osds 51 (REQUEST_SLOW) 2019-05-20 00:01:34.909842 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 173596 : cluster [WRN] Health check update: 494 slow requests are blocked > 32 sec. Implicated osds 51 (REQUEST_SLOW) 2019-05-20 00:01:44.770330 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 173597 : cluster [WRN] Health check update: 500 slow requests are blocked > 32 sec. Implicated osds 51 (REQUEST_SLOW) 2019-05-20 00:01:49.770625 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 173599 : cluster [WRN] Health check update: 608 slow requests are blocked > 32