Re: [ceph-users] Openstack VM IOPS drops dramatically during Ceph recovery
On Thu, Oct 17, 2019 at 12:35 PM huxia...@horebdata.cn wrote: > > hello, Robert > > thanks for the quick reply. I did test with osd op queue = wpq , and osd > op queue cut off = high > and > osd_recovery_op_priority = 1 > osd recovery delay start = 20 > osd recovery max active = 1 > osd recovery max chunk = 1048576 > osd recovery sleep = 1 > osd recovery sleep hdd = 1 > osd recovery sleep ssd = 1 > osd recovery sleep hybrid = 1 > osd recovery priority = 1 > osd max backfills = 1 > osd backfill scan max = 16 > osd backfill scan min = 4 > osd_op_thread_suicide_timeout = 300 > > But still the ceph cluster showed extremely hug recovery activities during > the beginning of the recovery, and after ca. 5-10 minutes, the recovery > gradually get under the control. I guess this is quite similar to what you > encountered in Nov. 2015. > > It is really annoying, and what else can i do to mitigate this weird > inital-recovery issue? any suggestions are much appreciated. Hmm, on our Luminous cluster, we have the defaults other than the op queue and cut off and bringing in a node is nearly zero impact for client traffic. Those would need to be set on all OSDs to be completely effective. Maybe go back to the defaults? Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Openstack VM IOPS drops dramatically during Ceph recovery
hello, Robert thanks for the quick reply. I did test with osd op queue = wpq , and osd op queue cut off = high and osd_recovery_op_priority = 1 osd recovery delay start = 20 osd recovery max active = 1 osd recovery max chunk = 1048576 osd recovery sleep = 1 osd recovery sleep hdd = 1 osd recovery sleep ssd = 1 osd recovery sleep hybrid = 1 osd recovery priority = 1 osd max backfills = 1 osd backfill scan max = 16 osd backfill scan min = 4 osd_op_thread_suicide_timeout = 300 But still the ceph cluster showed extremely hug recovery activities during the beginning of the recovery, and after ca. 5-10 minutes, the recovery gradually get under the control. I guess this is quite similar to what you encountered in Nov. 2015. It is really annoying, and what else can i do to mitigate this weird inital-recovery issue? any suggestions are much appreciated. thanks again, samuel huxia...@horebdata.cn From: Robert LeBlanc Date: 2019-10-17 21:23 To: huxia...@horebdata.cn CC: ceph-users Subject: Re: Re: [ceph-users] Openstack VM IOPS drops dramatically during Ceph recovery On Thu, Oct 17, 2019 at 12:08 PM huxia...@horebdata.cn wrote: > > I happened to find a note that you wrote in Nov 2015: > http://lists.ceph.com/pipermail/ceph-users-ceph.com/2015-November/006173.html > and I believe this is what i just hit exactly the same behavior : a host down > will badly take the client performance down 1/10 (with 200MB/s recovery > workload) and then took ten minutes to get good control of OSD recovery. > > Could you please share how did you eventally solve that issue? by seting a > fair large OSD recovery delay start or any other parameter? Wow! Dusting off the cobwebs here. I think this is what lead me to dig into the code and write the WPQ scheduler. I can't remember doing anything specific. I'm sorry I'm not much help in this regard. Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Openstack VM IOPS drops dramatically during Ceph recovery
On Thu, Oct 17, 2019 at 12:08 PM huxia...@horebdata.cn wrote: > > I happened to find a note that you wrote in Nov 2015: > http://lists.ceph.com/pipermail/ceph-users-ceph.com/2015-November/006173.html > and I believe this is what i just hit exactly the same behavior : a host down > will badly take the client performance down 1/10 (with 200MB/s recovery > workload) and then took ten minutes to get good control of OSD recovery. > > Could you please share how did you eventally solve that issue? by seting a > fair large OSD recovery delay start or any other parameter? Wow! Dusting off the cobwebs here. I think this is what lead me to dig into the code and write the WPQ scheduler. I can't remember doing anything specific. I'm sorry I'm not much help in this regard. Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Openstack VM IOPS drops dramatically during Ceph recovery
Hello, Robert, I happened to find a note that you wrote in Nov 2015: http://lists.ceph.com/pipermail/ceph-users-ceph.com/2015-November/006173.html and I believe this is what i just hit exactly the same behavior : a host down will badly take the client performance down 1/10 (with 200MB/s recovery workload) and then took ten minutes to get good control of OSD recovery. Could you please share how did you eventally solve that issue? by seting a fair large OSD recovery delay start or any other parameter? best regards, samuel huxia...@horebdata.cn From: Robert LeBlanc Date: 2019-10-16 21:46 To: huxia...@horebdata.cn CC: ceph-users Subject: Re: Re: [ceph-users] Openstack VM IOPS drops dramatically during Ceph recovery On Wed, Oct 16, 2019 at 11:53 AM huxia...@horebdata.cn wrote: > > My Ceph version is Luminuous 12.2.12. Do you think should i upgrade to > Nautilus, or will Nautilus have a better control of recovery/backfilling? We have a Jewel cluster and Luminuous cluster that we have changed these settings on and it really helped both of them. Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Openstack VM IOPS drops dramatically during Ceph recovery
On Wed, Oct 16, 2019 at 11:53 AM huxia...@horebdata.cn wrote: > > My Ceph version is Luminuous 12.2.12. Do you think should i upgrade to > Nautilus, or will Nautilus have a better control of recovery/backfilling? We have a Jewel cluster and Luminuous cluster that we have changed these settings on and it really helped both of them. Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Openstack VM IOPS drops dramatically during Ceph recovery
My Ceph version is Luminuous 12.2.12. Do you think should i upgrade to Nautilus, or will Nautilus have a better control of recovery/backfilling? best regards, Samuel huxia...@horebdata.cn From: Robert LeBlanc Date: 2019-10-14 16:27 To: huxia...@horebdata.cn CC: ceph-users Subject: Re: [ceph-users] Openstack VM IOPS drops dramatically during Ceph recovery On Thu, Oct 10, 2019 at 2:23 PM huxia...@horebdata.cn wrote: > > Hi, folks, > > I have a middle-size Ceph cluster as cinder backup for openstack (queens). > Duing testing, one Ceph node went down unexpected and powered up again ca 10 > minutes later, Ceph cluster starts PG recovery. To my surprise, VM IOPS > drops dramatically during Ceph recovery, from ca. 13K IOPS to about 400, a > factor of 1/30, and I did put a stringent throttling on backfill and > recovery, with the following ceph parameters > > osd_max_backfills = 1 > osd_recovery_max_active = 1 > osd_client_op_priority=63 > osd_recovery_op_priority=1 > osd_recovery_sleep = 0.5 > > The most weird thing is, > 1) when there is no IO activity from any VM (ALL VMs are quiet except the > recovery IO), the recovery bandwidth is ca. 10MiB/s, 2 objects/s. Seems like > recovery throttle setting is working properly > 2) when using FIO testing inside a VM, the recovery bandwith is going up > quickly, reaching above 200MiB/s, 60 objects/s. FIO IOPS performance inside > VM, however, is only at 400 IOPS/s (8KiB block size), around 3MiB/s. Obvious > recovery throttling DOES NOT work properly > 3) If i stop the FIO testing in VM, the recovery bandwith then goes down to > 10MiB/s, 2 objects/s again, strange enough. > > How can this weird behavior happen? I just wonder, is there a method to > configure recovery bandwith to a specific value, or the number of recovery > objects per second? this may give better control of bakcfilling/recovery, > instead of the faulty logic or relative osd_client_op_priority vs > osd_recovery_op_priority. > > any ideas or suggests to make the recovery under control? > > best regards, > > Samuel Not sure which version of Ceph you are on, but add these to your /etc/ceph/ceph.conf on all your OSDs and restart them. osd op queue = wpq osd op queue cut off = high That should really help and make backfills and recovery be non-impactful. This will be the default in Octopus. Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Openstack VM IOPS drops dramatically during Ceph recovery
On Thu, Oct 10, 2019 at 2:23 PM huxia...@horebdata.cn wrote: > > Hi, folks, > > I have a middle-size Ceph cluster as cinder backup for openstack (queens). > Duing testing, one Ceph node went down unexpected and powered up again ca 10 > minutes later, Ceph cluster starts PG recovery. To my surprise, VM IOPS > drops dramatically during Ceph recovery, from ca. 13K IOPS to about 400, a > factor of 1/30, and I did put a stringent throttling on backfill and > recovery, with the following ceph parameters > > osd_max_backfills = 1 > osd_recovery_max_active = 1 > osd_client_op_priority=63 > osd_recovery_op_priority=1 > osd_recovery_sleep = 0.5 > > The most weird thing is, > 1) when there is no IO activity from any VM (ALL VMs are quiet except the > recovery IO), the recovery bandwidth is ca. 10MiB/s, 2 objects/s. Seems like > recovery throttle setting is working properly > 2) when using FIO testing inside a VM, the recovery bandwith is going up > quickly, reaching above 200MiB/s, 60 objects/s. FIO IOPS performance inside > VM, however, is only at 400 IOPS/s (8KiB block size), around 3MiB/s. Obvious > recovery throttling DOES NOT work properly > 3) If i stop the FIO testing in VM, the recovery bandwith then goes down to > 10MiB/s, 2 objects/s again, strange enough. > > How can this weird behavior happen? I just wonder, is there a method to > configure recovery bandwith to a specific value, or the number of recovery > objects per second? this may give better control of bakcfilling/recovery, > instead of the faulty logic or relative osd_client_op_priority vs > osd_recovery_op_priority. > > any ideas or suggests to make the recovery under control? > > best regards, > > Samuel Not sure which version of Ceph you are on, but add these to your /etc/ceph/ceph.conf on all your OSDs and restart them. osd op queue = wpq osd op queue cut off = high That should really help and make backfills and recovery be non-impactful. This will be the default in Octopus. Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Openstack VM IOPS drops dramatically during Ceph recovery
Hi, folks, I have a middle-size Ceph cluster as cinder backup for openstack (queens). Duing testing, one Ceph node went down unexpected and powered up again ca 10 minutes later, Ceph cluster starts PG recovery. To my surprise, VM IOPS drops dramatically during Ceph recovery, from ca. 13K IOPS to about 400, a factor of 1/30, and I did put a stringent throttling on backfill and recovery, with the following ceph parameters osd_max_backfills = 1 osd_recovery_max_active = 1 osd_client_op_priority=63 osd_recovery_op_priority=1 osd_recovery_sleep = 0.5 The most weird thing is, 1) when there is no IO activity from any VM (ALL VMs are quiet except the recovery IO), the recovery bandwidth is ca. 10MiB/s, 2 objects/s. Seems like recovery throttle setting is working properly 2) when using FIO testing inside a VM, the recovery bandwith is going up quickly, reaching above 200MiB/s, 60 objects/s. FIO IOPS performance inside VM, however, is only at 400 IOPS/s (8KiB block size), around 3MiB/s. Obvious recovery throttling DOES NOT work properly 3) If i stop the FIO testing in VM, the recovery bandwith then goes down to 10MiB/s, 2 objects/s again, strange enough. How can this weird behavior happen? I just wonder, is there a method to configure recovery bandwith to a specific value, or the number of recovery objects per second? this may give better control of bakcfilling/recovery, instead of the faulty logic or relative osd_client_op_priority vs osd_recovery_op_priority. any ideas or suggests to make the recovery under control? best regards, Samuel huxia...@horebdata.cn ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com