Re: still recovery issues with cuttlefish
Am 22.08.2013 05:34, schrieb Samuel Just: It's not really possible at this time to control that limit because changing the primary is actually fairly expensive and doing it unnecessarily would probably make the situation much worse I'm sorry but remapping or backfilling is far less expensive on all of my machines than recovering. While backfilling i've around 8-10% I/O waits while under recovery i have 40%-50% (it's mostly necessary for backfilling, which is expensive anyway). It seems like forwarding IO on an object which needs to be recovered to a replica with the object would be the next step. Certainly something to consider for the future. Yes this would be the solution. Stefan -Sam On Wed, Aug 21, 2013 at 12:37 PM, Stefan Priebe s.pri...@profihost.ag wrote: Hi Sam, Am 21.08.2013 21:13, schrieb Samuel Just: As long as the request is for an object which is up to date on the primary, the request will be served without waiting for recovery. Sure but remember if you have VM random 4K workload a lot of objects go out of date pretty soon. A request only waits on recovery if the particular object being read or written must be recovered. Yes but on 4k load this can be a lot. Your issue was that recovering the particular object being requested was unreasonably slow due to silliness in the recovery code which you disabled by disabling osd_recover_clone_overlap. Yes and no. It's better now but far away from being good or perfect. My VMs do not crash anymore but i still have a bunch of slow requests (just around 10 messages) and still a VERY high I/O load on the disks during recovery. In cases where the primary osd is significantly behind, we do make one of the other osds primary during recovery in order to expedite requests (pgs in this state are shown as remapped). oh never seen that but at least in my case even 60s are a very long timeframe and the OSD is very stressed during recovery. Is it possible for me to set this value? Stefan -Sam On Wed, Aug 21, 2013 at 11:21 AM, Stefan Priebe s.pri...@profihost.ag wrote: Am 21.08.2013 17:32, schrieb Samuel Just: Have you tried setting osd_recovery_clone_overlap to false? That seemed to help with Stefan's issue. This might sound a bug harsh but maybe due to my limited english skills ;-) I still think that Cephs recovery system is broken by design. If an OSD comes back (was offline) all write requests regarding PGs where this one is primary are targeted immediatly to this OSD. If this one is not up2date for an PG it tries to recover that one immediatly which costs 4MB / block. If you have a lot of small write all over your OSDs and PGs you're sucked as your OSD has to recover ALL it's PGs immediatly or at least lots of them WHICH can't work. This is totally crazy. I think the right way would be: 1.) if an OSD goes down the replicas got primaries or 2.) an OSD which does not have an up2date PG should redirect to the OSD holding the secondary or third replica. Both results in being able to have a really smooth and slow recovery without any stress even under heavy 4K workloads like rbd backed VMs. Thanks for reading! Greets Stefan -Sam On Wed, Aug 21, 2013 at 8:28 AM, Mike Dawson mike.daw...@cloudapt.com wrote: Sam/Josh, We upgraded from 0.61.7 to 0.67.1 during a maintenance window this morning, hoping it would improve this situation, but there was no appreciable change. One node in our cluster fsck'ed after a reboot and got a bit behind. Our instances backed by RBD volumes were OK at that point, but once the node booted fully and the OSDs started, all Windows instances with rbd volumes experienced very choppy performance and were unable to ingest video surveillance traffic and commit it to disk. Once the cluster got back to HEALTH_OK, they resumed normal operation. I tried for a time with conservative recovery settings (osd max backfills = 1, osd recovery op priority = 1, and osd recovery max active = 1). No improvement for the guests. So I went to more aggressive settings to get things moving faster. That decreased the duration of the outage. During the entire period of recovery/backfill, the network looked fine...no where close to saturation. iowait on all drives look fine as well. Any ideas? Thanks, Mike Dawson On 8/14/2013 3:04 AM, Stefan Priebe - Profihost AG wrote: the same problem still occours. Will need to check when i've time to gather logs again. Am 14.08.2013 01:11, schrieb Samuel Just: I'm not sure, but your logs did show that you had 16 recovery ops in flight, so it's worth a try. If it doesn't help, you should collect the same set of logs I'll look again. Also, there are a few other patches between 61.7 and current cuttlefish which may help. -Sam On Tue, Aug 13, 2013 at 2:03 PM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Am 13.08.2013 um 22:43 schrieb Samuel Just
Re: still recovery issues with cuttlefish
Have you tried setting osd_recovery_clone_overlap to false? That seemed to help with Stefan's issue. -Sam On Wed, Aug 21, 2013 at 8:28 AM, Mike Dawson mike.daw...@cloudapt.com wrote: Sam/Josh, We upgraded from 0.61.7 to 0.67.1 during a maintenance window this morning, hoping it would improve this situation, but there was no appreciable change. One node in our cluster fsck'ed after a reboot and got a bit behind. Our instances backed by RBD volumes were OK at that point, but once the node booted fully and the OSDs started, all Windows instances with rbd volumes experienced very choppy performance and were unable to ingest video surveillance traffic and commit it to disk. Once the cluster got back to HEALTH_OK, they resumed normal operation. I tried for a time with conservative recovery settings (osd max backfills = 1, osd recovery op priority = 1, and osd recovery max active = 1). No improvement for the guests. So I went to more aggressive settings to get things moving faster. That decreased the duration of the outage. During the entire period of recovery/backfill, the network looked fine...no where close to saturation. iowait on all drives look fine as well. Any ideas? Thanks, Mike Dawson On 8/14/2013 3:04 AM, Stefan Priebe - Profihost AG wrote: the same problem still occours. Will need to check when i've time to gather logs again. Am 14.08.2013 01:11, schrieb Samuel Just: I'm not sure, but your logs did show that you had 16 recovery ops in flight, so it's worth a try. If it doesn't help, you should collect the same set of logs I'll look again. Also, there are a few other patches between 61.7 and current cuttlefish which may help. -Sam On Tue, Aug 13, 2013 at 2:03 PM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Am 13.08.2013 um 22:43 schrieb Samuel Just sam.j...@inktank.com: I just backported a couple of patches from next to fix a bug where we weren't respecting the osd_recovery_max_active config in some cases (1ea6b56170fc9e223e7c30635db02fa2ad8f4b4e). You can either try the current cuttlefish branch or wait for a 61.8 release. Thanks! Are you sure that this is the issue? I don't believe that but i'll give it a try. I already tested a branch from sage where he fixed a race regarding max active some weeks ago. So active recovering was max 1 but the issue didn't went away. Stefan -Sam On Mon, Aug 12, 2013 at 10:34 PM, Samuel Just sam.j...@inktank.com wrote: I got swamped today. I should be able to look tomorrow. Sorry! -Sam On Mon, Aug 12, 2013 at 9:39 PM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Did you take a look? Stefan Am 11.08.2013 um 05:50 schrieb Samuel Just sam.j...@inktank.com: Great! I'll take a look on Monday. -Sam On Sat, Aug 10, 2013 at 12:08 PM, Stefan Priebe s.pri...@profihost.ag wrote: Hi Samual, Am 09.08.2013 23:44, schrieb Samuel Just: I think Stefan's problem is probably distinct from Mike's. Stefan: Can you reproduce the problem with debug osd = 20 debug filestore = 20 debug ms = 1 debug optracker = 20 on a few osds (including the restarted osd), and upload those osd logs along with the ceph.log from before killing the osd until after the cluster becomes clean again? done - you'll find the logs at cephdrop folder: slow_requests_recovering_cuttlefish osd.52 was the one recovering Thanks! Greets, Stefan -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: still recovery issues with cuttlefish
It's osd recover clone overlap (see http://tracker.ceph.com/issues/5401) -Original Message- From: ceph-devel-ow...@vger.kernel.org [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Samuel Just Sent: mercredi 21 août 2013 17:33 To: Mike Dawson Cc: Stefan Priebe - Profihost AG; josh.dur...@inktank.com; ceph-devel@vger.kernel.org Subject: Re: still recovery issues with cuttlefish Have you tried setting osd_recovery_clone_overlap to false? That seemed to help with Stefan's issue. -Sam On Wed, Aug 21, 2013 at 8:28 AM, Mike Dawson mike.daw...@cloudapt.com wrote: Sam/Josh, We upgraded from 0.61.7 to 0.67.1 during a maintenance window this morning, hoping it would improve this situation, but there was no appreciable change. One node in our cluster fsck'ed after a reboot and got a bit behind. Our instances backed by RBD volumes were OK at that point, but once the node booted fully and the OSDs started, all Windows instances with rbd volumes experienced very choppy performance and were unable to ingest video surveillance traffic and commit it to disk. Once the cluster got back to HEALTH_OK, they resumed normal operation. I tried for a time with conservative recovery settings (osd max backfills = 1, osd recovery op priority = 1, and osd recovery max active = 1). No improvement for the guests. So I went to more aggressive settings to get things moving faster. That decreased the duration of the outage. During the entire period of recovery/backfill, the network looked fine...no where close to saturation. iowait on all drives look fine as well. Any ideas? Thanks, Mike Dawson On 8/14/2013 3:04 AM, Stefan Priebe - Profihost AG wrote: the same problem still occours. Will need to check when i've time to gather logs again. Am 14.08.2013 01:11, schrieb Samuel Just: I'm not sure, but your logs did show that you had 16 recovery ops in flight, so it's worth a try. If it doesn't help, you should collect the same set of logs I'll look again. Also, there are a few other patches between 61.7 and current cuttlefish which may help. -Sam On Tue, Aug 13, 2013 at 2:03 PM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Am 13.08.2013 um 22:43 schrieb Samuel Just sam.j...@inktank.com: I just backported a couple of patches from next to fix a bug where we weren't respecting the osd_recovery_max_active config in some cases (1ea6b56170fc9e223e7c30635db02fa2ad8f4b4e). You can either try the current cuttlefish branch or wait for a 61.8 release. Thanks! Are you sure that this is the issue? I don't believe that but i'll give it a try. I already tested a branch from sage where he fixed a race regarding max active some weeks ago. So active recovering was max 1 but the issue didn't went away. Stefan -Sam On Mon, Aug 12, 2013 at 10:34 PM, Samuel Just sam.j...@inktank.com wrote: I got swamped today. I should be able to look tomorrow. Sorry! -Sam On Mon, Aug 12, 2013 at 9:39 PM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Did you take a look? Stefan Am 11.08.2013 um 05:50 schrieb Samuel Just sam.j...@inktank.com: Great! I'll take a look on Monday. -Sam On Sat, Aug 10, 2013 at 12:08 PM, Stefan Priebe s.pri...@profihost.ag wrote: Hi Samual, Am 09.08.2013 23:44, schrieb Samuel Just: I think Stefan's problem is probably distinct from Mike's. Stefan: Can you reproduce the problem with debug osd = 20 debug filestore = 20 debug ms = 1 debug optracker = 20 on a few osds (including the restarted osd), and upload those osd logs along with the ceph.log from before killing the osd until after the cluster becomes clean again? done - you'll find the logs at cephdrop folder: slow_requests_recovering_cuttlefish osd.52 was the one recovering Thanks! Greets, Stefan -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: still recovery issues with cuttlefish
Sam, Tried it. Injected with 'ceph tell osd.* injectargs -- --no_osd_recover_clone_overlap', then stopped one OSD for ~1 minute. Upon restart, all my Windows VMs have issues until HEALTH_OK. The recovery was taking an abnormally long time, so I reverted away from --no_osd_recover_clone_overlap after about 10mins, to get back to HEALTH_OK. Interestingly, a Raring guest running a different video surveillance package proceeded without any issue whatsoever. Here is an image of the traffic to some of these Windows guests: http://www.gammacode.com/upload/rbd-hang-with-clone-overlap.jpg Ceph is outside of HEALTH_OK between ~12:55 and 13:10. Most of these instances rebooted due to an app error caused by the i/o hang shortly after 13:10. These Windows instances are booted as COW clones from a Glance image using Cinder. They also have a second RBD volume for bulk storage. I'm using qemu 1.5.2. Thanks, Mike On 8/21/2013 1:12 PM, Samuel Just wrote: Ah, thanks for the correction. -Sam On Wed, Aug 21, 2013 at 9:25 AM, Yann ROBIN yann.ro...@youscribe.com wrote: It's osd recover clone overlap (see http://tracker.ceph.com/issues/5401) -Original Message- From: ceph-devel-ow...@vger.kernel.org [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Samuel Just Sent: mercredi 21 août 2013 17:33 To: Mike Dawson Cc: Stefan Priebe - Profihost AG; josh.dur...@inktank.com; ceph-devel@vger.kernel.org Subject: Re: still recovery issues with cuttlefish Have you tried setting osd_recovery_clone_overlap to false? That seemed to help with Stefan's issue. -Sam On Wed, Aug 21, 2013 at 8:28 AM, Mike Dawson mike.daw...@cloudapt.com wrote: Sam/Josh, We upgraded from 0.61.7 to 0.67.1 during a maintenance window this morning, hoping it would improve this situation, but there was no appreciable change. One node in our cluster fsck'ed after a reboot and got a bit behind. Our instances backed by RBD volumes were OK at that point, but once the node booted fully and the OSDs started, all Windows instances with rbd volumes experienced very choppy performance and were unable to ingest video surveillance traffic and commit it to disk. Once the cluster got back to HEALTH_OK, they resumed normal operation. I tried for a time with conservative recovery settings (osd max backfills = 1, osd recovery op priority = 1, and osd recovery max active = 1). No improvement for the guests. So I went to more aggressive settings to get things moving faster. That decreased the duration of the outage. During the entire period of recovery/backfill, the network looked fine...no where close to saturation. iowait on all drives look fine as well. Any ideas? Thanks, Mike Dawson On 8/14/2013 3:04 AM, Stefan Priebe - Profihost AG wrote: the same problem still occours. Will need to check when i've time to gather logs again. Am 14.08.2013 01:11, schrieb Samuel Just: I'm not sure, but your logs did show that you had 16 recovery ops in flight, so it's worth a try. If it doesn't help, you should collect the same set of logs I'll look again. Also, there are a few other patches between 61.7 and current cuttlefish which may help. -Sam On Tue, Aug 13, 2013 at 2:03 PM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Am 13.08.2013 um 22:43 schrieb Samuel Just sam.j...@inktank.com: I just backported a couple of patches from next to fix a bug where we weren't respecting the osd_recovery_max_active config in some cases (1ea6b56170fc9e223e7c30635db02fa2ad8f4b4e). You can either try the current cuttlefish branch or wait for a 61.8 release. Thanks! Are you sure that this is the issue? I don't believe that but i'll give it a try. I already tested a branch from sage where he fixed a race regarding max active some weeks ago. So active recovering was max 1 but the issue didn't went away. Stefan -Sam On Mon, Aug 12, 2013 at 10:34 PM, Samuel Just sam.j...@inktank.com wrote: I got swamped today. I should be able to look tomorrow. Sorry! -Sam On Mon, Aug 12, 2013 at 9:39 PM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Did you take a look? Stefan Am 11.08.2013 um 05:50 schrieb Samuel Just sam.j...@inktank.com: Great! I'll take a look on Monday. -Sam On Sat, Aug 10, 2013 at 12:08 PM, Stefan Priebe s.pri...@profihost.ag wrote: Hi Samual, Am 09.08.2013 23:44, schrieb Samuel Just: I think Stefan's problem is probably distinct from Mike's. Stefan: Can you reproduce the problem with debug osd = 20 debug filestore = 20 debug ms = 1 debug optracker = 20 on a few osds (including the restarted osd), and upload those osd logs along with the ceph.log from before killing the osd until after the cluster becomes clean again? done - you'll find the logs at cephdrop folder: slow_requests_recovering_cuttlefish osd.52 was the one recovering Thanks! Greets, Stefan -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More
Re: still recovery issues with cuttlefish
If the raring guest was fine, I suspect that the issue is not on the OSDs. -Sam On Wed, Aug 21, 2013 at 10:55 AM, Mike Dawson mike.daw...@cloudapt.com wrote: Sam, Tried it. Injected with 'ceph tell osd.* injectargs -- --no_osd_recover_clone_overlap', then stopped one OSD for ~1 minute. Upon restart, all my Windows VMs have issues until HEALTH_OK. The recovery was taking an abnormally long time, so I reverted away from --no_osd_recover_clone_overlap after about 10mins, to get back to HEALTH_OK. Interestingly, a Raring guest running a different video surveillance package proceeded without any issue whatsoever. Here is an image of the traffic to some of these Windows guests: http://www.gammacode.com/upload/rbd-hang-with-clone-overlap.jpg Ceph is outside of HEALTH_OK between ~12:55 and 13:10. Most of these instances rebooted due to an app error caused by the i/o hang shortly after 13:10. These Windows instances are booted as COW clones from a Glance image using Cinder. They also have a second RBD volume for bulk storage. I'm using qemu 1.5.2. Thanks, Mike On 8/21/2013 1:12 PM, Samuel Just wrote: Ah, thanks for the correction. -Sam On Wed, Aug 21, 2013 at 9:25 AM, Yann ROBIN yann.ro...@youscribe.com wrote: It's osd recover clone overlap (see http://tracker.ceph.com/issues/5401) -Original Message- From: ceph-devel-ow...@vger.kernel.org [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Samuel Just Sent: mercredi 21 août 2013 17:33 To: Mike Dawson Cc: Stefan Priebe - Profihost AG; josh.dur...@inktank.com; ceph-devel@vger.kernel.org Subject: Re: still recovery issues with cuttlefish Have you tried setting osd_recovery_clone_overlap to false? That seemed to help with Stefan's issue. -Sam On Wed, Aug 21, 2013 at 8:28 AM, Mike Dawson mike.daw...@cloudapt.com wrote: Sam/Josh, We upgraded from 0.61.7 to 0.67.1 during a maintenance window this morning, hoping it would improve this situation, but there was no appreciable change. One node in our cluster fsck'ed after a reboot and got a bit behind. Our instances backed by RBD volumes were OK at that point, but once the node booted fully and the OSDs started, all Windows instances with rbd volumes experienced very choppy performance and were unable to ingest video surveillance traffic and commit it to disk. Once the cluster got back to HEALTH_OK, they resumed normal operation. I tried for a time with conservative recovery settings (osd max backfills = 1, osd recovery op priority = 1, and osd recovery max active = 1). No improvement for the guests. So I went to more aggressive settings to get things moving faster. That decreased the duration of the outage. During the entire period of recovery/backfill, the network looked fine...no where close to saturation. iowait on all drives look fine as well. Any ideas? Thanks, Mike Dawson On 8/14/2013 3:04 AM, Stefan Priebe - Profihost AG wrote: the same problem still occours. Will need to check when i've time to gather logs again. Am 14.08.2013 01:11, schrieb Samuel Just: I'm not sure, but your logs did show that you had 16 recovery ops in flight, so it's worth a try. If it doesn't help, you should collect the same set of logs I'll look again. Also, there are a few other patches between 61.7 and current cuttlefish which may help. -Sam On Tue, Aug 13, 2013 at 2:03 PM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Am 13.08.2013 um 22:43 schrieb Samuel Just sam.j...@inktank.com: I just backported a couple of patches from next to fix a bug where we weren't respecting the osd_recovery_max_active config in some cases (1ea6b56170fc9e223e7c30635db02fa2ad8f4b4e). You can either try the current cuttlefish branch or wait for a 61.8 release. Thanks! Are you sure that this is the issue? I don't believe that but i'll give it a try. I already tested a branch from sage where he fixed a race regarding max active some weeks ago. So active recovering was max 1 but the issue didn't went away. Stefan -Sam On Mon, Aug 12, 2013 at 10:34 PM, Samuel Just sam.j...@inktank.com wrote: I got swamped today. I should be able to look tomorrow. Sorry! -Sam On Mon, Aug 12, 2013 at 9:39 PM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Did you take a look? Stefan Am 11.08.2013 um 05:50 schrieb Samuel Just sam.j...@inktank.com: Great! I'll take a look on Monday. -Sam On Sat, Aug 10, 2013 at 12:08 PM, Stefan Priebe s.pri...@profihost.ag wrote: Hi Samual, Am 09.08.2013 23:44, schrieb Samuel Just: I think Stefan's problem is probably distinct from Mike's. Stefan: Can you reproduce the problem with debug osd = 20 debug filestore = 20 debug ms = 1 debug optracker = 20 on a few osds (including the restarted osd), and upload those osd logs along with the ceph.log from before killing the osd until after the cluster becomes clean again? done - you'll find the logs
Re: still recovery issues with cuttlefish
Am 21.08.2013 17:32, schrieb Samuel Just: Have you tried setting osd_recovery_clone_overlap to false? That seemed to help with Stefan's issue. This might sound a bug harsh but maybe due to my limited english skills ;-) I still think that Cephs recovery system is broken by design. If an OSD comes back (was offline) all write requests regarding PGs where this one is primary are targeted immediatly to this OSD. If this one is not up2date for an PG it tries to recover that one immediatly which costs 4MB / block. If you have a lot of small write all over your OSDs and PGs you're sucked as your OSD has to recover ALL it's PGs immediatly or at least lots of them WHICH can't work. This is totally crazy. I think the right way would be: 1.) if an OSD goes down the replicas got primaries or 2.) an OSD which does not have an up2date PG should redirect to the OSD holding the secondary or third replica. Both results in being able to have a really smooth and slow recovery without any stress even under heavy 4K workloads like rbd backed VMs. Thanks for reading! Greets Stefan -Sam On Wed, Aug 21, 2013 at 8:28 AM, Mike Dawson mike.daw...@cloudapt.com wrote: Sam/Josh, We upgraded from 0.61.7 to 0.67.1 during a maintenance window this morning, hoping it would improve this situation, but there was no appreciable change. One node in our cluster fsck'ed after a reboot and got a bit behind. Our instances backed by RBD volumes were OK at that point, but once the node booted fully and the OSDs started, all Windows instances with rbd volumes experienced very choppy performance and were unable to ingest video surveillance traffic and commit it to disk. Once the cluster got back to HEALTH_OK, they resumed normal operation. I tried for a time with conservative recovery settings (osd max backfills = 1, osd recovery op priority = 1, and osd recovery max active = 1). No improvement for the guests. So I went to more aggressive settings to get things moving faster. That decreased the duration of the outage. During the entire period of recovery/backfill, the network looked fine...no where close to saturation. iowait on all drives look fine as well. Any ideas? Thanks, Mike Dawson On 8/14/2013 3:04 AM, Stefan Priebe - Profihost AG wrote: the same problem still occours. Will need to check when i've time to gather logs again. Am 14.08.2013 01:11, schrieb Samuel Just: I'm not sure, but your logs did show that you had 16 recovery ops in flight, so it's worth a try. If it doesn't help, you should collect the same set of logs I'll look again. Also, there are a few other patches between 61.7 and current cuttlefish which may help. -Sam On Tue, Aug 13, 2013 at 2:03 PM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Am 13.08.2013 um 22:43 schrieb Samuel Just sam.j...@inktank.com: I just backported a couple of patches from next to fix a bug where we weren't respecting the osd_recovery_max_active config in some cases (1ea6b56170fc9e223e7c30635db02fa2ad8f4b4e). You can either try the current cuttlefish branch or wait for a 61.8 release. Thanks! Are you sure that this is the issue? I don't believe that but i'll give it a try. I already tested a branch from sage where he fixed a race regarding max active some weeks ago. So active recovering was max 1 but the issue didn't went away. Stefan -Sam On Mon, Aug 12, 2013 at 10:34 PM, Samuel Just sam.j...@inktank.com wrote: I got swamped today. I should be able to look tomorrow. Sorry! -Sam On Mon, Aug 12, 2013 at 9:39 PM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Did you take a look? Stefan Am 11.08.2013 um 05:50 schrieb Samuel Just sam.j...@inktank.com: Great! I'll take a look on Monday. -Sam On Sat, Aug 10, 2013 at 12:08 PM, Stefan Priebe s.pri...@profihost.ag wrote: Hi Samual, Am 09.08.2013 23:44, schrieb Samuel Just: I think Stefan's problem is probably distinct from Mike's. Stefan: Can you reproduce the problem with debug osd = 20 debug filestore = 20 debug ms = 1 debug optracker = 20 on a few osds (including the restarted osd), and upload those osd logs along with the ceph.log from before killing the osd until after the cluster becomes clean again? done - you'll find the logs at cephdrop folder: slow_requests_recovering_cuttlefish osd.52 was the one recovering Thanks! Greets, Stefan -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send
Re: still recovery issues with cuttlefish
As long as the request is for an object which is up to date on the primary, the request will be served without waiting for recovery. A request only waits on recovery if the particular object being read or written must be recovered. Your issue was that recovering the particular object being requested was unreasonably slow due to silliness in the recovery code which you disabled by disabling osd_recover_clone_overlap. In cases where the primary osd is significantly behind, we do make one of the other osds primary during recovery in order to expedite requests (pgs in this state are shown as remapped). -Sam On Wed, Aug 21, 2013 at 11:21 AM, Stefan Priebe s.pri...@profihost.ag wrote: Am 21.08.2013 17:32, schrieb Samuel Just: Have you tried setting osd_recovery_clone_overlap to false? That seemed to help with Stefan's issue. This might sound a bug harsh but maybe due to my limited english skills ;-) I still think that Cephs recovery system is broken by design. If an OSD comes back (was offline) all write requests regarding PGs where this one is primary are targeted immediatly to this OSD. If this one is not up2date for an PG it tries to recover that one immediatly which costs 4MB / block. If you have a lot of small write all over your OSDs and PGs you're sucked as your OSD has to recover ALL it's PGs immediatly or at least lots of them WHICH can't work. This is totally crazy. I think the right way would be: 1.) if an OSD goes down the replicas got primaries or 2.) an OSD which does not have an up2date PG should redirect to the OSD holding the secondary or third replica. Both results in being able to have a really smooth and slow recovery without any stress even under heavy 4K workloads like rbd backed VMs. Thanks for reading! Greets Stefan -Sam On Wed, Aug 21, 2013 at 8:28 AM, Mike Dawson mike.daw...@cloudapt.com wrote: Sam/Josh, We upgraded from 0.61.7 to 0.67.1 during a maintenance window this morning, hoping it would improve this situation, but there was no appreciable change. One node in our cluster fsck'ed after a reboot and got a bit behind. Our instances backed by RBD volumes were OK at that point, but once the node booted fully and the OSDs started, all Windows instances with rbd volumes experienced very choppy performance and were unable to ingest video surveillance traffic and commit it to disk. Once the cluster got back to HEALTH_OK, they resumed normal operation. I tried for a time with conservative recovery settings (osd max backfills = 1, osd recovery op priority = 1, and osd recovery max active = 1). No improvement for the guests. So I went to more aggressive settings to get things moving faster. That decreased the duration of the outage. During the entire period of recovery/backfill, the network looked fine...no where close to saturation. iowait on all drives look fine as well. Any ideas? Thanks, Mike Dawson On 8/14/2013 3:04 AM, Stefan Priebe - Profihost AG wrote: the same problem still occours. Will need to check when i've time to gather logs again. Am 14.08.2013 01:11, schrieb Samuel Just: I'm not sure, but your logs did show that you had 16 recovery ops in flight, so it's worth a try. If it doesn't help, you should collect the same set of logs I'll look again. Also, there are a few other patches between 61.7 and current cuttlefish which may help. -Sam On Tue, Aug 13, 2013 at 2:03 PM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Am 13.08.2013 um 22:43 schrieb Samuel Just sam.j...@inktank.com: I just backported a couple of patches from next to fix a bug where we weren't respecting the osd_recovery_max_active config in some cases (1ea6b56170fc9e223e7c30635db02fa2ad8f4b4e). You can either try the current cuttlefish branch or wait for a 61.8 release. Thanks! Are you sure that this is the issue? I don't believe that but i'll give it a try. I already tested a branch from sage where he fixed a race regarding max active some weeks ago. So active recovering was max 1 but the issue didn't went away. Stefan -Sam On Mon, Aug 12, 2013 at 10:34 PM, Samuel Just sam.j...@inktank.com wrote: I got swamped today. I should be able to look tomorrow. Sorry! -Sam On Mon, Aug 12, 2013 at 9:39 PM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Did you take a look? Stefan Am 11.08.2013 um 05:50 schrieb Samuel Just sam.j...@inktank.com: Great! I'll take a look on Monday. -Sam On Sat, Aug 10, 2013 at 12:08 PM, Stefan Priebe s.pri...@profihost.ag wrote: Hi Samual, Am 09.08.2013 23:44, schrieb Samuel Just: I think Stefan's problem is probably distinct from Mike's. Stefan: Can you reproduce the problem with debug osd = 20 debug filestore = 20 debug ms = 1 debug optracker = 20 on a few osds (including the restarted osd), and upload those osd logs along with the ceph.log from before killing the osd until after the cluster becomes clean
Re: still recovery issues with cuttlefish
Hi Sam, Am 21.08.2013 21:13, schrieb Samuel Just: As long as the request is for an object which is up to date on the primary, the request will be served without waiting for recovery. Sure but remember if you have VM random 4K workload a lot of objects go out of date pretty soon. A request only waits on recovery if the particular object being read or written must be recovered. Yes but on 4k load this can be a lot. Your issue was that recovering the particular object being requested was unreasonably slow due to silliness in the recovery code which you disabled by disabling osd_recover_clone_overlap. Yes and no. It's better now but far away from being good or perfect. My VMs do not crash anymore but i still have a bunch of slow requests (just around 10 messages) and still a VERY high I/O load on the disks during recovery. In cases where the primary osd is significantly behind, we do make one of the other osds primary during recovery in order to expedite requests (pgs in this state are shown as remapped). oh never seen that but at least in my case even 60s are a very long timeframe and the OSD is very stressed during recovery. Is it possible for me to set this value? Stefan -Sam On Wed, Aug 21, 2013 at 11:21 AM, Stefan Priebe s.pri...@profihost.ag wrote: Am 21.08.2013 17:32, schrieb Samuel Just: Have you tried setting osd_recovery_clone_overlap to false? That seemed to help with Stefan's issue. This might sound a bug harsh but maybe due to my limited english skills ;-) I still think that Cephs recovery system is broken by design. If an OSD comes back (was offline) all write requests regarding PGs where this one is primary are targeted immediatly to this OSD. If this one is not up2date for an PG it tries to recover that one immediatly which costs 4MB / block. If you have a lot of small write all over your OSDs and PGs you're sucked as your OSD has to recover ALL it's PGs immediatly or at least lots of them WHICH can't work. This is totally crazy. I think the right way would be: 1.) if an OSD goes down the replicas got primaries or 2.) an OSD which does not have an up2date PG should redirect to the OSD holding the secondary or third replica. Both results in being able to have a really smooth and slow recovery without any stress even under heavy 4K workloads like rbd backed VMs. Thanks for reading! Greets Stefan -Sam On Wed, Aug 21, 2013 at 8:28 AM, Mike Dawson mike.daw...@cloudapt.com wrote: Sam/Josh, We upgraded from 0.61.7 to 0.67.1 during a maintenance window this morning, hoping it would improve this situation, but there was no appreciable change. One node in our cluster fsck'ed after a reboot and got a bit behind. Our instances backed by RBD volumes were OK at that point, but once the node booted fully and the OSDs started, all Windows instances with rbd volumes experienced very choppy performance and were unable to ingest video surveillance traffic and commit it to disk. Once the cluster got back to HEALTH_OK, they resumed normal operation. I tried for a time with conservative recovery settings (osd max backfills = 1, osd recovery op priority = 1, and osd recovery max active = 1). No improvement for the guests. So I went to more aggressive settings to get things moving faster. That decreased the duration of the outage. During the entire period of recovery/backfill, the network looked fine...no where close to saturation. iowait on all drives look fine as well. Any ideas? Thanks, Mike Dawson On 8/14/2013 3:04 AM, Stefan Priebe - Profihost AG wrote: the same problem still occours. Will need to check when i've time to gather logs again. Am 14.08.2013 01:11, schrieb Samuel Just: I'm not sure, but your logs did show that you had 16 recovery ops in flight, so it's worth a try. If it doesn't help, you should collect the same set of logs I'll look again. Also, there are a few other patches between 61.7 and current cuttlefish which may help. -Sam On Tue, Aug 13, 2013 at 2:03 PM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Am 13.08.2013 um 22:43 schrieb Samuel Just sam.j...@inktank.com: I just backported a couple of patches from next to fix a bug where we weren't respecting the osd_recovery_max_active config in some cases (1ea6b56170fc9e223e7c30635db02fa2ad8f4b4e). You can either try the current cuttlefish branch or wait for a 61.8 release. Thanks! Are you sure that this is the issue? I don't believe that but i'll give it a try. I already tested a branch from sage where he fixed a race regarding max active some weeks ago. So active recovering was max 1 but the issue didn't went away. Stefan -Sam On Mon, Aug 12, 2013 at 10:34 PM, Samuel Just sam.j...@inktank.com wrote: I got swamped today. I should be able to look tomorrow. Sorry! -Sam On Mon, Aug 12, 2013 at 9:39 PM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Did you take a look? Stefan Am 11.08.2013 um 05:50 schrieb Samuel Just
Re: still recovery issues with cuttlefish
It's not really possible at this time to control that limit because changing the primary is actually fairly expensive and doing it unnecessarily would probably make the situation much worse (it's mostly necessary for backfilling, which is expensive anyway). It seems like forwarding IO on an object which needs to be recovered to a replica with the object would be the next step. Certainly something to consider for the future. -Sam On Wed, Aug 21, 2013 at 12:37 PM, Stefan Priebe s.pri...@profihost.ag wrote: Hi Sam, Am 21.08.2013 21:13, schrieb Samuel Just: As long as the request is for an object which is up to date on the primary, the request will be served without waiting for recovery. Sure but remember if you have VM random 4K workload a lot of objects go out of date pretty soon. A request only waits on recovery if the particular object being read or written must be recovered. Yes but on 4k load this can be a lot. Your issue was that recovering the particular object being requested was unreasonably slow due to silliness in the recovery code which you disabled by disabling osd_recover_clone_overlap. Yes and no. It's better now but far away from being good or perfect. My VMs do not crash anymore but i still have a bunch of slow requests (just around 10 messages) and still a VERY high I/O load on the disks during recovery. In cases where the primary osd is significantly behind, we do make one of the other osds primary during recovery in order to expedite requests (pgs in this state are shown as remapped). oh never seen that but at least in my case even 60s are a very long timeframe and the OSD is very stressed during recovery. Is it possible for me to set this value? Stefan -Sam On Wed, Aug 21, 2013 at 11:21 AM, Stefan Priebe s.pri...@profihost.ag wrote: Am 21.08.2013 17:32, schrieb Samuel Just: Have you tried setting osd_recovery_clone_overlap to false? That seemed to help with Stefan's issue. This might sound a bug harsh but maybe due to my limited english skills ;-) I still think that Cephs recovery system is broken by design. If an OSD comes back (was offline) all write requests regarding PGs where this one is primary are targeted immediatly to this OSD. If this one is not up2date for an PG it tries to recover that one immediatly which costs 4MB / block. If you have a lot of small write all over your OSDs and PGs you're sucked as your OSD has to recover ALL it's PGs immediatly or at least lots of them WHICH can't work. This is totally crazy. I think the right way would be: 1.) if an OSD goes down the replicas got primaries or 2.) an OSD which does not have an up2date PG should redirect to the OSD holding the secondary or third replica. Both results in being able to have a really smooth and slow recovery without any stress even under heavy 4K workloads like rbd backed VMs. Thanks for reading! Greets Stefan -Sam On Wed, Aug 21, 2013 at 8:28 AM, Mike Dawson mike.daw...@cloudapt.com wrote: Sam/Josh, We upgraded from 0.61.7 to 0.67.1 during a maintenance window this morning, hoping it would improve this situation, but there was no appreciable change. One node in our cluster fsck'ed after a reboot and got a bit behind. Our instances backed by RBD volumes were OK at that point, but once the node booted fully and the OSDs started, all Windows instances with rbd volumes experienced very choppy performance and were unable to ingest video surveillance traffic and commit it to disk. Once the cluster got back to HEALTH_OK, they resumed normal operation. I tried for a time with conservative recovery settings (osd max backfills = 1, osd recovery op priority = 1, and osd recovery max active = 1). No improvement for the guests. So I went to more aggressive settings to get things moving faster. That decreased the duration of the outage. During the entire period of recovery/backfill, the network looked fine...no where close to saturation. iowait on all drives look fine as well. Any ideas? Thanks, Mike Dawson On 8/14/2013 3:04 AM, Stefan Priebe - Profihost AG wrote: the same problem still occours. Will need to check when i've time to gather logs again. Am 14.08.2013 01:11, schrieb Samuel Just: I'm not sure, but your logs did show that you had 16 recovery ops in flight, so it's worth a try. If it doesn't help, you should collect the same set of logs I'll look again. Also, there are a few other patches between 61.7 and current cuttlefish which may help. -Sam On Tue, Aug 13, 2013 at 2:03 PM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Am 13.08.2013 um 22:43 schrieb Samuel Just sam.j...@inktank.com: I just backported a couple of patches from next to fix a bug where we weren't respecting the osd_recovery_max_active config in some cases (1ea6b56170fc9e223e7c30635db02fa2ad8f4b4e). You can either try the current cuttlefish branch or wait for a 61.8 release.
Re: still recovery issues with cuttlefish
the same problem still occours. Will need to check when i've time to gather logs again. Am 14.08.2013 01:11, schrieb Samuel Just: I'm not sure, but your logs did show that you had 16 recovery ops in flight, so it's worth a try. If it doesn't help, you should collect the same set of logs I'll look again. Also, there are a few other patches between 61.7 and current cuttlefish which may help. -Sam On Tue, Aug 13, 2013 at 2:03 PM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Am 13.08.2013 um 22:43 schrieb Samuel Just sam.j...@inktank.com: I just backported a couple of patches from next to fix a bug where we weren't respecting the osd_recovery_max_active config in some cases (1ea6b56170fc9e223e7c30635db02fa2ad8f4b4e). You can either try the current cuttlefish branch or wait for a 61.8 release. Thanks! Are you sure that this is the issue? I don't believe that but i'll give it a try. I already tested a branch from sage where he fixed a race regarding max active some weeks ago. So active recovering was max 1 but the issue didn't went away. Stefan -Sam On Mon, Aug 12, 2013 at 10:34 PM, Samuel Just sam.j...@inktank.com wrote: I got swamped today. I should be able to look tomorrow. Sorry! -Sam On Mon, Aug 12, 2013 at 9:39 PM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Did you take a look? Stefan Am 11.08.2013 um 05:50 schrieb Samuel Just sam.j...@inktank.com: Great! I'll take a look on Monday. -Sam On Sat, Aug 10, 2013 at 12:08 PM, Stefan Priebe s.pri...@profihost.ag wrote: Hi Samual, Am 09.08.2013 23:44, schrieb Samuel Just: I think Stefan's problem is probably distinct from Mike's. Stefan: Can you reproduce the problem with debug osd = 20 debug filestore = 20 debug ms = 1 debug optracker = 20 on a few osds (including the restarted osd), and upload those osd logs along with the ceph.log from before killing the osd until after the cluster becomes clean again? done - you'll find the logs at cephdrop folder: slow_requests_recovering_cuttlefish osd.52 was the one recovering Thanks! Greets, Stefan -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: still recovery issues with cuttlefish
I just backported a couple of patches from next to fix a bug where we weren't respecting the osd_recovery_max_active config in some cases (1ea6b56170fc9e223e7c30635db02fa2ad8f4b4e). You can either try the current cuttlefish branch or wait for a 61.8 release. -Sam On Mon, Aug 12, 2013 at 10:34 PM, Samuel Just sam.j...@inktank.com wrote: I got swamped today. I should be able to look tomorrow. Sorry! -Sam On Mon, Aug 12, 2013 at 9:39 PM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Did you take a look? Stefan Am 11.08.2013 um 05:50 schrieb Samuel Just sam.j...@inktank.com: Great! I'll take a look on Monday. -Sam On Sat, Aug 10, 2013 at 12:08 PM, Stefan Priebe s.pri...@profihost.ag wrote: Hi Samual, Am 09.08.2013 23:44, schrieb Samuel Just: I think Stefan's problem is probably distinct from Mike's. Stefan: Can you reproduce the problem with debug osd = 20 debug filestore = 20 debug ms = 1 debug optracker = 20 on a few osds (including the restarted osd), and upload those osd logs along with the ceph.log from before killing the osd until after the cluster becomes clean again? done - you'll find the logs at cephdrop folder: slow_requests_recovering_cuttlefish osd.52 was the one recovering Thanks! Greets, Stefan -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: still recovery issues with cuttlefish
Am 13.08.2013 um 22:43 schrieb Samuel Just sam.j...@inktank.com: I just backported a couple of patches from next to fix a bug where we weren't respecting the osd_recovery_max_active config in some cases (1ea6b56170fc9e223e7c30635db02fa2ad8f4b4e). You can either try the current cuttlefish branch or wait for a 61.8 release. Thanks! Are you sure that this is the issue? I don't believe that but i'll give it a try. I already tested a branch from sage where he fixed a race regarding max active some weeks ago. So active recovering was max 1 but the issue didn't went away. Stefan -Sam On Mon, Aug 12, 2013 at 10:34 PM, Samuel Just sam.j...@inktank.com wrote: I got swamped today. I should be able to look tomorrow. Sorry! -Sam On Mon, Aug 12, 2013 at 9:39 PM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Did you take a look? Stefan Am 11.08.2013 um 05:50 schrieb Samuel Just sam.j...@inktank.com: Great! I'll take a look on Monday. -Sam On Sat, Aug 10, 2013 at 12:08 PM, Stefan Priebe s.pri...@profihost.ag wrote: Hi Samual, Am 09.08.2013 23:44, schrieb Samuel Just: I think Stefan's problem is probably distinct from Mike's. Stefan: Can you reproduce the problem with debug osd = 20 debug filestore = 20 debug ms = 1 debug optracker = 20 on a few osds (including the restarted osd), and upload those osd logs along with the ceph.log from before killing the osd until after the cluster becomes clean again? done - you'll find the logs at cephdrop folder: slow_requests_recovering_cuttlefish osd.52 was the one recovering Thanks! Greets, Stefan -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: still recovery issues with cuttlefish
I'm not sure, but your logs did show that you had 16 recovery ops in flight, so it's worth a try. If it doesn't help, you should collect the same set of logs I'll look again. Also, there are a few other patches between 61.7 and current cuttlefish which may help. -Sam On Tue, Aug 13, 2013 at 2:03 PM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Am 13.08.2013 um 22:43 schrieb Samuel Just sam.j...@inktank.com: I just backported a couple of patches from next to fix a bug where we weren't respecting the osd_recovery_max_active config in some cases (1ea6b56170fc9e223e7c30635db02fa2ad8f4b4e). You can either try the current cuttlefish branch or wait for a 61.8 release. Thanks! Are you sure that this is the issue? I don't believe that but i'll give it a try. I already tested a branch from sage where he fixed a race regarding max active some weeks ago. So active recovering was max 1 but the issue didn't went away. Stefan -Sam On Mon, Aug 12, 2013 at 10:34 PM, Samuel Just sam.j...@inktank.com wrote: I got swamped today. I should be able to look tomorrow. Sorry! -Sam On Mon, Aug 12, 2013 at 9:39 PM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Did you take a look? Stefan Am 11.08.2013 um 05:50 schrieb Samuel Just sam.j...@inktank.com: Great! I'll take a look on Monday. -Sam On Sat, Aug 10, 2013 at 12:08 PM, Stefan Priebe s.pri...@profihost.ag wrote: Hi Samual, Am 09.08.2013 23:44, schrieb Samuel Just: I think Stefan's problem is probably distinct from Mike's. Stefan: Can you reproduce the problem with debug osd = 20 debug filestore = 20 debug ms = 1 debug optracker = 20 on a few osds (including the restarted osd), and upload those osd logs along with the ceph.log from before killing the osd until after the cluster becomes clean again? done - you'll find the logs at cephdrop folder: slow_requests_recovering_cuttlefish osd.52 was the one recovering Thanks! Greets, Stefan -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: still recovery issues with cuttlefish
Did you take a look? Stefan Am 11.08.2013 um 05:50 schrieb Samuel Just sam.j...@inktank.com: Great! I'll take a look on Monday. -Sam On Sat, Aug 10, 2013 at 12:08 PM, Stefan Priebe s.pri...@profihost.ag wrote: Hi Samual, Am 09.08.2013 23:44, schrieb Samuel Just: I think Stefan's problem is probably distinct from Mike's. Stefan: Can you reproduce the problem with debug osd = 20 debug filestore = 20 debug ms = 1 debug optracker = 20 on a few osds (including the restarted osd), and upload those osd logs along with the ceph.log from before killing the osd until after the cluster becomes clean again? done - you'll find the logs at cephdrop folder: slow_requests_recovering_cuttlefish osd.52 was the one recovering Thanks! Greets, Stefan -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: still recovery issues with cuttlefish
I got swamped today. I should be able to look tomorrow. Sorry! -Sam On Mon, Aug 12, 2013 at 9:39 PM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Did you take a look? Stefan Am 11.08.2013 um 05:50 schrieb Samuel Just sam.j...@inktank.com: Great! I'll take a look on Monday. -Sam On Sat, Aug 10, 2013 at 12:08 PM, Stefan Priebe s.pri...@profihost.ag wrote: Hi Samual, Am 09.08.2013 23:44, schrieb Samuel Just: I think Stefan's problem is probably distinct from Mike's. Stefan: Can you reproduce the problem with debug osd = 20 debug filestore = 20 debug ms = 1 debug optracker = 20 on a few osds (including the restarted osd), and upload those osd logs along with the ceph.log from before killing the osd until after the cluster becomes clean again? done - you'll find the logs at cephdrop folder: slow_requests_recovering_cuttlefish osd.52 was the one recovering Thanks! Greets, Stefan -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: still recovery issues with cuttlefish
Hi Samual, Am 09.08.2013 23:44, schrieb Samuel Just: I think Stefan's problem is probably distinct from Mike's. Stefan: Can you reproduce the problem with debug osd = 20 debug filestore = 20 debug ms = 1 debug optracker = 20 on a few osds (including the restarted osd), and upload those osd logs along with the ceph.log from before killing the osd until after the cluster becomes clean again? done - you'll find the logs at cephdrop folder: slow_requests_recovering_cuttlefish osd.52 was the one recovering Thanks! Greets, Stefan -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: still recovery issues with cuttlefish
, admin, rgw_keystone_token_cache_size: 1, rgw_keystone_revocation_interval: 900, rgw_admin_entry: admin, rgw_enforce_swift_acls: true, rgw_swift_token_expiration: 86400, rgw_print_continue: true, rgw_remote_addr_param: REMOTE_ADDR, rgw_op_thread_timeout: 600, rgw_op_thread_suicide_timeout: 0, rgw_thread_pool_size: 100, rgw_num_control_oids: 8, rgw_zone_root_pool: .rgw.root, rgw_log_nonexistent_bucket: false, rgw_log_object_name: %Y-%m-%d-%H-%i-%n, rgw_log_object_name_utc: false, rgw_usage_max_shards: 32, rgw_usage_max_user_shards: 1, rgw_enable_ops_log: false, rgw_enable_usage_log: false, rgw_ops_log_rados: true, rgw_ops_log_socket_path: , rgw_ops_log_data_backlog: 5242880, rgw_usage_log_flush_threshold: 1024, rgw_usage_log_tick_interval: 30, rgw_intent_log_object_name: %Y-%m-%d-%i-%n, rgw_intent_log_object_name_utc: false, rgw_init_timeout: 300, rgw_mime_types_file: \/etc\/mime.types, rgw_gc_max_objs: 32, rgw_gc_obj_min_wait: 7200, rgw_gc_processor_max_time: 3600, rgw_gc_processor_period: 3600, rgw_s3_success_create_obj_status: 0, rgw_resolve_cname: false, rgw_obj_stripe_size: 4194304, rgw_extended_http_attrs: , rgw_exit_timeout_secs: 120, rgw_get_obj_window_size: 16777216, rgw_get_obj_max_req_size: 4194304, rgw_relaxed_s3_bucket_names: false, rgw_list_buckets_max_chunk: 1000, mutex_perf_counter: false, internal_safe_to_start_threads: true} Stefan -Sam On Thu, Aug 1, 2013 at 12:07 PM, Stefan Priebe s.pri...@profihost.ag wrote: Mike we already have the async patch running. Yes it helps but only helps it does not solve. It just hides the issue ... Am 01.08.2013 20:54, schrieb Mike Dawson: I am also seeing recovery issues with 0.61.7. Here's the process: - ceph osd set noout - Reboot one of the nodes hosting OSDs - VMs mounted from RBD volumes work properly - I see the OSD's boot messages as they re-join the cluster - Start seeing active+recovery_wait, peering, and active+recovering - VMs mounted from RBD volumes become unresponsive. - Recovery completes - VMs mounted from RBD volumes regain responsiveness - ceph osd unset noout Would joshd's async patch for qemu help here, or is there something else going on? Output of ceph -w at: http://pastebin.com/raw.php?i=JLcZYFzY Thanks, Mike Dawson Co-Founder Director of Cloud Architecture Cloudapt LLC 6330 East 75th Street, Suite 170 Indianapolis, IN 46250 On 8/1/2013 2:34 PM, Samuel Just wrote: Can you reproduce and attach the ceph.log from before you stop the osd until after you have started the osd and it has recovered? -Sam On Thu, Aug 1, 2013 at 1:22 AM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Hi, i still have recovery issues with cuttlefish. After the OSD comes back it seem to hang for around 2-4 minutes and then recovery seems to start (pgs in recovery_wait start to decrement). This is with ceph 0.61.7. I get a lot of slow request messages an hanging VMs. What i noticed today is that if i leave the OSD off as long as ceph starts to backfill - the recovery and re backfilling wents absolutely smooth without any issues and no slow request messages at all. Does anybody have an idea why? Greets, Stefan -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: still recovery issues with cuttlefish
, rgw_gc_processor_period: 3600, rgw_s3_success_create_obj_status: 0, rgw_resolve_cname: false, rgw_obj_stripe_size: 4194304, rgw_extended_http_attrs: , rgw_exit_timeout_secs: 120, rgw_get_obj_window_size: 16777216, rgw_get_obj_max_req_size: 4194304, rgw_relaxed_s3_bucket_names: false, rgw_list_buckets_max_chunk: 1000, mutex_perf_counter: false, internal_safe_to_start_threads: true} Stefan -Sam On Thu, Aug 1, 2013 at 12:07 PM, Stefan Priebe s.pri...@profihost.ag wrote: Mike we already have the async patch running. Yes it helps but only helps it does not solve. It just hides the issue ... Am 01.08.2013 20:54, schrieb Mike Dawson: I am also seeing recovery issues with 0.61.7. Here's the process: - ceph osd set noout - Reboot one of the nodes hosting OSDs - VMs mounted from RBD volumes work properly - I see the OSD's boot messages as they re-join the cluster - Start seeing active+recovery_wait, peering, and active+recovering - VMs mounted from RBD volumes become unresponsive. - Recovery completes - VMs mounted from RBD volumes regain responsiveness - ceph osd unset noout Would joshd's async patch for qemu help here, or is there something else going on? Output of ceph -w at: http://pastebin.com/raw.php?i=JLcZYFzY Thanks, Mike Dawson Co-Founder Director of Cloud Architecture Cloudapt LLC 6330 East 75th Street, Suite 170 Indianapolis, IN 46250 On 8/1/2013 2:34 PM, Samuel Just wrote: Can you reproduce and attach the ceph.log from before you stop the osd until after you have started the osd and it has recovered? -Sam On Thu, Aug 1, 2013 at 1:22 AM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Hi, i still have recovery issues with cuttlefish. After the OSD comes back it seem to hang for around 2-4 minutes and then recovery seems to start (pgs in recovery_wait start to decrement). This is with ceph 0.61.7. I get a lot of slow request messages an hanging VMs. What i noticed today is that if i leave the OSD off as long as ceph starts to backfill - the recovery and re backfilling wents absolutely smooth without any issues and no slow request messages at all. Does anybody have an idea why? Greets, Stefan -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: still recovery issues with cuttlefish
: \/etc\/mime.types, rgw_gc_max_objs: 32, rgw_gc_obj_min_wait: 7200, rgw_gc_processor_max_time: 3600, rgw_gc_processor_period: 3600, rgw_s3_success_create_obj_status: 0, rgw_resolve_cname: false, rgw_obj_stripe_size: 4194304, rgw_extended_http_attrs: , rgw_exit_timeout_secs: 120, rgw_get_obj_window_size: 16777216, rgw_get_obj_max_req_size: 4194304, rgw_relaxed_s3_bucket_names: false, rgw_list_buckets_max_chunk: 1000, mutex_perf_counter: false, internal_safe_to_start_threads: true} Stefan -Sam On Thu, Aug 1, 2013 at 12:07 PM, Stefan Priebe s.pri...@profihost.ag wrote: Mike we already have the async patch running. Yes it helps but only helps it does not solve. It just hides the issue ... Am 01.08.2013 20:54, schrieb Mike Dawson: I am also seeing recovery issues with 0.61.7. Here's the process: - ceph osd set noout - Reboot one of the nodes hosting OSDs - VMs mounted from RBD volumes work properly - I see the OSD's boot messages as they re-join the cluster - Start seeing active+recovery_wait, peering, and active+recovering - VMs mounted from RBD volumes become unresponsive. - Recovery completes - VMs mounted from RBD volumes regain responsiveness - ceph osd unset noout Would joshd's async patch for qemu help here, or is there something else going on? Output of ceph -w at: http://pastebin.com/raw.php?i=JLcZYFzY Thanks, Mike Dawson Co-Founder Director of Cloud Architecture Cloudapt LLC 6330 East 75th Street, Suite 170 Indianapolis, IN 46250 On 8/1/2013 2:34 PM, Samuel Just wrote: Can you reproduce and attach the ceph.log from before you stop the osd until after you have started the osd and it has recovered? -Sam On Thu, Aug 1, 2013 at 1:22 AM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Hi, i still have recovery issues with cuttlefish. After the OSD comes back it seem to hang for around 2-4 minutes and then recovery seems to start (pgs in recovery_wait start to decrement). This is with ceph 0.61.7. I get a lot of slow request messages an hanging VMs. What i noticed today is that if i leave the OSD off as long as ceph starts to backfill - the recovery and re backfilling wents absolutely smooth without any issues and no slow request messages at all. Does anybody have an idea why? Greets, Stefan -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: still recovery issues with cuttlefish
, rgw_socket_path: , rgw_host: , rgw_port: , rgw_dns_name: , rgw_script_uri: , rgw_request_uri: , rgw_swift_url: , rgw_swift_url_prefix: swift, rgw_swift_auth_url: , rgw_swift_auth_entry: auth, rgw_keystone_url: , rgw_keystone_admin_token: , rgw_keystone_accepted_roles: Member, admin, rgw_keystone_token_cache_size: 1, rgw_keystone_revocation_interval: 900, rgw_admin_entry: admin, rgw_enforce_swift_acls: true, rgw_swift_token_expiration: 86400, rgw_print_continue: true, rgw_remote_addr_param: REMOTE_ADDR, rgw_op_thread_timeout: 600, rgw_op_thread_suicide_timeout: 0, rgw_thread_pool_size: 100, rgw_num_control_oids: 8, rgw_zone_root_pool: .rgw.root, rgw_log_nonexistent_bucket: false, rgw_log_object_name: %Y-%m-%d-%H-%i-%n, rgw_log_object_name_utc: false, rgw_usage_max_shards: 32, rgw_usage_max_user_shards: 1, rgw_enable_ops_log: false, rgw_enable_usage_log: false, rgw_ops_log_rados: true, rgw_ops_log_socket_path: , rgw_ops_log_data_backlog: 5242880, rgw_usage_log_flush_threshold: 1024, rgw_usage_log_tick_interval: 30, rgw_intent_log_object_name: %Y-%m-%d-%i-%n, rgw_intent_log_object_name_utc: false, rgw_init_timeout: 300, rgw_mime_types_file: \/etc\/mime.types, rgw_gc_max_objs: 32, rgw_gc_obj_min_wait: 7200, rgw_gc_processor_max_time: 3600, rgw_gc_processor_period: 3600, rgw_s3_success_create_obj_status: 0, rgw_resolve_cname: false, rgw_obj_stripe_size: 4194304, rgw_extended_http_attrs: , rgw_exit_timeout_secs: 120, rgw_get_obj_window_size: 16777216, rgw_get_obj_max_req_size: 4194304, rgw_relaxed_s3_bucket_names: false, rgw_list_buckets_max_chunk: 1000, mutex_perf_counter: false, internal_safe_to_start_threads: true} Stefan -Sam On Thu, Aug 1, 2013 at 12:07 PM, Stefan Priebe s.pri...@profihost.ag wrote: Mike we already have the async patch running. Yes it helps but only helps it does not solve. It just hides the issue ... Am 01.08.2013 20:54, schrieb Mike Dawson: I am also seeing recovery issues with 0.61.7. Here's the process: - ceph osd set noout - Reboot one of the nodes hosting OSDs - VMs mounted from RBD volumes work properly - I see the OSD's boot messages as they re-join the cluster - Start seeing active+recovery_wait, peering, and active+recovering - VMs mounted from RBD volumes become unresponsive. - Recovery completes - VMs mounted from RBD volumes regain responsiveness - ceph osd unset noout Would joshd's async patch for qemu help here, or is there something else going on? Output of ceph -w at: http://pastebin.com/raw.php?i=JLcZYFzY Thanks, Mike Dawson Co-Founder Director of Cloud Architecture Cloudapt LLC 6330 East 75th Street, Suite 170 Indianapolis, IN 46250 On 8/1/2013 2:34 PM, Samuel Just wrote: Can you reproduce and attach the ceph.log from before you stop the osd until after you have started the osd and it has recovered? -Sam On Thu, Aug 1, 2013 at 1:22 AM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Hi, i still have recovery issues with cuttlefish. After the OSD comes back it seem to hang for around 2-4 minutes and then recovery seems to start (pgs in recovery_wait start to decrement). This is with ceph 0.61.7. I get a lot of slow request messages an hanging VMs. What i noticed today is that if i leave the OSD off as long as ceph starts to backfill - the recovery and re backfilling wents absolutely smooth without any issues and no slow request messages at all. Does anybody have an idea why? Greets, Stefan -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: still recovery issues with cuttlefish
, journal_ignore_corruption: false, rbd_cache: false, rbd_cache_writethrough_until_flush: false, rbd_cache_size: 33554432, rbd_cache_max_dirty: 25165824, rbd_cache_target_dirty: 16777216, rbd_cache_max_dirty_age: 1, rbd_cache_block_writes_upfront: false, rbd_concurrent_management_ops: 10, rbd_default_format: 1, rbd_default_order: 22, rbd_default_stripe_count: 1, rbd_default_stripe_unit: 4194304, rbd_default_features: 3, nss_db_path: , rgw_data: \/var\/lib\/ceph\/radosgw\/ceph-0, rgw_enable_apis: s3, swift, swift_auth, admin, rgw_cache_enabled: true, rgw_cache_lru_size: 1, rgw_socket_path: , rgw_host: , rgw_port: , rgw_dns_name: , rgw_script_uri: , rgw_request_uri: , rgw_swift_url: , rgw_swift_url_prefix: swift, rgw_swift_auth_url: , rgw_swift_auth_entry: auth, rgw_keystone_url: , rgw_keystone_admin_token: , rgw_keystone_accepted_roles: Member, admin, rgw_keystone_token_cache_size: 1, rgw_keystone_revocation_interval: 900, rgw_admin_entry: admin, rgw_enforce_swift_acls: true, rgw_swift_token_expiration: 86400, rgw_print_continue: true, rgw_remote_addr_param: REMOTE_ADDR, rgw_op_thread_timeout: 600, rgw_op_thread_suicide_timeout: 0, rgw_thread_pool_size: 100, rgw_num_control_oids: 8, rgw_zone_root_pool: .rgw.root, rgw_log_nonexistent_bucket: false, rgw_log_object_name: %Y-%m-%d-%H-%i-%n, rgw_log_object_name_utc: false, rgw_usage_max_shards: 32, rgw_usage_max_user_shards: 1, rgw_enable_ops_log: false, rgw_enable_usage_log: false, rgw_ops_log_rados: true, rgw_ops_log_socket_path: , rgw_ops_log_data_backlog: 5242880, rgw_usage_log_flush_threshold: 1024, rgw_usage_log_tick_interval: 30, rgw_intent_log_object_name: %Y-%m-%d-%i-%n, rgw_intent_log_object_name_utc: false, rgw_init_timeout: 300, rgw_mime_types_file: \/etc\/mime.types, rgw_gc_max_objs: 32, rgw_gc_obj_min_wait: 7200, rgw_gc_processor_max_time: 3600, rgw_gc_processor_period: 3600, rgw_s3_success_create_obj_status: 0, rgw_resolve_cname: false, rgw_obj_stripe_size: 4194304, rgw_extended_http_attrs: , rgw_exit_timeout_secs: 120, rgw_get_obj_window_size: 16777216, rgw_get_obj_max_req_size: 4194304, rgw_relaxed_s3_bucket_names: false, rgw_list_buckets_max_chunk: 1000, mutex_perf_counter: false, internal_safe_to_start_threads: true} Stefan -Sam On Thu, Aug 1, 2013 at 12:07 PM, Stefan Priebe s.pri...@profihost.ag wrote: Mike we already have the async patch running. Yes it helps but only helps it does not solve. It just hides the issue ... Am 01.08.2013 20:54, schrieb Mike Dawson: I am also seeing recovery issues with 0.61.7. Here's the process: - ceph osd set noout - Reboot one of the nodes hosting OSDs - VMs mounted from RBD volumes work properly - I see the OSD's boot messages as they re-join the cluster - Start seeing active+recovery_wait, peering, and active+recovering - VMs mounted from RBD volumes become unresponsive. - Recovery completes - VMs mounted from RBD volumes regain responsiveness - ceph osd unset noout Would joshd's async patch for qemu help here, or is there something else going on? Output of ceph -w at: http://pastebin.com/raw.php?i=JLcZYFzY Thanks, Mike Dawson Co-Founder Director of Cloud Architecture Cloudapt LLC 6330 East 75th Street, Suite 170 Indianapolis, IN 46250 On 8/1/2013 2:34 PM, Samuel Just wrote: Can you reproduce and attach the ceph.log from before you stop the osd until after you have started the osd and it has recovered? -Sam On Thu, Aug 1, 2013 at 1:22 AM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Hi, i still have recovery issues with cuttlefish. After the OSD comes back it seem to hang for around 2-4 minutes and then recovery seems to start (pgs in recovery_wait start to decrement). This is with ceph 0.61.7. I get a lot of slow request messages an hanging VMs. What i noticed today is that if i leave the OSD off as long as ceph starts to backfill - the recovery and re backfilling wents absolutely smooth without any issues and no slow request messages at all. Does anybody have an idea why? Greets, Stefan -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord
Re: still recovery issues with cuttlefish
Created #5844. On Thu, Aug 1, 2013 at 10:38 PM, Samuel Just sam.j...@inktank.com wrote: Is there a bug open for this? I suspect we don't sufficiently throttle the snapshot removal work. -Sam On Thu, Aug 1, 2013 at 7:50 AM, Andrey Korolyov and...@xdel.ru wrote: Second this. Also for long-lasting snapshot problem and related performance issues I may say that cuttlefish improved things greatly, but creation/deletion of large snapshot (hundreds of gigabytes of commited data) still can bring down cluster for a minutes, despite usage of every possible optimization. On Thu, Aug 1, 2013 at 12:22 PM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Hi, i still have recovery issues with cuttlefish. After the OSD comes back it seem to hang for around 2-4 minutes and then recovery seems to start (pgs in recovery_wait start to decrement). This is with ceph 0.61.7. I get a lot of slow request messages an hanging VMs. What i noticed today is that if i leave the OSD off as long as ceph starts to backfill - the recovery and re backfilling wents absolutely smooth without any issues and no slow request messages at all. Does anybody have an idea why? Greets, Stefan -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: still recovery issues with cuttlefish
, journal_queue_max_bytes: 33554432, journal_align_min_size: 65536, journal_replay_from: 0, journal_zero_on_create: false, journal_ignore_corruption: false, rbd_cache: false, rbd_cache_writethrough_until_flush: false, rbd_cache_size: 33554432, rbd_cache_max_dirty: 25165824, rbd_cache_target_dirty: 16777216, rbd_cache_max_dirty_age: 1, rbd_cache_block_writes_upfront: false, rbd_concurrent_management_ops: 10, rbd_default_format: 1, rbd_default_order: 22, rbd_default_stripe_count: 1, rbd_default_stripe_unit: 4194304, rbd_default_features: 3, nss_db_path: , rgw_data: \/var\/lib\/ceph\/radosgw\/ceph-0, rgw_enable_apis: s3, swift, swift_auth, admin, rgw_cache_enabled: true, rgw_cache_lru_size: 1, rgw_socket_path: , rgw_host: , rgw_port: , rgw_dns_name: , rgw_script_uri: , rgw_request_uri: , rgw_swift_url: , rgw_swift_url_prefix: swift, rgw_swift_auth_url: , rgw_swift_auth_entry: auth, rgw_keystone_url: , rgw_keystone_admin_token: , rgw_keystone_accepted_roles: Member, admin, rgw_keystone_token_cache_size: 1, rgw_keystone_revocation_interval: 900, rgw_admin_entry: admin, rgw_enforce_swift_acls: true, rgw_swift_token_expiration: 86400, rgw_print_continue: true, rgw_remote_addr_param: REMOTE_ADDR, rgw_op_thread_timeout: 600, rgw_op_thread_suicide_timeout: 0, rgw_thread_pool_size: 100, rgw_num_control_oids: 8, rgw_zone_root_pool: .rgw.root, rgw_log_nonexistent_bucket: false, rgw_log_object_name: %Y-%m-%d-%H-%i-%n, rgw_log_object_name_utc: false, rgw_usage_max_shards: 32, rgw_usage_max_user_shards: 1, rgw_enable_ops_log: false, rgw_enable_usage_log: false, rgw_ops_log_rados: true, rgw_ops_log_socket_path: , rgw_ops_log_data_backlog: 5242880, rgw_usage_log_flush_threshold: 1024, rgw_usage_log_tick_interval: 30, rgw_intent_log_object_name: %Y-%m-%d-%i-%n, rgw_intent_log_object_name_utc: false, rgw_init_timeout: 300, rgw_mime_types_file: \/etc\/mime.types, rgw_gc_max_objs: 32, rgw_gc_obj_min_wait: 7200, rgw_gc_processor_max_time: 3600, rgw_gc_processor_period: 3600, rgw_s3_success_create_obj_status: 0, rgw_resolve_cname: false, rgw_obj_stripe_size: 4194304, rgw_extended_http_attrs: , rgw_exit_timeout_secs: 120, rgw_get_obj_window_size: 16777216, rgw_get_obj_max_req_size: 4194304, rgw_relaxed_s3_bucket_names: false, rgw_list_buckets_max_chunk: 1000, mutex_perf_counter: false, internal_safe_to_start_threads: true} Stefan -Sam On Thu, Aug 1, 2013 at 12:07 PM, Stefan Priebe s.pri...@profihost.ag wrote: Mike we already have the async patch running. Yes it helps but only helps it does not solve. It just hides the issue ... Am 01.08.2013 20:54, schrieb Mike Dawson: I am also seeing recovery issues with 0.61.7. Here's the process: - ceph osd set noout - Reboot one of the nodes hosting OSDs - VMs mounted from RBD volumes work properly - I see the OSD's boot messages as they re-join the cluster - Start seeing active+recovery_wait, peering, and active+recovering - VMs mounted from RBD volumes become unresponsive. - Recovery completes - VMs mounted from RBD volumes regain responsiveness - ceph osd unset noout Would joshd's async patch for qemu help here, or is there something else going on? Output of ceph -w at: http://pastebin.com/raw.php?i=JLcZYFzY Thanks, Mike Dawson Co-Founder Director of Cloud Architecture Cloudapt LLC 6330 East 75th Street, Suite 170 Indianapolis, IN 46250 On 8/1/2013 2:34 PM, Samuel Just wrote: Can you reproduce and attach the ceph.log from before you stop the osd until after you have started the osd and it has recovered? -Sam On Thu, Aug 1, 2013 at 1:22 AM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Hi, i still have recovery issues with cuttlefish. After the OSD comes back it seem to hang for around 2-4 minutes and then recovery seems to start (pgs in recovery_wait start to decrement). This is with ceph 0.61.7. I get a lot of slow request messages an hanging VMs. What i noticed today is that if i leave the OSD off as long as ceph starts to backfill - the recovery and re backfilling wents absolutely smooth without any issues and no slow request messages at all. Does anybody have an idea why? Greets, Stefan -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe
Re: still recovery issues with cuttlefish
, filestore_op_thread_timeout: 60, filestore_op_thread_suicide_timeout: 180, filestore_commit_timeout: 600, filestore_fiemap_threshold: 4096, filestore_merge_threshold: 10, filestore_split_multiple: 2, filestore_update_to: 1000, filestore_blackhole: false, filestore_dump_file: , filestore_kill_at: 0, filestore_inject_stall: 0, filestore_fail_eio: true, filestore_replica_fadvise: true, filestore_debug_verify_split: false, journal_dio: true, journal_aio: true, journal_force_aio: false, journal_max_corrupt_search: 10485760, journal_block_align: true, journal_write_header_frequency: 0, journal_max_write_bytes: 10485760, journal_max_write_entries: 100, journal_queue_max_ops: 300, journal_queue_max_bytes: 33554432, journal_align_min_size: 65536, journal_replay_from: 0, journal_zero_on_create: false, journal_ignore_corruption: false, rbd_cache: false, rbd_cache_writethrough_until_flush: false, rbd_cache_size: 33554432, rbd_cache_max_dirty: 25165824, rbd_cache_target_dirty: 16777216, rbd_cache_max_dirty_age: 1, rbd_cache_block_writes_upfront: false, rbd_concurrent_management_ops: 10, rbd_default_format: 1, rbd_default_order: 22, rbd_default_stripe_count: 1, rbd_default_stripe_unit: 4194304, rbd_default_features: 3, nss_db_path: , rgw_data: \/var\/lib\/ceph\/radosgw\/ceph-0, rgw_enable_apis: s3, swift, swift_auth, admin, rgw_cache_enabled: true, rgw_cache_lru_size: 1, rgw_socket_path: , rgw_host: , rgw_port: , rgw_dns_name: , rgw_script_uri: , rgw_request_uri: , rgw_swift_url: , rgw_swift_url_prefix: swift, rgw_swift_auth_url: , rgw_swift_auth_entry: auth, rgw_keystone_url: , rgw_keystone_admin_token: , rgw_keystone_accepted_roles: Member, admin, rgw_keystone_token_cache_size: 1, rgw_keystone_revocation_interval: 900, rgw_admin_entry: admin, rgw_enforce_swift_acls: true, rgw_swift_token_expiration: 86400, rgw_print_continue: true, rgw_remote_addr_param: REMOTE_ADDR, rgw_op_thread_timeout: 600, rgw_op_thread_suicide_timeout: 0, rgw_thread_pool_size: 100, rgw_num_control_oids: 8, rgw_zone_root_pool: .rgw.root, rgw_log_nonexistent_bucket: false, rgw_log_object_name: %Y-%m-%d-%H-%i-%n, rgw_log_object_name_utc: false, rgw_usage_max_shards: 32, rgw_usage_max_user_shards: 1, rgw_enable_ops_log: false, rgw_enable_usage_log: false, rgw_ops_log_rados: true, rgw_ops_log_socket_path: , rgw_ops_log_data_backlog: 5242880, rgw_usage_log_flush_threshold: 1024, rgw_usage_log_tick_interval: 30, rgw_intent_log_object_name: %Y-%m-%d-%i-%n, rgw_intent_log_object_name_utc: false, rgw_init_timeout: 300, rgw_mime_types_file: \/etc\/mime.types, rgw_gc_max_objs: 32, rgw_gc_obj_min_wait: 7200, rgw_gc_processor_max_time: 3600, rgw_gc_processor_period: 3600, rgw_s3_success_create_obj_status: 0, rgw_resolve_cname: false, rgw_obj_stripe_size: 4194304, rgw_extended_http_attrs: , rgw_exit_timeout_secs: 120, rgw_get_obj_window_size: 16777216, rgw_get_obj_max_req_size: 4194304, rgw_relaxed_s3_bucket_names: false, rgw_list_buckets_max_chunk: 1000, mutex_perf_counter: false, internal_safe_to_start_threads: true} Stefan -Sam On Thu, Aug 1, 2013 at 12:07 PM, Stefan Priebe s.pri...@profihost.ag wrote: Mike we already have the async patch running. Yes it helps but only helps it does not solve. It just hides the issue ... Am 01.08.2013 20:54, schrieb Mike Dawson: I am also seeing recovery issues with 0.61.7. Here's the process: - ceph osd set noout - Reboot one of the nodes hosting OSDs - VMs mounted from RBD volumes work properly - I see the OSD's boot messages as they re-join the cluster - Start seeing active+recovery_wait, peering, and active+recovering - VMs mounted from RBD volumes become unresponsive. - Recovery completes - VMs mounted from RBD volumes regain responsiveness - ceph osd unset noout Would joshd's async patch for qemu help here, or is there something else going on? Output of ceph -w at: http://pastebin.com/raw.php?i=JLcZYFzY Thanks, Mike Dawson Co-Founder Director of Cloud Architecture Cloudapt LLC 6330 East 75th Street, Suite 170 Indianapolis, IN 46250 On 8/1/2013 2:34 PM, Samuel Just wrote: Can you reproduce and attach the ceph.log from before you stop the osd until after you have started the osd and it has recovered? -Sam On Thu, Aug 1, 2013 at 1:22 AM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Hi, i still have recovery issues with cuttlefish. After the OSD comes back it seem to hang for around 2-4 minutes and then recovery seems to start (pgs in recovery_wait start to decrement). This is with ceph 0.61.7. I get a lot of slow
Re: still recovery issues with cuttlefish
, filestore_queue_max_bytes: 104857600, filestore_queue_committing_max_ops: 5000, filestore_queue_committing_max_bytes: 104857600, filestore_op_threads: 2, filestore_op_thread_timeout: 60, filestore_op_thread_suicide_timeout: 180, filestore_commit_timeout: 600, filestore_fiemap_threshold: 4096, filestore_merge_threshold: 10, filestore_split_multiple: 2, filestore_update_to: 1000, filestore_blackhole: false, filestore_dump_file: , filestore_kill_at: 0, filestore_inject_stall: 0, filestore_fail_eio: true, filestore_replica_fadvise: true, filestore_debug_verify_split: false, journal_dio: true, journal_aio: true, journal_force_aio: false, journal_max_corrupt_search: 10485760, journal_block_align: true, journal_write_header_frequency: 0, journal_max_write_bytes: 10485760, journal_max_write_entries: 100, journal_queue_max_ops: 300, journal_queue_max_bytes: 33554432, journal_align_min_size: 65536, journal_replay_from: 0, journal_zero_on_create: false, journal_ignore_corruption: false, rbd_cache: false, rbd_cache_writethrough_until_flush: false, rbd_cache_size: 33554432, rbd_cache_max_dirty: 25165824, rbd_cache_target_dirty: 16777216, rbd_cache_max_dirty_age: 1, rbd_cache_block_writes_upfront: false, rbd_concurrent_management_ops: 10, rbd_default_format: 1, rbd_default_order: 22, rbd_default_stripe_count: 1, rbd_default_stripe_unit: 4194304, rbd_default_features: 3, nss_db_path: , rgw_data: \/var\/lib\/ceph\/radosgw\/ceph-0, rgw_enable_apis: s3, swift, swift_auth, admin, rgw_cache_enabled: true, rgw_cache_lru_size: 1, rgw_socket_path: , rgw_host: , rgw_port: , rgw_dns_name: , rgw_script_uri: , rgw_request_uri: , rgw_swift_url: , rgw_swift_url_prefix: swift, rgw_swift_auth_url: , rgw_swift_auth_entry: auth, rgw_keystone_url: , rgw_keystone_admin_token: , rgw_keystone_accepted_roles: Member, admin, rgw_keystone_token_cache_size: 1, rgw_keystone_revocation_interval: 900, rgw_admin_entry: admin, rgw_enforce_swift_acls: true, rgw_swift_token_expiration: 86400, rgw_print_continue: true, rgw_remote_addr_param: REMOTE_ADDR, rgw_op_thread_timeout: 600, rgw_op_thread_suicide_timeout: 0, rgw_thread_pool_size: 100, rgw_num_control_oids: 8, rgw_zone_root_pool: .rgw.root, rgw_log_nonexistent_bucket: false, rgw_log_object_name: %Y-%m-%d-%H-%i-%n, rgw_log_object_name_utc: false, rgw_usage_max_shards: 32, rgw_usage_max_user_shards: 1, rgw_enable_ops_log: false, rgw_enable_usage_log: false, rgw_ops_log_rados: true, rgw_ops_log_socket_path: , rgw_ops_log_data_backlog: 5242880, rgw_usage_log_flush_threshold: 1024, rgw_usage_log_tick_interval: 30, rgw_intent_log_object_name: %Y-%m-%d-%i-%n, rgw_intent_log_object_name_utc: false, rgw_init_timeout: 300, rgw_mime_types_file: \/etc\/mime.types, rgw_gc_max_objs: 32, rgw_gc_obj_min_wait: 7200, rgw_gc_processor_max_time: 3600, rgw_gc_processor_period: 3600, rgw_s3_success_create_obj_status: 0, rgw_resolve_cname: false, rgw_obj_stripe_size: 4194304, rgw_extended_http_attrs: , rgw_exit_timeout_secs: 120, rgw_get_obj_window_size: 16777216, rgw_get_obj_max_req_size: 4194304, rgw_relaxed_s3_bucket_names: false, rgw_list_buckets_max_chunk: 1000, mutex_perf_counter: false, internal_safe_to_start_threads: true} Stefan -Sam On Thu, Aug 1, 2013 at 12:07 PM, Stefan Priebe s.pri...@profihost.ag wrote: Mike we already have the async patch running. Yes it helps but only helps it does not solve. It just hides the issue ... Am 01.08.2013 20:54, schrieb Mike Dawson: I am also seeing recovery issues with 0.61.7. Here's the process: - ceph osd set noout - Reboot one of the nodes hosting OSDs - VMs mounted from RBD volumes work properly - I see the OSD's boot messages as they re-join the cluster - Start seeing active+recovery_wait, peering, and active+recovering - VMs mounted from RBD volumes become unresponsive. - Recovery completes - VMs mounted from RBD volumes regain responsiveness - ceph osd unset noout Would joshd's async patch for qemu help here, or is there something else going on? Output of ceph -w at: http://pastebin.com/raw.php?i=JLcZYFzY Thanks, Mike Dawson Co-Founder Director of Cloud Architecture Cloudapt LLC 6330 East 75th Street, Suite 170 Indianapolis, IN 46250 On 8/1/2013 2:34 PM, Samuel Just wrote: Can you reproduce and attach the ceph.log from before you stop the osd until after you have started the osd and it has recovered? -Sam On Thu, Aug 1, 2013 at 1:22 AM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Hi, i still have recovery issues with cuttlefish. After the OSD comes back it seem to hang
still recovery issues with cuttlefish
Hi, i still have recovery issues with cuttlefish. After the OSD comes back it seem to hang for around 2-4 minutes and then recovery seems to start (pgs in recovery_wait start to decrement). This is with ceph 0.61.7. I get a lot of slow request messages an hanging VMs. What i noticed today is that if i leave the OSD off as long as ceph starts to backfill - the recovery and re backfilling wents absolutely smooth without any issues and no slow request messages at all. Does anybody have an idea why? Greets, Stefan -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: still recovery issues with cuttlefish
Second this. Also for long-lasting snapshot problem and related performance issues I may say that cuttlefish improved things greatly, but creation/deletion of large snapshot (hundreds of gigabytes of commited data) still can bring down cluster for a minutes, despite usage of every possible optimization. On Thu, Aug 1, 2013 at 12:22 PM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Hi, i still have recovery issues with cuttlefish. After the OSD comes back it seem to hang for around 2-4 minutes and then recovery seems to start (pgs in recovery_wait start to decrement). This is with ceph 0.61.7. I get a lot of slow request messages an hanging VMs. What i noticed today is that if i leave the OSD off as long as ceph starts to backfill - the recovery and re backfilling wents absolutely smooth without any issues and no slow request messages at all. Does anybody have an idea why? Greets, Stefan -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: still recovery issues with cuttlefish
Can you reproduce and attach the ceph.log from before you stop the osd until after you have started the osd and it has recovered? -Sam On Thu, Aug 1, 2013 at 1:22 AM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Hi, i still have recovery issues with cuttlefish. After the OSD comes back it seem to hang for around 2-4 minutes and then recovery seems to start (pgs in recovery_wait start to decrement). This is with ceph 0.61.7. I get a lot of slow request messages an hanging VMs. What i noticed today is that if i leave the OSD off as long as ceph starts to backfill - the recovery and re backfilling wents absolutely smooth without any issues and no slow request messages at all. Does anybody have an idea why? Greets, Stefan -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: still recovery issues with cuttlefish
m 01.08.2013 20:34, schrieb Samuel Just: Can you reproduce and attach the ceph.log from before you stop the osd until after you have started the osd and it has recovered? -Sam Sure which log levels? On Thu, Aug 1, 2013 at 1:22 AM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Hi, i still have recovery issues with cuttlefish. After the OSD comes back it seem to hang for around 2-4 minutes and then recovery seems to start (pgs in recovery_wait start to decrement). This is with ceph 0.61.7. I get a lot of slow request messages an hanging VMs. What i noticed today is that if i leave the OSD off as long as ceph starts to backfill - the recovery and re backfilling wents absolutely smooth without any issues and no slow request messages at all. Does anybody have an idea why? Greets, Stefan -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: still recovery issues with cuttlefish
For now, just the main ceph.log. -Sam On Thu, Aug 1, 2013 at 11:34 AM, Stefan Priebe s.pri...@profihost.ag wrote: m 01.08.2013 20:34, schrieb Samuel Just: Can you reproduce and attach the ceph.log from before you stop the osd until after you have started the osd and it has recovered? -Sam Sure which log levels? On Thu, Aug 1, 2013 at 1:22 AM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Hi, i still have recovery issues with cuttlefish. After the OSD comes back it seem to hang for around 2-4 minutes and then recovery seems to start (pgs in recovery_wait start to decrement). This is with ceph 0.61.7. I get a lot of slow request messages an hanging VMs. What i noticed today is that if i leave the OSD off as long as ceph starts to backfill - the recovery and re backfilling wents absolutely smooth without any issues and no slow request messages at all. Does anybody have an idea why? Greets, Stefan -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: still recovery issues with cuttlefish
It doesn't have log levels, should be in /var/log/ceph/ceph.log. -Sam On Thu, Aug 1, 2013 at 11:36 AM, Samuel Just sam.j...@inktank.com wrote: For now, just the main ceph.log. -Sam On Thu, Aug 1, 2013 at 11:34 AM, Stefan Priebe s.pri...@profihost.ag wrote: m 01.08.2013 20:34, schrieb Samuel Just: Can you reproduce and attach the ceph.log from before you stop the osd until after you have started the osd and it has recovered? -Sam Sure which log levels? On Thu, Aug 1, 2013 at 1:22 AM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Hi, i still have recovery issues with cuttlefish. After the OSD comes back it seem to hang for around 2-4 minutes and then recovery seems to start (pgs in recovery_wait start to decrement). This is with ceph 0.61.7. I get a lot of slow request messages an hanging VMs. What i noticed today is that if i leave the OSD off as long as ceph starts to backfill - the recovery and re backfilling wents absolutely smooth without any issues and no slow request messages at all. Does anybody have an idea why? Greets, Stefan -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: still recovery issues with cuttlefish
I am also seeing recovery issues with 0.61.7. Here's the process: - ceph osd set noout - Reboot one of the nodes hosting OSDs - VMs mounted from RBD volumes work properly - I see the OSD's boot messages as they re-join the cluster - Start seeing active+recovery_wait, peering, and active+recovering - VMs mounted from RBD volumes become unresponsive. - Recovery completes - VMs mounted from RBD volumes regain responsiveness - ceph osd unset noout Would joshd's async patch for qemu help here, or is there something else going on? Output of ceph -w at: http://pastebin.com/raw.php?i=JLcZYFzY Thanks, Mike Dawson Co-Founder Director of Cloud Architecture Cloudapt LLC 6330 East 75th Street, Suite 170 Indianapolis, IN 46250 On 8/1/2013 2:34 PM, Samuel Just wrote: Can you reproduce and attach the ceph.log from before you stop the osd until after you have started the osd and it has recovered? -Sam On Thu, Aug 1, 2013 at 1:22 AM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Hi, i still have recovery issues with cuttlefish. After the OSD comes back it seem to hang for around 2-4 minutes and then recovery seems to start (pgs in recovery_wait start to decrement). This is with ceph 0.61.7. I get a lot of slow request messages an hanging VMs. What i noticed today is that if i leave the OSD off as long as ceph starts to backfill - the recovery and re backfilling wents absolutely smooth without any issues and no slow request messages at all. Does anybody have an idea why? Greets, Stefan -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: still recovery issues with cuttlefish
Mike we already have the async patch running. Yes it helps but only helps it does not solve. It just hides the issue ... Am 01.08.2013 20:54, schrieb Mike Dawson: I am also seeing recovery issues with 0.61.7. Here's the process: - ceph osd set noout - Reboot one of the nodes hosting OSDs - VMs mounted from RBD volumes work properly - I see the OSD's boot messages as they re-join the cluster - Start seeing active+recovery_wait, peering, and active+recovering - VMs mounted from RBD volumes become unresponsive. - Recovery completes - VMs mounted from RBD volumes regain responsiveness - ceph osd unset noout Would joshd's async patch for qemu help here, or is there something else going on? Output of ceph -w at: http://pastebin.com/raw.php?i=JLcZYFzY Thanks, Mike Dawson Co-Founder Director of Cloud Architecture Cloudapt LLC 6330 East 75th Street, Suite 170 Indianapolis, IN 46250 On 8/1/2013 2:34 PM, Samuel Just wrote: Can you reproduce and attach the ceph.log from before you stop the osd until after you have started the osd and it has recovered? -Sam On Thu, Aug 1, 2013 at 1:22 AM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Hi, i still have recovery issues with cuttlefish. After the OSD comes back it seem to hang for around 2-4 minutes and then recovery seems to start (pgs in recovery_wait start to decrement). This is with ceph 0.61.7. I get a lot of slow request messages an hanging VMs. What i noticed today is that if i leave the OSD off as long as ceph starts to backfill - the recovery and re backfilling wents absolutely smooth without any issues and no slow request messages at all. Does anybody have an idea why? Greets, Stefan -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: still recovery issues with cuttlefish
Can you dump your osd settings? sudo ceph --admin-daemon ceph-osd.osdid.asok config show -Sam On Thu, Aug 1, 2013 at 12:07 PM, Stefan Priebe s.pri...@profihost.ag wrote: Mike we already have the async patch running. Yes it helps but only helps it does not solve. It just hides the issue ... Am 01.08.2013 20:54, schrieb Mike Dawson: I am also seeing recovery issues with 0.61.7. Here's the process: - ceph osd set noout - Reboot one of the nodes hosting OSDs - VMs mounted from RBD volumes work properly - I see the OSD's boot messages as they re-join the cluster - Start seeing active+recovery_wait, peering, and active+recovering - VMs mounted from RBD volumes become unresponsive. - Recovery completes - VMs mounted from RBD volumes regain responsiveness - ceph osd unset noout Would joshd's async patch for qemu help here, or is there something else going on? Output of ceph -w at: http://pastebin.com/raw.php?i=JLcZYFzY Thanks, Mike Dawson Co-Founder Director of Cloud Architecture Cloudapt LLC 6330 East 75th Street, Suite 170 Indianapolis, IN 46250 On 8/1/2013 2:34 PM, Samuel Just wrote: Can you reproduce and attach the ceph.log from before you stop the osd until after you have started the osd and it has recovered? -Sam On Thu, Aug 1, 2013 at 1:22 AM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Hi, i still have recovery issues with cuttlefish. After the OSD comes back it seem to hang for around 2-4 minutes and then recovery seems to start (pgs in recovery_wait start to decrement). This is with ceph 0.61.7. I get a lot of slow request messages an hanging VMs. What i noticed today is that if i leave the OSD off as long as ceph starts to backfill - the recovery and re backfilling wents absolutely smooth without any issues and no slow request messages at all. Does anybody have an idea why? Greets, Stefan -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html