Re: still recovery issues with cuttlefish

2013-08-22 Thread Stefan Priebe - Profihost AG
Am 22.08.2013 05:34, schrieb Samuel Just:
 It's not really possible at this time to control that limit because
 changing the primary is actually fairly expensive and doing it
 unnecessarily would probably make the situation much worse

I'm sorry but remapping or backfilling is far less expensive on all of
my machines than recovering.

While backfilling i've around 8-10% I/O waits while under recovery i
have 40%-50%


 (it's
 mostly necessary for backfilling, which is expensive anyway).  It
 seems like forwarding IO on an object which needs to be recovered to a
 replica with the object would be the next step.  Certainly something
 to consider for the future.

Yes this would be the solution.

Stefan

 -Sam
 
 On Wed, Aug 21, 2013 at 12:37 PM, Stefan Priebe s.pri...@profihost.ag wrote:
 Hi Sam,
 Am 21.08.2013 21:13, schrieb Samuel Just:

 As long as the request is for an object which is up to date on the
 primary, the request will be served without waiting for recovery.


 Sure but remember if you have VM random 4K workload a lot of objects go out
 of date pretty soon.


 A request only waits on recovery if the particular object being read or

 written must be recovered.


 Yes but on 4k load this can be a lot.


 Your issue was that recovering the
 particular object being requested was unreasonably slow due to
 silliness in the recovery code which you disabled by disabling
 osd_recover_clone_overlap.


 Yes and no. It's better now but far away from being good or perfect. My VMs
 do not crash anymore but i still have a bunch of slow requests (just around
 10 messages) and still a VERY high I/O load on the disks during recovery.


 In cases where the primary osd is significantly behind, we do make one
 of the other osds primary during recovery in order to expedite
 requests (pgs in this state are shown as remapped).


 oh never seen that but at least in my case even 60s are a very long
 timeframe and the OSD is very stressed during recovery. Is it possible for
 me to set this value?


 Stefan

 -Sam

 On Wed, Aug 21, 2013 at 11:21 AM, Stefan Priebe s.pri...@profihost.ag
 wrote:

 Am 21.08.2013 17:32, schrieb Samuel Just:

 Have you tried setting osd_recovery_clone_overlap to false?  That
 seemed to help with Stefan's issue.



 This might sound a bug harsh but maybe due to my limited english skills
 ;-)

 I still think that Cephs recovery system is broken by design. If an OSD
 comes back (was offline) all write requests regarding PGs where this one
 is
 primary are targeted immediatly to this OSD. If this one is not up2date
 for
 an PG it tries to recover that one immediatly which costs 4MB / block. If
 you have a lot of small write all over your OSDs and PGs you're sucked as
 your OSD has to recover ALL it's PGs immediatly or at least lots of them
 WHICH can't work. This is totally crazy.

 I think the right way would be:
 1.) if an OSD goes down the replicas got primaries

 or

 2.) an OSD which does not have an up2date PG should redirect to the OSD
 holding the secondary or third replica.

 Both results in being able to have a really smooth and slow recovery
 without
 any stress even under heavy 4K workloads like rbd backed VMs.

 Thanks for reading!

 Greets Stefan



 -Sam

 On Wed, Aug 21, 2013 at 8:28 AM, Mike Dawson mike.daw...@cloudapt.com
 wrote:


 Sam/Josh,

 We upgraded from 0.61.7 to 0.67.1 during a maintenance window this
 morning,
 hoping it would improve this situation, but there was no appreciable
 change.

 One node in our cluster fsck'ed after a reboot and got a bit behind.
 Our
 instances backed by RBD volumes were OK at that point, but once the
 node
 booted fully and the OSDs started, all Windows instances with rbd
 volumes
 experienced very choppy performance and were unable to ingest video
 surveillance traffic and commit it to disk. Once the cluster got back
 to
 HEALTH_OK, they resumed normal operation.

 I tried for a time with conservative recovery settings (osd max
 backfills
 =
 1, osd recovery op priority = 1, and osd recovery max active = 1). No
 improvement for the guests. So I went to more aggressive settings to
 get
 things moving faster. That decreased the duration of the outage.

 During the entire period of recovery/backfill, the network looked
 fine...no
 where close to saturation. iowait on all drives look fine as well.

 Any ideas?

 Thanks,
 Mike Dawson



 On 8/14/2013 3:04 AM, Stefan Priebe - Profihost AG wrote:



 the same problem still occours. Will need to check when i've time to
 gather logs again.

 Am 14.08.2013 01:11, schrieb Samuel Just:



 I'm not sure, but your logs did show that you had 16 recovery ops in
 flight, so it's worth a try.  If it doesn't help, you should collect
 the same set of logs I'll look again.  Also, there are a few other
 patches between 61.7 and current cuttlefish which may help.
 -Sam

 On Tue, Aug 13, 2013 at 2:03 PM, Stefan Priebe - Profihost AG
 s.pri...@profihost.ag wrote:




 Am 13.08.2013 um 22:43 schrieb Samuel Just 

Re: still recovery issues with cuttlefish

2013-08-21 Thread Samuel Just
Have you tried setting osd_recovery_clone_overlap to false?  That
seemed to help with Stefan's issue.
-Sam

On Wed, Aug 21, 2013 at 8:28 AM, Mike Dawson mike.daw...@cloudapt.com wrote:
 Sam/Josh,

 We upgraded from 0.61.7 to 0.67.1 during a maintenance window this morning,
 hoping it would improve this situation, but there was no appreciable change.

 One node in our cluster fsck'ed after a reboot and got a bit behind. Our
 instances backed by RBD volumes were OK at that point, but once the node
 booted fully and the OSDs started, all Windows instances with rbd volumes
 experienced very choppy performance and were unable to ingest video
 surveillance traffic and commit it to disk. Once the cluster got back to
 HEALTH_OK, they resumed normal operation.

 I tried for a time with conservative recovery settings (osd max backfills =
 1, osd recovery op priority = 1, and osd recovery max active = 1). No
 improvement for the guests. So I went to more aggressive settings to get
 things moving faster. That decreased the duration of the outage.

 During the entire period of recovery/backfill, the network looked fine...no
 where close to saturation. iowait on all drives look fine as well.

 Any ideas?

 Thanks,
 Mike Dawson



 On 8/14/2013 3:04 AM, Stefan Priebe - Profihost AG wrote:

 the same problem still occours. Will need to check when i've time to
 gather logs again.

 Am 14.08.2013 01:11, schrieb Samuel Just:

 I'm not sure, but your logs did show that you had 16 recovery ops in
 flight, so it's worth a try.  If it doesn't help, you should collect
 the same set of logs I'll look again.  Also, there are a few other
 patches between 61.7 and current cuttlefish which may help.
 -Sam

 On Tue, Aug 13, 2013 at 2:03 PM, Stefan Priebe - Profihost AG
 s.pri...@profihost.ag wrote:


 Am 13.08.2013 um 22:43 schrieb Samuel Just sam.j...@inktank.com:

 I just backported a couple of patches from next to fix a bug where we
 weren't respecting the osd_recovery_max_active config in some cases
 (1ea6b56170fc9e223e7c30635db02fa2ad8f4b4e).  You can either try the
 current cuttlefish branch or wait for a 61.8 release.


 Thanks! Are you sure that this is the issue? I don't believe that but
 i'll give it a try. I already tested a branch from sage where he fixed a
 race regarding max active some weeks ago. So active recovering was max 1 
 but
 the issue didn't went away.

 Stefan

 -Sam

 On Mon, Aug 12, 2013 at 10:34 PM, Samuel Just sam.j...@inktank.com
 wrote:

 I got swamped today.  I should be able to look tomorrow.  Sorry!
 -Sam

 On Mon, Aug 12, 2013 at 9:39 PM, Stefan Priebe - Profihost AG
 s.pri...@profihost.ag wrote:

 Did you take a look?

 Stefan

 Am 11.08.2013 um 05:50 schrieb Samuel Just sam.j...@inktank.com:

 Great!  I'll take a look on Monday.
 -Sam

 On Sat, Aug 10, 2013 at 12:08 PM, Stefan Priebe
 s.pri...@profihost.ag wrote:

 Hi Samual,

 Am 09.08.2013 23:44, schrieb Samuel Just:

 I think Stefan's problem is probably distinct from Mike's.

 Stefan: Can you reproduce the problem with

 debug osd = 20
 debug filestore = 20
 debug ms = 1
 debug optracker = 20

 on a few osds (including the restarted osd), and upload those osd
 logs
 along with the ceph.log from before killing the osd until after
 the
 cluster becomes clean again?



 done - you'll find the logs at cephdrop folder:
 slow_requests_recovering_cuttlefish

 osd.52 was the one recovering

 Thanks!

 Greets,
 Stefan

 --
 To unsubscribe from this list: send the line unsubscribe
 ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel
 in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: still recovery issues with cuttlefish

2013-08-21 Thread Yann ROBIN
It's osd recover clone overlap (see http://tracker.ceph.com/issues/5401)

-Original Message-
From: ceph-devel-ow...@vger.kernel.org 
[mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Samuel Just
Sent: mercredi 21 août 2013 17:33
To: Mike Dawson
Cc: Stefan Priebe - Profihost AG; josh.dur...@inktank.com; 
ceph-devel@vger.kernel.org
Subject: Re: still recovery issues with cuttlefish

Have you tried setting osd_recovery_clone_overlap to false?  That seemed to 
help with Stefan's issue.
-Sam

On Wed, Aug 21, 2013 at 8:28 AM, Mike Dawson mike.daw...@cloudapt.com wrote:
 Sam/Josh,

 We upgraded from 0.61.7 to 0.67.1 during a maintenance window this 
 morning, hoping it would improve this situation, but there was no appreciable 
 change.

 One node in our cluster fsck'ed after a reboot and got a bit behind. 
 Our instances backed by RBD volumes were OK at that point, but once 
 the node booted fully and the OSDs started, all Windows instances with 
 rbd volumes experienced very choppy performance and were unable to 
 ingest video surveillance traffic and commit it to disk. Once the 
 cluster got back to HEALTH_OK, they resumed normal operation.

 I tried for a time with conservative recovery settings (osd max 
 backfills = 1, osd recovery op priority = 1, and osd recovery max 
 active = 1). No improvement for the guests. So I went to more 
 aggressive settings to get things moving faster. That decreased the duration 
 of the outage.

 During the entire period of recovery/backfill, the network looked 
 fine...no where close to saturation. iowait on all drives look fine as well.

 Any ideas?

 Thanks,
 Mike Dawson



 On 8/14/2013 3:04 AM, Stefan Priebe - Profihost AG wrote:

 the same problem still occours. Will need to check when i've time to 
 gather logs again.

 Am 14.08.2013 01:11, schrieb Samuel Just:

 I'm not sure, but your logs did show that you had 16 recovery ops 
 in flight, so it's worth a try.  If it doesn't help, you should 
 collect the same set of logs I'll look again.  Also, there are a few 
 other patches between 61.7 and current cuttlefish which may help.
 -Sam

 On Tue, Aug 13, 2013 at 2:03 PM, Stefan Priebe - Profihost AG 
 s.pri...@profihost.ag wrote:


 Am 13.08.2013 um 22:43 schrieb Samuel Just sam.j...@inktank.com:

 I just backported a couple of patches from next to fix a bug where 
 we weren't respecting the osd_recovery_max_active config in some 
 cases (1ea6b56170fc9e223e7c30635db02fa2ad8f4b4e).  You can either 
 try the current cuttlefish branch or wait for a 61.8 release.


 Thanks! Are you sure that this is the issue? I don't believe that 
 but i'll give it a try. I already tested a branch from sage where 
 he fixed a race regarding max active some weeks ago. So active 
 recovering was max 1 but the issue didn't went away.

 Stefan

 -Sam

 On Mon, Aug 12, 2013 at 10:34 PM, Samuel Just 
 sam.j...@inktank.com
 wrote:

 I got swamped today.  I should be able to look tomorrow.  Sorry!
 -Sam

 On Mon, Aug 12, 2013 at 9:39 PM, Stefan Priebe - Profihost AG 
 s.pri...@profihost.ag wrote:

 Did you take a look?

 Stefan

 Am 11.08.2013 um 05:50 schrieb Samuel Just sam.j...@inktank.com:

 Great!  I'll take a look on Monday.
 -Sam

 On Sat, Aug 10, 2013 at 12:08 PM, Stefan Priebe 
 s.pri...@profihost.ag wrote:

 Hi Samual,

 Am 09.08.2013 23:44, schrieb Samuel Just:

 I think Stefan's problem is probably distinct from Mike's.

 Stefan: Can you reproduce the problem with

 debug osd = 20
 debug filestore = 20
 debug ms = 1
 debug optracker = 20

 on a few osds (including the restarted osd), and upload those 
 osd logs along with the ceph.log from before killing the osd 
 until after the cluster becomes clean again?



 done - you'll find the logs at cephdrop folder:
 slow_requests_recovering_cuttlefish

 osd.52 was the one recovering

 Thanks!

 Greets,
 Stefan

 --
 To unsubscribe from this list: send the line unsubscribe 
 ceph-devel in the body of a message to 
 majord...@vger.kernel.org More majordomo info at  
 http://vger.kernel.org/majordomo-info.html

 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel
 in
 the body of a message to majord...@vger.kernel.org More majordomo 
 info at  http://vger.kernel.org/majordomo-info.html

 --
 To unsubscribe from this list: send the line unsubscribe 
 ceph-devel in the body of a message to majord...@vger.kernel.org 
 More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe ceph-devel in the 
body of a message to majord...@vger.kernel.org More majordomo info at  
http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: still recovery issues with cuttlefish

2013-08-21 Thread Mike Dawson

Sam,

Tried it. Injected with 'ceph tell osd.* injectargs -- 
--no_osd_recover_clone_overlap', then stopped one OSD for ~1 minute. 
Upon restart, all my Windows VMs have issues until HEALTH_OK.


The recovery was taking an abnormally long time, so I reverted away from 
--no_osd_recover_clone_overlap after about 10mins, to get back to HEALTH_OK.


Interestingly, a Raring guest running a different video surveillance 
package proceeded without any issue whatsoever.


Here is an image of the traffic to some of these Windows guests:

http://www.gammacode.com/upload/rbd-hang-with-clone-overlap.jpg

Ceph is outside of HEALTH_OK between ~12:55 and 13:10. Most of these 
instances rebooted due to an app error caused by the i/o hang shortly 
after 13:10.


These Windows instances are booted as COW clones from a Glance image 
using Cinder. They also have a second RBD volume for bulk storage. I'm 
using qemu 1.5.2.


Thanks,
Mike


On 8/21/2013 1:12 PM, Samuel Just wrote:

Ah, thanks for the correction.
-Sam

On Wed, Aug 21, 2013 at 9:25 AM, Yann ROBIN yann.ro...@youscribe.com wrote:

It's osd recover clone overlap (see http://tracker.ceph.com/issues/5401)

-Original Message-
From: ceph-devel-ow...@vger.kernel.org 
[mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Samuel Just
Sent: mercredi 21 août 2013 17:33
To: Mike Dawson
Cc: Stefan Priebe - Profihost AG; josh.dur...@inktank.com; 
ceph-devel@vger.kernel.org
Subject: Re: still recovery issues with cuttlefish

Have you tried setting osd_recovery_clone_overlap to false?  That seemed to 
help with Stefan's issue.
-Sam

On Wed, Aug 21, 2013 at 8:28 AM, Mike Dawson mike.daw...@cloudapt.com wrote:

Sam/Josh,

We upgraded from 0.61.7 to 0.67.1 during a maintenance window this
morning, hoping it would improve this situation, but there was no appreciable 
change.

One node in our cluster fsck'ed after a reboot and got a bit behind.
Our instances backed by RBD volumes were OK at that point, but once
the node booted fully and the OSDs started, all Windows instances with
rbd volumes experienced very choppy performance and were unable to
ingest video surveillance traffic and commit it to disk. Once the
cluster got back to HEALTH_OK, they resumed normal operation.

I tried for a time with conservative recovery settings (osd max
backfills = 1, osd recovery op priority = 1, and osd recovery max
active = 1). No improvement for the guests. So I went to more
aggressive settings to get things moving faster. That decreased the duration of 
the outage.

During the entire period of recovery/backfill, the network looked
fine...no where close to saturation. iowait on all drives look fine as well.

Any ideas?

Thanks,
Mike Dawson



On 8/14/2013 3:04 AM, Stefan Priebe - Profihost AG wrote:


the same problem still occours. Will need to check when i've time to
gather logs again.

Am 14.08.2013 01:11, schrieb Samuel Just:


I'm not sure, but your logs did show that you had 16 recovery ops
in flight, so it's worth a try.  If it doesn't help, you should
collect the same set of logs I'll look again.  Also, there are a few
other patches between 61.7 and current cuttlefish which may help.
-Sam

On Tue, Aug 13, 2013 at 2:03 PM, Stefan Priebe - Profihost AG
s.pri...@profihost.ag wrote:



Am 13.08.2013 um 22:43 schrieb Samuel Just sam.j...@inktank.com:


I just backported a couple of patches from next to fix a bug where
we weren't respecting the osd_recovery_max_active config in some
cases (1ea6b56170fc9e223e7c30635db02fa2ad8f4b4e).  You can either
try the current cuttlefish branch or wait for a 61.8 release.



Thanks! Are you sure that this is the issue? I don't believe that
but i'll give it a try. I already tested a branch from sage where
he fixed a race regarding max active some weeks ago. So active
recovering was max 1 but the issue didn't went away.

Stefan


-Sam

On Mon, Aug 12, 2013 at 10:34 PM, Samuel Just
sam.j...@inktank.com
wrote:


I got swamped today.  I should be able to look tomorrow.  Sorry!
-Sam

On Mon, Aug 12, 2013 at 9:39 PM, Stefan Priebe - Profihost AG
s.pri...@profihost.ag wrote:


Did you take a look?

Stefan

Am 11.08.2013 um 05:50 schrieb Samuel Just sam.j...@inktank.com:


Great!  I'll take a look on Monday.
-Sam

On Sat, Aug 10, 2013 at 12:08 PM, Stefan Priebe
s.pri...@profihost.ag wrote:


Hi Samual,

Am 09.08.2013 23:44, schrieb Samuel Just:


I think Stefan's problem is probably distinct from Mike's.

Stefan: Can you reproduce the problem with

debug osd = 20
debug filestore = 20
debug ms = 1
debug optracker = 20

on a few osds (including the restarted osd), and upload those
osd logs along with the ceph.log from before killing the osd
until after the cluster becomes clean again?




done - you'll find the logs at cephdrop folder:
slow_requests_recovering_cuttlefish

osd.52 was the one recovering

Thanks!

Greets,
Stefan


--
To unsubscribe from this list: send the line unsubscribe
ceph-devel in the body of a message to
majord...@vger.kernel.org More

Re: still recovery issues with cuttlefish

2013-08-21 Thread Samuel Just
If the raring guest was fine, I suspect that the issue is not on the OSDs.
-Sam

On Wed, Aug 21, 2013 at 10:55 AM, Mike Dawson mike.daw...@cloudapt.com wrote:
 Sam,

 Tried it. Injected with 'ceph tell osd.* injectargs --
 --no_osd_recover_clone_overlap', then stopped one OSD for ~1 minute. Upon
 restart, all my Windows VMs have issues until HEALTH_OK.

 The recovery was taking an abnormally long time, so I reverted away from
 --no_osd_recover_clone_overlap after about 10mins, to get back to HEALTH_OK.

 Interestingly, a Raring guest running a different video surveillance package
 proceeded without any issue whatsoever.

 Here is an image of the traffic to some of these Windows guests:

 http://www.gammacode.com/upload/rbd-hang-with-clone-overlap.jpg

 Ceph is outside of HEALTH_OK between ~12:55 and 13:10. Most of these
 instances rebooted due to an app error caused by the i/o hang shortly after
 13:10.

 These Windows instances are booted as COW clones from a Glance image using
 Cinder. They also have a second RBD volume for bulk storage. I'm using qemu
 1.5.2.

 Thanks,
 Mike



 On 8/21/2013 1:12 PM, Samuel Just wrote:

 Ah, thanks for the correction.
 -Sam

 On Wed, Aug 21, 2013 at 9:25 AM, Yann ROBIN yann.ro...@youscribe.com
 wrote:

 It's osd recover clone overlap (see http://tracker.ceph.com/issues/5401)

 -Original Message-
 From: ceph-devel-ow...@vger.kernel.org
 [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Samuel Just
 Sent: mercredi 21 août 2013 17:33
 To: Mike Dawson
 Cc: Stefan Priebe - Profihost AG; josh.dur...@inktank.com;
 ceph-devel@vger.kernel.org
 Subject: Re: still recovery issues with cuttlefish

 Have you tried setting osd_recovery_clone_overlap to false?  That seemed
 to help with Stefan's issue.
 -Sam

 On Wed, Aug 21, 2013 at 8:28 AM, Mike Dawson mike.daw...@cloudapt.com
 wrote:

 Sam/Josh,

 We upgraded from 0.61.7 to 0.67.1 during a maintenance window this
 morning, hoping it would improve this situation, but there was no
 appreciable change.

 One node in our cluster fsck'ed after a reboot and got a bit behind.
 Our instances backed by RBD volumes were OK at that point, but once
 the node booted fully and the OSDs started, all Windows instances with
 rbd volumes experienced very choppy performance and were unable to
 ingest video surveillance traffic and commit it to disk. Once the
 cluster got back to HEALTH_OK, they resumed normal operation.

 I tried for a time with conservative recovery settings (osd max
 backfills = 1, osd recovery op priority = 1, and osd recovery max
 active = 1). No improvement for the guests. So I went to more
 aggressive settings to get things moving faster. That decreased the
 duration of the outage.

 During the entire period of recovery/backfill, the network looked
 fine...no where close to saturation. iowait on all drives look fine as
 well.

 Any ideas?

 Thanks,
 Mike Dawson



 On 8/14/2013 3:04 AM, Stefan Priebe - Profihost AG wrote:


 the same problem still occours. Will need to check when i've time to
 gather logs again.

 Am 14.08.2013 01:11, schrieb Samuel Just:


 I'm not sure, but your logs did show that you had 16 recovery ops
 in flight, so it's worth a try.  If it doesn't help, you should
 collect the same set of logs I'll look again.  Also, there are a few
 other patches between 61.7 and current cuttlefish which may help.
 -Sam

 On Tue, Aug 13, 2013 at 2:03 PM, Stefan Priebe - Profihost AG
 s.pri...@profihost.ag wrote:



 Am 13.08.2013 um 22:43 schrieb Samuel Just sam.j...@inktank.com:

 I just backported a couple of patches from next to fix a bug where
 we weren't respecting the osd_recovery_max_active config in some
 cases (1ea6b56170fc9e223e7c30635db02fa2ad8f4b4e).  You can either
 try the current cuttlefish branch or wait for a 61.8 release.



 Thanks! Are you sure that this is the issue? I don't believe that
 but i'll give it a try. I already tested a branch from sage where
 he fixed a race regarding max active some weeks ago. So active
 recovering was max 1 but the issue didn't went away.

 Stefan

 -Sam

 On Mon, Aug 12, 2013 at 10:34 PM, Samuel Just
 sam.j...@inktank.com
 wrote:


 I got swamped today.  I should be able to look tomorrow.  Sorry!
 -Sam

 On Mon, Aug 12, 2013 at 9:39 PM, Stefan Priebe - Profihost AG
 s.pri...@profihost.ag wrote:


 Did you take a look?

 Stefan

 Am 11.08.2013 um 05:50 schrieb Samuel Just sam.j...@inktank.com:

 Great!  I'll take a look on Monday.
 -Sam

 On Sat, Aug 10, 2013 at 12:08 PM, Stefan Priebe
 s.pri...@profihost.ag wrote:


 Hi Samual,

 Am 09.08.2013 23:44, schrieb Samuel Just:

 I think Stefan's problem is probably distinct from Mike's.

 Stefan: Can you reproduce the problem with

 debug osd = 20
 debug filestore = 20
 debug ms = 1
 debug optracker = 20

 on a few osds (including the restarted osd), and upload those
 osd logs along with the ceph.log from before killing the osd
 until after the cluster becomes clean again?




 done - you'll find the logs

Re: still recovery issues with cuttlefish

2013-08-21 Thread Stefan Priebe

Am 21.08.2013 17:32, schrieb Samuel Just:

Have you tried setting osd_recovery_clone_overlap to false?  That
seemed to help with Stefan's issue.


This might sound a bug harsh but maybe due to my limited english skills ;-)

I still think that Cephs recovery system is broken by design. If an OSD 
comes back (was offline) all write requests regarding PGs where this one 
is primary are targeted immediatly to this OSD. If this one is not 
up2date for an PG it tries to recover that one immediatly which costs 
4MB / block. If you have a lot of small write all over your OSDs and PGs 
you're sucked as your OSD has to recover ALL it's PGs immediatly or at 
least lots of them WHICH can't work. This is totally crazy.


I think the right way would be:
1.) if an OSD goes down the replicas got primaries

or

2.) an OSD which does not have an up2date PG should redirect to the OSD 
holding the secondary or third replica.


Both results in being able to have a really smooth and slow recovery 
without any stress even under heavy 4K workloads like rbd backed VMs.


Thanks for reading!

Greets Stefan



-Sam

On Wed, Aug 21, 2013 at 8:28 AM, Mike Dawson mike.daw...@cloudapt.com wrote:

Sam/Josh,

We upgraded from 0.61.7 to 0.67.1 during a maintenance window this morning,
hoping it would improve this situation, but there was no appreciable change.

One node in our cluster fsck'ed after a reboot and got a bit behind. Our
instances backed by RBD volumes were OK at that point, but once the node
booted fully and the OSDs started, all Windows instances with rbd volumes
experienced very choppy performance and were unable to ingest video
surveillance traffic and commit it to disk. Once the cluster got back to
HEALTH_OK, they resumed normal operation.

I tried for a time with conservative recovery settings (osd max backfills =
1, osd recovery op priority = 1, and osd recovery max active = 1). No
improvement for the guests. So I went to more aggressive settings to get
things moving faster. That decreased the duration of the outage.

During the entire period of recovery/backfill, the network looked fine...no
where close to saturation. iowait on all drives look fine as well.

Any ideas?

Thanks,
Mike Dawson



On 8/14/2013 3:04 AM, Stefan Priebe - Profihost AG wrote:


the same problem still occours. Will need to check when i've time to
gather logs again.

Am 14.08.2013 01:11, schrieb Samuel Just:


I'm not sure, but your logs did show that you had 16 recovery ops in
flight, so it's worth a try.  If it doesn't help, you should collect
the same set of logs I'll look again.  Also, there are a few other
patches between 61.7 and current cuttlefish which may help.
-Sam

On Tue, Aug 13, 2013 at 2:03 PM, Stefan Priebe - Profihost AG
s.pri...@profihost.ag wrote:



Am 13.08.2013 um 22:43 schrieb Samuel Just sam.j...@inktank.com:


I just backported a couple of patches from next to fix a bug where we
weren't respecting the osd_recovery_max_active config in some cases
(1ea6b56170fc9e223e7c30635db02fa2ad8f4b4e).  You can either try the
current cuttlefish branch or wait for a 61.8 release.



Thanks! Are you sure that this is the issue? I don't believe that but
i'll give it a try. I already tested a branch from sage where he fixed a
race regarding max active some weeks ago. So active recovering was max 1 but
the issue didn't went away.

Stefan


-Sam

On Mon, Aug 12, 2013 at 10:34 PM, Samuel Just sam.j...@inktank.com
wrote:


I got swamped today.  I should be able to look tomorrow.  Sorry!
-Sam

On Mon, Aug 12, 2013 at 9:39 PM, Stefan Priebe - Profihost AG
s.pri...@profihost.ag wrote:


Did you take a look?

Stefan

Am 11.08.2013 um 05:50 schrieb Samuel Just sam.j...@inktank.com:


Great!  I'll take a look on Monday.
-Sam

On Sat, Aug 10, 2013 at 12:08 PM, Stefan Priebe
s.pri...@profihost.ag wrote:


Hi Samual,

Am 09.08.2013 23:44, schrieb Samuel Just:


I think Stefan's problem is probably distinct from Mike's.

Stefan: Can you reproduce the problem with

debug osd = 20
debug filestore = 20
debug ms = 1
debug optracker = 20

on a few osds (including the restarted osd), and upload those osd
logs
along with the ceph.log from before killing the osd until after
the
cluster becomes clean again?




done - you'll find the logs at cephdrop folder:
slow_requests_recovering_cuttlefish

osd.52 was the one recovering

Thanks!

Greets,
Stefan


--
To unsubscribe from this list: send the line unsubscribe
ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe ceph-devel
in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html




--
To unsubscribe from this list: send 

Re: still recovery issues with cuttlefish

2013-08-21 Thread Samuel Just
As long as the request is for an object which is up to date on the
primary, the request will be served without waiting for recovery.  A
request only waits on recovery if the particular object being read or
written must be recovered.  Your issue was that recovering the
particular object being requested was unreasonably slow due to
silliness in the recovery code which you disabled by disabling
osd_recover_clone_overlap.

In cases where the primary osd is significantly behind, we do make one
of the other osds primary during recovery in order to expedite
requests (pgs in this state are shown as remapped).
-Sam

On Wed, Aug 21, 2013 at 11:21 AM, Stefan Priebe s.pri...@profihost.ag wrote:
 Am 21.08.2013 17:32, schrieb Samuel Just:

 Have you tried setting osd_recovery_clone_overlap to false?  That
 seemed to help with Stefan's issue.


 This might sound a bug harsh but maybe due to my limited english skills ;-)

 I still think that Cephs recovery system is broken by design. If an OSD
 comes back (was offline) all write requests regarding PGs where this one is
 primary are targeted immediatly to this OSD. If this one is not up2date for
 an PG it tries to recover that one immediatly which costs 4MB / block. If
 you have a lot of small write all over your OSDs and PGs you're sucked as
 your OSD has to recover ALL it's PGs immediatly or at least lots of them
 WHICH can't work. This is totally crazy.

 I think the right way would be:
 1.) if an OSD goes down the replicas got primaries

 or

 2.) an OSD which does not have an up2date PG should redirect to the OSD
 holding the secondary or third replica.

 Both results in being able to have a really smooth and slow recovery without
 any stress even under heavy 4K workloads like rbd backed VMs.

 Thanks for reading!

 Greets Stefan



 -Sam

 On Wed, Aug 21, 2013 at 8:28 AM, Mike Dawson mike.daw...@cloudapt.com
 wrote:

 Sam/Josh,

 We upgraded from 0.61.7 to 0.67.1 during a maintenance window this
 morning,
 hoping it would improve this situation, but there was no appreciable
 change.

 One node in our cluster fsck'ed after a reboot and got a bit behind. Our
 instances backed by RBD volumes were OK at that point, but once the node
 booted fully and the OSDs started, all Windows instances with rbd volumes
 experienced very choppy performance and were unable to ingest video
 surveillance traffic and commit it to disk. Once the cluster got back to
 HEALTH_OK, they resumed normal operation.

 I tried for a time with conservative recovery settings (osd max backfills
 =
 1, osd recovery op priority = 1, and osd recovery max active = 1). No
 improvement for the guests. So I went to more aggressive settings to get
 things moving faster. That decreased the duration of the outage.

 During the entire period of recovery/backfill, the network looked
 fine...no
 where close to saturation. iowait on all drives look fine as well.

 Any ideas?

 Thanks,
 Mike Dawson



 On 8/14/2013 3:04 AM, Stefan Priebe - Profihost AG wrote:


 the same problem still occours. Will need to check when i've time to
 gather logs again.

 Am 14.08.2013 01:11, schrieb Samuel Just:


 I'm not sure, but your logs did show that you had 16 recovery ops in
 flight, so it's worth a try.  If it doesn't help, you should collect
 the same set of logs I'll look again.  Also, there are a few other
 patches between 61.7 and current cuttlefish which may help.
 -Sam

 On Tue, Aug 13, 2013 at 2:03 PM, Stefan Priebe - Profihost AG
 s.pri...@profihost.ag wrote:



 Am 13.08.2013 um 22:43 schrieb Samuel Just sam.j...@inktank.com:

 I just backported a couple of patches from next to fix a bug where we
 weren't respecting the osd_recovery_max_active config in some cases
 (1ea6b56170fc9e223e7c30635db02fa2ad8f4b4e).  You can either try the
 current cuttlefish branch or wait for a 61.8 release.



 Thanks! Are you sure that this is the issue? I don't believe that but
 i'll give it a try. I already tested a branch from sage where he fixed
 a
 race regarding max active some weeks ago. So active recovering was max
 1 but
 the issue didn't went away.

 Stefan

 -Sam

 On Mon, Aug 12, 2013 at 10:34 PM, Samuel Just sam.j...@inktank.com
 wrote:


 I got swamped today.  I should be able to look tomorrow.  Sorry!
 -Sam

 On Mon, Aug 12, 2013 at 9:39 PM, Stefan Priebe - Profihost AG
 s.pri...@profihost.ag wrote:


 Did you take a look?

 Stefan

 Am 11.08.2013 um 05:50 schrieb Samuel Just sam.j...@inktank.com:

 Great!  I'll take a look on Monday.
 -Sam

 On Sat, Aug 10, 2013 at 12:08 PM, Stefan Priebe
 s.pri...@profihost.ag wrote:


 Hi Samual,

 Am 09.08.2013 23:44, schrieb Samuel Just:

 I think Stefan's problem is probably distinct from Mike's.

 Stefan: Can you reproduce the problem with

 debug osd = 20
 debug filestore = 20
 debug ms = 1
 debug optracker = 20

 on a few osds (including the restarted osd), and upload those
 osd
 logs
 along with the ceph.log from before killing the osd until after
 the
 cluster becomes clean 

Re: still recovery issues with cuttlefish

2013-08-21 Thread Stefan Priebe

Hi Sam,
Am 21.08.2013 21:13, schrieb Samuel Just:

As long as the request is for an object which is up to date on the
primary, the request will be served without waiting for recovery.


Sure but remember if you have VM random 4K workload a lot of objects go 
out of date pretty soon.


 A request only waits on recovery if the particular object being read or

written must be recovered.


Yes but on 4k load this can be a lot.


Your issue was that recovering the
particular object being requested was unreasonably slow due to
silliness in the recovery code which you disabled by disabling
osd_recover_clone_overlap.


Yes and no. It's better now but far away from being good or perfect. My 
VMs do not crash anymore but i still have a bunch of slow requests (just 
around 10 messages) and still a VERY high I/O load on the disks during 
recovery.



In cases where the primary osd is significantly behind, we do make one
of the other osds primary during recovery in order to expedite
requests (pgs in this state are shown as remapped).


oh never seen that but at least in my case even 60s are a very long 
timeframe and the OSD is very stressed during recovery. Is it possible 
for me to set this value?


Stefan


-Sam

On Wed, Aug 21, 2013 at 11:21 AM, Stefan Priebe s.pri...@profihost.ag wrote:

Am 21.08.2013 17:32, schrieb Samuel Just:


Have you tried setting osd_recovery_clone_overlap to false?  That
seemed to help with Stefan's issue.



This might sound a bug harsh but maybe due to my limited english skills ;-)

I still think that Cephs recovery system is broken by design. If an OSD
comes back (was offline) all write requests regarding PGs where this one is
primary are targeted immediatly to this OSD. If this one is not up2date for
an PG it tries to recover that one immediatly which costs 4MB / block. If
you have a lot of small write all over your OSDs and PGs you're sucked as
your OSD has to recover ALL it's PGs immediatly or at least lots of them
WHICH can't work. This is totally crazy.

I think the right way would be:
1.) if an OSD goes down the replicas got primaries

or

2.) an OSD which does not have an up2date PG should redirect to the OSD
holding the secondary or third replica.

Both results in being able to have a really smooth and slow recovery without
any stress even under heavy 4K workloads like rbd backed VMs.

Thanks for reading!

Greets Stefan




-Sam

On Wed, Aug 21, 2013 at 8:28 AM, Mike Dawson mike.daw...@cloudapt.com
wrote:


Sam/Josh,

We upgraded from 0.61.7 to 0.67.1 during a maintenance window this
morning,
hoping it would improve this situation, but there was no appreciable
change.

One node in our cluster fsck'ed after a reboot and got a bit behind. Our
instances backed by RBD volumes were OK at that point, but once the node
booted fully and the OSDs started, all Windows instances with rbd volumes
experienced very choppy performance and were unable to ingest video
surveillance traffic and commit it to disk. Once the cluster got back to
HEALTH_OK, they resumed normal operation.

I tried for a time with conservative recovery settings (osd max backfills
=
1, osd recovery op priority = 1, and osd recovery max active = 1). No
improvement for the guests. So I went to more aggressive settings to get
things moving faster. That decreased the duration of the outage.

During the entire period of recovery/backfill, the network looked
fine...no
where close to saturation. iowait on all drives look fine as well.

Any ideas?

Thanks,
Mike Dawson



On 8/14/2013 3:04 AM, Stefan Priebe - Profihost AG wrote:



the same problem still occours. Will need to check when i've time to
gather logs again.

Am 14.08.2013 01:11, schrieb Samuel Just:



I'm not sure, but your logs did show that you had 16 recovery ops in
flight, so it's worth a try.  If it doesn't help, you should collect
the same set of logs I'll look again.  Also, there are a few other
patches between 61.7 and current cuttlefish which may help.
-Sam

On Tue, Aug 13, 2013 at 2:03 PM, Stefan Priebe - Profihost AG
s.pri...@profihost.ag wrote:




Am 13.08.2013 um 22:43 schrieb Samuel Just sam.j...@inktank.com:


I just backported a couple of patches from next to fix a bug where we
weren't respecting the osd_recovery_max_active config in some cases
(1ea6b56170fc9e223e7c30635db02fa2ad8f4b4e).  You can either try the
current cuttlefish branch or wait for a 61.8 release.




Thanks! Are you sure that this is the issue? I don't believe that but
i'll give it a try. I already tested a branch from sage where he fixed
a
race regarding max active some weeks ago. So active recovering was max
1 but
the issue didn't went away.

Stefan


-Sam

On Mon, Aug 12, 2013 at 10:34 PM, Samuel Just sam.j...@inktank.com
wrote:



I got swamped today.  I should be able to look tomorrow.  Sorry!
-Sam

On Mon, Aug 12, 2013 at 9:39 PM, Stefan Priebe - Profihost AG
s.pri...@profihost.ag wrote:



Did you take a look?

Stefan

Am 11.08.2013 um 05:50 schrieb Samuel Just 

Re: still recovery issues with cuttlefish

2013-08-21 Thread Samuel Just
It's not really possible at this time to control that limit because
changing the primary is actually fairly expensive and doing it
unnecessarily would probably make the situation much worse (it's
mostly necessary for backfilling, which is expensive anyway).  It
seems like forwarding IO on an object which needs to be recovered to a
replica with the object would be the next step.  Certainly something
to consider for the future.
-Sam

On Wed, Aug 21, 2013 at 12:37 PM, Stefan Priebe s.pri...@profihost.ag wrote:
 Hi Sam,
 Am 21.08.2013 21:13, schrieb Samuel Just:

 As long as the request is for an object which is up to date on the
 primary, the request will be served without waiting for recovery.


 Sure but remember if you have VM random 4K workload a lot of objects go out
 of date pretty soon.


 A request only waits on recovery if the particular object being read or

 written must be recovered.


 Yes but on 4k load this can be a lot.


 Your issue was that recovering the
 particular object being requested was unreasonably slow due to
 silliness in the recovery code which you disabled by disabling
 osd_recover_clone_overlap.


 Yes and no. It's better now but far away from being good or perfect. My VMs
 do not crash anymore but i still have a bunch of slow requests (just around
 10 messages) and still a VERY high I/O load on the disks during recovery.


 In cases where the primary osd is significantly behind, we do make one
 of the other osds primary during recovery in order to expedite
 requests (pgs in this state are shown as remapped).


 oh never seen that but at least in my case even 60s are a very long
 timeframe and the OSD is very stressed during recovery. Is it possible for
 me to set this value?


 Stefan

 -Sam

 On Wed, Aug 21, 2013 at 11:21 AM, Stefan Priebe s.pri...@profihost.ag
 wrote:

 Am 21.08.2013 17:32, schrieb Samuel Just:

 Have you tried setting osd_recovery_clone_overlap to false?  That
 seemed to help with Stefan's issue.



 This might sound a bug harsh but maybe due to my limited english skills
 ;-)

 I still think that Cephs recovery system is broken by design. If an OSD
 comes back (was offline) all write requests regarding PGs where this one
 is
 primary are targeted immediatly to this OSD. If this one is not up2date
 for
 an PG it tries to recover that one immediatly which costs 4MB / block. If
 you have a lot of small write all over your OSDs and PGs you're sucked as
 your OSD has to recover ALL it's PGs immediatly or at least lots of them
 WHICH can't work. This is totally crazy.

 I think the right way would be:
 1.) if an OSD goes down the replicas got primaries

 or

 2.) an OSD which does not have an up2date PG should redirect to the OSD
 holding the secondary or third replica.

 Both results in being able to have a really smooth and slow recovery
 without
 any stress even under heavy 4K workloads like rbd backed VMs.

 Thanks for reading!

 Greets Stefan



 -Sam

 On Wed, Aug 21, 2013 at 8:28 AM, Mike Dawson mike.daw...@cloudapt.com
 wrote:


 Sam/Josh,

 We upgraded from 0.61.7 to 0.67.1 during a maintenance window this
 morning,
 hoping it would improve this situation, but there was no appreciable
 change.

 One node in our cluster fsck'ed after a reboot and got a bit behind.
 Our
 instances backed by RBD volumes were OK at that point, but once the
 node
 booted fully and the OSDs started, all Windows instances with rbd
 volumes
 experienced very choppy performance and were unable to ingest video
 surveillance traffic and commit it to disk. Once the cluster got back
 to
 HEALTH_OK, they resumed normal operation.

 I tried for a time with conservative recovery settings (osd max
 backfills
 =
 1, osd recovery op priority = 1, and osd recovery max active = 1). No
 improvement for the guests. So I went to more aggressive settings to
 get
 things moving faster. That decreased the duration of the outage.

 During the entire period of recovery/backfill, the network looked
 fine...no
 where close to saturation. iowait on all drives look fine as well.

 Any ideas?

 Thanks,
 Mike Dawson



 On 8/14/2013 3:04 AM, Stefan Priebe - Profihost AG wrote:



 the same problem still occours. Will need to check when i've time to
 gather logs again.

 Am 14.08.2013 01:11, schrieb Samuel Just:



 I'm not sure, but your logs did show that you had 16 recovery ops in
 flight, so it's worth a try.  If it doesn't help, you should collect
 the same set of logs I'll look again.  Also, there are a few other
 patches between 61.7 and current cuttlefish which may help.
 -Sam

 On Tue, Aug 13, 2013 at 2:03 PM, Stefan Priebe - Profihost AG
 s.pri...@profihost.ag wrote:




 Am 13.08.2013 um 22:43 schrieb Samuel Just sam.j...@inktank.com:

 I just backported a couple of patches from next to fix a bug where
 we
 weren't respecting the osd_recovery_max_active config in some cases
 (1ea6b56170fc9e223e7c30635db02fa2ad8f4b4e).  You can either try the
 current cuttlefish branch or wait for a 61.8 release.

Re: still recovery issues with cuttlefish

2013-08-14 Thread Stefan Priebe - Profihost AG
the same problem still occours. Will need to check when i've time to
gather logs again.

Am 14.08.2013 01:11, schrieb Samuel Just:
 I'm not sure, but your logs did show that you had 16 recovery ops in
 flight, so it's worth a try.  If it doesn't help, you should collect
 the same set of logs I'll look again.  Also, there are a few other
 patches between 61.7 and current cuttlefish which may help.
 -Sam
 
 On Tue, Aug 13, 2013 at 2:03 PM, Stefan Priebe - Profihost AG
 s.pri...@profihost.ag wrote:

 Am 13.08.2013 um 22:43 schrieb Samuel Just sam.j...@inktank.com:

 I just backported a couple of patches from next to fix a bug where we
 weren't respecting the osd_recovery_max_active config in some cases
 (1ea6b56170fc9e223e7c30635db02fa2ad8f4b4e).  You can either try the
 current cuttlefish branch or wait for a 61.8 release.

 Thanks! Are you sure that this is the issue? I don't believe that but i'll 
 give it a try. I already tested a branch from sage where he fixed a race 
 regarding max active some weeks ago. So active recovering was max 1 but the 
 issue didn't went away.

 Stefan

 -Sam

 On Mon, Aug 12, 2013 at 10:34 PM, Samuel Just sam.j...@inktank.com wrote:
 I got swamped today.  I should be able to look tomorrow.  Sorry!
 -Sam

 On Mon, Aug 12, 2013 at 9:39 PM, Stefan Priebe - Profihost AG
 s.pri...@profihost.ag wrote:
 Did you take a look?

 Stefan

 Am 11.08.2013 um 05:50 schrieb Samuel Just sam.j...@inktank.com:

 Great!  I'll take a look on Monday.
 -Sam

 On Sat, Aug 10, 2013 at 12:08 PM, Stefan Priebe s.pri...@profihost.ag 
 wrote:
 Hi Samual,

 Am 09.08.2013 23:44, schrieb Samuel Just:

 I think Stefan's problem is probably distinct from Mike's.

 Stefan: Can you reproduce the problem with

 debug osd = 20
 debug filestore = 20
 debug ms = 1
 debug optracker = 20

 on a few osds (including the restarted osd), and upload those osd logs
 along with the ceph.log from before killing the osd until after the
 cluster becomes clean again?


 done - you'll find the logs at cephdrop folder:
 slow_requests_recovering_cuttlefish

 osd.52 was the one recovering

 Thanks!

 Greets,
 Stefan
 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: still recovery issues with cuttlefish

2013-08-13 Thread Samuel Just
I just backported a couple of patches from next to fix a bug where we
weren't respecting the osd_recovery_max_active config in some cases
(1ea6b56170fc9e223e7c30635db02fa2ad8f4b4e).  You can either try the
current cuttlefish branch or wait for a 61.8 release.
-Sam

On Mon, Aug 12, 2013 at 10:34 PM, Samuel Just sam.j...@inktank.com wrote:
 I got swamped today.  I should be able to look tomorrow.  Sorry!
 -Sam

 On Mon, Aug 12, 2013 at 9:39 PM, Stefan Priebe - Profihost AG
 s.pri...@profihost.ag wrote:
 Did you take a look?

 Stefan

 Am 11.08.2013 um 05:50 schrieb Samuel Just sam.j...@inktank.com:

 Great!  I'll take a look on Monday.
 -Sam

 On Sat, Aug 10, 2013 at 12:08 PM, Stefan Priebe s.pri...@profihost.ag 
 wrote:
 Hi Samual,

 Am 09.08.2013 23:44, schrieb Samuel Just:

 I think Stefan's problem is probably distinct from Mike's.

 Stefan: Can you reproduce the problem with

 debug osd = 20
 debug filestore = 20
 debug ms = 1
 debug optracker = 20

 on a few osds (including the restarted osd), and upload those osd logs
 along with the ceph.log from before killing the osd until after the
 cluster becomes clean again?


 done - you'll find the logs at cephdrop folder:
 slow_requests_recovering_cuttlefish

 osd.52 was the one recovering

 Thanks!

 Greets,
 Stefan
 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: still recovery issues with cuttlefish

2013-08-13 Thread Stefan Priebe - Profihost AG

Am 13.08.2013 um 22:43 schrieb Samuel Just sam.j...@inktank.com:

 I just backported a couple of patches from next to fix a bug where we
 weren't respecting the osd_recovery_max_active config in some cases
 (1ea6b56170fc9e223e7c30635db02fa2ad8f4b4e).  You can either try the
 current cuttlefish branch or wait for a 61.8 release.

Thanks! Are you sure that this is the issue? I don't believe that but i'll give 
it a try. I already tested a branch from sage where he fixed a race regarding 
max active some weeks ago. So active recovering was max 1 but the issue didn't 
went away.

Stefan

 -Sam
 
 On Mon, Aug 12, 2013 at 10:34 PM, Samuel Just sam.j...@inktank.com wrote:
 I got swamped today.  I should be able to look tomorrow.  Sorry!
 -Sam
 
 On Mon, Aug 12, 2013 at 9:39 PM, Stefan Priebe - Profihost AG
 s.pri...@profihost.ag wrote:
 Did you take a look?
 
 Stefan
 
 Am 11.08.2013 um 05:50 schrieb Samuel Just sam.j...@inktank.com:
 
 Great!  I'll take a look on Monday.
 -Sam
 
 On Sat, Aug 10, 2013 at 12:08 PM, Stefan Priebe s.pri...@profihost.ag 
 wrote:
 Hi Samual,
 
 Am 09.08.2013 23:44, schrieb Samuel Just:
 
 I think Stefan's problem is probably distinct from Mike's.
 
 Stefan: Can you reproduce the problem with
 
 debug osd = 20
 debug filestore = 20
 debug ms = 1
 debug optracker = 20
 
 on a few osds (including the restarted osd), and upload those osd logs
 along with the ceph.log from before killing the osd until after the
 cluster becomes clean again?
 
 
 done - you'll find the logs at cephdrop folder:
 slow_requests_recovering_cuttlefish
 
 osd.52 was the one recovering
 
 Thanks!
 
 Greets,
 Stefan
 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: still recovery issues with cuttlefish

2013-08-13 Thread Samuel Just
I'm not sure, but your logs did show that you had 16 recovery ops in
flight, so it's worth a try.  If it doesn't help, you should collect
the same set of logs I'll look again.  Also, there are a few other
patches between 61.7 and current cuttlefish which may help.
-Sam

On Tue, Aug 13, 2013 at 2:03 PM, Stefan Priebe - Profihost AG
s.pri...@profihost.ag wrote:

 Am 13.08.2013 um 22:43 schrieb Samuel Just sam.j...@inktank.com:

 I just backported a couple of patches from next to fix a bug where we
 weren't respecting the osd_recovery_max_active config in some cases
 (1ea6b56170fc9e223e7c30635db02fa2ad8f4b4e).  You can either try the
 current cuttlefish branch or wait for a 61.8 release.

 Thanks! Are you sure that this is the issue? I don't believe that but i'll 
 give it a try. I already tested a branch from sage where he fixed a race 
 regarding max active some weeks ago. So active recovering was max 1 but the 
 issue didn't went away.

 Stefan

 -Sam

 On Mon, Aug 12, 2013 at 10:34 PM, Samuel Just sam.j...@inktank.com wrote:
 I got swamped today.  I should be able to look tomorrow.  Sorry!
 -Sam

 On Mon, Aug 12, 2013 at 9:39 PM, Stefan Priebe - Profihost AG
 s.pri...@profihost.ag wrote:
 Did you take a look?

 Stefan

 Am 11.08.2013 um 05:50 schrieb Samuel Just sam.j...@inktank.com:

 Great!  I'll take a look on Monday.
 -Sam

 On Sat, Aug 10, 2013 at 12:08 PM, Stefan Priebe s.pri...@profihost.ag 
 wrote:
 Hi Samual,

 Am 09.08.2013 23:44, schrieb Samuel Just:

 I think Stefan's problem is probably distinct from Mike's.

 Stefan: Can you reproduce the problem with

 debug osd = 20
 debug filestore = 20
 debug ms = 1
 debug optracker = 20

 on a few osds (including the restarted osd), and upload those osd logs
 along with the ceph.log from before killing the osd until after the
 cluster becomes clean again?


 done - you'll find the logs at cephdrop folder:
 slow_requests_recovering_cuttlefish

 osd.52 was the one recovering

 Thanks!

 Greets,
 Stefan
 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: still recovery issues with cuttlefish

2013-08-12 Thread Stefan Priebe - Profihost AG
Did you take a look?

Stefan

Am 11.08.2013 um 05:50 schrieb Samuel Just sam.j...@inktank.com:

 Great!  I'll take a look on Monday.
 -Sam
 
 On Sat, Aug 10, 2013 at 12:08 PM, Stefan Priebe s.pri...@profihost.ag wrote:
 Hi Samual,
 
 Am 09.08.2013 23:44, schrieb Samuel Just:
 
 I think Stefan's problem is probably distinct from Mike's.
 
 Stefan: Can you reproduce the problem with
 
 debug osd = 20
 debug filestore = 20
 debug ms = 1
 debug optracker = 20
 
 on a few osds (including the restarted osd), and upload those osd logs
 along with the ceph.log from before killing the osd until after the
 cluster becomes clean again?
 
 
 done - you'll find the logs at cephdrop folder:
 slow_requests_recovering_cuttlefish
 
 osd.52 was the one recovering
 
 Thanks!
 
 Greets,
 Stefan
 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: still recovery issues with cuttlefish

2013-08-12 Thread Samuel Just
I got swamped today.  I should be able to look tomorrow.  Sorry!
-Sam

On Mon, Aug 12, 2013 at 9:39 PM, Stefan Priebe - Profihost AG
s.pri...@profihost.ag wrote:
 Did you take a look?

 Stefan

 Am 11.08.2013 um 05:50 schrieb Samuel Just sam.j...@inktank.com:

 Great!  I'll take a look on Monday.
 -Sam

 On Sat, Aug 10, 2013 at 12:08 PM, Stefan Priebe s.pri...@profihost.ag 
 wrote:
 Hi Samual,

 Am 09.08.2013 23:44, schrieb Samuel Just:

 I think Stefan's problem is probably distinct from Mike's.

 Stefan: Can you reproduce the problem with

 debug osd = 20
 debug filestore = 20
 debug ms = 1
 debug optracker = 20

 on a few osds (including the restarted osd), and upload those osd logs
 along with the ceph.log from before killing the osd until after the
 cluster becomes clean again?


 done - you'll find the logs at cephdrop folder:
 slow_requests_recovering_cuttlefish

 osd.52 was the one recovering

 Thanks!

 Greets,
 Stefan
 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: still recovery issues with cuttlefish

2013-08-10 Thread Stefan Priebe

Hi Samual,

Am 09.08.2013 23:44, schrieb Samuel Just:

I think Stefan's problem is probably distinct from Mike's.

Stefan: Can you reproduce the problem with

debug osd = 20
debug filestore = 20
debug ms = 1
debug optracker = 20

on a few osds (including the restarted osd), and upload those osd logs
along with the ceph.log from before killing the osd until after the
cluster becomes clean again?


done - you'll find the logs at cephdrop folder: 
slow_requests_recovering_cuttlefish


osd.52 was the one recovering

Thanks!

Greets,
Stefan
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: still recovery issues with cuttlefish

2013-08-09 Thread Samuel Just
, admin,
 rgw_keystone_token_cache_size: 1,
 rgw_keystone_revocation_interval: 900,
 rgw_admin_entry: admin,
 rgw_enforce_swift_acls: true,
 rgw_swift_token_expiration: 86400,
 rgw_print_continue: true,
 rgw_remote_addr_param: REMOTE_ADDR,
 rgw_op_thread_timeout: 600,
 rgw_op_thread_suicide_timeout: 0,
 rgw_thread_pool_size: 100,
 rgw_num_control_oids: 8,
 rgw_zone_root_pool: .rgw.root,
 rgw_log_nonexistent_bucket: false,
 rgw_log_object_name: %Y-%m-%d-%H-%i-%n,
 rgw_log_object_name_utc: false,
 rgw_usage_max_shards: 32,
 rgw_usage_max_user_shards: 1,
 rgw_enable_ops_log: false,
 rgw_enable_usage_log: false,
 rgw_ops_log_rados: true,
 rgw_ops_log_socket_path: ,
 rgw_ops_log_data_backlog: 5242880,
 rgw_usage_log_flush_threshold: 1024,
 rgw_usage_log_tick_interval: 30,
 rgw_intent_log_object_name: %Y-%m-%d-%i-%n,
 rgw_intent_log_object_name_utc: false,
 rgw_init_timeout: 300,
 rgw_mime_types_file: \/etc\/mime.types,
 rgw_gc_max_objs: 32,
 rgw_gc_obj_min_wait: 7200,
 rgw_gc_processor_max_time: 3600,
 rgw_gc_processor_period: 3600,
 rgw_s3_success_create_obj_status: 0,
 rgw_resolve_cname: false,
 rgw_obj_stripe_size: 4194304,
 rgw_extended_http_attrs: ,
 rgw_exit_timeout_secs: 120,
 rgw_get_obj_window_size: 16777216,
 rgw_get_obj_max_req_size: 4194304,
 rgw_relaxed_s3_bucket_names: false,
 rgw_list_buckets_max_chunk: 1000,
 mutex_perf_counter: false,
 internal_safe_to_start_threads: true}



 Stefan


 -Sam

 On Thu, Aug 1, 2013 at 12:07 PM, Stefan Priebe
 s.pri...@profihost.ag
 wrote:



 Mike we already have the async patch running. Yes it helps but only
 helps
 it
 does not solve. It just hides the issue ...
 Am 01.08.2013 20:54, schrieb Mike Dawson:

 I am also seeing recovery issues with 0.61.7. Here's the process:

 - ceph osd set noout

 - Reboot one of the nodes hosting OSDs
 - VMs mounted from RBD volumes work properly

 - I see the OSD's boot messages as they re-join the cluster

 - Start seeing active+recovery_wait, peering, and
 active+recovering
 - VMs mounted from RBD volumes become unresponsive.

 - Recovery completes
 - VMs mounted from RBD volumes regain responsiveness

 - ceph osd unset noout

 Would joshd's async patch for qemu help here, or is there
 something
 else
 going on?

 Output of ceph -w at: http://pastebin.com/raw.php?i=JLcZYFzY

 Thanks,

 Mike Dawson
 Co-Founder  Director of Cloud Architecture
 Cloudapt LLC
 6330 East 75th Street, Suite 170
 Indianapolis, IN 46250

 On 8/1/2013 2:34 PM, Samuel Just wrote:




 Can you reproduce and attach the ceph.log from before you stop
 the
 osd
 until after you have started the osd and it has recovered?
 -Sam

 On Thu, Aug 1, 2013 at 1:22 AM, Stefan Priebe - Profihost AG
 s.pri...@profihost.ag wrote:




 Hi,

 i still have recovery issues with cuttlefish. After the OSD
 comes
 back
 it seem to hang for around 2-4 minutes and then recovery
 seems to
 start
 (pgs in recovery_wait start to decrement). This is with ceph
 0.61.7.
 I
 get a lot of slow request messages an hanging VMs.

 What i noticed today is that if i leave the OSD off as long as
 ceph
 starts to backfill - the recovery and re backfilling wents
 absolutely
 smooth without any issues and no slow request messages at all.

 Does anybody have an idea why?

 Greets,
 Stefan
 --
 To unsubscribe from this list: send the line unsubscribe
 ceph-devel
 in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at
 http://vger.kernel.org/majordomo-info.html




 --
 To unsubscribe from this list: send the line unsubscribe
 ceph-devel
 in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at
 http://vger.kernel.org/majordomo-info.html


 --
 To unsubscribe from this list: send the line unsubscribe
 ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html



 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: still recovery issues with cuttlefish

2013-08-08 Thread Mike Dawson
,
rgw_gc_processor_period: 3600,
rgw_s3_success_create_obj_status: 0,
rgw_resolve_cname: false,
rgw_obj_stripe_size: 4194304,
rgw_extended_http_attrs: ,
rgw_exit_timeout_secs: 120,
rgw_get_obj_window_size: 16777216,
rgw_get_obj_max_req_size: 4194304,
rgw_relaxed_s3_bucket_names: false,
rgw_list_buckets_max_chunk: 1000,
mutex_perf_counter: false,
internal_safe_to_start_threads: true}



Stefan



-Sam

On Thu, Aug 1, 2013 at 12:07 PM, Stefan Priebe
s.pri...@profihost.ag
wrote:



Mike we already have the async patch running. Yes it helps but only
helps
it
does not solve. It just hides the issue ...
Am 01.08.2013 20:54, schrieb Mike Dawson:


I am also seeing recovery issues with 0.61.7. Here's the process:

- ceph osd set noout

- Reboot one of the nodes hosting OSDs
- VMs mounted from RBD volumes work properly

- I see the OSD's boot messages as they re-join the cluster

- Start seeing active+recovery_wait, peering, and active+recovering
- VMs mounted from RBD volumes become unresponsive.

- Recovery completes
- VMs mounted from RBD volumes regain responsiveness

- ceph osd unset noout

Would joshd's async patch for qemu help here, or is there something
else
going on?

Output of ceph -w at: http://pastebin.com/raw.php?i=JLcZYFzY

Thanks,

Mike Dawson
Co-Founder  Director of Cloud Architecture
Cloudapt LLC
6330 East 75th Street, Suite 170
Indianapolis, IN 46250

On 8/1/2013 2:34 PM, Samuel Just wrote:




Can you reproduce and attach the ceph.log from before you stop the
osd
until after you have started the osd and it has recovered?
-Sam

On Thu, Aug 1, 2013 at 1:22 AM, Stefan Priebe - Profihost AG
s.pri...@profihost.ag wrote:




Hi,

i still have recovery issues with cuttlefish. After the OSD comes
back
it seem to hang for around 2-4 minutes and then recovery seems to
start
(pgs in recovery_wait start to decrement). This is with ceph
0.61.7.
I
get a lot of slow request messages an hanging VMs.

What i noticed today is that if i leave the OSD off as long as
ceph
starts to backfill - the recovery and re backfilling wents
absolutely
smooth without any issues and no slow request messages at all.

Does anybody have an idea why?

Greets,
Stefan
--
To unsubscribe from this list: send the line unsubscribe
ceph-devel
in
the body of a message to majord...@vger.kernel.org
More majordomo info at
http://vger.kernel.org/majordomo-info.html




--
To unsubscribe from this list: send the line unsubscribe
ceph-devel
in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html




--
To unsubscribe from this list: send the line unsubscribe
ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html






--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: still recovery issues with cuttlefish

2013-08-08 Thread Stefan Priebe
: \/etc\/mime.types,
rgw_gc_max_objs: 32,
rgw_gc_obj_min_wait: 7200,
rgw_gc_processor_max_time: 3600,
rgw_gc_processor_period: 3600,
rgw_s3_success_create_obj_status: 0,
rgw_resolve_cname: false,
rgw_obj_stripe_size: 4194304,
rgw_extended_http_attrs: ,
rgw_exit_timeout_secs: 120,
rgw_get_obj_window_size: 16777216,
rgw_get_obj_max_req_size: 4194304,
rgw_relaxed_s3_bucket_names: false,
rgw_list_buckets_max_chunk: 1000,
mutex_perf_counter: false,
internal_safe_to_start_threads: true}



Stefan



-Sam

On Thu, Aug 1, 2013 at 12:07 PM, Stefan Priebe
s.pri...@profihost.ag
wrote:



Mike we already have the async patch running. Yes it helps but only
helps
it
does not solve. It just hides the issue ...
Am 01.08.2013 20:54, schrieb Mike Dawson:


I am also seeing recovery issues with 0.61.7. Here's the process:

- ceph osd set noout

- Reboot one of the nodes hosting OSDs
- VMs mounted from RBD volumes work properly

- I see the OSD's boot messages as they re-join the cluster

- Start seeing active+recovery_wait, peering, and
active+recovering
- VMs mounted from RBD volumes become unresponsive.

- Recovery completes
- VMs mounted from RBD volumes regain responsiveness

- ceph osd unset noout

Would joshd's async patch for qemu help here, or is there
something
else
going on?

Output of ceph -w at: http://pastebin.com/raw.php?i=JLcZYFzY

Thanks,

Mike Dawson
Co-Founder  Director of Cloud Architecture
Cloudapt LLC
6330 East 75th Street, Suite 170
Indianapolis, IN 46250

On 8/1/2013 2:34 PM, Samuel Just wrote:




Can you reproduce and attach the ceph.log from before you stop
the
osd
until after you have started the osd and it has recovered?
-Sam

On Thu, Aug 1, 2013 at 1:22 AM, Stefan Priebe - Profihost AG
s.pri...@profihost.ag wrote:




Hi,

i still have recovery issues with cuttlefish. After the OSD
comes
back
it seem to hang for around 2-4 minutes and then recovery
seems to
start
(pgs in recovery_wait start to decrement). This is with ceph
0.61.7.
I
get a lot of slow request messages an hanging VMs.

What i noticed today is that if i leave the OSD off as long as
ceph
starts to backfill - the recovery and re backfilling wents
absolutely
smooth without any issues and no slow request messages at all.

Does anybody have an idea why?

Greets,
Stefan
--
To unsubscribe from this list: send the line unsubscribe
ceph-devel
in
the body of a message to majord...@vger.kernel.org
More majordomo info at
http://vger.kernel.org/majordomo-info.html




--
To unsubscribe from this list: send the line unsubscribe
ceph-devel
in
the body of a message to majord...@vger.kernel.org
More majordomo info at
http://vger.kernel.org/majordomo-info.html




--
To unsubscribe from this list: send the line unsubscribe
ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html






--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: still recovery issues with cuttlefish

2013-08-02 Thread Stefan Priebe
,
  rgw_socket_path: ,
  rgw_host: ,
  rgw_port: ,
  rgw_dns_name: ,
  rgw_script_uri: ,
  rgw_request_uri: ,
  rgw_swift_url: ,
  rgw_swift_url_prefix: swift,
  rgw_swift_auth_url: ,
  rgw_swift_auth_entry: auth,
  rgw_keystone_url: ,
  rgw_keystone_admin_token: ,
  rgw_keystone_accepted_roles: Member, admin,
  rgw_keystone_token_cache_size: 1,
  rgw_keystone_revocation_interval: 900,
  rgw_admin_entry: admin,
  rgw_enforce_swift_acls: true,
  rgw_swift_token_expiration: 86400,
  rgw_print_continue: true,
  rgw_remote_addr_param: REMOTE_ADDR,
  rgw_op_thread_timeout: 600,
  rgw_op_thread_suicide_timeout: 0,
  rgw_thread_pool_size: 100,
  rgw_num_control_oids: 8,
  rgw_zone_root_pool: .rgw.root,
  rgw_log_nonexistent_bucket: false,
  rgw_log_object_name: %Y-%m-%d-%H-%i-%n,
  rgw_log_object_name_utc: false,
  rgw_usage_max_shards: 32,
  rgw_usage_max_user_shards: 1,
  rgw_enable_ops_log: false,
  rgw_enable_usage_log: false,
  rgw_ops_log_rados: true,
  rgw_ops_log_socket_path: ,
  rgw_ops_log_data_backlog: 5242880,
  rgw_usage_log_flush_threshold: 1024,
  rgw_usage_log_tick_interval: 30,
  rgw_intent_log_object_name: %Y-%m-%d-%i-%n,
  rgw_intent_log_object_name_utc: false,
  rgw_init_timeout: 300,
  rgw_mime_types_file: \/etc\/mime.types,
  rgw_gc_max_objs: 32,
  rgw_gc_obj_min_wait: 7200,
  rgw_gc_processor_max_time: 3600,
  rgw_gc_processor_period: 3600,
  rgw_s3_success_create_obj_status: 0,
  rgw_resolve_cname: false,
  rgw_obj_stripe_size: 4194304,
  rgw_extended_http_attrs: ,
  rgw_exit_timeout_secs: 120,
  rgw_get_obj_window_size: 16777216,
  rgw_get_obj_max_req_size: 4194304,
  rgw_relaxed_s3_bucket_names: false,
  rgw_list_buckets_max_chunk: 1000,
  mutex_perf_counter: false,
  internal_safe_to_start_threads: true}



Stefan


-Sam

On Thu, Aug 1, 2013 at 12:07 PM, Stefan Priebe s.pri...@profihost.ag wrote:

Mike we already have the async patch running. Yes it helps but only helps it
does not solve. It just hides the issue ...
Am 01.08.2013 20:54, schrieb Mike Dawson:


I am also seeing recovery issues with 0.61.7. Here's the process:

- ceph osd set noout

- Reboot one of the nodes hosting OSDs
  - VMs mounted from RBD volumes work properly

- I see the OSD's boot messages as they re-join the cluster

- Start seeing active+recovery_wait, peering, and active+recovering
  - VMs mounted from RBD volumes become unresponsive.

- Recovery completes
  - VMs mounted from RBD volumes regain responsiveness

- ceph osd unset noout

Would joshd's async patch for qemu help here, or is there something else
going on?

Output of ceph -w at: http://pastebin.com/raw.php?i=JLcZYFzY

Thanks,

Mike Dawson
Co-Founder  Director of Cloud Architecture
Cloudapt LLC
6330 East 75th Street, Suite 170
Indianapolis, IN 46250

On 8/1/2013 2:34 PM, Samuel Just wrote:


Can you reproduce and attach the ceph.log from before you stop the osd
until after you have started the osd and it has recovered?
-Sam

On Thu, Aug 1, 2013 at 1:22 AM, Stefan Priebe - Profihost AG
s.pri...@profihost.ag wrote:


Hi,

i still have recovery issues with cuttlefish. After the OSD comes back
it seem to hang for around 2-4 minutes and then recovery seems to start
(pgs in recovery_wait start to decrement). This is with ceph 0.61.7. I
get a lot of slow request messages an hanging VMs.

What i noticed today is that if i leave the OSD off as long as ceph
starts to backfill - the recovery and re backfilling wents absolutely
smooth without any issues and no slow request messages at all.

Does anybody have an idea why?

Greets,
Stefan
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html




--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: still recovery issues with cuttlefish

2013-08-02 Thread Samuel Just
,
   journal_ignore_corruption: false,
   rbd_cache: false,
   rbd_cache_writethrough_until_flush: false,
   rbd_cache_size: 33554432,
   rbd_cache_max_dirty: 25165824,
   rbd_cache_target_dirty: 16777216,
   rbd_cache_max_dirty_age: 1,
   rbd_cache_block_writes_upfront: false,
   rbd_concurrent_management_ops: 10,
   rbd_default_format: 1,
   rbd_default_order: 22,
   rbd_default_stripe_count: 1,
   rbd_default_stripe_unit: 4194304,
   rbd_default_features: 3,
   nss_db_path: ,
   rgw_data: \/var\/lib\/ceph\/radosgw\/ceph-0,
   rgw_enable_apis: s3, swift, swift_auth, admin,
   rgw_cache_enabled: true,
   rgw_cache_lru_size: 1,
   rgw_socket_path: ,
   rgw_host: ,
   rgw_port: ,
   rgw_dns_name: ,
   rgw_script_uri: ,
   rgw_request_uri: ,
   rgw_swift_url: ,
   rgw_swift_url_prefix: swift,
   rgw_swift_auth_url: ,
   rgw_swift_auth_entry: auth,
   rgw_keystone_url: ,
   rgw_keystone_admin_token: ,
   rgw_keystone_accepted_roles: Member, admin,
   rgw_keystone_token_cache_size: 1,
   rgw_keystone_revocation_interval: 900,
   rgw_admin_entry: admin,
   rgw_enforce_swift_acls: true,
   rgw_swift_token_expiration: 86400,
   rgw_print_continue: true,
   rgw_remote_addr_param: REMOTE_ADDR,
   rgw_op_thread_timeout: 600,
   rgw_op_thread_suicide_timeout: 0,
   rgw_thread_pool_size: 100,
   rgw_num_control_oids: 8,
   rgw_zone_root_pool: .rgw.root,
   rgw_log_nonexistent_bucket: false,
   rgw_log_object_name: %Y-%m-%d-%H-%i-%n,
   rgw_log_object_name_utc: false,
   rgw_usage_max_shards: 32,
   rgw_usage_max_user_shards: 1,
   rgw_enable_ops_log: false,
   rgw_enable_usage_log: false,
   rgw_ops_log_rados: true,
   rgw_ops_log_socket_path: ,
   rgw_ops_log_data_backlog: 5242880,
   rgw_usage_log_flush_threshold: 1024,
   rgw_usage_log_tick_interval: 30,
   rgw_intent_log_object_name: %Y-%m-%d-%i-%n,
   rgw_intent_log_object_name_utc: false,
   rgw_init_timeout: 300,
   rgw_mime_types_file: \/etc\/mime.types,
   rgw_gc_max_objs: 32,
   rgw_gc_obj_min_wait: 7200,
   rgw_gc_processor_max_time: 3600,
   rgw_gc_processor_period: 3600,
   rgw_s3_success_create_obj_status: 0,
   rgw_resolve_cname: false,
   rgw_obj_stripe_size: 4194304,
   rgw_extended_http_attrs: ,
   rgw_exit_timeout_secs: 120,
   rgw_get_obj_window_size: 16777216,
   rgw_get_obj_max_req_size: 4194304,
   rgw_relaxed_s3_bucket_names: false,
   rgw_list_buckets_max_chunk: 1000,
   mutex_perf_counter: false,
   internal_safe_to_start_threads: true}



 Stefan


 -Sam

 On Thu, Aug 1, 2013 at 12:07 PM, Stefan Priebe s.pri...@profihost.ag
 wrote:

 Mike we already have the async patch running. Yes it helps but only helps
 it
 does not solve. It just hides the issue ...
 Am 01.08.2013 20:54, schrieb Mike Dawson:

 I am also seeing recovery issues with 0.61.7. Here's the process:

 - ceph osd set noout

 - Reboot one of the nodes hosting OSDs
   - VMs mounted from RBD volumes work properly

 - I see the OSD's boot messages as they re-join the cluster

 - Start seeing active+recovery_wait, peering, and active+recovering
   - VMs mounted from RBD volumes become unresponsive.

 - Recovery completes
   - VMs mounted from RBD volumes regain responsiveness

 - ceph osd unset noout

 Would joshd's async patch for qemu help here, or is there something else
 going on?

 Output of ceph -w at: http://pastebin.com/raw.php?i=JLcZYFzY

 Thanks,

 Mike Dawson
 Co-Founder  Director of Cloud Architecture
 Cloudapt LLC
 6330 East 75th Street, Suite 170
 Indianapolis, IN 46250

 On 8/1/2013 2:34 PM, Samuel Just wrote:


 Can you reproduce and attach the ceph.log from before you stop the osd
 until after you have started the osd and it has recovered?
 -Sam

 On Thu, Aug 1, 2013 at 1:22 AM, Stefan Priebe - Profihost AG
 s.pri...@profihost.ag wrote:


 Hi,

 i still have recovery issues with cuttlefish. After the OSD comes back
 it seem to hang for around 2-4 minutes and then recovery seems to
 start
 (pgs in recovery_wait start to decrement). This is with ceph 0.61.7. I
 get a lot of slow request messages an hanging VMs.

 What i noticed today is that if i leave the OSD off as long as ceph
 starts to backfill - the recovery and re backfilling wents
 absolutely
 smooth without any issues and no slow request messages at all.

 Does anybody have an idea why?

 Greets,
 Stefan
 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel
 in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html


 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel
 in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html


 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord

Re: still recovery issues with cuttlefish

2013-08-02 Thread Andrey Korolyov
Created #5844.

On Thu, Aug 1, 2013 at 10:38 PM, Samuel Just sam.j...@inktank.com wrote:
 Is there a bug open for this?  I suspect we don't sufficiently
 throttle the snapshot removal work.
 -Sam

 On Thu, Aug 1, 2013 at 7:50 AM, Andrey Korolyov and...@xdel.ru wrote:
 Second this. Also for long-lasting snapshot problem and related
 performance issues I may say that cuttlefish improved things greatly,
 but creation/deletion of large snapshot (hundreds of gigabytes of
 commited data) still can bring down cluster for a minutes, despite
 usage of every possible optimization.

 On Thu, Aug 1, 2013 at 12:22 PM, Stefan Priebe - Profihost AG
 s.pri...@profihost.ag wrote:
 Hi,

 i still have recovery issues with cuttlefish. After the OSD comes back
 it seem to hang for around 2-4 minutes and then recovery seems to start
 (pgs in recovery_wait start to decrement). This is with ceph 0.61.7. I
 get a lot of slow request messages an hanging VMs.

 What i noticed today is that if i leave the OSD off as long as ceph
 starts to backfill - the recovery and re backfilling wents absolutely
 smooth without any issues and no slow request messages at all.

 Does anybody have an idea why?

 Greets,
 Stefan
 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: still recovery issues with cuttlefish

2013-08-02 Thread Stefan Priebe
,
   journal_queue_max_bytes: 33554432,
   journal_align_min_size: 65536,
   journal_replay_from: 0,
   journal_zero_on_create: false,
   journal_ignore_corruption: false,
   rbd_cache: false,
   rbd_cache_writethrough_until_flush: false,
   rbd_cache_size: 33554432,
   rbd_cache_max_dirty: 25165824,
   rbd_cache_target_dirty: 16777216,
   rbd_cache_max_dirty_age: 1,
   rbd_cache_block_writes_upfront: false,
   rbd_concurrent_management_ops: 10,
   rbd_default_format: 1,
   rbd_default_order: 22,
   rbd_default_stripe_count: 1,
   rbd_default_stripe_unit: 4194304,
   rbd_default_features: 3,
   nss_db_path: ,
   rgw_data: \/var\/lib\/ceph\/radosgw\/ceph-0,
   rgw_enable_apis: s3, swift, swift_auth, admin,
   rgw_cache_enabled: true,
   rgw_cache_lru_size: 1,
   rgw_socket_path: ,
   rgw_host: ,
   rgw_port: ,
   rgw_dns_name: ,
   rgw_script_uri: ,
   rgw_request_uri: ,
   rgw_swift_url: ,
   rgw_swift_url_prefix: swift,
   rgw_swift_auth_url: ,
   rgw_swift_auth_entry: auth,
   rgw_keystone_url: ,
   rgw_keystone_admin_token: ,
   rgw_keystone_accepted_roles: Member, admin,
   rgw_keystone_token_cache_size: 1,
   rgw_keystone_revocation_interval: 900,
   rgw_admin_entry: admin,
   rgw_enforce_swift_acls: true,
   rgw_swift_token_expiration: 86400,
   rgw_print_continue: true,
   rgw_remote_addr_param: REMOTE_ADDR,
   rgw_op_thread_timeout: 600,
   rgw_op_thread_suicide_timeout: 0,
   rgw_thread_pool_size: 100,
   rgw_num_control_oids: 8,
   rgw_zone_root_pool: .rgw.root,
   rgw_log_nonexistent_bucket: false,
   rgw_log_object_name: %Y-%m-%d-%H-%i-%n,
   rgw_log_object_name_utc: false,
   rgw_usage_max_shards: 32,
   rgw_usage_max_user_shards: 1,
   rgw_enable_ops_log: false,
   rgw_enable_usage_log: false,
   rgw_ops_log_rados: true,
   rgw_ops_log_socket_path: ,
   rgw_ops_log_data_backlog: 5242880,
   rgw_usage_log_flush_threshold: 1024,
   rgw_usage_log_tick_interval: 30,
   rgw_intent_log_object_name: %Y-%m-%d-%i-%n,
   rgw_intent_log_object_name_utc: false,
   rgw_init_timeout: 300,
   rgw_mime_types_file: \/etc\/mime.types,
   rgw_gc_max_objs: 32,
   rgw_gc_obj_min_wait: 7200,
   rgw_gc_processor_max_time: 3600,
   rgw_gc_processor_period: 3600,
   rgw_s3_success_create_obj_status: 0,
   rgw_resolve_cname: false,
   rgw_obj_stripe_size: 4194304,
   rgw_extended_http_attrs: ,
   rgw_exit_timeout_secs: 120,
   rgw_get_obj_window_size: 16777216,
   rgw_get_obj_max_req_size: 4194304,
   rgw_relaxed_s3_bucket_names: false,
   rgw_list_buckets_max_chunk: 1000,
   mutex_perf_counter: false,
   internal_safe_to_start_threads: true}



Stefan



-Sam

On Thu, Aug 1, 2013 at 12:07 PM, Stefan Priebe s.pri...@profihost.ag
wrote:


Mike we already have the async patch running. Yes it helps but only helps
it
does not solve. It just hides the issue ...
Am 01.08.2013 20:54, schrieb Mike Dawson:


I am also seeing recovery issues with 0.61.7. Here's the process:

- ceph osd set noout

- Reboot one of the nodes hosting OSDs
   - VMs mounted from RBD volumes work properly

- I see the OSD's boot messages as they re-join the cluster

- Start seeing active+recovery_wait, peering, and active+recovering
   - VMs mounted from RBD volumes become unresponsive.

- Recovery completes
   - VMs mounted from RBD volumes regain responsiveness

- ceph osd unset noout

Would joshd's async patch for qemu help here, or is there something else
going on?

Output of ceph -w at: http://pastebin.com/raw.php?i=JLcZYFzY

Thanks,

Mike Dawson
Co-Founder  Director of Cloud Architecture
Cloudapt LLC
6330 East 75th Street, Suite 170
Indianapolis, IN 46250

On 8/1/2013 2:34 PM, Samuel Just wrote:



Can you reproduce and attach the ceph.log from before you stop the osd
until after you have started the osd and it has recovered?
-Sam

On Thu, Aug 1, 2013 at 1:22 AM, Stefan Priebe - Profihost AG
s.pri...@profihost.ag wrote:



Hi,

i still have recovery issues with cuttlefish. After the OSD comes back
it seem to hang for around 2-4 minutes and then recovery seems to
start
(pgs in recovery_wait start to decrement). This is with ceph 0.61.7. I
get a lot of slow request messages an hanging VMs.

What i noticed today is that if i leave the OSD off as long as ceph
starts to backfill - the recovery and re backfilling wents
absolutely
smooth without any issues and no slow request messages at all.

Does anybody have an idea why?

Greets,
Stefan
--
To unsubscribe from this list: send the line unsubscribe ceph-devel
in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
To unsubscribe from this list: send the line unsubscribe ceph-devel
in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html




--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html




--
To unsubscribe

Re: still recovery issues with cuttlefish

2013-08-02 Thread Samuel Just
,
filestore_op_thread_timeout: 60,
filestore_op_thread_suicide_timeout: 180,
filestore_commit_timeout: 600,
filestore_fiemap_threshold: 4096,
filestore_merge_threshold: 10,
filestore_split_multiple: 2,
filestore_update_to: 1000,
filestore_blackhole: false,
filestore_dump_file: ,
filestore_kill_at: 0,
filestore_inject_stall: 0,
filestore_fail_eio: true,
filestore_replica_fadvise: true,
filestore_debug_verify_split: false,
journal_dio: true,
journal_aio: true,
journal_force_aio: false,
journal_max_corrupt_search: 10485760,
journal_block_align: true,
journal_write_header_frequency: 0,
journal_max_write_bytes: 10485760,
journal_max_write_entries: 100,
journal_queue_max_ops: 300,
journal_queue_max_bytes: 33554432,
journal_align_min_size: 65536,
journal_replay_from: 0,
journal_zero_on_create: false,
journal_ignore_corruption: false,
rbd_cache: false,
rbd_cache_writethrough_until_flush: false,
rbd_cache_size: 33554432,
rbd_cache_max_dirty: 25165824,
rbd_cache_target_dirty: 16777216,
rbd_cache_max_dirty_age: 1,
rbd_cache_block_writes_upfront: false,
rbd_concurrent_management_ops: 10,
rbd_default_format: 1,
rbd_default_order: 22,
rbd_default_stripe_count: 1,
rbd_default_stripe_unit: 4194304,
rbd_default_features: 3,
nss_db_path: ,
rgw_data: \/var\/lib\/ceph\/radosgw\/ceph-0,
rgw_enable_apis: s3, swift, swift_auth, admin,
rgw_cache_enabled: true,
rgw_cache_lru_size: 1,
rgw_socket_path: ,
rgw_host: ,
rgw_port: ,
rgw_dns_name: ,
rgw_script_uri: ,
rgw_request_uri: ,
rgw_swift_url: ,
rgw_swift_url_prefix: swift,
rgw_swift_auth_url: ,
rgw_swift_auth_entry: auth,
rgw_keystone_url: ,
rgw_keystone_admin_token: ,
rgw_keystone_accepted_roles: Member, admin,
rgw_keystone_token_cache_size: 1,
rgw_keystone_revocation_interval: 900,
rgw_admin_entry: admin,
rgw_enforce_swift_acls: true,
rgw_swift_token_expiration: 86400,
rgw_print_continue: true,
rgw_remote_addr_param: REMOTE_ADDR,
rgw_op_thread_timeout: 600,
rgw_op_thread_suicide_timeout: 0,
rgw_thread_pool_size: 100,
rgw_num_control_oids: 8,
rgw_zone_root_pool: .rgw.root,
rgw_log_nonexistent_bucket: false,
rgw_log_object_name: %Y-%m-%d-%H-%i-%n,
rgw_log_object_name_utc: false,
rgw_usage_max_shards: 32,
rgw_usage_max_user_shards: 1,
rgw_enable_ops_log: false,
rgw_enable_usage_log: false,
rgw_ops_log_rados: true,
rgw_ops_log_socket_path: ,
rgw_ops_log_data_backlog: 5242880,
rgw_usage_log_flush_threshold: 1024,
rgw_usage_log_tick_interval: 30,
rgw_intent_log_object_name: %Y-%m-%d-%i-%n,
rgw_intent_log_object_name_utc: false,
rgw_init_timeout: 300,
rgw_mime_types_file: \/etc\/mime.types,
rgw_gc_max_objs: 32,
rgw_gc_obj_min_wait: 7200,
rgw_gc_processor_max_time: 3600,
rgw_gc_processor_period: 3600,
rgw_s3_success_create_obj_status: 0,
rgw_resolve_cname: false,
rgw_obj_stripe_size: 4194304,
rgw_extended_http_attrs: ,
rgw_exit_timeout_secs: 120,
rgw_get_obj_window_size: 16777216,
rgw_get_obj_max_req_size: 4194304,
rgw_relaxed_s3_bucket_names: false,
rgw_list_buckets_max_chunk: 1000,
mutex_perf_counter: false,
internal_safe_to_start_threads: true}



 Stefan


 -Sam

 On Thu, Aug 1, 2013 at 12:07 PM, Stefan Priebe s.pri...@profihost.ag
 wrote:


 Mike we already have the async patch running. Yes it helps but only
 helps
 it
 does not solve. It just hides the issue ...
 Am 01.08.2013 20:54, schrieb Mike Dawson:

 I am also seeing recovery issues with 0.61.7. Here's the process:

 - ceph osd set noout

 - Reboot one of the nodes hosting OSDs
- VMs mounted from RBD volumes work properly

 - I see the OSD's boot messages as they re-join the cluster

 - Start seeing active+recovery_wait, peering, and active+recovering
- VMs mounted from RBD volumes become unresponsive.

 - Recovery completes
- VMs mounted from RBD volumes regain responsiveness

 - ceph osd unset noout

 Would joshd's async patch for qemu help here, or is there something
 else
 going on?

 Output of ceph -w at: http://pastebin.com/raw.php?i=JLcZYFzY

 Thanks,

 Mike Dawson
 Co-Founder  Director of Cloud Architecture
 Cloudapt LLC
 6330 East 75th Street, Suite 170
 Indianapolis, IN 46250

 On 8/1/2013 2:34 PM, Samuel Just wrote:



 Can you reproduce and attach the ceph.log from before you stop the
 osd
 until after you have started the osd and it has recovered?
 -Sam

 On Thu, Aug 1, 2013 at 1:22 AM, Stefan Priebe - Profihost AG
 s.pri...@profihost.ag wrote:



 Hi,

 i still have recovery issues with cuttlefish. After the OSD comes
 back
 it seem to hang for around 2-4 minutes and then recovery seems to
 start
 (pgs in recovery_wait start to decrement). This is with ceph 0.61.7.
 I
 get a lot of slow

Re: still recovery issues with cuttlefish

2013-08-02 Thread Stefan Priebe
,
filestore_queue_max_bytes: 104857600,
filestore_queue_committing_max_ops: 5000,
filestore_queue_committing_max_bytes: 104857600,
filestore_op_threads: 2,
filestore_op_thread_timeout: 60,
filestore_op_thread_suicide_timeout: 180,
filestore_commit_timeout: 600,
filestore_fiemap_threshold: 4096,
filestore_merge_threshold: 10,
filestore_split_multiple: 2,
filestore_update_to: 1000,
filestore_blackhole: false,
filestore_dump_file: ,
filestore_kill_at: 0,
filestore_inject_stall: 0,
filestore_fail_eio: true,
filestore_replica_fadvise: true,
filestore_debug_verify_split: false,
journal_dio: true,
journal_aio: true,
journal_force_aio: false,
journal_max_corrupt_search: 10485760,
journal_block_align: true,
journal_write_header_frequency: 0,
journal_max_write_bytes: 10485760,
journal_max_write_entries: 100,
journal_queue_max_ops: 300,
journal_queue_max_bytes: 33554432,
journal_align_min_size: 65536,
journal_replay_from: 0,
journal_zero_on_create: false,
journal_ignore_corruption: false,
rbd_cache: false,
rbd_cache_writethrough_until_flush: false,
rbd_cache_size: 33554432,
rbd_cache_max_dirty: 25165824,
rbd_cache_target_dirty: 16777216,
rbd_cache_max_dirty_age: 1,
rbd_cache_block_writes_upfront: false,
rbd_concurrent_management_ops: 10,
rbd_default_format: 1,
rbd_default_order: 22,
rbd_default_stripe_count: 1,
rbd_default_stripe_unit: 4194304,
rbd_default_features: 3,
nss_db_path: ,
rgw_data: \/var\/lib\/ceph\/radosgw\/ceph-0,
rgw_enable_apis: s3, swift, swift_auth, admin,
rgw_cache_enabled: true,
rgw_cache_lru_size: 1,
rgw_socket_path: ,
rgw_host: ,
rgw_port: ,
rgw_dns_name: ,
rgw_script_uri: ,
rgw_request_uri: ,
rgw_swift_url: ,
rgw_swift_url_prefix: swift,
rgw_swift_auth_url: ,
rgw_swift_auth_entry: auth,
rgw_keystone_url: ,
rgw_keystone_admin_token: ,
rgw_keystone_accepted_roles: Member, admin,
rgw_keystone_token_cache_size: 1,
rgw_keystone_revocation_interval: 900,
rgw_admin_entry: admin,
rgw_enforce_swift_acls: true,
rgw_swift_token_expiration: 86400,
rgw_print_continue: true,
rgw_remote_addr_param: REMOTE_ADDR,
rgw_op_thread_timeout: 600,
rgw_op_thread_suicide_timeout: 0,
rgw_thread_pool_size: 100,
rgw_num_control_oids: 8,
rgw_zone_root_pool: .rgw.root,
rgw_log_nonexistent_bucket: false,
rgw_log_object_name: %Y-%m-%d-%H-%i-%n,
rgw_log_object_name_utc: false,
rgw_usage_max_shards: 32,
rgw_usage_max_user_shards: 1,
rgw_enable_ops_log: false,
rgw_enable_usage_log: false,
rgw_ops_log_rados: true,
rgw_ops_log_socket_path: ,
rgw_ops_log_data_backlog: 5242880,
rgw_usage_log_flush_threshold: 1024,
rgw_usage_log_tick_interval: 30,
rgw_intent_log_object_name: %Y-%m-%d-%i-%n,
rgw_intent_log_object_name_utc: false,
rgw_init_timeout: 300,
rgw_mime_types_file: \/etc\/mime.types,
rgw_gc_max_objs: 32,
rgw_gc_obj_min_wait: 7200,
rgw_gc_processor_max_time: 3600,
rgw_gc_processor_period: 3600,
rgw_s3_success_create_obj_status: 0,
rgw_resolve_cname: false,
rgw_obj_stripe_size: 4194304,
rgw_extended_http_attrs: ,
rgw_exit_timeout_secs: 120,
rgw_get_obj_window_size: 16777216,
rgw_get_obj_max_req_size: 4194304,
rgw_relaxed_s3_bucket_names: false,
rgw_list_buckets_max_chunk: 1000,
mutex_perf_counter: false,
internal_safe_to_start_threads: true}



Stefan



-Sam

On Thu, Aug 1, 2013 at 12:07 PM, Stefan Priebe s.pri...@profihost.ag
wrote:



Mike we already have the async patch running. Yes it helps but only
helps
it
does not solve. It just hides the issue ...
Am 01.08.2013 20:54, schrieb Mike Dawson:


I am also seeing recovery issues with 0.61.7. Here's the process:

- ceph osd set noout

- Reboot one of the nodes hosting OSDs
- VMs mounted from RBD volumes work properly

- I see the OSD's boot messages as they re-join the cluster

- Start seeing active+recovery_wait, peering, and active+recovering
- VMs mounted from RBD volumes become unresponsive.

- Recovery completes
- VMs mounted from RBD volumes regain responsiveness

- ceph osd unset noout

Would joshd's async patch for qemu help here, or is there something
else
going on?

Output of ceph -w at: http://pastebin.com/raw.php?i=JLcZYFzY

Thanks,

Mike Dawson
Co-Founder  Director of Cloud Architecture
Cloudapt LLC
6330 East 75th Street, Suite 170
Indianapolis, IN 46250

On 8/1/2013 2:34 PM, Samuel Just wrote:




Can you reproduce and attach the ceph.log from before you stop the
osd
until after you have started the osd and it has recovered?
-Sam

On Thu, Aug 1, 2013 at 1:22 AM, Stefan Priebe - Profihost AG
s.pri...@profihost.ag wrote:




Hi,

i still have recovery issues with cuttlefish. After the OSD comes
back
it seem to hang

still recovery issues with cuttlefish

2013-08-01 Thread Stefan Priebe - Profihost AG
Hi,

i still have recovery issues with cuttlefish. After the OSD comes back
it seem to hang for around 2-4 minutes and then recovery seems to start
(pgs in recovery_wait start to decrement). This is with ceph 0.61.7. I
get a lot of slow request messages an hanging VMs.

What i noticed today is that if i leave the OSD off as long as ceph
starts to backfill - the recovery and re backfilling wents absolutely
smooth without any issues and no slow request messages at all.

Does anybody have an idea why?

Greets,
Stefan
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: still recovery issues with cuttlefish

2013-08-01 Thread Andrey Korolyov
Second this. Also for long-lasting snapshot problem and related
performance issues I may say that cuttlefish improved things greatly,
but creation/deletion of large snapshot (hundreds of gigabytes of
commited data) still can bring down cluster for a minutes, despite
usage of every possible optimization.

On Thu, Aug 1, 2013 at 12:22 PM, Stefan Priebe - Profihost AG
s.pri...@profihost.ag wrote:
 Hi,

 i still have recovery issues with cuttlefish. After the OSD comes back
 it seem to hang for around 2-4 minutes and then recovery seems to start
 (pgs in recovery_wait start to decrement). This is with ceph 0.61.7. I
 get a lot of slow request messages an hanging VMs.

 What i noticed today is that if i leave the OSD off as long as ceph
 starts to backfill - the recovery and re backfilling wents absolutely
 smooth without any issues and no slow request messages at all.

 Does anybody have an idea why?

 Greets,
 Stefan
 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: still recovery issues with cuttlefish

2013-08-01 Thread Samuel Just
Can you reproduce and attach the ceph.log from before you stop the osd
until after you have started the osd and it has recovered?
-Sam

On Thu, Aug 1, 2013 at 1:22 AM, Stefan Priebe - Profihost AG
s.pri...@profihost.ag wrote:
 Hi,

 i still have recovery issues with cuttlefish. After the OSD comes back
 it seem to hang for around 2-4 minutes and then recovery seems to start
 (pgs in recovery_wait start to decrement). This is with ceph 0.61.7. I
 get a lot of slow request messages an hanging VMs.

 What i noticed today is that if i leave the OSD off as long as ceph
 starts to backfill - the recovery and re backfilling wents absolutely
 smooth without any issues and no slow request messages at all.

 Does anybody have an idea why?

 Greets,
 Stefan
 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: still recovery issues with cuttlefish

2013-08-01 Thread Stefan Priebe

m 01.08.2013 20:34, schrieb Samuel Just:

Can you reproduce and attach the ceph.log from before you stop the osd
until after you have started the osd and it has recovered?
-Sam


Sure which log levels?


On Thu, Aug 1, 2013 at 1:22 AM, Stefan Priebe - Profihost AG
s.pri...@profihost.ag wrote:

Hi,

i still have recovery issues with cuttlefish. After the OSD comes back
it seem to hang for around 2-4 minutes and then recovery seems to start
(pgs in recovery_wait start to decrement). This is with ceph 0.61.7. I
get a lot of slow request messages an hanging VMs.

What i noticed today is that if i leave the OSD off as long as ceph
starts to backfill - the recovery and re backfilling wents absolutely
smooth without any issues and no slow request messages at all.

Does anybody have an idea why?

Greets,
Stefan
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: still recovery issues with cuttlefish

2013-08-01 Thread Samuel Just
For now, just the main ceph.log.
-Sam

On Thu, Aug 1, 2013 at 11:34 AM, Stefan Priebe s.pri...@profihost.ag wrote:
 m 01.08.2013 20:34, schrieb Samuel Just:

 Can you reproduce and attach the ceph.log from before you stop the osd
 until after you have started the osd and it has recovered?
 -Sam


 Sure which log levels?


 On Thu, Aug 1, 2013 at 1:22 AM, Stefan Priebe - Profihost AG
 s.pri...@profihost.ag wrote:

 Hi,

 i still have recovery issues with cuttlefish. After the OSD comes back
 it seem to hang for around 2-4 minutes and then recovery seems to start
 (pgs in recovery_wait start to decrement). This is with ceph 0.61.7. I
 get a lot of slow request messages an hanging VMs.

 What i noticed today is that if i leave the OSD off as long as ceph
 starts to backfill - the recovery and re backfilling wents absolutely
 smooth without any issues and no slow request messages at all.

 Does anybody have an idea why?

 Greets,
 Stefan
 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: still recovery issues with cuttlefish

2013-08-01 Thread Samuel Just
It doesn't have log levels, should be in /var/log/ceph/ceph.log.
-Sam

On Thu, Aug 1, 2013 at 11:36 AM, Samuel Just sam.j...@inktank.com wrote:
 For now, just the main ceph.log.
 -Sam

 On Thu, Aug 1, 2013 at 11:34 AM, Stefan Priebe s.pri...@profihost.ag wrote:
 m 01.08.2013 20:34, schrieb Samuel Just:

 Can you reproduce and attach the ceph.log from before you stop the osd
 until after you have started the osd and it has recovered?
 -Sam


 Sure which log levels?


 On Thu, Aug 1, 2013 at 1:22 AM, Stefan Priebe - Profihost AG
 s.pri...@profihost.ag wrote:

 Hi,

 i still have recovery issues with cuttlefish. After the OSD comes back
 it seem to hang for around 2-4 minutes and then recovery seems to start
 (pgs in recovery_wait start to decrement). This is with ceph 0.61.7. I
 get a lot of slow request messages an hanging VMs.

 What i noticed today is that if i leave the OSD off as long as ceph
 starts to backfill - the recovery and re backfilling wents absolutely
 smooth without any issues and no slow request messages at all.

 Does anybody have an idea why?

 Greets,
 Stefan
 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: still recovery issues with cuttlefish

2013-08-01 Thread Mike Dawson

I am also seeing recovery issues with 0.61.7. Here's the process:

- ceph osd set noout

- Reboot one of the nodes hosting OSDs
- VMs mounted from RBD volumes work properly

- I see the OSD's boot messages as they re-join the cluster

- Start seeing active+recovery_wait, peering, and active+recovering
- VMs mounted from RBD volumes become unresponsive.

- Recovery completes
- VMs mounted from RBD volumes regain responsiveness

- ceph osd unset noout

Would joshd's async patch for qemu help here, or is there something else 
going on?


Output of ceph -w at: http://pastebin.com/raw.php?i=JLcZYFzY

Thanks,

Mike Dawson
Co-Founder  Director of Cloud Architecture
Cloudapt LLC
6330 East 75th Street, Suite 170
Indianapolis, IN 46250

On 8/1/2013 2:34 PM, Samuel Just wrote:

Can you reproduce and attach the ceph.log from before you stop the osd
until after you have started the osd and it has recovered?
-Sam

On Thu, Aug 1, 2013 at 1:22 AM, Stefan Priebe - Profihost AG
s.pri...@profihost.ag wrote:

Hi,

i still have recovery issues with cuttlefish. After the OSD comes back
it seem to hang for around 2-4 minutes and then recovery seems to start
(pgs in recovery_wait start to decrement). This is with ceph 0.61.7. I
get a lot of slow request messages an hanging VMs.

What i noticed today is that if i leave the OSD off as long as ceph
starts to backfill - the recovery and re backfilling wents absolutely
smooth without any issues and no slow request messages at all.

Does anybody have an idea why?

Greets,
Stefan
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: still recovery issues with cuttlefish

2013-08-01 Thread Stefan Priebe
Mike we already have the async patch running. Yes it helps but only 
helps it does not solve. It just hides the issue ...

Am 01.08.2013 20:54, schrieb Mike Dawson:

I am also seeing recovery issues with 0.61.7. Here's the process:

- ceph osd set noout

- Reboot one of the nodes hosting OSDs
 - VMs mounted from RBD volumes work properly

- I see the OSD's boot messages as they re-join the cluster

- Start seeing active+recovery_wait, peering, and active+recovering
 - VMs mounted from RBD volumes become unresponsive.

- Recovery completes
 - VMs mounted from RBD volumes regain responsiveness

- ceph osd unset noout

Would joshd's async patch for qemu help here, or is there something else
going on?

Output of ceph -w at: http://pastebin.com/raw.php?i=JLcZYFzY

Thanks,

Mike Dawson
Co-Founder  Director of Cloud Architecture
Cloudapt LLC
6330 East 75th Street, Suite 170
Indianapolis, IN 46250

On 8/1/2013 2:34 PM, Samuel Just wrote:

Can you reproduce and attach the ceph.log from before you stop the osd
until after you have started the osd and it has recovered?
-Sam

On Thu, Aug 1, 2013 at 1:22 AM, Stefan Priebe - Profihost AG
s.pri...@profihost.ag wrote:

Hi,

i still have recovery issues with cuttlefish. After the OSD comes back
it seem to hang for around 2-4 minutes and then recovery seems to start
(pgs in recovery_wait start to decrement). This is with ceph 0.61.7. I
get a lot of slow request messages an hanging VMs.

What i noticed today is that if i leave the OSD off as long as ceph
starts to backfill - the recovery and re backfilling wents absolutely
smooth without any issues and no slow request messages at all.

Does anybody have an idea why?

Greets,
Stefan
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: still recovery issues with cuttlefish

2013-08-01 Thread Samuel Just
Can you dump your osd settings?
sudo ceph --admin-daemon ceph-osd.osdid.asok config show
-Sam

On Thu, Aug 1, 2013 at 12:07 PM, Stefan Priebe s.pri...@profihost.ag wrote:
 Mike we already have the async patch running. Yes it helps but only helps it
 does not solve. It just hides the issue ...
 Am 01.08.2013 20:54, schrieb Mike Dawson:

 I am also seeing recovery issues with 0.61.7. Here's the process:

 - ceph osd set noout

 - Reboot one of the nodes hosting OSDs
  - VMs mounted from RBD volumes work properly

 - I see the OSD's boot messages as they re-join the cluster

 - Start seeing active+recovery_wait, peering, and active+recovering
  - VMs mounted from RBD volumes become unresponsive.

 - Recovery completes
  - VMs mounted from RBD volumes regain responsiveness

 - ceph osd unset noout

 Would joshd's async patch for qemu help here, or is there something else
 going on?

 Output of ceph -w at: http://pastebin.com/raw.php?i=JLcZYFzY

 Thanks,

 Mike Dawson
 Co-Founder  Director of Cloud Architecture
 Cloudapt LLC
 6330 East 75th Street, Suite 170
 Indianapolis, IN 46250

 On 8/1/2013 2:34 PM, Samuel Just wrote:

 Can you reproduce and attach the ceph.log from before you stop the osd
 until after you have started the osd and it has recovered?
 -Sam

 On Thu, Aug 1, 2013 at 1:22 AM, Stefan Priebe - Profihost AG
 s.pri...@profihost.ag wrote:

 Hi,

 i still have recovery issues with cuttlefish. After the OSD comes back
 it seem to hang for around 2-4 minutes and then recovery seems to start
 (pgs in recovery_wait start to decrement). This is with ceph 0.61.7. I
 get a lot of slow request messages an hanging VMs.

 What i noticed today is that if i leave the OSD off as long as ceph
 starts to backfill - the recovery and re backfilling wents absolutely
 smooth without any issues and no slow request messages at all.

 Does anybody have an idea why?

 Greets,
 Stefan
 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html