Re: maintanance on osd host
I have it documented here: http://ceph.com/docs/master/rados/operations/troubleshooting-osd/#stopping-w-out-rebalancing Let me know if this works for you. On Thu, Feb 28, 2013 at 8:14 AM, Gregory Farnum g...@inktank.com wrote: On Tue, Feb 26, 2013 at 11:37 PM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Hi Greg, Hi Sage, Am 26.02.2013 21:27, schrieb Gregory Farnum: On Tue, Feb 26, 2013 at 11:44 AM, Stefan Priebe s.pri...@profihost.ag wrote: out and down are quite different — are you sure you tried down and not out? (You reference out in your first email, rather than down.) -Greg sorry that's it i misread down / out. Sorry. Wouldn't it make sense to mark the osd automatically down when shutting down via the init script? It doesn't seem to make sense to hope for the automatic detection when somebody uses the init script. Yes, yes it would. http://tracker.ceph.com/issues/4267 :) -Greg -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- John Wilkins Senior Technical Writer Intank john.wilk...@inktank.com (415) 425-9599 -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: maintanance on osd host
Hi, I have it documented here: http://ceph.com/docs/master/rados/operations/troubleshooting-osd/#stopping-w-out-rebalancing That looks wrong to me AFAIU it should be 'noout'. You want it marked down ASAP. Cheers, Sylvain -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: maintanance on osd host
On Tue, Feb 26, 2013 at 11:37 PM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Hi Greg, Hi Sage, Am 26.02.2013 21:27, schrieb Gregory Farnum: On Tue, Feb 26, 2013 at 11:44 AM, Stefan Priebe s.pri...@profihost.ag wrote: out and down are quite different — are you sure you tried down and not out? (You reference out in your first email, rather than down.) -Greg sorry that's it i misread down / out. Sorry. Wouldn't it make sense to mark the osd automatically down when shutting down via the init script? It doesn't seem to make sense to hope for the automatic detection when somebody uses the init script. Yes, yes it would. http://tracker.ceph.com/issues/4267 :) -Greg -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
maintanance on osd host
Hi list, how can i do a short maintanance like a kernel upgrade on an osd host? Right now ceph starts to backfill immediatly if i say: ceph osd out 41 ... Without ceph osd out command all clients hang for the time ceph does not know that the host was rebootet. I tried ceph osd set nodown and ceph osd set noout but this doesn't result in any difference Stefan -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: maintanance on osd host
On Tue, Feb 26, 2013 at 6:56 PM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Hi list, how can i do a short maintanance like a kernel upgrade on an osd host? Right now ceph starts to backfill immediatly if i say: ceph osd out 41 ... Without ceph osd out command all clients hang for the time ceph does not know that the host was rebootet. I tried ceph osd set nodown and ceph osd set noout but this doesn't result in any difference Hi Stefan, in my practice nodown will freeze all I/O for sure until OSD will return, killing osd process and setting ``mon osd down out interval'' large enough will do the trick - you`ll get only two small freezes on the peering process at start and at the end. Also it is very strange that your clients hanging for a long time - I have set non-optimal values for purpose and was not able to observe re-peering process longer than a minute. -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: maintanance on osd host
On Tue, 26 Feb 2013, Stefan Priebe - Profihost AG wrote: Hi list, how can i do a short maintanance like a kernel upgrade on an osd host? Right now ceph starts to backfill immediatly if i say: ceph osd out 41 ... Without ceph osd out command all clients hang for the time ceph does not know that the host was rebootet. I tried ceph osd set nodown and ceph osd set noout but this doesn't result in any difference For a temporary event like this, you want the osd to be down (so that io can continue with remaining replicas) but NOT to mark it out (so that data doesn't get rebalanced). The simplest way to do that is ceph osd set noout killall ceph-osd .. reboot .. Just remember to do ceph osd unset noout when you are done so that future osds that fail will get marked out on their own after the 5 minute (default) interval. sage -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: maintanance on osd host
But that redults in a 1-3s hickup for all KVM vms. This is not what I want. Stefan Am 26.02.2013 um 18:06 schrieb Sage Weil s...@inktank.com: On Tue, 26 Feb 2013, Stefan Priebe - Profihost AG wrote: Hi list, how can i do a short maintanance like a kernel upgrade on an osd host? Right now ceph starts to backfill immediatly if i say: ceph osd out 41 ... Without ceph osd out command all clients hang for the time ceph does not know that the host was rebootet. I tried ceph osd set nodown and ceph osd set noout but this doesn't result in any difference For a temporary event like this, you want the osd to be down (so that io can continue with remaining replicas) but NOT to mark it out (so that data doesn't get rebalanced). The simplest way to do that is ceph osd set noout killall ceph-osd .. reboot .. Just remember to do ceph osd unset noout when you are done so that future osds that fail will get marked out on their own after the 5 minute (default) interval. sage -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: maintanance on osd host
On Tue, 26 Feb 2013, Stefan Priebe - Profihost AG wrote: But that redults in a 1-3s hickup for all KVM vms. This is not what I want. You can do kill $pid ceph osd down $osdid (or even reverse the order, if the sequence is quick enough) to avoid waiting for the failure detection delay. But if the OSDs are going down, then the peering has to happen one way or another. sage Stefan Am 26.02.2013 um 18:06 schrieb Sage Weil s...@inktank.com: On Tue, 26 Feb 2013, Stefan Priebe - Profihost AG wrote: Hi list, how can i do a short maintanance like a kernel upgrade on an osd host? Right now ceph starts to backfill immediatly if i say: ceph osd out 41 ... Without ceph osd out command all clients hang for the time ceph does not know that the host was rebootet. I tried ceph osd set nodown and ceph osd set noout but this doesn't result in any difference For a temporary event like this, you want the osd to be down (so that io can continue with remaining replicas) but NOT to mark it out (so that data doesn't get rebalanced). The simplest way to do that is ceph osd set noout killall ceph-osd .. reboot .. Just remember to do ceph osd unset noout when you are done so that future osds that fail will get marked out on their own after the 5 minute (default) interval. sage -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: maintanance on osd host
Hi Sage, Am 26.02.2013 18:24, schrieb Sage Weil: On Tue, 26 Feb 2013, Stefan Priebe - Profihost AG wrote: But that redults in a 1-3s hickup for all KVM vms. This is not what I want. You can do kill $pid ceph osd down $osdid (or even reverse the order, if the sequence is quick enough) to avoid waiting for the failure detection delay. But if the OSDs are going down, then the peering has to happen one way or another. But exaclty this results in starting backfill immediatly. My idea was to first mark the osd down so the mon knows about this fact. So no I/O is stalled. And then reboot the whole host but exactly this does not work like expected as backfilling is starting immediatly after setting the osd to down ;-( Greets, Stefan -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: maintanance on osd host
On Tue, 26 Feb 2013, Stefan Priebe wrote: Hi Sage, Am 26.02.2013 18:24, schrieb Sage Weil: On Tue, 26 Feb 2013, Stefan Priebe - Profihost AG wrote: But that redults in a 1-3s hickup for all KVM vms. This is not what I want. You can do kill $pid ceph osd down $osdid (or even reverse the order, if the sequence is quick enough) to avoid waiting for the failure detection delay. But if the OSDs are going down, then the peering has to happen one way or another. But exaclty this results in starting backfill immediatly. My idea was to first mark the osd down so the mon knows about this fact. So no I/O is stalled. And then reboot the whole host but exactly this does not work like expected as backfilling is starting immediatly after setting the osd to down ;-( Backfilling should not happen on down, unless you have reconfigured 'mon osd down out interval = 0' or something along those lines. Setting the 'noout' flag will also prevent the osds from marking out. As for limiting the IO stall: you could also do 'ceph osd set noup', then mark them down, then kill the daemon, and you won't have to worry about racing with the daemon marking itself back up (as it normally does). sage -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: maintanance on osd host
On Tue, Feb 26, 2013 at 11:44 AM, Stefan Priebe s.pri...@profihost.ag wrote: Hi Sage, Am 26.02.2013 18:24, schrieb Sage Weil: On Tue, 26 Feb 2013, Stefan Priebe - Profihost AG wrote: But that redults in a 1-3s hickup for all KVM vms. This is not what I want. You can do kill $pid ceph osd down $osdid (or even reverse the order, if the sequence is quick enough) to avoid waiting for the failure detection delay. But if the OSDs are going down, then the peering has to happen one way or another. But exaclty this results in starting backfill immediatly. My idea was to first mark the osd down so the mon knows about this fact. So no I/O is stalled. And then reboot the whole host but exactly this does not work like expected as backfilling is starting immediatly after setting the osd to down ;-( out and down are quite different — are you sure you tried down and not out? (You reference out in your first email, rather than down.) -Greg -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: maintanance on osd host
Hi Greg, Hi Sage, Am 26.02.2013 21:27, schrieb Gregory Farnum: On Tue, Feb 26, 2013 at 11:44 AM, Stefan Priebe s.pri...@profihost.ag wrote: out and down are quite different — are you sure you tried down and not out? (You reference out in your first email, rather than down.) -Greg sorry that's it i misread down / out. Sorry. Wouldn't it make sense to mark the osd automatically down when shutting down via the init script? It doesn't seem to make sense to hope for the automatic detection when somebody uses the init script. Stefan -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html