Re: [ceph-users] rebooting nodes in a ceph cluster

2013-12-21 Thread Mike Dawson
I think my wording was a bit misleading in my last message. Instead of 
"no re-balancing will happen", I should have said that no OSDs will be 
marked out of the cluster with the noout flag set.


- Mike

On 12/21/2013 2:06 PM, Mike Dawson wrote:

It is also useful to mention that you can set the noout flag when doing
maintenance of any given length needs to exceeds the 'mon osd down out
interval'.

$ ceph osd set noout
** no re-balancing will happen **

$ ceph osd unset noout
** normal re-balancing rules will resume **


- Mike Dawson


On 12/19/2013 7:51 PM, Sage Weil wrote:

On Thu, 19 Dec 2013, John-Paul Robinson wrote:

What impact does rebooting nodes in a ceph cluster have on the health of
the ceph cluster?  Can it trigger rebalancing activities that then have
to be undone once the node comes back up?

I have a 4 node ceph cluster each node has 11 osds.  There is a single
pool with redundant storage.

If it takes 15 minutes for one of my servers to reboot is there a risk
that some sort of needless automatic processing will begin?


By default, we start rebalancing data after 5 minutes.  You can adjust
this (to, say, 15 minutes) with

  mon osd down out interval = 900

in ceph.conf.

sage



I'm assuming that the ceph cluster can go into a "not ok" state but that
in this particular configuration all the data is protected against the
single node failure and there is no place for the data to migrate too so
nothing "bad" will happen.

Thanks for any feedback.

~jpr
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rebooting nodes in a ceph cluster

2013-12-21 Thread Mike Dawson
It is also useful to mention that you can set the noout flag when doing 
maintenance of any given length needs to exceeds the 'mon osd down out 
interval'.


$ ceph osd set noout
** no re-balancing will happen **

$ ceph osd unset noout
** normal re-balancing rules will resume **


- Mike Dawson


On 12/19/2013 7:51 PM, Sage Weil wrote:

On Thu, 19 Dec 2013, John-Paul Robinson wrote:

What impact does rebooting nodes in a ceph cluster have on the health of
the ceph cluster?  Can it trigger rebalancing activities that then have
to be undone once the node comes back up?

I have a 4 node ceph cluster each node has 11 osds.  There is a single
pool with redundant storage.

If it takes 15 minutes for one of my servers to reboot is there a risk
that some sort of needless automatic processing will begin?


By default, we start rebalancing data after 5 minutes.  You can adjust
this (to, say, 15 minutes) with

  mon osd down out interval = 900

in ceph.conf.

sage



I'm assuming that the ceph cluster can go into a "not ok" state but that
in this particular configuration all the data is protected against the
single node failure and there is no place for the data to migrate too so
nothing "bad" will happen.

Thanks for any feedback.

~jpr
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rebooting nodes in a ceph cluster

2013-12-20 Thread Sage Weil
On Fri, 20 Dec 2013, Derek Yarnell wrote:
> On 12/19/13, 7:51 PM, Sage Weil wrote:
> >> If it takes 15 minutes for one of my servers to reboot is there a risk
> >> that some sort of needless automatic processing will begin?
> > 
> > By default, we start rebalancing data after 5 minutes.  You can adjust 
> > this (to, say, 15 minutes) with
> > 
> >  mon osd down out interval = 900
> > 
> > in ceph.conf.
> > 
> 
> Will Ceph detect if the OSDs come back while it is re-balancing and stop?

Yep!

sage
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rebooting nodes in a ceph cluster

2013-12-20 Thread Derek Yarnell
On 12/19/13, 7:51 PM, Sage Weil wrote:
>> If it takes 15 minutes for one of my servers to reboot is there a risk
>> that some sort of needless automatic processing will begin?
> 
> By default, we start rebalancing data after 5 minutes.  You can adjust 
> this (to, say, 15 minutes) with
> 
>  mon osd down out interval = 900
> 
> in ceph.conf.
> 

Will Ceph detect if the OSDs come back while it is re-balancing and stop?

-- 
Derek T. Yarnell
University of Maryland
Institute for Advanced Computer Studies
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rebooting nodes in a ceph cluster

2013-12-20 Thread Simon Leinen
David Clarke writes:
> Not directly related to Ceph, but you may want to investigate kexec[0]
> ('kexec-tools' package in Debian derived distributions) in order to
> get your machines rebooting quicker.  It essentially re-loads the
> kernel as the last step of the shutdown procedure, skipping over the
> lengthy BIOS/UEFI/controller firmware etc boot stages.

> [0]: http://en.wikipedia.org/wiki/Kexec

I'd like to second that recommendation - I only discovered this
recently, and on systems with long BIOS initialization, this cuts down
the time to reboot *dramatically*, like from >5 to <1 minute.
-- 
Simon.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rebooting nodes in a ceph cluster

2013-12-19 Thread John-Paul Robinson (Campus)
So is it recommended to adjust  the rebalance timeout to align with the time to 
reboot individual nodes?  

I didn't see this in my pass through the ops manual but maybe I'm not looking 
in the right place. 

Thanks,

~jpr

> On Dec 19, 2013, at 6:51 PM, "Sage Weil"  wrote:
> 
>> On Thu, 19 Dec 2013, John-Paul Robinson wrote:
>> What impact does rebooting nodes in a ceph cluster have on the health of
>> the ceph cluster?  Can it trigger rebalancing activities that then have
>> to be undone once the node comes back up?
>> 
>> I have a 4 node ceph cluster each node has 11 osds.  There is a single
>> pool with redundant storage.
>> 
>> If it takes 15 minutes for one of my servers to reboot is there a risk
>> that some sort of needless automatic processing will begin?
> 
> By default, we start rebalancing data after 5 minutes.  You can adjust 
> this (to, say, 15 minutes) with
> 
> mon osd down out interval = 900
> 
> in ceph.conf.
> 
> sage
> 
>> 
>> I'm assuming that the ceph cluster can go into a "not ok" state but that
>> in this particular configuration all the data is protected against the
>> single node failure and there is no place for the data to migrate too so
>> nothing "bad" will happen.
>> 
>> Thanks for any feedback.
>> 
>> ~jpr
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> 
>> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rebooting nodes in a ceph cluster

2013-12-19 Thread David Clarke
On 20/12/13 13:51, Sage Weil wrote:
> On Thu, 19 Dec 2013, John-Paul Robinson wrote:
>> What impact does rebooting nodes in a ceph cluster have on the health of
>> the ceph cluster?  Can it trigger rebalancing activities that then have
>> to be undone once the node comes back up?
>>
>> I have a 4 node ceph cluster each node has 11 osds.  There is a single
>> pool with redundant storage.
>>
>> If it takes 15 minutes for one of my servers to reboot is there a risk
>> that some sort of needless automatic processing will begin?
> 
> By default, we start rebalancing data after 5 minutes.  You can adjust 
> this (to, say, 15 minutes) with
> 
>  mon osd down out interval = 900
> 
> in ceph.conf.
> 
> sage
> 
>>
>> I'm assuming that the ceph cluster can go into a "not ok" state but that
>> in this particular configuration all the data is protected against the
>> single node failure and there is no place for the data to migrate too so
>> nothing "bad" will happen.
>>
>> Thanks for any feedback.

Not directly related to Ceph, but you may want to investigate kexec[0] 
('kexec-tools' package in
Debian derived distributions) in order to get your machines rebooting quicker.  
It essentially
re-loads the kernel as the last step of the shutdown procedure, skipping over 
the lengthy
BIOS/UEFI/controller firmware etc boot stages.

[0]: http://en.wikipedia.org/wiki/Kexec


-- 
David Clarke
Systems Architect
Catalyst IT
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rebooting nodes in a ceph cluster

2013-12-19 Thread Sage Weil
On Thu, 19 Dec 2013, John-Paul Robinson wrote:
> What impact does rebooting nodes in a ceph cluster have on the health of
> the ceph cluster?  Can it trigger rebalancing activities that then have
> to be undone once the node comes back up?
> 
> I have a 4 node ceph cluster each node has 11 osds.  There is a single
> pool with redundant storage.
> 
> If it takes 15 minutes for one of my servers to reboot is there a risk
> that some sort of needless automatic processing will begin?

By default, we start rebalancing data after 5 minutes.  You can adjust 
this (to, say, 15 minutes) with

 mon osd down out interval = 900

in ceph.conf.

sage

> 
> I'm assuming that the ceph cluster can go into a "not ok" state but that
> in this particular configuration all the data is protected against the
> single node failure and there is no place for the data to migrate too so
> nothing "bad" will happen.
> 
> Thanks for any feedback.
> 
> ~jpr
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] rebooting nodes in a ceph cluster

2013-12-19 Thread John-Paul Robinson
What impact does rebooting nodes in a ceph cluster have on the health of
the ceph cluster?  Can it trigger rebalancing activities that then have
to be undone once the node comes back up?

I have a 4 node ceph cluster each node has 11 osds.  There is a single
pool with redundant storage.

If it takes 15 minutes for one of my servers to reboot is there a risk
that some sort of needless automatic processing will begin?

I'm assuming that the ceph cluster can go into a "not ok" state but that
in this particular configuration all the data is protected against the
single node failure and there is no place for the data to migrate too so
nothing "bad" will happen.

Thanks for any feedback.

~jpr
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com