subject:"Adding a delay when restarting all OSDs on a host"

Re: Adding a delay when restarting all OSDs on a host

2014-07-23 Thread Milosz Tanski

Default stack size shouldn't matter. At least it's not an issue on a kernel 
with over-commit turned on (default). Most threads / apps never use that many 
stack frames (in fact they use a fraction of that), thus the kernel doesn't 
bother allocating the pages to it. My bet is on some other resource.

On 7/23/14, 3:22 PM, Vit Yenukas wrote:
> Just some fun fact pertaining to the resources consumption during startup 
> sequence - 
> we've ran out of memory on a 72-disk server with 256GB RAM during the startup.
> ceph-osd dies with 'can not fork' and cores. There were in excess of 40 
> thousands 
> threads when this began to happen. With default thread stack size being 8MB, 
> no wonder :)
> Note that this was in an experimental setup with just one node, so all OSDs 
> peering happens on the same host.
> Just for heck of it, I reduced the number of OSDs by two (to 36 OSDs) by 
> setting up a soft RAID-0 for each disk pair. 
> This worked after some tweaking of udev rules (that ignore 'md' block devs). 
> I'm not sure if we're going to see 
> the same problem with real cluster (18 such 72-disk nodes), with EC 9-3. 
> Also, not sure if reducing user proc stack to 4MB would be a good idea. 
>
> On 07/22/2014 08:08 PM, Gregory Farnum wrote:
>
>> On Tue, Jul 22, 2014 at 6:19 AM, Wido den Hollander  wrote:
>>> Hi,
>>>
>>> Currently on Ubuntu with Upstart when you invoke a restart like this:
>>>
>>> $ sudo restart ceph-osd-all
>>>
>>> It will restart all OSDs at once, which can increase the load on the system
>>> a quite a bit.
>>>
>>> It's better to restart all OSDs by restarting them one by one:
>>>
>>> $ sudo ceph restart ceph-osd id=X
>>>
>>> But you then have to figure out all the IDs by doing a find in
>>> /var/lib/ceph/osd and that's more manual work.
>>>
>>> I'm thinking of patching the init scripts which allows something like this:
>>>
>>> $ sudo restart ceph-osd-all delay=180
>>>
>>> It then waits 180 seconds between each OSD restart making the proces even
>>> smoother.
>>>
>>> I know there are currently sysvinit, upstart and systemd scripts, so it has
>>> to be implemented on various places, but how does the general idea sound?
>> That sounds like a good idea to me. I presume you're meaning to
>> actually delay the restarts, not just turning them on, so that the
>> daemons all remain alive (that's what it sounds like to me here, just
>> wanted to clarify).
>> -Greg
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Adding a delay when restarting all OSDs on a host

2014-07-23 Thread Vit Yenukas

Just some fun fact pertaining to the resources consumption during startup 
sequence - 
we've ran out of memory on a 72-disk server with 256GB RAM during the startup.
ceph-osd dies with 'can not fork' and cores. There were in excess of 40 
thousands 
threads when this began to happen. With default thread stack size being 8MB, no 
wonder :)
Note that this was in an experimental setup with just one node, so all OSDs 
peering happens on the same host.
Just for heck of it, I reduced the number of OSDs by two (to 36 OSDs) by 
setting up a soft RAID-0 for each disk pair. 
This worked after some tweaking of udev rules (that ignore 'md' block devs). 
I'm not sure if we're going to see 
the same problem with real cluster (18 such 72-disk nodes), with EC 9-3. 
Also, not sure if reducing user proc stack to 4MB would be a good idea. 

On 07/22/2014 08:08 PM, Gregory Farnum wrote:

> On Tue, Jul 22, 2014 at 6:19 AM, Wido den Hollander  wrote:
>> Hi,
>>
>> Currently on Ubuntu with Upstart when you invoke a restart like this:
>>
>> $ sudo restart ceph-osd-all
>>
>> It will restart all OSDs at once, which can increase the load on the system
>> a quite a bit.
>>
>> It's better to restart all OSDs by restarting them one by one:
>>
>> $ sudo ceph restart ceph-osd id=X
>>
>> But you then have to figure out all the IDs by doing a find in
>> /var/lib/ceph/osd and that's more manual work.
>>
>> I'm thinking of patching the init scripts which allows something like this:
>>
>> $ sudo restart ceph-osd-all delay=180
>>
>> It then waits 180 seconds between each OSD restart making the proces even
>> smoother.
>>
>> I know there are currently sysvinit, upstart and systemd scripts, so it has
>> to be implemented on various places, but how does the general idea sound?
> 
> That sounds like a good idea to me. I presume you're meaning to
> actually delay the restarts, not just turning them on, so that the
> daemons all remain alive (that's what it sounds like to me here, just
> wanted to clarify).
> -Greg
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Adding a delay when restarting all OSDs on a host

2014-07-22 Thread Gregory Farnum

On Tue, Jul 22, 2014 at 6:19 AM, Wido den Hollander  wrote:
> Hi,
>
> Currently on Ubuntu with Upstart when you invoke a restart like this:
>
> $ sudo restart ceph-osd-all
>
> It will restart all OSDs at once, which can increase the load on the system
> a quite a bit.
>
> It's better to restart all OSDs by restarting them one by one:
>
> $ sudo ceph restart ceph-osd id=X
>
> But you then have to figure out all the IDs by doing a find in
> /var/lib/ceph/osd and that's more manual work.
>
> I'm thinking of patching the init scripts which allows something like this:
>
> $ sudo restart ceph-osd-all delay=180
>
> It then waits 180 seconds between each OSD restart making the proces even
> smoother.
>
> I know there are currently sysvinit, upstart and systemd scripts, so it has
> to be implemented on various places, but how does the general idea sound?

That sounds like a good idea to me. I presume you're meaning to
actually delay the restarts, not just turning them on, so that the
daemons all remain alive (that's what it sounds like to me here, just
wanted to clarify).
-Greg
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Adding a delay when restarting all OSDs on a host

2014-07-22 Thread Andrey Korolyov

On Tue, Jul 22, 2014 at 6:28 PM, Wido den Hollander  wrote:
> On 07/22/2014 03:48 PM, Andrey Korolyov wrote:
>>
>> On Tue, Jul 22, 2014 at 5:19 PM, Wido den Hollander  wrote:
>>>
>>> Hi,
>>>
>>> Currently on Ubuntu with Upstart when you invoke a restart like this:
>>>
>>> $ sudo restart ceph-osd-all
>>>
>>> It will restart all OSDs at once, which can increase the load on the
>>> system
>>> a quite a bit.
>>>
>>> It's better to restart all OSDs by restarting them one by one:
>>>
>>> $ sudo ceph restart ceph-osd id=X
>>>
>>> But you then have to figure out all the IDs by doing a find in
>>> /var/lib/ceph/osd and that's more manual work.
>>>
>>> I'm thinking of patching the init scripts which allows something like
>>> this:
>>>
>>> $ sudo restart ceph-osd-all delay=180
>>>
>>> It then waits 180 seconds between each OSD restart making the proces even
>>> smoother.
>>>
>>> I know there are currently sysvinit, upstart and systemd scripts, so it
>>> has
>>> to be implemented on various places, but how does the general idea sound?
>>>
>>> --
>>> Wido den Hollander
>>> Ceph consultant and trainer
>>> 42on B.V.
>>>
>>> Phone: +31 (0)20 700 9902
>>> Skype: contact42on
>>> --
>>
>>
>>
>> Hi,
>>
>> this behaviour obviously have a negative side of increased overall
>> peering time and larger integral value of out-of-SLA delays. I`d vote
>> for warming up necessary files, most likely collections, just before
>> restart. If there are no enough room to hold all of them at once, we
>> can probably combine both methods to achieve lower impact value on
>> restart, although adding a simple delay sounds much more straight than
>> putting file cache to ram.
>>
>
> In the case I'm talking about there are 23 OSDs running on a single machine
> and restarting all the OSDs causes a lot of peering and reading PG logs.
>
> A warm-up mechanism might work, but that would be a lot of work.
>
> When upgrading your cluster you simply want to do this:
>
> $ dsh -g ceph-osd "sudo restart ceph-osd-all delay=180"
>
> That might take hours to complete, but if it's just an upgrade that doesn't
> matter. You want as minimal impact on service as possible.
>

I may suggest to measure impact with vmtouch[0], it decreased OSD
startup time greatly on mine tests, but I was stuck with same resource
exhaustion as before after OSD marked itself up (IOPS ceiling
primarily).


0. http://hoytech.com/vmtouch/
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Adding a delay when restarting all OSDs on a host

2014-07-22 Thread Wido den Hollander


On 07/22/2014 03:48 PM, Andrey Korolyov wrote:

On Tue, Jul 22, 2014 at 5:19 PM, Wido den Hollander  wrote:

Hi,

Currently on Ubuntu with Upstart when you invoke a restart like this:

$ sudo restart ceph-osd-all

It will restart all OSDs at once, which can increase the load on the system
a quite a bit.

It's better to restart all OSDs by restarting them one by one:

$ sudo ceph restart ceph-osd id=X

But you then have to figure out all the IDs by doing a find in
/var/lib/ceph/osd and that's more manual work.

I'm thinking of patching the init scripts which allows something like this:

$ sudo restart ceph-osd-all delay=180

It then waits 180 seconds between each OSD restart making the proces even
smoother.

I know there are currently sysvinit, upstart and systemd scripts, so it has
to be implemented on various places, but how does the general idea sound?

--
Wido den Hollander
Ceph consultant and trainer
42on B.V.

Phone: +31 (0)20 700 9902
Skype: contact42on
--



Hi,

this behaviour obviously have a negative side of increased overall
peering time and larger integral value of out-of-SLA delays. I`d vote
for warming up necessary files, most likely collections, just before
restart. If there are no enough room to hold all of them at once, we
can probably combine both methods to achieve lower impact value on
restart, although adding a simple delay sounds much more straight than
putting file cache to ram.



In the case I'm talking about there are 23 OSDs running on a single 
machine and restarting all the OSDs causes a lot of peering and reading 
PG logs.


A warm-up mechanism might work, but that would be a lot of work.

When upgrading your cluster you simply want to do this:

$ dsh -g ceph-osd "sudo restart ceph-osd-all delay=180"

That might take hours to complete, but if it's just an upgrade that 
doesn't matter. You want as minimal impact on service as possible.


--
Wido den Hollander
Ceph consultant and trainer
42on B.V.

Phone: +31 (0)20 700 9902
Skype: contact42on
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Adding a delay when restarting all OSDs on a host

2014-07-22 Thread Andrey Korolyov

On Tue, Jul 22, 2014 at 5:19 PM, Wido den Hollander  wrote:
> Hi,
>
> Currently on Ubuntu with Upstart when you invoke a restart like this:
>
> $ sudo restart ceph-osd-all
>
> It will restart all OSDs at once, which can increase the load on the system
> a quite a bit.
>
> It's better to restart all OSDs by restarting them one by one:
>
> $ sudo ceph restart ceph-osd id=X
>
> But you then have to figure out all the IDs by doing a find in
> /var/lib/ceph/osd and that's more manual work.
>
> I'm thinking of patching the init scripts which allows something like this:
>
> $ sudo restart ceph-osd-all delay=180
>
> It then waits 180 seconds between each OSD restart making the proces even
> smoother.
>
> I know there are currently sysvinit, upstart and systemd scripts, so it has
> to be implemented on various places, but how does the general idea sound?
>
> --
> Wido den Hollander
> Ceph consultant and trainer
> 42on B.V.
>
> Phone: +31 (0)20 700 9902
> Skype: contact42on
> --


Hi,

this behaviour obviously have a negative side of increased overall
peering time and larger integral value of out-of-SLA delays. I`d vote
for warming up necessary files, most likely collections, just before
restart. If there are no enough room to hold all of them at once, we
can probably combine both methods to achieve lower impact value on
restart, although adding a simple delay sounds much more straight than
putting file cache to ram.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Adding a delay when restarting all OSDs on a host

2014-07-22 Thread Wido den Hollander


Hi,

Currently on Ubuntu with Upstart when you invoke a restart like this:

$ sudo restart ceph-osd-all

It will restart all OSDs at once, which can increase the load on the 
system a quite a bit.


It's better to restart all OSDs by restarting them one by one:

$ sudo ceph restart ceph-osd id=X

But you then have to figure out all the IDs by doing a find in 
/var/lib/ceph/osd and that's more manual work.


I'm thinking of patching the init scripts which allows something like this:

$ sudo restart ceph-osd-all delay=180

It then waits 180 seconds between each OSD restart making the proces 
even smoother.


I know there are currently sysvinit, upstart and systemd scripts, so it 
has to be implemented on various places, but how does the general idea 
sound?


--
Wido den Hollander
Ceph consultant and trainer
42on B.V.

Phone: +31 (0)20 700 9902
Skype: contact42on
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Adding a delay when restarting all OSDs on a host

Re: Adding a delay when restarting all OSDs on a host

Re: Adding a delay when restarting all OSDs on a host

Re: Adding a delay when restarting all OSDs on a host

Re: Adding a delay when restarting all OSDs on a host

Re: Adding a delay when restarting all OSDs on a host

Adding a delay when restarting all OSDs on a host

7 matches

Site Navigation

Mail list logo

Footer information