[ceph-users] Performance and CPU load on HP servers running ceph (DL380 G6, should apply to others too)

2015-05-25 Thread Tuomas Juntunen
Hi

 

I wanted to share my findings of running ceph on HP servers.

 

We had a lot of problems with CPU load, which was sometimes even 800. We
were trying to figure out why this happens even while not doing anything
special.

 

Our OSD nodes are running DL380 G6 with Dual Quad core cpu's and 32gb
memory.

 

The solution we found to work was to set the following settings in bios

 

HP Power Profile Mode: Maximum Performance

Power Regulator Mode: Static High Performance

IntelR Turbo Boost Technology: Disabled

 

With these settings our loads never go over 20 and there are no "hangs" in
writes or reads at any time.

 

If anyone else has any experiences with these settings, I would appreciate
to hear about your findings. The Turbo Boost, is, I would assume the biggest
thing here. When CPU frequency is adjusted, the CPU's "hang" for a while to
do the adjustment, and when the adjustment happens a lot, it creates this
high load.

 

Br,

Tuomas

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Performance and CPU load on HP servers running ceph (DL380 G6, should apply to others too)

2015-05-26 Thread Jan Schermer
Turbo Boost will not hurt performance. Unless you have 100% load on all cores 
it will actually improve performance (vastly, in terms of bursty workloads).
The issue you have could be related to CPU cores going to sleep mode.

Put "intel_idle.max_cstate=3” on the kernel command line (I ran with =2 but I 
think that disables Turbo Boost) and see what happens. You can find the CPU 
states counter in /sys.

BIOS is usually worthless because while it says “max performance” the OS is 
still free to take control to some extent and kernel usually does the right 
thing with the right settings if it sees it all, while BIOS couldn’t care less 
what happens in your OS…

Jan


> On 26 May 2015, at 06:54, Tuomas Juntunen  
> wrote:
> 
> Hi
>  
> I wanted to share my findings of running ceph on HP servers.
>  
> We had a lot of problems with CPU load, which was sometimes even 800. We were 
> trying to figure out why this happens even while not doing anything special.
>  
> Our OSD nodes are running DL380 G6 with Dual Quad core cpu’s and 32gb memory.
>  
> The solution we found to work was to set the following settings in bios
>  
> HP Power Profile Mode: Maximum Performance
> Power Regulator Mode: Static High Performance
> Intel® Turbo Boost Technology: Disabled
>  
> With these settings our loads never go over 20 and there are no “hangs” in 
> writes or reads at any time.
>  
> If anyone else has any experiences with these settings, I would appreciate to 
> hear about your findings. The Turbo Boost, is, I would assume the biggest 
> thing here. When CPU frequency is adjusted, the CPU’s “hang” for a while to 
> do the adjustment, and when the adjustment happens a lot, it creates this 
> high load.
>  
> Br,
> Tuomas
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com 
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Performance and CPU load on HP servers running ceph (DL380 G6, should apply to others too)

2015-05-26 Thread Lionel Bouton
On 05/26/15 10:06, Jan Schermer wrote:
> Turbo Boost will not hurt performance. Unless you have 100% load on
> all cores it will actually improve performance (vastly, in terms of
> bursty workloads).
> The issue you have could be related to CPU cores going to sleep mode.

Another possibility is that the system is overheating when Turbo Boost
is enabled. In this case it protects itself by throttling back the core
frequencies to a very low value (it may use other means too, like
lowering the system buses frequencies, halting the cores periodically,
...). This would explain the high loads.
If the system switches back and forth between normal loads and huge
loads and you can link that to CPU package temperature (and/or very low
CPU core frequencies), this is probably the cause. If the ambient
temperature isn't a problem (below 25°C any system should be fine and
most can tolerate 30°C or more) then you have an internal cooling problem.

Best regards,

Lionel
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Performance and CPU load on HP servers running ceph (DL380 G6, should apply to others too)

2015-05-26 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

In my experience with HP hardware, it was set to Econo mode in the
BIOS which is just plain junk. It will halt cores without respect to
workload to provide energy savings.

We found that by setting the power mode to "OS controlled" we got
almost the same performance as the "Max performance" setting, but
while at the same time consuming about the same power as the Econo
mode. The kernel is much better at putting cores to sleep, making sure
there is adequate reserve capacity, and making sure that they are
woken up faster when load is increasing. By selecting "Econo or max
performance" profiles, the CPU loses control of the sleep states. My
recommendation is before tweaking any other setting, change the
performance profile to "OS controlled" and then go from there.
-BEGIN PGP SIGNATURE-
Version: Mailvelope v0.13.1
Comment: https://www.mailvelope.com

wsFcBAEBCAAQBQJVZJopCRDmVDuy+mK58QAAXzMQAJdCFOOcTveCY68WQFHN
/sXrRdMKye9h2JKz+ftjDUCjPQwjybVTL1jySNX/GjyYJBAVJlR/XklYMUYz
Tvp65yprVEjXDnEbBNsJBw758MKfWcfwm/qaGtLpXOjj1aFvcuUILCHQp2VS
AIfEaP9ITjndaoljpebtqlWopvx64+q+qslb44zUR6rBHJyDD8X52GX8MGr0
D1u2CxKnga9/mRuQ5daF5h0bida9aX06CqpZkAe900gx4Ia/1fodIHKMpF/1
BanRlkio35483675QJVnIrLKg0s2mNrhyFWo7gfPOky1ZPmKfBALGFl8O2Cy
Hl2RygMOFKQ4tESKFn+AH8Y7/OtaSVOhRqBddx/Bh5ozyMFg4o6iWKKN6NOO
VPkYyEn5pYSMJga0sPffwOqLZYQ5AiB2zceW92MT5/R6yIVme5RvWJrot+pk
HsF1S1F26JEbl2ugMxd6aZk4RbesDMcvMnaQE6pVV3v+Zqv82UXmCNX14kKE
cDS/nLnKQKM2ehh2TLbFomZvFk4XXx4+ri/7A1vlqbisa+iedxaNrqG+wNY0
Q5fT44s+JTWZxCcxadhka0HQ4tguEvTXg83D/PGAkjo6BX7avUv5uiyDGMOw
HBWS9/Cy98EH2gDKvWOq2DMDiSvIY+aLZ5W8tFX+D+rsE6DwrvN8fUDxyG6C
dsN2
=u7g4
-END PGP SIGNATURE-

Robert LeBlanc
GPG Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1


On Tue, May 26, 2015 at 5:53 AM, Lionel Bouton  wrote:
> On 05/26/15 10:06, Jan Schermer wrote:
>
> Turbo Boost will not hurt performance. Unless you have 100% load on all
> cores it will actually improve performance (vastly, in terms of bursty
> workloads).
> The issue you have could be related to CPU cores going to sleep mode.
>
>
> Another possibility is that the system is overheating when Turbo Boost is
> enabled. In this case it protects itself by throttling back the core
> frequencies to a very low value (it may use other means too, like lowering
> the system buses frequencies, halting the cores periodically, ...). This
> would explain the high loads.
> If the system switches back and forth between normal loads and huge loads
> and you can link that to CPU package temperature (and/or very low CPU core
> frequencies), this is probably the cause. If the ambient temperature isn't a
> problem (below 25°C any system should be fine and most can tolerate 30°C or
> more) then you have an internal cooling problem.
>
> Best regards,
>
> Lionel
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Performance and CPU load on HP servers running ceph (DL380 G6, should apply to others too)

2015-05-26 Thread Jan Schermer
It should be noted that not all power saving is bad - you can save a lot of 
power by enabling some sleep states, throttling down, idling, or enabling low 
voltage mode on memory, with zero performance impact. In the end you can end up 
with more performance because of higher Turbo Boost TDP reserve and less 
thermal throttling (which you should never see in a well cooled datacentre, but 
it pops up from time to time as an issue).
Enabling stuff like “Performance mode” in BIOS usually just says “act 
predictably”, which doesn’t imply it’s performing optimally at all. This is 
where you’ll find the most difference between vendors and how they tune their 
default settings. Lots of corporates don’t care and just switch to 
“Performance” everywhere and trust the vendor to do the right thing, which is 
seldom the case when you’re on the budget :)

Jan


> On 26 May 2015, at 18:07, Robert LeBlanc  wrote:
> 
> Signed PGP part
> In my experience with HP hardware, it was set to Econo mode in the
> BIOS which is just plain junk. It will halt cores without respect to
> workload to provide energy savings.
> 
> We found that by setting the power mode to "OS controlled" we got
> almost the same performance as the "Max performance" setting, but
> while at the same time consuming about the same power as the Econo
> mode. The kernel is much better at putting cores to sleep, making sure
> there is adequate reserve capacity, and making sure that they are
> woken up faster when load is increasing. By selecting "Econo or max
> performance" profiles, the CPU loses control of the sleep states. My
> recommendation is before tweaking any other setting, change the
> performance profile to "OS controlled" and then go from there.
> 
> 
> Robert LeBlanc
> GPG Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
> 
> 
> On Tue, May 26, 2015 at 5:53 AM, Lionel Bouton  
> wrote:
> > On 05/26/15 10:06, Jan Schermer wrote:
> >
> > Turbo Boost will not hurt performance. Unless you have 100% load on all
> > cores it will actually improve performance (vastly, in terms of bursty
> > workloads).
> > The issue you have could be related to CPU cores going to sleep mode.
> >
> >
> > Another possibility is that the system is overheating when Turbo Boost is
> > enabled. In this case it protects itself by throttling back the core
> > frequencies to a very low value (it may use other means too, like lowering
> > the system buses frequencies, halting the cores periodically, ...). This
> > would explain the high loads.
> > If the system switches back and forth between normal loads and huge loads
> > and you can link that to CPU package temperature (and/or very low CPU core
> > frequencies), this is probably the cause. If the ambient temperature isn't a
> > problem (below 25°C any system should be fine and most can tolerate 30°C or
> > more) then you have an internal cooling problem.
> >
> > Best regards,
> >
> > Lionel
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Performance and CPU load on HP servers running ceph (DL380 G6, should apply to others too)

2015-05-27 Thread Tuomas Juntunen
Hi

Thanks for your comments

I'll indeed put the OS Controller on, when we get our replacement CPU's and try 
what you described here.

If there isn't any guide for this yet, should there be?

Br,
Tuomas

-Original Message-
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Jan 
Schermer
Sent: 26. toukokuuta 2015 19:18
To: Robert LeBlanc
Cc: ceph-users
Subject: Re: [ceph-users] Performance and CPU load on HP servers running ceph 
(DL380 G6, should apply to others too)

It should be noted that not all power saving is bad - you can save a lot of 
power by enabling some sleep states, throttling down, idling, or enabling low 
voltage mode on memory, with zero performance impact. In the end you can end up 
with more performance because of higher Turbo Boost TDP reserve and less 
thermal throttling (which you should never see in a well cooled datacentre, but 
it pops up from time to time as an issue).
Enabling stuff like “Performance mode” in BIOS usually just says “act 
predictably”, which doesn’t imply it’s performing optimally at all. This is 
where you’ll find the most difference between vendors and how they tune their 
default settings. Lots of corporates don’t care and just switch to 
“Performance” everywhere and trust the vendor to do the right thing, which is 
seldom the case when you’re on the budget :)

Jan


> On 26 May 2015, at 18:07, Robert LeBlanc  wrote:
> 
> Signed PGP part
> In my experience with HP hardware, it was set to Econo mode in the 
> BIOS which is just plain junk. It will halt cores without respect to 
> workload to provide energy savings.
> 
> We found that by setting the power mode to "OS controlled" we got 
> almost the same performance as the "Max performance" setting, but 
> while at the same time consuming about the same power as the Econo 
> mode. The kernel is much better at putting cores to sleep, making sure 
> there is adequate reserve capacity, and making sure that they are 
> woken up faster when load is increasing. By selecting "Econo or max 
> performance" profiles, the CPU loses control of the sleep states. My 
> recommendation is before tweaking any other setting, change the 
> performance profile to "OS controlled" and then go from there.
> 
> 
> Robert LeBlanc
> GPG Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
> 
> 
> On Tue, May 26, 2015 at 5:53 AM, Lionel Bouton  
> wrote:
> > On 05/26/15 10:06, Jan Schermer wrote:
> >
> > Turbo Boost will not hurt performance. Unless you have 100% load on 
> > all cores it will actually improve performance (vastly, in terms of 
> > bursty workloads).
> > The issue you have could be related to CPU cores going to sleep mode.
> >
> >
> > Another possibility is that the system is overheating when Turbo 
> > Boost is enabled. In this case it protects itself by throttling back 
> > the core frequencies to a very low value (it may use other means 
> > too, like lowering the system buses frequencies, halting the cores 
> > periodically, ...). This would explain the high loads.
> > If the system switches back and forth between normal loads and huge 
> > loads and you can link that to CPU package temperature (and/or very 
> > low CPU core frequencies), this is probably the cause. If the 
> > ambient temperature isn't a problem (below 25°C any system should be 
> > fine and most can tolerate 30°C or
> > more) then you have an internal cooling problem.
> >
> > Best regards,
> >
> > Lionel
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com