Re: [SLUG] High System CPU usage and finding culprit

2012-05-08 Thread Jason Ball
my solution appears to have fixed the problems I was seeing for a little
over a week now. As such - reported to Centos for fixing...

http://bugs.centos.org/view.php?id=5716

Cheers
J.


On Wed, May 9, 2012 at 10:28 AM, Jason Ball  wrote:

> I've seen a similar problem that took several weeks to identify.
>
> There is an issue with transparent hugepage support (aka memory defrag)
> where it causes processes on a server to stall and a number of other weird
> symptoms, I actually suspected dodgy drivers for one of my raid controllers
> before I found the cause.
>
> The solution (in my case) was to disable this facility:
>
>
> echo no > /sys/kernel/mm/redhat_transparent_hugepage/khugepaged/defrag
> echo never >/sys/kernel/mm/redhat_transparent_hugepage/defrag
>
> It's worth noting that this setting appears to be disabled by default on
> RHEL6 installations, but enabled on Centos 6.
>
>
> Cheers
> Jason.
>
>
>
> On Wed, May 9, 2012 at 10:15 AM, Michael Fox  wrote:
>
>> In which case, enable sysstat on centos if it's not already. will help
>> moving forward..
>>
>> On Wed, May 9, 2012 at 10:11 AM, Grant Street  wrote:
>>
>> > 16 core, 12-24G memory running centos 6.1
>> >
>> >
>> > On 09/05/12 10:08, David Lyon wrote:
>> >
>> >> Are they dual core ?
>> >>
>> >> Do they have a sheetload of memory ?
>> >>
>> >> I found ubuntu got slower and slower till I got in
>> >> newer hardware.
>> >>
>> >
>> > --
>> > SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
>> > Subscription info and FAQs: http://slug.org.au/faq/**mailinglists.html<
>> http://slug.org.au/faq/mailinglists.html>
>> >
>> --
>> SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
>> Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
>>
>
>
>
> --
> --
> Teach your kids Science, or somebody else will :/
>
> ja...@ball.net
> vk2...@google.com 
> callsign: vk2vjb
>
>
>


-- 
--
Teach your kids Science, or somebody else will :/

ja...@ball.net
vk2...@google.com 
callsign: vk2vjb
-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] High System CPU usage and finding culprit

2012-05-08 Thread Jason Ball
I've seen a similar problem that took several weeks to identify.

There is an issue with transparent hugepage support (aka memory defrag)
where it causes processes on a server to stall and a number of other weird
symptoms, I actually suspected dodgy drivers for one of my raid controllers
before I found the cause.

The solution (in my case) was to disable this facility:

echo no > /sys/kernel/mm/redhat_transparent_hugepage/khugepaged/defrag
echo never >/sys/kernel/mm/redhat_transparent_hugepage/defrag

It's worth noting that this setting appears to be disabled by default on
RHEL6 installations, but enabled on Centos 6.


Cheers
Jason.


On Wed, May 9, 2012 at 10:15 AM, Michael Fox  wrote:

> In which case, enable sysstat on centos if it's not already. will help
> moving forward..
>
> On Wed, May 9, 2012 at 10:11 AM, Grant Street  wrote:
>
> > 16 core, 12-24G memory running centos 6.1
> >
> >
> > On 09/05/12 10:08, David Lyon wrote:
> >
> >> Are they dual core ?
> >>
> >> Do they have a sheetload of memory ?
> >>
> >> I found ubuntu got slower and slower till I got in
> >> newer hardware.
> >>
> >
> > --
> > SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
> > Subscription info and FAQs: http://slug.org.au/faq/**mailinglists.html<
> http://slug.org.au/faq/mailinglists.html>
> >
> --
> SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
> Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
>



-- 
--
Teach your kids Science, or somebody else will :/

ja...@ball.net
vk2...@google.com 
callsign: vk2vjb
-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] High System CPU usage and finding culprit

2012-05-08 Thread John Ferlito
On Wed, May 09, 2012 at 10:11:43AM +1000, Grant Street wrote:
> 16 core, 12-24G memory running centos 6.1

I can highly recommend collectd for collecting system stats. It
collects them at 10 seconds and it knows about an amazing load of
stuff. Really easy to go back in time and work out what was going on
with the whole system.

I could probably be convinced to give a talk on this at SLUG,

Cheers,
John

-- 
John
Blog http://www.inodes.org
LCA2012  http://lcaunderthestars.org.au
-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] High System CPU usage and finding culprit

2012-05-08 Thread Michael Fox
In which case, enable sysstat on centos if it's not already. will help
moving forward..

On Wed, May 9, 2012 at 10:11 AM, Grant Street  wrote:

> 16 core, 12-24G memory running centos 6.1
>
>
> On 09/05/12 10:08, David Lyon wrote:
>
>> Are they dual core ?
>>
>> Do they have a sheetload of memory ?
>>
>> I found ubuntu got slower and slower till I got in
>> newer hardware.
>>
>
> --
> SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
> Subscription info and FAQs: 
> http://slug.org.au/faq/**mailinglists.html
>
-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] High System CPU usage and finding culprit

2012-05-08 Thread Grant Street

16 core, 12-24G memory running centos 6.1

On 09/05/12 10:08, David Lyon wrote:

Are they dual core ?

Do they have a sheetload of memory ?

I found ubuntu got slower and slower till I got in
newer hardware.


--
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] High System CPU usage and finding culprit

2012-05-08 Thread Michael Fox
Hello Grant,

Distribution? Release?

I'd go about reviewing the sysstat output (aka sar reports) during the
period of the issue to see io stats, any usful things to see if something
stands out. Will also allow you to review the swap usage  during those
times too.

If the package is not installed, get it installed for the next occurences.

Thanks

On Wed, May 9, 2012 at 9:49 AM, Grant Street  wrote:

> Hello
>
> I have some desktop linux machines that have periods of extremely high
> system time (30-90%) with no obvious cause. The users see it as a hang or a
> freeze to the point of 10sec for a key press to register. it comes and goes
> seemingly randomly but only lasts max about 1-2 min.
>
> What I'm after is any hints on how to track down whats causing this.
>
> Symptoms
> - load average is low (sub 1)
> - No process is consuming a lot of cpu (from top)
> - No swapping is occurring at the time
> - plenty of free memory
> - no/low IOwait%
> - IRQ% is 0
> - Still has Idle available
> - nothing in dmesg
>
> Any help or investigation tips would be appreciated
>
> Thanks
> --
> SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
> Subscription info and FAQs: 
> http://slug.org.au/faq/**mailinglists.html
>
-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] High System CPU usage and finding culprit

2012-05-08 Thread David Lyon
Are they dual core ?

Do they have a sheetload of memory ?

I found ubuntu got slower and slower till I got in
newer hardware.
-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


[SLUG] High System CPU usage and finding culprit

2012-05-08 Thread Grant Street

Hello

I have some desktop linux machines that have periods of extremely high 
system time (30-90%) with no obvious cause. The users see it as a hang 
or a freeze to the point of 10sec for a key press to register. it comes 
and goes seemingly randomly but only lasts max about 1-2 min.


What I'm after is any hints on how to track down whats causing this.

Symptoms
- load average is low (sub 1)
- No process is consuming a lot of cpu (from top)
- No swapping is occurring at the time
- plenty of free memory
- no/low IOwait%
- IRQ% is 0
- Still has Idle available
- nothing in dmesg

Any help or investigation tips would be appreciated

Thanks
--
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html