Re: On SMP, getting: Message from watchdog: The system will be rebooted because of error -3!

2003-09-08 Thread Russell Coker
On Mon, 8 Sep 2003 19:57, Dave Watkins wrote:
> This is probably due to the HT nature of the CPU. For example if one
> thread is using a part of the CPU that isn't duplicated, and then a
> second thread on the "other" logical CPU also wants those resources, you
> could effectivly have %150 CPU load until the first thread finshes with
> those resources as the second thread will be waiting.

That will not affect the load average.  Whether a process is ready to run or 
actually running should give the same load average, it's actually the number 
of processed ready or blocked on IO that's used for the average.

Even IF hyper-threading doubled the reported loadavg due to a kernel bug, it 
still wouldn't explain a 159 load average, that's much more than double what 
you want to see on a typical Linux system!

-- 
http://www.coker.com.au/selinux/   My NSA Security Enhanced Linux packages
http://www.coker.com.au/bonnie++/  Bonnie++ hard drive benchmark
http://www.coker.com.au/postal/Postal SMTP/POP benchmark
http://www.coker.com.au/~russell/  My home page


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: On SMP, getting: Message from watchdog: The system will be rebooted because of error -3!

2003-09-08 Thread Dave Watkins
Russell Coker wrote:

On Mon, 8 Sep 2003 17:46, Jason Lim wrote:

I had set the loadavg to such an absurd number, I never thought it could
be that. It NEVER peaks that high on a single CPU (well... without HT SMP
on). Is this normal? Do SMP systems tend to spike a lot higher than
regular single CPU ones?
Strange thing is... the previous 2Ghz CPU... never went that high... and
now with a 2.8Ghz HyperThreading processing, the load average actually
increases (or at least the spiking load average). Is this a trait of SMP?


Adding more CPUs will not affect the load average if it's IO related.  If it's 
CPU usage related then more CPU power should decrease the load average.

Of course kernel bugs could be triggered by a different hardware 
configuration, but that's not a likely possibility at this time.

I suggest checking for any cron jobs etc that might have caused high load.

This is probably due to the HT nature of the CPU. For example if one 
thread is using a part of the CPU that isn't duplicated, and then a 
second thread on the "other" logical CPU also wants those resources, you 
could effectivly have %150 CPU load until the first thread finshes with 
those resources as the second thread will be waiting.

Dave



--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]


Re: On SMP, getting: Message from watchdog: The system will be rebooted because of error -3!

2003-09-08 Thread Russell Coker
On Mon, 8 Sep 2003 17:46, Jason Lim wrote:
> I had set the loadavg to such an absurd number, I never thought it could
> be that. It NEVER peaks that high on a single CPU (well... without HT SMP
> on). Is this normal? Do SMP systems tend to spike a lot higher than
> regular single CPU ones?
>
> Strange thing is... the previous 2Ghz CPU... never went that high... and
> now with a 2.8Ghz HyperThreading processing, the load average actually
> increases (or at least the spiking load average). Is this a trait of SMP?

Adding more CPUs will not affect the load average if it's IO related.  If it's 
CPU usage related then more CPU power should decrease the load average.

Of course kernel bugs could be triggered by a different hardware 
configuration, but that's not a likely possibility at this time.

I suggest checking for any cron jobs etc that might have caused high load.

-- 
http://www.coker.com.au/selinux/   My NSA Security Enhanced Linux packages
http://www.coker.com.au/bonnie++/  Bonnie++ hard drive benchmark
http://www.coker.com.au/postal/Postal SMTP/POP benchmark
http://www.coker.com.au/~russell/  My home page


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: On SMP, getting: Message from watchdog: The system will be rebooted because of error -3!

2003-09-08 Thread Jason Lim
From: "Russell Coker" <[EMAIL PROTECTED]>
To: "Jason Lim" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]>
Sent: Monday, 08 September, 2003 1:26 PM
Subject: Re: On SMP, getting: Message from watchdog: The system will be
rebooted because of error -3!


> On Mon, 8 Sep 2003 15:09, Jason Lim wrote:
> > Recently got SMP working, but now keep getting:
> > Message from watchdog: The system will be rebooted because of
error -3!
>
> Check /var/log/daemon.log for the real reason, a transient load spike is
a
> likely cause.
>

Right... you're were right.

Sep  8 12:31:18 beta watchdog[243]: loadavg 159 63 24 is higher than the
given t
hreshold 150 140 130!

Sep  8 12:31:28 beta watchdog[243]: shutting down the system because of
error -3

I had set the loadavg to such an absurd number, I never thought it could
be that. It NEVER peaks that high on a single CPU (well... without HT SMP
on). Is this normal? Do SMP systems tend to spike a lot higher than
regular single CPU ones?

Strange thing is... the previous 2Ghz CPU... never went that high... and
now with a 2.8Ghz HyperThreading processing, the load average actually
increases (or at least the spiking load average). Is this a trait of SMP?

Never worked with SMP like this before... with such strange charateristic?
Normal?

Thanks in advance.

Jas


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: On SMP, getting: Message from watchdog: The system will be rebooted because of error -3!

2003-09-07 Thread Russell Coker
On Mon, 8 Sep 2003 15:09, Jason Lim wrote:
> Recently got SMP working, but now keep getting:
> Message from watchdog: The system will be rebooted because of error -3!

Check /var/log/daemon.log for the real reason, a transient load spike is a 
likely cause.

-- 
http://www.coker.com.au/selinux/   My NSA Security Enhanced Linux packages
http://www.coker.com.au/bonnie++/  Bonnie++ hard drive benchmark
http://www.coker.com.au/postal/Postal SMTP/POP benchmark
http://www.coker.com.au/~russell/  My home page


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



On SMP, getting: Message from watchdog: The system will be rebooted because of error -3!

2003-09-07 Thread Jason Lim
Hi all,

Recently got SMP working, but now keep getting:
Message from watchdog: The system will be rebooted because of error -3!



(note this isn't really SMP, it's intel hyperthreading...)



The system auto reboots because of this. Not sure why... doesn't appear to
be the load or anything (no conditions met from /etc/watchdog.conf)



Any idea what this might be?


Sincerely,
Jas


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]