On Thu, Jun 05, 2003 at 01:47:00PM -0400, David Nolan wrote:

> I disagree.  Problems with perl should *at worst* cause perl to segfault.
> 
> Any time any application can cause an entire system to freeze I feel the 
> cause is one of two things.  Either an old buggy kernel, or a hardware 
> problem that the particular application just happens to trigger.  And in my 
> 7 years of experience as a programmer and sysadmin, 99% of the time its 
> hardware.

I have had the same experience in the past.  Whenever I can recall
having similar problems in the past it seems to have been related
to some kind of hardware problem.  In this case however I've tried
mon on two entirely separate and distinct systems with virtually
nothing in common, both of which are 100% reliable when mon is not
running.  They will both run for months on end with no problem of
any kind, but when I start Mon they will both freeze like this
within a couple of days.
> 
> My first guess would be you may have a disk problem.  For example, it might 
> be in your swap partition, and it only causes problems when you're 
> exercising the machine enough to be swapping pretty hard.  But any number 
> of other problems could be the cause.
> 
> Use some standard system debugging procedures, and a little of the 
> Scientific Method:
> Contemplate other ways in which the machine is using its hardware more 
> agressively when a failure occurs.  Try running other programs that 
> exercise that hardware.  Run 'dmesg' and look for kernel errors that may be 
> early warning signs.  Listen for strange noises (disk problems).  Open up 
> the machine and disconnect and reconnect everything, including reseating 
> memory, PCI cards, etc.  Install and use utilities for monitoring the 
> system temperature.  Try replacing hardware components with hot-spares. 
> (You do have spare hardware, right?)
> 
> You get the idea...

Yeah, I've looked at dmesg and looked through all the log files
looking for anything amiss and found nothing that I recognize as an
issue.

As for replacing hardware, I sort of thought using two separate
unrelated system took care of that.  I may try a third system when
I have one that is available.  In fact a just thought of a new
server I am building that isn't in service yet and I may try Mon on
that one to see what happens.
> 
> -David Nolan
>  Network Software Developer
>  Computing Services
>  Carnegie Mellon University
> 
> 
Don MacDougall

_______________________________________________
mon mailing list
[EMAIL PROTECTED]
http://linux.kernel.org/mailman/listinfo/mon

Reply via email to