When some of our users run their leaky code on our cluster, it triggers
the Linux kernel's Out of Memory process killer.  Usually this works
good and kills the offending process with no other problems, but it
appears that sometimes gmond also dies at the same time.  The kernel
does not log that it has killed any gmond processes, and usually only a
few out of the 8 threads die or become zombies, so I don't think it is
the Out of Memory killer that is killing gmond.  Has anyone else noticed
a problem like this?  Could there be something internal to gmond that
does not handle low memory conditions?  It appears that only the threads
that read /proc and send the multicast data die, but the threads that
receive the data and respond to xml requests appear to continue to run.

~Jason


-- 
/------------------------------------------------------------------\
|  Jason A. Smith                          Email:  [EMAIL PROTECTED] |
|  Atlas Computing Facility, Bldg. 510M    Phone:  (631)344-4226   |
|  Brookhaven National Lab, P.O. Box 5000  Fax:    (631)344-7616   |
|  Upton, NY 11973-5000                                            |
\------------------------------------------------------------------/


Reply via email to