From the top's sent before, it looks like the administrators might have configured the system with no swap:

r1i1n2

Swap:        0M total,        0M used,        0M free,    10563M cached

r1i1n3

Swap:        0M total,        0M used,        0M free,    23089M cached

Keep in mind that having swap might mean the difference between hurt performance and a hard crash under low memory [ http://unix.stackexchange.com/questions/190398/do-i-need-swap-space-if-i-have-more-than-enough-amount-of-ram ].

On 9/29/2015 5:57 AM, Laurence Marks wrote:

If it happens again, one thing to ask them to check is swap usage and how much memory is cached. On some of my nodes I have noticed that they do not always release cached memory, and can start swapping. If this happens the job will get very slow. The commands to use to clear the cache can be found at http://www.tecmint.com/clear-ram-memory-cache-buffer-and-swap-space-on-linux/ or similar. (Needs root access.) Top can also show memory use.

While there should be no need to do this, I have noticed that I need to do it every 3hrs on 4 nodes - the other 20 don't need it. It is an issue mainly for big calculations.

Alternatively it was something else, a zombie, big log files or other things. Rebooting gets rid of a lot of system caches and helps -- even on my Android tablet every week or two. It's murky waters.

---
Professor Laurence Marks
Department of Materials Science and Engineering
Northwestern University
http://www.numis.northwestern.edu
Corrosion in 4Dhttp://MURI4D.numis.northwestern.edu <http://MURI4D.numis.northwestern.edu>
Co-Editor, Acta Cryst A
"Research is to see what everybody else has seen, and to think what nobody else has thought"
Albert Szent-Gyorgi

_______________________________________________
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html

Reply via email to