Hi all, Thanks all. After lots of testing, I isolated the problem to one of the memory modules.
Thought it might have been a kernel problem as I thought memtest should be exhaustive enough considering I ran it for so long, but apparently not... Even now, the bad module still does not show any errors in memtest... Thanks, Yucheng Ray Lee wrote: > On 9/19/07, Low Yucheng <[EMAIL PROTECTED]> wrote: > >> [1.] Summary >> System Freeze on Particular workload with kernel 2.6.22.6 >> >> [2.] Description >> System freezes on repeated application of the following command >> for f in *png ; do convert -quality 100 $f `basename $f png`jpg; done >> >> Problem is consistent and repeatable. >> Problem persists when running on a different drive, and also in pure console >> (no X). >> >> One time, the following error logged in syslog: >> Sep 19 04:22:11 mossnew kernel: [ 301.883919] VM: killing process convert >> Sep 19 04:22:11 mossnew kernel: [ 301.884382] swap_free: Unused swap offset >> entry 0000ff00 >> Sep 19 04:22:11 mossnew kernel: [ 301.884421] swap_free: Unused swap offset >> entry 00000300 >> Sep 19 04:22:11 mossnew kernel: [ 301.884456] swap_free: Unused swap offset >> entry 00000200 >> Sep 19 04:22:11 mossnew kernel: [ 301.884491] swap_free: Unused swap offset >> entry 0000ff00 >> Sep 19 04:22:11 mossnew kernel: [ 301.884527] swap_free: Unused swap offset >> entry 0000ff00 >> Sep 19 04:22:11 mossnew kernel: [ 301.884562] swap_free: Unused swap offset >> entry 00000100 >> >> Should not be a RAM problem. RAM has survived 12 hrs of Memtest with no >> errors. >> Should not be a CPU problem either. I have been running CPU intensive tasks >> for days. >> > > The "Unused swap offset entry" is almost always a sign of bad memory, > if google can be trusted. Your workload is *extremely* CPU and memory > intensive (and even hits the disk!), so this looks like bad RAM, bad > cooling, or a marginal power supply that is failing under load. > > memtest86+ doesn't stress the CPU nearly as much, so it often doesn't > show all the problems. > > Take your RAM down to one stick and try again (looks like you have 2G > installed?). If that still fails, try different RAM. If that still > fails, then swap out the power supply for another if you can, and try > again. > > Ray > > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/