On Wed, 2010-07-28 at 09:42 +0200, ulysse31 wrote: [...] > kernel:[63659.784275] invalid opcode: 0000 [#1] SMP [...] > kernel:[63659.784275] [<c017b969>] kmem_cache_alloc+0x47/0x87 [...] > kernel:[63659.784275] Code: 8b 75 00 39 ee 75 15 8b 75 10 8d 45 10 c7 45 34 > 01 00 00 00 39 c6 0f 84 a5 00 00 00 8b 4c 24 0c 8b 81 98 00 00 00 39 [...]
Although the code dump is incomplete, I was able to find a match. This byte sequence appears only in cache_alloc_refill(). There is a ud2a instruction a few bytes further on which appears to correspond to this assertion: /* * The slab was either on partial or free list so * there must be at least one object available for * allocation. */ BUG_ON(slabp->inuse < 0 || slabp->inuse >= cachep->num); This means that the free list has been corrupted in some way. I don't see any references to a bug in the NFS client that might do that - though, just because the corruption is found when the NFS client is active, doesn't mean it caused the corruption. Before we investigate the software any further, please check the RAM on the affected machine, e.g. using memtest86+. Ben. -- Ben Hutchings Once a job is fouled up, anything done to improve it makes it worse.
signature.asc
Description: This is a digitally signed message part