Christoph Rohland <[EMAIL PROTECTED]> writes:
> 
> This is the first report of such corruption. If it's real it is _not_
> fixed between test5 and test11. There is probably no way to reproduce
> it since you ask if it's fixed in test11, right?

I know no way to reproduce it.  I've been using "test5" reliably since
just after its release, and I've triggered this bug only the one time.

I was running Mozilla, one of the few programs I run that uses shared
memory to communicate with the X server.  If I recall correctly, the
machine had been idle for a few minutes when my ISP suddenly hung up
on me.  Then, I discovered the machine had locked: CPU1 running "pppd"
got stuck waiting for the kernel lock in "sock_ioctl".  I believe it
was the innocent victim.  CPU0 (running "XF86_SVGA") had grabbed the
kernel lock and gotten stuck spinning on the invalid swap device
spinlock, as mentioned in my previous message.

I use a SysReq patch to do an oops-style dump instead of the usual
"showPc" function, so I was able to copy a stack dump down.

>From the stack dump, I can be 100% positive that, in shm_nopage_core,
"shp" was 0xc218b240 on entry and "idx" was 0, but the line

        pte = SHM_ENTRY(shp,idx);

calculated a value of 0xc218b268, the memory location of
"shp->shm_dir".  That is, I had shp->shm_dir == **shp->shm_dir, so I
*suspect* that that shp->shm_dir == *shp->shm_dir.

In any event, the "shp" was corrupt (hadn't been initialized or had
been freed and reused).

I'll fiddle around a bit more and see if I can find a way to reproduce
it reliably.

Thanks.

Kevin <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Reply via email to