Emmanuel Dreyfus <m...@netbsd.org> wrote: > I discovered another scenario where force unmount could not work: an > unresponsive PUFFS filesystem. The filesystem got out of order during an > operation where the filesystem root vnode is locked.
I tried LOCKDEBUG to get some hints, and now I get a nice panic (see below). gdb suggests this happens in LOCKDEBUG_BARRIER(NULL, 0) in mi_userret(). Anyone knowledgable can tell me what is it so that I get some ideas of what needs to be fixed? I have the feeling this detects I leave the kernel while a lock was not released, is that the problem? The panic: Reader / writer lock error: lockdebug_barrier: sleep lock held lock address : 0x00000000c2965dc0 type : sleep/adaptive initialized : 0x00000000c0400b1b shared holds : 0 exclusive: 1 shares wanted: 0 exclusive: 0 current cpu : 0 last held: 0 current lwp : 0x00000000c2a1bd20 last held: 0x00000000c2a1bd20 last locked* : 0x00000000c018b217 unlocked : 0x00000000c018b2dd owner/count : 0x00000000c2a1bd20 flags : 0x0000000000000004 Turnstile chain at 0xc049ef00. => No active turnstile for this lock. panic: LOCKDEBUG: Reader / writer lock error: lockdebug_barrier: sleep lock held fatal breakpoint trap in supervisor mode trap type 1 code 0 eip c012fcd4 cs 9 eflags 282 cr2 b9d1997f ilevel 0 esp da8b3e cc curlwp 0xc2a1bd20 pid 4216 lid 1 lowest kstack 0xda8b22c0 Stopped in pid 4216.1 (ln) at netbsd:breakpoint+0x4: popl %ebp breakpoint(c04803c9,c04e7fe0,c04803cb,da8b3ef8,0,c048048d,da8b3eec,c28c0280,0,c0 48048d) at netbsd:breakpoint+0x4 vpanic(c04803cb,da8b3ef8,207,c048048d,0,c28c0280,da8b3f2c,c035847e,c04803cb,c047 51d2) at netbsd:vpanic+0x117 panic(c04803cb,c04751d2,c045cacc,c048048d,c2a1bd20,ffffff9c,17ffc77,c045cacc,bf7 ffc8d,400) at netbsd:panic+0x18 lockdebug_abort1(c048048d,1,da8b3f7c,da8b3fa0,c02ebc23,da8b3f54,9,106,bf7ffc77,b f7ffc8d) at netbsd:lockdebug_abort1+0xce syscall() at netbsd:syscall+0xea --- syscall (number 0) --- bb69d1d7: ds da8b0011 es c0480011 copyright+0x1b131 fs 31 gs da8b0011 edi da8b3ef8 esi c04803cb copyright+0x1b4eb ebp da8b3e9c ebx 104 edx 0 ecx 0 eax 1 eip c012fcd4 breakpoint+0x4 cs 9 eflags 282 esp da8b3e9c ss 11 netbsd:breakpoint+0x4: popl %ebp (gdb) list *(syscall+0xea) 0xc037d4ca is in syscall (../../../../arch/x86/x86/syscall.c:185). 180 } 181 } 182 183 SYSCALL_TIME_SYS_EXIT(l); 184 userret(l); 185 } 186 187 void 188 syscall_intern(struct proc *p) 189 { i386 userrer() is just a call to mi_userret(), which ends with LOCKDEBUG_BARRIER(NULL, 0); If LOCKDEBUG is enabled, it LOCKDEBUG_BARRIER is defined as #define LOCKDEBUG_BARRIER(lock, slp) lockdebug_barrier(lock, slp) And: void lockdebug_barrier(volatile void *spinlock, int slplocks) { struct lwp *l = curlwp; lockdebug_t *ld; int s; if (panicstr != NULL || ld_panic) return; s = splhigh(); if ((l->l_pflag & LP_INTR) == 0) { TAILQ_FOREACH(ld, &curcpu()->ci_data.cpu_ld_locks, ld_chain) { if (ld->ld_lock == spinlock) { continue; } __cpu_simple_lock(&ld->ld_spinlock); lockdebug_abort1(ld, s, __func__, "spin lock held", true); return; } } if (slplocks) { splx(s); return; } if ((ld = TAILQ_FIRST(&l->l_ld_locks)) != NULL) { __cpu_simple_lock(&ld->ld_spinlock); lockdebug_abort1(ld, s, __func__, "sleep lock held", true); return; } splx(s); if (l->l_shlocks != 0) { panic("lockdebug_barrier: holding %d shared locks", l->l_shlocks); } } -- Emmanuel Dreyfus http://hcpnet.free.fr/pubz m...@netbsd.org