Paolo Giarrusso wrote: > On Mon, May 10, 2010 at 19:43, Jan Kiszka <jan.kis...@web.de> wrote: >> Lukas Czerner wrote: >>> Hi, >>> >>> I have got an error when booting uml with Linked list manipulation >>> debugging turned on. The crash occurs in the exact moment when the >>> consoles are showing up - it just blinks and crush. Here is the >>> backtrace: >>> >>> >>> #0 0x0000003cb8233e14 in abort () from /lib64/libc.so.6 >>> #1 0x00000000600220b2 in os_dump_core () at arch/um/os-Linux/util.c:119 >>> #2 0x00000000600147c1 in panic_exit (self=<value optimized out>, >>> unused1=<value optimized out>, >>> unused2=<value optimized out>) at arch/um/kernel/um_arch.c:233 >>> #3 0x0000000060045996 in notifier_call_chain (nl=<value optimized out>, >>> val=0, v=0x602ffb40, >>> nr_to_call=-2, nr_calls=<value optimized out>) at kernel/notifier.c:93 >>> #4 0x00000000600459e4 in __atomic_notifier_call_chain (nh=<value optimized >>> out>, >>> val=<value optimized out>, v=<value optimized out>) at kernel/notifier.c:182 >>> #5 atomic_notifier_call_chain (nh=<value optimized out>, val=<value >>> optimized out>, >>> v=<value optimized out>) at kernel/notifier.c:191 >>> #6 0x00000000601a56d0 in panic (fmt=0x601f3050 "Kernel mode fault at addr >>> 0x%lx, ip 0x%lx") >>> at kernel/panic.c:115 >>> #7 0x00000000600145df in segv (fi=..., ip=1611852984, is_user=0, >>> regs=0x6024fa10) >>> at arch/um/kernel/trap.c:201 >>> #8 0x000000006001463e in segv_handler (sig=<value optimized out>, >>> regs=<value optimized out>) >>> at arch/um/kernel/trap.c:147 >>> #9 0x00000000600211a4 in sig_handler_common (sig=11, sc=0x6024fbe8) at >>> arch/um/os-Linux/signal.c:49 >>> #10 0x00000000600212ea in sig_handler (sig=<value optimized out>, sc=<value >>> optimized out>) >>> at arch/um/os-Linux/signal.c:226 >>> #11 0x000000006002151c in handle_signal (sig=<value optimized out>, >>> sc=0x6024fbe8) >>> at arch/um/os-Linux/signal.c:158 >>> #12 0x0000000060022e9c in hard_handler (sig=<value optimized out>) >>> at arch/um/os-Linux/sys-x86_64/signal.c:15 >>> #13 <signal handler called> >>> #14 0x000000006012ecb8 in list_del (entry=0x807bcf00) at lib/list_debug.c:46 >>> #15 0x0000000060016724 in free_winch (winch=0x807bcf00, free_irq_ok=0) at >>> arch/um/drivers/line.c:731 >>> #16 0x0000000060016ad7 in winch_interrupt (irq=<value optimized out>, >>> data=0x807bcf00) >>> at arch/um/drivers/line.c:760 >>> #17 0x0000000060055b45 in __free_irq (irq=10, dev_id=0x807bcf00) at >>> kernel/irq/manage.c:937 >>> #18 0x0000000060055bc0 in free_irq (irq=10, dev_id=0x807bcf00) at >>> kernel/irq/manage.c:986 >>> #19 0x0000000060016765 in free_winch (winch=0x807bcf00, free_irq_ok=1) at >>> arch/um/drivers/line.c:740 >>> #20 0x0000000060017211 in unregister_winch (tty=0x8072a800, filp=<value >>> optimized out>) >>> at arch/um/drivers/line.c:836 >>> #21 line_close (tty=0x8072a800, filp=<value optimized out>) at >>> arch/um/drivers/line.c:477 >>> #22 0x00000000601335d9 in tty_release (inode=0x7eff4190, filp=0x7fc9abc0) >>> at drivers/char/tty_io.c:1580 >>> #23 0x000000006007e6a8 in __fput (file=0x7fc9abc0) at fs/file_table.c:254 >>> #24 0x000000006007e79c in fput (file=<value optimized out>) at >>> fs/file_table.c:200 >>> #25 0x000000006007b995 in filp_close (filp=0x7fc9abc0, id=0x7fc720c0) at >>> fs/open.c:1124 >>> #26 0x000000006007ba4a in sys_close (fd=3) at fs/open.c:1153 >>> #27 0x0000000060014c7c in handle_syscall (r=0x7e1ba418) at >>> arch/um/kernel/skas/syscall.c:35 >>> #28 0x0000000060023e7f in handle_trap (regs=0x7e1ba418) at >>> arch/um/os-Linux/skas/process.c:201 >>> #29 userspace (regs=0x7e1ba418) at arch/um/os-Linux/skas/process.c:417 >>> #30 0x000000006001265b in fork_handler () at arch/um/kernel/process.c:181 >>> >>> >>> Apparently, the winch is not in the list when the free_winch is >>> invoked. Kernel version 2.6.34-rc7 >>> >> You might be lucky: Does the patch below make a difference? It may be >> untested as it's part on my half-finished uml cleanup/review queue. > > Hi, I'm not sure it may help, since the spurious call is from inside > the IRQ handler. In the worst case, the above call may cause a > deadlock. I suggest a better fix idea below. > > Here's the help from DEBUG_SHIRQ: > Enable this to generate a spurious interrupt as soon as a shared > interrupt handler is registered, and just before one is deregistered. > Drivers ought to be able to handle interrupts coming in at those > points; some don't and need to be caught. > > The problem is that free_winch calls free_irq at the end, after > destroying everything, which is wrong: first you disable the > interrupt, then you destroy resources needed to handle it. SHIRQ is > catching a real bug here. In fact, the IRQ handler is noticing that it > can't read from the FD (because it was just closed) and trying to free > again the IRQ (which is bad). You must move the call to free_irq at > the beginning. There's a number of apparent "race" in UML which is > really solved by state machines, like this: when closing, you are in a > state when other routines (like IRQs) must not wait (what you get by > locking), but must be disabled. > Actually, the cleanup seems to start at line_close which already > destroyed some resources, but winch_interrupt seems safe against that > (it tests if winch->tty is NULL). Not really maybe, because the > testing is not atomic (so there might be races, but it seems not so > likely). But moving the call to unregister_winch to the beginning of > line_close might be a good idea (after all, you don't really need to > listen for SIGWINCH when you start closing a window). The calls before > line_close are from the generic kernel, so they can't be changed: if > they free any resources, the interrupt handler must be ready to > operate without them, whatever that means. But it's just calling > tty_fasync(on=0), I think to disable async notifications.
Right, there was more broken. Besides the locking fix, this should do the trick by avoiding the recursive free_winch call: diff --git a/arch/um/drivers/line.c b/arch/um/drivers/line.c index d85fcb9..fa66eb4 100644 --- a/arch/um/drivers/line.c +++ b/arch/um/drivers/line.c @@ -731,8 +731,10 @@ static void free_winch(struct winch *winch, int free_irq_ok) if (winch->pid != -1) os_kill_process(winch->pid, 1); - if (winch->fd != -1) + if (winch->fd != -1) { os_close_file(winch->fd); + winch->fd = -1; + } if (winch->stack != 0) free_stack(winch->stack, 0); if (free_irq_ok) (note: untested) Jan
signature.asc
Description: OpenPGP digital signature
------------------------------------------------------------------------------
_______________________________________________ User-mode-linux-devel mailing list User-mode-linux-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel