Hmm, I think the problem is the call to uart_write_wakeup() at the end 
of uart_console_write() in serial_core.c.  I am not 100% sure why it is 
there, but since the patch makes the console and "normal" users share 
the same output buffer, I think it was required to kick off transmission 
when sharing console output with other output.  Thinking about it, I'm 
pretty sure that's why it's there.

I think the right solution is to add "if (port->info->flags & 
UIF_INITIALIZED)" before the call to uart_write_wakeup() in that code.  
Can you try that and see if it fixes the problem?

-corey


David Jenkins wrote:
> Corey,
>
> I applied the 2.6.23 IPMI UART system interface patches to my 2.6.23 
> 8641 kernel (serial_core.c & 8250.c).
>
> Now when I do a 'reboot' from the command line I get the following 
> kernel oops:
>
> Oops: Kernel access of bad area, sig: 11 [#1]
> SMP NR_CPUS=2 kxx8641
> Modules linked in:
> NIP: c01580a4 LR: c016c77c CTR: c016c764
> REGS: c1a23a50 TRAP: 0300   Not tainted  (2.6.23-ksi8641-rel_1_0-rc2)
> MSR: 00009032 <EE,ME,IR,DR>  CR: 42000242  XER: 20000000
> DAR: 000000b8, DSISR: 40000000
> TASK = effdac70[281] 'init' THREAD: c1a22000 CPU: 0
> GPR00: c016c77c c1a23b00 effdac70 00000000 00009032 fffba466 00000100 
> c04a0000
> GPR08: c1a23b18 c04ca000 0005b000 c016c764 22000244 1008c5ac c04b0000 
> 00000001
> GPR16: ffffffff 00000000 c1a23c50 00000000 393d66b8 c04a0000 00000000 
> c04bd960
> GPR24: c04bd960 c04b33a0 c048710c c1a22000 00000000 00000002 c04ca018 
> 00000000
> NIP [c01580a4] tty_wakeup+0x14/0x9c
> LR [c016c77c] uart_tasklet_action+0x18/0x28
> Call Trace:
> [c1a23b00] [c016861c] ipmi_serial_timeout+0x0/0x78 (unreliable)
> [c1a23b10] [c016c77c] uart_tasklet_action+0x18/0x28
> [c1a23b20] [c0029508] tasklet_action+0xbc/0x194
> [c1a23b50] [c0029814] __do_softirq+0xa0/0x13c
> [c1a23b90] [c00065b4] do_softirq+0x64/0x68
> [c1a23ba0] [c0029220] irq_exit+0x54/0x64
> [c1a23bb0] [c000e260] timer_interrupt+0x2e4/0x6cc
> [c1a23c40] [c0011374] ret_from_except+0x0/0x14
> --- Exception: 901 at vprintk+0x25c/0x434
>     LR = vprintk+0x2d4/0x434
> [c1a23d90] [c0023f7c] printk+0x50/0x60
> [c1a23e10] [c0032e14] kernel_restart+0x7c/0x98
> [c1a23e20] [c00354ac] sys_reboot+0x170/0x200
> [c1a23f40] [c0010cc8] ret_from_syscall+0x0/0x38
> --- Exception: c01 at 0xff252fc
>     LR = 0x1001c938
> Instruction dump:
> 4becbeb9 80010034 387f000c bbc10028 38210030 7c0803a6 4e800020 7c0802a6
> 9421fff0 bfc10008 7c7f1b78 90010014 <800300b8> 70090020 4082002c 387f0128
> Kernel panic - not syncing: Fatal exception in interrupt
> Rebooting in 180 seconds..
>
> The problem is that state->info->tty is NULL and so NULL is passed 
> into the tty_wakeup() from uart_tasklet_action() and NULL is 
> dereferenced inside tty_wakeup.
>
> I can make the problem go away in three ways:
>
> a) Remove the printk's in sys.c:kernel_restart()
> b) stick a if(state->info->tty) above the call to tty_wakeup() in
>    serial_core.c:uart_tasklet_action
> c) Remove the state->info->tty = NULL in seral_core.c:uart_close()
>
> 'c' stops the kernel oops but something is still wrong such that the 
> board locks up instead of oopsing. 
>
> There appears to be a timing issue between the tty_wakeup (I think 
> caused by the printk's in kernel_restart) and setting state->info->tty 
> = NULL as a part of the kernel shutdown.  However, I wasn't getting 
> very far tracking this down any further and so though I would let you 
> know about it.
>
> David Jenkins
> [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Openipmi-developer mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/openipmi-developer

Reply via email to