On 2015-02-20 19:53, Gilles Chanteperdrix wrote: > On Fri, Feb 20, 2015 at 07:51:19PM +0100, Jan Kiszka wrote: >> On 2015-02-20 19:38, Gilles Chanteperdrix wrote: >>> On Fri, Feb 20, 2015 at 07:03:14PM +0100, Jan Kiszka wrote: >>>> Hi Gilles, >>>> >>>> analyzing a lockdep warning on 3.16 with I-pipe enabled, I dug deeper >>>> into the hard and virtual interrupt state management during exception >>>> handling on ARM. I think there are several issues: >>>> >>>> - ipipe_fault_entry should not fiddle with the root irq state if run >>>> over head, only when invoked over root. >>>> - ipipe_fault_exit must not change the root state unless we entered over >>>> head and are about to leave over root - see x86. The current code may >>>> keep root incorrectly stalled after an exception, though this will >>>> probably be fixed up again in practice quickly. >>>> - do_sect_fault is only called by do_DataAbort and do_PrefetchAbort, >>>> in both cases already wrapped in ipipe_fault_entry/exit, thus it >>>> shouldn't invoke them once again. >>>> >>>> Room for optimization: >>>> - ipipe_fault_entry is always called with hard IRQs off from >>>> do_page_fault and do_translation_fault. I suspect this applies to the >>>> remaining callers (do_DataAbort and do_PrefetchAbort ) as well. Thus >>>> the hard IRQ state is actually known at compile time, right? >> >> To follow up on this: do_DataAbort and do_PrefetchAbort are always >> invoked with hard IRQs disable when a regular exception takes us there. >> Only the ghost syscall cmpxchg simulates do_DataAbort without adjusting >> hardware interrupt. It's probably easier to adjust that than to account >> for hw irqs being potentially on an fault entry. >> >>>> >>>> I can hack up patches, but I'd like to confirm first that I'm not >>>> missing anything subtle or ARM-specific here. >>> >>> Just to explain the original hack. >>> >>> Some time ago, the faults handlers were executed irqs on ARM. The >>> irqs were enabled in entry.S before executing the handlers. >>> >>> At some point, this was removed in entry.S and fault handlers >>> started to be executed irqs off. On ARM, all faults relax to be >>> handled in secondary mode, actually there is an exception, the FPU, >>> but it goes through a completely different path which had always >>> been executed irqs off until recently where the irqs are reenabled >>> when accessing user-space to be able to handle faults without >>> lockups. >>> >>> My concern was that the code thus executed could have assertion >>> about the root domain being stalled which would be fail, so I added >>> code which stalled root and enabled hardware irqs on fault entry and >>> unstalled root and disabled hardware irqs on fault exit (which >>> always happen on root domain). This should have worked even if a fault >>> had happened to be handled in head domain, because then the >>> operation would have been a nop (simply stall/then unstall). >>> >>> But Philippe found this dumb approach to fail when working on LPAE, >>> IIRC. IIRC, namely, if the root domain happens to be stalled when >>> entering a fault over head domain, it would end up unstalled after >>> the operation. So, I believe the code he added saves the stall state >>> on fault entry and restores it on fault exit. I have checked >>> Philippe's code details at the time and did not find anything wrong. >> >> I suspect the LPAE scenario takes the do_page_fault path? Then it should >> rather be solved by providing the right information to or preventing >> the execution of >> >> /* Enable interrupts if they were enabled in the parent context. */ >> if (interrupts_enabled(regs)) >> local_irq_enable(); >> >> Now we unconditionally restore to the root state on entry, overwriting >> what may happen to it during the handler execution - specifically via >> the snippet above. > > This code is part of the mainline kernel.
Correct. But we can adjust it to take interrupt virtualization and domain migration into account. I just also followed hook_ifault_code to find out if there is something unaddressed /wrt I-pipe. There is: hardware break/watchpoints. If triggered over head, they will go to hw_breakpoint_pending directly without prior domain migration. We probably want another __ipipe_report_trap here. Jan -- Siemens AG, Corporate Technology, CT RTC ITP SES-DE Corporate Competence Center Embedded Linux _______________________________________________ Xenomai mailing list [email protected] http://www.xenomai.org/mailman/listinfo/xenomai
