On Mon, Feb 23, 2015 at 05:50:13PM +0100, Jan Kiszka wrote: > On 2015-02-23 17:37, Gilles Chanteperdrix wrote: > > On Mon, Feb 23, 2015 at 05:32:56PM +0100, Philippe Gerum wrote: > >> On 02/23/2015 05:06 PM, Jan Kiszka wrote: > >>> On 2015-02-20 20:52, Philippe Gerum wrote: > >>>> On 02/20/2015 08:47 PM, Philippe Gerum wrote: > >>>>> On 02/20/2015 08:44 PM, Philippe Gerum wrote: > >>>>>> On 02/20/2015 07:17 PM, Jan Kiszka wrote: > >>>>>>> On 2015-02-20 19:03, Jan Kiszka wrote: > >>>>>>>> Hi Gilles, > >>>>>>>> > >>>>>>>> analyzing a lockdep warning on 3.16 with I-pipe enabled, I dug deeper > >>>>>>>> into the hard and virtual interrupt state management during exception > >>>>>>>> handling on ARM. I think there are several issues: > >>>>>>>> > >>>>>>>> - ipipe_fault_entry should not fiddle with the root irq state if run > >>>>>>>> over head, only when invoked over root. > >>>>>>>> - ipipe_fault_exit must not change the root state unless we entered > >>>>>>>> over > >>>>>>>> head and are about to leave over root - see x86. The current code > >>>>>>>> may > >>>>>>>> keep root incorrectly stalled after an exception, though this will > >>>>>>>> probably be fixed up again in practice quickly. > >>>>>>> > >>>>>>> And the adjustment of the root irq state after migration has to happen > >>>>>>> before Linux starts to handle the event. It would basically be a late > >>>>>>> ipipe_fault_entry. > >>>>>>> > >>>>>>>> - do_sect_fault is only called by do_DataAbort and do_PrefetchAbort, > >>>>>>>> in both cases already wrapped in ipipe_fault_entry/exit, thus it > >>>>>>>> shouldn't invoke them once again. > >>>>>>> > >>>>>>> Sorry, this was a misinterpretation - do_sect_fault is invoked before > >>>>>>> ipipe_fault_entry. > >>>>>>> > >>>>>>> What I need to add, though: > >>>>>>> > >>>>>>> - do_DataAbort and do_PrefetchAbort call __ipipe_report_trap after > >>>>>>> ipipe_fault_entry, thus with hard IRQs on. > >>>>>> > >>>>>> This would break LPAE with the Xenomai nucleus as a module on 2.6.x, by > >>>>>> treading over a non-linear kernel mapping before the page table could > >>>>>> be > >>>>>> fixed up. do_translation_fault() must run via the fsr handler > >>>>>> indirection before any non-linear access. > >>>>>> > >>>>> > >>>>> Sorry, if you do that _after_ the fault entry notification, then it's ok > >>>>> in theory. However, I don't understand why we would need to notify when > >>>>> only a minor fixup is required, that does not entail a mode migration. > >>>>> > >>>> > >>>> To be clearer, do you intend to report the minor fault upon > >>>> do_translation_fault() returning zero, or are you referring to a > >>>> different context? > >>> > >>> No, I'm just talking about this potential change: > >>> > >>> diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c > >>> index 38834c6..b42632a 100644 > >>> --- a/arch/arm/mm/fault.c > >>> +++ b/arch/arm/mm/fault.c > >>> @@ -629,10 +629,10 @@ do_DataAbort(unsigned long addr, unsigned int fsr, > >>> struct pt_regs *regs) > >>> if (!inf->fn(addr, fsr & ~FSR_LNX_PF, regs)) > >>> return; > >>> > >>> - irqflags = ipipe_fault_entry(); > >>> - > >>> if (__ipipe_report_trap(IPIPE_TRAP_UNKNOWN, regs)) > >>> - goto out; > >>> + return; > >>> + > >>> + irqflags = ipipe_fault_entry(); > >>> > >>> printk(KERN_ALERT "Unhandled fault: %s (0x%03x) at 0x%08lx\n", > >>> inf->name, fsr, addr); > >>> @@ -642,7 +642,7 @@ do_DataAbort(unsigned long addr, unsigned int fsr, > >>> struct pt_regs *regs) > >>> info.si_code = inf->code; > >>> info.si_addr = (void __user *)addr; > >>> arm_notify_die("", regs, &info, fsr, 0); > >>> -out: > >>> + > >>> ipipe_fault_exit(irqflags); > >>> } > >>> > >>> @@ -669,10 +669,10 @@ do_PrefetchAbort(unsigned long addr, unsigned int > >>> ifsr, struct pt_regs *regs) > >>> if (!inf->fn(addr, ifsr | FSR_LNX_PF, regs)) > >>> return; > >>> > >>> - irqflags = ipipe_fault_entry(); > >>> - > >>> if (__ipipe_report_trap(IPIPE_TRAP_UNKNOWN, regs)) > >>> - goto out; > >>> + return; > >>> + > >>> + irqflags = ipipe_fault_entry(); > >>> > >>> printk(KERN_ALERT "Unhandled prefetch abort: %s (0x%03x) at 0x%08lx\n", > >>> inf->name, ifsr, addr); > >>> @@ -682,7 +682,7 @@ do_PrefetchAbort(unsigned long addr, unsigned int > >>> ifsr, struct pt_regs *regs) > >>> info.si_code = inf->code; > >>> info.si_addr = (void __user *)addr; > >>> arm_notify_die("", regs, &info, ifsr, 0); > >>> -out: > >>> + > >>> ipipe_fault_exit(irqflags); > >>> } > >>> > >>> > >>> This seems more consistent - if not more correct - as it now does the > >>> reporting with hard irqs off, like in the other cases. > >>> > >> > >> Ack, definitely. The pattern is to cause any migration first if need be, > >> _then_ flip the virtual IRQ state, so that ipipe_fault_restore() always > >> reinstates the interrupt state in effect after the caller has migrated > >> to the root domain. > > > > Is it even useful ? After a relax, the state of the root thread > > stall bit and irq flags are well known... > > We still need to disable IRQs for root. HW IRQs are likely already on, > right?
Everything is on, hw and sw irqs. > > And, again, we should refrain from restoring any root irq state on > return - it belongs to Linux (once we migrated and synchronized the state). I think restoring the root irq state makes sense when the fault handler was entered over root domain. -- Gilles. _______________________________________________ Xenomai mailing list Xenomai@xenomai.org http://www.xenomai.org/mailman/listinfo/xenomai