Re: [Xenomai] ipipe: issues with ARM exception handling

Gilles Chanteperdrix Mon, 23 Feb 2015 08:55:06 -0800

On Mon, Feb 23, 2015 at 05:50:13PM +0100, Jan Kiszka wrote:
> On 2015-02-23 17:37, Gilles Chanteperdrix wrote:
> > On Mon, Feb 23, 2015 at 05:32:56PM +0100, Philippe Gerum wrote:
> >> On 02/23/2015 05:06 PM, Jan Kiszka wrote:
> >>> On 2015-02-20 20:52, Philippe Gerum wrote:
> >>>> On 02/20/2015 08:47 PM, Philippe Gerum wrote:
> >>>>> On 02/20/2015 08:44 PM, Philippe Gerum wrote:
> >>>>>> On 02/20/2015 07:17 PM, Jan Kiszka wrote:
> >>>>>>> On 2015-02-20 19:03, Jan Kiszka wrote:
> >>>>>>>> Hi Gilles,
> >>>>>>>>
> >>>>>>>> analyzing a lockdep warning on 3.16 with I-pipe enabled, I dug deeper
> >>>>>>>> into the hard and virtual interrupt state management during exception
> >>>>>>>> handling on ARM. I think there are several issues:
> >>>>>>>>
> >>>>>>>> - ipipe_fault_entry should not fiddle with the root irq state if run
> >>>>>>>>   over head, only when invoked over root.
> >>>>>>>> - ipipe_fault_exit must not change the root state unless we entered 
> >>>>>>>> over
> >>>>>>>>   head and are about to leave over root - see x86. The current code 
> >>>>>>>> may
> >>>>>>>>   keep root incorrectly stalled after an exception, though this will
> >>>>>>>>   probably be fixed up again in practice quickly.
> >>>>>>>
> >>>>>>> And the adjustment of the root irq state after migration has to happen
> >>>>>>> before Linux starts to handle the event. It would basically be a late
> >>>>>>> ipipe_fault_entry.
> >>>>>>>
> >>>>>>>> - do_sect_fault is only called by do_DataAbort and do_PrefetchAbort,
> >>>>>>>>   in both cases already wrapped in ipipe_fault_entry/exit, thus it
> >>>>>>>>   shouldn't invoke them once again.
> >>>>>>>
> >>>>>>> Sorry, this was a misinterpretation - do_sect_fault is invoked before
> >>>>>>> ipipe_fault_entry.
> >>>>>>>
> >>>>>>> What I need to add, though:
> >>>>>>>
> >>>>>>> - do_DataAbort and do_PrefetchAbort call __ipipe_report_trap after
> >>>>>>>   ipipe_fault_entry, thus with hard IRQs on.
> >>>>>>
> >>>>>> This would break LPAE with the Xenomai nucleus as a module on 2.6.x, by
> >>>>>> treading over a non-linear kernel mapping before the page table could 
> >>>>>> be
> >>>>>> fixed up. do_translation_fault() must run via the fsr handler
> >>>>>> indirection before any non-linear access.
> >>>>>>
> >>>>>
> >>>>> Sorry, if you do that _after_ the fault entry notification, then it's ok
> >>>>> in theory. However, I don't understand why we would need to notify when
> >>>>> only a minor fixup is required, that does not entail a mode migration.
> >>>>>
> >>>>
> >>>> To be clearer, do you intend to report the minor fault upon
> >>>> do_translation_fault() returning zero, or are you referring to a
> >>>> different context?
> >>>
> >>> No, I'm just talking about this potential change:
> >>>
> >>> diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c
> >>> index 38834c6..b42632a 100644
> >>> --- a/arch/arm/mm/fault.c
> >>> +++ b/arch/arm/mm/fault.c
> >>> @@ -629,10 +629,10 @@ do_DataAbort(unsigned long addr, unsigned int fsr, 
> >>> struct pt_regs *regs)
> >>>   if (!inf->fn(addr, fsr & ~FSR_LNX_PF, regs))
> >>>           return;
> >>>  
> >>> - irqflags = ipipe_fault_entry();
> >>> -
> >>>   if (__ipipe_report_trap(IPIPE_TRAP_UNKNOWN, regs))
> >>> -         goto out;
> >>> +         return;
> >>> +
> >>> + irqflags = ipipe_fault_entry();
> >>>  
> >>>   printk(KERN_ALERT "Unhandled fault: %s (0x%03x) at 0x%08lx\n",
> >>>           inf->name, fsr, addr);
> >>> @@ -642,7 +642,7 @@ do_DataAbort(unsigned long addr, unsigned int fsr, 
> >>> struct pt_regs *regs)
> >>>   info.si_code  = inf->code;
> >>>   info.si_addr  = (void __user *)addr;
> >>>   arm_notify_die("", regs, &info, fsr, 0);
> >>> -out:
> >>> +
> >>>   ipipe_fault_exit(irqflags);
> >>>  }
> >>>  
> >>> @@ -669,10 +669,10 @@ do_PrefetchAbort(unsigned long addr, unsigned int 
> >>> ifsr, struct pt_regs *regs)
> >>>   if (!inf->fn(addr, ifsr | FSR_LNX_PF, regs))
> >>>           return;
> >>>  
> >>> - irqflags = ipipe_fault_entry();
> >>> -
> >>>   if (__ipipe_report_trap(IPIPE_TRAP_UNKNOWN, regs))
> >>> -         goto out;
> >>> +         return;
> >>> +
> >>> + irqflags = ipipe_fault_entry();
> >>>  
> >>>   printk(KERN_ALERT "Unhandled prefetch abort: %s (0x%03x) at 0x%08lx\n",
> >>>           inf->name, ifsr, addr);
> >>> @@ -682,7 +682,7 @@ do_PrefetchAbort(unsigned long addr, unsigned int 
> >>> ifsr, struct pt_regs *regs)
> >>>   info.si_code  = inf->code;
> >>>   info.si_addr  = (void __user *)addr;
> >>>   arm_notify_die("", regs, &info, ifsr, 0);
> >>> -out:
> >>> +
> >>>   ipipe_fault_exit(irqflags);
> >>>  }
> >>>  
> >>>
> >>> This seems more consistent - if not more correct - as it now does the
> >>> reporting with hard irqs off, like in the other cases.
> >>>
> >>
> >> Ack, definitely. The pattern is to cause any migration first if need be,
> >> _then_ flip the virtual IRQ state, so that ipipe_fault_restore() always
> >> reinstates the interrupt state in effect after the caller has migrated
> >> to the root domain.
> > 
> > Is it even useful ? After a relax, the state of the root thread
> > stall bit and irq flags are well known...
> 
> We still need to disable IRQs for root. HW IRQs are likely already on,
> right?


Everything is on, hw and sw irqs.

> 
> And, again, we should refrain from restoring any root irq state on
> return - it belongs to Linux (once we migrated and synchronized the state).

I think restoring the root irq state makes sense when the fault
handler was entered over root domain.

-- 
                                            Gilles.

_______________________________________________
Xenomai mailing list
Xenomai@xenomai.org
http://www.xenomai.org/mailman/listinfo/xenomai

Re: [Xenomai] ipipe: issues with ARM exception handling

Reply via email to