Re: [Xenomai] "inconsistent lock state" on boot-up

Gilles Chanteperdrix Wed, 12 Nov 2014 09:33:05 -0800

On Mon, Nov 10, 2014 at 10:58:46PM +0100, Gilles Chanteperdrix wrote:
> On Mon, Nov 10, 2014 at 09:55:12PM +0100, Gilles Chanteperdrix wrote:
> > On Mon, Nov 10, 2014 at 09:42:22PM +0100, Jan Kiszka wrote:
> > > On 2014-11-10 21:37, Gilles Chanteperdrix wrote:
> > > > On Mon, Nov 10, 2014 at 09:28:50PM +0100, Jan Kiszka wrote:
> > > >> On 2014-11-10 21:23, Gilles Chanteperdrix wrote:
> > > >>> On Mon, Nov 10, 2014 at 09:17:18PM +0100, Jan Kiszka wrote:
> > > >>>> On 2014-11-10 21:14, Gilles Chanteperdrix wrote:
> > > >>>>> On Mon, Nov 10, 2014 at 09:10:31PM +0100, Jan Kiszka wrote:
> > > >>>>>> On 2014-11-10 21:06, Gilles Chanteperdrix wrote:
> > > >>>>>>> On Mon, Nov 10, 2014 at 09:02:58PM +0100, Jan Kiszka wrote:
> > > >>>>>>>> On 2014-11-10 21:00, Gilles Chanteperdrix wrote:
> > > >>>>>>>>> On Mon, Nov 10, 2014 at 08:55:26PM +0100, Jan Kiszka wrote:
> > > >>>>>>>>>> On 2014-11-10 20:46, Gilles Chanteperdrix wrote:
> > > >>>>>>>>>>> On Mon, Nov 10, 2014 at 07:29:58PM +0100, Jan Kiszka wrote:
> > > >>>>>>>>>>>> On 2014-11-10 16:56, Gilles Chanteperdrix wrote:
> > > >>>>>>>>>>>>> On Mon, Nov 10, 2014 at 03:52:41PM +0100, Jan Kiszka wrote:
> > > >>>>>>>>>>>>>> On 2014-11-10 13:43, Gilles Chanteperdrix wrote:
> > > >>>>>>>>>>>>>>> On Mon, Nov 10, 2014 at 09:08:47AM +0000, Stoidner, 
> > > >>>>>>>>>>>>>>> Christoph wrote:
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> Hi Gilles,
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> Do you have the same message with exactly the same 
> > > >>>>>>>>>>>>>>>>> kernel
> > > >>>>>>>>>>>>>>>>> configuration, only with CONFIG_XENOMAI and 
> > > >>>>>>>>>>>>>>>>> CONFIG_IPIPE disabled?
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> When CONFIG_XENOMAI and CONFIG_IPIPE are disabled the 
> > > >>>>>>>>>>>>>>>> message does not 
> > > >>>>>>>>>>>>>>>> appear on boot-up.
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> Do you have FCSE enabled? If yes, did you try disabling 
> > > >>>>>>>>>>>>>>>>> it? same
> > > >>>>>>>>>>>>>>>>> with unlocked context switch.
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> FCSE is already disabled at all.
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> Do you have an idea how to overcome the problem?
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> I am not sure the lockdep message really is a problem. 
> > > >>>>>>>>>>>>>>> lockdep could
> > > >>>>>>>>>>>>>>> be confused by the fact that the hardware interrupts are 
> > > >>>>>>>>>>>>>>> not off
> > > >>>>>>>>>>>>>>> when running the I-pipe, or because we are missing some 
> > > >>>>>>>>>>>>>>> bit in the
> > > >>>>>>>>>>>>>>> I-pipe arm specific code to get it looking at the virtual 
> > > >>>>>>>>>>>>>>> mask
> > > >>>>>>>>>>>>>>> instead of the hardware mask.
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> As for the scheduling while atomic and random 
> > > >>>>>>>>>>>>>>> segmentation fault,
> > > >>>>>>>>>>>>>>> you should use the I-pipe tracer, configure it with 
> > > >>>>>>>>>>>>>>> enough back
> > > >>>>>>>>>>>>>>> trace points, something like 1000 or 10000, and trigger a 
> > > >>>>>>>>>>>>>>> trace
> > > >>>>>>>>>>>>>>> freeze in the kernell code when the problem happens.
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> Also, for the "scheduling while atomic", it may happen if 
> > > >>>>>>>>>>>>>>> you call
> > > >>>>>>>>>>>>>>> some Linux service which reschedules from primary mode, 
> > > >>>>>>>>>>>>>>> you can try
> > > >>>>>>>>>>>>>>> enabling I-pipe debugging, and in fact all Xenomai 
> > > >>>>>>>>>>>>>>> debugging, to try
> > > >>>>>>>>>>>>>>> and catch such mistakes. This is especially important if 
> > > >>>>>>>>>>>>>>> you are
> > > >>>>>>>>>>>>>>> running a custom skin.
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> "Scheduling while atomic" may have the same reason why 
> > > >>>>>>>>>>>>>> lockdep stumbles:
> > > >>>>>>>>>>>>>> some changes of I-pipe messe up with IRQ state tracing of 
> > > >>>>>>>>>>>>>> Linux. I just
> > > >>>>>>>>>>>>>> started to look into this issue again. We tried earlier 
> > > >>>>>>>>>>>>>> but got distracted.
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> I doubt that very much. Though I never run with lockdep, I 
> > > >>>>>>>>>>>>> sometimes
> > > >>>>>>>>>>>>> run with CONFIG_PREEMPT, and never saw this message. From 
> > > >>>>>>>>>>>>> what I can
> > > >>>>>>>>>>>>> see, the "scheduling while atomic" message is based on the
> > > >>>>>>>>>>>>> preempt_count only and does not use irqs_disabled() (which 
> > > >>>>>>>>>>>>> by the
> > > >>>>>>>>>>>>> way is known to work with I-pipe on ARM as well, so, if 
> > > >>>>>>>>>>>>> something is
> > > >>>>>>>>>>>>> broken, that should be something more obscure).
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Let's see. I think I've identified one wrong path:
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> diff --git a/arch/arm/kernel/entry-header.S 
> > > >>>>>>>>>>>> b/arch/arm/kernel/entry-header.S
> > > >>>>>>>>>>>> index d32f8bd..ab911f8 100644
> > > >>>>>>>>>>>> --- a/arch/arm/kernel/entry-header.S
> > > >>>>>>>>>>>> +++ b/arch/arm/kernel/entry-header.S
> > > >>>>>>>>>>>> @@ -198,7 +198,10 @@
> > > >>>>>>>>>>>>  #ifdef CONFIG_TRACE_IRQFLAGS
> > > >>>>>>>>>>>>      @ The parent context IRQs must have been enabled to get 
> > > >>>>>>>>>>>> here in
> > > >>>>>>>>>>>>      @ the first place, so there's no point checking the PSR 
> > > >>>>>>>>>>>> I bit.
> > > >>>>>>>>>>>> -    bl      trace_hardirqs_on
> > > >>>>>>>>>>>> +    tst     \rpsr, #PSR_I_BIT
> > > >>>>>>>>>>>> +    bleq    trace_hardirqs_off
> > > >>>>>>>>>>>> +    tst     \rpsr, #PSR_I_BIT
> > > >>>>>>>>>>>> +    blne    trace_hardirqs_on
> > > >>>>>>>>>>>>  #endif
> > > >>>>>>>>>>>>      .else
> > > >>>>>>>>>>>>      @ IRQs off again before pulling preserved data off the 
> > > >>>>>>>>>>>> stack
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> This is probably no fix, but a with that change applied, the 
> > > >>>>>>>>>>>> warning is
> > > >>>>>>>>>>>> gone. Now the question is what to really test for when 
> > > >>>>>>>>>>>> returning here. I
> > > >>>>>>>>>>>> suppose we want the pipeline state of root here - should I
> > > >>>>>>>>>>>> __ipipe_check_root_interruptible?
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> This does not make sense, read the comment above that change: 
> > > >>>>>>>>>>> there
> > > >>>>>>>>>>> is no way an interrupt can be taken, and so entering 
> > > >>>>>>>>>>> svc_entry, with
> > > >>>>>>>>>>> interrupts off. Besides this is mainline code, so it would be 
> > > >>>>>>>>>>> a
> > > >>>>>>>>>>> problem for mainline too. We are necessarily returning to a 
> > > >>>>>>>>>>> place
> > > >>>>>>>>>>> where hardware irqs were on.
> > > >>>>>>>>>>
> > > >>>>>>>>>> Did you also look at the trace I posted?
> > > >>>>>>>>>
> > > >>>>>>>>> Yes, but I did not see what I am supposed to see. The only 
> > > >>>>>>>>> thing I
> > > >>>>>>>>> see is that these trace functions should never have been called 
> > > >>>>>>>>> from
> > > >>>>>>>>> rt domain in the first place.
> > > >>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> There is no RT domain in the trace, only an inconsistent Linux 
> > > >>>>>>>> trace
> > > >>>>>>>> state after return from IRQ.
> > > >>>>>>>
> > > >>>>>>> What can I say, when returning from IRQ, you are necessarily
> > > >>>>>>> returning to a point where irqs are ON, as the comment says, and 
> > > >>>>>>> it
> > > >>>>>>> makes perfect sense. So your "fix" should be a nop. So, something
> > > >>>>>>> else is broken.
> > > >>>>>>
> > > >>>>>> The test is for selecting trace_hardirqs_off/on is wrong, that's 
> > > >>>>>> why I
> > > >>>>>> was asking for a better check. Also, if that path can be taken by 
> > > >>>>>> RT
> > > >>>>>> domains as well, calling trace_hardirqs_off/on was always wrong, 
> > > >>>>>> and we
> > > >>>>>> additionally need to check for the caller's domain.
> > > >>>>>>
> > > >>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>> Note that the fact that this trace_irqs stuff is not working 
> > > >>>>>>>>> well
> > > >>>>>>>>> may be the fact that part of them are commented with 
> > > >>>>>>>>> CONFIG_IPIPE
> > > >>>>>>>>> (see asm_trace_hardirqs_on_cond, asm_trace_hardirqs_off)
> > > >>>>>>>>
> > > >>>>>>>> No, that doesn't solve all issues. Even with my hack (which may 
> > > >>>>>>>> not
> > > >>>>>>>> address all cases properly) plus the reversion of that commit, 
> > > >>>>>>>> there are
> > > >>>>>>>> still inconsistencies.
> > > >>>>>>>
> > > >>>>>>> You can not reverse that commit, otherwise you will end-up calling
> > > >>>>>>> trace_hardirqs_on/trace_hardirqs_off from RT domain,  which, I
> > > >>>>>>> repeat, can not work.
> > > >>>>>>
> > > >>>>>> I can help to understand if that is sufficient to resolve the 
> > > >>>>>> tracing
> > > >>>>>> breakage - it isn't, there are more paths missing or wrongly 
> > > >>>>>> instrumented.
> > > >>>>>
> > > >>>>> My idea of all this is that CONFIG_TRACE_IRQFLAGS should depend on
> > > >>>>> !IPIPE, since the I-pipe tracer provides the same functionality. And
> > > >>>>> is not broken.
> > > >>>>
> > > >>>> No, the I-pipe trace does not provide a Linux lock dependency 
> > > >>>> checker,
> > > >>>> nor does it support might_sleep and such. If you have Linux drivers
> > > >>>> which depend on Xenomai directly or indirectly, you cannot validate 
> > > >>>> them
> > > >>>> anymore. That's why we support this on x86.
> > > >>>
> > > >>> Since the I-pipe is already keeping track of irq state with
> > > >>> CONFIG_IPIPE_TRACE_IRQSOFF, can we not use that information instead
> > > >>> of trying and using this trace_hardirqs stuff which looks
> > > >>> irremediably broken to me?
> > > >>
> > > >> The former reflects the hw state, the latter traces the Linux state -
> > > >> from Linux POV.
> > > > 
> > > > The I-pipe tracer keeps track of the root domain stall bit as well.
> > > > 
> > > >>
> > > >> This is fixable. We just need to call the tracing functions where Linux
> > > >> would call it or where we replaced some Linux call with an I-pipe
> > > >> specific path and avoid calling it when the domain != root. Identifying
> > > >> those spots is tricky.
> > > > 
> > > > If we take the example of an irq, we probably want not to call
> > > > trace_hardirqs_on/trace_hardirqs_off anywhere, and just rely on the
> > > > root domain stall bit.
> > > 
> > > Linux tracks the IRQ state separately from the (now virtualized) real
> > > state - to validate the consistency independently of some spurious hard
> > > irq enable/disable. And it tracks per task, not per CPU. It will be more
> > > messy to fake this than to fix it, I'm quite sure.
> > 
> > If we take the example of irq_svc (the example you patched). We have
> > 4 cases:
> > 
> > 1- entry over root, exit over root
> > 2- entry over root, exit over non root
> > 3- entry over non root, exit over non root
> > 4- entry over non root, exit over root
> 
> Sorry, it does not work like that. Only case 1 and 3 make sense.
> Case 3 is easy, we do not need to call the trace_hardirqs functions.
> For case 1, I guess the trace_hardirqs_on at the end must be
> replaced with a test of the root domain stall bit, and call
> trace_hardirqs_on only if we return to a non-stalled root.


We do not need trace_hardirqs_on and trace_hardirqs_off for the
particular case of IRQs: they are already handled by
__ipipe_do_sync_stage. 

-- 
                                            Gilles.

_______________________________________________
Xenomai mailing list
[email protected]
http://www.xenomai.org/mailman/listinfo/xenomai

Re: [Xenomai] "inconsistent lock state" on boot-up

Reply via email to