Re: [Xenomai] "inconsistent lock state" on boot-up

Gilles Chanteperdrix Mon, 10 Nov 2014 14:06:55 -0800

On Mon, Nov 10, 2014 at 09:55:12PM +0100, Gilles Chanteperdrix wrote:
> On Mon, Nov 10, 2014 at 09:42:22PM +0100, Jan Kiszka wrote:
> > On 2014-11-10 21:37, Gilles Chanteperdrix wrote:
> > > On Mon, Nov 10, 2014 at 09:28:50PM +0100, Jan Kiszka wrote:
> > >> On 2014-11-10 21:23, Gilles Chanteperdrix wrote:
> > >>> On Mon, Nov 10, 2014 at 09:17:18PM +0100, Jan Kiszka wrote:
> > >>>> On 2014-11-10 21:14, Gilles Chanteperdrix wrote:
> > >>>>> On Mon, Nov 10, 2014 at 09:10:31PM +0100, Jan Kiszka wrote:
> > >>>>>> On 2014-11-10 21:06, Gilles Chanteperdrix wrote:
> > >>>>>>> On Mon, Nov 10, 2014 at 09:02:58PM +0100, Jan Kiszka wrote:
> > >>>>>>>> On 2014-11-10 21:00, Gilles Chanteperdrix wrote:
> > >>>>>>>>> On Mon, Nov 10, 2014 at 08:55:26PM +0100, Jan Kiszka wrote:
> > >>>>>>>>>> On 2014-11-10 20:46, Gilles Chanteperdrix wrote:
> > >>>>>>>>>>> On Mon, Nov 10, 2014 at 07:29:58PM +0100, Jan Kiszka wrote:
> > >>>>>>>>>>>> On 2014-11-10 16:56, Gilles Chanteperdrix wrote:
> > >>>>>>>>>>>>> On Mon, Nov 10, 2014 at 03:52:41PM +0100, Jan Kiszka wrote:
> > >>>>>>>>>>>>>> On 2014-11-10 13:43, Gilles Chanteperdrix wrote:
> > >>>>>>>>>>>>>>> On Mon, Nov 10, 2014 at 09:08:47AM +0000, Stoidner, 
> > >>>>>>>>>>>>>>> Christoph wrote:
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> Hi Gilles,
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> Do you have the same message with exactly the same kernel
> > >>>>>>>>>>>>>>>>> configuration, only with CONFIG_XENOMAI and CONFIG_IPIPE 
> > >>>>>>>>>>>>>>>>> disabled?
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> When CONFIG_XENOMAI and CONFIG_IPIPE are disabled the 
> > >>>>>>>>>>>>>>>> message does not 
> > >>>>>>>>>>>>>>>> appear on boot-up.
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> Do you have FCSE enabled? If yes, did you try disabling 
> > >>>>>>>>>>>>>>>>> it? same
> > >>>>>>>>>>>>>>>>> with unlocked context switch.
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> FCSE is already disabled at all.
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> Do you have an idea how to overcome the problem?
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> I am not sure the lockdep message really is a problem. 
> > >>>>>>>>>>>>>>> lockdep could
> > >>>>>>>>>>>>>>> be confused by the fact that the hardware interrupts are 
> > >>>>>>>>>>>>>>> not off
> > >>>>>>>>>>>>>>> when running the I-pipe, or because we are missing some bit 
> > >>>>>>>>>>>>>>> in the
> > >>>>>>>>>>>>>>> I-pipe arm specific code to get it looking at the virtual 
> > >>>>>>>>>>>>>>> mask
> > >>>>>>>>>>>>>>> instead of the hardware mask.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> As for the scheduling while atomic and random segmentation 
> > >>>>>>>>>>>>>>> fault,
> > >>>>>>>>>>>>>>> you should use the I-pipe tracer, configure it with enough 
> > >>>>>>>>>>>>>>> back
> > >>>>>>>>>>>>>>> trace points, something like 1000 or 10000, and trigger a 
> > >>>>>>>>>>>>>>> trace
> > >>>>>>>>>>>>>>> freeze in the kernell code when the problem happens.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Also, for the "scheduling while atomic", it may happen if 
> > >>>>>>>>>>>>>>> you call
> > >>>>>>>>>>>>>>> some Linux service which reschedules from primary mode, you 
> > >>>>>>>>>>>>>>> can try
> > >>>>>>>>>>>>>>> enabling I-pipe debugging, and in fact all Xenomai 
> > >>>>>>>>>>>>>>> debugging, to try
> > >>>>>>>>>>>>>>> and catch such mistakes. This is especially important if 
> > >>>>>>>>>>>>>>> you are
> > >>>>>>>>>>>>>>> running a custom skin.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> "Scheduling while atomic" may have the same reason why 
> > >>>>>>>>>>>>>> lockdep stumbles:
> > >>>>>>>>>>>>>> some changes of I-pipe messe up with IRQ state tracing of 
> > >>>>>>>>>>>>>> Linux. I just
> > >>>>>>>>>>>>>> started to look into this issue again. We tried earlier but 
> > >>>>>>>>>>>>>> got distracted.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> I doubt that very much. Though I never run with lockdep, I 
> > >>>>>>>>>>>>> sometimes
> > >>>>>>>>>>>>> run with CONFIG_PREEMPT, and never saw this message. From 
> > >>>>>>>>>>>>> what I can
> > >>>>>>>>>>>>> see, the "scheduling while atomic" message is based on the
> > >>>>>>>>>>>>> preempt_count only and does not use irqs_disabled() (which by 
> > >>>>>>>>>>>>> the
> > >>>>>>>>>>>>> way is known to work with I-pipe on ARM as well, so, if 
> > >>>>>>>>>>>>> something is
> > >>>>>>>>>>>>> broken, that should be something more obscure).
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Let's see. I think I've identified one wrong path:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> diff --git a/arch/arm/kernel/entry-header.S 
> > >>>>>>>>>>>> b/arch/arm/kernel/entry-header.S
> > >>>>>>>>>>>> index d32f8bd..ab911f8 100644
> > >>>>>>>>>>>> --- a/arch/arm/kernel/entry-header.S
> > >>>>>>>>>>>> +++ b/arch/arm/kernel/entry-header.S
> > >>>>>>>>>>>> @@ -198,7 +198,10 @@
> > >>>>>>>>>>>>  #ifdef CONFIG_TRACE_IRQFLAGS
> > >>>>>>>>>>>>        @ The parent context IRQs must have been enabled to get 
> > >>>>>>>>>>>> here in
> > >>>>>>>>>>>>        @ the first place, so there's no point checking the PSR 
> > >>>>>>>>>>>> I bit.
> > >>>>>>>>>>>> -      bl      trace_hardirqs_on
> > >>>>>>>>>>>> +      tst     \rpsr, #PSR_I_BIT
> > >>>>>>>>>>>> +      bleq    trace_hardirqs_off
> > >>>>>>>>>>>> +      tst     \rpsr, #PSR_I_BIT
> > >>>>>>>>>>>> +      blne    trace_hardirqs_on
> > >>>>>>>>>>>>  #endif
> > >>>>>>>>>>>>        .else
> > >>>>>>>>>>>>        @ IRQs off again before pulling preserved data off the 
> > >>>>>>>>>>>> stack
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> This is probably no fix, but a with that change applied, the 
> > >>>>>>>>>>>> warning is
> > >>>>>>>>>>>> gone. Now the question is what to really test for when 
> > >>>>>>>>>>>> returning here. I
> > >>>>>>>>>>>> suppose we want the pipeline state of root here - should I
> > >>>>>>>>>>>> __ipipe_check_root_interruptible?
> > >>>>>>>>>>>
> > >>>>>>>>>>> This does not make sense, read the comment above that change: 
> > >>>>>>>>>>> there
> > >>>>>>>>>>> is no way an interrupt can be taken, and so entering svc_entry, 
> > >>>>>>>>>>> with
> > >>>>>>>>>>> interrupts off. Besides this is mainline code, so it would be a
> > >>>>>>>>>>> problem for mainline too. We are necessarily returning to a 
> > >>>>>>>>>>> place
> > >>>>>>>>>>> where hardware irqs were on.
> > >>>>>>>>>>
> > >>>>>>>>>> Did you also look at the trace I posted?
> > >>>>>>>>>
> > >>>>>>>>> Yes, but I did not see what I am supposed to see. The only thing I
> > >>>>>>>>> see is that these trace functions should never have been called 
> > >>>>>>>>> from
> > >>>>>>>>> rt domain in the first place.
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>>> There is no RT domain in the trace, only an inconsistent Linux 
> > >>>>>>>> trace
> > >>>>>>>> state after return from IRQ.
> > >>>>>>>
> > >>>>>>> What can I say, when returning from IRQ, you are necessarily
> > >>>>>>> returning to a point where irqs are ON, as the comment says, and it
> > >>>>>>> makes perfect sense. So your "fix" should be a nop. So, something
> > >>>>>>> else is broken.
> > >>>>>>
> > >>>>>> The test is for selecting trace_hardirqs_off/on is wrong, that's why 
> > >>>>>> I
> > >>>>>> was asking for a better check. Also, if that path can be taken by RT
> > >>>>>> domains as well, calling trace_hardirqs_off/on was always wrong, and 
> > >>>>>> we
> > >>>>>> additionally need to check for the caller's domain.
> > >>>>>>
> > >>>>>>>
> > >>>>>>>>
> > >>>>>>>>> Note that the fact that this trace_irqs stuff is not working well
> > >>>>>>>>> may be the fact that part of them are commented with CONFIG_IPIPE
> > >>>>>>>>> (see asm_trace_hardirqs_on_cond, asm_trace_hardirqs_off)
> > >>>>>>>>
> > >>>>>>>> No, that doesn't solve all issues. Even with my hack (which may not
> > >>>>>>>> address all cases properly) plus the reversion of that commit, 
> > >>>>>>>> there are
> > >>>>>>>> still inconsistencies.
> > >>>>>>>
> > >>>>>>> You can not reverse that commit, otherwise you will end-up calling
> > >>>>>>> trace_hardirqs_on/trace_hardirqs_off from RT domain,  which, I
> > >>>>>>> repeat, can not work.
> > >>>>>>
> > >>>>>> I can help to understand if that is sufficient to resolve the tracing
> > >>>>>> breakage - it isn't, there are more paths missing or wrongly 
> > >>>>>> instrumented.
> > >>>>>
> > >>>>> My idea of all this is that CONFIG_TRACE_IRQFLAGS should depend on
> > >>>>> !IPIPE, since the I-pipe tracer provides the same functionality. And
> > >>>>> is not broken.
> > >>>>
> > >>>> No, the I-pipe trace does not provide a Linux lock dependency checker,
> > >>>> nor does it support might_sleep and such. If you have Linux drivers
> > >>>> which depend on Xenomai directly or indirectly, you cannot validate 
> > >>>> them
> > >>>> anymore. That's why we support this on x86.
> > >>>
> > >>> Since the I-pipe is already keeping track of irq state with
> > >>> CONFIG_IPIPE_TRACE_IRQSOFF, can we not use that information instead
> > >>> of trying and using this trace_hardirqs stuff which looks
> > >>> irremediably broken to me?
> > >>
> > >> The former reflects the hw state, the latter traces the Linux state -
> > >> from Linux POV.
> > > 
> > > The I-pipe tracer keeps track of the root domain stall bit as well.
> > > 
> > >>
> > >> This is fixable. We just need to call the tracing functions where Linux
> > >> would call it or where we replaced some Linux call with an I-pipe
> > >> specific path and avoid calling it when the domain != root. Identifying
> > >> those spots is tricky.
> > > 
> > > If we take the example of an irq, we probably want not to call
> > > trace_hardirqs_on/trace_hardirqs_off anywhere, and just rely on the
> > > root domain stall bit.
> > 
> > Linux tracks the IRQ state separately from the (now virtualized) real
> > state - to validate the consistency independently of some spurious hard
> > irq enable/disable. And it tracks per task, not per CPU. It will be more
> > messy to fake this than to fix it, I'm quite sure.
> 
> If we take the example of irq_svc (the example you patched). We have
> 4 cases:
> 
> 1- entry over root, exit over root
> 2- entry over root, exit over non root
> 3- entry over non root, exit over non root
> 4- entry over non root, exit over root


Sorry, it does not work like that. Only case 1 and 3 make sense.
Case 3 is easy, we do not need to call the trace_hardirqs functions.
For case 1, I guess the trace_hardirqs_on at the end must be
replaced with a test of the root domain stall bit, and call
trace_hardirqs_on only if we return to a non-stalled root.

-- 
                                            Gilles.

_______________________________________________
Xenomai mailing list
[email protected]
http://www.xenomai.org/mailman/listinfo/xenomai

Re: [Xenomai] "inconsistent lock state" on boot-up

Reply via email to