On Fri, Dec 04, 2015 at 05:24:28PM +0100, [email protected] wrote:
> Hello Gilles,
>
> sorry for bothering you again.
>
> > >
> > > Using the I-pipe tracer, you may be able to understand what happens.
> > Should I patch my Linux kernel so that xntrace_user_freeze is called,
> > when Linux scheduler sends "RT throttling activated" message?
>
> I now have been able to capture an I-pipe frozen file, when the
> "RT throttling" condition occurs.
> I have done this by adding the line
> ipipe_trace_freeze(1);
> right before
> printk_sched("sched: RT Throttling activated\n");
> inside "kernel/sched/rt.c".
> I have set back_trace_points to 10000 to get about 1 second of
> backtrace history.
>
> Here is a fragment of my huge frozen file which contains
> the last 1.6 seconds before the freeze.
>
> //================== frozen file start =======================
> +----- Hard IRQs ('|': locked)
> |+-- Xenomai
> ||+- Linux ('*': domain stalled, '+': current, '#': current+stalled)
> ||| +---------- Delay flag ('+': > 1 us, '!': > 10 us)
> ||| | +- NMI noise ('N')
> ||| | |
> Type User Val. Time Delay Function (Parent)
> :| # [ 4936] -<?>- 0 -1613869 0.852 xnpod_resume_thread+0x4e
> (xnthread_timeout_handler+0x1e)
> :| # event tick@-1611814-1613868+ 3.538 xntimer_next_local_shot+0xb4
> (xntimer_tick_aperiodic+0x1a0)
> :| # [ 5066] prc6hmi -1 -1613864+ 1.964 __xnpod_schedule+0x77
> (xnintr_clock_handler+0x105)
> :| # [ 4936] -<?>- 0 -1613862+ 3.127 __xnpod_schedule+0x45d
> (xnpod_suspend_thread+0x197)
> :| # [ 4936] -<?>- 0 -1613859+ 1.192 __xnpod_schedule+0x77
> (xnpod_suspend_thread+0x171)
> :| # [ 5066] prc6hmi -1 -1613858! 27.286 __xnpod_schedule+0x45d
> (xnintr_clock_handler+0x105)
> :| *+[ 4936] -<?>- 0 -1613831+ 1.824 xnpod_resume_thread+0x4e
> (gatekeeper_thread+0xf1)
> :| # [ 456] gatekee -1 -1613829+ 4.200 __xnpod_schedule+0x77
> (xnpod_schedule_handler+0x29)
> :| # [ 4936] -<?>- 0 -1613825 0.952 __xnpod_schedule+0x77
> (xnpod_suspend_thread+0x171)
> :| # [ 456] gatekee -1 -1613824! 13.001 __xnpod_schedule+0x45d
> (xnpod_schedule_handler+0x29)
> :| *+[ 4936] -<?>- 0 -1613811 0.882 xnpod_resume_thread+0x4e
> (gatekeeper_thread+0xf1)
> :| # [ 456] gatekee -1 -1613810+ 2.506 __xnpod_schedule+0x77
> (xnpod_schedule_handler+0x29)
> :| # [ 4936] -<?>- 0 -1613807 0.952 __xnpod_schedule+0x77
> (xnpod_suspend_thread+0x171)
> :| # [ 456] gatekee -1 -1613806! 11.588 __xnpod_schedule+0x45d
> (xnpod_schedule_handler+0x29)
> :| *+[ 4936] -<?>- 0 -1613795 0.721 xnpod_resume_thread+0x4e
> (gatekeeper_thread+0xf1)
> :| # [ 456] gatekee -1 -1613794+ 6.395 __xnpod_schedule+0x77
> (xnpod_schedule_handler+0x29)
> :| # [ 4936] -<?>- 0 -1613788 0.922 __xnpod_schedule+0x77
> (xnpod_suspend_thread+0x171)
> :| # [ 456] gatekee -1 -1613787! 10.124 __xnpod_schedule+0x45d
> (xnpod_schedule_handler+0x29)
> :| *+[ 4936] -<?>- 0 -1613777 0.711 xnpod_resume_thread+0x4e
> (gatekeeper_thread+0xf1)
> :| # [ 456] gatekee -1 -1613776+ 2.686 __xnpod_schedule+0x77
> (xnpod_schedule_handler+0x29)
> :| # [ 4936] -<?>- 0 -1613773 0.892 __xnpod_schedule+0x77
> (xnpod_suspend_thread+0x197)
> :| # [ 456] gatekee -1 -1613772! 1977.417 __xnpod_schedule+0x45d
> (xnpod_schedule_handler+0x29)
> :| # [ 4956] -<?>- 50 -1611795+ 1.794 xnpod_resume_thread+0x4e
> (xnthread_timeout_handler+0x1e)
> ...
> ...
> ...
> :| # event tick@-5129 -5382+ 4.771 xntimer_next_local_shot+0xb4
> (xntimer_tick_aperiodic+0x1a0)
> :| # [ 4961] -<?>- -1 -5377+ 3.448 __xnpod_schedule+0x77
> (xnintr_clock_handler+0x105)
> :| # [ 4936] -<?>- 0 -5374+ 7.027 __xnpod_schedule+0x45d
> (xnpod_suspend_thread+0x197)
> :| # [ 4936] -<?>- 0 -5367+ 1.172 __xnpod_schedule+0x77
> (xnpod_suspend_thread+0x171)
> :| # [ 4961] -<?>- -1 -5366! 253.408 __xnpod_schedule+0x45d
> (xnintr_clock_handler+0x105)
> :| # [ 4957] -<?>- 50 -5112+ 1.884 xnpod_resume_thread+0x4e
> (xnthread_periodic_handler+0x28)
> :| # event tick@-4525 -5110+ 3.548 xntimer_next_local_shot+0xb4
> (xntimer_tick_aperiodic+0x1a0)
> :| # [ 4961] -<?>- -1 -5107+ 2.476 __xnpod_schedule+0x77
> (xnintr_clock_handler+0x105)
> :| # [ 4957] -<?>- 50 -5104! 59.785 __xnpod_schedule+0x45d
> (xnpod_suspend_thread+0x197)
> :| # [ 4957] -<?>- 50 -5045+ 1.794 __xnpod_schedule+0x77
> (xnpod_suspend_thread+0x197)
> :| # [ 4961] -<?>- -1 -5043! 538.795 __xnpod_schedule+0x45d
> (xnintr_clock_handler+0x105)
> :| # [ 5012] -<?>- 99 -4504+ 2.195 xnpod_resume_thread+0x4e
> (xnthread_periodic_handler+0x28)
> :| # event tick@-129 -4502+ 4.180 xntimer_next_local_shot+0xb4
> (xntimer_tick_aperiodic+0x1a0)
> :| # [ 4961] -<?>- -1 -4498+ 2.596 __xnpod_schedule+0x77
> (xnintr_clock_handler+0x105)
> :| # [ 5012] -<?>- 99 -4495! 677.824 __xnpod_schedule+0x45d
> (xnpod_suspend_thread+0x197)
> :| # [ 5012] -<?>- 99 -3817+ 5.844 __xnpod_schedule+0x77
> (xnpod_suspend_thread+0x197)
> :| # [ 4961] -<?>- -1 -3811! 3709.015 __xnpod_schedule+0x45d
> (xnintr_clock_handler+0x105)
> :| # [ 4957] -<?>- 50 -102+ 3.147 xnpod_resume_thread+0x4e
> (xnthread_periodic_handler+0x28)
> :| # event tick@474 -99+ 4.210 xntimer_next_local_shot+0xb4
> (xntimer_tick_aperiodic+0x1a0)
> :| # [ 4961] -<?>- -1 -95+ 3.388 __xnpod_schedule+0x77
> (xnintr_clock_handler+0x105)
> :| # [ 4957] -<?>- 50 -92! 55.645 __xnpod_schedule+0x45d
> (xnpod_suspend_thread+0x197)
> :| # [ 4957] -<?>- 50 -36+ 2.065 __xnpod_schedule+0x77
> (xnpod_suspend_thread+0x197)
> :| # [ 4961] -<?>- -1 -34! 34.474 __xnpod_schedule+0x45d
> (xnintr_clock_handler+0x105)
> < #freeze 0x00000001 0 489.264 update_curr_rt+0x10f
> (task_tick_rt+0x15)
> | # [ 5012] -<?>- 99 489 1.724 xnpod_resume_thread+0x4e
> (xnthread_periodic_handler+0x28)
> | # event tick@4870 490 3.147 xntimer_next_local_shot+0xb4
> (xntimer_tick_aperiodic+0x1a0)
> | # [ 0] -<?>- -1 494 2.866 __xnpod_schedule+0x77
> (xnintr_clock_handler+0x105)
> | # [ 5012] -<?>- 99 497 662.948 __xnpod_schedule+0x45d
> (xnpod_suspend_thread+0x197)
> | # [ 5012] -<?>- 99 1159 3.608 __xnpod_schedule+0x77
> (xnpod_suspend_thread+0x197)
> | # [ 0] -<?>- -1 1163 3720.243 __xnpod_schedule+0x45d
> (xnintr_clock_handler+0x105)
> | # [ 4957] -<?>- 50 4883 2.085 xnpod_resume_thread+0x4e
> (xnthread_periodic_handler+0x28)
> | # event tick@5474 4885 3.368 xntimer_next_local_shot+0xb4
> (xntimer_tick_aperiodic+0x1a0)
> | # [ 0] -<?>- -1 4889 1.233 __xnpod_schedule+0x77
> (xnintr_clock_handler+0x105)
> | # [ 4957] -<?>- 50 4890 0.000 __xnpod_schedule+0x45d
> (xnpod_suspend_thread+0x197)
> //================== frozen file end =======================
>
> There is no process utilizing the CPU for > 950000 us, which
> leads me to the conclusion, that the problem is not caused by
> an infinite loop.
The problem is caused by processes scheduled by the Linux scheduler
utilizing the CPU without never releasing it. These may be different
processes you are right.
> However I observe delay values of up to 3900 us multiple times which
> makes me think that there is a high CPU utilization which in the
> worst case exceeds the limit of 950000 us per 1 second interval.
> Probably I would have to sum up all delay values of processes in
> secondary mode within 1 second time interval and see if it exceeds
> 950000 us.
>
> Now I have the following questions
> 1. What is meant by "domain stalled", "current" and "current+stalled"?
> 2. What is that process having ID "[ 0]"
> 3. How can I tell processes running in secondary mode from others?
> (just by looking at the priority value?)
>
> Any hint on interpreting my frozen file is appreciated.
This trace is worthless. To have a meaningful trace, enable tracing
on functions entry (the default when you enable the I-pipe tracer).
--
Gilles.
https://click-hack.org
_______________________________________________
Xenomai mailing list
[email protected]
http://xenomai.org/mailman/listinfo/xenomai