Hello Gilles,
sorry for bothering you again.
> >
> > Using the I-pipe tracer, you may be able to understand what happens.
> Should I patch my Linux kernel so that xntrace_user_freeze is called,
> when Linux scheduler sends "RT throttling activated" message?
I now have been able to capture an I-pipe frozen file, when the
"RT throttling" condition occurs.
I have done this by adding the line
ipipe_trace_freeze(1);
right before
printk_sched("sched: RT Throttling activated\n");
inside "kernel/sched/rt.c".
I have set back_trace_points to 10000 to get about 1 second of
backtrace history.
Here is a fragment of my huge frozen file which contains
the last 1.6 seconds before the freeze.
//================== frozen file start =======================
+----- Hard IRQs ('|': locked)
|+-- Xenomai
||+- Linux ('*': domain stalled, '+': current, '#': current+stalled)
||| +---------- Delay flag ('+': > 1 us, '!': > 10 us)
||| | +- NMI noise ('N')
||| | |
Type User Val. Time Delay Function (Parent)
:| # [ 4936] -<?>- 0 -1613869 0.852 xnpod_resume_thread+0x4e
(xnthread_timeout_handler+0x1e)
:| # event tick@-1611814-1613868+ 3.538 xntimer_next_local_shot+0xb4
(xntimer_tick_aperiodic+0x1a0)
:| # [ 5066] prc6hmi -1 -1613864+ 1.964 __xnpod_schedule+0x77
(xnintr_clock_handler+0x105)
:| # [ 4936] -<?>- 0 -1613862+ 3.127 __xnpod_schedule+0x45d
(xnpod_suspend_thread+0x197)
:| # [ 4936] -<?>- 0 -1613859+ 1.192 __xnpod_schedule+0x77
(xnpod_suspend_thread+0x171)
:| # [ 5066] prc6hmi -1 -1613858! 27.286 __xnpod_schedule+0x45d
(xnintr_clock_handler+0x105)
:| *+[ 4936] -<?>- 0 -1613831+ 1.824 xnpod_resume_thread+0x4e
(gatekeeper_thread+0xf1)
:| # [ 456] gatekee -1 -1613829+ 4.200 __xnpod_schedule+0x77
(xnpod_schedule_handler+0x29)
:| # [ 4936] -<?>- 0 -1613825 0.952 __xnpod_schedule+0x77
(xnpod_suspend_thread+0x171)
:| # [ 456] gatekee -1 -1613824! 13.001 __xnpod_schedule+0x45d
(xnpod_schedule_handler+0x29)
:| *+[ 4936] -<?>- 0 -1613811 0.882 xnpod_resume_thread+0x4e
(gatekeeper_thread+0xf1)
:| # [ 456] gatekee -1 -1613810+ 2.506 __xnpod_schedule+0x77
(xnpod_schedule_handler+0x29)
:| # [ 4936] -<?>- 0 -1613807 0.952 __xnpod_schedule+0x77
(xnpod_suspend_thread+0x171)
:| # [ 456] gatekee -1 -1613806! 11.588 __xnpod_schedule+0x45d
(xnpod_schedule_handler+0x29)
:| *+[ 4936] -<?>- 0 -1613795 0.721 xnpod_resume_thread+0x4e
(gatekeeper_thread+0xf1)
:| # [ 456] gatekee -1 -1613794+ 6.395 __xnpod_schedule+0x77
(xnpod_schedule_handler+0x29)
:| # [ 4936] -<?>- 0 -1613788 0.922 __xnpod_schedule+0x77
(xnpod_suspend_thread+0x171)
:| # [ 456] gatekee -1 -1613787! 10.124 __xnpod_schedule+0x45d
(xnpod_schedule_handler+0x29)
:| *+[ 4936] -<?>- 0 -1613777 0.711 xnpod_resume_thread+0x4e
(gatekeeper_thread+0xf1)
:| # [ 456] gatekee -1 -1613776+ 2.686 __xnpod_schedule+0x77
(xnpod_schedule_handler+0x29)
:| # [ 4936] -<?>- 0 -1613773 0.892 __xnpod_schedule+0x77
(xnpod_suspend_thread+0x197)
:| # [ 456] gatekee -1 -1613772! 1977.417 __xnpod_schedule+0x45d
(xnpod_schedule_handler+0x29)
:| # [ 4956] -<?>- 50 -1611795+ 1.794 xnpod_resume_thread+0x4e
(xnthread_timeout_handler+0x1e)
...
...
...
:| # event tick@-5129 -5382+ 4.771 xntimer_next_local_shot+0xb4
(xntimer_tick_aperiodic+0x1a0)
:| # [ 4961] -<?>- -1 -5377+ 3.448 __xnpod_schedule+0x77
(xnintr_clock_handler+0x105)
:| # [ 4936] -<?>- 0 -5374+ 7.027 __xnpod_schedule+0x45d
(xnpod_suspend_thread+0x197)
:| # [ 4936] -<?>- 0 -5367+ 1.172 __xnpod_schedule+0x77
(xnpod_suspend_thread+0x171)
:| # [ 4961] -<?>- -1 -5366! 253.408 __xnpod_schedule+0x45d
(xnintr_clock_handler+0x105)
:| # [ 4957] -<?>- 50 -5112+ 1.884 xnpod_resume_thread+0x4e
(xnthread_periodic_handler+0x28)
:| # event tick@-4525 -5110+ 3.548 xntimer_next_local_shot+0xb4
(xntimer_tick_aperiodic+0x1a0)
:| # [ 4961] -<?>- -1 -5107+ 2.476 __xnpod_schedule+0x77
(xnintr_clock_handler+0x105)
:| # [ 4957] -<?>- 50 -5104! 59.785 __xnpod_schedule+0x45d
(xnpod_suspend_thread+0x197)
:| # [ 4957] -<?>- 50 -5045+ 1.794 __xnpod_schedule+0x77
(xnpod_suspend_thread+0x197)
:| # [ 4961] -<?>- -1 -5043! 538.795 __xnpod_schedule+0x45d
(xnintr_clock_handler+0x105)
:| # [ 5012] -<?>- 99 -4504+ 2.195 xnpod_resume_thread+0x4e
(xnthread_periodic_handler+0x28)
:| # event tick@-129 -4502+ 4.180 xntimer_next_local_shot+0xb4
(xntimer_tick_aperiodic+0x1a0)
:| # [ 4961] -<?>- -1 -4498+ 2.596 __xnpod_schedule+0x77
(xnintr_clock_handler+0x105)
:| # [ 5012] -<?>- 99 -4495! 677.824 __xnpod_schedule+0x45d
(xnpod_suspend_thread+0x197)
:| # [ 5012] -<?>- 99 -3817+ 5.844 __xnpod_schedule+0x77
(xnpod_suspend_thread+0x197)
:| # [ 4961] -<?>- -1 -3811! 3709.015 __xnpod_schedule+0x45d
(xnintr_clock_handler+0x105)
:| # [ 4957] -<?>- 50 -102+ 3.147 xnpod_resume_thread+0x4e
(xnthread_periodic_handler+0x28)
:| # event tick@474 -99+ 4.210 xntimer_next_local_shot+0xb4
(xntimer_tick_aperiodic+0x1a0)
:| # [ 4961] -<?>- -1 -95+ 3.388 __xnpod_schedule+0x77
(xnintr_clock_handler+0x105)
:| # [ 4957] -<?>- 50 -92! 55.645 __xnpod_schedule+0x45d
(xnpod_suspend_thread+0x197)
:| # [ 4957] -<?>- 50 -36+ 2.065 __xnpod_schedule+0x77
(xnpod_suspend_thread+0x197)
:| # [ 4961] -<?>- -1 -34! 34.474 __xnpod_schedule+0x45d
(xnintr_clock_handler+0x105)
< #freeze 0x00000001 0 489.264 update_curr_rt+0x10f
(task_tick_rt+0x15)
| # [ 5012] -<?>- 99 489 1.724 xnpod_resume_thread+0x4e
(xnthread_periodic_handler+0x28)
| # event tick@4870 490 3.147 xntimer_next_local_shot+0xb4
(xntimer_tick_aperiodic+0x1a0)
| # [ 0] -<?>- -1 494 2.866 __xnpod_schedule+0x77
(xnintr_clock_handler+0x105)
| # [ 5012] -<?>- 99 497 662.948 __xnpod_schedule+0x45d
(xnpod_suspend_thread+0x197)
| # [ 5012] -<?>- 99 1159 3.608 __xnpod_schedule+0x77
(xnpod_suspend_thread+0x197)
| # [ 0] -<?>- -1 1163 3720.243 __xnpod_schedule+0x45d
(xnintr_clock_handler+0x105)
| # [ 4957] -<?>- 50 4883 2.085 xnpod_resume_thread+0x4e
(xnthread_periodic_handler+0x28)
| # event tick@5474 4885 3.368 xntimer_next_local_shot+0xb4
(xntimer_tick_aperiodic+0x1a0)
| # [ 0] -<?>- -1 4889 1.233 __xnpod_schedule+0x77
(xnintr_clock_handler+0x105)
| # [ 4957] -<?>- 50 4890 0.000 __xnpod_schedule+0x45d
(xnpod_suspend_thread+0x197)
//================== frozen file end =======================
There is no process utilizing the CPU for > 950000 us, which
leads me to the conclusion, that the problem is not caused by
an infinite loop.
However I observe delay values of up to 3900 us multiple times which
makes me think that there is a high CPU utilization which in the
worst case exceeds the limit of 950000 us per 1 second interval.
Probably I would have to sum up all delay values of processes in
secondary mode within 1 second time interval and see if it exceeds
950000 us.
Now I have the following questions
1. What is meant by "domain stalled", "current" and "current+stalled"?
2. What is that process having ID "[ 0]"
3. How can I tell processes running in secondary mode from others?
(just by looking at the priority value?)
Any hint on interpreting my frozen file is appreciated.
Thanks
Jochen
_______________________________________________
Xenomai mailing list
[email protected]
http://xenomai.org/mailman/listinfo/xenomai