Hello Gilles,

sorry for bothering you again.

> >
> > Using the I-pipe tracer, you may be able to understand what happens.
> Should I patch my Linux kernel so that xntrace_user_freeze is called,
> when Linux scheduler sends "RT throttling activated" message?

I now have been able to capture an I-pipe frozen file, when the
"RT throttling" condition occurs.
I have done this by adding the line
  ipipe_trace_freeze(1);
right before
  printk_sched("sched: RT Throttling activated\n");
inside "kernel/sched/rt.c".
I have set back_trace_points to 10000 to get about 1 second of
backtrace history.

Here is a fragment of my huge frozen file which contains
the last 1.6 seconds before the freeze.

//================== frozen file start =======================
 +----- Hard IRQs ('|': locked)
 |+-- Xenomai
 ||+- Linux ('*': domain stalled, '+': current, '#': current+stalled)
 |||              +---------- Delay flag ('+': > 1 us, '!': > 10 us)
 |||              |       +- NMI noise ('N')
 |||              |       |
      Type      User Val.   Time    Delay  Function (Parent)
:|  # [ 4936] -<?>-    0 -1613869      0.852  xnpod_resume_thread+0x4e 
(xnthread_timeout_handler+0x1e)
:|  # event   tick@-1611814-1613868+   3.538  xntimer_next_local_shot+0xb4 
(xntimer_tick_aperiodic+0x1a0)
:|  # [ 5066] prc6hmi -1 -1613864+   1.964  __xnpod_schedule+0x77 
(xnintr_clock_handler+0x105)
:|  # [ 4936] -<?>-    0 -1613862+   3.127  __xnpod_schedule+0x45d 
(xnpod_suspend_thread+0x197)
:|  # [ 4936] -<?>-    0 -1613859+   1.192  __xnpod_schedule+0x77 
(xnpod_suspend_thread+0x171)
:|  # [ 5066] prc6hmi -1 -1613858!  27.286  __xnpod_schedule+0x45d 
(xnintr_clock_handler+0x105)
:|  *+[ 4936] -<?>-    0 -1613831+   1.824  xnpod_resume_thread+0x4e 
(gatekeeper_thread+0xf1)
:|  # [  456] gatekee -1 -1613829+   4.200  __xnpod_schedule+0x77 
(xnpod_schedule_handler+0x29)
:|  # [ 4936] -<?>-    0 -1613825      0.952  __xnpod_schedule+0x77 
(xnpod_suspend_thread+0x171)
:|  # [  456] gatekee -1 -1613824!  13.001  __xnpod_schedule+0x45d 
(xnpod_schedule_handler+0x29)
:|  *+[ 4936] -<?>-    0 -1613811      0.882  xnpod_resume_thread+0x4e 
(gatekeeper_thread+0xf1)
:|  # [  456] gatekee -1 -1613810+   2.506  __xnpod_schedule+0x77 
(xnpod_schedule_handler+0x29)
:|  # [ 4936] -<?>-    0 -1613807      0.952  __xnpod_schedule+0x77 
(xnpod_suspend_thread+0x171)
:|  # [  456] gatekee -1 -1613806!  11.588  __xnpod_schedule+0x45d 
(xnpod_schedule_handler+0x29)
:|  *+[ 4936] -<?>-    0 -1613795      0.721  xnpod_resume_thread+0x4e 
(gatekeeper_thread+0xf1)
:|  # [  456] gatekee -1 -1613794+   6.395  __xnpod_schedule+0x77 
(xnpod_schedule_handler+0x29)
:|  # [ 4936] -<?>-    0 -1613788      0.922  __xnpod_schedule+0x77 
(xnpod_suspend_thread+0x171)
:|  # [  456] gatekee -1 -1613787!  10.124  __xnpod_schedule+0x45d 
(xnpod_schedule_handler+0x29)
:|  *+[ 4936] -<?>-    0 -1613777      0.711  xnpod_resume_thread+0x4e 
(gatekeeper_thread+0xf1)
:|  # [  456] gatekee -1 -1613776+   2.686  __xnpod_schedule+0x77 
(xnpod_schedule_handler+0x29)
:|  # [ 4936] -<?>-    0 -1613773      0.892  __xnpod_schedule+0x77 
(xnpod_suspend_thread+0x197)
:|  # [  456] gatekee -1 -1613772! 1977.417  __xnpod_schedule+0x45d 
(xnpod_schedule_handler+0x29)
:|  # [ 4956] -<?>-   50 -1611795+   1.794  xnpod_resume_thread+0x4e 
(xnthread_timeout_handler+0x1e)
...
...
...
:|  # event   tick@-5129 -5382+   4.771  xntimer_next_local_shot+0xb4 
(xntimer_tick_aperiodic+0x1a0)
:|  # [ 4961] -<?>-   -1 -5377+   3.448  __xnpod_schedule+0x77 
(xnintr_clock_handler+0x105)
:|  # [ 4936] -<?>-    0 -5374+   7.027  __xnpod_schedule+0x45d 
(xnpod_suspend_thread+0x197)
:|  # [ 4936] -<?>-    0 -5367+   1.172  __xnpod_schedule+0x77 
(xnpod_suspend_thread+0x171)
:|  # [ 4961] -<?>-   -1 -5366! 253.408  __xnpod_schedule+0x45d 
(xnintr_clock_handler+0x105)
:|  # [ 4957] -<?>-   50 -5112+   1.884  xnpod_resume_thread+0x4e 
(xnthread_periodic_handler+0x28)
:|  # event   tick@-4525 -5110+   3.548  xntimer_next_local_shot+0xb4 
(xntimer_tick_aperiodic+0x1a0)
:|  # [ 4961] -<?>-   -1 -5107+   2.476  __xnpod_schedule+0x77 
(xnintr_clock_handler+0x105)
:|  # [ 4957] -<?>-   50 -5104!  59.785  __xnpod_schedule+0x45d 
(xnpod_suspend_thread+0x197)
:|  # [ 4957] -<?>-   50 -5045+   1.794  __xnpod_schedule+0x77 
(xnpod_suspend_thread+0x197)
:|  # [ 4961] -<?>-   -1 -5043! 538.795  __xnpod_schedule+0x45d 
(xnintr_clock_handler+0x105)
:|  # [ 5012] -<?>-   99 -4504+   2.195  xnpod_resume_thread+0x4e 
(xnthread_periodic_handler+0x28)
:|  # event   tick@-129  -4502+   4.180  xntimer_next_local_shot+0xb4 
(xntimer_tick_aperiodic+0x1a0)
:|  # [ 4961] -<?>-   -1 -4498+   2.596  __xnpod_schedule+0x77 
(xnintr_clock_handler+0x105)
:|  # [ 5012] -<?>-   99 -4495! 677.824  __xnpod_schedule+0x45d 
(xnpod_suspend_thread+0x197)
:|  # [ 5012] -<?>-   99 -3817+   5.844  __xnpod_schedule+0x77 
(xnpod_suspend_thread+0x197)
:|  # [ 4961] -<?>-   -1 -3811! 3709.015  __xnpod_schedule+0x45d 
(xnintr_clock_handler+0x105)
:|  # [ 4957] -<?>-   50  -102+   3.147  xnpod_resume_thread+0x4e 
(xnthread_periodic_handler+0x28)
:|  # event   tick@474     -99+   4.210  xntimer_next_local_shot+0xb4 
(xntimer_tick_aperiodic+0x1a0)
:|  # [ 4961] -<?>-   -1   -95+   3.388  __xnpod_schedule+0x77 
(xnintr_clock_handler+0x105)
:|  # [ 4957] -<?>-   50   -92!  55.645  __xnpod_schedule+0x45d 
(xnpod_suspend_thread+0x197)
:|  # [ 4957] -<?>-   50   -36+   2.065  __xnpod_schedule+0x77 
(xnpod_suspend_thread+0x197)
:|  # [ 4961] -<?>-   -1   -34!  34.474  __xnpod_schedule+0x45d 
(xnintr_clock_handler+0x105)
<    #freeze  0x00000001     0    489.264  update_curr_rt+0x10f 
(task_tick_rt+0x15)
 |  # [ 5012] -<?>-   99   489      1.724  xnpod_resume_thread+0x4e 
(xnthread_periodic_handler+0x28)
 |  # event   tick@4870    490      3.147  xntimer_next_local_shot+0xb4 
(xntimer_tick_aperiodic+0x1a0)
 |  # [    0] -<?>-   -1   494      2.866  __xnpod_schedule+0x77 
(xnintr_clock_handler+0x105)
 |  # [ 5012] -<?>-   99   497    662.948  __xnpod_schedule+0x45d 
(xnpod_suspend_thread+0x197)
 |  # [ 5012] -<?>-   99  1159      3.608  __xnpod_schedule+0x77 
(xnpod_suspend_thread+0x197)
 |  # [    0] -<?>-   -1  1163    3720.243  __xnpod_schedule+0x45d 
(xnintr_clock_handler+0x105)
 |  # [ 4957] -<?>-   50  4883      2.085  xnpod_resume_thread+0x4e 
(xnthread_periodic_handler+0x28)
 |  # event   tick@5474   4885      3.368  xntimer_next_local_shot+0xb4 
(xntimer_tick_aperiodic+0x1a0)
 |  # [    0] -<?>-   -1  4889      1.233  __xnpod_schedule+0x77 
(xnintr_clock_handler+0x105)
 |  # [ 4957] -<?>-   50  4890      0.000  __xnpod_schedule+0x45d 
(xnpod_suspend_thread+0x197)
//================== frozen file end =======================

There is no process utilizing the CPU  for > 950000 us, which
leads me to the conclusion, that the problem is not caused by
an infinite loop.
However I observe delay values of up to 3900 us multiple times which
makes me think that there is a high CPU utilization which in the
worst case exceeds the limit of 950000 us per 1 second interval.
Probably I would have to sum up all delay values of processes in 
secondary mode within 1 second time interval and see if it exceeds
950000 us.

Now I have the following questions
1. What is meant by "domain stalled", "current" and "current+stalled"?
2. What is that process having ID "[   0]"
3. How can I tell processes running in secondary mode from others?
   (just by looking at the priority value?)

Any hint on interpreting my frozen file is appreciated.

Thanks
Jochen

_______________________________________________
Xenomai mailing list
[email protected]
http://xenomai.org/mailman/listinfo/xenomai

Reply via email to