Il 16/07/21 11:47, Jan Kiszka ha scritto:
On 16.07.21 11:45, Mauro S. via Xenomai wrote:
Hi,

I'm using Xenomai3 (master branch, commit
bca41678742be80c3a0d5a01935c671c385a95a1) on a X86_64bit Intel Atom
x5-E8000 with 2GB RAM, using kernel from Xenomai repos, in Cobalt
coniguration. SMI workaround is enabled and all latency tests are good.

I'm facing with a very weird problem in my application. I have some
tasks with priority < 90 that call rt_task_suspend() on themselves.
Then, I have a task with priority 99 that resumes all other tasks with
rt_task_resume(), when they are suspended.

Sometimes a task does not get resumed.

In /proc/xenomai/sched/stat I have this status for my suspendend task:

CPU  PID    MSW        CSW        XSC        PF    STAT       %CPU  NAME
   1  620    3          8          13         0     00048041    0.0  t12

Analizing the scenario attaching gdb to the application, I observe that
the not-resuming task has this backtrace:

#0  0x00007f0ec14c9b38 in __cobalt_kill (pid=620, sig=65) at signal.c:100
100    signal.c: No such file or directory.
(gdb) bt
#0  0x00007f0ec14c9b38 in __cobalt_kill (pid=620, sig=65) at signal.c:100
#1  0x00007f0ec14eb379 in threadobj_suspend
(thobj=thobj@entry=0x7f0ec06bfcd0) at threadobj.c:335
#2  0x00007f0ec15013dc in rt_task_suspend (task=task@entry=0x0) at
task.c:1154

that seems to me OK. If I understood correctly, it is locked in its
SIGSUSP handler, that calls sigsuspend() waiting SIGRESM to "restart".

Then, I placed some breakpoints where rt_task_resume() were called, and
in rt_task_resume() itself. I set tcb->suspends=1 with GDB and followed
the subsequent call of threadobj_resume(). Then, I placed some
breakpoints in threadobj_resume, I forced __THREAD_S_SUSPENDED bit in
thobj->status, and I observed that __RT(kill(thobj->pid, SIGRESM)) got
called, with a retval 0. thobj->pid has the right value.

But the suspended task does not get resumed.

Any idea/suggestion?


Can you derive a sharable simple test case from this? That will ensure
we are not misinterpreting what you code actually does.

Jan



I will try to reduce the application code to a simple test, but I'm not sure to succeed.
I will get back to you anyway.

Thanks.

--
Mauro

Reply via email to