Re: [Xenomai] Sleeping function called from invalid context

Stoidner, Christoph Thu, 11 Dec 2014 11:25:16 -0800

>> >> > Ok, maybe we have some hope with the tracer though. What should
>> >> > trigger a trace is the fact that a relax request has been sent, but
>> >> > that the next linux scheduling point does not wake up the said task.
>> >> > This is all debug code, so it does not need to be clean. You can
>> >> > define a per-cpu variable (if running on SMP systems, otherwise a
>> >> > global variable will do) with the last request posted. And when a
>> >> > linux scheduling happens, test that the newly scheduled task is the
>> >> > task that was passed to the relax, if that is not the case, trigger
>> >> > a trace freeze. The point where the nucleus is informed of a Linux
>> >> > task switch is do_schedule_event(). The trick is that if you have
>> >> > some tasks with a higher priority than the relaxed task, it is
>> >> > normal that the relaxed task is not scheduled immediately, so if you
>> >> > want the condition to hold, you need to run the xenomai tasks which
>> >> > relax with the highest priority. Also, obviously, after the test,
>> >> > the pointer should be reset to NULL, because there are several task
>> >> > queued, and with a global variable you have no way of knowing which
>> >> > is next.
>> >>
>> >> I will do so. Do I have to change the gatekeeper's priority behind the 
>> >> relaxing
>> >> task to assure that the gatekeeper would be scheduled before?
>> >
>> > The gatekeeper is not involved.
>>
>> Sorry, I wanted to say: "that gatekeeper would NOT be scheduled before?". 
>> Otherwise
>> we see possibly the gatekeeper instead of the relaxed task? Or am I wrong?
>
> I meant to say the gatekeeper is not involved in the primary to
> secondary mode switches, only in the secondary to primary mode
> switches. But yes, since it runs with the highest priority, it may
> be the one scheduled when back to primary mode. Though I would say
> it is probably very unlikely, since the events activating the
> gatekeeper is a secondary mode event, which by definition could not
> have happened while another task was in primary mode. So what could
> happen is that one task running in secondary mode tries to switch to
> primary mode, at which point, before the gatekeeper is even
> activated, another task running in primary mode is activated
> (perhaps because it is waiting for an interrupt, may it be atimer).


I have added debug code as described by you. Unfortunately I am not sure
if I catch the correct point in time to trigger the trace. The problem is that
beside gatekeeper their are also some kernel interrupt tasks that have a
higher priority. 

Because of that and to understand whats happening here I have added
some other debug code. Thereby my goal was to verify if all requests
of schedule_linux_call() are processed by lostage_handler(). This is 
realized as below:

1) In schedule_linux_call() I check if it's the request for my highest prio 
task. If yes I keep the tasks's pointer in a global variable.

2) In lostage_handler() before wake_up_process() call I check if the global 
pointer is equal to the current request. If true it's the wakeup for my
high prio task. In that case I keep that information by clearing the global
pointer with NULL.

3) In do_schedule_event() I output some printk() message when the 
global pointer is unequal to NULL.

What I would expect now is to see no message from do_schedule_event(), 
since lostage_handler() should be called before do_schedule_event().
However after some run-time I see the printk() message of do_schedule_event().
After that all tasks are freezed.

Is my assumption from above wrong or is the outputted printk() message 
the indication that the problem has occurred? 

For better understanding find my debug-changes below:

*** tmp/xenomai-2.6.3/ksrc/nucleus/shadow.c     2013-08-20 13:14:38.000000000 
+0200
--- xenomai-2.6.3/ksrc/nucleus/shadow.c 2014-12-11 20:03:22.885493756 +0100
***************
*** 752,757 ****
--- 752,758 ----
        }
  }
  
+ static volatile struct task_struct *__last_requested_task = NULL;
  static void lostage_handler(void *cookie)
  {
        int cpu, reqnum, type, arg, sig, sigarg;
***************
*** 795,800 ****
--- 796,803 ----
  
                        /* fall through */
                case LO_START_REQ:
+                       if (__last_requested_task && p==__last_requested_task)
+                               __last_requested_task = NULL;
                        wake_up_process(p);
                        break;
  
***************
*** 843,848 ****
--- 846,854 ----
        rq->req[reqnum].task = p;
        rq->req[reqnum].arg = arg;
  
+     if (strcmp(p->comm, "LApp") == 0)
+               __last_requested_task = p;
+ 
        __rthal_apc_schedule(lostage_apc);
  
        splexit(s);
***************
*** 2606,2611 ****
--- 2612,2623 ----
        if (!xnpod_active_p())
                return;
  
+       if (__last_requested_task)
+       {
+               printk("===================== lostage_handler() was not called 
before!\n");
+               __last_requested_task = NULL;
+       }
+ 
        prev_task = current;
        next = xnshadow_thread(next_task);
        set_switch_lock_owner(prev_task);


Regards,
Christoph




_______________________________________________
Xenomai mailing list
[email protected]
http://www.xenomai.org/mailman/listinfo/xenomai

Re: [Xenomai] Sleeping function called from invalid context

Reply via email to