Re: [Xenomai] Avoid stopping all timers during debug sessions

Jan Kiszka Wed, 02 Aug 2017 08:43:28 -0700

On 2017-08-02 15:05, Philippe Gerum wrote:
> On 08/02/2017 02:54 PM, Jan Kiszka wrote:
>> On 2017-08-02 14:47, Philippe Gerum wrote:
>>> On 08/02/2017 01:03 PM, Jan Kiszka wrote:
>>>> On 2017-08-02 10:12, Philippe Gerum wrote:
>>>>> On 07/27/2017 08:03 PM, Jan Kiszka wrote:
>>>>>> Hi,
>>>>>>
>>>>>> currently, when gdb interrupts a Xenomai application, all timers stop
>>>>>> firing (except for those marked with XNTIMER_NOBLCK - core timers). This
>>>>>> means, you can't debug one RT application without affecting others on
>>>>>> the same systems. I'm not wondering how to resolve this. Options:
>>>>>>
>>>>>>  - convert timer stopping to a per-process feature - implies the need to
>>>>>>    establish some timer-process relationship concept, and it might be
>>>>>>    tricky for drivers and other central timer creators
>>>>>>
>>>>>>  - let the application deal with timer overflows during debug:
>>>>>>     - make timer stop optional (in case of single applications that
>>>>>>       can't handle this yet)
>>>>>>     - do not stop timers at all
>>>>>>
>>>>>> Any opinions?
>>>>>>
>>>>>
>>>>> That stuff is a relic from the Dark Ages, which not only sits in a hot
>>>>> path, but also crashes an ARM board here when ptracing. At the very
>>>>> least, the implementation is overkill and needs fixing.
>>>>>
>>>>> I'm working on a much simpler and less intrusive approach, I'll follow
>>>>> up today on this.
>>>>
>>>> OK, looking forward.
>>>
>>> To answer your original question, I would either let the application
>>> deal with overruns, or provide an implementation that only cares for
>>> hiding overruns that may have occurred due to ptracing, instead of
>>> trying to somehow freeze time.
>>>
>>> So I pushed an implementation doing exactly that, fixing the breakage
>>> issue in the same move. There is no such thing as blocked timers
>>> anymore, we just pretend that no overrun took place when asked via
>>> xntimer_get_overrun() after a ptraced state has existed, until an
>>> originally relaxed caller eventually leaves the kernel in primary mode,
>>> at which point normal accounting is restarted.
>>>
>>> That should exactly cover the case we are interested in, i.e. a once
>>> ptraced - therefore relaxed - thread, returning from any service which
>>> may collect the overrun count, and has to do so from primary mode (e.g.
>>> sigwait(), timerfd.read(), rtdm_task_wait_period()).
>>>
>>> The code is in wip/sstep for now.
>>>
>>
>> Will have a look.
>>


The approach looks good, but please split up refactorings from the
functional change - reviewing and also cherry-picking the patch is hard
now (I will need to keep some structuring for the rtdebug patches below).

>>>>
>>>>>
>>>>>> BTW, I also have pre/post-debug hook concept for applications out of
>>>>>> tree that can help to make applications debug-aware. In our use cases,
>>>>>> the pre-hook brings hardware into a state that permits application stop,
>>>>>> and the post-host ramps everything up again.
>>>>>>
>>>>>
>>>>> Hooks sitting in kernel space?
>>>>>
>>>>
>>>> No, in the application. They are running in a "privileged" carrier
>>>> thread (privileged means that it is stopped only after the callback
>>>> finished). Goes along with changes to enable synchronous stopping of all
>>>> RT threads (except that one) once a debug-stop comes in.
>>>>
>>>
>>> Maybe that could be generalized by defining a special signal some
>>> dedicated thread within an application could sigwait() for. A thread
>>> synchronously waiting for such signal would inherit the required
>>> privileges by doing so.
>>
>> The interface currently consists of two new syscalls: one is registering
>> the calling thread as a "ptracer helper", the other is allowing it to
>> wait in related events in a loop. Maybe we can also do implicit
>> registration on the first wait syscall, though. But signals are limited,
>> and we already stole a couple from the application.
>>
> 
> Cobalt can also deal with pseudo-signal numbers which do not belong to
> the regular signal namespace, i.e. SIGSUSP .. SIGDEMT. It may not be
> necessary to handle the ptrace event strictly as an actual signal whose
> occurrence have to be logged in a bitmap, the thread issuing sigwait()
> on that event could just branch to a routine having its own
> synchronization method with the ptrace management stuff.

OK, whatever suites the use case and is easier to hook into. Separate
syscall seemed cleaner to me.

> 
>> I can factors those patches out and put them in a branch for preview and
>> further discussion.
>>
> 
> Please do, when time allows. Anything that makes debugging easier is
> worth consideration.
> 
Pushed to queues/rtdebug in my repo [1]. I didn't test the rebased queue
yet, just cherry-picked and folded some stuff together.

There are a few changes required to ipipe as well, see
queues/rtdebug-4.4 [2]. That's x86-only so far, but should be portable
to other archs.

Looking forward to hear your opinion!

Jan

[1] http://git.xenomai.org/xenomai-jki.git/log/?h=queues/rtdebug
[2] http://git.xenomai.org/ipipe-jki.git/log/?h=queues/rtdebug-4.4

-- 
Siemens AG, Corporate Technology, CT RDA ITP SES-DE
Corporate Competence Center Embedded Linux

_______________________________________________
Xenomai mailing list
[email protected]
https://xenomai.org/mailman/listinfo/xenomai

Re: [Xenomai] Avoid stopping all timers during debug sessions

Reply via email to