> -----Original Message-----
> From: Philippe Gerum [mailto:r...@xenomai.org]
> Sent: Tuesday, October 22, 2013 9:53 AM
> To: Gilles Chanteperdrix
> Cc: Daniel Merrill; xenomai@xenomai.org
> Subject: Re: [Xenomai] t_suspend and XNBREAK
>
> On 10/22/2013 04:49 PM, Gilles Chanteperdrix wrote:
> > On 10/22/2013 03:46 PM, Philippe Gerum wrote:
> >> On 10/21/2013 08:21 PM, Daniel Merrill wrote:
> >>
> >>> More follow up on this, we went ahead and put some logging in
> >>> shadow.c which from what we could find is where the signal is
> >>> "kicking" the thread.
> >>> >From the logging it looks like the only signals we get (while
> >>> attached to
> >>> GDB) are SIGSTOP, SIGTRAP, SIGRT32 and SIGKILL(upon exiting the
> >>> debugger).
> >>> I'm assuming the SIGSTOP, SIGTRAP and SIGKILL are normal from the
> >>> debugger. It looks like shadow.c looks for SIGTRAP and SIGSTOP and
> >>> sets an XNDEBUG state on the thread which I assume allows it to
> >>> restart the suspend?
> >>
> >> XNDEBUG marks a thread which is ptraced, this has implications when
> >> managing the system timer while the app is single-stepped/stopped by
> >> a debugger.
> >>
> >>    SIGRT32 I believe comes from our calls to t_delete. I'm guessing
> >>> this is what's causing the suspends to fail? Anyway, I appreciate
> >>> any additional insight anyone can offer. Thanks again for all the 
> >>> help.
> >>>
> >>
> >> t_delete() will cause t_suspend() to unblock if sent to the suspended
> >> task, due to receiving SIGRT32/SIGCANCEL from the linux side, which
> >> is how the NPTL deals with async cancellation internally (t_delete()
> >> ->
> >> pthread_cancel() -> t(g)kill(SIGCANCEL)).
> >>
> >> Internally, XNBREAK will be raised for that task, causing -EINTR to
> >> be propagated back. However, since there is SIGCANCEL pending for the
> >> task, the NPTL handler should run on the way back to the call site in
> >> t_suspend(), and the task should never return from this handler.
> >>
> >> In short, receiving EINTR from t_suspend() is unexpected,
> >> particularly when unblocked by SIGCANCEL. I could not reproduce this
> >> issue based on the simple test, running over GDB (7.5.1).
> >>
> >> A few questions more:
> >>
> >> - regardless of t_delete(), is the problem about one or multiple
> >> threads unblocking unexpectedly from t_suspend(0), when
> >> single-stepping a distinct thread over GDB?
> >>
> >> - I'm testing with Xenomai 2.6.3. Which version have you been using,
> >> on which cpu/platform, using which I-pipe release in the kernel
> >> (check /proc/xenomai/{version, hal}?
> >>
> >> - Also could you write a simple test code illustrating the issue so
> >> that I could try reproducing it? Typically, would this be
> >> reproducible on your setup with a single task running t_suspend(0),
> >> while ptracing the main routine in parallel?
> >
> > Maybe cancellation is disabled with pthread_setcancelstate?
> >
>
> AFAIU the NPTL code, then SIGCANCEL should not be sent.
>
> --
> Philippe.

Ok, If I didn't do anything stupid (don't hold it against me if I did) the
following code seems to reproduce the issue on my system:

#include <stdlib.h>
#include <stdio.h>
#include <psos+/psos.h>
#include <sys/mman.h>

#define CONTROLLER_PRIORITY     5
#define SUB_TASK1_PRIORITY      2
#define SUB_TASK2_PRIORITY      3
#define SUB_TASK3_PRIORITY      4

void subTask3()
{
        u_long retValue = 0;
        int count = 0;
        retValue = t_suspend(0);
        printf("subTask3, suspend returned %ld\n", retValue);
        /* count to 1000000 */
        while(count < 1000000)
                count++;
        retValue = t_suspend(0);
        printf("subTask3, suspend returned %ld\n", retValue);
        /*should never get here, we should have either suspended or
         * been deleted*/
        while(1);
}

void subTask2()
{
        u_long retValue = 0;
        int count = 0;
        retValue = t_suspend(0);
        printf("subTask2, suspend returned %ld\n", retValue);
        /* count to 100000 */
        while(count < 100000)
                count++;
        retValue = t_suspend(0);
        printf("subTask2, suspend returned %ld\n", retValue);
        /*should never get here, we should have either suspended or
         * been deleted*/
        while(1);
}

void subTask1()
{
        u_long retValue = 0;
        int count = 0;
        retValue = t_suspend(0);
        printf("subTask1, suspend returned %ld\n", retValue);
        /* count to 10000 */
        while(count < 10000)
                count++;
        retValue = t_suspend(0);
        printf("subTask1, suspend returned %ld\n", retValue);
        /*should never get here, we should have either suspended or
         * been deleted*/
        while(1);
}

void controllerTask()
{
        u_long tid;

        t_ident("SUB1", 0, &tid);
        t_resume(tid);
        tm_wkafter(5);
        t_delete(tid);

        t_ident("SUB2", 0, &tid);
        t_resume(tid);
        tm_wkafter(5);
        t_delete(tid);

        t_ident("SUB3", 0, &tid);
        t_resume(tid);
        tm_wkafter(5);
        t_delete(tid);

        t_ident("MAIN", 0, &tid);
        ev_send(tid, 0x00000001);
        t_suspend(0);
}


int main(int argc, char *argv[])
{
        u_long contId, sub1Id, sub2Id, sub3Id;
        u_long eventsReceived;

        mlockall(MCL_CURRENT | MCL_FUTURE);

        t_create("CONT", CONTROLLER_PRIORITY, 1000, 1000,
                        T_FPU | T_LOCAL, &contId);
        t_create("SUB1", SUB_TASK1_PRIORITY, 1000, 1000,
                        T_FPU | T_LOCAL, &sub1Id);
        t_create("SUB2", SUB_TASK2_PRIORITY, 1000, 1000,
                        T_FPU | T_LOCAL, &sub2Id);
        t_create("SUB3", SUB_TASK3_PRIORITY, 1000, 1000,
                        T_FPU | T_LOCAL, &sub3Id);
        t_start(sub1Id, T_PREEMPT | T_SUPV | T_NOASR, subTask1, NULL);
        t_start(sub2Id, T_PREEMPT | T_SUPV | T_NOASR, subTask2, NULL);
        t_start(sub3Id, T_PREEMPT | T_SUPV | T_NOASR, subTask3, NULL);
        tm_wkafter(5);
        t_start(contId, T_PREEMPT | T_SUPV | T_NOASR, controllerTask, NULL);

        ev_receive(0x00000001, EV_WAIT | EV_ALL, 0, &eventsReceived);

}

This was compiled with the following command:

gcc -g -I/usr/include/xenomai -D_GNU_SOURCE -D_REENTRANT -D__XENO__
 -I/usr/include/xenomai/psos+ 
test.c -lpsos -L/usr/lib -lxenomai -lpthread -lrt

It was then run in gdb using the following gdb script:

break 20
commands
next
continue
end
b 32
commands
next
continue
end
b 52
commands
next
continue
end
run

Please let me know what you think. If I made a mistake in the test I'm more 
than happy to try again, just let me know what I did wrong. Thanks for 
sticking with me through all this. I appreciate all the advice and help 
that's been given.

Dan Merrill

_______________________________________________
Xenomai mailing list
Xenomai@xenomai.org
http://www.xenomai.org/mailman/listinfo/xenomai

Reply via email to