On 19.08.21 18:54, Lange Norbert wrote: > > >> -----Original Message----- >> From: Jan Kiszka <jan.kis...@siemens.com> >> Sent: Donnerstag, 19. August 2021 17:42 >> To: Lange Norbert <norbert.la...@andritz.com>; Xenomai >> (xenomai@xenomai.org) <xenomai@xenomai.org> >> Subject: Re: cobalt_assert_nrt should use __cobalt_pthread_kill >> >> >> >> CAUTION: External email. Do not click on links or open attachments unless you >> know the sender and that the content is safe. >> >> On 19.08.21 17:24, Lange Norbert wrote: >>> >>> >>>> -----Original Message----- >>>> From: Jan Kiszka <jan.kis...@siemens.com> >>>> Sent: Donnerstag, 19. August 2021 12:54 >>>> To: Lange Norbert <norbert.la...@andritz.com>; Xenomai >>>> (xenomai@xenomai.org) <xenomai@xenomai.org> >>>> Subject: Re: cobalt_assert_nrt should use __cobalt_pthread_kill >>>> >>>> >>>> >>>> CAUTION: External email. Do not click on links or open attachments >>>> unless you know the sender and that the content is safe. >>>> >>>> On 19.08.21 11:56, Lange Norbert via Xenomai wrote: >>>>> Hello, >>>>> >>>>> I have some small slight issue with the cobalt_assert_nrt function, >>>>> incase a violation is detected the thread should get a signal, but >>>>> the implementation will implicitly get a signal during the execution >>>>> of >>>> pthread_kill, see: >>>>> >>>>> >>>>> #0 getpid () at ../sysdeps/unix/syscall-template.S:60 >>>>> #1 0x00007fc1dc4fa0d6 in __pthread_kill (threadid=<optimized out>, >>>>> signo=24) at ../sysdeps/unix/sysv/linux/pthread_kill.c:53 >>>>> #2 0x00007fc1dc8b2470 in callAssertFunction () at >>>>> /home/lano/git/preload_checkers/src/pchecker.h:199 >>>>> #3 malloc () at >>>>> /home/lano/git/preload_checkers/src/pchecker_heap_glibc.c:220 >>>>> #4 <actual instrumented function> >>>>> >>>>> You see, the signal should happen with the pc of #2, not from the >>>> implementation of glibc (or whatever c library). >>>>> So the function should be changed to: >>>>> >>>>> void cobalt_assert_nrt(void) >>>>> { >>>>> if (cobalt_should_warn()) >>>>> __cobalt_pthread_kill(pthread_self(), >>>>> SIGDEBUG); } >>>>> >>>>> (or even replaced with the raw syscall ?) >>>>> >>>> >>>> Hmm, that's similar to an assert causing a lengthy trace, not failing >>>> directly at the place where the assert was raised: >>>> >>>> #0 0x00007ffff7a3918b in raise () from /lib64/libc.so.6 >>>> #1 0x00007ffff7a3a585 in abort () from /lib64/libc.so.6 >>>> #2 0x00007ffff7a3185a in __assert_fail_base () from /lib64/libc.so.6 >>>> #3 0x00007ffff7a318d2 in __assert_fail () from /lib64/libc.so.6 >>>> #4 0x0000000000400524 in main () at assert.c:5 >>>> >>>> What is your practical problem with the current implementation? Do >>>> you expect a specific SIGDEBUG reason? >>> >>> A better stacktrace. (I actually cut the trace in the signal handler >>> in case of hitting an __assert_fail) >> >> The backtrace should still point to the right function that caused the >> migration. >> I miss cobalt_assert_nrt() in your backtrace though, but that should have >> nothing to do with how it is implemented. Are you actually using >> cobalt_assert_nrt() from libcobalt? > > Yes, but I dlsym it. > I would prefer if the cobalt_assert_nrt would be the start of the trace. >
That it always does under normal constraints - please check your local setup, this is not a generic problem. It's your pchecker.h:199 which issues the syscall directly, rather than calling cobalt_assert_nrt(). Maybe that's because of lazy symbol resolution? >> >>> BTW, __cobalt_pthread_kill(pthread_self(), SIGDEBUG) doesn’t seem to do a >> thing, doesn’t handle SIGDEBUG? >>> >> >> It only triggers the signal (in one way or another...). Handling is up to the >> application. If you don't handle that, the application is terminated, >> obviously. > > The application continues running. But I did not try with > __cobalt_pthread_kill(pthread_self(), SIGDEBUG) > but XENOMAI_SYSCALL2(sc_cobalt_thread_kill, thread, sig). > Means the cobalt syscall is not handling the signal. A syscall does not handle signals. By calling the cobalt version of pthread_kill, you queue the signal for synchronous RT processing (sigwait). > > So for to satisfy my OCD toggling off/on the modeswitch signals would be > correct I guess > > pthread_setmode_np(PTHREAD_WARNSW, 0, NULL); > pthread_kill(pthread_self(), SIGDEBUG); > pthread_setmode_np(0, PTHREAD_WARNSW, NULL); > > or even just using a linux syscall: > > getpid(); A syscall will remain the source of the signal, no change on the origin of the backtrace. > > Point being that right now you trap alteast twice That is a different point. So far, you were complaining about getting a wrong backtrace which is not caused by triggering a SIGDEBUG twice. If you want to prevent a duplicate event, triggering a syscall only or disabling the warning for the syscall itself can be options. But I consider this really a minor issue. Jan -- Siemens AG, T RDA IOT Corporate Competence Center Embedded Linux