Good Morning !

I'm chasing the origins of a random segfault when porting Beremiz to
Xenomai 3.

Beremiz PLC runtime loads PLC logic as a shared object. Loading is
performed as dlopen call from python interpreter. Each time PLC
programmer tries a new program, previous shared object is dlcosed and
the new program is dlopened.

Of course, there is in depth checks to ensure that all
dlopen/dlclose/dlsym operations are done from main thread only, and it
is ensured that all real time tasks and resources have been closed
before dlclose.

Also, I did check that implicit call to xenomai_init_dso() really
happens, when linking shared object with bootstrap-pic.o . I also tried
explicit call to xenomai_init (once at first load or after every
dlopen), no change.

I tried last commit about this topic : "boilerplate/setup: introduce
destructors for __setup_call"
(5511e76040444af875ae1bb099c13a25b16336fc). It didn't help,
unfortunately, but did remove Xenomai "Bad syscall" warning sometimes
after dlclose.

Segfault never happen at first reload. i.e. dlopen/dlcose/dlopen never
fails. You have to at least extend the sequence to
dlopen/dlcose/dlopen/dlcose/dlopen to see the crash. In other words,
smokey/dlopen test doesn't try hard enough to catch the problem. I have
to reload about 6 times to have a crash. Also, it seems that crash has
higher probability to occur if no symbol was called from shared object
in between dlopen and dlclose (dlsym was called).

Enabling full Xenomai debug didn't display more details on the crash.
Post-mortem debug (gdb -c core) works, but gdb can't give me any
backtrace :

(gdb) bt
#0  0x00007fb8 in ?? ()
#1  0xb520c098 in ?? ()

Is there a way to have gdb telling a bit more about what happens in
boilerplate/copperplate ? How can I find where it crashes ?

Cheers,

Edouard



_______________________________________________
Xenomai mailing list
Xenomai@xenomai.org
https://xenomai.org/mailman/listinfo/xenomai

Reply via email to