Hi,

i recently looked into dlopen()/dlclose() again, because the testcase i
wrote did not really work.

What i found is that dlclose() currently can not work because we lack
destructor code for the core and skins. Removing the setup_descriptors
from the list is only a small fraction of what would actually be needed.

In fact we would need the reverse of
"struct setup_descriptor->init()" for every file in lib/*/init.c.
On dlclose() one would need to call those and remove the descriptor
from the list. But since there are no destructors we are leaking half
initialized state because dlclose() might not actually go and free all
its memory/state.

So for now the state is that dlclose() is not supported, restart your
application if you need another "plugin".
Enabling dlclose() is sure possible, but not trivial.

Could you please have a look at
http://www.xenomai.org/pipermail/xenomai/2018-April/038821.html

That will actually "prevent" dlclose.

Henning

Am Thu, 26 Apr 2018 11:24:19 +0200
schrieb Edouard Tisserant <edouard.tisser...@gmail.com>:

> One more question. Sorry for flooding the list.
> 
> As a workaround to avoid leaking memory, I would like to try this :
> 
>  - xeno_stub.so : stub library, linked with bootstrap-pic.o
>  - 1.so, 2.so, ... n.so : libraries calling alchemy/posix realtime
> resources, NOT linked with bootstrap-pic.o
> 
> Process life-cycle would be :
> 
> - process start
> 
> dlopen(xeno_stub.so)
> 
> dlopen(1.so)
> dlsym + call 1.so
> dlclose(1.so)
> ...
> dlopen(n.so)
> dlsym + call n.so
> dlclose(n.so)
> 
> dlclose(xeno_stub.so)
> 
> - process end
> 
> Is that correct to assume that this way, pointers setup while calling
> xenomai_init() as a side effect of first dlopen() would stay valid
> while other non-bootstrap-pic libraries are loaded and unloaded ?
> 
> 
> On 26/04/2018 09:39, Edouard Tisserant wrote:
> > @Henning Schild :
> >
> > I just see your message from Tuesday : '[PATCH 3/3] build: link
> > dlopen libs with "nodelete"'
> >
> > Is "nodelete" the only way to make it stable ? Does it apply only to
> > Xenomai libs (i.e. alchemy, copperplalte) or should it also apply to
> > final shared object that uses Xenomai libs ? 
> >
> > Edouard
> >
> >
> > On 26/04/2018 09:02, Edouard Tisserant wrote:  
> >> Good Morning !
> >>
> >> I'm chasing the origins of a random segfault when porting Beremiz
> >> to Xenomai 3.
> >>
> >> Beremiz PLC runtime loads PLC logic as a shared object. Loading is
> >> performed as dlopen call from python interpreter. Each time PLC
> >> programmer tries a new program, previous shared object is dlcosed
> >> and the new program is dlopened.
> >>
> >> Of course, there is in depth checks to ensure that all
> >> dlopen/dlclose/dlsym operations are done from main thread only,
> >> and it is ensured that all real time tasks and resources have been
> >> closed before dlclose.
> >>
> >> Also, I did check that implicit call to xenomai_init_dso() really
> >> happens, when linking shared object with bootstrap-pic.o . I also
> >> tried explicit call to xenomai_init (once at first load or after
> >> every dlopen), no change.
> >>
> >> I tried last commit about this topic : "boilerplate/setup:
> >> introduce destructors for __setup_call"
> >> (5511e76040444af875ae1bb099c13a25b16336fc). It didn't help,
> >> unfortunately, but did remove Xenomai "Bad syscall" warning
> >> sometimes after dlclose.
> >>
> >> Segfault never happen at first reload. i.e. dlopen/dlcose/dlopen
> >> never fails. You have to at least extend the sequence to
> >> dlopen/dlcose/dlopen/dlcose/dlopen to see the crash. In other
> >> words, smokey/dlopen test doesn't try hard enough to catch the
> >> problem. I have to reload about 6 times to have a crash. Also, it
> >> seems that crash has higher probability to occur if no symbol was
> >> called from shared object in between dlopen and dlclose (dlsym was
> >> called).
> >>
> >> Enabling full Xenomai debug didn't display more details on the
> >> crash. Post-mortem debug (gdb -c core) works, but gdb can't give
> >> me any backtrace :
> >>
> >> (gdb) bt
> >> #0  0x00007fb8 in ?? ()
> >> #1  0xb520c098 in ?? ()
> >>
> >> Is there a way to have gdb telling a bit more about what happens in
> >> boilerplate/copperplate ? How can I find where it crashes ?
> >>
> >> Cheers,
> >>
> >> Edouard
> >>
> >>  
> >  
> 
> 
> 
> _______________________________________________
> Xenomai mailing list
> Xenomai@xenomai.org
> https://xenomai.org/mailman/listinfo/xenomai


_______________________________________________
Xenomai mailing list
Xenomai@xenomai.org
https://xenomai.org/mailman/listinfo/xenomai

Reply via email to