On 9/6/19 6:32 PM, Jan Kiszka wrote: > On 06.09.19 18:26, Philippe Gerum wrote: >> On 9/6/19 6:19 PM, Philippe Gerum via Xenomai wrote: >>> On 9/6/19 6:14 PM, Jan Kiszka wrote: >>>> On 06.09.19 17:56, Philippe Gerum wrote: >>>>> On 9/6/19 5:43 PM, Jan Kiszka wrote: >>>>>> On 06.09.19 16:52, Philippe Gerum via Xenomai wrote: >>>>>>> On 9/4/19 4:44 PM, Quirin Gylstorff via Xenomai wrote: >>>>>>>> >>>>>>>> I tested xenomai-next with xeno-test on 4.19.66+ amd64. >>>>>>>> This patch triggers a general protection fault during the >>>>>>>> execution of >>>>>>>> xeno-test. The log is attached. >>>>>>>> >>>>>>> Ok, I'll have a look. Thanks. >>>>>>> >>>>>> >>>>>> I've started to dig into this a bit already but got distracted >>>>>> multiple >>>>>> times. One thing I already found out: It's not a good idea to >>>>>> close the >>>>>> fd on unregister without also ensuring that it's no longer pending in >>>>>> rtdm_fd_cleanup_queue. >>>>> >>>>> Can you elaborate on the execution scenario that would allow this? >>>> >>>> Both the fd_cleanup_thread as well as rmmod are Linux tasks that can >>>> run >>>> concurrently on separate cores. It might be unlikely, but it is >>>> possible, nothing prevents this from happening. >>>> >>> >>> I mean, given the maintenance logic between openfd_list and the >>> cleanup_fd_list, which condition can make this possible? >>> >>> >> >> The answer may involve __put_fd() serializing on fdtree_lock, whilst >> rtdm_dev_unregister() serializes on the ugly big nklock. >> > > Indeed, unregister will not pick up anything that is in the cleanup > queue. But what ensures that unregisters waits for fd_cleanup_thread to > finish first? >
__rtdm_dev_close() is interposed on ->close handling for every non-core file descriptor (i.e. anything but timerfd and mqs), and that one holds the last reference in dev->refcount. So until the cleanup thread eventually calls __rtdm_dev_close() via ops->close(), rtdm_dev_unregister() should sleep on putwq. -- Philippe.
