On 11.01.21 14:14, Philippe Gerum wrote: > > Jan Kiszka <[email protected]> writes: > >> On 09.01.21 18:01, Philippe Gerum wrote: >>> >>> Jan Kiszka <[email protected]> writes: >>> >>>> On 23.12.20 11:40, Philippe Gerum wrote: >>>>> >>>>> Jan Kiszka <[email protected]> writes: >>>>> >>>>>> On 18.12.20 15:19, Philippe Gerum via Xenomai wrote: >>>>>>> >>>>>>> This wiki page [1] contains a draft proposal about specifying which >>>>>>> services from the current RTDM interface should be part of the Common >>>>>>> Xenomai Platform. Some proposals for deprecation stand out: >>>>>>> >>>>>>> - I suspect that only very few RTDM drivers are actually handling >>>>>>> requests from other kernel-based drivers in real world applications, >>>>>>> at least not enough to justify RTDM codifying these rare cases into a >>>>>>> common interface (rtdm_open, rtdm_read, rtdm_write etc). >>>>>>> >>>>>>> In other words, although I would agree that a few particular drivers >>>>>>> might want to export a couple of services to kernel-based clients in >>>>>>> order to provide them some sort of backchannel, it seems wrong to >>>>>>> require RTDM drivers to provide a kernel interface which would match >>>>>>> their user interface in the same terms. For these specific cases, ad >>>>>>> hoc code in these few drivers should be enough. >>>>>>> >>>>>>> Besides, I believe that most kernel->kernel request paths implemented >>>>>>> by in-tree RTDM drivers have never been tested for years, if ever. >>>>>>> Meanwhile, this kernel->kernel API introduces a basic exception case >>>>>>> into many RTDM and driver code paths, e.g. for differentiating kernel >>>>>>> vs user buffers, for only very few use cases. >>>>>>> >>>>>>> For these reasons, I would suggest to deprecate the kernel->kernel API >>>>>>> from RTDM starting from 3.3, excluding it from the CXP in the same >>>>>>> move. >>>>>> >>>>>> That's fine with me. The idea was once that something like bus drivers >>>>>> would appear, but that never happened. >>>>>> >>>>>>> >>>>>>> - RTDM_EXECUTE_ATOMICALLY() and related calls relying on the Cobalt big >>>>>>> lock must go. For SMP scalability reasons, this big lock was >>>>>>> eliminated from the EVL core, which means that all the attached >>>>>>> semantics will not hold there. Serializing access to shared resources >>>>>>> should be guaranteed by resource-specific locking, not by a giant >>>>>>> traffic light like the big lock implements. >>>>>> >>>>>> This is more complicated: RTDM_EXECUTE_ATOMICALLY was in fact deprecated >>>>>> long ago, but users were migrated to cobalt_atomic_enter/leave which may >>>>>> now make it look like we no longer need this. Maybe this is already the >>>>>> case when using rtdm_waitqueue*, but let's convert everyone first. >>>>> >>>>> Alternatively, In-tree v3 drivers could also keep relying >>>>> RTDM_EXECUTE_ATOMICALLY, the v4 implementation would be different for >>>>> them. Bottom line is to exclude from the CXP the whole idea that we may >>>>> schedule while holding a lock to protect against missed wake ups, in >>>>> general the very existence of any superlock which would cover everything >>>>> from top to bottom when serializing. I agree that having v3 converge >>>>> towards the CXP would be better though. >>>>> >>>> >>>> I'm fine with migrating to a new pattern first, drop that old RTDM >>>> pattern and declare the new one as migration path. Same for below. >>>> >>>>>> >>>>>>> >>>>>>> - rtdm_mutex_timedlock() has dubious semantics. Hitting a timeout >>>>>>> condition on grabbing a mutex either means that: >>>>>>> >>>>>> >>>>>> I think you are missing the use cases: >>>>>> >>>>>> mutex-lock-timed >>>>>> ... >>>>>> wait-event-timed >>>>>> ... >>>>>> mutex-unlock >>>>>> (which goes long with timeout sequences) >>>>>> >>>>> >>>>> There is a couple of issues with such use case: first we should never >>>>> ever sleep with a mutex held, this would trigger SIGDEBUG if done from >>>>> user ( a [binary] semaphore would at least prevent this problem), but >>>>> more importantly, how would this pattern allow the event to be signaled >>>>> given the waiter holds the lock the sender would need to acquire first? >>>> >>>> Just look at the existing drivers for the use cases (which obviously >>>> imply signalling without holding the mutex). >>>> >>> >>> Excluding RTDM_EXECUTE_ATOMICALLY() which has no in-tree user, what >>> remains is solving the issue for users of the cobalt_atomic_{enter, >>> leave} pattern, i.e.: >>> >>> kernel/drivers/can/rtcan_raw.c >>> kernel/drivers/can/rtcan_socket.c >>> kernel/drivers/ipc/bufp.c >>> kernel/drivers/ipc/iddp.c >>> kernel/drivers/ipc/rtipc.c >>> kernel/drivers/ipc/xddp.c >>> kernel/drivers/net/stack/rtmac/tdma/tdma_dev.c >>> kernel/drivers/testing/timerbench.c >>> kernel/drivers/udd/udd.c >>> >>> For the call sites listed about, AFAICS we'd need to: >>> >>> 1. move any blocking call out of the locking scope, by rewriting these >>> as wait loops rechecking the condition under lock if/when required. Only >>> a few would need the latter in fact, as in many cases >>> cobalt_atomic_leave() immediately follows the blocking call in the code >>> flow. >>> >>> 2. provide _nosched variants for signaling calls >>> (e.g. rtdm_event_pulse_nosched()) and use them, invoking xnsched_run() >>> out of lock as appropriate. >>> >>> However, I cannot find any code exhibiting the issue with mutexes in >>> these matches. Do you have an in-tree example of the problem you see to >>> point me at? >>> >> >> All serial drivers use mutexes with timeout in order to make write >> operations atomic and permit waiting for free buffers inside that atomic >> section. > > Ok, but none of these driver sleep with the superlock held, which is the > pattern I'm looking for at the moment. Sleeping with a mutex is a > different issue which also requires some fixing, but neither involves > the RTDM_EXECUTE_ATOMICALLY() or cobalt_atomic_enter/leave() constructs. >
It /could/ require it if it was not sleeping with mutexes. Jan -- Siemens AG, T RDA IOT Corporate Competence Center Embedded Linux
