Philippe Gerum wrote:
> Jan Kiszka wrote:
>> Philippe Gerum wrote:
>>> Philippe Gerum wrote:
>>>> Jan Kiszka wrote:
>>>>> Philippe Gerum wrote:
>>>>>> Jan Kiszka wrote:
>>>>>>> Philippe Gerum wrote:
>>>>>>>> Jan Kiszka wrote:
>>>>>>>>> commit 728fc8970e2032b3280971788f1223f3ad82d80d
>>>>>>>>> Author: Jan Kiszka <jan.kis...@siemens.com>
>>>>>>>>> Date:   Thu Jan 15 11:10:24 2009 +0100
>>>>>>>>>
>>>>>>>>>     xnpipe: Fix racy callback handlers
>>>>>>>>>     
>>>>>>>>>     Invocation of input, output and alloc handler must take place 
>>>>>>>>> under
>>>>>>>>>     nklock to properly synchronize with xnpipe_disconnect. Change all
>>>>>>>>>     callers to comply with this policy.
>>>>>>>>>
>>>>>>>> That one is under investigation. I agree on the bug report (thanks 
>>>>>>>> btw), but I
>>>>>>>> disagree on the fix. Basically, we can't run all hooks under nklock. 
>>>>>>>> For
>>>>>>>> instance, the alloc_handler may issue kmalloc() calls when issued from 
>>>>>>>> the Linux
>>>>>>>> write endpoint.
>>>>>>> You mean it /could/? Because no in-tree user (ie. native) calls
>>>>>>> rt-unsafe services from its alloc_handler.
>>>>>>>
>>>>>> When you export a public interface, it is better not to make it 
>>>>>> incompatible
>>>>>> unless there is no other way to fix a situation. Doing so is last resort 
>>>>>> for me.
>>>>> OTH, there is nothing documented yet about those callback handlers or
>>>>> xnpipe_connect. So we could only break _assumptions_ about this
>>>>> interface.
>>>> Actually, we would break existing implementations, but I understand your 
>>>> point.
>>>>
>>>>  But, of course, I would be happy if we could continue to keep
>>>>> the critical section length short. I just don't see how to achieve this
>>>>> without significant restrictions on the callback handlers and their use
>>>>> cases.
>>>>>
>>>> That is because the semantics of those callouts is not that smart. If we 
>>>> have to
>>>> break the API to solve the locking issue, I want to get the semantics 
>>>> fixed in
>>>> the same move (which may help a lot in solving the locking issue as well), 
>>>> so we
>>>> don't end up with two subsequent breakage.
>>>>
>>> As suspected, the callback management in the message pipe code had very 
>>> serious
>>> issues, including the xnpipe_disconnect / xnpipe_release race. At least:
>>>
>>> - the latency would simply skyrocket when the input queue (i.e. userland to
>>> kernel) is flushed from xnpipe_cleanup_user_conn, because the latter holds 
>>> the
>>> nklock with interrupts off when doing so. I've tested this case on v2.4.x, 
>>> the
>>> results on a latency test when the pipe test code is running in parallel 
>>> are not
>>> pretty (i.e. > 220 us latency spot).
>>>
>>> - message buffers waiting in the output queue while the latter is flushed 
>>> would
>>> NOT be returned to the memory pool when an input handler is present, which 
>>> is
>>> the case with the native API that provides one. All the confusion went from 
>>> the
>>> output notification and free buffer logic that were combined into the output
>>> handler. This went unnoticed probably because messages sent via the write()
>>> endpoint are immediately consumed by the receiving side in a RT task that
>>> preempts, but still, the bug is there.
>>>
>>> The disconnect vs release race can be fixed in a (nearly) lockless way 
>>> using a
>>> lingering close sequence, that prevents xnpipe_disconnect() to wipe out the
>>> extra state (e.g. RT_PIPE struct) until xnpipe_release() does not need it
>>> anymore. In short, if the userland side is connected, xnpipe_disconnect() 
>>> will
>>> defer the actual release of that extra state to xnpipe_release(); 
>>> otherwise, it
>>> will discard it immediately. A new callback, namely .release has been added 
>>> for
>>> that purpose.
>> Just had a short glance so far. Looks good - except that there is still
>> a use (cookie, now xstate, from xnpipe_state) after release
>> (xnpipe_minor_free). Can't we push the minor_free after calling the
>> release handler?
>>
> 
> We should indeed postpone this just in case the upper layer indexes the extra
> state on the minor value. We can also simplify a few things doing so.
> 
> --- ksrc/nucleus/pipe.c       (revision 4565)
> +++ ksrc/nucleus/pipe.c       (working copy)
> @@ -77,11 +77,9 @@
> 
>  static inline void xnpipe_minor_free(int minor)
>  {
> -     if (minor < 0 || minor >= XNPIPE_NDEVS)
> -             return;
> -
> -     __clrbits(xnpipe_bitmap[minor / BITS_PER_LONG],
> -               1UL << (minor % BITS_PER_LONG));

xnarch_write_memory_barrier()? Just for the paranoiacs.

> +     /* May be called with nklock free. */
> +     clrbits(xnpipe_bitmap[minor / BITS_PER_LONG],
> +             1UL << (minor % BITS_PER_LONG));
>  }
> 
>  static inline void xnpipe_enqueue_wait(struct xnpipe_state *state, int mask)
> @@ -413,9 +411,9 @@
>               __setbits(state->status, XNPIPE_KERN_LCLOSE);
>               xnlock_put_irqrestore(&nklock, s);
>       } else {
> -             xnpipe_minor_free(minor);
>               xnlock_put_irqrestore(&nklock, s);
>               state->ops.release(state->xstate);
> +             xnpipe_minor_free(minor);
>       }
> 
>       if (need_sched)
> @@ -621,9 +619,9 @@
>               __clrbits((__state)->status, XNPIPE_USER_CONN);         \
>               if (testbits((__state)->status, XNPIPE_KERN_LCLOSE)) {  \
>                       clrbits((__state)->status, XNPIPE_KERN_LCLOSE); \
> -                     xnpipe_minor_free(xnminor_from_state(__state)); \
>                       xnlock_put_irqrestore(&nklock, (__s));          \
>                       (__state)->ops.release((__state)->xstate);      \
> +                     xnpipe_minor_free(xnminor_from_state(__state)); \
>                       xnlock_get_irqsave(&nklock, (__s));             \
>               }                                                       \
>       } while(0)
> 

Jan

-- 
Siemens AG, Corporate Technology, CT SE 2
Corporate Competence Center Embedded Linux

_______________________________________________
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Reply via email to