32: Add AT_SYSINFO cancellation helpers

Rich Felker Thu, 10 Mar 2016 16:49:49 -0800

On Fri, Mar 11, 2016 at 01:18:54AM +0100, Szabolcs Nagy wrote:
> * Rich Felker <dal...@libc.org> [2016-03-10 18:28:20 -0500]:
> > On Thu, Mar 10, 2016 at 07:03:31PM +0100, Ingo Molnar wrote:
> > > 
> > > * Rich Felker <dal...@libc.org> wrote:
> > > 
> > > > > So instead of a sticky cancellation flag, we could introduce a sticky 
> > > > > cancellation signal.
> > > > > 
> > > > > A 'sticky signal' is not cleared from signal_pending() when the 
> > > > > signal handler 
> > > > > executes, but it's automatically blocked so no signal handler 
> > > > > recursion 
> > > > > occurs. (A sticky signal could still be cleared via a separate 
> > > > > mechanism, by 
> > > > > the cancellation cleanup code.)
> > > > > 
> > > > > Such a 'sticky cancellation signal' would, in the racy situation, 
> > > > > cause new 
> > > > > blocking system calls to immediately return with -EINTR. Non-blocking 
> > > > > syscalls 
> > > > > could still be used. (So the cancellation signal handler itself would 
> > > > > still 
> > > > > have access to various fundamental system calls.)
> > > > > 
> > > > > I think this would avoid messy coupling between the kernel's 
> > > > > increasingly more 
> > > > > varied system call entry code and C libraries.
> > > > > 
> > > > > Sticky signals could be requested via a new SA_ flag.
> > > > > 
> > > > > What do you think?
> > > > 
> > > > This still doesn't address the issue that the code making the syscall 
> > > > needs to 
> > > > be able to control whether it's cancellable or not. Not only do some 
> > > > syscalls 
> > > > whose public functions are cancellation points need to be used 
> > > > internally in 
> > > > non-cancellable ways; there's also the pthread_setcancelstate interface 
> > > > that 
> > > > allows deferring cancellation so that it's possible to call functions 
> > > > which are 
> > > > cancellation points without invoking cancellation.
> > > 
> > > I don't think there's a problem - but I might be wrong:
> > > 
> > > One way I think it would work is the following: a sticky signal is not 
> > > the 
> > > cancellation flag - it's a helper construct to implement the flag in 
> > > user-space in 
> > > a race-free way.
> > > 
> > > Say you have RT signal-32 as the cancellation signal, and it's a sticky 
> > > signal.
> > > 
> > > When pthread_cancel() wants to cancel another thread, it first 
> > > (atomically) sets 
> > > the desired cancel state of the target thread. If that state signals that 
> > > the 
> > > thread is cancellable right now, and that we initiated its cancellation, 
> > > then we 
> > > send signal-32. I.e. the signal only ever gets sent if the thread is in a 
> > > cancellable state.
> > > 
> > > libc internal functions and the pthread_setcancelstate() API can 
> > > temporarily 
> > > change the cancel state of a thread to non-cancellable - but 
> > > pthread_cancel() 
> > > observes those state transitions.
> > > 
> > > The 'sticky' nature of signal-32 will make a difference in the following 
> > > race 
> > > condition, if the cancellation flag is checked before a system call by 
> > > the C 
> > > library, and signal-32 arrives before the system call is executed. In 
> > > that case 
> > > the 'sticky' nature of the signal makes sure that all subsequent system 
> > > calls 
> > > return immediately.
> > > 
> > > The sticky signal is only ever sent when the thread is in cancellable 
> > > state - and 
> > > if the target thread notices the cancellation request before the signal 
> > > arrives, 
> > > it first waits for its arrival before executing any new system calls (as 
> > > part of 
> > > the teardown, etc.).
> > > 
> > > So the C library never has to do complex work with a sticky signal 
> > > pending.
> > > 
> > > Does that make more sense to you?
> > 
> > No, it doesn't work. Cancellability of the target thread at the time
> > of the cancellation request (when you would decide whether or not to
> > send the signal) has no relation to cancellability at the time of
> > calling the cancellation point. Consider 2 threads A and B and the
> > following sequence of events:
> > 
> > 1. A has cancellation enabled
> > 2. B calls pthread_cancel(A) and sets sticky pending signal
> > 3. A disables cancellation
> > 4. A calls cancellation point and syscall wrongly gets interrupted
> > 
> > This can be solved with more synchronization in pthread_cancel and
> > pthread_setcancelstate, but it seems costly. pthread_setcancelstate
> > would have to clear pending sticky cancellation signals, and any
> > internal non-cancellable syscalls would have to be made using the same
> > mechanism (effectively calling pthread_setcancelstate). A naive
> > implementation of such clearing would involve a syscall itself,
> 
> i think a syscall in setcancelstate in case of pending sticky signal
> is not that bad given that cancellation is very rarely used.


I agree, but it's not clear to me whether you could eliminate syscalls
in the case where it's not pending, since AS-safe lock machinery is
hard to get right. I don't see a way it can be done with just atomics
because the syscall that sends the signal cannot be atomic with the
memory operating setting a flag, which suggests a lock is needed, and
then there are all sorts of issues to deal with.

> however maintaining two completely different cancellation designs
> is expensive and only the current one works on old kernels.

Indeed. I think it would be hard to justify supporting a new one in
musl unless there's some easy way to isolate the complexity of having
both, being that vdso syscall is of marginal value to begin with
anyway...

Rich

Re: [musl] Re: [RFC PATCH] x86/vdso/32: Add AT_SYSINFO cancellation helpers

Reply via email to