I am not quite sure I can fully assess if this patch is correct but it 
seems to be. I am guessing the alarm() API is not thread-safe, meaning 
handle multiple threads setting alarm and expecting to be notified 
accordingly. So you patch does not need to handle it, right?

Also the underlying assumption is that that the original thread will 
eventually call cancel() so that _owner_thread is set back to null, right?

On Sunday, January 26, 2020 at 8:38:57 AM UTC-5, Nadav Har'El wrote:
>
> It is quite common for applications to set an alarm() and when it expires, 
> set a new alarm() from the signal handler. For example, this is what the 
> ping application does, and before this patch it often crashes after a 
> random number of times doing this. 
>
> The problem starts with alarm() wanting to know which thread ran it. 
> It needs this information so when an alarm happens it can interrupt a 
> system call sleeping in this specific thread. But unfortunately, when 
> alarm() runs from a signal handler (as described above), in OSv this 
> signal handler is a separate transient thread. Before this patch, we 
> remember *this* thread, and when the alarm next happens we try to wake 
> that thread, which no longer exists at that time. The result is of course 
> undefined, and depending on what else happened since, may result in a 
> crash. 
>
> The solution in this patch is for alarm() not to change the old owner 
> thread if called from a signal handler thread. We *hope* that the previous 
> one is still relevant, and it will be true for most applications such as 
> ping. This workaround is kind of ugly, but it does fix the ping crash 
> described in issue 1073. 
>
> Fixes #1073. 
>
> Signed-off-by: Nadav Har'El <n...@scylladb.com <javascript:>> 
> --- 
>  libc/signal.cc | 21 +++++++++++++++++++-- 
>  1 file changed, 19 insertions(+), 2 deletions(-) 
>
> diff --git a/libc/signal.cc b/libc/signal.cc 
> index ac105568..57cad35a 100644 
> --- a/libc/signal.cc 
> +++ b/libc/signal.cc 
> @@ -496,11 +496,26 @@ void itimer::work() 
>      } 
>  } 
>   
> +// alarm() wants to know which thread ran it, so when the alarm happens 
> it 
> +// can interrupt a system call sleeping on this thread. But there's a 
> special 
> +// case: if alarm() is called in a signal handler (which currently in OSv 
> is 
> +// a separate thread), this probably means the alarm was re-set after the 
> +// previous alarm expired. In that case we obviously don't want to 
> remember 
> +// the signal handler thread (which will go away almost immediately). 
> What 
> +// we'll do in the is_signal_handler() case is to just keep remembering 
> the 
> +// old owner thread, hoping it is still relevant... 
> +static bool is_signal_handler(){ 
> +    // must be the same name used in kill() above 
> +    return sched::thread::current()->name() == "signal_handler"; 
> +} 
> + 
>  void itimer::cancel() 
>  { 
>      _due = _no_alarm; 
>      _interval = decltype(_interval)::zero(); 
> -    _owner_thread = nullptr; 
> +    if (!is_signal_handler()) { 
> +        _owner_thread = nullptr; 
> +    } 
>      _cond.wake_one(); 
>  } 
>   
> @@ -513,7 +528,9 @@ void itimer::set_value(const struct timeval *tv) 
>          _started = true; 
>      } 
>      _due = now + tv->tv_sec * 1_s + tv->tv_usec * 1_us; 
> -    _owner_thread = sched::thread::current(); 
> +    if (!is_signal_handler()) { 
> +        _owner_thread = sched::thread::current(); 
> +    } 
>      _cond.wake_one(); 
>  } 
>   
> -- 
> 2.21.1 
>
>

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to osv-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/osv-dev/56ce391a-7ce7-4441-99c5-a35942efd2d6%40googlegroups.com.

Reply via email to