It is quite common for applications to set an alarm() and when it expires,
set a new alarm() from the signal handler. For example, this is what the
ping application does, and before this patch it often crashes after a
random number of times doing this.

The problem starts with alarm() wanting to know which thread ran it.
It needs this information so when an alarm happens it can interrupt a
system call sleeping in this specific thread. But unfortunately, when
alarm() runs from a signal handler (as described above), in OSv this
signal handler is a separate transient thread. Before this patch, we
remember *this* thread, and when the alarm next happens we try to wake
that thread, which no longer exists at that time. The result is of course
undefined, and depending on what else happened since, may result in a crash.

The solution in this patch is for alarm() not to change the old owner
thread if called from a signal handler thread. We *hope* that the previous
one is still relevant, and it will be true for most applications such as
ping. This workaround is kind of ugly, but it does fix the ping crash
described in issue 1073.

Fixes #1073.

Signed-off-by: Nadav Har'El <n...@scylladb.com>
---
 libc/signal.cc | 21 +++++++++++++++++++--
 1 file changed, 19 insertions(+), 2 deletions(-)

diff --git a/libc/signal.cc b/libc/signal.cc
index ac105568..57cad35a 100644
--- a/libc/signal.cc
+++ b/libc/signal.cc
@@ -496,11 +496,26 @@ void itimer::work()
     }
 }
 
+// alarm() wants to know which thread ran it, so when the alarm happens it
+// can interrupt a system call sleeping on this thread. But there's a special
+// case: if alarm() is called in a signal handler (which currently in OSv is
+// a separate thread), this probably means the alarm was re-set after the
+// previous alarm expired. In that case we obviously don't want to remember
+// the signal handler thread (which will go away almost immediately). What
+// we'll do in the is_signal_handler() case is to just keep remembering the
+// old owner thread, hoping it is still relevant...
+static bool is_signal_handler(){
+    // must be the same name used in kill() above
+    return sched::thread::current()->name() == "signal_handler";
+}
+
 void itimer::cancel()
 {
     _due = _no_alarm;
     _interval = decltype(_interval)::zero();
-    _owner_thread = nullptr;
+    if (!is_signal_handler()) {
+        _owner_thread = nullptr;
+    }
     _cond.wake_one();
 }
 
@@ -513,7 +528,9 @@ void itimer::set_value(const struct timeval *tv)
         _started = true;
     }
     _due = now + tv->tv_sec * 1_s + tv->tv_usec * 1_us;
-    _owner_thread = sched::thread::current();
+    if (!is_signal_handler()) {
+        _owner_thread = sched::thread::current();
+    }
     _cond.wake_one();
 }
 
-- 
2.21.1

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to osv-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/osv-dev/20200126133853.1351-1-nyh%40scylladb.com.

Reply via email to