> 2024年2月22日 上午1:55,Ilya Maximets <i.maxim...@ovn.org> 写道:
> 
> On 2/21/24 15:27, Aaron Conole wrote:
>> Daniel Ding <danieldin...@gmail.com> writes:
>> 
>>> After running ovs-tcpdump and inputs multiple CTRL+C, the program will
>>> raise the following exception.
>>> 
>>> Error in atexit._run_exitfuncs:
>>> Traceback (most recent call last):
>>>  File "/usr/bin/ovs-tcpdump", line 421, in cleanup_mirror
>>>    ovsdb = OVSDB(db_sock)
>>>  File "/usr/bin/ovs-tcpdump", line 168, in __init__
>>>    OVSDB.wait_for_db_change(self._idl_conn)  # Initial Sync with DB
>>>  File "/usr/bin/ovs-tcpdump", line 155, in wait_for_db_change
>>>    while idl.change_seqno == seq and not idl.run():
>>> 
>>> Signed-off-by: Daniel Ding <zhihui.d...@easystack.cn>
>>> ---
>> 
>> LGTM for the linux side - maybe Alin might check the windows side.
> 
> The code is copied from the fatal-signla library, so it is likely fine.
> However, I don;t think this issue should be fixed on the application
> level (ovs-tcpdump).   There seems to be adifference in how C and python
> fatal-signla libraries behave.  The C version calls the hooks in the run()
> function and the signal handler only updates the signal number variable.
> So, if the same signal arrives multiple times it will be handled only once
> and the run() function will not exit until all the hooks are done, unless
> it is a segfault.
> 
> With python implementation we're calling hooks right from the signal
> handler (it's not a real signal handler, so we're allowed to use not
> really singal-safe functions there).  So, if another signal arrives while
> we're executing the hooks, the second hook execution will be skipped
> due to 'recursion' check, but the signal will be re-raised immediately,
> killing the process.

As your suggestions, I rewrite recurse as a threading.Lock. So that protect 
calling hooks
applications registered.

> 
> There is likley a way to fix that by using a mutex instead of recursion
> check and blocking it while executing the hooks.  The signal number
> will need to be stored in the signal handler and the signal will need
> to be re-raised in the call_hooks() instead of signal handler.
> We can use mutex.acquire(blocking=False) as a way to prevent recursion.
> 

To avoid breaking calling hooks, I set the signal restored in the signal 
handler to SIG_IGN in 
call_hooks. And move re-raised the signal in the call_hooks with locked.

> Does that make sense?
> 
> Best regards, Ilya Maximets.

diff --git a/python/ovs/fatal_signal.py b/python/ovs/fatal_signal.py
index cb2e99e87..3aa56a166 100644
--- a/python/ovs/fatal_signal.py
+++ b/python/ovs/fatal_signal.py
@@ -16,6 +16,7 @@ import atexit
 import os
 import signal
 import sys
+import threading

 import ovs.vlog

@@ -112,28 +113,34 @@ def _unlink(file_):
 def _signal_handler(signr, _):
     _call_hooks(signr)

-    # Re-raise the signal with the default handling so that the program
-    # termination status reflects that we were killed by this signal.
-    signal.signal(signr, signal.SIG_DFL)
-    os.kill(os.getpid(), signr)
-

 def _atexit_handler():
     _call_hooks(0)


-recurse = False
+recurse = threading.Lock()


 def _call_hooks(signr):
     global recurse
-    if recurse:
+    if recurse.locked():
         return
-    recurse = True

-    for hook, cancel, run_at_exit in _hooks:
-        if signr != 0 or run_at_exit:
-            hook()
+    with recurse:
+        if signr != 0:
+            signal.signal(signr, signal.SIG_IGN)
+        else:
+            signal.signal(signal.SIGINT, signal.SIG_IGN)
+
+        for hook, cancel, run_at_exit in _hooks:
+            if signr != 0 or run_at_exit:
+                hook()
+
+        if signr != 0:
+           # Re-raise the signal with the default handling so that the program
+           # termination status reflects that we were killed by this signal.
+           signal.signal(signr, signal.SIG_DFL)
+           os.kill(os.getpid(), signr)


 _inited = False
@@ -165,7 +172,6 @@ def signal_alarm(timeout):

     if sys.platform == "win32":
         import time
-        import threading

         class Alarm (threading.Thread):
             def __init__(self, timeout):
--
2.43.0

Thanks Ilya Maximets, please help me review again.

Best regards, Daniel Ding.

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to