Pavel Kostyuchenko <shpr...@gmail.com> added the comment:

I was able to reproduce the error with version 
f13c5c8b9401a9dc19e95d8b420ee100ac022208 on FreeBSD 12.0 VM. The error seems to 
be caused not by those changes, but by lack of synchronization in the 
multiprocessing.managers.Server.
The failure happens when running the 
"test_shared_memory_SharedMemoryManager_basics" with high CPU load and frequent 
interrupts e.g. moving some window during test. Mostly I used the "python -m 
test --fail-env-changed test_multiprocessing_spawn -m 'WithProcessesTestS[hu]*' 
-F" command to reproduce the crash.
By analyzing core dumps I deduced that the crash happens during this call from 
the parent test process:

class BaseManager(object):
    def _finalize_manager(process, address, authkey, state, _Client):
        ...
            try:
                conn = _Client(address, authkey=authkey)
                try:
                    dispatch(conn, None, 'shutdown')
                finally:
                    conn.close()
            except Exception:
                pass

Main thread in the multiprocessing child:

class Server(object):
    def serve_forever(self):
        ...
        try:
            accepter = threading.Thread(target=self.accepter)
            accepter.daemon = True
            accepter.start()
            try:
                while not self.stop_event.is_set():
                    self.stop_event.wait(1)
            except (KeyboardInterrupt, SystemExit):
                pass
        finally:
            ...
            sys.exit(0)  << main thread have finished and destroyed the 
interpreter

Worker thread in the multiprocessing child.
Locals:
File "/usr/home/user/cpython/Lib/multiprocessing/managers.py", line 214, in 
handle_request
    c.send(msg)
        self = <SharedMemoryServer(....)>
        funcname = 'shutdown'
        result = None
        request = (None, 'shutdown', (), {})
        ignore = None
        args = ()
        kwds = {}
        msg = ('#RETURN', None)

Listing:
class Server(object):
    def handle_request(self, c):
        ...
            try:
                result = func(c, *args, **kwds)  << calls Server.shutdown method
            except Exception:
                msg = ('#TRACEBACK', format_exc())
            else:
                msg = ('#RETURN', result)
        try:
            c.send(msg)  << crashes with SIGBUS in _send_bytes -> write -> 
take_gil -> SET_GIL_DROP_REQUEST(tstate->interp)
        except Exception as e:
            try:
                c.send(('#TRACEBACK', format_exc()))
            except Exception:
                pass
    ...
    def shutdown(self, c):
        ...
        try:
            util.debug('manager received shutdown message')
            c.send(('#RETURN', None))
        except:
            import traceback
            traceback.print_exc()
        finally:
            self.stop_event.set()

Worker thread is daemonic and is not terminated during the interpreter 
finalization, thus it might still be running and is terminated silently when 
the process exits. The connection (c) has different implementations on several 
platforms, so we cannot be sure whether the connection is closed during 
shutdown or not, whether the last "c.send(msg)" blocks until the end of the 
process, returns instantly, or fails inconsistently.
The error was there for a long time, but for two reasons it didn't cause much 
trouble:
- the race condition is hard to trigger;
- SET_GIL_DROP_REQUEST used to ignore the errorneous state of interpreter, but 
introduction of tstate->interp argument by Eric manifested SIGBUS on FreeBSD.

I haven't managed to find a nice clean test to reproduce the bug automatically. 
I suggest the changes for the multiprocessing/managers.py in the attachment.

----------
nosy: +shprotx
Added file: https://bugs.python.org/file48333/managers.patch

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue33608>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to