[ https://issues.apache.org/jira/browse/BEAM-8651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Valentyn Tymofieiev reassigned BEAM-8651: ----------------------------------------- Assignee: Valentyn Tymofieiev > Python 3 portable pipelines sometimes fail with errors in > StockUnpickler.find_class() > ------------------------------------------------------------------------------------- > > Key: BEAM-8651 > URL: https://issues.apache.org/jira/browse/BEAM-8651 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core > Reporter: Valentyn Tymofieiev > Assignee: Valentyn Tymofieiev > Priority: Major > > Several Beam users [1,2] reported an error which happens on Python 3 in > StockUnpickler.find_class. > So far I've seen reports of the error on Python 3.5, 3.6, and 3.7.1, on Flink > and Dataflow runners. On Dataflow runner so far I have seen this in streaming > pipelines only, which use portable SDK worker. > Typical stack trace: > {noformat} > File > "python3.5/site-packages/apache_beam/runners/worker/bundle_processor.py", > line 1148, in _create_pardo_operation > dofn_data = pickler.loads(serialized_fn) > > File "python3.5/site-packages/apache_beam/internal/pickler.py", line 265, > in loads > return dill.loads(s) > > File "python3.5/site-packages/dill/_dill.py", line 317, in loads > > return load(file, ignore) > > File "python3.5/site-packages/dill/_dill.py", line 305, in load > > obj = pik.load() > > File "python3.5/site-packages/dill/_dill.py", line 474, in find_class > > return StockUnpickler.find_class(self, module, name) > > AttributeError: Can't get attribute 'ClassName' on <module 'ModuleName' from > 'python3.5/site-packages/filename.py'> > {noformat} > According to Guenther from [1]: > {quote} > This looks exactly like a race condition that we've encountered on Python > 3.7.1: There's a bug in some older 3.7.x releases that breaks the > thread-safety of the unpickler, as concurrent unpickle threads can access a > module before it has been fully imported. See > https://bugs.python.org/issue34572 for more information. > The traceback shows a Python 3.6 venv so this could be a different issue > (the unpickle bug was introduced in version 3.7). If it's the same bug then > upgrading to Python 3.7.3 or higher should fix that issue. One potential > workaround is to ensure that all of the modules get imported during the > initialization of the sdk_worker, as this bug only affects imports done by > the unpickler. > {quote} > Opening this for visibility. Current open questions are: > 1. Find a minimal example to reproduce this issue. > 2. Figure out whether users are still affected by this issue on Python 3.7.3. > 3. Communicate a workarounds for 3.5, 3.6 users affected by this. > [1] > https://lists.apache.org/thread.html/5581ddfcf6d2ae10d25b834b8a61ebee265ffbcf650c6ec8d1e69408@%3Cdev.beam.apache.org%3E -- This message was sent by Atlassian Jira (v8.3.4#803005)