[ https://issues.apache.org/jira/browse/BEAM-6158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16919945#comment-16919945 ]
Valentyn Tymofieiev commented on BEAM-6158: ------------------------------------------- The error is happens when main pipeline module has class methods that refer to superclass methods using super(). A reference to super in the method code creates a cyclical reference inside the object, which dill currently handles via pickling objects by reference. Such approach does not work for restoring a pickled a main session, since object classes need to be defined at the moment of unpickling . This issue will be addressed after https://github.com/uqfoundation/dill/issues/300. is fixed or we start using CloudPickle as a pickler, which is investigated in BEAM-8123. In the meantime following workarounds are available: - don't use super() in the main module. - refer to superclass methods via SuperClassName.method(self, ...). This is NOT an equivalent replacement, but may work in simple class hierarchies. > Using --save_main_session fails on Python 3 when main module has invocations > of superclass method using 'super' . > ----------------------------------------------------------------------------------------------------------------- > > Key: BEAM-6158 > URL: https://issues.apache.org/jira/browse/BEAM-6158 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-harness > Reporter: Mark Liu > Assignee: Valentyn Tymofieiev > Priority: Major > Time Spent: 3.5h > Remaining Estimate: 0h > > A typical manifestation of this failure, which can be observed on several > Beam examples: > {noformat} > Traceback (most recent call last): > File "/usr/lib/python3.5/runpy.py", line 193, in _run_module_as_main > "__main__", mod_spec) > File "/usr/lib/python3.5/runpy.py", line 85, in _run_code > exec(code, run_globals) > File > "/usr/local/google/home/valentyn/tmp/r2.14.0_py3.5_env/lib/python3.5/site-packages/apache_beam/examples/complete/game/user_score.py", > line 164, in <module> > run() > File > "/usr/local/google/home/valentyn/tmp/r2.14.0_py3.5_env/lib/python3.5/site-packages/apache_beam/examples/complete/game/user_score.py", > line 158, in run > | 'WriteUserScoreSums' >> beam.io.WriteToText(args.output)) > File > "/usr/local/google/home/valentyn/tmp/r2.14.0_py3.5_env/lib/python3.5/site-packages/apache_beam/pipeline.py", > line 426, in __exit__ > > self.run().wait_until_finish() > File > "/usr/local/google/home/valentyn/tmp/r2.14.0_py3.5_env/lib/python3.5/site-packages/apache_beam/runners/dataflow/dataflow_runner.py", > line 1338, in wait_until_finish > (self.state, getattr(self._runner, 'last_error_msg', None)), self) > apache_beam.runners.dataflow.dataflow_runner.DataflowRuntimeException: > Dataflow pipeline failed. State: FAILED, Error: > > Traceback (most recent call last): > File > "/usr/local/lib/python3.5/site-packages/dataflow_worker/batchworker.py", line > 773, in run > self._load_main_session(self.local_staging_directory) > File > "/usr/local/lib/python3.5/site-packages/dataflow_worker/batchworker.py", line > 489, in _load_main_session > > pickler.load_session(session_file) > File > "/usr/local/lib/python3.5/site-packages/apache_beam/internal/pickler.py", > line 280, in load_session > > return dill.load_session(file_path) > File "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 410, in > load_session > module = unpickler.load() > File "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 474, in > find_class > return StockUnpickler.find_class(self, module, name) > AttributeError: Can't get attribute 'ParseGameEventFn' on <module > 'dataflow_worker.start' from > '/usr/local/lib/python3.5/site-packages/dataflow_worker/start.py'> {noformat} > > Note that the example has the following code [1]: > {code:python} > class ParseGameEventFn(beam.DoFn): > def __init__(self): > super(ParseGameEventFn, self).__init__() > {code} > https://github.com/apache/beam/blob/0325c360bef17a6673e2d43051e59174b8e5ccc9/sdks/python/apache_beam/examples/complete/game/user_score.py#L81 > +cc: [~tvalentyn] [~robertwb] [~altay] -- This message was sent by Atlassian Jira (v8.3.2#803003)