[
https://issues.apache.org/jira/browse/BEAM-6158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Valentyn Tymofieiev updated BEAM-6158:
--------------------------------------
Description:
A typical manifestation of this failure, which can be observed on several Beam
examples:
{noformat}
Traceback (most recent call last):
File "/usr/lib/python3.5/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/usr/lib/python3.5/runpy.py", line 85, in _run_code
exec(code, run_globals)
File
"/usr/local/google/home/valentyn/tmp/r2.14.0_py3.5_env/lib/python3.5/site-packages/apache_beam/examples/complete/game/user_score.py",
line 164, in <module>
run()
File
"/usr/local/google/home/valentyn/tmp/r2.14.0_py3.5_env/lib/python3.5/site-packages/apache_beam/examples/complete/game/user_score.py",
line 158, in run
| 'WriteUserScoreSums' >> beam.io.WriteToText(args.output))
File
"/usr/local/google/home/valentyn/tmp/r2.14.0_py3.5_env/lib/python3.5/site-packages/apache_beam/pipeline.py",
line 426, in __exit__
self.run().wait_until_finish()
File
"/usr/local/google/home/valentyn/tmp/r2.14.0_py3.5_env/lib/python3.5/site-packages/apache_beam/runners/dataflow/dataflow_runner.py",
line 1338, in wait_until_finish
(self.state, getattr(self._runner, 'last_error_msg', None)), self)
apache_beam.runners.dataflow.dataflow_runner.DataflowRuntimeException: Dataflow
pipeline failed. State: FAILED, Error:
Traceback (most recent call last):
File "/usr/local/lib/python3.5/site-packages/dataflow_worker/batchworker.py",
line 773, in run
self._load_main_session(self.local_staging_directory)
File "/usr/local/lib/python3.5/site-packages/dataflow_worker/batchworker.py",
line 489, in _load_main_session
pickler.load_session(session_file)
File
"/usr/local/lib/python3.5/site-packages/apache_beam/internal/pickler.py", line
280, in load_session
return dill.load_session(file_path)
File "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 410, in
load_session
module = unpickler.load()
File "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 474, in
find_class
return StockUnpickler.find_class(self, module, name)
AttributeError: Can't get attribute 'ParseGameEventFn' on <module
'dataflow_worker.start' from
'/usr/local/lib/python3.5/site-packages/dataflow_worker/start.py'> {noformat}
Note that the example has the following code [1]:
{code:python}
class ParseGameEventFn(beam.DoFn):
def __init__(self):
super(ParseGameEventFn, self).__init__()
{code}
https://github.com/apache/beam/blob/0325c360bef17a6673e2d43051e59174b8e5ccc9/sdks/python/apache_beam/examples/complete/game/user_score.py#L81
+cc: [~tvalentyn] [~robertwb] [~altay]
was:
This happened when I run wordcount example with portable Dataflow runner in
Python 3.5. The failure shows in worker log (unfortunately unformatted) of
[this
job|https://pantheon.corp.google.com/dataflow/jobsDetail/locations/us-central1/jobs/2018-11-29_11_47_38-6731484595556255542?project=google.com:clouddfe]:
{code:java}
Could not load main session: Traceback (most recent call last): File
"/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker_main.py",
line 125, in main _load_main_session(semi_persistent_directory) File
"/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker_main.py",
line 201, in _load_main_session pickler.load_session(session_file) File
"/usr/local/lib/python3.5/site-packages/apache_beam/internal/pickler.py", line
269, in load_session return dill.load_session(file_path) File
"/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 402, in
load_session module = unpickler.load() File
"/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 465, in find_class
return StockUnpickler.find_class(self, module, name) AttributeError: Can't get
attribute 'WordExtractingDoFn' on <module
'apache_beam.runners.worker.sdk_worker_main' from
'/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker_main.py'>
Traceback (most recent call last): File
"/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker_main.py",
line 125, in main _load_main_session(semi_persistent_directory) File
"/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker_main.py",
line 201, in _load_main_session pickler.load_session(session_file) File
"/usr/local/lib/python3.5/site-packages/apache_beam/internal/pickler.py", line
269, in load_session return dill.load_session(file_path) File
"/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 402, in
load_session module = unpickler.load() File
"/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 465, in find_class
return StockUnpickler.find_class(self, module, name) AttributeError: Can't get
attribute 'WordExtractingDoFn' on <module
'apache_beam.runners.worker.sdk_worker_main' from
'/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker_main.py'>
{code}
Looks like saved main session didn't work properly in Python 3.
+cc: [~tvalentyn] [~robertwb] [~altay]
> Using --save_main_session fails on Python 3 when main module has superclass
> constructor calls.
> ----------------------------------------------------------------------------------------------
>
> Key: BEAM-6158
> URL: https://issues.apache.org/jira/browse/BEAM-6158
> Project: Beam
> Issue Type: Sub-task
> Components: sdk-py-harness
> Reporter: Mark Liu
> Assignee: Valentyn Tymofieiev
> Priority: Major
> Time Spent: 3.5h
> Remaining Estimate: 0h
>
> A typical manifestation of this failure, which can be observed on several
> Beam examples:
> {noformat}
> Traceback (most recent call last):
> File "/usr/lib/python3.5/runpy.py", line 193, in _run_module_as_main
> "__main__", mod_spec)
> File "/usr/lib/python3.5/runpy.py", line 85, in _run_code
> exec(code, run_globals)
> File
> "/usr/local/google/home/valentyn/tmp/r2.14.0_py3.5_env/lib/python3.5/site-packages/apache_beam/examples/complete/game/user_score.py",
> line 164, in <module>
> run()
> File
> "/usr/local/google/home/valentyn/tmp/r2.14.0_py3.5_env/lib/python3.5/site-packages/apache_beam/examples/complete/game/user_score.py",
> line 158, in run
> | 'WriteUserScoreSums' >> beam.io.WriteToText(args.output))
> File
> "/usr/local/google/home/valentyn/tmp/r2.14.0_py3.5_env/lib/python3.5/site-packages/apache_beam/pipeline.py",
> line 426, in __exit__
>
> self.run().wait_until_finish()
> File
> "/usr/local/google/home/valentyn/tmp/r2.14.0_py3.5_env/lib/python3.5/site-packages/apache_beam/runners/dataflow/dataflow_runner.py",
> line 1338, in wait_until_finish
> (self.state, getattr(self._runner, 'last_error_msg', None)), self)
> apache_beam.runners.dataflow.dataflow_runner.DataflowRuntimeException:
> Dataflow pipeline failed. State: FAILED, Error:
>
> Traceback (most recent call last):
> File
> "/usr/local/lib/python3.5/site-packages/dataflow_worker/batchworker.py", line
> 773, in run
> self._load_main_session(self.local_staging_directory)
> File
> "/usr/local/lib/python3.5/site-packages/dataflow_worker/batchworker.py", line
> 489, in _load_main_session
>
> pickler.load_session(session_file)
> File
> "/usr/local/lib/python3.5/site-packages/apache_beam/internal/pickler.py",
> line 280, in load_session
>
> return dill.load_session(file_path)
> File "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 410, in
> load_session
> module = unpickler.load()
> File "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 474, in
> find_class
> return StockUnpickler.find_class(self, module, name)
> AttributeError: Can't get attribute 'ParseGameEventFn' on <module
> 'dataflow_worker.start' from
> '/usr/local/lib/python3.5/site-packages/dataflow_worker/start.py'> {noformat}
>
> Note that the example has the following code [1]:
> {code:python}
> class ParseGameEventFn(beam.DoFn):
> def __init__(self):
> super(ParseGameEventFn, self).__init__()
> {code}
> https://github.com/apache/beam/blob/0325c360bef17a6673e2d43051e59174b8e5ccc9/sdks/python/apache_beam/examples/complete/game/user_score.py#L81
> +cc: [~tvalentyn] [~robertwb] [~altay]
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)