[ 
https://issues.apache.org/jira/browse/BEAM-6158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17146572#comment-17146572
 ] 

Valentyn Tymofieiev commented on BEAM-6158:
-------------------------------------------

Dill maintainers consider https://github.com/uqfoundation/dill/issues/300 to be 
a high priority issue as per 
https://github.com/uqfoundation/dill/issues/300#issuecomment-644486904, which 
would fix this bug. 

I also looked into a potential switch to CloudPickle and some issues I 
encountered previously no longer happen. We will plan to make CloudPickle and 
available pickler for Beam, which will also address this bug. 

In them mean time Beam users should follow the workarounds outlined in 
https://issues.apache.org/jira/browse/BEAM-6158?focusedCommentId=16919945&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16919945.

> Using --save_main_session fails on Python 3 when main module has invocations 
> of superclass method using 'super' .
> -----------------------------------------------------------------------------------------------------------------
>
>                 Key: BEAM-6158
>                 URL: https://issues.apache.org/jira/browse/BEAM-6158
>             Project: Beam
>          Issue Type: Sub-task
>          Components: sdk-py-harness
>            Reporter: Mark Liu
>            Assignee: Valentyn Tymofieiev
>            Priority: P2
>          Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> A typical manifestation of this failure, which can be observed on several 
> Beam examples:
> {noformat}
> Traceback (most recent call last):
>   File "/usr/lib/python3.5/runpy.py", line 193, in _run_module_as_main
>     "__main__", mod_spec)
>   File "/usr/lib/python3.5/runpy.py", line 85, in _run_code
>     exec(code, run_globals)
>   File 
> "/usr/local/google/home/valentyn/tmp/r2.14.0_py3.5_env/lib/python3.5/site-packages/apache_beam/examples/complete/game/user_score.py",
>  line 164, in <module>                                                
>     run()
>   File 
> "/usr/local/google/home/valentyn/tmp/r2.14.0_py3.5_env/lib/python3.5/site-packages/apache_beam/examples/complete/game/user_score.py",
>  line 158, in run                                                     
>     | 'WriteUserScoreSums' >> beam.io.WriteToText(args.output))
>   File 
> "/usr/local/google/home/valentyn/tmp/r2.14.0_py3.5_env/lib/python3.5/site-packages/apache_beam/pipeline.py",
>  line 426, in __exit__                                                        
>                  
>     self.run().wait_until_finish()
>   File 
> "/usr/local/google/home/valentyn/tmp/r2.14.0_py3.5_env/lib/python3.5/site-packages/apache_beam/runners/dataflow/dataflow_runner.py",
>  line 1338, in wait_until_finish                                       
>     (self.state, getattr(self._runner, 'last_error_msg', None)), self)
> apache_beam.runners.dataflow.dataflow_runner.DataflowRuntimeException: 
> Dataflow pipeline failed. State: FAILED, Error:                               
>                                                              
> Traceback (most recent call last):
>   File 
> "/usr/local/lib/python3.5/site-packages/dataflow_worker/batchworker.py", line 
> 773, in run
>     self._load_main_session(self.local_staging_directory)
>   File 
> "/usr/local/lib/python3.5/site-packages/dataflow_worker/batchworker.py", line 
> 489, in _load_main_session                                                    
>                                                
>     pickler.load_session(session_file)
>   File 
> "/usr/local/lib/python3.5/site-packages/apache_beam/internal/pickler.py", 
> line 280, in load_session                                                     
>                                                    
>     return dill.load_session(file_path)
>   File "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 410, in 
> load_session
>     module = unpickler.load()
>   File "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 474, in 
> find_class
>     return StockUnpickler.find_class(self, module, name)
> AttributeError: Can't get attribute 'ParseGameEventFn' on <module 
> 'dataflow_worker.start' from 
> '/usr/local/lib/python3.5/site-packages/dataflow_worker/start.py'> {noformat}
>  
> Note that the example has the following code [1]:
> {code:python}
> class ParseGameEventFn(beam.DoFn):
>   def __init__(self):
>     super(ParseGameEventFn, self).__init__()
> {code}
> https://github.com/apache/beam/blob/0325c360bef17a6673e2d43051e59174b8e5ccc9/sdks/python/apache_beam/examples/complete/game/user_score.py#L81
> +cc: [~tvalentyn] [~robertwb] [~altay]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to