[ 
https://issues.apache.org/jira/browse/BEAM-7540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16863580#comment-16863580
 ] 

Valentyn Tymofieiev commented on BEAM-7540:
-------------------------------------------

Thanks a lot [~joar] for a detailed report. We have encountered at least 3 
other Python 3-related issues in dill and are evaluating a replacement with 
cloudpickle. We will likely have an update this and other pickling issues 
around a couple of releases down the road.

> deadlock using save_main_session and logging
> --------------------------------------------
>
>                 Key: BEAM-7540
>                 URL: https://issues.apache.org/jira/browse/BEAM-7540
>             Project: Beam
>          Issue Type: Sub-task
>          Components: sdk-py-core
>         Environment: Python 3.5
> Linux
> apache-beam 2.12.0 & 2.13.0
> dill 0.2.9
>            Reporter: Joar Wandborg
>            Assignee: Valentyn Tymofieiev
>            Priority: Major
>
> If you set {{save_main_session = True}} and have a logging.Logger instance in 
> your __main__ module, calling a logger method *after* Pipeline.run has been 
> called, the process will hang and never exit.
> Python 3 Pipeline that reproduces the error (code also available at 
> [https://gist.github.com/joar/f021db55eca4fa9e9fd7dfd67cc011b9):]
> {code:java}
> import logging
> import apache_beam as beam
> from apache_beam.options.pipeline_options import PipelineOptions, SetupOptions
> _log = logging.getLogger(__name__)
> def main(argv=None):
>     logging.basicConfig(level=logging.INFO)
>     pipeline_options = PipelineOptions(argv)
>     setup_options = pipeline_options.view_as(SetupOptions)  # type: 
> SetupOptions
>     setup_options.save_main_session = True
>     _log.info("Running pipeline")
>     with beam.Pipeline(runner="DirectRunner", options=pipeline_options) as p:
>         p | beam.Create(["hello", "world"]) | beam.Map(lambda x: print(x))
>     print("""
>     Call to _log.info will now deadlock, since the logging handler's
>     threading.RLock() has been passed through dill.
>     
>     When you press Ctrl-C, the traceback should confirm that the process is 
>     stuck at:
>     
>       File "/usr/lib/python3.5/logging/__init__.py", line 810, in acquire
>         self.lock.acquire()
>     """)
>     _log.info("Pipeline done")
>     print("Launching nukes")
> if __name__ == '__main__':
>     main()
> {code}
>  I have opened an issue with {{dill}} as well: 
> [https://github.com/uqfoundation/dill/issues/321]
> This issue does (sadly) not happen on Python 2.
> Just to be clear: A workaround is to not use {{save_main_session = True}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to