Joar Wandborg created BEAM-7540:
-----------------------------------

             Summary: deadlock using save_main_session and logging
                 Key: BEAM-7540
                 URL: https://issues.apache.org/jira/browse/BEAM-7540
             Project: Beam
          Issue Type: Bug
          Components: sdk-py-core
         Environment: Python 3.5
Linux
apache-beam 2.12.0
            Reporter: Joar Wandborg


If you set {{save_main_session = True}} and have a logging.Logger instance in 
your __main__ module, calling a logger method *after* Pipeline.run has been 
called, the process will hang and never exit.

Python 3 Pipeline that reproduces the error:
{code}
import logging

import apache_beam as beam
from apache_beam.options.pipeline_options import PipelineOptions, SetupOptions

_log = logging.getLogger(__name__)


def main(argv=None):
    logging.basicConfig(level=logging.INFO)

    pipeline_options = PipelineOptions(argv)

    setup_options = pipeline_options.view_as(SetupOptions)  # type: SetupOptions
    setup_options.save_main_session = True

    _log.info("Running pipeline")

    with beam.Pipeline(runner="DirectRunner", options=pipeline_options) as p:
        p | beam.Create(["hello", "world"]) | beam.Map(lambda x: print(x))

    print("""
    Call to _log.info will now deadlock, since the logging handler's
    threading.RLock() has been passed through dill.
    
    When you press Ctrl-C, the traceback should confirm that the process is 
    stuck at:
    
      File "/usr/lib/python3.5/logging/__init__.py", line 810, in acquire
        self.lock.acquire()
    """)
    _log.info("Pipeline done")
    print("Launching nukes")


if __name__ == '__main__':
    main()
{code}
 I have opened an issue with {{dill}} as well: 
https://github.com/uqfoundation/dill/issues/321



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to