Fabian created BEAM-7266: ---------------------------- Summary: Pipeline run does not terminate because of Dataflow runner can close file system writer Key: BEAM-7266 URL: https://issues.apache.org/jira/browse/BEAM-7266 Project: Beam Issue Type: Bug Components: io-python-gcp, runner-dataflow Affects Versions: 2.11.0 Reporter: Fabian
We are using Apache Beam in version 2.11.0 (Python SDK) with the Dataflow runner running on the Google Cloud Platform. Two pipeline runs did not terminate, i.e. after multiple days (instead of some minutes) they where still running. The only error that was logged is: If fails to close a writer: {code:java} Traceback (most recent call last): File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py", line 649, in do_work work_executor.execute() File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/executor.py", line 178, in execute op.finish() File "dataflow_worker/native_operations.py", line 93, in dataflow_worker.native_operations.NativeWriteOperation.finish def finish(self): File "dataflow_worker/native_operations.py", line 94, in dataflow_worker.native_operations.NativeWriteOperation.finish with self.scoped_finish_state: File "dataflow_worker/native_operations.py", line 95, in dataflow_worker.native_operations.NativeWriteOperation.finish self.writer.__exit__(None, None, None) File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/nativeavroio.py", line 277, in __exit__ self._data_file_writer.close() File "/usr/local/lib/python2.7/dist-packages/avro/datafile.py", line 220, in close self.writer.close() File "/usr/local/lib/python2.7/dist-packages/apache_beam/io/filesystemio.py", line 202, in close self._uploader.finish() File "/usr/local/lib/python2.7/dist-packages/apache_beam/io/gcp/gcsio.py", line 606, in finish raise self._upload_thread.last_error # pylint: disable=raising-bad-type NotImplementedError{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)