[ https://issues.apache.org/jira/browse/BEAM-391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ahmet Altay updated BEAM-391: ----------------------------- Summary: Exceptions in gcsio upload thread causes pipeline to stall (was: Invalid GCS bucket name causes pipeline to stall) > Exceptions in gcsio upload thread causes pipeline to stall > ---------------------------------------------------------- > > Key: BEAM-391 > URL: https://issues.apache.org/jira/browse/BEAM-391 > Project: Beam > Issue Type: Bug > Components: sdk-py > Reporter: Ahmet Altay > > gcsio got stuck with invalid bucket name > GcsBufferedWriter._start_upload (gcsio.py) raises an exception if the bucket > does not exist. This causes upload thread to silenty fail. It logs exception > to the log but this does not stop the pipeline or closes the receiving end of > the multiprocessing.Pipe(). Later a call in to write() blocks at > self.conn.send_bytes(). Note that send may block if the buffer is full. > Upload thread should have a finally clause to close the socket connection. Or > better propagating the exception to its parent. This is true for other types > of exceptions also. > Another small issue in the GcsBufferedWriter.close(). It does not self > self.close to True. > reproduction: python -m apache_beam.examples.wordcount --output > gs://no-such-thing/ > Prints the exception but goes on forever. Ctrl + C breaks the main thread > shows where it got stuck. > Similarly reproducible on the service. -- This message was sent by Atlassian JIRA (v6.3.4#6332)