Sourabh Bajaj created BEAM-1892: ----------------------------------- Summary: Log process during size estimation in filebasedsource Key: BEAM-1892 URL: https://issues.apache.org/jira/browse/BEAM-1892 Project: Beam Issue Type: Improvement Components: sdk-py Reporter: Sourabh Bajaj Assignee: Sourabh Bajaj
http://stackoverflow.com/questions/43095445/how-to-iterate-all-files-in-google-cloud-storage-to-be-used-as-dataflow-input The user mentioned that there was no output and a huge delay in submitting the pipeline. The file size estimation process can be slow for really large datasets and this reports no process to the end user right now. We should be logging process and thresholding the pre submission size estimation as well. -- This message was sent by Atlassian JIRA (v6.3.15#6346)