[ https://issues.apache.org/jira/browse/BEAM-9078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17017453#comment-17017453 ]
Boyuan Zhang commented on BEAM-9078: ------------------------------------ Hey, 2.19.0 release branch has been cut. Any updates on this issue? > Large Tarball Artifacts Should Use GCS Resumable Upload > ------------------------------------------------------- > > Key: BEAM-9078 > URL: https://issues.apache.org/jira/browse/BEAM-9078 > Project: Beam > Issue Type: Bug > Components: runner-dataflow > Affects Versions: 2.17.0 > Reporter: Brad West > Assignee: Brad West > Priority: Major > Fix For: 2.19.0 > > Original Estimate: 1h > Time Spent: 40m > Remaining Estimate: 20m > > It's possible for the tarball uploaded to GCS to be quite large. An example > is a user vendoring multiple dependencies in their tarball so as to achieve a > more stable deployable artifact. > Before this change the GCS upload api call executed a multipart upload, which > Google > [documentation]([https://cloud.google.com/storage/docs/json_api/v1/how-tos/upload)] > states should be used when the file is small enough to upload again when the > connection fails. For large tarballs, we will hit 60 second socket timeouts > before completing the multipart upload. By passing `total_size`, apitools > first checks if the size exceeds the resumable upload threshold, and executes > the more robust resumable upload rather than a multipart, avoiding > socket timeouts. -- This message was sent by Atlassian Jira (v8.3.4#803005)