[jira] [Commented] (BEAM-9078) Large Tarball Artifacts Should Use GCS Resumable Upload

2020-01-16 Thread Boyuan Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17017481#comment-17017481
 ] 

Boyuan Zhang commented on BEAM-9078:


Thanks Brad! unfortunately, I don't thin k the issue will be marked as resolved 
automatically. If you think this issue has been addressed, please close it 
manually.
Thanks for your contribution!

> Large Tarball Artifacts Should Use GCS Resumable Upload
> ---
>
> Key: BEAM-9078
> URL: https://issues.apache.org/jira/browse/BEAM-9078
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Affects Versions: 2.17.0
>Reporter: Brad West
>Assignee: Brad West
>Priority: Major
> Fix For: 2.19.0
>
>   Original Estimate: 1h
>  Time Spent: 40m
>  Remaining Estimate: 20m
>
> It's possible for the tarball uploaded to GCS to be quite large. An example 
> is a user vendoring multiple dependencies in their tarball so as to achieve a 
> more stable deployable artifact.
> Before this change the GCS upload api call executed a multipart upload, which 
> Google 
> [documentation]([https://cloud.google.com/storage/docs/json_api/v1/how-tos/upload)]
>  states should be used when the file is small enough to upload again when the 
> connection fails. For large tarballs, we will hit 60 second socket timeouts 
> before completing the multipart upload. By passing `total_size`, apitools 
> first checks if the size exceeds the resumable upload threshold, and executes 
> the more robust resumable upload rather than a multipart, avoiding
>  socket timeouts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9078) Large Tarball Artifacts Should Use GCS Resumable Upload

2020-01-16 Thread Brad West (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17017479#comment-17017479
 ] 

Brad West commented on BEAM-9078:
-

Hmm, I assumed the Github/Jira integration (is there not one?) would 
automatically update this ticket. PR was merged and fix is included in the 
release-2.19.0 branch. Do I mark as resolved or wait for automation to take 
care of this ticket? First time contributer, so please advise. Thanks

> Large Tarball Artifacts Should Use GCS Resumable Upload
> ---
>
> Key: BEAM-9078
> URL: https://issues.apache.org/jira/browse/BEAM-9078
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Affects Versions: 2.17.0
>Reporter: Brad West
>Assignee: Brad West
>Priority: Major
> Fix For: 2.19.0
>
>   Original Estimate: 1h
>  Time Spent: 40m
>  Remaining Estimate: 20m
>
> It's possible for the tarball uploaded to GCS to be quite large. An example 
> is a user vendoring multiple dependencies in their tarball so as to achieve a 
> more stable deployable artifact.
> Before this change the GCS upload api call executed a multipart upload, which 
> Google 
> [documentation]([https://cloud.google.com/storage/docs/json_api/v1/how-tos/upload)]
>  states should be used when the file is small enough to upload again when the 
> connection fails. For large tarballs, we will hit 60 second socket timeouts 
> before completing the multipart upload. By passing `total_size`, apitools 
> first checks if the size exceeds the resumable upload threshold, and executes 
> the more robust resumable upload rather than a multipart, avoiding
>  socket timeouts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9078) Large Tarball Artifacts Should Use GCS Resumable Upload

2020-01-16 Thread Boyuan Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17017453#comment-17017453
 ] 

Boyuan Zhang commented on BEAM-9078:


Hey, 2.19.0 release branch has been cut. Any updates on this issue?

> Large Tarball Artifacts Should Use GCS Resumable Upload
> ---
>
> Key: BEAM-9078
> URL: https://issues.apache.org/jira/browse/BEAM-9078
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Affects Versions: 2.17.0
>Reporter: Brad West
>Assignee: Brad West
>Priority: Major
> Fix For: 2.19.0
>
>   Original Estimate: 1h
>  Time Spent: 40m
>  Remaining Estimate: 20m
>
> It's possible for the tarball uploaded to GCS to be quite large. An example 
> is a user vendoring multiple dependencies in their tarball so as to achieve a 
> more stable deployable artifact.
> Before this change the GCS upload api call executed a multipart upload, which 
> Google 
> [documentation]([https://cloud.google.com/storage/docs/json_api/v1/how-tos/upload)]
>  states should be used when the file is small enough to upload again when the 
> connection fails. For large tarballs, we will hit 60 second socket timeouts 
> before completing the multipart upload. By passing `total_size`, apitools 
> first checks if the size exceeds the resumable upload threshold, and executes 
> the more robust resumable upload rather than a multipart, avoiding
>  socket timeouts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)