[ 
https://issues.apache.org/jira/browse/BEAM-10395?focusedWorklogId=456253&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-456253
 ]

ASF GitHub Bot logged work on BEAM-10395:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 08/Jul/20 17:23
            Start Date: 08/Jul/20 17:23
    Worklog Time Spent: 10m 
      Work Description: pabloem commented on pull request #12144:
URL: https://github.com/apache/beam/pull/12144#issuecomment-655651862


   taking a look today


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 456253)
    Time Spent: 20m  (was: 10m)

> Dataflow runner should deduplicate files to stage by destination 
> -----------------------------------------------------------------
>
>                 Key: BEAM-10395
>                 URL: https://issues.apache.org/jira/browse/BEAM-10395
>             Project: Beam
>          Issue Type: Improvement
>          Components: runner-dataflow
>            Reporter: Steve Niemitz
>            Assignee: Steve Niemitz
>            Priority: P2
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> If a pipeline contains multiple files with the same destination path, the 
> dataflow runner will try to stage them both in parallel, resulting in the 
> upload usually failing (due to conflicting uploads).
> The runner should only upload one file per destination, and ideally check 
> that the sources are the same as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to