[ 
https://issues.apache.org/jira/browse/BEAM-10395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-10395:
-----------------------------------
    Status: Open  (was: Triage Needed)

> Dataflow runner should deduplicate files to stage by destination 
> -----------------------------------------------------------------
>
>                 Key: BEAM-10395
>                 URL: https://issues.apache.org/jira/browse/BEAM-10395
>             Project: Beam
>          Issue Type: Improvement
>          Components: runner-dataflow
>            Reporter: Steve Niemitz
>            Assignee: Steve Niemitz
>            Priority: P2
>
> If a pipeline contains multiple files with the same destination path, the 
> dataflow runner will try to stage them both in parallel, resulting in the 
> upload usually failing (due to conflicting uploads).
> The runner should only upload one file per destination, and ideally check 
> that the sources are the same as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to