[ 
https://issues.apache.org/jira/browse/TEZ-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated TEZ-1731:
--------------------------------
    Attachment: TEZ-1731.2.txt

Thanks Prakash and Rajesh for taking a look at the patch. 

bq. Should'nt both use the getLocalPathForWrite with the size parameter?
Yes it should. Fixed.

Fixed the unit test to use the correct merger / files. Also tested that it 
still fails without the rest of the patch applied.

> OnDiskMerger can end up clobbering files across tasks with LocalDiskFetch
> -------------------------------------------------------------------------
>
>                 Key: TEZ-1731
>                 URL: https://issues.apache.org/jira/browse/TEZ-1731
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Siddharth Seth
>            Assignee: Siddharth Seth
>            Priority: Critical
>         Attachments: TEZ-1731.1.txt, TEZ-1731.2.txt
>
>
> When an on disk fetch starts with LOCAL files (optimize.local.fetch), the 
> filename used by the merger is based on the source file name. This name can 
> be the same for all tasks reading the same input on the node - and can result 
> in files being overwritten between tasks, depending on the order in which 
> events are processed, and the dir allocated by the local dir-allocator.
> Leads to ChecksumExceptions, and FileNotFoundExceptions during the merge.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to