[ https://issues.apache.org/jira/browse/TEZ-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Siddharth Seth updated TEZ-1731: -------------------------------- Attachment: TEZ-1731.2.txt Thanks Prakash and Rajesh for taking a look at the patch. bq. Should'nt both use the getLocalPathForWrite with the size parameter? Yes it should. Fixed. Fixed the unit test to use the correct merger / files. Also tested that it still fails without the rest of the patch applied. > OnDiskMerger can end up clobbering files across tasks with LocalDiskFetch > ------------------------------------------------------------------------- > > Key: TEZ-1731 > URL: https://issues.apache.org/jira/browse/TEZ-1731 > Project: Apache Tez > Issue Type: Bug > Reporter: Siddharth Seth > Assignee: Siddharth Seth > Priority: Critical > Attachments: TEZ-1731.1.txt, TEZ-1731.2.txt > > > When an on disk fetch starts with LOCAL files (optimize.local.fetch), the > filename used by the merger is based on the source file name. This name can > be the same for all tasks reading the same input on the node - and can result > in files being overwritten between tasks, depending on the order in which > events are processed, and the dir allocated by the local dir-allocator. > Leads to ChecksumExceptions, and FileNotFoundExceptions during the merge. -- This message was sent by Atlassian JIRA (v6.3.4#6332)