[ https://issues.apache.org/jira/browse/MAPREDUCE-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472526#comment-13472526 ]
Robert Joseph Evans commented on MAPREDUCE-4568: ------------------------------------------------ I spoke with Virag about this before he filed the JIRA. The main goal here is to provide a way for Oozie to be able to maintain a bit more of a semblance of backwards compatibility even after MAPREDUCE-4549 goes in. They essentially want to de-dupe the entires in the dist cache that would cause an error. We originally decided on having a exception thrown because it would allow for other errors/checks that may show up in the future to also be added in. I don't think there would be a problem with adding in a new API that throws an exception if that API was also added into the 1.x line as well, but perhaps did not throw anything because there are not the same limitations. I realize that adding in new APIs, especially since we already have 3 classes that have these types of APIs in them, is not ideal, but it is the only way to maintain backwards compatibility and evolve the API. > Throw "early" exception when duplicate files or archives are found in > distributed cache > --------------------------------------------------------------------------------------- > > Key: MAPREDUCE-4568 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4568 > Project: Hadoop Map/Reduce > Issue Type: Bug > Reporter: Mohammad Kamrul Islam > Assignee: Arun C Murthy > > According to #MAPREDUCE-4549, Hadoop 2.x throws exception if duplicates found > in cacheFiles or cacheArchives. The exception throws during job submission. > This JIRA is to throw the exception ==early== when it is first added to the > Distributed Cache through addCacheFile or addFileToClassPath. > It will help the client to decide whether to fail-fast or continue w/o the > duplicated entries. > Alternatively, Hadoop could provide a knob where user will choose whether to > throw error( coming behavior) or silently ignore (old behavior). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira