In SortValidator, the check for whether a file belongs to sort-input or
sort-output dir is weak
-----------------------------------------------------------------------------------------------
Key: HADOOP-1090
URL: https://issues.apache.org/jira/browse/HADOOP-1090
Project: Hadoop
Issue Type: Bug
Components: mapred
Reporter: Devaraj Das
Assigned To: Arun C Murthy
In SortValidator, Maps invoke the method called deduceInputFile in the
configure method. The deduceInputFile is supposed to return whether the input
file belongs to the sort-input directory or the sort-output directory. However,
the check that deduceInputFile does -
inputFile.toString().startsWith(inputPaths[0].toString()) - is not totally
correct. The check will always returns true for inputPaths like
/user/foo/smallInput/<filenames>, /user/foo/smallInput-sorted/<filenames>. This
finally causes the SortValidator to declare the sort output as incorrect.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.