[
https://issues.apache.org/jira/browse/HADOOP-5698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jothi Padmanabhan updated HADOOP-5698:
--------------------------------------
Status: Open (was: Patch Available)
I think it would be a better idea not to club the fixes to the existing
mapred.CombineInputFormat into this Jira. That should be addressed in a
separate Jira and the patch for this should be built on top of that.
Some other points:
# MultiFileWordCount -- I do not think we should use the
MultiFileLineRecordReader to read from a CombineSplit. It is guaranteed to work
only if the start offset is 0, which is not necessarily true. Instead the
CombineFileRecordReader should be used
# Minor -- Why is there a return 2 in run (instead of return 1 as in existing
code)
# CombineFileInputFormat.createRecordReader -- should this just return null or
should it call super.createRecordReader ?
# Minor -- CombineFileRecordReader -- Remove unused exports
# Minor -- Where ever possible, keep the code/comments restricted to 80 columns
> Change org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.
> ------------------------------------------------------------------------------
>
> Key: HADOOP-5698
> URL: https://issues.apache.org/jira/browse/HADOOP-5698
> Project: Hadoop Core
> Issue Type: Sub-task
> Reporter: Amareshwari Sriramadasu
> Assignee: Amareshwari Sriramadasu
> Attachments: patch-5698.txt
>
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.