[jira] Updated: (HADOOP-5698) Change org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.

Jothi Padmanabhan (JIRA) Wed, 29 Apr 2009 23:54:56 -0700

     [ 
https://issues.apache.org/jira/browse/HADOOP-5698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Jothi Padmanabhan updated HADOOP-5698:
--------------------------------------

    Status: Open  (was: Patch Available)

I think it would be a better idea not to club the fixes to the existing 
mapred.CombineInputFormat into this Jira. That should be addressed in a 
separate Jira and the patch for this should be built on top of that.

Some other points:
# MultiFileWordCount -- I do not think we should use the 
MultiFileLineRecordReader to read from a CombineSplit. It is guaranteed to work 
only if the start offset is 0, which is not necessarily true. Instead the 
CombineFileRecordReader should be used
# Minor -- Why is there a return 2 in run (instead of return 1 as in existing 
code)
# CombineFileInputFormat.createRecordReader -- should this just return null or 
should it call super.createRecordReader ?
# Minor -- CombineFileRecordReader -- Remove unused exports
# Minor -- Where ever possible, keep the code/comments restricted to 80 columns

> Change org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.
> ------------------------------------------------------------------------------
>
>                 Key: HADOOP-5698
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5698
>             Project: Hadoop Core
>          Issue Type: Sub-task
>            Reporter: Amareshwari Sriramadasu
>            Assignee: Amareshwari Sriramadasu
>         Attachments: patch-5698.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-5698) Change org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.

Reply via email to