[
https://issues.apache.org/jira/browse/HADOOP-5494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12696818#action_12696818
]
Chris Douglas commented on HADOOP-5494:
---------------------------------------
Just a couple small nits:
* The comment in IFile::nextRawKey should read "Position for the value" instead
of "Position for next record"
* The field dataIn in InMemoryReader hides the dataIn field in its superclass.
While it looks like its use in the current patch is OK, this could cause some
confusing behavior for future modifications.
> IFile.Reader should have a nextRawKey/nextRawValue
> --------------------------------------------------
>
> Key: HADOOP-5494
> URL: https://issues.apache.org/jira/browse/HADOOP-5494
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Affects Versions: 0.18.0
> Reporter: Devaraj Das
> Assignee: Devaraj Das
> Fix For: 0.21.0
>
> Attachments: 5494-1.patch, 5494-2.patch, 5494-3.patch
>
>
> Merger.Segment has only the next() method defined which internally calls
> next(key,value) on the underlying IFile stream. This would read both the key
> and the value bytes. It would be good to have Merger.Segment.nextRawKey(),
> that would read only the key and delay reading the value until needed (in
> Merger.MergeQueue.next()) via a new method Merger.Segment.nextRawValue().
> This would mean that we load only one value bytes at a time, and hence would
> incur potentially much less (depending on how big the values are) on the
> memory footprint.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.