[jira] Commented: (HADOOP-4322) Input/Output Format for TFile

Vinod K V (JIRA) Fri, 09 Apr 2010 10:32:15 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855483#action_12855483
 ]


Vinod K V commented on HADOOP-4322:
-----------------------------------


When this patch is tested internally, we found some problem - the job gives out 
negative "input byes" count when using this outputformat. I don't have any 
context about this, but just updating this bug with the analysis by Rahul..

{code}

ObjectFileRecordReader has getPos() method implementation , this method is 
giving incorrect values for offset.

Code flow in the framework is like below.
=====
      beforePos = getPos();
      //call to user's record reader 'next() method.
      afterPos = getPos();

//then for counter we do the following:
inputByteCounter.increment(afterPos - beforePos);//this is the counter which is 
                                                 //in question 
=====

(ObjectFileRecordReader's getPos() method ) afterPos < beforePos , this is 
resulting in the -ve increment to the counter.
{code}

So this patch shouldn't be committed as is without a relook.

> Input/Output Format for TFile
> -----------------------------
>
>                 Key: HADOOP-4322
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4322
>             Project: Hadoop Common
>          Issue Type: New Feature
>            Reporter: Amir Youssefi
>            Assignee: Amir Youssefi
>         Attachments: ObjectFileInputOutputFormat_1.patch
>
>
> Input/Output Format for TFile

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-4322) Input/Output Format for TFile

Reply via email to