[ https://issues.apache.org/jira/browse/HADOOP-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855483#action_12855483 ]
Vinod K V commented on HADOOP-4322: ----------------------------------- When this patch is tested internally, we found some problem - the job gives out negative "input byes" count when using this outputformat. I don't have any context about this, but just updating this bug with the analysis by Rahul.. {code} ObjectFileRecordReader has getPos() method implementation , this method is giving incorrect values for offset. Code flow in the framework is like below. ===== beforePos = getPos(); //call to user's record reader 'next() method. afterPos = getPos(); //then for counter we do the following: inputByteCounter.increment(afterPos - beforePos);//this is the counter which is //in question ===== (ObjectFileRecordReader's getPos() method ) afterPos < beforePos , this is resulting in the -ve increment to the counter. {code} So this patch shouldn't be committed as is without a relook. > Input/Output Format for TFile > ----------------------------- > > Key: HADOOP-4322 > URL: https://issues.apache.org/jira/browse/HADOOP-4322 > Project: Hadoop Common > Issue Type: New Feature > Reporter: Amir Youssefi > Assignee: Amir Youssefi > Attachments: ObjectFileInputOutputFormat_1.patch > > > Input/Output Format for TFile -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.