[ https://issues.apache.org/jira/browse/PIG-5355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16609725#comment-16609725 ]
Koji Noguchi commented on PIG-5355: ----------------------------------- Thanks for the fix. Looks good to me. +1 Outside of this jira, I still don't like the logic of {{HBaseTableInputFormat.getProgress()}} {code:java} if (bigLastRow.compareTo(bigEnd_) > 0) { return progressSoFar_; } {code} which means when records have longer key length than {{max(startRow_.length,endRow_.length)}}, progress stays the same. > Negative progress report by HBaseTableRecordReader > -------------------------------------------------- > > Key: PIG-5355 > URL: https://issues.apache.org/jira/browse/PIG-5355 > Project: Pig > Issue Type: Bug > Reporter: Satish Subhashrao Saley > Assignee: Satish Subhashrao Saley > Priority: Major > Attachments: PIG-5355-1.patch, PIG-5355-2.patch, PIG-5355-3.patch > > > The logic for padding the current row does not consider the updated padded > row during the comparison. It ends up with different length then expected. > This results in negative value for {{processed}}. > {code} > byte[] lastPadded = currRow_; > if (currRow_.length < endRow_.length) { > lastPadded = Bytes.padTail(currRow_, endRow_.length - > currRow_.length); > } > if (currRow_.length < startRow_.length) { > lastPadded = Bytes.padTail(currRow_, startRow_.length - > currRow_.length); > } > byte [] prependHeader = {1, 0}; > BigInteger bigLastRow = new BigInteger(Bytes.add(prependHeader, > lastPadded)); > if (bigLastRow.compareTo(bigEnd_) > 0) { > return progressSoFar_; > } > BigDecimal processed = new > BigDecimal(bigLastRow.subtract(bigStart_)); > {code} > The fix is to use {{lastPadded}} in the second {{if}} comparison and > {{Bytes.padTail}} call inside that {{if}} > PIG-4700 added progress reporting. This enabled ProgressHelper in Tez. It > calls {{getProgress}} [here > |https://github.com/apache/tez/blob/master/tez-api/src/main/java/org/apache/tez/common/ProgressHelper.java#L50] > on {{PigRecrodReader}} > https://github.com/apache/pig/blob/trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigRecordReader.java#L159 > . Since Pig is reporting negative progress, job is getting killed by AM. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)