[ https://issues.apache.org/jira/browse/HBASE-15660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15243208#comment-15243208 ]
Josh Elser commented on HBASE-15660: ------------------------------------ The change is dirt simple, but I need to rerun with it just to make sure that it's working as I expect it should. > Printing extra refs in ITBLL.Verify reducer can cause reducers to be killed > due to lack of progress > --------------------------------------------------------------------------------------------------- > > Key: HBASE-15660 > URL: https://issues.apache.org/jira/browse/HBASE-15660 > Project: HBase > Issue Type: Bug > Components: integration tests > Reporter: Josh Elser > Assignee: Josh Elser > Priority: Minor > Fix For: 2.0.0, 1.1.5, 1.2.2 > > > In debugging an ITBLL job which has numerous failures, I saw that instead of > the Verify job completing and reporting that there were a large number of > UNDEF nodes, the reducers in the Verify were failing due to lack of progress. > The reducer's syslog file was filled with information from the > {{dumpExtraInfoOnRefs()}} method. I believe that when a reducer is repeatedly > doing these lookups, the MR framework doesn't realize that any progress is > being made (nothing is being written to the context) and eventually kills the > reducer task. This ultimately causes the entire Verify job to fail because > the reducer fails in the same manner each time. > We should make sure to invoke {{context.progress()}} when we do these lookups > to let the framework know that we're still doing "our thing". -- This message was sent by Atlassian JIRA (v6.3.4#6332)