[ 
https://issues.apache.org/jira/browse/HBASE-15660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved HBASE-15660.
--------------------------------
       Resolution: Invalid
         Assignee:     (was: Josh Elser)
    Fix Version/s:     (was: 1.2.2)
                       (was: 1.1.5)
                       (was: 2.0.0)

Nope, I'm wrong. The message was just the general reducer timeout, not because 
of lack of progress. Fine as is.

> Printing extra refs in ITBLL.Verify reducer can cause reducers to be killed 
> due to lack of progress
> ---------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-15660
>                 URL: https://issues.apache.org/jira/browse/HBASE-15660
>             Project: HBase
>          Issue Type: Bug
>          Components: integration tests
>            Reporter: Josh Elser
>            Priority: Minor
>
> In debugging an ITBLL job which has numerous failures, I saw that instead of 
> the Verify job completing and reporting that there were a large number of 
> UNDEF nodes, the reducers in the Verify were failing due to lack of progress.
> The reducer's syslog file was filled with information from the 
> {{dumpExtraInfoOnRefs()}} method. I believe that when a reducer is repeatedly 
> doing these lookups, the MR framework doesn't realize that any progress is 
> being made (nothing is being written to the context) and eventually kills the 
> reducer task. This ultimately causes the entire Verify job to fail because 
> the reducer fails in the same manner each time.
> We should make sure to invoke {{context.progress()}} when we do these lookups 
> to let the framework know that we're still doing "our thing".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to