[ https://issues.apache.org/jira/browse/HBASE-9286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Lars Hofhansl updated HBASE-9286: --------------------------------- Fix Version/s: (was: 0.96.1) (was: 0.98.0) Summary: [0.94] ageOfLastShippedOp replication metric doesn't update if the slave regionserver is stalled (was: ageOfLastShippedOp replication metric doesn't update if the slave regionserver is stalled) Cool. Making it 0.94 only, then. Going to commit tomorrow, unless I hear objections. > [0.94] ageOfLastShippedOp replication metric doesn't update if the slave > regionserver is stalled > ------------------------------------------------------------------------------------------------ > > Key: HBASE-9286 > URL: https://issues.apache.org/jira/browse/HBASE-9286 > Project: HBase > Issue Type: Bug > Reporter: Alex Newman > Assignee: Alex Newman > Fix For: 0.94.12 > > Attachments: > 0001-HBASE-9286.-ageOfLastShippedOp-replication-metric-do.patch > > > In replicationmanager > HRegionInterface rrs = getRS(); > rrs.replicateLogEntries(Arrays.copyOf(this.entriesArray, > currentNbEntries)); > .... > this.metrics.setAgeOfLastShippedOp( > this.entriesArray[currentNbEntries-1].getKey().getWriteTime()); > break; > which makes sense, but is wrong. The problem is that rrs.replicateLogEntries > will block for a very long time if the slave server is suspended or > unavailable but not down. > However this is easy to fix. We just need to call > refreshAgeOfLastShippedOp(); > on a regular basis, in a different thread. I've attached a patch which fixed > this for cdh4. I can make one for trunk and the like as well if you need me > to do but it's a small change. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira