[ https://issues.apache.org/jira/browse/HBASE-9286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13748210#comment-13748210 ]
Alex Newman commented on HBASE-9286: ------------------------------------ Sorry for the delay apparently I'm not getting jira email updates. JD you looking for an image. But yea I did try it and it works. Basically with this patch the graph looks like you expect. I rising line when the replicant is suspended and then it drops close to 0 when I unsuspend the replicant. > ageOfLastShippedOp replication metric doesn't update if the slave > regionserver is stalled > ----------------------------------------------------------------------------------------- > > Key: HBASE-9286 > URL: https://issues.apache.org/jira/browse/HBASE-9286 > Project: HBase > Issue Type: Bug > Reporter: Alex Newman > Assignee: Alex Newman > Attachments: > 0001-HBASE-9286.-ageOfLastShippedOp-replication-metric-do.patch > > > In replicationmanager > HRegionInterface rrs = getRS(); > rrs.replicateLogEntries(Arrays.copyOf(this.entriesArray, > currentNbEntries)); > .... > this.metrics.setAgeOfLastShippedOp( > this.entriesArray[currentNbEntries-1].getKey().getWriteTime()); > break; > which makes sense, but is wrong. The problem is that rrs.replicateLogEntries > will block for a very long time if the slave server is suspended or > unavailable but not down. > However this is easy to fix. We just need to call > refreshAgeOfLastShippedOp(); > on a regular basis, in a different thread. I've attached a patch which fixed > this for cdh4. I can make one for trunk and the like as well if you need me > to do but it's a small change. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira