[ https://issues.apache.org/jira/browse/HBASE-16960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15624243#comment-15624243 ]
stack commented on HBASE-16960: ------------------------------- This makes sense: 672 // SyncFuture reuse by thread, if TimeoutIOException happens, ringbuffer 673 // still refer to it, so if this thread use it next time may get a wrong 674 // result. this.syncFuturesByHandler.remove(Thread.currentThread()); ... Must have taken a while to figure. Patch looks good to me [~aoxiang] Does the test reproduce the scenario you've run into? And when you do reproduce the lockup, does the freeing of SyncFutures unblock us? I'll attach a patch I've been working on. I am missing a final ingrediient because it is not locking up yet. I was going to work on it this evening but if your patch does the job, I'll give up on it. Thanks [~aoxiang] for fixing this stuff. > RegionServer hang when aborting > ------------------------------- > > Key: HBASE-16960 > URL: https://issues.apache.org/jira/browse/HBASE-16960 > Project: HBase > Issue Type: Bug > Reporter: binlijin > Assignee: binlijin > Attachments: HBASE-16960.patch, HBASE-16960_master_v2.patch, > HBASE-16960_master_v3.patch, RingBufferEventHandler.png, > RingBufferEventHandler_exception.png, SyncFuture.png, > SyncFuture_exception.png, rs1081.jstack > > > We see regionserver hang when aborting several times and cause all regions on > this regionserver out of service and then all affected applications stop > works. -- This message was sent by Atlassian JIRA (v6.3.4#6332)