[ 
https://issues.apache.org/jira/browse/HBASE-16960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15624243#comment-15624243
 ] 

stack commented on HBASE-16960:
-------------------------------

This makes sense:

672           // SyncFuture reuse by thread, if TimeoutIOException happens, 
ringbuffer
673           // still refer to it, so if this thread use it next time may get 
a wrong
674           // result.
              this.syncFuturesByHandler.remove(Thread.currentThread());

... Must have taken a while to figure.

Patch looks good to me [~aoxiang] Does the test reproduce the scenario you've 
run into? And when you do reproduce the lockup, does the freeing of SyncFutures 
unblock us?

I'll attach a patch I've been working on.  I am missing a final ingrediient 
because it is not locking up yet. I was going to work on it this evening but if 
your patch does the job, I'll give up on it.

Thanks [~aoxiang] for fixing this stuff.



> RegionServer hang when aborting
> -------------------------------
>
>                 Key: HBASE-16960
>                 URL: https://issues.apache.org/jira/browse/HBASE-16960
>             Project: HBase
>          Issue Type: Bug
>            Reporter: binlijin
>            Assignee: binlijin
>         Attachments: HBASE-16960.patch, HBASE-16960_master_v2.patch, 
> HBASE-16960_master_v3.patch, RingBufferEventHandler.png, 
> RingBufferEventHandler_exception.png, SyncFuture.png, 
> SyncFuture_exception.png, rs1081.jstack
>
>
> We see regionserver hang when aborting several times and cause all regions on 
> this regionserver out of service and then all affected applications stop 
> works.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to