[ https://issues.apache.org/jira/browse/HBASE-17889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16006978#comment-16006978 ]
Enis Soztutar commented on HBASE-17889: --------------------------------------- Thanks [~huaxiang]. Did you verify that {{hbase.ipc.client.specificThreadForWriting}} have a problem in 2.0? > ResultBoundedCompletionService's cancel() needs to interrupt the working > thread and free it to the thread-pool > -------------------------------------------------------------------------------------------------------------- > > Key: HBASE-17889 > URL: https://issues.apache.org/jira/browse/HBASE-17889 > Project: HBase > Issue Type: Bug > Components: Client > Affects Versions: 2.0.0, 1.4.0, 1.2.6, 1.3.2 > Reporter: huaxiang sun > Assignee: huaxiang sun > Attachments: HBASE-17889-master-001.patch, jstack.txt > > > We run into one case with read-replica, when the server hosting the primary > region is shutdown, we see Get did not go to replica region and it paused for > about 50 seconds before Get was resumed. > More debugging finds out that when the server is down, one of the threads was > stuck at the write, it holds lock at > https://github.com/apache/hbase/blob/branch-1.3/hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/RpcClientImpl.java#L916. > The later write threads were waiting on this lock until all threads in the > connection's thread pool were stuck on this lock. At that moment, no work > will be done. After socket write times out, it frees up all threads and it > continues. > When QueueingFuture#cancel() is called, it does not interrupt the working > thread and return the thread to the pool. > Attaching the jstack trace. -- This message was sent by Atlassian JIRA (v6.3.15#6346)