[ https://issues.apache.org/jira/browse/HBASE-12028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14240486#comment-14240486 ]
Hadoop QA commented on HBASE-12028: ----------------------------------- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12686159/Hbase-12028-v1.patch against master branch at commit 1be63539f1e97a70bcd1eb6cbb48891b00146c51. ATTACHMENT ID: 12686159 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 5 new or modified tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:red}-1 checkstyle{color}. The applied patch generated 2097 checkstyle errors (more than the master's current 2094 errors). {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 lineLengths{color}. The patch introduces the following lines longer than 100: + int handlerCount = context.conf.getInt(HConstants.REGION_SERVER_HANDLER_COUNT, HConstants.DEFAULT_REGION_SERVER_HANDLER_COUNT); + .newReflectiveBlockingService(new TestRpcServiceProtos.TestProtobufRpcProto.BlockingInterface() { + super(null, "testRpcServer", Lists.newArrayList(new BlockingServiceAndInterface(SERVICE, null)), new InetSocketAddress( {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.ipc.TestRpcHandlerException Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/12031//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12031//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12031//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12031//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12031//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12031//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12031//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12031//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12031//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12031//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12031//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12031//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/12031//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/12031//console This message is automatically generated. > Abort the RegionServer, when one of it's handler threads die > ------------------------------------------------------------ > > Key: HBASE-12028 > URL: https://issues.apache.org/jira/browse/HBASE-12028 > Project: HBase > Issue Type: Bug > Components: regionserver > Reporter: Sudarshan Kadambi > Assignee: Alicia Ying Shu > Attachments: Hbase-12028-v1.patch, Hbase-12028.patch > > > Over in HBase-11813, a user identified an issue where in all the RPC handler > threads would exit with StackOverflow errors due to an unchecked > recursion-terminating condition. Our clusters demonstrated the same trace. > While the patch posted for HBASE-11813 got our clusters to be merry again, > the breakdown surfaced some larger issues. > When the RegionServer had all it's RPC handler threads dead, it continued to > have regions assigned it. Clearly, it wouldn't be able to serve reads and > writes on those regions. A second issue was that when a user tried to disable > or drop a table, the master would try to communicate to the regionserver for > region unassignment. Since the same handler threads seem to be used for > master <-> RS communication as well, the master ended up hanging on the RS > indefinitely. Eventually, the master stopped responding to all table > meta-operations. > A handler thread should never exit, and if it does, it seems like the more > prudent thing to do would be for the RS to abort. This way, at least recovery > can be undertaken and the regions could be reassigned elsewhere. I also think > that the master<->RS communication should get its own exclusive threadpool, > but I'll wait until this issue has been sufficiently discussed before opening > an issue ticket for that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)