[
https://issues.apache.org/jira/browse/HBASE-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13258998#comment-13258998
]
stack commented on HBASE-5833:
------------------------------
More digging. The newest test added here,
testShouldCheckMasterFailOverWhenMETAIsInOpenedState, is a little interesting.
It was added by this commit:
{code}
------------------------------------------------------------------------
r1172063 | tedyu | 2011-09-17 13:27:00 -0700 (Sat, 17 Sep 2011) | 3 lines
HBASE-4400 .META. getting stuck if RS hosting it is dead and znode state is in
RS_ZK_REGION_OPENED (Ramkrishna)
{code}
The test is a bunch of copy/paste confirming stuff its not using. It then does
a cluster shutdown but does it explicitly on a cluster object and not via
HBaseTestingUtility though it then starts a cluster subsequently with
HBaseTestingUtility. Not using HTU to do both the shutodwn and the startup can
make he HTU state confused on whether there a master available so we just wait
for ever. This seems to be responsible for case where test would timeout after
15 minutes and say no tests run and none failed.
I added a timeout for this test of 3 minutes.
Other interesting stuff is that this TestMasterFailover starts clusters per
method but shutdown leaves around some threads. I dug in some and was able to
clean up an LruBlockCache eviction thread but others persist and would take a
little more work to undo. They seem harmless but I'll list them anyways:
{code}
TestMasterFailover [JUnit]
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner at
localhost:54811
Thread [main] (Running)
Thread [ReaderThread] (Running)
Thread [Thread-2] (Suspended (breakpoint at line 587 in
HBaseTestingUtility))
HBaseTestingUtility.shutdownMiniCluster() line: 587
TestMasterFailover.testSimpleMasterFailover() line: 178
NativeMethodAccessorImpl.invoke0(Method, Object,
Object[]) line: not available [native method]
NativeMethodAccessorImpl.invoke(Object, Object[]) line:
39
DelegatingMethodAccessorImpl.invoke(Object, Object[])
line: 25
Method.invoke(Object, Object...) line: 597
FrameworkMethod$1.runReflectiveCall() line: 45
FrameworkMethod$1(ReflectiveCallable).run() line: 15
FrameworkMethod.invokeExplosively(Object, Object...)
line: 42
InvokeMethod.evaluate() line: 20
FailOnTimeout$StatementThread.run() line: 62
Daemon Thread [Poller SunPKCS11-Darwin] (Running)
Thread [pool-1-thread-1] (Running)
Thread [pool-2-thread-1] (Running)
Thread [pool-3-thread-1] (Running)
Thread [pool-4-thread-1] (Running)
Daemon Thread [LeaseChecker] (Running)
Daemon Thread
[RegionServer:2;192.168.1.74,54842,1335066804457.decayingSampleTick.1]
(Running)
Daemon Thread
[Master:2;192.168.1.74,54838,1335066803952-SendThread(fe80:0:0:0:0:0:0:1%1:21818)]
(Running)
Daemon Thread
[Master:2;192.168.1.74,54838,1335066803952-EventThread] (Running)
Daemon Thread
[Master:1;192.168.1.74,54836,1335066798880-EventThread] (Running)
Daemon Thread
[Master:1;192.168.1.74,54836,1335066798880-SendThread(localhost:21818)]
(Running)
/System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home/bin/java (Apr
21, 2012 8:53:07 PM)
{code}
The thread names are enhanced -- v2 of this patch -- but things like
decayingSampleTick are set in a static so hard to get rid of in test setup.
The SendThread/EventThread are zk client hangouts. Not sure what
pool-4-thread-1 are (I've enhanced the HTable executor to include htable in
name so these are identifiable going forward but above executor does not seem
to be HTable).
> 0.92 build has been failing pretty consistently on TestMasterFailover....
> -------------------------------------------------------------------------
>
> Key: HBASE-5833
> URL: https://issues.apache.org/jira/browse/HBASE-5833
> Project: HBase
> Issue Type: Bug
> Reporter: stack
> Assignee: stack
> Fix For: 0.92.2
>
> Attachments: 5833.txt, closehregions.txt
>
>
> Trunk seems fine but 0.92 fails on this test pretty regularly. Running it
> local it seems to hang for me.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira