[ https://issues.apache.org/jira/browse/HBASE-9318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jeffrey Zhong updated HBASE-9318: --------------------------------- Resolution: Fixed Fix Version/s: 0.96.0 0.98.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Thanks [~saint....@gmail.com] & [~jmhsieh] for the reviews! I integrated the patch into 0.95 and trunk branch. > Procedure#waitForLatch may not throw error even there is one > ------------------------------------------------------------ > > Key: HBASE-9318 > URL: https://issues.apache.org/jira/browse/HBASE-9318 > Project: HBase > Issue Type: Bug > Reporter: Jeffrey Zhong > Assignee: Jeffrey Zhong > Fix For: 0.98.0, 0.96.0 > > Attachments: hbase-9318.patch > > > On Suse, TestProcedureCoordinator#testUnreachableControllerDuringCommit often > fails with stack trace pasted at the bottom. > The failure is due to a race condition that if current procedure throws error > in the last wait because we don't check error after while wait loop. > When I looked at the failure, I found the related code in file > ForeignExceptionDispatcher#receive may have an issue. Though we create a new > exception, we still pass e to dispatch when e is null. [~jmhsieh] Do you know > if it's by design? > {code} > if (e != null) { > exception = e; > } else { > exception = new ForeignException(name, ""); > } > // notify all the listeners > dispatch(e); > {code} > {code} > Test case failure stack trace: > java.io.IOException via some op:java.io.IOException: Failed to reach > controller during prepare > at > org.apache.hadoop.hbase.errorhandling.ForeignExceptionDispatcher.rethrowException(ForeignExceptionDispatcher.java:85) > at > org.apache.hadoop.hbase.procedure.Procedure.waitForLatch(Procedure.java:371) > at > org.apache.hadoop.hbase.procedure.Procedure.waitForCompleted(Procedure.java:343) > at > org.apache.hadoop.hbase.procedure.TestProcedureCoordinator.testUnreachableControllerDuringCommit(TestProcedureCoordinator.java:171) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) > Caused by: java.io.IOException: Failed to reach controller during prepare > at org.apache.hadoop.hbase.procedure.Procedure.call(Procedure.java:212) > at org.apache.hadoop.hbase.procedure.Procedure.call(Procedure.java:68) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) > at java.util.concurrent.FutureTask.run(FutureTask.java:138) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:662) > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira