Jens Deppe created GEODE-5080:
---------------------------------
Summary: CI Failure:
ClusterConfigLocatorRestartDUnitTest.serverRestartsAfterLocatorReconnects
Key: GEODE-5080
URL: https://issues.apache.org/jira/browse/GEODE-5080
Project: Geode
Issue Type: Bug
Components: gfsh, management
Reporter: Jens Deppe
This test intermittently fails with with following:
{noformat}
org.apache.geode.management.internal.configuration.ClusterConfigLocatorRestartDUnitTest
> serverRestartsAfterLocatorReconnects FAILED
org.apache.geode.test.dunit.RMIException: While invoking
org.apache.geode.test.dunit.rules.ClusterStartupRule$$Lambda$41/761947362.call
in VM 3 running on Host b669312074c0 with 5 VMs
at org.apache.geode.test.dunit.VM.invoke(VM.java:436)
at org.apache.geode.test.dunit.VM.invoke(VM.java:405)
at org.apache.geode.test.dunit.VM.invoke(VM.java:371)
at
org.apache.geode.test.dunit.rules.ClusterStartupRule.startServerVM(ClusterStartupRule.java:203)
at
org.apache.geode.test.dunit.rules.ClusterStartupRule.startServerVM(ClusterStartupRule.java:196)
at
org.apache.geode.test.dunit.rules.ClusterStartupRule.startServerVM(ClusterStartupRule.java:182)
at
org.apache.geode.management.internal.configuration.ClusterConfigLocatorRestartDUnitTest.serverRestartsAfterLocatorReconnects(ClusterConfigLocatorRestartDUnitTest.java:65)
Caused by:
org.apache.geode.GemFireConfigException: Unable to join the distributed
system. Operation either timed out, was stopped or Locator does not exist.
{noformat}
The detailed test failure shows the following cause:
{noformat}
Caused by: org.apache.geode.GemFireConfigException: Unable to join the
distributed system. Operation either timed out, was stopped or Locator does
not exist.
at
org.apache.geode.distributed.internal.membership.gms.mgr.GMSMembershipManager.join(GMSMembershipManager.java:661)
at
org.apache.geode.distributed.internal.membership.gms.mgr.GMSMembershipManager.joinDistributedSystem(GMSMembershipManager.java:747)
at
org.apache.geode.distributed.internal.membership.gms.Services.start(Services.java:191)
at
org.apache.geode.distributed.internal.membership.gms.GMSMemberFactory.newMembershipManager(GMSMemberFactory.java:106)
at
org.apache.geode.distributed.internal.membership.MemberFactory.newMembershipManager(MemberFactory.java:90)
at
org.apache.geode.distributed.internal.ClusterDistributionManager.<init>(ClusterDistributionManager.java:1027)
at
org.apache.geode.distributed.internal.ClusterDistributionManager.<init>(ClusterDistributionManager.java:1061)
at
org.apache.geode.distributed.internal.ClusterDistributionManager.create(ClusterDistributionManager.java:554)
at
org.apache.geode.distributed.internal.InternalDistributedSystem.initialize(InternalDistributedSystem.java:763)
at
org.apache.geode.distributed.internal.InternalDistributedSystem.newInstance(InternalDistributedSystem.java:355)
at
org.apache.geode.distributed.internal.InternalDistributedSystem.newInstance(InternalDistributedSystem.java:343)
at
org.apache.geode.distributed.internal.InternalDistributedSystem.newInstance(InternalDistributedSystem.java:335)
at
org.apache.geode.distributed.DistributedSystem.connect(DistributedSystem.java:211)
at org.apache.geode.cache.CacheFactory.create(CacheFactory.java:219)
at
org.apache.geode.test.junit.rules.ServerStarterRule.startServer(ServerStarterRule.java:172)
at
org.apache.geode.test.junit.rules.ServerStarterRule.before(ServerStarterRule.java:78)
at
org.apache.geode.test.dunit.rules.ClusterStartupRule.lambda$startServerVM$a2926408$1(ClusterStartupRule.java:212)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at hydra.MethExecutor.executeObject(MethExecutor.java:244)
at
org.apache.geode.test.dunit.standalone.RemoteDUnitVM.executeMethodOnObject(RemoteDUnitVM.java:70)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:361)
at sun.rmi.transport.Transport$1.run(Transport.java:200)
at sun.rmi.transport.Transport$1.run(Transport.java:197)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.Transport.serviceCall(Transport.java:196)
at
sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:568)
at
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:826)
at
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:683)
at java.security.AccessController.doPrivileged(Native Method)
at
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:682)
... 3 more
{noformat}
The problem is that after the locator is 'crashed' a loop is entered to wait
for the ClusterConfigurationService to restart. However, sometime this check
happens too quickly after the crash and the CC still appears to be available.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)