Keith Wall created QPID-7185:
--------------------------------

             Summary: 
ReplicatedEnvironmentFacadeTest.testReplicationGroupListenerHearsNodeRemoved 
fails sporadically on Apache CI
                 Key: QPID-7185
                 URL: https://issues.apache.org/jira/browse/QPID-7185
             Project: Qpid
          Issue Type: Bug
          Components: Java Tests
            Reporter: Keith Wall
             Fix For: qpid-java-6.1


The test {{testReplicationGroupListenerHearsNodeRemoved }} failed in the 
following way on the Apache CI host:

{noformat}
org.apache.qpid.server.store.StoreException: Exception on node removal from 
group
        at 
com.sleepycat.je.EnvironmentFailureException.unexpectedState(EnvironmentFailureException.java:426)
        at 
com.sleepycat.je.rep.util.ReplicationGroupAdmin.getException(ReplicationGroupAdmin.java:504)
        at 
com.sleepycat.je.rep.util.ReplicationGroupAdmin.doMessageExchange(ReplicationGroupAdmin.java:474)
        at 
com.sleepycat.je.rep.util.ReplicationGroupAdmin.removeMember(ReplicationGroupAdmin.java:245)
        at 
org.apache.qpid.server.store.berkeleydb.replication.ReplicatedEnvironmentFacade.removeNodeFromGroup(ReplicatedEnvironmentFacade.java:1284)
        at 
org.apache.qpid.server.store.berkeleydb.replication.ReplicatedEnvironmentFacadeTest.testReplicationGroupListenerHearsNodeRemoved(ReplicatedEnvironmentFacadeTest.java:377)
{noformat}

The underlying exception was as follows:

{noformat}
2016-04-03 23:19:00,667 ERROR [main] o.a.q.s.u.ServerScopedRuntimeException 
Exception on node removal from group
com.sleepycat.je.EnvironmentFailureException: (JE 5.0.104) (JE 5.0.104) 
Transaction -20 cannot execute write operations because this node is no longer 
a master UNEXPECTED_STATE: Unexpected internal state, may have side effects.
        at 
com.sleepycat.je.EnvironmentFailureException.unexpectedState(EnvironmentFailureException.java:426)
 ~[je-5.0.104.jar:5.0.104]
        at 
com.sleepycat.je.rep.util.ReplicationGroupAdmin.getException(ReplicationGroupAdmin.java:504)
 ~[je-5.0.104.jar:5.0.104]
        at 
com.sleepycat.je.rep.util.ReplicationGroupAdmin.doMessageExchange(ReplicationGroupAdmin.java:474)
 ~[je-5.0.104.jar:5.0.104]
        at 
com.sleepycat.je.rep.util.ReplicationGroupAdmin.removeMember(ReplicationGroupAdmin.java:245)
 ~[je-5.0.104.jar:5.0.104]
        at 
org.apache.qpid.server.store.berkeleydb.replication.ReplicatedEnvironmentFacade.removeNodeFromGroup(ReplicatedEnvironmentFacade.java:1284)
 ~[classes/:na]
        at 
org.apache.qpid.server.store.berkeleydb.replication.ReplicatedEnvironmentFacadeTest.testReplicationGroupListenerHearsNodeRemoved(ReplicatedEnvironmentFacadeTest.java:377)
 [test-classes/:na]
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[na:1.7.0_80]
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
~[na:1.7.0_80]
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[na:1.7.0_80]
        at java.lang.reflect.Method.invoke(Method.java:606) ~[na:1.7.0_80]
        at junit.framework.TestCase.runTest(TestCase.java:176) 
[junit-4.11.jar:na]
        at 
org.apache.qpid.test.utils.QpidTestCase.runTest(QpidTestCase.java:171) 
[qpid-test-utils-6.1.0-SNAPSHOT.jar:6.1.0-SNAPSHOT]
        at junit.framework.TestCase.runBare(TestCase.java:141) 
[junit-4.11.jar:na]
        at junit.framework.TestResult$1.protect(TestResult.java:122) 
[junit-4.11.jar:na]
        at junit.framework.TestResult.runProtected(TestResult.java:142) 
[junit-4.11.jar:na]
        at junit.framework.TestResult.run(TestResult.java:125) 
[junit-4.11.jar:na]
        at junit.framework.TestCase.run(TestCase.java:129) [junit-4.11.jar:na]
        at org.apache.qpid.test.utils.QpidTestCase.run(QpidTestCase.java:156) 
[qpid-test-utils-6.1.0-SNAPSHOT.jar:6.1.0-SNAPSHOT]
        at junit.framework.TestSuite.runTest(TestSuite.java:255) 
[junit-4.11.jar:na]
        at junit.framework.TestSuite.run(TestSuite.java:250) [junit-4.11.jar:na]
        at 
org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:84) 
[junit-4.11.jar:na]
        at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
 [surefire-junit4-2.17.jar:2.17]
        at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
 [surefire-junit4-2.17.jar:2.17]
        at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124) 
[surefire-junit4-2.17.jar:2.17]
        at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
 [surefire-booter-2.17.jar:2.17]
        at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
 [surefire-booter-2.17.jar:2.17]
        at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103) 
[surefire-booter-2.17.jar:2.17]
{noformat}

The node that was the target of the {{ReplicationGroupAdmin.removeMember}} call 
was at that moment being restarted as majority had been lost.  This seems to 
have provoked an unexpected exception from within JE.

The test is concerned with ensuring the listener fires correctly in response to 
changes in group membership.  This test can avoid the possibility of a 
mastership loss simply by setting designated primary to true.

As changing the consistency of a group whilst a production system is live would 
be an unusual thing to do, this chances of this manifesting in production are 
small.  If it were to happen, a node restart would be required to restore 
service.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org

Reply via email to