[jira] [Updated] (GEODE-9204) A not serializable exception can cause a ServerConnection thread to get stuck waiting for a reply from another member
[ https://issues.apache.org/jira/browse/GEODE-9204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt updated GEODE-9204: -- Summary: A not serializable exception can cause a ServerConnection thread to get stuck waiting for a reply from another member (was: A not serializable object can cause a ServerConnection thread to get stuck waiting for a reply from another member) > A not serializable exception can cause a ServerConnection thread to get stuck > waiting for a reply from another member > - > > Key: GEODE-9204 > URL: https://issues.apache.org/jira/browse/GEODE-9204 > Project: Geode > Issue Type: Bug > Components: membership, messaging >Reporter: Bruce J Schuchardt >Assignee: Bruce J Schuchardt >Priority: Major > > A test case that reproduces it is: > - a client get request is received in one server and sent to another server > - the other server uses a CacheLoader to load the value > - the CacheLoader throws an exception containing a non-serializable object > - the reply attempts to serialize that exception but fails with > NotSerializableException > - the original server's ServerConnection thread gets stuck waiting for a > reply that will never come > Here is a stack trace showing the NotSerializableException: > {noformat} > [severe 2018/03/20 14:30:27.793 PDT elgreco(85544):30177 unshared ordered uid=14 dom #1 port=53923> > tid=0x5c] Uncaught exception processing partitioned.GetMessage(prid=2 (name > = "/data") processorId=0; posDup=false; key=0; callback arg=null; > context=identity(elgreco(client:85552:loner):53907:fce35145:client,connection=2) > org.apache.geode.InternalGemFireException: java.io.NotSerializableException: > java.lang.Object > at > org.apache.geode.internal.tcp.DirectReplySender.putOutgoing(DirectReplySender.java:76) > at > org.apache.geode.distributed.internal.ReplyMessage.send(ReplyMessage.java:109) > at > org.apache.geode.internal.cache.partitioned.PartitionMessage.sendReply(PartitionMessage.java:392) > at > org.apache.geode.internal.cache.partitioned.PartitionMessage.process(PartitionMessage.java:376) > at > org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:386) > at > org.apache.geode.distributed.internal.DistributionMessage.schedule(DistributionMessage.java:449) > at > org.apache.geode.distributed.internal.DistributionManager.scheduleIncomingMessage(DistributionManager.java:3872) > at > org.apache.geode.distributed.internal.DistributionManager.handleIncomingDMsg(DistributionManager.java:3496) > at > org.apache.geode.distributed.internal.DistributionManager$MyListener.messageReceived(DistributionManager.java:4693) > at > org.apache.geode.distributed.internal.membership.jgroup.JGroupMembershipManager.processMessage(JGroupMembershipManager.java:2128) > at > org.apache.geode.distributed.internal.membership.jgroup.JGroupMembershipManager.handleOrDeferMessage(JGroupMembershipManager.java:2037) > at > org.apache.geode.distributed.internal.membership.jgroup.JGroupMembershipManager$MyDCReceiver.messageReceived(JGroupMembershipManager.java:647) > at > org.apache.geode.distributed.internal.direct.DirectChannel.receive(DirectChannel.java:804) > at > org.apache.geode.internal.tcp.TCPConduit.messageReceived(TCPConduit.java:835) > at > org.apache.geode.internal.tcp.Connection.dispatchMessage(Connection.java:3932) > at > org.apache.geode.internal.tcp.Connection.processNIOBuffer(Connection.java:3515) > at > org.apache.geode.internal.tcp.Connection.runNioReader(Connection.java:1827) > at org.apache.geode.internal.tcp.Connection.run(Connection.java:1702) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.io.NotSerializableException: java.lang.Object > at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184) > at > java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548) > at > java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509) > at > java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432) > at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178) > at > java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548) > at > java.io.ObjectOutputStream.defaultWriteObject(ObjectOutputStream.java:441) > at java.lang.Throwable.writeObject(Throwable.java:985) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.Delegat
[jira] [Resolved] (GEODE-9139) SSLException in starting up a Locator
[ https://issues.apache.org/jira/browse/GEODE-9139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt resolved GEODE-9139. --- Resolution: Fixed > SSLException in starting up a Locator > - > > Key: GEODE-9139 > URL: https://issues.apache.org/jira/browse/GEODE-9139 > Project: Geode > Issue Type: Bug > Components: membership, messaging >Reporter: Bruce J Schuchardt >Assignee: Bruce J Schuchardt >Priority: Major > Labels: pull-request-available > Fix For: 1.15.0 > > > If you start up a locator using its host name, without a domain name, as a > bind address you may get an SSLException in the form > {noformat} > javax.net.ssl.SSLHandshakeException: java.security.cert.CertificateException: > No subject alternative DNS name matching hostname.domainname found > {noformat} > The LocatorLauncher and InternalLocator throw away the bind address string > and later do a reverse lookup to find the fully qualified hostname to use in > endpoint identification matching.If the locator's own TLS certificate > doesn't have the fully qualified name in it as a Subject Alternate Name the > connection that the Locator makes to its own location service will fail. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GEODE-9204) A not serializable object can cause a ServerConnection thread to get stuck waiting for a reply from another member
[ https://issues.apache.org/jira/browse/GEODE-9204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt updated GEODE-9204: -- Issue Type: Bug (was: Test) > A not serializable object can cause a ServerConnection thread to get stuck > waiting for a reply from another member > -- > > Key: GEODE-9204 > URL: https://issues.apache.org/jira/browse/GEODE-9204 > Project: Geode > Issue Type: Bug > Components: membership, messaging >Reporter: Bruce J Schuchardt >Assignee: Bruce J Schuchardt >Priority: Major > > A test case that reproduces it is: > - a client get request is received in one server and sent to another server > - the other server uses a CacheLoader to load the value > - the CacheLoader throws an exception containing a non-serializable object > - the reply attempts to serialize that exception but fails with > NotSerializableException > - the original server's ServerConnection thread gets stuck waiting for a > reply that will never come > Here is a stack trace showing the NotSerializableException: > {noformat} > [severe 2018/03/20 14:30:27.793 PDT elgreco(85544):30177 unshared ordered uid=14 dom #1 port=53923> > tid=0x5c] Uncaught exception processing partitioned.GetMessage(prid=2 (name > = "/data") processorId=0; posDup=false; key=0; callback arg=null; > context=identity(elgreco(client:85552:loner):53907:fce35145:client,connection=2) > org.apache.geode.InternalGemFireException: java.io.NotSerializableException: > java.lang.Object > at > org.apache.geode.internal.tcp.DirectReplySender.putOutgoing(DirectReplySender.java:76) > at > org.apache.geode.distributed.internal.ReplyMessage.send(ReplyMessage.java:109) > at > org.apache.geode.internal.cache.partitioned.PartitionMessage.sendReply(PartitionMessage.java:392) > at > org.apache.geode.internal.cache.partitioned.PartitionMessage.process(PartitionMessage.java:376) > at > org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:386) > at > org.apache.geode.distributed.internal.DistributionMessage.schedule(DistributionMessage.java:449) > at > org.apache.geode.distributed.internal.DistributionManager.scheduleIncomingMessage(DistributionManager.java:3872) > at > org.apache.geode.distributed.internal.DistributionManager.handleIncomingDMsg(DistributionManager.java:3496) > at > org.apache.geode.distributed.internal.DistributionManager$MyListener.messageReceived(DistributionManager.java:4693) > at > org.apache.geode.distributed.internal.membership.jgroup.JGroupMembershipManager.processMessage(JGroupMembershipManager.java:2128) > at > org.apache.geode.distributed.internal.membership.jgroup.JGroupMembershipManager.handleOrDeferMessage(JGroupMembershipManager.java:2037) > at > org.apache.geode.distributed.internal.membership.jgroup.JGroupMembershipManager$MyDCReceiver.messageReceived(JGroupMembershipManager.java:647) > at > org.apache.geode.distributed.internal.direct.DirectChannel.receive(DirectChannel.java:804) > at > org.apache.geode.internal.tcp.TCPConduit.messageReceived(TCPConduit.java:835) > at > org.apache.geode.internal.tcp.Connection.dispatchMessage(Connection.java:3932) > at > org.apache.geode.internal.tcp.Connection.processNIOBuffer(Connection.java:3515) > at > org.apache.geode.internal.tcp.Connection.runNioReader(Connection.java:1827) > at org.apache.geode.internal.tcp.Connection.run(Connection.java:1702) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.io.NotSerializableException: java.lang.Object > at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184) > at > java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548) > at > java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509) > at > java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432) > at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178) > at > java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548) > at > java.io.ObjectOutputStream.defaultWriteObject(ObjectOutputStream.java:441) > at java.lang.Throwable.writeObject(Throwable.java:985) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:1028) > at > ja
[jira] [Assigned] (GEODE-9204) A not serializable object can cause a ServerConnection thread to get stuck waiting for a reply from another member
[ https://issues.apache.org/jira/browse/GEODE-9204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt reassigned GEODE-9204: - Assignee: Bruce J Schuchardt > A not serializable object can cause a ServerConnection thread to get stuck > waiting for a reply from another member > -- > > Key: GEODE-9204 > URL: https://issues.apache.org/jira/browse/GEODE-9204 > Project: Geode > Issue Type: Test > Components: membership, messaging >Reporter: Bruce J Schuchardt >Assignee: Bruce J Schuchardt >Priority: Major > > A test case that reproduces it is: > - a client get request is received in one server and sent to another server > - the other server uses a CacheLoader to load the value > - the CacheLoader throws an exception containing a non-serializable object > - the reply attempts to serialize that exception but fails with > NotSerializableException > - the original server's ServerConnection thread gets stuck waiting for a > reply that will never come > Here is a stack trace showing the NotSerializableException: > {noformat} > [severe 2018/03/20 14:30:27.793 PDT elgreco(85544):30177 unshared ordered uid=14 dom #1 port=53923> > tid=0x5c] Uncaught exception processing partitioned.GetMessage(prid=2 (name > = "/data") processorId=0; posDup=false; key=0; callback arg=null; > context=identity(elgreco(client:85552:loner):53907:fce35145:client,connection=2) > org.apache.geode.InternalGemFireException: java.io.NotSerializableException: > java.lang.Object > at > org.apache.geode.internal.tcp.DirectReplySender.putOutgoing(DirectReplySender.java:76) > at > org.apache.geode.distributed.internal.ReplyMessage.send(ReplyMessage.java:109) > at > org.apache.geode.internal.cache.partitioned.PartitionMessage.sendReply(PartitionMessage.java:392) > at > org.apache.geode.internal.cache.partitioned.PartitionMessage.process(PartitionMessage.java:376) > at > org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:386) > at > org.apache.geode.distributed.internal.DistributionMessage.schedule(DistributionMessage.java:449) > at > org.apache.geode.distributed.internal.DistributionManager.scheduleIncomingMessage(DistributionManager.java:3872) > at > org.apache.geode.distributed.internal.DistributionManager.handleIncomingDMsg(DistributionManager.java:3496) > at > org.apache.geode.distributed.internal.DistributionManager$MyListener.messageReceived(DistributionManager.java:4693) > at > org.apache.geode.distributed.internal.membership.jgroup.JGroupMembershipManager.processMessage(JGroupMembershipManager.java:2128) > at > org.apache.geode.distributed.internal.membership.jgroup.JGroupMembershipManager.handleOrDeferMessage(JGroupMembershipManager.java:2037) > at > org.apache.geode.distributed.internal.membership.jgroup.JGroupMembershipManager$MyDCReceiver.messageReceived(JGroupMembershipManager.java:647) > at > org.apache.geode.distributed.internal.direct.DirectChannel.receive(DirectChannel.java:804) > at > org.apache.geode.internal.tcp.TCPConduit.messageReceived(TCPConduit.java:835) > at > org.apache.geode.internal.tcp.Connection.dispatchMessage(Connection.java:3932) > at > org.apache.geode.internal.tcp.Connection.processNIOBuffer(Connection.java:3515) > at > org.apache.geode.internal.tcp.Connection.runNioReader(Connection.java:1827) > at org.apache.geode.internal.tcp.Connection.run(Connection.java:1702) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.io.NotSerializableException: java.lang.Object > at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184) > at > java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548) > at > java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509) > at > java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432) > at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178) > at > java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548) > at > java.io.ObjectOutputStream.defaultWriteObject(ObjectOutputStream.java:441) > at java.lang.Throwable.writeObject(Throwable.java:985) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:1028) >
[jira] [Created] (GEODE-9204) A not serializable object can cause a ServerConnection thread to get stuck waiting for a reply from another member
Bruce J Schuchardt created GEODE-9204: - Summary: A not serializable object can cause a ServerConnection thread to get stuck waiting for a reply from another member Key: GEODE-9204 URL: https://issues.apache.org/jira/browse/GEODE-9204 Project: Geode Issue Type: Test Components: membership, messaging Reporter: Bruce J Schuchardt A test case that reproduces it is: - a client get request is received in one server and sent to another server - the other server uses a CacheLoader to load the value - the CacheLoader throws an exception containing a non-serializable object - the reply attempts to serialize that exception but fails with NotSerializableException - the original server's ServerConnection thread gets stuck waiting for a reply that will never come Here is a stack trace showing the NotSerializableException: {noformat} [severe 2018/03/20 14:30:27.793 PDT :30177 unshared ordered uid=14 dom #1 port=53923> tid=0x5c] Uncaught exception processing partitioned.GetMessage(prid=2 (name = "/data") processorId=0; posDup=false; key=0; callback arg=null; context=identity(elgreco(client:85552:loner):53907:fce35145:client,connection=2) org.apache.geode.InternalGemFireException: java.io.NotSerializableException: java.lang.Object at org.apache.geode.internal.tcp.DirectReplySender.putOutgoing(DirectReplySender.java:76) at org.apache.geode.distributed.internal.ReplyMessage.send(ReplyMessage.java:109) at org.apache.geode.internal.cache.partitioned.PartitionMessage.sendReply(PartitionMessage.java:392) at org.apache.geode.internal.cache.partitioned.PartitionMessage.process(PartitionMessage.java:376) at org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:386) at org.apache.geode.distributed.internal.DistributionMessage.schedule(DistributionMessage.java:449) at org.apache.geode.distributed.internal.DistributionManager.scheduleIncomingMessage(DistributionManager.java:3872) at org.apache.geode.distributed.internal.DistributionManager.handleIncomingDMsg(DistributionManager.java:3496) at org.apache.geode.distributed.internal.DistributionManager$MyListener.messageReceived(DistributionManager.java:4693) at org.apache.geode.distributed.internal.membership.jgroup.JGroupMembershipManager.processMessage(JGroupMembershipManager.java:2128) at org.apache.geode.distributed.internal.membership.jgroup.JGroupMembershipManager.handleOrDeferMessage(JGroupMembershipManager.java:2037) at org.apache.geode.distributed.internal.membership.jgroup.JGroupMembershipManager$MyDCReceiver.messageReceived(JGroupMembershipManager.java:647) at org.apache.geode.distributed.internal.direct.DirectChannel.receive(DirectChannel.java:804) at org.apache.geode.internal.tcp.TCPConduit.messageReceived(TCPConduit.java:835) at org.apache.geode.internal.tcp.Connection.dispatchMessage(Connection.java:3932) at org.apache.geode.internal.tcp.Connection.processNIOBuffer(Connection.java:3515) at org.apache.geode.internal.tcp.Connection.runNioReader(Connection.java:1827) at org.apache.geode.internal.tcp.Connection.run(Connection.java:1702) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.NotSerializableException: java.lang.Object at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184) at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548) at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509) at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432) at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178) at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548) at java.io.ObjectOutputStream.defaultWriteObject(ObjectOutputStream.java:441) at java.lang.Throwable.writeObject(Throwable.java:985) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:1028) at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1496) at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432) at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178) at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548) at java.io.ObjectOutputStream.defaultWriteObject(ObjectOutputStream.java:441) at java.lan
[jira] [Updated] (GEODE-9139) SSLException in starting up a Locator
[ https://issues.apache.org/jira/browse/GEODE-9139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt updated GEODE-9139: -- Fix Version/s: 1.15.0 > SSLException in starting up a Locator > - > > Key: GEODE-9139 > URL: https://issues.apache.org/jira/browse/GEODE-9139 > Project: Geode > Issue Type: Bug > Components: membership, messaging >Reporter: Bruce J Schuchardt >Assignee: Bruce J Schuchardt >Priority: Major > Labels: pull-request-available > Fix For: 1.15.0 > > > If you start up a locator using its host name, without a domain name, as a > bind address you may get an SSLException in the form > {noformat} > javax.net.ssl.SSLHandshakeException: java.security.cert.CertificateException: > No subject alternative DNS name matching hostname.domainname found > {noformat} > The LocatorLauncher and InternalLocator throw away the bind address string > and later do a reverse lookup to find the fully qualified hostname to use in > endpoint identification matching.If the locator's own TLS certificate > doesn't have the fully qualified name in it as a Subject Alternate Name the > connection that the Locator makes to its own location service will fail. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GEODE-6413) CI Failure: Bind Exception during ClusterCommunicationsDUnitTest.performARollingUpgrade
[ https://issues.apache.org/jira/browse/GEODE-6413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt updated GEODE-6413: -- Component/s: membership > CI Failure: Bind Exception during > ClusterCommunicationsDUnitTest.performARollingUpgrade > --- > > Key: GEODE-6413 > URL: https://issues.apache.org/jira/browse/GEODE-6413 > Project: Geode > Issue Type: Bug > Components: membership, tests >Reporter: Benjamin P Ross >Priority: Major > Fix For: 1.9.0 > > > Stack Trace: > {noformat} > org.apache.geode.ClusterCommunicationsDUnitTest > > performARollingUpgrade[SHARED_CONNECTIONS] FAILED > org.apache.geode.test.dunit.RMIException: While invoking > org.apache.geode.test.dunit.NamedRunnable.run in VM 0 running on Host > c4dd6cb2c206 with 3 VMs > at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:579) > at org.apache.geode.test.dunit.VM.invoke(VM.java:393) > at > org.apache.geode.ClusterCommunicationsDUnitTest.performARollingUpgrade(ClusterCommunicationsDUnitTest.java:214) > Caused by: > java.net.BindException: Failed to create server socket on > c4dd6cb2c206/172.17.0.16[43969] > at > org.apache.geode.internal.net.SocketCreator.createServerSocket(SocketCreator.java:756) > at > org.apache.geode.internal.net.SocketCreator.createServerSocket(SocketCreator.java:714) > at > org.apache.geode.internal.net.SocketCreator.createServerSocket(SocketCreator.java:680) > at > org.apache.geode.distributed.internal.tcpserver.TcpServer.initializeServerSocket(TcpServer.java:225) > at > org.apache.geode.distributed.internal.tcpserver.TcpServer.startServerThread(TcpServer.java:215) > at > org.apache.geode.distributed.internal.tcpserver.TcpServer.start(TcpServer.java:210) > at > org.apache.geode.distributed.internal.InternalLocator.startTcpServer(InternalLocator.java:501) > at > org.apache.geode.distributed.internal.InternalLocator.startPeerLocation(InternalLocator.java:557) > at > org.apache.geode.distributed.internal.InternalLocator.startLocator(InternalLocator.java:340) > at > org.apache.geode.distributed.Locator.startLocator(Locator.java:252) > at > org.apache.geode.distributed.Locator.startLocatorAndDS(Locator.java:139) > at > org.apache.geode.ClusterCommunicationsDUnitTest.lambda$null$1(ClusterCommunicationsDUnitTest.java:220) > Caused by: > java.net.BindException: Address already in use (Bind failed) > at java.net.PlainSocketImpl.socketBind(Native Method) > at > java.net.AbstractPlainSocketImpl.bind(AbstractPlainSocketImpl.java:387) > at java.net.ServerSocket.bind(ServerSocket.java:375) > at > org.apache.geode.internal.net.SocketCreator.createServerSocket(SocketCreator.java:753) > ... 11 more > {noformat} > This test may be fixed with a longer await() timeout. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (GEODE-9141) Hang while shutting down a cache server due to corrupted message
[ https://issues.apache.org/jira/browse/GEODE-9141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt reassigned GEODE-9141: - Assignee: Bill Burcham (was: Bruce J Schuchardt) > Hang while shutting down a cache server due to corrupted message > > > Key: GEODE-9141 > URL: https://issues.apache.org/jira/browse/GEODE-9141 > Project: Geode > Issue Type: Bug > Components: membership, messaging >Affects Versions: 1.13.2, 1.14.0, 1.15.0 >Reporter: Bruce J Schuchardt >Assignee: Bill Burcham >Priority: Major > Labels: blocks-1.14.0, blocks-1.15.0, pull-request-available > > We have a test that fails once in 5000 runs with a corrupted > DestroyRegionMessage. It is always during CacheServer teardown when > destroying a HARegionQueue Region. > {noformat} > "vm_0_thr_0_bridge_1_1_host1_6920" #144 daemon prio=5 os_prio=0 > tid=0x7fec70058800 nid=0x1d28 waiting on condition [0x7fec62063000] >java.lang.Thread.State: TIMED_WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0xf4f654f8> (a > java.util.concurrent.CountDownLatch$Sync) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328) > at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277) > at > org.apache.geode.internal.util.concurrent.StoppableCountDownLatch.await(StoppableCountDownLatch.java:72) > at > org.apache.geode.distributed.internal.ReplyProcessor21.basicWait(ReplyProcessor21.java:723) > at > org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:794) > at > org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:771) > at > org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:857) > at > org.apache.geode.internal.cache.DistributedCacheOperation.waitForAckIfNeeded(DistributedCacheOperation.java:779) > at > org.apache.geode.internal.cache.DistributedCacheOperation._distribute(DistributedCacheOperation.java:676) > at > org.apache.geode.internal.cache.DistributedCacheOperation.startOperation(DistributedCacheOperation.java:277) > at > org.apache.geode.internal.cache.DistributedCacheOperation.distribute(DistributedCacheOperation.java:318) > at > org.apache.geode.internal.cache.DistributedRegion.distributeDestroyRegion(DistributedRegion.java:1865) > at > org.apache.geode.internal.cache.DistributedRegion.basicDestroyRegion(DistributedRegion.java:1844) > at > org.apache.geode.internal.cache.LocalRegion.basicDestroyRegion(LocalRegion.java:6180) > at > org.apache.geode.internal.cache.HARegion.destroyRegion(HARegion.java:331) > at > org.apache.geode.internal.cache.AbstractRegion.destroyRegion(AbstractRegion.java:476) > at > org.apache.geode.internal.cache.ha.HARegionQueue.destroy(HARegionQueue.java:3438) > at > org.apache.geode.internal.cache.ha.HARegionQueue$BlockingHARegionQueue.destroy(HARegionQueue.java:2272) > at > org.apache.geode.internal.cache.tier.sockets.CacheClientProxy.destroyRQ(CacheClientProxy.java:1031) > at > org.apache.geode.internal.cache.tier.sockets.CacheClientProxy.terminateDispatching(CacheClientProxy.java:939) > at > org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier.shutdown(CacheClientNotifier.java:1306) > - locked <0xf8022800> (a > org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier) > at > org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.close(AcceptorImpl.java:1630) > - locked <0xf5f7b888> (a java.lang.Object) > at > org.apache.geode.internal.cache.CacheServerImpl.stop(CacheServerImpl.java:491) > - locked <0xf7ef2980> (a > org.apache.geode.internal.cache.CacheServerImpl) > at > org.apache.geode.internal.cache.GemFireCacheImpl.stopServers(GemFireCacheImpl.java:2672) > at > org.apache.geode.internal.cache.GemFireCacheImpl.doClose(GemFireCacheImpl.java:2263) > - locked <0xf5a21a08> (a java.lang.Class for > org.apache.geode.internal.cache.GemFireCacheImpl) > at > org.apache.geode.internal.cache.GemFireCacheImpl.close(GemFireCacheImpl.java:2151) > at > org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1559) > - locked <0xf
[jira] [Assigned] (GEODE-7607) Create a concurrent-startup membership test outside of geode-core
[ https://issues.apache.org/jira/browse/GEODE-7607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt reassigned GEODE-7607: - Assignee: (was: Bruce J Schuchardt) > Create a concurrent-startup membership test outside of geode-core > - > > Key: GEODE-7607 > URL: https://issues.apache.org/jira/browse/GEODE-7607 > Project: Geode > Issue Type: Test > Components: membership >Reporter: Bruce J Schuchardt >Priority: Major > > There is currently a test in MembershipJUnitTest that spins up a Locator and > two Membership services in the same JVM. Use this to springboard a new test > that concurrently starts up two Membership services, each hosting a TcpServer > peer-to-peer location service. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (GEODE-6222) CI Failure: GemFireDeadlockDetectorDUnitTest
[ https://issues.apache.org/jira/browse/GEODE-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt reassigned GEODE-6222: - Assignee: (was: Bruce J Schuchardt) > CI Failure: GemFireDeadlockDetectorDUnitTest > > > Key: GEODE-6222 > URL: https://issues.apache.org/jira/browse/GEODE-6222 > Project: Geode > Issue Type: Bug > Components: distributed lock service >Affects Versions: 1.9.0 >Reporter: Ken Howe >Priority: Major > Labels: flaky > > Flaky test failure in > [https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/DistributedTestOpenJDK11/builds/247] > {code:java} > org.apache.geode.distributed.internal.deadlock.GemFireDeadlockDetectorDUnitTest > > testDistributedDeadlockWithDLock FAILED > java.lang.AssertionError > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.geode.distributed.internal.deadlock.GemFireDeadlockDetectorDUnitTest.testDistributedDeadlockWithDLock(GemFireDeadlockDetectorDUnitTest.java:199) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (GEODE-9135) Remove reverse DNS lookup in Connection.java for accepted connections
[ https://issues.apache.org/jira/browse/GEODE-9135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt reassigned GEODE-9135: - Assignee: (was: Bruce J Schuchardt) > Remove reverse DNS lookup in Connection.java for accepted connections > - > > Key: GEODE-9135 > URL: https://issues.apache.org/jira/browse/GEODE-9135 > Project: Geode > Issue Type: Bug > Components: membership >Reporter: Bruce J Schuchardt >Priority: Major > > Prior to the introduction of SSLEngine use in the > org.apache.geode.internal.tcp package we used SSLSockets. During a handshake > we would set the SNIHostName on the client side of the connection and have it > validate the hostname returned by the server side of the handshake. > When we introduced SSLEngine we changed this to set the SNIHostName on both > sides. We should revert this so that it only does it on the client side. > The server side of the connection does not have a hostname for the client > side of the connection in this case and it is currently doing a reverse DNS > lookup to get the name. That's a potentially expensive operation, and even > then we don't know whether to use the fully qualified domain name (FQDN) or a > simple host name. This matters because endpoint verification requires that > the name we choose be presented in the certificate of the other server. If > we choose the FQDN and the cert only has a simple host name the handshake > will fail. > SSLEngine requires a host name when it's constructed but most algorithms > don't use it. Documentation mentions Kerberos possibly needing it, so we'd > have to have a way for the reverse lookup to be enabled or find some other > way to get the host name, like SocketCreator.getHostName()'s reverse-lookup > cache. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GEODE-9128) Remove host name look-up from JGAddress
[ https://issues.apache.org/jira/browse/GEODE-9128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt updated GEODE-9128: -- Fix Version/s: 1.14.0 > Remove host name look-up from JGAddress > --- > > Key: GEODE-9128 > URL: https://issues.apache.org/jira/browse/GEODE-9128 > Project: Geode > Issue Type: Test > Components: membership >Reporter: Bruce J Schuchardt >Assignee: Bruce J Schuchardt >Priority: Major > Labels: pull-request-available > Fix For: 1.13.3, 1.14.0, 1.15.0 > > > The method JGAddress.toString() contains a host name lookup that should be > removed. It should just log the toString of its ip_addr field, not > ip_addr.getHostName(). That method can cause a reverse-DNS lookup, which is > needlessly expensive for a toString() operation. > {code:java} > public String toString() { > StringBuilder sb = new StringBuilder(); > if (ip_addr == null) > sb.append(""); > else { > sb.append(ip_addr.getHostName()); > } > if (vmViewId >= 0) { > sb.append("'); > } > if (SHOW_UUIDS) { > sb.append("(").append(toStringLong()).append(")"); > } else if (mostSigBits == 0 && leastSigBits == 0) { > sb.append("(no uuid set)"); > } > sb.append(":").append(port); > return sb.toString(); > } > {code:java} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (GEODE-9145) update CODEOWNERS
[ https://issues.apache.org/jira/browse/GEODE-9145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt closed GEODE-9145. - > update CODEOWNERS > - > > Key: GEODE-9145 > URL: https://issues.apache.org/jira/browse/GEODE-9145 > Project: Geode > Issue Type: Task > Components: membership >Reporter: Bruce J Schuchardt >Assignee: Bruce J Schuchardt >Priority: Major > Labels: pull-request-available > Fix For: 1.15.0 > > > remove bschuchardt from CODEOWNERS -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (GEODE-9145) update CODEOWNERS
[ https://issues.apache.org/jira/browse/GEODE-9145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt resolved GEODE-9145. --- Fix Version/s: 1.15.0 Resolution: Fixed > update CODEOWNERS > - > > Key: GEODE-9145 > URL: https://issues.apache.org/jira/browse/GEODE-9145 > Project: Geode > Issue Type: Task > Components: membership >Reporter: Bruce J Schuchardt >Assignee: Bruce J Schuchardt >Priority: Major > Labels: pull-request-available > Fix For: 1.15.0 > > > remove bschuchardt from CODEOWNERS -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GEODE-9128) Remove host name look-up from JGAddress
[ https://issues.apache.org/jira/browse/GEODE-9128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt updated GEODE-9128: -- Fix Version/s: 1.13.3 > Remove host name look-up from JGAddress > --- > > Key: GEODE-9128 > URL: https://issues.apache.org/jira/browse/GEODE-9128 > Project: Geode > Issue Type: Test > Components: membership >Reporter: Bruce J Schuchardt >Assignee: Bruce J Schuchardt >Priority: Major > Labels: pull-request-available > Fix For: 1.13.3, 1.15.0 > > > The method JGAddress.toString() contains a host name lookup that should be > removed. It should just log the toString of its ip_addr field, not > ip_addr.getHostName(). That method can cause a reverse-DNS lookup, which is > needlessly expensive for a toString() operation. > {code:java} > public String toString() { > StringBuilder sb = new StringBuilder(); > if (ip_addr == null) > sb.append(""); > else { > sb.append(ip_addr.getHostName()); > } > if (vmViewId >= 0) { > sb.append("'); > } > if (SHOW_UUIDS) { > sb.append("(").append(toStringLong()).append(")"); > } else if (mostSigBits == 0 && leastSigBits == 0) { > sb.append("(no uuid set)"); > } > sb.append(":").append(port); > return sb.toString(); > } > {code:java} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (GEODE-9128) Remove host name look-up from JGAddress
[ https://issues.apache.org/jira/browse/GEODE-9128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt resolved GEODE-9128. --- Fix Version/s: 1.15.0 Resolution: Fixed PR for backport to 1.14 is up so I'm closing this ticket. > Remove host name look-up from JGAddress > --- > > Key: GEODE-9128 > URL: https://issues.apache.org/jira/browse/GEODE-9128 > Project: Geode > Issue Type: Test > Components: membership >Reporter: Bruce J Schuchardt >Assignee: Bruce J Schuchardt >Priority: Major > Labels: pull-request-available > Fix For: 1.15.0 > > > The method JGAddress.toString() contains a host name lookup that should be > removed. It should just log the toString of its ip_addr field, not > ip_addr.getHostName(). That method can cause a reverse-DNS lookup, which is > needlessly expensive for a toString() operation. > {code:java} > public String toString() { > StringBuilder sb = new StringBuilder(); > if (ip_addr == null) > sb.append(""); > else { > sb.append(ip_addr.getHostName()); > } > if (vmViewId >= 0) { > sb.append("'); > } > if (SHOW_UUIDS) { > sb.append("(").append(toStringLong()).append(")"); > } else if (mostSigBits == 0 && leastSigBits == 0) { > sb.append("(no uuid set)"); > } > sb.append(":").append(port); > return sb.toString(); > } > {code:java} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (GEODE-9128) Remove host name look-up from JGAddress
[ https://issues.apache.org/jira/browse/GEODE-9128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt reassigned GEODE-9128: - Assignee: Bruce J Schuchardt > Remove host name look-up from JGAddress > --- > > Key: GEODE-9128 > URL: https://issues.apache.org/jira/browse/GEODE-9128 > Project: Geode > Issue Type: Test > Components: membership >Reporter: Bruce J Schuchardt >Assignee: Bruce J Schuchardt >Priority: Major > Labels: pull-request-available > > The method JGAddress.toString() contains a host name lookup that should be > removed. It should just log the toString of its ip_addr field, not > ip_addr.getHostName(). That method can cause a reverse-DNS lookup, which is > needlessly expensive for a toString() operation. > {code:java} > public String toString() { > StringBuilder sb = new StringBuilder(); > if (ip_addr == null) > sb.append(""); > else { > sb.append(ip_addr.getHostName()); > } > if (vmViewId >= 0) { > sb.append("'); > } > if (SHOW_UUIDS) { > sb.append("(").append(toStringLong()).append(")"); > } else if (mostSigBits == 0 && leastSigBits == 0) { > sb.append("(no uuid set)"); > } > sb.append(":").append(port); > return sb.toString(); > } > {code:java} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GEODE-9145) update CODEOWNERS
[ https://issues.apache.org/jira/browse/GEODE-9145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt updated GEODE-9145: -- Component/s: membership > update CODEOWNERS > - > > Key: GEODE-9145 > URL: https://issues.apache.org/jira/browse/GEODE-9145 > Project: Geode > Issue Type: Task > Components: membership >Reporter: Bruce J Schuchardt >Assignee: Bruce J Schuchardt >Priority: Major > > remove bschuchardt from CODEOWNERS -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GEODE-9145) update CODEOWNERS
[ https://issues.apache.org/jira/browse/GEODE-9145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt updated GEODE-9145: -- Issue Type: Task (was: Test) > update CODEOWNERS > - > > Key: GEODE-9145 > URL: https://issues.apache.org/jira/browse/GEODE-9145 > Project: Geode > Issue Type: Task >Reporter: Bruce J Schuchardt >Assignee: Bruce J Schuchardt >Priority: Major > > remove bschuchardt from CODEOWNERS -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (GEODE-9145) update CODEOWNERS
Bruce J Schuchardt created GEODE-9145: - Summary: update CODEOWNERS Key: GEODE-9145 URL: https://issues.apache.org/jira/browse/GEODE-9145 Project: Geode Issue Type: Test Reporter: Bruce J Schuchardt remove bschuchardt from CODEOWNERS -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (GEODE-9145) update CODEOWNERS
[ https://issues.apache.org/jira/browse/GEODE-9145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt reassigned GEODE-9145: - Assignee: Bruce J Schuchardt > update CODEOWNERS > - > > Key: GEODE-9145 > URL: https://issues.apache.org/jira/browse/GEODE-9145 > Project: Geode > Issue Type: Test >Reporter: Bruce J Schuchardt >Assignee: Bruce J Schuchardt >Priority: Major > > remove bschuchardt from CODEOWNERS -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Reopened] (GEODE-5940) ServerLauncherRemoteIntegrationTest times out waiting for server to start
[ https://issues.apache.org/jira/browse/GEODE-5940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt reopened GEODE-5940: --- This problem has returned in this run: https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/WindowsCoreIntegrationTestOpenJDK8/builds/141 > ServerLauncherRemoteIntegrationTest times out waiting for server to start > - > > Key: GEODE-5940 > URL: https://issues.apache.org/jira/browse/GEODE-5940 > Project: Geode > Issue Type: Test >Reporter: Dale Emery >Assignee: Kirk Lund >Priority: Major > Labels: swat > Fix For: 1.11.0 > > Time Spent: 40m > Remaining Estimate: 0h > > http://files.apachegeode-ci.info/builds/apache-develop-main/1.8.0-build.50/test-results/integrationTest/1540492907/classes/org.apache.geode.distributed.ServerLauncherRemoteIntegrationTest.html#startOverwritesStalePidFile > {noformat} > org.awaitility.core.ConditionTimeoutException: Assertion condition defined as > a lambda expression in > org.apache.geode.distributed.ServerLauncherRemoteIntegrationTestCase that > uses org.apache.geode.distributed.ServerLauncher expected:<[online]> but > was:<[not responding]> within 300 seconds. > at org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:145) > at > org.awaitility.core.AssertionCondition.await(AssertionCondition.java:122) > at > org.awaitility.core.AssertionCondition.await(AssertionCondition.java:32) > at org.awaitility.core.ConditionFactory.until(ConditionFactory.java:890) > at > org.awaitility.core.ConditionFactory.untilAsserted(ConditionFactory.java:711) > at > org.apache.geode.distributed.ServerLauncherRemoteIntegrationTestCase.awaitStart(ServerLauncherRemoteIntegrationTestCase.java:200) > at > org.apache.geode.distributed.ServerLauncherRemoteIntegrationTestCase.awaitStart(ServerLauncherRemoteIntegrationTestCase.java:178) > at > org.apache.geode.distributed.ServerLauncherRemoteIntegrationTestCase.awaitStart(ServerLauncherRemoteIntegrationTestCase.java:189) > at > org.apache.geode.distributed.ServerLauncherRemoteIntegrationTestCase.startServer(ServerLauncherRemoteIntegrationTestCase.java:128) > at > org.apache.geode.distributed.ServerLauncherRemoteIntegrationTestCase.startServer(ServerLauncherRemoteIntegrationTestCase.java:124) > at > org.apache.geode.distributed.ServerLauncherRemoteIntegrationTest.startOverwritesStalePidFile(ServerLauncherRemoteIntegrationTest.java:91) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.rules.Verifier$1.evaluate(Verifier.java:35) > at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48) > at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48) > at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) > at org.junit.rules.RunRules.evaluate(RunRules.java:20) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at > org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:106) > at > org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58) > at > org.gradle.api.intern
[jira] [Updated] (GEODE-9141) Hang while shutting down a cache server due to corrupted message
[ https://issues.apache.org/jira/browse/GEODE-9141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt updated GEODE-9141: -- Labels: blocks-1.14.0 blocks-1.15.0 (was: blocks-1.15.0) > Hang while shutting down a cache server due to corrupted message > > > Key: GEODE-9141 > URL: https://issues.apache.org/jira/browse/GEODE-9141 > Project: Geode > Issue Type: Bug > Components: membership, messaging >Affects Versions: 1.13.2, 1.14.0, 1.15.0 >Reporter: Bruce J Schuchardt >Assignee: Bruce J Schuchardt >Priority: Major > Labels: blocks-1.14.0, blocks-1.15.0 > > We have a test that fails once in 5000 runs with a corrupted > DestroyRegionMessage. It is always during CacheServer teardown when > destroying a HARegionQueue Region. > {noformat} > "vm_0_thr_0_bridge_1_1_host1_6920" #144 daemon prio=5 os_prio=0 > tid=0x7fec70058800 nid=0x1d28 waiting on condition [0x7fec62063000] >java.lang.Thread.State: TIMED_WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0xf4f654f8> (a > java.util.concurrent.CountDownLatch$Sync) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328) > at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277) > at > org.apache.geode.internal.util.concurrent.StoppableCountDownLatch.await(StoppableCountDownLatch.java:72) > at > org.apache.geode.distributed.internal.ReplyProcessor21.basicWait(ReplyProcessor21.java:723) > at > org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:794) > at > org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:771) > at > org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:857) > at > org.apache.geode.internal.cache.DistributedCacheOperation.waitForAckIfNeeded(DistributedCacheOperation.java:779) > at > org.apache.geode.internal.cache.DistributedCacheOperation._distribute(DistributedCacheOperation.java:676) > at > org.apache.geode.internal.cache.DistributedCacheOperation.startOperation(DistributedCacheOperation.java:277) > at > org.apache.geode.internal.cache.DistributedCacheOperation.distribute(DistributedCacheOperation.java:318) > at > org.apache.geode.internal.cache.DistributedRegion.distributeDestroyRegion(DistributedRegion.java:1865) > at > org.apache.geode.internal.cache.DistributedRegion.basicDestroyRegion(DistributedRegion.java:1844) > at > org.apache.geode.internal.cache.LocalRegion.basicDestroyRegion(LocalRegion.java:6180) > at > org.apache.geode.internal.cache.HARegion.destroyRegion(HARegion.java:331) > at > org.apache.geode.internal.cache.AbstractRegion.destroyRegion(AbstractRegion.java:476) > at > org.apache.geode.internal.cache.ha.HARegionQueue.destroy(HARegionQueue.java:3438) > at > org.apache.geode.internal.cache.ha.HARegionQueue$BlockingHARegionQueue.destroy(HARegionQueue.java:2272) > at > org.apache.geode.internal.cache.tier.sockets.CacheClientProxy.destroyRQ(CacheClientProxy.java:1031) > at > org.apache.geode.internal.cache.tier.sockets.CacheClientProxy.terminateDispatching(CacheClientProxy.java:939) > at > org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier.shutdown(CacheClientNotifier.java:1306) > - locked <0xf8022800> (a > org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier) > at > org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.close(AcceptorImpl.java:1630) > - locked <0xf5f7b888> (a java.lang.Object) > at > org.apache.geode.internal.cache.CacheServerImpl.stop(CacheServerImpl.java:491) > - locked <0xf7ef2980> (a > org.apache.geode.internal.cache.CacheServerImpl) > at > org.apache.geode.internal.cache.GemFireCacheImpl.stopServers(GemFireCacheImpl.java:2672) > at > org.apache.geode.internal.cache.GemFireCacheImpl.doClose(GemFireCacheImpl.java:2263) > - locked <0xf5a21a08> (a java.lang.Class for > org.apache.geode.internal.cache.GemFireCacheImpl) > at > org.apache.geode.internal.cache.GemFireCacheImpl.close(GemFireCacheImpl.java:2151) > at > org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1559) > - locked <0xf5a21a08> (a ja
[jira] [Updated] (GEODE-9141) Hang while shutting down a cache server due to corrupted message
[ https://issues.apache.org/jira/browse/GEODE-9141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt updated GEODE-9141: -- Labels: blocks-1.15.0 (was: ) > Hang while shutting down a cache server due to corrupted message > > > Key: GEODE-9141 > URL: https://issues.apache.org/jira/browse/GEODE-9141 > Project: Geode > Issue Type: Bug > Components: membership, messaging >Reporter: Bruce J Schuchardt >Assignee: Bruce J Schuchardt >Priority: Major > Labels: blocks-1.15.0 > > We have a test that fails once in 5000 runs with a corrupted > DestroyRegionMessage. It is always during CacheServer teardown when > destroying a HARegionQueue Region. > {noformat} > "vm_0_thr_0_bridge_1_1_host1_6920" #144 daemon prio=5 os_prio=0 > tid=0x7fec70058800 nid=0x1d28 waiting on condition [0x7fec62063000] >java.lang.Thread.State: TIMED_WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0xf4f654f8> (a > java.util.concurrent.CountDownLatch$Sync) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328) > at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277) > at > org.apache.geode.internal.util.concurrent.StoppableCountDownLatch.await(StoppableCountDownLatch.java:72) > at > org.apache.geode.distributed.internal.ReplyProcessor21.basicWait(ReplyProcessor21.java:723) > at > org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:794) > at > org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:771) > at > org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:857) > at > org.apache.geode.internal.cache.DistributedCacheOperation.waitForAckIfNeeded(DistributedCacheOperation.java:779) > at > org.apache.geode.internal.cache.DistributedCacheOperation._distribute(DistributedCacheOperation.java:676) > at > org.apache.geode.internal.cache.DistributedCacheOperation.startOperation(DistributedCacheOperation.java:277) > at > org.apache.geode.internal.cache.DistributedCacheOperation.distribute(DistributedCacheOperation.java:318) > at > org.apache.geode.internal.cache.DistributedRegion.distributeDestroyRegion(DistributedRegion.java:1865) > at > org.apache.geode.internal.cache.DistributedRegion.basicDestroyRegion(DistributedRegion.java:1844) > at > org.apache.geode.internal.cache.LocalRegion.basicDestroyRegion(LocalRegion.java:6180) > at > org.apache.geode.internal.cache.HARegion.destroyRegion(HARegion.java:331) > at > org.apache.geode.internal.cache.AbstractRegion.destroyRegion(AbstractRegion.java:476) > at > org.apache.geode.internal.cache.ha.HARegionQueue.destroy(HARegionQueue.java:3438) > at > org.apache.geode.internal.cache.ha.HARegionQueue$BlockingHARegionQueue.destroy(HARegionQueue.java:2272) > at > org.apache.geode.internal.cache.tier.sockets.CacheClientProxy.destroyRQ(CacheClientProxy.java:1031) > at > org.apache.geode.internal.cache.tier.sockets.CacheClientProxy.terminateDispatching(CacheClientProxy.java:939) > at > org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier.shutdown(CacheClientNotifier.java:1306) > - locked <0xf8022800> (a > org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier) > at > org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.close(AcceptorImpl.java:1630) > - locked <0xf5f7b888> (a java.lang.Object) > at > org.apache.geode.internal.cache.CacheServerImpl.stop(CacheServerImpl.java:491) > - locked <0xf7ef2980> (a > org.apache.geode.internal.cache.CacheServerImpl) > at > org.apache.geode.internal.cache.GemFireCacheImpl.stopServers(GemFireCacheImpl.java:2672) > at > org.apache.geode.internal.cache.GemFireCacheImpl.doClose(GemFireCacheImpl.java:2263) > - locked <0xf5a21a08> (a java.lang.Class for > org.apache.geode.internal.cache.GemFireCacheImpl) > at > org.apache.geode.internal.cache.GemFireCacheImpl.close(GemFireCacheImpl.java:2151) > at > org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1559) > - locked <0xf5a21a08> (a java.lang.Class for > org.apache.geode.internal.cache.GemFireCacheImpl) > at > org.ap
[jira] [Updated] (GEODE-9141) Hang while shutting down a cache server due to corrupted message
[ https://issues.apache.org/jira/browse/GEODE-9141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt updated GEODE-9141: -- Affects Version/s: 1.15.0 1.14.0 1.13.2 > Hang while shutting down a cache server due to corrupted message > > > Key: GEODE-9141 > URL: https://issues.apache.org/jira/browse/GEODE-9141 > Project: Geode > Issue Type: Bug > Components: membership, messaging >Affects Versions: 1.13.2, 1.14.0, 1.15.0 >Reporter: Bruce J Schuchardt >Assignee: Bruce J Schuchardt >Priority: Major > Labels: blocks-1.15.0 > > We have a test that fails once in 5000 runs with a corrupted > DestroyRegionMessage. It is always during CacheServer teardown when > destroying a HARegionQueue Region. > {noformat} > "vm_0_thr_0_bridge_1_1_host1_6920" #144 daemon prio=5 os_prio=0 > tid=0x7fec70058800 nid=0x1d28 waiting on condition [0x7fec62063000] >java.lang.Thread.State: TIMED_WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0xf4f654f8> (a > java.util.concurrent.CountDownLatch$Sync) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328) > at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277) > at > org.apache.geode.internal.util.concurrent.StoppableCountDownLatch.await(StoppableCountDownLatch.java:72) > at > org.apache.geode.distributed.internal.ReplyProcessor21.basicWait(ReplyProcessor21.java:723) > at > org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:794) > at > org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:771) > at > org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:857) > at > org.apache.geode.internal.cache.DistributedCacheOperation.waitForAckIfNeeded(DistributedCacheOperation.java:779) > at > org.apache.geode.internal.cache.DistributedCacheOperation._distribute(DistributedCacheOperation.java:676) > at > org.apache.geode.internal.cache.DistributedCacheOperation.startOperation(DistributedCacheOperation.java:277) > at > org.apache.geode.internal.cache.DistributedCacheOperation.distribute(DistributedCacheOperation.java:318) > at > org.apache.geode.internal.cache.DistributedRegion.distributeDestroyRegion(DistributedRegion.java:1865) > at > org.apache.geode.internal.cache.DistributedRegion.basicDestroyRegion(DistributedRegion.java:1844) > at > org.apache.geode.internal.cache.LocalRegion.basicDestroyRegion(LocalRegion.java:6180) > at > org.apache.geode.internal.cache.HARegion.destroyRegion(HARegion.java:331) > at > org.apache.geode.internal.cache.AbstractRegion.destroyRegion(AbstractRegion.java:476) > at > org.apache.geode.internal.cache.ha.HARegionQueue.destroy(HARegionQueue.java:3438) > at > org.apache.geode.internal.cache.ha.HARegionQueue$BlockingHARegionQueue.destroy(HARegionQueue.java:2272) > at > org.apache.geode.internal.cache.tier.sockets.CacheClientProxy.destroyRQ(CacheClientProxy.java:1031) > at > org.apache.geode.internal.cache.tier.sockets.CacheClientProxy.terminateDispatching(CacheClientProxy.java:939) > at > org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier.shutdown(CacheClientNotifier.java:1306) > - locked <0xf8022800> (a > org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier) > at > org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.close(AcceptorImpl.java:1630) > - locked <0xf5f7b888> (a java.lang.Object) > at > org.apache.geode.internal.cache.CacheServerImpl.stop(CacheServerImpl.java:491) > - locked <0xf7ef2980> (a > org.apache.geode.internal.cache.CacheServerImpl) > at > org.apache.geode.internal.cache.GemFireCacheImpl.stopServers(GemFireCacheImpl.java:2672) > at > org.apache.geode.internal.cache.GemFireCacheImpl.doClose(GemFireCacheImpl.java:2263) > - locked <0xf5a21a08> (a java.lang.Class for > org.apache.geode.internal.cache.GemFireCacheImpl) > at > org.apache.geode.internal.cache.GemFireCacheImpl.close(GemFireCacheImpl.java:2151) > at > org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1559) > - locked <0xf5a21a
[jira] [Updated] (GEODE-9141) Hang while shutting down a cache server due to corrupted message
[ https://issues.apache.org/jira/browse/GEODE-9141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt updated GEODE-9141: -- Issue Type: Bug (was: Test) > Hang while shutting down a cache server due to corrupted message > > > Key: GEODE-9141 > URL: https://issues.apache.org/jira/browse/GEODE-9141 > Project: Geode > Issue Type: Bug > Components: membership, messaging >Reporter: Bruce J Schuchardt >Assignee: Bruce J Schuchardt >Priority: Major > > We have a test that fails once in 5000 runs with a corrupted > DestroyRegionMessage. It is always during CacheServer teardown when > destroying a HARegionQueue Region. > {noformat} > "vm_0_thr_0_bridge_1_1_host1_6920" #144 daemon prio=5 os_prio=0 > tid=0x7fec70058800 nid=0x1d28 waiting on condition [0x7fec62063000] >java.lang.Thread.State: TIMED_WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0xf4f654f8> (a > java.util.concurrent.CountDownLatch$Sync) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328) > at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277) > at > org.apache.geode.internal.util.concurrent.StoppableCountDownLatch.await(StoppableCountDownLatch.java:72) > at > org.apache.geode.distributed.internal.ReplyProcessor21.basicWait(ReplyProcessor21.java:723) > at > org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:794) > at > org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:771) > at > org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:857) > at > org.apache.geode.internal.cache.DistributedCacheOperation.waitForAckIfNeeded(DistributedCacheOperation.java:779) > at > org.apache.geode.internal.cache.DistributedCacheOperation._distribute(DistributedCacheOperation.java:676) > at > org.apache.geode.internal.cache.DistributedCacheOperation.startOperation(DistributedCacheOperation.java:277) > at > org.apache.geode.internal.cache.DistributedCacheOperation.distribute(DistributedCacheOperation.java:318) > at > org.apache.geode.internal.cache.DistributedRegion.distributeDestroyRegion(DistributedRegion.java:1865) > at > org.apache.geode.internal.cache.DistributedRegion.basicDestroyRegion(DistributedRegion.java:1844) > at > org.apache.geode.internal.cache.LocalRegion.basicDestroyRegion(LocalRegion.java:6180) > at > org.apache.geode.internal.cache.HARegion.destroyRegion(HARegion.java:331) > at > org.apache.geode.internal.cache.AbstractRegion.destroyRegion(AbstractRegion.java:476) > at > org.apache.geode.internal.cache.ha.HARegionQueue.destroy(HARegionQueue.java:3438) > at > org.apache.geode.internal.cache.ha.HARegionQueue$BlockingHARegionQueue.destroy(HARegionQueue.java:2272) > at > org.apache.geode.internal.cache.tier.sockets.CacheClientProxy.destroyRQ(CacheClientProxy.java:1031) > at > org.apache.geode.internal.cache.tier.sockets.CacheClientProxy.terminateDispatching(CacheClientProxy.java:939) > at > org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier.shutdown(CacheClientNotifier.java:1306) > - locked <0xf8022800> (a > org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier) > at > org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.close(AcceptorImpl.java:1630) > - locked <0xf5f7b888> (a java.lang.Object) > at > org.apache.geode.internal.cache.CacheServerImpl.stop(CacheServerImpl.java:491) > - locked <0xf7ef2980> (a > org.apache.geode.internal.cache.CacheServerImpl) > at > org.apache.geode.internal.cache.GemFireCacheImpl.stopServers(GemFireCacheImpl.java:2672) > at > org.apache.geode.internal.cache.GemFireCacheImpl.doClose(GemFireCacheImpl.java:2263) > - locked <0xf5a21a08> (a java.lang.Class for > org.apache.geode.internal.cache.GemFireCacheImpl) > at > org.apache.geode.internal.cache.GemFireCacheImpl.close(GemFireCacheImpl.java:2151) > at > org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1559) > - locked <0xf5a21a08> (a java.lang.Class for > org.apache.geode.internal.cache.GemFireCacheImpl) > at > org.apache.geode.distributed.internal.InternalD
[jira] [Assigned] (GEODE-9141) Hang while shutting down a cache server due to corrupted message
[ https://issues.apache.org/jira/browse/GEODE-9141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt reassigned GEODE-9141: - Assignee: Bruce J Schuchardt > Hang while shutting down a cache server due to corrupted message > > > Key: GEODE-9141 > URL: https://issues.apache.org/jira/browse/GEODE-9141 > Project: Geode > Issue Type: Test > Components: membership, messaging >Reporter: Bruce J Schuchardt >Assignee: Bruce J Schuchardt >Priority: Major > > We have a test that fails once in 5000 runs with a corrupted > DestroyRegionMessage. It is always during CacheServer teardown when > destroying a HARegionQueue Region. > {noformat} > "vm_0_thr_0_bridge_1_1_host1_6920" #144 daemon prio=5 os_prio=0 > tid=0x7fec70058800 nid=0x1d28 waiting on condition [0x7fec62063000] >java.lang.Thread.State: TIMED_WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0xf4f654f8> (a > java.util.concurrent.CountDownLatch$Sync) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328) > at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277) > at > org.apache.geode.internal.util.concurrent.StoppableCountDownLatch.await(StoppableCountDownLatch.java:72) > at > org.apache.geode.distributed.internal.ReplyProcessor21.basicWait(ReplyProcessor21.java:723) > at > org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:794) > at > org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:771) > at > org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:857) > at > org.apache.geode.internal.cache.DistributedCacheOperation.waitForAckIfNeeded(DistributedCacheOperation.java:779) > at > org.apache.geode.internal.cache.DistributedCacheOperation._distribute(DistributedCacheOperation.java:676) > at > org.apache.geode.internal.cache.DistributedCacheOperation.startOperation(DistributedCacheOperation.java:277) > at > org.apache.geode.internal.cache.DistributedCacheOperation.distribute(DistributedCacheOperation.java:318) > at > org.apache.geode.internal.cache.DistributedRegion.distributeDestroyRegion(DistributedRegion.java:1865) > at > org.apache.geode.internal.cache.DistributedRegion.basicDestroyRegion(DistributedRegion.java:1844) > at > org.apache.geode.internal.cache.LocalRegion.basicDestroyRegion(LocalRegion.java:6180) > at > org.apache.geode.internal.cache.HARegion.destroyRegion(HARegion.java:331) > at > org.apache.geode.internal.cache.AbstractRegion.destroyRegion(AbstractRegion.java:476) > at > org.apache.geode.internal.cache.ha.HARegionQueue.destroy(HARegionQueue.java:3438) > at > org.apache.geode.internal.cache.ha.HARegionQueue$BlockingHARegionQueue.destroy(HARegionQueue.java:2272) > at > org.apache.geode.internal.cache.tier.sockets.CacheClientProxy.destroyRQ(CacheClientProxy.java:1031) > at > org.apache.geode.internal.cache.tier.sockets.CacheClientProxy.terminateDispatching(CacheClientProxy.java:939) > at > org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier.shutdown(CacheClientNotifier.java:1306) > - locked <0xf8022800> (a > org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier) > at > org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.close(AcceptorImpl.java:1630) > - locked <0xf5f7b888> (a java.lang.Object) > at > org.apache.geode.internal.cache.CacheServerImpl.stop(CacheServerImpl.java:491) > - locked <0xf7ef2980> (a > org.apache.geode.internal.cache.CacheServerImpl) > at > org.apache.geode.internal.cache.GemFireCacheImpl.stopServers(GemFireCacheImpl.java:2672) > at > org.apache.geode.internal.cache.GemFireCacheImpl.doClose(GemFireCacheImpl.java:2263) > - locked <0xf5a21a08> (a java.lang.Class for > org.apache.geode.internal.cache.GemFireCacheImpl) > at > org.apache.geode.internal.cache.GemFireCacheImpl.close(GemFireCacheImpl.java:2151) > at > org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1559) > - locked <0xf5a21a08> (a java.lang.Class for > org.apache.geode.internal.cache.GemFireCacheImpl) > at > org.apache.geode.distributed.internal.I
[jira] [Created] (GEODE-9141) Hang while shutting down a cache server due to corrupted message
Bruce J Schuchardt created GEODE-9141: - Summary: Hang while shutting down a cache server due to corrupted message Key: GEODE-9141 URL: https://issues.apache.org/jira/browse/GEODE-9141 Project: Geode Issue Type: Test Components: membership, messaging Reporter: Bruce J Schuchardt We have a test that fails once in 5000 runs with a corrupted DestroyRegionMessage. It is always during CacheServer teardown when destroying a HARegionQueue Region. {noformat} "vm_0_thr_0_bridge_1_1_host1_6920" #144 daemon prio=5 os_prio=0 tid=0x7fec70058800 nid=0x1d28 waiting on condition [0x7fec62063000] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0xf4f654f8> (a java.util.concurrent.CountDownLatch$Sync) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037) at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328) at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277) at org.apache.geode.internal.util.concurrent.StoppableCountDownLatch.await(StoppableCountDownLatch.java:72) at org.apache.geode.distributed.internal.ReplyProcessor21.basicWait(ReplyProcessor21.java:723) at org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:794) at org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:771) at org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:857) at org.apache.geode.internal.cache.DistributedCacheOperation.waitForAckIfNeeded(DistributedCacheOperation.java:779) at org.apache.geode.internal.cache.DistributedCacheOperation._distribute(DistributedCacheOperation.java:676) at org.apache.geode.internal.cache.DistributedCacheOperation.startOperation(DistributedCacheOperation.java:277) at org.apache.geode.internal.cache.DistributedCacheOperation.distribute(DistributedCacheOperation.java:318) at org.apache.geode.internal.cache.DistributedRegion.distributeDestroyRegion(DistributedRegion.java:1865) at org.apache.geode.internal.cache.DistributedRegion.basicDestroyRegion(DistributedRegion.java:1844) at org.apache.geode.internal.cache.LocalRegion.basicDestroyRegion(LocalRegion.java:6180) at org.apache.geode.internal.cache.HARegion.destroyRegion(HARegion.java:331) at org.apache.geode.internal.cache.AbstractRegion.destroyRegion(AbstractRegion.java:476) at org.apache.geode.internal.cache.ha.HARegionQueue.destroy(HARegionQueue.java:3438) at org.apache.geode.internal.cache.ha.HARegionQueue$BlockingHARegionQueue.destroy(HARegionQueue.java:2272) at org.apache.geode.internal.cache.tier.sockets.CacheClientProxy.destroyRQ(CacheClientProxy.java:1031) at org.apache.geode.internal.cache.tier.sockets.CacheClientProxy.terminateDispatching(CacheClientProxy.java:939) at org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier.shutdown(CacheClientNotifier.java:1306) - locked <0xf8022800> (a org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier) at org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.close(AcceptorImpl.java:1630) - locked <0xf5f7b888> (a java.lang.Object) at org.apache.geode.internal.cache.CacheServerImpl.stop(CacheServerImpl.java:491) - locked <0xf7ef2980> (a org.apache.geode.internal.cache.CacheServerImpl) at org.apache.geode.internal.cache.GemFireCacheImpl.stopServers(GemFireCacheImpl.java:2672) at org.apache.geode.internal.cache.GemFireCacheImpl.doClose(GemFireCacheImpl.java:2263) - locked <0xf5a21a08> (a java.lang.Class for org.apache.geode.internal.cache.GemFireCacheImpl) at org.apache.geode.internal.cache.GemFireCacheImpl.close(GemFireCacheImpl.java:2151) at org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1559) - locked <0xf5a21a08> (a java.lang.Class for org.apache.geode.internal.cache.GemFireCacheImpl) at org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1257) at hydra.RemoteTestModule$2.run(RemoteTestModule.java:388) {noformat} Another server logs this corrupted message. It is almost always the same corruption. When it's not we see the message header messed up, not a bad DSFID. {noformat} [fatal 2021/03/06 09:45:02.796 PST bridgegemfire_1_3
[jira] [Updated] (GEODE-9139) SSLException in starting up a Locator
[ https://issues.apache.org/jira/browse/GEODE-9139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt updated GEODE-9139: -- Issue Type: Bug (was: Test) > SSLException in starting up a Locator > - > > Key: GEODE-9139 > URL: https://issues.apache.org/jira/browse/GEODE-9139 > Project: Geode > Issue Type: Bug > Components: membership, messaging >Reporter: Bruce J Schuchardt >Assignee: Bruce J Schuchardt >Priority: Major > > If you start up a locator using its host name, without a domain name, as a > bind address you may get an SSLException in the form > {noformat} > javax.net.ssl.SSLHandshakeException: java.security.cert.CertificateException: > No subject alternative DNS name matching hostname.domainname found > {noformat} > The LocatorLauncher and InternalLocator throw away the bind address string > and later do a reverse lookup to find the fully qualified hostname to use in > endpoint identification matching.If the locator's own TLS certificate > doesn't have the fully qualified name in it as a Subject Alternate Name the > connection that the Locator makes to its own location service will fail. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (GEODE-9139) SSLException in starting up a Locator
[ https://issues.apache.org/jira/browse/GEODE-9139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt reassigned GEODE-9139: - Assignee: Bruce J Schuchardt > SSLException in starting up a Locator > - > > Key: GEODE-9139 > URL: https://issues.apache.org/jira/browse/GEODE-9139 > Project: Geode > Issue Type: Test > Components: membership, messaging >Reporter: Bruce J Schuchardt >Assignee: Bruce J Schuchardt >Priority: Major > > If you start up a locator using its host name, without a domain name, as a > bind address you may get an SSLException in the form > {noformat} > javax.net.ssl.SSLHandshakeException: java.security.cert.CertificateException: > No subject alternative DNS name matching hostname.domainname found > {noformat} > The LocatorLauncher and InternalLocator throw away the bind address string > and later do a reverse lookup to find the fully qualified hostname to use in > endpoint identification matching.If the locator's own TLS certificate > doesn't have the fully qualified name in it as a Subject Alternate Name the > connection that the Locator makes to its own location service will fail. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (GEODE-9139) SSLException in starting up a Locator
Bruce J Schuchardt created GEODE-9139: - Summary: SSLException in starting up a Locator Key: GEODE-9139 URL: https://issues.apache.org/jira/browse/GEODE-9139 Project: Geode Issue Type: Test Components: membership, messaging Reporter: Bruce J Schuchardt If you start up a locator using its host name, without a domain name, as a bind address you may get an SSLException in the form {noformat} javax.net.ssl.SSLHandshakeException: java.security.cert.CertificateException: No subject alternative DNS name matching hostname.domainname found {noformat} The LocatorLauncher and InternalLocator throw away the bind address string and later do a reverse lookup to find the fully qualified hostname to use in endpoint identification matching.If the locator's own TLS certificate doesn't have the fully qualified name in it as a Subject Alternate Name the connection that the Locator makes to its own location service will fail. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GEODE-9135) Remove reverse DNS lookup in Connection.java for accepted connections
[ https://issues.apache.org/jira/browse/GEODE-9135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt updated GEODE-9135: -- Issue Type: Bug (was: Test) > Remove reverse DNS lookup in Connection.java for accepted connections > - > > Key: GEODE-9135 > URL: https://issues.apache.org/jira/browse/GEODE-9135 > Project: Geode > Issue Type: Bug > Components: membership >Reporter: Bruce J Schuchardt >Assignee: Bruce J Schuchardt >Priority: Major > > Prior to the introduction of SSLEngine use in the > org.apache.geode.internal.tcp package we used SSLSockets. During a handshake > we would set the SNIHostName on the client side of the connection and have it > validate the hostname returned by the server side of the handshake. > When we introduced SSLEngine we changed this to set the SNIHostName on both > sides. We should revert this so that it only does it on the client side. > The server side of the connection does not have a hostname for the client > side of the connection in this case and it is currently doing a reverse DNS > lookup to get the name. That's a potentially expensive operation, and even > then we don't know whether to use the fully qualified domain name (FQDN) or a > simple host name. This matters because endpoint verification requires that > the name we choose be presented in the certificate of the other server. If > we choose the FQDN and the cert only has a simple host name the handshake > will fail. > SSLEngine requires a host name when it's constructed but most algorithms > don't use it. Documentation mentions Kerberos possibly needing it, so we'd > have to have a way for the reverse lookup to be enabled or find some other > way to get the host name, like SocketCreator.getHostName()'s reverse-lookup > cache. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (GEODE-9135) Remove reverse DNS lookup in Connection.java for accepted connections
[ https://issues.apache.org/jira/browse/GEODE-9135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt reassigned GEODE-9135: - Assignee: Bruce J Schuchardt > Remove reverse DNS lookup in Connection.java for accepted connections > - > > Key: GEODE-9135 > URL: https://issues.apache.org/jira/browse/GEODE-9135 > Project: Geode > Issue Type: Test > Components: membership >Reporter: Bruce J Schuchardt >Assignee: Bruce J Schuchardt >Priority: Major > > Prior to the introduction of SSLEngine use in the > org.apache.geode.internal.tcp package we used SSLSockets. During a handshake > we would set the SNIHostName on the client side of the connection and have it > validate the hostname returned by the server side of the handshake. > When we introduced SSLEngine we changed this to set the SNIHostName on both > sides. We should revert this so that it only does it on the client side. > The server side of the connection does not have a hostname for the client > side of the connection in this case and it is currently doing a reverse DNS > lookup to get the name. That's a potentially expensive operation, and even > then we don't know whether to use the fully qualified domain name (FQDN) or a > simple host name. This matters because endpoint verification requires that > the name we choose be presented in the certificate of the other server. If > we choose the FQDN and the cert only has a simple host name the handshake > will fail. > SSLEngine requires a host name when it's constructed but most algorithms > don't use it. Documentation mentions Kerberos possibly needing it, so we'd > have to have a way for the reverse lookup to be enabled or find some other > way to get the host name, like SocketCreator.getHostName()'s reverse-lookup > cache. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (GEODE-9135) Remove reverse DNS lookup in Connection.java for accepted connections
Bruce J Schuchardt created GEODE-9135: - Summary: Remove reverse DNS lookup in Connection.java for accepted connections Key: GEODE-9135 URL: https://issues.apache.org/jira/browse/GEODE-9135 Project: Geode Issue Type: Test Components: membership Reporter: Bruce J Schuchardt Prior to the introduction of SSLEngine use in the org.apache.geode.internal.tcp package we used SSLSockets. During a handshake we would set the SNIHostName on the client side of the connection and have it validate the hostname returned by the server side of the handshake. When we introduced SSLEngine we changed this to set the SNIHostName on both sides. We should revert this so that it only does it on the client side. The server side of the connection does not have a hostname for the client side of the connection in this case and it is currently doing a reverse DNS lookup to get the name. That's a potentially expensive operation, and even then we don't know whether to use the fully qualified domain name (FQDN) or a simple host name. This matters because endpoint verification requires that the name we choose be presented in the certificate of the other server. If we choose the FQDN and the cert only has a simple host name the handshake will fail. SSLEngine requires a host name when it's constructed but most algorithms don't use it. Documentation mentions Kerberos possibly needing it, so we'd have to have a way for the reverse lookup to be enabled or find some other way to get the host name, like SocketCreator.getHostName()'s reverse-lookup cache. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (GEODE-9128) Remove host name look-up from JGAddress
Bruce J Schuchardt created GEODE-9128: - Summary: Remove host name look-up from JGAddress Key: GEODE-9128 URL: https://issues.apache.org/jira/browse/GEODE-9128 Project: Geode Issue Type: Test Components: membership Reporter: Bruce J Schuchardt The method JGAddress.toString() contains a host name lookup that should be removed. It should just log the toString of its ip_addr field, not ip_addr.getHostName(). That method can cause a reverse-DNS lookup, which is needlessly expensive for a toString() operation. {code:java} public String toString() { StringBuilder sb = new StringBuilder(); if (ip_addr == null) sb.append(""); else { sb.append(ip_addr.getHostName()); } if (vmViewId >= 0) { sb.append("'); } if (SHOW_UUIDS) { sb.append("(").append(toStringLong()).append(")"); } else if (mostSigBits == 0 && leastSigBits == 0) { sb.append("(no uuid set)"); } sb.append(":").append(port); return sb.toString(); } {code:java} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (GEODE-8997) remove protobuf client server code
[ https://issues.apache.org/jira/browse/GEODE-8997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt resolved GEODE-8997. --- Fix Version/s: 1.15.0 Resolution: Fixed > remove protobuf client server code > -- > > Key: GEODE-8997 > URL: https://issues.apache.org/jira/browse/GEODE-8997 > Project: Geode > Issue Type: Improvement > Components: client/server >Reporter: Darrel Schneider >Assignee: Bruce J Schuchardt >Priority: Major > Labels: pull-request-available > Fix For: 1.15.0 > > > The protobuf based client/server project is essentially dead but code for it > is still part of geode. > This complicates the implementation. For example I was working on an > improvement to have the thread monitor detect stuck server connection threads > and found myself trying to figure out how to make this work for > ProtobufServerConnection. > I think it would be best to remove the dead protobuf code. I'm not sure what > all of it is but here is what I have found so far: > ProtobufServerConnection > package org.apache.geode.internal.cache.client.protocol > package org.apache.geode.internal.protocol.protobuf.v1 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (GEODE-8997) remove protobuf client server code
[ https://issues.apache.org/jira/browse/GEODE-8997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt reassigned GEODE-8997: - Assignee: Bruce J Schuchardt (was: Bill Burcham) > remove protobuf client server code > -- > > Key: GEODE-8997 > URL: https://issues.apache.org/jira/browse/GEODE-8997 > Project: Geode > Issue Type: Improvement > Components: client/server >Reporter: Darrel Schneider >Assignee: Bruce J Schuchardt >Priority: Major > > The protobuf based client/server project is essentially dead but code for it > is still part of geode. > This complicates the implementation. For example I was working on an > improvement to have the thread monitor detect stuck server connection threads > and found myself trying to figure out how to make this work for > ProtobufServerConnection. > I think it would be best to remove the dead protobuf code. I'm not sure what > all of it is but here is what I have found so far: > ProtobufServerConnection > package org.apache.geode.internal.cache.client.protocol > package org.apache.geode.internal.protocol.protobuf.v1 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GEODE-9011) (deleted)
[ https://issues.apache.org/jira/browse/GEODE-9011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt updated GEODE-9011: -- Priority: Trivial (was: Major) > (deleted) > - > > Key: GEODE-9011 > URL: https://issues.apache.org/jira/browse/GEODE-9011 > Project: Geode > Issue Type: Test >Reporter: Bruce J Schuchardt >Priority: Trivial > > submitted to the wrong JIRA - sorry -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GEODE-9011) (deleted)
[ https://issues.apache.org/jira/browse/GEODE-9011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt updated GEODE-9011: -- Component/s: (was: messaging) (was: membership) > (deleted) > - > > Key: GEODE-9011 > URL: https://issues.apache.org/jira/browse/GEODE-9011 > Project: Geode > Issue Type: Test >Reporter: Bruce J Schuchardt >Priority: Major > > submitted to the wrong JIRA - sorry -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GEODE-9011) (deleted)
[ https://issues.apache.org/jira/browse/GEODE-9011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt updated GEODE-9011: -- Summary: (deleted) (was: hctKill.conf Error deserializing message causes hang) > (deleted) > - > > Key: GEODE-9011 > URL: https://issues.apache.org/jira/browse/GEODE-9011 > Project: Geode > Issue Type: Test > Components: membership, messaging >Reporter: Bruce J Schuchardt >Priority: Major > > A test was reported hung when it tried to shut down. One server reported > this: > {noformat} > [warn 2021/03/06 09:45:18.783 PST bridgegemfire_1_1_host1_6920 > tid=0x90] 15 seconds have elapsed while > waiting for replies: 66 waiting for 2 replies from > [rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_3_host1_582:582):41006, > > rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_4_host1_31258:31258):41005]> > on > rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_1_host1_6920:6920):41007 > whose current membership list is: > [[rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_2_host1_7658:7658):41004, > > rs-FullRegression58615648a0i3large-hydra-client-18(locatorgemfire_1_2_host1_13486:13486:locator):41003, > > rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_3_host1_582:582):41006, > > rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_4_host1_31258:31258):41005, > > rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_1_host1_6920:6920):41007, > > rs-FullRegression58615648a0i3large-hydra-client-18(locatorgemfire_1_1_host1_13950:13950:locator):41000]] > {noformat} > and was stuck waiting for a reply in thread dumps > {noformat} > "vm_0_thr_0_bridge_1_1_host1_6920" #144 daemon prio=5 os_prio=0 > tid=0x7fec70058800 nid=0x1d28 waiting on condition [0x7fec62063000] >java.lang.Thread.State: TIMED_WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0xf4f654f8> (a > java.util.concurrent.CountDownLatch$Sync) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328) > at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277) > at > org.apache.geode.internal.util.concurrent.StoppableCountDownLatch.await(StoppableCountDownLatch.java:72) > at > org.apache.geode.distributed.internal.ReplyProcessor21.basicWait(ReplyProcessor21.java:723) > at > org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:794) > at > org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:771) > at > org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:857) > at > org.apache.geode.internal.cache.DistributedCacheOperation.waitForAckIfNeeded(DistributedCacheOperation.java:779) > at > org.apache.geode.internal.cache.DistributedCacheOperation._distribute(DistributedCacheOperation.java:676) > at > org.apache.geode.internal.cache.DistributedCacheOperation.startOperation(DistributedCacheOperation.java:277) > at > org.apache.geode.internal.cache.DistributedCacheOperation.distribute(DistributedCacheOperation.java:318) > at > org.apache.geode.internal.cache.DistributedRegion.distributeDestroyRegion(DistributedRegion.java:1865) > at > org.apache.geode.internal.cache.DistributedRegion.basicDestroyRegion(DistributedRegion.java:1844) > at > org.apache.geode.internal.cache.LocalRegion.basicDestroyRegion(LocalRegion.java:6180) > at > org.apache.geode.internal.cache.HARegion.destroyRegion(HARegion.java:331) > at > org.apache.geode.internal.cache.AbstractRegion.destroyRegion(AbstractRegion.java:476) > at > org.apache.geode.internal.cache.ha.HARegionQueue.destroy(HARegionQueue.java:3438) > at > org.apache.geode.internal.cache.ha.HARegionQueue$BlockingHARegionQueue.destroy(HARegionQueue.java:2272) > at > org.apache.geode.internal.cache.tier.sockets.CacheClientProxy.destroyRQ(CacheClientProxy.java:1031) > at > org.apache.geode.internal.cache.tier.sockets.CacheClientProxy.terminateDispatching(CacheClientProxy.java:939) > at > org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier.shutdown(CacheClientNotifier.java:1306) > - locked <0xf8022800> (a > org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier) > at > org.apache.geode.internal.cache.tier.sock
[jira] [Resolved] (GEODE-9011) hctKill.conf Error deserializing message causes hang
[ https://issues.apache.org/jira/browse/GEODE-9011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt resolved GEODE-9011. --- Resolution: Invalid > hctKill.conf Error deserializing message causes hang > > > Key: GEODE-9011 > URL: https://issues.apache.org/jira/browse/GEODE-9011 > Project: Geode > Issue Type: Test > Components: membership, messaging >Reporter: Bruce J Schuchardt >Priority: Major > > A test was reported hung when it tried to shut down. One server reported > this: > {noformat} > [warn 2021/03/06 09:45:18.783 PST bridgegemfire_1_1_host1_6920 > tid=0x90] 15 seconds have elapsed while > waiting for replies: 66 waiting for 2 replies from > [rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_3_host1_582:582):41006, > > rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_4_host1_31258:31258):41005]> > on > rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_1_host1_6920:6920):41007 > whose current membership list is: > [[rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_2_host1_7658:7658):41004, > > rs-FullRegression58615648a0i3large-hydra-client-18(locatorgemfire_1_2_host1_13486:13486:locator):41003, > > rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_3_host1_582:582):41006, > > rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_4_host1_31258:31258):41005, > > rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_1_host1_6920:6920):41007, > > rs-FullRegression58615648a0i3large-hydra-client-18(locatorgemfire_1_1_host1_13950:13950:locator):41000]] > {noformat} > and was stuck waiting for a reply in thread dumps > {noformat} > "vm_0_thr_0_bridge_1_1_host1_6920" #144 daemon prio=5 os_prio=0 > tid=0x7fec70058800 nid=0x1d28 waiting on condition [0x7fec62063000] >java.lang.Thread.State: TIMED_WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0xf4f654f8> (a > java.util.concurrent.CountDownLatch$Sync) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328) > at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277) > at > org.apache.geode.internal.util.concurrent.StoppableCountDownLatch.await(StoppableCountDownLatch.java:72) > at > org.apache.geode.distributed.internal.ReplyProcessor21.basicWait(ReplyProcessor21.java:723) > at > org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:794) > at > org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:771) > at > org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:857) > at > org.apache.geode.internal.cache.DistributedCacheOperation.waitForAckIfNeeded(DistributedCacheOperation.java:779) > at > org.apache.geode.internal.cache.DistributedCacheOperation._distribute(DistributedCacheOperation.java:676) > at > org.apache.geode.internal.cache.DistributedCacheOperation.startOperation(DistributedCacheOperation.java:277) > at > org.apache.geode.internal.cache.DistributedCacheOperation.distribute(DistributedCacheOperation.java:318) > at > org.apache.geode.internal.cache.DistributedRegion.distributeDestroyRegion(DistributedRegion.java:1865) > at > org.apache.geode.internal.cache.DistributedRegion.basicDestroyRegion(DistributedRegion.java:1844) > at > org.apache.geode.internal.cache.LocalRegion.basicDestroyRegion(LocalRegion.java:6180) > at > org.apache.geode.internal.cache.HARegion.destroyRegion(HARegion.java:331) > at > org.apache.geode.internal.cache.AbstractRegion.destroyRegion(AbstractRegion.java:476) > at > org.apache.geode.internal.cache.ha.HARegionQueue.destroy(HARegionQueue.java:3438) > at > org.apache.geode.internal.cache.ha.HARegionQueue$BlockingHARegionQueue.destroy(HARegionQueue.java:2272) > at > org.apache.geode.internal.cache.tier.sockets.CacheClientProxy.destroyRQ(CacheClientProxy.java:1031) > at > org.apache.geode.internal.cache.tier.sockets.CacheClientProxy.terminateDispatching(CacheClientProxy.java:939) > at > org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier.shutdown(CacheClientNotifier.java:1306) > - locked <0xf8022800> (a > org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier) > at > org.apache.geo
[jira] [Closed] (GEODE-9011) (deleted)
[ https://issues.apache.org/jira/browse/GEODE-9011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt closed GEODE-9011. - > (deleted) > - > > Key: GEODE-9011 > URL: https://issues.apache.org/jira/browse/GEODE-9011 > Project: Geode > Issue Type: Test > Components: membership, messaging >Reporter: Bruce J Schuchardt >Priority: Major > > submitted to the wrong JIRA - sorry -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GEODE-9011) (deleted)
[ https://issues.apache.org/jira/browse/GEODE-9011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt updated GEODE-9011: -- Description: submitted to the wrong JIRA - sorry (was: A test was reported hung when it tried to shut down. One server reported this: {noformat} [warn 2021/03/06 09:45:18.783 PST bridgegemfire_1_1_host1_6920 tid=0x90] 15 seconds have elapsed while waiting for replies: :41006, rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_4_host1_31258:31258):41005]> on rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_1_host1_6920:6920):41007 whose current membership list is: [[rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_2_host1_7658:7658):41004, rs-FullRegression58615648a0i3large-hydra-client-18(locatorgemfire_1_2_host1_13486:13486:locator):41003, rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_3_host1_582:582):41006, rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_4_host1_31258:31258):41005, rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_1_host1_6920:6920):41007, rs-FullRegression58615648a0i3large-hydra-client-18(locatorgemfire_1_1_host1_13950:13950:locator):41000]] {noformat} and was stuck waiting for a reply in thread dumps {noformat} "vm_0_thr_0_bridge_1_1_host1_6920" #144 daemon prio=5 os_prio=0 tid=0x7fec70058800 nid=0x1d28 waiting on condition [0x7fec62063000] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0xf4f654f8> (a java.util.concurrent.CountDownLatch$Sync) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037) at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328) at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277) at org.apache.geode.internal.util.concurrent.StoppableCountDownLatch.await(StoppableCountDownLatch.java:72) at org.apache.geode.distributed.internal.ReplyProcessor21.basicWait(ReplyProcessor21.java:723) at org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:794) at org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:771) at org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:857) at org.apache.geode.internal.cache.DistributedCacheOperation.waitForAckIfNeeded(DistributedCacheOperation.java:779) at org.apache.geode.internal.cache.DistributedCacheOperation._distribute(DistributedCacheOperation.java:676) at org.apache.geode.internal.cache.DistributedCacheOperation.startOperation(DistributedCacheOperation.java:277) at org.apache.geode.internal.cache.DistributedCacheOperation.distribute(DistributedCacheOperation.java:318) at org.apache.geode.internal.cache.DistributedRegion.distributeDestroyRegion(DistributedRegion.java:1865) at org.apache.geode.internal.cache.DistributedRegion.basicDestroyRegion(DistributedRegion.java:1844) at org.apache.geode.internal.cache.LocalRegion.basicDestroyRegion(LocalRegion.java:6180) at org.apache.geode.internal.cache.HARegion.destroyRegion(HARegion.java:331) at org.apache.geode.internal.cache.AbstractRegion.destroyRegion(AbstractRegion.java:476) at org.apache.geode.internal.cache.ha.HARegionQueue.destroy(HARegionQueue.java:3438) at org.apache.geode.internal.cache.ha.HARegionQueue$BlockingHARegionQueue.destroy(HARegionQueue.java:2272) at org.apache.geode.internal.cache.tier.sockets.CacheClientProxy.destroyRQ(CacheClientProxy.java:1031) at org.apache.geode.internal.cache.tier.sockets.CacheClientProxy.terminateDispatching(CacheClientProxy.java:939) at org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier.shutdown(CacheClientNotifier.java:1306) - locked <0xf8022800> (a org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier) at org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.close(AcceptorImpl.java:1630) - locked <0xf5f7b888> (a java.lang.Object) at org.apache.geode.internal.cache.CacheServerImpl.stop(CacheServerImpl.java:491) - locked <0xf7ef2980> (a org.apache.geode.internal.cache.CacheServerImpl) at org.apache.geode.internal.cache.GemFireCacheImpl.stopServers(GemFireCacheImpl.java:2672) at org.apache.geode.internal.cache.GemFireCacheImpl.doClose(GemFireCacheImpl.java:2263) - locked <0xf5a21a08> (a java.lang.Class for org.ap
[jira] [Created] (GEODE-9011) hctKill.conf Error deserializing message causes hang
Bruce J Schuchardt created GEODE-9011: - Summary: hctKill.conf Error deserializing message causes hang Key: GEODE-9011 URL: https://issues.apache.org/jira/browse/GEODE-9011 Project: Geode Issue Type: Test Components: membership, messaging Reporter: Bruce J Schuchardt A test was reported hung when it tried to shut down. One server reported this: {noformat} [warn 2021/03/06 09:45:18.783 PST bridgegemfire_1_1_host1_6920 tid=0x90] 15 seconds have elapsed while waiting for replies: :41006, rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_4_host1_31258:31258):41005]> on rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_1_host1_6920:6920):41007 whose current membership list is: [[rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_2_host1_7658:7658):41004, rs-FullRegression58615648a0i3large-hydra-client-18(locatorgemfire_1_2_host1_13486:13486:locator):41003, rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_3_host1_582:582):41006, rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_4_host1_31258:31258):41005, rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_1_host1_6920:6920):41007, rs-FullRegression58615648a0i3large-hydra-client-18(locatorgemfire_1_1_host1_13950:13950:locator):41000]] {noformat} and was stuck waiting for a reply in thread dumps {noformat} "vm_0_thr_0_bridge_1_1_host1_6920" #144 daemon prio=5 os_prio=0 tid=0x7fec70058800 nid=0x1d28 waiting on condition [0x7fec62063000] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0xf4f654f8> (a java.util.concurrent.CountDownLatch$Sync) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037) at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328) at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277) at org.apache.geode.internal.util.concurrent.StoppableCountDownLatch.await(StoppableCountDownLatch.java:72) at org.apache.geode.distributed.internal.ReplyProcessor21.basicWait(ReplyProcessor21.java:723) at org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:794) at org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:771) at org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:857) at org.apache.geode.internal.cache.DistributedCacheOperation.waitForAckIfNeeded(DistributedCacheOperation.java:779) at org.apache.geode.internal.cache.DistributedCacheOperation._distribute(DistributedCacheOperation.java:676) at org.apache.geode.internal.cache.DistributedCacheOperation.startOperation(DistributedCacheOperation.java:277) at org.apache.geode.internal.cache.DistributedCacheOperation.distribute(DistributedCacheOperation.java:318) at org.apache.geode.internal.cache.DistributedRegion.distributeDestroyRegion(DistributedRegion.java:1865) at org.apache.geode.internal.cache.DistributedRegion.basicDestroyRegion(DistributedRegion.java:1844) at org.apache.geode.internal.cache.LocalRegion.basicDestroyRegion(LocalRegion.java:6180) at org.apache.geode.internal.cache.HARegion.destroyRegion(HARegion.java:331) at org.apache.geode.internal.cache.AbstractRegion.destroyRegion(AbstractRegion.java:476) at org.apache.geode.internal.cache.ha.HARegionQueue.destroy(HARegionQueue.java:3438) at org.apache.geode.internal.cache.ha.HARegionQueue$BlockingHARegionQueue.destroy(HARegionQueue.java:2272) at org.apache.geode.internal.cache.tier.sockets.CacheClientProxy.destroyRQ(CacheClientProxy.java:1031) at org.apache.geode.internal.cache.tier.sockets.CacheClientProxy.terminateDispatching(CacheClientProxy.java:939) at org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier.shutdown(CacheClientNotifier.java:1306) - locked <0xf8022800> (a org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier) at org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.close(AcceptorImpl.java:1630) - locked <0xf5f7b888> (a java.lang.Object) at org.apache.geode.internal.cache.CacheServerImpl.stop(CacheServerImpl.java:491) - locked <0xf7ef2980> (a org.apache.geode.internal.cache.CacheServerImpl) at org.apache.geode.internal.cache.GemFireCacheImpl.stopServers(GemFireCacheImpl.java:2672) at org.apache.geode.i
[jira] [Resolved] (GEODE-8979) CI Failure: SSLSocketHostNameVerificationIntegrationTest
[ https://issues.apache.org/jira/browse/GEODE-8979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt resolved GEODE-8979. --- Fix Version/s: 1.15.0 Resolution: Fixed > CI Failure: SSLSocketHostNameVerificationIntegrationTest > > > Key: GEODE-8979 > URL: https://issues.apache.org/jira/browse/GEODE-8979 > Project: Geode > Issue Type: Test > Components: membership, messaging >Reporter: Bruce J Schuchardt >Assignee: Bruce J Schuchardt >Priority: Major > Labels: pull-request-available > Fix For: 1.15.0 > > > This test failed in a CI IntegrationTest run with this exception: > {noformat} > org.apache.geode.internal.net.SSLSocketHostNameVerificationIntegrationTest > > nioHandshakeValidatesHostName[hasSAN=true and doEndPointIdentification=true] > FAILED > org.apache.geode.GemFireIOException: exception closing SSL session > at > org.apache.geode.internal.net.NioSslEngine.close(NioSslEngine.java:409) > at > org.apache.geode.internal.net.SSLSocketHostNameVerificationIntegrationTest.lambda$startServerNIO$3(SSLSocketHostNameVerificationIntegrationTest.java:216) > Caused by: > java.io.IOException: Connection reset by peer > at sun.nio.ch.FileDispatcherImpl.write0(Native Method) > at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) > at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) > at sun.nio.ch.IOUtil.write(IOUtil.java:51) > at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:470) > at > org.apache.geode.internal.net.NioSslEngine.close(NioSslEngine.java:403) > ... 1 more > {noformat} > It looks like the test needs to have a try/catch for IOException when closing > the NioSslEngine. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (GEODE-9000) NPE During Reconnect After Network Split
[ https://issues.apache.org/jira/browse/GEODE-9000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17295377#comment-17295377 ] Bruce J Schuchardt commented on GEODE-9000: --- The server was reconnecting and emptying out messages queued during quorum checks: {noformat} logsAndStats/gemfire-cluster-server-0-02-01.log: [info 2021/03/04 10:30:28.595 GMT gemfire-cluster-server-0 tid=0x8c] Delivering 22 messages queued by quorum checker logsAndStats/gemfire-cluster-server-0-02-01.log: [info 2021/03/04 10:30:28.596 GMT gemfire-cluster-server-0 tid=0x8c] received suspect message from 10.4.2.34(:locator):41000 for 10.4.3.19(gemfire-cluster-locator-0:1:locator):41000: Member isn't responding to heartbeat requests [fatal 2021/03/04 10:30:28.596 GMT gemfire-cluster-server-0 tid=0x8c] Unexpected exception while booting membership services java.lang.NullPointerException at org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.processNetworkPartitionMessage(GMSJoinLeave.java:1459) at org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1343) at org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger.started(JGroupsMessenger.java:428) {noformat} The network-partition message was delivered during this time and was likely intended for the previous Membership service. Adding a check for "isJoined" or a null currentView and ignoring the message is probably the right way to fix this problem. > NPE During Reconnect After Network Split > > > Key: GEODE-9000 > URL: https://issues.apache.org/jira/browse/GEODE-9000 > Project: Geode > Issue Type: Bug > Components: membership >Affects Versions: 1.14.0 >Reporter: Juan Ramos >Priority: Major > > During a full network split when all members get shutdown by a partition, one > of the servers continually fails to reconnect due to a > {{NullPointerException}}. When using persistent regions, this also prevents > the remaining members from correctly start up as they might be waiting for > the stuck member to recover the latest data. > The issue itself has been introduced by the fix for GEODE-8901, the new > implementation for {{GMSJoinLeave.processNetworkPartitionMessage}} doesn't > have a {{currentView}} installed during the reconnect phase ({{getView() == > null}}) and the following is shown in the logs: > {noformat} > [fatal 2021/03/04 03:32:02.744 GMT gemfire-cluster-server-0 > tid=0x8a] Unexpected exception while booting membership services > java.lang.NullPointerException > at > org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.processNetworkPartitionMessage(GMSJoinLeave.java:1459) > at > org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1343) > at > org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger.started(JGroupsMessenger.java:428) > at > org.apache.geode.distributed.internal.membership.gms.Services.start(Services.java:210) > at > org.apache.geode.distributed.internal.membership.gms.GMSMembership.start(GMSMembership.java:1782) > at > org.apache.geode.distributed.internal.DistributionImpl.start(DistributionImpl.java:171) > at > org.apache.geode.distributed.internal.DistributionImpl.createDistribution(DistributionImpl.java:222) > at > org.apache.geode.distributed.internal.ClusterDistributionManager.(ClusterDistributionManager.java:464) > at > org.apache.geode.distributed.internal.ClusterDistributionManager.(ClusterDistributionManager.java:497) > at > org.apache.geode.distributed.internal.ClusterDistributionManager.create(ClusterDistributionManager.java:326) > at > org.apache.geode.distributed.internal.InternalDistributedSystem.initialize(InternalDistributedSystem.java:779) > at > org.apache.geode.distributed.internal.InternalDistributedSystem.access$200(InternalDistributedSystem.java:135) > at > org.apache.geode.distributed.internal.InternalDistributedSystem$Builder.build(InternalDistributedSystem.java:3034) > at > org.apache.geode.distributed.internal.InternalDistributedSystem.connectInternal(InternalDistributedSystem.java:290) > at > org.apache.geode.distributed.internal.InternalDistributedSystem.reconnect(InternalDistributedSystem.java:2605) > at > org.apache.geode.distributed.internal.InternalDistributedSystem.tryReconnect(InternalDistributedSystem.java:2424) > at > org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1275) > at > org.apache.geode.distributed.internal.ClusterDistribution
[jira] [Resolved] (GEODE-8999) When max-threads is specified for a cache server its reader threads may be reported as Stuck
[ https://issues.apache.org/jira/browse/GEODE-8999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt resolved GEODE-8999. --- Resolution: Not A Problem After talking with Darrel I think the thread was stuck. It was reported during a network partition and the client had invalidated the server and was closing connections to it. > When max-threads is specified for a cache server its reader threads may be > reported as Stuck > > > Key: GEODE-8999 > URL: https://issues.apache.org/jira/browse/GEODE-8999 > Project: Geode > Issue Type: Bug > Components: client/server, membership >Affects Versions: 1.14.0 >Reporter: Bruce J Schuchardt >Priority: Major > > We noticed this report of a stuck thread in a test that enabled max-threads > in a cache server: > {noformat} > [warn 2021/03/02 19:54:31.041 PST bridgep2_host2_17822 > tid=0x1b] Thread <104> (0x68) that was executed at <02 Mar 2021 19:53:44 PST> > has been stuck for <46.356 seconds> and number of thread monitor iteration <1> > Thread Name state > Executor Group > Monitored metric > Thread stack: > sun.nio.ch.FileDispatcherImpl.read0(Native Method) > sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) > sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) > sun.nio.ch.IOUtil.read(IOUtil.java:192) > sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:378) > org.apache.geode.internal.cache.tier.sockets.Message.readWrappedHeaders(Message.java:1237) > org.apache.geode.internal.cache.tier.sockets.Message.fetchHeader(Message.java:859) > org.apache.geode.internal.cache.tier.sockets.Message.readHeaderAndBody(Message.java:698) > org.apache.geode.internal.cache.tier.sockets.Message.receive(Message.java:1213) > org.apache.geode.internal.cache.tier.sockets.Message.receive(Message.java:1229) > org.apache.geode.internal.cache.tier.sockets.BaseCommand.readRequest(BaseCommand.java:816) > org.apache.geode.internal.cache.tier.sockets.ServerConnection.doNormalMessage(ServerConnection.java:777) > org.apache.geode.internal.cache.tier.sockets.OriginalServerConnection.doOneMessage(OriginalServerConnection.java:73) > org.apache.geode.internal.cache.tier.sockets.ServerConnection.run(ServerConnection.java:1185) > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.lambda$initializeServerConnectionThreadPool$3(AcceptorImpl.java:710) > org.apache.geode.internal.cache.tier.sockets.AcceptorImpl$$Lambda$351/1357226696.invoke(Unknown > Source) > org.apache.geode.logging.internal.executors.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:120) > org.apache.geode.logging.internal.executors.LoggingThreadFactory$$Lambda$88/1800187767.run(Unknown > Source) > java.lang.Thread.run(Thread.java:748) > {noformat} > The cache server should suspend thread monitoring before reading from a > socket and resume monitoring afterward. An example of this can be found in > org.apache.geode.internal.tcp.Connection.java. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (GEODE-8999) When max-threads is specified for a cache server its reader threads may be reported as Stuck
[ https://issues.apache.org/jira/browse/GEODE-8999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt closed GEODE-8999. - > When max-threads is specified for a cache server its reader threads may be > reported as Stuck > > > Key: GEODE-8999 > URL: https://issues.apache.org/jira/browse/GEODE-8999 > Project: Geode > Issue Type: Bug > Components: client/server, membership >Affects Versions: 1.14.0 >Reporter: Bruce J Schuchardt >Priority: Major > > We noticed this report of a stuck thread in a test that enabled max-threads > in a cache server: > {noformat} > [warn 2021/03/02 19:54:31.041 PST bridgep2_host2_17822 > tid=0x1b] Thread <104> (0x68) that was executed at <02 Mar 2021 19:53:44 PST> > has been stuck for <46.356 seconds> and number of thread monitor iteration <1> > Thread Name state > Executor Group > Monitored metric > Thread stack: > sun.nio.ch.FileDispatcherImpl.read0(Native Method) > sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) > sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) > sun.nio.ch.IOUtil.read(IOUtil.java:192) > sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:378) > org.apache.geode.internal.cache.tier.sockets.Message.readWrappedHeaders(Message.java:1237) > org.apache.geode.internal.cache.tier.sockets.Message.fetchHeader(Message.java:859) > org.apache.geode.internal.cache.tier.sockets.Message.readHeaderAndBody(Message.java:698) > org.apache.geode.internal.cache.tier.sockets.Message.receive(Message.java:1213) > org.apache.geode.internal.cache.tier.sockets.Message.receive(Message.java:1229) > org.apache.geode.internal.cache.tier.sockets.BaseCommand.readRequest(BaseCommand.java:816) > org.apache.geode.internal.cache.tier.sockets.ServerConnection.doNormalMessage(ServerConnection.java:777) > org.apache.geode.internal.cache.tier.sockets.OriginalServerConnection.doOneMessage(OriginalServerConnection.java:73) > org.apache.geode.internal.cache.tier.sockets.ServerConnection.run(ServerConnection.java:1185) > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.lambda$initializeServerConnectionThreadPool$3(AcceptorImpl.java:710) > org.apache.geode.internal.cache.tier.sockets.AcceptorImpl$$Lambda$351/1357226696.invoke(Unknown > Source) > org.apache.geode.logging.internal.executors.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:120) > org.apache.geode.logging.internal.executors.LoggingThreadFactory$$Lambda$88/1800187767.run(Unknown > Source) > java.lang.Thread.run(Thread.java:748) > {noformat} > The cache server should suspend thread monitoring before reading from a > socket and resume monitoring afterward. An example of this can be found in > org.apache.geode.internal.tcp.Connection.java. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GEODE-8999) When max-threads is specified for a cache server its reader threads may be reported as Stuck
[ https://issues.apache.org/jira/browse/GEODE-8999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt updated GEODE-8999: -- Component/s: membership > When max-threads is specified for a cache server its reader threads may be > reported as Stuck > > > Key: GEODE-8999 > URL: https://issues.apache.org/jira/browse/GEODE-8999 > Project: Geode > Issue Type: Bug > Components: client/server, membership >Affects Versions: 1.14.0 >Reporter: Bruce J Schuchardt >Priority: Major > > We noticed this report of a stuck thread in a test that enabled max-threads > in a cache server: > {noformat} > [warn 2021/03/02 19:54:31.041 PST bridgep2_host2_17822 > tid=0x1b] Thread <104> (0x68) that was executed at <02 Mar 2021 19:53:44 PST> > has been stuck for <46.356 seconds> and number of thread monitor iteration <1> > Thread Name state > Executor Group > Monitored metric > Thread stack: > sun.nio.ch.FileDispatcherImpl.read0(Native Method) > sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) > sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) > sun.nio.ch.IOUtil.read(IOUtil.java:192) > sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:378) > org.apache.geode.internal.cache.tier.sockets.Message.readWrappedHeaders(Message.java:1237) > org.apache.geode.internal.cache.tier.sockets.Message.fetchHeader(Message.java:859) > org.apache.geode.internal.cache.tier.sockets.Message.readHeaderAndBody(Message.java:698) > org.apache.geode.internal.cache.tier.sockets.Message.receive(Message.java:1213) > org.apache.geode.internal.cache.tier.sockets.Message.receive(Message.java:1229) > org.apache.geode.internal.cache.tier.sockets.BaseCommand.readRequest(BaseCommand.java:816) > org.apache.geode.internal.cache.tier.sockets.ServerConnection.doNormalMessage(ServerConnection.java:777) > org.apache.geode.internal.cache.tier.sockets.OriginalServerConnection.doOneMessage(OriginalServerConnection.java:73) > org.apache.geode.internal.cache.tier.sockets.ServerConnection.run(ServerConnection.java:1185) > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.lambda$initializeServerConnectionThreadPool$3(AcceptorImpl.java:710) > org.apache.geode.internal.cache.tier.sockets.AcceptorImpl$$Lambda$351/1357226696.invoke(Unknown > Source) > org.apache.geode.logging.internal.executors.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:120) > org.apache.geode.logging.internal.executors.LoggingThreadFactory$$Lambda$88/1800187767.run(Unknown > Source) > java.lang.Thread.run(Thread.java:748) > {noformat} > The cache server should suspend thread monitoring before reading from a > socket and resume monitoring afterward. An example of this can be found in > org.apache.geode.internal.tcp.Connection.java. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GEODE-8999) When max-threads is specified for a cache server its reader threads may be reported as Stuck
[ https://issues.apache.org/jira/browse/GEODE-8999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt updated GEODE-8999: -- Issue Type: Bug (was: Test) > When max-threads is specified for a cache server its reader threads may be > reported as Stuck > > > Key: GEODE-8999 > URL: https://issues.apache.org/jira/browse/GEODE-8999 > Project: Geode > Issue Type: Bug > Components: client/server >Reporter: Bruce J Schuchardt >Priority: Major > > We noticed this report of a stuck thread in a test that enabled max-threads > in a cache server: > {noformat} > [warn 2021/03/02 19:54:31.041 PST bridgep2_host2_17822 > tid=0x1b] Thread <104> (0x68) that was executed at <02 Mar 2021 19:53:44 PST> > has been stuck for <46.356 seconds> and number of thread monitor iteration <1> > Thread Name state > Executor Group > Monitored metric > Thread stack: > sun.nio.ch.FileDispatcherImpl.read0(Native Method) > sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) > sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) > sun.nio.ch.IOUtil.read(IOUtil.java:192) > sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:378) > org.apache.geode.internal.cache.tier.sockets.Message.readWrappedHeaders(Message.java:1237) > org.apache.geode.internal.cache.tier.sockets.Message.fetchHeader(Message.java:859) > org.apache.geode.internal.cache.tier.sockets.Message.readHeaderAndBody(Message.java:698) > org.apache.geode.internal.cache.tier.sockets.Message.receive(Message.java:1213) > org.apache.geode.internal.cache.tier.sockets.Message.receive(Message.java:1229) > org.apache.geode.internal.cache.tier.sockets.BaseCommand.readRequest(BaseCommand.java:816) > org.apache.geode.internal.cache.tier.sockets.ServerConnection.doNormalMessage(ServerConnection.java:777) > org.apache.geode.internal.cache.tier.sockets.OriginalServerConnection.doOneMessage(OriginalServerConnection.java:73) > org.apache.geode.internal.cache.tier.sockets.ServerConnection.run(ServerConnection.java:1185) > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.lambda$initializeServerConnectionThreadPool$3(AcceptorImpl.java:710) > org.apache.geode.internal.cache.tier.sockets.AcceptorImpl$$Lambda$351/1357226696.invoke(Unknown > Source) > org.apache.geode.logging.internal.executors.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:120) > org.apache.geode.logging.internal.executors.LoggingThreadFactory$$Lambda$88/1800187767.run(Unknown > Source) > java.lang.Thread.run(Thread.java:748) > {noformat} > The cache server should suspend thread monitoring before reading from a > socket and resume monitoring afterward. An example of this can be found in > org.apache.geode.internal.tcp.Connection.java. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GEODE-8999) When max-threads is specified for a cache server its reader threads may be reported as Stuck
[ https://issues.apache.org/jira/browse/GEODE-8999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt updated GEODE-8999: -- Affects Version/s: 1.14.0 > When max-threads is specified for a cache server its reader threads may be > reported as Stuck > > > Key: GEODE-8999 > URL: https://issues.apache.org/jira/browse/GEODE-8999 > Project: Geode > Issue Type: Bug > Components: client/server >Affects Versions: 1.14.0 >Reporter: Bruce J Schuchardt >Priority: Major > > We noticed this report of a stuck thread in a test that enabled max-threads > in a cache server: > {noformat} > [warn 2021/03/02 19:54:31.041 PST bridgep2_host2_17822 > tid=0x1b] Thread <104> (0x68) that was executed at <02 Mar 2021 19:53:44 PST> > has been stuck for <46.356 seconds> and number of thread monitor iteration <1> > Thread Name state > Executor Group > Monitored metric > Thread stack: > sun.nio.ch.FileDispatcherImpl.read0(Native Method) > sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) > sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) > sun.nio.ch.IOUtil.read(IOUtil.java:192) > sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:378) > org.apache.geode.internal.cache.tier.sockets.Message.readWrappedHeaders(Message.java:1237) > org.apache.geode.internal.cache.tier.sockets.Message.fetchHeader(Message.java:859) > org.apache.geode.internal.cache.tier.sockets.Message.readHeaderAndBody(Message.java:698) > org.apache.geode.internal.cache.tier.sockets.Message.receive(Message.java:1213) > org.apache.geode.internal.cache.tier.sockets.Message.receive(Message.java:1229) > org.apache.geode.internal.cache.tier.sockets.BaseCommand.readRequest(BaseCommand.java:816) > org.apache.geode.internal.cache.tier.sockets.ServerConnection.doNormalMessage(ServerConnection.java:777) > org.apache.geode.internal.cache.tier.sockets.OriginalServerConnection.doOneMessage(OriginalServerConnection.java:73) > org.apache.geode.internal.cache.tier.sockets.ServerConnection.run(ServerConnection.java:1185) > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.lambda$initializeServerConnectionThreadPool$3(AcceptorImpl.java:710) > org.apache.geode.internal.cache.tier.sockets.AcceptorImpl$$Lambda$351/1357226696.invoke(Unknown > Source) > org.apache.geode.logging.internal.executors.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:120) > org.apache.geode.logging.internal.executors.LoggingThreadFactory$$Lambda$88/1800187767.run(Unknown > Source) > java.lang.Thread.run(Thread.java:748) > {noformat} > The cache server should suspend thread monitoring before reading from a > socket and resume monitoring afterward. An example of this can be found in > org.apache.geode.internal.tcp.Connection.java. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (GEODE-8999) When max-threads is specified for a cache server its reader threads may be reported as Stuck
Bruce J Schuchardt created GEODE-8999: - Summary: When max-threads is specified for a cache server its reader threads may be reported as Stuck Key: GEODE-8999 URL: https://issues.apache.org/jira/browse/GEODE-8999 Project: Geode Issue Type: Test Components: client/server Reporter: Bruce J Schuchardt We noticed this report of a stuck thread in a test that enabled max-threads in a cache server: {noformat} [warn 2021/03/02 19:54:31.041 PST bridgep2_host2_17822 tid=0x1b] Thread <104> (0x68) that was executed at <02 Mar 2021 19:53:44 PST> has been stuck for <46.356 seconds> and number of thread monitor iteration <1> Thread Name state Executor Group Monitored metric Thread stack: sun.nio.ch.FileDispatcherImpl.read0(Native Method) sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) sun.nio.ch.IOUtil.read(IOUtil.java:192) sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:378) org.apache.geode.internal.cache.tier.sockets.Message.readWrappedHeaders(Message.java:1237) org.apache.geode.internal.cache.tier.sockets.Message.fetchHeader(Message.java:859) org.apache.geode.internal.cache.tier.sockets.Message.readHeaderAndBody(Message.java:698) org.apache.geode.internal.cache.tier.sockets.Message.receive(Message.java:1213) org.apache.geode.internal.cache.tier.sockets.Message.receive(Message.java:1229) org.apache.geode.internal.cache.tier.sockets.BaseCommand.readRequest(BaseCommand.java:816) org.apache.geode.internal.cache.tier.sockets.ServerConnection.doNormalMessage(ServerConnection.java:777) org.apache.geode.internal.cache.tier.sockets.OriginalServerConnection.doOneMessage(OriginalServerConnection.java:73) org.apache.geode.internal.cache.tier.sockets.ServerConnection.run(ServerConnection.java:1185) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.lambda$initializeServerConnectionThreadPool$3(AcceptorImpl.java:710) org.apache.geode.internal.cache.tier.sockets.AcceptorImpl$$Lambda$351/1357226696.invoke(Unknown Source) org.apache.geode.logging.internal.executors.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:120) org.apache.geode.logging.internal.executors.LoggingThreadFactory$$Lambda$88/1800187767.run(Unknown Source) java.lang.Thread.run(Thread.java:748) {noformat} The cache server should suspend thread monitoring before reading from a socket and resume monitoring afterward. An example of this can be found in org.apache.geode.internal.tcp.Connection.java. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (GEODE-8997) remove protobuf client server code
[ https://issues.apache.org/jira/browse/GEODE-8997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17294685#comment-17294685 ] Bruce J Schuchardt commented on GEODE-8997: --- also these subprojects: geode-protobuf geode-protobuf-messages geode-experimental-driver > remove protobuf client server code > -- > > Key: GEODE-8997 > URL: https://issues.apache.org/jira/browse/GEODE-8997 > Project: Geode > Issue Type: Improvement > Components: client/server >Reporter: Darrel Schneider >Priority: Major > > The protobuf based client/server project is essentially dead but code for it > is still part of geode. > This complicates the implementation. For example I was working on an > improvement to have the thread monitor detect stuck server connection threads > and found myself trying to figure out how to make this work for > ProtobufServerConnection. > I think it would be best to remove the dead protobuf code. I'm not sure what > all of it is but here is what I have found so far: > ProtobufServerConnection > package org.apache.geode.internal.cache.client.protocol > package org.apache.geode.internal.protocol.protobuf.v1 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (GEODE-8979) CI Failure: SSLSocketHostNameVerificationIntegrationTest
[ https://issues.apache.org/jira/browse/GEODE-8979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt reassigned GEODE-8979: - Assignee: Bruce J Schuchardt > CI Failure: SSLSocketHostNameVerificationIntegrationTest > > > Key: GEODE-8979 > URL: https://issues.apache.org/jira/browse/GEODE-8979 > Project: Geode > Issue Type: Test > Components: membership, messaging >Reporter: Bruce J Schuchardt >Assignee: Bruce J Schuchardt >Priority: Major > > This test failed in a CI IntegrationTest run with this exception: > {noformat} > org.apache.geode.internal.net.SSLSocketHostNameVerificationIntegrationTest > > nioHandshakeValidatesHostName[hasSAN=true and doEndPointIdentification=true] > FAILED > org.apache.geode.GemFireIOException: exception closing SSL session > at > org.apache.geode.internal.net.NioSslEngine.close(NioSslEngine.java:409) > at > org.apache.geode.internal.net.SSLSocketHostNameVerificationIntegrationTest.lambda$startServerNIO$3(SSLSocketHostNameVerificationIntegrationTest.java:216) > Caused by: > java.io.IOException: Connection reset by peer > at sun.nio.ch.FileDispatcherImpl.write0(Native Method) > at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) > at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) > at sun.nio.ch.IOUtil.write(IOUtil.java:51) > at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:470) > at > org.apache.geode.internal.net.NioSslEngine.close(NioSslEngine.java:403) > ... 1 more > {noformat} > It looks like the test needs to have a try/catch for IOException when closing > the NioSslEngine. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (GEODE-8963) separate client/server compatibility from server/server version compatibility
[ https://issues.apache.org/jira/browse/GEODE-8963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt resolved GEODE-8963. --- Fix Version/s: 1.15.0 Resolution: Fixed > separate client/server compatibility from server/server version compatibility > - > > Key: GEODE-8963 > URL: https://issues.apache.org/jira/browse/GEODE-8963 > Project: Geode > Issue Type: Improvement > Components: serialization >Reporter: Bruce J Schuchardt >Assignee: Bruce J Schuchardt >Priority: Minor > Labels: pull-request-available > Fix For: 1.15.0 > > > A client's version is used for deserializing data received from the client > and for serializing data sent to the client. It is also used to locate the > map of Commands used to process client requests. Every time we cut a new > release we bump this version in KnownVersions and create a new map of > Commands, even though client/server communications protocols rarely change. > We should have each KnownVersion hold a client/server compatibility number > that is used to identify clients rather than the KnownVersion's ordinal. > For instance, > {code:java} > public static final KnownVersion GEODE_1_15_0 = > new KnownVersion("GEODE", "1.15.0", (byte) 1, (byte) 15, (byte) 0, > (byte) 0, > /*server/server version*/GEODE_1_15_0_ORDINAL, > /*client/server version*/GEODE_1_15_0_ORDINAL); > > public static final KnownVersion GEODE_1_16_0 = > new KnownVersion("GEODE", "1.16.0", (byte) 1, (byte) 16, (byte) 0, > (byte) 0, > /*server/server version*/GEODE_1_16_0_ORDINAL, > /*client/server version*/GEODE_1_15_0_ORDINAL); > public static final KnownVersion GEODE_1_17_0 = > new KnownVersion("GEODE", "1.17.0", (byte) 1, (byte) 17, (byte) 0, > (byte) 0, > /*server/server version*/GEODE_1_17_0_ORDINAL, > /*client/server version*/GEODE_1_15_0_ORDINAL); > {code} > In the above KnownVersions the client/server serialization is known to have > not changed since v1.15.0 and so there is no need to use a newer KnownVersion > for clients. > Client handshake code will need to be changed to use the client/server > ordinal when identifying clients and servers. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (GEODE-8979) CI Failure: SSLSocketHostNameVerificationIntegrationTest
Bruce J Schuchardt created GEODE-8979: - Summary: CI Failure: SSLSocketHostNameVerificationIntegrationTest Key: GEODE-8979 URL: https://issues.apache.org/jira/browse/GEODE-8979 Project: Geode Issue Type: Test Components: membership, messaging Reporter: Bruce J Schuchardt This test failed in a CI IntegrationTest run with this exception: {noformat} org.apache.geode.internal.net.SSLSocketHostNameVerificationIntegrationTest > nioHandshakeValidatesHostName[hasSAN=true and doEndPointIdentification=true] FAILED org.apache.geode.GemFireIOException: exception closing SSL session at org.apache.geode.internal.net.NioSslEngine.close(NioSslEngine.java:409) at org.apache.geode.internal.net.SSLSocketHostNameVerificationIntegrationTest.lambda$startServerNIO$3(SSLSocketHostNameVerificationIntegrationTest.java:216) Caused by: java.io.IOException: Connection reset by peer at sun.nio.ch.FileDispatcherImpl.write0(Native Method) at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) at sun.nio.ch.IOUtil.write(IOUtil.java:51) at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:470) at org.apache.geode.internal.net.NioSslEngine.close(NioSslEngine.java:403) ... 1 more {noformat} It looks like the test needs to have a try/catch for IOException when closing the NioSslEngine. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GEODE-8978) convert all DataSerializableFixedID classes to stop using InternalDataSerializer's static methods
[ https://issues.apache.org/jira/browse/GEODE-8978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt updated GEODE-8978: -- Description: When we introduced the geode-serialization module we created new method signatures for toData and fromData in the form {code:java} void toData(DataOutput out, SerializationContext context) throws IOException; {code} and {code:java} void fromData(DataInput in, DeserializationContext context) throws IOException, ClassNotFoundException; {code} All DataSerializableFixedID classes were modified to use these signatures but many continue to use InternalDataSerializer and/or DataSerializer static methods to perform their work. These should be changed to use the SerializationContext or DeserializationContext parameter along with StaticSerialization whenever possible. was: When we introduced the geode-serialization module we created new method signatures for toData and fromData in the form {code:java} void toData(DataOutput out, SerializationContext context) throws IOException; {code} and {code:java} void fromData(DataInput in, DeserializationContext context) throws IOException, ClassNotFoundException; {code} All DataSerializableFixedID classes were modified to use these signatures but many continue to use InternalDataSerializer and/or DataSerializer static methods to perform their work. These should be changed to use the SerializationContext parameter and StaticSerialization whenever possible. > convert all DataSerializableFixedID classes to stop using > InternalDataSerializer's static methods > - > > Key: GEODE-8978 > URL: https://issues.apache.org/jira/browse/GEODE-8978 > Project: Geode > Issue Type: Improvement > Components: membership, serialization >Reporter: Bruce J Schuchardt >Priority: Major > > When we introduced the geode-serialization module we created new method > signatures for toData and fromData in the form > {code:java} > void toData(DataOutput out, SerializationContext context) throws IOException; > {code} > and > {code:java} > void fromData(DataInput in, DeserializationContext context) > throws IOException, ClassNotFoundException; > {code} > All DataSerializableFixedID classes were modified to use these signatures but > many continue to use InternalDataSerializer and/or DataSerializer static > methods to perform their work. These should be changed to use the > SerializationContext or DeserializationContext parameter along with > StaticSerialization whenever possible. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GEODE-8978) convert all DataSerializableFixedID classes to stop using InternalDataSerializer's static methods
[ https://issues.apache.org/jira/browse/GEODE-8978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt updated GEODE-8978: -- Issue Type: Improvement (was: Bug) > convert all DataSerializableFixedID classes to stop using > InternalDataSerializer's static methods > - > > Key: GEODE-8978 > URL: https://issues.apache.org/jira/browse/GEODE-8978 > Project: Geode > Issue Type: Improvement > Components: membership, serialization >Reporter: Bruce J Schuchardt >Priority: Major > > When we introduced the geode-serialization module we created new method > signatures for toData and fromData in the form > {code:java} > void toData(DataOutput out, SerializationContext context) throws IOException; > {code} > and > {code:java} > void fromData(DataInput in, DeserializationContext context) > throws IOException, ClassNotFoundException; > {code} > All DataSerializableFixedID classes were modified to use these signatures but > many continue to use InternalDataSerializer and/or DataSerializer static > methods to perform their work. These should be changed to use the > SerializationContext parameter and StaticSerialization whenever possible. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (GEODE-8978) convert all DataSerializableFixedID classes to stop using InternalDataSerializer's static methods
Bruce J Schuchardt created GEODE-8978: - Summary: convert all DataSerializableFixedID classes to stop using InternalDataSerializer's static methods Key: GEODE-8978 URL: https://issues.apache.org/jira/browse/GEODE-8978 Project: Geode Issue Type: Bug Components: membership, serialization Reporter: Bruce J Schuchardt When we introduced the geode-serialization module we created new method signatures for toData and fromData in the form {code:java} void toData(DataOutput out, SerializationContext context) throws IOException; {code} and {code:java} void fromData(DataInput in, DeserializationContext context) throws IOException, ClassNotFoundException; {code} All DataSerializableFixedID classes were modified to use these signatures but many continue to use InternalDataSerializer and/or DataSerializer static methods to perform their work. These should be changed to use the SerializationContext parameter and StaticSerialization whenever possible. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (GEODE-8948) log a locator's coordinates during launch
[ https://issues.apache.org/jira/browse/GEODE-8948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt resolved GEODE-8948. --- Resolution: Invalid We are already logging the locator's coordinates in its log file, so this ticket isn't needed. [info 2021/02/17 15:43:16.162 PST locatorgemfire_1_1_host1_47196 tid=0x17] Locator started on bruces-a01.fios-router.home[27985] > log a locator's coordinates during launch > - > > Key: GEODE-8948 > URL: https://issues.apache.org/jira/browse/GEODE-8948 > Project: Geode > Issue Type: Improvement > Components: membership >Reporter: Bruce J Schuchardt >Priority: Major > > Looking through a Locator's log file it is difficult, if not impossible, to > tell what the locator's host and port are. This makes it difficult to know > which Locator log files to examine when debugging if a client (or WAN > service) has trouble contacting a Locator because they only log that locators > host and port number. > If Locators would log something like > Starting location services on \{hostname} and port > \{portnumber} > and with any other additional info that would be useful in grepping through > artifacts to find a log file of interest it would help a lot in debugging > efforts. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (GEODE-8948) log a locator's coordinates during launch
[ https://issues.apache.org/jira/browse/GEODE-8948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt closed GEODE-8948. - > log a locator's coordinates during launch > - > > Key: GEODE-8948 > URL: https://issues.apache.org/jira/browse/GEODE-8948 > Project: Geode > Issue Type: Improvement > Components: membership >Reporter: Bruce J Schuchardt >Priority: Major > > Looking through a Locator's log file it is difficult, if not impossible, to > tell what the locator's host and port are. This makes it difficult to know > which Locator log files to examine when debugging if a client (or WAN > service) has trouble contacting a Locator because they only log that locators > host and port number. > If Locators would log something like > Starting location services on \{hostname} and port > \{portnumber} > and with any other additional info that would be useful in grepping through > artifacts to find a log file of interest it would help a lot in debugging > efforts. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GEODE-8972) remove shunnedMembers collection from GMSMembership
[ https://issues.apache.org/jira/browse/GEODE-8972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt updated GEODE-8972: -- Issue Type: Improvement (was: Bug) > remove shunnedMembers collection from GMSMembership > --- > > Key: GEODE-8972 > URL: https://issues.apache.org/jira/browse/GEODE-8972 > Project: Geode > Issue Type: Improvement > Components: membership >Reporter: Bruce J Schuchardt >Priority: Major > > GMSMembership has a _shunnedMembers_ collection that is used to track the IDs > of nodes that are no longer part of the cluster. This collection is no > longer needed since we can tell if a node is old by comparing the view ID in > its identifier to that of the current view (called _latestView_ in that > class. Checks like this are already in place in some parts of the code. > All uses of _shunnedMembers_ should be replaced with this check. > MembershipView view = latestView; > boolean shunned = memberId.getVmViewId() <= view.getViewId() && > !view.contains(memberId); -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (GEODE-8972) remove shunnedMembers collection from GMSMembership
Bruce J Schuchardt created GEODE-8972: - Summary: remove shunnedMembers collection from GMSMembership Key: GEODE-8972 URL: https://issues.apache.org/jira/browse/GEODE-8972 Project: Geode Issue Type: Bug Components: membership Reporter: Bruce J Schuchardt GMSMembership has a _shunnedMembers_ collection that is used to track the IDs of nodes that are no longer part of the cluster. This collection is no longer needed since we can tell if a node is old by comparing the view ID in its identifier to that of the current view (called _latestView_ in that class. Checks like this are already in place in some parts of the code. All uses of _shunnedMembers_ should be replaced with this check. MembershipView view = latestView; boolean shunned = memberId.getVmViewId() <= view.getViewId() && !view.contains(memberId); -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (GEODE-8963) separate client/server compatibility from server/server version compatibility
[ https://issues.apache.org/jira/browse/GEODE-8963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17289395#comment-17289395 ] Bruce J Schuchardt commented on GEODE-8963: --- bq. If we didn't bump the ordinal at 1.14.0 what would we do if we needed a protocol change in 1.13.2? Would we say "no" to that protocol change? That is exactly the reason we do it. I think we should be bumping it by more than 5. We aren't going to run out of numbers if we bump it by 10 or 20. > separate client/server compatibility from server/server version compatibility > - > > Key: GEODE-8963 > URL: https://issues.apache.org/jira/browse/GEODE-8963 > Project: Geode > Issue Type: Improvement > Components: serialization >Reporter: Bruce J Schuchardt >Assignee: Bruce J Schuchardt >Priority: Minor > Labels: pull-request-available > > A client's version is used for deserializing data received from the client > and for serializing data sent to the client. It is also used to locate the > map of Commands used to process client requests. Every time we cut a new > release we bump this version in KnownVersions and create a new map of > Commands, even though client/server communications protocols rarely change. > We should have each KnownVersion hold a client/server compatibility number > that is used to identify clients rather than the KnownVersion's ordinal. > For instance, > {code:java} > public static final KnownVersion GEODE_1_15_0 = > new KnownVersion("GEODE", "1.15.0", (byte) 1, (byte) 15, (byte) 0, > (byte) 0, > /*server/server version*/GEODE_1_15_0_ORDINAL, > /*client/server version*/GEODE_1_15_0_ORDINAL); > > public static final KnownVersion GEODE_1_16_0 = > new KnownVersion("GEODE", "1.16.0", (byte) 1, (byte) 16, (byte) 0, > (byte) 0, > /*server/server version*/GEODE_1_16_0_ORDINAL, > /*client/server version*/GEODE_1_15_0_ORDINAL); > public static final KnownVersion GEODE_1_17_0 = > new KnownVersion("GEODE", "1.17.0", (byte) 1, (byte) 17, (byte) 0, > (byte) 0, > /*server/server version*/GEODE_1_17_0_ORDINAL, > /*client/server version*/GEODE_1_15_0_ORDINAL); > {code} > In the above KnownVersions the client/server serialization is known to have > not changed since v1.15.0 and so there is no need to use a newer KnownVersion > for clients. > Client handshake code will need to be changed to use the client/server > ordinal when identifying clients and servers. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GEODE-8963) separate client/server compatibility from server/server version compatibility
[ https://issues.apache.org/jira/browse/GEODE-8963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt updated GEODE-8963: -- Priority: Minor (was: Major) > separate client/server compatibility from server/server version compatibility > - > > Key: GEODE-8963 > URL: https://issues.apache.org/jira/browse/GEODE-8963 > Project: Geode > Issue Type: Improvement > Components: membership, serialization >Reporter: Bruce J Schuchardt >Assignee: Bruce J Schuchardt >Priority: Minor > > A client's version is used for deserializing data received from the client > and for serializing data sent to the client. It is also used to locate the > map of Commands used to process client requests. Every time we cut a new > release we bump this version in KnownVersions and create a new map of > Commands, even though client/server communications protocols rarely change. > We should have each KnownVersion hold a client/server compatibility number > that is used to identify clients rather than the KnownVersion's ordinal. > For instance, > {code:java} > public static final KnownVersion GEODE_1_15_0 = > new KnownVersion("GEODE", "1.15.0", (byte) 1, (byte) 15, (byte) 0, > (byte) 0, > /*server/server version*/GEODE_1_15_0_ORDINAL, > /*client/server version*/GEODE_1_15_0_ORDINAL); > > public static final KnownVersion GEODE_1_16_0 = > new KnownVersion("GEODE", "1.16.0", (byte) 1, (byte) 16, (byte) 0, > (byte) 0, > /*server/server version*/GEODE_1_16_0_ORDINAL, > /*client/server version*/GEODE_1_15_0_ORDINAL); > public static final KnownVersion GEODE_1_17_0 = > new KnownVersion("GEODE", "1.17.0", (byte) 1, (byte) 17, (byte) 0, > (byte) 0, > /*server/server version*/GEODE_1_17_0_ORDINAL, > /*client/server version*/GEODE_1_15_0_ORDINAL); > {code} > In the above KnownVersions the client/server serialization is known to have > not changed since v1.15.0 and so there is no need to use a newer KnownVersion > for clients. > Client handshake code will need to be changed to use the client/server > ordinal when identifying clients and servers. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (GEODE-8963) separate client/server compatibility from server/server version compatibility
[ https://issues.apache.org/jira/browse/GEODE-8963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt reassigned GEODE-8963: - Assignee: Bruce J Schuchardt > separate client/server compatibility from server/server version compatibility > - > > Key: GEODE-8963 > URL: https://issues.apache.org/jira/browse/GEODE-8963 > Project: Geode > Issue Type: Improvement > Components: membership, serialization >Reporter: Bruce J Schuchardt >Assignee: Bruce J Schuchardt >Priority: Major > > A client's version is used for deserializing data received from the client > and for serializing data sent to the client. It is also used to locate the > map of Commands used to process client requests. Every time we cut a new > release we bump this version in KnownVersions and create a new map of > Commands, even though client/server communications protocols rarely change. > We should have each KnownVersion hold a client/server compatibility number > that is used to identify clients rather than the KnownVersion's ordinal. > For instance, > {code:java} > public static final KnownVersion GEODE_1_15_0 = > new KnownVersion("GEODE", "1.15.0", (byte) 1, (byte) 15, (byte) 0, > (byte) 0, > /*server/server version*/GEODE_1_15_0_ORDINAL, > /*client/server version*/GEODE_1_15_0_ORDINAL); > > public static final KnownVersion GEODE_1_16_0 = > new KnownVersion("GEODE", "1.16.0", (byte) 1, (byte) 16, (byte) 0, > (byte) 0, > /*server/server version*/GEODE_1_16_0_ORDINAL, > /*client/server version*/GEODE_1_15_0_ORDINAL); > public static final KnownVersion GEODE_1_17_0 = > new KnownVersion("GEODE", "1.17.0", (byte) 1, (byte) 17, (byte) 0, > (byte) 0, > /*server/server version*/GEODE_1_17_0_ORDINAL, > /*client/server version*/GEODE_1_15_0_ORDINAL); > {code} > In the above KnownVersions the client/server serialization is known to have > not changed since v1.15.0 and so there is no need to use a newer KnownVersion > for clients. > Client handshake code will need to be changed to use the client/server > ordinal when identifying clients and servers. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GEODE-8963) separate client/server compatibility from server/server version compatibility
[ https://issues.apache.org/jira/browse/GEODE-8963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt updated GEODE-8963: -- Component/s: (was: membership) > separate client/server compatibility from server/server version compatibility > - > > Key: GEODE-8963 > URL: https://issues.apache.org/jira/browse/GEODE-8963 > Project: Geode > Issue Type: Improvement > Components: serialization >Reporter: Bruce J Schuchardt >Assignee: Bruce J Schuchardt >Priority: Minor > > A client's version is used for deserializing data received from the client > and for serializing data sent to the client. It is also used to locate the > map of Commands used to process client requests. Every time we cut a new > release we bump this version in KnownVersions and create a new map of > Commands, even though client/server communications protocols rarely change. > We should have each KnownVersion hold a client/server compatibility number > that is used to identify clients rather than the KnownVersion's ordinal. > For instance, > {code:java} > public static final KnownVersion GEODE_1_15_0 = > new KnownVersion("GEODE", "1.15.0", (byte) 1, (byte) 15, (byte) 0, > (byte) 0, > /*server/server version*/GEODE_1_15_0_ORDINAL, > /*client/server version*/GEODE_1_15_0_ORDINAL); > > public static final KnownVersion GEODE_1_16_0 = > new KnownVersion("GEODE", "1.16.0", (byte) 1, (byte) 16, (byte) 0, > (byte) 0, > /*server/server version*/GEODE_1_16_0_ORDINAL, > /*client/server version*/GEODE_1_15_0_ORDINAL); > public static final KnownVersion GEODE_1_17_0 = > new KnownVersion("GEODE", "1.17.0", (byte) 1, (byte) 17, (byte) 0, > (byte) 0, > /*server/server version*/GEODE_1_17_0_ORDINAL, > /*client/server version*/GEODE_1_15_0_ORDINAL); > {code} > In the above KnownVersions the client/server serialization is known to have > not changed since v1.15.0 and so there is no need to use a newer KnownVersion > for clients. > Client handshake code will need to be changed to use the client/server > ordinal when identifying clients and servers. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GEODE-8963) separate client/server compatibility from server/server version compatibility
[ https://issues.apache.org/jira/browse/GEODE-8963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt updated GEODE-8963: -- Component/s: membership > separate client/server compatibility from server/server version compatibility > - > > Key: GEODE-8963 > URL: https://issues.apache.org/jira/browse/GEODE-8963 > Project: Geode > Issue Type: Improvement > Components: membership, serialization >Reporter: Bruce J Schuchardt >Priority: Major > > A client's version is used for deserializing data received from the client > and for serializing data sent to the client. It is also used to locate the > map of Commands used to process client requests. Every time we cut a new > release we bump this version in KnownVersions and create a new map of > Commands, even though client/server communications protocols rarely change. > We should have each KnownVersion hold a client/server compatibility number > that is used to identify clients rather than the KnownVersion's ordinal. > For instance, > {code:java} > public static final KnownVersion GEODE_1_15_0 = > new KnownVersion("GEODE", "1.15.0", (byte) 1, (byte) 15, (byte) 0, > (byte) 0, > /*server/server version*/GEODE_1_15_0_ORDINAL, > /*client/server version*/GEODE_1_15_0_ORDINAL); > > public static final KnownVersion GEODE_1_16_0 = > new KnownVersion("GEODE", "1.16.0", (byte) 1, (byte) 16, (byte) 0, > (byte) 0, > /*server/server version*/GEODE_1_16_0_ORDINAL, > /*client/server version*/GEODE_1_15_0_ORDINAL); > public static final KnownVersion GEODE_1_17_0 = > new KnownVersion("GEODE", "1.17.0", (byte) 1, (byte) 17, (byte) 0, > (byte) 0, > /*server/server version*/GEODE_1_17_0_ORDINAL, > /*client/server version*/GEODE_1_15_0_ORDINAL); > {code} > In the above KnownVersions the client/server serialization is known to have > not changed since v1.15.0 and so there is no need to use a newer KnownVersion > for clients. > Client handshake code will need to be changed to use the client/server > ordinal when identifying clients and servers. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (GEODE-8963) separate client/server compatibility from server/server version compatibility
Bruce J Schuchardt created GEODE-8963: - Summary: separate client/server compatibility from server/server version compatibility Key: GEODE-8963 URL: https://issues.apache.org/jira/browse/GEODE-8963 Project: Geode Issue Type: Bug Components: serialization Reporter: Bruce J Schuchardt A client's version is used for deserializing data received from the client and for serializing data sent to the client. It is also used to locate the map of Commands used to process client requests. Every time we cut a new release we bump this version in KnownVersions and create a new map of Commands, even though client/server communications protocols rarely change. We should have each KnownVersion hold a client/server compatibility number that is used to identify clients rather than the KnownVersion's ordinal. For instance, {code:java} public static final KnownVersion GEODE_1_15_0 = new KnownVersion("GEODE", "1.15.0", (byte) 1, (byte) 15, (byte) 0, (byte) 0, /*server/server version*/GEODE_1_15_0_ORDINAL, /*client/server version*/GEODE_1_15_0_ORDINAL); public static final KnownVersion GEODE_1_16_0 = new KnownVersion("GEODE", "1.16.0", (byte) 1, (byte) 16, (byte) 0, (byte) 0, /*server/server version*/GEODE_1_16_0_ORDINAL, /*client/server version*/GEODE_1_15_0_ORDINAL); public static final KnownVersion GEODE_1_17_0 = new KnownVersion("GEODE", "1.17.0", (byte) 1, (byte) 17, (byte) 0, (byte) 0, /*server/server version*/GEODE_1_17_0_ORDINAL, /*client/server version*/GEODE_1_15_0_ORDINAL); {code} In the above KnownVersions the client/server serialization is known to have not changed since v1.15.0 and so there is no need to use a newer KnownVersion for clients. Client handshake code will need to be changed to use the client/server ordinal when identifying clients and servers. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GEODE-8963) separate client/server compatibility from server/server version compatibility
[ https://issues.apache.org/jira/browse/GEODE-8963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt updated GEODE-8963: -- Issue Type: Improvement (was: Bug) > separate client/server compatibility from server/server version compatibility > - > > Key: GEODE-8963 > URL: https://issues.apache.org/jira/browse/GEODE-8963 > Project: Geode > Issue Type: Improvement > Components: serialization >Reporter: Bruce J Schuchardt >Priority: Major > > A client's version is used for deserializing data received from the client > and for serializing data sent to the client. It is also used to locate the > map of Commands used to process client requests. Every time we cut a new > release we bump this version in KnownVersions and create a new map of > Commands, even though client/server communications protocols rarely change. > We should have each KnownVersion hold a client/server compatibility number > that is used to identify clients rather than the KnownVersion's ordinal. > For instance, > {code:java} > public static final KnownVersion GEODE_1_15_0 = > new KnownVersion("GEODE", "1.15.0", (byte) 1, (byte) 15, (byte) 0, > (byte) 0, > /*server/server version*/GEODE_1_15_0_ORDINAL, > /*client/server version*/GEODE_1_15_0_ORDINAL); > > public static final KnownVersion GEODE_1_16_0 = > new KnownVersion("GEODE", "1.16.0", (byte) 1, (byte) 16, (byte) 0, > (byte) 0, > /*server/server version*/GEODE_1_16_0_ORDINAL, > /*client/server version*/GEODE_1_15_0_ORDINAL); > public static final KnownVersion GEODE_1_17_0 = > new KnownVersion("GEODE", "1.17.0", (byte) 1, (byte) 17, (byte) 0, > (byte) 0, > /*server/server version*/GEODE_1_17_0_ORDINAL, > /*client/server version*/GEODE_1_15_0_ORDINAL); > {code} > In the above KnownVersions the client/server serialization is known to have > not changed since v1.15.0 and so there is no need to use a newer KnownVersion > for clients. > Client handshake code will need to be changed to use the client/server > ordinal when identifying clients and servers. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (GEODE-8956) LocatorMembershipListenerImpl has unconstrained thread creation that can crash a machine
Bruce J Schuchardt created GEODE-8956: - Summary: LocatorMembershipListenerImpl has unconstrained thread creation that can crash a machine Key: GEODE-8956 URL: https://issues.apache.org/jira/browse/GEODE-8956 Project: Geode Issue Type: Bug Components: wan Reporter: Bruce J Schuchardt In reviewing PR 6013 I found that a simple change meant to resolve a difficult problem lead to unrestrained thread growth, sometimes topping out at over 5000 threads, in a locator that often crashed the host machine. The thread growth was due to this method in LocatorMembershipListenerImpl: {code:java} Thread buildLocatorsDistributorThread(DistributionLocatorId localLocatorId, Map> remoteLocators, DistributionLocatorId joiningLocator, int joiningLocatorDistributedSystemId) { Runnable distributeLocatorsRunnable = new DistributeLocatorsRunnable(config.getMemberTimeout(), tcpClient, localLocatorId, remoteLocators, joiningLocator, joiningLocatorDistributedSystemId); ThreadFactory threadFactory = new LoggingThreadFactory(LOCATORS_DISTRIBUTOR_THREAD_NAME, true); return threadFactory.newThread(distributeLocatorsRunnable); } {code} This should probably be performed in an Executor with a reasonable max-threads limit based on the number of local and remote-locators in the DistributionConfig. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GEODE-8955) WAN location service uses DistributedLocatorId.toString() to represent a locator
[ https://issues.apache.org/jira/browse/GEODE-8955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt updated GEODE-8955: -- Priority: Minor (was: Major) > WAN location service uses DistributedLocatorId.toString() to represent a > locator > > > Key: GEODE-8955 > URL: https://issues.apache.org/jira/browse/GEODE-8955 > Project: Geode > Issue Type: Improvement > Components: wan >Reporter: Bruce J Schuchardt >Priority: Minor > > This code in LocatorHelper, and probably code in other parts of the WAN > location service, uses DistributionLocatorId.toString() to track whether > other locators have the WAN location service available. It should use the > DistributionLocatorId.marshal() method instead. We should never use the > toString() representation of an object in this way as it may change over time. > > {code:java} > private static void addServerLocator(Integer distributedSystemId, > LocatorMembershipListener locatorListener, DistributionLocatorId locator) > { > ConcurrentHashMap> allServerLocatorsInfo = > (ConcurrentHashMap>) > locatorListener.getAllServerLocatorsInfo(); > Set locatorsSet = new CopyOnWriteHashSet(); > locatorsSet.add(locator.toString()); > Set existingValue = > allServerLocatorsInfo.putIfAbsent(distributedSystemId, locatorsSet); > if (existingValue != null) { > if (!existingValue.contains(locator.toString())) { > existingValue.add(locator.toString()); > } > } > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (GEODE-8955) WAN location service uses DistributedLocatorId.toString() to represent a locator
Bruce J Schuchardt created GEODE-8955: - Summary: WAN location service uses DistributedLocatorId.toString() to represent a locator Key: GEODE-8955 URL: https://issues.apache.org/jira/browse/GEODE-8955 Project: Geode Issue Type: Improvement Components: wan Reporter: Bruce J Schuchardt This code in LocatorHelper, and probably code in other parts of the WAN location service, uses DistributionLocatorId.toString() to track whether other locators have the WAN location service available. It should use the DistributionLocatorId.marshal() method instead. We should never use the toString() representation of an object in this way as it may change over time. {code:java} private static void addServerLocator(Integer distributedSystemId, LocatorMembershipListener locatorListener, DistributionLocatorId locator) { ConcurrentHashMap> allServerLocatorsInfo = (ConcurrentHashMap>) locatorListener.getAllServerLocatorsInfo(); Set locatorsSet = new CopyOnWriteHashSet(); locatorsSet.add(locator.toString()); Set existingValue = allServerLocatorsInfo.putIfAbsent(distributedSystemId, locatorsSet); if (existingValue != null) { if (!existingValue.contains(locator.toString())) { existingValue.add(locator.toString()); } } } {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (GEODE-8922) Remove ProductUseLog
[ https://issues.apache.org/jira/browse/GEODE-8922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt resolved GEODE-8922. --- Fix Version/s: 1.15.0 Resolution: Fixed > Remove ProductUseLog > > > Key: GEODE-8922 > URL: https://issues.apache.org/jira/browse/GEODE-8922 > Project: Geode > Issue Type: Improvement > Components: membership >Reporter: Bruce J Schuchardt >Assignee: Bruce J Schuchardt >Priority: Major > Labels: pull-request-available > Fix For: 1.15.0 > > > A Locator logs the number of servers present in the cluster to a file that's > of little use to anyone. The log was added long ago in a weird attempt to > monitor whether users were adhering to their license contract. We should > remove ProductUseLog and its tests. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GEODE-8951) Unnecessary messaging in WAN locator discovery
[ https://issues.apache.org/jira/browse/GEODE-8951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt updated GEODE-8951: -- Priority: Minor (was: Major) > Unnecessary messaging in WAN locator discovery > -- > > Key: GEODE-8951 > URL: https://issues.apache.org/jira/browse/GEODE-8951 > Project: Geode > Issue Type: Improvement > Components: wan >Affects Versions: 1.15.0 >Reporter: Bruce J Schuchardt >Priority: Minor > > While debugging another issue I noticed that a locator was trying to send a > notice to another locator in its cluster telling it that the recipient had > joined. > > [warn 2021/02/16 15:16:56.195 PST locatorgemfire_4_3_host2_9736 > tid=0x153] Locator Membership listener > permanently failed to exchange locator information > *rs-GEM-3188-VJ1459-1a0i3large-hydra-client-1:27878* with > *rs-GEM-3188-VJ1459-1a0i3large-hydra-client-2:28778* after 3 retry attempts > > This messaging is unnecessary. The locator that this message was being sent > to already knows about itself. This is being done in > _DistributeLocatorsRunnable.run()._ > > {code:java} > for (DistributionLocatorId remoteLocator : entry.getValue()) { > // Notify known remote locator about the advertised locator. > LocatorJoinMessage advertiseNewLocatorMessage = new > LocatorJoinMessage( > joiningLocatorDistributedSystemId, joiningLocator, > localLocatorId, ""); > sendMessage(remoteLocator, advertiseNewLocatorMessage, > failedMessages); > // Notify the advertised locator about remote known locator. > LocatorJoinMessage advertiseKnownLocatorMessage = > new LocatorJoinMessage(entry.getKey(), remoteLocator, > localLocatorId, ""); > sendMessage(joiningLocator, advertiseKnownLocatorMessage, > failedMessages); > } > {code} > It should check to see if the joiningLocator ID is equal to the remoteLocator > ID and, if so, not create messages in that iteration. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (GEODE-8951) Unnecessary messaging in WAN locator discovery
Bruce J Schuchardt created GEODE-8951: - Summary: Unnecessary messaging in WAN locator discovery Key: GEODE-8951 URL: https://issues.apache.org/jira/browse/GEODE-8951 Project: Geode Issue Type: Improvement Components: wan Affects Versions: 1.15.0 Reporter: Bruce J Schuchardt While debugging another issue I noticed that a locator was trying to send a notice to another locator in its cluster telling it that the recipient had joined. [warn 2021/02/16 15:16:56.195 PST locatorgemfire_4_3_host2_9736 tid=0x153] Locator Membership listener permanently failed to exchange locator information *rs-GEM-3188-VJ1459-1a0i3large-hydra-client-1:27878* with *rs-GEM-3188-VJ1459-1a0i3large-hydra-client-2:28778* after 3 retry attempts This messaging is unnecessary. The locator that this message was being sent to already knows about itself. This is being done in _DistributeLocatorsRunnable.run()._ {code:java} for (DistributionLocatorId remoteLocator : entry.getValue()) { // Notify known remote locator about the advertised locator. LocatorJoinMessage advertiseNewLocatorMessage = new LocatorJoinMessage( joiningLocatorDistributedSystemId, joiningLocator, localLocatorId, ""); sendMessage(remoteLocator, advertiseNewLocatorMessage, failedMessages); // Notify the advertised locator about remote known locator. LocatorJoinMessage advertiseKnownLocatorMessage = new LocatorJoinMessage(entry.getKey(), remoteLocator, localLocatorId, ""); sendMessage(joiningLocator, advertiseKnownLocatorMessage, failedMessages); } {code} It should check to see if the joiningLocator ID is equal to the remoteLocator ID and, if so, not create messages in that iteration. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (GEODE-8030) CI Failure: HARQueueNewImplDUnitTest.testHAEventWrapperDoesNotHoldCUMOnceInsideCMR
[ https://issues.apache.org/jira/browse/GEODE-8030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17285399#comment-17285399 ] Bruce J Schuchardt commented on GEODE-8030: --- Failed in the same way in this DistributedTestOpenJDK8 run: https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/DistributedTestOpenJDK8/builds/24 > CI Failure: > HARQueueNewImplDUnitTest.testHAEventWrapperDoesNotHoldCUMOnceInsideCMR > -- > > Key: GEODE-8030 > URL: https://issues.apache.org/jira/browse/GEODE-8030 > Project: Geode > Issue Type: Bug > Components: client queues, tests >Reporter: Kirk Lund >Priority: Major > Labels: flaky > > Link: > http://files.apachegeode-ci.info/builds/apache-develop-main/1.13.0-SNAPSHOT.0220/test-results/distributedTest/1587763613/classes/org.apache.geode.internal.cache.ha.HARQueueNewImplDUnitTest.html#testHAEventWrapperDoesNotHoldCUMOnceInsideCMR > Partial stack: > {noformat} > Caused by: org.junit.ComparisonFailure: expected: but > was: bytes;threadID=3;sequenceID=14];shouldConflate=false;versionTag={v1; rv6; > time=1587758285078; remote};hasCqs=false]> > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at > org.apache.geode.internal.cache.ha.HARQueueNewImplDUnitTest.verifyNullCUMReference(HARQueueNewImplDUnitTest.java:855) > at > org.apache.geode.internal.cache.ha.HARQueueNewImplDUnitTest.lambda$testHAEventWrapperDoesNotHoldCUMOnceInsideCMR$bb17a952$4(HARQueueNewImplDUnitTest.java:683) > {noformat} > Full stack: > {noformat} > org.apache.geode.test.dunit.RMIException: While invoking > org.apache.geode.internal.cache.ha.HARQueueNewImplDUnitTest$$Lambda$260/1730948285.run > in VM 1 running on Host c8674217ee1c with 4 VMs > at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:610) > at org.apache.geode.test.dunit.VM.invoke(VM.java:437) > at > org.apache.geode.internal.cache.ha.HARQueueNewImplDUnitTest.testHAEventWrapperDoesNotHoldCUMOnceInsideCMR(HARQueueNewImplDUnitTest.java:683) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at > org.apache.geode.test.dunit.rules.AbstractDistributedRule$1.evaluate(AbstractDistributedRule.java:59) > at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) > at org.junit.rules.RunRules.evaluate(RunRules.java:20) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at > org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:110) > at > org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58) > at > org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38) > at > org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:62) > at
[jira] [Created] (GEODE-8948) log a locator's coordinates during launch
Bruce J Schuchardt created GEODE-8948: - Summary: log a locator's coordinates during launch Key: GEODE-8948 URL: https://issues.apache.org/jira/browse/GEODE-8948 Project: Geode Issue Type: Improvement Components: membership Reporter: Bruce J Schuchardt Looking through a Locator's log file it is difficult, if not impossible, to tell what the locator's host and port are. This makes it difficult to know which Locator log files to examine when debugging if a client (or WAN service) has trouble contacting a Locator because they only log that locators host and port number. If Locators would log something like Starting location services on \{hostname} and port \{portnumber} and with any other additional info that would be useful in grepping through artifacts to find a log file of interest it would help a lot in debugging efforts. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (GEODE-8922) Remove ProductUseLog
[ https://issues.apache.org/jira/browse/GEODE-8922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt reassigned GEODE-8922: - Assignee: Bruce J Schuchardt > Remove ProductUseLog > > > Key: GEODE-8922 > URL: https://issues.apache.org/jira/browse/GEODE-8922 > Project: Geode > Issue Type: Improvement > Components: membership >Reporter: Bruce J Schuchardt >Assignee: Bruce J Schuchardt >Priority: Major > > A Locator logs the number of servers present in the cluster to a file that's > of little use to anyone. The log was added long ago in a weird attempt to > monitor whether users were adhering to their license contract. We should > remove ProductUseLog and its tests. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (GEODE-8817) server hangs in cache close with ssl enabled due to active client connection; client side (CacheClientUpdater.close()) is hung in SSLSocketImpl$AppInputStream.deplete()
[ https://issues.apache.org/jira/browse/GEODE-8817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt resolved GEODE-8817. --- Fix Version/s: 1.14.0 Resolution: Fixed Client-side closing of sockets has not been altered but with server-side changes we are no longer seeing hangs. Open a new ticket if this kind of hang is seen again. > server hangs in cache close with ssl enabled due to active client connection; > client side (CacheClientUpdater.close()) is hung in > SSLSocketImpl$AppInputStream.deplete() > > > Key: GEODE-8817 > URL: https://issues.apache.org/jira/browse/GEODE-8817 > Project: Geode > Issue Type: Bug > Components: client/server, security >Affects Versions: 1.14.0 >Reporter: Bill Burcham >Assignee: Bill Burcham >Priority: Major > Labels: blocks-1.14.0, pull-request-available > Fix For: 1.14.0 > > > A proprietary TLS/SSL-enabled application encountered a network partition. A > server hangs in cache close due to active client connection; client side > ({{CacheClientUpdater.close()}}) is hung in > {{SSLSocketImpl$AppInputStream.deplete()}} > The configuration is: > {noformat} > == > losingSide |survivingSide > == > 0 |10627 > 5 |10632 > -- > 11139 |10655 > |10662 > -- > {noformat} > The stuck threads were stuck in sun's SSL code. Geode's client/Server > framework uses old I/O and that was also part of where they were stuck. If > the clients had closed their connections to the server then it would not have > been stuck here. But the server shutdown shouldn't hang because of client > that refuses to disconnect. > The Geode client-side of the connection is hung here: > {code:java} > \[warn 2020/11/06 14:56:56.577 PST tid=0x18] Thread <50> > (0x32) that was executed at <06 Nov 2020 14:55:43 PST> has been stuck for > <72.81 seconds> and number of thread monitor iteration <1> > Thread Name state > Waiting on > Owned By 10.32.108.224(bridgep2_host2_10627:10627):41003 port 27636> with ID <43> > Executor Group > Monitored metric > Thread stack: > sun.security.ssl.SSLSocketImpl$AppInputStream.deplete(SSLSocketImpl.java:1016) > sun.security.ssl.SSLSocketImpl$AppInputStream.access$100(SSLSocketImpl.java:816) > sun.security.ssl.SSLSocketImpl.bruteForceCloseInput(SSLSocketImpl.java:702) > sun.security.ssl.SSLSocketImpl.duplexCloseOutput(SSLSocketImpl.java:553) > sun.security.ssl.SSLSocketImpl.close(SSLSocketImpl.java:485) > org.apache.geode.internal.cache.tier.sockets.CacheClientUpdater.close(CacheClientUpdater.java:546) > org.apache.geode.cache.client.internal.QueueConnectionImpl.internalDestroy(QueueConnectionImpl.java:112) > org.apache.geode.cache.client.internal.QueueManagerImpl.endpointCrashed(QueueManagerImpl.java:379) > org.apache.geode.cache.client.internal.QueueManagerImpl.connectionCrashed(QueueManagerImpl.java:357) > org.apache.geode.cache.client.internal.QueueConnectionImpl.destroy(QueueConnectionImpl.java:88) > org.apache.geode.cache.client.internal.OpExecutorImpl.handleException(OpExecutorImpl.java:645) > org.apache.geode.cache.client.internal.OpExecutorImpl.handleException(OpExecutorImpl.java:504) > org.apache.geode.cache.client.internal.OpExecutorImpl.executeOnServer(OpExecutorImpl.java:334) > org.apache.geode.cache.client.internal.OpExecutorImpl.executeOn(OpExecutorImpl.java:303) > org.apache.geode.cache.client.internal.PoolImpl.executeOn(PoolImpl.java:839) > org.apache.geode.cache.client.internal.PingOp.execute(PingOp.java:38) > org.apache.geode.cache.client.internal.LiveServerPinger$PingTask.run2(LiveServerPinger.java:90) > org.apache.geode.cache.client.internal.PoolImpl$PoolTask.run(PoolImpl.java:1329) > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > org.apache.geode.internal.ScheduledThreadPoolExecutorWithKeepAlive$DelegatingScheduledFuture.run(ScheduledThreadPoolExecutorWithKeepAlive.java:279) > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > java.lang.Thread.run(Thread.java:748) > Lock owner thread stack > java.net.SocketInputStream.socketRead0(Native Method) > java.net.SocketInputStream.socketRead(SocketInputStream.java:116) > java.net.SocketInputStream.read(SocketInputStream.java:171) > java.net.Socket
[jira] [Resolved] (GEODE-8195) ConcurrentModificationException from LocatorMembershipListenerImpl$DistributeLocatorsRunnable.run
[ https://issues.apache.org/jira/browse/GEODE-8195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt resolved GEODE-8195. --- I accidentally reopened this ticket > ConcurrentModificationException from > LocatorMembershipListenerImpl$DistributeLocatorsRunnable.run > - > > Key: GEODE-8195 > URL: https://issues.apache.org/jira/browse/GEODE-8195 > Project: Geode > Issue Type: Bug > Components: membership >Reporter: Bill Burcham >Assignee: Bruce J Schuchardt >Priority: Major > Fix For: 1.12.1, 1.14.0, 1.13.0 > > > this WAN code in > {{LocatorMembershipListenerImpl$DistributeLocatorsRunnable.run}}: > {code} > Set joinMessages = entry.getValue(); > for (LocatorJoinMessage locatorJoinMessage : joinMessages) { > if (retryMessage(targetLocator, locatorJoinMessage, attempt)) { > joinMessages.remove(locatorJoinMessage); > } else { > {code} > modifies the {{joinMessages}} set as it is iterating over the set, resulting > in a {{ConcurrentModificationException}}. > This bug will cause (inter-site) notification of locators (of the presence of > a new locator) to fail early if retry is necessary. If we have to retry > notifying any locator, and we succeed, we’ll throw the > {{ConcurrentModificationException}} and stop trying to notify any of the > other locators. See the _Discovery For Multi-Site Systems_ section of the > [Overview of Multi-Site > Caching|https://geode.apache.org/docs/guide/14/topologies_and_comm/topology_concepts/multisite_overview.html] > documentation for an overview of the locator's role in WAN. > Here is a scratch file that illustrates the issue, throwing > {{ConcurrentModificationException}}: > {code} > import java.util.HashSet; > import java.util.Set; > class Scratch { > public static void main(String[] args) { > final Set joinMessages = new HashSet<>(); > joinMessages.add("one"); > joinMessages.add("two"); > for( final String entry:joinMessages ) { > if (entry.equals("one")) > joinMessages.remove(entry); > } > } > } > {code} > From looking at the Geode code, {{joinMessages}} is not used outside the loop > so there is no need to modify it at all—I think we can simply remove this > line: > {code} > joinMessages.remove(locatorJoinMessage); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (GEODE-8825) CI failure: GatewayReceiverMBeanDUnitTest > testMBeanAndProxiesForGatewayReceiverAreRemovedOnDestroy
[ https://issues.apache.org/jira/browse/GEODE-8825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17279901#comment-17279901 ] Bruce J Schuchardt commented on GEODE-8825: --- Test failed in this PR run: https://concourse.apachegeode-ci.info/builds/724 > CI failure: GatewayReceiverMBeanDUnitTest > > testMBeanAndProxiesForGatewayReceiverAreRemovedOnDestroy > > > Key: GEODE-8825 > URL: https://issues.apache.org/jira/browse/GEODE-8825 > Project: Geode > Issue Type: Bug > Components: tests, wan >Reporter: Jianxia Chen >Priority: Major > Labels: flaky > > {code:java} > org.apache.geode.internal.cache.wan.GatewayReceiverMBeanDUnitTest > > testMBeanAndProxiesForGatewayReceiverAreRemovedOnDestroy FAILED > org.apache.geode.test.dunit.RMIException: While invoking > org.apache.geode.internal.cache.wan.GatewayReceiverMBeanDUnitTest$$Lambda$202/0x0001008f0c40.run > in VM 0 running on Host c3e48bdac460 with 4 VMs > at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:623) > at org.apache.geode.test.dunit.VM.invoke(VM.java:447) > at > org.apache.geode.internal.cache.wan.GatewayReceiverMBeanDUnitTest.testMBeanAndProxiesForGatewayReceiverAreRemovedOnDestroy(GatewayReceiverMBeanDUnitTest.java:76) > Caused by: > java.lang.AssertionError: expected null, but was: GemFire:service=GatewayReceiver,type=Member,member=172.17.0.18(183)-41002> > at org.junit.Assert.fail(Assert.java:89) > at org.junit.Assert.failNotNull(Assert.java:756) > at org.junit.Assert.assertNull(Assert.java:738) > at org.junit.Assert.assertNull(Assert.java:748) > at > org.apache.geode.internal.cache.wan.GatewayReceiverMBeanDUnitTest.verifyMBeanProxiesDoesNotExist(GatewayReceiverMBeanDUnitTest.java:106) > at > org.apache.geode.internal.cache.wan.GatewayReceiverMBeanDUnitTest.lambda$testMBeanAndProxiesForGatewayReceiverAreRemovedOnDestroy$bb17a952$3(GatewayReceiverMBeanDUnitTest.java:76) > {code} > https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/DistributedTestOpenJDK11/builds/704 > =-=-=-=-=-=-=-=-=-=-=-=-=-=-= Test Results URI > =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= > http://files.apachegeode-ci.info/builds/apache-develop-main/1.14.0-build.0601/test-results/distributedTest/1610390301/ > =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= > Test report artifacts from this job are available at: > http://files.apachegeode-ci.info/builds/apache-develop-main/1.14.0-build.0601/test-artifacts/1610390301/distributedtestfiles-OpenJDK11-1.14.0-build.0601.tgz -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GEODE-8825) CI failure: GatewayReceiverMBeanDUnitTest > testMBeanAndProxiesForGatewayReceiverAreRemovedOnDestroy
[ https://issues.apache.org/jira/browse/GEODE-8825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt updated GEODE-8825: -- Labels: flaky (was: ) > CI failure: GatewayReceiverMBeanDUnitTest > > testMBeanAndProxiesForGatewayReceiverAreRemovedOnDestroy > > > Key: GEODE-8825 > URL: https://issues.apache.org/jira/browse/GEODE-8825 > Project: Geode > Issue Type: Bug > Components: tests, wan >Reporter: Jianxia Chen >Priority: Major > Labels: flaky > > {code:java} > org.apache.geode.internal.cache.wan.GatewayReceiverMBeanDUnitTest > > testMBeanAndProxiesForGatewayReceiverAreRemovedOnDestroy FAILED > org.apache.geode.test.dunit.RMIException: While invoking > org.apache.geode.internal.cache.wan.GatewayReceiverMBeanDUnitTest$$Lambda$202/0x0001008f0c40.run > in VM 0 running on Host c3e48bdac460 with 4 VMs > at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:623) > at org.apache.geode.test.dunit.VM.invoke(VM.java:447) > at > org.apache.geode.internal.cache.wan.GatewayReceiverMBeanDUnitTest.testMBeanAndProxiesForGatewayReceiverAreRemovedOnDestroy(GatewayReceiverMBeanDUnitTest.java:76) > Caused by: > java.lang.AssertionError: expected null, but was: GemFire:service=GatewayReceiver,type=Member,member=172.17.0.18(183)-41002> > at org.junit.Assert.fail(Assert.java:89) > at org.junit.Assert.failNotNull(Assert.java:756) > at org.junit.Assert.assertNull(Assert.java:738) > at org.junit.Assert.assertNull(Assert.java:748) > at > org.apache.geode.internal.cache.wan.GatewayReceiverMBeanDUnitTest.verifyMBeanProxiesDoesNotExist(GatewayReceiverMBeanDUnitTest.java:106) > at > org.apache.geode.internal.cache.wan.GatewayReceiverMBeanDUnitTest.lambda$testMBeanAndProxiesForGatewayReceiverAreRemovedOnDestroy$bb17a952$3(GatewayReceiverMBeanDUnitTest.java:76) > {code} > https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/DistributedTestOpenJDK11/builds/704 > =-=-=-=-=-=-=-=-=-=-=-=-=-=-= Test Results URI > =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= > http://files.apachegeode-ci.info/builds/apache-develop-main/1.14.0-build.0601/test-results/distributedTest/1610390301/ > =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= > Test report artifacts from this job are available at: > http://files.apachegeode-ci.info/builds/apache-develop-main/1.14.0-build.0601/test-artifacts/1610390301/distributedtestfiles-OpenJDK11-1.14.0-build.0601.tgz -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GEODE-8825) CI failure: GatewayReceiverMBeanDUnitTest > testMBeanAndProxiesForGatewayReceiverAreRemovedOnDestroy
[ https://issues.apache.org/jira/browse/GEODE-8825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt updated GEODE-8825: -- Component/s: wan tests > CI failure: GatewayReceiverMBeanDUnitTest > > testMBeanAndProxiesForGatewayReceiverAreRemovedOnDestroy > > > Key: GEODE-8825 > URL: https://issues.apache.org/jira/browse/GEODE-8825 > Project: Geode > Issue Type: Bug > Components: tests, wan >Reporter: Jianxia Chen >Priority: Major > > {code:java} > org.apache.geode.internal.cache.wan.GatewayReceiverMBeanDUnitTest > > testMBeanAndProxiesForGatewayReceiverAreRemovedOnDestroy FAILED > org.apache.geode.test.dunit.RMIException: While invoking > org.apache.geode.internal.cache.wan.GatewayReceiverMBeanDUnitTest$$Lambda$202/0x0001008f0c40.run > in VM 0 running on Host c3e48bdac460 with 4 VMs > at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:623) > at org.apache.geode.test.dunit.VM.invoke(VM.java:447) > at > org.apache.geode.internal.cache.wan.GatewayReceiverMBeanDUnitTest.testMBeanAndProxiesForGatewayReceiverAreRemovedOnDestroy(GatewayReceiverMBeanDUnitTest.java:76) > Caused by: > java.lang.AssertionError: expected null, but was: GemFire:service=GatewayReceiver,type=Member,member=172.17.0.18(183)-41002> > at org.junit.Assert.fail(Assert.java:89) > at org.junit.Assert.failNotNull(Assert.java:756) > at org.junit.Assert.assertNull(Assert.java:738) > at org.junit.Assert.assertNull(Assert.java:748) > at > org.apache.geode.internal.cache.wan.GatewayReceiverMBeanDUnitTest.verifyMBeanProxiesDoesNotExist(GatewayReceiverMBeanDUnitTest.java:106) > at > org.apache.geode.internal.cache.wan.GatewayReceiverMBeanDUnitTest.lambda$testMBeanAndProxiesForGatewayReceiverAreRemovedOnDestroy$bb17a952$3(GatewayReceiverMBeanDUnitTest.java:76) > {code} > https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/DistributedTestOpenJDK11/builds/704 > =-=-=-=-=-=-=-=-=-=-=-=-=-=-= Test Results URI > =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= > http://files.apachegeode-ci.info/builds/apache-develop-main/1.14.0-build.0601/test-results/distributedTest/1610390301/ > =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= > Test report artifacts from this job are available at: > http://files.apachegeode-ci.info/builds/apache-develop-main/1.14.0-build.0601/test-artifacts/1610390301/distributedtestfiles-OpenJDK11-1.14.0-build.0601.tgz -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (GEODE-8922) Remove ProductUseLog
Bruce J Schuchardt created GEODE-8922: - Summary: Remove ProductUseLog Key: GEODE-8922 URL: https://issues.apache.org/jira/browse/GEODE-8922 Project: Geode Issue Type: Improvement Components: membership Reporter: Bruce J Schuchardt A Locator logs the number of servers present in the cluster to a file that's of little use to anyone. The log was added long ago in a weird attempt to monitor whether users were adhering to their license contract. We should remove ProductUseLog and its tests. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (GEODE-8920) Modify debug logging to make it easier to trace a message
[ https://issues.apache.org/jira/browse/GEODE-8920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17279106#comment-17279106 ] Bruce J Schuchardt commented on GEODE-8920: --- This could do the trick in DirectChannel.java: {code:java} if (logger.isDebugEnabled()) { StringBuilder sb = new StringBuilder(); if (retry) { sb.append("Retrying send"); } else { sb.append("Sending ").append(msg).append("to ").append(p_destinations.length) .append(" nodes"); } sb.append(" via these tcp/ip connections: "); for (Connection connection: cons) { sb.append("[").append(connection.getRemoteAddress()).append(", uid=") .append(connection.getUniqueId()).append("] "); } logger.debug(sb.toString()); } {code} > Modify debug logging to make it easier to trace a message > - > > Key: GEODE-8920 > URL: https://issues.apache.org/jira/browse/GEODE-8920 > Project: Geode > Issue Type: Improvement > Components: membership >Reporter: Bruce J Schuchardt >Priority: Major > > Debug logging in DirectChannel lets us know the IDs of receivers of a message > and the toString of the message but it's very difficult to figure out what > thread on the receiving end is supposed to process that message. > Here's an example of what we currently have: > [debug 2021/02/01 16:15:17.492 PST persistgemfire8_host1_8586 > tid=0x4f0] Sending > (DLockRequestProcessor.DLockResponseMessage responding GRANT; > serviceName=__PDX(version 4); objectName=PDX_LOCK; responseCode=0; > keyIfFailed=null; leaseExpireTime=9223372036854775807; processorId=509; > lockId=509) to 1 peers > ([rs-GEM-3166-PL1535a2i32xlarge-hydra-client-36(persistgemfire9_host1_8517:8517):41005]) > via tcp/ip > This does not tell you anything about the receiver except its ID. On the > receiving side the thread that, in this run, would handle that message is > this: > persistgemfire9_host1_8517 rs-GEM-3166-PL1535a2i32xlarge-hydra-client-36(persistgemfire8_host1_8586:8586):41006 > unshared ordered *uid=1036* dom #1 local port=47207 remote port=42068> > tid=0x51 > I've highlighted the *uid* here because that is the _uniqueId_ of the sending > Connection. If you looked through the logs or stack traces of the receiver > and knew the uniqueId of the sending Connection you could easily locate the > thread that should receive this DLockResponseMessage. Currently this is much > harder than it needs to be because the DirectChannel _Sending_ log message > doesn't include the _uniqueId_ of the Connections it is using to send the > message. > Let's change that log message to include the _uniqueId_ of each outgoing > Connection. Maybe something like this: > Sending (message.toString()) to 1 peers (peer ID)*, uid=1036* via tcp/ip > and on the receiving side we could be clearer about what the *uid* in the > thread's name means: > persistgemfire9_host1_8517 rs-GEM-3166-PL1535a2i32xlarge-hydra-client-36(persistgemfire8_host1_8586:8586):41006 > unshared ordered *sender uid=1036* dom #1 local port=47207 remote > port=42068> tid=0x51 > or something like that. > Now we can look at the _Sending_ message and know that the receiving thread > will have _uid=1036_ in its name. Knowing this it ought to be possible to > write a program/script to trace a message and its consequences from one node > to another. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GEODE-8920) Modify debug logging to make it easier to trace a message
[ https://issues.apache.org/jira/browse/GEODE-8920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt updated GEODE-8920: -- Description: Debug logging in DirectChannel lets us know the IDs of receivers of a message and the toString of the message but it's very difficult to figure out what thread on the receiving end is supposed to process that message. Here's an example of what we currently have: [debug 2021/02/01 16:15:17.492 PST persistgemfire8_host1_8586 tid=0x4f0] Sending (DLockRequestProcessor.DLockResponseMessage responding GRANT; serviceName=__PDX(version 4); objectName=PDX_LOCK; responseCode=0; keyIfFailed=null; leaseExpireTime=9223372036854775807; processorId=509; lockId=509) to 1 peers ([rs-GEM-3166-PL1535a2i32xlarge-hydra-client-36(persistgemfire9_host1_8517:8517):41005]) via tcp/ip This does not tell you anything about the receiver except its ID. On the receiving side the thread that, in this run, would handle that message is this: persistgemfire9_host1_8517 :41006 unshared ordered *uid=1036* dom #1 local port=47207 remote port=42068> tid=0x51 I've highlighted the *uid* here because that is the _uniqueId_ of the sending Connection. If you looked through the logs or stack traces of the receiver and knew the uniqueId of the sending Connection you could easily locate the thread that should receive this DLockResponseMessage. Currently this is much harder than it needs to be because the DirectChannel _Sending_ log message doesn't include the _uniqueId_ of the Connections it is using to send the message. Let's change that log message to include the _uniqueId_ of each outgoing Connection. Maybe something like this: Sending (message.toString()) to 1 peers (peer ID)*, uid=1036* via tcp/ip and on the receiving side we could be clearer about what the *uid* in the thread's name means: persistgemfire9_host1_8517 :41006 unshared ordered *sender uid=1036* dom #1 local port=47207 remote port=42068> tid=0x51 or something like that. Now we can look at the _Sending_ message and know that the receiving thread will have _uid=1036_ in its name. Knowing this it ought to be possible to write a program/script to trace a message and its consequences from one node to another. was: Debug logging in DirectChannel lets us know the IDs of receivers of a message and the toString of the message but it's very difficult to figure out what thread on the receiving end is supposed to process that message. Here's an example of what we currently have: [debug 2021/02/01 16:15:17.492 PST persistgemfire8_host1_8586 tid=0x4f0] Sending (DLockRequestProcessor.DLockResponseMessage responding GRANT; serviceName=__PDX(version 4); objectName=PDX_LOCK; responseCode=0; keyIfFailed=null; leaseExpireTime=9223372036854775807; processorId=509; lockId=509) to 1 peers ([rs-GEM-3166-PL1535a2i32xlarge-hydra-client-36(persistgemfire9_host1_8517:8517):41005]) via tcp/ip This does not tell you anything about the receiver except its ID. On the receiving side the thread that, in this run, would handle that message is this: persistgemfire9_host1_8517 :41006 unshared ordered *uid=1036* dom #1 local port=47207 remote port=42068> tid=0x51 I've highlighted the *uid* here because that is the _uniqueId_ of the sending Connection. If you looked through the logs or stack traces of the receiver and knew the uniqueId of the sending Connection you could easily locate the thread that should receive this DLockResponseMessage. Currently this is much harder than it needs to be because the DirectChannel _Sending_ log message doesn't include the _uniqueId_ of the Connections it is using to send the message. Let's change that log message to include the _uniqueId_ of each outgoing Connection. Maybe something like this: Sending (message.toString()) to 1 peers (peer ID)to 1 peers (peer ID)*, uid=1036* via tcp/ip and on the receiving side we could be clearer about what the *uid* in the thread's name means: persistgemfire9_host1_8517 :41006 unshared ordered *sender uid=1036* dom #1 local port=47207 remote port=42068> tid=0x51 or something like that. Now we can look at the _Sending_ message and know that the receiving thread will have _uid=1036_ in its name. Knowing this it ought to be possible to write a program/script to trace a message and its consequences from one node to another. > Modify debug logging to make it easier to trace a message > - > > Key: GEODE-8920 > URL: https://issues.apache.org/jira/browse/GEODE-8920 > Project: Geode > Issue Type: Improvement > Components: membership >Reporter: Bruce J Schuchardt >Priority: Major > > Debug logging in DirectChannel lets us know the IDs of receivers of a message > and the toString of the message but it's very difficu
[jira] [Created] (GEODE-8920) Modify debug logging to make it easier to trace a message
Bruce J Schuchardt created GEODE-8920: - Summary: Modify debug logging to make it easier to trace a message Key: GEODE-8920 URL: https://issues.apache.org/jira/browse/GEODE-8920 Project: Geode Issue Type: Improvement Components: membership Reporter: Bruce J Schuchardt Debug logging in DirectChannel lets us know the IDs of receivers of a message and the toString of the message but it's very difficult to figure out what thread on the receiving end is supposed to process that message. Here's an example of what we currently have: [debug 2021/02/01 16:15:17.492 PST persistgemfire8_host1_8586 tid=0x4f0] Sending (DLockRequestProcessor.DLockResponseMessage responding GRANT; serviceName=__PDX(version 4); objectName=PDX_LOCK; responseCode=0; keyIfFailed=null; leaseExpireTime=9223372036854775807; processorId=509; lockId=509) to 1 peers ([rs-GEM-3166-PL1535a2i32xlarge-hydra-client-36(persistgemfire9_host1_8517:8517):41005]) via tcp/ip This does not tell you anything about the receiver except its ID. On the receiving side the thread that, in this run, would handle that message is this: persistgemfire9_host1_8517 :41006 unshared ordered *uid=1036* dom #1 local port=47207 remote port=42068> tid=0x51 I've highlighted the *uid* here because that is the _uniqueId_ of the sending Connection. If you looked through the logs or stack traces of the receiver and knew the uniqueId of the sending Connection you could easily locate the thread that should receive this DLockResponseMessage. Currently this is much harder than it needs to be because the DirectChannel _Sending_ log message doesn't include the _uniqueId_ of the Connections it is using to send the message. Let's change that log message to include the _uniqueId_ of each outgoing Connection. Maybe something like this: Sending (message.toString()) to 1 peers (peer ID)to 1 peers (peer ID)*, uid=1036* via tcp/ip and on the receiving side we could be clearer about what the *uid* in the thread's name means: persistgemfire9_host1_8517 :41006 unshared ordered *sender uid=1036* dom #1 local port=47207 remote port=42068> tid=0x51 or something like that. Now we can look at the _Sending_ message and know that the receiving thread will have _uid=1036_ in its name. Knowing this it ought to be possible to write a program/script to trace a message and its consequences from one node to another. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (GEODE-8919) revert renaming of GMS processMessage methods
Bruce J Schuchardt created GEODE-8919: - Summary: revert renaming of GMS processMessage methods Key: GEODE-8919 URL: https://issues.apache.org/jira/browse/GEODE-8919 Project: Geode Issue Type: Improvement Components: membership Reporter: Bruce J Schuchardt [~upthewaterspout] modified methods in the membership module that process membership methods so that they are now all named *processMessage*, but this make it more difficult to read stack traces and know what type of message a thread is processing. Let's make life easier for us and revert that change. Let's name each method after the type of message it processes so that we don't have to look at source code to figure it out. This method, for instance, could be named *processInstallViewMessage* and we would know, without looking at source code, which type of message is being processed. {noformat} at org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.processMessage(GMSJoinLeave.java:1053) > at > org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1330) > at > org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1269) > at org.jgroups.JChannel.invokeCallback(JChannel.java:816) > at org.jgroups.JChannel.up(JChannel.java:741) > at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1030) > at org.jgroups.protocols.FRAG2.up(FRAG2.java:165) > at org.jgroups.protocols.FlowControl.up(FlowControl.java:390) > at org.jgroups.protocols.UNICAST3.deliverMessage(UNICAST3.java:1077) > at org.jgroups.protocols.UNICAST3.handleDataReceived(UNICAST3.java:792) > at org.jgroups.protocols.UNICAST3.up(UNICAST3.java:433) > at > org.apache.geode.distributed.internal.membership.gms.messenger.StatRecorder.up(StatRecorder.java:73) > at > org.apache.geode.distributed.internal.membership.gms.messenger.AddressManager.up(AddressManager.java:72) > at org.jgroups.protocols.TP.passMessageUp(TP.java:1658) > at org.jgroups.protocols.TP$SingleMessageHandler.run(TP.java:1876) > at org.jgroups.util.DirectExecutor.execute(DirectExecutor.java:10) > at org.jgroups.protocols.TP.handleSingleMessage(TP.java:1789) > at org.jgroups.protocols.TP.receive(TP.java:1714) > at > org.apache.geode.distributed.internal.membership.gms.messenger.Transport.receive(Transport.java:152) > at org.jgroups.protocols.UDP$PacketReceiver.run(UDP.java:701) > at java.lang.Thread.run(Thread.java:748) {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (GEODE-8767) NullPointerException in TCPConduit.getBufferPool due to conTable being null on Windows
[ https://issues.apache.org/jira/browse/GEODE-8767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt resolved GEODE-8767. --- Fix Version/s: 1.14.0 Resolution: Fixed > NullPointerException in TCPConduit.getBufferPool due to conTable being null > on Windows > -- > > Key: GEODE-8767 > URL: https://issues.apache.org/jira/browse/GEODE-8767 > Project: Geode > Issue Type: Bug > Components: membership, messaging >Affects Versions: 1.14.0 >Reporter: Donal Evans >Priority: Major > Labels: pull-request-available > Fix For: 1.14.0 > > > This failure was seen in the WindowsGfshDistributedTestOpenJDK11 CI pipeline > job: > {noformat} > org.apache.geode.management.MemberMXBeanDistributedTest > classMethod FAILED > java.lang.AssertionError: Suspicious strings were written to the log > during this run. > Fix the strings or use IgnoredException.addIgnoredException to ignore. > --- > Found suspect string in 'dunit_suspect-vm1.log' at line 8350 > [fatal 2020/12/03 20:00:40.000 GMT tid=630] While pushing > message regionName=#testCreateRegion2 ,distTx=false)> to recipients: > <10.0.0.75(server-2:2444):41002, 10.0.0.75(server-3:4716):41003, > 10.0.0.75(server-4:6184):41004> > java.lang.NullPointerException > at > org.apache.geode.internal.tcp.TCPConduit.getBufferPool(TCPConduit.java:949) > at > org.apache.geode.distributed.internal.direct.DirectChannel.sendToMany(DirectChannel.java:298) > at > org.apache.geode.distributed.internal.direct.DirectChannel.send(DirectChannel.java:513) > at > org.apache.geode.distributed.internal.DistributionImpl.directChannelSend(DistributionImpl.java:346) > at > org.apache.geode.distributed.internal.DistributionImpl.send(DistributionImpl.java:291) > at > org.apache.geode.distributed.internal.ClusterDistributionManager.sendViaMembershipManager(ClusterDistributionManager.java:2053) > at > org.apache.geode.distributed.internal.ClusterDistributionManager.sendOutgoing(ClusterDistributionManager.java:1981) > at > org.apache.geode.distributed.internal.ClusterDistributionManager.sendMessage(ClusterDistributionManager.java:2018) > at > org.apache.geode.distributed.internal.ClusterDistributionManager.putOutgoing(ClusterDistributionManager.java:1083) > at > org.apache.geode.internal.cache.partitioned.PRSanityCheckMessage$1.run2(PRSanityCheckMessage.java:133) > at > org.apache.geode.internal.SystemTimer$SystemTimerTask.run(SystemTimer.java:334) > at java.base/java.util.TimerThread.mainLoop(Timer.java:556) > at java.base/java.util.TimerThread.run(Timer.java:506) > 3 tests completed, 1 failed > {noformat} > =-=-=-=-=-=-=-=-=-=-=-=-=-=-= Test Results URI > =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= > > [http://files.apachegeode-ci.info/builds/apache-develop-main/1.14.0-build.0532/test-results/distributedTest/1607032539/] > > =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= > Test report artifacts from this job are available at: > [http://files.apachegeode-ci.info/builds/apache-develop-main/1.14.0-build.0532/test-artifacts/1607032539/windows-gfshdistributedtest-OpenJDK11-1.14.0-build.0532.tgz] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (GEODE-5526) CI Failure: ParallelWANStatsDUnitTest.testParallelPropagationHA fails with AssertionError for Queue Size
[ https://issues.apache.org/jira/browse/GEODE-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17260862#comment-17260862 ] Bruce J Schuchardt commented on GEODE-5526: --- Same issue with a different test method in the same class: {noformat} org.apache.geode.internal.cache.wan.parallel.ParallelWANStatsDUnitTest > testParallelPropagationHAWithGroupTransactionEvents FAILED java.lang.AssertionError: expected:<0> but was:<-20> at org.junit.Assert.fail(Assert.java:89) at org.junit.Assert.failNotEquals(Assert.java:835) at org.junit.Assert.assertEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:633) at org.apache.geode.internal.cache.wan.parallel.ParallelWANStatsDUnitTest.testParallelPropagationHAWithGroupTransactionEvents(ParallelWANStatsDUnitTest.java:823) {noformat} https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/DistributedTestOpenJDK8/builds/726 > CI Failure: ParallelWANStatsDUnitTest.testParallelPropagationHA fails with > AssertionError for Queue Size > > > Key: GEODE-5526 > URL: https://issues.apache.org/jira/browse/GEODE-5526 > Project: Geode > Issue Type: Bug >Reporter: Helena Bales >Priority: Major > Labels: swat > > Failed in Geode DistributedTests on August 3rd, 2018 with: > {{org.apache.geode.internal.cache.wan.parallel.ParallelWANStatsDUnitTest > > testParallelPropagationHA FAILED}} > {{java.lang.AssertionError: expected:<0> but was:<-3>}} > {{at org.junit.Assert.fail(Assert.java:88)}} > {{at org.junit.Assert.failNotEquals(Assert.java:834)}} > {{at org.junit.Assert.assertEquals(Assert.java:645)}} > {{at org.junit.Assert.assertEquals(Assert.java:631)}} > {{at > org.apache.geode.internal.cache.wan.parallel.ParallelWANStatsDUnitTest.testParallelPropagationHA(ParallelWANStatsDUnitTest.java:429)}} > On the assertion: > {{assertEquals(0, v5List.get(0) + v6List.get(0) + v7List.get(0)); // queue > size}} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (GEODE-8816) CI failure: SerialWanPropagationDUnitTest. testReplicatedSerialPropagationWithRemoteRegionDestroy3
[ https://issues.apache.org/jira/browse/GEODE-8816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17260621#comment-17260621 ] Bruce J Schuchardt commented on GEODE-8816: --- https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/DistributedTestOpenJDK8/builds/723 > CI failure: SerialWanPropagationDUnitTest. > testReplicatedSerialPropagationWithRemoteRegionDestroy3 > -- > > Key: GEODE-8816 > URL: https://issues.apache.org/jira/browse/GEODE-8816 > Project: Geode > Issue Type: Bug > Components: wan >Reporter: Bruce J Schuchardt >Priority: Minor > > This test failed with a suspect string showing a functional problem with the > sender event processor. > {noformat} > org.apache.geode.internal.cache.wan.serial.SerialWANPropagationDUnitTest > > testReplicatedSerialPropagationWithRemoteRegionDestroy3 FAILED > java.lang.AssertionError: Suspicious strings were written to the log > during this run. > Fix the strings or use IgnoredException.addIgnoredException to ignore. > --- > Found suspect string in 'dunit_suspect-vm5.log' at line 737 > [error 2021/01/07 01:06:31.894 GMT 172.17.0.18(663):41005 unshared ordered uid=191 dom #1 local port=49013 > remote port=40362> tid=1289] Exception occurred in CacheListener > java.util.concurrent.RejectedExecutionException: Task > org.apache.geode.internal.cache.wan.serial.SerialGatewaySenderEventProcessor$2@3f0ab37b > rejected from java.util.concurrent.ThreadPoolExecutor@1a4d5355[Terminated, > pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 2071] > at > java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2063) > at > java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830) > at > java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1379) > at > org.apache.geode.internal.cache.wan.serial.SerialGatewaySenderEventProcessor.handlePrimaryDestroy(SerialGatewaySenderEventProcessor.java:607) > at > org.apache.geode.internal.cache.wan.serial.SerialSecondaryGatewayListener.afterDestroy(SerialSecondaryGatewayListener.java:91) > at > org.apache.geode.internal.cache.EnumListenerEvent$AFTER_DESTROY.dispatchEvent(EnumListenerEvent.java:178) > at > org.apache.geode.internal.cache.LocalRegion.dispatchEvent(LocalRegion.java:8265) > at > org.apache.geode.internal.cache.LocalRegion.dispatchListenerEvent(LocalRegion.java:6974) > at > org.apache.geode.internal.cache.LocalRegion.invokeDestroyCallbacks(LocalRegion.java:6775) > at > org.apache.geode.internal.cache.EntryEventImpl.invokeCallbacks(EntryEventImpl.java:2446) > at > org.apache.geode.internal.cache.entries.AbstractRegionEntry.dispatchListenerEvents(AbstractRegionEntry.java:164) > at > org.apache.geode.internal.cache.LocalRegion.basicDestroyPart2(LocalRegion.java:6716) > at > org.apache.geode.internal.cache.map.RegionMapDestroy.destroyExistingEntry(RegionMapDestroy.java:414) > at > org.apache.geode.internal.cache.map.RegionMapDestroy.handleExistingRegionEntry(RegionMapDestroy.java:244) > at > org.apache.geode.internal.cache.map.RegionMapDestroy.destroy(RegionMapDestroy.java:152) > at > org.apache.geode.internal.cache.AbstractRegionMap.destroy(AbstractRegionMap.java:968) > at > org.apache.geode.internal.cache.LocalRegion.mapDestroy(LocalRegion.java:6505) > at > org.apache.geode.internal.cache.LocalRegion.mapDestroy(LocalRegion.java:6479) > at > org.apache.geode.internal.cache.LocalRegionDataView.destroyExistingEntry(LocalRegionDataView.java:59) > at > org.apache.geode.internal.cache.LocalRegion.basicDestroy(LocalRegion.java:6430) > at > org.apache.geode.internal.cache.DistributedRegion.basicDestroy(DistributedRegion.java:1730) > at > org.apache.geode.internal.cache.wan.serial.SerialGatewaySenderQueue$SerialGatewaySenderQueueMetaRegion.basicDestroy(SerialGatewaySenderQueue.java:1387) > at > org.apache.geode.internal.cache.LocalRegion.localDestroy(LocalRegion.java:2230) > at > org.apache.geode.internal.cache.DistributedRegion.localDestroy(DistributedRegion.java:967) > at > org.apache.geode.internal.cache.wan.serial.BatchDestroyOperation$DestroyMessage.operateOnRegion(BatchDestroyOperation.java:121) > at > org.apache.geode.internal.cache.DistributedCacheOperation$CacheOperationMessage.basicProcess(DistributedCacheOperation.java:1208) > at > org.apache.geode.internal.cache.DistributedCacheOperation$CacheOperationMessage.process(DistributedCacheOperation.java:1110) >
[jira] [Updated] (GEODE-8816) CI failure: SerialWanPropagationDUnitTest. testReplicatedSerialPropagationWithRemoteRegionDestroy3
[ https://issues.apache.org/jira/browse/GEODE-8816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt updated GEODE-8816: -- Priority: Minor (was: Major) > CI failure: SerialWanPropagationDUnitTest. > testReplicatedSerialPropagationWithRemoteRegionDestroy3 > -- > > Key: GEODE-8816 > URL: https://issues.apache.org/jira/browse/GEODE-8816 > Project: Geode > Issue Type: Bug > Components: wan >Reporter: Bruce J Schuchardt >Priority: Minor > > This test failed with a suspect string showing a functional problem with the > sender event processor. > {noformat} > org.apache.geode.internal.cache.wan.serial.SerialWANPropagationDUnitTest > > testReplicatedSerialPropagationWithRemoteRegionDestroy3 FAILED > java.lang.AssertionError: Suspicious strings were written to the log > during this run. > Fix the strings or use IgnoredException.addIgnoredException to ignore. > --- > Found suspect string in 'dunit_suspect-vm5.log' at line 737 > [error 2021/01/07 01:06:31.894 GMT 172.17.0.18(663):41005 unshared ordered uid=191 dom #1 local port=49013 > remote port=40362> tid=1289] Exception occurred in CacheListener > java.util.concurrent.RejectedExecutionException: Task > org.apache.geode.internal.cache.wan.serial.SerialGatewaySenderEventProcessor$2@3f0ab37b > rejected from java.util.concurrent.ThreadPoolExecutor@1a4d5355[Terminated, > pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 2071] > at > java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2063) > at > java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830) > at > java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1379) > at > org.apache.geode.internal.cache.wan.serial.SerialGatewaySenderEventProcessor.handlePrimaryDestroy(SerialGatewaySenderEventProcessor.java:607) > at > org.apache.geode.internal.cache.wan.serial.SerialSecondaryGatewayListener.afterDestroy(SerialSecondaryGatewayListener.java:91) > at > org.apache.geode.internal.cache.EnumListenerEvent$AFTER_DESTROY.dispatchEvent(EnumListenerEvent.java:178) > at > org.apache.geode.internal.cache.LocalRegion.dispatchEvent(LocalRegion.java:8265) > at > org.apache.geode.internal.cache.LocalRegion.dispatchListenerEvent(LocalRegion.java:6974) > at > org.apache.geode.internal.cache.LocalRegion.invokeDestroyCallbacks(LocalRegion.java:6775) > at > org.apache.geode.internal.cache.EntryEventImpl.invokeCallbacks(EntryEventImpl.java:2446) > at > org.apache.geode.internal.cache.entries.AbstractRegionEntry.dispatchListenerEvents(AbstractRegionEntry.java:164) > at > org.apache.geode.internal.cache.LocalRegion.basicDestroyPart2(LocalRegion.java:6716) > at > org.apache.geode.internal.cache.map.RegionMapDestroy.destroyExistingEntry(RegionMapDestroy.java:414) > at > org.apache.geode.internal.cache.map.RegionMapDestroy.handleExistingRegionEntry(RegionMapDestroy.java:244) > at > org.apache.geode.internal.cache.map.RegionMapDestroy.destroy(RegionMapDestroy.java:152) > at > org.apache.geode.internal.cache.AbstractRegionMap.destroy(AbstractRegionMap.java:968) > at > org.apache.geode.internal.cache.LocalRegion.mapDestroy(LocalRegion.java:6505) > at > org.apache.geode.internal.cache.LocalRegion.mapDestroy(LocalRegion.java:6479) > at > org.apache.geode.internal.cache.LocalRegionDataView.destroyExistingEntry(LocalRegionDataView.java:59) > at > org.apache.geode.internal.cache.LocalRegion.basicDestroy(LocalRegion.java:6430) > at > org.apache.geode.internal.cache.DistributedRegion.basicDestroy(DistributedRegion.java:1730) > at > org.apache.geode.internal.cache.wan.serial.SerialGatewaySenderQueue$SerialGatewaySenderQueueMetaRegion.basicDestroy(SerialGatewaySenderQueue.java:1387) > at > org.apache.geode.internal.cache.LocalRegion.localDestroy(LocalRegion.java:2230) > at > org.apache.geode.internal.cache.DistributedRegion.localDestroy(DistributedRegion.java:967) > at > org.apache.geode.internal.cache.wan.serial.BatchDestroyOperation$DestroyMessage.operateOnRegion(BatchDestroyOperation.java:121) > at > org.apache.geode.internal.cache.DistributedCacheOperation$CacheOperationMessage.basicProcess(DistributedCacheOperation.java:1208) > at > org.apache.geode.internal.cache.DistributedCacheOperation$CacheOperationMessage.process(DistributedCacheOperation.java:1110) > at > org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:376) > at > org.apache.geode.
[jira] [Updated] (GEODE-8816) CI failure: SerialWanPropagationDUnitTest. testReplicatedSerialPropagationWithRemoteRegionDestroy3
[ https://issues.apache.org/jira/browse/GEODE-8816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt updated GEODE-8816: -- Description: This test failed with a suspect string showing a functional problem with the sender event processor. {noformat} org.apache.geode.internal.cache.wan.serial.SerialWANPropagationDUnitTest > testReplicatedSerialPropagationWithRemoteRegionDestroy3 FAILED java.lang.AssertionError: Suspicious strings were written to the log during this run. Fix the strings or use IgnoredException.addIgnoredException to ignore. --- Found suspect string in 'dunit_suspect-vm5.log' at line 737 [error 2021/01/07 01:06:31.894 GMT :41005 unshared ordered uid=191 dom #1 local port=49013 remote port=40362> tid=1289] Exception occurred in CacheListener java.util.concurrent.RejectedExecutionException: Task org.apache.geode.internal.cache.wan.serial.SerialGatewaySenderEventProcessor$2@3f0ab37b rejected from java.util.concurrent.ThreadPoolExecutor@1a4d5355[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 2071] at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2063) at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830) at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1379) at org.apache.geode.internal.cache.wan.serial.SerialGatewaySenderEventProcessor.handlePrimaryDestroy(SerialGatewaySenderEventProcessor.java:607) at org.apache.geode.internal.cache.wan.serial.SerialSecondaryGatewayListener.afterDestroy(SerialSecondaryGatewayListener.java:91) at org.apache.geode.internal.cache.EnumListenerEvent$AFTER_DESTROY.dispatchEvent(EnumListenerEvent.java:178) at org.apache.geode.internal.cache.LocalRegion.dispatchEvent(LocalRegion.java:8265) at org.apache.geode.internal.cache.LocalRegion.dispatchListenerEvent(LocalRegion.java:6974) at org.apache.geode.internal.cache.LocalRegion.invokeDestroyCallbacks(LocalRegion.java:6775) at org.apache.geode.internal.cache.EntryEventImpl.invokeCallbacks(EntryEventImpl.java:2446) at org.apache.geode.internal.cache.entries.AbstractRegionEntry.dispatchListenerEvents(AbstractRegionEntry.java:164) at org.apache.geode.internal.cache.LocalRegion.basicDestroyPart2(LocalRegion.java:6716) at org.apache.geode.internal.cache.map.RegionMapDestroy.destroyExistingEntry(RegionMapDestroy.java:414) at org.apache.geode.internal.cache.map.RegionMapDestroy.handleExistingRegionEntry(RegionMapDestroy.java:244) at org.apache.geode.internal.cache.map.RegionMapDestroy.destroy(RegionMapDestroy.java:152) at org.apache.geode.internal.cache.AbstractRegionMap.destroy(AbstractRegionMap.java:968) at org.apache.geode.internal.cache.LocalRegion.mapDestroy(LocalRegion.java:6505) at org.apache.geode.internal.cache.LocalRegion.mapDestroy(LocalRegion.java:6479) at org.apache.geode.internal.cache.LocalRegionDataView.destroyExistingEntry(LocalRegionDataView.java:59) at org.apache.geode.internal.cache.LocalRegion.basicDestroy(LocalRegion.java:6430) at org.apache.geode.internal.cache.DistributedRegion.basicDestroy(DistributedRegion.java:1730) at org.apache.geode.internal.cache.wan.serial.SerialGatewaySenderQueue$SerialGatewaySenderQueueMetaRegion.basicDestroy(SerialGatewaySenderQueue.java:1387) at org.apache.geode.internal.cache.LocalRegion.localDestroy(LocalRegion.java:2230) at org.apache.geode.internal.cache.DistributedRegion.localDestroy(DistributedRegion.java:967) at org.apache.geode.internal.cache.wan.serial.BatchDestroyOperation$DestroyMessage.operateOnRegion(BatchDestroyOperation.java:121) at org.apache.geode.internal.cache.DistributedCacheOperation$CacheOperationMessage.basicProcess(DistributedCacheOperation.java:1208) at org.apache.geode.internal.cache.DistributedCacheOperation$CacheOperationMessage.process(DistributedCacheOperation.java:1110) at org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:376) at org.apache.geode.distributed.internal.DistributionMessage.schedule(DistributionMessage.java:432) at org.apache.geode.distributed.internal.ClusterDistributionManager.scheduleIncomingMessage(ClusterDistributionManager.java:2066) at org.apache.geode.distributed.internal.ClusterDistributionManager.handleIncomingDMsg(ClusterDistributionManager.java:1831) at org.apache.geode.distributed.internal.membership.gms.GMSMembership.dispatchMessage(GMSMembership.java:930) at org.apache.geode.distributed.internal.membership.gms.GMSMembership.handleOrDeferMessage(GMSMembership.java:
[jira] [Updated] (GEODE-8816) CI failure: SerialWanPropagationDUnitTest. testReplicatedSerialPropagationWithRemoteRegionDestroy3
[ https://issues.apache.org/jira/browse/GEODE-8816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt updated GEODE-8816: -- Description: This test failed with a suspect string showing a functional problem with the sender event processor. {noformat} org.apache.geode.internal.cache.wan.serial.SerialWANPropagationDUnitTest > testReplicatedSerialPropagationWithRemoteRegionDestroy3 FAILED java.lang.AssertionError: Suspicious strings were written to the log during this run. Fix the strings or use IgnoredException.addIgnoredException to ignore. --- Found suspect string in 'dunit_suspect-vm5.log' at line 737 [error 2021/01/07 01:06:31.894 GMT :41005 unshared ordered uid=191 dom #1 local port=49013 remote port=40362> tid=1289] Exception occurred in CacheListener java.util.concurrent.RejectedExecutionException: Task org.apache.geode.internal.cache.wan.serial.SerialGatewaySenderEventProcessor$2@3f0ab37b rejected from java.util.concurrent.ThreadPoolExecutor@1a4d5355[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 2071] at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2063) at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830) at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1379) at org.apache.geode.internal.cache.wan.serial.SerialGatewaySenderEventProcessor.handlePrimaryDestroy(SerialGatewaySenderEventProcessor.java:607) at org.apache.geode.internal.cache.wan.serial.SerialSecondaryGatewayListener.afterDestroy(SerialSecondaryGatewayListener.java:91) at org.apache.geode.internal.cache.EnumListenerEvent$AFTER_DESTROY.dispatchEvent(EnumListenerEvent.java:178) at org.apache.geode.internal.cache.LocalRegion.dispatchEvent(LocalRegion.java:8265) at org.apache.geode.internal.cache.LocalRegion.dispatchListenerEvent(LocalRegion.java:6974) at org.apache.geode.internal.cache.LocalRegion.invokeDestroyCallbacks(LocalRegion.java:6775) at org.apache.geode.internal.cache.EntryEventImpl.invokeCallbacks(EntryEventImpl.java:2446) at org.apache.geode.internal.cache.entries.AbstractRegionEntry.dispatchListenerEvents(AbstractRegionEntry.java:164) at org.apache.geode.internal.cache.LocalRegion.basicDestroyPart2(LocalRegion.java:6716) at org.apache.geode.internal.cache.map.RegionMapDestroy.destroyExistingEntry(RegionMapDestroy.java:414) at org.apache.geode.internal.cache.map.RegionMapDestroy.handleExistingRegionEntry(RegionMapDestroy.java:244) at org.apache.geode.internal.cache.map.RegionMapDestroy.destroy(RegionMapDestroy.java:152) at org.apache.geode.internal.cache.AbstractRegionMap.destroy(AbstractRegionMap.java:968) at org.apache.geode.internal.cache.LocalRegion.mapDestroy(LocalRegion.java:6505) at org.apache.geode.internal.cache.LocalRegion.mapDestroy(LocalRegion.java:6479) at org.apache.geode.internal.cache.LocalRegionDataView.destroyExistingEntry(LocalRegionDataView.java:59) at org.apache.geode.internal.cache.LocalRegion.basicDestroy(LocalRegion.java:6430) at org.apache.geode.internal.cache.DistributedRegion.basicDestroy(DistributedRegion.java:1730) at org.apache.geode.internal.cache.wan.serial.SerialGatewaySenderQueue$SerialGatewaySenderQueueMetaRegion.basicDestroy(SerialGatewaySenderQueue.java:1387) at org.apache.geode.internal.cache.LocalRegion.localDestroy(LocalRegion.java:2230) at org.apache.geode.internal.cache.DistributedRegion.localDestroy(DistributedRegion.java:967) at org.apache.geode.internal.cache.wan.serial.BatchDestroyOperation$DestroyMessage.operateOnRegion(BatchDestroyOperation.java:121) at org.apache.geode.internal.cache.DistributedCacheOperation$CacheOperationMessage.basicProcess(DistributedCacheOperation.java:1208) at org.apache.geode.internal.cache.DistributedCacheOperation$CacheOperationMessage.process(DistributedCacheOperation.java:1110) at org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:376) at org.apache.geode.distributed.internal.DistributionMessage.schedule(DistributionMessage.java:432) at org.apache.geode.distributed.internal.ClusterDistributionManager.scheduleIncomingMessage(ClusterDistributionManager.java:2066) at org.apache.geode.distributed.internal.ClusterDistributionManager.handleIncomingDMsg(ClusterDistributionManager.java:1831) at org.apache.geode.distributed.internal.membership.gms.GMSMembership.dispatchMessage(GMSMembership.java:930) at org.apache.geode.distributed.internal.membership.gms.GMSMembership.handleOrDeferMessage(GMSMembership.java:
[jira] [Created] (GEODE-8816) CI failure: SerialWanPropagationDUnitTest. testReplicatedSerialPropagationWithRemoteRegionDestroy3
Bruce J Schuchardt created GEODE-8816: - Summary: CI failure: SerialWanPropagationDUnitTest. testReplicatedSerialPropagationWithRemoteRegionDestroy3 Key: GEODE-8816 URL: https://issues.apache.org/jira/browse/GEODE-8816 Project: Geode Issue Type: Bug Components: wan Reporter: Bruce J Schuchardt This test failed with a suspect string showing a functional problem with the sender queues. {noformat} org.apache.geode.internal.cache.wan.serial.SerialWANPropagationDUnitTest > testReplicatedSerialPropagationWithRemoteRegionDestroy3 FAILED java.lang.AssertionError: Suspicious strings were written to the log during this run. Fix the strings or use IgnoredException.addIgnoredException to ignore. --- Found suspect string in 'dunit_suspect-vm5.log' at line 737 [error 2021/01/07 01:06:31.894 GMT :41005 unshared ordered uid=191 dom #1 local port=49013 remote port=40362> tid=1289] Exception occurred in CacheListener java.util.concurrent.RejectedExecutionException: Task org.apache.geode.internal.cache.wan.serial.SerialGatewaySenderEventProcessor$2@3f0ab37b rejected from java.util.concurrent.ThreadPoolExecutor@1a4d5355[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 2071] at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2063) at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830) at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1379) at org.apache.geode.internal.cache.wan.serial.SerialGatewaySenderEventProcessor.handlePrimaryDestroy(SerialGatewaySenderEventProcessor.java:607) at org.apache.geode.internal.cache.wan.serial.SerialSecondaryGatewayListener.afterDestroy(SerialSecondaryGatewayListener.java:91) at org.apache.geode.internal.cache.EnumListenerEvent$AFTER_DESTROY.dispatchEvent(EnumListenerEvent.java:178) at org.apache.geode.internal.cache.LocalRegion.dispatchEvent(LocalRegion.java:8265) at org.apache.geode.internal.cache.LocalRegion.dispatchListenerEvent(LocalRegion.java:6974) at org.apache.geode.internal.cache.LocalRegion.invokeDestroyCallbacks(LocalRegion.java:6775) at org.apache.geode.internal.cache.EntryEventImpl.invokeCallbacks(EntryEventImpl.java:2446) at org.apache.geode.internal.cache.entries.AbstractRegionEntry.dispatchListenerEvents(AbstractRegionEntry.java:164) at org.apache.geode.internal.cache.LocalRegion.basicDestroyPart2(LocalRegion.java:6716) at org.apache.geode.internal.cache.map.RegionMapDestroy.destroyExistingEntry(RegionMapDestroy.java:414) at org.apache.geode.internal.cache.map.RegionMapDestroy.handleExistingRegionEntry(RegionMapDestroy.java:244) at org.apache.geode.internal.cache.map.RegionMapDestroy.destroy(RegionMapDestroy.java:152) at org.apache.geode.internal.cache.AbstractRegionMap.destroy(AbstractRegionMap.java:968) at org.apache.geode.internal.cache.LocalRegion.mapDestroy(LocalRegion.java:6505) at org.apache.geode.internal.cache.LocalRegion.mapDestroy(LocalRegion.java:6479) at org.apache.geode.internal.cache.LocalRegionDataView.destroyExistingEntry(LocalRegionDataView.java:59) at org.apache.geode.internal.cache.LocalRegion.basicDestroy(LocalRegion.java:6430) at org.apache.geode.internal.cache.DistributedRegion.basicDestroy(DistributedRegion.java:1730) at org.apache.geode.internal.cache.wan.serial.SerialGatewaySenderQueue$SerialGatewaySenderQueueMetaRegion.basicDestroy(SerialGatewaySenderQueue.java:1387) at org.apache.geode.internal.cache.LocalRegion.localDestroy(LocalRegion.java:2230) at org.apache.geode.internal.cache.DistributedRegion.localDestroy(DistributedRegion.java:967) at org.apache.geode.internal.cache.wan.serial.BatchDestroyOperation$DestroyMessage.operateOnRegion(BatchDestroyOperation.java:121) at org.apache.geode.internal.cache.DistributedCacheOperation$CacheOperationMessage.basicProcess(DistributedCacheOperation.java:1208) at org.apache.geode.internal.cache.DistributedCacheOperation$CacheOperationMessage.process(DistributedCacheOperation.java:1110) at org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:376) at org.apache.geode.distributed.internal.DistributionMessage.schedule(DistributionMessage.java:432) at org.apache.geode.distributed.internal.ClusterDistributionManager.scheduleIncomingMessage(ClusterDistributionManager.java:2066) at org.apache.geode.distributed.internal.ClusterDistributionManager.handleIncomingDMsg(ClusterDistributionManager.java:1831) at org.apache.geode.distributed.
[jira] [Resolved] (GEODE-5922) SerialGatewaySenderQueue concurrency is poorly implemented
[ https://issues.apache.org/jira/browse/GEODE-5922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt resolved GEODE-5922. --- Resolution: Won't Fix The fix for this problem has been reverted on develop and all support branches. The change to use a fair lock instead of Java synchronization caused queuing to be about 3x slower under heavy load. > SerialGatewaySenderQueue concurrency is poorly implemented > -- > > Key: GEODE-5922 > URL: https://issues.apache.org/jira/browse/GEODE-5922 > Project: Geode > Issue Type: Improvement > Components: wan >Reporter: Bruce J Schuchardt >Assignee: Bruce J Schuchardt >Priority: Major > Labels: blocks-1.14.0, pull-request-available > Fix For: 1.8.0 > > Time Spent: 20m > Remaining Estimate: 0h > > This class uses synchronization on the queue to limit access to one put at a > time. Synchronization isn't a fair locking mechanism so threads can be > blocked trying to add events to the queue while other more recent events get > the lock and insert their events. This causes inconsistent latency which > I've observed being as long as 30 seconds, causing client connections to be > shut down by the ClientHealthMonitor. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GEODE-5922) SerialGatewaySenderQueue concurrency is poorly implemented
[ https://issues.apache.org/jira/browse/GEODE-5922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt updated GEODE-5922: -- Labels: blocks-1.14.0 pull-request-available (was: pull-request-available) > SerialGatewaySenderQueue concurrency is poorly implemented > -- > > Key: GEODE-5922 > URL: https://issues.apache.org/jira/browse/GEODE-5922 > Project: Geode > Issue Type: Improvement > Components: wan >Reporter: Bruce J Schuchardt >Assignee: Bruce J Schuchardt >Priority: Major > Labels: blocks-1.14.0, pull-request-available > Fix For: 1.8.0 > > Time Spent: 20m > Remaining Estimate: 0h > > This class uses synchronization on the queue to limit access to one put at a > time. Synchronization isn't a fair locking mechanism so threads can be > blocked trying to add events to the queue while other more recent events get > the lock and insert their events. This causes inconsistent latency which > I've observed being as long as 30 seconds, causing client connections to be > shut down by the ClientHealthMonitor. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Reopened] (GEODE-5922) SerialGatewaySenderQueue concurrency is poorly implemented
[ https://issues.apache.org/jira/browse/GEODE-5922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt reopened GEODE-5922: --- The fix for this issue caused a 3x performance degradation in adding new events to the async queue. The fix needs to be reverted and reevaluated. A performance test of the AEQ with heavy load should be created to vet any new fix. > SerialGatewaySenderQueue concurrency is poorly implemented > -- > > Key: GEODE-5922 > URL: https://issues.apache.org/jira/browse/GEODE-5922 > Project: Geode > Issue Type: Improvement > Components: wan >Reporter: Bruce J Schuchardt >Assignee: Bruce J Schuchardt >Priority: Major > Labels: pull-request-available > Fix For: 1.8.0 > > Time Spent: 20m > Remaining Estimate: 0h > > This class uses synchronization on the queue to limit access to one put at a > time. Synchronization isn't a fair locking mechanism so threads can be > blocked trying to add events to the queue while other more recent events get > the lock and insert their events. This causes inconsistent latency which > I've observed being as long as 30 seconds, causing client connections to be > shut down by the ClientHealthMonitor. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GEODE-8567) CI Failure: ConcurrentSerialGatewaySenderOperationsDistributedTest > testRestartSerialGatewaySendersWhilePutting
[ https://issues.apache.org/jira/browse/GEODE-8567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce J Schuchardt updated GEODE-8567: -- Labels: no-release-note (was: ) > CI Failure: ConcurrentSerialGatewaySenderOperationsDistributedTest > > testRestartSerialGatewaySendersWhilePutting > > > Key: GEODE-8567 > URL: https://issues.apache.org/jira/browse/GEODE-8567 > Project: Geode > Issue Type: Improvement > Components: wan >Reporter: Owen Nichols >Priority: Major > Labels: no-release-note > > ConcurrentSerialGatewaySenderOperationsDistributedTest > > testRestartSerialGatewaySendersWhilePutting[1: numDispatchers=3] FAILED > seen in > [DistributedTestOpenJDK11|https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/DistributedTestOpenJDK11/builds/490] > #490 > > Task :geode-wan:distributedTest > org.apache.geode.internal.cache.wan.concurrent.ConcurrentSerialGatewaySenderOperationsDistributedTest > > testRestartSerialGatewaySendersWhilePutting[1: numDispatchers=3] FAILED > org.apache.geode.test.dunit.RMIException: While invoking > org.apache.geode.internal.cache.wan.serial.SerialGatewaySenderOperationsDistributedTest$$Lambda$490/0x00010094b040.run > in VM 5 running on Host 60d64fc07216 with 8 VMs > Caused by: > org.awaitility.core.ConditionTimeoutException: Assertion condition > defined as a lambda expression in > org.apache.geode.internal.cache.wan.serial.SerialGatewaySenderOperationsDistributedTest > that uses org.apache.geode.internal.cache.wan.InternalGatewaySender, > org.apache.geode.internal.cache.wan.InternalGatewaySenderint [Sender > statistics unprocessed event map size] expected:<[0]> but was:<[2]> within 5 > minutes. > Caused by: > org.junit.ComparisonFailure: [Sender statistics unprocessed event > map size] expected:<[0]> but was:<[2]> > =-=-=-=-=-=-=-=-=-=-=-=-=-=-= Test Results URI > =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= > [*http://files.apachegeode-ci.info/builds/apache-develop-main/1.14.0-build.0380/test-results/distributedTest/1601531249/*] > =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= > Test report artifacts from this job are available at: > [*http://files.apachegeode-ci.info/builds/apache-develop-main/1.14.0-build.0380/test-artifacts/1601531249/distributedtestfiles-OpenJDK11-1.14.0-build.0380.tgz*] -- This message was sent by Atlassian Jira (v8.3.4#803005)