[ https://issues.apache.org/jira/browse/ARTEMIS-3831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17809624#comment-17809624 ]
ASF subversion and git services commented on ARTEMIS-3831: ---------------------------------------------------------- Commit 3dd50f8ff14116234909ad2e275515b2b83971b5 in activemq-artemis's branch refs/heads/main from Justin Bertram [ https://gitbox.apache.org/repos/asf?p=activemq-artemis.git;h=3dd50f8ff1 ] ARTEMIS-3831 scale-down w/jgroups fails if using same dg as cluster-connection If both scale-down and cluster-connection are using the same JGroups discovery-group then when the cluster-connection stops it will close the underlying org.jgroups.JChannel and when the scale-down process tries to use it to find a server it will fail. This commit ensures that the JGroupsBroadcastEndpoint implementation of BroadcastEndpoint#openClient initializes the channel if it has been closed. > Scale-down fails when using same discovery-group used by Broker cluster > connection > ---------------------------------------------------------------------------------- > > Key: ARTEMIS-3831 > URL: https://issues.apache.org/jira/browse/ARTEMIS-3831 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: Broker > Affects Versions: 2.19.1, 2.31.0 > Reporter: Apache Dev > Assignee: Justin Bertram > Priority: Critical > Time Spent: 20m > Remaining Estimate: 0h > > Using 2 Live brokers in cluster. > Both having the following HA Policy: > {code} > <ha-policy> > <live-only> > <scale-down> > <enabled>true</enabled> > <discovery-group-ref > discovery-group-name="activemq-discovery-group"/> > </scale-down> > </live-only> > </ha-policy> > {code} > where "activemq-discovery-group" is using JGroups TCPPING: > {code} > <discovery-groups> > <discovery-group name="activemq-discovery-group"> > <jgroups-file>...</jgroups-file> > <jgroups-channel>...</jgroups-channel> > <refresh-timeout>10000</refresh-timeout> > </discovery-group> > </discovery-groups> > {code} > and it is used by the cluster of 2 brokers: > {code} > <cluster-connections> > <cluster-connection name="activemq-cluster"> > <connector-ref>netty-connector</connector-ref> > <retry-interval>5000</retry-interval> > <use-duplicate-detection>true</use-duplicate-detection> > <message-load-balancing>OFF</message-load-balancing> > <max-hops>1</max-hops> > <discovery-group-ref > discovery-group-name="activemq-discovery-group"/> > </cluster-connection> > </cluster-connections> > {code} > Issue is that when shutdown happens, scale-down fails: > {code} > org.apache.activemq.artemis.core.server W AMQ222181: > Unable to scaleDown messages > ActiveMQInternalErrorException[errorType=INTERNAL_ERROR > message=AMQ219004: Failed to initialise session factory] > at > org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.initialize(ServerLocatorImpl.java:272) > at > org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.createSessionFactory(ServerLocatorImpl.java:655) > at > org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.connect(ServerLocatorImpl.java:554) > at > org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.connect(ServerLocatorImpl.java:533) > at > org.apache.activemq.artemis.core.server.LiveNodeLocator.connectToCluster(LiveNodeLocator.java:85) > at > org.apache.activemq.artemis.core.server.impl.LiveOnlyActivation.connectToScaleDownTarget(LiveOnlyActivation.java:146) > at > org.apache.activemq.artemis.core.server.impl.LiveOnlyActivation.freezeConnections(LiveOnlyActivation.java:114) > at > org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.freezeConnections(ActiveMQServerImpl.java:1468) > at > org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.stop(ActiveMQServerImpl.java:1250) > at > org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.stop(ActiveMQServerImpl.java:1166) > at > org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.stop(ActiveMQServerImpl.java:1150) > ... > Caused by: ActiveMQInternalErrorException[errorType=INTERNAL_ERROR > message=channel is closed] > at > org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.startDiscovery(ServerLocatorImpl.java:286) > at > org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.initialize(ServerLocatorImpl.java:268) > ... 44 more > Caused by: java.lang.IllegalStateException: channel is closed > at org.jgroups.JChannel.checkClosed(JChannel.java:957) > at org.jgroups.JChannel._preConnect(JChannel.java:548) > at org.jgroups.JChannel.connect(JChannel.java:288) > at org.jgroups.JChannel.connect(JChannel.java:279) > at > org.apache.activemq.artemis.api.core.jgroups.JChannelWrapper.connect(JChannelWrapper.java:126) > at > org.apache.activemq.artemis.api.core.JGroupsBroadcastEndpoint.internalOpen(JGroupsBroadcastEndpoint.java:113) > at > org.apache.activemq.artemis.api.core.JGroupsBroadcastEndpoint.openClient(JGroupsBroadcastEndpoint.java:91) > at > org.apache.activemq.artemis.core.cluster.DiscoveryGroup.start(DiscoveryGroup.java:111) > at > org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.startDiscovery(ServerLocatorImpl.java:284) > ... 45 more > {code} > JGroups channel used by scale-down is probably the same used by broker, but > already being closed during broker shutdown itself. > As a workaround, it is possible to create a separate discovery-group (with > its own broadcast-group) so that scale-down uses a new JGroups channel not > being closed by broker. > However, this causes duplication of configurations and a new JGroups port for > the scale-down discovery must be opened. -- This message was sent by Atlassian Jira (v8.20.10#820010)