[jira] [Commented] (ARTEMIS-3831) Scale-down fails when using same discovery-group used by Broker cluster connection

ASF subversion and git services (Jira) Mon, 22 Jan 2024 12:03:04 -0800


    [ 
https://issues.apache.org/jira/browse/ARTEMIS-3831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17809624#comment-17809624
 ]


ASF subversion and git services commented on ARTEMIS-3831:
----------------------------------------------------------

Commit 3dd50f8ff14116234909ad2e275515b2b83971b5 in activemq-artemis's branch 
refs/heads/main from Justin Bertram
[ https://gitbox.apache.org/repos/asf?p=activemq-artemis.git;h=3dd50f8ff1 ]

ARTEMIS-3831 scale-down w/jgroups fails if using same dg as cluster-connection

If both scale-down and cluster-connection are using the same JGroups
discovery-group then when the cluster-connection stops it will close the
underlying org.jgroups.JChannel and when the scale-down process tries to
use it to find a server it will fail.

This commit ensures that the JGroupsBroadcastEndpoint implementation of
BroadcastEndpoint#openClient initializes the channel if it has been
closed.


> Scale-down fails when using same discovery-group used by Broker cluster 
> connection
> ----------------------------------------------------------------------------------
>
>                 Key: ARTEMIS-3831
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-3831
>             Project: ActiveMQ Artemis
>          Issue Type: Bug
>          Components: Broker
>    Affects Versions: 2.19.1, 2.31.0
>            Reporter: Apache Dev
>            Assignee: Justin Bertram
>            Priority: Critical
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> Using 2 Live brokers in cluster.
> Both having the following HA Policy:
> {code}
>         <ha-policy>
>             <live-only>
>                 <scale-down>
>                     <enabled>true</enabled>
>                     <discovery-group-ref 
> discovery-group-name="activemq-discovery-group"/>
>                 </scale-down>
>             </live-only>
>         </ha-policy>
> {code}
> where "activemq-discovery-group" is using JGroups TCPPING:
> {code}
>         <discovery-groups>
>             <discovery-group name="activemq-discovery-group">
>                 <jgroups-file>...</jgroups-file>
>                 <jgroups-channel>...</jgroups-channel>
>                 <refresh-timeout>10000</refresh-timeout>
>             </discovery-group>
>         </discovery-groups>
> {code}
> and it is used by the cluster of 2 brokers:
> {code}
>         <cluster-connections>
>             <cluster-connection name="activemq-cluster">
>                 <connector-ref>netty-connector</connector-ref>
>                 <retry-interval>5000</retry-interval>
>                 <use-duplicate-detection>true</use-duplicate-detection>
>                 <message-load-balancing>OFF</message-load-balancing>
>                 <max-hops>1</max-hops>
>                 <discovery-group-ref 
> discovery-group-name="activemq-discovery-group"/>
>             </cluster-connection>
>         </cluster-connections>
> {code}
> Issue is that when shutdown happens, scale-down fails:
> {code}
> org.apache.activemq.artemis.core.server                      W AMQ222181: 
> Unable to scaleDown messages
>         ActiveMQInternalErrorException[errorType=INTERNAL_ERROR 
> message=AMQ219004: Failed to initialise session factory]
>         at 
> org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.initialize(ServerLocatorImpl.java:272)
>         at 
> org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.createSessionFactory(ServerLocatorImpl.java:655)
>         at 
> org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.connect(ServerLocatorImpl.java:554)
>         at 
> org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.connect(ServerLocatorImpl.java:533)
>         at 
> org.apache.activemq.artemis.core.server.LiveNodeLocator.connectToCluster(LiveNodeLocator.java:85)
>         at 
> org.apache.activemq.artemis.core.server.impl.LiveOnlyActivation.connectToScaleDownTarget(LiveOnlyActivation.java:146)
>         at 
> org.apache.activemq.artemis.core.server.impl.LiveOnlyActivation.freezeConnections(LiveOnlyActivation.java:114)
>         at 
> org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.freezeConnections(ActiveMQServerImpl.java:1468)
>         at 
> org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.stop(ActiveMQServerImpl.java:1250)
>         at 
> org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.stop(ActiveMQServerImpl.java:1166)
>         at 
> org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.stop(ActiveMQServerImpl.java:1150)
>         ...
>         Caused by: ActiveMQInternalErrorException[errorType=INTERNAL_ERROR 
> message=channel is closed]
>         at 
> org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.startDiscovery(ServerLocatorImpl.java:286)
>         at 
> org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.initialize(ServerLocatorImpl.java:268)
>         ... 44 more
>         Caused by: java.lang.IllegalStateException: channel is closed
>         at org.jgroups.JChannel.checkClosed(JChannel.java:957)
>         at org.jgroups.JChannel._preConnect(JChannel.java:548)
>         at org.jgroups.JChannel.connect(JChannel.java:288)
>         at org.jgroups.JChannel.connect(JChannel.java:279)
>         at 
> org.apache.activemq.artemis.api.core.jgroups.JChannelWrapper.connect(JChannelWrapper.java:126)
>         at 
> org.apache.activemq.artemis.api.core.JGroupsBroadcastEndpoint.internalOpen(JGroupsBroadcastEndpoint.java:113)
>         at 
> org.apache.activemq.artemis.api.core.JGroupsBroadcastEndpoint.openClient(JGroupsBroadcastEndpoint.java:91)
>         at 
> org.apache.activemq.artemis.core.cluster.DiscoveryGroup.start(DiscoveryGroup.java:111)
>         at 
> org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.startDiscovery(ServerLocatorImpl.java:284)
>         ... 45 more
> {code}
> JGroups channel used by scale-down is probably the same used by broker, but 
> already being closed during broker shutdown itself.
> As a workaround, it is possible to create a separate discovery-group (with 
> its own broadcast-group) so that scale-down uses a new JGroups channel not 
> being closed by broker.
> However, this causes duplication of configurations and a new JGroups port for 
> the scale-down discovery must be opened.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (ARTEMIS-3831) Scale-down fails when using same discovery-group used by Broker cluster connection

Reply via email to