[ https://issues.apache.org/jira/browse/ARTEMIS-3277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jelmer Marinus updated ARTEMIS-3277: ------------------------------------ Description: We are running an Artemis-cluster with the following specs: ------ Artemis version: 2.15.0 (runs in a Docker-container) Docker version: Docker version 20.10.3, build 48d30b5 Docker image: vromero/activemq-artemis:2.15.0 (JVM: OpenJDK Runtime Environment (build 1.8.0_232-b09)) OS-version: CentOS 7.9.2009 (x86_64) Cluster-setup: 3 primary nodes with 3 secondary nodes ------ Sometimes we are experiencing connection issues from the clients connecting to the cluster which results in messages not beeing consumed from or produced to the cluster. This seems to be specific to a single alternating master node at the time and a restart of the node seems to solve the problem. From the client we get a "AMQ212037: Connection failure to ############################## has been detected: AMQ219015: The connection was disconnected because of server shutdown [code=DISCONNECTED]" error message.". The Artemis-console however is available so from this perspective the node seems to be reachable and working correctly. Server-side (Artemis) we get the follow error messages which maybe related "Inconsistency during compacting: RollbackRecord ID = 82113064851 for an already rolled back transaction during compacting", "Error on reading compacting for JournalFileImpl: (activemq-data-######.amq id = #####, recordID = #####)", "Bridge Failed to ack", "Cannot find add info ######## on compactor or current records", We are also experiencing the following issues which may or may not be related: 1) initial distribution/Redistribution not working correctly 2) Reset of Address-setting to default values. We use specific setting regarding i.e. DLA, redistributionDelay etc. but sometimes these settings seem to go back to their default values. Is this problem reported more often or has this something to do with our setup of the cluster ? If the latter is true is there a way to fix it ? was: We are running an Artemis-cluster with the following specs: ------ Artemis version: 2.15.0 (runs in a Docker-container) Docker version: Docker version 20.10.3, build 48d30b5 Docker image: vromero/activemq-artemis:2.15.0 (JVM: OpenJDK Runtime Environment (build 1.8.0_232-b09)) OS-version: CentOS 7.9.2009 (x86_64) Cluster-setup: 3 primary nodes with 3 secondary nodes ------ Sometimes we are experiencing connection issues from the clients connecting to the cluster which results in messages not beeing consumed from or produced to the cluster. This seems to be specific to a single alternating master node at the time and a restart of the node seems to solve the problem. From the client we get a "AMQ212037: Connection failure to ############################## has been detected: AMQ219015: The connection was disconnected because of server shutdown [code=DISCONNECTED]" error message.". The Artemis-console however is available so from this perspective the node seems to be reachable and working correctly. Server-side (Artemis) we get the follow error messages which maybe related "Inconsistency during compacting: RollbackRecord ID = 82113064851 for an already rolled back transaction during compacting", "Error on reading compacting for JournalFileImpl: (activemq-data-######.amq id = #####, recordID = #####)", "Bridge Failed to ack", "Cannot find add info ######## on compactor or current records", We are also experiencing the following issues which may or may not be related: 1) initial distribution/Redistribution not working correctly 2) Reset of Address-setting to default values. We use specific setting regarding i.e. DLA, redistributionDelay etc. but sometimes these settings seem to go back to their default values. > Client connection to Artemis-cluster node lost > ---------------------------------------------- > > Key: ARTEMIS-3277 > URL: https://issues.apache.org/jira/browse/ARTEMIS-3277 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: ActiveMQ-Artemis-Native > Affects Versions: 2.15.0 > Reporter: Jelmer Marinus > Assignee: Clebert Suconic > Priority: Major > > We are running an Artemis-cluster with the following specs: > ------ > Artemis version: 2.15.0 (runs in a Docker-container) > Docker version: Docker version 20.10.3, build 48d30b5 > Docker image: vromero/activemq-artemis:2.15.0 (JVM: OpenJDK Runtime > Environment (build 1.8.0_232-b09)) > OS-version: CentOS 7.9.2009 (x86_64) > Cluster-setup: 3 primary nodes with 3 secondary nodes > ------ > Sometimes we are experiencing connection issues from the clients connecting > to the cluster which results in messages not beeing consumed from or produced > to the cluster. This seems to be specific to a single alternating master node > at the time and a restart of the node seems to solve the problem. >From the > client we get a "AMQ212037: Connection failure to > ############################## has been detected: AMQ219015: The connection > was disconnected because of server shutdown [code=DISCONNECTED]" error > message.". The Artemis-console however is available so from this perspective > the node seems to be reachable and working correctly. > Server-side (Artemis) we get the follow error messages which maybe related > "Inconsistency during compacting: RollbackRecord ID = 82113064851 for an > already rolled back transaction during compacting", "Error on reading > compacting for JournalFileImpl: (activemq-data-######.amq id = #####, > recordID = #####)", "Bridge Failed to ack", "Cannot find add info ######## on > compactor or current records", > > We are also experiencing the following issues which may or may not be related: > 1) initial distribution/Redistribution not working correctly > 2) Reset of Address-setting to default values. We use specific setting > regarding i.e. DLA, redistributionDelay etc. but sometimes these settings > seem to go back to their default values. > > Is this problem reported more often or has this something to do with our > setup of the cluster ? If the latter is true is there a way to fix it ? > -- This message was sent by Atlassian Jira (v8.3.4#803005)