[ https://issues.apache.org/jira/browse/ARTEMIS-2969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17223686#comment-17223686 ]
Robbie Gemmell edited comment on ARTEMIS-2969 at 10/30/20, 2:47 PM: -------------------------------------------------------------------- The latest fix means the broker does notice the connection going down immediately which is good, however it also introduces a new problem behaviour. It seems like each time it disconnects it creates an extra additional new connection on trying to reconnect. E.g in testing, I've got one server-connection configured, but after a number of restarts on the peer I now have traffic logs froming from 4 separate broker connections. As soon as I discconnected it, it logged twice that it was doing 'retry 2 of -1' (i.e it skipped logging retry 1). I stopped and started the router a few more times and ended up with yet more connection logs. I can see the live traffic going for them all on a protocol trace (broker + router) so they really are actually there. E.g on shutting the router down in this state, it logged 4 connections: {noformat} 2020-10-30 14:35:19.403646 +0000 ROUTER_CORE (info) Router Core thread exited 2020-10-30 14:35:19.403807 +0000 ROUTER_CORE (info) Finalizing core module: streaming_link_scruber 2020-10-30 14:35:19.403834 +0000 ROUTER_CORE (info) Finalizing core module: stuck_delivery_detection 2020-10-30 14:35:19.403842 +0000 ROUTER_CORE (info) Finalizing core module: address_lookup_client 2020-10-30 14:35:19.403875 +0000 SERVER (info) [C1] Closing connection on shutdown 2020-10-30 14:35:19.403915 +0000 SERVER (info) [C2] Closing connection on shutdown 2020-10-30 14:35:19.403925 +0000 SERVER (info) [C3] Closing connection on shutdown 2020-10-30 14:35:19.403929 +0000 SERVER (info) [C4] Closing connection on shutdown [0x23cd9b0]: AMQP:FRAME:0 -> @close(24) [error=@error(29) [condition=:"amqp:connection:framing-error", description="connection aborted"]] [0x23cd9b0]: IO:FRAME: <- EOS [0x2396bd0]: AMQP:FRAME:0 -> @close(24) [error=@error(29) [condition=:"amqp:connection:framing-error", description="connection aborted"]] [0x2396bd0]: IO:FRAME: <- EOS [0x231b3b0]: AMQP:FRAME:0 -> @close(24) [error=@error(29) [condition=:"amqp:connection:framing-error", description="connection aborted"]] [0x231b3b0]: IO:FRAME: <- EOS [0x2314620]: AMQP:FRAME:0 -> @close(24) [error=@error(29) [condition=:"amqp:connection:framing-error", description="connection aborted"]] {noformat} While the broker was logging: {noformat} 2020-10-30 14:36:04,423 INFO [org.apache.activemq.artemis.protocol.amqp.logger] AMQ111002: ******************************************************************************************************************************* Retrying Server AMQP Connection my-router on localhost:6000 retry 46 of -1 ******************************************************************************************************************************* 2020-10-30 14:36:04,424 INFO [org.apache.activemq.artemis.protocol.amqp.logger] AMQ111002: ******************************************************************************************************************************* Retrying Server AMQP Connection my-router on localhost:6000 retry 47 of -1 ******************************************************************************************************************************* 2020-10-30 14:36:04,426 INFO [org.apache.activemq.artemis.protocol.amqp.logger] AMQ111002: ******************************************************************************************************************************* Retrying Server AMQP Connection my-router on localhost:6000 retry 48 of -1 ******************************************************************************************************************************* 2020-10-30 14:36:04,427 INFO [org.apache.activemq.artemis.protocol.amqp.logger] AMQ111002: ******************************************************************************************************************************* Retrying Server AMQP Connection my-router on localhost:6000 retry 49 of -1 ******************************************************************************************************************************* 2020-10-30 14:36:04,429 INFO [org.apache.activemq.artemis.protocol.amqp.logger] AMQ111002: ******************************************************************************************************************************* Retrying Server AMQP Connection my-router on localhost:6000 retry 50 of -1 ******************************************************************************************************************************* {noformat} It appears the (now 5) different connections end up sharing the same reconnect counter (which perhaps also explains the initial 'retry 2' thing I noted seeing). was (Author: gemmellr): The latest fix means the broker does notice the connection going down immediately which is good, however it also introduces a new problem behaviour. It seems like each time it disconnects it creates an extra additional new connection on trying to reconnect. E.g in testing, I've got one server-connection configured, but after a number of restarts on the peer I now have traffic logs froming from 4 separate broker connections. As soon as I discconnected it, it logged twice that it was doing 'retry 2 of -1' (i.e it skipped logging retry 1). I stopped and started the router a few more times and ended up with yet more connection logs. I can see the live traffic going for them all on a protocol trace (broker + router) so they really are actually there. E.g on shutting the router down in this state, it logged: {noformat} 2020-10-30 14:35:19.403646 +0000 ROUTER_CORE (info) Router Core thread exited 2020-10-30 14:35:19.403807 +0000 ROUTER_CORE (info) Finalizing core module: streaming_link_scruber 2020-10-30 14:35:19.403834 +0000 ROUTER_CORE (info) Finalizing core module: stuck_delivery_detection 2020-10-30 14:35:19.403842 +0000 ROUTER_CORE (info) Finalizing core module: address_lookup_client 2020-10-30 14:35:19.403875 +0000 SERVER (info) [C1] Closing connection on shutdown 2020-10-30 14:35:19.403915 +0000 SERVER (info) [C2] Closing connection on shutdown 2020-10-30 14:35:19.403925 +0000 SERVER (info) [C3] Closing connection on shutdown 2020-10-30 14:35:19.403929 +0000 SERVER (info) [C4] Closing connection on shutdown [0x23cd9b0]: AMQP:FRAME:0 -> @close(24) [error=@error(29) [condition=:"amqp:connection:framing-error", description="connection aborted"]] [0x23cd9b0]: IO:FRAME: <- EOS [0x2396bd0]: AMQP:FRAME:0 -> @close(24) [error=@error(29) [condition=:"amqp:connection:framing-error", description="connection aborted"]] [0x2396bd0]: IO:FRAME: <- EOS [0x231b3b0]: AMQP:FRAME:0 -> @close(24) [error=@error(29) [condition=:"amqp:connection:framing-error", description="connection aborted"]] [0x231b3b0]: IO:FRAME: <- EOS [0x2314620]: AMQP:FRAME:0 -> @close(24) [error=@error(29) [condition=:"amqp:connection:framing-error", description="connection aborted"]] {noformat} While the broker was logging: {noformat} 2020-10-30 14:36:04,423 INFO [org.apache.activemq.artemis.protocol.amqp.logger] AMQ111002: ******************************************************************************************************************************* Retrying Server AMQP Connection my-router on localhost:6000 retry 46 of -1 ******************************************************************************************************************************* 2020-10-30 14:36:04,424 INFO [org.apache.activemq.artemis.protocol.amqp.logger] AMQ111002: ******************************************************************************************************************************* Retrying Server AMQP Connection my-router on localhost:6000 retry 47 of -1 ******************************************************************************************************************************* 2020-10-30 14:36:04,426 INFO [org.apache.activemq.artemis.protocol.amqp.logger] AMQ111002: ******************************************************************************************************************************* Retrying Server AMQP Connection my-router on localhost:6000 retry 48 of -1 ******************************************************************************************************************************* 2020-10-30 14:36:04,427 INFO [org.apache.activemq.artemis.protocol.amqp.logger] AMQ111002: ******************************************************************************************************************************* Retrying Server AMQP Connection my-router on localhost:6000 retry 49 of -1 ******************************************************************************************************************************* 2020-10-30 14:36:04,429 INFO [org.apache.activemq.artemis.protocol.amqp.logger] AMQ111002: ******************************************************************************************************************************* Retrying Server AMQP Connection my-router on localhost:6000 retry 50 of -1 ******************************************************************************************************************************* {noformat} It appears the (now 5) different connections end up sharing the same reconnect counter (which perhaps also explains the initial 'retry 2' thing I noted seeing). > broker doesnt handle server-connection being severed > ---------------------------------------------------- > > Key: ARTEMIS-2969 > URL: https://issues.apache.org/jira/browse/ARTEMIS-2969 > Project: ActiveMQ Artemis > Issue Type: Bug > Affects Versions: 2.16.0 > Reporter: Robbie Gemmell > Assignee: Clebert Suconic > Priority: Major > Fix For: 2.16.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > The broker doesnt notice or handle a server-connection being severed / timing > out. > Start a broker, having instructed it to connect to a router (i.e 'peer' > configuration). Stop the router (CTRL-C). Start the router again. Broker > doesnt seem to notice the server-connection is gone and doesnt reconnect. It > actually sits generating heartbeat frames until its 60sec actual-timeout is > reached...at which point proton generates a close frame indicating the > timeout. The broker still doesnt notice the connection is gone, and still > doesnt reconnect. -- This message was sent by Atlassian Jira (v8.3.4#803005)