[jira] [Updated] (ARTEMIS-2870) CORE connection failure sometimes doesn't cleanup sessions

Markus Meierhofer (Jira) Thu, 05 Nov 2020 07:12:37 -0800


     [ 
https://issues.apache.org/jira/browse/ARTEMIS-2870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Markus Meierhofer updated ARTEMIS-2870:
---------------------------------------
    Attachment: connection_nonexistent.png

> CORE connection failure sometimes doesn't cleanup sessions
> ----------------------------------------------------------
>
>                 Key: ARTEMIS-2870
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-2870
>             Project: ActiveMQ Artemis
>          Issue Type: Bug
>          Components: Broker
>    Affects Versions: 2.10.1, 2.14.0, 2.15.0
>            Reporter: Markus Meierhofer
>            Priority: Blocker
>         Attachments: artemis.log, broker.xml, connection_nonexistent.png, 
> consumer_list_for_one_queue.png, duplicated consumers.png, 
> multiple_consumers_per_queue.png, session_with_connection_id.png, three 
> consumers per queue.png
>
>
> h3. Summary
> Since the upgrade of our deployed artemis instances from version 2.6.4 to 
> 2.10.1 we have noticed the problem that sometimes, a connection failure 
> doesn't include the cleanup of its connected sessions, leading to "zombie" 
> consumers and producers on queues.
>  
> h3. The issue
> Our Artemis Clients are connected to the broker via the provided JMS 
> abstraction, using the default connection TTL of 60 seconds. we are using 
> both JMS Topics and JMS Queues.
> As most of our Clients are mobile and in a WiFi, connection losses may occur 
> frequently, depending on the quality of the network. When the client is 
> disconnected for 60 seconds, the broker usually closes the connection and 
> cleans up all the sessions connected to it. The mobile Clients then create 
> reconnect when they are online again. What we have noticed is that after many 
> connection failures, messages may to be sent twice to the mobile clients. 
> When analyzing the problem on the broker console, we found out that there 
> were two consumers connected to each of the queues one mobile client usually 
> consumes from. One of them belonged to the new connection of the mobile 
> Client, which is fine.
> The other consumer belonged to a session whose connection already failed and 
> was closed at that time. When analyzing the logs, we saw that for these 
> connections, it contained a "Connection failure to ... has been detected" 
> line, but no following "clearing up resources for session ..." log lines for 
> these connections.
>  
> h3. Instance of the issue
>  
> The broken Session is the "7a9292cb-xxx" in the picture. In the logs you can 
> see that the connection failure was detected, but the session was never 
> cleared by the broker (mind the timestamp).
> !duplicated consumers.png!
> {code:java}
> [WARN 2020-07-27 14:33:29,794  Thread-13  
> org.apache.activemq.artemis.core.client]: AMQ212037: Connection failure to 
> /10.255.0.2:54812 has been detected: syscall:read(..) failed: Connection 
> reset by peer [code=GENERIC_EXCEPTION]
> [WARN 2020-07-29 09:31:30,828 Thread-20   
> org.apache.activemq.artemis.core.client]: AMQ212037: Connection failure to 
> /10.255.0.2:55994 has been detected: AMQ229014: Did not receive data from 
> /10.255.0.2:55994 within the 60,000ms connection TTL. The connection will now 
> be closed. [code=CONNECTION_TIMEDOUT]
> {code}
>  
> Attached you can find the full [^artemis.log] and our [^broker.xml]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (ARTEMIS-2870) CORE connection failure sometimes doesn't cleanup sessions

Reply via email to