[jira] [Work logged] (ARTEMIS-4476) Connection Failure Race Conditions in AMQP and Core

ASF GitHub Bot (Jira) Thu, 30 Nov 2023 03:25:53 -0800


     [ 
https://issues.apache.org/jira/browse/ARTEMIS-4476?focusedWorklogId=893144&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-893144
 ]


ASF GitHub Bot logged work on ARTEMIS-4476:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 30/Nov/23 11:24
            Start Date: 30/Nov/23 11:24
    Worklog Time Spent: 10m 
      Work Description: gtully commented on code in PR #4694:
URL: https://github.com/apache/activemq-artemis/pull/4694#discussion_r1410534274


##########
artemis-protocols/artemis-openwire-protocol/src/main/java/org/apache/activemq/artemis/core/protocol/openwire/OpenWireConnection.java:
##########
@@ -761,7 +761,11 @@ public void fail(ActiveMQException me, String message) {
 
       final ThresholdActor<Command> localVisibleActor = openWireActor;
       if (localVisibleActor != null) {
-         localVisibleActor.shutdown(() -> doFail(me, message));
+         localVisibleActor.requestShutdown();
+      }
+
+      if (executor != null) {
+         executor.execute(() -> doFail(me, message));

Review Comment:
   I don't follow, the point is to terminate processing of commands and execute 
the doFail as the last/next task. The only call to fail should be from the 
netty socket handler that sees a socket error, remote close etc. It is the 
transport initiating a close on a socket error.





Issue Time Tracking
-------------------

    Worklog Id:     (was: 893144)
    Time Spent: 5h 20m  (was: 5h 10m)

> Connection Failure Race Conditions in AMQP and Core
> ---------------------------------------------------
>
>                 Key: ARTEMIS-4476
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-4476
>             Project: ActiveMQ Artemis
>          Issue Type: Task
>            Reporter: Clebert Suconic
>            Assignee: Clebert Suconic
>            Priority: Major
>          Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> Failure Detection has a possibility to a race condition with the processing 
> of the client packets (or frames in the case of AMQP).
> This is because Netty detects the failure and removes the connection objects 
> while the packets are still processing things. 
> I was not able to reproduce this particular issue, but I have seen a case 
> from a memory dump where the consumer was created while the connection was 
> already dropped, leaving the consumer isolated without any communication with 
> clients.
> That particular case I could see a possibility because of these races.
> I am adding tests to exercise connection failure in stress and I was able to 
> reproduce other issues.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (ARTEMIS-4476) Connection Failure Race Conditions in AMQP and Core

Reply via email to