[ 
https://issues.apache.org/jira/browse/MESOS-7742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16334227#comment-16334227
 ] 

Andrei Budnik commented on MESOS-7742:
--------------------------------------

https://reviews.apache.org/r/65261/

I think this patch provides a better solution than retrying to 
[connect|https://github.com/apache/mesos/blob/336e932199643e88c0edbea7c1f08d4b45596389/src/slave/containerizer/mesos/io/switchboard.cpp#L696-L700],
because otherwise it's needed to:
# Use one more `loop` for retrying logic
# Define the limit of retry attempts and delay between attempts
# It might retry to connect due to some non-ECONNREFUSED error

> ContentType/AgentAPIStreamingTest.AttachInputToNestedContainerSession is flaky
> ------------------------------------------------------------------------------
>
>                 Key: MESOS-7742
>                 URL: https://issues.apache.org/jira/browse/MESOS-7742
>             Project: Mesos
>          Issue Type: Bug
>          Components: agent
>    Affects Versions: 1.5.0
>            Reporter: Vinod Kone
>            Assignee: Andrei Budnik
>            Priority: Major
>              Labels: flaky-test, mesosphere-oncall
>             Fix For: 1.6.0
>
>         Attachments: AgentAPITest.LaunchNestedContainerSession-badrun.txt, 
> LaunchNestedContainerSessionDisconnected-badrun.txt
>
>
> Observed this on ASF CI and internal Mesosphere CI. Affected tests:
> {noformat}
> AgentAPIStreamingTest.AttachInputToNestedContainerSession
> AgentAPITest.LaunchNestedContainerSession
> AgentAPITest.AttachContainerInputAuthorization/0
> AgentAPITest.LaunchNestedContainerSessionWithTTY/0
> AgentAPITest.LaunchNestedContainerSessionDisconnected/1
> {noformat}
> This issue comes at least in three different flavours. Take 
> {{AgentAPIStreamingTest.AttachInputToNestedContainerSession}} as an example.
> h5. Flavour 1
> {noformat}
> ../../src/tests/api_tests.cpp:6473
> Value of: (response).get().status
>   Actual: "503 Service Unavailable"
> Expected: http::OK().status
> Which is: "200 OK"
>     Body: ""
> {noformat}
> h5. Flavour 2
> {noformat}
> ../../src/tests/api_tests.cpp:6473
> Value of: (response).get().status
>   Actual: "500 Internal Server Error"
> Expected: http::OK().status
> Which is: "200 OK"
>     Body: "Disconnected"
> {noformat}
> h5. Flavour 3
> {noformat}
> /home/ubuntu/workspace/mesos/Mesos_CI-build/FLAG/CMake/label/mesos-ec2-ubuntu-16.04/mesos/src/tests/api_tests.cpp:6367
> Value of: (sessionResponse).get().status
>   Actual: "500 Internal Server Error"
> Expected: http::OK().status
> Which is: "200 OK"
>     Body: ""
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to