[ https://issues.apache.org/jira/browse/THRIFT-5814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17876057#comment-17876057 ]
Yuxuan Wang commented on THRIFT-5814: ------------------------------------- So far I tried a few ways, none of them actually fixes the flakiness: * Replace the tcp connection with a unix domain socket * After the client established the connection and sleep for a small period of time, do a connectivity check to make sure the connection is still good, and if not, retry establishing a client connection again * Disable tcp keep-alive * Change tcp keep-alive to a much shorter interval > go: Flaky test TestNoHangDuringStopFromClientNoDataSendDuringAcceptLoop > ----------------------------------------------------------------------- > > Key: THRIFT-5814 > URL: https://issues.apache.org/jira/browse/THRIFT-5814 > Project: Thrift > Issue Type: Task > Components: Go - Library > Affects Versions: 0.20.0 > Reporter: Yuxuan Wang > Priority: Minor > > Currently the > [TestNoHangDuringStopFromClientNoDataSendDuringAcceptLoop|https://github.com/apache/thrift/blob/cb9ceada554f47aa5ebbedfe3984de0983cf0226/lib/go/thrift/simple_server_test.go#L164] > test in go library can be flaky (fails at roughly 1-in-100 chance) > What this test does is roughly: > # Create a local server listening on a random local port (via localhost:0) > # Create a tcp client that connects to the server (via net.Dial) but does > nothing after established the connection (so to server's PoV this is an idle > client) > # Tries to shutdown the server > # Verifies that the shutting down of the server took at least the configured > timeout, before server forcefully close idle client connections > Step 4 can occasionally (rarely) fail because the server shutdown much faster > than expected. I did some digging, the reason seems to be that the > client-server tcp connection is broken after established (killed by the os or > something?) > So we need to find a way to keep the connection until server kills it to fix > the flakiness of this test -- This message was sent by Atlassian Jira (v8.20.10#820010)