mateczagany commented on code in PR #27407:
URL: https://github.com/apache/flink/pull/27407#discussion_r2693277077


##########
flink-runtime/src/main/java/org/apache/flink/runtime/rest/handler/AbstractRestHandler.java:
##########
@@ -85,14 +85,15 @@ protected CompletableFuture<Void> respondToRequest(
             response = FutureUtils.completedExceptionally(e);
         }
 
-        return response.thenAccept(
+        return response.thenAcceptAsync(
                 resp ->
                         HandlerUtils.sendResponse(
                                 ctx,
                                 httpRequest,
                                 resp,
                                 messageHeaders.getResponseStatusCode(),
-                                responseHeaders));
+                                responseHeaders),
+                ctx.executor());

Review Comment:
   <del>@snuyanzin the issue was this stuck thread:</del>
   ```
   Jan 14 21:15:14      at 
org.apache.flink.core.testutils.BlockerSync.awaitBlocker(BlockerSync.java:59)
   Jan 14 21:15:14      - locked <0x00000000aa141c48> (a java.lang.Object)
   Jan 14 21:15:14      at 
org.apache.flink.runtime.rest.RestServerEndpointITCase.testShouldWaitForHandlersWhenClosing(RestServerEndpointITCase.java:597)
   ```
   
   <del>Something has changed internally in Netty, and this test consistently 
times out on Azure CI. The issue arises when the response is not written from 
the event loop, and Netty enqueues the write operations on the event loop 
instead, and then we close the REST server itself (which is the point of the 
test), the writes don't always go through properly.</del>
   
   <del>By running the write operations on the event loop, this issue goes 
away. I think it's better to do it this way anyways, so that Netty does not 
have to create 3 write tasks to eventually execute on the event loop 
thread.</del>
   
   
   Nevermind, unfortunately this only solved the issue on my side own computer, 
but the CI still failed at the exact same steps. I'll continue digging...



##########
flink-runtime/src/main/java/org/apache/flink/runtime/rest/handler/AbstractRestHandler.java:
##########
@@ -85,14 +85,15 @@ protected CompletableFuture<Void> respondToRequest(
             response = FutureUtils.completedExceptionally(e);
         }
 
-        return response.thenAccept(
+        return response.thenAcceptAsync(
                 resp ->
                         HandlerUtils.sendResponse(
                                 ctx,
                                 httpRequest,
                                 resp,
                                 messageHeaders.getResponseStatusCode(),
-                                responseHeaders));
+                                responseHeaders),
+                ctx.executor());

Review Comment:
   <del>@snuyanzin the issue was this stuck thread:</del>
   ```
   Jan 14 21:15:14      at 
org.apache.flink.core.testutils.BlockerSync.awaitBlocker(BlockerSync.java:59)
   Jan 14 21:15:14      - locked <0x00000000aa141c48> (a java.lang.Object)
   Jan 14 21:15:14      at 
org.apache.flink.runtime.rest.RestServerEndpointITCase.testShouldWaitForHandlersWhenClosing(RestServerEndpointITCase.java:597)
   ```
   
   <del>Something has changed internally in Netty, and this test consistently 
times out on Azure CI. The issue arises when the response is not written from 
the event loop, and Netty enqueues the write operations on the event loop 
instead, and then we close the REST server itself (which is the point of the 
test), the writes don't always go through properly.</del>
   
   <del>By running the write operations on the event loop, this issue goes 
away. I think it's better to do it this way anyways, so that Netty does not 
have to create 3 write tasks to eventually execute on the event loop 
thread.</del>
   
   
   Nevermind, unfortunately this only solved the issue on my own computer, but 
the CI still failed at the exact same steps. I'll continue digging...



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to