Re: [PR] Implement per-segment query timeout on data nodes (druid)

via GitHub Wed, 17 Sep 2025 21:48:22 -0700


jtuglu1 commented on code in PR #18148:
URL: https://github.com/apache/druid/pull/18148#discussion_r2357320740



##########
processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/GroupByMergingQueryRunner.java:
##########
@@ -392,6 +403,14 @@ private void waitForFutureCompletion(
     }
     catch (ExecutionException e) {
       GuavaUtils.cancelAll(true, future, futures);
+      Throwable cause = e.getCause();
+      if (cause instanceof TimeoutException || cause instanceof 
QueryTimeoutException) {

Review Comment:
   > I wonder if a simpler approach could be to use the same timeout we already 
have, but shorten it in `SpecificSegmentQueryRunner`.
   
   I'm not sure this will work. The reason being is that while this does 
execute per segment, the timeout is not enforced here. The timeout is 
unfortunately enforced in the caller runners which aggregate lists of 
per-segment runners into callable futures (e.g. `ChainedExecutionQueryRunner` 
or `GroupByMergingQueryRunner`). These callers are in lower stack frames 
whereas the SpecificSegmentQueryRunner is near the top of a segment query 
stacktrace. The thread that is running the SpecificSegmentQueryRunner is the 
same thread that we'd be asking to service the timer interrupt (which won't 
work in Java AFAIK since the futures framework is not truly async). These 
callers rely on the aggregate query timeout to make one mega future with that 
value.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Implement per-segment query timeout on data nodes (druid)

Reply via email to