For timeAllowed, the proper design would seem to be to establish a "must finish by" time, at initial query receipt, and then pass that to the shards. The fly in that ointment would be clock skew, so one probably needs to also send along a notion of when the sub-request was made, which the recipient can compare to their own clock. This is similar to discounting but would implicitly allow network request time to be "free" (untimed), but the advantage is that it also leaves the option that in a cluster where nodes run on machines with trustworthy clocks the skew check can be skipped, and skew checking code can be independent of all other request processing code.
Main request: timeAllowed=*1*0000 Coordinator clock at request receipt: 17585758*5*06*05* Subrequest finishBefore=17585758*6*06*05* Optionally on initial request: mitigateClockSkew=true (this could also default to true if we like) which then causes the subrequest to also contain: clockSkewCheck=17585758*5* 06*73* (assuming it took 68 ms to decide to send the sub-request) if the clockSkewCheck is sent the receiving node checks the supplied value against its local clock and applies the (positive or negative) difference to the finishBefore value. That way all logic is simple and consistent for checking the limit (just compare to current time), systems that have synchronized clocks can benefit from it, and most importantly, the skew adjustment code can be a super simple HttpServletFilter that doesn't need to be worked into any of the existing code paths. Then we could also allow the user to send the "finishBefore" to the coordinator if the original requesting system also has a reliably synchronized clock. On Mon, Sep 22, 2025 at 10:21 AM David Smiley <[email protected]> wrote: > I think it definitely makes sense to discount timeAllowed for the shard > requests. I was thinking about this and related matters recently at my > last job. I recall thinking that a possible solution is to use the NOW > passed along from coordinator to shard as the epoch to base the time > allowance on. It's captured early by the coordinator. On the other hand, > it could be considered a hack / abuse of the intent of NOW. Maybe a > separate parameter (or HTTP header) should be used to pass this along. > > I don't think it's worth bothering doing this for memory or CPU limits. > But I'm not against it. > > On Mon, Sep 22, 2025 at 10:11 AM Andrzej Białecki <[email protected]> wrote: > > > Hi devs, > > > > While working on SOLR-17869 two things occurred to me: > > > > 1. Discounting the `timeAllowed` > > > > When `timeAllowed` is > 0 `TopGroupsShardRequestFactory` discounts the > > time already spent at the query coordinator when creating shard requests. > > This happens ONLY for the grouping queries, for all other queries > > `timeAllowed` is passed as-is to shard requests. > > > > The net effect is that for other query types the same value of > > `timeAllowed` is applied to the processing at the coordinator as well as > to > > the per-shard local processing. Since `timeAllowed` is a wall-clock time > > relative to the time of the local request, not discounting the time > already > > spent increases the likelihood of the query timing out at the coordinator > > (because even if shard requests all fit within the `timeAllowed` *as > > started at each shard*, it still takes some time to return the results > and > > post-process them at the coordinator where timeAllowed was started much > > earlier). > > > > From this POV it seems that `timeAllowed` should always be discounted > > before sending shard requests, for all types of queries, which is not the > > case today. > > > > Another question comes to mind, too: should we discount other types of > > query limits, too? For other limit types the same “budget” is applied > both > > at the coordinator and locally per-shard - however, CPU and memory limits > > are not tied to an absolute point in time so maybe it doesn’t make sense > to > > discount them by the amount spent at the coordinator, unless we want to > > limit the “end-to-end” impact of a query across both local and distrib > > sub-requests. WDYT? > > > > > > 2. QueryCommand.timeAllowed > > > > During the upgrade to Lucene 10 a few blocks of code were commented out > in > > SolrIndexSearcher, and one of them was this (SolrIndexSearcher:309): > > > > /* final long timeAllowed = cmd.getTimeAllowed(); > > if (timeAllowed > 0) { > > setTimeout(new QueryTimeoutImpl(timeAllowed)); > > }*/ > > > > I was surprised that none of the tests failed with this change and > started > > digging. QueryCommand.timeAllowed is initialized from request params. But > > it looks like this property is no longer needed because now for each > > request we initialize the QueryLimits from the same request params, which > > serve the same purpose. > > > > What’s more, I think the commented-out section was a bug because it would > > call `IndexSearcher.setTimeout` on a searcher instance that was used for > > multiple queries, and that setting would persist even for other > subsequent > > requests. > > > > I propose to remove `QueryCommand.timeAllowed` altogether. > > > > — > > > > Andrzej Białecki > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [email protected] > > For additional commands, e-mail: [email protected] > > > > > -- http://www.needhamsoftware.com (work) https://a.co/d/b2sZLD9 (my fantasy fiction book)
