Re: timeAllowed discounting and QueryLimits

Andrzej Białecki Tue, 23 Sep 2025 03:38:31 -0700

Mark,

You make many excellent points, I’d like to see some of your changes in main. 
Are you planning to work on merging them? If not then I can at least create a 
bunch of tickets so that the diffs are easier to process.


Below I’ll comment mostly on the timeAllowed parts:

> On 23 Sep 2025, at 01:11, Mark Miller <[email protected]> wrote:
> 
> I forgot to post a message about it, but FYI, I pushed a Lucene 10 upgrade
> branch a while back as well.
> 
> I don't claim it is 100% correct, I only spent a day on it, but all tests
> do pass.
> 
> I also tried to identify where it differed from the upgrade that is in main
> (at the time that upgrade work went in). Handling timeouts was one of the
> differences. Here is a list I put together of all of the differences:
> 
> 1) Timeouts: enforced via IndexSearcher.setTimeout + timedOut (Lucene 10
> idiom)
> 
>   - My branch sets an IndexSearcher timeout using a thread local strategy
>   (wired to QueryLimits) for timeAllowed and component searches (e.g.,
>   Grouping/CommandHandler), and checks timedOut() to flag partial results
>   consistently. The Lucene‑10 partition-based search path is also used so
>   QueryTimeout is consulted during scoring.
>   - In main, the setTimeout call in the main search path is commented out,
>   and the Lucene‑10 partition-based override is disabled; it instead relies
>   on ExitableDirectoryReader if a system property is set. Ramification:
>   timeAllowed is frequently inert, many queries won’t be interrupted inside
>   Lucene, and partial results are not consistently flagged unless
>   ExitableDirectoryReader is explicitly enabled.
> 

As of last week :) that’s no longer the case, I reinstated the setTimeout using 
the same trick as before, i.e. creating a new short-lived instance of 
IndexSearcher using the same IndexReader, which is (presumably) inexpensive, 
and using setTimeout on that instance.

As to the other change: using CollectorManager is something we should 
definitely do - in a separate ticket. 

> 
> 2) Partial results handling in field sort values (FSV)
> 
>   - My branch handles FSV timeouts in place: doFieldSortValues catches
>   ExitableDirectoryReader.ExitingReaderException, logs the issue, marks the
>   response partial via rsp.setPartialResults(req), and returns an empty
>   sort_values block while letting the rest of the request finish
>   
> (solr/core/src/java/org/apache/solr/handler/component/QueryComponent.java:569-575).

> 
>   - /home/markmiller/solr-main does not set the flag locally; it rethrows
>   the same exception and relies on SearchHandler.shortCircuitedResults to
>   intercept it later and short-circuit the response
>   
> (external/solr-main/solr/core/src/java/org/apache/solr/handler/component/QueryComponent.java:575-579,
>   then SearchHandler.java:526-529 & 761-779). Ramification: here the
>   timeout is contained to the FSV sub-phase (other components still run) and
>   the partial-results status is flagged exactly where the failure occurs,
>   versus invoking the global short-circuit path and aborting remaining work.
> 


+1 makes sense to change this.

> 
> 3) PathHierarchyTokenizer compatibility preserved
> 
>   - My branch adds targeted handling in TextField.parseFieldQuery for
>   PathHierarchyTokenizer’s Lucene 10 positional change so hierarchical path
>   queries continue to work as before.
>   - main does not include this fix. Ramification: hierarchical search can
>   break (phrase-style matching fails), whereas it works here.
> 
> 
> 4) PreAnalyzedField policy
> 
>   - This repo retains and adapts PreAnalyzedField and
>   PreAnalyzedUpdateProcessor to Lucene 10 (createFields + custom Field
>   subclass).
>   - main removes these features. Ramification: users can no longer use
>   pre‑analyzed input on main; it continue to work in my branch.


As the original author of this field type I was sad to see it go … but it 
seemed that very few people used it anyway.



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: timeAllowed discounting and QueryLimits

Reply via email to