[jira] [Commented] (SOLR-17775) Optimize ValueSourceAugmenter

Yura (Jira) Mon, 02 Jun 2025 23:46:06 -0700


    [ 
https://issues.apache.org/jira/browse/SOLR-17775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17955820#comment-17955820
 ]


Yura commented on SOLR-17775:
-----------------------------

[~dsmiley], the improvement was quite significant. This is especially 
noticeable if you need more than 100 rows. I don’t think it adds much memory 
usage. It’s only for the retrieved document IDs, which are usually quite small 
(<1000).

The main gain is not in saving calls to {{{}getValues{}}}, but in retrieving 
values in order. Lucene data structures are optimized for iteration, not random 
seek. Internally, this approach uses {{DocIterator}} jumps from doc[N-1] to 
doc[N], instead of jumping from 0 to doc[N]. This is much cheaper.

> Optimize ValueSourceAugmenter
> -----------------------------
>
>                 Key: SOLR-17775
>                 URL: https://issues.apache.org/jira/browse/SOLR-17775
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>            Reporter: Yura
>            Priority: Minor
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> h3. Problem
> ValueSourceAugmenter currently calculates function values on-demand during 
> transform(), performing expensive binary searches and reader lookups for each 
> document individually.
> h3. Solution
> Pre-calculate function values for all result set documents during 
> setContext() by:
>  * Collecting and sorting document IDs from DocList
>  * Sequential iteration through sorted documents to calculate values once per 
> reader segment
>  * Storing results in hash map for O(1) lookup during transform()
>  * Fallback to on-demand calculation for documents outside the pre-calculated 
> set (RTG cases)
> h3. Performance Benefit
> Replaces repeated "find document at position N" operations (binary search per 
> document) with efficient "get next document" iteration (sequential processing 
> within reader segments), significantly reducing lookup overhead.
> h3. Compatibility
> Maintains full backward compatibility through fallback mechanism for edge 
> cases.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-17775) Optimize ValueSourceAugmenter

Reply via email to