[
https://issues.apache.org/jira/browse/SOLR-17775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17955911#comment-17955911
]
David Smiley commented on SOLR-17775:
-------------------------------------
By recommending modifying DocStreamer, I don't imply leaving
ValueSourceAugmenter as-is, as it could cache the last getValues call and
related metadata if the next doc to retrieve is in the same segment.
Fetching stored fields in Lucene isn't an array lookup! It would benefit from
the doc-order approach too!
Batching by 1000 (configurable) would seem to be a nice balance of capping
memory needs and benefiting from this nice optimization.
> Optimize ValueSourceAugmenter
> -----------------------------
>
> Key: SOLR-17775
> URL: https://issues.apache.org/jira/browse/SOLR-17775
> Project: Solr
> Issue Type: Improvement
> Components: search
> Reporter: Yura
> Priority: Minor
> Labels: pull-request-available
> Time Spent: 10m
> Remaining Estimate: 0h
>
> h3. Problem
> ValueSourceAugmenter currently calculates function values on-demand during
> transform(), performing expensive binary searches and reader lookups for each
> document individually.
> h3. Solution
> Pre-calculate function values for all result set documents during
> setContext() by:
> * Collecting and sorting document IDs from DocList
> * Sequential iteration through sorted documents to calculate values once per
> reader segment
> * Storing results in hash map for O(1) lookup during transform()
> * Fallback to on-demand calculation for documents outside the pre-calculated
> set (RTG cases)
> h3. Performance Benefit
> Replaces repeated "find document at position N" operations (binary search per
> document) with efficient "get next document" iteration (sequential processing
> within reader segments), significantly reducing lookup overhead.
> h3. Compatibility
> Maintains full backward compatibility through fallback mechanism for edge
> cases.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]