[
https://issues.apache.org/jira/browse/SOLR-17775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17955566#comment-17955566
]
David Smiley commented on SOLR-17775:
-------------------------------------
In practice, how much better are you finding this optimization? I would not
have guessed it to be considerable. I suppose this saves the
ValueSource.getValues call so that it's per segment instead of per doc. Note
that caching this will use up some memory... albeit maybe not too much if a
ValueSource is just loading a number, say.
Perhaps it would make more sense for Solr to process the documents (and thus
all DocTransformers and also field value retrieval) in doc ID order before then
sorting them in the desired order? See DocsStreamer.
I could see doing this with a chunking strategy to cap risks of using too much
memory.
> Optimize ValueSourceAugmenter
> -----------------------------
>
> Key: SOLR-17775
> URL: https://issues.apache.org/jira/browse/SOLR-17775
> Project: Solr
> Issue Type: Improvement
> Components: search
> Reporter: Yura
> Priority: Minor
>
> h3. Problem
> ValueSourceAugmenter currently calculates function values on-demand during
> transform(), performing expensive binary searches and reader lookups for each
> document individually.
> h3. Solution
> Pre-calculate function values for all result set documents during
> setContext() by:
> * Collecting and sorting document IDs from DocList
> * Sequential iteration through sorted documents to calculate values once per
> reader segment
> * Storing results in hash map for O(1) lookup during transform()
> * Fallback to on-demand calculation for documents outside the pre-calculated
> set (RTG cases)
> h3. Performance Benefit
> Replaces repeated "find document at position N" operations (binary search per
> document) with efficient "get next document" iteration (sequential processing
> within reader segments), significantly reducing lookup overhead.
> h3. Compatibility
> Maintains full backward compatibility through fallback mechanism for edge
> cases.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]