[
https://issues.apache.org/jira/browse/SOLR-2583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13055437#comment-13055437
]
Koji Sekiguchi commented on SOLR-2583:
--------------------------------------
I'd like the feature as I'm using ExternalFileField a lot!
bq. what do you say regarding the suggestion to use HashMap up to ~5.5% and
above that using the float[]?
Looking at your test, I think it is reasonable. But I'd like to use
CompactByteArray. I saw it wins over HashMap and float[] when 5% and above in
my test.
How about introducing compact=yes (default is no and float[] is used) with
sparse=yes/no/auto?
> Make external scoring more efficient (ExternalFileField, FileFloatSource)
> -------------------------------------------------------------------------
>
> Key: SOLR-2583
> URL: https://issues.apache.org/jira/browse/SOLR-2583
> Project: Solr
> Issue Type: Improvement
> Components: search
> Reporter: Martin Grotzke
> Priority: Minor
> Attachments: FileFloatSource.java.patch, patch.txt
>
>
> External scoring eats much memory, depending on the number of documents in
> the index. The ExternalFileField (used for external scoring) uses
> FileFloatSource, where one FileFloatSource is created per external scoring
> file. FileFloatSource creates a float array with the size of the number of
> docs (this is also done if the file to load is not found). If there are much
> less entries in the scoring file than there are number of docs in total the
> big float array wastes much memory.
> This could be optimized by using a map of doc -> score, so that the map
> contains as many entries as there are scoring entries in the external file,
> but not more.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]