[
https://issues.apache.org/jira/browse/SOLR-2583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13055737#comment-13055737
]
Martin Grotzke commented on SOLR-2583:
--------------------------------------
bq. Looking at your test, I think it is reasonable. But I'd like to use
CompactByteArray. I saw it wins over HashMap and float[] when 5% and above in
my test.
Can you share your test code or s.th. similar? Perhaps you can just fork
https://github.com/magro/lucene-solr/ and add an appropriate test that reflects
your data?
> Make external scoring more efficient (ExternalFileField, FileFloatSource)
> -------------------------------------------------------------------------
>
> Key: SOLR-2583
> URL: https://issues.apache.org/jira/browse/SOLR-2583
> Project: Solr
> Issue Type: Improvement
> Components: search
> Reporter: Martin Grotzke
> Priority: Minor
> Attachments: FileFloatSource.java.patch, patch.txt
>
>
> External scoring eats much memory, depending on the number of documents in
> the index. The ExternalFileField (used for external scoring) uses
> FileFloatSource, where one FileFloatSource is created per external scoring
> file. FileFloatSource creates a float array with the size of the number of
> docs (this is also done if the file to load is not found). If there are much
> less entries in the scoring file than there are number of docs in total the
> big float array wastes much memory.
> This could be optimized by using a map of doc -> score, so that the map
> contains as many entries as there are scoring entries in the external file,
> but not more.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]