[ 
https://issues.apache.org/jira/browse/SOLR-2583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13046674#comment-13046674
 ] 

Martin Grotzke commented on SOLR-2583:
--------------------------------------

Yes, you're right regarding non-sparse fields. The question for the user will 
be when to use true or false for sparse. It might also be the case, that files 
differ, in that some are big, others are small. So I'm thinking about making it 
adaptive: when the number of lines reach a certain percentage compared to the 
number of docs, the float array is used, otherwise the doc->score map is used. 
Perhaps it would be good to allow the user to override this, s.th. like 
sparse=yes/no/auto.

What do you think?

> Make external scoring more efficient (ExternalFileField, FileFloatSource)
> -------------------------------------------------------------------------
>
>                 Key: SOLR-2583
>                 URL: https://issues.apache.org/jira/browse/SOLR-2583
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>            Reporter: Martin Grotzke
>            Priority: Minor
>         Attachments: FileFloatSource.java.patch
>
>
> External scoring eats much memory, depending on the number of documents in 
> the index. The ExternalFileField (used for external scoring) uses 
> FileFloatSource, where one FileFloatSource is created per external scoring 
> file. FileFloatSource creates a float array with the size of the number of 
> docs (this is also done if the file to load is not found). If there are much 
> less entries in the scoring file than there are number of docs in total the 
> big float array wastes much memory.
> This could be optimized by using a map of doc -> score, so that the map 
> contains as many entries as there are scoring entries in the external file, 
> but not more.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to