[ 
https://issues.apache.org/jira/browse/SOLR-6581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15392469#comment-15392469
 ] 

Joel Bernstein commented on SOLR-6581:
--------------------------------------

Wow that's a hard case.

The string collapse is done into a fixed array the size of the unique values in 
the field. Similar to faceting on a string field. So we're talking about a huge 
amount of memory for a single query. Still not sure why MultiDocValues would 
outperform the top level field cache in this scenario. 

In this scenario sharding would be very useful, but you would have to shard on 
the collapse field.



> Efficient DocValues support and numeric collapse field implementations for 
> Collapse and Expand
> ----------------------------------------------------------------------------------------------
>
>                 Key: SOLR-6581
>                 URL: https://issues.apache.org/jira/browse/SOLR-6581
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Joel Bernstein
>            Assignee: Joel Bernstein
>            Priority: Minor
>             Fix For: 5.0, 6.0
>
>         Attachments: SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, 
> SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, 
> SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, 
> SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, 
> renames.diff
>
>
> The 4x implementation of the CollapsingQParserPlugin and the ExpandComponent 
> are optimized to work with a top level FieldCache. Top level FieldCaches have 
> a very fast docID to top-level ordinal lookup. Fast access to the top-level 
> ordinals allows for very high performance field collapsing on high 
> cardinality fields. 
> LUCENE-5666 unified the DocValues and FieldCache api's so that the top level 
> FieldCache is no longer in regular use. Instead all top level caches are 
> accessed through MultiDocValues. 
> This ticket does the following:
> 1) Optimizes Collapse and Expand to use MultiDocValues and makes this the 
> default approach when collapsing on String fields
> 2) Provides an option to use a top level FieldCache if the performance of 
> MultiDocValues is a blocker. The mechanism for switching to the FieldCache is 
> a new "hint" parameter. If the hint parameter is set to "top_fc" then the 
> top-level FieldCache would be used for both Collapse and Expand.
> Example syntax:
> {code}
> fq={!collapse field=x hint=TOP_FC}
> {code}
> 3)  Adds numeric collapse field implementations.
> 4) Resolves issue SOLR-6066
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to