[ https://issues.apache.org/jira/browse/SOLR-6581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15392469#comment-15392469 ]
Joel Bernstein commented on SOLR-6581: -------------------------------------- Wow that's a hard case. The string collapse is done into a fixed array the size of the unique values in the field. Similar to faceting on a string field. So we're talking about a huge amount of memory for a single query. Still not sure why MultiDocValues would outperform the top level field cache in this scenario. In this scenario sharding would be very useful, but you would have to shard on the collapse field. > Efficient DocValues support and numeric collapse field implementations for > Collapse and Expand > ---------------------------------------------------------------------------------------------- > > Key: SOLR-6581 > URL: https://issues.apache.org/jira/browse/SOLR-6581 > Project: Solr > Issue Type: Bug > Reporter: Joel Bernstein > Assignee: Joel Bernstein > Priority: Minor > Fix For: 5.0, 6.0 > > Attachments: SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, > SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, > SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, > SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, > renames.diff > > > The 4x implementation of the CollapsingQParserPlugin and the ExpandComponent > are optimized to work with a top level FieldCache. Top level FieldCaches have > a very fast docID to top-level ordinal lookup. Fast access to the top-level > ordinals allows for very high performance field collapsing on high > cardinality fields. > LUCENE-5666 unified the DocValues and FieldCache api's so that the top level > FieldCache is no longer in regular use. Instead all top level caches are > accessed through MultiDocValues. > This ticket does the following: > 1) Optimizes Collapse and Expand to use MultiDocValues and makes this the > default approach when collapsing on String fields > 2) Provides an option to use a top level FieldCache if the performance of > MultiDocValues is a blocker. The mechanism for switching to the FieldCache is > a new "hint" parameter. If the hint parameter is set to "top_fc" then the > top-level FieldCache would be used for both Collapse and Expand. > Example syntax: > {code} > fq={!collapse field=x hint=TOP_FC} > {code} > 3) Adds numeric collapse field implementations. > 4) Resolves issue SOLR-6066 > -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org