[ 
https://issues.apache.org/jira/browse/SOLR-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13839665#comment-13839665
 ] 

Trey Grainger commented on SOLR-5027:
-------------------------------------

Interesting.  I've been playing around with the Collapsing QParser and, because 
of the reason Gabe mentioned, I can think very few use cases for it in it's 
current implementation.  Specifically, because there is no way to break a tie 
between multiple documents with the same value (the way sorting does), a search 
that is sorted by score desc, modifieddt desc (newer documents break the tie) 
is not possible... it just collapses based upon the first document in the index 
with the duplicate score.  Many of my use cases are even trickier... something 
like sort by displaypriority desc, score desc, modifieddt desc.

Just brainstorming here, but if sorting documents before collapsing is not 
possible (due to where in the code stack the collapsing occurs), then it might 
be possible to just implement a "sort" function (ValueSource) that gave an 
ordinal score to each document based upon the position it would occur within 
all documents.  If I understand what you mean when you say "group head 
selection based upon the min/max of the function", then this would effectively 
allow collapsing sorted values, because the sort function would return higher 
values for documents which would sort higher.  In that case, the sort function 
(which could read in the current sort parameter from the search request) could 
even be the default used by collapsing, since that is probably what user's are 
expecting to happen (this is consistent with how grouping works, for example).

Thoughts?

> Field Collapsing PostFilter
> ---------------------------
>
>                 Key: SOLR-5027
>                 URL: https://issues.apache.org/jira/browse/SOLR-5027
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>    Affects Versions: 5.0
>            Reporter: Joel Bernstein
>            Assignee: Joel Bernstein
>            Priority: Minor
>             Fix For: 4.6, 5.0
>
>         Attachments: SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, 
> SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, 
> SOLR-5027.patch, SOLR-5027.patch
>
>
> This ticket introduces the *CollapsingQParserPlugin* 
> The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. 
> This is a high performance alternative to standard Solr field collapsing 
> (with *ngroups*) when the number of distinct groups in the result set is high.
> For example in one performance test, a search with 10 million full results 
> and 1 million collapsed groups:
> Standard grouping with ngroups : 17 seconds.
> CollapsingQParserPlugin: 300 milli-seconds.
> Sample syntax:
> Collapse based on the highest scoring document:
> {code}
> fq=(!collapse field=<field_name>}
> {code}
> Collapse based on the min value of a numeric field:
> {code}
> fq={!collapse field=<field_name> min=<field_name>}
> {code}
> Collapse based on the max value of a numeric field:
> {code}
> fq={!collapse field=<field_name> max=<field_name>}
> {code}
> Collapse with a null policy:
> {code}
> fq={!collapse field=<field_name> nullPolicy=<null_policy>}
> {code}
> There are three null policies:
> ignore : removes docs with a null value in the collapse field (default).
> expand : treats each doc with a null value in the collapse field as a 
> separate group.
> collapse : collapses all docs with a null value into a single group using 
> either highest score, or min/max.
> The CollapsingQParserPlugin also fully supports the QueryElevationComponent
> *Note:*  The July 16 patch also includes and ExpandComponent that expands the 
> collapsed groups for the current search result page. This functionality will 
> be moved to it's own ticket.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to