[jira] [Commented] (SOLR-2564) Integrating grouping module into Solr 4.0

Yonik Seeley (JIRA) Sun, 05 Jun 2011 07:13:26 -0700

    [ 
https://issues.apache.org/jira/browse/SOLR-2564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13044535#comment-13044535
 ]


Yonik Seeley commented on SOLR-2564:
------------------------------------

Thanks for theupdate Martijn, it's looking good.

Just a note that the following optimization will no longer be valid once we 
have "post collapse faceting" or whatever we're calling it, or
when we have an option to return the total number of groups.
But hopefully our random testing will catch that in the future.

{code}
    protected Collector createFirstPassCollector() throws IOException {
      // Ok we don't want groups, but do want a total count
      if (actualGroupsToFind <= 0) {
        fallBackCollector = new TotalHitCountCollector();
        return fallBackCollector;
      }
{code}

bq. However I have changed this in the updated patch and basically this check 
isn't done with setting GET_SCORES flag.

Thanks... GET_SCORES does have a different meaning: scores must be returned to 
the caller (which can still be false even if scores are used for sorting)

bq. I also think that group.groupCount is a better name.

groupCount is now GROUPED or UNGROUPED, and is used to set "Accuracy" (more on 
that later ;-)
Seems like this parameter should be a boolean that says if the total number of 
groups should be returned?
If true, we can add a "ngroups" or "groupCount" element at the same level as 
"matches".  We should probably just name the parameter the same thing as the 
variable that gets returned... i.e. group.ngroups=true would cause "ngroups" to 
be populated (or groupCount if we decide that's a better name).

bq. Maybe another name is more descriptive. Maybe style or method?
"method" should probably be reserved for the algorithm used for collapsing (as 
we do for faceting).

Background for others: This feature has been called many things like "post 
collapse faceting", etc.  But it's really much more than that.  Normal grouping 
simply groups documents and presents them in a different way, but does *not* 
change what documents match the base query + filters.  The other use-case is 
more like field collapsing and does change what documents match (basically, 
only the first documents in each group, up to limit, "match").

Maybe just use a word from the original name for this whole feature... 
"group.collapse=true"?

There are some other interesting semantics to work out for group.collapse=true, 
such as if the collapsing happens before or after filters are applied.  Perhaps 
either could make sense depending on the use case?  Here's one use case I can 
think of: using field collapsing for only showing the latest version of a 
document.  In this case, one would only want collapsing to apply to the base 
query (with filtering happening after that) because you don't want to get into 
the position of having a filter that filters out the most recent version of a 
document and thus shows an older version.

However for now if it's easier, we could treat group.collapse=true to apply to 
the base query and all filters, and handle the use case I mentioned above via a 
qparser in the future.

> Integrating grouping module into Solr 4.0
> -----------------------------------------
>
>                 Key: SOLR-2564
>                 URL: https://issues.apache.org/jira/browse/SOLR-2564
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Martijn van Groningen
>            Assignee: Martijn van Groningen
>             Fix For: 4.0
>
>         Attachments: LUCENE-2564.patch, SOLR-2564.patch, SOLR-2564.patch, 
> SOLR-2564.patch
>
>
> Since work on grouping module is going well. I think it is time to wire this 
> up in Solr.
> Besides the current grouping features Solr provides, Solr will then also 
> support second pass caching and total count based on groups.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-2564) Integrating grouping module into Solr 4.0

Reply via email to