[
https://issues.apache.org/jira/browse/SOLR-2564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13044535#comment-13044535
]
Yonik Seeley commented on SOLR-2564:
------------------------------------
Thanks for theupdate Martijn, it's looking good.
Just a note that the following optimization will no longer be valid once we
have "post collapse faceting" or whatever we're calling it, or
when we have an option to return the total number of groups.
But hopefully our random testing will catch that in the future.
{code}
protected Collector createFirstPassCollector() throws IOException {
// Ok we don't want groups, but do want a total count
if (actualGroupsToFind <= 0) {
fallBackCollector = new TotalHitCountCollector();
return fallBackCollector;
}
{code}
bq. However I have changed this in the updated patch and basically this check
isn't done with setting GET_SCORES flag.
Thanks... GET_SCORES does have a different meaning: scores must be returned to
the caller (which can still be false even if scores are used for sorting)
bq. I also think that group.groupCount is a better name.
groupCount is now GROUPED or UNGROUPED, and is used to set "Accuracy" (more on
that later ;-)
Seems like this parameter should be a boolean that says if the total number of
groups should be returned?
If true, we can add a "ngroups" or "groupCount" element at the same level as
"matches". We should probably just name the parameter the same thing as the
variable that gets returned... i.e. group.ngroups=true would cause "ngroups" to
be populated (or groupCount if we decide that's a better name).
bq. Maybe another name is more descriptive. Maybe style or method?
"method" should probably be reserved for the algorithm used for collapsing (as
we do for faceting).
Background for others: This feature has been called many things like "post
collapse faceting", etc. But it's really much more than that. Normal grouping
simply groups documents and presents them in a different way, but does *not*
change what documents match the base query + filters. The other use-case is
more like field collapsing and does change what documents match (basically,
only the first documents in each group, up to limit, "match").
Maybe just use a word from the original name for this whole feature...
"group.collapse=true"?
There are some other interesting semantics to work out for group.collapse=true,
such as if the collapsing happens before or after filters are applied. Perhaps
either could make sense depending on the use case? Here's one use case I can
think of: using field collapsing for only showing the latest version of a
document. In this case, one would only want collapsing to apply to the base
query (with filtering happening after that) because you don't want to get into
the position of having a filter that filters out the most recent version of a
document and thus shows an older version.
However for now if it's easier, we could treat group.collapse=true to apply to
the base query and all filters, and handle the use case I mentioned above via a
qparser in the future.
> Integrating grouping module into Solr 4.0
> -----------------------------------------
>
> Key: SOLR-2564
> URL: https://issues.apache.org/jira/browse/SOLR-2564
> Project: Solr
> Issue Type: Improvement
> Reporter: Martijn van Groningen
> Assignee: Martijn van Groningen
> Fix For: 4.0
>
> Attachments: LUCENE-2564.patch, SOLR-2564.patch, SOLR-2564.patch,
> SOLR-2564.patch
>
>
> Since work on grouping module is going well. I think it is time to wire this
> up in Solr.
> Besides the current grouping features Solr provides, Solr will then also
> support second pass caching and total count based on groups.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]