[
https://issues.apache.org/jira/browse/SOLR-8776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16005060#comment-16005060
]
Diego Ceccarelli edited comment on SOLR-8776 at 5/10/17 5:40 PM:
-----------------------------------------------------------------
Hi all, I updated the PR (https://github.com/apache/lucene-solr/pull/162),
highlights:
[~romseygeek],[~martijn.v.groningen] now the patch relies on the new grouping
code :) I had to add a new {{protected}} constructor to {{TopGroupsCollector}}
to inject my own {{GroupReducer}}. Could you please take a look at let me know
if it makes sense? also in
[SecondPassGroupingCollector#L54|https://github.com/bloomberg/lucene-solr/blob/c22a9017649406c5673c9b72878ad66a20d9b8d2/lucene/grouping/src/java/org/apache/lucene/search/grouping/SecondPassGroupingCollector.java#L54]
{code:title=SecondPassGroupingCollector.java|borderStyle=solid}
public SecondPassGroupingCollector(GroupSelector<T> groupSelector,
Collection<SearchGroup<T>> groups, GroupReducer<T, ?> reducer) {
//System.out.println("SP init");
//Do we want to check if groups is null here? instead of checking at line
62?
if (groups.isEmpty()) {
throw new IllegalArgumentException("no groups to collect (groups is
empty)");
}
this.groupSelector = Objects.requireNonNull(groupSelector);
this.groupSelector.setGroups(groups);
this.groups = Objects.requireNonNull(groups);
{code}
I would check if {{groups != null}} before {{groups.isEmpty()}}.
2. I changed the logic to rerank groups and not only documents: for example if
a user ask to rerank the top 100 documents: {{q=greetings&rows=10&rq=\{!rerank
reRankQuery=$rqq reRankDocs=100 reRankWeight=3\}&rqq=(hi+hello+hey+hiya)}}:
* the top 100 groups matching {{greeting}} are retrieved;
* top 100 groups are reranked by {{rqq}};
* finally the top 10 reranked groups are returned;
* inside each group documents will be reranked as well.
(it's worth to note that for simplicity, in distribute mode first pass will
retrieve the top 100 groups from all the shards, the federator will compute the
top 100 groups and send them to the shards to get the reranking scores, and
finally the federator will return the top 10)
IMO the patch is now complete and I've working unit tests. Please, can someone
review it?
was (Author: diegoceccarelli):
Hi all, I updated the PR (https://github.com/apache/lucene-solr/pull/162),
highlights:
[~romseygeek],[~martijn.v.groningen] now the patch relies on the new grouping
code :) I had to add a new {{protected}} constructor to {{TopGroupsCollector}}
to inject my own {{GroupReducer}}. Could you please take a look at let me know
if it makes sense? also in
[SecondPassGroupingCollector#L54|https://github.com/bloomberg/lucene-solr/blob/c22a9017649406c5673c9b72878ad66a20d9b8d2/lucene/grouping/src/java/org/apache/lucene/search/grouping/SecondPassGroupingCollector.java#L54]
{code:title=SecondPassGroupingCollector.java|borderStyle=solid}
public SecondPassGroupingCollector(GroupSelector<T> groupSelector,
Collection<SearchGroup<T>> groups, GroupReducer<T, ?> reducer) {
//System.out.println("SP init");
//Do we want to check if groups is null here? instead of checking at line
62?
if (groups.isEmpty()) {
throw new IllegalArgumentException("no groups to collect (groups is
empty)");
}
this.groupSelector = Objects.requireNonNull(groupSelector);
this.groupSelector.setGroups(groups);
this.groups = Objects.requireNonNull(groups);
{code}
I would check if {{groups != null}} before {{groups.isEmpty()}}.
2. I changed the logic to rerank groups and not only documents: for example if
a user ask to rerank the top 100 documents: {{q=greetings&rows=10&rq=\{!rerank
reRankQuery=$rqq reRankDocs=100 reRankWeight=3\}&rqq=(hi+hello+hey+hiya)}}:
* the top 100 groups matching {{greeting}} are retrieved;
* top 100 groups are reranked by {{rqq}};
* finally the top 10 reranked groups are returned;
* inside each group documents will be reranked as well.
(it's worth to note that for simplicity, in distribute mode first pass will
retrieve the top 100 groups from all the shards, the federator will compute the
top 100 groups to the shards to get the reranking scores, and finally the
federator will select the top 10)
IMO the patch is now complete and I've working unit tests. Please, can someone
review it?
> Support RankQuery in grouping
> -----------------------------
>
> Key: SOLR-8776
> URL: https://issues.apache.org/jira/browse/SOLR-8776
> Project: Solr
> Issue Type: Improvement
> Components: search
> Affects Versions: 6.0
> Reporter: Diego Ceccarelli
> Priority: Minor
> Fix For: 6.0
>
> Attachments: 0001-SOLR-8776-Support-RankQuery-in-grouping.patch,
> 0001-SOLR-8776-Support-RankQuery-in-grouping.patch,
> 0001-SOLR-8776-Support-RankQuery-in-grouping.patch,
> 0001-SOLR-8776-Support-RankQuery-in-grouping.patch,
> 0001-SOLR-8776-Support-RankQuery-in-grouping.patch
>
>
> Currently it is not possible to use RankQuery [1] and Grouping [2] together
> (see also [3]). In some situations Grouping can be replaced by Collapse and
> Expand Results [4] (that supports reranking), but i) collapse cannot
> guarantee that at least a minimum number of groups will be returned for a
> query, and ii) in the Solr Cloud setting you will have constraints on how to
> partition the documents among the shards.
> I'm going to start working on supporting RankQuery in grouping. I'll start
> attaching a patch with a test that fails because grouping does not support
> the rank query and then I'll try to fix the problem, starting from the non
> distributed setting (GroupingSearch).
> My feeling is that since grouping is mostly performed by Lucene, RankQuery
> should be refactored and moved (or partially moved) there.
> Any feedback is welcome.
> [1] https://cwiki.apache.org/confluence/display/solr/RankQuery+API
> [2] https://cwiki.apache.org/confluence/display/solr/Result+Grouping
> [3]
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201507.mbox/%3ccahm-lpuvspest-sw63_8a6gt-wor6ds_t_nb2rope93e4+s...@mail.gmail.com%3E
> [4]
> https://cwiki.apache.org/confluence/display/solr/Collapse+and+Expand+Results
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]