[jira] [Commented] (LUCENE-8996) maxScore is sometimes missing from distributed grouped responses

Diego Ceccarelli (Jira) Thu, 17 Oct 2019 09:44:17 -0700


    [ 
https://issues.apache.org/jira/browse/LUCENE-8996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953911#comment-16953911
 ]


Diego Ceccarelli commented on LUCENE-8996:
------------------------------------------

Ah! Good point about the delete between the first and second pass! didn't think 
about that, rarely but it might happen! and I agree about fix MIN_VALUE in a 
second PR. 

About the  LUCENE-9010 I would keep fix and tests together - because the tests 
prove that you actually fixed the issue and also they help understanding the 
issue (aside from the "lost of numbers there"). 
( another example here: 
https://github.com/apache/lucene-solr/commit/d1706b36babda2dc17fd82d3165607f5c44bd83e)
 

I'm not sure about the narrative tests, but I agree on too many numbers and I 
can fix the tests by:
  * Adding constants and remove the numbers
 *  I like the colors, I can use them instead of group1 group2
 * I also would like to make the tests shorter (maybe having more tests)

what do you think? 

> maxScore is sometimes missing from distributed grouped responses
> ----------------------------------------------------------------
>
>                 Key: LUCENE-8996
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8996
>             Project: Lucene - Core
>          Issue Type: Bug
>    Affects Versions: 5.3
>            Reporter: Julien Massenet
>            Priority: Minor
>         Attachments: LUCENE-8996.patch, lucene_6_5-GroupingMaxScore.patch, 
> lucene_solr_5_3-GroupingMaxScore.patch, master-GroupingMaxScore.patch
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> This issue occurs when using the grouping feature in distributed mode and 
> sorting by score.
> Each group's {{docList}} in the response is supposed to contain a 
> {{maxScore}} entry that hold the maximum score for that group. Using the 
> current releases, it sometimes happens that this piece of information is not 
> included:
> {code}
> {
>   "responseHeader": {
>     "status": 0,
>     "QTime": 42,
>     "params": {
>       "sort": "score desc",
>       "fl": "id,score",
>       "q": "_text_:\"72\"",
>       "group.limit": "2",
>       "group.field": "group2",
>       "group.sort": "score desc",
>       "group": "true",
>       "wt": "json",
>       "fq": "group2:72 OR group2:45"
>     }
>   },
>   "grouped": {
>     "group2": {
>       "matches": 567,
>       "groups": [
>         {
>           "groupValue": 72,
>           "doclist": {
>             "numFound": 562,
>             "start": 0,
>             "maxScore": 2.0378063,
>             "docs": [
>               {
>                 "id": "29!26551",
>                 "score": 2.0378063
>               },
>               {
>                 "id": "78!11462",
>                 "score": 2.0298104
>               }
>             ]
>           }
>         },
>         {
>           "groupValue": 45,
>           "doclist": {
>             "numFound": 5,
>             "start": 0,
>             "docs": [
>               {
>                 "id": "72!8569",
>                 "score": 1.8988966
>               },
>               {
>                 "id": "72!14075",
>                 "score": 1.5191172
>               }
>             ]
>           }
>         }
>       ]
>     }
>   }
> }
> {code}
> Looking into the issue, it comes from the fact that if a shard does not 
> contain a document from that group, trying to merge its {{maxScore}} with 
> real {{maxScore}} entries from other shards is invalid (it results in NaN).
> I'm attaching a patch containing a fix.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-8996) maxScore is sometimes missing from distributed grouped responses

Reply via email to