[jira] [Commented] (LUCENE-8996) maxScore is sometimes missing from distributed grouped responses
[ https://issues.apache.org/jira/browse/LUCENE-8996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16991717#comment-16991717 ] ASF subversion and git services commented on LUCENE-8996: - Commit 49631ace9f1ee110d52a207377e4926baef74929 in lucene-solr's branch refs/heads/gradle-master from Christine Poerschke [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=49631ac ] LUCENE-8996: maxScore was sometimes missing from distributed grouped responses. (Julien Massenet, Diego Ceccarelli, Munendra S N, Christine Poerschke) > maxScore is sometimes missing from distributed grouped responses > > > Key: LUCENE-8996 > URL: https://issues.apache.org/jira/browse/LUCENE-8996 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 5.3 >Reporter: Julien Massenet >Assignee: Christine Poerschke >Priority: Minor > Fix For: master (9.0), 8.4 > > Attachments: LUCENE-8996.02.patch, LUCENE-8996.03.patch, > LUCENE-8996.04.patch, LUCENE-8996.patch, LUCENE-8996.patch, > lucene_6_5-GroupingMaxScore.patch, lucene_solr_5_3-GroupingMaxScore.patch, > master-GroupingMaxScore.patch > > Time Spent: 10m > Remaining Estimate: 0h > > This issue occurs when using the grouping feature in distributed mode and > sorting by score. > Each group's {{docList}} in the response is supposed to contain a > {{maxScore}} entry that hold the maximum score for that group. Using the > current releases, it sometimes happens that this piece of information is not > included: > {code} > { > "responseHeader": { > "status": 0, > "QTime": 42, > "params": { > "sort": "score desc", > "fl": "id,score", > "q": "_text_:\"72\"", > "group.limit": "2", > "group.field": "group2", > "group.sort": "score desc", > "group": "true", > "wt": "json", > "fq": "group2:72 OR group2:45" > } > }, > "grouped": { > "group2": { > "matches": 567, > "groups": [ > { > "groupValue": 72, > "doclist": { > "numFound": 562, > "start": 0, > "maxScore": 2.0378063, > "docs": [ > { > "id": "29!26551", > "score": 2.0378063 > }, > { > "id": "78!11462", > "score": 2.0298104 > } > ] > } > }, > { > "groupValue": 45, > "doclist": { > "numFound": 5, > "start": 0, > "docs": [ > { > "id": "72!8569", > "score": 1.8988966 > }, > { > "id": "72!14075", > "score": 1.5191172 > } > ] > } > } > ] > } > } > } > {code} > Looking into the issue, it comes from the fact that if a shard does not > contain a document from that group, trying to merge its {{maxScore}} with > real {{maxScore}} entries from other shards is invalid (it results in NaN). > I'm attaching a patch containing a fix. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8996) maxScore is sometimes missing from distributed grouped responses
[ https://issues.apache.org/jira/browse/LUCENE-8996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16991663#comment-16991663 ] ASF subversion and git services commented on LUCENE-8996: - Commit 540f617cbb7d465aff356221dba55ad631c52528 in lucene-solr's branch refs/heads/branch_8x from Christine Poerschke [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=540f617 ] LUCENE-8996: maxScore was sometimes missing from distributed grouped responses. (Julien Massenet, Diego Ceccarelli, Munendra S N, Christine Poerschke) Resolved Conflicts: lucene/grouping/src/java/org/apache/lucene/search/grouping/TopGroups.java > maxScore is sometimes missing from distributed grouped responses > > > Key: LUCENE-8996 > URL: https://issues.apache.org/jira/browse/LUCENE-8996 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 5.3 >Reporter: Julien Massenet >Priority: Minor > Attachments: LUCENE-8996.02.patch, LUCENE-8996.03.patch, > LUCENE-8996.04.patch, LUCENE-8996.patch, LUCENE-8996.patch, > lucene_6_5-GroupingMaxScore.patch, lucene_solr_5_3-GroupingMaxScore.patch, > master-GroupingMaxScore.patch > > Time Spent: 10m > Remaining Estimate: 0h > > This issue occurs when using the grouping feature in distributed mode and > sorting by score. > Each group's {{docList}} in the response is supposed to contain a > {{maxScore}} entry that hold the maximum score for that group. Using the > current releases, it sometimes happens that this piece of information is not > included: > {code} > { > "responseHeader": { > "status": 0, > "QTime": 42, > "params": { > "sort": "score desc", > "fl": "id,score", > "q": "_text_:\"72\"", > "group.limit": "2", > "group.field": "group2", > "group.sort": "score desc", > "group": "true", > "wt": "json", > "fq": "group2:72 OR group2:45" > } > }, > "grouped": { > "group2": { > "matches": 567, > "groups": [ > { > "groupValue": 72, > "doclist": { > "numFound": 562, > "start": 0, > "maxScore": 2.0378063, > "docs": [ > { > "id": "29!26551", > "score": 2.0378063 > }, > { > "id": "78!11462", > "score": 2.0298104 > } > ] > } > }, > { > "groupValue": 45, > "doclist": { > "numFound": 5, > "start": 0, > "docs": [ > { > "id": "72!8569", > "score": 1.8988966 > }, > { > "id": "72!14075", > "score": 1.5191172 > } > ] > } > } > ] > } > } > } > {code} > Looking into the issue, it comes from the fact that if a shard does not > contain a document from that group, trying to merge its {{maxScore}} with > real {{maxScore}} entries from other shards is invalid (it results in NaN). > I'm attaching a patch containing a fix. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8996) maxScore is sometimes missing from distributed grouped responses
[ https://issues.apache.org/jira/browse/LUCENE-8996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16991642#comment-16991642 ] ASF subversion and git services commented on LUCENE-8996: - Commit 49631ace9f1ee110d52a207377e4926baef74929 in lucene-solr's branch refs/heads/master from Christine Poerschke [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=49631ac ] LUCENE-8996: maxScore was sometimes missing from distributed grouped responses. (Julien Massenet, Diego Ceccarelli, Munendra S N, Christine Poerschke) > maxScore is sometimes missing from distributed grouped responses > > > Key: LUCENE-8996 > URL: https://issues.apache.org/jira/browse/LUCENE-8996 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 5.3 >Reporter: Julien Massenet >Priority: Minor > Attachments: LUCENE-8996.02.patch, LUCENE-8996.03.patch, > LUCENE-8996.04.patch, LUCENE-8996.patch, LUCENE-8996.patch, > lucene_6_5-GroupingMaxScore.patch, lucene_solr_5_3-GroupingMaxScore.patch, > master-GroupingMaxScore.patch > > Time Spent: 10m > Remaining Estimate: 0h > > This issue occurs when using the grouping feature in distributed mode and > sorting by score. > Each group's {{docList}} in the response is supposed to contain a > {{maxScore}} entry that hold the maximum score for that group. Using the > current releases, it sometimes happens that this piece of information is not > included: > {code} > { > "responseHeader": { > "status": 0, > "QTime": 42, > "params": { > "sort": "score desc", > "fl": "id,score", > "q": "_text_:\"72\"", > "group.limit": "2", > "group.field": "group2", > "group.sort": "score desc", > "group": "true", > "wt": "json", > "fq": "group2:72 OR group2:45" > } > }, > "grouped": { > "group2": { > "matches": 567, > "groups": [ > { > "groupValue": 72, > "doclist": { > "numFound": 562, > "start": 0, > "maxScore": 2.0378063, > "docs": [ > { > "id": "29!26551", > "score": 2.0378063 > }, > { > "id": "78!11462", > "score": 2.0298104 > } > ] > } > }, > { > "groupValue": 45, > "doclist": { > "numFound": 5, > "start": 0, > "docs": [ > { > "id": "72!8569", > "score": 1.8988966 > }, > { > "id": "72!14075", > "score": 1.5191172 > } > ] > } > } > ] > } > } > } > {code} > Looking into the issue, it comes from the fact that if a shard does not > contain a document from that group, trying to merge its {{maxScore}} with > real {{maxScore}} entries from other shards is invalid (it results in NaN). > I'm attaching a patch containing a fix. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8996) maxScore is sometimes missing from distributed grouped responses
[ https://issues.apache.org/jira/browse/LUCENE-8996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16990097#comment-16990097 ] Christine Poerschke commented on LUCENE-8996: - Returning to this ... if no one else gets to it first and there are no concerns or objections then I'll aim to commit the LUCENE-8996.04.patch from Nov 1st on Monday afternoon (London time) hopefully in plenty of time for the 8.4 release branch cutting in the second half of next week. > maxScore is sometimes missing from distributed grouped responses > > > Key: LUCENE-8996 > URL: https://issues.apache.org/jira/browse/LUCENE-8996 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 5.3 >Reporter: Julien Massenet >Priority: Minor > Attachments: LUCENE-8996.02.patch, LUCENE-8996.03.patch, > LUCENE-8996.04.patch, LUCENE-8996.patch, LUCENE-8996.patch, > lucene_6_5-GroupingMaxScore.patch, lucene_solr_5_3-GroupingMaxScore.patch, > master-GroupingMaxScore.patch > > Time Spent: 10m > Remaining Estimate: 0h > > This issue occurs when using the grouping feature in distributed mode and > sorting by score. > Each group's {{docList}} in the response is supposed to contain a > {{maxScore}} entry that hold the maximum score for that group. Using the > current releases, it sometimes happens that this piece of information is not > included: > {code} > { > "responseHeader": { > "status": 0, > "QTime": 42, > "params": { > "sort": "score desc", > "fl": "id,score", > "q": "_text_:\"72\"", > "group.limit": "2", > "group.field": "group2", > "group.sort": "score desc", > "group": "true", > "wt": "json", > "fq": "group2:72 OR group2:45" > } > }, > "grouped": { > "group2": { > "matches": 567, > "groups": [ > { > "groupValue": 72, > "doclist": { > "numFound": 562, > "start": 0, > "maxScore": 2.0378063, > "docs": [ > { > "id": "29!26551", > "score": 2.0378063 > }, > { > "id": "78!11462", > "score": 2.0298104 > } > ] > } > }, > { > "groupValue": 45, > "doclist": { > "numFound": 5, > "start": 0, > "docs": [ > { > "id": "72!8569", > "score": 1.8988966 > }, > { > "id": "72!14075", > "score": 1.5191172 > } > ] > } > } > ] > } > } > } > {code} > Looking into the issue, it comes from the fact that if a shard does not > contain a document from that group, trying to merge its {{maxScore}} with > real {{maxScore}} entries from other shards is invalid (it results in NaN). > I'm attaching a patch containing a fix. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8996) maxScore is sometimes missing from distributed grouped responses
[ https://issues.apache.org/jira/browse/LUCENE-8996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16972630#comment-16972630 ] Diego Ceccarelli commented on LUCENE-8996: -- [~cpoerschke] I realised only after that tests were split and merged in LUCENE-9010 - I opened another patch to refactor the tests LUCENE-9042 > maxScore is sometimes missing from distributed grouped responses > > > Key: LUCENE-8996 > URL: https://issues.apache.org/jira/browse/LUCENE-8996 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 5.3 >Reporter: Julien Massenet >Priority: Minor > Attachments: LUCENE-8996.02.patch, LUCENE-8996.03.patch, > LUCENE-8996.04.patch, LUCENE-8996.patch, LUCENE-8996.patch, > lucene_6_5-GroupingMaxScore.patch, lucene_solr_5_3-GroupingMaxScore.patch, > master-GroupingMaxScore.patch > > Time Spent: 10m > Remaining Estimate: 0h > > This issue occurs when using the grouping feature in distributed mode and > sorting by score. > Each group's {{docList}} in the response is supposed to contain a > {{maxScore}} entry that hold the maximum score for that group. Using the > current releases, it sometimes happens that this piece of information is not > included: > {code} > { > "responseHeader": { > "status": 0, > "QTime": 42, > "params": { > "sort": "score desc", > "fl": "id,score", > "q": "_text_:\"72\"", > "group.limit": "2", > "group.field": "group2", > "group.sort": "score desc", > "group": "true", > "wt": "json", > "fq": "group2:72 OR group2:45" > } > }, > "grouped": { > "group2": { > "matches": 567, > "groups": [ > { > "groupValue": 72, > "doclist": { > "numFound": 562, > "start": 0, > "maxScore": 2.0378063, > "docs": [ > { > "id": "29!26551", > "score": 2.0378063 > }, > { > "id": "78!11462", > "score": 2.0298104 > } > ] > } > }, > { > "groupValue": 45, > "doclist": { > "numFound": 5, > "start": 0, > "docs": [ > { > "id": "72!8569", > "score": 1.8988966 > }, > { > "id": "72!14075", > "score": 1.5191172 > } > ] > } > } > ] > } > } > } > {code} > Looking into the issue, it comes from the fact that if a shard does not > contain a document from that group, trying to merge its {{maxScore}} with > real {{maxScore}} entries from other shards is invalid (it results in NaN). > I'm attaching a patch containing a fix. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8996) maxScore is sometimes missing from distributed grouped responses
[ https://issues.apache.org/jira/browse/LUCENE-8996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16972240#comment-16972240 ] Christine Poerschke commented on LUCENE-8996: - bq. ... can I patch on top of your patch? Sure, please go ahead. I'm sorry I've had only very limited time recently here and elsewhere. I'll unassign the ticket to try and make it clearer that it's fully up-for-grabs. > maxScore is sometimes missing from distributed grouped responses > > > Key: LUCENE-8996 > URL: https://issues.apache.org/jira/browse/LUCENE-8996 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 5.3 >Reporter: Julien Massenet >Assignee: Christine Poerschke >Priority: Minor > Attachments: LUCENE-8996.02.patch, LUCENE-8996.03.patch, > LUCENE-8996.04.patch, LUCENE-8996.patch, LUCENE-8996.patch, > lucene_6_5-GroupingMaxScore.patch, lucene_solr_5_3-GroupingMaxScore.patch, > master-GroupingMaxScore.patch > > Time Spent: 10m > Remaining Estimate: 0h > > This issue occurs when using the grouping feature in distributed mode and > sorting by score. > Each group's {{docList}} in the response is supposed to contain a > {{maxScore}} entry that hold the maximum score for that group. Using the > current releases, it sometimes happens that this piece of information is not > included: > {code} > { > "responseHeader": { > "status": 0, > "QTime": 42, > "params": { > "sort": "score desc", > "fl": "id,score", > "q": "_text_:\"72\"", > "group.limit": "2", > "group.field": "group2", > "group.sort": "score desc", > "group": "true", > "wt": "json", > "fq": "group2:72 OR group2:45" > } > }, > "grouped": { > "group2": { > "matches": 567, > "groups": [ > { > "groupValue": 72, > "doclist": { > "numFound": 562, > "start": 0, > "maxScore": 2.0378063, > "docs": [ > { > "id": "29!26551", > "score": 2.0378063 > }, > { > "id": "78!11462", > "score": 2.0298104 > } > ] > } > }, > { > "groupValue": 45, > "doclist": { > "numFound": 5, > "start": 0, > "docs": [ > { > "id": "72!8569", > "score": 1.8988966 > }, > { > "id": "72!14075", > "score": 1.5191172 > } > ] > } > } > ] > } > } > } > {code} > Looking into the issue, it comes from the fact that if a shard does not > contain a document from that group, trying to merge its {{maxScore}} with > real {{maxScore}} entries from other shards is invalid (it results in NaN). > I'm attaching a patch containing a fix. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8996) maxScore is sometimes missing from distributed grouped responses
[ https://issues.apache.org/jira/browse/LUCENE-8996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16971565#comment-16971565 ] Diego Ceccarelli commented on LUCENE-8996: -- Thanks [~cpoerschke] the new patch looks good - IMHO it's true that is not retrocompatible, I don't think there was a way for the previous implementation to return Float.Min (because it would have been overridden by Nan or by a real value) - so basically we are just fixing the case where there is a mix of NaNs and real values (by returning a real value instead of NaN). I would like to work a bit on the unit tests, can I patch on top of your patch? > maxScore is sometimes missing from distributed grouped responses > > > Key: LUCENE-8996 > URL: https://issues.apache.org/jira/browse/LUCENE-8996 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 5.3 >Reporter: Julien Massenet >Assignee: Christine Poerschke >Priority: Minor > Attachments: LUCENE-8996.02.patch, LUCENE-8996.03.patch, > LUCENE-8996.04.patch, LUCENE-8996.patch, LUCENE-8996.patch, > lucene_6_5-GroupingMaxScore.patch, lucene_solr_5_3-GroupingMaxScore.patch, > master-GroupingMaxScore.patch > > Time Spent: 10m > Remaining Estimate: 0h > > This issue occurs when using the grouping feature in distributed mode and > sorting by score. > Each group's {{docList}} in the response is supposed to contain a > {{maxScore}} entry that hold the maximum score for that group. Using the > current releases, it sometimes happens that this piece of information is not > included: > {code} > { > "responseHeader": { > "status": 0, > "QTime": 42, > "params": { > "sort": "score desc", > "fl": "id,score", > "q": "_text_:\"72\"", > "group.limit": "2", > "group.field": "group2", > "group.sort": "score desc", > "group": "true", > "wt": "json", > "fq": "group2:72 OR group2:45" > } > }, > "grouped": { > "group2": { > "matches": 567, > "groups": [ > { > "groupValue": 72, > "doclist": { > "numFound": 562, > "start": 0, > "maxScore": 2.0378063, > "docs": [ > { > "id": "29!26551", > "score": 2.0378063 > }, > { > "id": "78!11462", > "score": 2.0298104 > } > ] > } > }, > { > "groupValue": 45, > "doclist": { > "numFound": 5, > "start": 0, > "docs": [ > { > "id": "72!8569", > "score": 1.8988966 > }, > { > "id": "72!14075", > "score": 1.5191172 > } > ] > } > } > ] > } > } > } > {code} > Looking into the issue, it comes from the fact that if a shard does not > contain a document from that group, trying to merge its {{maxScore}} with > real {{maxScore}} entries from other shards is invalid (it results in NaN). > I'm attaching a patch containing a fix. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8996) maxScore is sometimes missing from distributed grouped responses
[ https://issues.apache.org/jira/browse/LUCENE-8996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16965295#comment-16965295 ] Lucene/Solr QA commented on LUCENE-8996: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 14s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} Release audit (RAT) {color} | {color:green} 0m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} Check forbidden APIs {color} | {color:green} 0m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} Validate source patterns {color} | {color:green} 0m 17s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 21s{color} | {color:green} grouping in the patch passed. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 2m 21s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | LUCENE-8996 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12984657/LUCENE-8996.04.patch | | Optional Tests | compile javac unit ratsources checkforbiddenapis validatesourcepatterns | | uname | Linux lucene1-us-west 4.15.0-54-generic #58-Ubuntu SMP Mon Jun 24 10:55:24 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | ant | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-LUCENE-Build/sourcedir/dev-tools/test-patch/lucene-solr-yetus-personality.sh | | git revision | master / 5c6a299eff9 | | ant | version: Apache Ant(TM) version 1.10.5 compiled on March 28 2019 | | Default Java | LTS | | Test Results | https://builds.apache.org/job/PreCommit-LUCENE-Build/220/testReport/ | | modules | C: lucene/grouping U: lucene/grouping | | Console output | https://builds.apache.org/job/PreCommit-LUCENE-Build/220/console | | Powered by | Apache Yetus 0.7.0 http://yetus.apache.org | This message was automatically generated. > maxScore is sometimes missing from distributed grouped responses > > > Key: LUCENE-8996 > URL: https://issues.apache.org/jira/browse/LUCENE-8996 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 5.3 >Reporter: Julien Massenet >Assignee: Christine Poerschke >Priority: Minor > Attachments: LUCENE-8996.02.patch, LUCENE-8996.03.patch, > LUCENE-8996.04.patch, LUCENE-8996.patch, LUCENE-8996.patch, > lucene_6_5-GroupingMaxScore.patch, lucene_solr_5_3-GroupingMaxScore.patch, > master-GroupingMaxScore.patch > > Time Spent: 10m > Remaining Estimate: 0h > > This issue occurs when using the grouping feature in distributed mode and > sorting by score. > Each group's {{docList}} in the response is supposed to contain a > {{maxScore}} entry that hold the maximum score for that group. Using the > current releases, it sometimes happens that this piece of information is not > included: > {code} > { > "responseHeader": { > "status": 0, > "QTime": 42, > "params": { > "sort": "score desc", > "fl": "id,score", > "q": "_text_:\"72\"", > "group.limit": "2", > "group.field": "group2", > "group.sort": "score desc", > "group": "true", > "wt": "json", > "fq": "group2:72 OR group2:45" > } > }, > "grouped": { > "group2": { > "matches": 567, > "groups": [ > { > "groupValue": 72, > "doclist": { > "numFound": 562, > "start": 0, > "maxScore": 2.0378063, > "docs": [ > { > "id": "29!26551", > "score": 2.0378063 > }, > { > "id": "78!11462", > "score": 2.0298104 > } > ] > } > }, > { > "groupValue": 45, > "doclist": { > "numFound": 5, > "start": 0, > "docs": [ >
[jira] [Commented] (LUCENE-8996) maxScore is sometimes missing from distributed grouped responses
[ https://issues.apache.org/jira/browse/LUCENE-8996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16965060#comment-16965060 ] Christine Poerschke commented on LUCENE-8996: - Attached LUCENE-8996.04.patch file, it simply combines the Oct 22nd LUCENE-8996.patch and the Oct 23rd LUCENE-8996.02.patch files. Thoughts? > maxScore is sometimes missing from distributed grouped responses > > > Key: LUCENE-8996 > URL: https://issues.apache.org/jira/browse/LUCENE-8996 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 5.3 >Reporter: Julien Massenet >Assignee: Christine Poerschke >Priority: Minor > Attachments: LUCENE-8996.02.patch, LUCENE-8996.03.patch, > LUCENE-8996.04.patch, LUCENE-8996.patch, LUCENE-8996.patch, > lucene_6_5-GroupingMaxScore.patch, lucene_solr_5_3-GroupingMaxScore.patch, > master-GroupingMaxScore.patch > > Time Spent: 10m > Remaining Estimate: 0h > > This issue occurs when using the grouping feature in distributed mode and > sorting by score. > Each group's {{docList}} in the response is supposed to contain a > {{maxScore}} entry that hold the maximum score for that group. Using the > current releases, it sometimes happens that this piece of information is not > included: > {code} > { > "responseHeader": { > "status": 0, > "QTime": 42, > "params": { > "sort": "score desc", > "fl": "id,score", > "q": "_text_:\"72\"", > "group.limit": "2", > "group.field": "group2", > "group.sort": "score desc", > "group": "true", > "wt": "json", > "fq": "group2:72 OR group2:45" > } > }, > "grouped": { > "group2": { > "matches": 567, > "groups": [ > { > "groupValue": 72, > "doclist": { > "numFound": 562, > "start": 0, > "maxScore": 2.0378063, > "docs": [ > { > "id": "29!26551", > "score": 2.0378063 > }, > { > "id": "78!11462", > "score": 2.0298104 > } > ] > } > }, > { > "groupValue": 45, > "doclist": { > "numFound": 5, > "start": 0, > "docs": [ > { > "id": "72!8569", > "score": 1.8988966 > }, > { > "id": "72!14075", > "score": 1.5191172 > } > ] > } > } > ] > } > } > } > {code} > Looking into the issue, it comes from the fact that if a shard does not > contain a document from that group, trying to merge its {{maxScore}} with > real {{maxScore}} entries from other shards is invalid (it results in NaN). > I'm attaching a patch containing a fix. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8996) maxScore is sometimes missing from distributed grouped responses
[ https://issues.apache.org/jira/browse/LUCENE-8996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16965059#comment-16965059 ] Christine Poerschke commented on LUCENE-8996: - Thanks [~munendrasn] and [~diegoceccarelli] for your input and insights! Let me try to sum up things so far and what could be next. * The main objective of this bug fix ticket is to correctly return {{maxScore}} for groups that are _not_ represented on all shards. ** Currently {{maxScore}} is missing because {{TopGroups.merge}} does a {{maxScore = Math.max(maxScore, shardGroupDocs.maxScore);}} computation with assumptions. ** The computation assumes that {{shardGroupDocs.maxScore}} is always a number when in fact it can be {{NaN}} if that group is not represented on that shard. ** {{Math.max(a, b)}} returns {{NaN}} if either value is {{NaN}} ** code links: *** [https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.3.0/lucene/grouping/src/java/org/apache/lucene/search/grouping/TopGroups.java#L172] *** [https://docs.oracle.com/javase/8/docs/api/java/lang/Math.html#max-float-float-] * We now know that {{shardGroupDocs.maxScore}} could also be absent if the group was represented on the shard but _scores were not requested_. ** In this scenario changing {{TopGroups.merge}} to ignore any {{NaN}} scores had a side effect as illustrated by the {{TestDistributedGrouping}} test failure. ** If all shards return a {{maxScore}} of {{NaN}} then the {{TopGroups.merge}} computation (with {{nonNANmax}} instead of {{Math.max}}) will return as the overall result whatever the initial value of the local {{maxScore}} variable was. ** The local {{maxScore}} variable is currently initialised to {{Float.MIN_VALUE}}. ** If scores were not requested then they should not be returned in the response (but as per SOLR-6612 they currently are returned in distributed mode but not in non-distributed mode). * The {{TestDistributedGrouping}} logic could be adjusted to skip the {{maxScore}} comparison (until SOLR-6612 is fixed) or the {{GroupedEndResultTransformer}} response composition logic could be adjusted to screen out {{Float.MIN_VALUE}} scores but irrespective of that {{TopGroups.merge}} returning a {{MIN_VALUE}} score if given (say) three {{NaN}} scores seems surprising and the {{NaN}} that is currently being returned seems more logical. * If we changed the initialisation of the local {{maxScore}} variable from {{MIN_VALUE}} to {{NaN}} then groups without scores would merge into an overall {{NaN}} i.e. no overall score. * Local initialisation changes pose a backwards compatibility question: are there cases where previously {{MIN_VALUE}} was returned and where now {{NaN}} would be returned instead? ** Code inspection shows that {{MIN_VALUE}} would be returned if the groups and shards for-loops do not run. *** [https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.3.0/lucene/grouping/src/java/org/apache/lucene/search/grouping/TopGroups.java#L138-L227] *** The shards for-loop will always run since the no-shards case already resulted in a {{return}} at L105. *** In the Apache Lucene/Solr code base there is only one non-test caller of the {{TopGroups.merge}} method: [https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.3.0/solr/core/src/java/org/apache/solr/search/grouping/distributed/responseprocessor/TopGroupsShardResponseProcessor.java#L153-L168] *** The groups for-loop will always run since no-groups results in a {{continue}} at L156. * Per the above the local initialisation changes would pose no backwards compatibility concerns within the Apache Lucene/Solr code base. * Considering any {{TopGroups.merge}} callers outside the Lucene/Solr code base, some {{lucene/CHANGES.txt}} wording like the following could be used to document the change in behaviour. _"TopGroups.merge now returns Float.NaN instead of Float.MIN_VALUE if no groups are passed as input."_ * However, I would suggest that such wording would be unnecessary since merging empty group arrays seems unlikely and exotic? > maxScore is sometimes missing from distributed grouped responses > > > Key: LUCENE-8996 > URL: https://issues.apache.org/jira/browse/LUCENE-8996 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 5.3 >Reporter: Julien Massenet >Assignee: Christine Poerschke >Priority: Minor > Attachments: LUCENE-8996.02.patch, LUCENE-8996.03.patch, > LUCENE-8996.patch, LUCENE-8996.patch, lucene_6_5-GroupingMaxScore.patch, > lucene_solr_5_3-GroupingMaxScore.patch, master-GroupingMaxScore.patch > > Time Spent: 10m > Remaining Estimate: 0h > > This issue occurs when using the grouping feature in distributed mode and > sorting by
[jira] [Commented] (LUCENE-8996) maxScore is sometimes missing from distributed grouped responses
[ https://issues.apache.org/jira/browse/LUCENE-8996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16962197#comment-16962197 ] ASF subversion and git services commented on LUCENE-8996: - Commit 788e88f5b105c358caf8bd69136f5e542524e1b5 in lucene-solr's branch refs/heads/jira/SOLR-13101 from Christine Poerschke [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=788e88f ] Revert "LUCENE-8996: maxScore was sometimes missing from distributed grouped responses." This reverts commit 0afe0b6aa1a82589f30444dc5403c07c5be04f11. > maxScore is sometimes missing from distributed grouped responses > > > Key: LUCENE-8996 > URL: https://issues.apache.org/jira/browse/LUCENE-8996 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 5.3 >Reporter: Julien Massenet >Assignee: Christine Poerschke >Priority: Minor > Attachments: LUCENE-8996.02.patch, LUCENE-8996.03.patch, > LUCENE-8996.patch, LUCENE-8996.patch, lucene_6_5-GroupingMaxScore.patch, > lucene_solr_5_3-GroupingMaxScore.patch, master-GroupingMaxScore.patch > > Time Spent: 10m > Remaining Estimate: 0h > > This issue occurs when using the grouping feature in distributed mode and > sorting by score. > Each group's {{docList}} in the response is supposed to contain a > {{maxScore}} entry that hold the maximum score for that group. Using the > current releases, it sometimes happens that this piece of information is not > included: > {code} > { > "responseHeader": { > "status": 0, > "QTime": 42, > "params": { > "sort": "score desc", > "fl": "id,score", > "q": "_text_:\"72\"", > "group.limit": "2", > "group.field": "group2", > "group.sort": "score desc", > "group": "true", > "wt": "json", > "fq": "group2:72 OR group2:45" > } > }, > "grouped": { > "group2": { > "matches": 567, > "groups": [ > { > "groupValue": 72, > "doclist": { > "numFound": 562, > "start": 0, > "maxScore": 2.0378063, > "docs": [ > { > "id": "29!26551", > "score": 2.0378063 > }, > { > "id": "78!11462", > "score": 2.0298104 > } > ] > } > }, > { > "groupValue": 45, > "doclist": { > "numFound": 5, > "start": 0, > "docs": [ > { > "id": "72!8569", > "score": 1.8988966 > }, > { > "id": "72!14075", > "score": 1.5191172 > } > ] > } > } > ] > } > } > } > {code} > Looking into the issue, it comes from the fact that if a shard does not > contain a document from that group, trying to merge its {{maxScore}} with > real {{maxScore}} entries from other shards is invalid (it results in NaN). > I'm attaching a patch containing a fix. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8996) maxScore is sometimes missing from distributed grouped responses
[ https://issues.apache.org/jira/browse/LUCENE-8996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16962196#comment-16962196 ] ASF subversion and git services commented on LUCENE-8996: - Commit 0afe0b6aa1a82589f30444dc5403c07c5be04f11 in lucene-solr's branch refs/heads/jira/SOLR-13101 from Christine Poerschke [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=0afe0b6 ] LUCENE-8996: maxScore was sometimes missing from distributed grouped responses. (Julien Massenet, Diego Ceccarelli, Christine Poerschke) Resolved Conflicts: lucene/grouping/src/java/org/apache/lucene/search/grouping/TopGroups.java > maxScore is sometimes missing from distributed grouped responses > > > Key: LUCENE-8996 > URL: https://issues.apache.org/jira/browse/LUCENE-8996 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 5.3 >Reporter: Julien Massenet >Assignee: Christine Poerschke >Priority: Minor > Attachments: LUCENE-8996.02.patch, LUCENE-8996.03.patch, > LUCENE-8996.patch, LUCENE-8996.patch, lucene_6_5-GroupingMaxScore.patch, > lucene_solr_5_3-GroupingMaxScore.patch, master-GroupingMaxScore.patch > > Time Spent: 10m > Remaining Estimate: 0h > > This issue occurs when using the grouping feature in distributed mode and > sorting by score. > Each group's {{docList}} in the response is supposed to contain a > {{maxScore}} entry that hold the maximum score for that group. Using the > current releases, it sometimes happens that this piece of information is not > included: > {code} > { > "responseHeader": { > "status": 0, > "QTime": 42, > "params": { > "sort": "score desc", > "fl": "id,score", > "q": "_text_:\"72\"", > "group.limit": "2", > "group.field": "group2", > "group.sort": "score desc", > "group": "true", > "wt": "json", > "fq": "group2:72 OR group2:45" > } > }, > "grouped": { > "group2": { > "matches": 567, > "groups": [ > { > "groupValue": 72, > "doclist": { > "numFound": 562, > "start": 0, > "maxScore": 2.0378063, > "docs": [ > { > "id": "29!26551", > "score": 2.0378063 > }, > { > "id": "78!11462", > "score": 2.0298104 > } > ] > } > }, > { > "groupValue": 45, > "doclist": { > "numFound": 5, > "start": 0, > "docs": [ > { > "id": "72!8569", > "score": 1.8988966 > }, > { > "id": "72!14075", > "score": 1.5191172 > } > ] > } > } > ] > } > } > } > {code} > Looking into the issue, it comes from the fact that if a shard does not > contain a document from that group, trying to merge its {{maxScore}} with > real {{maxScore}} entries from other shards is invalid (it results in NaN). > I'm attaching a patch containing a fix. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8996) maxScore is sometimes missing from distributed grouped responses
[ https://issues.apache.org/jira/browse/LUCENE-8996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16958814#comment-16958814 ] Diego Ceccarelli commented on LUCENE-8996: -- Interesting - I'm curious to dig into why the distribute query returned {{maxScores==Float.MIN}} given that with [~cpoerschke] we thought very rare to happen. It seems that the query that produced the failure: {code:java} ... webapp= path=/select params={q=*:*&shards=http://127.0.0.1:62435/collection1|[ff01::213]:2/|[ff01::114]:2/|[ff01::083]:2/&fl=id,a_i1&group.limit=-1&sort=a_i1+asc,+id+asc&rows=100&wt=javabin&version=2&group.field=a_i1&group=true} status=0 QTime=10 [junit4] 2> 2169480 ERROR (TEST-TestDistributedGrouping.test-seed#[9532D3AC873FA0D8]) [ ] o.a.s.BaseDistributedSearchTestCase Mismatched responses: {code} .. was a MatchAll {{*:*}} + sort - I'm wondering if this particular combination produces some particular effect on the computation of MaxScore, I'm going to play with the unit tests. > maxScore is sometimes missing from distributed grouped responses > > > Key: LUCENE-8996 > URL: https://issues.apache.org/jira/browse/LUCENE-8996 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 5.3 >Reporter: Julien Massenet >Assignee: Christine Poerschke >Priority: Minor > Attachments: LUCENE-8996.02.patch, LUCENE-8996.03.patch, > LUCENE-8996.patch, LUCENE-8996.patch, lucene_6_5-GroupingMaxScore.patch, > lucene_solr_5_3-GroupingMaxScore.patch, master-GroupingMaxScore.patch > > Time Spent: 10m > Remaining Estimate: 0h > > This issue occurs when using the grouping feature in distributed mode and > sorting by score. > Each group's {{docList}} in the response is supposed to contain a > {{maxScore}} entry that hold the maximum score for that group. Using the > current releases, it sometimes happens that this piece of information is not > included: > {code} > { > "responseHeader": { > "status": 0, > "QTime": 42, > "params": { > "sort": "score desc", > "fl": "id,score", > "q": "_text_:\"72\"", > "group.limit": "2", > "group.field": "group2", > "group.sort": "score desc", > "group": "true", > "wt": "json", > "fq": "group2:72 OR group2:45" > } > }, > "grouped": { > "group2": { > "matches": 567, > "groups": [ > { > "groupValue": 72, > "doclist": { > "numFound": 562, > "start": 0, > "maxScore": 2.0378063, > "docs": [ > { > "id": "29!26551", > "score": 2.0378063 > }, > { > "id": "78!11462", > "score": 2.0298104 > } > ] > } > }, > { > "groupValue": 45, > "doclist": { > "numFound": 5, > "start": 0, > "docs": [ > { > "id": "72!8569", > "score": 1.8988966 > }, > { > "id": "72!14075", > "score": 1.5191172 > } > ] > } > } > ] > } > } > } > {code} > Looking into the issue, it comes from the fact that if a shard does not > contain a document from that group, trying to merge its {{maxScore}} with > real {{maxScore}} entries from other shards is invalid (it results in NaN). > I'm attaching a patch containing a fix. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8996) maxScore is sometimes missing from distributed grouped responses
[ https://issues.apache.org/jira/browse/LUCENE-8996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16958664#comment-16958664 ] Lucene/Solr QA commented on LUCENE-8996: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 19s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} Release audit (RAT) {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} Check forbidden APIs {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} Validate source patterns {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 46s{color} | {color:green} grouping in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 87m 5s{color} | {color:green} core in the patch passed. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 97m 36s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | LUCENE-8996 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12983895/LUCENE-8996.03.patch | | Optional Tests | compile javac unit ratsources checkforbiddenapis validatesourcepatterns | | uname | Linux lucene2-us-west.apache.org 4.4.0-112-generic #135-Ubuntu SMP Fri Jan 19 11:48:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | ant | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-LUCENE-Build/sourcedir/dev-tools/test-patch/lucene-solr-yetus-personality.sh | | git revision | master / f71e4b2 | | ant | version: Apache Ant(TM) version 1.9.6 compiled on July 20 2018 | | Default Java | LTS | | Test Results | https://builds.apache.org/job/PreCommit-LUCENE-Build/213/testReport/ | | modules | C: lucene/grouping solr/core U: . | | Console output | https://builds.apache.org/job/PreCommit-LUCENE-Build/213/console | | Powered by | Apache Yetus 0.7.0 http://yetus.apache.org | This message was automatically generated. > maxScore is sometimes missing from distributed grouped responses > > > Key: LUCENE-8996 > URL: https://issues.apache.org/jira/browse/LUCENE-8996 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 5.3 >Reporter: Julien Massenet >Assignee: Christine Poerschke >Priority: Minor > Attachments: LUCENE-8996.02.patch, LUCENE-8996.03.patch, > LUCENE-8996.patch, LUCENE-8996.patch, lucene_6_5-GroupingMaxScore.patch, > lucene_solr_5_3-GroupingMaxScore.patch, master-GroupingMaxScore.patch > > Time Spent: 10m > Remaining Estimate: 0h > > This issue occurs when using the grouping feature in distributed mode and > sorting by score. > Each group's {{docList}} in the response is supposed to contain a > {{maxScore}} entry that hold the maximum score for that group. Using the > current releases, it sometimes happens that this piece of information is not > included: > {code} > { > "responseHeader": { > "status": 0, > "QTime": 42, > "params": { > "sort": "score desc", > "fl": "id,score", > "q": "_text_:\"72\"", > "group.limit": "2", > "group.field": "group2", > "group.sort": "score desc", > "group": "true", > "wt": "json", > "fq": "group2:72 OR group2:45" > } > }, > "grouped": { > "group2": { > "matches": 567, > "groups": [ > { > "groupValue": 72, > "doclist": { > "numFound": 562, > "start": 0, > "maxScore": 2.0378063, > "docs": [ > { > "id": "29!26551", > "score": 2.0378063 > }, > { > "id": "78!11462", > "score": 2.0298104 > } > ] > } > }, > { > "groupValue": 45, >
[jira] [Commented] (LUCENE-8996) maxScore is sometimes missing from distributed grouped responses
[ https://issues.apache.org/jira/browse/LUCENE-8996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16958555#comment-16958555 ] Munendra S N commented on LUCENE-8996: -- [^LUCENE-8996.03.patch] Patch with {{maxScore}} skip. I haven't added {{score}} to each test as we hit encounter SOLR-13823 > maxScore is sometimes missing from distributed grouped responses > > > Key: LUCENE-8996 > URL: https://issues.apache.org/jira/browse/LUCENE-8996 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 5.3 >Reporter: Julien Massenet >Assignee: Christine Poerschke >Priority: Minor > Attachments: LUCENE-8996.02.patch, LUCENE-8996.03.patch, > LUCENE-8996.patch, LUCENE-8996.patch, lucene_6_5-GroupingMaxScore.patch, > lucene_solr_5_3-GroupingMaxScore.patch, master-GroupingMaxScore.patch > > Time Spent: 10m > Remaining Estimate: 0h > > This issue occurs when using the grouping feature in distributed mode and > sorting by score. > Each group's {{docList}} in the response is supposed to contain a > {{maxScore}} entry that hold the maximum score for that group. Using the > current releases, it sometimes happens that this piece of information is not > included: > {code} > { > "responseHeader": { > "status": 0, > "QTime": 42, > "params": { > "sort": "score desc", > "fl": "id,score", > "q": "_text_:\"72\"", > "group.limit": "2", > "group.field": "group2", > "group.sort": "score desc", > "group": "true", > "wt": "json", > "fq": "group2:72 OR group2:45" > } > }, > "grouped": { > "group2": { > "matches": 567, > "groups": [ > { > "groupValue": 72, > "doclist": { > "numFound": 562, > "start": 0, > "maxScore": 2.0378063, > "docs": [ > { > "id": "29!26551", > "score": 2.0378063 > }, > { > "id": "78!11462", > "score": 2.0298104 > } > ] > } > }, > { > "groupValue": 45, > "doclist": { > "numFound": 5, > "start": 0, > "docs": [ > { > "id": "72!8569", > "score": 1.8988966 > }, > { > "id": "72!14075", > "score": 1.5191172 > } > ] > } > } > ] > } > } > } > {code} > Looking into the issue, it comes from the fact that if a shard does not > contain a document from that group, trying to merge its {{maxScore}} with > real {{maxScore}} entries from other shards is invalid (it results in NaN). > I'm attaching a patch containing a fix. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8996) maxScore is sometimes missing from distributed grouped responses
[ https://issues.apache.org/jira/browse/LUCENE-8996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16958541#comment-16958541 ] Munendra S N commented on LUCENE-8996: -- I think SOLR-6612 is the cause for failure. {{maxScore}} could be returned in distrib mode even when {{score}} is not requested but in standalone it is returned only if {{score}} is requested. Before this fix, maxScore used to be NAN in these cases so, it was never returned but now, it would be MIN_VALUE will be returned I tried it with adding {{score}} to this {code:java} query("q", "*:*", "rows", 100, "fl", "id," + i1, "group", "true", "group.field", i1, "group.limit", -1, "sort", i1 + " asc, id asc"); {code} The test started passing(but I haven't done beasting) Other idea is to skip {{maxScore}} comparison. This is already being done in tests but this can be moved to top. With this change, tests started passing again needs some beasting {code:java} https://github.com/apache/lucene-solr/blob/f71e4b210ddd8b329813503cd3df9ed343a18b66/solr/core/src/test/org/apache/solr/TestDistributedGrouping.java#L322 {code} > maxScore is sometimes missing from distributed grouped responses > > > Key: LUCENE-8996 > URL: https://issues.apache.org/jira/browse/LUCENE-8996 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 5.3 >Reporter: Julien Massenet >Assignee: Christine Poerschke >Priority: Minor > Attachments: LUCENE-8996.02.patch, LUCENE-8996.patch, > LUCENE-8996.patch, lucene_6_5-GroupingMaxScore.patch, > lucene_solr_5_3-GroupingMaxScore.patch, master-GroupingMaxScore.patch > > Time Spent: 10m > Remaining Estimate: 0h > > This issue occurs when using the grouping feature in distributed mode and > sorting by score. > Each group's {{docList}} in the response is supposed to contain a > {{maxScore}} entry that hold the maximum score for that group. Using the > current releases, it sometimes happens that this piece of information is not > included: > {code} > { > "responseHeader": { > "status": 0, > "QTime": 42, > "params": { > "sort": "score desc", > "fl": "id,score", > "q": "_text_:\"72\"", > "group.limit": "2", > "group.field": "group2", > "group.sort": "score desc", > "group": "true", > "wt": "json", > "fq": "group2:72 OR group2:45" > } > }, > "grouped": { > "group2": { > "matches": 567, > "groups": [ > { > "groupValue": 72, > "doclist": { > "numFound": 562, > "start": 0, > "maxScore": 2.0378063, > "docs": [ > { > "id": "29!26551", > "score": 2.0378063 > }, > { > "id": "78!11462", > "score": 2.0298104 > } > ] > } > }, > { > "groupValue": 45, > "doclist": { > "numFound": 5, > "start": 0, > "docs": [ > { > "id": "72!8569", > "score": 1.8988966 > }, > { > "id": "72!14075", > "score": 1.5191172 > } > ] > } > } > ] > } > } > } > {code} > Looking into the issue, it comes from the fact that if a shard does not > contain a document from that group, trying to merge its {{maxScore}} with > real {{maxScore}} entries from other shards is invalid (it results in NaN). > I'm attaching a patch containing a fix. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8996) maxScore is sometimes missing from distributed grouped responses
[ https://issues.apache.org/jira/browse/LUCENE-8996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16958041#comment-16958041 ] ASF subversion and git services commented on LUCENE-8996: - Commit 3a3df4784089eb449109b9037b0dde15996f41ce in lucene-solr's branch refs/heads/master from Christine Poerschke [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=3a3df47 ] Revert "LUCENE-8996: maxScore was sometimes missing from distributed grouped responses." This reverts commit 5289fce4bf857861af3644f6240a8369aa9f5407. > maxScore is sometimes missing from distributed grouped responses > > > Key: LUCENE-8996 > URL: https://issues.apache.org/jira/browse/LUCENE-8996 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 5.3 >Reporter: Julien Massenet >Assignee: Christine Poerschke >Priority: Minor > Fix For: master (9.0), 8.3 > > Attachments: LUCENE-8996.02.patch, LUCENE-8996.patch, > LUCENE-8996.patch, lucene_6_5-GroupingMaxScore.patch, > lucene_solr_5_3-GroupingMaxScore.patch, master-GroupingMaxScore.patch > > Time Spent: 10m > Remaining Estimate: 0h > > This issue occurs when using the grouping feature in distributed mode and > sorting by score. > Each group's {{docList}} in the response is supposed to contain a > {{maxScore}} entry that hold the maximum score for that group. Using the > current releases, it sometimes happens that this piece of information is not > included: > {code} > { > "responseHeader": { > "status": 0, > "QTime": 42, > "params": { > "sort": "score desc", > "fl": "id,score", > "q": "_text_:\"72\"", > "group.limit": "2", > "group.field": "group2", > "group.sort": "score desc", > "group": "true", > "wt": "json", > "fq": "group2:72 OR group2:45" > } > }, > "grouped": { > "group2": { > "matches": 567, > "groups": [ > { > "groupValue": 72, > "doclist": { > "numFound": 562, > "start": 0, > "maxScore": 2.0378063, > "docs": [ > { > "id": "29!26551", > "score": 2.0378063 > }, > { > "id": "78!11462", > "score": 2.0298104 > } > ] > } > }, > { > "groupValue": 45, > "doclist": { > "numFound": 5, > "start": 0, > "docs": [ > { > "id": "72!8569", > "score": 1.8988966 > }, > { > "id": "72!14075", > "score": 1.5191172 > } > ] > } > } > ] > } > } > } > {code} > Looking into the issue, it comes from the fact that if a shard does not > contain a document from that group, trying to merge its {{maxScore}} with > real {{maxScore}} entries from other shards is invalid (it results in NaN). > I'm attaching a patch containing a fix. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8996) maxScore is sometimes missing from distributed grouped responses
[ https://issues.apache.org/jira/browse/LUCENE-8996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16958039#comment-16958039 ] ASF subversion and git services commented on LUCENE-8996: - Commit acaaf4c2702e6e0138916617fd2780a04c172522 in lucene-solr's branch refs/heads/branch_8x from Christine Poerschke [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=acaaf4c ] Revert "LUCENE-8996: maxScore was sometimes missing from distributed grouped responses." This reverts commit fb37f89af27d881efb0f7b40eff2dc76ef5bc2c4. > maxScore is sometimes missing from distributed grouped responses > > > Key: LUCENE-8996 > URL: https://issues.apache.org/jira/browse/LUCENE-8996 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 5.3 >Reporter: Julien Massenet >Assignee: Christine Poerschke >Priority: Minor > Fix For: master (9.0), 8.3 > > Attachments: LUCENE-8996.02.patch, LUCENE-8996.patch, > LUCENE-8996.patch, lucene_6_5-GroupingMaxScore.patch, > lucene_solr_5_3-GroupingMaxScore.patch, master-GroupingMaxScore.patch > > Time Spent: 10m > Remaining Estimate: 0h > > This issue occurs when using the grouping feature in distributed mode and > sorting by score. > Each group's {{docList}} in the response is supposed to contain a > {{maxScore}} entry that hold the maximum score for that group. Using the > current releases, it sometimes happens that this piece of information is not > included: > {code} > { > "responseHeader": { > "status": 0, > "QTime": 42, > "params": { > "sort": "score desc", > "fl": "id,score", > "q": "_text_:\"72\"", > "group.limit": "2", > "group.field": "group2", > "group.sort": "score desc", > "group": "true", > "wt": "json", > "fq": "group2:72 OR group2:45" > } > }, > "grouped": { > "group2": { > "matches": 567, > "groups": [ > { > "groupValue": 72, > "doclist": { > "numFound": 562, > "start": 0, > "maxScore": 2.0378063, > "docs": [ > { > "id": "29!26551", > "score": 2.0378063 > }, > { > "id": "78!11462", > "score": 2.0298104 > } > ] > } > }, > { > "groupValue": 45, > "doclist": { > "numFound": 5, > "start": 0, > "docs": [ > { > "id": "72!8569", > "score": 1.8988966 > }, > { > "id": "72!14075", > "score": 1.5191172 > } > ] > } > } > ] > } > } > } > {code} > Looking into the issue, it comes from the fact that if a shard does not > contain a document from that group, trying to merge its {{maxScore}} with > real {{maxScore}} entries from other shards is invalid (it results in NaN). > I'm attaching a patch containing a fix. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8996) maxScore is sometimes missing from distributed grouped responses
[ https://issues.apache.org/jira/browse/LUCENE-8996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16958038#comment-16958038 ] ASF subversion and git services commented on LUCENE-8996: - Commit 788e88f5b105c358caf8bd69136f5e542524e1b5 in lucene-solr's branch refs/heads/branch_8_3 from Christine Poerschke [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=788e88f ] Revert "LUCENE-8996: maxScore was sometimes missing from distributed grouped responses." This reverts commit 0afe0b6aa1a82589f30444dc5403c07c5be04f11. > maxScore is sometimes missing from distributed grouped responses > > > Key: LUCENE-8996 > URL: https://issues.apache.org/jira/browse/LUCENE-8996 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 5.3 >Reporter: Julien Massenet >Assignee: Christine Poerschke >Priority: Minor > Fix For: master (9.0), 8.3 > > Attachments: LUCENE-8996.02.patch, LUCENE-8996.patch, > LUCENE-8996.patch, lucene_6_5-GroupingMaxScore.patch, > lucene_solr_5_3-GroupingMaxScore.patch, master-GroupingMaxScore.patch > > Time Spent: 10m > Remaining Estimate: 0h > > This issue occurs when using the grouping feature in distributed mode and > sorting by score. > Each group's {{docList}} in the response is supposed to contain a > {{maxScore}} entry that hold the maximum score for that group. Using the > current releases, it sometimes happens that this piece of information is not > included: > {code} > { > "responseHeader": { > "status": 0, > "QTime": 42, > "params": { > "sort": "score desc", > "fl": "id,score", > "q": "_text_:\"72\"", > "group.limit": "2", > "group.field": "group2", > "group.sort": "score desc", > "group": "true", > "wt": "json", > "fq": "group2:72 OR group2:45" > } > }, > "grouped": { > "group2": { > "matches": 567, > "groups": [ > { > "groupValue": 72, > "doclist": { > "numFound": 562, > "start": 0, > "maxScore": 2.0378063, > "docs": [ > { > "id": "29!26551", > "score": 2.0378063 > }, > { > "id": "78!11462", > "score": 2.0298104 > } > ] > } > }, > { > "groupValue": 45, > "doclist": { > "numFound": 5, > "start": 0, > "docs": [ > { > "id": "72!8569", > "score": 1.8988966 > }, > { > "id": "72!14075", > "score": 1.5191172 > } > ] > } > } > ] > } > } > } > {code} > Looking into the issue, it comes from the fact that if a shard does not > contain a document from that group, trying to merge its {{maxScore}} with > real {{maxScore}} entries from other shards is invalid (it results in NaN). > I'm attaching a patch containing a fix. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8996) maxScore is sometimes missing from distributed grouped responses
[ https://issues.apache.org/jira/browse/LUCENE-8996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16958037#comment-16958037 ] Christine Poerschke commented on LUCENE-8996: - Investigating this further and confidently coming up with an alternative fix would likely hold up the 8.3.0 release process. Any and all input and help appreciated, as always. In the meantime I'll proceed to revert the commits above so as to avoid delaying the release process. Version control is (sometimes at least) meant for that, right? Sorry about the noise. > maxScore is sometimes missing from distributed grouped responses > > > Key: LUCENE-8996 > URL: https://issues.apache.org/jira/browse/LUCENE-8996 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 5.3 >Reporter: Julien Massenet >Assignee: Christine Poerschke >Priority: Minor > Fix For: master (9.0), 8.3 > > Attachments: LUCENE-8996.02.patch, LUCENE-8996.patch, > LUCENE-8996.patch, lucene_6_5-GroupingMaxScore.patch, > lucene_solr_5_3-GroupingMaxScore.patch, master-GroupingMaxScore.patch > > Time Spent: 10m > Remaining Estimate: 0h > > This issue occurs when using the grouping feature in distributed mode and > sorting by score. > Each group's {{docList}} in the response is supposed to contain a > {{maxScore}} entry that hold the maximum score for that group. Using the > current releases, it sometimes happens that this piece of information is not > included: > {code} > { > "responseHeader": { > "status": 0, > "QTime": 42, > "params": { > "sort": "score desc", > "fl": "id,score", > "q": "_text_:\"72\"", > "group.limit": "2", > "group.field": "group2", > "group.sort": "score desc", > "group": "true", > "wt": "json", > "fq": "group2:72 OR group2:45" > } > }, > "grouped": { > "group2": { > "matches": 567, > "groups": [ > { > "groupValue": 72, > "doclist": { > "numFound": 562, > "start": 0, > "maxScore": 2.0378063, > "docs": [ > { > "id": "29!26551", > "score": 2.0378063 > }, > { > "id": "78!11462", > "score": 2.0298104 > } > ] > } > }, > { > "groupValue": 45, > "doclist": { > "numFound": 5, > "start": 0, > "docs": [ > { > "id": "72!8569", > "score": 1.8988966 > }, > { > "id": "72!14075", > "score": 1.5191172 > } > ] > } > } > ] > } > } > } > {code} > Looking into the issue, it comes from the fact that if a shard does not > contain a document from that group, trying to merge its {{maxScore}} with > real {{maxScore}} entries from other shards is invalid (it results in NaN). > I'm attaching a patch containing a fix. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8996) maxScore is sometimes missing from distributed grouped responses
[ https://issues.apache.org/jira/browse/LUCENE-8996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16958035#comment-16958035 ] Christine Poerschke commented on LUCENE-8996: - Well, changing the two local variable initialisations from {{Float.MIN_VALUE}} to {{Float.NaN}} means that Lucene and the Solr tests now all pass for me locally now (and I've learnt to not let only Jenkins run all the tests in scenarios like this) but reading further around the code base found more interesting things: * {{QueryComponent}} instantiates a choice of three {{EndResultTransformer}} objects ** [https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.2.0/solr/core/src/java/org/apache/solr/handler/component/QueryComponent.java#L655-L668] * {{GroupedEndResultTransformer}} has NaN-logic (and thus any MIN_VALUE now makes it through I would speculate when previously that NaN logic would have screened out the NaN and instead give a missing maxScore) ** [https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.2.0/solr/core/src/java/org/apache/solr/search/grouping/endresulttransformer/GroupedEndResultTransformer.java#L92-L112] * {{MainEndResultTransformer}} and {{SimpleEndResultTransformer}} have a mix of NEGATIVE_INFINITY and NaN logic. ** [https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.2.0/solr/core/src/java/org/apache/solr/search/grouping/endresulttransformer/MainEndResultTransformer.java#L59-L84] ** [https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.2.0/solr/core/src/java/org/apache/solr/search/grouping/endresulttransformer/SimpleEndResultTransformer.java#L55-L84] The difference in Grouped vs. Main-or-Simple end result transformer logic surprises me, as does the apparent absence of a maxScore in the response in the non-distributed response. > maxScore is sometimes missing from distributed grouped responses > > > Key: LUCENE-8996 > URL: https://issues.apache.org/jira/browse/LUCENE-8996 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 5.3 >Reporter: Julien Massenet >Assignee: Christine Poerschke >Priority: Minor > Fix For: master (9.0), 8.3 > > Attachments: LUCENE-8996.02.patch, LUCENE-8996.patch, > LUCENE-8996.patch, lucene_6_5-GroupingMaxScore.patch, > lucene_solr_5_3-GroupingMaxScore.patch, master-GroupingMaxScore.patch > > Time Spent: 10m > Remaining Estimate: 0h > > This issue occurs when using the grouping feature in distributed mode and > sorting by score. > Each group's {{docList}} in the response is supposed to contain a > {{maxScore}} entry that hold the maximum score for that group. Using the > current releases, it sometimes happens that this piece of information is not > included: > {code} > { > "responseHeader": { > "status": 0, > "QTime": 42, > "params": { > "sort": "score desc", > "fl": "id,score", > "q": "_text_:\"72\"", > "group.limit": "2", > "group.field": "group2", > "group.sort": "score desc", > "group": "true", > "wt": "json", > "fq": "group2:72 OR group2:45" > } > }, > "grouped": { > "group2": { > "matches": 567, > "groups": [ > { > "groupValue": 72, > "doclist": { > "numFound": 562, > "start": 0, > "maxScore": 2.0378063, > "docs": [ > { > "id": "29!26551", > "score": 2.0378063 > }, > { > "id": "78!11462", > "score": 2.0298104 > } > ] > } > }, > { > "groupValue": 45, > "doclist": { > "numFound": 5, > "start": 0, > "docs": [ > { > "id": "72!8569", > "score": 1.8988966 > }, > { > "id": "72!14075", > "score": 1.5191172 > } > ] > } > } > ] > } > } > } > {code} > Looking into the issue, it comes from the fact that if a shard does not > contain a document from that group, trying to merge its {{maxScore}} with > real {{maxScore}} entries from other shards is invalid (it results in NaN). > I'm attaching a patch containing a fix. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8996) maxScore is sometimes missing from distributed grouped responses
[ https://issues.apache.org/jira/browse/LUCENE-8996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16957977#comment-16957977 ] Christine Poerschke commented on LUCENE-8996: - bq. ... changing the local variable initialisations ... Just attached LUCENE-8996.02.patch file has that change and includes {{TopGroupsTest}} test expectation adjustments. > maxScore is sometimes missing from distributed grouped responses > > > Key: LUCENE-8996 > URL: https://issues.apache.org/jira/browse/LUCENE-8996 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 5.3 >Reporter: Julien Massenet >Assignee: Christine Poerschke >Priority: Minor > Fix For: master (9.0), 8.3 > > Attachments: LUCENE-8996.02.patch, LUCENE-8996.patch, > LUCENE-8996.patch, lucene_6_5-GroupingMaxScore.patch, > lucene_solr_5_3-GroupingMaxScore.patch, master-GroupingMaxScore.patch > > Time Spent: 10m > Remaining Estimate: 0h > > This issue occurs when using the grouping feature in distributed mode and > sorting by score. > Each group's {{docList}} in the response is supposed to contain a > {{maxScore}} entry that hold the maximum score for that group. Using the > current releases, it sometimes happens that this piece of information is not > included: > {code} > { > "responseHeader": { > "status": 0, > "QTime": 42, > "params": { > "sort": "score desc", > "fl": "id,score", > "q": "_text_:\"72\"", > "group.limit": "2", > "group.field": "group2", > "group.sort": "score desc", > "group": "true", > "wt": "json", > "fq": "group2:72 OR group2:45" > } > }, > "grouped": { > "group2": { > "matches": 567, > "groups": [ > { > "groupValue": 72, > "doclist": { > "numFound": 562, > "start": 0, > "maxScore": 2.0378063, > "docs": [ > { > "id": "29!26551", > "score": 2.0378063 > }, > { > "id": "78!11462", > "score": 2.0298104 > } > ] > } > }, > { > "groupValue": 45, > "doclist": { > "numFound": 5, > "start": 0, > "docs": [ > { > "id": "72!8569", > "score": 1.8988966 > }, > { > "id": "72!14075", > "score": 1.5191172 > } > ] > } > } > ] > } > } > } > {code} > Looking into the issue, it comes from the fact that if a shard does not > contain a document from that group, trying to merge its {{maxScore}} with > real {{maxScore}} entries from other shards is invalid (it results in NaN). > I'm attaching a patch containing a fix. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8996) maxScore is sometimes missing from distributed grouped responses
[ https://issues.apache.org/jira/browse/LUCENE-8996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16957972#comment-16957972 ] Christine Poerschke commented on LUCENE-8996: - This is an interesting one. * As per [https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.2.0/solr/test-framework/src/java/org/apache/solr/BaseDistributedSearchTestCase.java#L680] the meaning of the {{1.4E-45!=null}} is that the left side is distributed and the right side is non-distributed. * So the expectation here is for a {{null}} result and previously the result would presumably have been {{null}} due to the Math-max-on-NaN calculation bug but now with that calculation fixed a number is returned (would {{1.4E-45}} be {{MIN_FLOAT}} by any chance?) and thus distributed and non-distributed response comparison finds a mismatch. Will dig a little further. From initial investigation it would seem changing the local variable initialisations (as was originally proposed and included as part of the patch and pull request here) provides a solution (though might in some way not maintain backwards compatibility) i.e. {code:java} -float totalMaxScore = Float.MIN_VALUE; +float totalMaxScore = Float.NaN; ... - float maxScore = Float.MIN_VALUE; + float maxScore = Float.NaN; {code} > maxScore is sometimes missing from distributed grouped responses > > > Key: LUCENE-8996 > URL: https://issues.apache.org/jira/browse/LUCENE-8996 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 5.3 >Reporter: Julien Massenet >Assignee: Christine Poerschke >Priority: Minor > Fix For: master (9.0), 8.3 > > Attachments: LUCENE-8996.patch, LUCENE-8996.patch, > lucene_6_5-GroupingMaxScore.patch, lucene_solr_5_3-GroupingMaxScore.patch, > master-GroupingMaxScore.patch > > Time Spent: 10m > Remaining Estimate: 0h > > This issue occurs when using the grouping feature in distributed mode and > sorting by score. > Each group's {{docList}} in the response is supposed to contain a > {{maxScore}} entry that hold the maximum score for that group. Using the > current releases, it sometimes happens that this piece of information is not > included: > {code} > { > "responseHeader": { > "status": 0, > "QTime": 42, > "params": { > "sort": "score desc", > "fl": "id,score", > "q": "_text_:\"72\"", > "group.limit": "2", > "group.field": "group2", > "group.sort": "score desc", > "group": "true", > "wt": "json", > "fq": "group2:72 OR group2:45" > } > }, > "grouped": { > "group2": { > "matches": 567, > "groups": [ > { > "groupValue": 72, > "doclist": { > "numFound": 562, > "start": 0, > "maxScore": 2.0378063, > "docs": [ > { > "id": "29!26551", > "score": 2.0378063 > }, > { > "id": "78!11462", > "score": 2.0298104 > } > ] > } > }, > { > "groupValue": 45, > "doclist": { > "numFound": 5, > "start": 0, > "docs": [ > { > "id": "72!8569", > "score": 1.8988966 > }, > { > "id": "72!14075", > "score": 1.5191172 > } > ] > } > } > ] > } > } > } > {code} > Looking into the issue, it comes from the fact that if a shard does not > contain a document from that group, trying to merge its {{maxScore}} with > real {{maxScore}} entries from other shards is invalid (it results in NaN). > I'm attaching a patch containing a fix. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8996) maxScore is sometimes missing from distributed grouped responses
[ https://issues.apache.org/jira/browse/LUCENE-8996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16957824#comment-16957824 ] Christine Poerschke commented on LUCENE-8996: - bq. ... step 3: LUCENE-9010 further test coverage and refining, including possible removal and replacement of the initial tests PS: I've converted LUCENE-9010 sub-task (of this ticket) to ticket of its own so that its life can extend beyond LUCNE-8996 here. > maxScore is sometimes missing from distributed grouped responses > > > Key: LUCENE-8996 > URL: https://issues.apache.org/jira/browse/LUCENE-8996 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 5.3 >Reporter: Julien Massenet >Assignee: Christine Poerschke >Priority: Minor > Fix For: master (9.0), 8.3 > > Attachments: LUCENE-8996.patch, LUCENE-8996.patch, > lucene_6_5-GroupingMaxScore.patch, lucene_solr_5_3-GroupingMaxScore.patch, > master-GroupingMaxScore.patch > > Time Spent: 10m > Remaining Estimate: 0h > > This issue occurs when using the grouping feature in distributed mode and > sorting by score. > Each group's {{docList}} in the response is supposed to contain a > {{maxScore}} entry that hold the maximum score for that group. Using the > current releases, it sometimes happens that this piece of information is not > included: > {code} > { > "responseHeader": { > "status": 0, > "QTime": 42, > "params": { > "sort": "score desc", > "fl": "id,score", > "q": "_text_:\"72\"", > "group.limit": "2", > "group.field": "group2", > "group.sort": "score desc", > "group": "true", > "wt": "json", > "fq": "group2:72 OR group2:45" > } > }, > "grouped": { > "group2": { > "matches": 567, > "groups": [ > { > "groupValue": 72, > "doclist": { > "numFound": 562, > "start": 0, > "maxScore": 2.0378063, > "docs": [ > { > "id": "29!26551", > "score": 2.0378063 > }, > { > "id": "78!11462", > "score": 2.0298104 > } > ] > } > }, > { > "groupValue": 45, > "doclist": { > "numFound": 5, > "start": 0, > "docs": [ > { > "id": "72!8569", > "score": 1.8988966 > }, > { > "id": "72!14075", > "score": 1.5191172 > } > ] > } > } > ] > } > } > } > {code} > Looking into the issue, it comes from the fact that if a shard does not > contain a document from that group, trying to merge its {{maxScore}} with > real {{maxScore}} entries from other shards is invalid (it results in NaN). > I'm attaching a patch containing a fix. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8996) maxScore is sometimes missing from distributed grouped responses
[ https://issues.apache.org/jira/browse/LUCENE-8996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16957818#comment-16957818 ] ASF subversion and git services commented on LUCENE-8996: - Commit 0afe0b6aa1a82589f30444dc5403c07c5be04f11 in lucene-solr's branch refs/heads/branch_8_3 from Christine Poerschke [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=0afe0b6 ] LUCENE-8996: maxScore was sometimes missing from distributed grouped responses. (Julien Massenet, Diego Ceccarelli, Christine Poerschke) Resolved Conflicts: lucene/grouping/src/java/org/apache/lucene/search/grouping/TopGroups.java > maxScore is sometimes missing from distributed grouped responses > > > Key: LUCENE-8996 > URL: https://issues.apache.org/jira/browse/LUCENE-8996 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 5.3 >Reporter: Julien Massenet >Priority: Minor > Attachments: LUCENE-8996.patch, LUCENE-8996.patch, > lucene_6_5-GroupingMaxScore.patch, lucene_solr_5_3-GroupingMaxScore.patch, > master-GroupingMaxScore.patch > > Time Spent: 10m > Remaining Estimate: 0h > > This issue occurs when using the grouping feature in distributed mode and > sorting by score. > Each group's {{docList}} in the response is supposed to contain a > {{maxScore}} entry that hold the maximum score for that group. Using the > current releases, it sometimes happens that this piece of information is not > included: > {code} > { > "responseHeader": { > "status": 0, > "QTime": 42, > "params": { > "sort": "score desc", > "fl": "id,score", > "q": "_text_:\"72\"", > "group.limit": "2", > "group.field": "group2", > "group.sort": "score desc", > "group": "true", > "wt": "json", > "fq": "group2:72 OR group2:45" > } > }, > "grouped": { > "group2": { > "matches": 567, > "groups": [ > { > "groupValue": 72, > "doclist": { > "numFound": 562, > "start": 0, > "maxScore": 2.0378063, > "docs": [ > { > "id": "29!26551", > "score": 2.0378063 > }, > { > "id": "78!11462", > "score": 2.0298104 > } > ] > } > }, > { > "groupValue": 45, > "doclist": { > "numFound": 5, > "start": 0, > "docs": [ > { > "id": "72!8569", > "score": 1.8988966 > }, > { > "id": "72!14075", > "score": 1.5191172 > } > ] > } > } > ] > } > } > } > {code} > Looking into the issue, it comes from the fact that if a shard does not > contain a document from that group, trying to merge its {{maxScore}} with > real {{maxScore}} entries from other shards is invalid (it results in NaN). > I'm attaching a patch containing a fix. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8996) maxScore is sometimes missing from distributed grouped responses
[ https://issues.apache.org/jira/browse/LUCENE-8996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16957815#comment-16957815 ] Christine Poerschke commented on LUCENE-8996: - bq. Commit fb37f89af27d881efb0f7b40eff2dc76ef5bc2c4 in lucene-solr's branch refs/heads/branch_8x ... Resolved Conflicts: lucene/grouping/src/java/org/apache/lucene/search/grouping/TopGroups.java This was a trivial cherry-pick conflict due to LUCENE-8857 being on master branch only. branch_8x to branch_8_3 cherry-picking after that was (as expected) without conflicts. > maxScore is sometimes missing from distributed grouped responses > > > Key: LUCENE-8996 > URL: https://issues.apache.org/jira/browse/LUCENE-8996 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 5.3 >Reporter: Julien Massenet >Priority: Minor > Attachments: LUCENE-8996.patch, LUCENE-8996.patch, > lucene_6_5-GroupingMaxScore.patch, lucene_solr_5_3-GroupingMaxScore.patch, > master-GroupingMaxScore.patch > > Time Spent: 10m > Remaining Estimate: 0h > > This issue occurs when using the grouping feature in distributed mode and > sorting by score. > Each group's {{docList}} in the response is supposed to contain a > {{maxScore}} entry that hold the maximum score for that group. Using the > current releases, it sometimes happens that this piece of information is not > included: > {code} > { > "responseHeader": { > "status": 0, > "QTime": 42, > "params": { > "sort": "score desc", > "fl": "id,score", > "q": "_text_:\"72\"", > "group.limit": "2", > "group.field": "group2", > "group.sort": "score desc", > "group": "true", > "wt": "json", > "fq": "group2:72 OR group2:45" > } > }, > "grouped": { > "group2": { > "matches": 567, > "groups": [ > { > "groupValue": 72, > "doclist": { > "numFound": 562, > "start": 0, > "maxScore": 2.0378063, > "docs": [ > { > "id": "29!26551", > "score": 2.0378063 > }, > { > "id": "78!11462", > "score": 2.0298104 > } > ] > } > }, > { > "groupValue": 45, > "doclist": { > "numFound": 5, > "start": 0, > "docs": [ > { > "id": "72!8569", > "score": 1.8988966 > }, > { > "id": "72!14075", > "score": 1.5191172 > } > ] > } > } > ] > } > } > } > {code} > Looking into the issue, it comes from the fact that if a shard does not > contain a document from that group, trying to merge its {{maxScore}} with > real {{maxScore}} entries from other shards is invalid (it results in NaN). > I'm attaching a patch containing a fix. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8996) maxScore is sometimes missing from distributed grouped responses
[ https://issues.apache.org/jira/browse/LUCENE-8996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16957798#comment-16957798 ] ASF subversion and git services commented on LUCENE-8996: - Commit fb37f89af27d881efb0f7b40eff2dc76ef5bc2c4 in lucene-solr's branch refs/heads/branch_8x from Christine Poerschke [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=fb37f89 ] LUCENE-8996: maxScore was sometimes missing from distributed grouped responses. (Julien Massenet, Diego Ceccarelli, Christine Poerschke) Resolved Conflicts: lucene/grouping/src/java/org/apache/lucene/search/grouping/TopGroups.java > maxScore is sometimes missing from distributed grouped responses > > > Key: LUCENE-8996 > URL: https://issues.apache.org/jira/browse/LUCENE-8996 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 5.3 >Reporter: Julien Massenet >Priority: Minor > Attachments: LUCENE-8996.patch, LUCENE-8996.patch, > lucene_6_5-GroupingMaxScore.patch, lucene_solr_5_3-GroupingMaxScore.patch, > master-GroupingMaxScore.patch > > Time Spent: 10m > Remaining Estimate: 0h > > This issue occurs when using the grouping feature in distributed mode and > sorting by score. > Each group's {{docList}} in the response is supposed to contain a > {{maxScore}} entry that hold the maximum score for that group. Using the > current releases, it sometimes happens that this piece of information is not > included: > {code} > { > "responseHeader": { > "status": 0, > "QTime": 42, > "params": { > "sort": "score desc", > "fl": "id,score", > "q": "_text_:\"72\"", > "group.limit": "2", > "group.field": "group2", > "group.sort": "score desc", > "group": "true", > "wt": "json", > "fq": "group2:72 OR group2:45" > } > }, > "grouped": { > "group2": { > "matches": 567, > "groups": [ > { > "groupValue": 72, > "doclist": { > "numFound": 562, > "start": 0, > "maxScore": 2.0378063, > "docs": [ > { > "id": "29!26551", > "score": 2.0378063 > }, > { > "id": "78!11462", > "score": 2.0298104 > } > ] > } > }, > { > "groupValue": 45, > "doclist": { > "numFound": 5, > "start": 0, > "docs": [ > { > "id": "72!8569", > "score": 1.8988966 > }, > { > "id": "72!14075", > "score": 1.5191172 > } > ] > } > } > ] > } > } > } > {code} > Looking into the issue, it comes from the fact that if a shard does not > contain a document from that group, trying to merge its {{maxScore}} with > real {{maxScore}} entries from other shards is invalid (it results in NaN). > I'm attaching a patch containing a fix. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8996) maxScore is sometimes missing from distributed grouped responses
[ https://issues.apache.org/jira/browse/LUCENE-8996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16957767#comment-16957767 ] ASF subversion and git services commented on LUCENE-8996: - Commit 5289fce4bf857861af3644f6240a8369aa9f5407 in lucene-solr's branch refs/heads/master from Christine Poerschke [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=5289fce ] LUCENE-8996: maxScore was sometimes missing from distributed grouped responses. (Julien Massenet, Diego Ceccarelli, Christine Poerschke) > maxScore is sometimes missing from distributed grouped responses > > > Key: LUCENE-8996 > URL: https://issues.apache.org/jira/browse/LUCENE-8996 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 5.3 >Reporter: Julien Massenet >Priority: Minor > Attachments: LUCENE-8996.patch, LUCENE-8996.patch, > lucene_6_5-GroupingMaxScore.patch, lucene_solr_5_3-GroupingMaxScore.patch, > master-GroupingMaxScore.patch > > Time Spent: 10m > Remaining Estimate: 0h > > This issue occurs when using the grouping feature in distributed mode and > sorting by score. > Each group's {{docList}} in the response is supposed to contain a > {{maxScore}} entry that hold the maximum score for that group. Using the > current releases, it sometimes happens that this piece of information is not > included: > {code} > { > "responseHeader": { > "status": 0, > "QTime": 42, > "params": { > "sort": "score desc", > "fl": "id,score", > "q": "_text_:\"72\"", > "group.limit": "2", > "group.field": "group2", > "group.sort": "score desc", > "group": "true", > "wt": "json", > "fq": "group2:72 OR group2:45" > } > }, > "grouped": { > "group2": { > "matches": 567, > "groups": [ > { > "groupValue": 72, > "doclist": { > "numFound": 562, > "start": 0, > "maxScore": 2.0378063, > "docs": [ > { > "id": "29!26551", > "score": 2.0378063 > }, > { > "id": "78!11462", > "score": 2.0298104 > } > ] > } > }, > { > "groupValue": 45, > "doclist": { > "numFound": 5, > "start": 0, > "docs": [ > { > "id": "72!8569", > "score": 1.8988966 > }, > { > "id": "72!14075", > "score": 1.5191172 > } > ] > } > } > ] > } > } > } > {code} > Looking into the issue, it comes from the fact that if a shard does not > contain a document from that group, trying to merge its {{maxScore}} with > real {{maxScore}} entries from other shards is invalid (it results in NaN). > I'm attaching a patch containing a fix. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8996) maxScore is sometimes missing from distributed grouped responses
[ https://issues.apache.org/jira/browse/LUCENE-8996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16957612#comment-16957612 ] Lucene/Solr QA commented on LUCENE-8996: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 18s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} Release audit (RAT) {color} | {color:green} 0m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} Check forbidden APIs {color} | {color:green} 0m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} Validate source patterns {color} | {color:green} 0m 19s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 23s{color} | {color:green} grouping in the patch passed. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 2m 52s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | LUCENE-8996 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12983746/LUCENE-8996.patch | | Optional Tests | compile javac unit ratsources checkforbiddenapis validatesourcepatterns | | uname | Linux lucene1-us-west 4.15.0-54-generic #58-Ubuntu SMP Mon Jun 24 10:55:24 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | ant | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-LUCENE-Build/sourcedir/dev-tools/test-patch/lucene-solr-yetus-personality.sh | | git revision | master / 517bfd0ab75 | | ant | version: Apache Ant(TM) version 1.10.5 compiled on March 28 2019 | | Default Java | LTS | | Test Results | https://builds.apache.org/job/PreCommit-LUCENE-Build/212/testReport/ | | modules | C: lucene/grouping U: lucene/grouping | | Console output | https://builds.apache.org/job/PreCommit-LUCENE-Build/212/console | | Powered by | Apache Yetus 0.7.0 http://yetus.apache.org | This message was automatically generated. > maxScore is sometimes missing from distributed grouped responses > > > Key: LUCENE-8996 > URL: https://issues.apache.org/jira/browse/LUCENE-8996 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 5.3 >Reporter: Julien Massenet >Priority: Minor > Attachments: LUCENE-8996.patch, LUCENE-8996.patch, > lucene_6_5-GroupingMaxScore.patch, lucene_solr_5_3-GroupingMaxScore.patch, > master-GroupingMaxScore.patch > > Time Spent: 10m > Remaining Estimate: 0h > > This issue occurs when using the grouping feature in distributed mode and > sorting by score. > Each group's {{docList}} in the response is supposed to contain a > {{maxScore}} entry that hold the maximum score for that group. Using the > current releases, it sometimes happens that this piece of information is not > included: > {code} > { > "responseHeader": { > "status": 0, > "QTime": 42, > "params": { > "sort": "score desc", > "fl": "id,score", > "q": "_text_:\"72\"", > "group.limit": "2", > "group.field": "group2", > "group.sort": "score desc", > "group": "true", > "wt": "json", > "fq": "group2:72 OR group2:45" > } > }, > "grouped": { > "group2": { > "matches": 567, > "groups": [ > { > "groupValue": 72, > "doclist": { > "numFound": 562, > "start": 0, > "maxScore": 2.0378063, > "docs": [ > { > "id": "29!26551", > "score": 2.0378063 > }, > { > "id": "78!11462", > "score": 2.0298104 > } > ] > } > }, > { > "groupValue": 45, > "doclist": { > "numFound": 5, > "start": 0, > "docs": [ > { > "id": "72!8569", > "score": 1.8988966 > }, > { >
[jira] [Commented] (LUCENE-8996) maxScore is sometimes missing from distributed grouped responses
[ https://issues.apache.org/jira/browse/LUCENE-8996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16957147#comment-16957147 ] Christine Poerschke commented on LUCENE-8996: - Just attached LUCENE-8996.patch includes the {{@Ignore}} annotation removal stuff. And I'm going to go out on a limb here and: * commit the LUCENE-9010 test addition (totally happy for it to be reverted or replaced later) -- step 1 above * propose the fix for the issue here for the 8.3.0 RC2 respin -- step 2 above > maxScore is sometimes missing from distributed grouped responses > > > Key: LUCENE-8996 > URL: https://issues.apache.org/jira/browse/LUCENE-8996 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 5.3 >Reporter: Julien Massenet >Priority: Minor > Attachments: LUCENE-8996.patch, LUCENE-8996.patch, > lucene_6_5-GroupingMaxScore.patch, lucene_solr_5_3-GroupingMaxScore.patch, > master-GroupingMaxScore.patch > > Time Spent: 10m > Remaining Estimate: 0h > > This issue occurs when using the grouping feature in distributed mode and > sorting by score. > Each group's {{docList}} in the response is supposed to contain a > {{maxScore}} entry that hold the maximum score for that group. Using the > current releases, it sometimes happens that this piece of information is not > included: > {code} > { > "responseHeader": { > "status": 0, > "QTime": 42, > "params": { > "sort": "score desc", > "fl": "id,score", > "q": "_text_:\"72\"", > "group.limit": "2", > "group.field": "group2", > "group.sort": "score desc", > "group": "true", > "wt": "json", > "fq": "group2:72 OR group2:45" > } > }, > "grouped": { > "group2": { > "matches": 567, > "groups": [ > { > "groupValue": 72, > "doclist": { > "numFound": 562, > "start": 0, > "maxScore": 2.0378063, > "docs": [ > { > "id": "29!26551", > "score": 2.0378063 > }, > { > "id": "78!11462", > "score": 2.0298104 > } > ] > } > }, > { > "groupValue": 45, > "doclist": { > "numFound": 5, > "start": 0, > "docs": [ > { > "id": "72!8569", > "score": 1.8988966 > }, > { > "id": "72!14075", > "score": 1.5191172 > } > ] > } > } > ] > } > } > } > {code} > Looking into the issue, it comes from the fact that if a shard does not > contain a document from that group, trying to merge its {{maxScore}} with > real {{maxScore}} entries from other shards is invalid (it results in NaN). > I'm attaching a patch containing a fix. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8996) maxScore is sometimes missing from distributed grouped responses
[ https://issues.apache.org/jira/browse/LUCENE-8996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16957141#comment-16957141 ] Christine Poerschke commented on LUCENE-8996: - {quote}... About the LUCENE-9010 I would keep fix and tests together - because the tests prove that you actually fixed the issue ... {quote} The LUCENE-9010 patch includes two {{@Ignore}}-ed tests ({{testAllGroupsEmptyInSecondPass}} and {{testSomeGroupsEmptyInSecondPass}}) which demonstrate the issue and the fix here then would remove the {{@Ignore}} annotation and thus prove fixing of the issue. {quote}... I'm not sure about the narrative tests, but I agree on too many numbers and I can fix the tests by: ... {quote} I wonder if the separate LUCENE-9010 could actually help with this: * step 1: LUCENE-9010 three initial tests for existing behaviour including two {{@Ignore}} annotated tests to demonstrate the issue * step 2: LUCENE-8996 fix for the issue including two {{@Ignore}} annotation removals as proof-of-fix * step 3: LUCENE-9010 further test coverage and refining, including possible removal and replacement of the initial tests > maxScore is sometimes missing from distributed grouped responses > > > Key: LUCENE-8996 > URL: https://issues.apache.org/jira/browse/LUCENE-8996 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 5.3 >Reporter: Julien Massenet >Priority: Minor > Attachments: LUCENE-8996.patch, lucene_6_5-GroupingMaxScore.patch, > lucene_solr_5_3-GroupingMaxScore.patch, master-GroupingMaxScore.patch > > Time Spent: 10m > Remaining Estimate: 0h > > This issue occurs when using the grouping feature in distributed mode and > sorting by score. > Each group's {{docList}} in the response is supposed to contain a > {{maxScore}} entry that hold the maximum score for that group. Using the > current releases, it sometimes happens that this piece of information is not > included: > {code} > { > "responseHeader": { > "status": 0, > "QTime": 42, > "params": { > "sort": "score desc", > "fl": "id,score", > "q": "_text_:\"72\"", > "group.limit": "2", > "group.field": "group2", > "group.sort": "score desc", > "group": "true", > "wt": "json", > "fq": "group2:72 OR group2:45" > } > }, > "grouped": { > "group2": { > "matches": 567, > "groups": [ > { > "groupValue": 72, > "doclist": { > "numFound": 562, > "start": 0, > "maxScore": 2.0378063, > "docs": [ > { > "id": "29!26551", > "score": 2.0378063 > }, > { > "id": "78!11462", > "score": 2.0298104 > } > ] > } > }, > { > "groupValue": 45, > "doclist": { > "numFound": 5, > "start": 0, > "docs": [ > { > "id": "72!8569", > "score": 1.8988966 > }, > { > "id": "72!14075", > "score": 1.5191172 > } > ] > } > } > ] > } > } > } > {code} > Looking into the issue, it comes from the fact that if a shard does not > contain a document from that group, trying to merge its {{maxScore}} with > real {{maxScore}} entries from other shards is invalid (it results in NaN). > I'm attaching a patch containing a fix. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8996) maxScore is sometimes missing from distributed grouped responses
[ https://issues.apache.org/jira/browse/LUCENE-8996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953911#comment-16953911 ] Diego Ceccarelli commented on LUCENE-8996: -- Ah! Good point about the delete between the first and second pass! didn't think about that, rarely but it might happen! and I agree about fix MIN_VALUE in a second PR. About the LUCENE-9010 I would keep fix and tests together - because the tests prove that you actually fixed the issue and also they help understanding the issue (aside from the "lost of numbers there"). ( another example here: https://github.com/apache/lucene-solr/commit/d1706b36babda2dc17fd82d3165607f5c44bd83e) I'm not sure about the narrative tests, but I agree on too many numbers and I can fix the tests by: * Adding constants and remove the numbers * I like the colors, I can use them instead of group1 group2 * I also would like to make the tests shorter (maybe having more tests) what do you think? > maxScore is sometimes missing from distributed grouped responses > > > Key: LUCENE-8996 > URL: https://issues.apache.org/jira/browse/LUCENE-8996 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 5.3 >Reporter: Julien Massenet >Priority: Minor > Attachments: LUCENE-8996.patch, lucene_6_5-GroupingMaxScore.patch, > lucene_solr_5_3-GroupingMaxScore.patch, master-GroupingMaxScore.patch > > Time Spent: 10m > Remaining Estimate: 0h > > This issue occurs when using the grouping feature in distributed mode and > sorting by score. > Each group's {{docList}} in the response is supposed to contain a > {{maxScore}} entry that hold the maximum score for that group. Using the > current releases, it sometimes happens that this piece of information is not > included: > {code} > { > "responseHeader": { > "status": 0, > "QTime": 42, > "params": { > "sort": "score desc", > "fl": "id,score", > "q": "_text_:\"72\"", > "group.limit": "2", > "group.field": "group2", > "group.sort": "score desc", > "group": "true", > "wt": "json", > "fq": "group2:72 OR group2:45" > } > }, > "grouped": { > "group2": { > "matches": 567, > "groups": [ > { > "groupValue": 72, > "doclist": { > "numFound": 562, > "start": 0, > "maxScore": 2.0378063, > "docs": [ > { > "id": "29!26551", > "score": 2.0378063 > }, > { > "id": "78!11462", > "score": 2.0298104 > } > ] > } > }, > { > "groupValue": 45, > "doclist": { > "numFound": 5, > "start": 0, > "docs": [ > { > "id": "72!8569", > "score": 1.8988966 > }, > { > "id": "72!14075", > "score": 1.5191172 > } > ] > } > } > ] > } > } > } > {code} > Looking into the issue, it comes from the fact that if a shard does not > contain a document from that group, trying to merge its {{maxScore}} with > real {{maxScore}} entries from other shards is invalid (it results in NaN). > I'm attaching a patch containing a fix. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8996) maxScore is sometimes missing from distributed grouped responses
[ https://issues.apache.org/jira/browse/LUCENE-8996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953080#comment-16953080 ] Christine Poerschke commented on LUCENE-8996: - Looking at the {{TopGroupsTest}} portion of both the patch and the pull request for this ticket I had some "there's a lot of numbers here" thoughts and it (subjectively, of course) seemed to me a little tricky to work out what they all are (numbers for shard index, numbers for doc id, numbers for group value, numbers for scores, numbers for hit counts, sometimes NaN not-a-number numbers) and what they mean and why/that the expected test results are correct. The LUCENE-9010 sub-task proposes to split out the addition of test coverage for the existing code from the 'maxScore missing' fix here (and the first proposed patch for it tries to reduce the "amount of numbers" e.g. instead of integer group values 1 and 2 there's string group values "red" and "blue" and a narrative and local variable names (redAntScore, blueDragonflyScore, redSquirrelScore, blueWhaleScore) try to make it easier to work out what the {{expectedMaxScore}} value is. > maxScore is sometimes missing from distributed grouped responses > > > Key: LUCENE-8996 > URL: https://issues.apache.org/jira/browse/LUCENE-8996 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 5.3 >Reporter: Julien Massenet >Priority: Minor > Attachments: LUCENE-8996.patch, lucene_6_5-GroupingMaxScore.patch, > lucene_solr_5_3-GroupingMaxScore.patch, master-GroupingMaxScore.patch > > Time Spent: 10m > Remaining Estimate: 0h > > This issue occurs when using the grouping feature in distributed mode and > sorting by score. > Each group's {{docList}} in the response is supposed to contain a > {{maxScore}} entry that hold the maximum score for that group. Using the > current releases, it sometimes happens that this piece of information is not > included: > {code} > { > "responseHeader": { > "status": 0, > "QTime": 42, > "params": { > "sort": "score desc", > "fl": "id,score", > "q": "_text_:\"72\"", > "group.limit": "2", > "group.field": "group2", > "group.sort": "score desc", > "group": "true", > "wt": "json", > "fq": "group2:72 OR group2:45" > } > }, > "grouped": { > "group2": { > "matches": 567, > "groups": [ > { > "groupValue": 72, > "doclist": { > "numFound": 562, > "start": 0, > "maxScore": 2.0378063, > "docs": [ > { > "id": "29!26551", > "score": 2.0378063 > }, > { > "id": "78!11462", > "score": 2.0298104 > } > ] > } > }, > { > "groupValue": 45, > "doclist": { > "numFound": 5, > "start": 0, > "docs": [ > { > "id": "72!8569", > "score": 1.8988966 > }, > { > "id": "72!14075", > "score": 1.5191172 > } > ] > } > } > ] > } > } > } > {code} > Looking into the issue, it comes from the fact that if a shard does not > contain a document from that group, trying to merge its {{maxScore}} with > real {{maxScore}} entries from other shards is invalid (it results in NaN). > I'm attaching a patch containing a fix. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8996) maxScore is sometimes missing from distributed grouped responses
[ https://issues.apache.org/jira/browse/LUCENE-8996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953078#comment-16953078 ] Christine Poerschke commented on LUCENE-8996: - {quote}... If you merge two groups with no real maxScores the final result will be MIN_VALUE (NaN would make more sense imo) ... {quote} Yes, MIN_VALUE seems a quirky result for this edge case. Though if one were to change the existing behaviour it might be clearest to do that separately from the 'maxScore missing' fix here: here we are removing an erroneous case of 'maxScore missing' and changing away from MIN_VALUE would add a legitimate case of 'maxScore missing'. {quote}... this *should* never happen in theory because if no segment contains documents about group x it shouldn't be possible that we retrieve documents about group x in first place. ... {quote} I agree, in theory it should never happen though in practice I think there's a timing window of opportunity that could make it happen, though it would seem quite unlikely. The first pass of the distributed search could determine that there are segments with documents about group X but subsequently it could then be 'just so' that by the time the second pass of the search runs a few moments later the document(s) in group X have all been deleted? > maxScore is sometimes missing from distributed grouped responses > > > Key: LUCENE-8996 > URL: https://issues.apache.org/jira/browse/LUCENE-8996 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 5.3 >Reporter: Julien Massenet >Priority: Minor > Attachments: LUCENE-8996.patch, lucene_6_5-GroupingMaxScore.patch, > lucene_solr_5_3-GroupingMaxScore.patch, master-GroupingMaxScore.patch > > Time Spent: 10m > Remaining Estimate: 0h > > This issue occurs when using the grouping feature in distributed mode and > sorting by score. > Each group's {{docList}} in the response is supposed to contain a > {{maxScore}} entry that hold the maximum score for that group. Using the > current releases, it sometimes happens that this piece of information is not > included: > {code} > { > "responseHeader": { > "status": 0, > "QTime": 42, > "params": { > "sort": "score desc", > "fl": "id,score", > "q": "_text_:\"72\"", > "group.limit": "2", > "group.field": "group2", > "group.sort": "score desc", > "group": "true", > "wt": "json", > "fq": "group2:72 OR group2:45" > } > }, > "grouped": { > "group2": { > "matches": 567, > "groups": [ > { > "groupValue": 72, > "doclist": { > "numFound": 562, > "start": 0, > "maxScore": 2.0378063, > "docs": [ > { > "id": "29!26551", > "score": 2.0378063 > }, > { > "id": "78!11462", > "score": 2.0298104 > } > ] > } > }, > { > "groupValue": 45, > "doclist": { > "numFound": 5, > "start": 0, > "docs": [ > { > "id": "72!8569", > "score": 1.8988966 > }, > { > "id": "72!14075", > "score": 1.5191172 > } > ] > } > } > ] > } > } > } > {code} > Looking into the issue, it comes from the fact that if a shard does not > contain a document from that group, trying to merge its {{maxScore}} with > real {{maxScore}} entries from other shards is invalid (it results in NaN). > I'm attaching a patch containing a fix. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8996) maxScore is sometimes missing from distributed grouped responses
[ https://issues.apache.org/jira/browse/LUCENE-8996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16944432#comment-16944432 ] Diego Ceccarelli commented on LUCENE-8996: -- Thanks [~cpoerschke]: good point about the change of initialisation! I integrated your patch into the PR. Now it initialise the values with MIN_VALUE, I updated the tests and the only difference this line: https://github.com/apache/lucene-solr/blob/d5c93123135393a75b6766d0e89bf10d2277f2ad/lucene/grouping/src/test/org/apache/lucene/search/grouping/TopGroupsTest.java#L134 -> If you merge two groups with no real maxScores the final result will be MIN_VALUE (NaN would make more sense imo) but it's worth noting that this *should* never happen in theory because if no segment contains documents about group x it shouldn't be possible that we retrieve documents about group x in first place. What do you think? > maxScore is sometimes missing from distributed grouped responses > > > Key: LUCENE-8996 > URL: https://issues.apache.org/jira/browse/LUCENE-8996 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 5.3 >Reporter: Julien Massenet >Priority: Minor > Attachments: LUCENE-8996.patch, lucene_6_5-GroupingMaxScore.patch, > lucene_solr_5_3-GroupingMaxScore.patch, master-GroupingMaxScore.patch > > Time Spent: 10m > Remaining Estimate: 0h > > This issue occurs when using the grouping feature in distributed mode and > sorting by score. > Each group's {{docList}} in the response is supposed to contain a > {{maxScore}} entry that hold the maximum score for that group. Using the > current releases, it sometimes happens that this piece of information is not > included: > {code} > { > "responseHeader": { > "status": 0, > "QTime": 42, > "params": { > "sort": "score desc", > "fl": "id,score", > "q": "_text_:\"72\"", > "group.limit": "2", > "group.field": "group2", > "group.sort": "score desc", > "group": "true", > "wt": "json", > "fq": "group2:72 OR group2:45" > } > }, > "grouped": { > "group2": { > "matches": 567, > "groups": [ > { > "groupValue": 72, > "doclist": { > "numFound": 562, > "start": 0, > "maxScore": 2.0378063, > "docs": [ > { > "id": "29!26551", > "score": 2.0378063 > }, > { > "id": "78!11462", > "score": 2.0298104 > } > ] > } > }, > { > "groupValue": 45, > "doclist": { > "numFound": 5, > "start": 0, > "docs": [ > { > "id": "72!8569", > "score": 1.8988966 > }, > { > "id": "72!14075", > "score": 1.5191172 > } > ] > } > } > ] > } > } > } > {code} > Looking into the issue, it comes from the fact that if a shard does not > contain a document from that group, trying to merge its {{maxScore}} with > real {{maxScore}} entries from other shards is invalid (it results in NaN). > I'm attaching a patch containing a fix. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8996) maxScore is sometimes missing from distributed grouped responses
[ https://issues.apache.org/jira/browse/LUCENE-8996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16942115#comment-16942115 ] Christine Poerschke commented on LUCENE-8996: - Looking at the {{master-GroupingMaxScore.patch}} patch and the [https://github.com/apache/lucene-solr/pull/906] pull request I'm wondering if backwards compatibility considerations apply w.r.t. the initialisation of the {{maxScore}} and {{totalMaxScore}} local variables? Attached LUCENE-8996.patch tries to illustrate what I mean; it also adds javadocs for the "not-NaN max" helper method (but contains neither the patch's nor the pull request's tests, temporarily to make it (perhaps) easier to focus on the code change). Haven't looked too closely though yet i.e. it could be that whatever value is used to initialise those local variables never makes it "out of the method" or it never makes it into "client visibility" in which case the initial value of the local variable would be just implementation detail (with NaN likely the clearer choice). > maxScore is sometimes missing from distributed grouped responses > > > Key: LUCENE-8996 > URL: https://issues.apache.org/jira/browse/LUCENE-8996 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 5.3 >Reporter: Julien Massenet >Priority: Minor > Attachments: LUCENE-8996.patch, lucene_6_5-GroupingMaxScore.patch, > lucene_solr_5_3-GroupingMaxScore.patch, master-GroupingMaxScore.patch > > Time Spent: 10m > Remaining Estimate: 0h > > This issue occurs when using the grouping feature in distributed mode and > sorting by score. > Each group's {{docList}} in the response is supposed to contain a > {{maxScore}} entry that hold the maximum score for that group. Using the > current releases, it sometimes happens that this piece of information is not > included: > {code} > { > "responseHeader": { > "status": 0, > "QTime": 42, > "params": { > "sort": "score desc", > "fl": "id,score", > "q": "_text_:\"72\"", > "group.limit": "2", > "group.field": "group2", > "group.sort": "score desc", > "group": "true", > "wt": "json", > "fq": "group2:72 OR group2:45" > } > }, > "grouped": { > "group2": { > "matches": 567, > "groups": [ > { > "groupValue": 72, > "doclist": { > "numFound": 562, > "start": 0, > "maxScore": 2.0378063, > "docs": [ > { > "id": "29!26551", > "score": 2.0378063 > }, > { > "id": "78!11462", > "score": 2.0298104 > } > ] > } > }, > { > "groupValue": 45, > "doclist": { > "numFound": 5, > "start": 0, > "docs": [ > { > "id": "72!8569", > "score": 1.8988966 > }, > { > "id": "72!14075", > "score": 1.5191172 > } > ] > } > } > ] > } > } > } > {code} > Looking into the issue, it comes from the fact that if a shard does not > contain a document from that group, trying to merge its {{maxScore}} with > real {{maxScore}} entries from other shards is invalid (it results in NaN). > I'm attaching a patch containing a fix. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org