[jira] Commented: (LUCENE-698) FilteredQuery ignores boost
[ https://issues.apache.org/jira/browse/LUCENE-698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12500427 ] Doug Cutting commented on LUCENE-698: - If boost is zero, then sumOfSquaredWeights() returns zero as well, resulting in a queryNorm of Infinity (due to a div by zero if DefaultSimilarity is used). Then it multiplies boost and queryNorm and 0*Infinity=NaN. The bug here to me seems that queryNorm is Infinity. A boost of zero has a reasonable interpretation (don't influence scoring), but I don't see how a queryNorm of Infinity is ever useful. So perhaps we can remove the NaN by modifying the default implementation of queryNorm to return 1.0 instead of Infinity when passed zero. Would that cause any harm? FilteredQuery ignores boost --- Key: LUCENE-698 URL: https://issues.apache.org/jira/browse/LUCENE-698 Project: Lucene - Java Issue Type: Bug Components: Search Affects Versions: 2.0.0 Reporter: Yonik Seeley Assignee: Michael Busch Priority: Minor Fix For: 2.2 Attachments: lucene-698.patch Filtered query ignores it's own boost. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Commented: (LUCENE-698) FilteredQuery ignores boost
[ https://issues.apache.org/jira/browse/LUCENE-698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12500552 ] Michael Busch commented on LUCENE-698: -- So perhaps we can remove the NaN by modifying the default implementation of queryNorm to return 1.0 instead of Infinity when passed zero. Would that cause any harm? Yes I believe this should work, too. This would prevent the NaN score when DefaultSimilarity is used. It will be the responsibility of people who implement their own Similarity then to take care of this in a similar way. I'll open a new issue for fixing the DefaultSimilarity. FilteredQuery ignores boost --- Key: LUCENE-698 URL: https://issues.apache.org/jira/browse/LUCENE-698 Project: Lucene - Java Issue Type: Bug Components: Search Affects Versions: 2.0.0 Reporter: Yonik Seeley Assignee: Michael Busch Priority: Minor Fix For: 2.2 Attachments: lucene-698.patch Filtered query ignores it's own boost. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Commented: (LUCENE-698) FilteredQuery ignores boost
[ https://issues.apache.org/jira/browse/LUCENE-698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12500252 ] Michael Busch commented on LUCENE-698: -- FYI, Nutch uses Query.setBoost(0.0f) to add clauses which affect the set of results but not their ranking. In particular, it uses this to automatically convert query clauses into filters, so that query clauses like lang:en can be implemented as cached filters. Note that not all such clauses are so optimized. Thanks for the hint, Doug. OK, I understand how you use boost=0.0f in Nutch. Quite cool and elegant idea actually! I guess then throwing an IllegalArgumentException in case boost=0 would break this. The question remains if we should fix the scorers to never return NaN. Hmm, I'm not completely sure how to do this. Maybe DefaultSimilarity.queryNorm() should return 0 instead of Infinity in case sumOfSquaredWeights is 0. But then with custom Similarity implemenations we could still end up getting NaN. A different solution of course is to fix it in the scorers itself, to return a score of 0 in case boost is 0. But then we'd have to add checks in the score() and explain() methods, which might be a performance overhead. So I'm not sure if we should fix this at all considering these difficulties and the fact that nobody complained (I think?) about the NaN so far. Anyway, I'll go ahead and commit LUCENE-698 since this NaN problem is a separate issue and not only happing for the FilteredQuery. FilteredQuery ignores boost --- Key: LUCENE-698 URL: https://issues.apache.org/jira/browse/LUCENE-698 Project: Lucene - Java Issue Type: Bug Components: Search Affects Versions: 2.0.0 Reporter: Yonik Seeley Assignee: Michael Busch Priority: Minor Attachments: lucene-698.patch Filtered query ignores it's own boost. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Commented: (LUCENE-698) FilteredQuery ignores boost
[ https://issues.apache.org/jira/browse/LUCENE-698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12499652 ] Hoss Man commented on LUCENE-698: - i think the test class and test case testFQ7 in particular are correct in the sense that they try to verify every conceivable permutation of stock query times has an explanation that matches it's score ... the problem may just be in the CheckHits.ExplanationAsserter class ... perhaps it should test if either the score or the explanation value are NaN before comparing them, and fail if only one is NaN or if neither is NaN but they are not equal) (after all: if the score is NaN, then the explanation should be NaN as well) FilteredQuery ignores boost --- Key: LUCENE-698 URL: https://issues.apache.org/jira/browse/LUCENE-698 Project: Lucene - Java Issue Type: Bug Components: Search Affects Versions: 2.0.0 Reporter: Yonik Seeley Assignee: Michael Busch Priority: Minor Attachments: lucene-698.patch Filtered query ignores it's own boost. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Commented: (LUCENE-698) FilteredQuery ignores boost
[ https://issues.apache.org/jira/browse/LUCENE-698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12499658 ] Michael Busch commented on LUCENE-698: -- perhaps it should test if either the score or the explanation value are NaN before comparing them, and fail if only one is NaN or if neither is NaN but they are not equal) Thanks for reviewing, Hoss! You are right, we could do that and I was actually thinking about it already. The problem is if I make this fix than testFQ7 fails for TestSimpleExplanationsOfNonMatches because it is assumed that all non matching docs have a score of 0.0. I can easily change that, so that non matching docs can either have a score of 0.0 or NaN but I was not sure if we want that, because other scoring bugs resulting in a score of NaN (which we will hopefully never have) wouldn't be noticed then anymore. The reason why I argued that testFQ7 is an invalid test case is that it would fail for any other query with a boost set to 0. Ironically we have this test only for FilteredQuery, the only query class that ignores the boost, which made it pass in the past. FilteredQuery ignores boost --- Key: LUCENE-698 URL: https://issues.apache.org/jira/browse/LUCENE-698 Project: Lucene - Java Issue Type: Bug Components: Search Affects Versions: 2.0.0 Reporter: Yonik Seeley Assignee: Michael Busch Priority: Minor Attachments: lucene-698.patch Filtered query ignores it's own boost. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Commented: (LUCENE-698) FilteredQuery ignores boost
[ https://issues.apache.org/jira/browse/LUCENE-698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12499668 ] Hoss Man commented on LUCENE-698: - A ... yes, looking back at the comments in LUCENE-557 I remember now: I originally thought boosts of 0.0 were legal for all queries, and then discovered i was wrong, and removed a bunch of tests -- but i clearly missed this one because it wasn't failing. we should go ahead and remove the test ... but we should probably also fix FilteredQuery so that a boost of 0 produces some other result then just a NaN score (either an exception, or a score of 0) since as you say: NaN scores are bad. FilteredQuery ignores boost --- Key: LUCENE-698 URL: https://issues.apache.org/jira/browse/LUCENE-698 Project: Lucene - Java Issue Type: Bug Components: Search Affects Versions: 2.0.0 Reporter: Yonik Seeley Assignee: Michael Busch Priority: Minor Attachments: lucene-698.patch Filtered query ignores it's own boost. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Commented: (LUCENE-698) FilteredQuery ignores boost
[ https://issues.apache.org/jira/browse/LUCENE-698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12499673 ] Michael Busch commented on LUCENE-698: -- but we should probably also fix FilteredQuery so that a boost of 0 produces some other result then just a NaN score (either an exception, or a score of 0) since as you say: NaN scores are bad. TermQuery actually behaves the same way. If boost is zero, then sumOfSquaredWeights() returns zero as well, resulting in a queryNorm of Infinity (due to a div by zero if DefaultSimilarity is used). Then it multiplies boost and queryNorm and 0*Infinity=NaN. FilteredQuery ignores boost --- Key: LUCENE-698 URL: https://issues.apache.org/jira/browse/LUCENE-698 Project: Lucene - Java Issue Type: Bug Components: Search Affects Versions: 2.0.0 Reporter: Yonik Seeley Assignee: Michael Busch Priority: Minor Attachments: lucene-698.patch Filtered query ignores it's own boost. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Commented: (LUCENE-698) FilteredQuery ignores boost
[ https://issues.apache.org/jira/browse/LUCENE-698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12499674 ] Michael Busch commented on LUCENE-698: -- Maybe Query.setBoost() should throw an IllegalArgumentException in case the value is zero? FilteredQuery ignores boost --- Key: LUCENE-698 URL: https://issues.apache.org/jira/browse/LUCENE-698 Project: Lucene - Java Issue Type: Bug Components: Search Affects Versions: 2.0.0 Reporter: Yonik Seeley Assignee: Michael Busch Priority: Minor Attachments: lucene-698.patch Filtered query ignores it's own boost. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Commented: (LUCENE-698) FilteredQuery ignores boost
[ https://issues.apache.org/jira/browse/LUCENE-698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12499675 ] Hoss Man commented on LUCENE-698: - Hmmm.. didn't realize that. I withdrawal all previous comments. patch seems fine to me. FilteredQuery ignores boost --- Key: LUCENE-698 URL: https://issues.apache.org/jira/browse/LUCENE-698 Project: Lucene - Java Issue Type: Bug Components: Search Affects Versions: 2.0.0 Reporter: Yonik Seeley Assignee: Michael Busch Priority: Minor Attachments: lucene-698.patch Filtered query ignores it's own boost. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Commented: (LUCENE-698) FilteredQuery ignores boost
[ https://issues.apache.org/jira/browse/LUCENE-698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12499676 ] Hoss Man commented on LUCENE-698: - whoops .. comment collision. i think the patch as it stands is fine for this issue .. but we may want another issue to hollisticly question NaN as a score. FilteredQuery ignores boost --- Key: LUCENE-698 URL: https://issues.apache.org/jira/browse/LUCENE-698 Project: Lucene - Java Issue Type: Bug Components: Search Affects Versions: 2.0.0 Reporter: Yonik Seeley Assignee: Michael Busch Priority: Minor Attachments: lucene-698.patch Filtered query ignores it's own boost. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Commented: (LUCENE-698) FilteredQuery ignores boost
[ http://issues.apache.org/jira/browse/LUCENE-698?page=comments#action_12444570 ] Yonik Seeley commented on LUCENE-698: - I just commited hashCode() and equals() changes to take boost into account so that generic tests in QueryUtils.check(query) can pass. One should arguably be able to set the boost on any query clause, so I'm leaving this open since I think scoring probably ignores the boost too. FilteredQuery ignores boost --- Key: LUCENE-698 URL: http://issues.apache.org/jira/browse/LUCENE-698 Project: Lucene - Java Issue Type: Bug Affects Versions: 2.0.0 Reporter: Yonik Seeley Filtered query ignores it's own boost. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]