[jira] [Comment Edited] (SOLR-13126) Multiplicative boost of isn't applied when one of the summed or multiplied queries doesn't match

2019-05-13 Thread Thomas Zillinger (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16838310#comment-16838310
 ] 

Thomas Zillinger edited comment on SOLR-13126 at 5/13/19 6:37 AM:
--

[~janhoy] could this also be backported to Solr 7.3 - 7.5?


was (Author: tomzi):
[~janhoy] could this also be backported to Solr 7.5?

> Multiplicative boost of isn't applied when one of the summed or multiplied 
> queries doesn't match 
> -
>
> Key: SOLR-13126
> URL: https://issues.apache.org/jira/browse/SOLR-13126
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: search
>Affects Versions: 7.3, 7.4, 7.6, 7.7, 7.5.0, 7.7.1
> Environment: Reproduced with macOS 10.14.1, a quick test with Windows 
> 10 showed the same result.
>Reporter: Thomas Aglassinger
>Assignee: Alan Woodward
>Priority: Major
> Fix For: 7.7.2, 8.0
>
> Attachments: 
> 0001-use-deprecated-classes-to-fix-regression-introduced-.patch, 
> 0002-SOLR-13126-Added-test-case.patch, 2019-02-14_1715.png, SOLR-13126.patch, 
> SOLR-13126.patch, debugQuery.json, image-2019-02-13-16-17-56-272.png, 
> screenshot-1.png, solr_match_neither_nextteil_nor_sony.json, 
> solr_match_neither_nextteil_nor_sony.txt, solr_match_netzteil_and_sony.json, 
> solr_match_netzteil_and_sony.txt, solr_match_netzteil_only.json, 
> solr_match_netzteil_only.txt
>
>
> Under certain circumstances search results from queries with multiple 
> multiplicative boosts using the Solr functions {{product()}} and {{query()}} 
> result in a score that is inconsistent with the one from the debugQuery 
> information. Also only the debug score is correct while the actual search 
> results show a wrong score.
> This seems somewhat similar to the behaviour described in 
> https://issues.apache.org/jira/browse/LUCENE-7132, though this issue has been 
> resolved a while ago.
> A little background: we are using Solr as a search platform for the 
> e-commerce framework SAP Hybris. There the shop administrator can create 
> multiplicative boost rules (see below for an example) where a value like 2.0 
> means that an item gets boosted to 200%. This works fine in the demo shop 
> distributed by SAP but breaks in our shop. We encountered the issue when 
> Upgrading from Solr 7.2.1 / Hybris 6.7 to Solr 7.5 / Hybris 18.8.3 (which 
> would have been named Hybris 6.8 but the version naming schema changed).
> We reduced the Solr query generated by Hybris to the relevant parts and could 
> reproduce the issue in the Solr admin without any Hybris connection.
> I attached the JSON result of a test query but here's a description of the 
> parts that seemed most relevant to me.
> The {{responseHeader.params}} reads (slightly rearranged):
> {code:java}
> "q":"{!boost b=$ymb}(+{!lucene v=$yq})",
> "ymb":"product(query({!v=\"name_text_de\\:Netzteil\\^=2.0\"},1),query({!v=\"name_text_de\\:Sony\\^=3.0\"},1))",
> "yq":"*:*",
> "sort":"score desc",
> "debugQuery":"true",
> // Added to keep the output small but probably unrelated to the actual issue
> "fl":"score,id,code_string,name_text_de",
> "fq":"catalogId:\"someProducts\"",
> "rows":"10",
> {code}
> This example boosts the German product name (field {{name_text_de}}) in case 
> in contains certain terms:
>  * "Netzteil" (power supply) is boosted to 200%
>  * "Sony" is boosted to 300%
> Consequently a product containing both terms should be boosted to 600%.
> Also the query function has the value 1 specified as default in case the name 
> does not contain the respective term resulting in a pseudo boost that 
> preserves the score.
> According to the debug information the parser used is the LuceneQParser, 
> which translates this to the following parsed query:
> {quote}FunctionScoreQuery(FunctionScoreQuery(+*:*, scored by 
> boost(product(query((ConstantScore(name_text_de:netzteil))^2.0,def=1.0),query((ConstantScore(name_text_de:sony))^3.0,def=1.0)
> {quote}
> And the translated boost is:
> {quote}org.apache.lucene.queries.function.valuesource.ProductFloatFunction:product(query((ConstantScore(name_text_de:netzteil))^2.0,def=1.0),query((ConstantScore(name_text_de:sony))^3.0,def=1.0))
> {quote}
> When taking a look at the search result, among other the following products 
> are included (see the JSON comments for an analysis of each result):
> {code:javascript}
>  {
> "id":"someProducts/Online/test711",
> "name_text_de":"Original Sony Vaio Netzteil",
> "code_string":"test711",
> // CORRECT, both "Netzteil" and "Sony" are included in the name
> "score":6.0},
>   {
> "id":"someProducts/Online/taxTestingProductThree",
>  

[jira] [Commented] (SOLR-13126) Multiplicative boost of isn't applied when one of the summed or multiplied queries doesn't match

2019-05-13 Thread Thomas Zillinger (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16838310#comment-16838310
 ] 

Thomas Zillinger commented on SOLR-13126:
-

[~janhoy] could this also be backported to Solr 7.5?

> Multiplicative boost of isn't applied when one of the summed or multiplied 
> queries doesn't match 
> -
>
> Key: SOLR-13126
> URL: https://issues.apache.org/jira/browse/SOLR-13126
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: search
>Affects Versions: 7.3, 7.4, 7.6, 7.7, 7.5.0, 7.7.1
> Environment: Reproduced with macOS 10.14.1, a quick test with Windows 
> 10 showed the same result.
>Reporter: Thomas Aglassinger
>Assignee: Alan Woodward
>Priority: Major
> Fix For: 7.7.2, 8.0
>
> Attachments: 
> 0001-use-deprecated-classes-to-fix-regression-introduced-.patch, 
> 0002-SOLR-13126-Added-test-case.patch, 2019-02-14_1715.png, SOLR-13126.patch, 
> SOLR-13126.patch, debugQuery.json, image-2019-02-13-16-17-56-272.png, 
> screenshot-1.png, solr_match_neither_nextteil_nor_sony.json, 
> solr_match_neither_nextteil_nor_sony.txt, solr_match_netzteil_and_sony.json, 
> solr_match_netzteil_and_sony.txt, solr_match_netzteil_only.json, 
> solr_match_netzteil_only.txt
>
>
> Under certain circumstances search results from queries with multiple 
> multiplicative boosts using the Solr functions {{product()}} and {{query()}} 
> result in a score that is inconsistent with the one from the debugQuery 
> information. Also only the debug score is correct while the actual search 
> results show a wrong score.
> This seems somewhat similar to the behaviour described in 
> https://issues.apache.org/jira/browse/LUCENE-7132, though this issue has been 
> resolved a while ago.
> A little background: we are using Solr as a search platform for the 
> e-commerce framework SAP Hybris. There the shop administrator can create 
> multiplicative boost rules (see below for an example) where a value like 2.0 
> means that an item gets boosted to 200%. This works fine in the demo shop 
> distributed by SAP but breaks in our shop. We encountered the issue when 
> Upgrading from Solr 7.2.1 / Hybris 6.7 to Solr 7.5 / Hybris 18.8.3 (which 
> would have been named Hybris 6.8 but the version naming schema changed).
> We reduced the Solr query generated by Hybris to the relevant parts and could 
> reproduce the issue in the Solr admin without any Hybris connection.
> I attached the JSON result of a test query but here's a description of the 
> parts that seemed most relevant to me.
> The {{responseHeader.params}} reads (slightly rearranged):
> {code:java}
> "q":"{!boost b=$ymb}(+{!lucene v=$yq})",
> "ymb":"product(query({!v=\"name_text_de\\:Netzteil\\^=2.0\"},1),query({!v=\"name_text_de\\:Sony\\^=3.0\"},1))",
> "yq":"*:*",
> "sort":"score desc",
> "debugQuery":"true",
> // Added to keep the output small but probably unrelated to the actual issue
> "fl":"score,id,code_string,name_text_de",
> "fq":"catalogId:\"someProducts\"",
> "rows":"10",
> {code}
> This example boosts the German product name (field {{name_text_de}}) in case 
> in contains certain terms:
>  * "Netzteil" (power supply) is boosted to 200%
>  * "Sony" is boosted to 300%
> Consequently a product containing both terms should be boosted to 600%.
> Also the query function has the value 1 specified as default in case the name 
> does not contain the respective term resulting in a pseudo boost that 
> preserves the score.
> According to the debug information the parser used is the LuceneQParser, 
> which translates this to the following parsed query:
> {quote}FunctionScoreQuery(FunctionScoreQuery(+*:*, scored by 
> boost(product(query((ConstantScore(name_text_de:netzteil))^2.0,def=1.0),query((ConstantScore(name_text_de:sony))^3.0,def=1.0)
> {quote}
> And the translated boost is:
> {quote}org.apache.lucene.queries.function.valuesource.ProductFloatFunction:product(query((ConstantScore(name_text_de:netzteil))^2.0,def=1.0),query((ConstantScore(name_text_de:sony))^3.0,def=1.0))
> {quote}
> When taking a look at the search result, among other the following products 
> are included (see the JSON comments for an analysis of each result):
> {code:javascript}
>  {
> "id":"someProducts/Online/test711",
> "name_text_de":"Original Sony Vaio Netzteil",
> "code_string":"test711",
> // CORRECT, both "Netzteil" and "Sony" are included in the name
> "score":6.0},
>   {
> "id":"someProducts/Online/taxTestingProductThree",
> "name_text_de":"Steuertestprodukt Zwei",
> "code_string":"taxTestingProductThree",
> // CORRECT, neither