[ 
https://issues.apache.org/jira/browse/SOLR-8082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya updated SOLR-8082:
---------------------------------------
    Attachment: SOLR-8082.patch

Here's a summary of my understanding / observations:
# Floats and doubles need to be converted to longs before writing them to 
NumericDocValues. 
# We have two options, Double.doubleToLongBits() and 
NumericUtils.doubleToSortableLong(). For positive doubles, both these methods 
return the same long value, but different ones for negative doubles.
# Currently, we use Double.doubleToLongBits(). Hence, to use term query against 
such docValues, we should use the same method with the query value, but current 
code uses NumericUtils.doubleToSortableLong() and hence term queries against 
negative values fail. Similarly, range queries also fail when min is negative.
# I tried changing initial writing logic to use 
NumericUtils.doubleToSortableLong(). With this change, both term queries and 
range queries work, but sorting fails (when there are negative values). That is 
counter intuitive, since the individual long values themselves are in sorted 
order. Since this is an intrusive change that breaks backcompat, I didn't 
investigate deeper to understand why this is happening.
# To arrive at a least intrusive fix, I tried changing the range query logic to 
split out the queries into two distinct ranges (negatives and positives) using 
a boolean query. I had to do this since the Double.doubleToLongBits() values 
are not monotonically increasing (they are decreasing for Double.MIN_VALUE to 
0, but increasing for 0 to Double.MAX_VALUE).

Attached the patch for the last point, which I think is the least intrusive way 
to pull things together so that they work. When the range query crosses the 0 
boundary, there are two dv range queries which is less efficient, but better 
than not working at all (which is the case today). The patch passes the tests, 
but it might benefit from some neater refactoring.

[~hossman] Can you please review? Do you think there's a cleaner way to do this?

> can't query against negative float or double values when indexed="false" 
> docValues="true" multiValued="false"
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-8082
>                 URL: https://issues.apache.org/jira/browse/SOLR-8082
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Hoss Man
>         Attachments: SOLR-8082.patch, SOLR-8082.patch
>
>
> Haven't dug into this yet, but something is evidently wrong in how the 
> DocValues based queries get build for single valued float or double fields 
> when negative numbers are involved.
> Steps to reproduce...
> {noformat}
> $ bin/solr -e schemaless -noprompt
> ...
> $ curl -X POST -H 'Content-type:application/json' --data-binary '{ 
> "add-field":{ "name":"f_dv_multi", "type":"tfloat", "stored":"true", 
> "indexed":"false", "docValues":"true", "multiValued":"true" }, "add-field":{ 
> "name":"f_dv_single", "type":"tfloat", "stored":"true", "indexed":"false", 
> "docValues":"true", "multiValued":"false" } }' 
> http://localhost:8983/solr/gettingstarted/schema
> {
>   "responseHeader":{
>     "status":0,
>     "QTime":84}}
> $ curl -X POST -H 'Content-type:application/json' --data-binary 
> '[{"id":"test", "f_dv_multi":-4.3, "f_dv_single":-4.3}]' 
> 'http://localhost:8983/solr/gettingstarted/update/json/docs?commit=true'
> {"responseHeader":{"status":0,"QTime":57}}
> $ curl 'http://localhost:8983/solr/gettingstarted/query?q=f_dv_multi:"-4.3";'
> {
>   "responseHeader":{
>     "status":0,
>     "QTime":5,
>     "params":{
>       "q":"f_dv_multi:\"-4.3\""}},
>   "response":{"numFound":1,"start":0,"docs":[
>       {
>         "id":"test",
>         "f_dv_multi":[-4.3],
>         "f_dv_single":-4.3,
>         "_version_":1512962117004689408}]
>   }}
> $ curl 'http://localhost:8983/solr/gettingstarted/query?q=f_dv_single:"-4.3";'
> {
>   "responseHeader":{
>     "status":0,
>     "QTime":5,
>     "params":{
>       "q":"f_dv_single:\"-4.3\""}},
>   "response":{"numFound":0,"start":0,"docs":[]
>   }}
> {noformat}
> Explicit range queries (which is how numeric "field" queries are implemented 
> under the cover) are equally problematic...
> {noformat}
> $ curl 
> 'http://localhost:8983/solr/gettingstarted/query?q=f_dv_multi:%5B-4.3+TO+-4.3%5D'
> {
>   "responseHeader":{
>     "status":0,
>     "QTime":0,
>     "params":{
>       "q":"f_dv_multi:[-4.3 TO -4.3]"}},
>   "response":{"numFound":1,"start":0,"docs":[
>       {
>         "id":"test",
>         "f_dv_multi":[-4.3],
>         "f_dv_single":-4.3,
>         "_version_":1512962117004689408}]
>   }}
> $ curl 
> 'http://localhost:8983/solr/gettingstarted/query?q=f_dv_single:%5B-4.3+TO+-4.3%5D'
> {
>   "responseHeader":{
>     "status":0,
>     "QTime":0,
>     "params":{
>       "q":"f_dv_single:[-4.3 TO -4.3]"}},
>   "response":{"numFound":0,"start":0,"docs":[]
>   }}
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to