[ 
https://issues.apache.org/jira/browse/CALCITE-4292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shlok Srivastava updated CALCITE-4292:
--------------------------------------
    Description: 
**Currently elasticsearch adapters ignore records which do not have the field 
specified in NotEqual clause, which is ideally included by elasticsearch.

*Query:*
{code:java}
SELECT* from zips WHERE name <> "NMAX"{code}
The elasticsearch query formed  for above  is this : 
{code:java}
{
  "query": {
    "constant_score": {
      "filter": {
        "bool": {
          "must": {
            "exists": {
              "field": "Name"
            }
          },
          "must_not": {
            "term": {
              "Name": "NMQAX"
            }
          }
        }
      }
    }
  }
}
{code}
*Problem* : The above query ignores document which do not have _Name_ field 
which is ideally included by elasticsearch but ignored due to must exists 
condition. 

*Data:*
{noformat}
{ "_id" : "01701", "name" : "NMAX", "loc" : [ -71.42548600000001, 42.300665 ], 
"pop" : 65046, "state" : "MA" }
{ "_id" : "02154", "name" : "NORTH WALTHAM", "loc" : [ -71.236497, 42.382492 ], 
"pop" : 57871, "state" : "MA" }
{ "_id" : "02401 , "loc" : [ -71.03434799999999, 42.081571 ], "pop" : 59498, 
"state" : "MA" }
{noformat}
 

 
 *Expected result:* 
{noformat}
{ "_id" : "02154", "name" : "NORTH WALTHAM", "loc" : [ -71.236497, 42.382492 ], 
"pop" : 57871, "state" : "MA" }
{ "_id" : "02401", "loc" : [ -71.03434799999999, 42.081571 ], "pop" : 59498, 
"state" : "MA" }
{noformat}
 
 *Current Result:*
{noformat}
 { "_id" : "02154", "name" : "NORTH WALTHAM", "loc" : [ -71.236497, 42.382492 
], "pop" : 57871, "state" : "MA" }
{noformat}
RelNode for same - 
{code:java}
relB.not(relB.equals(relb.literal("Name"),relb.literal"NMQAX")){code}
 

*Solution* : Remove the exists condition from Not equals Query Expression. 
Elasticsearch doesn't put this condition therefore keeping queries in sync. 
[Code|https://github.com/apache/calcite/blob/1050b36cafbb0c487b7a2ade3efd12850609717e/elasticsearch/src/main/java/org/apache/calcite/adapter/elasticsearch/PredicateAnalyzer.java#L782]

  was:
*Query:*
{code:java}
SELECT* from zips WHERE name <> "NMAX"{code}
The elasticsearch query formed  for above  is this : 
{code:java}
{
  "query": {
    "constant_score": {
      "filter": {
        "bool": {
          "must": {
            "exists": {
              "field": "Name"
            }
          },
          "must_not": {
            "term": {
              "Name": "NMQAX"
            }
          }
        }
      }
    }
  }
}
{code}
*Problem* : The above query ignores document which do not have _Name_ field 
which is ideally included by elasticsearch but ignored due to must exists 
condition. 

*Data:*
{noformat}
{ "_id" : "01701", "name" : "NMAX", "loc" : [ -71.42548600000001, 42.300665 ], 
"pop" : 65046, "state" : "MA" }
{ "_id" : "02154", "name" : "NORTH WALTHAM", "loc" : [ -71.236497, 42.382492 ], 
"pop" : 57871, "state" : "MA" }
{ "_id" : "02401 , "loc" : [ -71.03434799999999, 42.081571 ], "pop" : 59498, 
"state" : "MA" }
{noformat}
 

 
 *Expected result:* 
{noformat}
{ "_id" : "02154", "name" : "NORTH WALTHAM", "loc" : [ -71.236497, 42.382492 ], 
"pop" : 57871, "state" : "MA" }
{ "_id" : "02401", "loc" : [ -71.03434799999999, 42.081571 ], "pop" : 59498, 
"state" : "MA" }
{noformat}
 
 *Current Result:*
{noformat}
 { "_id" : "02154", "name" : "NORTH WALTHAM", "loc" : [ -71.236497, 42.382492 
], "pop" : 57871, "state" : "MA" }
{noformat}
RelNode for same - 
{code:java}
relB.not(relB.equals(relb.literal("Name"),relb.literal"NMQAX")){code}
 

*Solution* : Remove the exists condition from Not equals Query Expression. 
Elasticsearch doesn't put this condition therefore keeping queries in sync. 
[Code|https://github.com/apache/calcite/blob/1050b36cafbb0c487b7a2ade3efd12850609717e/elasticsearch/src/main/java/org/apache/calcite/adapter/elasticsearch/PredicateAnalyzer.java#L782]


> Wrong results in ElasticSearch when query contains NOT EQUAL
> ------------------------------------------------------------
>
>                 Key: CALCITE-4292
>                 URL: https://issues.apache.org/jira/browse/CALCITE-4292
>             Project: Calcite
>          Issue Type: Bug
>          Components: elasticsearch-adapter
>            Reporter: Shlok Srivastava
>            Assignee: Bill Neil
>            Priority: Major
>              Labels: ElasticSearch, NotEquals, QueryBuilder, calcite, 
> pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> **Currently elasticsearch adapters ignore records which do not have the field 
> specified in NotEqual clause, which is ideally included by elasticsearch.
> *Query:*
> {code:java}
> SELECT* from zips WHERE name <> "NMAX"{code}
> The elasticsearch query formed  for above  is this : 
> {code:java}
> {
>   "query": {
>     "constant_score": {
>       "filter": {
>         "bool": {
>           "must": {
>             "exists": {
>               "field": "Name"
>             }
>           },
>           "must_not": {
>             "term": {
>               "Name": "NMQAX"
>             }
>           }
>         }
>       }
>     }
>   }
> }
> {code}
> *Problem* : The above query ignores document which do not have _Name_ field 
> which is ideally included by elasticsearch but ignored due to must exists 
> condition. 
> *Data:*
> {noformat}
> { "_id" : "01701", "name" : "NMAX", "loc" : [ -71.42548600000001, 42.300665 
> ], "pop" : 65046, "state" : "MA" }
> { "_id" : "02154", "name" : "NORTH WALTHAM", "loc" : [ -71.236497, 42.382492 
> ], "pop" : 57871, "state" : "MA" }
> { "_id" : "02401 , "loc" : [ -71.03434799999999, 42.081571 ], "pop" : 59498, 
> "state" : "MA" }
> {noformat}
>  
>  
>  *Expected result:* 
> {noformat}
> { "_id" : "02154", "name" : "NORTH WALTHAM", "loc" : [ -71.236497, 42.382492 
> ], "pop" : 57871, "state" : "MA" }
> { "_id" : "02401", "loc" : [ -71.03434799999999, 42.081571 ], "pop" : 59498, 
> "state" : "MA" }
> {noformat}
>  
>  *Current Result:*
> {noformat}
>  { "_id" : "02154", "name" : "NORTH WALTHAM", "loc" : [ -71.236497, 42.382492 
> ], "pop" : 57871, "state" : "MA" }
> {noformat}
> RelNode for same - 
> {code:java}
> relB.not(relB.equals(relb.literal("Name"),relb.literal"NMQAX")){code}
>  
> *Solution* : Remove the exists condition from Not equals Query Expression. 
> Elasticsearch doesn't put this condition therefore keeping queries in sync. 
> [Code|https://github.com/apache/calcite/blob/1050b36cafbb0c487b7a2ade3efd12850609717e/elasticsearch/src/main/java/org/apache/calcite/adapter/elasticsearch/PredicateAnalyzer.java#L782]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to