[ https://issues.apache.org/jira/browse/CALCITE-4292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17424535#comment-17424535 ]
Julian Hyde commented on CALCITE-4292: -------------------------------------- I support Stamatis’ position. Our SQL adapter must implement SQL semantics. That is what users of a SQL interface want and expect. If Elasticsearch’s data model has nuances that cannot be captured in SQL, feel free to add extra operators. If name is a multi-valued attribute, then some ideas are IS_EMPTY(name), VALUE_COUNT(name), VALUE_SET(name). > Wrong results in ElasticSearch when query contains NOT EQUAL > ------------------------------------------------------------ > > Key: CALCITE-4292 > URL: https://issues.apache.org/jira/browse/CALCITE-4292 > Project: Calcite > Issue Type: Bug > Components: elasticsearch-adapter > Reporter: Shlok Srivastava > Assignee: Bill Neil > Priority: Major > Labels: ElasticSearch, NotEquals, QueryBuilder, calcite, > pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Currently elasticsearch adapters ignore records which do not have the field > specified in NotEqual clause, which is ideally included by elasticsearch. > *Query:* > {code:java} > SELECT* from zips WHERE name <> "NMAX"{code} > The elasticsearch query formed for above is this : > {code:java} > { > "query": { > "constant_score": { > "filter": { > "bool": { > "must": { > "exists": { > "field": "Name" > } > }, > "must_not": { > "term": { > "Name": "NMQAX" > } > } > } > } > } > } > } > {code} > *Problem* : The above query ignores document which do not have _Name_ field > which is ideally included by elasticsearch but ignored due to must exists > condition. > *Data:* > {noformat} > { "_id" : "01701", "name" : "NMAX", "loc" : [ -71.42548600000001, 42.300665 > ], "pop" : 65046, "state" : "MA" } > { "_id" : "02154", "name" : "NORTH WALTHAM", "loc" : [ -71.236497, 42.382492 > ], "pop" : 57871, "state" : "MA" } > { "_id" : "02401 , "loc" : [ -71.03434799999999, 42.081571 ], "pop" : 59498, > "state" : "MA" } > {noformat} > > > *Expected result:* > {noformat} > { "_id" : "02154", "name" : "NORTH WALTHAM", "loc" : [ -71.236497, 42.382492 > ], "pop" : 57871, "state" : "MA" } > { "_id" : "02401", "loc" : [ -71.03434799999999, 42.081571 ], "pop" : 59498, > "state" : "MA" } > {noformat} > > *Current Result:* > {noformat} > { "_id" : "02154", "name" : "NORTH WALTHAM", "loc" : [ -71.236497, 42.382492 > ], "pop" : 57871, "state" : "MA" } > {noformat} > RelNode for same - > {code:java} > relB.not(relB.equals(relb.literal("Name"),relb.literal"NMQAX")){code} > > *Solution* : Remove the exists condition from Not equals Query Expression. > Elasticsearch doesn't put this condition therefore keeping queries in sync. > [Code|https://github.com/apache/calcite/blob/1050b36cafbb0c487b7a2ade3efd12850609717e/elasticsearch/src/main/java/org/apache/calcite/adapter/elasticsearch/PredicateAnalyzer.java#L782] -- This message was sent by Atlassian Jira (v8.3.4#803005)