[ https://issues.apache.org/jira/browse/SPARK-43491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
KuijianLiu updated SPARK-43491: ------------------------------- Description: The query results of Spark SQL 3.1.1 and Hive SQL 3.1.0 are inconsistent with same sql. Spark SQL calculates `{{{}0 in ('00')`{}}} as false, which act different from `{{{}=`{}}} keyword, but Hive calculates true. Hive is compatible with the `{{{}in`{}}} keyword in 3.1.0, but SparkSQL does not. It's better when dataTypes of elements in `{{{}In`{}}} expression are the same, it should behaviour as same as BinaryComparison like ` {{{}EqualTo`{}}}. Test SQL: {code:java} scala> spark.sql("select 1 as test where 0 = '00'").show +----+ |test| +----+ | 1| +----+ scala> spark.sql("select 1 as test where 0 in ('00')").show +----+ |test| +----+ +----+ scala> spark.sql("select 1 as test where 0 = '00'").explain(true) == Parsed Logical Plan == 'Project [1 AS test#23] +- 'Filter (0 = 00) +- OneRowRelation== Analyzed Logical Plan == test: int Project [1 AS test#23] +- Filter (0 = cast(00 as int)) +- OneRowRelation== Optimized Logical Plan == Project [1 AS test#23] +- OneRowRelation== Physical Plan == *(1) Project [1 AS test#23] +- *(1) Scan OneRowRelation[] scala> spark.sql("select 1 as test where 0 in ('00')").explain(true) == Parsed Logical Plan == 'Project [1 AS test#25] +- 'Filter 0 IN (00) +- OneRowRelation== Analyzed Logical Plan == test: int Project [1 AS test#25] +- Filter cast(0 as string) IN (cast(00 as string)) +- OneRowRelation== Optimized Logical Plan == LocalRelation <empty>, [test#25]== Physical Plan == LocalTableScan <empty>, [test#25] {code} !image-2023-05-13-13-14-55-853.png! was: The query results of Spark SQL 3.1.1 and Hive SQL 3.1.0 are inconsistent with same sql. Spark SQL calculates `{{{}0 in ('00')`{}}} as false, which act different from `{{{}=`{}}} keyword, but Hive calculates true. Hive is compatible with the `{{{}in`{}}} keyword in 3.1.0, but SparkSQL does not. It's better when dataTypes of elements in `{{{}In`{}}} expression are the same, it should behaviour as same as BinaryComparison like ` {{{}EqualTo`{}}}. Test SQL: {code:java} scala> spark.sql("select 1 as test where 0 = '00'").show +----+ |test| +----+ | 1| +----+ scala> spark.sql("select 1 as test where 0 in ('00')").show +----+ |test| +----+ +----+ scala> spark.sql("select 1 as test where 0 = '00'").explain(true) == Parsed Logical Plan == 'Project [1 AS test#23] +- 'Filter (0 = 00) +- OneRowRelation== Analyzed Logical Plan == test: int Project [1 AS test#23] +- Filter (0 = cast(00 as int)) +- OneRowRelation== Optimized Logical Plan == Project [1 AS test#23] +- OneRowRelation== Physical Plan == *(1) Project [1 AS test#23] +- *(1) Scan OneRowRelation[] scala> spark.sql("select 1 as test where 0 in ('00')").explain(true) == Parsed Logical Plan == 'Project [1 AS test#25] +- 'Filter 0 IN (00) +- OneRowRelation== Analyzed Logical Plan == test: int Project [1 AS test#25] +- Filter cast(0 as string) IN (cast(00 as string)) +- OneRowRelation== Optimized Logical Plan == LocalRelation <empty>, [test#25]== Physical Plan == LocalTableScan <empty>, [test#25] {code} > In expression not compatible with EqualTo Expression > ---------------------------------------------------- > > Key: SPARK-43491 > URL: https://issues.apache.org/jira/browse/SPARK-43491 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 3.3.1 > Reporter: KuijianLiu > Priority: Minor > Attachments: image-2023-05-13-13-14-55-853.png > > > The query results of Spark SQL 3.1.1 and Hive SQL 3.1.0 are inconsistent > with same sql. Spark SQL calculates `{{{}0 in ('00')`{}}} as false, which act > different from `{{{}=`{}}} keyword, but Hive calculates true. Hive is > compatible with the `{{{}in`{}}} keyword in 3.1.0, but SparkSQL does not. > It's better when dataTypes of elements in `{{{}In`{}}} expression are the > same, it should behaviour as same as BinaryComparison like ` {{{}EqualTo`{}}}. > Test SQL: > {code:java} > scala> spark.sql("select 1 as test where 0 = '00'").show > +----+ > |test| > +----+ > | 1| > +----+ > scala> spark.sql("select 1 as test where 0 in ('00')").show > +----+ > |test| > +----+ > +----+ > scala> spark.sql("select 1 as test where 0 = '00'").explain(true) > == Parsed Logical Plan == > 'Project [1 AS test#23] > +- 'Filter (0 = 00) > +- OneRowRelation== Analyzed Logical Plan == > test: int > Project [1 AS test#23] > +- Filter (0 = cast(00 as int)) > +- OneRowRelation== Optimized Logical Plan == > Project [1 AS test#23] > +- OneRowRelation== Physical Plan == > *(1) Project [1 AS test#23] > +- *(1) Scan OneRowRelation[] > scala> spark.sql("select 1 as test where 0 in ('00')").explain(true) > == Parsed Logical Plan == > 'Project [1 AS test#25] > +- 'Filter 0 IN (00) > +- OneRowRelation== Analyzed Logical Plan == > test: int > Project [1 AS test#25] > +- Filter cast(0 as string) IN (cast(00 as string)) > +- OneRowRelation== Optimized Logical Plan == > LocalRelation <empty>, [test#25]== Physical Plan == > LocalTableScan <empty>, [test#25] > {code} > > !image-2023-05-13-13-14-55-853.png! -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org