Yuming Wang created SPARK-35316: ----------------------------------- Summary: UnwrapCastInBinaryComparison support In/InSet predicate Key: SPARK-35316 URL: https://issues.apache.org/jira/browse/SPARK-35316 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.2.0 Reporter: Yuming Wang
In/InSet missing this optimization: {code:scala} spark.range(50).selectExpr("cast(id as int) as id").write.mode("overwrite").parquet("/tmp/parquet/t1") spark.read.parquet("/tmp/parquet/t1").where("id in (1L, 2L, 4L)").explain spark.read.parquet("/tmp/parquet/t1").where("id = 1L or id = 2L or id = 4L").explain {code} {noformat} == Physical Plan == *(1) Filter cast(id#5 as bigint) IN (1,2,4) +- *(1) ColumnarToRow +- FileScan parquet [id#5] Batched: true, DataFilters: [cast(id#5 as bigint) IN (1,2,4)], Format: Parquet, Location: InMemoryFileIndex(1 paths)[file:/tmp/parquet/t1], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<id:int> == Physical Plan == *(1) Filter (((id#7 = 1) OR (id#7 = 2)) OR (id#7 = 4)) +- *(1) ColumnarToRow +- FileScan parquet [id#7] Batched: true, DataFilters: [(((id#7 = 1) OR (id#7 = 2)) OR (id#7 = 4))], Format: Parquet, Location: InMemoryFileIndex(1 paths)[file:/tmp/parquet/t1], PartitionFilters: [], PushedFilters: [Or(Or(EqualTo(id,1),EqualTo(id,2)),EqualTo(id,4))], ReadSchema: struct<id:int> {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org