[GitHub] [spark] wangyum commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

2020-12-12 Thread GitBox
wangyum commented on pull request #30222: URL: https://github.com/apache/spark/pull/30222#issuecomment-743725618 @cloud-fan @dongjoon-hyun We can improve the following case to reduce `Union` operator: ```sql create table t1 using parquet as select * from range(100); create table t2

[GitHub] [spark] wangyum commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

2020-11-06 Thread GitBox
wangyum commented on pull request #30222: URL: https://github.com/apache/spark/pull/30222#issuecomment-722957043 Sorry. This change has logic issue, for example: ```scala spark.sql("CREATE TABLE t using parquet AS SELECT if(id % 2 = 7, null, id) AS a FROM range(7)") spark.sql(

[GitHub] [spark] wangyum commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

2020-11-05 Thread GitBox
wangyum commented on pull request #30222: URL: https://github.com/apache/spark/pull/30222#issuecomment-722858604 We can reproduce it by: ```scala spark.sql("CREATE TABLE t(a int, b int, c int) using parquet") spark.sql( """ |SELECT * | FROM (SELECT CASE

[GitHub] [spark] wangyum commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

2020-11-05 Thread GitBox
wangyum commented on pull request #30222: URL: https://github.com/apache/spark/pull/30222#issuecomment-722751973 It seems it is caused by **deterministic**. cc @viirya ``` == Analyzed Logical Plan == label: double, features: vector, fold: int Filter (UDF(fold#14) AND NOT (fold#14

[GitHub] [spark] wangyum commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

2020-11-03 Thread GitBox
wangyum commented on pull request #30222: URL: https://github.com/apache/spark/pull/30222#issuecomment-721474620 Hive optimized it to `predicate: CASE WHEN ((a = 100)) THEN (false) WHEN ((b > 1000)) THEN (true) WHEN (c is not null) THEN (false) ELSE (null) END (type: boolean)`. But this co

[GitHub] [spark] wangyum commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

2020-11-03 Thread GitBox
wangyum commented on pull request #30222: URL: https://github.com/apache/spark/pull/30222#issuecomment-720320498 retest this please. This is an automated message from the Apache Git Service. To respond to the message, please