wangyum commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-743725618
@cloud-fan @dongjoon-hyun We can improve the following case to reduce
`Union` operator:
```sql
create table t1 using parquet as select * from range(100);
create table t2
wangyum commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722957043
Sorry. This change has logic issue, for example:
```scala
spark.sql("CREATE TABLE t using parquet AS SELECT if(id % 2 = 7, null, id)
AS a FROM range(7)")
spark.sql(
wangyum commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722858604
We can reproduce it by:
```scala
spark.sql("CREATE TABLE t(a int, b int, c int) using parquet")
spark.sql(
"""
|SELECT *
| FROM (SELECT CASE
wangyum commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722751973
It seems it is caused by **deterministic**. cc @viirya
```
== Analyzed Logical Plan ==
label: double, features: vector, fold: int
Filter (UDF(fold#14) AND NOT (fold#14
wangyum commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-721474620
Hive optimized it to `predicate: CASE WHEN ((a = 100)) THEN (false) WHEN ((b
> 1000)) THEN (true) WHEN (c is not null) THEN (false) ELSE (null) END (type:
boolean)`. But this co
wangyum commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-720320498
retest this please.
This is an automated message from the Apache Git Service.
To respond to the message, please