[ 
https://issues.apache.org/jira/browse/SPARK-35688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17359786#comment-17359786
 ] 

todd.chen commented on SPARK-35688:
-----------------------------------

== Physical Plan ==
Project [get_json_object(json#14, $.uid) AS uid#20, name_array#21]
+- Generate explode(explode_name_array(name#13)), [json#14], false, 
[name_array#21]
 +- Filter ((((isnotnull(name#13) AND NOT (name#13 = ?)) AND 
(cast(get_json_object(json#14, $.uid) as int) > 0)) AND 
(size(explode_name_array(name#13), true) > 0)) AND 
isnotnull(explode_name_array(name#13)))
 +- *(1) ColumnarToRow
 +- FileScan parquet [name#13,json#14] Batched: true, DataFilters: 
[isnotnull(name#13), NOT (name#13 = ?), (cast(get_json_object(json#14, $.uid) 
as int) > 0), (size..., Format: Parquet, Location: 
InMemoryFileIndex[file:/tmp/tb_eliminate_bad_case_data], PartitionFilters: [], 
PushedFilters: [IsNotNull(name), Not(EqualTo(name,?))], ReadSchema: 
struct<name:string,json:string>

 

and from this plan filter invalid data " ?"   will execute before explode ,but 
because spark.sql.subexpressionElimination.enabled is true ,spark call  
explode_name_array before filter data

> GeneratePredicate eliminate will fail  in some case
> ---------------------------------------------------
>
>                 Key: SPARK-35688
>                 URL: https://issues.apache.org/jira/browse/SPARK-35688
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.1.1
>            Reporter: todd.chen
>            Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to