subject:"\[jira\] \[Commented\] \(SPARK\-37897\) Filter with subexpression elimination may cause query failed"

[jira] [Commented] (SPARK-37897) Filter with subexpression elimination may cause query failed

2022-01-20 Thread hujiahua (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-37897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17479764#comment-17479764
 ] 

hujiahua commented on SPARK-37897:
--

[~viirya] Thank you for your reply. But I'm curious what is the correct way to 
use `plusOne`,  like this ?  
{code:java}
select t.* from  (select * from table1 where c1 >= 0) t where plusOne(t.c1) > 1 
and plusOne(t.c1) < 3 {code}
By the way, are there any implementation constraints associated with this 
(filter predicates order) in SQL ANSI Standards?

> Filter with subexpression elimination may cause query failed
> 
>
> Key: SPARK-37897
> URL: https://issues.apache.org/jira/browse/SPARK-37897
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: hujiahua
>Priority: Major
> Attachments: image-2022-01-13-20-22-09-055.png
>
>
>  
> The following test results will fail, the root cause was that the execution 
> order of filter predicates had changed after subexpression elimination. So I 
> think we should keep predicates execution order after subexpression 
> elimination.
> {code:java}
> test("filter with subexpression elimination may cause query failed.") {
>   withSQLConf((SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key, "false")) {
> val df = Seq(-1, 1, 2).toDF("c1")
> //register `plusOne` udf, and the function will failed if input was not a 
> positive number.
> spark.sqlContext.udf.register("plusOne",
>   (n: Int) => { if (n >= 0) n + 1 else throw new SparkException("Must be 
> positive number.") })
> val result = df.filter("c1 >= 0 and plusOne(c1) > 1 and plusOne(c1) < 
> 3").collect()
> assert(result.size === 1)
>   }
> } 
> Caused by: org.apache.spark.SparkException: Must be positive number.
>     at 
> org.apache.spark.sql.DataFrameSuite.$anonfun$new$3(DataFrameSuite.scala:67)
>     at 
> scala.runtime.java8.JFunction1$mcII$sp.apply(JFunction1$mcII$sp.java:23)
>     ... 20 more{code}
>  
> https://github.com/apache/spark/blob/0e186e8a19926f91810f3eaf174611b71e598de6/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GeneratePredicate.scala#L63
> !image-2022-01-13-20-22-09-055.png!
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-37897) Filter with subexpression elimination may cause query failed

2022-01-15 Thread L. C. Hsieh (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-37897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17476700#comment-17476700
 ] 

L. C. Hsieh commented on SPARK-37897:
-

SQL is a declarative language, please don't think it in imperative style of 
short-circuit evaluation. How to evaluate the predicate is 
implementation-dependent. 

> Filter with subexpression elimination may cause query failed
> 
>
> Key: SPARK-37897
> URL: https://issues.apache.org/jira/browse/SPARK-37897
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: hujiahua
>Priority: Major
> Attachments: image-2022-01-13-20-22-09-055.png
>
>
>  
> The following test results will fail, the root cause was that the execution 
> order of filter predicates had changed after subexpression elimination. So I 
> think we should keep predicates execution order after subexpression 
> elimination.
> {code:java}
> test("filter with subexpression elimination may cause query failed.") {
>   withSQLConf((SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key, "false")) {
> val df = Seq(-1, 1, 2).toDF("c1")
> //register `plusOne` udf, and the function will failed if input was not a 
> positive number.
> spark.sqlContext.udf.register("plusOne",
>   (n: Int) => { if (n >= 0) n + 1 else throw new SparkException("Must be 
> positive number.") })
> val result = df.filter("c1 >= 0 and plusOne(c1) > 1 and plusOne(c1) < 
> 3").collect()
> assert(result.size === 1)
>   }
> } 
> Caused by: org.apache.spark.SparkException: Must be positive number.
>     at 
> org.apache.spark.sql.DataFrameSuite.$anonfun$new$3(DataFrameSuite.scala:67)
>     at 
> scala.runtime.java8.JFunction1$mcII$sp.apply(JFunction1$mcII$sp.java:23)
>     ... 20 more{code}
>  
> https://github.com/apache/spark/blob/0e186e8a19926f91810f3eaf174611b71e598de6/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GeneratePredicate.scala#L63
> !image-2022-01-13-20-22-09-055.png!
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-37897) Filter with subexpression elimination may cause query failed

[jira] [Commented] (SPARK-37897) Filter with subexpression elimination may cause query failed

2 matches

Site Navigation

Mail list logo

Footer information