[jira] [Commented] (SPARK-21774) The rule PromoteStrings cast string to a wrong data type
[ https://issues.apache.org/jira/browse/SPARK-21774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17177461#comment-17177461 ] ZhouDaHong commented on SPARK-21774: Hello, you compare the value of a field of string type with the 0 in your sql. Due to the different data types, (the 0 may be judged as boolean type, or 0 as int type). Therefore, the SQL statement [ select a, B from TB where a = 0 ] cannot get the result you expect. It is suggested to change to [ select a, B from TB where a ='0' ] > The rule PromoteStrings cast string to a wrong data type > > > Key: SPARK-21774 > URL: https://issues.apache.org/jira/browse/SPARK-21774 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0 >Reporter: StanZhai >Priority: Critical > Labels: correctness > > Data > {code} > create temporary view tb as select * from values > ("0", 1), > ("-0.1", 2), > ("1", 3) > as grouping(a, b) > {code} > SQL: > {code} > select a, b from tb where a=0 > {code} > The result which is wrong: > {code} > ++---+ > | a| b| > ++---+ > | 0| 1| > |-0.1| 2| > ++---+ > {code} > Logical Plan: > {code} > == Parsed Logical Plan == > 'Project ['a] > +- 'Filter ('a = 0) >+- 'UnresolvedRelation `src` > == Analyzed Logical Plan == > a: string > Project [a#8528] > +- Filter (cast(a#8528 as int) = 0) >+- SubqueryAlias src > +- Project [_1#8525 AS a#8528, _2#8526 AS b#8529] > +- LocalRelation [_1#8525, _2#8526] > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21774) The rule PromoteStrings cast string to a wrong data type
[ https://issues.apache.org/jira/browse/SPARK-21774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631568#comment-16631568 ] ice bai commented on SPARK-21774: - I met the same problem in Spark 2.3.0. The flowlling is some tests ``` spark-sql> select ''>0; true Time taken: 0.078 seconds, Fetched 1 row(s) spark-sql> select ''>0; NULL Time taken: 0.065 seconds, Fetched 1 row(s) spark-sql> select '1.0'=1; true Time taken: 0.054 seconds, Fetched 1 row(s) spark-sql> select '1.2'=1; true Time taken: 0.07 seconds, Fetched 1 row(s) ``` When set log level to trace, I found this: === Applying Rule org.apache.spark.sql.catalyst.analysis.TypeCoercion$PromoteStrings === !'Project [unresolvedalias((> 0), None)] 'Project [unresolvedalias((cast( as int) > 0), None)] +- OneRowRelation +- OneRowRelation > The rule PromoteStrings cast string to a wrong data type > > > Key: SPARK-21774 > URL: https://issues.apache.org/jira/browse/SPARK-21774 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0 >Reporter: StanZhai >Priority: Critical > Labels: correctness > > Data > {code} > create temporary view tb as select * from values > ("0", 1), > ("-0.1", 2), > ("1", 3) > as grouping(a, b) > {code} > SQL: > {code} > select a, b from tb where a=0 > {code} > The result which is wrong: > {code} > ++---+ > | a| b| > ++---+ > | 0| 1| > |-0.1| 2| > ++---+ > {code} > Logical Plan: > {code} > == Parsed Logical Plan == > 'Project ['a] > +- 'Filter ('a = 0) >+- 'UnresolvedRelation `src` > == Analyzed Logical Plan == > a: string > Project [a#8528] > +- Filter (cast(a#8528 as int) = 0) >+- SubqueryAlias src > +- Project [_1#8525 AS a#8528, _2#8526 AS b#8529] > +- LocalRelation [_1#8525, _2#8526] > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21774) The rule PromoteStrings cast string to a wrong data type
[ https://issues.apache.org/jira/browse/SPARK-21774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16144029#comment-16144029 ] Apache Spark commented on SPARK-21774: -- User 'stanzhai' has created a pull request for this issue: https://github.com/apache/spark/pull/18986 > The rule PromoteStrings cast string to a wrong data type > > > Key: SPARK-21774 > URL: https://issues.apache.org/jira/browse/SPARK-21774 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0 >Reporter: StanZhai >Priority: Critical > Labels: correctness > > Data > {code} > create temporary view tb as select * from values > ("0", 1), > ("-0.1", 2), > ("1", 3) > as grouping(a, b) > {code} > SQL: > {code} > select a, b from tb where a=0 > {code} > The result which is wrong: > {code} > ++---+ > | a| b| > ++---+ > | 0| 1| > |-0.1| 2| > ++---+ > {code} > Logical Plan: > {code} > == Parsed Logical Plan == > 'Project ['a] > +- 'Filter ('a = 0) >+- 'UnresolvedRelation `src` > == Analyzed Logical Plan == > a: string > Project [a#8528] > +- Filter (cast(a#8528 as int) = 0) >+- SubqueryAlias src > +- Project [_1#8525 AS a#8528, _2#8526 AS b#8529] > +- LocalRelation [_1#8525, _2#8526] > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21774) The rule PromoteStrings cast string to a wrong data type
[ https://issues.apache.org/jira/browse/SPARK-21774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16131921#comment-16131921 ] StanZhai commented on SPARK-21774: -- I've opened a PR (https://github.com/apache/spark/pull/18986) for this issue. This PR is not automatically associated with JIRA yet. > The rule PromoteStrings cast string to a wrong data type > > > Key: SPARK-21774 > URL: https://issues.apache.org/jira/browse/SPARK-21774 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0 >Reporter: StanZhai >Priority: Critical > Labels: correctness > > Data > {code} > create temporary view tb as select * from values > ("0", 1), > ("-0.1", 2), > ("1", 3) > as grouping(a, b) > {code} > SQL: > {code} > select a, b from tb where a=0 > {code} > The result which is wrong: > {code} > ++---+ > | a| b| > ++---+ > | 0| 1| > |-0.1| 2| > ++---+ > {code} > Logical Plan: > {code} > == Parsed Logical Plan == > 'Project ['a] > +- 'Filter ('a = 0) >+- 'UnresolvedRelation `src` > == Analyzed Logical Plan == > a: string > Project [a#8528] > +- Filter (cast(a#8528 as int) = 0) >+- SubqueryAlias src > +- Project [_1#8525 AS a#8528, _2#8526 AS b#8529] > +- LocalRelation [_1#8525, _2#8526] > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21774) The rule PromoteStrings cast string to a wrong data type
[ https://issues.apache.org/jira/browse/SPARK-21774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16131908#comment-16131908 ] Feng Zhu commented on SPARK-21774: -- This is introduced by the PR-15880 (https://github.com/apache/spark/pull/15880), which addresses SPARK-17913. This PR tries to follow logic in PG. "I think it's more reasonable to follow postgres in this case, i.e. cast string to the type of the other side, but return null if the string is not castable to keep hive compatibility." However, *UTF8String* still returns true for such case. From the below code, res=true and wrapper.value=0 {code:java} val x = UTF8String.fromString("0.1") val wrapper = new IntWrapper val res = x.toInt(wrapper) {code} Shall we check such similar beheviors, or back to the logic in 2.1, which casts String into DoubleType? [~LI,Xiao][~cloud_fan] > The rule PromoteStrings cast string to a wrong data type > > > Key: SPARK-21774 > URL: https://issues.apache.org/jira/browse/SPARK-21774 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0 >Reporter: StanZhai >Priority: Critical > Labels: correctness > > Data > {code} > create temporary view tb as select * from values > ("0", 1), > ("-0.1", 2), > ("1", 3) > as grouping(a, b) > {code} > SQL: > {code} > select a, b from tb where a=0 > {code} > The result which is wrong: > {code} > ++---+ > | a| b| > ++---+ > | 0| 1| > |-0.1| 2| > ++---+ > {code} > Logical Plan: > {code} > == Parsed Logical Plan == > 'Project ['a] > +- 'Filter ('a = 0) >+- 'UnresolvedRelation `src` > == Analyzed Logical Plan == > a: string > Project [a#8528] > +- Filter (cast(a#8528 as int) = 0) >+- SubqueryAlias src > +- Project [_1#8525 AS a#8528, _2#8526 AS b#8529] > +- LocalRelation [_1#8525, _2#8526] > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org