[jira] [Comment Edited] (SPARK-37344) split function behave differently between spark 2.3 and spark 3.2
[ https://issues.apache.org/jira/browse/SPARK-37344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17444225#comment-17444225 ] angerszhu edited comment on SPARK-37344 at 11/16/21, 8:06 AM: -- for same SQL {code} explain extended select split('dawdawdawd',';'); {code} In hive 1.2 {code} OK ABSTRACT SYNTAX TREE: TOK_QUERY TOK_INSERT TOK_DESTINATION TOK_DIR TOK_TMP_FILE TOK_SELECT TOK_SELEXPR TOK_FUNCTION split 'dawdawdawd' '\\\;' {code} In hive 3 {code} OK ABSTRACT SYNTAX TREE: TOK_QUERY TOK_INSERT TOK_DESTINATION TOK_DIR TOK_TMP_FILE TOK_SELECT TOK_SELEXPR TOK_FUNCTION split 'dawdawdawd' ';' {code} So it should be caused by hive's code. was (Author: angerszhuuu): for same SQL {code} explain extended select split('dawdawdawd',';'); {code} In hive 1.2 {code} OK ABSTRACT SYNTAX TREE: TOK_QUERY TOK_INSERT TOK_DESTINATION TOK_DIR TOK_TMP_FILE TOK_SELECT TOK_SELEXPR TOK_FUNCTION split 'dawdawdawd' '\\\;' {code} In hive 3 {code} OK ABSTRACT SYNTAX TREE: TOK_QUERY TOK_INSERT TOK_DESTINATION TOK_DIR TOK_TMP_FILE TOK_SELECT TOK_SELEXPR TOK_FUNCTION split 'dawdawdawd' ';' {code} > split function behave differently between spark 2.3 and spark 3.2 > - > > Key: SPARK-37344 > URL: https://issues.apache.org/jira/browse/SPARK-37344 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.1.1, 3.1.2, 3.2.0 >Reporter: ocean >Priority: Major > Labels: incorrect > > while use split function in sql, it behave differently between 2.3 and 3.2, > which cause incorrect problem. > we can use this sql to reproduce this problem: > > create table split_test ( id int,name string) > insert into split_test values(1,"abc;def") > explain extended select split(name,';') from split_test > > spark3: > spark-sql> Explain extended select split(name,';') from split_test; > == Parsed Logical Plan == > 'Project [unresolvedalias('split('name, \\;), None)] > +- 'UnresolvedRelation [split_test], [], false > > spark2: > > spark-sql> Explain extended select split(name,';') from split_test; > == Parsed Logical Plan == > 'Project [unresolvedalias('split('name, \;), None)] > +- 'UnresolvedRelation split_test > > It looks like the deal of escape is different -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-37344) split function behave differently between spark 2.3 and spark 3.2
[ https://issues.apache.org/jira/browse/SPARK-37344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17444225#comment-17444225 ] angerszhu edited comment on SPARK-37344 at 11/16/21, 8:05 AM: -- for same SQL {code} explain extended select split('dawdawdawd',';'); {code} In hive 1.2 {code} OK ABSTRACT SYNTAX TREE: TOK_QUERY TOK_INSERT TOK_DESTINATION TOK_DIR TOK_TMP_FILE TOK_SELECT TOK_SELEXPR TOK_FUNCTION split 'dawdawdawd' '\\\;' {code} In hive 3 {code} OK ABSTRACT SYNTAX TREE: TOK_QUERY TOK_INSERT TOK_DESTINATION TOK_DIR TOK_TMP_FILE TOK_SELECT TOK_SELEXPR TOK_FUNCTION split 'dawdawdawd' ';' {code} was (Author: angerszhuuu): In latest master branch {code} == Parsed Logical Plan == 'Project [unresolvedalias('split('name, \;), None)] +- 'UnresolvedRelation [split_test], [], false == Analyzed Logical Plan == split(name, \;, -1): array Project [split(name#225, \;, -1) AS split(name, \;, -1)#226] +- SubqueryAlias spark_catalog.default.split_test +- Relation default.split_test[id#224,name#225] parquet == Optimized Logical Plan == Project [split(name#225, \;, -1) AS split(name, \;, -1)#226] +- Relation default.split_test[id#224,name#225] parquet == Physical Plan == *(1) Project [split(name#225, \;, -1) AS split(name, \;, -1)#226] +- *(1) ColumnarToRow +- FileScan parquet default.split_test[name#225] Batched: true, DataFilters: [], Format: Parquet, Location: InMemoryFileIndex(1 paths)[file:/Users/yi.zhu/Documents/project/Angerszh/spark/sql/core/spark..., PartitionFilters: [], PushedFilters: [], ReadSchema: struct {code} > split function behave differently between spark 2.3 and spark 3.2 > - > > Key: SPARK-37344 > URL: https://issues.apache.org/jira/browse/SPARK-37344 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.1.1, 3.1.2, 3.2.0 >Reporter: ocean >Priority: Major > Labels: incorrect > > while use split function in sql, it behave differently between 2.3 and 3.2, > which cause incorrect problem. > we can use this sql to reproduce this problem: > > create table split_test ( id int,name string) > insert into split_test values(1,"abc;def") > explain extended select split(name,';') from split_test > > spark3: > spark-sql> Explain extended select split(name,';') from split_test; > == Parsed Logical Plan == > 'Project [unresolvedalias('split('name, \\;), None)] > +- 'UnresolvedRelation [split_test], [], false > > spark2: > > spark-sql> Explain extended select split(name,';') from split_test; > == Parsed Logical Plan == > 'Project [unresolvedalias('split('name, \;), None)] > +- 'UnresolvedRelation split_test > > It looks like the deal of escape is different -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-37344) split function behave differently between spark 2.3 and spark 3.2
[ https://issues.apache.org/jira/browse/SPARK-37344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17444225#comment-17444225 ] angerszhu edited comment on SPARK-37344 at 11/16/21, 2:51 AM: -- In latest master branch {code} == Parsed Logical Plan == 'Project [unresolvedalias('split('name, \;), None)] +- 'UnresolvedRelation [split_test], [], false == Analyzed Logical Plan == split(name, \;, -1): array Project [split(name#225, \;, -1) AS split(name, \;, -1)#226] +- SubqueryAlias spark_catalog.default.split_test +- Relation default.split_test[id#224,name#225] parquet == Optimized Logical Plan == Project [split(name#225, \;, -1) AS split(name, \;, -1)#226] +- Relation default.split_test[id#224,name#225] parquet == Physical Plan == *(1) Project [split(name#225, \;, -1) AS split(name, \;, -1)#226] +- *(1) ColumnarToRow +- FileScan parquet default.split_test[name#225] Batched: true, DataFilters: [], Format: Parquet, Location: InMemoryFileIndex(1 paths)[file:/Users/yi.zhu/Documents/project/Angerszh/spark/sql/core/spark..., PartitionFilters: [], PushedFilters: [], ReadSchema: struct {code} was (Author: angerszhuuu): Work on this > split function behave differently between spark 2.3 and spark 3.2 > - > > Key: SPARK-37344 > URL: https://issues.apache.org/jira/browse/SPARK-37344 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.1.1, 3.1.2, 3.2.0 >Reporter: ocean >Priority: Major > Labels: incorrect > > while use split function in sql, it behave differently between 2.3 and 3.2, > which cause incorrect problem. > we can use this sql to reproduce this problem: > > create table split_test ( id int,name string) > insert into split_test values(1,"abc;def") > explain extended select split(name,';') from split_test > > spark3: > spark-sql> Explain extended select split(name,';') from split_test; > == Parsed Logical Plan == > 'Project [unresolvedalias('split('name, \\;), None)] > +- 'UnresolvedRelation [split_test], [], false > > spark2: > > spark-sql> Explain extended select split(name,';') from split_test; > == Parsed Logical Plan == > 'Project [unresolvedalias('split('name, \;), None)] > +- 'UnresolvedRelation split_test > > It looks like the deal of escape is different -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org