[jira] [Commented] (SPARK-46428) The stage that retries due to FetchFailed will cause the downstream skewJoin AQE become ineffective
[ https://issues.apache.org/jira/browse/SPARK-46428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17797994#comment-17797994 ] Tongwei commented on SPARK-46428: - cc [~yumwang] > The stage that retries due to FetchFailed will cause the downstream skewJoin > AQE become ineffective > --- > > Key: SPARK-46428 > URL: https://issues.apache.org/jira/browse/SPARK-46428 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.5.0 >Reporter: Tongwei >Priority: Major > Attachments: Stage 46.0 skewjoin apply failed.png, Stage 46.0 > skewjoin apply success.png > > > In our production environment, a SQL statement containing a Union all was > executed. Due to a FetchFailed issue in Stage 24.0 of the first Union > subquery, after rerunning some stages, Stage 46.0, which was supposed to > follow the Skew Join rule, is now unable to take effect. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-46428) The stage that retries due to FetchFailed will cause the downstream skewJoin AQE become ineffective
[ https://issues.apache.org/jira/browse/SPARK-46428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tongwei updated SPARK-46428: Description: In our production environment, a SQL statement containing a Union all was executed. Due to a FetchFailed issue in Stage 24.0 of the first Union subquery, after rerunning some stages, Stage 46.0, which was supposed to follow the Skew Join rule, is now unable to take effect. > The stage that retries due to FetchFailed will cause the downstream skewJoin > AQE become ineffective > --- > > Key: SPARK-46428 > URL: https://issues.apache.org/jira/browse/SPARK-46428 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.5.0 >Reporter: Tongwei >Priority: Major > Attachments: Stage 46.0 skewjoin apply failed.png, Stage 46.0 > skewjoin apply success.png > > > In our production environment, a SQL statement containing a Union all was > executed. Due to a FetchFailed issue in Stage 24.0 of the first Union > subquery, after rerunning some stages, Stage 46.0, which was supposed to > follow the Skew Join rule, is now unable to take effect. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-44456) ConstantPropagation
Tongwei created SPARK-44456: --- Summary: ConstantPropagation Key: SPARK-44456 URL: https://issues.apache.org/jira/browse/SPARK-44456 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.3.1 Reporter: Tongwei -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-44007) Unresolved hint cause query failure
[ https://issues.apache.org/jira/browse/SPARK-44007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tongwei resolved SPARK-44007. - Resolution: Won't Fix > Unresolved hint cause query failure > --- > > Key: SPARK-44007 > URL: https://issues.apache.org/jira/browse/SPARK-44007 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.1, 3.4.0 >Reporter: Tongwei >Priority: Major > > After the Resolve Hints Rules are completed, immediately remove unknown Hints > to *avoid query errors caused by Unresolved Hints.* > Query error: > {code:java} > // create t0,t1 > CREATE TABLE t0(c0 bigint) USING PARQUET > CREATE TABLE t1(c1 bigint) USING PARQUET > // query with unknown hint > with w0 as (select * from t0), > w1 as (select c0 from w0 group by c0), > w2 as (select /*+ userHint(t1) */ c1 from t1 ), > w3 as ( > select w2.c1, w0.c0, w1.c0 > from w2 > join w0 on w2.c1 = w0.c0 > join w1 on w2.c1 = w1.c0 > ) > select * from w3; > 23/06/08 17:25:23 WARN HintErrorLogger: Unrecognized hint: userHint(t1) > [UNRESOLVED_COLUMN.WITH_SUGGESTION] A column or function parameter with name > `w2`.`c1` cannot be resolved. Did you mean one of the following? [`w2`.`c1`, > `w0`.`c0`].; line 8 pos 11; > 'WithCTE > :- CTERelationDef 4, false > : +- SubqueryAlias w0 > : +- Project [c0#0L] > : +- SubqueryAlias spark_catalog.default.t0 > : +- Relation spark_catalog.default.t0[c0#0L] parquet > :- CTERelationDef 5, false > : +- SubqueryAlias w1 > : +- Aggregate [c0#0L], [c0#0L] > : +- SubqueryAlias w0 > : +- CTERelationRef 4, true, [c0#0L] > :- CTERelationDef 6, false > : +- SubqueryAlias w2 > : +- Project [c1#1L] > : +- SubqueryAlias spark_catalog.default.t1 > : +- Relation spark_catalog.default.t1[c1#1L] parquet > :- 'CTERelationDef 7, false > : +- 'SubqueryAlias w3 > : +- 'Project ['w2.c1, 'w0.c0, 'w1.c0] > : +- 'Join Inner, ('w2.c1 = 'w1.c0) > : :- Join Inner, (c1#1L = c0#0L) > : : :- SubqueryAlias w2 > : : : +- CTERelationRef 6, true, [c1#1L] > : : +- SubqueryAlias w0 > : : +- CTERelationRef 4, true, [c0#0L] > : +- SubqueryAlias w1 > : +- CTERelationRef 5, true, [c0#0L] > +- 'Project [*] > +- 'SubqueryAlias w3 > +- 'CTERelationRef 7, false > // query without unknown hint > with w0 as (select * from t0), > w1 as (select * from w0 group by c0), > w2 as (select c1 from t1 ), > w3 as ( > select w2.c1, w0.c0, w1.c0 > from w2 > join w0 on w2.c1 = w0.c0 > join w1 on w2.c1 = w1.c0 > ) > select * from w3; > Time taken: 12.666 seconds > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-44007) Unresolved hint cause query failure
[ https://issues.apache.org/jira/browse/SPARK-44007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tongwei updated SPARK-44007: Attachment: (was: image-2023-06-08-17-26-59-186.png) > Unresolved hint cause query failure > --- > > Key: SPARK-44007 > URL: https://issues.apache.org/jira/browse/SPARK-44007 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.1, 3.4.0 >Reporter: Tongwei >Priority: Major > > After the Resolve Hints Rules are completed, immediately remove unknown Hints > to *avoid query errors caused by Unresolved Hints.* > Query error: > {code:java} > // create t0,t1 > CREATE TABLE t0(c0 bigint) USING PARQUET > CREATE TABLE t1(c1 bigint) USING PARQUET > // query with unknown hint > with w0 as (select * from t0), > w1 as (select c0 from w0 group by c0), > w2 as (select /*+ userHint(t1) */ c1 from t1 ), > w3 as ( > select w2.c1, w0.c0, w1.c0 > from w2 > join w0 on w2.c1 = w0.c0 > join w1 on w2.c1 = w1.c0 > ) > select * from w3; > 23/06/08 17:25:23 WARN HintErrorLogger: Unrecognized hint: userHint(t1) > [UNRESOLVED_COLUMN.WITH_SUGGESTION] A column or function parameter with name > `w2`.`c1` cannot be resolved. Did you mean one of the following? [`w2`.`c1`, > `w0`.`c0`].; line 8 pos 11; > 'WithCTE > :- CTERelationDef 4, false > : +- SubqueryAlias w0 > : +- Project [c0#0L] > : +- SubqueryAlias spark_catalog.default.t0 > : +- Relation spark_catalog.default.t0[c0#0L] parquet > :- CTERelationDef 5, false > : +- SubqueryAlias w1 > : +- Aggregate [c0#0L], [c0#0L] > : +- SubqueryAlias w0 > : +- CTERelationRef 4, true, [c0#0L] > :- CTERelationDef 6, false > : +- SubqueryAlias w2 > : +- Project [c1#1L] > : +- SubqueryAlias spark_catalog.default.t1 > : +- Relation spark_catalog.default.t1[c1#1L] parquet > :- 'CTERelationDef 7, false > : +- 'SubqueryAlias w3 > : +- 'Project ['w2.c1, 'w0.c0, 'w1.c0] > : +- 'Join Inner, ('w2.c1 = 'w1.c0) > : :- Join Inner, (c1#1L = c0#0L) > : : :- SubqueryAlias w2 > : : : +- CTERelationRef 6, true, [c1#1L] > : : +- SubqueryAlias w0 > : : +- CTERelationRef 4, true, [c0#0L] > : +- SubqueryAlias w1 > : +- CTERelationRef 5, true, [c0#0L] > +- 'Project [*] > +- 'SubqueryAlias w3 > +- 'CTERelationRef 7, false > // query without unknown hint > with w0 as (select * from t0), > w1 as (select * from w0 group by c0), > w2 as (select c1 from t1 ), > w3 as ( > select w2.c1, w0.c0, w1.c0 > from w2 > join w0 on w2.c1 = w0.c0 > join w1 on w2.c1 = w1.c0 > ) > select * from w3; > Time taken: 12.666 seconds > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-44007) Unresolved hint cause query failure
[ https://issues.apache.org/jira/browse/SPARK-44007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tongwei updated SPARK-44007: Description: After the Resolve Hints Rules are completed, immediately remove unknown Hints to *avoid query errors caused by Unresolved Hints.* Query error: {code:java} // create t0,t1 CREATE TABLE t0(c0 bigint) USING PARQUET CREATE TABLE t1(c1 bigint) USING PARQUET // query with unknown hint with w0 as (select * from t0), w1 as (select c0 from w0 group by c0), w2 as (select /*+ userHint(t1) */ c1 from t1 ), w3 as ( select w2.c1, w0.c0, w1.c0 from w2 join w0 on w2.c1 = w0.c0 join w1 on w2.c1 = w1.c0 ) select * from w3; 23/06/08 17:25:23 WARN HintErrorLogger: Unrecognized hint: userHint(t1) [UNRESOLVED_COLUMN.WITH_SUGGESTION] A column or function parameter with name `w2`.`c1` cannot be resolved. Did you mean one of the following? [`w2`.`c1`, `w0`.`c0`].; line 8 pos 11; 'WithCTE :- CTERelationDef 4, false : +- SubqueryAlias w0 : +- Project [c0#0L] : +- SubqueryAlias spark_catalog.default.t0 : +- Relation spark_catalog.default.t0[c0#0L] parquet :- CTERelationDef 5, false : +- SubqueryAlias w1 : +- Aggregate [c0#0L], [c0#0L] : +- SubqueryAlias w0 : +- CTERelationRef 4, true, [c0#0L] :- CTERelationDef 6, false : +- SubqueryAlias w2 : +- Project [c1#1L] : +- SubqueryAlias spark_catalog.default.t1 : +- Relation spark_catalog.default.t1[c1#1L] parquet :- 'CTERelationDef 7, false : +- 'SubqueryAlias w3 : +- 'Project ['w2.c1, 'w0.c0, 'w1.c0] : +- 'Join Inner, ('w2.c1 = 'w1.c0) : :- Join Inner, (c1#1L = c0#0L) : : :- SubqueryAlias w2 : : : +- CTERelationRef 6, true, [c1#1L] : : +- SubqueryAlias w0 : : +- CTERelationRef 4, true, [c0#0L] : +- SubqueryAlias w1 : +- CTERelationRef 5, true, [c0#0L] +- 'Project [*] +- 'SubqueryAlias w3 +- 'CTERelationRef 7, false // query without unknown hint with w0 as (select * from t0), w1 as (select * from w0 group by c0), w2 as (select c1 from t1 ), w3 as ( select w2.c1, w0.c0, w1.c0 from w2 join w0 on w2.c1 = w0.c0 join w1 on w2.c1 = w1.c0 ) select * from w3; Time taken: 12.666 seconds {code} was: After the Resolve Hints Rules are completed, immediately remove unknown Hints to avoid query errors caused by unknown Hints. {code:java} CREATE TABLE t0(c0 bigint) USING PARQUET CREATE TABLE t1(c1 bigint) USING PARQUET {code} Query info: {code:java} // query with unknown hint with w0 as (select * from t0), w1 as (select c0 from w0 group by c0), w2 as (select /*+ userHint(t1) */ c1 from t1 ), w3 as ( select w2.c1, w0.c0, w1.c0 from w2 join w0 on w2.c1 = w0.c0 join w1 on w2.c1 = w1.c0 ) select * from w3; 23/06/08 17:25:23 WARN HintErrorLogger: Unrecognized hint: userHint(t1) [UNRESOLVED_COLUMN.WITH_SUGGESTION] A column or function parameter with name `w2`.`c1` cannot be resolved. Did you mean one of the following? [`w2`.`c1`, `w0`.`c0`].; line 8 pos 11; 'WithCTE :- CTERelationDef 4, false : +- SubqueryAlias w0 : +- Project [c0#0L] : +- SubqueryAlias spark_catalog.default.t0 : +- Relation spark_catalog.default.t0[c0#0L] parquet :- CTERelationDef 5, false : +- SubqueryAlias w1 : +- Aggregate [c0#0L], [c0#0L] : +- SubqueryAlias w0 : +- CTERelationRef 4, true, [c0#0L] :- CTERelationDef 6, false : +- SubqueryAlias w2 : +- Project [c1#1L] : +- SubqueryAlias spark_catalog.default.t1 : +- Relation spark_catalog.default.t1[c1#1L] parquet :- 'CTERelationDef 7, false : +- 'SubqueryAlias w3 : +- 'Project ['w2.c1, 'w0.c0, 'w1.c0] : +- 'Join Inner, ('w2.c1 = 'w1.c0) : :- Join Inner, (c1#1L = c0#0L) : : :- SubqueryAlias w2 : : : +- CTERelationRef 6, true, [c1#1L] : : +- SubqueryAlias w0 : : +- CTERelationRef 4, true, [c0#0L] : +- SubqueryAlias w1 : +- CTERelationRef 5, true, [c0#0L] +- 'Project [*] +- 'SubqueryAlias w3 +- 'CTERelationRef 7, false // query without unknown hint with w0 as (select * from t0), w1 as (select * from w0 group by c0), w2 as (select c1 from t1 ), w3 as ( select w2.c1, w0.c0, w1.c0 from w2 join w0 on w2.c1 = w0.c0 join w1 on w2.c1 = w1.c0 ) select * from w3; Time taken: 12.666 seconds {code} > Unresolved hint cause query failure > --- > > Key: SPARK-44007 > URL: https://issues.apache.org/jira/browse/SPARK-44007 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.1, 3.4.0 >Reporter: Tongwei >Priority: Major > Attachments: image-2023-06-08-17-26-59-186.png > > > After the Resolve Hints Rules are completed, immediately remove unknown Hints
[jira] [Updated] (SPARK-44007) Unresolved hint cause query failure
[ https://issues.apache.org/jira/browse/SPARK-44007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tongwei updated SPARK-44007: Description: After the Resolve Hints Rules are completed, immediately remove unknown Hints to avoid query errors caused by unknown Hints. {code:java} CREATE TABLE t0(c0 bigint) USING PARQUET CREATE TABLE t1(c1 bigint) USING PARQUET {code} Query info: {code:java} // query with unknown hint with w0 as (select * from t0), w1 as (select c0 from w0 group by c0), w2 as (select /*+ userHint(t1) */ c1 from t1 ), w3 as ( select w2.c1, w0.c0, w1.c0 from w2 join w0 on w2.c1 = w0.c0 join w1 on w2.c1 = w1.c0 ) select * from w3; 23/06/08 17:25:23 WARN HintErrorLogger: Unrecognized hint: userHint(t1) [UNRESOLVED_COLUMN.WITH_SUGGESTION] A column or function parameter with name `w2`.`c1` cannot be resolved. Did you mean one of the following? [`w2`.`c1`, `w0`.`c0`].; line 8 pos 11; 'WithCTE :- CTERelationDef 4, false : +- SubqueryAlias w0 : +- Project [c0#0L] : +- SubqueryAlias spark_catalog.default.t0 : +- Relation spark_catalog.default.t0[c0#0L] parquet :- CTERelationDef 5, false : +- SubqueryAlias w1 : +- Aggregate [c0#0L], [c0#0L] : +- SubqueryAlias w0 : +- CTERelationRef 4, true, [c0#0L] :- CTERelationDef 6, false : +- SubqueryAlias w2 : +- Project [c1#1L] : +- SubqueryAlias spark_catalog.default.t1 : +- Relation spark_catalog.default.t1[c1#1L] parquet :- 'CTERelationDef 7, false : +- 'SubqueryAlias w3 : +- 'Project ['w2.c1, 'w0.c0, 'w1.c0] : +- 'Join Inner, ('w2.c1 = 'w1.c0) : :- Join Inner, (c1#1L = c0#0L) : : :- SubqueryAlias w2 : : : +- CTERelationRef 6, true, [c1#1L] : : +- SubqueryAlias w0 : : +- CTERelationRef 4, true, [c0#0L] : +- SubqueryAlias w1 : +- CTERelationRef 5, true, [c0#0L] +- 'Project [*] +- 'SubqueryAlias w3 +- 'CTERelationRef 7, false // query without unknown hint with w0 as (select * from t0), w1 as (select * from w0 group by c0), w2 as (select c1 from t1 ), w3 as ( select w2.c1, w0.c0, w1.c0 from w2 join w0 on w2.c1 = w0.c0 join w1 on w2.c1 = w1.c0 ) select * from w3; Time taken: 12.666 seconds {code} was: {code:java} CREATE TABLE t0(c0 bigint) USING PARQUET CREATE TABLE t1(c1 bigint) USING PARQUET {code} Query info: {code:java} // query with unknown hint with w0 as (select * from t0), w1 as (select c0 from w0 group by c0), w2 as (select /*+ userHint(t1) */ c1 from t1 ), w3 as ( select w2.c1, w0.c0, w1.c0 from w2 join w0 on w2.c1 = w0.c0 join w1 on w2.c1 = w1.c0 ) select * from w3; 23/06/08 17:25:23 WARN HintErrorLogger: Unrecognized hint: userHint(t1) [UNRESOLVED_COLUMN.WITH_SUGGESTION] A column or function parameter with name `w2`.`c1` cannot be resolved. Did you mean one of the following? [`w2`.`c1`, `w0`.`c0`].; line 8 pos 11; 'WithCTE :- CTERelationDef 4, false : +- SubqueryAlias w0 : +- Project [c0#0L] : +- SubqueryAlias spark_catalog.default.t0 : +- Relation spark_catalog.default.t0[c0#0L] parquet :- CTERelationDef 5, false : +- SubqueryAlias w1 : +- Aggregate [c0#0L], [c0#0L] : +- SubqueryAlias w0 : +- CTERelationRef 4, true, [c0#0L] :- CTERelationDef 6, false : +- SubqueryAlias w2 : +- Project [c1#1L] : +- SubqueryAlias spark_catalog.default.t1 : +- Relation spark_catalog.default.t1[c1#1L] parquet :- 'CTERelationDef 7, false : +- 'SubqueryAlias w3 : +- 'Project ['w2.c1, 'w0.c0, 'w1.c0] : +- 'Join Inner, ('w2.c1 = 'w1.c0) : :- Join Inner, (c1#1L = c0#0L) : : :- SubqueryAlias w2 : : : +- CTERelationRef 6, true, [c1#1L] : : +- SubqueryAlias w0 : : +- CTERelationRef 4, true, [c0#0L] : +- SubqueryAlias w1 : +- CTERelationRef 5, true, [c0#0L] +- 'Project [*] +- 'SubqueryAlias w3 +- 'CTERelationRef 7, false // query without unknown hint with w0 as (select * from t0), w1 as (select * from w0 group by c0), w2 as (select c1 from t1 ), w3 as ( select w2.c1, w0.c0, w1.c0 from w2 join w0 on w2.c1 = w0.c0 join w1 on w2.c1 = w1.c0 ) select * from w3; Time taken: 12.666 seconds {code} > Unresolved hint cause query failure > --- > > Key: SPARK-44007 > URL: https://issues.apache.org/jira/browse/SPARK-44007 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.1, 3.4.0 >Reporter: Tongwei >Priority: Major > Attachments: image-2023-06-08-17-26-59-186.png > > > After the Resolve Hints Rules are completed, immediately remove unknown Hints > to avoid query errors caused by unknown Hints. > {code:java} > CREATE TABLE t0(c0 bigint) USING PARQUET > CREATE TABLE t1(c1 big
[jira] [Updated] (SPARK-44007) Unresolved hint cause query failure
[ https://issues.apache.org/jira/browse/SPARK-44007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tongwei updated SPARK-44007: Description: {code:java} CREATE TABLE t0(c0 bigint) USING PARQUET CREATE TABLE t1(c1 bigint) USING PARQUET {code} Query info: {code:java} // query with unknown hint with w0 as (select * from t0), w1 as (select c0 from w0 group by c0), w2 as (select /*+ userHint(t1) */ c1 from t1 ), w3 as ( select w2.c1, w0.c0, w1.c0 from w2 join w0 on w2.c1 = w0.c0 join w1 on w2.c1 = w1.c0 ) select * from w3; 23/06/08 17:25:23 WARN HintErrorLogger: Unrecognized hint: userHint(t1) [UNRESOLVED_COLUMN.WITH_SUGGESTION] A column or function parameter with name `w2`.`c1` cannot be resolved. Did you mean one of the following? [`w2`.`c1`, `w0`.`c0`].; line 8 pos 11; 'WithCTE :- CTERelationDef 4, false : +- SubqueryAlias w0 : +- Project [c0#0L] : +- SubqueryAlias spark_catalog.default.t0 : +- Relation spark_catalog.default.t0[c0#0L] parquet :- CTERelationDef 5, false : +- SubqueryAlias w1 : +- Aggregate [c0#0L], [c0#0L] : +- SubqueryAlias w0 : +- CTERelationRef 4, true, [c0#0L] :- CTERelationDef 6, false : +- SubqueryAlias w2 : +- Project [c1#1L] : +- SubqueryAlias spark_catalog.default.t1 : +- Relation spark_catalog.default.t1[c1#1L] parquet :- 'CTERelationDef 7, false : +- 'SubqueryAlias w3 : +- 'Project ['w2.c1, 'w0.c0, 'w1.c0] : +- 'Join Inner, ('w2.c1 = 'w1.c0) : :- Join Inner, (c1#1L = c0#0L) : : :- SubqueryAlias w2 : : : +- CTERelationRef 6, true, [c1#1L] : : +- SubqueryAlias w0 : : +- CTERelationRef 4, true, [c0#0L] : +- SubqueryAlias w1 : +- CTERelationRef 5, true, [c0#0L] +- 'Project [*] +- 'SubqueryAlias w3 +- 'CTERelationRef 7, false // query without unknown hint with w0 as (select * from t0), w1 as (select * from w0 group by c0), w2 as (select c1 from t1 ), w3 as ( select w2.c1, w0.c0, w1.c0 from w2 join w0 on w2.c1 = w0.c0 join w1 on w2.c1 = w1.c0 ) select * from w3; Time taken: 12.666 seconds {code} was: {code:java} CREATE TABLE t0(c0 bigint) USING PARQUET CREATE TABLE t1(c1 bigint) USING PARQUET {code} Query info: {code:java} // query with unknown hint with w0 as (select * from t0), w1 as (select c0 from w0 group by c0), w2 as (select /*+ userHint(t1) */ c1 from t1 ), w3 as ( select w2.c1, w0.c0, w1.c0 from w2 join w0 on w2.c1 = w0.c0 join w1 on w2.c1 = w1.c0 ) select * from w3; 23/06/08 17:25:23 WARN HintErrorLogger: Unrecognized hint: userHint(t1) [UNRESOLVED_COLUMN.WITH_SUGGESTION] A column or function parameter with name `w2`.`c1` cannot be resolved. Did you mean one of the following? [`w2`.`c1`, `w0`.`c0`].; line 8 pos 11; 'WithCTE :- CTERelationDef 4, false : +- SubqueryAlias w0 : +- Project [c0#0L] : +- SubqueryAlias spark_catalog.default.t0 : +- Relation spark_catalog.default.t0[c0#0L] parquet :- CTERelationDef 5, false : +- SubqueryAlias w1 : +- Aggregate [c0#0L], [c0#0L] : +- SubqueryAlias w0 : +- CTERelationRef 4, true, [c0#0L] :- CTERelationDef 6, false : +- SubqueryAlias w2 : +- Project [c1#1L] : +- SubqueryAlias spark_catalog.default.t1 : +- Relation spark_catalog.default.t1[c1#1L] parquet :- 'CTERelationDef 7, false : +- 'SubqueryAlias w3 : +- 'Project ['w2.c1, 'w0.c0, 'w1.c0] : +- 'Join Inner, ('w2.c1 = 'w1.c0) : :- Join Inner, (c1#1L = c0#0L) : : :- SubqueryAlias w2 : : : +- CTERelationRef 6, true, [c1#1L] : : +- SubqueryAlias w0 : : +- CTERelationRef 4, true, [c0#0L] : +- SubqueryAlias w1 : +- CTERelationRef 5, true, [c0#0L] +- 'Project [*] +- 'SubqueryAlias w3 +- 'CTERelationRef 7, false {code} > Unresolved hint cause query failure > --- > > Key: SPARK-44007 > URL: https://issues.apache.org/jira/browse/SPARK-44007 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.1, 3.4.0 >Reporter: Tongwei >Priority: Major > Attachments: image-2023-06-08-17-26-59-186.png > > > > {code:java} > CREATE TABLE t0(c0 bigint) USING PARQUET > CREATE TABLE t1(c1 bigint) USING PARQUET > {code} > Query info: > {code:java} > // query with unknown hint > with w0 as (select * from t0), > w1 as (select c0 from w0 group by c0), > w2 as (select /*+ userHint(t1) */ c1 from t1 ), > w3 as ( > select w2.c1, w0.c0, w1.c0 > from w2 > join w0 on w2.c1 = w0.c0 > join w1 on w2.c1 = w1.c0 > ) > select * from w3; > 23/06/08 17:25:23 WARN HintErrorLogger: Unrecognized hint: userHint(t1) > [UNRESOLVED_COLUMN.WITH_SUGGESTION] A column or function parameter with name > `w2`.`c1` cannot be resolve
[jira] [Updated] (SPARK-44007) Unresolved hint cause query failure
[ https://issues.apache.org/jira/browse/SPARK-44007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tongwei updated SPARK-44007: Description: {code:java} CREATE TABLE t0(c0 bigint) USING PARQUET CREATE TABLE t1(c1 bigint) USING PARQUET {code} Query info: {code:java} // query with unknown hint with w0 as (select * from t0), w1 as (select c0 from w0 group by c0), w2 as (select /*+ userHint(t1) */ c1 from t1 ), w3 as ( select w2.c1, w0.c0, w1.c0 from w2 join w0 on w2.c1 = w0.c0 join w1 on w2.c1 = w1.c0 ) select * from w3; 23/06/08 17:25:23 WARN HintErrorLogger: Unrecognized hint: userHint(t1) [UNRESOLVED_COLUMN.WITH_SUGGESTION] A column or function parameter with name `w2`.`c1` cannot be resolved. Did you mean one of the following? [`w2`.`c1`, `w0`.`c0`].; line 8 pos 11; 'WithCTE :- CTERelationDef 4, false : +- SubqueryAlias w0 : +- Project [c0#0L] : +- SubqueryAlias spark_catalog.default.t0 : +- Relation spark_catalog.default.t0[c0#0L] parquet :- CTERelationDef 5, false : +- SubqueryAlias w1 : +- Aggregate [c0#0L], [c0#0L] : +- SubqueryAlias w0 : +- CTERelationRef 4, true, [c0#0L] :- CTERelationDef 6, false : +- SubqueryAlias w2 : +- Project [c1#1L] : +- SubqueryAlias spark_catalog.default.t1 : +- Relation spark_catalog.default.t1[c1#1L] parquet :- 'CTERelationDef 7, false : +- 'SubqueryAlias w3 : +- 'Project ['w2.c1, 'w0.c0, 'w1.c0] : +- 'Join Inner, ('w2.c1 = 'w1.c0) : :- Join Inner, (c1#1L = c0#0L) : : :- SubqueryAlias w2 : : : +- CTERelationRef 6, true, [c1#1L] : : +- SubqueryAlias w0 : : +- CTERelationRef 4, true, [c0#0L] : +- SubqueryAlias w1 : +- CTERelationRef 5, true, [c0#0L] +- 'Project [*] +- 'SubqueryAlias w3 +- 'CTERelationRef 7, false {code} was: {code:java} CREATE TABLE t0(c0 bigint) USING PARQUET CREATE TABLE t1(c1 bigint) USING PARQUET {code} !image-2023-06-08-17-26-59-186.png! > Unresolved hint cause query failure > --- > > Key: SPARK-44007 > URL: https://issues.apache.org/jira/browse/SPARK-44007 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.1, 3.4.0 >Reporter: Tongwei >Priority: Major > Attachments: image-2023-06-08-17-26-59-186.png > > > > {code:java} > CREATE TABLE t0(c0 bigint) USING PARQUET > CREATE TABLE t1(c1 bigint) USING PARQUET > {code} > Query info: > {code:java} > // query with unknown hint > with w0 as (select * from t0), > w1 as (select c0 from w0 group by c0), > w2 as (select /*+ userHint(t1) */ c1 from t1 ), > w3 as ( > select w2.c1, w0.c0, w1.c0 > from w2 > join w0 on w2.c1 = w0.c0 > join w1 on w2.c1 = w1.c0 > ) > select * from w3; > 23/06/08 17:25:23 WARN HintErrorLogger: Unrecognized hint: userHint(t1) > [UNRESOLVED_COLUMN.WITH_SUGGESTION] A column or function parameter with name > `w2`.`c1` cannot be resolved. Did you mean one of the following? [`w2`.`c1`, > `w0`.`c0`].; line 8 pos 11; > 'WithCTE > :- CTERelationDef 4, false > : +- SubqueryAlias w0 > : +- Project [c0#0L] > : +- SubqueryAlias spark_catalog.default.t0 > : +- Relation spark_catalog.default.t0[c0#0L] parquet > :- CTERelationDef 5, false > : +- SubqueryAlias w1 > : +- Aggregate [c0#0L], [c0#0L] > : +- SubqueryAlias w0 > : +- CTERelationRef 4, true, [c0#0L] > :- CTERelationDef 6, false > : +- SubqueryAlias w2 > : +- Project [c1#1L] > : +- SubqueryAlias spark_catalog.default.t1 > : +- Relation spark_catalog.default.t1[c1#1L] parquet > :- 'CTERelationDef 7, false > : +- 'SubqueryAlias w3 > : +- 'Project ['w2.c1, 'w0.c0, 'w1.c0] > : +- 'Join Inner, ('w2.c1 = 'w1.c0) > : :- Join Inner, (c1#1L = c0#0L) > : : :- SubqueryAlias w2 > : : : +- CTERelationRef 6, true, [c1#1L] > : : +- SubqueryAlias w0 > : : +- CTERelationRef 4, true, [c0#0L] > : +- SubqueryAlias w1 > : +- CTERelationRef 5, true, [c0#0L] > +- 'Project [*] > +- 'SubqueryAlias w3 > +- 'CTERelationRef 7, false > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-44007) Unresolved hint cause query failure
[ https://issues.apache.org/jira/browse/SPARK-44007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tongwei updated SPARK-44007: Attachment: image-2023-06-08-17-26-59-186.png > Unresolved hint cause query failure > --- > > Key: SPARK-44007 > URL: https://issues.apache.org/jira/browse/SPARK-44007 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.1, 3.4.0 >Reporter: Tongwei >Priority: Major > Attachments: image-2023-06-08-17-26-59-186.png > > > > {code:java} > CREATE TABLE t0(c0 bigint) USING PARQUET > CREATE TABLE t1(c1 bigint) USING PARQUET > {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-44007) Unresolved hint cause query failure
[ https://issues.apache.org/jira/browse/SPARK-44007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tongwei updated SPARK-44007: Description: {code:java} CREATE TABLE t0(c0 bigint) USING PARQUET CREATE TABLE t1(c1 bigint) USING PARQUET {code} !image-2023-06-08-17-26-59-186.png! was: {code:java} CREATE TABLE t0(c0 bigint) USING PARQUET CREATE TABLE t1(c1 bigint) USING PARQUET {code} > Unresolved hint cause query failure > --- > > Key: SPARK-44007 > URL: https://issues.apache.org/jira/browse/SPARK-44007 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.1, 3.4.0 >Reporter: Tongwei >Priority: Major > Attachments: image-2023-06-08-17-26-59-186.png > > > > {code:java} > CREATE TABLE t0(c0 bigint) USING PARQUET > CREATE TABLE t1(c1 bigint) USING PARQUET > {code} > !image-2023-06-08-17-26-59-186.png! -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-44007) Unresolved hint cause query failure
[ https://issues.apache.org/jira/browse/SPARK-44007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tongwei updated SPARK-44007: Affects Version/s: 3.4.0 > Unresolved hint cause query failure > --- > > Key: SPARK-44007 > URL: https://issues.apache.org/jira/browse/SPARK-44007 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.1, 3.4.0 >Reporter: Tongwei >Priority: Major > > > {code:java} > CREATE TABLE t0(c0 bigint) USING PARQUET > CREATE TABLE t1(c1 bigint) USING PARQUET > {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-44007) Unresolved hint cause query failure
[ https://issues.apache.org/jira/browse/SPARK-44007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tongwei updated SPARK-44007: Description: {code:java} CREATE TABLE t0(c0 bigint) USING PARQUET CREATE TABLE t1(c1 bigint) USING PARQUET {code} > Unresolved hint cause query failure > --- > > Key: SPARK-44007 > URL: https://issues.apache.org/jira/browse/SPARK-44007 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.1 >Reporter: Tongwei >Priority: Major > > > {code:java} > CREATE TABLE t0(c0 bigint) USING PARQUET > CREATE TABLE t1(c1 bigint) USING PARQUET > {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-44007) Unresolved hint cause query failure
Tongwei created SPARK-44007: --- Summary: Unresolved hint cause query failure Key: SPARK-44007 URL: https://issues.apache.org/jira/browse/SPARK-44007 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.3.1 Reporter: Tongwei -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-42356) Cannot resolve orderby attributes in DISTINCT
[ https://issues.apache.org/jira/browse/SPARK-42356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tongwei resolved SPARK-42356. - Resolution: Won't Fix > Cannot resolve orderby attributes in DISTINCT > -- > > Key: SPARK-42356 > URL: https://issues.apache.org/jira/browse/SPARK-42356 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.1 >Reporter: Tongwei >Priority: Major > > query: > {code:java} > CREATE TABLE students (name VARCHAR(64), address VARCHAR(64)) USING PARQUET > PARTITIONED BY (student_id INT); > SELECT DISTINCT trim(name) AS new_name FROM students order by name desc; > {code} > error: > {code:java} > spark-sql> SELECT DISTINCT trim(name) AS new_name FROM students order by name > desc; > Error in query: Column 'name' does not exist. Did you mean one of the > following? [new_name]; line 1 pos 62; > 'Sort ['name DESC NULLS LAST], true > +- Distinct > +- Project trim(name#1, None) AS new_name#0 > +- SubqueryAlias spark_catalog.default.students > +- Relation default.studentsname#1,address#2,student_id#3 parquet > {code} > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-42356) Cannot resolve orderby attributes in DISTINCT
[ https://issues.apache.org/jira/browse/SPARK-42356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tongwei updated SPARK-42356: Description: query: {code:java} CREATE TABLE students (name VARCHAR(64), address VARCHAR(64)) USING PARQUET PARTITIONED BY (student_id INT); SELECT DISTINCT trim(name) AS new_name FROM students order by name desc; {code} error: {code:java} spark-sql> SELECT DISTINCT trim(name) AS new_name FROM students order by name desc; Error in query: Column 'name' does not exist. Did you mean one of the following? [new_name]; line 1 pos 62; 'Sort ['name DESC NULLS LAST], true +- Distinct +- Project trim(name#1, None) AS new_name#0 +- SubqueryAlias spark_catalog.default.students +- Relation default.studentsname#1,address#2,student_id#3 parquet {code} was: query: {code:java} CREATE TABLE students (name VARCHAR(64), address VARCHAR(64)) USING PARQUET PARTITIONED BY (student_id INT); SELECT DISTINCT trim(name) AS new_name FROM students order by name desc; {code} error: {code:java} spark-sql> SELECT DISTINCT trim(name) AS new_name FROM students order by name desc; Error in query: Column 'name' does not exist. Did you mean one of the following? [new_name]; line 1 pos 62; 'Sort ['name DESC NULLS LAST], true +- Distinct +- Project trim(name#1, None) AS new_name#0 +- SubqueryAlias spark_catalog.default.students +- Relation default.studentsname#1,address#2,student_id#3 parquet {code} > Cannot resolve orderby attributes in DISTINCT > -- > > Key: SPARK-42356 > URL: https://issues.apache.org/jira/browse/SPARK-42356 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.1 >Reporter: Tongwei >Priority: Major > > query: > {code:java} > CREATE TABLE students (name VARCHAR(64), address VARCHAR(64)) USING PARQUET > PARTITIONED BY (student_id INT); > SELECT DISTINCT trim(name) AS new_name FROM students order by name desc; > {code} > error: > {code:java} > spark-sql> SELECT DISTINCT trim(name) AS new_name FROM students order by name > desc; > Error in query: Column 'name' does not exist. Did you mean one of the > following? [new_name]; line 1 pos 62; > 'Sort ['name DESC NULLS LAST], true > +- Distinct > +- Project trim(name#1, None) AS new_name#0 > +- SubqueryAlias spark_catalog.default.students > +- Relation default.studentsname#1,address#2,student_id#3 parquet > {code} > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-42356) Cannot resolve orderby attributes in DISTINCT
[ https://issues.apache.org/jira/browse/SPARK-42356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tongwei updated SPARK-42356: Description: query: {code:java} CREATE TABLE students (name VARCHAR(64), address VARCHAR(64)) USING PARQUET PARTITIONED BY (student_id INT); SELECT DISTINCT trim(name) AS new_name FROM students order by name desc; {code} error: {code:java} spark-sql> SELECT DISTINCT trim(name) AS new_name FROM students order by name desc; Error in query: Column 'name' does not exist. Did you mean one of the following? [new_name]; line 1 pos 62; 'Sort ['name DESC NULLS LAST], true +- Distinct +- Project trim(name#1, None) AS new_name#0 +- SubqueryAlias spark_catalog.default.students +- Relation default.studentsname#1,address#2,student_id#3 parquet {code} was: ``` CREATE TABLE students (name VARCHAR(64), address VARCHAR(64)) USING PARQUET PARTITIONED BY (student_id INT); SELECT DISTINCT trim(name) AS new_name FROM students order by name desc; ``` ERROR: ``` spark-sql> SELECT DISTINCT trim(name) AS new_name FROM students order by name desc; Error in query: Column 'name' does not exist. Did you mean one of the following? [new_name]; line 1 pos 62; 'Sort ['name DESC NULLS LAST], true +- Distinct +- Project [trim(name#1, None) AS new_name#0] +- SubqueryAlias spark_catalog.default.students +- Relation default.students[name#1,address#2,student_id#3] parquet ``` > Cannot resolve orderby attributes in DISTINCT > -- > > Key: SPARK-42356 > URL: https://issues.apache.org/jira/browse/SPARK-42356 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.1 >Reporter: Tongwei >Priority: Major > > query: > > {code:java} > CREATE TABLE students (name VARCHAR(64), address VARCHAR(64)) USING PARQUET > PARTITIONED BY (student_id INT); > SELECT DISTINCT trim(name) AS new_name FROM students order by name desc; > {code} > error: > > {code:java} > spark-sql> SELECT DISTINCT trim(name) AS new_name FROM students order by name > desc; > Error in query: Column 'name' does not exist. Did you mean one of the > following? [new_name]; line 1 pos 62; > 'Sort ['name DESC NULLS LAST], true > +- Distinct > +- Project trim(name#1, None) AS new_name#0 > +- SubqueryAlias spark_catalog.default.students > +- Relation default.studentsname#1,address#2,student_id#3 parquet > {code} > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-42356) Cannot resolve orderby attributes in DISTINCT
Tongwei created SPARK-42356: --- Summary: Cannot resolve orderby attributes in DISTINCT Key: SPARK-42356 URL: https://issues.apache.org/jira/browse/SPARK-42356 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.3.1 Reporter: Tongwei ``` CREATE TABLE students (name VARCHAR(64), address VARCHAR(64)) USING PARQUET PARTITIONED BY (student_id INT); SELECT DISTINCT trim(name) AS new_name FROM students order by name desc; ``` ERROR: ``` spark-sql> SELECT DISTINCT trim(name) AS new_name FROM students order by name desc; Error in query: Column 'name' does not exist. Did you mean one of the following? [new_name]; line 1 pos 62; 'Sort ['name DESC NULLS LAST], true +- Distinct +- Project [trim(name#1, None) AS new_name#0] +- SubqueryAlias spark_catalog.default.students +- Relation default.students[name#1,address#2,student_id#3] parquet ``` -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38173) Quoted column cannot be recognized correctly when quotedRegexColumnNames is true
[ https://issues.apache.org/jira/browse/SPARK-38173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tongwei updated SPARK-38173: Description: When spark.sql.parser.quotedRegexColumnNames=true {code:java} SELECT `(C3)?+.+`,`C1` * C2 FROM (SELECT 3 AS C1,2 AS C2,1 AS C3) T;{code} The above query will throw an exception {code:java} Error: org.apache.hive.service.cli.HiveSQLException: Error running query: org.apache.spark.sql.AnalysisException: Invalid usage of '*' in expression 'multiply' at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:370) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.$anonfun$run$2(SparkExecuteStatementOperation.scala:266) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties(SparkOperation.scala:78) at org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties$(SparkOperation.scala:62) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withLocalProperties(SparkExecuteStatementOperation.scala:44) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:266) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:261) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2.run(SparkExecuteStatementOperation.scala:275) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.spark.sql.AnalysisException: Invalid usage of '*' in expression 'multiply' at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.failAnalysis(CheckAnalysis.scala:50) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.failAnalysis$(CheckAnalysis.scala:49) at org.apache.spark.sql.catalyst.analysis.Analyzer.failAnalysis(Analyzer.scala:155) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences$$anonfun$expandStarExpression$1.applyOrElse(Analyzer.scala:1700) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences$$anonfun$expandStarExpression$1.applyOrElse(Analyzer.scala:1671) at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformUp$2(TreeNode.scala:342) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:74) at org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:342) at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformUp$1(TreeNode.scala:339) at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$mapChildren$1(TreeNode.scala:408) at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:244) at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:406) at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:359) at org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:339) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences$.expandStarExpression(Analyzer.scala:1671) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences$.$anonfun$buildExpandedProjectList$1(Analyzer.scala:1656) {code} It works fine in hive {code:java} 0: jdbc:hive2://hiveserver-inc.> set hive.support.quoted.identifiers=none; No rows affected (0.003 seconds) 0: jdbc:hive2://hiveserver-inc.> SELECT `(C3)?+.+`,`C1` * C2 FROM (SELECT 3 AS C1,2 AS C2,1 AS C3) T; 22/02/10 19:01:43 INFO ql.Driver: OK +---+---+--+ | t.c1 | t.c2 | _c1 | +---+---+--+ | 3 | 2 | 6 | +---+---+--+ 1 row selected (0.136 seconds){code} was: When spark.sql.parser.quotedRegexColumnNames=true {code:java} SELECT `(C3)?+.+`,`C1` * C2 FROM (SELECT 3 AS C1,2 AS C2,1 AS C3) T;{code} The above query will throw an exception {code:java} Error: org.apache.hive.service.cli.HiveSQLException: Error running query: org.apache.spark.sql.AnalysisException: Invalid usage of '*' in expression 'multiply
[jira] [Updated] (SPARK-38173) Quoted column cannot be recognized correctly when quotedRegexColumnNames is true
[ https://issues.apache.org/jira/browse/SPARK-38173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tongwei updated SPARK-38173: Description: When spark.sql.parser.quotedRegexColumnNames=true {code:java} SELECT `(C3)?+.+`,`C1` * C2 FROM (SELECT 3 AS C1,2 AS C2,1 AS C3) T;{code} The above query will throw an exception {code:java} Error: org.apache.hive.service.cli.HiveSQLException: Error running query: org.apache.spark.sql.AnalysisException: Invalid usage of '*' in expression 'multiply' at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:370) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.$anonfun$run$2(SparkExecuteStatementOperation.scala:266) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties(SparkOperation.scala:78) at org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties$(SparkOperation.scala:62) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withLocalProperties(SparkExecuteStatementOperation.scala:44) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:266) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:261) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2.run(SparkExecuteStatementOperation.scala:275) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.spark.sql.AnalysisException: Invalid usage of '*' in expression 'multiply' at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.failAnalysis(CheckAnalysis.scala:50) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.failAnalysis$(CheckAnalysis.scala:49) at org.apache.spark.sql.catalyst.analysis.Analyzer.failAnalysis(Analyzer.scala:155) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences$$anonfun$expandStarExpression$1.applyOrElse(Analyzer.scala:1700) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences$$anonfun$expandStarExpression$1.applyOrElse(Analyzer.scala:1671) at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformUp$2(TreeNode.scala:342) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:74) at org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:342) at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformUp$1(TreeNode.scala:339) at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$mapChildren$1(TreeNode.scala:408) at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:244) at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:406) at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:359) at org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:339) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences$.expandStarExpression(Analyzer.scala:1671) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences$.$anonfun$buildExpandedProjectList$1(Analyzer.scala:1656) {code} It works fine in hive {code:java} 0: jdbc:hive2://hiveserver-prd.shizhuang-inc.> set hive.support.quoted.identifiers=none; No rows affected (0.003 seconds) 0: jdbc:hive2://hiveserver-prd.shizhuang-inc.> SELECT `(C3)?+.+`,`C1` * C2 FROM (SELECT 3 AS C1,2 AS C2,1 AS C3) T; 22/02/10 19:01:43 INFO ql.Driver: OK +---+---+--+ | t.c1 | t.c2 | _c1 | +---+---+--+ | 3 | 2 | 6 | +---+---+--+ 1 row selected (0.136 seconds){code} was: When spark.sql.parser.quotedRegexColumnNames=true {code:java} SELECT `(C3)?+.+`,`C1` * C2 FROM (SELECT 3 AS C1,2 AS C2,1 AS C3) T;{code} The above query will throw an exception {code:java} Error: org.apache.hive.service.cli.HiveSQLException: Error running query: org.apache.spark.sql.AnalysisException: Invalid us
[jira] [Updated] (SPARK-38173) Quoted column cannot be recognized correctly when quotedRegexColumnNames is true
[ https://issues.apache.org/jira/browse/SPARK-38173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tongwei updated SPARK-38173: Description: When spark.sql.parser.quotedRegexColumnNames=true {code:java} SELECT `(C3)?+.+`,`C1` * C2 FROM (SELECT 3 AS C1,2 AS C2,1 AS C3) T;{code} The above query will throw an exception {code:java} Error: org.apache.hive.service.cli.HiveSQLException: Error running query: org.apache.spark.sql.AnalysisException: Invalid usage of '*' in expression 'multiply' at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:370) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.$anonfun$run$2(SparkExecuteStatementOperation.scala:266) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties(SparkOperation.scala:78) at org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties$(SparkOperation.scala:62) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withLocalProperties(SparkExecuteStatementOperation.scala:44) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:266) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:261) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2.run(SparkExecuteStatementOperation.scala:275) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.spark.sql.AnalysisException: Invalid usage of '*' in expression 'multiply' at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.failAnalysis(CheckAnalysis.scala:50) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.failAnalysis$(CheckAnalysis.scala:49) at org.apache.spark.sql.catalyst.analysis.Analyzer.failAnalysis(Analyzer.scala:155) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences$$anonfun$expandStarExpression$1.applyOrElse(Analyzer.scala:1700) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences$$anonfun$expandStarExpression$1.applyOrElse(Analyzer.scala:1671) at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformUp$2(TreeNode.scala:342) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:74) at org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:342) at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformUp$1(TreeNode.scala:339) at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$mapChildren$1(TreeNode.scala:408) at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:244) at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:406) at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:359) at org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:339) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences$.expandStarExpression(Analyzer.scala:1671) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences$.$anonfun$buildExpandedProjectList$1(Analyzer.scala:1656) {code} It works fine in hive {code:java} 0: jdbc:hive2://hiveserver-prd.shizhuang-inc.> set hive.support.quoted.identifiers=none; No rows affected (0.003 seconds) 0: jdbc:hive2://hiveserver-prd.shizhuang-inc.> SELECT `(C3)?+.+`,`C1` * C2 FROM (SELECT 3 AS C1,2 AS C2,1 AS C3) T; 22/02/10 19:01:43 INFO ql.Driver: OK +---+---+--+ | t.c1 | t.c2 | _c1 | +---+---+--+ | 3 | 2 | 6 | +---+---+--+ 1 row selected (0.136 seconds){code} was: When spark.sql.parser.quotedRegexColumnNames=true ``` SELECT `(C3)?+.+`,`C1` * C2 FROM (SELECT 3 AS C1,2 AS C2,1 AS C3) T; ``` > Quoted column cannot be recognized correctly when quotedRegexColumnNames is > true > > >
[jira] [Updated] (SPARK-38173) Quoted column cannot be recognized correctly when quotedRegexColumnNames is true
[ https://issues.apache.org/jira/browse/SPARK-38173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tongwei updated SPARK-38173: Description: When spark.sql.parser.quotedRegexColumnNames=true ``` SELECT `(C3)?+.+`,`C1` * C2 FROM (SELECT 3 AS C1,2 AS C2,1 AS C3) T; ``` > Quoted column cannot be recognized correctly when quotedRegexColumnNames is > true > > > Key: SPARK-38173 > URL: https://issues.apache.org/jira/browse/SPARK-38173 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.1.2, 3.2.0 >Reporter: Tongwei >Priority: Major > > When spark.sql.parser.quotedRegexColumnNames=true > ``` > SELECT `(C3)?+.+`,`C1` * C2 FROM (SELECT 3 AS C1,2 AS C2,1 AS C3) T; > ``` -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38173) Quoted column cannot be recognized correctly when quotedRegexColumnNames is true
Tongwei created SPARK-38173: --- Summary: Quoted column cannot be recognized correctly when quotedRegexColumnNames is true Key: SPARK-38173 URL: https://issues.apache.org/jira/browse/SPARK-38173 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.2.0, 3.1.2 Reporter: Tongwei -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37519) Support Relation With LateralView
[ https://issues.apache.org/jira/browse/SPARK-37519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tongwei updated SPARK-37519: Description: {code:java} CREATE TABLE person (id INT, name STRING, age INT, class INT, address STRING); INSERT INTO person VALUES (100, 'John', 30, 1, 'Street 1'), (200, 'Mary', NULL, 1, 'Street 2'), (300, 'Mike', 80, 3, 'Street 3'), (400, 'Dan', 50, 4, 'Street 4'); SELECT * FROM person AS P1 LATERAL VIEW EXPLODE(ARRAY(30, 60)) CC1 AS C_AGE1 LEFT JOIN person P2 LATERAL VIEW EXPLODE(ARRAY(50)) CC2 AS C_AGE2 ON P1.ID = P2.ID AND CC1.C_AGE1=P2.AGE; {code} {code:java} Error msg: LEFT JOIN person P2 ^^^ LATERAL VIEW EXPLODE(ARRAY(50)) CC2 AS C_AGE2 ON P1.ID = P2.ID AND CC1.C_AGE1=P2.AGE . at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.spark.sql.catalyst.parser.ParseException: mismatched input 'LEFT' expecting {, ';'}(line 4, pos 0) {code} was: {code:java} CREATE TABLE person (id INT, name STRING, age INT, class INT, address STRING); INSERT INTO person VALUES (100, 'John', 30, 1, 'Street 1'), (200, 'Mary', NULL, 1, 'Street 2'), (300, 'Mike', 80, 3, 'Street 3'), (400, 'Dan', 50, 4, 'Street 4'); SELECT * FROM person AS P1 LATERAL VIEW EXPLODE(ARRAY(30, 60)) CC1 AS C_AGE1 LEFT JOIN person P2 LATERAL VIEW EXPLODE(ARRAY(50)) CC2 AS C_AGE2 ON P1.ID = P2.ID AND CC1.C_AGE1=P2.AGE; {code} Error msg: LEFT JOIN person P2 ^^^ LATERAL VIEW EXPLODE(ARRAY(50)) CC2 AS C_AGE2 ON P1.ID = P2.ID AND CC1.C_AGE1=P2.AGE . at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.spark.sql.catalyst.parser.ParseException: mismatched input 'LEFT' expecting \{, ';'}(line 4, pos 0) > Support Relation With LateralView > - > > Key: SPARK-37519 > URL: https://issues.apache.org/jira/browse/SPARK-37519 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.1.2 >Reporter: Tongwei >Priority: Major > > {code:java} > CREATE TABLE person (id INT, name STRING, age INT, class INT, address STRING); > INSERT INTO person VALUES > (100, 'John', 30, 1, 'Street 1'), > (200, 'Mary', NULL, 1, 'Street 2'), > (300, 'Mike', 80, 3, 'Street 3'), > (400, 'Dan', 50, 4, 'Street 4'); > SELECT * > FROM person AS P1 > LATERAL VIEW EXPLODE(ARRAY(30, 60)) CC1 AS C_AGE1 > LEFT JOIN person P2 > LATERAL VIEW EXPLODE(ARRAY(50)) CC2 AS C_AGE2 ON P1.ID = P2.ID AND > CC1.C_AGE1=P2.AGE; > {code} > {code:java} > Error msg: > LEFT JOIN person P2 ^^^ LATERAL VIEW EXPLODE(ARRAY(50)) CC2 AS C_AGE2 ON > P1.ID = P2.ID AND CC1.C_AGE1=P2.AGE > . > at java.lang.Thread.run(Thread.java:748) Caused by: > org.apache.spark.sql.catalyst.parser.ParseException: mismatched input 'LEFT' > expecting {, ';'}(line 4, pos 0) {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37519) Support Relation With LateralView
[ https://issues.apache.org/jira/browse/SPARK-37519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tongwei updated SPARK-37519: Description: {code:java} CREATE TABLE person (id INT, name STRING, age INT, class INT, address STRING); INSERT INTO person VALUES (100, 'John', 30, 1, 'Street 1'), (200, 'Mary', NULL, 1, 'Street 2'), (300, 'Mike', 80, 3, 'Street 3'), (400, 'Dan', 50, 4, 'Street 4'); SELECT * FROM person AS P1 LATERAL VIEW EXPLODE(ARRAY(30, 60)) CC1 AS C_AGE1 LEFT JOIN person P2 LATERAL VIEW EXPLODE(ARRAY(50)) CC2 AS C_AGE2 ON P1.ID = P2.ID AND CC1.C_AGE1=P2.AGE; {code} Error msg: LEFT JOIN person P2 ^^^ LATERAL VIEW EXPLODE(ARRAY(50)) CC2 AS C_AGE2 ON P1.ID = P2.ID AND CC1.C_AGE1=P2.AGE . at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.spark.sql.catalyst.parser.ParseException: mismatched input 'LEFT' expecting \{, ';'}(line 4, pos 0) was: ``` CREATE TABLE person (id INT, name STRING, age INT, class INT, address STRING); INSERT INTO person VALUES (100, 'John', 30, 1, 'Street 1'), (200, 'Mary', NULL, 1, 'Street 2'), (300, 'Mike', 80, 3, 'Street 3'), (400, 'Dan', 50, 4, 'Street 4'); SELECT * FROM person AS P1 LATERAL VIEW EXPLODE(ARRAY(30, 60)) CC1 AS C_AGE1 LEFT JOIN person P2 LATERAL VIEW EXPLODE(ARRAY(50)) CC2 AS C_AGE2 ON P1.ID = P2.ID AND CC1.C_AGE1=P2.AGE; ERROR INFO: LEFT JOIN PERSON P2 ^^^ LATERAL VIEW EXPLODE(ARRAY(50)) CC2 AS C_AGE2 ON P1.ID = P2.ID AND CC1.C_AGE1=P2.AGE at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:370) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.$anonfun$run$2(SparkExecuteStatementOperation.scala:266) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties(SparkOperation.scala:78) at org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties$(SparkOperation.scala:62) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withLocalProperties(SparkExecuteStatementOperation.scala:44) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:266) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:261) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2.run(SparkExecuteStatementOperation.scala:275) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.spark.sql.catalyst.parser.ParseException: mismatched input 'LEFT' expecting \{, ';'}(line 13, pos 0) ``` > Support Relation With LateralView > - > > Key: SPARK-37519 > URL: https://issues.apache.org/jira/browse/SPARK-37519 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.1.2 >Reporter: Tongwei >Priority: Major > > {code:java} > CREATE TABLE person (id INT, name STRING, age INT, class INT, address STRING); > INSERT INTO person VALUES > (100, 'John', 30, 1, 'Street 1'), > (200, 'Mary', NULL, 1, 'Street 2'), > (300, 'Mike', 80, 3, 'Street 3'), > (400, 'Dan', 50, 4, 'Street 4'); > SELECT * > FROM person AS P1 > LATERAL VIEW EXPLODE(ARRAY(30, 60)) CC1 AS C_AGE1 > LEFT JOIN person P2 > LATERAL VIEW EXPLODE(ARRAY(50)) CC2 AS C_AGE2 ON P1.ID = P2.ID AND > CC1.C_AGE1=P2.AGE; > {code} > Error msg: > LEFT JOIN person P2 ^^^ LATERAL VIEW EXPLODE(ARRAY(50)) CC2 AS C_AGE2 ON > P1.ID = P2.ID AND CC1.C_AGE1=P2.AGE > . > at java.lang.Thread.run(Thread.java:748) Caused by: > org.apache.spark.sql.catalyst.parser.ParseException: mismatched input 'LEFT' > expecting \{, ';'}(line 4, pos 0) -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37519) Support Relation With LateralView
[ https://issues.apache.org/jira/browse/SPARK-37519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tongwei updated SPARK-37519: Description: ``` CREATE TABLE person (id INT, name STRING, age INT, class INT, address STRING); INSERT INTO person VALUES (100, 'John', 30, 1, 'Street 1'), (200, 'Mary', NULL, 1, 'Street 2'), (300, 'Mike', 80, 3, 'Street 3'), (400, 'Dan', 50, 4, 'Street 4'); SELECT * FROM person AS P1 LATERAL VIEW EXPLODE(ARRAY(30, 60)) CC1 AS C_AGE1 LEFT JOIN person P2 LATERAL VIEW EXPLODE(ARRAY(50)) CC2 AS C_AGE2 ON P1.ID = P2.ID AND CC1.C_AGE1=P2.AGE; ERROR INFO: LEFT JOIN PERSON P2 ^^^ LATERAL VIEW EXPLODE(ARRAY(50)) CC2 AS C_AGE2 ON P1.ID = P2.ID AND CC1.C_AGE1=P2.AGE at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:370) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.$anonfun$run$2(SparkExecuteStatementOperation.scala:266) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties(SparkOperation.scala:78) at org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties$(SparkOperation.scala:62) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withLocalProperties(SparkExecuteStatementOperation.scala:44) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:266) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:261) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2.run(SparkExecuteStatementOperation.scala:275) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.spark.sql.catalyst.parser.ParseException: mismatched input 'LEFT' expecting \{, ';'}(line 13, pos 0) ``` > Support Relation With LateralView > - > > Key: SPARK-37519 > URL: https://issues.apache.org/jira/browse/SPARK-37519 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.1.2 >Reporter: Tongwei >Priority: Major > > ``` > CREATE TABLE person (id INT, name STRING, age INT, class INT, address STRING); > INSERT INTO person VALUES > (100, 'John', 30, 1, 'Street 1'), > (200, 'Mary', NULL, 1, 'Street 2'), > (300, 'Mike', 80, 3, 'Street 3'), > (400, 'Dan', 50, 4, 'Street 4'); > SELECT * > FROM person AS P1 > LATERAL VIEW EXPLODE(ARRAY(30, 60)) CC1 AS C_AGE1 > LEFT JOIN person P2 > LATERAL VIEW EXPLODE(ARRAY(50)) CC2 AS C_AGE2 ON P1.ID = P2.ID AND > CC1.C_AGE1=P2.AGE; > ERROR INFO: > LEFT JOIN PERSON P2 > ^^^ > LATERAL VIEW EXPLODE(ARRAY(50)) CC2 AS C_AGE2 ON P1.ID = P2.ID AND > CC1.C_AGE1=P2.AGE > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:370) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.$anonfun$run$2(SparkExecuteStatementOperation.scala:266) > at > scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) > at > org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties(SparkOperation.scala:78) > at > org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties$(SparkOperation.scala:62) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withLocalProperties(SparkExecuteStatementOperation.scala:44) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:266) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:261) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) >
[jira] [Created] (SPARK-37519) Support Relation With LateralView
Tongwei created SPARK-37519: --- Summary: Support Relation With LateralView Key: SPARK-37519 URL: https://issues.apache.org/jira/browse/SPARK-37519 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.1.2 Reporter: Tongwei -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37131) Support use IN/EXISTS with subquery in Project/Aggregate
[ https://issues.apache.org/jira/browse/SPARK-37131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tongwei updated SPARK-37131: Description: {code:java} CREATE TABLE tbl1 (col1 INT, col2 STRING) USING PARQUET; INSERT OVERWRITE TABLE tbl1 SELECT 0,1; CREATE TABLE tbl2 (c1 INT, c2 STRING) USING PARQUET; INSERT OVERWRITE TABLE tbl2 SELECT 0,2; case 1: select c1 in (select col1 from tbl1) from tbl2 Error msg: IN/EXISTS predicate sub-queries can only be used in Filter/Join and a few commands: Project [] case 2: select count(1), case when c1 in (select col1 from tbl1) then "A" else "B" end as tag from tbl2 group by case when c1 in (select col1 from tbl1) then "A" else "B" end Error msg: IN/EXISTS predicate sub-queries can only be used in Filter/Join and a few commands: Aggregate [] {code} was: {code:java} CREATE TABLE tbl1 (col1 INT, col2 STRING) USING PARQUET; INSERT OVERWRITE TABLE tbl1 SELECT 0,1; CREATE TABLE tbl2 (c1 INT, c2 STRING) USING PARQUET; INSERT OVERWRITE TABLE tbl2 SELECT 0,2; case 1: select c1 in (select col1 from tbl1) from tbl2 Error msg: IN/EXISTS predicate sub-queries can only be used in Filter/Join and a few commands: Project [] case 2: select count(1), case when c1 in (select col1 from tbl1) then "A" else "B" end as tag from tbl2 group by case when c1 in (select col1 from tbl1) then "A" else "B" end Error msg: IN/EXISTS predicate sub-queries can only be used in Filter/Join and a few commands: Aggregate [] {code} > Support use IN/EXISTS with subquery in Project/Aggregate > > > Key: SPARK-37131 > URL: https://issues.apache.org/jira/browse/SPARK-37131 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.2.0 >Reporter: Tongwei >Priority: Major > > {code:java} > CREATE TABLE tbl1 (col1 INT, col2 STRING) USING PARQUET; > INSERT OVERWRITE TABLE tbl1 SELECT 0,1; > CREATE TABLE tbl2 (c1 INT, c2 STRING) USING PARQUET; > INSERT OVERWRITE TABLE tbl2 SELECT 0,2; > case 1: > select c1 in (select col1 from tbl1) from tbl2 > Error msg: > IN/EXISTS predicate sub-queries can only be used in Filter/Join and a > few commands: Project [] > case 2: > select count(1), case when c1 in (select col1 from tbl1) then "A" else > "B" end as tag from tbl2 group by case when c1 in (select col1 from tbl1) > then "A" else "B" end > Error msg: > IN/EXISTS predicate sub-queries can only be used in Filter/Join and a > few commands: Aggregate [] > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37131) Support use IN/EXISTS with subquery in Project/Aggregate
[ https://issues.apache.org/jira/browse/SPARK-37131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tongwei updated SPARK-37131: Description: {code:java} CREATE TABLE tbl1 (col1 INT, col2 STRING) USING PARQUET; INSERT OVERWRITE TABLE tbl1 SELECT 0,1; CREATE TABLE tbl2 (c1 INT, c2 STRING) USING PARQUET; INSERT OVERWRITE TABLE tbl2 SELECT 0,2; case 1: select c1 in (select col1 from tbl1) from tbl2 Error msg: IN/EXISTS predicate sub-queries can only be used in Filter/Join and a few commands: Project [] case 2: select count(1), case when c1 in (select col1 from tbl1) then "A" else "B" end as tag from tbl2 group by case when c1 in (select col1 from tbl1) then "A" else "B" end Error msg: IN/EXISTS predicate sub-queries can only be used in Filter/Join and a few commands: Aggregate [] {code} was: ``` CREATE TABLE tbl1 (col1 INT, col2 STRING) USING PARQUET; INSERT OVERWRITE TABLE tbl1 SELECT 0,1; CREATE TABLE tbl2 (c1 INT, c2 STRING) USING PARQUET; INSERT OVERWRITE TABLE tbl2 SELECT 0,2; case 1: select c1 in (select col1 from tbl1) from tbl2 Error msg: IN/EXISTS predicate sub-queries can only be used in Filter/Join and a few commands: Project [] case 2: select count(1), case when c1 in (select col1 from tbl1) then "A" else "B" end as tag from tbl2 group by case when c1 in (select col1 from tbl1) then "A" else "B" end Error msg: IN/EXISTS predicate sub-queries can only be used in Filter/Join and a few commands: Aggregate [] ``` > Support use IN/EXISTS with subquery in Project/Aggregate > > > Key: SPARK-37131 > URL: https://issues.apache.org/jira/browse/SPARK-37131 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.2.0 >Reporter: Tongwei >Priority: Major > > > > {code:java} > CREATE TABLE tbl1 (col1 INT, col2 STRING) USING PARQUET; > INSERT OVERWRITE TABLE tbl1 SELECT 0,1; > CREATE TABLE tbl2 (c1 INT, c2 STRING) USING PARQUET; > INSERT OVERWRITE TABLE tbl2 SELECT 0,2; > case 1: > select c1 in (select col1 from tbl1) from tbl2 > Error msg: > IN/EXISTS predicate sub-queries can only be used in Filter/Join and a > few commands: Project [] > case 2: > select count(1), case when c1 in (select col1 from tbl1) then "A" else > "B" end as tag from tbl2 group by case when c1 in (select col1 from tbl1) > then "A" else "B" end > Error msg: > IN/EXISTS predicate sub-queries can only be used in Filter/Join and a > few commands: Aggregate [] > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37131) Support use IN/EXISTS with subquery in Project/Aggregate
[ https://issues.apache.org/jira/browse/SPARK-37131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tongwei updated SPARK-37131: Description: CREATE TABLE tbl1 (col1 INT, col2 STRING) USING PARQUET; INSERT OVERWRITE TABLE tbl1 SELECT 0,1; CREATE TABLE tbl2 (c1 INT, c2 STRING) USING PARQUET; INSERT OVERWRITE TABLE tbl2 SELECT 0,2; case 1: select c1 in (select col1 from tbl1) from tbl2 Error msg: IN/EXISTS predicate sub-queries can only be used in Filter/Join and a few commands: Project [] case 2: select count(1), case when c1 in (select col1 from tbl1) then "A" else "B" end as tag from tbl2 group by case when c1 in (select col1 from tbl1) then "A" else "B" end Error msg: IN/EXISTS predicate sub-queries can only be used in Filter/Join and a few commands: Aggregate [] was: CREATE TABLE tbl1 (col1 INT, col2 STRING) USING PARQUET; INSERT OVERWRITE TABLE tbl1 SELECT 0,1; CREATE TABLE tbl2 (c1 INT, c2 STRING) USING PARQUET; INSERT OVERWRITE TABLE tbl2 SELECT 0,2; case 1: select c1 in (select col1 from tbl1) from tbl2 Error msg: IN/EXISTS predicate sub-queries can only be used in Filter/Join and a few commands: Project [] case 2: select count(*), case when c1 in (select col1 from tbl1) then "A" else "B" end as tag from tbl2 group by case when c1 in (select col1 from tbl1) then "A" else "B" end Error msg: IN/EXISTS predicate sub-queries can only be used in Filter/Join and a few commands: Aggregate [] > Support use IN/EXISTS with subquery in Project/Aggregate > > > Key: SPARK-37131 > URL: https://issues.apache.org/jira/browse/SPARK-37131 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.2.0 >Reporter: Tongwei >Priority: Major > > CREATE TABLE tbl1 (col1 INT, col2 STRING) USING PARQUET; > INSERT OVERWRITE TABLE tbl1 SELECT 0,1; > CREATE TABLE tbl2 (c1 INT, c2 STRING) USING PARQUET; > INSERT OVERWRITE TABLE tbl2 SELECT 0,2; > case 1: > select c1 in (select col1 from tbl1) from tbl2 > Error msg: > IN/EXISTS predicate sub-queries can only be used in Filter/Join and a few > commands: Project [] > case 2: > select count(1), case when c1 in (select col1 from tbl1) then "A" else "B" > end as tag from tbl2 group by case when c1 in (select col1 from tbl1) then > "A" else "B" end > Error msg: > IN/EXISTS predicate sub-queries can only be used in Filter/Join and a few > commands: Aggregate [] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37131) Support use IN/EXISTS with subquery in Project/Aggregate
[ https://issues.apache.org/jira/browse/SPARK-37131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tongwei updated SPARK-37131: Description: ``` CREATE TABLE tbl1 (col1 INT, col2 STRING) USING PARQUET; INSERT OVERWRITE TABLE tbl1 SELECT 0,1; CREATE TABLE tbl2 (c1 INT, c2 STRING) USING PARQUET; INSERT OVERWRITE TABLE tbl2 SELECT 0,2; case 1: select c1 in (select col1 from tbl1) from tbl2 Error msg: IN/EXISTS predicate sub-queries can only be used in Filter/Join and a few commands: Project [] case 2: select count(1), case when c1 in (select col1 from tbl1) then "A" else "B" end as tag from tbl2 group by case when c1 in (select col1 from tbl1) then "A" else "B" end Error msg: IN/EXISTS predicate sub-queries can only be used in Filter/Join and a few commands: Aggregate [] ``` was: CREATE TABLE tbl1 (col1 INT, col2 STRING) USING PARQUET; INSERT OVERWRITE TABLE tbl1 SELECT 0,1; CREATE TABLE tbl2 (c1 INT, c2 STRING) USING PARQUET; INSERT OVERWRITE TABLE tbl2 SELECT 0,2; case 1: select c1 in (select col1 from tbl1) from tbl2 Error msg: IN/EXISTS predicate sub-queries can only be used in Filter/Join and a few commands: Project [] case 2: select count(1), case when c1 in (select col1 from tbl1) then "A" else "B" end as tag from tbl2 group by case when c1 in (select col1 from tbl1) then "A" else "B" end Error msg: IN/EXISTS predicate sub-queries can only be used in Filter/Join and a few commands: Aggregate [] > Support use IN/EXISTS with subquery in Project/Aggregate > > > Key: SPARK-37131 > URL: https://issues.apache.org/jira/browse/SPARK-37131 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.2.0 >Reporter: Tongwei >Priority: Major > > ``` > CREATE TABLE tbl1 (col1 INT, col2 STRING) USING PARQUET; > INSERT OVERWRITE TABLE tbl1 SELECT 0,1; > CREATE TABLE tbl2 (c1 INT, c2 STRING) USING PARQUET; > INSERT OVERWRITE TABLE tbl2 SELECT 0,2; > case 1: > select c1 in (select col1 from tbl1) from tbl2 > Error msg: > IN/EXISTS predicate sub-queries can only be used in Filter/Join and a few > commands: Project [] > case 2: > select count(1), case when c1 in (select col1 from tbl1) then "A" else "B" > end as tag from tbl2 group by case when c1 in (select col1 from tbl1) then > "A" else "B" end > Error msg: > IN/EXISTS predicate sub-queries can only be used in Filter/Join and a few > commands: Aggregate [] > ``` -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37131) Support use IN/EXISTS with subquery in Project/Aggregate
Tongwei created SPARK-37131: --- Summary: Support use IN/EXISTS with subquery in Project/Aggregate Key: SPARK-37131 URL: https://issues.apache.org/jira/browse/SPARK-37131 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.2.0 Reporter: Tongwei CREATE TABLE tbl1 (col1 INT, col2 STRING) USING PARQUET; INSERT OVERWRITE TABLE tbl1 SELECT 0,1; CREATE TABLE tbl2 (c1 INT, c2 STRING) USING PARQUET; INSERT OVERWRITE TABLE tbl2 SELECT 0,2; case 1: select c1 in (select col1 from tbl1) from tbl2 Error msg: IN/EXISTS predicate sub-queries can only be used in Filter/Join and a few commands: Project [] case 2: select count(*), case when c1 in (select col1 from tbl1) then "A" else "B" end as tag from tbl2 group by case when c1 in (select col1 from tbl1) then "A" else "B" end Error msg: IN/EXISTS predicate sub-queries can only be used in Filter/Join and a few commands: Aggregate [] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36727) Support sql overwrite a path that is also being read from when partitionOverwriteMode is dynamic
[ https://issues.apache.org/jira/browse/SPARK-36727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tongwei updated SPARK-36727: Priority: Major (was: Minor) > Support sql overwrite a path that is also being read from when > partitionOverwriteMode is dynamic > > > Key: SPARK-36727 > URL: https://issues.apache.org/jira/browse/SPARK-36727 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.1.2 >Reporter: Tongwei >Priority: Major > > {code:java} > // non-partitioned table overwrite > CREATE TABLE tbl (col1 INT, col2 STRING) USING PARQUET; > INSERT OVERWRITE TABLE tbl SELECT 0,1; > INSERT OVERWRITE TABLE tbl SELECT * FROM tbl; > // partitioned table static overwrite > CREATE TABLE tbl (col1 INT, col2 STRING) USING PARQUET PARTITIONED BY (pt1 > INT); > INSERT OVERWRITE TABLE tbl PARTITION(p1=2021) SELECT 0 AS col1,1 AS col2; > INSERT OVERWRITE TABLE tbl PARTITION(p1=2021) SELECT col1, col2 FROM WHERE > p1=2021; > {code} > When we run the above query, an error will be throwed "Cannot overwrite a > path that is also being read from" > We need to support this operation when the > spark.sql.sources.partitionOverwriteMode is dynamic -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36727) Support sql overwrite a path that is also being read from when partitionOverwriteMode is dynamic
[ https://issues.apache.org/jira/browse/SPARK-36727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tongwei updated SPARK-36727: External issue URL: https://github.com/apache/spark/pull/33986 > Support sql overwrite a path that is also being read from when > partitionOverwriteMode is dynamic > > > Key: SPARK-36727 > URL: https://issues.apache.org/jira/browse/SPARK-36727 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.1.2 >Reporter: Tongwei >Priority: Minor > > {code:java} > // non-partitioned table overwrite > CREATE TABLE tbl (col1 INT, col2 STRING) USING PARQUET; > INSERT OVERWRITE TABLE tbl SELECT 0,1; > INSERT OVERWRITE TABLE tbl SELECT * FROM tbl; > // partitioned table static overwrite > CREATE TABLE tbl (col1 INT, col2 STRING) USING PARQUET PARTITIONED BY (pt1 > INT); > INSERT OVERWRITE TABLE tbl PARTITION(p1=2021) SELECT 0 AS col1,1 AS col2; > INSERT OVERWRITE TABLE tbl PARTITION(p1=2021) SELECT col1, col2 FROM WHERE > p1=2021; > {code} > When we run the above query, an error will be throwed "Cannot overwrite a > path that is also being read from" > We need to support this operation when the > spark.sql.sources.partitionOverwriteMode is dynamic -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36727) Support sql overwrite a path that is also being read from when partitionOverwriteMode is dynamic
[ https://issues.apache.org/jira/browse/SPARK-36727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tongwei updated SPARK-36727: External issue URL: (was: https://github.com/apache/spark/pull/33986) > Support sql overwrite a path that is also being read from when > partitionOverwriteMode is dynamic > > > Key: SPARK-36727 > URL: https://issues.apache.org/jira/browse/SPARK-36727 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.1.2 >Reporter: Tongwei >Priority: Minor > > {code:java} > // non-partitioned table overwrite > CREATE TABLE tbl (col1 INT, col2 STRING) USING PARQUET; > INSERT OVERWRITE TABLE tbl SELECT 0,1; > INSERT OVERWRITE TABLE tbl SELECT * FROM tbl; > // partitioned table static overwrite > CREATE TABLE tbl (col1 INT, col2 STRING) USING PARQUET PARTITIONED BY (pt1 > INT); > INSERT OVERWRITE TABLE tbl PARTITION(p1=2021) SELECT 0 AS col1,1 AS col2; > INSERT OVERWRITE TABLE tbl PARTITION(p1=2021) SELECT col1, col2 FROM WHERE > p1=2021; > {code} > When we run the above query, an error will be throwed "Cannot overwrite a > path that is also being read from" > We need to support this operation when the > spark.sql.sources.partitionOverwriteMode is dynamic -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36727) Support sql overwrite a path that is also being read from when partitionOverwriteMode is dynamic
[ https://issues.apache.org/jira/browse/SPARK-36727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tongwei updated SPARK-36727: Description: {code:java} // non-partitioned table overwrite CREATE TABLE tbl (col1 INT, col2 STRING) USING PARQUET; INSERT OVERWRITE TABLE tbl SELECT 0,1; INSERT OVERWRITE TABLE tbl SELECT * FROM tbl; // partitioned table static overwrite CREATE TABLE tbl (col1 INT, col2 STRING) USING PARQUET PARTITIONED BY (pt1 INT); INSERT OVERWRITE TABLE tbl PARTITION(p1=2021) SELECT 0 AS col1,1 AS col2; INSERT OVERWRITE TABLE tbl PARTITION(p1=2021) SELECT col1, col2 FROM WHERE p1=2021; {code} When we run the above query, an error will be throwed "Cannot overwrite a path that is also being read from" We need to support this operation when the spark.sql.sources.partitionOverwriteMode is dynamic was: {code:java} // non-partitioned table overwrite CREATE TABLE tbl (col1 INT, col2 STRING) USING PARQUET; INSERT OVERWRITE TABLE tbl SELECT 0,1; INSERT OVERWRITE TABLE tbl SELECT * FROM tbl; // partitioned table static overwrite CREATE TABLE tbl (col1 INT, col2 STRING) USING PARQUET PARTITIONED BY (pt1 INT); INSERT OVERWRITE TABLE tbl PARTITION(p1=2021) SELECT 0 AS col1,1 AS col2; INSERT OVERWRITE TABLE tbl PARTITION(p1=2021) SELECT col1, col2 FROM WHERE p1=2021; {code} When we run the above query, an error will be throwed "Cannot overwrite a path that is also being read from" We need to support this operation when the weather is good > Support sql overwrite a path that is also being read from when > partitionOverwriteMode is dynamic > > > Key: SPARK-36727 > URL: https://issues.apache.org/jira/browse/SPARK-36727 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.1.2 >Reporter: Tongwei >Priority: Minor > > {code:java} > // non-partitioned table overwrite > CREATE TABLE tbl (col1 INT, col2 STRING) USING PARQUET; > INSERT OVERWRITE TABLE tbl SELECT 0,1; > INSERT OVERWRITE TABLE tbl SELECT * FROM tbl; > // partitioned table static overwrite > CREATE TABLE tbl (col1 INT, col2 STRING) USING PARQUET PARTITIONED BY (pt1 > INT); > INSERT OVERWRITE TABLE tbl PARTITION(p1=2021) SELECT 0 AS col1,1 AS col2; > INSERT OVERWRITE TABLE tbl PARTITION(p1=2021) SELECT col1, col2 FROM WHERE > p1=2021; > {code} > When we run the above query, an error will be throwed "Cannot overwrite a > path that is also being read from" > We need to support this operation when the > spark.sql.sources.partitionOverwriteMode is dynamic -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36727) Support sql overwrite a path that is also being read from when partitionOverwriteMode is dynamic
[ https://issues.apache.org/jira/browse/SPARK-36727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tongwei updated SPARK-36727: Description: {code:java} // non-partitioned table overwrite CREATE TABLE tbl (col1 INT, col2 STRING) USING PARQUET; INSERT OVERWRITE TABLE tbl SELECT 0,1; INSERT OVERWRITE TABLE tbl SELECT * FROM tbl; // partitioned table static overwrite CREATE TABLE tbl (col1 INT, col2 STRING) USING PARQUET PARTITIONED BY (pt1 INT); INSERT OVERWRITE TABLE tbl PARTITION(p1=2021) SELECT 0 AS col1,1 AS col2; INSERT OVERWRITE TABLE tbl PARTITION(p1=2021) SELECT col1, col2 FROM WHERE p1=2021; {code} When we run the above query, an error will be throwed "Cannot overwrite a path that is also being read from" We need to support this operation when the weather is good was: {code:java} // non-partitioned table overwrite CREATE TABLE tbl (col1 INT, col2 STRING) USING PARQUET; INSERT OVERWRITE TABLE tbl SELECT 0,1; INSERT OVERWRITE TABLE tbl SELECT * FROM tbl; // partitioned table static overwrite CREATE TABLE tbl (col1 INT, col2 STRING) USING PARQUET PARTITIONED BY (pt1 INT); INSERT OVERWRITE TABLE tbl PARTITION(p1=2021) SELECT 0 AS col1,1 AS col2; INSERT OVERWRITE TABLE tbl PARTITION(p1=2021) SELECT col1, col2 FROM WHERE p1=2021; {code} When we run the above query, an error will be throwed "Cannot overwrite a path that is also being read from" > Support sql overwrite a path that is also being read from when > partitionOverwriteMode is dynamic > > > Key: SPARK-36727 > URL: https://issues.apache.org/jira/browse/SPARK-36727 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.1.2 >Reporter: Tongwei >Priority: Minor > > {code:java} > // non-partitioned table overwrite > CREATE TABLE tbl (col1 INT, col2 STRING) USING PARQUET; > INSERT OVERWRITE TABLE tbl SELECT 0,1; > INSERT OVERWRITE TABLE tbl SELECT * FROM tbl; > // partitioned table static overwrite > CREATE TABLE tbl (col1 INT, col2 STRING) USING PARQUET PARTITIONED BY (pt1 > INT); > INSERT OVERWRITE TABLE tbl PARTITION(p1=2021) SELECT 0 AS col1,1 AS col2; > INSERT OVERWRITE TABLE tbl PARTITION(p1=2021) SELECT col1, col2 FROM WHERE > p1=2021; > {code} > When we run the above query, an error will be throwed "Cannot overwrite a > path that is also being read from" > We need to support this operation when the weather is good > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36727) Support sql overwrite a path that is also being read from when partitionOverwriteMode is dynamic
[ https://issues.apache.org/jira/browse/SPARK-36727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tongwei updated SPARK-36727: Description: {code:java} // non-partitioned table overwrite CREATE TABLE tbl (col1 INT, col2 STRING) USING PARQUET; INSERT OVERWRITE TABLE tbl SELECT 0,1; INSERT OVERWRITE TABLE tbl SELECT * FROM tbl; // partitioned table static overwrite CREATE TABLE tbl (col1 INT, col2 STRING) USING PARQUET PARTITIONED BY (pt1 INT); INSERT OVERWRITE TABLE tbl PARTITION(p1=2021) SELECT 0 AS col1,1 AS col2; INSERT OVERWRITE TABLE tbl PARTITION(p1=2021) SELECT col1, col2 FROM WHERE p1=2021; {code} When we run the above query, an error will be throwed "Cannot overwrite a path that is also being read from" > Support sql overwrite a path that is also being read from when > partitionOverwriteMode is dynamic > > > Key: SPARK-36727 > URL: https://issues.apache.org/jira/browse/SPARK-36727 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.1.2 >Reporter: Tongwei >Priority: Minor > > {code:java} > // non-partitioned table overwrite > CREATE TABLE tbl (col1 INT, col2 STRING) USING PARQUET; > INSERT OVERWRITE TABLE tbl SELECT 0,1; > INSERT OVERWRITE TABLE tbl SELECT * FROM tbl; > // partitioned table static overwrite > CREATE TABLE tbl (col1 INT, col2 STRING) USING PARQUET PARTITIONED BY (pt1 > INT); > INSERT OVERWRITE TABLE tbl PARTITION(p1=2021) SELECT 0 AS col1,1 AS col2; > INSERT OVERWRITE TABLE tbl PARTITION(p1=2021) SELECT col1, col2 FROM WHERE > p1=2021; > {code} > When we run the above query, an error will be throwed "Cannot overwrite a > path that is also being read from" > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Closed] (SPARK-33648) Moving file stage failed cause dulpicated data
[ https://issues.apache.org/jira/browse/SPARK-33648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tongwei closed SPARK-33648. --- > Moving file stage failed cause dulpicated data > --- > > Key: SPARK-33648 > URL: https://issues.apache.org/jira/browse/SPARK-33648 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.1 >Reporter: Tongwei >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36727) Support sql overwrite a path that is also being read from when partitionOverwriteMode is dynamic
Tongwei created SPARK-36727: --- Summary: Support sql overwrite a path that is also being read from when partitionOverwriteMode is dynamic Key: SPARK-36727 URL: https://issues.apache.org/jira/browse/SPARK-36727 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.1.2 Reporter: Tongwei -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-33648) Moving file stage failed cause dulpicated data
[ https://issues.apache.org/jira/browse/SPARK-33648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tongwei updated SPARK-33648: Summary: Moving file stage failed cause dulpicated data (was: Moving file failed cause dulpicated data ) > Moving file stage failed cause dulpicated data > --- > > Key: SPARK-33648 > URL: https://issues.apache.org/jira/browse/SPARK-33648 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.1 >Reporter: Tongwei >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-33648) Moving file failed cause dulpicated data
[ https://issues.apache.org/jira/browse/SPARK-33648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tongwei updated SPARK-33648: Summary: Moving file failed cause dulpicated data (was: null) > Moving file failed cause dulpicated data > - > > Key: SPARK-33648 > URL: https://issues.apache.org/jira/browse/SPARK-33648 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.1 >Reporter: Tongwei >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Moved] (SPARK-33648) null
[ https://issues.apache.org/jira/browse/SPARK-33648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tongwei moved HIVE-24476 to SPARK-33648: Component/s: SQL Key: SPARK-33648 (was: HIVE-24476) Affects Version/s: 3.0.1 Workflow: no-reopen-closed (was: no-reopen-closed, patch-avail) Project: Spark (was: Hive) > null > > > Key: SPARK-33648 > URL: https://issues.apache.org/jira/browse/SPARK-33648 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.1 >Reporter: Tongwei >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org