[GitHub] [spark] cloud-fan commented on a diff in pull request #38404: [SPARK-40956] SQL Equivalent for Dataframe overwrite command
cloud-fan commented on code in PR #38404: URL: https://github.com/apache/spark/pull/38404#discussion_r1022326063 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala: ## @@ -261,6 +261,7 @@ class AstBuilder extends SqlBaseParserBaseVisitor[AnyRef] with SQLConfHelper wit * {{{ * INSERT OVERWRITE TABLE tableIdentifier [partitionSpec [IF NOT EXISTS]]? [identifierList] * INSERT INTO [TABLE] tableIdentifier [partitionSpec] [identifierList] + * INSERT INTO [TABLE] tableIdentifier REPLACE whereClause identifierList Review Comment: ```suggestion * INSERT INTO [TABLE] tableIdentifier REPLACE whereClause ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a diff in pull request #38404: [SPARK-40956] SQL Equivalent for Dataframe overwrite command
cloud-fan commented on code in PR #38404: URL: https://github.com/apache/spark/pull/38404#discussion_r1021061558 ## sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4: ## @@ -319,6 +319,7 @@ query insertInto : INSERT OVERWRITE TABLE? multipartIdentifier (partitionSpec (IF NOT EXISTS)?)? identifierList?#insertOverwriteTable | INSERT INTO TABLE? multipartIdentifier partitionSpec? (IF NOT EXISTS)? identifierList?#insertIntoTable +| INSERT INTO TABLE? multipartIdentifier REPLACE whereClause? identifierList? #insertIntoReplaceWhere Review Comment: `SELECT * FROM source2` is not identifier list, but the input query. The syntax here is for the header of INSERT and doesn't include the input query. A full query would look like `INSERT INTO t REPLACE WHERE ... (a, c) SELECT 1, 2`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a diff in pull request #38404: [SPARK-40956] SQL Equivalent for Dataframe overwrite command
cloud-fan commented on code in PR #38404: URL: https://github.com/apache/spark/pull/38404#discussion_r1019781125 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala: ## @@ -288,6 +289,11 @@ class AstBuilder extends SqlBaseParserBaseVisitor[AnyRef] with SQLConfHelper wit query, overwrite = true, ifPartitionNotExists) + case ctx: InsertIntoReplaceWhereContext => Review Comment: I think we should fail if `identifierList` is specified. I'll update `OverwriteByExpression` to support column list later. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a diff in pull request #38404: [SPARK-40956] SQL Equivalent for Dataframe overwrite command
cloud-fan commented on code in PR #38404: URL: https://github.com/apache/spark/pull/38404#discussion_r1019780097 ## sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4: ## @@ -319,6 +319,7 @@ query insertInto : INSERT OVERWRITE TABLE? multipartIdentifier (partitionSpec (IF NOT EXISTS)?)? identifierList?#insertOverwriteTable | INSERT INTO TABLE? multipartIdentifier partitionSpec? (IF NOT EXISTS)? identifierList?#insertIntoTable +| INSERT INTO TABLE? multipartIdentifier REPLACE whereClause? identifierList? #insertIntoReplaceWhere Review Comment: I think the WHERE clause should be required? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a diff in pull request #38404: [SPARK-40956] SQL Equivalent for Dataframe overwrite command
cloud-fan commented on code in PR #38404: URL: https://github.com/apache/spark/pull/38404#discussion_r1009015127 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala: ## @@ -279,14 +282,26 @@ class AstBuilder extends SqlBaseParserBaseVisitor[AnyRef] with SQLConfHelper wit overwrite = false, ifPartitionNotExists) case table: InsertOverwriteTableContext => -val (relation, cols, partition, ifPartitionNotExists) = visitInsertOverwriteTable(table) +val (relation, cols, partition, ifPartitionNotExists, _) += visitInsertOverwriteTable(table) InsertIntoStatement( relation, partition, cols, query, overwrite = true, ifPartitionNotExists) + case table: InsertIntoReplaceWhereContext => +val (relation, cols, partition, ifPartitionNotExists, replacePredicates) += visitInsertIntoReplaceWhere(table) +InsertIntoStatement( Review Comment: can't we return `OverwriteByExpression` directly here? Then we don't need to touch `InsertIntoStatement`. This is v2 code path only, and we don't need to use `InsertIntoStatement` and covert it to either v1 or v2 code path later. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org