[GitHub] [spark] cloud-fan commented on a diff in pull request #38404: [SPARK-40956] SQL Equivalent for Dataframe overwrite command

2022-11-14 Thread GitBox


cloud-fan commented on code in PR #38404:
URL: https://github.com/apache/spark/pull/38404#discussion_r1022326063


##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala:
##
@@ -261,6 +261,7 @@ class AstBuilder extends SqlBaseParserBaseVisitor[AnyRef] 
with SQLConfHelper wit
* {{{
*   INSERT OVERWRITE TABLE tableIdentifier [partitionSpec [IF NOT EXISTS]]? 
[identifierList]
*   INSERT INTO [TABLE] tableIdentifier [partitionSpec]  [identifierList]
+   *   INSERT INTO [TABLE] tableIdentifier REPLACE whereClause identifierList

Review Comment:
   ```suggestion
  *   INSERT INTO [TABLE] tableIdentifier REPLACE whereClause
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a diff in pull request #38404: [SPARK-40956] SQL Equivalent for Dataframe overwrite command

2022-11-13 Thread GitBox


cloud-fan commented on code in PR #38404:
URL: https://github.com/apache/spark/pull/38404#discussion_r1021061558


##
sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4:
##
@@ -319,6 +319,7 @@ query
 insertInto
 : INSERT OVERWRITE TABLE? multipartIdentifier (partitionSpec (IF NOT 
EXISTS)?)?  identifierList?#insertOverwriteTable
 | INSERT INTO TABLE? multipartIdentifier partitionSpec? (IF NOT EXISTS)? 
identifierList?#insertIntoTable
+| INSERT INTO TABLE? multipartIdentifier REPLACE whereClause? 
identifierList?   #insertIntoReplaceWhere

Review Comment:
   `SELECT * FROM source2` is not identifier list, but the input query. The 
syntax here is for the header of INSERT and doesn't include the input query.
   
   A full query would look like `INSERT INTO t REPLACE WHERE ... (a, c) SELECT 
1, 2`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a diff in pull request #38404: [SPARK-40956] SQL Equivalent for Dataframe overwrite command

2022-11-10 Thread GitBox


cloud-fan commented on code in PR #38404:
URL: https://github.com/apache/spark/pull/38404#discussion_r1019781125


##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala:
##
@@ -288,6 +289,11 @@ class AstBuilder extends SqlBaseParserBaseVisitor[AnyRef] 
with SQLConfHelper wit
   query,
   overwrite = true,
   ifPartitionNotExists)
+  case ctx: InsertIntoReplaceWhereContext =>

Review Comment:
   I think we should fail if `identifierList` is specified. I'll update 
`OverwriteByExpression` to support column list later.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a diff in pull request #38404: [SPARK-40956] SQL Equivalent for Dataframe overwrite command

2022-11-10 Thread GitBox


cloud-fan commented on code in PR #38404:
URL: https://github.com/apache/spark/pull/38404#discussion_r1019780097


##
sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4:
##
@@ -319,6 +319,7 @@ query
 insertInto
 : INSERT OVERWRITE TABLE? multipartIdentifier (partitionSpec (IF NOT 
EXISTS)?)?  identifierList?#insertOverwriteTable
 | INSERT INTO TABLE? multipartIdentifier partitionSpec? (IF NOT EXISTS)? 
identifierList?#insertIntoTable
+| INSERT INTO TABLE? multipartIdentifier REPLACE whereClause? 
identifierList?   #insertIntoReplaceWhere

Review Comment:
   I think the WHERE clause should be required?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a diff in pull request #38404: [SPARK-40956] SQL Equivalent for Dataframe overwrite command

2022-10-30 Thread GitBox


cloud-fan commented on code in PR #38404:
URL: https://github.com/apache/spark/pull/38404#discussion_r1009015127


##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala:
##
@@ -279,14 +282,26 @@ class AstBuilder extends SqlBaseParserBaseVisitor[AnyRef] 
with SQLConfHelper wit
   overwrite = false,
   ifPartitionNotExists)
   case table: InsertOverwriteTableContext =>
-val (relation, cols, partition, ifPartitionNotExists) = 
visitInsertOverwriteTable(table)
+val (relation, cols, partition, ifPartitionNotExists, _)
+= visitInsertOverwriteTable(table)
 InsertIntoStatement(
   relation,
   partition,
   cols,
   query,
   overwrite = true,
   ifPartitionNotExists)
+  case table: InsertIntoReplaceWhereContext =>
+val (relation, cols, partition, ifPartitionNotExists, 
replacePredicates)
+= visitInsertIntoReplaceWhere(table)
+InsertIntoStatement(

Review Comment:
   can't we return `OverwriteByExpression` directly here? Then we don't need to 
touch `InsertIntoStatement`. This is v2 code path only, and we don't need to 
use `InsertIntoStatement` and covert it to either v1 or v2 code path later.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org