[spark] branch branch-3.1 updated: [SPARK-34599][SQL] Fix the issue that INSERT INTO OVERWRITE doesn't support partition columns containing dot for DSv2

wenchen Wed, 03 Mar 2021 23:14:44 -0800

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/branch-3.1 by this push:
     new f83053c  [SPARK-34599][SQL] Fix the issue that INSERT INTO OVERWRITE 
doesn't support partition columns containing dot for DSv2
f83053c is described below

commit f83053c2e2facd27f89744f73af6d80d05cc0cf9
Author: Shixiong Zhu <zsxw...@gmail.com>
AuthorDate: Thu Mar 4 15:12:53 2021 +0800

    [SPARK-34599][SQL] Fix the issue that INSERT INTO OVERWRITE doesn't support 
partition columns containing dot for DSv2
    
    ### What changes were proposed in this pull request?
    
    `ResolveInsertInto.staticDeleteExpression` should use 
`UnresolvedAttribute.quoted` to create the delete expression so that we will 
treat the entire `attr.name` as a column name.
    
    ### Why are the changes needed?
    
    When users use `dot` in a partition column name, queries like ```INSERT 
OVERWRITE $t1 PARTITION (`a.b` = 'a') (`c.d`) VALUES('b')``` is not working.
    
    ### Does this PR introduce _any_ user-facing change?
    
    Without this test, the above query will throw
    ```
    [info]   org.apache.spark.sql.AnalysisException: cannot resolve '`a.b`' 
given input columns: [a.b, c.d];
    [info] 'OverwriteByExpression RelationV2[a.b#17, c.d#18] default.tbl, ('a.b 
<=> cast(a as string)), false
    [info] +- Project [a.b#19, ansi_cast(col1#16 as string) AS c.d#20]
    [info]    +- Project [cast(a as string) AS a.b#19, col1#16]
    [info]       +- LocalRelation [col1#16]
    ```
    
    With the fix, the query will run correctly.
    
    ### How was this patch tested?
    
    The new added test.
    
    Closes #31713 from zsxwing/SPARK-34599.
    
    Authored-by: Shixiong Zhu <zsxw...@gmail.com>
    Signed-off-by: Wenchen Fan <wenc...@databricks.com>
    (cherry picked from commit 53e4dba7c489ac5c0ad61f0121c4e247de5b485c)
    Signed-off-by: Wenchen Fan <wenc...@databricks.com>
---
 .../org/apache/spark/sql/catalyst/analysis/Analyzer.scala      |  4 +++-
 .../scala/org/apache/spark/sql/connector/InsertIntoTests.scala | 10 ++++++++++
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
index f71139f..771b817 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
@@ -1336,7 +1336,9 @@ class Analyzer(override val catalogManager: 
CatalogManager)
               // ResolveOutputRelation runs, using the query's column names 
that will match the
               // table names at that point. because resolution happens after a 
future rule, create
               // an UnresolvedAttribute.
-              EqualNullSafe(UnresolvedAttribute(attr.name), 
Cast(Literal(value), attr.dataType))
+              EqualNullSafe(
+                UnresolvedAttribute.quoted(attr.name),
+                Cast(Literal(value), attr.dataType))
             case None =>
               throw QueryCompilationErrors.unknownStaticPartitionColError(name)
           }
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/connector/InsertIntoTests.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/connector/InsertIntoTests.scala
index 2cc7a1f..ad73037 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/connector/InsertIntoTests.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/connector/InsertIntoTests.scala
@@ -477,5 +477,15 @@ trait InsertIntoSQLOnlyTests
         verifyTable(t1, spark.table(view))
       }
     }
+
+    test("SPARK-34599: InsertInto: overwrite - dot in the partition column 
name - static mode") {
+      import testImplicits._
+      val t1 = "tbl"
+      withTable(t1) {
+        sql(s"CREATE TABLE $t1 (`a.b` string, `c.d` string) USING $v2Format 
PARTITIONED BY (`a.b`)")
+        sql(s"INSERT OVERWRITE $t1 PARTITION (`a.b` = 'a') (`c.d`) 
VALUES('b')")
+        verifyTable(t1, Seq("a" -> "b").toDF("id", "data"))
+      }
+    }
   }
 }


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.1 updated: [SPARK-34599][SQL] Fix the issue that INSERT INTO OVERWRITE doesn't support partition columns containing dot for DSv2

Reply via email to