cloud-fan commented on a change in pull request #31421:
URL: https://github.com/apache/spark/pull/31421#discussion_r568379292



##########
File path: docs/sql-migration-guide.md
##########
@@ -41,6 +41,8 @@ license: |
 
   - In Spark 3.2, the auto-generated `Cast` (such as those added by type 
coercion rules) will be stripped when generating column alias names. E.g., 
`sql("SELECT floor(1)").columns` will be `FLOOR(1)` instead of `FLOOR(CAST(1 AS 
DOUBLE))`.
 
+  - In Spark 3.2, a null partition value is parsed as it is. In Spark 3.1 or 
earlier, it is parsed as a string literal of its text representation, e.g., 
string "null". To restore the legacy behavior, you can set 
`spark.sql.legacy.parseNullPartitionSpecAsStringLiteral` as true.

Review comment:
       Let's be more specific
   ```
   In Spark 3.2, `PARTITION(col=null)` is always parsed as a null literal in 
the partition spec. In Spark 3.1 or earlier,
   it is parsed as a string literal ..., if the partition column is string 
type. To restore the legacy behavior, ...
   ```

##########
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
##########
@@ -500,9 +502,11 @@ class AstBuilder extends SqlBaseBaseVisitor[AnyRef] with 
SQLConfHelper with Logg
    * main purpose is to prevent slight differences due to back to back 
conversions i.e.:
    * String -> Literal -> String.
    */
-  protected def visitStringConstant(ctx: ConstantContext): String = 
withOrigin(ctx) {
+  protected def visitStringConstant(
+    ctx: ConstantContext,

Review comment:
       nit: 4 spaces indentation

##########
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
##########
@@ -472,9 +472,11 @@ class AstBuilder extends SqlBaseBaseVisitor[AnyRef] with 
SQLConfHelper with Logg
    */
   override def visitPartitionSpec(
       ctx: PartitionSpecContext): Map[String, Option[String]] = 
withOrigin(ctx) {
+    val processNullLiteral =

Review comment:
       this name is weird, how about `legacyNullAsString`?

##########
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
##########
@@ -2472,6 +2472,16 @@ object SQLConf {
     .booleanConf
     .createWithDefault(true)
 
+  val LEGACY_PARSE_NULL_PARTITION_SPEC_AS_STRING_LITERAL =
+    buildConf("spark.sql.legacy.parseNullPartitionSpecAsStringLiteral")
+      .internal()
+      .doc("If it is set to true, a null partition value is parsed as a string 
literal of its " +

Review comment:
       ditto, make it more specific.

##########
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
##########
@@ -2472,6 +2472,16 @@ object SQLConf {
     .booleanConf
     .createWithDefault(true)
 
+  val LEGACY_PARSE_NULL_PARTITION_SPEC_AS_STRING_LITERAL =
+    buildConf("spark.sql.legacy.parseNullPartitionSpecAsStringLiteral")
+      .internal()
+      .doc("If it is set to true, a null partition value is parsed as a string 
literal of its " +
+        "text representation, e.g., string 'null'. Otherwise, null partition 
values are parsed " +
+        "as they are.")
+      .version("3.2.0")

Review comment:
       the version should be 3.1.1, as we are going to backport this.

##########
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/ShowPartitionsSuiteBase.scala
##########
@@ -154,4 +154,15 @@ trait ShowPartitionsSuiteBase extends QueryTest with 
DDLCommandTestUtils {
       assert(partitions.first().getString(0) === "part=a")
     }
   }
+
+  test("SPARK-33591: null as string partition literal value 'null' after 
setting legacy conf") {
+    withSQLConf(SQLConf.LEGACY_PARSE_NULL_PARTITION_SPEC_AS_STRING_LITERAL.key 
-> "true") {
+      withNamespaceAndTable("ns", "tbl") { t =>
+        sql(s"CREATE TABLE $t (col1 INT, p1 STRING) $defaultUsing PARTITIONED 
BY (p1)")
+        sql(s"INSERT INTO TABLE $t PARTITION (p1 = null) SELECT 0")
+        runShowPartitionsSql(s"SHOW PARTITIONS $t", Row("p1=null") :: Nil)
+        runShowPartitionsSql(s"SHOW PARTITIONS $t PARTITION (p1 = null)", 
Row("p1=null") :: Nil)

Review comment:
       how does this show the null string behavior? shall we just test a simple 
table scan operation?

##########
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
##########
@@ -2472,6 +2472,16 @@ object SQLConf {
     .booleanConf
     .createWithDefault(true)
 
+  val LEGACY_PARSE_NULL_PARTITION_SPEC_AS_STRING_LITERAL =
+    buildConf("spark.sql.legacy.parseNullPartitionSpecAsStringLiteral")
+      .internal()
+      .doc("If it is set to true, a null partition value is parsed as a string 
literal of its " +
+        "text representation, e.g., string 'null'. Otherwise, null partition 
values are parsed " +
+        "as they are.")
+      .version("3.2.0")

Review comment:
       Ah I see, then how should we place it in the migration guide? Repeat it 
3 times in 3.0.2, 3.1.1 and 3.2.0 sections?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to