[GitHub] [spark] maropu commented on a change in pull request #28647: [SPARK-31828][SQL] Retain table properties at CreateTableLikeCommand

2020-06-30 Thread GitBox


maropu commented on a change in pull request #28647:
URL: https://github.com/apache/spark/pull/28647#discussion_r448066069



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala
##
@@ -115,6 +116,26 @@ case class CreateTableLikeCommand(
   CatalogTableType.EXTERNAL
 }
 
+// We only copy source tbl properties if the format is the same with each 
other
+val needCopyProperties =
+  (provider.isEmpty || provider == sourceTableDesc.provider) &&

Review comment:
   We already have tests for all the case described above?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #28647: [SPARK-31828][SQL] Retain table properties at CreateTableLikeCommand

2020-06-30 Thread GitBox


maropu commented on a change in pull request #28647:
URL: https://github.com/apache/spark/pull/28647#discussion_r448065805



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala
##
@@ -55,7 +55,8 @@ import org.apache.spark.sql.util.SchemaUtils
  * are identical to the ones defined in the source table.
  *
  * The CatalogTable attributes copied from the source table are 
storage(inputFormat, outputFormat,
- * serde, compressed, properties), schema, provider, partitionColumnNames, 
bucketSpec by default.
+ * serde, compressed, properties), schema, provider, partitionColumnNames, 
bucketSpec, tblproperties
+ * by default.
  *

Review comment:
   Plz update the comment, too? `if the formats is the same..`
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #28647: [SPARK-31828][SQL] Retain table properties at CreateTableLikeCommand

2020-06-30 Thread GitBox


maropu commented on a change in pull request #28647:
URL: https://github.com/apache/spark/pull/28647#discussion_r448065211



##
File path: docs/sql-ref-syntax-ddl-create-table-like.md
##
@@ -57,6 +57,8 @@ CREATE TABLE [IF NOT EXISTS] table_identifier LIKE 
source_table_identifier
 * **TBLPROPERTIES**
 
 Table properties that have to be set are specified, such as 
`created.by.user`, `owner`, etc.
+Note that a basic set of table properties defined in a source table is 
copied into a new table if the table format including data source provider and 
storage input format are the same.

Review comment:
   nit: `the table format including data source provider and storage input 
format` -> `the table formats including data source providers and storage input 
formats`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #28647: [SPARK-31828][SQL] Retain table properties at CreateTableLikeCommand

2020-05-31 Thread GitBox


maropu commented on a change in pull request #28647:
URL: https://github.com/apache/spark/pull/28647#discussion_r432941526



##
File path: docs/sql-ref-syntax-ddl-create-table-like.md
##
@@ -57,6 +57,7 @@ CREATE TABLE [IF NOT EXISTS] table_identifier LIKE 
source_table_identifier
 * **TBLPROPERTIES**
 
 Table properties that have to be set are specified, such as 
`created.by.user`, `owner`, etc.
+And we copied from the source table are storage(inputFormat, outputFormat, 
serde, compressed, properties), schema, provider, partitionColumnNames, 
bucketSpec, tblproperties by default.

Review comment:
   ```
   Note that a basic set of table properties defined in a source table is 
copied into a new table. 
   If a specified property key has already existed in a source table, the old 
property is overwritten with the new one.
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #28647: [SPARK-31828][SQL] Retain table properties at CreateTableLikeCommand

2020-05-31 Thread GitBox


maropu commented on a change in pull request #28647:
URL: https://github.com/apache/spark/pull/28647#discussion_r432941526



##
File path: docs/sql-ref-syntax-ddl-create-table-like.md
##
@@ -57,6 +57,7 @@ CREATE TABLE [IF NOT EXISTS] table_identifier LIKE 
source_table_identifier
 * **TBLPROPERTIES**
 
 Table properties that have to be set are specified, such as 
`created.by.user`, `owner`, etc.
+And we copied from the source table are storage(inputFormat, outputFormat, 
serde, compressed, properties), schema, provider, partitionColumnNames, 
bucketSpec, tblproperties by default.

Review comment:
   ```
   Note that a basic set of table properties defined in a source table is 
copied into a new table. If a specified property key has already existed in a 
source table, the old property is overwritten with the new one.
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #28647: [SPARK-31828][SQL] Retain table properties at CreateTableLikeCommand

2020-05-29 Thread GitBox


maropu commented on a change in pull request #28647:
URL: https://github.com/apache/spark/pull/28647#discussion_r432794003



##
File path: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala
##
@@ -2720,4 +2704,26 @@ class HiveDDLSuite
   checkAnswer(sql("SHOW PARTITIONS ta_part"), Row("ts=10") :: Nil)
 }
   }
+
+  test("SPARK-31828: Retain table properties at CreateTableLikeCommand") {
+val catalog = spark.sessionState.catalog
+withTable("t1", "t2", "t3") {
+  sql(s"CREATE TABLE t1(c1 int) TBLPROPERTIES('k1'='v1', 'k2'='v2', 
'totalNumberFiles'='meta')")
+  val t1 = catalog.getTableMetadata(TableIdentifier("t1"))
+  assert(t1.properties("k1") == "v1")
+  assert(t1.properties("k2") == "v2")
+  assert(t1.properties(s"totalNumberFiles") == "meta")
+  sql("CREATE TABLE t2 LIKE t1 TBLPROPERTIES('k2'='v3', 'k4'='v4')")
+  val t2 = catalog.getTableMetadata(TableIdentifier("t2"))
+  assert(t2.properties("k1") == "v1")
+  assert(t2.properties("k2") == "v3")
+  assert(t2.properties("k4") == "v4")
+  assert(t2.properties.get(s"totalNumberFiles").isEmpty)
+  sql("CREATE TABLE t3 LIKE t1")
+  val t3 = catalog.getTableMetadata(TableIdentifier("t3"))
+  assert(t3.properties("k1") == "v1")
+  assert(t3.properties("k2") == "v2")
+  assert(t3.properties.get(s"totalNumberFiles").isEmpty)

Review comment:
   nit: remove `s` in the head. It seems there are the same unnecessary `s` 
in the other places, too.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #28647: [SPARK-31828][SQL] Retain table properties at CreateTableLikeCommand

2020-05-29 Thread GitBox


maropu commented on a change in pull request #28647:
URL: https://github.com/apache/spark/pull/28647#discussion_r432794744



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala
##
@@ -115,6 +116,15 @@ case class CreateTableLikeCommand(
   CatalogTableType.EXTERNAL
 }
 
+val newProperties = sourceTableDesc.tableType match {
+  case VIEW =>
+// For view, we just use new properties
+properties

Review comment:
   nit format;
   ```
   
   val newProperties = sourceTableDesc.tableType match {
 case MANAGED | EXTERNAL =>
   // Hive only retain the useful properties through serde class 
annotation.
   // For better compatible with Hive, we remove the metastore 
properties.
   sourceTableDesc.properties -- 
DDLUtils.METASTORE_GENERATED_PROPERTIES ++ properties
   
 case VIEW =>
   // For view, we just use new properties
   properties
   }
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #28647: [SPARK-31828][SQL] Retain table properties at CreateTableLikeCommand

2020-05-29 Thread GitBox


maropu commented on a change in pull request #28647:
URL: https://github.com/apache/spark/pull/28647#discussion_r432794399



##
File path: docs/sql-ref-syntax-ddl-create-table-like.md
##
@@ -22,6 +22,9 @@ license: |
 ### Description
 
 The `CREATE TABLE` statement defines a new table using the definition/metadata 
of an existing table or view.
+And we copied from the source table are storage(inputFormat, outputFormat, 
serde, compressed, properties), schema, 
+provider, partitionColumnNames, bucketSpec, tblproperties
+by default.

Review comment:
   Plz write this note in the `TABLEPROPERTIES` section instead of doing it 
here: 
https://github.com/apache/spark/blame/master/docs/sql-ref-syntax-ddl-create-table-like.md#L59





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #28647: [SPARK-31828][SQL] Retain table properties at CreateTableLikeCommand

2020-05-29 Thread GitBox


maropu commented on a change in pull request #28647:
URL: https://github.com/apache/spark/pull/28647#discussion_r432794003



##
File path: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala
##
@@ -2720,4 +2704,26 @@ class HiveDDLSuite
   checkAnswer(sql("SHOW PARTITIONS ta_part"), Row("ts=10") :: Nil)
 }
   }
+
+  test("SPARK-31828: Retain table properties at CreateTableLikeCommand") {
+val catalog = spark.sessionState.catalog
+withTable("t1", "t2", "t3") {
+  sql(s"CREATE TABLE t1(c1 int) TBLPROPERTIES('k1'='v1', 'k2'='v2', 
'totalNumberFiles'='meta')")
+  val t1 = catalog.getTableMetadata(TableIdentifier("t1"))
+  assert(t1.properties("k1") == "v1")
+  assert(t1.properties("k2") == "v2")
+  assert(t1.properties(s"totalNumberFiles") == "meta")
+  sql("CREATE TABLE t2 LIKE t1 TBLPROPERTIES('k2'='v3', 'k4'='v4')")
+  val t2 = catalog.getTableMetadata(TableIdentifier("t2"))
+  assert(t2.properties("k1") == "v1")
+  assert(t2.properties("k2") == "v3")
+  assert(t2.properties("k4") == "v4")
+  assert(t2.properties.get(s"totalNumberFiles").isEmpty)
+  sql("CREATE TABLE t3 LIKE t1")
+  val t3 = catalog.getTableMetadata(TableIdentifier("t3"))
+  assert(t3.properties("k1") == "v1")
+  assert(t3.properties("k2") == "v2")
+  assert(t3.properties.get(s"totalNumberFiles").isEmpty)

Review comment:
   nit: remove `s` in the head.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #28647: [SPARK-31828][SQL] Retain table properties at CreateTableLikeCommand

2020-05-29 Thread GitBox


maropu commented on a change in pull request #28647:
URL: https://github.com/apache/spark/pull/28647#discussion_r432793971



##
File path: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala
##
@@ -2720,4 +2704,26 @@ class HiveDDLSuite
   checkAnswer(sql("SHOW PARTITIONS ta_part"), Row("ts=10") :: Nil)
 }
   }
+
+  test("SPARK-31828: Retain table properties at CreateTableLikeCommand") {
+val catalog = spark.sessionState.catalog
+withTable("t1", "t2", "t3") {
+  sql(s"CREATE TABLE t1(c1 int) TBLPROPERTIES('k1'='v1', 'k2'='v2', 
'totalNumberFiles'='meta')")

Review comment:
   Just in case, could you test all the prpos in 
`METASTORE_GENERATED_PROPERTIES`? Probably, its beetter to split it into two 
test units like add a new test unit `test("SPARK-31828: Filters out Hive 
metastore properties in CreateTableLikeCommand") {`.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #28647: [SPARK-31828][SQL] Retain table properties at CreateTableLikeCommand

2020-05-28 Thread GitBox


maropu commented on a change in pull request #28647:
URL: https://github.com/apache/spark/pull/28647#discussion_r432203235



##
File path: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala
##
@@ -2720,4 +2708,23 @@ class HiveDDLSuite
   checkAnswer(sql("SHOW PARTITIONS ta_part"), Row("ts=10") :: Nil)
 }
   }
+
+  test("SPARK-31828: Retain table properties at CreateTableLikeCommand") {
+val catalog = spark.sessionState.catalog
+withTable("t1", "t2", "t3") {
+  sql("CREATE TABLE t1(c1 int) TBLPROPERTIES('k1'='v1', 'k2'='v2')")
+  val t1 = catalog.getTableMetadata(TableIdentifier("t1"))
+  assert(t1.properties("k1") == "v1")
+  assert(t1.properties("k2") == "v2")
+  sql("CREATE TABLE t2 LIKE t1 TBLPROPERTIES('k2'='v3', 'k4'='v4')")
+  val t2 = catalog.getTableMetadata(TableIdentifier("t2"))
+  assert(t2.properties("k1") == "v1")
+  assert(t2.properties("k2") == "v3")
+  assert(t2.properties("k4") == "v4")
+  sql("CREATE TABLE t3 LIKE t1")
+  val t3 = catalog.getTableMetadata(TableIdentifier("t3"))
+  assert(t3.properties("k1") == "v1")
+  assert(t3.properties("k2") == "v2")

Review comment:
   Could you check if the metastore props are filtered out here?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #28647: [SPARK-31828][SQL] Retain table properties at CreateTableLikeCommand

2020-05-28 Thread GitBox


maropu commented on a change in pull request #28647:
URL: https://github.com/apache/spark/pull/28647#discussion_r432202932



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala
##
@@ -839,6 +839,19 @@ case class AlterTableSetLocationCommand(
 object DDLUtils {
   val HIVE_PROVIDER = "hive"
 
+  val METASTORE_GENERATED_PROPERTIES: Set[String] = Set(

Review comment:
   Just to check; does the latest hive metastore have the same set of these 
props here?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #28647: [SPARK-31828][SQL] Retain table properties at CreateTableLikeCommand

2020-05-28 Thread GitBox


maropu commented on a change in pull request #28647:
URL: https://github.com/apache/spark/pull/28647#discussion_r432201354



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala
##
@@ -115,6 +116,8 @@ case class CreateTableLikeCommand(
   CatalogTableType.EXTERNAL
 }
 
+val newProperties =

Review comment:
   plz leave some comments about why the metastore props are remvoed here? 
https://github.com/apache/spark/pull/28647#issuecomment-634373422





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org