[GitHub] [spark] imback82 commented on a change in pull request #35113: [SPARK-37636][SQL] Migrate CREATE NAMESPACE to use V2 command by default

2022-01-05 Thread GitBox


imback82 commented on a change in pull request #35113:
URL: https://github.com/apache/spark/pull/35113#discussion_r779313382



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/v1/CreateNamespaceSuite.scala
##
@@ -36,7 +36,7 @@ import org.apache.spark.sql.execution.command
 trait CreateNamespaceSuiteBase extends command.CreateNamespaceSuiteBase
 with command.TestsV1AndV2Commands {
   override def namespace: String = "db"
-  override def notFoundMsgPrefix: String = "Database"
+  override def notFoundMsgPrefix: String = if (runningV1Command) "Database" 
else "Namespace"

Review comment:
   We need to update the test now that v1 catalogs are run against both v1 
and v2 commands (after adding `if conf.useV1Command`) .

##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala
##
@@ -214,16 +214,11 @@ class ResolveSessionCatalog(val catalogManager: 
CatalogManager)
 case DropView(r: ResolvedView, ifExists) =>
   DropTableCommand(r.identifier.asTableIdentifier, ifExists, isView = 
true, purge = false)
 
-case c @ CreateNamespace(ResolvedDBObjectName(catalog, name), _, _)
-if isSessionCatalog(catalog) =>
-  if (name.length != 1) {
-throw QueryCompilationErrors.invalidDatabaseNameError(name.quoted)
-  }

Review comment:
   This check is handled in `DatabaseNameInSessionCatalog`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] imback82 commented on a change in pull request #35113: [SPARK-37636][SQL] Migrate CREATE NAMESPACE to use V2 command by default

2022-01-06 Thread GitBox


imback82 commented on a change in pull request #35113:
URL: https://github.com/apache/spark/pull/35113#discussion_r780069534



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/TestsV1AndV2Commands.scala
##
@@ -28,14 +28,21 @@ import org.apache.spark.sql.internal.SQLConf
 trait TestsV1AndV2Commands extends DDLCommandTestUtils {
   private var _version: String = ""
   override def commandVersion: String = _version
+  def runningV1Command: Boolean = commandVersion == "V1"
 
   // Tests using V1 catalogs will run with `spark.sql.legacy.useV1Command` on 
and off
   // to test both V1 and V2 commands.
   override def test(testName: String, testTags: Tag*)(testFun: => Any)
 (implicit pos: Position): Unit = {
 Seq(true, false).foreach { useV1Command =>
-  _version = if (useV1Command) "V1" else "V2"
+  def setCommandVersion(): Unit = {
+_version = if (useV1Command) "V1" else "V2"
+  }
+  setCommandVersion()
   super.test(testName, testTags: _*) {
+// Need to set command version inside this test function so that
+// the correct command version is available in each test.
+setCommandVersion()

Review comment:
   This `def test` doesn't run `testFun` when it's invoked; it only 
registers the test). So by the time, `testFunc` is actually run, `_version` 
will always be set to "V2" (`useV1Command == false`). So we need to capture 
this inside the lambda that's passed into `super.test`.
   
   Also, note that we need to call `setCommandVersion()` before calling 
`super.test` because `super.test` being called is utilizing the 
`commandVersion` to set the right test name:
   
https://github.com/apache/spark/blob/527e842ee6ed3bb1aabba61d02337ab81a56f87b/sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLCommandTestUtils.scala#L56-L57




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] imback82 commented on a change in pull request #35113: [SPARK-37636][SQL] Migrate CREATE NAMESPACE to use V2 command by default

2022-01-06 Thread GitBox


imback82 commented on a change in pull request #35113:
URL: https://github.com/apache/spark/pull/35113#discussion_r780070663



##
File path: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala
##
@@ -189,8 +189,15 @@ private[spark] class HiveExternalCatalog(conf: SparkConf, 
hadoopConf: Configurat
 
   override def createDatabase(
   dbDefinition: CatalogDatabase,
-  ignoreIfExists: Boolean): Unit = withClient {
-client.createDatabase(dbDefinition, ignoreIfExists)
+  ignoreIfExists: Boolean): Unit = {
+try {
+  withClient {
+client.createDatabase(dbDefinition, ignoreIfExists)
+  }
+} catch {
+  case e: AnalysisException if e.message.contains("already exists") =>
+throw new DatabaseAlreadyExistsException(dbDefinition.name)

Review comment:
   @cloud-fan I intercepted the exception here. looks a bit hacky?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] imback82 commented on a change in pull request #35113: [SPARK-37636][SQL] Migrate CREATE NAMESPACE to use V2 command by default

2022-01-07 Thread GitBox


imback82 commented on a change in pull request #35113:
URL: https://github.com/apache/spark/pull/35113#discussion_r780418849



##
File path: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala
##
@@ -189,8 +189,15 @@ private[spark] class HiveExternalCatalog(conf: SparkConf, 
hadoopConf: Configurat
 
   override def createDatabase(
   dbDefinition: CatalogDatabase,
-  ignoreIfExists: Boolean): Unit = withClient {
-client.createDatabase(dbDefinition, ignoreIfExists)
+  ignoreIfExists: Boolean): Unit = {
+try {
+  withClient {
+client.createDatabase(dbDefinition, ignoreIfExists)
+  }
+} catch {
+  case e: AnalysisException if e.message.contains("already exists") =>
+throw new DatabaseAlreadyExistsException(dbDefinition.name)

Review comment:
   Two concerns if we move the logic to `withClient`:
   1. How can we guarantee 
`org.apache.hadoop.hive.metastore.api.AlreadyExistsException` is thrown by 
`createDatabase` but not for other calls like `createTable`?
   2. Since db name is not available, we need to parse out the db name from the 
message.
   
   Are you OK with these?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] imback82 commented on a change in pull request #35113: [SPARK-37636][SQL] Migrate CREATE NAMESPACE to use V2 command by default

2022-01-07 Thread GitBox


imback82 commented on a change in pull request #35113:
URL: https://github.com/apache/spark/pull/35113#discussion_r780418849



##
File path: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala
##
@@ -189,8 +189,15 @@ private[spark] class HiveExternalCatalog(conf: SparkConf, 
hadoopConf: Configurat
 
   override def createDatabase(
   dbDefinition: CatalogDatabase,
-  ignoreIfExists: Boolean): Unit = withClient {
-client.createDatabase(dbDefinition, ignoreIfExists)
+  ignoreIfExists: Boolean): Unit = {
+try {
+  withClient {
+client.createDatabase(dbDefinition, ignoreIfExists)
+  }
+} catch {
+  case e: AnalysisException if e.message.contains("already exists") =>
+throw new DatabaseAlreadyExistsException(dbDefinition.name)

Review comment:
   Two concerns if we move the logic to `withClient`:
   1. How can we guarantee 
`org.apache.hadoop.hive.metastore.api.AlreadyExistsException` is thrown only by 
`createDatabase` but not for other calls like `createTable`?
   2. Since db name is not available, we need to parse out the db name from the 
message.
   
   Are you OK with these?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] imback82 commented on a change in pull request #35113: [SPARK-37636][SQL] Migrate CREATE NAMESPACE to use V2 command by default

2022-01-07 Thread GitBox


imback82 commented on a change in pull request #35113:
URL: https://github.com/apache/spark/pull/35113#discussion_r780475265



##
File path: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala
##
@@ -189,8 +189,15 @@ private[spark] class HiveExternalCatalog(conf: SparkConf, 
hadoopConf: Configurat
 
   override def createDatabase(
   dbDefinition: CatalogDatabase,
-  ignoreIfExists: Boolean): Unit = withClient {
-client.createDatabase(dbDefinition, ignoreIfExists)
+  ignoreIfExists: Boolean): Unit = {
+try {
+  withClient {
+client.createDatabase(dbDefinition, ignoreIfExists)
+  }
+} catch {
+  case e: AnalysisException if e.message.contains("already exists") =>
+throw new DatabaseAlreadyExistsException(dbDefinition.name)

Review comment:
   Good idea! Let me try this approach.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] imback82 commented on a change in pull request #35113: [SPARK-37636][SQL] Migrate CREATE NAMESPACE to use V2 command by default

2022-01-08 Thread GitBox


imback82 commented on a change in pull request #35113:
URL: https://github.com/apache/spark/pull/35113#discussion_r780069534



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/TestsV1AndV2Commands.scala
##
@@ -28,14 +28,21 @@ import org.apache.spark.sql.internal.SQLConf
 trait TestsV1AndV2Commands extends DDLCommandTestUtils {
   private var _version: String = ""
   override def commandVersion: String = _version
+  def runningV1Command: Boolean = commandVersion == "V1"
 
   // Tests using V1 catalogs will run with `spark.sql.legacy.useV1Command` on 
and off
   // to test both V1 and V2 commands.
   override def test(testName: String, testTags: Tag*)(testFun: => Any)
 (implicit pos: Position): Unit = {
 Seq(true, false).foreach { useV1Command =>
-  _version = if (useV1Command) "V1" else "V2"
+  def setCommandVersion(): Unit = {
+_version = if (useV1Command) "V1" else "V2"
+  }
+  setCommandVersion()
   super.test(testName, testTags: _*) {
+// Need to set command version inside this test function so that
+// the correct command version is available in each test.
+setCommandVersion()

Review comment:
   This `def test` doesn't run `testFun` when it's invoked; it only 
registers the test). So by the time, `testFunc` is actually run, `_version` 
will always be set to "V2" (`useV1Command == false`). So we need to capture 
this inside the lambda that's passed into `super.test`.
   
   Also, note that we need to call `setCommandVersion()` before calling 
`super.test` because `super.test` being called is utilizing the 
`commandVersion` to set the right test name:
   
https://github.com/apache/spark/blob/527e842ee6ed3bb1aabba61d02337ab81a56f87b/sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLCommandTestUtils.scala#L56-L57

##
File path: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala
##
@@ -189,8 +189,15 @@ private[spark] class HiveExternalCatalog(conf: SparkConf, 
hadoopConf: Configurat
 
   override def createDatabase(
   dbDefinition: CatalogDatabase,
-  ignoreIfExists: Boolean): Unit = withClient {
-client.createDatabase(dbDefinition, ignoreIfExists)
+  ignoreIfExists: Boolean): Unit = {
+try {
+  withClient {
+client.createDatabase(dbDefinition, ignoreIfExists)
+  }
+} catch {
+  case e: AnalysisException if e.message.contains("already exists") =>
+throw new DatabaseAlreadyExistsException(dbDefinition.name)

Review comment:
   @cloud-fan I intercepted the exception here. looks a bit hacky?

##
File path: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala
##
@@ -189,8 +189,15 @@ private[spark] class HiveExternalCatalog(conf: SparkConf, 
hadoopConf: Configurat
 
   override def createDatabase(
   dbDefinition: CatalogDatabase,
-  ignoreIfExists: Boolean): Unit = withClient {
-client.createDatabase(dbDefinition, ignoreIfExists)
+  ignoreIfExists: Boolean): Unit = {
+try {
+  withClient {
+client.createDatabase(dbDefinition, ignoreIfExists)
+  }
+} catch {
+  case e: AnalysisException if e.message.contains("already exists") =>
+throw new DatabaseAlreadyExistsException(dbDefinition.name)

Review comment:
   Two concerns if we move the logic to `withClient`:
   1. How can we guarantee 
`org.apache.hadoop.hive.metastore.api.AlreadyExistsException` is thrown by 
`createDatabase` but not for other calls like `createTable`?
   2. Since db name is not available, we need to parse out the db name from the 
message.
   
   Are you OK with these?

##
File path: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala
##
@@ -189,8 +189,15 @@ private[spark] class HiveExternalCatalog(conf: SparkConf, 
hadoopConf: Configurat
 
   override def createDatabase(
   dbDefinition: CatalogDatabase,
-  ignoreIfExists: Boolean): Unit = withClient {
-client.createDatabase(dbDefinition, ignoreIfExists)
+  ignoreIfExists: Boolean): Unit = {
+try {
+  withClient {
+client.createDatabase(dbDefinition, ignoreIfExists)
+  }
+} catch {
+  case e: AnalysisException if e.message.contains("already exists") =>
+throw new DatabaseAlreadyExistsException(dbDefinition.name)

Review comment:
   Two concerns if we move the logic to `withClient`:
   1. How can we guarantee 
`org.apache.hadoop.hive.metastore.api.AlreadyExistsException` is thrown only by 
`createDatabase` but not for other calls like `createTable`?
   2. Since db name is not available, we need to parse out the db name from the 
message.
   
   Are you OK with these?

##
File path: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala
##
@@ -189,8 +189,15 @@ private[spark] class HiveExternalCatalog(conf: SparkConf, 
hadoopConf: Configurat
 
 

[GitHub] [spark] imback82 commented on a change in pull request #35113: [SPARK-37636][SQL] Migrate CREATE NAMESPACE to use V2 command by default

2022-01-09 Thread GitBox


imback82 commented on a change in pull request #35113:
URL: https://github.com/apache/spark/pull/35113#discussion_r780936585



##
File path: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala
##
@@ -189,8 +189,15 @@ private[spark] class HiveExternalCatalog(conf: SparkConf, 
hadoopConf: Configurat
 
   override def createDatabase(
   dbDefinition: CatalogDatabase,
-  ignoreIfExists: Boolean): Unit = withClient {
-client.createDatabase(dbDefinition, ignoreIfExists)
+  ignoreIfExists: Boolean): Unit = {
+try {
+  withClient {
+client.createDatabase(dbDefinition, ignoreIfExists)
+  }
+} catch {
+  case e: AnalysisException if e.message.contains("already exists") =>
+throw new DatabaseAlreadyExistsException(dbDefinition.name)

Review comment:
   Updated as suggested




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] imback82 commented on a change in pull request #35113: [SPARK-37636][SQL] Migrate CREATE NAMESPACE to use V2 command by default

2022-01-10 Thread GitBox


imback82 commented on a change in pull request #35113:
URL: https://github.com/apache/spark/pull/35113#discussion_r781643228



##
File path: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala
##
@@ -189,8 +204,17 @@ private[spark] class HiveExternalCatalog(conf: SparkConf, 
hadoopConf: Configurat
 
   override def createDatabase(
   dbDefinition: CatalogDatabase,
-  ignoreIfExists: Boolean): Unit = withClient {
+  ignoreIfExists: Boolean): Unit = withClientWrappingException {
 client.createDatabase(dbDefinition, ignoreIfExists)
+  } { exception =>
+if (exception.getClass.getName.equals(
+  "org.apache.hadoop.hive.metastore.api.AlreadyExistsException")
+&& exception.getMessage.contains(
+  s"Database ${dbDefinition.name} already exists")) {
+  Some(new DatabaseAlreadyExistsException(dbDefinition.name))

Review comment:
   Updated the test. Thanks!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] imback82 commented on a change in pull request #35113: [SPARK-37636][SQL] Migrate CREATE NAMESPACE to use V2 command by default

2022-01-10 Thread GitBox


imback82 commented on a change in pull request #35113:
URL: https://github.com/apache/spark/pull/35113#discussion_r781643367



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/v1/CreateNamespaceSuite.scala
##
@@ -36,7 +36,7 @@ import org.apache.spark.sql.execution.command
 trait CreateNamespaceSuiteBase extends command.CreateNamespaceSuiteBase
 with command.TestsV1AndV2Commands {
   override def namespace: String = "db"
-  override def notFoundMsgPrefix: String = "Database"
+  override def notFoundMsgPrefix: String = if (runningV1Command) "Database" 
else "Namespace"

Review comment:
   Done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org