[GitHub] spark pull request: [SPARK-6339][SQL][WIP] Supports create CREATE ...
Github user clockfly commented on a diff in the pull request: https://github.com/apache/spark/pull/12872#discussion_r61987856 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala --- @@ -71,27 +81,69 @@ case class CreateViewCommand( require(tableDesc.schema == Nil || tableDesc.schema.length == analyzedPlan.output.length) val sessionState = sparkSession.sessionState -if (sessionState.catalog.tableExists(tableIdentifier)) { +if (isTemporary) { + createTemporaryView(tableDesc.identifier, sparkSession, analyzedPlan) +} else { + // Adds default database for permanent table if it doesn't exist, so that tableExists() + // only check permanent tables. + val database = tableDesc.identifier.database.getOrElse( +sessionState.catalog.getCurrentDatabase) + val tableIdentifier = tableDesc.identifier.copy(database = Option(database)) + + if (sessionState.catalog.tableExists(tableIdentifier)) { +if (allowExisting) { --- End diff -- Fixed by disallowing "IF NOT EXISTS" syntax to be consistent with "CREATE TEMPORARY TABLE" behavior. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6339][SQL][WIP] Supports create CREATE ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12872#issuecomment-216734829 **[Test build #57710 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57710/consoleFull)** for PR 12872 at commit [`1e20bb0`](https://github.com/apache/spark/commit/1e20bb0df84ffcfe173c234375f8eb0eb7662242). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6339][SQL][WIP] Supports create CREATE ...
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/12872#discussion_r61987175 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala --- @@ -71,29 +87,59 @@ case class CreateViewCommand( require(tableDesc.schema == Nil || tableDesc.schema.length == analyzedPlan.output.length) val sessionState = sparkSession.sessionState -if (sessionState.catalog.tableExists(tableIdentifier)) { - if (allowExisting) { -// Handles `CREATE VIEW IF NOT EXISTS v0 AS SELECT ...`. Does nothing when the target view -// already exists. - } else if (replace) { -// Handles `CREATE OR REPLACE VIEW v0 AS SELECT ...` -sessionState.catalog.alterTable(prepareTable(sparkSession, analyzedPlan)) +if (isTemporary) { + createTemporaryView(tableDesc.identifier, sparkSession, analyzedPlan) +} else { + // Adds default database for permanent table if it doesn't exist, so that tableExists() + // only check permanent tables. + val database = tableDesc.identifier.database.getOrElse( +sessionState.catalog.getCurrentDatabase) + val tableIdentifier = tableDesc.identifier.copy(database = Option(database)) + + if (sessionState.catalog.tableExists(tableIdentifier)) { +if (allowExisting) { + // Handles `CREATE VIEW IF NOT EXISTS v0 AS SELECT ...`. Does nothing when the target view + // already exists. +} else if (replace) { + // Handles `CREATE OR REPLACE VIEW v0 AS SELECT ...` + sessionState.catalog.alterTable(prepareTable(sparkSession, analyzedPlan)) +} else { + // Handles `CREATE VIEW v0 AS SELECT ...`. Throws exception when the target view already + // exists. + throw new AnalysisException(s"View $tableIdentifier already exists. " + --- End diff -- Nit: Wrapping this line using the following style: ```scala throw new AnalysisException( s"View ...") ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6339][SQL][WIP] Supports create CREATE ...
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/12872#discussion_r61987105 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala --- @@ -37,13 +37,18 @@ import org.apache.spark.sql.catalyst.plans.logical.{LogicalPlan, Project} *already exists, throws analysis exception. * @param replace if true, and if the view already exists, updates it; if false, and if the view *already exists, throws analysis exception. + * @param isTemporary if true, the view is created as a temporary view. Temporary views are dropped + * at the end of current Spark session. Existing permanent relations with the same --- End diff -- Nit: Indentation is off --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6339][SQL][WIP] Supports create CREATE ...
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/12872#issuecomment-216733049 One thing that we've decided offline is that we should deprecate Dataset.registerTempTable() by Dataset.createTempView() in a follow-up PR. It's becoming confusing that we call the same thing "table" in Dataset API but "view" in SQL DDL. I think the essential difference between a view and a table is that a view is basically a lineage, while a table is always materialized to disk. For example, data of temporary tables in Hive are written to scratch folder of the current session. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6339][SQL][WIP] Supports create CREATE ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12872#issuecomment-216731693 **[Test build #57707 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57707/consoleFull)** for PR 12872 at commit [`a497cf3`](https://github.com/apache/spark/commit/a497cf3894d7ef6d39fc370665a7767e4cc2c76c). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6339][SQL][WIP] Supports create CREATE ...
Github user clockfly commented on a diff in the pull request: https://github.com/apache/spark/pull/12872#discussion_r61986232 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala --- @@ -55,13 +60,18 @@ case class CreateViewCommand( require(tableDesc.tableType == CatalogTableType.VIEW) require(tableDesc.viewText.isDefined) - private val tableIdentifier = tableDesc.identifier - if (allowExisting && replace) { throw new AnalysisException( "It is not allowed to define a view with both IF NOT EXISTS and OR REPLACE.") } + // Temporary view names should NOT contain database prefix like "database.table" + if (isTemporary && tableDesc.identifier.database.isDefined) { +val database = tableDesc.identifier.database.get +throw new AnalysisException( --- End diff -- This is to be consistent with DataSet API. When registering a temp table, it will remove the database prefix, here is the code: https://github.com/apache/spark/blob/6ba17cd147277a20a7fbb244c040e694de486c36/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala#L533 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6339][SQL][WIP] Supports create CREATE ...
Github user clockfly commented on a diff in the pull request: https://github.com/apache/spark/pull/12872#discussion_r61984365 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala --- @@ -71,27 +81,69 @@ case class CreateViewCommand( require(tableDesc.schema == Nil || tableDesc.schema.length == analyzedPlan.output.length) val sessionState = sparkSession.sessionState -if (sessionState.catalog.tableExists(tableIdentifier)) { +if (isTemporary) { + createTemporaryView(tableDesc.identifier, sparkSession, analyzedPlan) +} else { + // Adds default database for permanent table if it doesn't exist, so that tableExists() + // only check permanent tables. + val database = tableDesc.identifier.database.getOrElse( +sessionState.catalog.getCurrentDatabase) + val tableIdentifier = tableDesc.identifier.copy(database = Option(database)) + + if (sessionState.catalog.tableExists(tableIdentifier)) { +if (allowExisting) { + // Handles `CREATE VIEW IF NOT EXISTS v0 AS SELECT ...`. Does nothing when the target view + // already exists. +} else if (replace) { + // Handles `CREATE OR REPLACE VIEW v0 AS SELECT ...` + sessionState.catalog.alterTable(prepareTable(sparkSession, analyzedPlan)) +} else { + // Handles `CREATE VIEW v0 AS SELECT ...`. Throws exception when the target view already + // exists. + throw new AnalysisException(s"View $tableIdentifier already exists. " + +"If you want to update the view definition, please use ALTER VIEW AS or " + +"CREATE OR REPLACE VIEW AS") +} + } else { +// Create the view if it doesn't exist. +sessionState.catalog.createTable( + prepareTable(sparkSession, analyzedPlan), ignoreIfExists = false) + } +} +Seq.empty[Row] + } + + private def createTemporaryView(table: TableIdentifier, sparkSession: SparkSession, + analyzedPlan: LogicalPlan): Unit = { + +val sessionState = sparkSession.sessionState +val catalog = sessionState.catalog + +// Projects column names to alias names +val logicalPlan = { --- End diff -- Yes, we can project the columns like this: ``` spark.range(1, 10).selectExpr("id", "id id1").write.format("json").saveAsTable("jt") spark.sql("select * from jt").show() spark.sql("CREATE TEMPORARY VIEW testView (c1 COMMENT 'column 1', c2 COMMENT 'colum 2') AS SELECT * FROM jt") spark.sql("select * from testView").show() ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6339][SQL][WIP] Supports create CREATE ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12872#issuecomment-216718016 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57693/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6339][SQL][WIP] Supports create CREATE ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12872#issuecomment-216718015 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6339][SQL][WIP] Supports create CREATE ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12872#issuecomment-216717929 **[Test build #57693 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57693/consoleFull)** for PR 12872 at commit [`8af667e`](https://github.com/apache/spark/commit/8af667eda1f6a33774734b5a42ffae62bca22015). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6339][SQL][WIP] Supports create CREATE ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12872#issuecomment-216704088 **[Test build #57693 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57693/consoleFull)** for PR 12872 at commit [`8af667e`](https://github.com/apache/spark/commit/8af667eda1f6a33774734b5a42ffae62bca22015). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6339][SQL][WIP] Supports create CREATE ...
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/12872#issuecomment-216703452 add to whitelist --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6339][SQL][WIP] Supports create CREATE ...
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/12872#issuecomment-216703301 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6339][SQL][WIP] Supports create CREATE ...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/12872#discussion_r61954025 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala --- @@ -71,27 +81,69 @@ case class CreateViewCommand( require(tableDesc.schema == Nil || tableDesc.schema.length == analyzedPlan.output.length) val sessionState = sparkSession.sessionState -if (sessionState.catalog.tableExists(tableIdentifier)) { +if (isTemporary) { + createTemporaryView(tableDesc.identifier, sparkSession, analyzedPlan) +} else { + // Adds default database for permanent table if it doesn't exist, so that tableExists() + // only check permanent tables. + val database = tableDesc.identifier.database.getOrElse( +sessionState.catalog.getCurrentDatabase) + val tableIdentifier = tableDesc.identifier.copy(database = Option(database)) + + if (sessionState.catalog.tableExists(tableIdentifier)) { +if (allowExisting) { + // Handles `CREATE VIEW IF NOT EXISTS v0 AS SELECT ...`. Does nothing when the target view + // already exists. +} else if (replace) { + // Handles `CREATE OR REPLACE VIEW v0 AS SELECT ...` + sessionState.catalog.alterTable(prepareTable(sparkSession, analyzedPlan)) +} else { + // Handles `CREATE VIEW v0 AS SELECT ...`. Throws exception when the target view already + // exists. + throw new AnalysisException(s"View $tableIdentifier already exists. " + +"If you want to update the view definition, please use ALTER VIEW AS or " + +"CREATE OR REPLACE VIEW AS") +} + } else { +// Create the view if it doesn't exist. +sessionState.catalog.createTable( + prepareTable(sparkSession, analyzedPlan), ignoreIfExists = false) + } +} +Seq.empty[Row] + } + + private def createTemporaryView(table: TableIdentifier, sparkSession: SparkSession, + analyzedPlan: LogicalPlan): Unit = { + +val sessionState = sparkSession.sessionState +val catalog = sessionState.catalog + +// Projects column names to alias names +val logicalPlan = { --- End diff -- Is this possible? To have different columns and query output? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6339][SQL][WIP] Supports create CREATE ...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/12872#discussion_r61953261 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala --- @@ -71,27 +81,69 @@ case class CreateViewCommand( require(tableDesc.schema == Nil || tableDesc.schema.length == analyzedPlan.output.length) val sessionState = sparkSession.sessionState -if (sessionState.catalog.tableExists(tableIdentifier)) { +if (isTemporary) { + createTemporaryView(tableDesc.identifier, sparkSession, analyzedPlan) +} else { + // Adds default database for permanent table if it doesn't exist, so that tableExists() + // only check permanent tables. + val database = tableDesc.identifier.database.getOrElse( +sessionState.catalog.getCurrentDatabase) + val tableIdentifier = tableDesc.identifier.copy(database = Option(database)) + + if (sessionState.catalog.tableExists(tableIdentifier)) { +if (allowExisting) { --- End diff -- You can simplify the code here by ommiting this if clause and adding the explicit check `!allowExisting` to the else branch; this also makes is easier to follow the logic. Keep the comment though. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6339][SQL][WIP] Supports create CREATE ...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/12872#discussion_r61952975 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala --- @@ -71,27 +81,69 @@ case class CreateViewCommand( require(tableDesc.schema == Nil || tableDesc.schema.length == analyzedPlan.output.length) val sessionState = sparkSession.sessionState -if (sessionState.catalog.tableExists(tableIdentifier)) { +if (isTemporary) { + createTemporaryView(tableDesc.identifier, sparkSession, analyzedPlan) +} else { + // Adds default database for permanent table if it doesn't exist, so that tableExists() + // only check permanent tables. + val database = tableDesc.identifier.database.getOrElse( +sessionState.catalog.getCurrentDatabase) + val tableIdentifier = tableDesc.identifier.copy(database = Option(database)) + + if (sessionState.catalog.tableExists(tableIdentifier)) { +if (allowExisting) { + // Handles `CREATE VIEW IF NOT EXISTS v0 AS SELECT ...`. Does nothing when the target view + // already exists. +} else if (replace) { + // Handles `CREATE OR REPLACE VIEW v0 AS SELECT ...` + sessionState.catalog.alterTable(prepareTable(sparkSession, analyzedPlan)) +} else { + // Handles `CREATE VIEW v0 AS SELECT ...`. Throws exception when the target view already + // exists. + throw new AnalysisException(s"View $tableIdentifier already exists. " + +"If you want to update the view definition, please use ALTER VIEW AS or " + +"CREATE OR REPLACE VIEW AS") +} + } else { +// Create the view if it doesn't exist. +sessionState.catalog.createTable( + prepareTable(sparkSession, analyzedPlan), ignoreIfExists = false) + } +} +Seq.empty[Row] + } + + private def createTemporaryView(table: TableIdentifier, sparkSession: SparkSession, --- End diff -- Style --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6339][SQL][WIP] Supports create CREATE ...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/12872#discussion_r61952907 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala --- @@ -55,13 +60,18 @@ case class CreateViewCommand( require(tableDesc.tableType == CatalogTableType.VIEW) require(tableDesc.viewText.isDefined) - private val tableIdentifier = tableDesc.identifier - if (allowExisting && replace) { throw new AnalysisException( "It is not allowed to define a view with both IF NOT EXISTS and OR REPLACE.") } + // Temporary view names should NOT contain database prefix like "database.table" + if (isTemporary && tableDesc.identifier.database.isDefined) { +val database = tableDesc.identifier.database.get +throw new AnalysisException( --- End diff -- Where does this semantic rule come from? Hive? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org