[GitHub] spark pull request #14148: [SPARK-16482] [SQL] Describe Table Command for Ta...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/14148 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14148: [SPARK-16482] [SQL] Describe Table Command for Ta...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/14148#discussion_r70578153 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala --- @@ -413,38 +413,36 @@ case class DescribeTableCommand(table: TableIdentifier, isExtended: Boolean, isF } else { val metadata = catalog.getTableMetadata(table) + if (DDLUtils.isDatasourceTable(metadata)) { +DDLUtils.getSchemaFromTableProperties(metadata) match { + case Some(userSpecifiedSchema) => describeSchema(userSpecifiedSchema, result) + case None => describeSchema(catalog.lookupRelation(table).schema, result) +} + } else { +describeSchema(metadata.schema, result) + } --- End diff -- @yhuai I just did a try. We have to pass `CatalogTable` for avoiding another call of `getTableMetadata`. We also need to pass `SessionCatalog` for calling `lookupRelation`. Do you like this function? or keep the existing one? Thanks! ```Scala private def describeSchema( tableDesc: CatalogTable, catalog: SessionCatalog, buffer: ArrayBuffer[Row]): Unit = { if (DDLUtils.isDatasourceTable(tableDesc)) { DDLUtils.getSchemaFromTableProperties(tableDesc) match { case Some(userSpecifiedSchema) => describeSchema(userSpecifiedSchema, buffer) case None => describeSchema(catalog.lookupRelation(table).schema, buffer) } } else { describeSchema(tableDesc.schema, buffer) } } ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14148: [SPARK-16482] [SQL] Describe Table Command for Ta...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/14148#discussion_r70573373 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala --- @@ -413,38 +413,36 @@ case class DescribeTableCommand(table: TableIdentifier, isExtended: Boolean, isF } else { val metadata = catalog.getTableMetadata(table) + if (DDLUtils.isDatasourceTable(metadata)) { +DDLUtils.getSchemaFromTableProperties(metadata) match { + case Some(userSpecifiedSchema) => describeSchema(userSpecifiedSchema, result) + case None => describeSchema(catalog.lookupRelation(table).schema, result) +} + } else { +describeSchema(metadata.schema, result) + } --- End diff -- Sure. Let me do it now BTW, previously, `describeExtended` and `describeFormatted` also contain the schema. Both call the original function `describe`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14148: [SPARK-16482] [SQL] Describe Table Command for Ta...
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/14148#discussion_r70571914 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala --- @@ -413,38 +413,36 @@ case class DescribeTableCommand(table: TableIdentifier, isExtended: Boolean, isF } else { val metadata = catalog.getTableMetadata(table) + if (DDLUtils.isDatasourceTable(metadata)) { +DDLUtils.getSchemaFromTableProperties(metadata) match { + case Some(userSpecifiedSchema) => describeSchema(userSpecifiedSchema, result) + case None => describeSchema(catalog.lookupRelation(table).schema, result) +} + } else { +describeSchema(metadata.schema, result) + } --- End diff -- How about we try to put these into describeSchema? Of, maybe we can add a `describeSchema(tableName, result)`? Seems it is weird that `describeExtended` and `describeFormatted` do not contain the code for describing the schema. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14148: [SPARK-16482] [SQL] Describe Table Command for Ta...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/14148#discussion_r70570674 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala --- @@ -431,7 +431,7 @@ case class DescribeTableCommand(table: TableIdentifier, isExtended: Boolean, isF val schema = DDLUtils.getSchemaFromTableProperties(table) if (schema.isEmpty) { -append(buffer, "# Schema of this table is inferred at runtime", "", "") +append(buffer, "# Schema of this table in catalog is corrupted", "", "") --- End diff -- Do you like the last patch? https://github.com/apache/spark/pull/14148/commits/d92ebcdfd7e525499e0c8b491eeab416ad12ecfd --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14148: [SPARK-16482] [SQL] Describe Table Command for Ta...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/14148#discussion_r70570710 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/createDataSourceTables.scala --- @@ -105,7 +105,7 @@ case class CreateDataSourceTableCommand( CreateDataSourceTableUtils.createDataSourceTable( sparkSession = sparkSession, tableIdent = tableIdent, - userSpecifiedSchema = userSpecifiedSchema, + userSpecifiedSchema = Some(dataSource.schema), --- End diff -- Agree. I will revert it to the last solution. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14148: [SPARK-16482] [SQL] Describe Table Command for Ta...
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/14148#discussion_r70570551 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/createDataSourceTables.scala --- @@ -105,7 +105,7 @@ case class CreateDataSourceTableCommand( CreateDataSourceTableUtils.createDataSourceTable( sparkSession = sparkSession, tableIdent = tableIdent, - userSpecifiedSchema = userSpecifiedSchema, + userSpecifiedSchema = Some(dataSource.schema), --- End diff -- I think this change is risky for 2.0 and it is changing the behavior. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14148: [SPARK-16482] [SQL] Describe Table Command for Ta...
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/14148#discussion_r70570489 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala --- @@ -431,7 +431,7 @@ case class DescribeTableCommand(table: TableIdentifier, isExtended: Boolean, isF val schema = DDLUtils.getSchemaFromTableProperties(table) if (schema.isEmpty) { -append(buffer, "# Schema of this table is inferred at runtime", "", "") +append(buffer, "# Schema of this table in catalog is corrupted", "", "") --- End diff -- Should we just use `catalog.lookupRelation(table).schema` to get the schema? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14148: [SPARK-16482] [SQL] Describe Table Command for Ta...
GitHub user gatorsmile opened a pull request: https://github.com/apache/spark/pull/14148 [SPARK-16482] [SQL] Describe Table Command for Tables Requiring Runtime Inferred Schema What changes were proposed in this pull request? If we create a table pointing to a parquet/json datasets without specifying the schema, describe table command does not show the schema at all. It only shows `# Schema of this table is inferred at runtime`. In 1.6, describe table does show the schema of such a table. For data source tables, to infer the schema, we need to load the data source tables at runtime. Thus, this PR calls the function `lookupRelation`. How was this patch tested? Added test cases You can merge this pull request into a Git repository by running: $ git pull https://github.com/gatorsmile/spark describeSchema Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/14148.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #14148 commit 57893bdf55146c4ecd0a6d72c69ec3d3e85b5207 Author: gatorsmileDate: 2016-07-11T22:30:11Z fix commit 6f2deb3405b119aff1c88cab19d3953a7ede0408 Author: gatorsmile Date: 2016-07-11T22:55:18Z another fix way commit d92ebcdfd7e525499e0c8b491eeab416ad12ecfd Author: gatorsmile Date: 2016-07-12T04:00:20Z another fix way --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org