[GitHub] [spark] cloud-fan commented on a change in pull request #30881: [SPARK-33875][SQL] Implement DESCRIBE COLUMN for v2 tables
cloud-fan commented on a change in pull request #30881: URL: https://github.com/apache/spark/pull/30881#discussion_r549936158 ## File path: sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala ## @@ -154,13 +154,54 @@ class DataSourceV2SQLSuite Array("Table Properties", "[bar=baz]", ""))) } - test("Describe column is not supported for v2 catalog") { -withTable("testcat.tbl") { - spark.sql("CREATE TABLE testcat.tbl (id bigint) USING foo") - val ex = intercept[AnalysisException] { -spark.sql("DESCRIBE testcat.tbl id") + test("Describe column for v2 catalog") { +val t = "testcat.tbl" +withTable(t) { + sql(s"CREATE TABLE $t (id bigint, data string COMMENT 'hello') USING foo") + val df1 = sql(s"DESCRIBE $t id") + assert(df1.schema.map(field => (field.name, field.dataType)) +=== Seq(("info_name", StringType), ("info_value", StringType))) + assert(df1.collect === Seq( +Row("col_name", "id"), +Row("data_type", "bigint"), +Row("comment", "NULL"))) + val df2 = sql(s"DESCRIBE $t data") + assert(df2.schema.map(field => (field.name, field.dataType)) +=== Seq(("info_name", StringType), ("info_value", StringType))) + assert(df2.collect === Seq( +Row("col_name", "data"), +Row("data_type", "string"), +Row("comment", "hello"))) + + assertAnalysisError( +s"DESCRIBE $t invalid_col", +"cannot resolve '`invalid_col`' given input columns: [testcat.tbl.data, testcat.tbl.id]") Review comment: I think v2 is better. Let's keep it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30881: [SPARK-33875][SQL] Implement DESCRIBE COLUMN for v2 tables
cloud-fan commented on a change in pull request #30881: URL: https://github.com/apache/spark/pull/30881#discussion_r549935749 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala ## @@ -764,13 +765,13 @@ case class DescribeColumnCommand( val colName = UnresolvedAttribute(colNameParts).name val field = { relation.resolve(colNameParts, resolver).getOrElse { -throw new AnalysisException(s"Column $colName does not exist") +throw QueryCompilationErrors.columnDoesNotExistError(colName) } } if (!field.isInstanceOf[Attribute]) { // If the field is not an attribute after `resolve`, then it's a nested field. - throw new AnalysisException( -s"DESC TABLE COLUMN command does not support nested data types: $colName") + throw QueryCompilationErrors.commandNotSupportNestedColumnError( +"DESC TABLE COLUMN", toPrettySQL(field, removeAlias = true)) Review comment: We won't hit this branch anymore, as we fail earlier in `ResolveSessionCatalog`. We can simply use `assert(field.isInstanceOf[Attribute])` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30881: [SPARK-33875][SQL] Implement DESCRIBE COLUMN for v2 tables
cloud-fan commented on a change in pull request #30881: URL: https://github.com/apache/spark/pull/30881#discussion_r549935311 ## File path: sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala ## @@ -235,8 +237,21 @@ class ResolveSessionCatalog( case DescribeRelation(ResolvedV1TableOrViewIdentifier(ident), partitionSpec, isExtended) => DescribeTableCommand(ident.asTableIdentifier, partitionSpec, isExtended) -case DescribeColumn(ResolvedV1TableOrViewIdentifier(ident), colNameParts, isExtended) => - DescribeColumnCommand(ident.asTableIdentifier, colNameParts, isExtended) +case DescribeColumn(ResolvedViewIdentifier(ident), column: UnresolvedAttribute, isExtended) => + // For views, the column will not be resolved by `ResolveReferences` because + // `ResolvedView` stores only the identifier. + DescribeColumnCommand(ident.asTableIdentifier, column.nameParts, isExtended) + +case DescribeColumn(ResolvedV1TableIdentifier(ident), column, isExtended) => + column match { +case u: UnresolvedAttribute => + throw QueryCompilationErrors.columnDoesNotExistError(u.name) +case a: Attribute => + DescribeColumnCommand(ident.asTableIdentifier, a.qualifier :+ a.name, isExtended) +case nested => Review comment: We can match `Alias` directly here, as other expressions won't be produced when resolving columns. e.g. ``` case Alias(child, _) => throw ... toPrettySQL(child) case other => throw ...("[BUG] unexpected column expression: " + other) ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30881: [SPARK-33875][SQL] Implement DESCRIBE COLUMN for v2 tables
cloud-fan commented on a change in pull request #30881: URL: https://github.com/apache/spark/pull/30881#discussion_r549933484 ## File path: sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala ## @@ -235,8 +237,21 @@ class ResolveSessionCatalog( case DescribeRelation(ResolvedV1TableOrViewIdentifier(ident), partitionSpec, isExtended) => DescribeTableCommand(ident.asTableIdentifier, partitionSpec, isExtended) -case DescribeColumn(ResolvedV1TableOrViewIdentifier(ident), colNameParts, isExtended) => - DescribeColumnCommand(ident.asTableIdentifier, colNameParts, isExtended) +case DescribeColumn(ResolvedViewIdentifier(ident), column: UnresolvedAttribute, isExtended) => + // For views, the column will not be resolved by `ResolveReferences` because + // `ResolvedView` stores only the identifier. + DescribeColumnCommand(ident.asTableIdentifier, column.nameParts, isExtended) + +case DescribeColumn(ResolvedV1TableIdentifier(ident), column, isExtended) => + column match { +case u: UnresolvedAttribute => + throw QueryCompilationErrors.columnDoesNotExistError(u.name) +case a: Attribute => + DescribeColumnCommand(ident.asTableIdentifier, a.qualifier :+ a.name, isExtended) Review comment: How about `ident.asMultipartIdentifier :+ a.name`? Then we don't need to add qualifiers to `ResolvedTable`. ## File path: sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala ## @@ -235,8 +237,21 @@ class ResolveSessionCatalog( case DescribeRelation(ResolvedV1TableOrViewIdentifier(ident), partitionSpec, isExtended) => DescribeTableCommand(ident.asTableIdentifier, partitionSpec, isExtended) -case DescribeColumn(ResolvedV1TableOrViewIdentifier(ident), colNameParts, isExtended) => - DescribeColumnCommand(ident.asTableIdentifier, colNameParts, isExtended) +case DescribeColumn(ResolvedViewIdentifier(ident), column: UnresolvedAttribute, isExtended) => + // For views, the column will not be resolved by `ResolveReferences` because + // `ResolvedView` stores only the identifier. + DescribeColumnCommand(ident.asTableIdentifier, column.nameParts, isExtended) + +case DescribeColumn(ResolvedV1TableIdentifier(ident), column, isExtended) => + column match { +case u: UnresolvedAttribute => + throw QueryCompilationErrors.columnDoesNotExistError(u.name) +case a: Attribute => + DescribeColumnCommand(ident.asTableIdentifier, a.qualifier :+ a.name, isExtended) Review comment: How about `ident.asMultipartIdentifier :+ a.name`? Then we don't need to add qualifiers to `ResolvedTable.output`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30881: [SPARK-33875][SQL] Implement DESCRIBE COLUMN for v2 tables
cloud-fan commented on a change in pull request #30881: URL: https://github.com/apache/spark/pull/30881#discussion_r549933194 ## File path: sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala ## @@ -235,8 +237,21 @@ class ResolveSessionCatalog( case DescribeRelation(ResolvedV1TableOrViewIdentifier(ident), partitionSpec, isExtended) => DescribeTableCommand(ident.asTableIdentifier, partitionSpec, isExtended) -case DescribeColumn(ResolvedV1TableOrViewIdentifier(ident), colNameParts, isExtended) => - DescribeColumnCommand(ident.asTableIdentifier, colNameParts, isExtended) +case DescribeColumn(ResolvedViewIdentifier(ident), column: UnresolvedAttribute, isExtended) => + // For views, the column will not be resolved by `ResolveReferences` because + // `ResolvedView` stores only the identifier. + DescribeColumnCommand(ident.asTableIdentifier, column.nameParts, isExtended) + +case DescribeColumn(ResolvedV1TableIdentifier(ident), column, isExtended) => + column match { +case u: UnresolvedAttribute => + throw QueryCompilationErrors.columnDoesNotExistError(u.name) +case a: Attribute => + DescribeColumnCommand(ident.asTableIdentifier, a.qualifier :+ a.name, isExtended) +case nested => + throw QueryCompilationErrors.commandNotSupportNestedColumnError( +"DESC TABLE COLUMN", toPrettySQL(nested, removeAlias = true)) Review comment: Can we manually remove `Alias` here instead of changing the `toPrettySQL` method? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30881: [SPARK-33875][SQL] Implement DESCRIBE COLUMN for v2 tables
cloud-fan commented on a change in pull request #30881: URL: https://github.com/apache/spark/pull/30881#discussion_r549602461 ## File path: sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala ## @@ -235,8 +236,17 @@ class ResolveSessionCatalog( case DescribeRelation(ResolvedV1TableOrViewIdentifier(ident), partitionSpec, isExtended) => DescribeTableCommand(ident.asTableIdentifier, partitionSpec, isExtended) -case DescribeColumn(ResolvedV1TableOrViewIdentifier(ident), colNameParts, isExtended) => - DescribeColumnCommand(ident.asTableIdentifier, colNameParts, isExtended) +case DescribeColumn(ResolvedV1TableOrViewIdentifier(ident), column, isExtended) => + column match { +case u: UnresolvedAttribute => + // For views, the column will not be resolved by `ResolveReferences` because + // `ResolvedView` stores only the identifier. + DescribeColumnCommand(ident.asTableIdentifier, u.nameParts, isExtended) +case a: Attribute => + DescribeColumnCommand(ident.asTableIdentifier, a.qualifier :+ a.name, isExtended) +case nested => + throw QueryCompilationErrors.commandNotSupportNestedColumnError("DESC TABLE COLUMN") Review comment: How about we strip the `Alias` and then call `toPrettySQL` which is defined in `org.apache.spark.sql.catalyst.util`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30881: [SPARK-33875][SQL] Implement DESCRIBE COLUMN for v2 tables
cloud-fan commented on a change in pull request #30881: URL: https://github.com/apache/spark/pull/30881#discussion_r549592550 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala ## @@ -272,8 +273,13 @@ class DataSourceV2Strategy(session: SparkSession) extends Strategy with Predicat } DescribeTableExec(desc.output, r.table, isExtended) :: Nil -case DescribeColumn(_: ResolvedTable, _, _) => - throw new AnalysisException("Describing columns is not supported for v2 tables.") +case desc @ DescribeColumn(_: ResolvedTable, column, isExtended) => + column match { +case c: Attribute => + DescribeColumnExec(desc.output, c, isExtended) :: Nil +case _ => + throw QueryCompilationErrors.commandNotSupportNestedColumnError("DESC TABLE COLUMN") Review comment: ditto This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30881: [SPARK-33875][SQL] Implement DESCRIBE COLUMN for v2 tables
cloud-fan commented on a change in pull request #30881: URL: https://github.com/apache/spark/pull/30881#discussion_r549592416 ## File path: sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala ## @@ -235,8 +236,17 @@ class ResolveSessionCatalog( case DescribeRelation(ResolvedV1TableOrViewIdentifier(ident), partitionSpec, isExtended) => DescribeTableCommand(ident.asTableIdentifier, partitionSpec, isExtended) -case DescribeColumn(ResolvedV1TableOrViewIdentifier(ident), colNameParts, isExtended) => - DescribeColumnCommand(ident.asTableIdentifier, colNameParts, isExtended) +case DescribeColumn(ResolvedV1TableOrViewIdentifier(ident), column, isExtended) => + column match { +case u: UnresolvedAttribute => + // For views, the column will not be resolved by `ResolveReferences` because + // `ResolvedView` stores only the identifier. + DescribeColumnCommand(ident.asTableIdentifier, u.nameParts, isExtended) +case a: Attribute => + DescribeColumnCommand(ident.asTableIdentifier, a.qualifier :+ a.name, isExtended) +case nested => + throw QueryCompilationErrors.commandNotSupportNestedColumnError("DESC TABLE COLUMN") Review comment: > Construct the original name from GetStructField, GetArrayStructFields, etc. Is it simply `nested.sql`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30881: [SPARK-33875][SQL] Implement DESCRIBE COLUMN for v2 tables
cloud-fan commented on a change in pull request #30881: URL: https://github.com/apache/spark/pull/30881#discussion_r549591975 ## File path: sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala ## @@ -235,8 +236,17 @@ class ResolveSessionCatalog( case DescribeRelation(ResolvedV1TableOrViewIdentifier(ident), partitionSpec, isExtended) => DescribeTableCommand(ident.asTableIdentifier, partitionSpec, isExtended) -case DescribeColumn(ResolvedV1TableOrViewIdentifier(ident), colNameParts, isExtended) => - DescribeColumnCommand(ident.asTableIdentifier, colNameParts, isExtended) +case DescribeColumn(ResolvedV1TableOrViewIdentifier(ident), column, isExtended) => + column match { +case u: UnresolvedAttribute => + // For views, the column will not be resolved by `ResolveReferences` because + // `ResolvedView` stores only the identifier. + DescribeColumnCommand(ident.asTableIdentifier, u.nameParts, isExtended) Review comment: It's possible when the column name doesn't exist in the table, and we should give a clear error message: `Column $colName does not exist` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30881: [SPARK-33875][SQL] Implement DESCRIBE COLUMN for v2 tables
cloud-fan commented on a change in pull request #30881: URL: https://github.com/apache/spark/pull/30881#discussion_r549236685 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ## @@ -1675,6 +1675,13 @@ class Analyzer(override val catalogManager: CatalogManager) // Skip the having clause here, this will be handled in ResolveAggregateFunctions. case h: UnresolvedHaving => h + case d @ DescribeColumn(rt: ResolvedTable, _, _) => +rt.table match { + // References for v1 tables are resolved in DescribeColumnCommand. + case _: V1Table => d Review comment: We can re-construct the multi-part column name from `Attribute`: `attr.qualifier +: attr.name` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30881: [SPARK-33875][SQL] Implement DESCRIBE COLUMN for v2 tables
cloud-fan commented on a change in pull request #30881: URL: https://github.com/apache/spark/pull/30881#discussion_r549236194 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ## @@ -1675,6 +1675,13 @@ class Analyzer(override val catalogManager: CatalogManager) // Skip the having clause here, this will be handled in ResolveAggregateFunctions. case h: UnresolvedHaving => h + case d @ DescribeColumn(rt: ResolvedTable, _, _) => +rt.table match { + // References for v1 tables are resolved in DescribeColumnCommand. + case _: V1Table => d Review comment: But seems like to resolve the column twice? ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ## @@ -1675,6 +1675,13 @@ class Analyzer(override val catalogManager: CatalogManager) // Skip the having clause here, this will be handled in ResolveAggregateFunctions. case h: UnresolvedHaving => h + case d @ DescribeColumn(rt: ResolvedTable, _, _) => +rt.table match { + // References for v1 tables are resolved in DescribeColumnCommand. + case _: V1Table => d Review comment: But seems fine to resolve the column twice? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30881: [SPARK-33875][SQL] Implement DESCRIBE COLUMN for v2 tables
cloud-fan commented on a change in pull request #30881: URL: https://github.com/apache/spark/pull/30881#discussion_r549236035 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ## @@ -1675,6 +1675,13 @@ class Analyzer(override val catalogManager: CatalogManager) // Skip the having clause here, this will be handled in ResolveAggregateFunctions. case h: UnresolvedHaving => h + case d @ DescribeColumn(rt: ResolvedTable, _, _) => +rt.table match { + // References for v1 tables are resolved in DescribeColumnCommand. + case _: V1Table => d Review comment: Then we shouldn't change the v1 command. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30881: [SPARK-33875][SQL] Implement DESCRIBE COLUMN for v2 tables
cloud-fan commented on a change in pull request #30881: URL: https://github.com/apache/spark/pull/30881#discussion_r548838750 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/v2ResolutionPlans.scala ## @@ -97,9 +98,21 @@ case class ResolvedNamespace(catalog: CatalogPlugin, namespace: Seq[String]) /** * A plan containing resolved table. */ -case class ResolvedTable(catalog: TableCatalog, identifier: Identifier, table: Table) - extends LeafNode { - override def output: Seq[Attribute] = Nil +case class ResolvedTable( +catalog: TableCatalog, +identifier: Identifier, +table: Table, +override val output: Seq[Attribute]) + extends LeafNode + +object ResolvedTable { + def create( + catalog: TableCatalog, + identifier: Identifier, + table: Table): ResolvedTable = { Review comment: This follows `DataSourceV2Relation.create` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30881: [SPARK-33875][SQL] Implement DESCRIBE COLUMN for v2 tables
cloud-fan commented on a change in pull request #30881: URL: https://github.com/apache/spark/pull/30881#discussion_r548830290 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ## @@ -1675,6 +1675,13 @@ class Analyzer(override val catalogManager: CatalogManager) // Skip the having clause here, this will be handled in ResolveAggregateFunctions. case h: UnresolvedHaving => h + case d @ DescribeColumn(rt: ResolvedTable, _, _) => +rt.table match { + // References for v1 tables are resolved in DescribeColumnCommand. + case _: V1Table => d Review comment: Or, we don't care too much about the DDL perf, then I think resolving the column twice is also fine and we don't need this hack either. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30881: [SPARK-33875][SQL] Implement DESCRIBE COLUMN for v2 tables
cloud-fan commented on a change in pull request #30881: URL: https://github.com/apache/spark/pull/30881#discussion_r548830073 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ## @@ -1675,6 +1675,13 @@ class Analyzer(override val catalogManager: CatalogManager) // Skip the having clause here, this will be handled in ResolveAggregateFunctions. case h: UnresolvedHaving => h + case d @ DescribeColumn(rt: ResolvedTable, _, _) => +rt.table match { + // References for v1 tables are resolved in DescribeColumnCommand. + case _: V1Table => d Review comment: Shall we change the v1 command to take resolved `Attribute` directly? Then we don't need the hack here. This also reminds me that, for all v1 commands, we will lookup the table twice: once in the framework, and once inside the v1 command. This is kind of a perf regression: previously we simply lookup the catalog and fallback to v1 command if it's the session catalog, so the table lookup still happens once inside the v1 command. We can also update the v1 command and take the resolved table directly. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30881: [SPARK-33875][SQL] Implement DESCRIBE COLUMN for v2 tables
cloud-fan commented on a change in pull request #30881: URL: https://github.com/apache/spark/pull/30881#discussion_r548829115 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala ## @@ -272,8 +272,14 @@ class DataSourceV2Strategy(session: SparkSession) extends Strategy with Predicat } DescribeTableExec(desc.output, r.table, isExtended) :: Nil -case DescribeColumn(_: ResolvedTable, _, _) => - throw new AnalysisException("Describing columns is not supported for v2 tables.") +case desc @ DescribeColumn(_: ResolvedTable, column, isExtended) => + column match { +case c: Attribute => + DescribeColumnExec(desc.output, c, isExtended) :: Nil +case c: GetStructField => + throw new AnalysisException( +"DESC TABLE COLUMN command does not support nested data types.") Review comment: `nested data types` -> `nested fields` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30881: [SPARK-33875][SQL] Implement DESCRIBE COLUMN for v2 tables
cloud-fan commented on a change in pull request #30881: URL: https://github.com/apache/spark/pull/30881#discussion_r548399246 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala ## @@ -344,10 +344,11 @@ case class DescribeRelation( */ case class DescribeColumn( relation: LogicalPlan, -colNameParts: Seq[String], +column: NamedExpression, isExtended: Boolean) extends Command { override def children: Seq[LogicalPlan] = Seq(relation) override def output: Seq[Attribute] = DescribeCommandSchema.describeColumnAttributes() + override lazy val references: AttributeSet = AttributeSet.empty Review comment: I agree. About `ResolvedView`, I think eventually it will have output as well, after we have v2 view APII. For now, `ResolvedView` always go to v1 command, so we don't need its output. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30881: [SPARK-33875][SQL] Implement DESCRIBE COLUMN for v2 tables
cloud-fan commented on a change in pull request #30881: URL: https://github.com/apache/spark/pull/30881#discussion_r547946465 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala ## @@ -344,10 +344,11 @@ case class DescribeRelation( */ case class DescribeColumn( relation: LogicalPlan, -colNameParts: Seq[String], +column: NamedExpression, isExtended: Boolean) extends Command { override def children: Seq[LogicalPlan] = Seq(relation) override def output: Seq[Attribute] = DescribeCommandSchema.describeColumnAttributes() + override lazy val references: AttributeSet = AttributeSet.empty Review comment: If we need a hack anyway, I'd prefer we override `inputSet` here, so that we can let analyzer to resolve columns. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30881: [SPARK-33875][SQL] Implement DESCRIBE COLUMN for v2 tables
cloud-fan commented on a change in pull request #30881: URL: https://github.com/apache/spark/pull/30881#discussion_r547128251 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala ## @@ -344,10 +344,11 @@ case class DescribeRelation( */ case class DescribeColumn( relation: LogicalPlan, -colNameParts: Seq[String], +column: NamedExpression, isExtended: Boolean) extends Command { override def children: Seq[LogicalPlan] = Seq(relation) override def output: Seq[Attribute] = DescribeCommandSchema.describeColumnAttributes() + override lazy val references: AttributeSet = AttributeSet.empty Review comment: What's the problem if `QueryPlan.expressions` pick up this field? Then we can let the analyzer to resolve columns which is even better. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org