Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/14750#discussion_r78106864 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala --- @@ -446,29 +449,46 @@ private[spark] class HiveExternalCatalog(conf: SparkConf, hadoopConf: Configurat table } else { getProviderFromTableProperties(table).map { provider => - assert(provider != "hive", "Hive serde table should not save provider in table properties.") - // SPARK-15269: Persisted data source tables always store the location URI as a storage - // property named "path" instead of standard Hive `dataLocation`, because Hive only - // allows directory paths as location URIs while Spark SQL data source tables also - // allows file paths. So the standard Hive `dataLocation` is meaningless for Spark SQL - // data source tables. - // Spark SQL may also save external data source in Hive compatible format when - // possible, so that these tables can be directly accessed by Hive. For these tables, - // `dataLocation` is still necessary. Here we also check for input format because only - // these Hive compatible tables set this field. - val storage = if (table.tableType == EXTERNAL && table.storage.inputFormat.isEmpty) { - table.storage.copy(locationUri = None) + if (provider == "hive") { + val schemaFromTableProps = getSchemaFromTableProperties(table) + if (DataType.equalsIgnoreCaseAndNullability(schemaFromTableProps, table.schema)) { --- End diff -- Schema includes partitioning columns, but it does not include the info of `bucketSpec`. Based on the previous answer about `bucketSpec`, I am not sure whether we also need to check whether `bucketSpec` is the same? If not, should we leave a TODO here; otherwise, we might forget it when we support `bucketSpec` for Hive serde tables.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org