This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push: new a51dd1820da [SPARK-39203][SQL][FOLLOWUP] Do not qualify view location a51dd1820da is described below commit a51dd1820dad8f24f99a979d5998b999ae4a3c25 Author: Wenchen Fan <wenc...@databricks.com> AuthorDate: Thu Oct 20 15:16:04 2022 -0700 [SPARK-39203][SQL][FOLLOWUP] Do not qualify view location ### What changes were proposed in this pull request? This fixes a corner-case regression caused by https://github.com/apache/spark/pull/36625. Users may have existing views that have invalid locations due to historical reasons. The location is actually useless for a view, but after https://github.com/apache/spark/pull/36625 , they start to fail to read the view as qualifying the location fails. We should just skip qualifying view locations. ### Why are the changes needed? avoid regression ### Does this PR introduce _any_ user-facing change? Spark can read view with invalid location again. ### How was this patch tested? manually test. View with an invalid location is kind of "broken" and can't be dropped (HMS fails to drop it), so we can't write a UT for it. Closes #38321 from cloud-fan/follow. Authored-by: Wenchen Fan <wenc...@databricks.com> Signed-off-by: Dongjoon Hyun <dongj...@apache.org> --- .../apache/spark/sql/hive/client/HiveClientImpl.scala | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala b/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala index f6b06b08cbc..213d930653d 100644 --- a/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala +++ b/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala @@ -537,12 +537,18 @@ private[hive] class HiveClientImpl( storage = CatalogStorageFormat( locationUri = shim.getDataLocation(h).map { loc => val tableUri = stringToURI(loc) - // Before SPARK-19257, created data source table does not use absolute uri. - // This makes Spark can't read these tables across HDFS clusters. - // Rewrite table location to absolute uri based on database uri to fix this issue. - val absoluteUri = Option(tableUri).filterNot(_.isAbsolute) - .map(_ => stringToURI(client.getDatabase(h.getDbName).getLocationUri)) - HiveExternalCatalog.toAbsoluteURI(tableUri, absoluteUri) + if (h.getTableType == HiveTableType.VIRTUAL_VIEW) { + // Data location of SQL view is useless. Do not qualify it even if it's present, as + // it can be an invalid path. + tableUri + } else { + // Before SPARK-19257, created data source table does not use absolute uri. + // This makes Spark can't read these tables across HDFS clusters. + // Rewrite table location to absolute uri based on database uri to fix this issue. + val absoluteUri = Option(tableUri).filterNot(_.isAbsolute) + .map(_ => stringToURI(client.getDatabase(h.getDbName).getLocationUri)) + HiveExternalCatalog.toAbsoluteURI(tableUri, absoluteUri) + } }, // To avoid ClassNotFound exception, we try our best to not get the format class, but get // the class name directly. However, for non-native tables, there is no interface to get --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org