I think I found the issue, Hive metastore 2.3.6 doesn't have the necessary support. After upgrading to Hive 3.1.2 I was able to run the select query.
On Sun, 20 Dec 2020 at 12:00, Jay <jayadeep.jayara...@gmail.com> wrote: > Thanks Matt. > > I have set the two configs in my sparkConfig as below > val spark = > SparkSession.builder().appName("QuickstartSQL").config("spark.sql.extensions", > "io.delta.sql.DeltaSparkSessionExtension").config("spark.sql.catalog.spark_catalog", > "org.apache.spark.sql.delta.catalog.DeltaCatalog").getOrCreate() > > I am using a managed Spark service with Delta on Google Cloud and > therefore all nodes have delta-core_2.12:0.7.0 in /usr/lib/delta/jars/ > > I am using a managed Hive metastore version 2.3.6 which is connected to my > Delta cluster as well as presto cluster. When I am using the normal scala > API it works without any issues > > spark.read.format("delta").load("pathtoTable").show() > val deltaTable = > DeltaTable.forPath("gs://jayadeep-etl-platform/first-delta-table") > deltaTable.as("oldData").merge(merge_df.as("newData"),"oldData.x = > newData.x").whenMatched.update(Map("y" -> > col("newData.y"))).whenNotMatched.insert(Map("x" -> > col("newData.x"))).execute() > > When I am issuing the following command it works fine > scala> spark.sql(s"SELECT * FROM $tableName") > res2: org.apache.spark.sql.DataFrame = [col1: int] > > But when I try to do .show() it returns an error > scala> spark.sql(s"SELECT * FROM $tableName").show() > org.apache.spark.sql.AnalysisException: Table does not support reads: > default.tblname_3; > > -Jay > > > On Sun, 20 Dec 2020 at 03:51, Matt Proetsch <mattproet...@gmail.com> > wrote: > >> Hi Jay, >> >> Some things to check: >> >> Do you have the following set in your Spark SQL config: >> >> "spark.sql.extensions=io.delta.sql.DeltaSparkSessionExtension" >> >> "spark.sql.catalog.spark_catalog=org.apache.spark.sql.delta.catalog.DeltaCatalog" >> >> >> Is the JAR for the package delta-core_2.12:0.7.0 available on both your >> driver and executor classpaths? >> (More info >> https://docs.delta.io/latest/quick-start.html#set-up-apache-spark-with-delta-lake >> ) >> >> Since you are using non-default metastore version have you set the config >> for spark.sql.hive.metastore.version >> (More info >> https://spark.apache.org/docs/latest/sql-data-sources-hive-tables.html#interacting-with-different-versions-of-hive-metastore >> ) >> >> Finally are you able to read/write Delta tables outside of Hive? >> >> -Matt >> >> On Dec 19, 2020, at 13:03, Jay <jayadeep.jayara...@gmail.com> wrote: >> >> >> Hi All - >> >> I have currently setup a Spark 3.0.1 cluster with delta version 0.7.0 >> which is connected to an external hive metastore. >> >> I run the below set of commands :- >> >> val tableName = tblname_2 >> spark.sql(s"CREATE TABLE $tableName(col1 INTEGER) USING delta >> options(path='GCS_PATH')") >> >> *20/12/19 17:30:52 WARN org.apache.spark.sql.hive.HiveExternalCatalog: >> Couldn't find corresponding Hive SerDe for data source provider delta. >> Persisting data source table `default`.`tblname_2` into Hive metastore in >> Spark SQL specific format, which is NOT compatible with Hive.* >> >> spark.sql(s"INSERT OVERWRITE $tableName VALUES 5, 6, 7, 8, 9") >> res51: org.apache.spark.sql.DataFrame = [] >> >> spark.sql(s"SELECT * FROM $tableName").show() >> >> *org.apache.spark.sql.AnalysisException: Table does not support reads: >> default.tblname_2; * >> >> I see a warning which is related to integration with Hive Metastore which >> essentially tells that this table cannot be queried via Hive or Presto >> which is fine but when I try to read the data from the same spark session I >> am getting an error. Can someone suggest what can be the problem ? >> >>