Thanks Matt.

I have set the two configs in my sparkConfig as below
val spark =
SparkSession.builder().appName("QuickstartSQL").config("spark.sql.extensions",
"io.delta.sql.DeltaSparkSessionExtension").config("spark.sql.catalog.spark_catalog",
"org.apache.spark.sql.delta.catalog.DeltaCatalog").getOrCreate()

I am using a managed Spark service with Delta on Google Cloud and therefore
all nodes have delta-core_2.12:0.7.0 in /usr/lib/delta/jars/

I am using a managed Hive metastore version 2.3.6 which is connected to my
Delta cluster as well as presto cluster. When I am using the normal scala
API it works without any issues

spark.read.format("delta").load("pathtoTable").show()
val deltaTable =
DeltaTable.forPath("gs://jayadeep-etl-platform/first-delta-table")
deltaTable.as("oldData").merge(merge_df.as("newData"),"oldData.x =
newData.x").whenMatched.update(Map("y" ->
col("newData.y"))).whenNotMatched.insert(Map("x" ->
col("newData.x"))).execute()

When I am issuing the following command it works fine
scala> spark.sql(s"SELECT * FROM $tableName")
res2: org.apache.spark.sql.DataFrame = [col1: int]

But when I try to do .show() it returns an error
scala> spark.sql(s"SELECT * FROM $tableName").show()
org.apache.spark.sql.AnalysisException: Table does not support reads:
default.tblname_3;

-Jay


On Sun, 20 Dec 2020 at 03:51, Matt Proetsch <mattproet...@gmail.com> wrote:

> Hi Jay,
>
> Some things to check:
>
> Do you have the following set in your Spark SQL config:
>
> "spark.sql.extensions=io.delta.sql.DeltaSparkSessionExtension"
>
> "spark.sql.catalog.spark_catalog=org.apache.spark.sql.delta.catalog.DeltaCatalog"
>
>
> Is the JAR for the package delta-core_2.12:0.7.0 available on both your
> driver and executor classpaths?
> (More info
> https://docs.delta.io/latest/quick-start.html#set-up-apache-spark-with-delta-lake
> )
>
> Since you are using non-default metastore version have you set the config
> for spark.sql.hive.metastore.version
> (More info
> https://spark.apache.org/docs/latest/sql-data-sources-hive-tables.html#interacting-with-different-versions-of-hive-metastore
> )
>
> Finally are you able to read/write Delta tables outside of Hive?
>
> -Matt
>
> On Dec 19, 2020, at 13:03, Jay <jayadeep.jayara...@gmail.com> wrote:
>
> 
> Hi All -
>
> I have currently setup a Spark 3.0.1 cluster with delta version 0.7.0
> which is connected to an external hive metastore.
>
> I run the below set of commands :-
>
> val tableName = tblname_2
> spark.sql(s"CREATE TABLE $tableName(col1 INTEGER) USING delta
> options(path='GCS_PATH')")
>
> *20/12/19 17:30:52 WARN org.apache.spark.sql.hive.HiveExternalCatalog:
> Couldn't find corresponding Hive SerDe for data source provider delta.
> Persisting data source table `default`.`tblname_2` into Hive metastore in
> Spark SQL specific format, which is NOT compatible with Hive.*
>
> spark.sql(s"INSERT OVERWRITE $tableName VALUES 5, 6, 7, 8, 9")
> res51: org.apache.spark.sql.DataFrame = []
>
> spark.sql(s"SELECT * FROM $tableName").show()
>
> *org.apache.spark.sql.AnalysisException: Table does not support reads:
> default.tblname_2;                   *
>
> I see a warning which is related to integration with Hive Metastore which
> essentially tells that this table cannot be queried via Hive or Presto
> which is fine but when I try to read the data from the same spark session I
> am getting an error. Can someone suggest what can be the problem ?
>
>

Reply via email to