I guess it's really depends on your configuration. The Hive metastore
is providing just the metadata/schema data for your database, not actual
data storage. Hive is running on top of Hadoop. If you configure your
Spark to run on the same Hadoop cluster using Yarn, your SQL dataframe
in Spark
hi Team,
I wanted to understand how spark connects to Hive. Does it connect to Hive
metastore directly bypassing hive server?. Lets say when we are inserting
data into a hive table with its I/O format as Parquet. Does Spark creates
the parquet file from the Dataframe/RDD/DataSet and put it in its