hi Team, I wanted to understand how spark connects to Hive. Does it connect to Hive metastore directly bypassing hive server?. Lets say when we are inserting data into a hive table with its I/O format as Parquet. Does Spark creates the parquet file from the Dataframe/RDD/DataSet and put it in its HDFS location and update metastore about the new parquet file?. Or it simply run the insert statement on Hiverserver (through jdbc or some other means).
We are using Spark 2.4.3 and Hive 2.1.1 in our cluster. Is there a document that explains about this?. Please share. Thanks, Venkat 2016173438