Hi, I want to load a stream of CSV files to a partitioned Hive table called myTable.
I tried using Spark 2 Structured Streaming to do that: val spark = SparkSession .builder .appName("TrueCallLoade") .enableHiveSupport() .config("hive.exec.dynamic.partition.mode", "non-strict") .config("hive.exec.dynamic.partition", "true") .config("hive.exec.max.dynamic.partition", "2048") .config("hive.exec.max.dynamic.partition.pernode", "256") .getOrCreate() val df = spark.readStream.option("sep", ",").option("header", "true").schema(customSchema).csv(fileDirectory) The dataframe has 2 columns called "dt" and "h" by which the Hive table is partitioned. writeStream can't directly stream to a Hive table, so I decided to use val query = df.writeStream.queryName("LoadedCSVData").outputMode("Append").format("memory").start() and then spark.sql("INSERT INTO myTable SELECT * FROM LoadedCSVData") This doesn't seem to insert work. Any idea how I can achieve that? Nimrod -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Saving-Structured-Streaming-DF-to-Hive-Partitioned-table-tp28424.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org