subject:"Spark sql insert hive table which method has the highest performance"

Re: Spark sql insert hive table which method has the highest performance

2019-05-15 Thread Jelly Young

Hi, The document of DFWriter say that: Unlike `saveAsTable`, `insertInto` ignores the column names and just uses position-based For example: * * {{{ *scala> Seq((1, 2)).toDF("i", "j").write.mode("overwrite").saveAsTable("t1") *scala> Seq((3, 4)).toDF("j", "i").write.insertInto("t1") *

Spark sql insert hive table which method has the highest performance

2019-05-14 Thread 车 ��

Hello guys, I use spark streaming to receive data from kafka and need to store the data into hive. I see the following ways to insert data into hive on the Internet: 1.use tmp_table TmpDF=spark.createDataFrame(RDD,schema) TmpDF.createOrReplaceTempView('TmpData')