Hi all, I ran two Spark SQL, they read the same table, partition, but write to different tables. Is there any way to merge them into one SQL, and realize the read data operation is only run once?
Suppose there are two SQL: ----------------------------------------------------------------------------------------------------------------- INSERT OVERWRITE TABLE spark_input_test2 PARTITION(dt='20200909') SELECT name, number, age FROM spark_input_test WHERE dt='20200908' ----------------------------------------------------------------------------------------------------------------- INSERT OVERWRITE TABLE spark_input_test1 PARTITION(dt='20200909') SELECT name, number, sex FROM spark_input_test WHERE dt='20200908' ----------------------------------------------------------------------------------------------------------------- Running these two SQL statements will generate two Physical Plan, and the data source "spark_input_test" will be read twice. If spark_input_test is read only once, it will save memory. Cheers, Gang Li -- Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ --------------------------------------------------------------------- To unsubscribe e-mail: dev-unsubscr...@spark.apache.org