[jira] [Created] (SPARK-42595) Support query inserted partitions after insert data into table when hive.exec.dynamic.partition=true
zhang haoyan created SPARK-42595: Summary: Support query inserted partitions after insert data into table when hive.exec.dynamic.partition=true Key: SPARK-42595 URL: https://issues.apache.org/jira/browse/SPARK-42595 Project: Spark Issue Type: New Feature Components: SQL Affects Versions: 3.5.0 Reporter: zhang haoyan When hive.exec.dynamic.partition=true and hive.exec.dynamic.partition.mode=nonstrict, we can insert table by sql like 'insert overwrite table aaa partition(dt) select ', of course we can know the partitions inserted into the table by the sql itself, but if we want do something for common use, we need some common way to get the inserted partitions, for example: spark.sql("insert overwrite table aaa partition(dt) select ") //insert table val partitions = getInsertedPartitions() //need some way to get inserted partitions monitorInsertedPartitions(partitions) //do something for common use Since insert statement should not return any data, this ticket propose to introduce spark.hive.exec.dynamic.partition.savePartitions=true (default false) spark.hive.exec.dynamic.partition.savePartitions.tableNamePrefix=hive_dynamic_inserted_partitions when spark.hive.exec.dynamic.partition.savePartitions=true we save the partitions to the temporary view $spark.hive.exec.dynamic.partition.savePartitions.tableNamePrefix_$dbName_$tableName we will allow user to do this scala> spark.conf.set("hive.exec.dynamic.partition", true) scala> spark.conf.set("hive.exec.dynamic.partition.mode", "nonstrict") scala> spark.conf.set("spark.hive.exec.dynamic.partition.savePartitions", true) scala> spark.sql("insert overwrite table db1.test_partition_table partition (dt) select 1, '2023-02-22'").show(false) ++ || ++ ++ scala> spark.sql("select * from hive_dynamic_inserted_partitions_db1_test_partition_table").show(false) +--+ |dt | +--+ |2023-02-22| +--+ -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42595) Support query inserted partitions after insert data into table when hive.exec.dynamic.partition=true
[ https://issues.apache.org/jira/browse/SPARK-42595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17702418#comment-17702418 ] zhang haoyan commented on SPARK-42595: -- Gently ping [~maxgekk] for suggestions > Support query inserted partitions after insert data into table when > hive.exec.dynamic.partition=true > > > Key: SPARK-42595 > URL: https://issues.apache.org/jira/browse/SPARK-42595 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 3.5.0 >Reporter: zhang haoyan >Priority: Major > > When hive.exec.dynamic.partition=true and > hive.exec.dynamic.partition.mode=nonstrict, we can insert table by sql like > 'insert overwrite table aaa partition(dt) select ', of course we can > know the partitions inserted into the table by the sql itself, but if we > want do something for common use, we need some common way to get the inserted > partitions, for example: > spark.sql("insert overwrite table aaa partition(dt) select ") > //insert table > val partitions = getInsertedPartitions() //need some way to get > inserted partitions > monitorInsertedPartitions(partitions) //do something for common use > Since insert statement should not return any data, this ticket propose to > introduce spark.hive.exec.dynamic.partition.savePartitions=true (default > false) > spark.hive.exec.dynamic.partition.savePartitions.tableNamePrefix=hive_dynamic_inserted_partitions > when spark.hive.exec.dynamic.partition.savePartitions=true we save the > partitions to the > temporary view > $spark.hive.exec.dynamic.partition.savePartitions.tableNamePrefix_$dbName_$tableName > we will allow user to do this > scala> spark.conf.set("hive.exec.dynamic.partition", true) > scala> spark.conf.set("hive.exec.dynamic.partition.mode", "nonstrict") > scala> spark.conf.set("spark.hive.exec.dynamic.partition.savePartitions", > true) > scala> spark.sql("insert overwrite table db1.test_partition_table partition > (dt) select 1, '2023-02-22'").show(false) > ++ > > || > ++ > ++ > scala> spark.sql("select * from > hive_dynamic_inserted_partitions_db1_test_partition_table").show(false) > +--+ > > |dt | > +--+ > |2023-02-22| > +--+ -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org