[jira] [Commented] (SPARK-42595) Support query inserted partitions after insert data into table when hive.exec.dynamic.partition=true

2023-03-19 Thread zhang haoyan (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17702418#comment-17702418
 ] 

zhang haoyan commented on SPARK-42595:
--

Gently ping [~maxgekk] for suggestions

> Support query inserted partitions after insert data into table when 
> hive.exec.dynamic.partition=true
> 
>
> Key: SPARK-42595
> URL: https://issues.apache.org/jira/browse/SPARK-42595
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 3.5.0
>Reporter: zhang haoyan
>Priority: Major
>
> When hive.exec.dynamic.partition=true and 
> hive.exec.dynamic.partition.mode=nonstrict, we can insert table by sql like 
> 'insert overwrite table aaa partition(dt) select ',  of course we can 
> know the partitions inserted into the table by the sql itself,  but if we 
> want do something for common use, we need some common way to get the inserted 
> partitions,  for example:
>     spark.sql("insert overwrite table aaa partition(dt) select ")  
> //insert table
>     val partitions = getInsertedPartitions()   //need some way to get 
> inserted partitions
>     monitorInsertedPartitions(partitions)    //do something for common use
> Since insert statement should not return any data, this ticket propose to 
> introduce spark.hive.exec.dynamic.partition.savePartitions=true (default 
> false) 
> spark.hive.exec.dynamic.partition.savePartitions.tableNamePrefix=hive_dynamic_inserted_partitions
> when spark.hive.exec.dynamic.partition.savePartitions=true we save the 
> partitions to the 
> temporary view 
> $spark.hive.exec.dynamic.partition.savePartitions.tableNamePrefix_$dbName_$tableName
> we will allow user to do this
> scala> spark.conf.set("hive.exec.dynamic.partition", true)
> scala> spark.conf.set("hive.exec.dynamic.partition.mode", "nonstrict")
> scala> spark.conf.set("spark.hive.exec.dynamic.partition.savePartitions", 
> true)
> scala> spark.sql("insert overwrite table db1.test_partition_table partition 
> (dt) select 1, '2023-02-22'").show(false)
> ++                                                                            
>   
> ||
> ++
> ++
> scala> spark.sql("select * from 
> hive_dynamic_inserted_partitions_db1_test_partition_table").show(false)
> +--+                                                                  
>   
> |dt        |
> +--+
> |2023-02-22|
> +--+



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-42595) Support query inserted partitions after insert data into table when hive.exec.dynamic.partition=true

2023-02-26 Thread zhang haoyan (Jira)
zhang haoyan created SPARK-42595:


 Summary: Support query inserted partitions after insert data into 
table when hive.exec.dynamic.partition=true
 Key: SPARK-42595
 URL: https://issues.apache.org/jira/browse/SPARK-42595
 Project: Spark
  Issue Type: New Feature
  Components: SQL
Affects Versions: 3.5.0
Reporter: zhang haoyan


When hive.exec.dynamic.partition=true and 
hive.exec.dynamic.partition.mode=nonstrict, we can insert table by sql like 
'insert overwrite table aaa partition(dt) select ',  of course we can know 
the partitions inserted into the table by the sql itself,  but if we want do 
something for common use, we need some common way to get the inserted 
partitions,  for example:

    spark.sql("insert overwrite table aaa partition(dt) select ")  //insert 
table

    val partitions = getInsertedPartitions()   //need some way to get inserted 
partitions

    monitorInsertedPartitions(partitions)    //do something for common use

Since insert statement should not return any data, this ticket propose to 
introduce spark.hive.exec.dynamic.partition.savePartitions=true (default false) 
spark.hive.exec.dynamic.partition.savePartitions.tableNamePrefix=hive_dynamic_inserted_partitions

when spark.hive.exec.dynamic.partition.savePartitions=true we save the 
partitions to the 

temporary view 
$spark.hive.exec.dynamic.partition.savePartitions.tableNamePrefix_$dbName_$tableName

we will allow user to do this

scala> spark.conf.set("hive.exec.dynamic.partition", true)

scala> spark.conf.set("hive.exec.dynamic.partition.mode", "nonstrict")

scala> spark.conf.set("spark.hive.exec.dynamic.partition.savePartitions", true)

scala> spark.sql("insert overwrite table db1.test_partition_table partition 
(dt) select 1, '2023-02-22'").show(false)

++                                                                              

||

++

++

scala> spark.sql("select * from 
hive_dynamic_inserted_partitions_db1_test_partition_table").show(false)

+--+                                                                    

|dt        |

+--+

|2023-02-22|

+--+



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org