[jira] [Updated] (SPARK-45908) write empty parquet file while using partitioned write

ASF GitHub Bot (Jira) Tue, 14 Nov 2023 18:12:41 -0800


     [ 
https://issues.apache.org/jira/browse/SPARK-45908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


ASF GitHub Bot updated SPARK-45908:
-----------------------------------
    Labels: pull-request-available  (was: )

> write empty parquet file while using partitioned write
> ------------------------------------------------------
>
>                 Key: SPARK-45908
>                 URL: https://issues.apache.org/jira/browse/SPARK-45908
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 3.5.0
>            Reporter: Paride Casulli
>            Priority: Minor
>              Labels: pull-request-available
>
> Hi,
> I'm currently using pyspark and if I try to write an empty dataframe in 
> parquet file in a partitioned way no file is written in the target folder
> df.write.mode("overwrite").partitionBy("BUSINESS_DATE").parquet("/data_dir/"+stg+"/ISS/exchange/WORK_ISSR_EOD_EXT_SETTLEMENT_CA_"+se)
>  
> this creates a problem because I have another job which reads the file and 
> can't infer the schema and raises an error. I made a workaround in this way:
> #implemented to manage empty data also
> def write_partitioned_df(df,partition_col,partition_val,save_path):
>     df.write.mode("overwrite").partitionBy(partition_col).parquet(save_path)
>     if df.isEmpty():
>         df=df.drop(partition_col)
>         
> df.write.mode("overwrite").parquet(save_path+"/"+partition_col+"="+partition_val)
>  
> in order to write an empty parquet in the target folder but It would be great 
> to have an option in the write function to avoid this custom implementation. 
> I see other users interested in this feature asking on StackOverflow.
>  
> Thank you very much
> Paride
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-45908) write empty parquet file while using partitioned write

Reply via email to