flashJd commented on PR #9113:
URL: https://github.com/apache/hudi/pull/9113#issuecomment-1623593792

   > > How about follow the spark behavior? We should respect the spark 
configure: `spark.sql.sources.partitionOverwriteMode`, if it's `Static`, 
overwrite the whole table, otherwise if `dynamic`, overwrite the changed 
partitions.
   > > It appeals this is also how iceberg works.
   > 
   > Agreed, I will check the logic to respect the spark configure
   
   @boneanxs @KnightChess @danny0405 
   1) I try to respect the spark configure 
`spark.sql.sources.partitionOverwriteMode`, but found it's supported in 
datasource v2 as HoodieInternalV2Table only support V1_BATCH_WRITE 
TableCapability thus can't extend  interface `SupportsDynamicOverwrite` 
   2) For iceberg, I found it repsect the spark config as implement the 
interface `SupportsDynamicOverwrite`, but it  also set 
    it's own configure to control the static/dynamic overwrite semantics, 
   
https://github.com/apache/iceberg/blob/1f1ec4be478feae79b04bcea3e9a8556d8076054/spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkWriteConf.java#L106
   3) As the v2 BATCH_WRITE not supported now, we can first use hooide config 
`hoodie.datasource.write.operation = insert_overwrite_table/insert_overwrite` 
to implement the static/dynamic overwrite semantics, then respect spark 
configure when v2 write support. 
   what about you think?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to