[ 
https://issues.apache.org/jira/browse/HUDI-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Chen updated HUDI-8143:
-----------------------------
    Description: 
We should suggest the users to always enable just one table service for 
multiple jobs.

1. Cleaning: concurrent cleaning is allowed, but there is case concurrent 
execution from multiple jobs happen, which result in duplicate cleam commit 
metadata files.(This is behavior for both 0.x and 1.x)
2. Compaction: concurrent scheduling is supported because we have completion 
file filtering for the target logs, cocurrent execution may be problematic
3. Clustering: No concurrency is supported.

> NB CC across Flink/Spark for multiple writers and all table services (clean, 
> compaction, clustering, ...)
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: HUDI-8143
>                 URL: https://issues.apache.org/jira/browse/HUDI-8143
>             Project: Apache Hudi
>          Issue Type: Task
>            Reporter: Ethan Guo
>            Assignee: Danny Chen
>            Priority: Blocker
>             Fix For: 1.0.0
>
>
> We should suggest the users to always enable just one table service for 
> multiple jobs.
> 1. Cleaning: concurrent cleaning is allowed, but there is case concurrent 
> execution from multiple jobs happen, which result in duplicate cleam commit 
> metadata files.(This is behavior for both 0.x and 1.x)
> 2. Compaction: concurrent scheduling is supported because we have completion 
> file filtering for the target logs, cocurrent execution may be problematic
> 3. Clustering: No concurrency is supported.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to