pengxianzi commented on issue #12589:
URL: https://github.com/apache/hudi/issues/12589#issuecomment-2577071674
> Problem Description We encountered the following issues while using Apache
Hudi for data migration and real-time writing:
>
> Scenario 1:
>
> Migrating data from Kudu to a Hudi MOR bucketed table, then writing data
from MySQL via Kafka to the Hudi MOR bucketed table works fine.
>
> Scenario 2:
>
> Migrating data from Kudu to a Hudi COW bucketed table, then writing data
from MySQL via Kafka to the Hudi COW bucketed table fails to generate commits,
and the checkpoint fails.
>
> Error Log Here is the error log when the task fails:
>
> org.apache.flink.streaming.runtime.tasks.AsynccheckpointRunnable [].
bucket_write default_databases.table_cow ardemmewlcos part of checkpoint 1
could not be completed.
>
> java.util.concurrent.cancellationException:null
>
> org.apache.flink.runtime.checkpoint.CheckpointException: Could not
complete snapshot 2 for operator bucket_write: default_database.table_cow
(3/4). failure reason: Checkpoint was declined.
>
> Caused by :org.apache.hudi.exception.HoodieException:Timeout(1201000ms)
while waiting for instant initialize
>
> switched from RUNNING to FAILED with failure cause: jave.io.I0Exception:
could not perforn checkpoint 2 for operator bucket_write:
default_database.table_cow(3/4)#0.
>
> Configuration Parameters Here is our Hudi table configuration:
>
> options.put("hoodie.upsert.shuffle.parallelism", "20");
options.put("hoodie.insert.shuffle.parallelism", "20");
options.put("write.operation", "upsert");
options.put(FlinkOptions.TABLE_TYPE.key(), name);
options.put(FlinkOptions.PRECOMBINE_FIELD.key(),precombing);
options.put(FlinkOptions.PRE_COMBINE.key(), "true");
options.put("hoodie.clean.automatic", "true");
options.put("hoodie.cleaner.policy", "KEEP_LATEST_COMMITS");
options.put("hoodie.cleaner.commits.retained", "5");
options.put("hoodie.clean.async", "true");
options.put("hoodie.archive.min.commits", "20");
options.put("hoodie.archive.max.commits", "30");
options.put("hoodie.clean.parallelism", "20");
options.put("hoodie.archive.parallelism", "20");
options.put("hoodie.write.concurrency.mode","optimistic_concurrency_control");
options.put("write.tasks", "20"); options.put("index.type","BUCKET");
options.put("hoodie.bucket.index.num.buckets","80");
options.put("hoodie.index.bucket.engine","SIMPLE");
>
> Checkpoint Configuration We tested various checkpoint timeout and interval
configurations, but the issue persists:
>
> env.getCheckpointConfig().setCheckpointTimeout(5_60_1000L);
env.getCheckpointConfig().setMinPauseBetweenCheckpoints(60*1000L);
>
> Steps to Reproduce Migrate data from Kudu to a Hudi COW bucketed table.
>
> Write data from MySQL via Kafka to the Hudi COW bucketed table.
>
> The task fails with the error Timeout while waiting for instant initialize.
>
> Expected Behavior The task should generate commits normally, and the
checkpoint should succeed.
>
> Actual Behavior The task fails, no commits are generated, and the
checkpoint fails with an error.
>
> Hudi version : 0.14.0
>
> Spark version : 2.4.7
>
> Hive version : 3.1.3
>
> Hadoop version : 3.1.1 Further Questions Checkpoint Timeout Issue: The
error log mentions Timeout while waiting for instant initialize. Is this
related to the initialization mechanism of the Hudi COW table? Are there ways
to optimize the initialization time?
>
> COW Table Write Performance: Is the write performance of COW tables slower
than MOR tables? Are there optimization suggestions for COW tables?
>
> Impact of Bucketed Table: Does the bucketed table have a specific impact
on the write performance of COW tables? Are there optimization configurations
for bucketed tables?
>
> Checkpoint Configuration: We tried various checkpoint timeout and interval
configurations, but the issue persists. Are there recommended checkpoint
configurations?
>
> Summary We would like to know:
>
> Why does the Hudi COW bucketed table encounter checkpoint timeout issues
during writes?
>
> Are there optimization suggestions for COW table write performance?
>
> Does the bucketed table have a specific impact on COW table write
performance?
>
> Are there recommended checkpoint configurations?
>
> Thank you for your help!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]