tongyp created FLINK-37975:
------------------------------
Summary: sync a lot sharding table Post Transform failed
Key: FLINK-37975
URL: https://issues.apache.org/jira/browse/FLINK-37975
Project: Flink
Issue Type: Bug
Components: Flink CDC
Affects Versions: cdc-3.4.0
Reporter: tongyp
env: flinkcdc 3.4 pipeline + flink 1.20.1
i sync about 10000 sharding tables to doris,the task will failed。the jobmanager
log is
```python
2025-06-17 17:41:25,211 INFO
org.apache.flink.cdc.connectors.mysql.source.enumerator.MySqlSourceEnumerator
[] - The enumerator assigns split MySqlSnapshotSplit\{tableId=xxx,
splitId='xx', splitKeyType=[`id` BIGINT NOT NULL], splitStart=null,
splitEnd=null, highWatermark=null} to subtask 0
2025-06-17 17:41:26,070 INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph [] -
Transform:Data -> SchemaOperator -> PrePartition (1/1)
(219052134362b9a800a3f2246605fbac_90bea66de1c231edf33913ecd54406c1_0_0)
switched from RUNNING to FAILED on localhost:44643-f3a1f2 @ localhost
(dataPort=40666).
org.apache.flink.util.SerializedThrowable:
org.apache.flink.cdc.runtime.operators.transform.exceptions.TransformException:
Failed to post-transform with
CreateTableEvent\{xxx}
for table
xx
from schema
columns={xxx)
to schema
columns={xxx).
at
org.apache.flink.cdc.runtime.operators.transform.PostTransformOperator.processElement(PostTransformOperator.java:146)
~[?:?]
at
org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:238)
~[flink-dist-1.20.1.jar:1.20.1]
at
org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.processElement(AbstractStreamTaskNetworkInput.java:157)
~[flink-dist-1.20.1.jar:1.20.1]
at
org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.emitNext(AbstractStreamTaskNetworkInput.java:114)
~[flink-dist-1.20.1.jar:1.20.1]
at
org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)
~[flink-dist-1.20.1.jar:1.20.1]
at
org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:638)
~[flink-dist-1.20.1.jar:1.20.1]
at
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:231)
~[flink-dist-1.20.1.jar:1.20.1]
at
org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:973)
~[flink-dist-1.20.1.jar:1.20.1]
at
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:917)
~[flink-dist-1.20.1.jar:1.20.1]
at
org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:970)
~[flink-dist-1.20.1.jar:1.20.1]
at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:949)
~[flink-dist-1.20.1.jar:1.20.1]
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:763)
~[flink-dist-1.20.1.jar:1.20.1]
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:575)
~[flink-dist-1.20.1.jar:1.20.1]
at java.lang.Thread.run(Thread.java:829) ~[?:?]
Caused by: org.apache.flink.util.SerializedThrowable:
org.apache.flink.streaming.runtime.tasks.ExceptionInChainedOperatorException:
Could not forward element to next operator
at
org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.pushToOperator(CopyingChainingOutput.java:92)
~[flink-dist-1.20.1.jar:1.20.1]
at
org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.collect(CopyingChainingOutput.java:50)
~[flink-dist-1.20.1.jar:1.20.1]
at
org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.collect(CopyingChainingOutput.java:29)
~[flink-dist-1.20.1.jar:1.20.1]
at java.util.Optional.ifPresent(Optional.java:183) ~[?:?]
at
org.apache.flink.cdc.runtime.operators.transform.PostTransformOperator.processElementInternal(PostTransformOperator.java:175)
~[?:?]
at
org.apache.flink.cdc.runtime.operators.transform.PostTransformOperator.processElement(PostTransformOperator.java:130)
~[?:?]
... 13 more
Caused by: org.apache.flink.util.SerializedThrowable:
java.lang.IllegalStateException: Failed to send request to coordinator:
SchemaChangeRequest\{xxx}
at
org.apache.flink.cdc.runtime.operators.schema.regular.SchemaOperator.sendRequestToCoordinator(SchemaOperator.java:241)
~[?:?]
at
org.apache.flink.cdc.runtime.operators.schema.regular.SchemaOperator.requestSchemaChange(SchemaOperator.java:227)
~[?:?]
at
org.apache.flink.cdc.runtime.operators.schema.regular.SchemaOperator.handleSchemaChangeEvent(SchemaOperator.java:173)
~[?:?]
at
org.apache.flink.cdc.runtime.operators.schema.regular.SchemaOperator.processElement(SchemaOperator.java:148)
~[?:?]
at
org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.pushToOperator(CopyingChainingOutput.java:75)
~[flink-dist-1.20.1.jar:1.20.1]
at
org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.collect(CopyingChainingOutput.java:50)
~[flink-dist-1.20.1.jar:1.20.1]
at
org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.collect(CopyingChainingOutput.java:29)
~[flink-dist-1.20.1.jar:1.20.1]
at java.util.Optional.ifPresent(Optional.java:183) ~[?:?]
at
org.apache.flink.cdc.runtime.operators.transform.PostTransformOperator.processElementInternal(PostTransformOperator.java:175)
~[?:?]
at
org.apache.flink.cdc.runtime.operators.transform.PostTransformOperator.processElement(PostTransformOperator.java:130)
~[?:?]
... 13 more
Caused by: org.apache.flink.util.SerializedThrowable:
java.util.concurrent.ExecutionException: java.util.concurrent.TimeoutException:
Invocation of
[RemoteRpcInvocation(JobMasterOperatorEventGateway.sendRequestToCoordinator(OperatorID,
SerializedValue))] at recipient
[pekko.tcp://flink@localhost:6123/user/rpc/jobmanager_2] timed out. This is
usually caused by: 1) Pekko failed sending the message silently, due to
problems like oversized payload or serialization failures. In that case, you
should find detailed error information in the logs. 2) The recipient needs more
time for responding, due to problems like slow machines or network jitters. In
that case, you can try to increase pekko.ask.timeout.
at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395)
~[?:?]
at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2028)
~[?:?]
at
org.apache.flink.cdc.runtime.operators.schema.regular.SchemaOperator.sendRequestToCoordinator(SchemaOperator.java:238)
~[?:?]
at
org.apache.flink.cdc.runtime.operators.schema.regular.SchemaOperator.requestSchemaChange(SchemaOperator.java:227)
~[?:?]
at
org.apache.flink.cdc.runtime.operators.schema.regular.SchemaOperator.handleSchemaChangeEvent(SchemaOperator.java:173)
~[?:?]
at
org.apache.flink.cdc.runtime.operators.schema.regular.SchemaOperator.processElement(SchemaOperator.java:148)
~[?:?]
at
org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.pushToOperator(CopyingChainingOutput.java:75)
~[flink-dist-1.20.1.jar:1.20.1]
at
org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.collect(CopyingChainingOutput.java:50)
~[flink-dist-1.20.1.jar:1.20.1]
at
org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.collect(CopyingChainingOutput.java:29)
~[flink-dist-1.20.1.jar:1.20.1]
at java.util.Optional.ifPresent(Optional.java:183) ~[?:?]
at
org.apache.flink.cdc.runtime.operators.transform.PostTransformOperator.processElementInternal(PostTransformOperator.java:175)
~[?:?]
at
org.apache.flink.cdc.runtime.operators.transform.PostTransformOperator.processElement(PostTransformOperator.java:130)
~[?:?]
... 13 more
Caused by: org.apache.flink.util.SerializedThrowable:
java.util.concurrent.TimeoutException: Invocation of
[RemoteRpcInvocation(JobMasterOperatorEventGateway.sendRequestToCoordinator(OperatorID,
SerializedValue))] at recipient
[pekko.tcp://flink@localhost:6123/user/rpc/jobmanager_2] timed out. This is
usually caused by: 1) Pekko failed sending the message silently, due to
problems like oversized payload or serialization failures. In that case, you
should find detailed error information in the logs. 2) The recipient needs more
time for responding, due to problems like slow machines or network jitters. In
that case, you can try to increase pekko.ask.timeout.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)