[ https://issues.apache.org/jira/browse/FLINK-27608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17536840#comment-17536840 ]
Zhilong Hong commented on FLINK-27608: -------------------------------------- As a {{PartitionNotFoundException}} is thrown, it will be handled in by the logic located at {{{}org.apache.flink.runtime.io.network.netty.CreditBasedPartitionRequestClientHandler:298{}}}. The {{Task}} will try to {{requestPartitionProducerState}} from the JobManager. If the upstream task is not ready (i.e. in the DEPLOYING or INITIALIZING state), the {{SingleInputGate}} will try to retrigger another partition request until the partition is consumable. > Flink may throw PartitionNotFound Exception if the downstream task reached > Running state earlier than it's upstream task > ------------------------------------------------------------------------------------------------------------------------ > > Key: FLINK-27608 > URL: https://issues.apache.org/jira/browse/FLINK-27608 > Project: Flink > Issue Type: Bug > Components: Runtime / Network > Affects Versions: 1.14.2 > Reporter: zlzhang0122 > Priority: Major > Fix For: 1.16.0 > > > Flink streaming job deployment may throw PartitionNotFound Exception if the > downstream task reached Running state earlier than its upstream task and > after maximum backoff for partition requests passed.But the config of > taskmanager.network.request-backoff.max is not eay to decide. Can we use a > loop awaiting the upstream task partition be ready? > -- This message was sent by Atlassian Jira (v8.20.7#820007)