[ 
https://issues.apache.org/jira/browse/FLINK-27608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17536840#comment-17536840
 ] 

Zhilong Hong commented on FLINK-27608:
--------------------------------------

As a {{PartitionNotFoundException}} is thrown, it will be handled in by the 
logic located at 
{{{}org.apache.flink.runtime.io.network.netty.CreditBasedPartitionRequestClientHandler:298{}}}.
 The {{Task}} will try to {{requestPartitionProducerState}} from the 
JobManager. If the upstream task is not ready (i.e. in the DEPLOYING or 
INITIALIZING state), the {{SingleInputGate}} will  try to retrigger another 
partition request until the partition is consumable.

> Flink may throw PartitionNotFound Exception if the downstream task reached 
> Running state earlier than it's upstream task
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-27608
>                 URL: https://issues.apache.org/jira/browse/FLINK-27608
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Network
>    Affects Versions: 1.14.2
>            Reporter: zlzhang0122
>            Priority: Major
>             Fix For: 1.16.0
>
>
> Flink streaming job deployment may throw PartitionNotFound Exception if the 
> downstream task reached Running state earlier than its upstream task and 
> after maximum backoff for partition requests passed.But the config of 
> taskmanager.network.request-backoff.max is not eay to decide. Can we use a 
> loop awaiting the upstream task partition be ready?
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to