[jira] [Commented] (FLINK-29923) Hybrid Shuffle may face deadlock when running a task need to execute big size data

Weijie Guo (Jira) Mon, 07 Nov 2022 20:23:06 -0800


    [ 
https://issues.apache.org/jira/browse/FLINK-29923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17630171#comment-17630171
 ]


Weijie Guo commented on FLINK-29923:
------------------------------------

[~AlexXXX] Thanks for the feedback. If I'm not wrong, the reason for the 
failure should be insufficient network memory or batch read memory, and this is 
an expected behavior. After all, pipelined execution requires more resources 
than all blocking. So now we have to solve the problem that the task thread is 
stuck. Can you provide more detailed information, such as the thread dump of 
the stuck subtask. In addition, if it is difficult to describe the problem 
clearly, you can communicate with me offline via wechat(a644813550) or any 
other contact ways you want.

> Hybrid Shuffle may face deadlock when running a task need to execute big size 
> data
> ----------------------------------------------------------------------------------
>
>                 Key: FLINK-29923
>                 URL: https://issues.apache.org/jira/browse/FLINK-29923
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Network
>    Affects Versions: 1.16.0
>            Reporter: AlexHu
>            Priority: Major
>         Attachments: 性能差距.png, 死锁2-select.png, 死锁检测.png
>
>
> The flink 1.16 offers hybrid shuffle to combine the superiority of blocking 
> shuffle and pipeline shuffle. But when I want to test this new feature I face 
> a problem that it may cause deadlock when it running. 
> Actually, it will run well at beginning. However, when it runs to a certain 
> number it may failure for the buffer size and if I set a bigger size it may 
> running without data execution like the picture. So I want to ask the cause 
> of this problem and a solution.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-29923) Hybrid Shuffle may face deadlock when running a task need to execute big size data

Reply via email to