[ 
https://issues.apache.org/jira/browse/YARN-10058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17890288#comment-17890288
 ] 

ASF GitHub Bot commented on YARN-10058:
---------------------------------------

TaoYang526 commented on PR #1781:
URL: https://github.com/apache/hadoop/pull/1781#issuecomment-2418328890

   @tuyu Thanks for the contribution. It is truly harmful when the scheduling 
thread exits in silent without triggering any state transition. I think it's 
reasonable to let RM transition to standby when HA is enabled or just shut down 
(so that the scheduling thread issue can be aware of instead of just missing 
and the system doesn't work). 
   +1 for the patch, would you mind to add UT for this?




> Capacity Scheduler dispatcher hang when async thread crash
> ----------------------------------------------------------
>
>                 Key: YARN-10058
>                 URL: https://issues.apache.org/jira/browse/YARN-10058
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacity scheduler
>    Affects Versions: 3.2.0, 3.2.1
>            Reporter: tuyu
>            Priority: Major
>             Fix For: 3.2.1
>
>         Attachments: 0001-global-scheduling-standby-hang.patch
>
>
> when capacity scheduler enable global scheduler, if global scheduler's 
> AsyncScheduleThread crash, the capacity scheduler dispatcher will hang for 
> long time. This behavior is unreasonable. 
> if this situation happen, In HA mode, current RM should change to standby



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to