[jira] [Updated] (YARN-10557) Application may be leaked in state store when resourcemanager failover.

2020-12-30 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10557: --- Affects Version/s: 3.2.1 > Application may be leaked in state store when resourcemanager failover. >

[jira] [Created] (YARN-10557) Application may be leaked in state store when resourcemanager failover.

2020-12-30 Thread zhengchenyu (Jira)
zhengchenyu created YARN-10557: -- Summary: Application may be leaked in state store when resourcemanager failover. Key: YARN-10557 URL: https://issues.apache.org/jira/browse/YARN-10557 Project: Hadoop YAR

[jira] [Updated] (YARN-10557) Application may be leaked in state store when resourcemanager failover.

2020-12-30 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10557: --- Fix Version/s: 3.3.1 > Application may be leaked in state store when resourcemanager failover. >

[jira] [Updated] (YARN-10557) Application may be leaked in state store when resourcemanager failover.

2020-12-30 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10557: --- Component/s: RM > Application may be leaked in state store when resourcemanager failover. > -

[jira] [Assigned] (YARN-10557) Application may be leaked in state store when resourcemanager failover.

2020-12-30 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu reassigned YARN-10557: -- Assignee: zhengchenyu > Application may be leaked in state store when resourcemanager failover

[jira] [Updated] (YARN-10557) Application may be leaked in state store when resourcemanager failover.

2020-12-30 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10557: --- Labels: resourcemanager (was: ) > Application may be leaked in state store when resourcemanager fail

[jira] [Updated] (YARN-10557) Application may be leaked in state store when resourcemanager failover.

2020-12-30 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10557: --- Component/s: (was: RM) resourcemanager > Application may be leaked in state stor

[jira] [Updated] (YARN-10557) Application may be leaked in state store when resourcemanager failover.

2020-12-30 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10557: --- Description: In resourceManager log, I found amount of log like below: {code} 2020-12-30 19:18:48,12

[jira] [Resolved] (YARN-10557) Application may be leaked in state store when resourcemanager failover.

2020-12-30 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu resolved YARN-10557. Release Note: YARN-9848 Resolution: Duplicate I think it duplicate with YARN-9848. > Applica

[jira] [Created] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread zhengchenyu (Jira)
zhengchenyu created YARN-10642: -- Summary: ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop Key: YARN-10642 URL: https://issues.apache.org/jira/browse/YA

[jira] [Updated] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10642: --- Description: In our cluster, ResouceManager stuck twice within twenty days. Yarn client can't submit

[jira] [Updated] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10642: --- Description: In our cluster, ResouceManager stuck twice within twenty days. Yarn client can't submit

[jira] [Updated] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10642: --- Attachment: put.png > ResourceManager may keep stuck, because AsyncDispatcher's > printEventQueueDet

[jira] [Updated] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10642: --- Attachment: take.png > ResourceManager may keep stuck, because AsyncDispatcher's > printEventQueueDe

[jira] [Updated] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10642: --- Attachment: debugfornode.png > ResourceManager may keep stuck, because AsyncDispatcher's > printEven

[jira] [Updated] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10642: --- Attachment: deadloop.png > ResourceManager may keep stuck, because AsyncDispatcher's > printEventQue

[jira] [Updated] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10642: --- Attachment: MockForDeadLoop.java > ResourceManager may keep stuck, because AsyncDispatcher's > print

[jira] [Commented] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17287685#comment-17287685 ] zhengchenyu commented on YARN-10642: If you feel description is too long, you only ne

[jira] [Updated] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10642: --- Attachment: YARN-10642.001.patch > ResourceManager may keep stuck, because AsyncDispatcher's > print

[jira] [Commented] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17287713#comment-17287713 ] zhengchenyu commented on YARN-10642: YARN-10221 is the same problem, but no real reas

[jira] [Updated] (YARN-10643) Fix the race condition introduced by YARN-8995.

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10643: --- Attachment: YARN-10643.001.patch > Fix the race condition introduced by YARN-8995. >

[jira] [Comment Edited] (YARN-10643) Fix the race condition introduced by YARN-8995.

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17287714#comment-17287714 ] zhengchenyu edited comment on YARN-10643 at 2/20/21, 3:27 PM: -

[jira] [Commented] (YARN-10643) Fix the race condition introduced by YARN-8995.

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17287714#comment-17287714 ] zhengchenyu commented on YARN-10643: Just use Iterator() could solve this problem. Yo

[jira] [Comment Edited] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17287713#comment-17287713 ] zhengchenyu edited comment on YARN-10642 at 2/20/21, 3:28 PM: -

[jira] [Assigned] (YARN-10643) Fix the race condition introduced by YARN-8995.

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu reassigned YARN-10643: -- Assignee: zhengchenyu (was: Qi Zhu) > Fix the race condition introduced by YARN-8995. > -

[jira] [Commented] (YARN-10643) Fix the race condition introduced by YARN-8995.

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17287723#comment-17287723 ] zhengchenyu commented on YARN-10643: I think you need to konw why stuck. Please discu

[jira] [Commented] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17287728#comment-17287728 ] zhengchenyu commented on YARN-10642: [~zhuqi] That's OK. I also found, but sorry for

[jira] [Updated] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10642: --- Attachment: YARN-10642.002.patch > ResourceManager may keep stuck, because AsyncDispatcher's > print

[jira] [Updated] (YARN-10642) AsyncDispatcher will stuck introduced by YARN-8995.

2021-02-21 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10642: --- Summary: AsyncDispatcher will stuck introduced by YARN-8995. (was: ResourceManager may keep stuck, b

[jira] [Updated] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-21 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10642: --- Attachment: YARN-10642.003.patch > ResourceManager may keep stuck, because AsyncDispatcher's > print

[jira] [Commented] (YARN-10642) AsyncDispatcher will stuck introduced by YARN-8995.

2021-02-21 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17288223#comment-17288223 ] zhengchenyu commented on YARN-10642: I add a uni-test in YARN-10642.003.patch which r

[jira] [Comment Edited] (YARN-10642) AsyncDispatcher will stuck introduced by YARN-8995.

2021-02-21 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17288223#comment-17288223 ] zhengchenyu edited comment on YARN-10642 at 2/22/21, 7:38 AM: -

[jira] [Comment Edited] (YARN-10642) AsyncDispatcher will stuck introduced by YARN-8995.

2021-02-21 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17288223#comment-17288223 ] zhengchenyu edited comment on YARN-10642 at 2/22/21, 7:40 AM: -

[jira] [Updated] (YARN-10642) AsyncDispatcher will stuck introduced by YARN-8995.

2021-02-21 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10642: --- Description: In our cluster, ResouceManager stuck twice within twenty days. Yarn client can't submit

[jira] [Updated] (YARN-10642) AsyncDispatcher will stuck introduced by YARN-8995.

2021-03-03 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10642: --- Attachment: YARN-10642.004.patch > AsyncDispatcher will stuck introduced by YARN-8995. >

[jira] [Commented] (YARN-10642) AsyncDispatcher will stuck introduced by YARN-8995.

2021-03-03 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17294982#comment-17294982 ] zhengchenyu commented on YARN-10642: Okay, I submit YARN-10642.004.patch which repair

[jira] [Comment Edited] (YARN-10642) AsyncDispatcher will stuck introduced by YARN-8995.

2021-03-03 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17294982#comment-17294982 ] zhengchenyu edited comment on YARN-10642 at 3/4/21, 4:15 AM: -

[jira] [Commented] (YARN-10221) Nodemanager lockups on printEventQueueDetails

2021-03-03 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17294988#comment-17294988 ] zhengchenyu commented on YARN-10221: Please follow YARN-10642 which explain the reaso

[jira] [Updated] (YARN-10642) AsyncDispatcher will stuck introduced by YARN-8995.

2021-03-04 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10642: --- Attachment: YARN-10642.005.patch > AsyncDispatcher will stuck introduced by YARN-8995. >

[jira] [Commented] (YARN-10642) AsyncDispatcher will stuck introduced by YARN-8995.

2021-03-04 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17295286#comment-17295286 ] zhengchenyu commented on YARN-10642: [~pbacsko] Okay, work done in YARN-10642.005.pat

[jira] [Commented] (YARN-10642) Race condition: AsyncDispatcher can get stuck by the changes introduced in YARN-8995

2021-03-05 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17296075#comment-17296075 ] zhengchenyu commented on YARN-10642: [~pbacsko] Yes, I think it's Java's bug. Then I

[jira] [Commented] (YARN-10642) Race condition: AsyncDispatcher can get stuck by the changes introduced in YARN-8995

2021-03-05 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17296077#comment-17296077 ] zhengchenyu commented on YARN-10642: [~pbacsko] Okay, I will submit pathes for branch

[jira] [Updated] (YARN-10642) Race condition: AsyncDispatcher can get stuck by the changes introduced in YARN-8995

2021-03-05 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10642: --- Attachment: YARN-10642-branch-3.2.001.patch > Race condition: AsyncDispatcher can get stuck by the ch

[jira] [Updated] (YARN-10642) Race condition: AsyncDispatcher can get stuck by the changes introduced in YARN-8995

2021-03-05 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10642: --- Attachment: YARN-10642-branch-3.3.001.patch > Race condition: AsyncDispatcher can get stuck by the ch

[jira] [Comment Edited] (YARN-10642) Race condition: AsyncDispatcher can get stuck by the changes introduced in YARN-8995

2021-03-05 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17296075#comment-17296075 ] zhengchenyu edited comment on YARN-10642 at 3/5/21, 4:01 PM: -

[jira] [Commented] (YARN-6202) Configuration item Dispatcher.DISPATCHER_EXIT_ON_ERROR_KEY is disregarded

2021-03-29 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-6202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17310564#comment-17310564 ] zhengchenyu commented on YARN-6202: --- [~yufeigu] I agree that exitOnDispatchException sho

[jira] [Commented] (YARN-11183) Federation: Remove outdated ApplicationHomeSubCluster in federation state store.

2022-11-08 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17630300#comment-17630300 ] zhengchenyu commented on YARN-11183: [~goiri]  Hi, can you please review this PR? the

[jira] [Created] (YARN-11549) Add MiniRouterYarnCluster for test

2023-08-12 Thread zhengchenyu (Jira)
zhengchenyu created YARN-11549: -- Summary: Add MiniRouterYarnCluster for test Key: YARN-11549 URL: https://issues.apache.org/jira/browse/YARN-11549 Project: Hadoop YARN Issue Type: Improvement

[jira] [Updated] (YARN-11153) Make proxy server support YARN federation.

2023-08-13 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-11153: --- Parent Issue: YARN-5597 (was: YARN-10775) > Make proxy server support YARN federation. > ---

[jira] [Updated] (YARN-11154) Make router support proxy server.

2023-08-13 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-11154: --- Parent Issue: YARN-5597 (was: YARN-10775) > Make router support proxy server. >

[jira] [Updated] (YARN-11153) Make proxy server support YARN federation.

2023-08-13 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-11153: --- Description: I setup a yarn federation cluster, I can't connect the running app web, but the complet

[jira] [Updated] (YARN-11153) Make proxy server support YARN federation.

2023-08-13 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-11153: --- Attachment: YARN-10775-design-doc.001.pdf > Make proxy server support YARN federation. >

[jira] [Updated] (YARN-11153) Make proxy server support YARN federation.

2023-08-13 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-11153: --- Description: I setup a yarn federation cluster, I can't connect the running app web, but the complet

[jira] [Assigned] (YARN-8980) Mapreduce application container start fail after AM restart.

2023-08-22 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-8980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu reassigned YARN-8980: - Assignee: zhengchenyu (was: Shilun Fan) > Mapreduce application container start fail after AM r

[jira] [Created] (YARN-11564) The default value of sub cluster cleaner interval was converted unexpectedly

2023-09-07 Thread zhengchenyu (Jira)
zhengchenyu created YARN-11564: -- Summary: The default value of sub cluster cleaner interval was converted unexpectedly Key: YARN-11564 URL: https://issues.apache.org/jira/browse/YARN-11564 Project: Hadoo

[jira] [Updated] (YARN-11564) Fix wrong config in yarn-default.xml

2023-09-07 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-11564: --- Description: yarn.router.subcluster.cleaner.interval.time is duplicated in yarn-default.xml (was: So

[jira] [Updated] (YARN-11564) Fix wrong config in yarn-default.xml

2023-09-07 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-11564: --- Summary: Fix wrong config in yarn-default.xml (was: The default value of sub cluster cleaner interva

[jira] [Created] (YARN-11565) Container logs are missing when yarn.app.container.log.filesize is set to default value 0.

2023-09-08 Thread zhengchenyu (Jira)
zhengchenyu created YARN-11565: -- Summary: Container logs are missing when yarn.app.container.log.filesize is set to default value 0. Key: YARN-11565 URL: https://issues.apache.org/jira/browse/YARN-11565

[jira] [Updated] (YARN-11565) Container logs are missing when yarn.app.container.log.filesize is set to default value 0.

2023-09-08 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-11565: --- Description: Since HADOOP-18649, log4j.appender.\{APPENDER}.MaxFileSize is set to ${yarn.app.contain

[jira] [Updated] (YARN-11565) Container logs are missing when yarn.app.container.log.filesize is set to default value 0.

2023-09-08 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-11565: --- Description: Since HADOOP-18649, log4j.appender.\{APPENDER}.MaxFileSize is set to ${yarn.app.contain

[jira] [Updated] (YARN-11565) Container logs are missing when yarn.app.container.log.filesize is set to default value 0.

2023-09-08 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-11565: --- Description: Since HADOOP-18649, in container-log4j.properties, log4j.appender.\{APPENDER}.MaxFileSi

[jira] [Updated] (YARN-11565) Container logs are missing when yarn.app.container.log.filesize is set to default value 0.

2023-09-08 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-11565: --- Description: Since HADOOP-18649, in container-log4j.properties, log4j.appender.\{APPENDER}.MaxFileSi

[jira] [Updated] (YARN-11565) Container logs are missing when yarn.app.container.log.filesize is set to default value 0.

2023-09-08 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-11565: --- Description: Since HADOOP-18649, in container-log4j.properties, log4j.appender.\{APPENDER}.MaxFileSi

[jira] [Updated] (YARN-11565) Container logs are missing when yarn.app.container.log.filesize is set to default value 0.

2023-09-08 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-11565: --- Description: Since HADOOP-18649, in container-log4j.properties, log4j.appender.\{APPENDER}.MaxFileSi

[jira] [Resolved] (YARN-11565) Container logs are missing when yarn.app.container.log.filesize is set to default value 0.

2023-09-08 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu resolved YARN-11565. Resolution: Duplicate This Jira should be in mapreduce module. > Container logs are missing when y

[jira] [Updated] (YARN-11565) Container logs are missing when yarn.app.container.log.filesize is set to default value 0.

2023-09-08 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-11565: --- Labels: (was: pull-request-available) > Container logs are missing when yarn.app.container.log.file

[jira] [Updated] (YARN-11565) Container logs are missing when yarn.app.container.log.filesize is set to default value 0.

2023-09-08 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-11565: --- Target Version/s: (was: 3.4.0) > Container logs are missing when yarn.app.container.log.filesize is

[jira] [Comment Edited] (YARN-11565) Container logs are missing when yarn.app.container.log.filesize is set to default value 0.

2023-09-08 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17763087#comment-17763087 ] zhengchenyu edited comment on YARN-11565 at 9/8/23 1:34 PM: T

[jira] [Created] (YARN-11566) Yarn app kill command can not kill the application in secondary sub cluster.

2023-09-11 Thread zhengchenyu (Jira)
zhengchenyu created YARN-11566: -- Summary: Yarn app kill command can not kill the application in secondary sub cluster. Key: YARN-11566 URL: https://issues.apache.org/jira/browse/YARN-11566 Project: Hadoo

[jira] [Updated] (YARN-11566) Yarn app kill command can not kill the application in secondary sub cluster.

2023-09-11 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-11566: --- Description: When AMRMProxy is enable, the application may allocate container among multi sub cluster

[jira] [Updated] (YARN-11566) Yarn app kill command can not kill the application in secondary sub cluster.

2023-09-11 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-11566: --- Description: When AMRMProxy is enable, the application may allocate container among multi sub cluster

[jira] [Commented] (YARN-11566) Yarn app kill command can not kill the application in secondary sub cluster.

2023-09-12 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17764465#comment-17764465 ] zhengchenyu commented on YARN-11566: There are two ways to solve this problem: (1) C

[jira] [Comment Edited] (YARN-11566) Yarn app kill command can not kill the application in secondary sub cluster.

2023-09-13 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17764465#comment-17764465 ] zhengchenyu edited comment on YARN-11566 at 9/13/23 10:00 AM: -

[jira] [Updated] (YARN-11566) Yarn app kill command can not kill the application in secondary sub cluster.

2023-09-13 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-11566: --- Description: When AMRMProxy is enable, the application may allocate container among multi sub cluste

[jira] [Comment Edited] (YARN-11566) Yarn app kill command can not kill the application in secondary sub cluster.

2023-09-14 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17764465#comment-17764465 ] zhengchenyu edited comment on YARN-11566 at 9/14/23 10:54 AM: -

[jira] [Updated] (YARN-11566) Yarn app kill command can not kill the application in secondary sub cluster.

2023-09-15 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-11566: --- Issue Type: Bug (was: Improvement) > Yarn app kill command can not kill the application in secondary

[jira] [Commented] (YARN-10174) Add colored policies to enable manual load balancing across sub clusters

2023-09-17 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17766228#comment-17766228 ] zhengchenyu commented on YARN-10174: I think [~youchen] means that we can get weights

[jira] [Commented] (YARN-10174) Add colored policies to enable manual load balancing across sub clusters

2023-09-26 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17769401#comment-17769401 ] zhengchenyu commented on YARN-10174: [~slfan1989] Thanks for your reply. Adjusting pa

[jira] [Created] (YARN-10775) Federation: Yarn running app web can't be unable to connect, because AppMaster can't redirect to the right address.

2021-05-18 Thread zhengchenyu (Jira)
zhengchenyu created YARN-10775: -- Summary: Federation: Yarn running app web can't be unable to connect, because AppMaster can't redirect to the right address. Key: YARN-10775 URL: https://issues.apache.org/jira/brows

[jira] [Assigned] (YARN-10775) Federation: Yarn running app web can't be unable to connect, because AppMaster can't redirect to the right address.

2021-05-18 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu reassigned YARN-10775: -- Assignee: zhengchenyu > Federation: Yarn running app web can't be unable to connect, because

[jira] [Created] (YARN-10776) Make ConfiguredRMFailoverProxyProvider select ResourceManager or Router randomly

2021-05-18 Thread zhengchenyu (Jira)
zhengchenyu created YARN-10776: -- Summary: Make ConfiguredRMFailoverProxyProvider select ResourceManager or Router randomly Key: YARN-10776 URL: https://issues.apache.org/jira/browse/YARN-10776 Project: H

[jira] [Assigned] (YARN-10776) Make ConfiguredRMFailoverProxyProvider select ResourceManager or Router randomly

2021-05-18 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu reassigned YARN-10776: -- Assignee: zhengchenyu > Make ConfiguredRMFailoverProxyProvider select ResourceManager or Route

[jira] [Commented] (YARN-10775) Federation: Yarn running app web can't be unable to connect, because AppMaster can't redirect to the right address.

2021-05-31 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17354290#comment-17354290 ] zhengchenyu commented on YARN-10775: I think maybe we need to construct a proxy serve

[jira] [Commented] (YARN-10786) Federation:We can't access the AM page while using federation

2021-05-31 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17354292#comment-17354292 ] zhengchenyu commented on YARN-10786: I don't think it's a good way to solve this prob

[jira] [Issue Comment Deleted] (YARN-10775) Federation: Yarn running app web can't be unable to connect, because AppMaster can't redirect to the right address.

2021-05-31 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10775: --- Comment: was deleted (was: I think maybe we need to construct a proxy server in nm to proxy am's web

[jira] [Created] (YARN-11127) Potential deadlock in AsyncDispatcher caused by RMNodeImpl, SchedulerApplicationAttempt and RMAppImpl's lock contention.

2022-05-06 Thread zhengchenyu (Jira)
zhengchenyu created YARN-11127: -- Summary: Potential deadlock in AsyncDispatcher caused by RMNodeImpl, SchedulerApplicationAttempt and RMAppImpl's lock contention. Key: YARN-11127 URL: https://issues.apache.org/jira/b

[jira] [Updated] (YARN-11127) Potential deadlock in AsyncDispatcher caused by RMNodeImpl, SchedulerApplicationAttempt and RMAppImpl's lock contention.

2022-05-06 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-11127: --- Description: I found rm deadlock in our cluster. It's a low probability event. some critical jstack

[jira] [Updated] (YARN-11127) Potential deadlock in AsyncDispatcher caused by RMNodeImpl, SchedulerApplicationAttempt and RMAppImpl's lock contention.

2022-05-06 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-11127: --- Description: I found rm deadlock in our cluster. It's a low probability event. some critical jstack

[jira] [Updated] (YARN-11127) Potential deadlock in AsyncDispatcher caused by RMNodeImpl, SchedulerApplicationAttempt and RMAppImpl's lock contention.

2022-05-06 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-11127: --- Description: I found rm deadlock in our cluster. It's a low probability event. some critical jstack

[jira] [Commented] (YARN-11127) Potential deadlock in AsyncDispatcher caused by RMNodeImpl, SchedulerApplicationAttempt and RMAppImpl's lock contention.

2022-05-06 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17532824#comment-17532824 ] zhengchenyu commented on YARN-11127: aggregateLogReport introduce by YARN-1376 then t

[jira] [Comment Edited] (YARN-11127) Potential deadlock in AsyncDispatcher caused by RMNodeImpl, SchedulerApplicationAttempt and RMAppImpl's lock contention.

2022-05-06 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17532824#comment-17532824 ] zhengchenyu edited comment on YARN-11127 at 5/6/22 12:07 PM: -

[jira] [Commented] (YARN-11127) Potential deadlock in AsyncDispatcher caused by RMNodeImpl, SchedulerApplicationAttempt and RMAppImpl's lock contention.

2022-05-06 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17532828#comment-17532828 ] zhengchenyu commented on YARN-11127: [~vinodkv] [~bteke]  [~pbacsko]  [~bilwa_st] [~z

[jira] [Created] (YARN-11132) RM failover may fail when Dispatcher stuck.

2022-05-06 Thread zhengchenyu (Jira)
zhengchenyu created YARN-11132: -- Summary: RM failover may fail when Dispatcher stuck. Key: YARN-11132 URL: https://issues.apache.org/jira/browse/YARN-11132 Project: Hadoop YARN Issue Type: Impro

[jira] [Commented] (YARN-11132) RM failover may fail when Dispatcher stuck.

2022-05-06 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17533203#comment-17533203 ] zhengchenyu commented on YARN-11132: I think we could watch the head element of event

[jira] [Commented] (YARN-11127) Potential deadlock in AsyncDispatcher caused by RMNodeImpl, SchedulerApplicationAttempt and RMAppImpl's lock contention.

2022-05-06 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17533204#comment-17533204 ] zhengchenyu commented on YARN-11127: Another problem is that When dispatcher thread s

[jira] [Updated] (YARN-11127) Potential deadlock in AsyncDispatcher caused by RMNodeImpl, SchedulerApplicationAttempt and RMAppImpl's lock contention.

2022-05-07 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-11127: --- Description: I found rm deadlock in our cluster. It's a low probability event. some critical jstack

[jira] [Updated] (YARN-10775) Federation: Yarn running app web can't be unable to connect, because AppMaster can't redirect to the right address.

2022-05-13 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10775: --- Description: I setup a yarn federation cluster, I can't connect the running app web, but the complet

[jira] [Created] (YARN-11148) In federation and security mode, nm recover may fail.

2022-05-13 Thread zhengchenyu (Jira)
zhengchenyu created YARN-11148: -- Summary: In federation and security mode, nm recover may fail. Key: YARN-11148 URL: https://issues.apache.org/jira/browse/YARN-11148 Project: Hadoop YARN Issue T

[jira] [Updated] (YARN-11148) In federation and security mode, nm recover may fail.

2022-05-13 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-11148: --- Description: Exception stack {code:java} 2022-05-08 00:44:11,536 WARN org.apache.hadoop.ipc.Client: E

[jira] [Updated] (YARN-11148) In federation and security mode, nm recover may fail.

2022-05-13 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-11148: --- Description: Exception stack {code:java} 2022-05-08 00:44:11,536 WARN org.apache.hadoop.ipc.Client: E

  1   2   3   >