[jira] [Updated] (YARN-8464) Async scheduling thread could be interrupted when there are no NodeManagers in cluster
[ https://issues.apache.org/jira/browse/YARN-8464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8464: - Fix Version/s: 3.2.0 > Async scheduling thread could be interrupted when there are no NodeManagers > in cluster > -- > > Key: YARN-8464 > URL: https://issues.apache.org/jira/browse/YARN-8464 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Charan Hebri >Assignee: Sunil Govindan >Priority: Blocker > Fix For: 3.2.0, 3.1.1 > > Attachments: YARN-8464.001.patch, YARN-8464.002.patch > > > Test scenario: > 1. Make either yarn.nodemanager.log-dirs/yarn.nodemanager.local-dirs read-only > 2. Restart NMs via Ambari, none of them show up in the RM UI as expected > 3. Revert back the read-only dirs and restart NMs > 4. Include a non-existent dir in either > yarn.nodemanager.log-dirs/yarn.nodemanager.local-dirs (1 good existing dir + > 1 non-existing dir) > 5. Restart NMs via Ambari, all NMs show as RUNNING with a Health Report > message as expected > 6. Submit a MapReduce sleep job, job goes into ACCEPTED state > 7. Job stays in ACCEPTED state forever even though all NMs are running and > have available memory > > Credits to [~charanh] who found this issue. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8464) Async scheduling thread could be interrupted when there are no NodeManagers in cluster
[ https://issues.apache.org/jira/browse/YARN-8464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8464: - Priority: Blocker (was: Critical) > Async scheduling thread could be interrupted when there are no NodeManagers > in cluster > -- > > Key: YARN-8464 > URL: https://issues.apache.org/jira/browse/YARN-8464 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Charan Hebri >Assignee: Sunil Govindan >Priority: Blocker > Attachments: YARN-8464.001.patch, YARN-8464.002.patch > > > Test scenario: > 1. Make either yarn.nodemanager.log-dirs/yarn.nodemanager.local-dirs read-only > 2. Restart NMs via Ambari, none of them show up in the RM UI as expected > 3. Revert back the read-only dirs and restart NMs > 4. Include a non-existent dir in either > yarn.nodemanager.log-dirs/yarn.nodemanager.local-dirs (1 good existing dir + > 1 non-existing dir) > 5. Restart NMs via Ambari, all NMs show as RUNNING with a Health Report > message as expected > 6. Submit a MapReduce sleep job, job goes into ACCEPTED state > 7. Job stays in ACCEPTED state forever even though all NMs are running and > have available memory > > Credits to [~charanh] who found this issue. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8464) Async scheduling thread could be interrupted when there are no NodeManagers in cluster
[ https://issues.apache.org/jira/browse/YARN-8464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil Govindan updated YARN-8464: - Attachment: YARN-8464.002.patch > Async scheduling thread could be interrupted when there are no NodeManagers > in cluster > -- > > Key: YARN-8464 > URL: https://issues.apache.org/jira/browse/YARN-8464 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Charan Hebri >Assignee: Sunil Govindan >Priority: Critical > Attachments: YARN-8464.001.patch, YARN-8464.002.patch > > > Test scenario: > 1. Make either yarn.nodemanager.log-dirs/yarn.nodemanager.local-dirs read-only > 2. Restart NMs via Ambari, none of them show up in the RM UI as expected > 3. Revert back the read-only dirs and restart NMs > 4. Include a non-existent dir in either > yarn.nodemanager.log-dirs/yarn.nodemanager.local-dirs (1 good existing dir + > 1 non-existing dir) > 5. Restart NMs via Ambari, all NMs show as RUNNING with a Health Report > message as expected > 6. Submit a MapReduce sleep job, job goes into ACCEPTED state > 7. Job stays in ACCEPTED state forever even though all NMs are running and > have available memory > > Credits to [~charanh] who found this issue. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8464) Async scheduling thread could be interrupted when there are no NodeManagers in cluster
[ https://issues.apache.org/jira/browse/YARN-8464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil Govindan updated YARN-8464: - Summary: Async scheduling thread could be interrupted when there are no NodeManagers in cluster (was: Application does not get to Running state even with available resources on node managers when async scheduling is enabled) > Async scheduling thread could be interrupted when there are no NodeManagers > in cluster > -- > > Key: YARN-8464 > URL: https://issues.apache.org/jira/browse/YARN-8464 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Charan Hebri >Assignee: Sunil Govindan >Priority: Critical > Attachments: YARN-8464.001.patch > > > Test scenario: > 1. Make either yarn.nodemanager.log-dirs/yarn.nodemanager.local-dirs read-only > 2. Restart NMs via Ambari, none of them show up in the RM UI as expected > 3. Revert back the read-only dirs and restart NMs > 4. Include a non-existent dir in either > yarn.nodemanager.log-dirs/yarn.nodemanager.local-dirs (1 good existing dir + > 1 non-existing dir) > 5. Restart NMs via Ambari, all NMs show as RUNNING with a Health Report > message as expected > 6. Submit a MapReduce sleep job, job goes into ACCEPTED state > 7. Job stays in ACCEPTED state forever even though all NMs are running and > have available memory > > Credits to [~charanh] who found this issue. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org