[jira] [Commented] (YARN-4997) Update fair scheduler to use pluggable auth provider
[ https://issues.apache.org/jira/browse/YARN-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15704462#comment-15704462 ] Tao Jie commented on YARN-4997: --- Updated the patch respect to [~templedf]'s comment. The test failure is irrelevant. > Update fair scheduler to use pluggable auth provider > > > Key: YARN-4997 > URL: https://issues.apache.org/jira/browse/YARN-4997 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 2.8.0 >Reporter: Daniel Templeton >Assignee: Tao Jie > Attachments: YARN-4997-001.patch, YARN-4997-002.patch, > YARN-4997-003.patch, YARN-4997-004.patch, YARN-4997-005.patch, > YARN-4997-006.patch, YARN-4997-007.patch, YARN-4997-008.patch, > YARN-4997-009.patch, YARN-4997-010.patch > > > Now that YARN-3100 has made the authorization pluggable, it should be > supported by the fair scheduler. YARN-3100 only updated the capacity > scheduler. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5435) [Regression] QueueCapacities not being updated for dynamic ReservationQueue
[ https://issues.apache.org/jira/browse/YARN-5435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15704282#comment-15704282 ] Hadoop QA commented on YARN-5435: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 47s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 20s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 42m 20s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 15s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 57m 37s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:a9ad5d6 | | JIRA Issue | YARN-5435 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12840801/YARN-5435.v005.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux ec613db42fea 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 47ca9e2 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/14094/testReport/ | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/14094/console | | Powered by | Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > [Regression] QueueCapacities not being updated for dynamic ReservationQueue > --- > > Key: YARN-5435 > URL: https://issues.apache.org/jira/browse/YARN-5435 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, resourcemanager >Affects Versions: 2.8.0 >Reporter: Sean Po >Assignee: Sean Po > Labels: oct16-easy,
[jira] [Updated] (YARN-5927) BaseContainerManagerTest::waitForNMContainerState timeout accounting is not accurate
[ https://issues.apache.org/jira/browse/YARN-5927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Sasaki updated YARN-5927: - Attachment: YARN-5917.02.patch > BaseContainerManagerTest::waitForNMContainerState timeout accounting is not > accurate > > > Key: YARN-5927 > URL: https://issues.apache.org/jira/browse/YARN-5927 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Miklos Szegedi >Assignee: Kai Sasaki >Priority: Trivial > Labels: newbie > Attachments: YARN-5917.01.patch, YARN-5917.02.patch > > > See below that timeoutSecs is increased twice. We also do a sleep right away > before even checking the observed value. > {code} > do { > Thread.sleep(2000); > ... > timeoutSecs += 2; > } while (!finalStates.contains(currentState) > && timeoutSecs++ < timeOutMax); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5559) Analyse 2.8.0/3.0.0 jdiff reports and fix any issues
[ https://issues.apache.org/jira/browse/YARN-5559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15704159#comment-15704159 ] Akira Ajisaka commented on YARN-5559: - +1, thanks [~jianhe] and [~djp]. > Analyse 2.8.0/3.0.0 jdiff reports and fix any issues > > > Key: YARN-5559 > URL: https://issues.apache.org/jira/browse/YARN-5559 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Reporter: Wangda Tan >Assignee: Wangda Tan >Priority: Blocker > Labels: oct16-easy > Attachments: YARN-5559.1.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5435) [Regression] QueueCapacities not being updated for dynamic ReservationQueue
[ https://issues.apache.org/jira/browse/YARN-5435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Po updated YARN-5435: -- Attachment: YARN-5435.v005.patch YARN-5435.v005.patch modifies the code so that all tests pass, and only max QueueCapacities are inherited from the parent queue. > [Regression] QueueCapacities not being updated for dynamic ReservationQueue > --- > > Key: YARN-5435 > URL: https://issues.apache.org/jira/browse/YARN-5435 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, resourcemanager >Affects Versions: 2.8.0 >Reporter: Sean Po >Assignee: Sean Po > Labels: oct16-easy, regression > Attachments: YARN-5435.v003.patch, YARN-5435.v004.patch, > YARN-5435.v005.patch, YARN-5435.v1.patch, YARN-5435.v2.patch > > > YARN-1707 added dynamic queues (ReservationQueue) to CapacityScheduler. The > QueueCapacities data structure was added subsequently but is not being > updated correctly for ReservationQueue. This JIRA tracks the changes required > to update QueueCapacities of ReservationQueue correctly. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4997) Update fair scheduler to use pluggable auth provider
[ https://issues.apache.org/jira/browse/YARN-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15704153#comment-15704153 ] Hadoop QA commented on YARN-4997: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 55s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 49s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 43s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 6s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 0s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 40s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 45s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 2 new + 289 unchanged - 4 fixed = 291 total (was 293) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 24s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 42m 28s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 29s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 84m 6s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMRestart | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:a9ad5d6 | | JIRA Issue | YARN-4997 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12840787/YARN-4997-010.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux b4e143324bcb 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 47ca9e2 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/14093/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn.txt | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/14093/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | Test Results |
[jira] [Updated] (YARN-4997) Update fair scheduler to use pluggable auth provider
[ https://issues.apache.org/jira/browse/YARN-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated YARN-4997: -- Attachment: YARN-4997-010.patch > Update fair scheduler to use pluggable auth provider > > > Key: YARN-4997 > URL: https://issues.apache.org/jira/browse/YARN-4997 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 2.8.0 >Reporter: Daniel Templeton >Assignee: Tao Jie > Attachments: YARN-4997-001.patch, YARN-4997-002.patch, > YARN-4997-003.patch, YARN-4997-004.patch, YARN-4997-005.patch, > YARN-4997-006.patch, YARN-4997-007.patch, YARN-4997-008.patch, > YARN-4997-009.patch, YARN-4997-010.patch > > > Now that YARN-3100 has made the authorization pluggable, it should be > supported by the fair scheduler. YARN-3100 only updated the capacity > scheduler. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5917) [YARN-3368] Make navigation link active when selecting child components in "Applications" and "Nodes"
[ https://issues.apache.org/jira/browse/YARN-5917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703875#comment-15703875 ] Sunil G commented on YARN-5917: --- Sure. I ll take a look shortly. > [YARN-3368] Make navigation link active when selecting child components in > "Applications" and "Nodes" > - > > Key: YARN-5917 > URL: https://issues.apache.org/jira/browse/YARN-5917 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn-ui-v2 >Affects Versions: 3.0.0-alpha2 >Reporter: Kai Sasaki >Assignee: Kai Sasaki >Priority: Minor > Attachments: Screen Shot 2016-11-20 at 20.37.53.png, Screen Shot > 2016-11-20 at 20.38.01.png, YARN-5917.01.patch > > > When we select "Long Running Services" under "Applications" and "Nodes > Heatmap Chart" under "Nodes", navigation links become inactive. > They can be always active when child components are selected. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5725) Test uncaught exception in TestContainersMonitorResourceChange.testContainersResourceChange when setting IP and host
[ https://issues.apache.org/jira/browse/YARN-5725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703873#comment-15703873 ] Miklos Szegedi commented on YARN-5725: -- We need to back port first 9449519a2503c55d9eac8fd7519df28aa0760059 YARN-5776. Do we really need this in v2? > Test uncaught exception in > TestContainersMonitorResourceChange.testContainersResourceChange when setting > IP and host > > > Key: YARN-5725 > URL: https://issues.apache.org/jira/browse/YARN-5725 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Miklos Szegedi >Assignee: Miklos Szegedi >Priority: Minor > Labels: oct16-easy > Fix For: 3.0.0-alpha2 > > Attachments: YARN-5725.000.patch, YARN-5725.001.patch, > YARN-5725.002.patch, YARN-5725.003.patch, YARN-5725.004.patch, > YARN-5725.005.patch, YARN-5725.006.patch, YARN-5725.007.patch, > YARN-5725.008.patch, YARN-5725.009.patch, YARN-5725.010.patch, > YARN-5725.011.patch > > Original Estimate: 2h > Remaining Estimate: 2h > > The issue is a warning but it prevents container monitor to continue > 2016-10-12 14:38:23,280 WARN [Container Monitor] > monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(594)) - > Uncaught exception in ContainersMonitorImpl while monitoring resource of > container_123456_0001_01_01 > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl$MonitoringThread.run(ContainersMonitorImpl.java:455) > 2016-10-12 14:38:23,281 WARN [Container Monitor] > monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(613)) - > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl > is interrupted. Exiting. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5725) Test uncaught exception in TestContainersMonitorResourceChange.testContainersResourceChange when setting IP and host
[ https://issues.apache.org/jira/browse/YARN-5725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated YARN-5725: -- Fix Version/s: 3.0.0-alpha2 > Test uncaught exception in > TestContainersMonitorResourceChange.testContainersResourceChange when setting > IP and host > > > Key: YARN-5725 > URL: https://issues.apache.org/jira/browse/YARN-5725 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Miklos Szegedi >Assignee: Miklos Szegedi >Priority: Minor > Labels: oct16-easy > Fix For: 3.0.0-alpha2 > > Attachments: YARN-5725.000.patch, YARN-5725.001.patch, > YARN-5725.002.patch, YARN-5725.003.patch, YARN-5725.004.patch, > YARN-5725.005.patch, YARN-5725.006.patch, YARN-5725.007.patch, > YARN-5725.008.patch, YARN-5725.009.patch, YARN-5725.010.patch, > YARN-5725.011.patch > > Original Estimate: 2h > Remaining Estimate: 2h > > The issue is a warning but it prevents container monitor to continue > 2016-10-12 14:38:23,280 WARN [Container Monitor] > monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(594)) - > Uncaught exception in ContainersMonitorImpl while monitoring resource of > container_123456_0001_01_01 > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl$MonitoringThread.run(ContainersMonitorImpl.java:455) > 2016-10-12 14:38:23,281 WARN [Container Monitor] > monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(613)) - > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl > is interrupted. Exiting. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4330) MiniYARNCluster is showing multiple Failed to instantiate default resource calculator warning messages.
[ https://issues.apache.org/jira/browse/YARN-4330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703871#comment-15703871 ] Andrew Wang commented on YARN-4330: --- I set the fix versions based on current git state. Please update the 2.x fix version if this goes back to 2.8. > MiniYARNCluster is showing multiple Failed to instantiate default resource > calculator warning messages. > > > Key: YARN-4330 > URL: https://issues.apache.org/jira/browse/YARN-4330 > Project: Hadoop YARN > Issue Type: Bug > Components: test, yarn >Affects Versions: 2.8.0 > Environment: OSX, JUnit >Reporter: Steve Loughran >Assignee: Varun Saxena >Priority: Blocker > Labels: oct16-hard > Fix For: 2.9.0, 3.0.0-alpha2 > > Attachments: YARN-4330.002.patch, YARN-4330.003.patch, > YARN-4330.004.patch, YARN-4330.01.patch > > > Whenever I try to start a MiniYARNCluster on Branch-2 (commit #0b61cca), I > see multiple stack traces warning me that a resource calculator plugin could > not be created > {code} > (ResourceCalculatorPlugin.java:getResourceCalculatorPlugin(184)) - > java.lang.UnsupportedOperationException: Could not determine OS: Failed to > instantiate default resource calculator. > java.lang.UnsupportedOperationException: Could not determine OS > {code} > This is a minicluster. It doesn't need resource calculation. It certainly > doesn't need test logs being cluttered with even more stack traces which will > only generate false alarms about tests failing. > There needs to be a way to turn this off, and the minicluster should have it > that way by default. > Being ruthless and marking as a blocker, because its a fairly major > regression for anyone testing with the minicluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-4330) MiniYARNCluster is showing multiple Failed to instantiate default resource calculator warning messages.
[ https://issues.apache.org/jira/browse/YARN-4330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated YARN-4330: -- Fix Version/s: 3.0.0-alpha2 2.9.0 > MiniYARNCluster is showing multiple Failed to instantiate default resource > calculator warning messages. > > > Key: YARN-4330 > URL: https://issues.apache.org/jira/browse/YARN-4330 > Project: Hadoop YARN > Issue Type: Bug > Components: test, yarn >Affects Versions: 2.8.0 > Environment: OSX, JUnit >Reporter: Steve Loughran >Assignee: Varun Saxena >Priority: Blocker > Labels: oct16-hard > Fix For: 2.9.0, 3.0.0-alpha2 > > Attachments: YARN-4330.002.patch, YARN-4330.003.patch, > YARN-4330.004.patch, YARN-4330.01.patch > > > Whenever I try to start a MiniYARNCluster on Branch-2 (commit #0b61cca), I > see multiple stack traces warning me that a resource calculator plugin could > not be created > {code} > (ResourceCalculatorPlugin.java:getResourceCalculatorPlugin(184)) - > java.lang.UnsupportedOperationException: Could not determine OS: Failed to > instantiate default resource calculator. > java.lang.UnsupportedOperationException: Could not determine OS > {code} > This is a minicluster. It doesn't need resource calculation. It certainly > doesn't need test logs being cluttered with even more stack traces which will > only generate false alarms about tests failing. > There needs to be a way to turn this off, and the minicluster should have it > that way by default. > Being ruthless and marking as a blocker, because its a fairly major > regression for anyone testing with the minicluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5839) ClusterApplication API does not include a ReservationID
[ https://issues.apache.org/jira/browse/YARN-5839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703867#comment-15703867 ] Hadoop QA commented on YARN-5839: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 4s{color} | {color:red} YARN-5839 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-5839 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12840773/YARN-5839.v001.patch | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/14092/console | | Powered by | Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > ClusterApplication API does not include a ReservationID > --- > > Key: YARN-5839 > URL: https://issues.apache.org/jira/browse/YARN-5839 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Sean Po >Assignee: Sean Po > Attachments: YARN-5839.v001.patch > > > Currently, the ClusterApplication, and ClusterApplications API does not allow > users to find the reservation queue that an application is running in. > YARN-5839 proposes to add ReservationId to the ClusterApplication and > ClusterApplications API. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5839) ClusterApplication API does not include a ReservationID
[ https://issues.apache.org/jira/browse/YARN-5839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Po updated YARN-5839: -- Description: Currently, the ClusterApplication, and ClusterApplications API does not allow users to find the reservation queue that an application is running in. YARN-5839 proposes to add ReservationId to the ClusterApplication and ClusterApplications API. (was: Currently, the ClusterApplication, and ClusterApplications API does not allow users to find the reservation queue that an application is running in. YARN-5435 proposes to add ReservationId to the ClusterApplication and ClusterApplications API.) > ClusterApplication API does not include a ReservationID > --- > > Key: YARN-5839 > URL: https://issues.apache.org/jira/browse/YARN-5839 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Sean Po >Assignee: Sean Po > Attachments: YARN-5839.v001.patch > > > Currently, the ClusterApplication, and ClusterApplications API does not allow > users to find the reservation queue that an application is running in. > YARN-5839 proposes to add ReservationId to the ClusterApplication and > ClusterApplications API. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4752) [Umbrella] FairScheduler: Improve preemption
[ https://issues.apache.org/jira/browse/YARN-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703863#comment-15703863 ] Andrew Wang commented on YARN-4752: --- [~templedf] Looks like this umbrella was committed to trunk, should we resolve / set the fix version? > [Umbrella] FairScheduler: Improve preemption > > > Key: YARN-4752 > URL: https://issues.apache.org/jira/browse/YARN-4752 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.8.0 >Reporter: Karthik Kambatla > Attachments: YARN-4752.FairSchedulerPreemptionOverhaul.pdf, > yarn-4752-1.patch, yarn-4752.2.patch, yarn-4752.3.patch, yarn-4752.4.patch, > yarn-4752.4.patch > > > A number of issues have been reported with respect to preemption in > FairScheduler along the lines of: > # FairScheduler preempts resources from nodes even if the resultant free > resources cannot fit the incoming request. > # Preemption doesn't preempt from sibling queues > # Preemption doesn't preempt from sibling apps under the same queue that is > over its fairshare > # ... > Filing this umbrella JIRA to group all the issues together and think of a > comprehensive solution. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-5839) ClusterApplication API does not include a ReservationID
[ https://issues.apache.org/jira/browse/YARN-5839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703855#comment-15703855 ] Sean Po edited comment on YARN-5839 at 11/29/16 1:53 AM: - YARN-5839.v001.patch includes the changes to the REST and RPC API to add a ReservationId to the ApplicationReport. was (Author: seanpo03): YARN-5389.v001.patch includes the changes to the REST and RPC API to add a ReservationId to the ApplicationReport. > ClusterApplication API does not include a ReservationID > --- > > Key: YARN-5839 > URL: https://issues.apache.org/jira/browse/YARN-5839 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Sean Po >Assignee: Sean Po > Attachments: YARN-5839.v001.patch > > > Currently, the ClusterApplication, and ClusterApplications API does not allow > users to find the reservation queue that an application is running in. > YARN-5435 proposes to add ReservationId to the ClusterApplication and > ClusterApplications API. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5839) ClusterApplication API does not include a ReservationID
[ https://issues.apache.org/jira/browse/YARN-5839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Po updated YARN-5839: -- Attachment: YARN-5839.v001.patch > ClusterApplication API does not include a ReservationID > --- > > Key: YARN-5839 > URL: https://issues.apache.org/jira/browse/YARN-5839 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Sean Po >Assignee: Sean Po > Attachments: YARN-5839.v001.patch > > > Currently, the ClusterApplication, and ClusterApplications API does not allow > users to find the reservation queue that an application is running in. > YARN-5435 proposes to add ReservationId to the ClusterApplication and > ClusterApplications API. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5938) Minor refactoring to OpportunisticContainerAllocatorAMService
[ https://issues.apache.org/jira/browse/YARN-5938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703811#comment-15703811 ] Hadoop QA commented on YARN-5938: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 54s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 56s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 32s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 29s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server: The patch generated 11 new + 33 unchanged - 9 fixed = 44 total (was 42) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 28s{color} | {color:green} hadoop-yarn-server-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 13m 47s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 40m 3s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 85m 29s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMRestart | | | hadoop.yarn.server.resourcemanager.TestApplicationMasterService | | | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing | | | hadoop.yarn.server.resourcemanager.scheduler.fifo.TestFifoScheduler | | | hadoop.yarn.server.resourcemanager.ahs.TestRMApplicationHistoryWriter | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:a9ad5d6 | | JIRA Issue | YARN-5938 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12840757/YARN-5938.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 3f7c8a7b49aa 3.13.0-95-generic #142-Ubuntu SMP Fri
[jira] [Commented] (YARN-5890) FairScheduler should log information about AM-resource-usage and max-AM-share for queues
[ https://issues.apache.org/jira/browse/YARN-5890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703785#comment-15703785 ] Hadoop QA commented on YARN-5890: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 24s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 6 new + 238 unchanged - 0 fixed = 244 total (was 238) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 39m 35s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 58m 2s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:a9ad5d6 | | JIRA Issue | YARN-5890 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12840759/YARN-5890.004.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux f45daf25e0ff 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / be88d57 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/14090/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/14090/testReport/ | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/14090/console | | Powered by | Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > FairScheduler should log information about AM-resource-usage and max-AM-share > for queues >
[jira] [Commented] (YARN-5774) MR Job stuck in ACCEPTED status without any progress in Fair Scheduler if set yarn.scheduler.minimum-allocation-mb to 0.
[ https://issues.apache.org/jira/browse/YARN-5774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703774#comment-15703774 ] Yufei Gu commented on YARN-5774: Thanks [~templedf]. The reason to use code instead of link is to avoid cyclic dependency. {{AbstractResourceRequest}} is in project hadoop-yarn-api, and {{ResourceManager}} is in project hadoop-yarn-server which depends on project hadoop-yarn-api. If we use link, we need to let project hadoop-yarn-api depends on project hadoop-yarn-server and introduce cyclic dependency. > MR Job stuck in ACCEPTED status without any progress in Fair Scheduler if set > yarn.scheduler.minimum-allocation-mb to 0. > > > Key: YARN-5774 > URL: https://issues.apache.org/jira/browse/YARN-5774 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 3.0.0-alpha1 >Reporter: Yufei Gu >Assignee: Yufei Gu > Labels: oct16-easy > Attachments: YARN-5774.001.patch, YARN-5774.002.patch, > YARN-5774.003.patch, YARN-5774.004.patch, YARN-5774.005.patch, > YARN-5774.006.patch, YARN-5774.007.patch > > > MR Job stuck in ACCEPTED status without any progress in Fair Scheduler > because there is no resource request for the AM. This happened when you > configure {{yarn.scheduler.minimum-allocation-mb}} to zero. > The problem is in the code used by both Capacity Scheduler and Fair > Scheduler. {{scheduler.increment-allocation-mb}} is a concept in FS, but not > CS. So the common code in class RMAppManager passes the > {{yarn.scheduler.minimum-allocation-mb}} as incremental one because there is > no incremental one for CS when it tried to normalize the resource requests. > {code} > SchedulerUtils.normalizeRequest(amReq, scheduler.getResourceCalculator(), > scheduler.getClusterResource(), > scheduler.getMinimumResourceCapability(), > scheduler.getMaximumResourceCapability(), > scheduler.getMinimumResourceCapability()); --> incrementResource > should be passed here. > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5739) Provide timeline reader API to list available timeline entity types for one application
[ https://issues.apache.org/jira/browse/YARN-5739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Lu updated YARN-5739: Attachment: YARN-5739-YARN-5355.005.patch Refactored EntityTypeReader and TimelineEntityReader. EntityTypeReader has been separated from EntityReaders after this refactoring. > Provide timeline reader API to list available timeline entity types for one > application > --- > > Key: YARN-5739 > URL: https://issues.apache.org/jira/browse/YARN-5739 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelinereader >Reporter: Li Lu >Assignee: Li Lu > Attachments: YARN-5739-YARN-5355.001.patch, > YARN-5739-YARN-5355.002.patch, YARN-5739-YARN-5355.003.patch, > YARN-5739-YARN-5355.004.patch, YARN-5739-YARN-5355.005.patch > > > Right now we only show a part of available timeline entity data in the new > YARN UI. However, some data (especially library specific data) are not > possible to be queried out by the web UI. It will be appealing for the UI to > provide an "entity browser" for each YARN application. Actually, simply > dumping out available timeline entities (with proper pagination, of course) > would be pretty helpful for UI users. > On timeline side, we're not far away from this goal. Right now I believe the > only thing missing is to list all available entity types within one > application. The challenge here is that we're not storing this data for each > application, but given this kind of call is relatively rare (compare to > writes and updates) we can perform some scanning during the read time. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5938) Minor refactoring to OpportunisticContainerAllocatorAMService
[ https://issues.apache.org/jira/browse/YARN-5938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-5938: -- Issue Type: Sub-task (was: Improvement) Parent: YARN-5085 > Minor refactoring to OpportunisticContainerAllocatorAMService > - > > Key: YARN-5938 > URL: https://issues.apache.org/jira/browse/YARN-5938 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh >Assignee: Arun Suresh > Attachments: YARN-5938.001.patch > > > Minor code re-organization to do the following: > # The OpportunisticContainerAllocatorAMService currently allocates outside > the ApplicationAttempt lock maintained by the ApplicationMasterService. This > should happen inside the lock. > # Refactored out some code to simplify the allocate() method. > # Removed some unused fields inside the OpportunisticContainerAllocator. > # Re-organized some of the code in the > OpportunisticContainerAllocatorAMService::allocate method to make it a bit > more readable. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5938) Minor refactoring to OpportunisticContainerAllocatorAMService
[ https://issues.apache.org/jira/browse/YARN-5938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703670#comment-15703670 ] Arun Suresh commented on YARN-5938: --- Like I mentioned, this is mostly just a re-factoring change. [~kasha], [~varun_saxena] (since you helped review YARN-5918, so you might have some context) > Minor refactoring to OpportunisticContainerAllocatorAMService > - > > Key: YARN-5938 > URL: https://issues.apache.org/jira/browse/YARN-5938 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > Attachments: YARN-5938.001.patch > > > Minor code re-organization to do the following: > # The OpportunisticContainerAllocatorAMService currently allocates outside > the ApplicationAttempt lock maintained by the ApplicationMasterService. This > should happen inside the lock. > # Refactored out some code to simplify the allocate() method. > # Removed some unused fields inside the OpportunisticContainerAllocator. > # Re-organized some of the code in the > OpportunisticContainerAllocatorAMService::allocate method to make it a bit > more readable. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5559) Analyse 2.8.0/3.0.0 jdiff reports and fix any issues
[ https://issues.apache.org/jira/browse/YARN-5559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703666#comment-15703666 ] Junping Du commented on YARN-5559: -- bq. I think we can use this single jira and upload two patches one for branch-2/trunk and one for branch-2.6? +1 on this proposal. [~ajisakaa], what do you think? > Analyse 2.8.0/3.0.0 jdiff reports and fix any issues > > > Key: YARN-5559 > URL: https://issues.apache.org/jira/browse/YARN-5559 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Reporter: Wangda Tan >Assignee: Wangda Tan >Priority: Blocker > Labels: oct16-easy > Attachments: YARN-5559.1.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5872) Add AlwayReject policies for router and amrmproxy.
[ https://issues.apache.org/jira/browse/YARN-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703648#comment-15703648 ] Carlo Curino commented on YARN-5872: Thanks [~subru]! > Add AlwayReject policies for router and amrmproxy. > -- > > Key: YARN-5872 > URL: https://issues.apache.org/jira/browse/YARN-5872 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Carlo Curino >Assignee: Carlo Curino > Fix For: YARN-2915 > > Attachments: YARN-5872-YARN-2915.01.patch, > YARN-5872-YARN-2915.02.patch, YARN-5872-YARN-2915.03.patch > > > This could be relevant as a safe fallback, for example to disable access to > the entire federation for a queue (without updating each RM in the > federation), we could set this policies and prevent access. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5676) Add a HashBasedRouterPolicy, and small policies and test refactoring.
[ https://issues.apache.org/jira/browse/YARN-5676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703649#comment-15703649 ] Carlo Curino commented on YARN-5676: Thanks [~subru]! > Add a HashBasedRouterPolicy, and small policies and test refactoring. > - > > Key: YARN-5676 > URL: https://issues.apache.org/jira/browse/YARN-5676 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Affects Versions: YARN-2915 >Reporter: Carlo Curino >Assignee: Carlo Curino > Fix For: YARN-2915 > > Attachments: YARN-5676-YARN-2915.01.patch, > YARN-5676-YARN-2915.02.patch, YARN-5676-YARN-2915.03.patch, > YARN-5676-YARN-2915.04.patch, YARN-5676-YARN-2915.05.patch, > YARN-5676-YARN-2915.06.patch, YARN-5676-YARN-2915.07.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5890) FairScheduler should log information about AM-resource-usage and max-AM-share for queues
[ https://issues.apache.org/jira/browse/YARN-5890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yufei Gu updated YARN-5890: --- Attachment: YARN-5890.004.patch Did the rebase in patch 004. > FairScheduler should log information about AM-resource-usage and max-AM-share > for queues > > > Key: YARN-5890 > URL: https://issues.apache.org/jira/browse/YARN-5890 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Reporter: Yufei Gu >Assignee: Yufei Gu > Attachments: YARN-5890.001.patch, YARN-5890.002.patch, > YARN-5890.003.patch, YARN-5890.004.patch > > > There are several cases where jobs in a queue or stuck likely because of > maxAMShare. It is hard to debug these issues without any information. > At the very least, we need to log both AM-resource-usage and max-AM-share for > queues. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5938) Minor refactoring to OpportunisticContainerAllocatorAMService
[ https://issues.apache.org/jira/browse/YARN-5938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-5938: -- Issue Type: Improvement (was: Sub-task) Parent: (was: YARN-5085) > Minor refactoring to OpportunisticContainerAllocatorAMService > - > > Key: YARN-5938 > URL: https://issues.apache.org/jira/browse/YARN-5938 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > > Minor code re-organization to do the following: > # The OpportunisticContainerAllocatorAMService currently allocates outside > the ApplicationAttempt lock maintained by the ApplicationMasterService. This > should happen inside the lock. > # Refactored out some code to simplify the allocate() method. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5938) Minor refactoring to OpportunisticContainerAllocatorAMService
[ https://issues.apache.org/jira/browse/YARN-5938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-5938: -- Attachment: YARN-5938.001.patch Uploading patch > Minor refactoring to OpportunisticContainerAllocatorAMService > - > > Key: YARN-5938 > URL: https://issues.apache.org/jira/browse/YARN-5938 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > Attachments: YARN-5938.001.patch > > > Minor code re-organization to do the following: > # The OpportunisticContainerAllocatorAMService currently allocates outside > the ApplicationAttempt lock maintained by the ApplicationMasterService. This > should happen inside the lock. > # Refactored out some code to simplify the allocate() method. > # Removed some unused fields inside the OpportunisticContainerAllocator. > # Re-organized some of the code in the > OpportunisticContainerAllocatorAMService::allocate method to make it a bit > more readable. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5938) Minor refactoring to OpportunisticContainerAllocatorAMService
[ https://issues.apache.org/jira/browse/YARN-5938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-5938: -- Description: Minor code re-organization to do the following: # The OpportunisticContainerAllocatorAMService currently allocates outside the ApplicationAttempt lock maintained by the ApplicationMasterService. This should happen inside the lock. # Refactored out some code to simplify the allocate() method. # Removed some unused fields inside the OpportunisticContainerAllocator. # Re-organized some of the code in the OpportunisticContainerAllocatorAMService::allocate method to make it a bit more readable. was: Minor code re-organization to do the following: # The OpportunisticContainerAllocatorAMService currently allocates outside the ApplicationAttempt lock maintained by the ApplicationMasterService. This should happen inside the lock. # Refactored out some code to simplify the allocate() method. > Minor refactoring to OpportunisticContainerAllocatorAMService > - > > Key: YARN-5938 > URL: https://issues.apache.org/jira/browse/YARN-5938 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > > Minor code re-organization to do the following: > # The OpportunisticContainerAllocatorAMService currently allocates outside > the ApplicationAttempt lock maintained by the ApplicationMasterService. This > should happen inside the lock. > # Refactored out some code to simplify the allocate() method. > # Removed some unused fields inside the OpportunisticContainerAllocator. > # Re-organized some of the code in the > OpportunisticContainerAllocatorAMService::allocate method to make it a bit > more readable. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5559) Analyse 2.8.0/3.0.0 jdiff reports and fix any issues
[ https://issues.apache.org/jira/browse/YARN-5559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703623#comment-15703623 ] Jian He commented on YARN-5559: --- [~ajisakaa], wangda is off for a while. would you like to update the patch according to your comments? I can commit it. bq. I'm thinking we should separate the patch into (3) and the rest. (3) should be fixed in branch-2.6 and above. I think we can use this single jira and upload two patches one for branch-2/trunk and one for branch-2.6 ? > Analyse 2.8.0/3.0.0 jdiff reports and fix any issues > > > Key: YARN-5559 > URL: https://issues.apache.org/jira/browse/YARN-5559 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Reporter: Wangda Tan >Assignee: Wangda Tan >Priority: Blocker > Labels: oct16-easy > Attachments: YARN-5559.1.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-5938) Minor refactoring to OpportunisticContainerAllocatorAMService
Arun Suresh created YARN-5938: - Summary: Minor refactoring to OpportunisticContainerAllocatorAMService Key: YARN-5938 URL: https://issues.apache.org/jira/browse/YARN-5938 Project: Hadoop YARN Issue Type: Sub-task Reporter: Arun Suresh Assignee: Arun Suresh Minor code re-organization to do the following: # The OpportunisticContainerAllocatorAMService currently allocates outside the ApplicationAttempt lock maintained by the ApplicationMasterService. This should happen inside the lock. # Refactored out some code to simplify the allocate() method. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5525) Make log aggregation service class configurable
[ https://issues.apache.org/jira/browse/YARN-5525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703613#comment-15703613 ] Xuan Gong commented on YARN-5525: - bq. then I assume they will handle the downstream log usage, and it is okay for yarn log cli and RM UI not to be able to find it? I do not think that we should assume the customer can handle everything. In my opinion, we should handle how to aggregate the logs and how to fetch the logs at the same time. If we open a window for the customer which allows them to customize the log aggregation service, we/they have to provide a way to fetch the logs as well. > Make log aggregation service class configurable > --- > > Key: YARN-5525 > URL: https://issues.apache.org/jira/browse/YARN-5525 > Project: Hadoop YARN > Issue Type: Improvement > Components: log-aggregation >Reporter: Giovanni Matteo Fumarola >Assignee: Botong Huang >Priority: Minor > Labels: oct16-medium > Attachments: YARN-5525.v1.patch, YARN-5525.v2.patch, > YARN-5525.v3.patch, YARN-5525.v4.patch, YARN-5525.v5.patch > > > Make the log aggregation class configurable and extensible, so that > alternative log aggregation behaviors like app specific log aggregation > directory, log aggregation format can be implemented and plugged in. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5774) MR Job stuck in ACCEPTED status without any progress in Fair Scheduler if set yarn.scheduler.minimum-allocation-mb to 0.
[ https://issues.apache.org/jira/browse/YARN-5774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703599#comment-15703599 ] Daniel Templeton commented on YARN-5774: Thanks for the update, [~yufeigu]! I only see one tiny nit: in the {{AbstractResourceRequest}} javadocs, {code}{@code ResourceManager}{code} should be {code}{@link ResourceManager}{code} > MR Job stuck in ACCEPTED status without any progress in Fair Scheduler if set > yarn.scheduler.minimum-allocation-mb to 0. > > > Key: YARN-5774 > URL: https://issues.apache.org/jira/browse/YARN-5774 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 3.0.0-alpha1 >Reporter: Yufei Gu >Assignee: Yufei Gu > Labels: oct16-easy > Attachments: YARN-5774.001.patch, YARN-5774.002.patch, > YARN-5774.003.patch, YARN-5774.004.patch, YARN-5774.005.patch, > YARN-5774.006.patch, YARN-5774.007.patch > > > MR Job stuck in ACCEPTED status without any progress in Fair Scheduler > because there is no resource request for the AM. This happened when you > configure {{yarn.scheduler.minimum-allocation-mb}} to zero. > The problem is in the code used by both Capacity Scheduler and Fair > Scheduler. {{scheduler.increment-allocation-mb}} is a concept in FS, but not > CS. So the common code in class RMAppManager passes the > {{yarn.scheduler.minimum-allocation-mb}} as incremental one because there is > no incremental one for CS when it tried to normalize the resource requests. > {code} > SchedulerUtils.normalizeRequest(amReq, scheduler.getResourceCalculator(), > scheduler.getClusterResource(), > scheduler.getMinimumResourceCapability(), > scheduler.getMaximumResourceCapability(), > scheduler.getMinimumResourceCapability()); --> incrementResource > should be passed here. > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5890) FairScheduler should log information about AM-resource-usage and max-AM-share for queues
[ https://issues.apache.org/jira/browse/YARN-5890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703597#comment-15703597 ] Hadoop QA commented on YARN-5890: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 5s{color} | {color:red} YARN-5890 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-5890 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12840754/YARN-5890.003.patch | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/14088/console | | Powered by | Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > FairScheduler should log information about AM-resource-usage and max-AM-share > for queues > > > Key: YARN-5890 > URL: https://issues.apache.org/jira/browse/YARN-5890 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Reporter: Yufei Gu >Assignee: Yufei Gu > Attachments: YARN-5890.001.patch, YARN-5890.002.patch, > YARN-5890.003.patch > > > There are several cases where jobs in a queue or stuck likely because of > maxAMShare. It is hard to debug these issues without any information. > At the very least, we need to log both AM-resource-usage and max-AM-share for > queues. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5890) FairScheduler should log information about AM-resource-usage and max-AM-share for queues
[ https://issues.apache.org/jira/browse/YARN-5890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yufei Gu updated YARN-5890: --- Attachment: YARN-5890.003.patch > FairScheduler should log information about AM-resource-usage and max-AM-share > for queues > > > Key: YARN-5890 > URL: https://issues.apache.org/jira/browse/YARN-5890 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Reporter: Yufei Gu >Assignee: Yufei Gu > Attachments: YARN-5890.001.patch, YARN-5890.002.patch, > YARN-5890.003.patch > > > There are several cases where jobs in a queue or stuck likely because of > maxAMShare. It is hard to debug these issues without any information. > At the very least, we need to log both AM-resource-usage and max-AM-share for > queues. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5890) FairScheduler should log information about AM-resource-usage and max-AM-share for queues
[ https://issues.apache.org/jira/browse/YARN-5890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703592#comment-15703592 ] Yufei Gu commented on YARN-5890: Thanks [~templedf]. I uploaded the patch 003 for all your comments. I set the max AM share explicitly instead of add some doc, which I think more readable. > FairScheduler should log information about AM-resource-usage and max-AM-share > for queues > > > Key: YARN-5890 > URL: https://issues.apache.org/jira/browse/YARN-5890 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Reporter: Yufei Gu >Assignee: Yufei Gu > Attachments: YARN-5890.001.patch, YARN-5890.002.patch > > > There are several cases where jobs in a queue or stuck likely because of > maxAMShare. It is hard to debug these issues without any information. > At the very least, we need to log both AM-resource-usage and max-AM-share for > queues. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5890) FairScheduler should log information about AM-resource-usage and max-AM-share for queues
[ https://issues.apache.org/jira/browse/YARN-5890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yufei Gu updated YARN-5890: --- Attachment: (was: YARN-5890.003.patch) > FairScheduler should log information about AM-resource-usage and max-AM-share > for queues > > > Key: YARN-5890 > URL: https://issues.apache.org/jira/browse/YARN-5890 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Reporter: Yufei Gu >Assignee: Yufei Gu > Attachments: YARN-5890.001.patch, YARN-5890.002.patch > > > There are several cases where jobs in a queue or stuck likely because of > maxAMShare. It is hard to debug these issues without any information. > At the very least, we need to log both AM-resource-usage and max-AM-share for > queues. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5890) FairScheduler should log information about AM-resource-usage and max-AM-share for queues
[ https://issues.apache.org/jira/browse/YARN-5890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yufei Gu updated YARN-5890: --- Attachment: YARN-5890.003.patch > FairScheduler should log information about AM-resource-usage and max-AM-share > for queues > > > Key: YARN-5890 > URL: https://issues.apache.org/jira/browse/YARN-5890 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Reporter: Yufei Gu >Assignee: Yufei Gu > Attachments: YARN-5890.001.patch, YARN-5890.002.patch, > YARN-5890.003.patch > > > There are several cases where jobs in a queue or stuck likely because of > maxAMShare. It is hard to debug these issues without any information. > At the very least, we need to log both AM-resource-usage and max-AM-share for > queues. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4997) Update fair scheduler to use pluggable auth provider
[ https://issues.apache.org/jira/browse/YARN-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703578#comment-15703578 ] Daniel Templeton commented on YARN-4997: Thanks for updating the patch. Since [~kasha] is out for a little while, I'm jumping back in. Looks like you (pl.) decided to drop the {{synchronized}} and screw the checkstyle complaint. In the interest of not going in circles, I can live with that. Other minor nits: * Can we have {{AllocationConfiguration.getQueueAcls()}} wrap the {{Map}} in a {{Collections.unmodifiableMap()}}? It makes me a little nervous to expose mutable data structures in getters. * The javadoc for {{AllocationFileLoaderService. getDefaultPermissions()}} should start with a summary sentence that ends with a period. Aside from not being a good summary, the current first sentence is missing the period. * In {{FairScheduler}}, you messed up the indentation of the first line of {{applyChildDefaults()'s}} javadoc. > Update fair scheduler to use pluggable auth provider > > > Key: YARN-4997 > URL: https://issues.apache.org/jira/browse/YARN-4997 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 2.8.0 >Reporter: Daniel Templeton >Assignee: Tao Jie > Attachments: YARN-4997-001.patch, YARN-4997-002.patch, > YARN-4997-003.patch, YARN-4997-004.patch, YARN-4997-005.patch, > YARN-4997-006.patch, YARN-4997-007.patch, YARN-4997-008.patch, > YARN-4997-009.patch > > > Now that YARN-3100 has made the authorization pluggable, it should be > supported by the fair scheduler. YARN-3100 only updated the capacity > scheduler. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5725) Test uncaught exception in TestContainersMonitorResourceChange.testContainersResourceChange when setting IP and host
[ https://issues.apache.org/jira/browse/YARN-5725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703580#comment-15703580 ] Hudson commented on YARN-5725: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10900 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/10900/]) YARN-5725. Test uncaught exception in (templedf: rev 62b42ef5dd04d516d33bf0890ac5cd49f8184a73) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/TestContainersMonitorResourceChange.java > Test uncaught exception in > TestContainersMonitorResourceChange.testContainersResourceChange when setting > IP and host > > > Key: YARN-5725 > URL: https://issues.apache.org/jira/browse/YARN-5725 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Miklos Szegedi >Assignee: Miklos Szegedi >Priority: Minor > Labels: oct16-easy > Attachments: YARN-5725.000.patch, YARN-5725.001.patch, > YARN-5725.002.patch, YARN-5725.003.patch, YARN-5725.004.patch, > YARN-5725.005.patch, YARN-5725.006.patch, YARN-5725.007.patch, > YARN-5725.008.patch, YARN-5725.009.patch, YARN-5725.010.patch, > YARN-5725.011.patch > > Original Estimate: 2h > Remaining Estimate: 2h > > The issue is a warning but it prevents container monitor to continue > 2016-10-12 14:38:23,280 WARN [Container Monitor] > monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(594)) - > Uncaught exception in ContainersMonitorImpl while monitoring resource of > container_123456_0001_01_01 > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl$MonitoringThread.run(ContainersMonitorImpl.java:455) > 2016-10-12 14:38:23,281 WARN [Container Monitor] > monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(613)) - > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl > is interrupted. Exiting. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5761) Separate QueueManager from Scheduler
[ https://issues.apache.org/jira/browse/YARN-5761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703489#comment-15703489 ] Hadoop QA commented on YARN-5761: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 7 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 31s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 30s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 3 new + 877 unchanged - 17 fixed = 880 total (was 894) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 42m 32s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 16s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 58m 14s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMRestart | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:a9ad5d6 | | JIRA Issue | YARN-5761 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12840733/YARN-5761.8.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 6dbcca3d7703 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / a2b1ff0 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/14084/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/14084/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/14084/testReport/ | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/14084/console
[jira] [Commented] (YARN-5725) Test uncaught exception in TestContainersMonitorResourceChange.testContainersResourceChange when setting IP and host
[ https://issues.apache.org/jira/browse/YARN-5725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703475#comment-15703475 ] Daniel Templeton commented on YARN-5725: Committed to trunk. Can you please supply a patch for branch-2? > Test uncaught exception in > TestContainersMonitorResourceChange.testContainersResourceChange when setting > IP and host > > > Key: YARN-5725 > URL: https://issues.apache.org/jira/browse/YARN-5725 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Miklos Szegedi >Assignee: Miklos Szegedi >Priority: Minor > Labels: oct16-easy > Attachments: YARN-5725.000.patch, YARN-5725.001.patch, > YARN-5725.002.patch, YARN-5725.003.patch, YARN-5725.004.patch, > YARN-5725.005.patch, YARN-5725.006.patch, YARN-5725.007.patch, > YARN-5725.008.patch, YARN-5725.009.patch, YARN-5725.010.patch, > YARN-5725.011.patch > > Original Estimate: 2h > Remaining Estimate: 2h > > The issue is a warning but it prevents container monitor to continue > 2016-10-12 14:38:23,280 WARN [Container Monitor] > monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(594)) - > Uncaught exception in ContainersMonitorImpl while monitoring resource of > container_123456_0001_01_01 > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl$MonitoringThread.run(ContainersMonitorImpl.java:455) > 2016-10-12 14:38:23,281 WARN [Container Monitor] > monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(613)) - > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl > is interrupted. Exiting. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5739) Provide timeline reader API to list available timeline entity types for one application
[ https://issues.apache.org/jira/browse/YARN-5739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703452#comment-15703452 ] Li Lu commented on YARN-5739: - Sure. Let me try with some refactoring... > Provide timeline reader API to list available timeline entity types for one > application > --- > > Key: YARN-5739 > URL: https://issues.apache.org/jira/browse/YARN-5739 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelinereader >Reporter: Li Lu >Assignee: Li Lu > Attachments: YARN-5739-YARN-5355.001.patch, > YARN-5739-YARN-5355.002.patch, YARN-5739-YARN-5355.003.patch, > YARN-5739-YARN-5355.004.patch > > > Right now we only show a part of available timeline entity data in the new > YARN UI. However, some data (especially library specific data) are not > possible to be queried out by the web UI. It will be appealing for the UI to > provide an "entity browser" for each YARN application. Actually, simply > dumping out available timeline entities (with proper pagination, of course) > would be pretty helpful for UI users. > On timeline side, we're not far away from this goal. Right now I believe the > only thing missing is to list all available entity types within one > application. The challenge here is that we're not storing this data for each > application, but given this kind of call is relatively rare (compare to > writes and updates) we can perform some scanning during the read time. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5536) Multiple format support (JSON, etc.) for exclude node file in NM graceful decommission with timeout
[ https://issues.apache.org/jira/browse/YARN-5536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703441#comment-15703441 ] Ming Ma commented on YARN-5536: --- I don't have any immediate plan to work on it yet. HDFS-9005 could be useful reference in terms of the work required. > Multiple format support (JSON, etc.) for exclude node file in NM graceful > decommission with timeout > --- > > Key: YARN-5536 > URL: https://issues.apache.org/jira/browse/YARN-5536 > Project: Hadoop YARN > Issue Type: Sub-task > Components: graceful >Reporter: Junping Du >Priority: Blocker > > Per discussion in YARN-4676, we agree that multiple format (other than xml) > should be supported to decommission nodes with timeout values. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5725) Test uncaught exception in TestContainersMonitorResourceChange.testContainersResourceChange when setting IP and host
[ https://issues.apache.org/jira/browse/YARN-5725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703434#comment-15703434 ] Daniel Templeton commented on YARN-5725: +1 on the latest patch. > Test uncaught exception in > TestContainersMonitorResourceChange.testContainersResourceChange when setting > IP and host > > > Key: YARN-5725 > URL: https://issues.apache.org/jira/browse/YARN-5725 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Miklos Szegedi >Assignee: Miklos Szegedi >Priority: Minor > Labels: oct16-easy > Attachments: YARN-5725.000.patch, YARN-5725.001.patch, > YARN-5725.002.patch, YARN-5725.003.patch, YARN-5725.004.patch, > YARN-5725.005.patch, YARN-5725.006.patch, YARN-5725.007.patch, > YARN-5725.008.patch, YARN-5725.009.patch, YARN-5725.010.patch, > YARN-5725.011.patch > > Original Estimate: 2h > Remaining Estimate: 2h > > The issue is a warning but it prevents container monitor to continue > 2016-10-12 14:38:23,280 WARN [Container Monitor] > monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(594)) - > Uncaught exception in ContainersMonitorImpl while monitoring resource of > container_123456_0001_01_01 > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl$MonitoringThread.run(ContainersMonitorImpl.java:455) > 2016-10-12 14:38:23,281 WARN [Container Monitor] > monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(613)) - > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl > is interrupted. Exiting. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5849) Automatically create YARN control group for pre-mounted cgroups
[ https://issues.apache.org/jira/browse/YARN-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Miklos Szegedi updated YARN-5849: - Attachment: YARN-5849.005.patch Fixing whitespace issue > Automatically create YARN control group for pre-mounted cgroups > --- > > Key: YARN-5849 > URL: https://issues.apache.org/jira/browse/YARN-5849 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.7.3, 3.0.0-alpha1, 3.0.0-alpha2 >Reporter: Miklos Szegedi >Assignee: Miklos Szegedi >Priority: Minor > Attachments: YARN-5849.000.patch, YARN-5849.001.patch, > YARN-5849.002.patch, YARN-5849.003.patch, YARN-5849.004.patch, > YARN-5849.005.patch > > > Yarn can be launched with linux-container-executor.cgroups.mount set to > false. It will search for the cgroup mount paths set up by the administrator > parsing the /etc/mtab file. You can also specify > resource.percentage-physical-cpu-limit to limit the CPU resources assigned to > containers. > linux-container-executor.cgroups.hierarchy is the root of the settings of all > YARN containers. If this is specified but not created YARN will fail at > startup: > Caused by: java.io.FileNotFoundException: > /cgroups/cpu/hadoop-yarn/cpu.cfs_period_us (Permission denied) > org.apache.hadoop.yarn.server.nodemanager.util.CgroupsLCEResourcesHandler.updateCgroup(CgroupsLCEResourcesHandler.java:263) > This JIRA is about automatically creating YARN control group in the case > above. It reduces the cost of administration. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4126) RM should not issue delegation tokens in unsecure mode
[ https://issues.apache.org/jira/browse/YARN-4126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703346#comment-15703346 ] Jian He commented on YARN-4126: --- Below is the original code, the logic is also bizarre: what's the point of having this condition guarding against whether security is enabled or not, given that it is allowed in unsecure cluster anyway ? doesn't this look like a bug ? {code} private boolean isAllowedDelegationTokenOp() throws IOException { if (UserGroupInformation.isSecurityEnabled()) { return EnumSet.of(AuthenticationMethod.KERBEROS, AuthenticationMethod.KERBEROS_SSL, AuthenticationMethod.CERTIFICATE) .contains(UserGroupInformation.getCurrentUser() .getRealAuthenticationMethod()); } else { return true; } } {code} [~venkatnrangan] mentioned there was a jira to fix this in Ozzie itself, do you have the jira number ? > RM should not issue delegation tokens in unsecure mode > -- > > Key: YARN-4126 > URL: https://issues.apache.org/jira/browse/YARN-4126 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jian He >Assignee: Bibin A Chundatt > Fix For: 3.0.0-alpha1 > > Attachments: 0001-YARN-4126.patch, 0002-YARN-4126.patch, > 0003-YARN-4126.patch, 0004-YARN-4126.patch, 0005-YARN-4126.patch, > 0006-YARN-4126.patch > > > ClientRMService#getDelegationToken is currently returning a delegation token > in insecure mode. We should not return the token if it's in insecure mode. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5761) Separate QueueManager from Scheduler
[ https://issues.apache.org/jira/browse/YARN-5761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703319#comment-15703319 ] Xuan Gong commented on YARN-5761: - rebase and fix the javadoc > Separate QueueManager from Scheduler > > > Key: YARN-5761 > URL: https://issues.apache.org/jira/browse/YARN-5761 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler >Reporter: Xuan Gong >Assignee: Xuan Gong > Labels: oct16-medium > Attachments: YARN-5761.1.patch, YARN-5761.1.rebase.patch, > YARN-5761.2.patch, YARN-5761.3.patch, YARN-5761.4.patch, YARN-5761.5.patch, > YARN-5761.6.patch, YARN-5761.7.patch, YARN-5761.7.patch, YARN-5761.8.patch > > > Currently, in scheduler code, we are doing queue manager and scheduling work. > We'd better separate the queue manager out of scheduler logic. In that case, > it would be much easier and safer to extend. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5761) Separate QueueManager from Scheduler
[ https://issues.apache.org/jira/browse/YARN-5761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-5761: Attachment: YARN-5761.8.patch > Separate QueueManager from Scheduler > > > Key: YARN-5761 > URL: https://issues.apache.org/jira/browse/YARN-5761 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler >Reporter: Xuan Gong >Assignee: Xuan Gong > Labels: oct16-medium > Attachments: YARN-5761.1.patch, YARN-5761.1.rebase.patch, > YARN-5761.2.patch, YARN-5761.3.patch, YARN-5761.4.patch, YARN-5761.5.patch, > YARN-5761.6.patch, YARN-5761.7.patch, YARN-5761.7.patch, YARN-5761.8.patch > > > Currently, in scheduler code, we are doing queue manager and scheduling work. > We'd better separate the queue manager out of scheduler logic. In that case, > it would be much easier and safer to extend. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5933) ATS stale entries in active directory causes ApplicationNotFoundException in RM
[ https://issues.apache.org/jira/browse/YARN-5933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703312#comment-15703312 ] Li Lu commented on YARN-5933: - After putting some thoughts on this issue I have some hesitation to directly remove the active directory when we see an unknown application exception. The RM does not recognize the application ID does not mean the application is not running. It certainly does not mean there is no concurrent writer to this active directory, although in this reported case this is true. Therefore, simply removing the active directory may not work for the cases where some "hidden" applications are actually writing the directory although the RM does not recognize this app. > ATS stale entries in active directory causes ApplicationNotFoundException in > RM > --- > > Key: YARN-5933 > URL: https://issues.apache.org/jira/browse/YARN-5933 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.7.3 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph > > On Secure cluster where ATS is down, Tez job submitted will fail while > getting TIMELINE_DELEGATION_TOKEN with below exception > {code} > 0: jdbc:hive2://kerberos-2.openstacklocal:100> select csmallint from > alltypesorc group by csmallint; > INFO : Session is already open > INFO : Dag name: select csmallint from alltypesor...csmallint(Stage-1) > INFO : Tez session was closed. Reopening... > ERROR : Failed to execute tez graph. > java.lang.RuntimeException: Failed to connect to timeline server. Connection > retries limit exceeded. The posted timeline event may be missing > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$TimelineClientConnectionRetry.retryOn(TimelineClientImpl.java:266) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.operateDelegationToken(TimelineClientImpl.java:590) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.getDelegationToken(TimelineClientImpl.java:506) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getTimelineDelegationToken(YarnClientImpl.java:349) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.addTimelineDelegationToken(YarnClientImpl.java:330) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:250) > at > org.apache.tez.client.TezYarnClient.submitApplication(TezYarnClient.java:72) > at org.apache.tez.client.TezClient.start(TezClient.java:409) > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:196) > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.closeAndOpen(TezSessionPoolManager.java:311) > at org.apache.hadoop.hive.ql.exec.tez.TezTask.submit(TezTask.java:453) > at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:180) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1728) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1485) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1262) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1126) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1121) > at > org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:154) > at > org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:71) > at > org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:206) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709) > at > org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:218) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} > Tez YarnClient has received an applicationID from RM. On Restarting ATS now, > ATS tries to get the application report from RM and so RM will throw > ApplicationNotFoundException. ATS will keep on requesting and which floods RM. > {code} > RM logs: > 2016-11-23 13:53:57,345 INFO > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Allocated new > applicationId: 5 >
[jira] [Commented] (YARN-5910) Support for multi-cluster delegation tokens
[ https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703305#comment-15703305 ] Jian He commented on YARN-5910: --- To summarize, ideally, the token should be self-sufficient to discover the renewer address. But this is not the case if Hdfs is in HA mode which uses logical URI for the token service name, RM has to rely on the local hdfs config to discover the renewer address. To let RM not depend on the local hdfs config, below are possible approaches I can think of: - 1) Change the way hdfs token is constructed in HA to be self-sufficient, instead of using logical URI, probably use a comma-separated list of real address and change DFS client HA implementation all the way down to not rely on configuration. I guess this is too big a change for hdfs to be accepted. - 2) Push the token renewal responsibility to the AM itself. That is , we distribute the kerberos keytab along with the AM and let AM itself renew the token periodically, instead of RM doing the renewal. we probably write a library for this to avoid each AM write its own. - 3) Have ApplicationSubmissonContext carry a app configuration object, RM uses this configuration object for token renewal instead of local config. [~jlowe], would you mind sharing some thoughts on this ? > Support for multi-cluster delegation tokens > --- > > Key: YARN-5910 > URL: https://issues.apache.org/jira/browse/YARN-5910 > Project: Hadoop YARN > Issue Type: New Feature > Components: security >Reporter: Clay B. >Priority: Minor > > As an administrator running many secure (kerberized) clusters, some which > have peer clusters managed by other teams, I am looking for a way to run jobs > which may require services running on other clusters. Particular cases where > this rears itself are running something as core as a distcp between two > kerberized clusters (e.g. {{hadoop --config /home/user292/conf/ distcp > hdfs://LOCALCLUSTER/user/user292/test.out > hdfs://REMOTECLUSTER/user/user292/test.out.result}}). > Thanks to YARN-3021, once can run for a while but if the delegation token for > the remote cluster needs renewal the job will fail[1]. One can pre-configure > their {{hdfs-site.xml}} loaded by the YARN RM to know of all possible HDFSes > available but that requires coordination that is not always feasible, > especially as a cluster's peers grow into the tens of clusters or across > management teams. Ideally, one could have core systems configured this way > but jobs could also specify their own handling of tokens and management when > needed? > [1]: Example stack trace when the RM is unaware of a remote service: > > {code} > 2016-03-23 14:59:50,528 INFO > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > application_1458441356031_3317 found existing hdfs token Kind: > HDFS_DELEGATION_TOKEN, Service: ha-hdfs:REMOTECLUSTER, Ident: > (HDFS_DELEGATION_TOKEN token > 10927 for user292) > 2016-03-23 14:59:50,557 WARN > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > Unable to add the application to the delegation token renewer. > java.io.IOException: Failed to renew token: Kind: HDFS_DELEGATION_TOKEN, > Service: ha-hdfs:REMOTECLUSTER, Ident: (HDFS_DELEGATION_TOKEN token 10927 for > user292) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:427) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$700(DelegationTokenRenewer.java:78) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:781) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:762) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > Caused by: java.io.IOException: Unable to map logical nameservice URI > 'hdfs://REMOTECLUSTER' to a NameNode. Local configuration does not have a > failover proxy provider configured. > at org.apache.hadoop.hdfs.DFSClient$Renewer.getNNProxy(DFSClient.java:1164) > at org.apache.hadoop.hdfs.DFSClient$Renewer.renew(DFSClient.java:1128) > at org.apache.hadoop.security.token.Token.renew(Token.java:377) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:516) > at >
[jira] [Commented] (YARN-5849) Automatically create YARN control group for pre-mounted cgroups
[ https://issues.apache.org/jira/browse/YARN-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703304#comment-15703304 ] Hadoop QA commented on YARN-5849: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 21s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 5 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 47s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 6s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 44s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 52s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 11s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 42s{color} | {color:green} hadoop-yarn-project/hadoop-yarn: The patch generated 0 new + 26 unchanged - 23 fixed = 26 total (was 49) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 49s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s{color} | {color:green} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 0 new + 230 unchanged - 5 fixed = 230 total (was 235) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s{color} | {color:green} hadoop-yarn-site in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 26s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 13m 41s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 14s{color} | {color:green} hadoop-yarn-site in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 29s{color} | {color:green} The patch does not generate ASF License warnings.
[jira] [Commented] (YARN-5536) Multiple format support (JSON, etc.) for exclude node file in NM graceful decommission with timeout
[ https://issues.apache.org/jira/browse/YARN-5536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703297#comment-15703297 ] Junping Du commented on YARN-5536: -- Not yet from my side. This jira comes from YARN-4676 to address [~mingma]'s comments. [~mingma] and [~rkanter], do you have plan to work on it? > Multiple format support (JSON, etc.) for exclude node file in NM graceful > decommission with timeout > --- > > Key: YARN-5536 > URL: https://issues.apache.org/jira/browse/YARN-5536 > Project: Hadoop YARN > Issue Type: Sub-task > Components: graceful >Reporter: Junping Du >Priority: Blocker > > Per discussion in YARN-4676, we agree that multiple format (other than xml) > should be supported to decommission nodes with timeout values. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4126) RM should not issue delegation tokens in unsecure mode
[ https://issues.apache.org/jira/browse/YARN-4126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703265#comment-15703265 ] Andrew Wang commented on YARN-4126: --- Ping on this JIRA. Should we revert while discussion continues? I'd prefer to stick with the 2.x behavior if we're not in agreement, and Daryn's previous comment seemed like a strong -0 if not a -1. > RM should not issue delegation tokens in unsecure mode > -- > > Key: YARN-4126 > URL: https://issues.apache.org/jira/browse/YARN-4126 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jian He >Assignee: Bibin A Chundatt > Fix For: 3.0.0-alpha1 > > Attachments: 0001-YARN-4126.patch, 0002-YARN-4126.patch, > 0003-YARN-4126.patch, 0004-YARN-4126.patch, 0005-YARN-4126.patch, > 0006-YARN-4126.patch > > > ClientRMService#getDelegationToken is currently returning a delegation token > in insecure mode. We should not return the token if it's in insecure mode. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5536) Multiple format support (JSON, etc.) for exclude node file in NM graceful decommission with timeout
[ https://issues.apache.org/jira/browse/YARN-5536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703227#comment-15703227 ] Andrew Wang commented on YARN-5536: --- This is marked as a blocker, is there an update? > Multiple format support (JSON, etc.) for exclude node file in NM graceful > decommission with timeout > --- > > Key: YARN-5536 > URL: https://issues.apache.org/jira/browse/YARN-5536 > Project: Hadoop YARN > Issue Type: Sub-task > Components: graceful >Reporter: Junping Du >Priority: Blocker > > Per discussion in YARN-4676, we agree that multiple format (other than xml) > should be supported to decommission nodes with timeout values. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5909) Remove the agent code in slider and upgrade framework jetty version
[ https://issues.apache.org/jira/browse/YARN-5909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703205#comment-15703205 ] Jian He commented on YARN-5909: --- Thanks Billie for the review ! - Fixed the typo in WebAppApi - Removed the jetty dependencies from slider core pom.xml - Removed the jackson-jaxrs-json-provider dependency added in hadoop-project/pom.xml bq. Let's open a follow-up patch to remove AgentKeys. will do > Remove the agent code in slider and upgrade framework jetty version > > > Key: YARN-5909 > URL: https://issues.apache.org/jira/browse/YARN-5909 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Jian He > Attachments: YARN-5909-yarn-native-services.01.patch, > YARN-5909-yarn-native-services.02.patch > > > Hadoop core has upgraded jetty version to 9 , > The framework (slider AM) also needs to upgrade the jetty version > Problem is that some legacy agent code uses classes which only exist in old > jetty. Probably it's the time to remove all the agent related code ? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-5909) Remove the agent code in slider and upgrade framework jetty version
[ https://issues.apache.org/jira/browse/YARN-5909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He reassigned YARN-5909: - Assignee: Jian He > Remove the agent code in slider and upgrade framework jetty version > > > Key: YARN-5909 > URL: https://issues.apache.org/jira/browse/YARN-5909 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Jian He >Assignee: Jian He > Attachments: YARN-5909-yarn-native-services.01.patch, > YARN-5909-yarn-native-services.02.patch > > > Hadoop core has upgraded jetty version to 9 , > The framework (slider AM) also needs to upgrade the jetty version > Problem is that some legacy agent code uses classes which only exist in old > jetty. Probably it's the time to remove all the agent related code ? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5909) Remove the agent code in slider and upgrade framework jetty version
[ https://issues.apache.org/jira/browse/YARN-5909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated YARN-5909: -- Attachment: YARN-5909-yarn-native-services.02.patch > Remove the agent code in slider and upgrade framework jetty version > > > Key: YARN-5909 > URL: https://issues.apache.org/jira/browse/YARN-5909 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Jian He > Attachments: YARN-5909-yarn-native-services.01.patch, > YARN-5909-yarn-native-services.02.patch > > > Hadoop core has upgraded jetty version to 9 , > The framework (slider AM) also needs to upgrade the jetty version > Problem is that some legacy agent code uses classes which only exist in old > jetty. Probably it's the time to remove all the agent related code ? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5933) ATS stale entries in active directory causes ApplicationNotFoundException in RM
[ https://issues.apache.org/jira/browse/YARN-5933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703184#comment-15703184 ] Li Lu commented on YARN-5933: - bq. ATS will keep on requesting and which floods RM. [~Prabhu Joseph] by saying "flood" do you mean the ATS launched requests to RM in a frequency higher than expected? > ATS stale entries in active directory causes ApplicationNotFoundException in > RM > --- > > Key: YARN-5933 > URL: https://issues.apache.org/jira/browse/YARN-5933 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.7.3 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph > > On Secure cluster where ATS is down, Tez job submitted will fail while > getting TIMELINE_DELEGATION_TOKEN with below exception > {code} > 0: jdbc:hive2://kerberos-2.openstacklocal:100> select csmallint from > alltypesorc group by csmallint; > INFO : Session is already open > INFO : Dag name: select csmallint from alltypesor...csmallint(Stage-1) > INFO : Tez session was closed. Reopening... > ERROR : Failed to execute tez graph. > java.lang.RuntimeException: Failed to connect to timeline server. Connection > retries limit exceeded. The posted timeline event may be missing > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$TimelineClientConnectionRetry.retryOn(TimelineClientImpl.java:266) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.operateDelegationToken(TimelineClientImpl.java:590) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.getDelegationToken(TimelineClientImpl.java:506) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getTimelineDelegationToken(YarnClientImpl.java:349) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.addTimelineDelegationToken(YarnClientImpl.java:330) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:250) > at > org.apache.tez.client.TezYarnClient.submitApplication(TezYarnClient.java:72) > at org.apache.tez.client.TezClient.start(TezClient.java:409) > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:196) > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.closeAndOpen(TezSessionPoolManager.java:311) > at org.apache.hadoop.hive.ql.exec.tez.TezTask.submit(TezTask.java:453) > at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:180) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1728) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1485) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1262) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1126) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1121) > at > org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:154) > at > org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:71) > at > org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:206) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709) > at > org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:218) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} > Tez YarnClient has received an applicationID from RM. On Restarting ATS now, > ATS tries to get the application report from RM and so RM will throw > ApplicationNotFoundException. ATS will keep on requesting and which floods RM. > {code} > RM logs: > 2016-11-23 13:53:57,345 INFO > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Allocated new > applicationId: 5 > 2016-11-23 14:05:04,936 INFO org.apache.hadoop.ipc.Server: IPC Server handler > 9 on 8050, call > org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport > from 172.26.71.120:37699 Call#26 Retry#0 > org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application > with id 'application_1479897867169_0005' doesn't exist in RM. > at >
[jira] [Updated] (YARN-5909) Remove the agent code in slider and upgrade framework jetty version
[ https://issues.apache.org/jira/browse/YARN-5909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated YARN-5909: -- Attachment: (was: YARN-5909-yarn-native-services.02.patch) > Remove the agent code in slider and upgrade framework jetty version > > > Key: YARN-5909 > URL: https://issues.apache.org/jira/browse/YARN-5909 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Jian He > Attachments: YARN-5909-yarn-native-services.01.patch > > > Hadoop core has upgraded jetty version to 9 , > The framework (slider AM) also needs to upgrade the jetty version > Problem is that some legacy agent code uses classes which only exist in old > jetty. Probably it's the time to remove all the agent related code ? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5909) Remove the agent code in slider and upgrade framework jetty version
[ https://issues.apache.org/jira/browse/YARN-5909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated YARN-5909: -- Attachment: YARN-5909-yarn-native-services.02.patch > Remove the agent code in slider and upgrade framework jetty version > > > Key: YARN-5909 > URL: https://issues.apache.org/jira/browse/YARN-5909 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Jian He > Attachments: YARN-5909-yarn-native-services.01.patch, > YARN-5909-yarn-native-services.02.patch > > > Hadoop core has upgraded jetty version to 9 , > The framework (slider AM) also needs to upgrade the jetty version > Problem is that some legacy agent code uses classes which only exist in old > jetty. Probably it's the time to remove all the agent related code ? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5915) ATS 1.5 FileSystemTimelineWriter causes flush() to be called after every event write
[ https://issues.apache.org/jira/browse/YARN-5915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated YARN-5915: -- Target Version/s: 3.0.0-alpha2 (was: 3.0.0-alpha1) > ATS 1.5 FileSystemTimelineWriter causes flush() to be called after every > event write > > > Key: YARN-5915 > URL: https://issues.apache.org/jira/browse/YARN-5915 > Project: Hadoop YARN > Issue Type: Bug > Components: timelineserver >Affects Versions: 3.0.0-alpha1 >Reporter: Atul Sikaria >Assignee: Atul Sikaria > Attachments: YARN-5915.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5899) A small fix for printing debug info inside function canAssignToThisQueue()
[ https://issues.apache.org/jira/browse/YARN-5899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated YARN-5899: -- Target Version/s: 3.0.0-alpha2 (was: 3.0.0-alpha1) > A small fix for printing debug info inside function canAssignToThisQueue() > -- > > Key: YARN-5899 > URL: https://issues.apache.org/jira/browse/YARN-5899 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.0.0-alpha1 >Reporter: Ying Zhang >Assignee: Ying Zhang >Priority: Trivial > Attachments: YARN-5899.001.patch > > > A small fix inside function canAssignToThisQueue() for printing DEBUG info. > Please see patch attached. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5849) Automatically create YARN control group for pre-mounted cgroups
[ https://issues.apache.org/jira/browse/YARN-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Miklos Szegedi updated YARN-5849: - Attachment: YARN-5849.004.patch > Automatically create YARN control group for pre-mounted cgroups > --- > > Key: YARN-5849 > URL: https://issues.apache.org/jira/browse/YARN-5849 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.7.3, 3.0.0-alpha1, 3.0.0-alpha2 >Reporter: Miklos Szegedi >Assignee: Miklos Szegedi >Priority: Minor > Attachments: YARN-5849.000.patch, YARN-5849.001.patch, > YARN-5849.002.patch, YARN-5849.003.patch, YARN-5849.004.patch > > > Yarn can be launched with linux-container-executor.cgroups.mount set to > false. It will search for the cgroup mount paths set up by the administrator > parsing the /etc/mtab file. You can also specify > resource.percentage-physical-cpu-limit to limit the CPU resources assigned to > containers. > linux-container-executor.cgroups.hierarchy is the root of the settings of all > YARN containers. If this is specified but not created YARN will fail at > startup: > Caused by: java.io.FileNotFoundException: > /cgroups/cpu/hadoop-yarn/cpu.cfs_period_us (Permission denied) > org.apache.hadoop.yarn.server.nodemanager.util.CgroupsLCEResourcesHandler.updateCgroup(CgroupsLCEResourcesHandler.java:263) > This JIRA is about automatically creating YARN control group in the case > above. It reduces the cost of administration. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5849) Automatically create YARN control group for pre-mounted cgroups
[ https://issues.apache.org/jira/browse/YARN-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703125#comment-15703125 ] Miklos Szegedi commented on YARN-5849: -- Thank you for the review, [~templedf]! I addressed all the issues that you raised except the one below and I will send the update soon. "In getErrorWithDetails() you may want to use UserGroupInformation.getCurrentUser() instead of the system property." I need to show the system user that will access the cgroup folder. I think System.getProperty("user.name") is the right one here based on my research. UserGroupInformation.getCurrentUser().getRealUser() returns null when running from the tests. Since there are other users of System.getProperty("user.name") in the Hadoop code base, I suggest using it here too. > Automatically create YARN control group for pre-mounted cgroups > --- > > Key: YARN-5849 > URL: https://issues.apache.org/jira/browse/YARN-5849 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.7.3, 3.0.0-alpha1, 3.0.0-alpha2 >Reporter: Miklos Szegedi >Assignee: Miklos Szegedi >Priority: Minor > Attachments: YARN-5849.000.patch, YARN-5849.001.patch, > YARN-5849.002.patch, YARN-5849.003.patch > > > Yarn can be launched with linux-container-executor.cgroups.mount set to > false. It will search for the cgroup mount paths set up by the administrator > parsing the /etc/mtab file. You can also specify > resource.percentage-physical-cpu-limit to limit the CPU resources assigned to > containers. > linux-container-executor.cgroups.hierarchy is the root of the settings of all > YARN containers. If this is specified but not created YARN will fail at > startup: > Caused by: java.io.FileNotFoundException: > /cgroups/cpu/hadoop-yarn/cpu.cfs_period_us (Permission denied) > org.apache.hadoop.yarn.server.nodemanager.util.CgroupsLCEResourcesHandler.updateCgroup(CgroupsLCEResourcesHandler.java:263) > This JIRA is about automatically creating YARN control group in the case > above. It reduces the cost of administration. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-5525) Make log aggregation service class configurable
[ https://issues.apache.org/jira/browse/YARN-5525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703107#comment-15703107 ] Botong Huang edited comment on YARN-5525 at 11/28/16 9:04 PM: -- Thanks [~xgong], yes this is an issue. I am thinking if the customer needs to customize the location of the logs, and choose not to upload the logs into the configured RemoteRootDir assumed by Yarn, then I assume they will handle the downstream log usage, and it is okay for yarn log cli and RM UI not to be able to find it? Or in their private implementation, they can copy the logs to both places. One for Yarn to understand, and one for their customized use. was (Author: botong): Thanks [~xgong], yes this is an issue. I am thinking if the customer needs to customize the location of the logs, and choose not to upload the logs into the configured RemoteRootDir assumed by Yarn, then I assume they will handle the downstream log usage, and it is okay for yarn log cli and RM UI not to be able to find it? > Make log aggregation service class configurable > --- > > Key: YARN-5525 > URL: https://issues.apache.org/jira/browse/YARN-5525 > Project: Hadoop YARN > Issue Type: Improvement > Components: log-aggregation >Reporter: Giovanni Matteo Fumarola >Assignee: Botong Huang >Priority: Minor > Labels: oct16-medium > Attachments: YARN-5525.v1.patch, YARN-5525.v2.patch, > YARN-5525.v3.patch, YARN-5525.v4.patch, YARN-5525.v5.patch > > > Make the log aggregation class configurable and extensible, so that > alternative log aggregation behaviors like app specific log aggregation > directory, log aggregation format can be implemented and plugged in. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5525) Make log aggregation service class configurable
[ https://issues.apache.org/jira/browse/YARN-5525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703107#comment-15703107 ] Botong Huang commented on YARN-5525: Thanks [~xgong], yes this is an issue. I am thinking if the customer needs to customize the location of the logs, and choose not to upload the logs into the configured RemoteRootDir assumed by Yarn, then I assume they will handle the downstream log usage, and it is okay for yarn log cli and RM UI not to be able to find it? > Make log aggregation service class configurable > --- > > Key: YARN-5525 > URL: https://issues.apache.org/jira/browse/YARN-5525 > Project: Hadoop YARN > Issue Type: Improvement > Components: log-aggregation >Reporter: Giovanni Matteo Fumarola >Assignee: Botong Huang >Priority: Minor > Labels: oct16-medium > Attachments: YARN-5525.v1.patch, YARN-5525.v2.patch, > YARN-5525.v3.patch, YARN-5525.v4.patch, YARN-5525.v5.patch > > > Make the log aggregation class configurable and extensible, so that > alternative log aggregation behaviors like app specific log aggregation > directory, log aggregation format can be implemented and plugged in. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5725) Test uncaught exception in TestContainersMonitorResourceChange.testContainersResourceChange when setting IP and host
[ https://issues.apache.org/jira/browse/YARN-5725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703005#comment-15703005 ] Hadoop QA commented on YARN-5725: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 21s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 2s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 16s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 13m 34s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 15s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 27m 23s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:a9ad5d6 | | JIRA Issue | YARN-5725 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12840705/YARN-5725.011.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 9dcac9a8aca9 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 5d5614f | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/14081/testReport/ | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/14081/console | | Powered by | Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Test uncaught exception in > TestContainersMonitorResourceChange.testContainersResourceChange when setting > IP and host > > > Key: YARN-5725 > URL: https://issues.apache.org/jira/browse/YARN-5725 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Miklos Szegedi >Assignee: Miklos Szegedi >
[jira] [Updated] (YARN-5725) Test uncaught exception in TestContainersMonitorResourceChange.testContainersResourceChange when setting IP and host
[ https://issues.apache.org/jira/browse/YARN-5725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Miklos Szegedi updated YARN-5725: - Attachment: YARN-5725.011.patch Thank you for the review [~templedf]. I updated the patch as requested. > Test uncaught exception in > TestContainersMonitorResourceChange.testContainersResourceChange when setting > IP and host > > > Key: YARN-5725 > URL: https://issues.apache.org/jira/browse/YARN-5725 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Miklos Szegedi >Assignee: Miklos Szegedi >Priority: Minor > Labels: oct16-easy > Attachments: YARN-5725.000.patch, YARN-5725.001.patch, > YARN-5725.002.patch, YARN-5725.003.patch, YARN-5725.004.patch, > YARN-5725.005.patch, YARN-5725.006.patch, YARN-5725.007.patch, > YARN-5725.008.patch, YARN-5725.009.patch, YARN-5725.010.patch, > YARN-5725.011.patch > > Original Estimate: 2h > Remaining Estimate: 2h > > The issue is a warning but it prevents container monitor to continue > 2016-10-12 14:38:23,280 WARN [Container Monitor] > monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(594)) - > Uncaught exception in ContainersMonitorImpl while monitoring resource of > container_123456_0001_01_01 > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl$MonitoringThread.run(ContainersMonitorImpl.java:455) > 2016-10-12 14:38:23,281 WARN [Container Monitor] > monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(613)) - > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl > is interrupted. Exiting. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5761) Separate QueueManager from Scheduler
[ https://issues.apache.org/jira/browse/YARN-5761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15702756#comment-15702756 ] Daniel Templeton commented on YARN-5761: I would really rather that we add the description for the throws clause than introduce more javadoc errors. A useful message for the throws clause is a good thing. Embrace the javadoc; don't fight it. :) > Separate QueueManager from Scheduler > > > Key: YARN-5761 > URL: https://issues.apache.org/jira/browse/YARN-5761 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler >Reporter: Xuan Gong >Assignee: Xuan Gong > Labels: oct16-medium > Attachments: YARN-5761.1.patch, YARN-5761.1.rebase.patch, > YARN-5761.2.patch, YARN-5761.3.patch, YARN-5761.4.patch, YARN-5761.5.patch, > YARN-5761.6.patch, YARN-5761.7.patch, YARN-5761.7.patch > > > Currently, in scheduler code, we are doing queue manager and scheduling work. > We'd better separate the queue manager out of scheduler logic. In that case, > it would be much easier and safer to extend. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5761) Separate QueueManager from Scheduler
[ https://issues.apache.org/jira/browse/YARN-5761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15702740#comment-15702740 ] Xuan Gong commented on YARN-5761: - bq. I think there are some java doc issues. Could u pls help to update the same if you have bandwidth Those are no description for @throws IOException. I think that we could ignore them. > Separate QueueManager from Scheduler > > > Key: YARN-5761 > URL: https://issues.apache.org/jira/browse/YARN-5761 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler >Reporter: Xuan Gong >Assignee: Xuan Gong > Labels: oct16-medium > Attachments: YARN-5761.1.patch, YARN-5761.1.rebase.patch, > YARN-5761.2.patch, YARN-5761.3.patch, YARN-5761.4.patch, YARN-5761.5.patch, > YARN-5761.6.patch, YARN-5761.7.patch, YARN-5761.7.patch > > > Currently, in scheduler code, we are doing queue manager and scheduling work. > We'd better separate the queue manager out of scheduler logic. In that case, > it would be much easier and safer to extend. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5762) Summarize ApplicationNotFoundException in the RM log
[ https://issues.apache.org/jira/browse/YARN-5762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15702699#comment-15702699 ] Xuan Gong commented on YARN-5762: - [~raviprak] I prefer to solve the problem from the caller. Looks like the AggregatedLogDeletionService is only used in JHS right now, can we find a way to extends AggregatedLogDeletionService in jhs which checks the app status from jhs instead of RM ? > Summarize ApplicationNotFoundException in the RM log > > > Key: YARN-5762 > URL: https://issues.apache.org/jira/browse/YARN-5762 > Project: Hadoop YARN > Issue Type: Task >Affects Versions: 2.7.2 >Reporter: Ravi Prakash >Assignee: Ravi Prakash >Priority: Minor > Attachments: YARN-5762.01.patch > > > We found a lot of {{ApplicationNotFoundException}} in the RM logs. These were > most likely caused by the {{AggregatedLogDeletionService}} [which > checks|https://github.com/apache/hadoop/blob/262827cf75bf9c48cd95335eb04fd8ff1d64c538/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/AggregatedLogDeletionService.java#L156] > that the application is not running anymore. e.g. > {code}2016-10-17 15:25:26,542 INFO org.apache.hadoop.ipc.Server: IPC Server > handler 20 on 8032, call > org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport > from :12205 Call#35401 Retry#0 > org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application > with id 'application_1473396553140_1451' doesn't exist in RM. > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:327) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:175) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:417) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1679) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043) > 2016-10-17 15:25:26,633 INFO org.apache.hadoop.ipc.Server: IPC Server handler > 47 on 8032, call > org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport > from :12205 Call#35404 Retry#0 > org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application > with id 'application_1473396553140_1452' doesn't exist in RM. > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:327) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:175) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:417) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1679) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5525) Make log aggregation service class configurable
[ https://issues.apache.org/jira/browse/YARN-5525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15702688#comment-15702688 ] Xuan Gong commented on YARN-5525: - [~botong] [~jianhe] So, if the customer can customize the log aggregation service, such as using different remote dir or different file format, how we can fetch the logs ? The yarn logs cli and log link in RM UI/ AHS UI will break. If we really want to customize the log aggregation service, this jira is not enough. We should have the detail plan on how to investigate/solve all the related problems. > Make log aggregation service class configurable > --- > > Key: YARN-5525 > URL: https://issues.apache.org/jira/browse/YARN-5525 > Project: Hadoop YARN > Issue Type: Improvement > Components: log-aggregation >Reporter: Giovanni Matteo Fumarola >Assignee: Botong Huang >Priority: Minor > Labels: oct16-medium > Attachments: YARN-5525.v1.patch, YARN-5525.v2.patch, > YARN-5525.v3.patch, YARN-5525.v4.patch, YARN-5525.v5.patch > > > Make the log aggregation class configurable and extensible, so that > alternative log aggregation behaviors like app specific log aggregation > directory, log aggregation format can be implemented and plugged in. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5924) Resource Manager fails to load state with InvalidProtocolBufferException
[ https://issues.apache.org/jira/browse/YARN-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15702652#comment-15702652 ] ASF GitHub Bot commented on YARN-5924: -- Github user ameks94 commented on the issue: https://github.com/apache/hadoop/pull/164 Update PR to fix the checkstyle and whitespace tests failure. > Resource Manager fails to load state with InvalidProtocolBufferException > > > Key: YARN-5924 > URL: https://issues.apache.org/jira/browse/YARN-5924 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 3.0.0-alpha1 >Reporter: Oleksii Dymytrov >Assignee: Oleksii Dymytrov > > InvalidProtocolBufferException is thrown during recovering of the > application's state if application's data has invalid format (or is broken) > under FSRMStateRoot/RMAppRoot/application_1477986176766_0134/ directory in > HDFS: > {noformat} > com.google.protobuf.InvalidProtocolBufferException: Protocol message > end-group tag did not match expected tag. > at > com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(InvalidProtocolBufferException.java:94) > at > com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.java:124) > at > com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:143) > at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:176) > at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:188) > at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:193) > at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:49) > at > org.apache.hadoop.yarn.proto.YarnServerResourceManagerRecoveryProtos$ApplicationStateDataProto.parseFrom(YarnServerResourceManagerRecoveryProtos.java:1028) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore$RMAppStateFileProcessor.processChildNode(FileSystemRMStateStore.java:966) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.processDirectoriesOfFiles(FileSystemRMStateStore.java:317) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.loadRMAppState(FileSystemRMStateStore.java:281) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.loadState(FileSystemRMStateStore.java:232) > {noformat} > The solution can be to catch "InvalidProtocolBufferException", show warning > and remove application's folder that contains invalid data to prevent RM > restart failure. > Additionally, I've added catch for other exceptions that can appear during > recovering of the specific application, to avoid RM failure even if the only > one application's state can't be loaded. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5924) Resource Manager fails to load state with InvalidProtocolBufferException
[ https://issues.apache.org/jira/browse/YARN-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15702655#comment-15702655 ] Oleksii Dymytrov commented on YARN-5924: Kindly review the PR. > Resource Manager fails to load state with InvalidProtocolBufferException > > > Key: YARN-5924 > URL: https://issues.apache.org/jira/browse/YARN-5924 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 3.0.0-alpha1 >Reporter: Oleksii Dymytrov >Assignee: Oleksii Dymytrov > > InvalidProtocolBufferException is thrown during recovering of the > application's state if application's data has invalid format (or is broken) > under FSRMStateRoot/RMAppRoot/application_1477986176766_0134/ directory in > HDFS: > {noformat} > com.google.protobuf.InvalidProtocolBufferException: Protocol message > end-group tag did not match expected tag. > at > com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(InvalidProtocolBufferException.java:94) > at > com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.java:124) > at > com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:143) > at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:176) > at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:188) > at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:193) > at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:49) > at > org.apache.hadoop.yarn.proto.YarnServerResourceManagerRecoveryProtos$ApplicationStateDataProto.parseFrom(YarnServerResourceManagerRecoveryProtos.java:1028) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore$RMAppStateFileProcessor.processChildNode(FileSystemRMStateStore.java:966) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.processDirectoriesOfFiles(FileSystemRMStateStore.java:317) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.loadRMAppState(FileSystemRMStateStore.java:281) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.loadState(FileSystemRMStateStore.java:232) > {noformat} > The solution can be to catch "InvalidProtocolBufferException", show warning > and remove application's folder that contains invalid data to prevent RM > restart failure. > Additionally, I've added catch for other exceptions that can appear during > recovering of the specific application, to avoid RM failure even if the only > one application's state can't be loaded. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5927) BaseContainerManagerTest::waitForNMContainerState timeout accounting is not accurate
[ https://issues.apache.org/jira/browse/YARN-5927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15702603#comment-15702603 ] Miklos Szegedi commented on YARN-5927: -- Thank you for the patch [~lewuathe]! I also suggest reducing {code}Thread.sleep(2000);{code} to 10 milliseconds. That is the usual range of these container events to occur, I think. > BaseContainerManagerTest::waitForNMContainerState timeout accounting is not > accurate > > > Key: YARN-5927 > URL: https://issues.apache.org/jira/browse/YARN-5927 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Miklos Szegedi >Assignee: Kai Sasaki >Priority: Trivial > Labels: newbie > Attachments: YARN-5917.01.patch > > > See below that timeoutSecs is increased twice. We also do a sleep right away > before even checking the observed value. > {code} > do { > Thread.sleep(2000); > ... > timeoutSecs += 2; > } while (!finalStates.contains(currentState) > && timeoutSecs++ < timeOutMax); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5937) stop-yarn.sh is not able to gracefully stop node managers
[ https://issues.apache.org/jira/browse/YARN-5937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiwei Yang updated YARN-5937: -- Description: stop-yarn.sh always gives following output {code} ./sbin/stop-yarn.sh Stopping resourcemanager Stopping nodemanagers : WARNING: nodemanager did not stop gracefully after 5 seconds: Trying to kill with kill -9 : ERROR: Unable to kill 18097 {code} this was because resource manager is stopped before node managers, when the shutdown hook manager tries to gracefully stop NM services, NM needs to unregister with RM, and it gets timeout as NM could not connect to RM (already stopped). See log (stop RM then run kill ) {code} 16/11/28 08:26:43 ERROR nodemanager.NodeManager: RECEIVED SIGNAL 15: SIGTERM ... 16/11/28 08:26:53 WARN util.ShutdownHookManager: ShutdownHook 'CompositeServiceShutdownHook' timeout, java.util.concurrent.TimeoutException java.util.concurrent.TimeoutException at java.util.concurrent.FutureTask.get(FutureTask.java:205) at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:67) ... at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.unRegisterNM(NodeStatusUpdaterImpl.java:291) ... 16/11/28 08:27:13 ERROR util.ShutdownHookManager: ShutdownHookManger shutdown forcefully. {code} the shutdown hooker has a default of 10s timeout, so if RM is stopped before NMs, they always took more than 10s to stop (in java code). However stop-yarn.sh only gives 5s timeout, so NM is always killed instead of stopped. It would make sense to stop NMs before RMs in this script, in a graceful way. was: stop-yarn.sh always gives following output {code} ./sbin/stop-yarn.sh Stopping resourcemanager Stopping nodemanagers : WARNING: nodemanager did not stop gracefully after 5 seconds: Trying to kill with kill -9 oracle1.fyre.ibm.com: ERROR: Unable to kill 18097 {code} this was because resource manager is stopped before node managers, when the shutdown hook manager tries to gracefully stop NM services, NM needs to unregister with RM, and it gets timeout as NM could not connect to RM (already stopped). See log (stop RM then run kill ) {code} 16/11/28 08:26:43 ERROR nodemanager.NodeManager: RECEIVED SIGNAL 15: SIGTERM ... 16/11/28 08:26:53 WARN util.ShutdownHookManager: ShutdownHook 'CompositeServiceShutdownHook' timeout, java.util.concurrent.TimeoutException java.util.concurrent.TimeoutException at java.util.concurrent.FutureTask.get(FutureTask.java:205) at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:67) ... at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.unRegisterNM(NodeStatusUpdaterImpl.java:291) ... 16/11/28 08:27:13 ERROR util.ShutdownHookManager: ShutdownHookManger shutdown forcefully. {code} the shutdown hooker has a default of 10s timeout, so if RM is stopped before NMs, they always took more than 10s to stop (in java code). However stop-yarn.sh only gives 5s timeout, so NM is always killed instead of stopped. It would make sense to stop NMs before RMs in this script, in a graceful way. > stop-yarn.sh is not able to gracefully stop node managers > - > > Key: YARN-5937 > URL: https://issues.apache.org/jira/browse/YARN-5937 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Weiwei Yang >Assignee: Weiwei Yang > Attachments: YARN-5937.01.patch, nm_shutdown.log > > > stop-yarn.sh always gives following output > {code} > ./sbin/stop-yarn.sh > Stopping resourcemanager > Stopping nodemanagers > : WARNING: nodemanager did not stop gracefully after 5 seconds: > Trying to kill with kill -9 > : ERROR: Unable to kill 18097 > {code} > this was because resource manager is stopped before node managers, when the > shutdown hook manager tries to gracefully stop NM services, NM needs to > unregister with RM, and it gets timeout as NM could not connect to RM > (already stopped). See log (stop RM then run kill ) > {code} > 16/11/28 08:26:43 ERROR nodemanager.NodeManager: RECEIVED SIGNAL 15: SIGTERM > ... > 16/11/28 08:26:53 WARN util.ShutdownHookManager: ShutdownHook > 'CompositeServiceShutdownHook' timeout, java.util.concurrent.TimeoutException > java.util.concurrent.TimeoutException > at java.util.concurrent.FutureTask.get(FutureTask.java:205) > at > org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:67) > ... > at > org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.unRegisterNM(NodeStatusUpdaterImpl.java:291) > ... > 16/11/28 08:27:13 ERROR util.ShutdownHookManager: ShutdownHookManger shutdown > forcefully. > {code} > the shutdown hooker has a default of 10s timeout, so if RM is stopped before > NMs, they always took more than 10s to stop (in java code). However >
[jira] [Commented] (YARN-5909) Remove the agent code in slider and upgrade framework jetty version
[ https://issues.apache.org/jira/browse/YARN-5909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15702479#comment-15702479 ] Billie Rinaldi commented on YARN-5909: -- [~jianhe], thanks for taking this on. The patch looks good so far. The following change in WebAppApi appears to be a typo: {noformat} - * @return the (singleton) instance + * @return the (singleton) instancepro {noformat} I think we can remove the three jetty dependencies from the slider core pom.xml. I don't see any jetty imports except in the removed classes. It looks like we also need to remove the jackson-jaxrs-json-provider dependency added in hadoop-project/pom.xml for the services API, as this has already been added in YARN-5713. Let's open a follow-up patch to remove AgentKeys. Any constants from there that are still being used can be moved to SliderKeys or one of the other Keys classes. > Remove the agent code in slider and upgrade framework jetty version > > > Key: YARN-5909 > URL: https://issues.apache.org/jira/browse/YARN-5909 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Jian He > Attachments: YARN-5909-yarn-native-services.01.patch > > > Hadoop core has upgraded jetty version to 9 , > The framework (slider AM) also needs to upgrade the jetty version > Problem is that some legacy agent code uses classes which only exist in old > jetty. Probably it's the time to remove all the agent related code ? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5937) stop-yarn.sh is not able to gracefully stop node managers
[ https://issues.apache.org/jira/browse/YARN-5937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiwei Yang updated YARN-5937: -- Attachment: YARN-5937.01.patch > stop-yarn.sh is not able to gracefully stop node managers > - > > Key: YARN-5937 > URL: https://issues.apache.org/jira/browse/YARN-5937 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Weiwei Yang >Assignee: Weiwei Yang > Attachments: YARN-5937.01.patch, nm_shutdown.log > > > stop-yarn.sh always gives following output > {code} > ./sbin/stop-yarn.sh > Stopping resourcemanager > Stopping nodemanagers > : WARNING: nodemanager did not stop gracefully after 5 seconds: > Trying to kill with kill -9 > oracle1.fyre.ibm.com: ERROR: Unable to kill 18097 > {code} > this was because resource manager is stopped before node managers, when the > shutdown hook manager tries to gracefully stop NM services, NM needs to > unregister with RM, and it gets timeout as NM could not connect to RM > (already stopped). See log (stop RM then run kill ) > {code} > 16/11/28 08:26:43 ERROR nodemanager.NodeManager: RECEIVED SIGNAL 15: SIGTERM > ... > 16/11/28 08:26:53 WARN util.ShutdownHookManager: ShutdownHook > 'CompositeServiceShutdownHook' timeout, java.util.concurrent.TimeoutException > java.util.concurrent.TimeoutException > at java.util.concurrent.FutureTask.get(FutureTask.java:205) > at > org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:67) > ... > at > org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.unRegisterNM(NodeStatusUpdaterImpl.java:291) > ... > 16/11/28 08:27:13 ERROR util.ShutdownHookManager: ShutdownHookManger shutdown > forcefully. > {code} > the shutdown hooker has a default of 10s timeout, so if RM is stopped before > NMs, they always took more than 10s to stop (in java code). However > stop-yarn.sh only gives 5s timeout, so NM is always killed instead of stopped. > It would make sense to stop NMs before RMs in this script, in a graceful way. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5937) stop-yarn.sh is not able to gracefully stop node managers
[ https://issues.apache.org/jira/browse/YARN-5937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiwei Yang updated YARN-5937: -- Attachment: nm_shutdown.log > stop-yarn.sh is not able to gracefully stop node managers > - > > Key: YARN-5937 > URL: https://issues.apache.org/jira/browse/YARN-5937 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Weiwei Yang >Assignee: Weiwei Yang > Attachments: nm_shutdown.log > > > stop-yarn.sh always gives following output > {code} > ./sbin/stop-yarn.sh > Stopping resourcemanager > Stopping nodemanagers > : WARNING: nodemanager did not stop gracefully after 5 seconds: > Trying to kill with kill -9 > oracle1.fyre.ibm.com: ERROR: Unable to kill 18097 > {code} > this was because resource manager is stopped before node managers, when the > shutdown hook manager tries to gracefully stop NM services, NM needs to > unregister with RM, and it gets timeout as NM could not connect to RM > (already stopped). See log (stop RM then run kill ) > {code} > 16/11/28 08:26:43 ERROR nodemanager.NodeManager: RECEIVED SIGNAL 15: SIGTERM > ... > 16/11/28 08:26:53 WARN util.ShutdownHookManager: ShutdownHook > 'CompositeServiceShutdownHook' timeout, java.util.concurrent.TimeoutException > java.util.concurrent.TimeoutException > at java.util.concurrent.FutureTask.get(FutureTask.java:205) > at > org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:67) > ... > at > org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.unRegisterNM(NodeStatusUpdaterImpl.java:291) > ... > 16/11/28 08:27:13 ERROR util.ShutdownHookManager: ShutdownHookManger shutdown > forcefully. > {code} > the shutdown hooker has a default of 10s timeout, so if RM is stopped before > NMs, they always took more than 10s to stop (in java code). However > stop-yarn.sh only gives 5s timeout, so NM is always killed instead of stopped. > It would make sense to stop NMs before RMs in this script, in a graceful way. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-5937) stop-yarn.sh is not able to gracefully stop node managers
Weiwei Yang created YARN-5937: - Summary: stop-yarn.sh is not able to gracefully stop node managers Key: YARN-5937 URL: https://issues.apache.org/jira/browse/YARN-5937 Project: Hadoop YARN Issue Type: Bug Reporter: Weiwei Yang Assignee: Weiwei Yang stop-yarn.sh always gives following output {code} ./sbin/stop-yarn.sh Stopping resourcemanager Stopping nodemanagers : WARNING: nodemanager did not stop gracefully after 5 seconds: Trying to kill with kill -9 oracle1.fyre.ibm.com: ERROR: Unable to kill 18097 {code} this was because resource manager is stopped before node managers, when the shutdown hook manager tries to gracefully stop NM services, NM needs to unregister with RM, and it gets timeout as NM could not connect to RM (already stopped). See log (stop RM then run kill ) {code} 16/11/28 08:26:43 ERROR nodemanager.NodeManager: RECEIVED SIGNAL 15: SIGTERM ... 16/11/28 08:26:53 WARN util.ShutdownHookManager: ShutdownHook 'CompositeServiceShutdownHook' timeout, java.util.concurrent.TimeoutException java.util.concurrent.TimeoutException at java.util.concurrent.FutureTask.get(FutureTask.java:205) at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:67) ... at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.unRegisterNM(NodeStatusUpdaterImpl.java:291) ... 16/11/28 08:27:13 ERROR util.ShutdownHookManager: ShutdownHookManger shutdown forcefully. {code} the shutdown hooker has a default of 10s timeout, so if RM is stopped before NMs, they always took more than 10s to stop (in java code). However stop-yarn.sh only gives 5s timeout, so NM is always killed instead of stopped. It would make sense to stop NMs before RMs in this script, in a graceful way. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5932) Retrospect moveApplicationToQueue in align with YARN-5611
[ https://issues.apache.org/jira/browse/YARN-5932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15701652#comment-15701652 ] Rohith Sharma K S commented on YARN-5932: - Approach looks fine to me,I will have detailed look at the patch. > Retrospect moveApplicationToQueue in align with YARN-5611 > - > > Key: YARN-5932 > URL: https://issues.apache.org/jira/browse/YARN-5932 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, resourcemanager >Reporter: Sunil G >Assignee: Sunil G > Attachments: YARN-5932.v0.patch, YARN-5932.v1.patch > > > All dynamic api's of an application's state change could follow a general > design approach. Currently priority and app timeouts are following this > approach all corner cases. > *Steps* > - Do a pre-validate check to ensure that changes are fine. > - Update this information to state-store > - Perform real move operation and update in-memory data structures. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5931) Document timeout interfaces CLI and REST APIs
[ https://issues.apache.org/jira/browse/YARN-5931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15701517#comment-15701517 ] Hadoop QA commented on YARN-5931: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 13s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 3 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch 12 line(s) with tabs. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 9m 56s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:a9ad5d6 | | JIRA Issue | YARN-5931 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12840622/YARN-5931.0.patch | | Optional Tests | asflicense mvnsite | | uname | Linux 59f7900f06c0 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 5d5614f | | whitespace | https://builds.apache.org/job/PreCommit-YARN-Build/14080/artifact/patchprocess/whitespace-eol.txt | | whitespace | https://builds.apache.org/job/PreCommit-YARN-Build/14080/artifact/patchprocess/whitespace-tabs.txt | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/14080/console | | Powered by | Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Document timeout interfaces CLI and REST APIs > - > > Key: YARN-5931 > URL: https://issues.apache.org/jira/browse/YARN-5931 > Project: Hadoop YARN > Issue Type: Sub-task > Components: scheduler >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S > Attachments: ResourceManagerRest.html, YARN-5931.0.patch, > YarnCommands.html > > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5931) Document timeout interfaces CLI and REST APIs
[ https://issues.apache.org/jira/browse/YARN-5931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith Sharma K S updated YARN-5931: Attachment: YARN-5931.0.patch attaching the patch. Kindly review the patch > Document timeout interfaces CLI and REST APIs > - > > Key: YARN-5931 > URL: https://issues.apache.org/jira/browse/YARN-5931 > Project: Hadoop YARN > Issue Type: Sub-task > Components: scheduler >Reporter: Rohith Sharma K S > Attachments: ResourceManagerRest.html, YARN-5931.0.patch, > YarnCommands.html > > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5931) Document timeout interfaces CLI and REST APIs
[ https://issues.apache.org/jira/browse/YARN-5931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith Sharma K S updated YARN-5931: Attachment: YarnCommands.html ResourceManagerRest.html updating the html page for patch. > Document timeout interfaces CLI and REST APIs > - > > Key: YARN-5931 > URL: https://issues.apache.org/jira/browse/YARN-5931 > Project: Hadoop YARN > Issue Type: Sub-task > Components: scheduler >Reporter: Rohith Sharma K S > Attachments: ResourceManagerRest.html, YarnCommands.html > > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org