[jira] [Commented] (YARN-6759) TestRMRestart.testRMRestartWaitForPreviousAMToFinish is failing in trunk
[ https://issues.apache.org/jira/browse/YARN-6759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16074448#comment-16074448 ] Feng Yuan commented on YARN-6759: - final int maxRetry = 10; final RMApp rmAppForCheck = rmApp; GenericTestUtils.waitFor( new Supplier() { @Override public Boolean get() { return new Boolean(rmAppForCheck.getAppAttempts().size() == 4); } }, 100, maxRetry); Maybe should "*maxRetry \* 100*" > TestRMRestart.testRMRestartWaitForPreviousAMToFinish is failing in trunk > > > Key: YARN-6759 > URL: https://issues.apache.org/jira/browse/YARN-6759 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Naganarasimha G R > > {code} > java.lang.IllegalArgumentException: Total wait time should be greater than > check interval time > at > com.google.common.base.Preconditions.checkArgument(Preconditions.java:88) > at > org.apache.hadoop.test.GenericTestUtils.waitFor(GenericTestUtils.java:273) > at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartWaitForPreviousAMToFinish(TestRMRestart.java:618) > {code} > refer > https://builds.apache.org/job/PreCommit-YARN-Build/16229/testReport/org.apache.hadoop.yarn.server.resourcemanager/TestRMRestart/testRMRestartWaitForPreviousAMToFinish/ > which ran for YARN-2919 -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6759) TestRMRestart.testRMRestartWaitForPreviousAMToFinish is failing in trunk
[ https://issues.apache.org/jira/browse/YARN-6759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16074502#comment-16074502 ] Bibin A Chundatt commented on YARN-6759: {code} public static void waitFor(Supplier check, int checkEveryMillis, int waitForMillis) throws TimeoutException, InterruptedException { Preconditions.checkNotNull(check, ERROR_MISSING_ARGUMENT); Preconditions.checkArgument(waitForMillis > checkEveryMillis, ERROR_INVALID_ARGUMENT); {code} Yes, *"maxRetry * 100"* should solve the problem > TestRMRestart.testRMRestartWaitForPreviousAMToFinish is failing in trunk > > > Key: YARN-6759 > URL: https://issues.apache.org/jira/browse/YARN-6759 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Naganarasimha G R > > {code} > java.lang.IllegalArgumentException: Total wait time should be greater than > check interval time > at > com.google.common.base.Preconditions.checkArgument(Preconditions.java:88) > at > org.apache.hadoop.test.GenericTestUtils.waitFor(GenericTestUtils.java:273) > at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartWaitForPreviousAMToFinish(TestRMRestart.java:618) > {code} > refer > https://builds.apache.org/job/PreCommit-YARN-Build/16229/testReport/org.apache.hadoop.yarn.server.resourcemanager/TestRMRestart/testRMRestartWaitForPreviousAMToFinish/ > which ran for YARN-2919 -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6727) Improve getQueueUserAcls API to query for specific queue and user
[ https://issues.apache.org/jira/browse/YARN-6727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16074516#comment-16074516 ] Sunil G commented on YARN-6727: --- I guess I ll put my thoughts more clearer for better clarity in terms of understanding. # Invoking {{YarnScheduler#checkAccess}} is safer for CS as we lock with a readLock only in YarnAuthorizationProvider where as FS still has some locks in scheduler. So we can try limiting the checkAccess as possible. # Whichever user is passed with checkAccess at one point of time (from cli/api side or app submission time etc) could be cached. I prefer this to be stored outside scheduler. # So any further {{getQueueUserAcls}} could look into cache for first time. And given a cache miss, we can do {{YarnScheduler#checkAccess}} and update cache. # Cache could be invalidated in cases here a config refresh happened for queues/acls or in similar conditions. Given if changes are minimal, we can do in one ticket. But if its more, I am in favor of splitting to 2 jiras for api and scheduler. > Improve getQueueUserAcls API to query for specific queue and user > -- > > Key: YARN-6727 > URL: https://issues.apache.org/jira/browse/YARN-6727 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt > Attachments: YARN-6727.WIP.patch > > > Currently {{ApplicationClientProtocol#getQueueUserAcls}} return data for all > the queues available in scheduler for user. > User wants to know whether he has rights of a particular queue only. For > systems with 5K queues returning all queues list is not efficient. > Suggested change: support additional parameters *userName and queueName* as > optional. Admin user should be able to query other users ACL for a particular > queueName. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6708) Nodemanager container crash after ext3 folder limit
[ https://issues.apache.org/jira/browse/YARN-6708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16074546#comment-16074546 ] Naganarasimha G R commented on YARN-6708: - [~jlowe], Findbugs is not related to the patch and is tracked in YARN-6515, Apart from it i feel the work in the latest patch is fine. Shall i go ahead and merge if you are held up with other work ? > Nodemanager container crash after ext3 folder limit > --- > > Key: YARN-6708 > URL: https://issues.apache.org/jira/browse/YARN-6708 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Critical > Attachments: YARN-6708.001.patch, YARN-6708.002.patch, > YARN-6708.003.patch, YARN-6708.004.patch, YARN-6708.005.patch, > YARN-6708.006.patch > > > Configure umask as *027* for nodemanager service user > and {{yarn.nodemanager.local-cache.max-files-per-directory}} as {{40}}. After > 4 *private* dir localization next directory will be *0/14* > Local Directory cache manager > {code} > vm2:/opt/hadoop/release/data/nmlocal/usercache/mapred/filecache # l > total 28 > drwx--x--- 7 mapred hadoop 4096 Jun 10 14:35 ./ > drwxr-s--- 4 mapred hadoop 4096 Jun 10 12:07 ../ > drwxr-x--- 3 mapred users 4096 Jun 10 14:36 0/ > drwxr-xr-x 3 mapred users 4096 Jun 10 12:15 10/ > drwxr-xr-x 3 mapred users 4096 Jun 10 12:22 11/ > drwxr-xr-x 3 mapred users 4096 Jun 10 12:27 12/ > drwxr-xr-x 3 mapred users 4096 Jun 10 12:31 13/ > {code} > *drwxr-x---* 3 mapred users 4096 Jun 10 14:36 0/ is only *750* > Nodemanager user will not be able check for localization path exists or not. > {{LocalResourcesTrackerImpl}} > {code} > case REQUEST: > if (rsrc != null && (!isResourcePresent(rsrc))) { > LOG.info("Resource " + rsrc.getLocalPath() > + " is missing, localizing it again"); > removeResource(req); > rsrc = null; > } > if (null == rsrc) { > rsrc = new LocalizedResource(req, dispatcher); > localrsrc.put(req, rsrc); > } > break; > {code} > *isResourcePresent* will always return false and same resource will be > localized to {{0}} to next unique number -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-6727) Improve getQueueUserAcls API to query for specific queue and user
[ https://issues.apache.org/jira/browse/YARN-6727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16074561#comment-16074561 ] Bibin A Chundatt edited comment on YARN-6727 at 7/5/17 10:40 AM: - Thank you [~sunilg] for explanation {quote} Is safer for CS as we lock with a readLock only in YarnAuthorizationProvider {quote} + queue level readlock {quote} Whichever user is passed with checkAccess at one point of time (from cli/api side or app submission time etc) could be cached. {quote} Submission time QUEUE_SUBMIT right we could cache but we need all .. am i missing something? {quote} Cache could be invalidated in cases here a config refresh happened for queues/acls or in similar conditions. {quote} The ACL mapping will depend on user to group mapping also which gets refreshed based in time interval. IIUC the refresh interval is about 5/10 min.We dont have direct update or notifier as of now. was (Author: bibinchundatt): Thank you [~sunilg] for explanation {quote} Is safer for CS as we lock with a readLock only in YarnAuthorizationProvider {quote} + queue level readlock {quote} Whichever user is passed with checkAccess at one point of time (from cli/api side or app submission time etc) could be cached. {quote} Submission time QUEUE_SUBMIT right we could cache but we need all .. am i missing something? {quote} Cache could be invalidated in cases here a config refresh happened for queues/acls or in similar conditions. {quote} The ACL mapping will depend on user to group mapping also which gets refreshed based in time interval. IIUC the refresh interval is about 5/10 min. I dont this we have direct update or notifier as of now. > Improve getQueueUserAcls API to query for specific queue and user > -- > > Key: YARN-6727 > URL: https://issues.apache.org/jira/browse/YARN-6727 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt > Attachments: YARN-6727.WIP.patch > > > Currently {{ApplicationClientProtocol#getQueueUserAcls}} return data for all > the queues available in scheduler for user. > User wants to know whether he has rights of a particular queue only. For > systems with 5K queues returning all queues list is not efficient. > Suggested change: support additional parameters *userName and queueName* as > optional. Admin user should be able to query other users ACL for a particular > queueName. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6727) Improve getQueueUserAcls API to query for specific queue and user
[ https://issues.apache.org/jira/browse/YARN-6727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16074561#comment-16074561 ] Bibin A Chundatt commented on YARN-6727: Thank you [~sunilg] for explanation {quote} Is safer for CS as we lock with a readLock only in YarnAuthorizationProvider {quote} + queue level readlock {quote} Whichever user is passed with checkAccess at one point of time (from cli/api side or app submission time etc) could be cached. {quote} Submission time QUEUE_SUBMIT right we could cache but we need all .. am i missing something? {quote} Cache could be invalidated in cases here a config refresh happened for queues/acls or in similar conditions. {quote} The ACL mapping will depend on user to group mapping also which gets refreshed based in time interval. IIUC the refresh interval is about 5/10 min. I dont this we have direct update or notifier as of now. > Improve getQueueUserAcls API to query for specific queue and user > -- > > Key: YARN-6727 > URL: https://issues.apache.org/jira/browse/YARN-6727 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt > Attachments: YARN-6727.WIP.patch > > > Currently {{ApplicationClientProtocol#getQueueUserAcls}} return data for all > the queues available in scheduler for user. > User wants to know whether he has rights of a particular queue only. For > systems with 5K queues returning all queues list is not efficient. > Suggested change: support additional parameters *userName and queueName* as > optional. Admin user should be able to query other users ACL for a particular > queueName. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-6763) TestProcfsBasedProcessTree#testProcessTree fails in trunk
Bibin A Chundatt created YARN-6763: -- Summary: TestProcfsBasedProcessTree#testProcessTree fails in trunk Key: YARN-6763 URL: https://issues.apache.org/jira/browse/YARN-6763 Project: Hadoop YARN Issue Type: Test Reporter: Bibin A Chundatt Priority: Minor {code} Tests run: 5, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.949 sec <<< FAILURE! - in org.apache.hadoop.yarn.util.TestProcfsBasedProcessTree testProcessTree(org.apache.hadoop.yarn.util.TestProcfsBasedProcessTree) Time elapsed: 7.119 sec <<< FAILURE! java.lang.AssertionError: Child process owned by init escaped process tree. at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.assertTrue(Assert.java:41) at org.apache.hadoop.yarn.util.TestProcfsBasedProcessTree.testProcessTree(TestProcfsBasedProcessTree.java:184) {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-2113) Add cross-user preemption within CapacityScheduler's leaf-queue
[ https://issues.apache.org/jira/browse/YARN-2113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Payne updated YARN-2113: - Attachment: (was: YARN-2113.branch-2.0020.patch) > Add cross-user preemption within CapacityScheduler's leaf-queue > --- > > Key: YARN-2113 > URL: https://issues.apache.org/jira/browse/YARN-2113 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Reporter: Vinod Kumar Vavilapalli >Assignee: Sunil G > Fix For: 3.0.0-alpha4 > > Attachments: IntraQueue Preemption-Impact Analysis.pdf, > TestNoIntraQueuePreemptionIfBelowUserLimitAndDifferentPrioritiesWithExtraUsers.txt, > YARN-2113.0001.patch, YARN-2113.0002.patch, YARN-2113.0003.patch, > YARN-2113.0004.patch, YARN-2113.0005.patch, YARN-2113.0006.patch, > YARN-2113.0007.patch, YARN-2113.0008.patch, YARN-2113.0009.patch, > YARN-2113.0010.patch, YARN-2113.0011.patch, YARN-2113.0012.patch, > YARN-2113.0013.patch, YARN-2113.0014.patch, YARN-2113.0015.patch, > YARN-2113.0016.patch, YARN-2113.0017.patch, YARN-2113.0018.patch, > YARN-2113.0019.patch, YARN-2113.apply.onto.0012.ericp.patch, > YARN-2113.branch-2.0019.patch, YARN-2113.branch-2.8.0019.patch, > YARN-2113.branch-2.8.0020.patch, YARN-2113 Intra-QueuePreemption > Behavior.pdf, YARN-2113.v0.patch > > > Preemption today only works across queues and moves around resources across > queues per demand and usage. We should also have user-level preemption within > a queue, to balance capacity across users in a predictable manner. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-2113) Add cross-user preemption within CapacityScheduler's leaf-queue
[ https://issues.apache.org/jira/browse/YARN-2113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Payne updated YARN-2113: - Attachment: YARN-2113.branch-2.0020.patch > Add cross-user preemption within CapacityScheduler's leaf-queue > --- > > Key: YARN-2113 > URL: https://issues.apache.org/jira/browse/YARN-2113 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Reporter: Vinod Kumar Vavilapalli >Assignee: Sunil G > Fix For: 3.0.0-alpha4 > > Attachments: IntraQueue Preemption-Impact Analysis.pdf, > TestNoIntraQueuePreemptionIfBelowUserLimitAndDifferentPrioritiesWithExtraUsers.txt, > YARN-2113.0001.patch, YARN-2113.0002.patch, YARN-2113.0003.patch, > YARN-2113.0004.patch, YARN-2113.0005.patch, YARN-2113.0006.patch, > YARN-2113.0007.patch, YARN-2113.0008.patch, YARN-2113.0009.patch, > YARN-2113.0010.patch, YARN-2113.0011.patch, YARN-2113.0012.patch, > YARN-2113.0013.patch, YARN-2113.0014.patch, YARN-2113.0015.patch, > YARN-2113.0016.patch, YARN-2113.0017.patch, YARN-2113.0018.patch, > YARN-2113.0019.patch, YARN-2113.apply.onto.0012.ericp.patch, > YARN-2113.branch-2.0019.patch, YARN-2113.branch-2.0020.patch, > YARN-2113.branch-2.8.0019.patch, YARN-2113.branch-2.8.0020.patch, YARN-2113 > Intra-QueuePreemption Behavior.pdf, YARN-2113.v0.patch > > > Preemption today only works across queues and moves around resources across > queues per demand and usage. We should also have user-level preemption within > a queue, to balance capacity across users in a predictable manner. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-2113) Add cross-user preemption within CapacityScheduler's leaf-queue
[ https://issues.apache.org/jira/browse/YARN-2113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16074780#comment-16074780 ] Sunil G commented on YARN-2113: --- Hi [~eepayne] I think branch-2.8 patch is fine. I guess i made a mistake while mentioning patch name :) I will wait for jenkins for branch-2 patch. > Add cross-user preemption within CapacityScheduler's leaf-queue > --- > > Key: YARN-2113 > URL: https://issues.apache.org/jira/browse/YARN-2113 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Reporter: Vinod Kumar Vavilapalli >Assignee: Sunil G > Fix For: 3.0.0-alpha4 > > Attachments: IntraQueue Preemption-Impact Analysis.pdf, > TestNoIntraQueuePreemptionIfBelowUserLimitAndDifferentPrioritiesWithExtraUsers.txt, > YARN-2113.0001.patch, YARN-2113.0002.patch, YARN-2113.0003.patch, > YARN-2113.0004.patch, YARN-2113.0005.patch, YARN-2113.0006.patch, > YARN-2113.0007.patch, YARN-2113.0008.patch, YARN-2113.0009.patch, > YARN-2113.0010.patch, YARN-2113.0011.patch, YARN-2113.0012.patch, > YARN-2113.0013.patch, YARN-2113.0014.patch, YARN-2113.0015.patch, > YARN-2113.0016.patch, YARN-2113.0017.patch, YARN-2113.0018.patch, > YARN-2113.0019.patch, YARN-2113.apply.onto.0012.ericp.patch, > YARN-2113.branch-2.0019.patch, YARN-2113.branch-2.0020.patch, > YARN-2113.branch-2.8.0019.patch, YARN-2113.branch-2.8.0020.patch, YARN-2113 > Intra-QueuePreemption Behavior.pdf, YARN-2113.v0.patch > > > Preemption today only works across queues and moves around resources across > queues per demand and usage. We should also have user-level preemption within > a queue, to balance capacity across users in a predictable manner. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-2113) Add cross-user preemption within CapacityScheduler's leaf-queue
[ https://issues.apache.org/jira/browse/YARN-2113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16074977#comment-16074977 ] Hadoop QA commented on YARN-2113: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 26s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 42s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 8s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 20s{color} | {color:green} branch-2 passed with JDK v1.8.0_131 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 14s{color} | {color:green} branch-2 passed with JDK v1.7.0_131 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 42s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 15s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 22s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 50s{color} | {color:green} branch-2 passed with JDK v1.8.0_131 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 56s{color} | {color:green} branch-2 passed with JDK v1.7.0_131 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 49s{color} | {color:green} the patch passed with JDK v1.8.0_131 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 1m 49s{color} | {color:red} hadoop-yarn-project_hadoop-yarn-jdk1.8.0_131 with JDK v1.8.0_131 generated 1 new + 134 unchanged - 1 fixed = 135 total (was 135) {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 13s{color} | {color:green} the patch passed with JDK v1.7.0_131 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 2m 13s{color} | {color:red} hadoop-yarn-project_hadoop-yarn-jdk1.7.0_131 with JDK v1.7.0_131 generated 1 new + 143 unchanged - 1 fixed = 144 total (was 144) {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 41s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 24 new + 195 unchanged - 1 fixed = 219 total (was 196) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 46s{color} | {color:green} the patch passed with JDK v1.8.0_131 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 53s{color} | {color:green} the patch passed with JDK v1.7.0_131 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 22s{color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.7.0_131. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 44m 2s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_131. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}131m 39s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_131 Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMRestart | | | hadoop.yarn.server.resourcemanager.scheduler.fair.TestFSAppStarvation | | JDK v1.7.0_131 Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMRe
[jira] [Commented] (YARN-6742) Minor mistakes in "The YARN Service Registry" docs
[ https://issues.apache.org/jira/browse/YARN-6742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16075104#comment-16075104 ] Shane Kumpf commented on YARN-6742: --- Thanks for taking the time to review the entire document, [~Cyl]! One minor comment. The following still reads a bit strange to me. I think removing "which" would help. Could you take a quick look? {code}A service instance is running only if the component instances which for the service are running.{code} > Minor mistakes in "The YARN Service Registry" docs > -- > > Key: YARN-6742 > URL: https://issues.apache.org/jira/browse/YARN-6742 > Project: Hadoop YARN > Issue Type: Bug > Components: documentation >Affects Versions: 3.0.0-alpha3 >Reporter: Yeliang Cang >Assignee: Yeliang Cang >Priority: Trivial > Attachments: YARN-6742-001.patch, YARN-6742-002.patch > > > There are minor mistakes in The YARN Service Registry docs. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6708) Nodemanager container crash after ext3 folder limit
[ https://issues.apache.org/jira/browse/YARN-6708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16075202#comment-16075202 ] Jason Lowe commented on YARN-6708: -- Sorry for the delay, was out of the office for a bit. Please hold off on committing this, as I plan on reviewing the latest patch later today. > Nodemanager container crash after ext3 folder limit > --- > > Key: YARN-6708 > URL: https://issues.apache.org/jira/browse/YARN-6708 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Critical > Attachments: YARN-6708.001.patch, YARN-6708.002.patch, > YARN-6708.003.patch, YARN-6708.004.patch, YARN-6708.005.patch, > YARN-6708.006.patch > > > Configure umask as *027* for nodemanager service user > and {{yarn.nodemanager.local-cache.max-files-per-directory}} as {{40}}. After > 4 *private* dir localization next directory will be *0/14* > Local Directory cache manager > {code} > vm2:/opt/hadoop/release/data/nmlocal/usercache/mapred/filecache # l > total 28 > drwx--x--- 7 mapred hadoop 4096 Jun 10 14:35 ./ > drwxr-s--- 4 mapred hadoop 4096 Jun 10 12:07 ../ > drwxr-x--- 3 mapred users 4096 Jun 10 14:36 0/ > drwxr-xr-x 3 mapred users 4096 Jun 10 12:15 10/ > drwxr-xr-x 3 mapred users 4096 Jun 10 12:22 11/ > drwxr-xr-x 3 mapred users 4096 Jun 10 12:27 12/ > drwxr-xr-x 3 mapred users 4096 Jun 10 12:31 13/ > {code} > *drwxr-x---* 3 mapred users 4096 Jun 10 14:36 0/ is only *750* > Nodemanager user will not be able check for localization path exists or not. > {{LocalResourcesTrackerImpl}} > {code} > case REQUEST: > if (rsrc != null && (!isResourcePresent(rsrc))) { > LOG.info("Resource " + rsrc.getLocalPath() > + " is missing, localizing it again"); > removeResource(req); > rsrc = null; > } > if (null == rsrc) { > rsrc = new LocalizedResource(req, dispatcher); > localrsrc.put(req, rsrc); > } > break; > {code} > *isResourcePresent* will always return false and same resource will be > localized to {{0}} to next unique number -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-6764) Simplify the logic in FairScheduler#attemptScheduling
Yufei Gu created YARN-6764: -- Summary: Simplify the logic in FairScheduler#attemptScheduling Key: YARN-6764 URL: https://issues.apache.org/jira/browse/YARN-6764 Project: Hadoop YARN Issue Type: Improvement Components: fairscheduler Affects Versions: 3.0.0-alpha3, 2.8.1 Reporter: Yufei Gu Assignee: Yufei Gu -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6764) Simplify the logic in FairScheduler#attemptScheduling
[ https://issues.apache.org/jira/browse/YARN-6764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yufei Gu updated YARN-6764: --- Priority: Trivial (was: Major) > Simplify the logic in FairScheduler#attemptScheduling > - > > Key: YARN-6764 > URL: https://issues.apache.org/jira/browse/YARN-6764 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 2.8.1, 3.0.0-alpha3 >Reporter: Yufei Gu >Assignee: Yufei Gu >Priority: Trivial > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6764) Simplify the logic in FairScheduler#attemptScheduling
[ https://issues.apache.org/jira/browse/YARN-6764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yufei Gu updated YARN-6764: --- Attachment: YARN-6764.001.patch > Simplify the logic in FairScheduler#attemptScheduling > - > > Key: YARN-6764 > URL: https://issues.apache.org/jira/browse/YARN-6764 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 2.8.1, 3.0.0-alpha3 >Reporter: Yufei Gu >Assignee: Yufei Gu >Priority: Trivial > Attachments: YARN-6764.001.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6759) TestRMRestart.testRMRestartWaitForPreviousAMToFinish is failing in trunk
[ https://issues.apache.org/jira/browse/YARN-6759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16075279#comment-16075279 ] Eric Payne commented on YARN-6759: -- Please note that this is not just failing in trunk. It fails for me in branch-2 as well: {noformat} java.lang.IllegalArgumentException: Total wait time should be greater than check interval time at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88) at org.apache.hadoop.test.GenericTestUtils.waitFor(GenericTestUtils.java:311) at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartWaitForPreviousAMToFinish(TestRMRestart.java:613) {noformat} > TestRMRestart.testRMRestartWaitForPreviousAMToFinish is failing in trunk > > > Key: YARN-6759 > URL: https://issues.apache.org/jira/browse/YARN-6759 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Naganarasimha G R > > {code} > java.lang.IllegalArgumentException: Total wait time should be greater than > check interval time > at > com.google.common.base.Preconditions.checkArgument(Preconditions.java:88) > at > org.apache.hadoop.test.GenericTestUtils.waitFor(GenericTestUtils.java:273) > at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartWaitForPreviousAMToFinish(TestRMRestart.java:618) > {code} > refer > https://builds.apache.org/job/PreCommit-YARN-Build/16229/testReport/org.apache.hadoop.yarn.server.resourcemanager/TestRMRestart/testRMRestartWaitForPreviousAMToFinish/ > which ran for YARN-2919 -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6752) Display reserved resources in web UI per application
[ https://issues.apache.org/jira/browse/YARN-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16075299#comment-16075299 ] Daniel Templeton commented on YARN-6752: Thanks for the patch, [~ayousufi]. Looks like you have some indentation issues in the {{FairSchedulerAppsBlock}} constructor. Otherwise, looks good to me. > Display reserved resources in web UI per application > > > Key: YARN-6752 > URL: https://issues.apache.org/jira/browse/YARN-6752 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi > Attachments: YARN-6752.001.patch, YARN-6752.002.patch > > > Show the number of reserved memory and vcores for each application -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6757) Refactor the usage of yarn.nodemanager.linux-container-executor.cgroups.mount-path
[ https://issues.apache.org/jira/browse/YARN-6757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Templeton updated YARN-6757: --- Description: We should add the ability to specify a custom cgroup path. This is how the documentation of {{linux-container-executor.cgroups.mount-path}} would look like: {noformat} Requested cgroup mount path. Yarn has built in functionality to discover the system cgroup mount paths, so use this setting only, if the discovery does not work. This path must exist before the NodeManager is launched. The location can vary depending on the Linux distribution in use. Common locations include /sys/fs/cgroup and /cgroup. If cgroups are not mounted, set yarn.nodemanager.linux-container-executor.cgroups.mount to true. In this case it specifies, where the LCE should attempt to mount cgroups if not found. If cgroups is accessible through lxcfs or some other file system, then set this path and yarn.nodemanager.linux-container-executor.cgroups.mount to false. Yarn tries to use this path first, before any cgroup mount point discovery. If it cannot find this directory, it falls back to searching for cgroup mount points in the system. Only used when the LCE resources handler is set to the CgroupsLCEResourcesHandler {noformat} was: We should add the ability to specify a custom cgroup path. This is how the documentation of {{linux-container-executor.cgroups.mount-path}} would look like: {code} Requested cgroup mount path. Yarn has built in functionality to discover the system cgroup mount paths, so use this setting only, if the discovery does not work. This path must exist before the NodeManager is launched. The location can vary depending on the Linux distribution in use. Common locations include /sys/fs/cgroup and /cgroup. If cgroups are not mounted, set yarn.nodemanager.linux-container-executor.cgroups.mount to true. In this case it specifies, where the LCE should attempt to mount cgroups if not found. If cgroups is accessible through lxcfs or some other file system, then set this path and yarn.nodemanager.linux-container-executor.cgroups.mount to false. Yarn tries to use this path first, before any cgroup mount point discovery. If it cannot find this directory, it falls back to searching for cgroup mount points in the system. Only used when the LCE resources handler is set to the CgroupsLCEResourcesHandler {code} > Refactor the usage of > yarn.nodemanager.linux-container-executor.cgroups.mount-path > -- > > Key: YARN-6757 > URL: https://issues.apache.org/jira/browse/YARN-6757 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.0.0-alpha4 >Reporter: Miklos Szegedi >Assignee: Miklos Szegedi >Priority: Minor > Attachments: YARN-6757.000.patch > > > We should add the ability to specify a custom cgroup path. This is how the > documentation of {{linux-container-executor.cgroups.mount-path}} would look > like: > {noformat} > Requested cgroup mount path. Yarn has built in functionality to discover > the system cgroup mount paths, so use this setting only, if the discovery > does not work. > This path must exist before the NodeManager is launched. > The location can vary depending on the Linux distribution in use. > Common locations include /sys/fs/cgroup and /cgroup. > If cgroups are not mounted, set > yarn.nodemanager.linux-container-executor.cgroups.mount > to true. In this case it specifies, where the LCE should attempt to mount > cgroups if not found. > If cgroups is accessible through lxcfs or some other file system, > then set this path and > yarn.nodemanager.linux-container-executor.cgroups.mount to false. > Yarn tries to use this path first, before any cgroup mount point > discovery. > If it cannot find this directory, it falls back to searching for cgroup > mount points in the system. > Only used when the LCE resources handler is set to the > CgroupsLCEResourcesHandler > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-2113) Add cross-user preemption within CapacityScheduler's leaf-queue
[ https://issues.apache.org/jira/browse/YARN-2113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16075317#comment-16075317 ] Eric Payne commented on YARN-2113: -- The TestRMRestart test failure looks like it's the same as YARN-6759. The other test failures don't happen for me locally. > Add cross-user preemption within CapacityScheduler's leaf-queue > --- > > Key: YARN-2113 > URL: https://issues.apache.org/jira/browse/YARN-2113 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Reporter: Vinod Kumar Vavilapalli >Assignee: Sunil G > Fix For: 3.0.0-alpha4 > > Attachments: IntraQueue Preemption-Impact Analysis.pdf, > TestNoIntraQueuePreemptionIfBelowUserLimitAndDifferentPrioritiesWithExtraUsers.txt, > YARN-2113.0001.patch, YARN-2113.0002.patch, YARN-2113.0003.patch, > YARN-2113.0004.patch, YARN-2113.0005.patch, YARN-2113.0006.patch, > YARN-2113.0007.patch, YARN-2113.0008.patch, YARN-2113.0009.patch, > YARN-2113.0010.patch, YARN-2113.0011.patch, YARN-2113.0012.patch, > YARN-2113.0013.patch, YARN-2113.0014.patch, YARN-2113.0015.patch, > YARN-2113.0016.patch, YARN-2113.0017.patch, YARN-2113.0018.patch, > YARN-2113.0019.patch, YARN-2113.apply.onto.0012.ericp.patch, > YARN-2113.branch-2.0019.patch, YARN-2113.branch-2.0020.patch, > YARN-2113.branch-2.8.0019.patch, YARN-2113.branch-2.8.0020.patch, YARN-2113 > Intra-QueuePreemption Behavior.pdf, YARN-2113.v0.patch > > > Preemption today only works across queues and moves around resources across > queues per demand and usage. We should also have user-level preemption within > a queue, to balance capacity across users in a predictable manner. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6757) Refactor the usage of yarn.nodemanager.linux-container-executor.cgroups.mount-path
[ https://issues.apache.org/jira/browse/YARN-6757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Templeton updated YARN-6757: --- Description: We should add the ability to specify a custom cgroup path. This is how the documentation of {{linux-container-executor.cgroups.mount-path}} would look like: {code} Requested cgroup mount path. Yarn has built in functionality to discover the system cgroup mount paths, so use this setting only, if the discovery does not work. This path must exist before the NodeManager is launched. The location can vary depending on the Linux distribution in use. Common locations include /sys/fs/cgroup and /cgroup. If cgroups are not mounted, set yarn.nodemanager.linux-container-executor.cgroups.mount to true. In this case it specifies, where the LCE should attempt to mount cgroups if not found. If cgroups is accessible through lxcfs or some other file system, then set this path and yarn.nodemanager.linux-container-executor.cgroups.mount to false. Yarn tries to use this path first, before any cgroup mount point discovery. If it cannot find this directory, it falls back to searching for cgroup mount points in the system. Only used when the LCE resources handler is set to the CgroupsLCEResourcesHandler {code} was: We should add the ability to specify a custom cgroup path. This is how the documentation of inux-container-executor.cgroups.mount-path would look like: {code} Requested cgroup mount path. Yarn has built in functionality to discover the system cgroup mount paths, so use this setting only, if the discovery does not work. This path must exist before the NodeManager is launched. The location can vary depending on the Linux distribution in use. Common locations include /sys/fs/cgroup and /cgroup. If cgroups are not mounted, set yarn.nodemanager.linux-container-executor.cgroups.mount to true. In this case it specifies, where the LCE should attempt to mount cgroups if not found. If cgroups is accessible through lxcfs or some other file system, then set this path and yarn.nodemanager.linux-container-executor.cgroups.mount to false. Yarn tries to use this path first, before any cgroup mount point discovery. If it cannot find this directory, it falls back to searching for cgroup mount points in the system. Only used when the LCE resources handler is set to the CgroupsLCEResourcesHandler {code} > Refactor the usage of > yarn.nodemanager.linux-container-executor.cgroups.mount-path > -- > > Key: YARN-6757 > URL: https://issues.apache.org/jira/browse/YARN-6757 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.0.0-alpha4 >Reporter: Miklos Szegedi >Assignee: Miklos Szegedi >Priority: Minor > Attachments: YARN-6757.000.patch > > > We should add the ability to specify a custom cgroup path. This is how the > documentation of {{linux-container-executor.cgroups.mount-path}} would look > like: > {code} > Requested cgroup mount path. Yarn has built in functionality to discover > the system cgroup mount paths, so use this setting only, if the discovery > does not work. > This path must exist before the NodeManager is launched. > The location can vary depending on the Linux distribution in use. > Common locations include /sys/fs/cgroup and /cgroup. > If cgroups are not mounted, set > yarn.nodemanager.linux-container-executor.cgroups.mount > to true. In this case it specifies, where the LCE should attempt to mount > cgroups if not found. > If cgroups is accessible through lxcfs or some other file system, > then set this path and > yarn.nodemanager.linux-container-executor.cgroups.mount to false. > Yarn tries to use this path first, before any cgroup mount point > discovery. > If it cannot find this directory, it falls back to searching for cgroup > mount points in the system. > Only used when the LCE resources handler is set to the > CgroupsLCEResourcesHandler > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6752) Display reserved resources in web UI per application
[ https://issues.apache.org/jira/browse/YARN-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated YARN-6752: --- Attachment: YARN-6752.003.patch > Display reserved resources in web UI per application > > > Key: YARN-6752 > URL: https://issues.apache.org/jira/browse/YARN-6752 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi > Attachments: YARN-6752.001.patch, YARN-6752.002.patch, > YARN-6752.003.patch > > > Show the number of reserved memory and vcores for each application -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6761) Fix build for YARN-3926 branch
[ https://issues.apache.org/jira/browse/YARN-6761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16075418#comment-16075418 ] Wangda Tan commented on YARN-6761: -- Thanks Varun for the patch, I think we should keep the {{SimpleResource}} to be used for {{#resource-type}} > 2 as well. Currently what the patch does is initialize a regular Resource object when {{#resource-type}} > 2. {code} if (tmpResource.getResources().size() > 2) { Resource ret = Records.newRecord(Resource.class); ret.setMemorySize(memory); ret.setVirtualCores(vCores); return ret; } return new SimpleResource(memory, vCores); {code} To achieve this, I suggest to remove: {code} private long memory; private long vcores; {code} >From the SimpleResource, and add a reference to {code} private ResourceInformation memoryResInfo; private ResourceInformation vcoresResInfo; {code} And initialize a map when create the resource. (Don't do lazy initialization in {{getResources}}) In addition to that, instead of initialize HashMap, I think we can use ImmutableMap for better memory efficiency, and only create HashMap when 3rd resources added. > Fix build for YARN-3926 branch > -- > > Key: YARN-6761 > URL: https://issues.apache.org/jira/browse/YARN-6761 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Varun Vasudev >Assignee: Varun Vasudev > Attachments: YARN-6761-YARN-3926.001.patch > > > After rebasing to trunk, due to the addition of YARN-6679, compilation of the > YARN-3926 branch is broken. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6752) Display reserved resources in web UI per application
[ https://issues.apache.org/jira/browse/YARN-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16075420#comment-16075420 ] Daniel Templeton commented on YARN-6752: Thanks. Latest patch looks good. I just tested the patch out, and I can see that the table width is starting to get a little silly. Probably overkill for this patch, but any clever thoughts on how to add the info without invoking the dreaded horizontal scroll bar? > Display reserved resources in web UI per application > > > Key: YARN-6752 > URL: https://issues.apache.org/jira/browse/YARN-6752 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi > Attachments: YARN-6752.001.patch, YARN-6752.002.patch, > YARN-6752.003.patch > > > Show the number of reserved memory and vcores for each application -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6726) Fix issues with docker commands in container-executor
[ https://issues.apache.org/jira/browse/YARN-6726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shane Kumpf updated YARN-6726: -- Summary: Fix issues with docker commands in container-executor (was: Support additional docker commands in container-executor) > Fix issues with docker commands in container-executor > - > > Key: YARN-6726 > URL: https://issues.apache.org/jira/browse/YARN-6726 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Shane Kumpf >Assignee: Shane Kumpf > > docker inspect, rm, stop, etc are issued through container-executor. Commands > other than docker run are not functioning properly. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6726) Fix issues with docker commands executed by container-executor
[ https://issues.apache.org/jira/browse/YARN-6726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shane Kumpf updated YARN-6726: -- Summary: Fix issues with docker commands executed by container-executor (was: Fix issues with docker commands in container-executor) > Fix issues with docker commands executed by container-executor > -- > > Key: YARN-6726 > URL: https://issues.apache.org/jira/browse/YARN-6726 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Shane Kumpf >Assignee: Shane Kumpf > > docker inspect, rm, stop, etc are issued through container-executor. Commands > other than docker run are not functioning properly. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6752) Display reserved resources in web UI per application
[ https://issues.apache.org/jira/browse/YARN-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16075438#comment-16075438 ] Abdullah Yousufi commented on YARN-6752: Definitely, perhaps we could combine the vcores and memory into one column, like resources are shown in other places > Display reserved resources in web UI per application > > > Key: YARN-6752 > URL: https://issues.apache.org/jira/browse/YARN-6752 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi > Attachments: YARN-6752.001.patch, YARN-6752.002.patch, > YARN-6752.003.patch > > > Show the number of reserved memory and vcores for each application -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6720) Support updating FPGA related constraint node label after FPGA device re-configuration
[ https://issues.apache.org/jira/browse/YARN-6720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16075443#comment-16075443 ] Wangda Tan commented on YARN-6720: -- [~tangzhankun]/[~zyluo], bq. YARN-3409 Wouldn't be a blocker since this JIRA is a improvement of YARN-6507. I'm not sure how to support device meta info in global (RM) scheduler without YARN-3409, I couldn't find the answer from attached design doc. Could you explain what is the solution in your mind? Anyway I'm in favor of using a general approach which can be utilized by other features instead of customize RM scheduler to support FPGA requirements. GPU support is more sensitive to GPU type instead of firmware, but I can see docker support can be improved a lot if we can schedule containers to a node which already has localized docker image. > Support updating FPGA related constraint node label after FPGA device > re-configuration > -- > > Key: YARN-6720 > URL: https://issues.apache.org/jira/browse/YARN-6720 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Zhankun Tang > Attachments: > Storing-and-Updating-extra-FPGA-resource-attributes-in-hdfs_v1.pdf > > > In order to provide a global optimal scheduling for mutable FPGA resource, it > seems an easy and direct way to utilize constraint node labels(YARN-3409) > instead of extending the global scheduler(YARN-3926) to match both resource > count and attributes. > The rough idea is that the AM sets the constraint node label expression to > request containers on the nodes whose FPGA devices has the matching IP, and > then NM resource handler update the node constraint label if there's FPGA > device re-configuration. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6708) Nodemanager container crash after ext3 folder limit
[ https://issues.apache.org/jira/browse/YARN-6708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16075457#comment-16075457 ] Jason Lowe commented on YARN-6708: -- Thanks for updating the patch! We're almost there, just some cleanup needed in the unit test. A prior review comment was missed in the patch update: bq. There be an After method that deletes basedir so we don't leave cruft on the filesystem if a unit test fails. On a related note, the unit test should be using {{basedir}} rather than making up its own path under {{target}} to benefit from that cleanup. Otherwise the unit test is leaving cruft around on the filesystem after it runs. Also the unit test is passing for me even without the code change. It will only fail if the umask of the user running the test is more restrictive than 022 which is a typical default. One way to work around that is to explicitly create one of the parent directories with the wrong permissions first, e.g.: filecache/0 with permissions 0700. Then we can call the localizer and verify the permissions were fixed afterwards. > Nodemanager container crash after ext3 folder limit > --- > > Key: YARN-6708 > URL: https://issues.apache.org/jira/browse/YARN-6708 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Critical > Attachments: YARN-6708.001.patch, YARN-6708.002.patch, > YARN-6708.003.patch, YARN-6708.004.patch, YARN-6708.005.patch, > YARN-6708.006.patch > > > Configure umask as *027* for nodemanager service user > and {{yarn.nodemanager.local-cache.max-files-per-directory}} as {{40}}. After > 4 *private* dir localization next directory will be *0/14* > Local Directory cache manager > {code} > vm2:/opt/hadoop/release/data/nmlocal/usercache/mapred/filecache # l > total 28 > drwx--x--- 7 mapred hadoop 4096 Jun 10 14:35 ./ > drwxr-s--- 4 mapred hadoop 4096 Jun 10 12:07 ../ > drwxr-x--- 3 mapred users 4096 Jun 10 14:36 0/ > drwxr-xr-x 3 mapred users 4096 Jun 10 12:15 10/ > drwxr-xr-x 3 mapred users 4096 Jun 10 12:22 11/ > drwxr-xr-x 3 mapred users 4096 Jun 10 12:27 12/ > drwxr-xr-x 3 mapred users 4096 Jun 10 12:31 13/ > {code} > *drwxr-x---* 3 mapred users 4096 Jun 10 14:36 0/ is only *750* > Nodemanager user will not be able check for localization path exists or not. > {{LocalResourcesTrackerImpl}} > {code} > case REQUEST: > if (rsrc != null && (!isResourcePresent(rsrc))) { > LOG.info("Resource " + rsrc.getLocalPath() > + " is missing, localizing it again"); > removeResource(req); > rsrc = null; > } > if (null == rsrc) { > rsrc = new LocalizedResource(req, dispatcher); > localrsrc.put(req, rsrc); > } > break; > {code} > *isResourcePresent* will always return false and same resource will be > localized to {{0}} to next unique number -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6757) Refactor the usage of yarn.nodemanager.linux-container-executor.cgroups.mount-path
[ https://issues.apache.org/jira/browse/YARN-6757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16075459#comment-16075459 ] Daniel Templeton commented on YARN-6757: Thanks for the patch, [~miklos.szeg...@cloudera.com]. A few comments: # The javadoc description for {{CGroupsHandler.getValidCGroups()}} should end with a period. # {{CGroupsHandler.getValidCGroups()}} seems altogether unnecessary. It's really just doing {{new HashSet<>(Arrays.asList(CGroupsController.values()))}} long hand. Maybe even better would be to use an {{EnumSet}} instead of a {{HashSet}}. # In {checkConfiguredCGroupPath()}}, {{cGroupMountPathSpecified}} is a little long-winded. {{cGroupMountPath}} is plenty. # {{checkConfiguredCGroupPath()}} isn't a great name. Maybe {{loadConfiguredCGroupPath()}} or {{parseConfiguredCGroupPath}}? # Why does {{CgroupsLCEResourcesHandler}} have an identical copy of {{checkConfiguredCGroupPath()}}? Seems like there should be some code sharing... # Theres now a dead import in {{CgroupsLCEResourcesHandler}}. # What's with the mount path containing directories that contain files whose names are comma-separated lists or cgroups? Is that a normal cgroups thing? Sounds weird to me... # I love that the text in {{yarn-defaults.xml}} is detailed and prescriptive, but it needs some work. There are grammar issues, but first let's tackle the clarity issues. I don't know what it's trying to say. I get that the property sets a path that we'll use to resolve cgroups under certain circumstances, but it's not clear what those circumstances are and what will happen when this property is set. It also says nothing about the file names that are comma-separated lists of cgroups. If it helps, I'm happy to have an offline conversation and help craft the docs. > Refactor the usage of > yarn.nodemanager.linux-container-executor.cgroups.mount-path > -- > > Key: YARN-6757 > URL: https://issues.apache.org/jira/browse/YARN-6757 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.0.0-alpha4 >Reporter: Miklos Szegedi >Assignee: Miklos Szegedi >Priority: Minor > Attachments: YARN-6757.000.patch > > > We should add the ability to specify a custom cgroup path. This is how the > documentation of {{linux-container-executor.cgroups.mount-path}} would look > like: > {noformat} > Requested cgroup mount path. Yarn has built in functionality to discover > the system cgroup mount paths, so use this setting only, if the discovery > does not work. > This path must exist before the NodeManager is launched. > The location can vary depending on the Linux distribution in use. > Common locations include /sys/fs/cgroup and /cgroup. > If cgroups are not mounted, set > yarn.nodemanager.linux-container-executor.cgroups.mount > to true. In this case it specifies, where the LCE should attempt to mount > cgroups if not found. > If cgroups is accessible through lxcfs or some other file system, > then set this path and > yarn.nodemanager.linux-container-executor.cgroups.mount to false. > Yarn tries to use this path first, before any cgroup mount point > discovery. > If it cannot find this directory, it falls back to searching for cgroup > mount points in the system. > Only used when the LCE resources handler is set to the > CgroupsLCEResourcesHandler > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-6765) CGroupsHandlerImpl.initializeControllerPaths() should include cause when chaining exceptions
Daniel Templeton created YARN-6765: -- Summary: CGroupsHandlerImpl.initializeControllerPaths() should include cause when chaining exceptions Key: YARN-6765 URL: https://issues.apache.org/jira/browse/YARN-6765 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 3.0.0-alpha3, 2.8.1 Reporter: Daniel Templeton Priority: Minor This: {code} throw new ResourceHandlerException( "Failed to initialize controller paths!");{code} should be this: {code} throw new ResourceHandlerException( "Failed to initialize controller paths!", e);{code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-2919) Potential race between renew and cancel in DelegationTokenRenwer
[ https://issues.apache.org/jira/browse/YARN-2919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16075512#comment-16075512 ] Junping Du commented on YARN-2919: -- Latest patch LGTM. +1. Will commit it tomorrow if no further comments from others. > Potential race between renew and cancel in DelegationTokenRenwer > - > > Key: YARN-2919 > URL: https://issues.apache.org/jira/browse/YARN-2919 > Project: Hadoop YARN > Issue Type: Bug > Components: security >Affects Versions: 2.6.0 >Reporter: Karthik Kambatla >Assignee: Naganarasimha G R >Priority: Critical > Attachments: YARN-2919.002.patch, YARN-2919.003.patch, > YARN-2919.004.patch, YARN-2919.005.patch, YARN-2919.20141209-1.patch > > > YARN-2874 fixes a deadlock in DelegationTokenRenewer, but there is still a > race because of which a renewal in flight isn't interrupted by a cancel. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6764) Simplify the logic in FairScheduler#attemptScheduling
[ https://issues.apache.org/jira/browse/YARN-6764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16075550#comment-16075550 ] Hadoop QA commented on YARN-6764: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 1s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 4s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 7s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 43m 23s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 66m 29s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMRestart | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:14b5c93 | | JIRA Issue | YARN-6764 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12875806/YARN-6764.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux ac2eff1f5597 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 946dd25 | | Default Java | 1.8.0_131 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/16303/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/16303/testReport/ | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/16303/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Simplify the logic in FairScheduler#attemptScheduling > - > > Key: YARN-6764 > URL: https://issues.apache.org/jira/browse/YARN-6764 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 2.8
[jira] [Commented] (YARN-6757) Refactor the usage of yarn.nodemanager.linux-container-executor.cgroups.mount-path
[ https://issues.apache.org/jira/browse/YARN-6757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16075708#comment-16075708 ] Yufei Gu commented on YARN-6757: Thanks [~templedf]'s review. I will take this as the offline discussion with [~miklos.szeg...@cloudera.com]. Will post the new patch soon. > Refactor the usage of > yarn.nodemanager.linux-container-executor.cgroups.mount-path > -- > > Key: YARN-6757 > URL: https://issues.apache.org/jira/browse/YARN-6757 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.0.0-alpha4 >Reporter: Miklos Szegedi >Assignee: Miklos Szegedi >Priority: Minor > Attachments: YARN-6757.000.patch > > > We should add the ability to specify a custom cgroup path. This is how the > documentation of {{linux-container-executor.cgroups.mount-path}} would look > like: > {noformat} > Requested cgroup mount path. Yarn has built in functionality to discover > the system cgroup mount paths, so use this setting only, if the discovery > does not work. > This path must exist before the NodeManager is launched. > The location can vary depending on the Linux distribution in use. > Common locations include /sys/fs/cgroup and /cgroup. > If cgroups are not mounted, set > yarn.nodemanager.linux-container-executor.cgroups.mount > to true. In this case it specifies, where the LCE should attempt to mount > cgroups if not found. > If cgroups is accessible through lxcfs or some other file system, > then set this path and > yarn.nodemanager.linux-container-executor.cgroups.mount to false. > Yarn tries to use this path first, before any cgroup mount point > discovery. > If it cannot find this directory, it falls back to searching for cgroup > mount points in the system. > Only used when the LCE resources handler is set to the > CgroupsLCEResourcesHandler > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6355) Preprocessor framework for AM and Client interactions with the RM
[ https://issues.apache.org/jira/browse/YARN-6355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-6355: -- Attachment: YARN-6355.003.patch Updated patch based on suggestions from [~subru] > Preprocessor framework for AM and Client interactions with the RM > - > > Key: YARN-6355 > URL: https://issues.apache.org/jira/browse/YARN-6355 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > Labels: amrmproxy, resourcemanager > Attachments: YARN-6355.001.patch, YARN-6355.002.patch, > YARN-6355.003.patch, YARN-6355-one-pager.pdf > > > Currently on the NM, we have the {{AMRMProxy}} framework to intercept the AM > <-> RM communication and enforce policies. This is used both by YARN > federation (YARN-2915) as well as Distributed Scheduling (YARN-2877). > This JIRA proposes to introduce a similar framework on the the RM side, so > that pluggable policies can be enforced on ApplicationMasterService centrally > as well. > This would be similar in spirit to a Java Servlet Filter Chain. Where the > order of the interceptors can declared externally. > Once possible usecase would be: > the {{OpportunisticContainerAllocatorAMService}} is implemented as a wrapper > over the {{ApplicationMasterService}}. It would probably be better to > implement it as an Interceptor. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6742) Minor mistakes in "The YARN Service Registry" docs
[ https://issues.apache.org/jira/browse/YARN-6742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16075753#comment-16075753 ] Yeliang Cang commented on YARN-6742: Yes, I think you are right! Thanks, [~shaneku...@gmail.com], and I submit a new patch ! > Minor mistakes in "The YARN Service Registry" docs > -- > > Key: YARN-6742 > URL: https://issues.apache.org/jira/browse/YARN-6742 > Project: Hadoop YARN > Issue Type: Bug > Components: documentation >Affects Versions: 3.0.0-alpha3 >Reporter: Yeliang Cang >Assignee: Yeliang Cang >Priority: Trivial > Attachments: YARN-6742-001.patch, YARN-6742-002.patch > > > There are minor mistakes in The YARN Service Registry docs. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6742) Minor mistakes in "The YARN Service Registry" docs
[ https://issues.apache.org/jira/browse/YARN-6742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yeliang Cang updated YARN-6742: --- Attachment: YARN-6742-003.patch > Minor mistakes in "The YARN Service Registry" docs > -- > > Key: YARN-6742 > URL: https://issues.apache.org/jira/browse/YARN-6742 > Project: Hadoop YARN > Issue Type: Bug > Components: documentation >Affects Versions: 3.0.0-alpha3 >Reporter: Yeliang Cang >Assignee: Yeliang Cang >Priority: Trivial > Attachments: YARN-6742-001.patch, YARN-6742-002.patch, > YARN-6742-003.patch > > > There are minor mistakes in The YARN Service Registry docs. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6742) Minor mistakes in "The YARN Service Registry" docs
[ https://issues.apache.org/jira/browse/YARN-6742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16075782#comment-16075782 ] Hadoop QA commented on YARN-6742: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 21m 15s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:14b5c93 | | JIRA Issue | YARN-6742 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12875841/YARN-6742-003.patch | | Optional Tests | asflicense mvnsite | | uname | Linux 877efeac25b2 3.13.0-117-generic #164-Ubuntu SMP Fri Apr 7 11:05:26 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 946dd25 | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/16306/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Minor mistakes in "The YARN Service Registry" docs > -- > > Key: YARN-6742 > URL: https://issues.apache.org/jira/browse/YARN-6742 > Project: Hadoop YARN > Issue Type: Bug > Components: documentation >Affects Versions: 3.0.0-alpha3 >Reporter: Yeliang Cang >Assignee: Yeliang Cang >Priority: Trivial > Attachments: YARN-6742-001.patch, YARN-6742-002.patch, > YARN-6742-003.patch > > > There are minor mistakes in The YARN Service Registry docs. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6355) Preprocessor framework for AM and Client interactions with the RM
[ https://issues.apache.org/jira/browse/YARN-6355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16075807#comment-16075807 ] Hadoop QA commented on YARN-6355: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 3m 12s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 49s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 54s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 55s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 53s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 20 new + 237 unchanged - 15 fixed = 257 total (was 252) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 15s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager generated 3 new + 0 unchanged - 0 fixed = 3 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 53s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 34s{color} | {color:red} hadoop-yarn-api in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 14m 4s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 37s{color} | {color:red} The patch generated 2 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 69m 25s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | | Should org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService$AMSProcessorChain be a _static_ inner class? At ApplicationMasterService.java:inner class? At ApplicationMasterService.java:[lines 84-113] | | | Non-virtual method call in new org.apache.hadoop.yarn.server.resourcemanager.FinalAMSProcessor() passes null for non-null parameter of new AMSProcessor(ApplicationMasterServiceInterceptor) At FinalAMSProcessor.java:org.apache.hadoop.yarn.server.resourcemanager.FinalAMSProcessor() passes null for non-null parameter of new AMSProcessor(ApplicationMasterServiceInterceptor) At FinalAMSProcessor.java:[line 95] | | | Dead store to appAttempt in org.apache.hadoop.yarn.server.resourcemanager.OpportunisticContainerAllocatorAMService$OCAInterceptor.afterAllocate(ApplicationAttemptId, AllocateRequest, AllocateResponse, Object) At OpportunisticContainerAllocatorAMService.java:org.apache.hadoop.yarn.server.resourcemanager.OpportunisticContainerAllocatorAMService$OCAInterceptor.afterAllocate(ApplicationAttemptId, AllocateRequest, AllocateResponse, Object) At OpportunisticContai
[jira] [Commented] (YARN-2919) Potential race between renew and cancel in DelegationTokenRenwer
[ https://issues.apache.org/jira/browse/YARN-2919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16075901#comment-16075901 ] Jian He commented on YARN-2919: --- got a question, [~Naganarasimha], could you elaborate what the race condition is based on code? The jira description is a bit vague. The patch uses a local variable in the token itself to indicate cancelling, I'm unsure if this is a good way. Caller can just make a copy of the token and do all sorts of operation and this flag becomes moot. Also, there are some behavior changes, the return value of renew method was supposed to be the expiration time, now '-1' is returned as an error code, which old program does not understand - old program was expecting an exception if renew fails. And it's possible for old program to wrongly interprets the '-1' as the expiration time. > Potential race between renew and cancel in DelegationTokenRenwer > - > > Key: YARN-2919 > URL: https://issues.apache.org/jira/browse/YARN-2919 > Project: Hadoop YARN > Issue Type: Bug > Components: security >Affects Versions: 2.6.0 >Reporter: Karthik Kambatla >Assignee: Naganarasimha G R >Priority: Critical > Attachments: YARN-2919.002.patch, YARN-2919.003.patch, > YARN-2919.004.patch, YARN-2919.005.patch, YARN-2919.20141209-1.patch > > > YARN-2874 fixes a deadlock in DelegationTokenRenewer, but there is still a > race because of which a renewal in flight isn't interrupted by a cancel. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6102) On failover RM can crash due to unregistered event to AsyncDispatcher
[ https://issues.apache.org/jira/browse/YARN-6102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16075932#comment-16075932 ] Rohith Sharma K S commented on YARN-6102: - thanks [~ajithshetty] and [~Naganarasimha] for explanation. bq. Stopping of service will not stop the rpc thread handling the current event/call ahh .. I see.. then as a solution need to ensure RPC threads are stopped before return from service stop method. [~ajithshetty] would you change your patch to ensuring RPC threads are stopped? This will common solution for all such race condition issues. > On failover RM can crash due to unregistered event to AsyncDispatcher > - > > Key: YARN-6102 > URL: https://issues.apache.org/jira/browse/YARN-6102 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.8.0, 2.7.3 >Reporter: Ajith S >Assignee: Ajith S >Priority: Critical > Attachments: eventOrder.JPG > > > {code}2017-01-17 16:42:17,911 FATAL [AsyncDispatcher event handler] > event.AsyncDispatcher (AsyncDispatcher.java:dispatch(200)) - Error in > dispatcher thread > java.lang.Exception: No handler for registered for class > org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeEventType > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:196) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:120) > at java.lang.Thread.run(Thread.java:745) > 2017-01-17 16:42:17,914 INFO [AsyncDispatcher ShutDown handler] > event.AsyncDispatcher (AsyncDispatcher.java:run(303)) - Exiting, bbye..{code} > The same stack i was also noticed in {{TestResourceTrackerOnHA}} exits > abnormally, after some analysis, i was able to reproduce. > Once the nodeHeartBeat is sent to RM, inside > {{org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService.nodeHeartbeat(NodeHeartbeatRequest)}}, > before sending it to dispatcher through > {{this.rmContext.getDispatcher().getEventHandler().handle(nodeStatusEvent);}} > if RM failover is called, the dispatcher is reset > The new dispatcher is however first started and then the events are > registered at > {{org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.reinitialize(boolean)}} > So event order will look like > 1. Send Node heartbeat to {{ResourceTrackerService}} > 2. In {{ResourceTrackerService.nodeHeartbeat}}, before passing to dispatcher > call RM failover > 3. In RM Failover, current active will reset dispatcher @reinitialize i.e ( > {{resetDispatcher();}} + {{createAndInitActiveServices();}} ) > Now between {{resetDispatcher();}} and {{createAndInitActiveServices();}} , > the {{ResourceTrackerService.nodeHeartbeat}} invokes dipatcher > This will cause the above error as at point of time when {{STATUS_UPDATE}} > event is given to dispatcher in {{ResourceTrackerService}} , the new > dispatcher(from the failover) may be started but not yet registered for events > Using same steps(with pausing JVM at debug), i was able to reproduce this in > production cluster also. for {{STATUS_UPDATE}} active service event, when the > service is yet to forward the event to RM dispatcher but a failover is called > and dispatcher reset is between {{resetDispatcher();}} & > {{createAndInitActiveServices();}} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org