[jira] [Commented] (YARN-1279) Expose a client API to allow clients to figure if log aggregation is complete
[ https://issues.apache.org/jira/browse/YARN-1279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815738#comment-13815738 ] Jian He commented on YARN-1279: --- - Does it make sense to send an event to RMApp to process the app log status, instead of explicitly creating an update api of RMApp? - why is it possible for a single node to first get log aggregation succeeded then Failed ? {code} currentState == LogAggregationState.COMPLETED && status.getLogAggregationState() == LogAggregationState.FAILED {code} - I think we can have separate maps, one for completed succeeded node aggregation, the other for failed node aggregation. Then we don't need two more counters for counting succeeded or failed nodes and those increment/decrement logic. - It's good to append failed log aggregation node info and also the diagnostics coming with ApplicationLogStatus to the diagnostics of the app. - Do we need a separate Timeout state? is it good to append the timeout diagnostics and return the state as Failed ? - this code logic can be simplified to say, if exceeds timeout period return FAILED or Timeout, otherwise return In_Progress. And so we can remove the logAggregationTimeOutDisabled boolean. {code} if (this.logAggregationTimeOutDisabled) { return LogAggregationState.IN_PROGRESS; } else { if (System.currentTimeMillis() - this.finishTime <= this.logAggregationTimeOut) { return LogAggregationState.IN_PROGRESS; } return LogAggregationState.TIME_OUT; } {code} - containerLogAggregationFail doesn't need to be atmoicBoolean - ApplicationLogStatus is better to be named as ApplicationLogAggregationStatus - [~vinodkv] For the time being, do we want to keep both application level log status as well as per container log status in the ApplicationLogStatus.java which is sent from NM to RM? > Expose a client API to allow clients to figure if log aggregation is complete > - > > Key: YARN-1279 > URL: https://issues.apache.org/jira/browse/YARN-1279 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 2.2.0 >Reporter: Arun C Murthy >Assignee: Xuan Gong > Attachments: YARN-1279.1.patch, YARN-1279.2.patch, YARN-1279.2.patch, > YARN-1279.3.patch, YARN-1279.3.patch, YARN-1279.4.patch, YARN-1279.4.patch, > YARN-1279.5.patch, YARN-1279.6.patch, YARN-1279.7.patch > > > Expose a client API to allow clients to figure if log aggregation is complete -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1366) ApplicationMasterService should Resync with the AM upon allocate call after restart
[ https://issues.apache.org/jira/browse/YARN-1366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith Sharma K S updated YARN-1366: Attachment: YARN-1366.patch Correct me If am wrong, I have prepared initial patch and attached the same. RM should differentiate Resync and Shutdown command. Please review whether this will fullfill expectations mentioned in JIra. > ApplicationMasterService should Resync with the AM upon allocate call after > restart > --- > > Key: YARN-1366 > URL: https://issues.apache.org/jira/browse/YARN-1366 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Bikas Saha > Attachments: YARN-1366.patch > > > The ApplicationMasterService currently sends a resync response to which the > AM responds by shutting down. The AM behavior is expected to change to > calling resyncing with the RM. Resync means resetting the allocate RPC > sequence number to 0 and the AM should send its entire outstanding request to > the RM. Note that if the AM is making its first allocate call to the RM then > things should proceed like normal without needing a resync. The RM will > return all containers that have completed since the RM last synced with the > AM. Some container completions may be reported more than once. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1390) Add applicationSource to ApplicationSubmissionContext and RMApp
[ https://issues.apache.org/jira/browse/YARN-1390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815657#comment-13815657 ] Karthik Kambatla commented on YARN-1390: [~zjshen], good idea. Allowing multiple applicationTypes for an application gives even more flexibility. The proto change of making applicationType repeated instead of optional is also compatible. [~vinodkv], [~tucu00] - do you agree this is a reasonable approach? > Add applicationSource to ApplicationSubmissionContext and RMApp > --- > > Key: YARN-1390 > URL: https://issues.apache.org/jira/browse/YARN-1390 > Project: Hadoop YARN > Issue Type: Improvement > Components: api >Affects Versions: 2.2.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla > > In addition to other fields like application-type (added in YARN-563), it is > useful to have an applicationSource field to track the source of an > application. The application source can be useful in (1) fetching only those > applications a user is interested in, (2) potentially adding source-specific > optimizations in the future. > Examples of sources are: User-defined project names, Pig, Hive, Oozie, Sqoop > etc. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1307) Rethink znode structure for RM HA
[ https://issues.apache.org/jira/browse/YARN-1307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815605#comment-13815605 ] Tsuyoshi OZAWA commented on YARN-1307: -- I'm now analysing the reason why Jenkins fails to compile with my latest patch. > Rethink znode structure for RM HA > - > > Key: YARN-1307 > URL: https://issues.apache.org/jira/browse/YARN-1307 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Tsuyoshi OZAWA >Assignee: Tsuyoshi OZAWA > Attachments: YARN-1307.1.patch, YARN-1307.2.patch, YARN-1307.3.patch, > YARN-1307.4-2.patch, YARN-1307.4.patch > > > Rethink for znode structure for RM HA is proposed in some JIRAs(YARN-659, > YARN-1222). The motivation of this JIRA is quoted from Bikas' comment in > YARN-1222: > {quote} > We should move to creating a node hierarchy for apps such that all znodes for > an app are stored under an app znode instead of the app root znode. This will > help in removeApplication and also in scaling better on ZK. The earlier code > was written this way to ensure create/delete happens under a root znode for > fencing. But given that we have moved to multi-operations globally, this isnt > required anymore. > {quote} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1021) Yarn Scheduler Load Simulator
[ https://issues.apache.org/jira/browse/YARN-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815592#comment-13815592 ] Wei Yan commented on YARN-1021: --- [~curino], thanks for your comment. Currently we put the instructions in the .pdf document and the generated site document. We already have a simple rumen trace (hadoop-sls/src/main/data/2job2min-rumen-jh.json) and example configurations (hadoop-sls/src/main/sample-conf). I'll make this clear in the README. YARN-1393 is created for this issue. > Yarn Scheduler Load Simulator > - > > Key: YARN-1021 > URL: https://issues.apache.org/jira/browse/YARN-1021 > Project: Hadoop YARN > Issue Type: New Feature > Components: scheduler >Reporter: Wei Yan >Assignee: Wei Yan > Fix For: 2.3.0 > > Attachments: YARN-1021-demo.tar.gz, YARN-1021-images.tar.gz, > YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, > YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, > YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, > YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.pdf > > > The Yarn Scheduler is a fertile area of interest with different > implementations, e.g., Fifo, Capacity and Fair schedulers. Meanwhile, > several optimizations are also made to improve scheduler performance for > different scenarios and workload. Each scheduler algorithm has its own set of > features, and drives scheduling decisions by many factors, such as fairness, > capacity guarantee, resource availability, etc. It is very important to > evaluate a scheduler algorithm very well before we deploy it in a production > cluster. Unfortunately, currently it is non-trivial to evaluate a scheduling > algorithm. Evaluating in a real cluster is always time and cost consuming, > and it is also very hard to find a large-enough cluster. Hence, a simulator > which can predict how well a scheduler algorithm for some specific workload > would be quite useful. > We want to build a Scheduler Load Simulator to simulate large-scale Yarn > clusters and application loads in a single machine. This would be invaluable > in furthering Yarn by providing a tool for researchers and developers to > prototype new scheduler features and predict their behavior and performance > with reasonable amount of confidence, there-by aiding rapid innovation. > The simulator will exercise the real Yarn ResourceManager removing the > network factor by simulating NodeManagers and ApplicationMasters via handling > and dispatching NM/AMs heartbeat events from within the same JVM. > To keep tracking of scheduler behavior and performance, a scheduler wrapper > will wrap the real scheduler. > The simulator will produce real time metrics while executing, including: > * Resource usages for whole cluster and each queue, which can be utilized to > configure cluster and queue's capacity. > * The detailed application execution trace (recorded in relation to simulated > time), which can be analyzed to understand/validate the scheduler behavior > (individual jobs turn around time, throughput, fairness, capacity guarantee, > etc). > * Several key metrics of scheduler algorithm, such as time cost of each > scheduler operation (allocate, handle, etc), which can be utilized by Hadoop > developers to find the code spots and scalability limits. > The simulator will provide real time charts showing the behavior of the > scheduler and its performance. > A short demo is available http://www.youtube.com/watch?v=6thLi8q0qLE, showing > how to use simulator to simulate Fair Scheduler and Capacity Scheduler. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (YARN-1393) Add how-to-use instruction in README for Yarn Scheduler Load Simulator
Wei Yan created YARN-1393: - Summary: Add how-to-use instruction in README for Yarn Scheduler Load Simulator Key: YARN-1393 URL: https://issues.apache.org/jira/browse/YARN-1393 Project: Hadoop YARN Issue Type: Improvement Reporter: Wei Yan Assignee: Wei Yan The instructions are put in the .pdf document and site page. The README needs to include a simple instruction for users to quickly pick up. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1021) Yarn Scheduler Load Simulator
[ https://issues.apache.org/jira/browse/YARN-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815573#comment-13815573 ] Carlo Curino commented on YARN-1021: Hi Wei, it would be nice to add in the README how to use the simulator (e.g., by having a super simple rumen trace, and configs one can use right away, including pointers to the nice visualizations you have). I was looking at it today, and while I am sure I can figure it out digging around more, quick instructions will make it more likely that people pick it up. > Yarn Scheduler Load Simulator > - > > Key: YARN-1021 > URL: https://issues.apache.org/jira/browse/YARN-1021 > Project: Hadoop YARN > Issue Type: New Feature > Components: scheduler >Reporter: Wei Yan >Assignee: Wei Yan > Fix For: 2.3.0 > > Attachments: YARN-1021-demo.tar.gz, YARN-1021-images.tar.gz, > YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, > YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, > YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, > YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.pdf > > > The Yarn Scheduler is a fertile area of interest with different > implementations, e.g., Fifo, Capacity and Fair schedulers. Meanwhile, > several optimizations are also made to improve scheduler performance for > different scenarios and workload. Each scheduler algorithm has its own set of > features, and drives scheduling decisions by many factors, such as fairness, > capacity guarantee, resource availability, etc. It is very important to > evaluate a scheduler algorithm very well before we deploy it in a production > cluster. Unfortunately, currently it is non-trivial to evaluate a scheduling > algorithm. Evaluating in a real cluster is always time and cost consuming, > and it is also very hard to find a large-enough cluster. Hence, a simulator > which can predict how well a scheduler algorithm for some specific workload > would be quite useful. > We want to build a Scheduler Load Simulator to simulate large-scale Yarn > clusters and application loads in a single machine. This would be invaluable > in furthering Yarn by providing a tool for researchers and developers to > prototype new scheduler features and predict their behavior and performance > with reasonable amount of confidence, there-by aiding rapid innovation. > The simulator will exercise the real Yarn ResourceManager removing the > network factor by simulating NodeManagers and ApplicationMasters via handling > and dispatching NM/AMs heartbeat events from within the same JVM. > To keep tracking of scheduler behavior and performance, a scheduler wrapper > will wrap the real scheduler. > The simulator will produce real time metrics while executing, including: > * Resource usages for whole cluster and each queue, which can be utilized to > configure cluster and queue's capacity. > * The detailed application execution trace (recorded in relation to simulated > time), which can be analyzed to understand/validate the scheduler behavior > (individual jobs turn around time, throughput, fairness, capacity guarantee, > etc). > * Several key metrics of scheduler algorithm, such as time cost of each > scheduler operation (allocate, handle, etc), which can be utilized by Hadoop > developers to find the code spots and scalability limits. > The simulator will provide real time charts showing the behavior of the > scheduler and its performance. > A short demo is available http://www.youtube.com/watch?v=6thLi8q0qLE, showing > how to use simulator to simulate Fair Scheduler and Capacity Scheduler. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1242) AHS start as independent process
[ https://issues.apache.org/jira/browse/YARN-1242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815534#comment-13815534 ] Hadoop QA commented on YARN-1242: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12612490/YARN-1242-1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:red}-1 javac{color:red}. The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2389//console This message is automatically generated. > AHS start as independent process > > > Key: YARN-1242 > URL: https://issues.apache.org/jira/browse/YARN-1242 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zhijie Shen >Assignee: Mayank Bansal > Attachments: YARN-1242-1.patch > > > Maybe we should include AHS classes as well (for developer usage) in yarn and > yarn.cmd -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-955) [YARN-321] Implementation of ApplicationHistoryProtocol
[ https://issues.apache.org/jira/browse/YARN-955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815532#comment-13815532 ] Hadoop QA commented on YARN-955: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12612493/YARN-955-3.patch against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2388//console This message is automatically generated. > [YARN-321] Implementation of ApplicationHistoryProtocol > --- > > Key: YARN-955 > URL: https://issues.apache.org/jira/browse/YARN-955 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Vinod Kumar Vavilapalli >Assignee: Mayank Bansal > Attachments: YARN-955-1.patch, YARN-955-2.patch, YARN-955-3.patch > > -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-955) [YARN-321] Implementation of ApplicationHistoryProtocol
[ https://issues.apache.org/jira/browse/YARN-955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mayank Bansal updated YARN-955: --- Attachment: YARN-955-3.patch Attaching latest patch Thanks, Mayank > [YARN-321] Implementation of ApplicationHistoryProtocol > --- > > Key: YARN-955 > URL: https://issues.apache.org/jira/browse/YARN-955 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Vinod Kumar Vavilapalli >Assignee: Mayank Bansal > Attachments: YARN-955-1.patch, YARN-955-2.patch, YARN-955-3.patch > > -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-955) [YARN-321] Implementation of ApplicationHistoryProtocol
[ https://issues.apache.org/jira/browse/YARN-955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815501#comment-13815501 ] Mayank Bansal commented on YARN-955: bq. Not necessary. The default value can be read from yarn-default.xml. The problem is that you can not specify the prefix variables like that in the xml file. This default URI will be a relative path based on the current directory. Done bq. Maybe just call it AHS_ADDRESS Done bq. The nested class is not necessary. ApplicationHistoryClientService can implement ApplicationHistoryProtocol directly. I think its more cleaner then having service implements the protocol by this you can pass around protocol handler around and your service in the future can handle multiplt protocols bq. Not necessary wrap-up. Please place the simple statement directly in the callers. Same for getApplications. Done bq.Personally, I think returning empty collections is fine to indicate no results. Otherwise, the caller needs always to check not null first. Done bq.Why do you want two references pointing to the same object? Done bq. 7. In the original design, we said we're going to make AHS a service of RM, though it should be independent enough. In this patch, I can see AHS is going to be an completely independent process. So far, it should be OK, because AHS needs nothing from RM. However, I'm expecting some more security work to do if AHS is separate process, as AHS and RM will not share the common context, and may be launched by different users. Vinod Kumar Vavilapalli, do you have any opinion about service or process? I think we are having same discussion in YARN-1266, lets continue there. bq.Anyway, if we decide to make AHS a process now, this patch should also include the shell script to launch AHS. YARN-1242 caters that. Thanks, Mayank > [YARN-321] Implementation of ApplicationHistoryProtocol > --- > > Key: YARN-955 > URL: https://issues.apache.org/jira/browse/YARN-955 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Vinod Kumar Vavilapalli >Assignee: Mayank Bansal > Attachments: YARN-955-1.patch, YARN-955-2.patch > > -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1242) AHS start as independent process
[ https://issues.apache.org/jira/browse/YARN-1242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mayank Bansal updated YARN-1242: Attachment: YARN-1242-1.patch Attaching intial patch Thanks, Mayank > AHS start as independent process > > > Key: YARN-1242 > URL: https://issues.apache.org/jira/browse/YARN-1242 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zhijie Shen >Assignee: Mayank Bansal > Attachments: YARN-1242-1.patch > > > Maybe we should include AHS classes as well (for developer usage) in yarn and > yarn.cmd -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1242) AHS start as independent process
[ https://issues.apache.org/jira/browse/YARN-1242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mayank Bansal updated YARN-1242: Summary: AHS start as independent process (was: AHS's resource needs be added to RM's classpath) > AHS start as independent process > > > Key: YARN-1242 > URL: https://issues.apache.org/jira/browse/YARN-1242 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zhijie Shen >Assignee: Mayank Bansal > > Maybe we should include AHS classes as well (for developer usage) in yarn and > yarn.cmd -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1279) Expose a client API to allow clients to figure if log aggregation is complete
[ https://issues.apache.org/jira/browse/YARN-1279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815488#comment-13815488 ] Hadoop QA commented on YARN-1279: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12612450/YARN-1279.7.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 9 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.mapred.TestJobCleanup The following test timeouts occurred in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.mapreduce.v2.TestUberAM org.apache.hadoop.mapred.TestMultiFileInputFormat {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2386//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2386//console This message is automatically generated. > Expose a client API to allow clients to figure if log aggregation is complete > - > > Key: YARN-1279 > URL: https://issues.apache.org/jira/browse/YARN-1279 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 2.2.0 >Reporter: Arun C Murthy >Assignee: Xuan Gong > Attachments: YARN-1279.1.patch, YARN-1279.2.patch, YARN-1279.2.patch, > YARN-1279.3.patch, YARN-1279.3.patch, YARN-1279.4.patch, YARN-1279.4.patch, > YARN-1279.5.patch, YARN-1279.6.patch, YARN-1279.7.patch > > > Expose a client API to allow clients to figure if log aggregation is complete -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-987) Adding History Service to use Store and converting Historydata to Report
[ https://issues.apache.org/jira/browse/YARN-987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815417#comment-13815417 ] Zhijie Shen commented on YARN-987: -- +1 ApplicationHistoryManagerImpl may be changed accordingly when cache is ready, but it's a separate thing. > Adding History Service to use Store and converting Historydata to Report > > > Key: YARN-987 > URL: https://issues.apache.org/jira/browse/YARN-987 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Mayank Bansal >Assignee: Mayank Bansal > Attachments: YARN-987-1.patch, YARN-987-2.patch, YARN-987-3.patch, > YARN-987-4.patch, YARN-987-5.patch, YARN-987-6.patch, YARN-987-7.patch, > YARN-987-8.patch > > -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-987) Adding History Service to use Store and converting Historydata to Report
[ https://issues.apache.org/jira/browse/YARN-987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815397#comment-13815397 ] Hadoop QA commented on YARN-987: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12612458/YARN-987-8.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:red}-1 javac{color:red}. The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2387//console This message is automatically generated. > Adding History Service to use Store and converting Historydata to Report > > > Key: YARN-987 > URL: https://issues.apache.org/jira/browse/YARN-987 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Mayank Bansal >Assignee: Mayank Bansal > Attachments: YARN-987-1.patch, YARN-987-2.patch, YARN-987-3.patch, > YARN-987-4.patch, YARN-987-5.patch, YARN-987-6.patch, YARN-987-7.patch, > YARN-987-8.patch > > -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-987) Adding History Service to use Store and converting Historydata to Report
[ https://issues.apache.org/jira/browse/YARN-987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mayank Bansal updated YARN-987: --- Attachment: YARN-987-8.patch Attaching the latest patch. Thanks, Mayank > Adding History Service to use Store and converting Historydata to Report > > > Key: YARN-987 > URL: https://issues.apache.org/jira/browse/YARN-987 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Mayank Bansal >Assignee: Mayank Bansal > Attachments: YARN-987-1.patch, YARN-987-2.patch, YARN-987-3.patch, > YARN-987-4.patch, YARN-987-5.patch, YARN-987-6.patch, YARN-987-7.patch, > YARN-987-8.patch > > -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-987) Adding History Service to use Store and converting Historydata to Report
[ https://issues.apache.org/jira/browse/YARN-987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815389#comment-13815389 ] Mayank Bansal commented on YARN-987: Done. Thanks, Mayank > Adding History Service to use Store and converting Historydata to Report > > > Key: YARN-987 > URL: https://issues.apache.org/jira/browse/YARN-987 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Mayank Bansal >Assignee: Mayank Bansal > Attachments: YARN-987-1.patch, YARN-987-2.patch, YARN-987-3.patch, > YARN-987-4.patch, YARN-987-5.patch, YARN-987-6.patch, YARN-987-7.patch, > YARN-987-8.patch > > -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-987) Adding History Service to use Store and converting Historydata to Report
[ https://issues.apache.org/jira/browse/YARN-987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815351#comment-13815351 ] Zhijie Shen commented on YARN-987: -- No need to do this change, which will break TestMemoryApplicationHistoryStore and TestFileSystemApplicationHistoryStore. {code} -appAttemptId, appAttemptId.toString(), 0, +appAttemptId, "localhost", 0, {code} Why not simply changing {code} +Assert.assertEquals("localhost", appReport.getHost()); {code} to {code} +Assert.assertEquals(appAttemptId.toString(), appReport.getHost()); {code} > Adding History Service to use Store and converting Historydata to Report > > > Key: YARN-987 > URL: https://issues.apache.org/jira/browse/YARN-987 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Mayank Bansal >Assignee: Mayank Bansal > Attachments: YARN-987-1.patch, YARN-987-2.patch, YARN-987-3.patch, > YARN-987-4.patch, YARN-987-5.patch, YARN-987-6.patch, YARN-987-7.patch > > -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1279) Expose a client API to allow clients to figure if log aggregation is complete
[ https://issues.apache.org/jira/browse/YARN-1279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-1279: Attachment: YARN-1279.7.patch Fix -1 on javadoc warning Fix TestYarnCLI test case failure > Expose a client API to allow clients to figure if log aggregation is complete > - > > Key: YARN-1279 > URL: https://issues.apache.org/jira/browse/YARN-1279 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 2.2.0 >Reporter: Arun C Murthy >Assignee: Xuan Gong > Attachments: YARN-1279.1.patch, YARN-1279.2.patch, YARN-1279.2.patch, > YARN-1279.3.patch, YARN-1279.3.patch, YARN-1279.4.patch, YARN-1279.4.patch, > YARN-1279.5.patch, YARN-1279.6.patch, YARN-1279.7.patch > > > Expose a client API to allow clients to figure if log aggregation is complete -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1279) Expose a client API to allow clients to figure if log aggregation is complete
[ https://issues.apache.org/jira/browse/YARN-1279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815306#comment-13815306 ] Hadoop QA commented on YARN-1279: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12612422/YARN-1279.6.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 9 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 1 warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.client.cli.TestYarnCLI The following test timeouts occurred in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.mapreduce.v2.TestUberAM {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2384//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2384//console This message is automatically generated. > Expose a client API to allow clients to figure if log aggregation is complete > - > > Key: YARN-1279 > URL: https://issues.apache.org/jira/browse/YARN-1279 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 2.2.0 >Reporter: Arun C Murthy >Assignee: Xuan Gong > Attachments: YARN-1279.1.patch, YARN-1279.2.patch, YARN-1279.2.patch, > YARN-1279.3.patch, YARN-1279.3.patch, YARN-1279.4.patch, YARN-1279.4.patch, > YARN-1279.5.patch, YARN-1279.6.patch > > > Expose a client API to allow clients to figure if log aggregation is complete -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1222) Make improvements in ZKRMStateStore for fencing
[ https://issues.apache.org/jira/browse/YARN-1222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815280#comment-13815280 ] Bikas Saha commented on YARN-1222: -- bq. Post YARN-1318, I think RMStateStore constructor should take RMContext. Then, we should be able to replace the RPC approach with rmContext.getHAService.transitionToStandby() Great, lets track that and put a comment. Doing a self-RPC is good to avoid. bq. A completely different approach might to be keep handleStoreFencedException() in ResourceManager and the store implementation to call it when it realizes it got fenced. Thoughts? Thats what I was suggesting. The store reports this exception/error to the RM and then the RM does the right thing. (in this case transitionToStandby). notifyDoneStoringApplicationAttempt() etc should not be sent when there is a fenced exception. Extending that, we should probably only send the notifyDone* upon success. That way those callees need to be bothered only with the normal/success code path. Any exception should be reported to the RM. The RM can examine the exception to see if it is a fenced exception. Then transitionToStandby(). If some other exception then die (like we currently do in multiple different places. We will now do it in one place). > Make improvements in ZKRMStateStore for fencing > --- > > Key: YARN-1222 > URL: https://issues.apache.org/jira/browse/YARN-1222 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Bikas Saha >Assignee: Karthik Kambatla > Attachments: yarn-1222-1.patch, yarn-1222-2.patch, yarn-1222-3.patch, > yarn-1222-4.patch, yarn-1222-5.patch > > > Using multi-operations for every ZK interaction. > In every operation, automatically creating/deleting a lock znode that is the > child of the root znode. This is to achieve fencing by modifying the > create/delete permissions on the root znode. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-987) Adding History Service to use Store and converting Historydata to Report
[ https://issues.apache.org/jira/browse/YARN-987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815228#comment-13815228 ] Hadoop QA commented on YARN-987: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12612429/YARN-987-7.patch against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2385//console This message is automatically generated. > Adding History Service to use Store and converting Historydata to Report > > > Key: YARN-987 > URL: https://issues.apache.org/jira/browse/YARN-987 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Mayank Bansal >Assignee: Mayank Bansal > Attachments: YARN-987-1.patch, YARN-987-2.patch, YARN-987-3.patch, > YARN-987-4.patch, YARN-987-5.patch, YARN-987-6.patch, YARN-987-7.patch > > -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-987) Adding History Service to use Store and converting Historydata to Report
[ https://issues.apache.org/jira/browse/YARN-987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mayank Bansal updated YARN-987: --- Attachment: YARN-987-7.patch Attaching latest patch Thanks, Mayank > Adding History Service to use Store and converting Historydata to Report > > > Key: YARN-987 > URL: https://issues.apache.org/jira/browse/YARN-987 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Mayank Bansal >Assignee: Mayank Bansal > Attachments: YARN-987-1.patch, YARN-987-2.patch, YARN-987-3.patch, > YARN-987-4.patch, YARN-987-5.patch, YARN-987-6.patch, YARN-987-7.patch > > -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1266) Adding ApplicationHistoryProtocolPBService
[ https://issues.apache.org/jira/browse/YARN-1266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815216#comment-13815216 ] Mayank Bansal commented on YARN-1266: - bq. Correct my previous comment. It seems that we can have a single protocol, but AHS can have separate server-side implementation of it. YarnClient will then be modified to query the completed applications/attempts/containers from the AHS's implementation instead. It's similar to MRClientProtocol, which has two implementations in MR and in JHS. If we anyway wants to do two implementations then the better choice would be root interface derived by both interfaces and have their own implementation. > Adding ApplicationHistoryProtocolPBService > -- > > Key: YARN-1266 > URL: https://issues.apache.org/jira/browse/YARN-1266 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Mayank Bansal >Assignee: Mayank Bansal > Attachments: YARN-1266-1.patch, YARN-1266-2.patch > > > Adding ApplicationHistoryProtocolPBService to make web apps to work and > changing yarn to run AHS as a seprate process -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-987) Adding History Service to use Store and converting Historydata to Report
[ https://issues.apache.org/jira/browse/YARN-987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815211#comment-13815211 ] Mayank Bansal commented on YARN-987: bq. 1. Would you please fix the capitalization? Done bq. It seems not necessary that convertToReport is public as well. Done bq. It's better to use the constant in YarnConfiguration and config.getClass/getClassByName. Done bq. No need to convert to string to assert equivalence. You can do that directly with the ID instances. In addition, please assert some fields from the application attempt, such as "host" Done bq. You can refer to RMAppImpl#createAndGetApplicationReport to decide what the fields should be when the attempt is null. Done Thanks, Mayank > Adding History Service to use Store and converting Historydata to Report > > > Key: YARN-987 > URL: https://issues.apache.org/jira/browse/YARN-987 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Mayank Bansal >Assignee: Mayank Bansal > Attachments: YARN-987-1.patch, YARN-987-2.patch, YARN-987-3.patch, > YARN-987-4.patch, YARN-987-5.patch, YARN-987-6.patch > > -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1279) Expose a client API to allow clients to figure if log aggregation is complete
[ https://issues.apache.org/jira/browse/YARN-1279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815190#comment-13815190 ] Xuan Gong commented on YARN-1279: - Upload an initial patch on YARN-1376 for the NM side changes. You may want to take a look to get the whole picture > Expose a client API to allow clients to figure if log aggregation is complete > - > > Key: YARN-1279 > URL: https://issues.apache.org/jira/browse/YARN-1279 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 2.2.0 >Reporter: Arun C Murthy >Assignee: Xuan Gong > Attachments: YARN-1279.1.patch, YARN-1279.2.patch, YARN-1279.2.patch, > YARN-1279.3.patch, YARN-1279.3.patch, YARN-1279.4.patch, YARN-1279.4.patch, > YARN-1279.5.patch, YARN-1279.6.patch > > > Expose a client API to allow clients to figure if log aggregation is complete -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1376) NM need to notify the log aggregation status to RM through Node heartbeat
[ https://issues.apache.org/jira/browse/YARN-1376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-1376: Attachment: YARN-1376.1.patch The patch is based on YARN-1279, and is for NM side changes > NM need to notify the log aggregation status to RM through Node heartbeat > - > > Key: YARN-1376 > URL: https://issues.apache.org/jira/browse/YARN-1376 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Xuan Gong >Assignee: Xuan Gong > Attachments: YARN-1376.1.patch > > > Expose a client API to allow clients to figure if log aggregation is > complete. The ticket is used to track the changes on NM side -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1279) Expose a client API to allow clients to figure if log aggregation is complete
[ https://issues.apache.org/jira/browse/YARN-1279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-1279: Attachment: YARN-1279.6.patch The RMNode will receive one applicationLogStatus per application (which includes applicationId, applicationLogAggregationState, Map) through the NM heartBeat. Also add time_out state in LogAggregationState. If some NMs is shut down, it is very possible that we will miss the applicationLogStatus from those NMs. In stead of keep showing the IN_PROGRESS state, we can use TIME_OUT instead based on how long we wait. > Expose a client API to allow clients to figure if log aggregation is complete > - > > Key: YARN-1279 > URL: https://issues.apache.org/jira/browse/YARN-1279 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 2.2.0 >Reporter: Arun C Murthy >Assignee: Xuan Gong > Attachments: YARN-1279.1.patch, YARN-1279.2.patch, YARN-1279.2.patch, > YARN-1279.3.patch, YARN-1279.3.patch, YARN-1279.4.patch, YARN-1279.4.patch, > YARN-1279.5.patch, YARN-1279.6.patch > > > Expose a client API to allow clients to figure if log aggregation is complete -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (YARN-1392) Allow sophisticated app-to-queue placement policies in the Fair Scheduler
Sandy Ryza created YARN-1392: Summary: Allow sophisticated app-to-queue placement policies in the Fair Scheduler Key: YARN-1392 URL: https://issues.apache.org/jira/browse/YARN-1392 Project: Hadoop YARN Issue Type: New Feature Components: scheduler Affects Versions: 2.2.0 Reporter: Sandy Ryza Assignee: Sandy Ryza Currently the Fair Scheduler supports app-to-queue placement by username. It would be beneficial to allow more sophisticated policies that rely on primary and secondary groups and fallbacks. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1266) Adding ApplicationHistoryProtocolPBService
[ https://issues.apache.org/jira/browse/YARN-1266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815171#comment-13815171 ] Zhijie Shen commented on YARN-1266: --- bq. Let's say if we want AHS to be a separate process like JHS in the future (or maybe now, see my comments in YARN-955), when RM is stopped, AHS can not be accessed via RPC interface. Correct my previous comment. It seems that we can have a single protocol, but AHS can have separate server-side implementation of it. YarnClient will then be modified to query the completed applications/attempts/containers from the AHS's implementation instead. It's similar to MRClientProtocol, which has two implementations in MR and in JHS. bq. May be in client protocol and then history protocol can derive from that. In JHS, it is this case: HSClientProtocol derives MRClientProtocol without adding more methods. Hence, the derivation seems to be unnecessary. > Adding ApplicationHistoryProtocolPBService > -- > > Key: YARN-1266 > URL: https://issues.apache.org/jira/browse/YARN-1266 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Mayank Bansal >Assignee: Mayank Bansal > Attachments: YARN-1266-1.patch, YARN-1266-2.patch > > > Adding ApplicationHistoryProtocolPBService to make web apps to work and > changing yarn to run AHS as a seprate process -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1390) Add applicationSource to ApplicationSubmissionContext and RMApp
[ https://issues.apache.org/jira/browse/YARN-1390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815147#comment-13815147 ] Zhijie Shen commented on YARN-1390: --- It sounds to me another way to categorize applications, but IMOH, it can still make use of the "applicationType" field. Users can define the types based one computation framework: mapreduce, tez, storm, mpi and etc, and also the ones based on the *source*: pig, hive, oozie, sqoop and etc. It's just a different opinions on the definition of the "applicationType". Sometimes, users want to have both information, the computation framework and the source. It sounds to me more like the requirement of sub-types: ||1st level||2nd level|| |hive|mapreduce| | |tez| |pig|mapreduce| | |tez| > Add applicationSource to ApplicationSubmissionContext and RMApp > --- > > Key: YARN-1390 > URL: https://issues.apache.org/jira/browse/YARN-1390 > Project: Hadoop YARN > Issue Type: Improvement > Components: api >Affects Versions: 2.2.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla > > In addition to other fields like application-type (added in YARN-563), it is > useful to have an applicationSource field to track the source of an > application. The application source can be useful in (1) fetching only those > applications a user is interested in, (2) potentially adding source-specific > optimizations in the future. > Examples of sources are: User-defined project names, Pig, Hive, Oozie, Sqoop > etc. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-987) Adding History Service to use Store and converting Historydata to Report
[ https://issues.apache.org/jira/browse/YARN-987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815080#comment-13815080 ] Zhijie Shen commented on YARN-987: -- 1. Would you please fix the capitalization? {code} + /** AHS STORAGE CLASS */ {code} {code} +Store Class Name for History Store, defaulting to file + system store {code} 2. It seems not necessary that convertToReport is public as well. 3. It's better to use the constant in YarnConfiguration and config.getClass/getClassByName. {code} +config +.set( +"yarn.ahs.store.class", + "org.apache.hadoop.yarn.server.applicationhistoryservice.MemoryApplicationHistoryStore"); {code} 4. No need to convert to string to assert equivalence. You can do that directly with the ID instances. In addition, please assert some fields from the application attempt, such as "host" {code} +Assert.assertEquals("application_0_0001", appReport.getApplicationId() +.toString()); +Assert.assertEquals("appattempt_0_0001_01", appReport +.getCurrentApplicationAttemptId().toString()); {code} 5. You can refer to RMAppImpl#createAndGetApplicationReport to decide what the fields should be when the attempt is null. {code} +if (lastAttempt == null) { + return ApplicationReport.newInstance(appHistory.getApplicationId(), null, + appHistory.getUser(), appHistory.getQueue(), appHistory + .getApplicationName(), "", 0, null, null, "", "", appHistory + .getStartTime(), appHistory.getFinishTime(), appHistory + .getFinalApplicationStatus(), null, "", 100, appHistory + .getApplicationType(), null); +} {code} > Adding History Service to use Store and converting Historydata to Report > > > Key: YARN-987 > URL: https://issues.apache.org/jira/browse/YARN-987 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Mayank Bansal >Assignee: Mayank Bansal > Attachments: YARN-987-1.patch, YARN-987-2.patch, YARN-987-3.patch, > YARN-987-4.patch, YARN-987-5.patch, YARN-987-6.patch > > -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1374) Resource Manager fails to start due to ConcurrentModificationException
[ https://issues.apache.org/jira/browse/YARN-1374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815041#comment-13815041 ] Karthik Kambatla commented on YARN-1374: I filed HADOOP-10085 to follow this up. IMO, being able to add services to CompositeService while initing is more a wish than a requirement. > Resource Manager fails to start due to ConcurrentModificationException > -- > > Key: YARN-1374 > URL: https://issues.apache.org/jira/browse/YARN-1374 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Devaraj K >Assignee: Karthik Kambatla >Priority: Blocker > Fix For: 2.3.0 > > Attachments: yarn-1374-1.patch, yarn-1374-1.patch > > > Resource Manager is failing to start with the below > ConcurrentModificationException. > {code:xml} > 2013-10-30 20:22:42,371 INFO org.apache.hadoop.util.HostsFileReader: > Refreshing hosts (include/exclude) list > 2013-10-30 20:22:42,376 INFO org.apache.hadoop.service.AbstractService: > Service ResourceManager failed in state INITED; cause: > java.util.ConcurrentModificationException > java.util.ConcurrentModificationException > at > java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372) > at java.util.AbstractList$Itr.next(AbstractList.java:343) > at > java.util.Collections$UnmodifiableCollection$1.next(Collections.java:1010) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:187) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:944) > 2013-10-30 20:22:42,378 INFO > org.apache.hadoop.yarn.server.resourcemanager.RMHAProtocolService: > Transitioning to standby > 2013-10-30 20:22:42,378 INFO > org.apache.hadoop.yarn.server.resourcemanager.RMHAProtocolService: > Transitioned to standby > 2013-10-30 20:22:42,378 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting > ResourceManager > java.util.ConcurrentModificationException > at > java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372) > at java.util.AbstractList$Itr.next(AbstractList.java:343) > at > java.util.Collections$UnmodifiableCollection$1.next(Collections.java:1010) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:187) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:944) > 2013-10-30 20:22:42,379 INFO > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: SHUTDOWN_MSG: > / > SHUTDOWN_MSG: Shutting down ResourceManager at HOST-10-18-40-24/10.18.40.24 > / > {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-311) Dynamic node resource configuration: core scheduler changes
[ https://issues.apache.org/jira/browse/YARN-311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13814878#comment-13814878 ] Hudson commented on YARN-311: - SUCCESS: Integrated in Hadoop-Hdfs-trunk #1575 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1575/]) YARN-311. RM/scheduler support for dynamic resource configuration. (Junping Du via llu) (llu: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1539134) * /hadoop/common/trunk/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/nodemanager/NodeInfo.java * /hadoop/common/trunk/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/RMNodeWrapper.java * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ResourceOption.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_protos.proto * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/records/impl/pb/ResourceOptionPBImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNode.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerNode.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerUtils.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerNode.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSSchedulerNode.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockNodes.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMNodeTransitions.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/TestFifoScheduler.java > Dynamic node resource configuration: core scheduler changes > --- > > Key: YARN-311 > URL: https://issues.apache.org/jira/browse/YARN-311 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager, scheduler >Reporter: Junping Du >Assignee: Junping Du > Fix For: 2.3.0 > > Attachments: YARN-311-v1.patch, YARN-311-v10.patch, > YARN-311-v11.patch, YARN-311-v12.patch, YARN-311-v12b.patch, > YARN-311-v13.patch, YARN-311-v2.patch, YARN-311-v3.patch, YARN-311-v4.patch, > YARN-311-v4.patch, YARN-311-v5.patch, YARN-311-v6.1.patch, > YARN-311-v6.2.patch, YARN-311-v6.patch, YARN-311-v7.patch, YARN-311-v8.patch, > YARN-311-v9.patch > > > As the first step, we go for resource change on RM side and expose admin APIs > (admin protocol, CLI, REST and JMX API) later. In this jira, we will only > contain changes in scheduler. > The flow to update node's resource and awareness in resource scheduling is: > 1. Resource update is through admin API to RM and take effect on RMNodeImpl. > 2. When next
[jira] [Commented] (YARN-1374) Resource Manager fails to start due to ConcurrentModificationException
[ https://issues.apache.org/jira/browse/YARN-1374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13814877#comment-13814877 ] Hudson commented on YARN-1374: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #1575 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1575/]) YARN-1374. Changed ResourceManager to start the preemption policy monitors as active services. Contributed by Karthik Kambatla. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1539089) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/TestSchedulingMonitor.java > Resource Manager fails to start due to ConcurrentModificationException > -- > > Key: YARN-1374 > URL: https://issues.apache.org/jira/browse/YARN-1374 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Devaraj K >Assignee: Karthik Kambatla >Priority: Blocker > Fix For: 2.3.0 > > Attachments: yarn-1374-1.patch, yarn-1374-1.patch > > > Resource Manager is failing to start with the below > ConcurrentModificationException. > {code:xml} > 2013-10-30 20:22:42,371 INFO org.apache.hadoop.util.HostsFileReader: > Refreshing hosts (include/exclude) list > 2013-10-30 20:22:42,376 INFO org.apache.hadoop.service.AbstractService: > Service ResourceManager failed in state INITED; cause: > java.util.ConcurrentModificationException > java.util.ConcurrentModificationException > at > java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372) > at java.util.AbstractList$Itr.next(AbstractList.java:343) > at > java.util.Collections$UnmodifiableCollection$1.next(Collections.java:1010) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:187) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:944) > 2013-10-30 20:22:42,378 INFO > org.apache.hadoop.yarn.server.resourcemanager.RMHAProtocolService: > Transitioning to standby > 2013-10-30 20:22:42,378 INFO > org.apache.hadoop.yarn.server.resourcemanager.RMHAProtocolService: > Transitioned to standby > 2013-10-30 20:22:42,378 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting > ResourceManager > java.util.ConcurrentModificationException > at > java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372) > at java.util.AbstractList$Itr.next(AbstractList.java:343) > at > java.util.Collections$UnmodifiableCollection$1.next(Collections.java:1010) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:187) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:944) > 2013-10-30 20:22:42,379 INFO > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: SHUTDOWN_MSG: > / > SHUTDOWN_MSG: Shutting down ResourceManager at HOST-10-18-40-24/10.18.40.24 > / > {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1374) Resource Manager fails to start due to ConcurrentModificationException
[ https://issues.apache.org/jira/browse/YARN-1374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13814858#comment-13814858 ] Hudson commented on YARN-1374: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1601 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1601/]) YARN-1374. Changed ResourceManager to start the preemption policy monitors as active services. Contributed by Karthik Kambatla. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1539089) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/TestSchedulingMonitor.java > Resource Manager fails to start due to ConcurrentModificationException > -- > > Key: YARN-1374 > URL: https://issues.apache.org/jira/browse/YARN-1374 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Devaraj K >Assignee: Karthik Kambatla >Priority: Blocker > Fix For: 2.3.0 > > Attachments: yarn-1374-1.patch, yarn-1374-1.patch > > > Resource Manager is failing to start with the below > ConcurrentModificationException. > {code:xml} > 2013-10-30 20:22:42,371 INFO org.apache.hadoop.util.HostsFileReader: > Refreshing hosts (include/exclude) list > 2013-10-30 20:22:42,376 INFO org.apache.hadoop.service.AbstractService: > Service ResourceManager failed in state INITED; cause: > java.util.ConcurrentModificationException > java.util.ConcurrentModificationException > at > java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372) > at java.util.AbstractList$Itr.next(AbstractList.java:343) > at > java.util.Collections$UnmodifiableCollection$1.next(Collections.java:1010) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:187) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:944) > 2013-10-30 20:22:42,378 INFO > org.apache.hadoop.yarn.server.resourcemanager.RMHAProtocolService: > Transitioning to standby > 2013-10-30 20:22:42,378 INFO > org.apache.hadoop.yarn.server.resourcemanager.RMHAProtocolService: > Transitioned to standby > 2013-10-30 20:22:42,378 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting > ResourceManager > java.util.ConcurrentModificationException > at > java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372) > at java.util.AbstractList$Itr.next(AbstractList.java:343) > at > java.util.Collections$UnmodifiableCollection$1.next(Collections.java:1010) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:187) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:944) > 2013-10-30 20:22:42,379 INFO > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: SHUTDOWN_MSG: > / > SHUTDOWN_MSG: Shutting down ResourceManager at HOST-10-18-40-24/10.18.40.24 > / > {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-311) Dynamic node resource configuration: core scheduler changes
[ https://issues.apache.org/jira/browse/YARN-311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13814859#comment-13814859 ] Hudson commented on YARN-311: - FAILURE: Integrated in Hadoop-Mapreduce-trunk #1601 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1601/]) YARN-311. RM/scheduler support for dynamic resource configuration. (Junping Du via llu) (llu: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1539134) * /hadoop/common/trunk/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/nodemanager/NodeInfo.java * /hadoop/common/trunk/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/RMNodeWrapper.java * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ResourceOption.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_protos.proto * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/records/impl/pb/ResourceOptionPBImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNode.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerNode.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerUtils.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerNode.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSSchedulerNode.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockNodes.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMNodeTransitions.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/TestFifoScheduler.java > Dynamic node resource configuration: core scheduler changes > --- > > Key: YARN-311 > URL: https://issues.apache.org/jira/browse/YARN-311 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager, scheduler >Reporter: Junping Du >Assignee: Junping Du > Fix For: 2.3.0 > > Attachments: YARN-311-v1.patch, YARN-311-v10.patch, > YARN-311-v11.patch, YARN-311-v12.patch, YARN-311-v12b.patch, > YARN-311-v13.patch, YARN-311-v2.patch, YARN-311-v3.patch, YARN-311-v4.patch, > YARN-311-v4.patch, YARN-311-v5.patch, YARN-311-v6.1.patch, > YARN-311-v6.2.patch, YARN-311-v6.patch, YARN-311-v7.patch, YARN-311-v8.patch, > YARN-311-v9.patch > > > As the first step, we go for resource change on RM side and expose admin APIs > (admin protocol, CLI, REST and JMX API) later. In this jira, we will only > contain changes in scheduler. > The flow to update node's resource and awareness in resource scheduling is: > 1. Resource update is through admin API to RM and take effect on RMNodeImpl. > 2.
[jira] [Commented] (YARN-1145) Potential file handle leak in aggregated logs web ui
[ https://issues.apache.org/jira/browse/YARN-1145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13814847#comment-13814847 ] Hadoop QA commented on YARN-1145: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12612346/YARN-1145.4.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2383//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2383//console This message is automatically generated. > Potential file handle leak in aggregated logs web ui > > > Key: YARN-1145 > URL: https://issues.apache.org/jira/browse/YARN-1145 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.0.5-alpha, 0.23.9, 2.1.1-beta >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S > Attachments: MAPREDUCE-5486.patch, YARN-1145.1.patch, > YARN-1145.2.patch, YARN-1145.3.patch, YARN-1145.4.patch, YARN-1145.patch > > > Any problem in getting aggregated logs for rendering on web ui, then > LogReader is not closed. > Now, it reader is not closed which causing many connections in close_wait > state. > hadoopuser@hadoopuser:> jps > *27909* JobHistoryServer > DataNode port is 50010. When greped with DataNode port, many connections are > in CLOSE_WAIT from JHS. > hadoopuser@hadoopuser:> netstat -tanlp |grep 50010 > tcp0 0 10.18.40.48:50010 0.0.0.0:* LISTEN > 21453/java > tcp1 0 10.18.40.48:20596 10.18.40.48:50010 > CLOSE_WAIT *27909*/java > tcp1 0 10.18.40.48:19667 10.18.40.152:50010 > CLOSE_WAIT *27909*/java > tcp1 0 10.18.40.48:20593 10.18.40.48:50010 > CLOSE_WAIT *27909*/java > tcp1 0 10.18.40.48:12290 10.18.40.48:50010 > CLOSE_WAIT *27909*/java > tcp1 0 10.18.40.48:19662 10.18.40.152:50010 > CLOSE_WAIT *27909*/java -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-311) Dynamic node resource configuration: core scheduler changes
[ https://issues.apache.org/jira/browse/YARN-311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13814787#comment-13814787 ] Hudson commented on YARN-311: - SUCCESS: Integrated in Hadoop-Yarn-trunk #384 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/384/]) YARN-311. RM/scheduler support for dynamic resource configuration. (Junping Du via llu) (llu: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1539134) * /hadoop/common/trunk/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/nodemanager/NodeInfo.java * /hadoop/common/trunk/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/RMNodeWrapper.java * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ResourceOption.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_protos.proto * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/records/impl/pb/ResourceOptionPBImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNode.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerNode.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerUtils.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerNode.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSSchedulerNode.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockNodes.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMNodeTransitions.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/TestFifoScheduler.java > Dynamic node resource configuration: core scheduler changes > --- > > Key: YARN-311 > URL: https://issues.apache.org/jira/browse/YARN-311 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager, scheduler >Reporter: Junping Du >Assignee: Junping Du > Fix For: 2.3.0 > > Attachments: YARN-311-v1.patch, YARN-311-v10.patch, > YARN-311-v11.patch, YARN-311-v12.patch, YARN-311-v12b.patch, > YARN-311-v13.patch, YARN-311-v2.patch, YARN-311-v3.patch, YARN-311-v4.patch, > YARN-311-v4.patch, YARN-311-v5.patch, YARN-311-v6.1.patch, > YARN-311-v6.2.patch, YARN-311-v6.patch, YARN-311-v7.patch, YARN-311-v8.patch, > YARN-311-v9.patch > > > As the first step, we go for resource change on RM side and expose admin APIs > (admin protocol, CLI, REST and JMX API) later. In this jira, we will only > contain changes in scheduler. > The flow to update node's resource and awareness in resource scheduling is: > 1. Resource update is through admin API to RM and take effect on RMNodeImpl. > 2. When next NM
[jira] [Commented] (YARN-1374) Resource Manager fails to start due to ConcurrentModificationException
[ https://issues.apache.org/jira/browse/YARN-1374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13814786#comment-13814786 ] Hudson commented on YARN-1374: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #384 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/384/]) YARN-1374. Changed ResourceManager to start the preemption policy monitors as active services. Contributed by Karthik Kambatla. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1539089) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/TestSchedulingMonitor.java > Resource Manager fails to start due to ConcurrentModificationException > -- > > Key: YARN-1374 > URL: https://issues.apache.org/jira/browse/YARN-1374 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Devaraj K >Assignee: Karthik Kambatla >Priority: Blocker > Fix For: 2.3.0 > > Attachments: yarn-1374-1.patch, yarn-1374-1.patch > > > Resource Manager is failing to start with the below > ConcurrentModificationException. > {code:xml} > 2013-10-30 20:22:42,371 INFO org.apache.hadoop.util.HostsFileReader: > Refreshing hosts (include/exclude) list > 2013-10-30 20:22:42,376 INFO org.apache.hadoop.service.AbstractService: > Service ResourceManager failed in state INITED; cause: > java.util.ConcurrentModificationException > java.util.ConcurrentModificationException > at > java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372) > at java.util.AbstractList$Itr.next(AbstractList.java:343) > at > java.util.Collections$UnmodifiableCollection$1.next(Collections.java:1010) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:187) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:944) > 2013-10-30 20:22:42,378 INFO > org.apache.hadoop.yarn.server.resourcemanager.RMHAProtocolService: > Transitioning to standby > 2013-10-30 20:22:42,378 INFO > org.apache.hadoop.yarn.server.resourcemanager.RMHAProtocolService: > Transitioned to standby > 2013-10-30 20:22:42,378 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting > ResourceManager > java.util.ConcurrentModificationException > at > java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372) > at java.util.AbstractList$Itr.next(AbstractList.java:343) > at > java.util.Collections$UnmodifiableCollection$1.next(Collections.java:1010) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:187) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:944) > 2013-10-30 20:22:42,379 INFO > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: SHUTDOWN_MSG: > / > SHUTDOWN_MSG: Shutting down ResourceManager at HOST-10-18-40-24/10.18.40.24 > / > {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1307) Rethink znode structure for RM HA
[ https://issues.apache.org/jira/browse/YARN-1307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13814755#comment-13814755 ] Hadoop QA commented on YARN-1307: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12612336/YARN-1307.4-2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:red}-1 javac{color:red}. The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2382//console This message is automatically generated. > Rethink znode structure for RM HA > - > > Key: YARN-1307 > URL: https://issues.apache.org/jira/browse/YARN-1307 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Tsuyoshi OZAWA >Assignee: Tsuyoshi OZAWA > Attachments: YARN-1307.1.patch, YARN-1307.2.patch, YARN-1307.3.patch, > YARN-1307.4-2.patch, YARN-1307.4.patch > > > Rethink for znode structure for RM HA is proposed in some JIRAs(YARN-659, > YARN-1222). The motivation of this JIRA is quoted from Bikas' comment in > YARN-1222: > {quote} > We should move to creating a node hierarchy for apps such that all znodes for > an app are stored under an app znode instead of the app root znode. This will > help in removeApplication and also in scaling better on ZK. The earlier code > was written this way to ensure create/delete happens under a root znode for > fencing. But given that we have moved to multi-operations globally, this isnt > required anymore. > {quote} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-987) Adding History Service to use Store and converting Historydata to Report
[ https://issues.apache.org/jira/browse/YARN-987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13814757#comment-13814757 ] Hadoop QA commented on YARN-987: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12612348/YARN-987-6.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:red}-1 javac{color:red}. The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2381//console This message is automatically generated. > Adding History Service to use Store and converting Historydata to Report > > > Key: YARN-987 > URL: https://issues.apache.org/jira/browse/YARN-987 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Mayank Bansal >Assignee: Mayank Bansal > Attachments: YARN-987-1.patch, YARN-987-2.patch, YARN-987-3.patch, > YARN-987-4.patch, YARN-987-5.patch, YARN-987-6.patch > > -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-987) Adding History Service to use Store and converting Historydata to Report
[ https://issues.apache.org/jira/browse/YARN-987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mayank Bansal updated YARN-987: --- Attachment: YARN-987-6.patch Attaching latest patch. Thanks, Mayank > Adding History Service to use Store and converting Historydata to Report > > > Key: YARN-987 > URL: https://issues.apache.org/jira/browse/YARN-987 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Mayank Bansal >Assignee: Mayank Bansal > Attachments: YARN-987-1.patch, YARN-987-2.patch, YARN-987-3.patch, > YARN-987-4.patch, YARN-987-5.patch, YARN-987-6.patch > > -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-987) Adding History Service to use Store and converting Historydata to Report
[ https://issues.apache.org/jira/browse/YARN-987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13814743#comment-13814743 ] Mayank Bansal commented on YARN-987: Thanks [~vinodkv] and [~zjshen] for the review. bq. The unnecessary type casting is still there. Done bq. lastAttempt can be null. Should do null check. Otherwise, NPE may be expected. Done bq. Is it good to write a test case for this one? Done bq. reduce the scope of methods like getLastAttempt, they don't need to be public. Done bq. ApplicationHistoryContext -> ApplicationHistoryManager and ApplicationHistory -> ApplicationHistoryManagerImpl. They aren't just context objects. Done Thanks, Mayank > Adding History Service to use Store and converting Historydata to Report > > > Key: YARN-987 > URL: https://issues.apache.org/jira/browse/YARN-987 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Mayank Bansal >Assignee: Mayank Bansal > Attachments: YARN-987-1.patch, YARN-987-2.patch, YARN-987-3.patch, > YARN-987-4.patch, YARN-987-5.patch, YARN-987-6.patch > > -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1145) Potential file handle leak in aggregated logs web ui
[ https://issues.apache.org/jira/browse/YARN-1145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith Sharma K S updated YARN-1145: Attachment: YARN-1145.4.patch Apologies for delayed response. Thank you Vinod for reviewing patch :-) Attaching patch addressing all Vinod comments. For 5th comment, added try{}finally{} for whole render method in AggregatedLogsBlock.java. Eventhough patch looks with lot of difference(since try catch added for whole render method), modified code is {noformat} protected void render(Block html) { +AggregatedLogFormat.LogReader reader = null; +try{ // render block : NO CHANGE Path remoteRootLogDir = new Path(conf.get( YarnConfiguration.NM_REMOTE_APP_LOG_DIR, YarnConfiguration.DEFAULT_NM_REMOTE_APP_LOG_DIR)); -AggregatedLogFormat.LogReader reader = null; // render block : NO CHANGE +} finally{ + if (reader != null) { +reader.close(); + } + } } {noformat} > Potential file handle leak in aggregated logs web ui > > > Key: YARN-1145 > URL: https://issues.apache.org/jira/browse/YARN-1145 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.0.5-alpha, 0.23.9, 2.1.1-beta >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S > Attachments: MAPREDUCE-5486.patch, YARN-1145.1.patch, > YARN-1145.2.patch, YARN-1145.3.patch, YARN-1145.4.patch, YARN-1145.patch > > > Any problem in getting aggregated logs for rendering on web ui, then > LogReader is not closed. > Now, it reader is not closed which causing many connections in close_wait > state. > hadoopuser@hadoopuser:> jps > *27909* JobHistoryServer > DataNode port is 50010. When greped with DataNode port, many connections are > in CLOSE_WAIT from JHS. > hadoopuser@hadoopuser:> netstat -tanlp |grep 50010 > tcp0 0 10.18.40.48:50010 0.0.0.0:* LISTEN > 21453/java > tcp1 0 10.18.40.48:20596 10.18.40.48:50010 > CLOSE_WAIT *27909*/java > tcp1 0 10.18.40.48:19667 10.18.40.152:50010 > CLOSE_WAIT *27909*/java > tcp1 0 10.18.40.48:20593 10.18.40.48:50010 > CLOSE_WAIT *27909*/java > tcp1 0 10.18.40.48:12290 10.18.40.48:50010 > CLOSE_WAIT *27909*/java > tcp1 0 10.18.40.48:19662 10.18.40.152:50010 > CLOSE_WAIT *27909*/java -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1266) Adding ApplicationHistoryProtocolPBService
[ https://issues.apache.org/jira/browse/YARN-1266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13814690#comment-13814690 ] Mayank Bansal commented on YARN-1266: - Thanks [~vinodkv] and [~zjshen] for the review. bq. This is not enough, you need a client wrapper too. Yes you are correct and thats part of YARN-967 as all the clients side (CLI) changes are part of this JIRA. bq. If so, I think it makes sense, and probably simplifies the problem. However, I still have one concern about the independency of AHS. Let's say if we want AHS to be a separate process like JHS in the future (or maybe now, see my comments in YARN-955), when RM is stopped, AHS can not be accessed via RPC interface. I think AHS should be a seprate process thats how it can scale. What I think is we still should have two seprate protocols to have the seprate RPC entry for RM and AHS however we should be having same methods in both the protocol. May be in client protocol and then history protocol can derive from that. > Adding ApplicationHistoryProtocolPBService > -- > > Key: YARN-1266 > URL: https://issues.apache.org/jira/browse/YARN-1266 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Mayank Bansal >Assignee: Mayank Bansal > Attachments: YARN-1266-1.patch, YARN-1266-2.patch > > > Adding ApplicationHistoryProtocolPBService to make web apps to work and > changing yarn to run AHS as a seprate process -- This message was sent by Atlassian JIRA (v6.1#6144)