[jira] [Commented] (YARN-451) Add more metrics to RM page
[ https://issues.apache.org/jira/browse/YARN-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636358#comment-14636358 ] Joep Rottinghuis commented on YARN-451: --- Just for the record, at Twitter we've been running with YARN-2417 in production and are finding it very useful in clusters of many thousands of nodes with tens of thousands of jobs in a day. Add more metrics to RM page --- Key: YARN-451 URL: https://issues.apache.org/jira/browse/YARN-451 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.0.3-alpha Reporter: Lohit Vijayarenu Assignee: Sangjin Lee Attachments: in_progress_2x.png, yarn-451-trunk-20130916.1.patch ResourceManager webUI shows list of RUNNING applications, but it does not tell which applications are requesting more resource compared to others. With cluster running hundreds of applications at once it would be useful to have some kind of metric to show high-resource usage applications vs low-resource usage ones. At the minimum showing number of containers is good option. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-451) Add more metrics to RM page
[ https://issues.apache.org/jira/browse/YARN-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13802398#comment-13802398 ] Sangjin Lee commented on YARN-451: -- {quote} Quick question on the estimate – is it a calculation of the total app weight at the start of the app or do the values decrease as containers are granted? The former is useful as a gauge of how big an app is/was overall, while the latter is more useful for identifying upcoming demands if the application has been running for some time. {quote} What we envision and implemented in that patch is the total app weight (values do not decrease as containers are granted). The rationale is to size a job quickly (and compare sizes of jobs). The current patch piggybacks on the allocate request so that apps may set the initial estimate but also update it if the forecast changes and want to communicate that to the RM. Add more metrics to RM page --- Key: YARN-451 URL: https://issues.apache.org/jira/browse/YARN-451 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.0.3-alpha Reporter: Lohit Vijayarenu Assignee: Sangjin Lee Priority: Blocker Attachments: in_progress_2x.png, yarn-451-trunk-20130916.1.patch ResourceManager webUI shows list of RUNNING applications, but it does not tell which applications are requesting more resource compared to others. With cluster running hundreds of applications at once it would be useful to have some kind of metric to show high-resource usage applications vs low-resource usage ones. At the minimum showing number of containers is good option. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-451) Add more metrics to RM page
[ https://issues.apache.org/jira/browse/YARN-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13795266#comment-13795266 ] Jason Lowe commented on YARN-451: - bq. current allocation can be seen from the scheduler page. I took a look at the scheduler page, and all I see for current allocation is per-user-per-queue and not per app. Where are you seeing the current assignment for each app on the scheduler page? As for the instance you recently encountered, showing the current ask would have quickly isolated the issue as all 30K maps would have been asked for once the app launched. My main concern with a current-plus-estimated-future approach is that it's optional for AMs to implement and requires an API change. I see showing the current and/or ask as more robust across different app frameworks (doesn't require AMs to implement anything), easier to implement, and should solve most of the problems with identifying where the bottlenecks currently are in scheduling containers. Doing so doesn't preclude adding a total estimate metric at some point. Quick question on the estimate -- is it a calculation of the total app weight at the start of the app or do the values decrease as containers are granted? The former is useful as a gauge of how big an app is/was overall, while the latter is more useful for identifying upcoming demands if the application has been running for some time. Add more metrics to RM page --- Key: YARN-451 URL: https://issues.apache.org/jira/browse/YARN-451 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.0.3-alpha Reporter: Lohit Vijayarenu Assignee: Sangjin Lee Priority: Blocker Attachments: in_progress_2x.png, yarn-451-trunk-20130916.1.patch ResourceManager webUI shows list of RUNNING applications, but it does not tell which applications are requesting more resource compared to others. With cluster running hundreds of applications at once it would be useful to have some kind of metric to show high-resource usage applications vs low-resource usage ones. At the minimum showing number of containers is good option. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-451) Add more metrics to RM page
[ https://issues.apache.org/jira/browse/YARN-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13794737#comment-13794737 ] Joep Rottinghuis commented on YARN-451: --- [~jlowe] current allocation can be seen from the scheduler page. While solving the which app is the most starved app is also an interesting use-case I think that showing only what the current assignement is (and not what the estimated total ask will be) does not solve the use case we're trying to tackle. Today we again had an issue where a (newby) user launched 20 30K mapper jobs. While each job got only few resources, the issue was not identifying that his jobs were all starved, but more that his jobs were going to not finish within any reasonable time and would demand over time an unreasonable amount of resources. On a Hadoop 1.0 equivalent cluster one can quickly sort by # mappers /. # reducers. We need the equivalent on Yarn where we can click on the estimated #MB that this app will be asking for in total and/or the estimated total # cores. Add more metrics to RM page --- Key: YARN-451 URL: https://issues.apache.org/jira/browse/YARN-451 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.0.3-alpha Reporter: Lohit Vijayarenu Assignee: Sangjin Lee Priority: Blocker Attachments: in_progress_2x.png, yarn-451-trunk-20130916.1.patch ResourceManager webUI shows list of RUNNING applications, but it does not tell which applications are requesting more resource compared to others. With cluster running hundreds of applications at once it would be useful to have some kind of metric to show high-resource usage applications vs low-resource usage ones. At the minimum showing number of containers is good option. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-451) Add more metrics to RM page
[ https://issues.apache.org/jira/browse/YARN-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13792818#comment-13792818 ] Sangjin Lee commented on YARN-451: -- [~jlowe] Although I like the idea of showing both types of metrics (current consumption and total), I am concerned about over-crowding that UI. Thoughts? Maybe one can emit both types of data in the REST output but show only one type in html? Add more metrics to RM page --- Key: YARN-451 URL: https://issues.apache.org/jira/browse/YARN-451 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.0.3-alpha Reporter: Lohit Vijayarenu Assignee: Sangjin Lee Priority: Blocker Attachments: in_progress_2x.png, yarn-451-trunk-20130916.1.patch ResourceManager webUI shows list of RUNNING applications, but it does not tell which applications are requesting more resource compared to others. With cluster running hundreds of applications at once it would be useful to have some kind of metric to show high-resource usage applications vs low-resource usage ones. At the minimum showing number of containers is good option. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-451) Add more metrics to RM page
[ https://issues.apache.org/jira/browse/YARN-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13792908#comment-13792908 ] Jason Lowe commented on YARN-451: - I wasn't proposing current consumption and total, rather current consumption and ask (or current requests), although one can back-calculate ask from total given current. The current consumption lets us quickly find the biggest apps in the cluster, while the current ask lets us quickly find the apps that are starving for more resources (beyond the ones that are UNASSIGNED). If we only can pick one, I'd say go with current consumption. Usually users know when their jobs are getting starved, and they want to know which jobs are filling up the queue. Add more metrics to RM page --- Key: YARN-451 URL: https://issues.apache.org/jira/browse/YARN-451 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.0.3-alpha Reporter: Lohit Vijayarenu Assignee: Sangjin Lee Priority: Blocker Attachments: in_progress_2x.png, yarn-451-trunk-20130916.1.patch ResourceManager webUI shows list of RUNNING applications, but it does not tell which applications are requesting more resource compared to others. With cluster running hundreds of applications at once it would be useful to have some kind of metric to show high-resource usage applications vs low-resource usage ones. At the minimum showing number of containers is good option. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-451) Add more metrics to RM page
[ https://issues.apache.org/jira/browse/YARN-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790895#comment-13790895 ] Sangjin Lee commented on YARN-451: -- I agree. Even for mappers and reducers container resource asks may vary, and the memory and the cores contain more information than the number of containers. Add more metrics to RM page --- Key: YARN-451 URL: https://issues.apache.org/jira/browse/YARN-451 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.0.3-alpha Reporter: Lohit Vijayarenu Assignee: Sangjin Lee Priority: Blocker Attachments: in_progress_2x.png, yarn-451-trunk-20130916.1.patch ResourceManager webUI shows list of RUNNING applications, but it does not tell which applications are requesting more resource compared to others. With cluster running hundreds of applications at once it would be useful to have some kind of metric to show high-resource usage applications vs low-resource usage ones. At the minimum showing number of containers is good option. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-451) Add more metrics to RM page
[ https://issues.apache.org/jira/browse/YARN-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788226#comment-13788226 ] Jason Lowe commented on YARN-451: - I've found #containers to not be a very useful metric, as it doesn't necessarily map closely to the amount of cluster resources. Sometimes apps run with lots of tiny containers while others run with huge containers. I think showing the resource utilization in terms of memory and CPU for both current and ask would be useful. That would show which apps are big right now and which are trying to be much bigger. Add more metrics to RM page --- Key: YARN-451 URL: https://issues.apache.org/jira/browse/YARN-451 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.0.3-alpha Reporter: Lohit Vijayarenu Assignee: Sangjin Lee Priority: Blocker Attachments: in_progress_2x.png, yarn-451-trunk-20130916.1.patch ResourceManager webUI shows list of RUNNING applications, but it does not tell which applications are requesting more resource compared to others. With cluster running hundreds of applications at once it would be useful to have some kind of metric to show high-resource usage applications vs low-resource usage ones. At the minimum showing number of containers is good option. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-451) Add more metrics to RM page
[ https://issues.apache.org/jira/browse/YARN-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13787396#comment-13787396 ] Arun C Murthy commented on YARN-451: Thinking aloud... we could track past allocations (#containers) per application in RM and also track future requests (sum numContainers for *) per application. Would showing those two help? Add more metrics to RM page --- Key: YARN-451 URL: https://issues.apache.org/jira/browse/YARN-451 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.0.3-alpha Reporter: Lohit Vijayarenu Assignee: Sangjin Lee Priority: Blocker Attachments: in_progress_2x.png, yarn-451-trunk-20130916.1.patch ResourceManager webUI shows list of RUNNING applications, but it does not tell which applications are requesting more resource compared to others. With cluster running hundreds of applications at once it would be useful to have some kind of metric to show high-resource usage applications vs low-resource usage ones. At the minimum showing number of containers is good option. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-451) Add more metrics to RM page
[ https://issues.apache.org/jira/browse/YARN-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13775700#comment-13775700 ] Jason Lowe commented on YARN-451: - Is knowing how big an application might get in the future important? Knowing how big an application is right now, both in terms of what it's using and what it's asking for, seems more relevant for understanding why a queue is overloaded or jobs aren't getting scheduled as quickly as expected. The ApplicationResourceUsageReport already contains this information, and it should be straightforward to report as part of the ApplicationResourceUsageReport for display via the web UI, CLI, or REST services. Note that YARN-415 is already attempting to do this for historical resource usage so it will be easy to see which jobs have taken large amounts of resources and could have slowed other jobs in the past. Add more metrics to RM page --- Key: YARN-451 URL: https://issues.apache.org/jira/browse/YARN-451 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.0.3-alpha Reporter: Lohit Vijayarenu Assignee: Sangjin Lee Priority: Blocker Attachments: in_progress_2x.png, yarn-451-trunk-20130916.1.patch ResourceManager webUI shows list of RUNNING applications, but it does not tell which applications are requesting more resource compared to others. With cluster running hundreds of applications at once it would be useful to have some kind of metric to show high-resource usage applications vs low-resource usage ones. At the minimum showing number of containers is good option. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-451) Add more metrics to RM page
[ https://issues.apache.org/jira/browse/YARN-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13771073#comment-13771073 ] Hadoop QA commented on YARN-451: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12603493/in_progress_2x.png against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1959//console This message is automatically generated. Add more metrics to RM page --- Key: YARN-451 URL: https://issues.apache.org/jira/browse/YARN-451 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.0.3-alpha Reporter: Lohit Vijayarenu Assignee: Sangjin Lee Priority: Blocker Attachments: in_progress_2x.png, yarn-451-trunk-20130916.1.patch ResourceManager webUI shows list of RUNNING applications, but it does not tell which applications are requesting more resource compared to others. With cluster running hundreds of applications at once it would be useful to have some kind of metric to show high-resource usage applications vs low-resource usage ones. At the minimum showing number of containers is good option. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-451) Add more metrics to RM page
[ https://issues.apache.org/jira/browse/YARN-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13771163#comment-13771163 ] Arun C Murthy commented on YARN-451: I'm not clear if this is the right approach... How about something simpler: # RM UI displays current usage of resources (memory, cpu etc.) # Application can pass in a string (along with progress) where-by we can annotate with something app-specific like: 100 maps total, 5 finished, 5 running Add more metrics to RM page --- Key: YARN-451 URL: https://issues.apache.org/jira/browse/YARN-451 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.0.3-alpha Reporter: Lohit Vijayarenu Assignee: Sangjin Lee Priority: Blocker Attachments: in_progress_2x.png, yarn-451-trunk-20130916.1.patch ResourceManager webUI shows list of RUNNING applications, but it does not tell which applications are requesting more resource compared to others. With cluster running hundreds of applications at once it would be useful to have some kind of metric to show high-resource usage applications vs low-resource usage ones. At the minimum showing number of containers is good option. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-451) Add more metrics to RM page
[ https://issues.apache.org/jira/browse/YARN-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13770060#comment-13770060 ] Joep Rottinghuis commented on YARN-451: --- Marking as blocker. W/o a sense of size running a cluster at scale is incredibly difficult. When there is resource contention, we need to be able to sort so see the difference between a user running 10 apps that are tiny (require 1-2 small containers each), or they run 5 apps that require tens of thousands of large containers. Add more metrics to RM page --- Key: YARN-451 URL: https://issues.apache.org/jira/browse/YARN-451 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.0.3-alpha Reporter: Lohit Vijayarenu Priority: Blocker Attachments: in_progress_2x.png, yarn-451-trunk-20130916.1.patch ResourceManager webUI shows list of RUNNING applications, but it does not tell which applications are requesting more resource compared to others. With cluster running hundreds of applications at once it would be useful to have some kind of metric to show high-resource usage applications vs low-resource usage ones. At the minimum showing number of containers is good option. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-451) Add more metrics to RM page
[ https://issues.apache.org/jira/browse/YARN-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13768989#comment-13768989 ] Sangjin Lee commented on YARN-451: -- I am going to upload a single patch to get your feedback. I wanted to get your feedback first to validate the approach before I break out sub-tasks and post separate patches. This single patch is solely for review, and is not to be merged. Add more metrics to RM page --- Key: YARN-451 URL: https://issues.apache.org/jira/browse/YARN-451 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.0.3-alpha Reporter: Lohit Vijayarenu Priority: Minor ResourceManager webUI shows list of RUNNING applications, but it does not tell which applications are requesting more resource compared to others. With cluster running hundreds of applications at once it would be useful to have some kind of metric to show high-resource usage applications vs low-resource usage ones. At the minimum showing number of containers is good option. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-451) Add more metrics to RM page
[ https://issues.apache.org/jira/browse/YARN-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13766759#comment-13766759 ] Sangjin Lee commented on YARN-451: -- I am pretty close to getting a patch ready for review on this. A quick question before that however: the proposed change contains changes in YARN (changes in message definition to carry this extra info, and subsequent UI changes) and mapreduce (mapreduce application providing this information). Should I create two sub-tasks (one for YARN and one for MAPREDUCE) and provide separate patches for them? Add more metrics to RM page --- Key: YARN-451 URL: https://issues.apache.org/jira/browse/YARN-451 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.0.3-alpha Reporter: Lohit Vijayarenu Priority: Minor ResourceManager webUI shows list of RUNNING applications, but it does not tell which applications are requesting more resource compared to others. With cluster running hundreds of applications at once it would be useful to have some kind of metric to show high-resource usage applications vs low-resource usage ones. At the minimum showing number of containers is good option. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-451) Add more metrics to RM page
[ https://issues.apache.org/jira/browse/YARN-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739801#comment-13739801 ] Sangjin Lee commented on YARN-451: -- I agree that hadoop 1 was different as the notion of mappers and reducers was explicit from the overview and the RM works in a different way in terms of resource allocation. I am pointing out that from a user perspective there is a feature gap where one cannot quickly get a sense of relative sizes of apps/jobs. I also agree that the solution should be done in a way such that it doesn't crowd the UI and also conforms well with the current RM architecture. Thanks! Add more metrics to RM page --- Key: YARN-451 URL: https://issues.apache.org/jira/browse/YARN-451 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.0.3-alpha Reporter: Lohit Vijayarenu Priority: Minor ResourceManager webUI shows list of RUNNING applications, but it does not tell which applications are requesting more resource compared to others. With cluster running hundreds of applications at once it would be useful to have some kind of metric to show high-resource usage applications vs low-resource usage ones. At the minimum showing number of containers is good option. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-451) Add more metrics to RM page
[ https://issues.apache.org/jira/browse/YARN-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739213#comment-13739213 ] Sangjin Lee commented on YARN-451: -- I think showing this information on the app list page is actually more valuable than the per-app page. If this information is present in the app list page, one can quickly scan the list and get a sense of which job/app is bigger than others in terms of resource consumption. Also, it makes sorting possible. One could in theory visit individual per-app pages one by one to get the same information, but it's so much more useful to have it ready at the overview page so one can get that information quickly. In hadoop 1.0, one could get the same information by looking at the number of total mappers and reducers. That way, we got a very good idea on which ones are big jobs (and thus need to be monitored more closely) without drilling into any of the apps. Add more metrics to RM page --- Key: YARN-451 URL: https://issues.apache.org/jira/browse/YARN-451 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.0.3-alpha Reporter: Lohit Vijayarenu Priority: Minor ResourceManager webUI shows list of RUNNING applications, but it does not tell which applications are requesting more resource compared to others. With cluster running hundreds of applications at once it would be useful to have some kind of metric to show high-resource usage applications vs low-resource usage ones. At the minimum showing number of containers is good option. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-451) Add more metrics to RM page
[ https://issues.apache.org/jira/browse/YARN-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739225#comment-13739225 ] Vinod Kumar Vavilapalli commented on YARN-451: -- Agreed about having it on the listing page, but that page is already dense. Have to do some basic UI design. Again, like I mentioned, Hadoop-1 was different as number of maps, reduces doesn't change after job starts. Whereas in Hadoop-2, memory/cores allocated slowly increases over time , so it may or may not be of much use. I am ambivalent about adding it. Add more metrics to RM page --- Key: YARN-451 URL: https://issues.apache.org/jira/browse/YARN-451 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.0.3-alpha Reporter: Lohit Vijayarenu Priority: Minor ResourceManager webUI shows list of RUNNING applications, but it does not tell which applications are requesting more resource compared to others. With cluster running hundreds of applications at once it would be useful to have some kind of metric to show high-resource usage applications vs low-resource usage ones. At the minimum showing number of containers is good option. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-451) Add more metrics to RM page
[ https://issues.apache.org/jira/browse/YARN-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13737497#comment-13737497 ] Sangjin Lee commented on YARN-451: -- I think some information about the app size can be obtained through ApplicationResourceUsageReport. Specifically it has getNeededResources() and getNumUsedContainers() and getNumReservedContainers(). This could be a useful candidate for sizing the app in terms of resource demand. If I am not mistaken, AppInfo is the class that exposes the data for the app UI, and we could add a string or a couple of numbers individually to expose this information. Thoughts? Add more metrics to RM page --- Key: YARN-451 URL: https://issues.apache.org/jira/browse/YARN-451 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.0.3-alpha Reporter: Lohit Vijayarenu Priority: Minor ResourceManager webUI shows list of RUNNING applications, but it does not tell which applications are requesting more resource compared to others. With cluster running hundreds of applications at once it would be useful to have some kind of metric to show high-resource usage applications vs low-resource usage ones. At the minimum showing number of containers is good option. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-451) Add more metrics to RM page
[ https://issues.apache.org/jira/browse/YARN-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13737518#comment-13737518 ] Sangjin Lee commented on YARN-451: -- For sorting, it would be good to expose a number. How about the number of containers (used + reserved = needed)? I think the resource info (memory, vCores, etc.) may be bit redundant at least UI-wise if the number of containers is displayed. Add more metrics to RM page --- Key: YARN-451 URL: https://issues.apache.org/jira/browse/YARN-451 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.0.3-alpha Reporter: Lohit Vijayarenu Priority: Minor ResourceManager webUI shows list of RUNNING applications, but it does not tell which applications are requesting more resource compared to others. With cluster running hundreds of applications at once it would be useful to have some kind of metric to show high-resource usage applications vs low-resource usage ones. At the minimum showing number of containers is good option. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-451) Add more metrics to RM page
[ https://issues.apache.org/jira/browse/YARN-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13737641#comment-13737641 ] Vinod Kumar Vavilapalli commented on YARN-451: -- In YARN, resource usage by applications is dynamic. So at any point of time, RM CANNOT tell you what the max is for any given application. What we can do is show how many containers has any application has allocated till this point of time overall(which would be getNumUsedContainers()). Again this is dynamic and with MR it slowly increases over time. May or may not be much of a use. We can and will definitely add more information on the per-app page. But is showing on the app-list page a big requirement or showing it on the per-app page enough? Add more metrics to RM page --- Key: YARN-451 URL: https://issues.apache.org/jira/browse/YARN-451 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.0.3-alpha Reporter: Lohit Vijayarenu Priority: Minor ResourceManager webUI shows list of RUNNING applications, but it does not tell which applications are requesting more resource compared to others. With cluster running hundreds of applications at once it would be useful to have some kind of metric to show high-resource usage applications vs low-resource usage ones. At the minimum showing number of containers is good option. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-451) Add more metrics to RM page
[ https://issues.apache.org/jira/browse/YARN-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13709420#comment-13709420 ] Joep Rottinghuis commented on YARN-451: --- It would certainly be very useful to be able to see application size/weight (and order by this) when many applications run. If it were to be added, various Yarn applications would have their own specific implementation. At the moment only memory is tracked, so #slot Gigabytes would be a possible number that would be more generic then simply #mappers+#reducers. Either would be more useful that having no data at all. Being able to see the size of applications is really helpful to understand what is going on in one view. Is somebody running many small applications, a few large ones, many large ones ? etc. Add more metrics to RM page --- Key: YARN-451 URL: https://issues.apache.org/jira/browse/YARN-451 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.0.3-alpha Reporter: Lohit Vijayarenu Priority: Minor ResourceManager webUI shows list of RUNNING applications, but it does not tell which applications are requesting more resource compared to others. With cluster running hundreds of applications at once it would be useful to have some kind of metric to show high-resource usage applications vs low-resource usage ones. At the minimum showing number of containers is good option. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-451) Add more metrics to RM page
[ https://issues.apache.org/jira/browse/YARN-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13627203#comment-13627203 ] Lohit Vijayarenu commented on YARN-451: --- Tried to see if adding total number of Containers was trivial change, but looks like there is no notion of application max resource available to resourcemangaer. This might be the reason why RM page did not have the information. Looking into RMAppImpl shows that this kind of information is not passed either from Client/AM to RM during application initialization. Something close to notion of job weight I could see was Resource demand, but that seems to be change based on how an application request containers. For example FairScheduler seem to recalculate fairshare based on how much resource demand is passed by applications. One option I can think of is to add an additional field in protobuf which specifies what is total number of containers/resource an application might use. This would be optional field which can be used only by MapReduce for now and Client can set this value based on number of mappers/reducers. I am not sure if this is the right approach, any other simpler ideas people can suggest? Add more metrics to RM page --- Key: YARN-451 URL: https://issues.apache.org/jira/browse/YARN-451 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.0.3-alpha Reporter: Lohit Vijayarenu Priority: Minor ResourceManager webUI shows list of RUNNING applications, but it does not tell which applications are requesting more resource compared to others. With cluster running hundreds of applications at once it would be useful to have some kind of metric to show high-resource usage applications vs low-resource usage ones. At the minimum showing number of containers is good option. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira