[jira] [Commented] (YARN-451) Add more metrics to RM page

2015-07-22 Thread Joep Rottinghuis (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636358#comment-14636358
 ] 

Joep Rottinghuis commented on YARN-451:
---

Just for the record, at Twitter we've been running with YARN-2417 in production 
and are finding it very useful in clusters of many thousands of nodes with tens 
of thousands of jobs in a day.

 Add more metrics to RM page
 ---

 Key: YARN-451
 URL: https://issues.apache.org/jira/browse/YARN-451
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.0.3-alpha
Reporter: Lohit Vijayarenu
Assignee: Sangjin Lee
 Attachments: in_progress_2x.png, yarn-451-trunk-20130916.1.patch


 ResourceManager webUI shows list of RUNNING applications, but it does not 
 tell which applications are requesting more resource compared to others. With 
 cluster running hundreds of applications at once it would be useful to have 
 some kind of metric to show high-resource usage applications vs low-resource 
 usage ones. At the minimum showing number of containers is good option.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-451) Add more metrics to RM page

2013-10-22 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13802398#comment-13802398
 ] 

Sangjin Lee commented on YARN-451:
--

{quote}
Quick question on the estimate – is it a calculation of the total app weight at 
the start of the app or do the values decrease as containers are granted? The 
former is useful as a gauge of how big an app is/was overall, while the latter 
is more useful for identifying upcoming demands if the application has been 
running for some time.
{quote}

What we envision and implemented in that patch is the total app weight (values 
do not decrease as containers are granted). The rationale is to size a job 
quickly (and compare sizes of jobs). The current patch piggybacks on the 
allocate request so that apps may set the initial estimate but also update it 
if the forecast changes and want to communicate that to the RM.

 Add more metrics to RM page
 ---

 Key: YARN-451
 URL: https://issues.apache.org/jira/browse/YARN-451
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.0.3-alpha
Reporter: Lohit Vijayarenu
Assignee: Sangjin Lee
Priority: Blocker
 Attachments: in_progress_2x.png, yarn-451-trunk-20130916.1.patch


 ResourceManager webUI shows list of RUNNING applications, but it does not 
 tell which applications are requesting more resource compared to others. With 
 cluster running hundreds of applications at once it would be useful to have 
 some kind of metric to show high-resource usage applications vs low-resource 
 usage ones. At the minimum showing number of containers is good option.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-451) Add more metrics to RM page

2013-10-15 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13795266#comment-13795266
 ] 

Jason Lowe commented on YARN-451:
-

bq. current allocation can be seen from the scheduler page.

I took a look at the scheduler page, and all I see for current allocation is 
per-user-per-queue and not per app.  Where are you seeing the current 
assignment for each app on the scheduler page?

As for the instance you recently encountered, showing the current ask would 
have quickly isolated the issue as all 30K maps would have been asked for once 
the app launched.

My main concern with a current-plus-estimated-future approach is that it's 
optional for AMs to implement and requires an API change.  I see showing the 
current and/or ask as more robust across different app frameworks (doesn't 
require AMs to implement anything), easier to implement, and should solve most 
of the problems with identifying where the bottlenecks currently are in 
scheduling containers.  Doing so doesn't preclude adding a total estimate 
metric at some point.

Quick question on the estimate -- is it a calculation of the total app weight 
at the start of the app or do the values decrease as containers are granted?  
The former is useful as a gauge of how big an app is/was overall, while the 
latter is more useful for identifying upcoming demands if the application has 
been running for some time.

 Add more metrics to RM page
 ---

 Key: YARN-451
 URL: https://issues.apache.org/jira/browse/YARN-451
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.0.3-alpha
Reporter: Lohit Vijayarenu
Assignee: Sangjin Lee
Priority: Blocker
 Attachments: in_progress_2x.png, yarn-451-trunk-20130916.1.patch


 ResourceManager webUI shows list of RUNNING applications, but it does not 
 tell which applications are requesting more resource compared to others. With 
 cluster running hundreds of applications at once it would be useful to have 
 some kind of metric to show high-resource usage applications vs low-resource 
 usage ones. At the minimum showing number of containers is good option.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-451) Add more metrics to RM page

2013-10-14 Thread Joep Rottinghuis (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13794737#comment-13794737
 ] 

Joep Rottinghuis commented on YARN-451:
---

[~jlowe] current allocation can be seen from the scheduler page.
While solving the which app is the most starved app is also an interesting 
use-case I think that showing only what the current assignement is (and not 
what the estimated total ask will be) does not solve the use case we're trying 
to tackle.
Today we again had an issue where a (newby) user launched 20 30K mapper jobs. 
While each job got only few resources, the issue was not identifying that his 
jobs were all starved, but more that his jobs were going to not finish within 
any reasonable time and would demand over time an unreasonable amount of 
resources.

On a Hadoop 1.0 equivalent cluster one can quickly sort by # mappers /. # 
reducers.
We need the equivalent on Yarn where we can click on the estimated #MB that 
this app will be asking for in total and/or the estimated total # cores.

 Add more metrics to RM page
 ---

 Key: YARN-451
 URL: https://issues.apache.org/jira/browse/YARN-451
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.0.3-alpha
Reporter: Lohit Vijayarenu
Assignee: Sangjin Lee
Priority: Blocker
 Attachments: in_progress_2x.png, yarn-451-trunk-20130916.1.patch


 ResourceManager webUI shows list of RUNNING applications, but it does not 
 tell which applications are requesting more resource compared to others. With 
 cluster running hundreds of applications at once it would be useful to have 
 some kind of metric to show high-resource usage applications vs low-resource 
 usage ones. At the minimum showing number of containers is good option.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-451) Add more metrics to RM page

2013-10-11 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13792818#comment-13792818
 ] 

Sangjin Lee commented on YARN-451:
--

[~jlowe] Although I like the idea of showing both types of metrics (current 
consumption and total), I am concerned about over-crowding that UI. Thoughts? 
Maybe one can emit both types of data in the REST output but show only one type 
in html?

 Add more metrics to RM page
 ---

 Key: YARN-451
 URL: https://issues.apache.org/jira/browse/YARN-451
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.0.3-alpha
Reporter: Lohit Vijayarenu
Assignee: Sangjin Lee
Priority: Blocker
 Attachments: in_progress_2x.png, yarn-451-trunk-20130916.1.patch


 ResourceManager webUI shows list of RUNNING applications, but it does not 
 tell which applications are requesting more resource compared to others. With 
 cluster running hundreds of applications at once it would be useful to have 
 some kind of metric to show high-resource usage applications vs low-resource 
 usage ones. At the minimum showing number of containers is good option.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-451) Add more metrics to RM page

2013-10-11 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13792908#comment-13792908
 ] 

Jason Lowe commented on YARN-451:
-

I wasn't proposing current consumption and total, rather current consumption 
and ask (or current requests), although one can back-calculate ask from total 
given current.  The current consumption lets us quickly find the biggest apps 
in the cluster, while the current ask lets us quickly find the apps that are 
starving for more resources (beyond the ones that are UNASSIGNED).

If we only can pick one, I'd say go with current consumption.  Usually users 
know when their jobs are getting starved, and they want to know which jobs are 
filling up the queue.

 Add more metrics to RM page
 ---

 Key: YARN-451
 URL: https://issues.apache.org/jira/browse/YARN-451
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.0.3-alpha
Reporter: Lohit Vijayarenu
Assignee: Sangjin Lee
Priority: Blocker
 Attachments: in_progress_2x.png, yarn-451-trunk-20130916.1.patch


 ResourceManager webUI shows list of RUNNING applications, but it does not 
 tell which applications are requesting more resource compared to others. With 
 cluster running hundreds of applications at once it would be useful to have 
 some kind of metric to show high-resource usage applications vs low-resource 
 usage ones. At the minimum showing number of containers is good option.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-451) Add more metrics to RM page

2013-10-09 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790895#comment-13790895
 ] 

Sangjin Lee commented on YARN-451:
--

I agree. Even for mappers and reducers container resource asks may vary, and 
the memory and the cores contain more information than the number of containers.

 Add more metrics to RM page
 ---

 Key: YARN-451
 URL: https://issues.apache.org/jira/browse/YARN-451
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.0.3-alpha
Reporter: Lohit Vijayarenu
Assignee: Sangjin Lee
Priority: Blocker
 Attachments: in_progress_2x.png, yarn-451-trunk-20130916.1.patch


 ResourceManager webUI shows list of RUNNING applications, but it does not 
 tell which applications are requesting more resource compared to others. With 
 cluster running hundreds of applications at once it would be useful to have 
 some kind of metric to show high-resource usage applications vs low-resource 
 usage ones. At the minimum showing number of containers is good option.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-451) Add more metrics to RM page

2013-10-07 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788226#comment-13788226
 ] 

Jason Lowe commented on YARN-451:
-

I've found #containers to not be a very useful metric, as it doesn't 
necessarily map closely to the amount of cluster resources.  Sometimes apps run 
with lots of tiny containers while others run with huge containers.  I think 
showing the resource utilization in terms of memory and CPU for both current 
and ask would be useful.  That would show which apps are big right now and 
which are trying to be much bigger.

 Add more metrics to RM page
 ---

 Key: YARN-451
 URL: https://issues.apache.org/jira/browse/YARN-451
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.0.3-alpha
Reporter: Lohit Vijayarenu
Assignee: Sangjin Lee
Priority: Blocker
 Attachments: in_progress_2x.png, yarn-451-trunk-20130916.1.patch


 ResourceManager webUI shows list of RUNNING applications, but it does not 
 tell which applications are requesting more resource compared to others. With 
 cluster running hundreds of applications at once it would be useful to have 
 some kind of metric to show high-resource usage applications vs low-resource 
 usage ones. At the minimum showing number of containers is good option.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-451) Add more metrics to RM page

2013-10-05 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13787396#comment-13787396
 ] 

Arun C Murthy commented on YARN-451:


Thinking aloud... we could track past allocations (#containers) per application 
in RM and also track future requests (sum numContainers for *) per application. 
Would showing those two help? 

 Add more metrics to RM page
 ---

 Key: YARN-451
 URL: https://issues.apache.org/jira/browse/YARN-451
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.0.3-alpha
Reporter: Lohit Vijayarenu
Assignee: Sangjin Lee
Priority: Blocker
 Attachments: in_progress_2x.png, yarn-451-trunk-20130916.1.patch


 ResourceManager webUI shows list of RUNNING applications, but it does not 
 tell which applications are requesting more resource compared to others. With 
 cluster running hundreds of applications at once it would be useful to have 
 some kind of metric to show high-resource usage applications vs low-resource 
 usage ones. At the minimum showing number of containers is good option.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-451) Add more metrics to RM page

2013-09-23 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13775700#comment-13775700
 ] 

Jason Lowe commented on YARN-451:
-

Is knowing how big an application might get in the future important?  Knowing 
how big an application is right now, both in terms of what it's using and what 
it's asking for, seems more relevant for understanding why a queue is 
overloaded or jobs aren't getting scheduled as quickly as expected.  The 
ApplicationResourceUsageReport already contains this information, and it should 
be straightforward to report as part of the ApplicationResourceUsageReport for 
display via the web UI, CLI, or REST services.

Note that YARN-415 is already attempting to do this for historical resource 
usage so it will be easy to see which jobs have taken large amounts of 
resources and could have slowed other jobs in the past.

 Add more metrics to RM page
 ---

 Key: YARN-451
 URL: https://issues.apache.org/jira/browse/YARN-451
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.0.3-alpha
Reporter: Lohit Vijayarenu
Assignee: Sangjin Lee
Priority: Blocker
 Attachments: in_progress_2x.png, yarn-451-trunk-20130916.1.patch


 ResourceManager webUI shows list of RUNNING applications, but it does not 
 tell which applications are requesting more resource compared to others. With 
 cluster running hundreds of applications at once it would be useful to have 
 some kind of metric to show high-resource usage applications vs low-resource 
 usage ones. At the minimum showing number of containers is good option.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-451) Add more metrics to RM page

2013-09-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13771073#comment-13771073
 ] 

Hadoop QA commented on YARN-451:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12603493/in_progress_2x.png
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1959//console

This message is automatically generated.

 Add more metrics to RM page
 ---

 Key: YARN-451
 URL: https://issues.apache.org/jira/browse/YARN-451
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.0.3-alpha
Reporter: Lohit Vijayarenu
Assignee: Sangjin Lee
Priority: Blocker
 Attachments: in_progress_2x.png, yarn-451-trunk-20130916.1.patch


 ResourceManager webUI shows list of RUNNING applications, but it does not 
 tell which applications are requesting more resource compared to others. With 
 cluster running hundreds of applications at once it would be useful to have 
 some kind of metric to show high-resource usage applications vs low-resource 
 usage ones. At the minimum showing number of containers is good option.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-451) Add more metrics to RM page

2013-09-18 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13771163#comment-13771163
 ] 

Arun C Murthy commented on YARN-451:


I'm not clear if this is the right approach... 

How about something simpler:
# RM UI displays current usage of resources (memory, cpu etc.)
# Application can pass in a string (along with progress) where-by we can 
annotate with something app-specific like: 100 maps total, 5 finished, 5 
running

 Add more metrics to RM page
 ---

 Key: YARN-451
 URL: https://issues.apache.org/jira/browse/YARN-451
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.0.3-alpha
Reporter: Lohit Vijayarenu
Assignee: Sangjin Lee
Priority: Blocker
 Attachments: in_progress_2x.png, yarn-451-trunk-20130916.1.patch


 ResourceManager webUI shows list of RUNNING applications, but it does not 
 tell which applications are requesting more resource compared to others. With 
 cluster running hundreds of applications at once it would be useful to have 
 some kind of metric to show high-resource usage applications vs low-resource 
 usage ones. At the minimum showing number of containers is good option.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-451) Add more metrics to RM page

2013-09-17 Thread Joep Rottinghuis (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13770060#comment-13770060
 ] 

Joep Rottinghuis commented on YARN-451:
---

Marking as blocker. W/o a sense of size running a cluster at scale is 
incredibly difficult.
When there is resource contention, we need to be able to sort so see the 
difference between a user running 10 apps that are tiny (require 1-2 small 
containers each), or they run 5 apps that require tens of thousands of large 
containers.

 Add more metrics to RM page
 ---

 Key: YARN-451
 URL: https://issues.apache.org/jira/browse/YARN-451
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.0.3-alpha
Reporter: Lohit Vijayarenu
Priority: Blocker
 Attachments: in_progress_2x.png, yarn-451-trunk-20130916.1.patch


 ResourceManager webUI shows list of RUNNING applications, but it does not 
 tell which applications are requesting more resource compared to others. With 
 cluster running hundreds of applications at once it would be useful to have 
 some kind of metric to show high-resource usage applications vs low-resource 
 usage ones. At the minimum showing number of containers is good option.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-451) Add more metrics to RM page

2013-09-16 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13768989#comment-13768989
 ] 

Sangjin Lee commented on YARN-451:
--

I am going to upload a single patch to get your feedback. I wanted to get your 
feedback first to validate the approach before I break out sub-tasks and post 
separate patches. This single patch is solely for review, and is not to be 
merged.

 Add more metrics to RM page
 ---

 Key: YARN-451
 URL: https://issues.apache.org/jira/browse/YARN-451
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.0.3-alpha
Reporter: Lohit Vijayarenu
Priority: Minor

 ResourceManager webUI shows list of RUNNING applications, but it does not 
 tell which applications are requesting more resource compared to others. With 
 cluster running hundreds of applications at once it would be useful to have 
 some kind of metric to show high-resource usage applications vs low-resource 
 usage ones. At the minimum showing number of containers is good option.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-451) Add more metrics to RM page

2013-09-13 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13766759#comment-13766759
 ] 

Sangjin Lee commented on YARN-451:
--

I am pretty close to getting a patch ready for review on this. A quick question 
before that however: the proposed change contains changes in YARN (changes in 
message definition to carry this extra info, and subsequent UI changes) and 
mapreduce (mapreduce application providing this information).

Should I create two sub-tasks (one for YARN and one for MAPREDUCE) and provide 
separate patches for them?

 Add more metrics to RM page
 ---

 Key: YARN-451
 URL: https://issues.apache.org/jira/browse/YARN-451
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.0.3-alpha
Reporter: Lohit Vijayarenu
Priority: Minor

 ResourceManager webUI shows list of RUNNING applications, but it does not 
 tell which applications are requesting more resource compared to others. With 
 cluster running hundreds of applications at once it would be useful to have 
 some kind of metric to show high-resource usage applications vs low-resource 
 usage ones. At the minimum showing number of containers is good option.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-451) Add more metrics to RM page

2013-08-14 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739801#comment-13739801
 ] 

Sangjin Lee commented on YARN-451:
--

I agree that hadoop 1 was different as the notion of mappers and reducers was 
explicit from the overview and the RM works in a different way in terms of 
resource allocation.

I am pointing out that from a user perspective there is a feature gap where one 
cannot quickly get a sense of relative sizes of apps/jobs. I also agree that 
the solution should be done in a way such that it doesn't crowd the UI and also 
conforms well with the current RM architecture. Thanks!

 Add more metrics to RM page
 ---

 Key: YARN-451
 URL: https://issues.apache.org/jira/browse/YARN-451
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.0.3-alpha
Reporter: Lohit Vijayarenu
Priority: Minor

 ResourceManager webUI shows list of RUNNING applications, but it does not 
 tell which applications are requesting more resource compared to others. With 
 cluster running hundreds of applications at once it would be useful to have 
 some kind of metric to show high-resource usage applications vs low-resource 
 usage ones. At the minimum showing number of containers is good option.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-451) Add more metrics to RM page

2013-08-13 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739213#comment-13739213
 ] 

Sangjin Lee commented on YARN-451:
--

I think showing this information on the app list page is actually more valuable 
than the per-app page. If this information is present in the app list page, one 
can quickly scan the list and get a sense of which job/app is bigger than 
others in terms of resource consumption. Also, it makes sorting possible.

One could in theory visit individual per-app pages one by one to get the same 
information, but it's so much more useful to have it ready at the overview page 
so one can get that information quickly.

In hadoop 1.0, one could get the same information by looking at the number of 
total mappers and reducers. That way, we got a very good idea on which ones are 
big jobs (and thus need to be monitored more closely) without drilling into any 
of the apps.

 Add more metrics to RM page
 ---

 Key: YARN-451
 URL: https://issues.apache.org/jira/browse/YARN-451
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.0.3-alpha
Reporter: Lohit Vijayarenu
Priority: Minor

 ResourceManager webUI shows list of RUNNING applications, but it does not 
 tell which applications are requesting more resource compared to others. With 
 cluster running hundreds of applications at once it would be useful to have 
 some kind of metric to show high-resource usage applications vs low-resource 
 usage ones. At the minimum showing number of containers is good option.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-451) Add more metrics to RM page

2013-08-13 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739225#comment-13739225
 ] 

Vinod Kumar Vavilapalli commented on YARN-451:
--

Agreed about having it on the listing page, but that page is already dense. 
Have to do some basic UI design.

Again, like I mentioned, Hadoop-1 was different as number of maps, reduces 
doesn't change after job starts. Whereas in Hadoop-2, memory/cores allocated 
slowly increases over time , so it may or may not be of much use. I am 
ambivalent about adding it.

 Add more metrics to RM page
 ---

 Key: YARN-451
 URL: https://issues.apache.org/jira/browse/YARN-451
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.0.3-alpha
Reporter: Lohit Vijayarenu
Priority: Minor

 ResourceManager webUI shows list of RUNNING applications, but it does not 
 tell which applications are requesting more resource compared to others. With 
 cluster running hundreds of applications at once it would be useful to have 
 some kind of metric to show high-resource usage applications vs low-resource 
 usage ones. At the minimum showing number of containers is good option.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-451) Add more metrics to RM page

2013-08-12 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13737497#comment-13737497
 ] 

Sangjin Lee commented on YARN-451:
--

I think some information about the app size can be obtained through 
ApplicationResourceUsageReport. Specifically it has getNeededResources() and 
getNumUsedContainers() and getNumReservedContainers(). This could be a useful 
candidate for sizing the app in terms of resource demand.

If I am not mistaken, AppInfo is the class that exposes the data for the app 
UI, and we could add a string or a couple of numbers individually to expose 
this information. Thoughts?

 Add more metrics to RM page
 ---

 Key: YARN-451
 URL: https://issues.apache.org/jira/browse/YARN-451
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.0.3-alpha
Reporter: Lohit Vijayarenu
Priority: Minor

 ResourceManager webUI shows list of RUNNING applications, but it does not 
 tell which applications are requesting more resource compared to others. With 
 cluster running hundreds of applications at once it would be useful to have 
 some kind of metric to show high-resource usage applications vs low-resource 
 usage ones. At the minimum showing number of containers is good option.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-451) Add more metrics to RM page

2013-08-12 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13737518#comment-13737518
 ] 

Sangjin Lee commented on YARN-451:
--

For sorting, it would be good to expose a number. How about the number of 
containers (used + reserved = needed)? I think the resource info (memory, 
vCores, etc.) may be bit redundant at least UI-wise if the number of containers 
is displayed.

 Add more metrics to RM page
 ---

 Key: YARN-451
 URL: https://issues.apache.org/jira/browse/YARN-451
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.0.3-alpha
Reporter: Lohit Vijayarenu
Priority: Minor

 ResourceManager webUI shows list of RUNNING applications, but it does not 
 tell which applications are requesting more resource compared to others. With 
 cluster running hundreds of applications at once it would be useful to have 
 some kind of metric to show high-resource usage applications vs low-resource 
 usage ones. At the minimum showing number of containers is good option.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-451) Add more metrics to RM page

2013-08-12 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13737641#comment-13737641
 ] 

Vinod Kumar Vavilapalli commented on YARN-451:
--

In YARN, resource usage by applications is dynamic. So at any point of time, RM 
CANNOT tell you what the max is for any given application.

What we can do is show how many containers has any application has allocated 
till this point of time overall(which would be getNumUsedContainers()). Again 
this is dynamic and with MR it slowly increases over time. May or may not be 
much of a use.

We can and will definitely add more information on the per-app page. But is 
showing on the app-list page a big requirement or showing it on the per-app 
page enough?

 Add more metrics to RM page
 ---

 Key: YARN-451
 URL: https://issues.apache.org/jira/browse/YARN-451
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.0.3-alpha
Reporter: Lohit Vijayarenu
Priority: Minor

 ResourceManager webUI shows list of RUNNING applications, but it does not 
 tell which applications are requesting more resource compared to others. With 
 cluster running hundreds of applications at once it would be useful to have 
 some kind of metric to show high-resource usage applications vs low-resource 
 usage ones. At the minimum showing number of containers is good option.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-451) Add more metrics to RM page

2013-07-15 Thread Joep Rottinghuis (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13709420#comment-13709420
 ] 

Joep Rottinghuis commented on YARN-451:
---

It would certainly be very useful to be able to see application size/weight 
(and order by this) when many applications run.
If it were to be added, various Yarn applications would have their own specific 
implementation.
At the moment only memory is tracked, so #slot Gigabytes would be a possible 
number that would be more generic then simply #mappers+#reducers.

Either would be more useful that having no data at all.

Being able to see the size of applications is really helpful to understand what 
is going on in one view. Is somebody running many small applications, a few 
large ones, many large ones ? etc.

 Add more metrics to RM page
 ---

 Key: YARN-451
 URL: https://issues.apache.org/jira/browse/YARN-451
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.0.3-alpha
Reporter: Lohit Vijayarenu
Priority: Minor

 ResourceManager webUI shows list of RUNNING applications, but it does not 
 tell which applications are requesting more resource compared to others. With 
 cluster running hundreds of applications at once it would be useful to have 
 some kind of metric to show high-resource usage applications vs low-resource 
 usage ones. At the minimum showing number of containers is good option.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-451) Add more metrics to RM page

2013-04-09 Thread Lohit Vijayarenu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13627203#comment-13627203
 ] 

Lohit Vijayarenu commented on YARN-451:
---

Tried to see if adding total number of Containers was trivial change, but looks 
like there is no notion of application max resource available to 
resourcemangaer. This might be the reason why RM page did not have the 
information. Looking into RMAppImpl shows that this kind of information is not 
passed either from Client/AM to RM during application initialization. 

Something close to notion of job weight I could see was Resource demand, but 
that seems to be change based on how an application request containers. For 
example FairScheduler seem to recalculate fairshare based on how much resource 
demand is passed by applications. 

One option I can think of is to add an additional field in protobuf which 
specifies what is total number of containers/resource an application might use. 
This would be optional field which can be used only by MapReduce for now and 
Client can set this value based on number of mappers/reducers. I am not sure if 
this is the right approach, any other simpler ideas people can suggest?

 Add more metrics to RM page
 ---

 Key: YARN-451
 URL: https://issues.apache.org/jira/browse/YARN-451
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.0.3-alpha
Reporter: Lohit Vijayarenu
Priority: Minor

 ResourceManager webUI shows list of RUNNING applications, but it does not 
 tell which applications are requesting more resource compared to others. With 
 cluster running hundreds of applications at once it would be useful to have 
 some kind of metric to show high-resource usage applications vs low-resource 
 usage ones. At the minimum showing number of containers is good option.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira