[ 
https://issues.apache.org/jira/browse/YARN-2868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14520480#comment-14520480
 ] 

Ray Chiang commented on YARN-2868:
----------------------------------

I'll answer these in reverse order:

2) The first AM container is the "easy" one to measure.  Subsequent 
measurements can be tricky since the "request" time will need to be recorded 
somewhere until the request is actually fulfilled.  Tracking all the requests 
and corresponding fulfillments would be a lot more work and may want more 
sophisticated measurements.  I haven't filed a JIRA for doing the later 
containers.

1) Breaking this answer into several parts.  I'm not going to remember all the 
iterations I went through but I'll answer as best as I can.

1A) YARN-3105 covers the enhancements to StateMachine to record state 
transitions generically for metrics.  [~jianhe] made the original suggestion.

1B) There were several factors for this.  I think it was a combination of 
wanting queue-specific metrics, wanting to separate first allocation from later 
allocations, working with managed and unmanaged AMs, and a desire to get a more 
exact measurement with less overhead.  I've deleted all my earliest attempts at 
this (i.e. those prior to the first patch on this JIRA), so I can't provide 
more specific information offhand.

Let me know if that satisfactorily answers your questions.

> FairScheduler: Metric for latency to allocate first container for an 
> application
> --------------------------------------------------------------------------------
>
>                 Key: YARN-2868
>                 URL: https://issues.apache.org/jira/browse/YARN-2868
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Ray Chiang
>            Assignee: Ray Chiang
>              Labels: metrics, supportability
>             Fix For: 2.8.0
>
>         Attachments: YARN-2868-01.patch, YARN-2868.002.patch, 
> YARN-2868.003.patch, YARN-2868.004.patch, YARN-2868.005.patch, 
> YARN-2868.006.patch, YARN-2868.007.patch, YARN-2868.008.patch, 
> YARN-2868.009.patch, YARN-2868.010.patch, YARN-2868.011.patch, 
> YARN-2868.012.patch
>
>
> Add a metric to measure the latency between "starting container allocation" 
> and "first container actually allocated".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to