[ https://issues.apache.org/jira/browse/YARN-2868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14520480#comment-14520480 ]
Ray Chiang commented on YARN-2868: ---------------------------------- I'll answer these in reverse order: 2) The first AM container is the "easy" one to measure. Subsequent measurements can be tricky since the "request" time will need to be recorded somewhere until the request is actually fulfilled. Tracking all the requests and corresponding fulfillments would be a lot more work and may want more sophisticated measurements. I haven't filed a JIRA for doing the later containers. 1) Breaking this answer into several parts. I'm not going to remember all the iterations I went through but I'll answer as best as I can. 1A) YARN-3105 covers the enhancements to StateMachine to record state transitions generically for metrics. [~jianhe] made the original suggestion. 1B) There were several factors for this. I think it was a combination of wanting queue-specific metrics, wanting to separate first allocation from later allocations, working with managed and unmanaged AMs, and a desire to get a more exact measurement with less overhead. I've deleted all my earliest attempts at this (i.e. those prior to the first patch on this JIRA), so I can't provide more specific information offhand. Let me know if that satisfactorily answers your questions. > FairScheduler: Metric for latency to allocate first container for an > application > -------------------------------------------------------------------------------- > > Key: YARN-2868 > URL: https://issues.apache.org/jira/browse/YARN-2868 > Project: Hadoop YARN > Issue Type: Improvement > Reporter: Ray Chiang > Assignee: Ray Chiang > Labels: metrics, supportability > Fix For: 2.8.0 > > Attachments: YARN-2868-01.patch, YARN-2868.002.patch, > YARN-2868.003.patch, YARN-2868.004.patch, YARN-2868.005.patch, > YARN-2868.006.patch, YARN-2868.007.patch, YARN-2868.008.patch, > YARN-2868.009.patch, YARN-2868.010.patch, YARN-2868.011.patch, > YARN-2868.012.patch > > > Add a metric to measure the latency between "starting container allocation" > and "first container actually allocated". -- This message was sent by Atlassian JIRA (v6.3.4#6332)