[ 
https://issues.apache.org/jira/browse/YARN-3981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15449912#comment-15449912
 ] 

Li Lu commented on YARN-3981:
-----------------------------

Thanks [~rohithsharma]! 

bq. As part of NM daemon, start new service same as TimeLineWriterWebService. 
Idea is NM reports all these collector address to RM. Introduce new API in 
clientRMservice to get collector address. Address is given by RM in random(This 
can be decided later). This address is used by timeline client. TimeLineClient 
exposes new constructor with an flowName. So system properties can be written 
at flow level.
Actually this looks a little bit similar to the current collector discovery 
mechanism, where the NM reports app level collector information to RM, and RM 
distributes such information to all containers. 

The difference is we need to explicitly decide where and when to launch the 
collectors. The RM can decide where to launch collectors, but as of now, all 
collectors are associated with some concrete application's life-cycles 
(launched as aux-services). We can launch collectors as separate process for 
this use case? 

One concern is this will increase the load on the RM again. Not sure if this 
will be a problem on busy clusters with a lot of client connections. However, 
this is definitely better than launching a central server daemon to handle all 
client requests (which falls back to old ATS v1 architecture). 

For storing those entities posted from clients, can we put them in the entity 
table, but just leave some unknown fields empty? Will that be a concern for the 
storage API's semantics? 

> support timeline clients not associated with an application
> -----------------------------------------------------------
>
>                 Key: YARN-3981
>                 URL: https://issues.apache.org/jira/browse/YARN-3981
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>    Affects Versions: YARN-2928
>            Reporter: Sangjin Lee
>            Assignee: Rohith Sharma K S
>              Labels: YARN-5355
>
> In the current v.2 design, all timeline writes must belong in a 
> flow/application context (cluster + user + flow + flow run + application).
> But there are use cases that require writing data outside the context of an 
> application. One such example is a higher level client (e.g. tez client or 
> hive/oozie/cascading client) writing flow-level data that spans multiple 
> applications. We need to find a way to support them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to