[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

Zhijie Shen (JIRA) Thu, 30 Oct 2014 14:11:12 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14190849#comment-14190849
 ]


Zhijie Shen commented on SPARK-1537:
------------------------------------

[~vanzin], thanks for introducing YARN timeline server to Spark. Let me briefly 
summarize the current status of the timeline server and answer some concerns 
here. Spark folks who are interested in this monitoring service offered by YARN 
can go ahead to YARN-1530 to read the design doc and watch the latest progress.

1. The essential functions or the timeline service have been available since 
Hadoop 2.4. Basically, the user can organize the app's history or metrics 
according to timeline data model and post it the the timeline server. Later on, 
user or admin can come back to query this information to analyze how the app 
was going. The essential APIs keep unchanged from 2.4 to the coming 2.6. There 
should *NOT* be any incompatible API changes that will block this work. 
Moreover, Keeping compatible is always in our consideration when coming up with 
new features in the following Hadoop releases.

2. It's *NOT* exactly that the timeline server is not production-ready. In 
fact, Apache Tez has already integrated the timeline server for logging the 
history information. In the coming Hadoop 2.6, MapReduce is also enabled to 
publish the history information to the timeline server, too. Moreover, within 
the scope of YARN, a built-in generic history service on top of the timeline 
service is available to YARN users to watch all kinds of apps. Hence, with 
several successful pioneer, Spark should be confident enough to take the new 
merit of YARN.

3. While YARN community is progressing quickly to improve the timeline server 
in terms of security (coming 2.6), high availability, scalability, better 
client libs and so on, it should not disturb the initial attempt for Spark to 
embrace the timeline server, but will offer better experience if Spark is 
riding on it.

If you have other issue of high priority to work on, I think [~zhazhan] will be 
able to help this integration. Thanks!

> Add integration with Yarn's Application Timeline Server
> -------------------------------------------------------
>
>                 Key: SPARK-1537
>                 URL: https://issues.apache.org/jira/browse/SPARK-1537
>             Project: Spark
>          Issue Type: New Feature
>          Components: YARN
>            Reporter: Marcelo Vanzin
>            Assignee: Marcelo Vanzin
>
> It would be nice to have Spark integrate with Yarn's Application Timeline 
> Server (see YARN-321, YARN-1530). This would allow users running Spark on 
> Yarn to have a single place to go for all their history needs, and avoid 
> having to manage a separate service (Spark's built-in server).
> At the moment, there's a working version of the ATS in the Hadoop 2.4 branch, 
> although there is still some ongoing work. But the basics are there, and I 
> wouldn't expect them to change (much) at this point.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

Reply via email to