[ https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14329460#comment-14329460 ]
Marcelo Vanzin commented on SPARK-1537: --------------------------------------- Hi [~zzhan], thanks for uploading the document. Reading through it, I don't see anything that is really that much different from my initial proof-of-concept. The points I'd like to highlight are: - It still depends on YARN-2423, or at least on some effort to write a REST client that does not depend on internal Yarn classes. - What about overhead of the read code? Large jobs with lots of tasks, or really long jobs such as Spark Streaming jobs, will have a really large amount of events. Fetching them all in one batch would require a lot of memory for serializing the data on both sides (ATS and History Server). - Any security considerations? I haven't really kept up-to-date with the security changes in the ATS after I ran into issues with my p.o.c.; but mainly, does the Spark job need any special tokens to talk to the ATS when security is enabled? Does the ATS guarantee that only the job itself (or someone with the right credentials) can add events to its timeline? Or is that all handled transparently, somehow, by the client library? - Does YARN-2928 affect the design in any way? I took a quick look at the data model, so hopefully they'll keep things backwards compatible. But it would kinda suck to add support for an API with a limited shelf life if that's not the case. > Add integration with Yarn's Application Timeline Server > ------------------------------------------------------- > > Key: SPARK-1537 > URL: https://issues.apache.org/jira/browse/SPARK-1537 > Project: Spark > Issue Type: New Feature > Components: YARN > Reporter: Marcelo Vanzin > Assignee: Marcelo Vanzin > Attachments: SPARK-1537.txt, spark-1573.patch > > > It would be nice to have Spark integrate with Yarn's Application Timeline > Server (see YARN-321, YARN-1530). This would allow users running Spark on > Yarn to have a single place to go for all their history needs, and avoid > having to manage a separate service (Spark's built-in server). > At the moment, there's a working version of the ATS in the Hadoop 2.4 branch, > although there is still some ongoing work. But the basics are there, and I > wouldn't expect them to change (much) at this point. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org