[jira] [Commented] (YARN-1530) [Umbrella] Store, manage and serve per-framework application-timeline data

Zhijie Shen (JIRA) Mon, 22 Sep 2014 11:35:12 -0700

    [ 
https://issues.apache.org/jira/browse/YARN-1530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14143594#comment-14143594
 ]


Zhijie Shen commented on YARN-1530:
-----------------------------------

Hi, [~bcwalrus]. Thanks for your further comments.

bq. You seem to agree with the premise that ATS write path should not slow down 
apps.

Definitely. The arguable point is that the current timeline client is going to 
slow down the app, given we have a scalable and reliable timeline server.

bq. If we can drop the high uptime + low write latency requirement from the ATS 
service, we can avoid tons of effort.

I'm not sure such fundamental requirements can be dropped from the timeline 
service. Projecting the future, scalable and high available timeline servers 
have multiple benefits and enable different use cases. For example,

1. We can use it to serve realtime or near realtime data, such that we can go 
the timeline server to see what happens to an application. It's in particularly 
useful for the long running services, which will never turn down.

2. We can build checkpoints on the timeline server for the app do to recovery 
once it crashes. It's pretty much like what we've done for MR jobs.

I bundled "scalable" and "reliable" together because multiple-instance solution 
will improve the timeline server in both dimensions.

Moreover, no matter how scalable and reliable the channel could be, we 
eventually want to get the timeline data accommodated into the timeline server, 
right? Otherwise, it is not going to be accessible by users (Of course, tricks 
can be played to fetch it directly from HDFS, but it's completely another story 
than the timeline server). If the apps are publishing 10GB data per hour, while 
the server can only process 1G per hour, the 9GB outstanding data per hour that 
resides in some temp location of HDFS is going to be useless writes.

We have narrow down very much to discuss the reliability of the write path, but 
if we look into the big picture, *the timeline server is not just place to 
store data, but also serves it to users* (e.g., YARN-2513). In terms of use 
case, users may want to monitor completed apps as well as running apps and 
cluster. If the timeline server doesn't have capacity to serve the data for a 
particular use case, it's actually wasting the cost on aggregating it. IMHO, 
the scalable and the reliable timeline server is going to be *the eventual 
solution to satisfy multiple stakeholders*, regardless the use case is read 
intensive, write intensive or both intensive. That's why I think it could a 
high margin work to improve the timeline server. It's may be a hard work, but 
we should definitely pick it up.


> [Umbrella] Store, manage and serve per-framework application-timeline data
> --------------------------------------------------------------------------
>
>                 Key: YARN-1530
>                 URL: https://issues.apache.org/jira/browse/YARN-1530
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Vinod Kumar Vavilapalli
>         Attachments: ATS-Write-Pipeline-Design-Proposal.pdf, 
> ATS-meet-up-8-28-2014-notes.pdf, application timeline design-20140108.pdf, 
> application timeline design-20140116.pdf, application timeline 
> design-20140130.pdf, application timeline design-20140210.pdf
>
>
> This is a sibling JIRA for YARN-321.
> Today, each application/framework has to do store, and serve per-framework 
> data all by itself as YARN doesn't have a common solution. This JIRA attempts 
> to solve the storage, management and serving of per-framework data from 
> various applications, both running and finished. The aim is to change YARN to 
> collect and store data in a generic manner with plugin points for frameworks 
> to do their own thing w.r.t interpretation and serving.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-1530) [Umbrella] Store, manage and serve per-framework application-timeline data

Reply via email to