[jira] [Commented] (SPARK-18085) Scalability enhancements for the History Server

Thomas Graves (JIRA) Wed, 26 Oct 2016 08:05:33 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-18085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15608719#comment-15608719
 ]


Thomas Graves commented on SPARK-18085:
---------------------------------------

So there are a few things unclear to me about the details from the doc, but 
overall this really sounds very much like the Hadoop ATS V1.5.  Have you looked 
at that?

I'm not necessarily saying we need to use the Hadoop ATS because it was mostly 
created for TEZ and it an external component. Its fairly easy to setup as its 
just one daemon but its still external.

You have a section SHS Application Listing Persistence, I assume this a 
separate levelDB that stores just the metadata to do the simple listing on 
startup?

Just a quick overall picture of what I think is being proposed without all the 
incremental steps and leaving out UI parts
 - spark apps write event log history to HDFS (no change)
 - Spark history server periodically parses these and then creates a levelDB 
for each application (using lru so we don't run out of disk space), worse case 
it has to reparse the original event log.
 - I assume during parsing (or separate parser?) it also stores application 
metadata in separate levelDB for listing?  Cleanup policy on that?
- Stretch would be for SHS to look at inprogress history files and read from 
where it left off, this gives you the history server showing data on running 
applications as well as finished and perhaps makes it show up on UI faster and 
load entire details if it has less to parse when application finishes.

How is this solving quickly listing new apps issue? If its reading the files 
before application is finished that can help, but it depends on the details of 
the incremental parsing and how big a log can get between it looking.
 It still seems to me we would be better off having the application save the 
metadata somewhere else. For instance have each spark app write a separate 
"summary" file. This could list basic things like start time, end time, user, 
acls. You might also put important milestones in here, like perhaps job 
specified start/end time.  This depends to what the UI is going to show.

streaming data, I'm not sure if streaming stores history at this point?  Is it 
one big file? 

One thing we may consider is breaking up the giant event log.  If we have 
streaming applications or very large applications, would it make sense to say 
have an event log per job in an application.  This would allow incremental 
reading of the event log and not having to open such huge files.  You could do 
more fine grain control of the files and in the streaming case perhaps get rid 
of old ones.

> Scalability enhancements for the History Server
> -----------------------------------------------
>
>                 Key: SPARK-18085
>                 URL: https://issues.apache.org/jira/browse/SPARK-18085
>             Project: Spark
>          Issue Type: Umbrella
>          Components: Spark Core, Web UI
>    Affects Versions: 2.0.0
>            Reporter: Marcelo Vanzin
>         Attachments: spark_hs_next_gen.pdf
>
>
> It's a known fact that the History Server currently has some annoying issues 
> when serving lots of applications, and when serving large applications.
> I'm filing this umbrella to track work related to addressing those issues. 
> I'll be attaching a document shortly describing the issues and suggesting a 
> path to how to solve them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-18085) Scalability enhancements for the History Server

Reply via email to