[ https://issues.apache.org/jira/browse/SPARK-18085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15608719#comment-15608719 ]
Thomas Graves commented on SPARK-18085: --------------------------------------- So there are a few things unclear to me about the details from the doc, but overall this really sounds very much like the Hadoop ATS V1.5. Have you looked at that? I'm not necessarily saying we need to use the Hadoop ATS because it was mostly created for TEZ and it an external component. Its fairly easy to setup as its just one daemon but its still external. You have a section SHS Application Listing Persistence, I assume this a separate levelDB that stores just the metadata to do the simple listing on startup? Just a quick overall picture of what I think is being proposed without all the incremental steps and leaving out UI parts - spark apps write event log history to HDFS (no change) - Spark history server periodically parses these and then creates a levelDB for each application (using lru so we don't run out of disk space), worse case it has to reparse the original event log. - I assume during parsing (or separate parser?) it also stores application metadata in separate levelDB for listing? Cleanup policy on that? - Stretch would be for SHS to look at inprogress history files and read from where it left off, this gives you the history server showing data on running applications as well as finished and perhaps makes it show up on UI faster and load entire details if it has less to parse when application finishes. How is this solving quickly listing new apps issue? If its reading the files before application is finished that can help, but it depends on the details of the incremental parsing and how big a log can get between it looking. It still seems to me we would be better off having the application save the metadata somewhere else. For instance have each spark app write a separate "summary" file. This could list basic things like start time, end time, user, acls. You might also put important milestones in here, like perhaps job specified start/end time. This depends to what the UI is going to show. streaming data, I'm not sure if streaming stores history at this point? Is it one big file? One thing we may consider is breaking up the giant event log. If we have streaming applications or very large applications, would it make sense to say have an event log per job in an application. This would allow incremental reading of the event log and not having to open such huge files. You could do more fine grain control of the files and in the streaming case perhaps get rid of old ones. > Scalability enhancements for the History Server > ----------------------------------------------- > > Key: SPARK-18085 > URL: https://issues.apache.org/jira/browse/SPARK-18085 > Project: Spark > Issue Type: Umbrella > Components: Spark Core, Web UI > Affects Versions: 2.0.0 > Reporter: Marcelo Vanzin > Attachments: spark_hs_next_gen.pdf > > > It's a known fact that the History Server currently has some annoying issues > when serving lots of applications, and when serving large applications. > I'm filing this umbrella to track work related to addressing those issues. > I'll be attaching a document shortly describing the issues and suggesting a > path to how to solve them. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org