[ 
https://issues.apache.org/jira/browse/SPARK-18085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15971594#comment-15971594
 ] 

Marcelo Vanzin commented on SPARK-18085:
----------------------------------------

I'm getting close to a point where I think the code can start to trickle in. I 
want to wait until 2.2's branch gets going before sending PRs, though. In the 
meantime, I'm keeping "private PRs" in my fork for each milestone, so it's easy 
for anybody interesting in getting themselves familiar with the code to provide 
comments:

https://github.com/vanzin/spark/pulls

At this point, all the UI that the SHS shows is kept in a disk store (that's 
core + SQL, but not streaming). At this point, since streaming is not shown in 
the SHS, I'm not planning to touch it (aside from the small changes I made that 
were required by internal API changes in core).

What's left at this point is, from my view:
- managing disk space in the SHS so that large number of apps don't cause the 
SHS to fill local disks
- limiting the number of jobs / stages / tasks / etc kept in the store (similar 
to existing settings, which the code doesn't yet honor)
- an in-memory implementation of the store (in case someone wants lower latency 
or can't / does not want to use the disk store)
- more tests, and more testing


> Better History Server scalability for many / large applications
> ---------------------------------------------------------------
>
>                 Key: SPARK-18085
>                 URL: https://issues.apache.org/jira/browse/SPARK-18085
>             Project: Spark
>          Issue Type: Umbrella
>          Components: Spark Core, Web UI
>    Affects Versions: 2.0.0
>            Reporter: Marcelo Vanzin
>         Attachments: spark_hs_next_gen.pdf
>
>
> It's a known fact that the History Server currently has some annoying issues 
> when serving lots of applications, and when serving large applications.
> I'm filing this umbrella to track work related to addressing those issues. 
> I'll be attaching a document shortly describing the issues and suggesting a 
> path to how to solve them.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to