[ 
https://issues.apache.org/jira/browse/SPARK-6270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14357627#comment-14357627
 ] 

Josh Rosen commented on SPARK-6270:
-----------------------------------

In the long run, my preference is to remove HistoryServer-like responsibilities 
from the Master: the standalone Master is typically configured with a small 
amount of memory and risks OOMing when loading UIs, even if the UI loading is 
done asynchronously (right now it blocks the main event processing thread).

We might consider trying to add lazy loading as an intermediate stepping-stone 
to properly fixing this issue, but I'd like to argue against that approach: 
lazy loading inside of the Master is going to require mechanisms similar to 
what we have in the HistoryServer's loaderServlet, so we're either going to 
have to duplicate a bunch of code or change the HistoryServer code to be more 
modular so that we can reuse its components it inside of the Master.

Another consideration firewall / port issues: currently, the master web UI and 
the Spark web UIs that it loads are served on the same port.  If we set up a 
new Jetty server for the UIs, whether in the same Master JVM or in a separate 
HistoryServer process, then the Spark UIs will be served at some different 
port, potentially breaking those links in environments where only the master 
web UI port is exposed.  I think it's going to be really painful to avoid this, 
though, and I don't think we should resort to solutions where we proxy the 
Spark UI through the master UI, since the responses could be huge and lead to 
OOMs in the proxy.

I think we should Introduce a new configuration which completely disables the 
master's Spark UI serving feature, backport this to all maintenance branches, 
and mention this feature in the release notes.

For Spark 1.4, I think we should completely remove the web UI serving from the 
Master and provide the ability to configure the master with a HistoryServer 
address which will be used to generate links to UIs.  This runs into its own 
set of problems, though: the current HistoryServer FSHistoryProvider assumes 
that all applications' event logs are located in the same directory, whereas 
the Master can load event logs from any directory which is specified in the 
application description.  This means that we'll need a way to instruct the 
HistoryServer to load logs from an arbitrary path.  Therefore, maybe we should 
extend the HistoryServer's HTTP interface to allow requests to specify the 
event log location (falling back to the history server's default event log 
directory if no alternate log location was specified).  This could have 
security implications, though; we'd have to be careful to ensure that this 
doesn't allow arbitrary file reads.

> Standalone Master hangs when streaming job completes
> ----------------------------------------------------
>
>                 Key: SPARK-6270
>                 URL: https://issues.apache.org/jira/browse/SPARK-6270
>             Project: Spark
>          Issue Type: Bug
>          Components: Deploy, Streaming
>    Affects Versions: 1.2.0, 1.3.0, 1.2.1
>            Reporter: Tathagata Das
>            Priority: Critical
>
> If the event logging is enabled, the Spark Standalone Master tries to 
> recreate the web UI of a completed Spark application from its event logs. 
> However if this event log is huge (e.g. for a Spark Streaming application), 
> then the master hangs in its attempt to read and recreate the web ui. This 
> hang causes the whole standalone cluster to be unusable. 
> Workaround is to disable the event logging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to