[ https://issues.apache.org/jira/browse/SPARK-6270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15053810#comment-15053810 ]
Josh Rosen commented on SPARK-6270: ----------------------------------- While I think that we should have this discussion about UI reconstruction of long-running applications, I think this is orthogonal to the right solution for this issue (SPARK-6270). The root problem here, related to the master / cluster manager dying, seems to be caused by a design flaw: why is the master responsible for serving historical UIs? The standalone history server process should have that responsibility, since UI serving might need a lot of memory. I think the right fix here is to just remove the Master's embedded history server; I just don't think it makes sense to assign history server responsibilities to the master when it's designed to be a very low-resource-use, high-stability, high-resiliency service. > Standalone Master hangs when streaming job completes and event logging is > enabled > --------------------------------------------------------------------------------- > > Key: SPARK-6270 > URL: https://issues.apache.org/jira/browse/SPARK-6270 > Project: Spark > Issue Type: Bug > Components: Deploy, Streaming > Affects Versions: 1.2.0, 1.2.1, 1.3.0, 1.5.1 > Reporter: Tathagata Das > Priority: Critical > > If the event logging is enabled, the Spark Standalone Master tries to > recreate the web UI of a completed Spark application from its event logs. > However if this event log is huge (e.g. for a Spark Streaming application), > then the master hangs in its attempt to read and recreate the web ui. This > hang causes the whole standalone cluster to be unusable. > Workaround is to disable the event logging. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org