[ 
https://issues.apache.org/jira/browse/SPARK-4679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-4679.
---------------------------------
    Resolution: Incomplete

> Race condition in querying the Spark UI JSON endpoint when Jetty context 
> handlers are added and removed
> -------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-4679
>                 URL: https://issues.apache.org/jira/browse/SPARK-4679
>             Project: Spark
>          Issue Type: Bug
>          Components: Web UI
>    Affects Versions: 1.0.2
>            Reporter: Matt Cheah
>            Priority: Major
>              Labels: bulk-closed
>
> We started seeing some strange behavior when we were querying the Spark UI 
> JSON endpoint for job metadata.
> When the Spark cluster was under heavy load from a large number of 
> short-lived spark contexts being created and stopped, querying the JSON 
> endpoint (e.g. http://localhost:8080/json) returned the HTML webpage instead. 
> We were relying on this JSON data to get information about running jobs on 
> our own server and the result was a JSON Parse Exception.
> I dug into the code and realized that this is caused by a race condition 
> between how we add and remove Jetty context handlers on the Akka message 
> queue thread and how the context handler is looked up on a different thread 
> when the HTTP request is fired. Whenever an application is started or 
> completes, we invoke ContextHandlerCollection.setHandlers() adding or 
> removing a new Jetty handler to the collection. However, setHandlers() first 
> sets its internal collection to null before configuring the new passed-in 
> collection. If an HTTP request is made and the Jetty context handler is 
> looked up AFTER the collection's internal map is set to null, but BEFORE it 
> has configured the new collection, the default handler is selected to return 
> HTML.
> tl;dr we're using Jetty's ContextHandlerCollection in a way that is not 
> thread-safe. The issue we found is only one possible ramification of this; 
> I'm not sure what other consequences a non-thread-safe usage of Jetty may 
> have. I could only reproduce this by manually stepping through Spark's code 
> with a debugger to force the race condition described above, however this 
> caused some pain in production when it manifested itself repeatedly and 
> reliably.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to