[ 
https://issues.apache.org/jira/browse/SPARK-20853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16114436#comment-16114436
 ] 

Trevor McKay commented on SPARK-20853:
--------------------------------------

@Josh Bacon,

  Hi Josh, we tracked this down independently and someone else found the same 
thing and put up a fix, referenced in the "cloned by" SPARK-21176 above.

  The issue is actually how the Jetty thread pool associated with the reverse 
proxy handlers is being built.  The more processors your system has, the 
quicker your threadpool will be exhausted, in a nutshell. I ran on a 40 core 
system and so 40 / 2 * (number of nodes) exceeded the 200 count pool at 10 
nodes.  If you run the master  on a single core, you can have 200 workers 
before you blow up I believe (that is spark standalone).

Cheers!

> spark.ui.reverseProxy=true leads to hanging communication to master
> -------------------------------------------------------------------
>
>                 Key: SPARK-20853
>                 URL: https://issues.apache.org/jira/browse/SPARK-20853
>             Project: Spark
>          Issue Type: Bug
>          Components: Web UI
>    Affects Versions: 2.1.0
>         Environment: ppc64le GNU/Linux, POWER8, only master node is reachable 
> externally other nodes are in an internal network
>            Reporter: Benno Staebler
>              Labels: network, web-ui
>
> When *reverse proxy is enabled*
> {quote}
> spark.ui.reverseProxy=true
> spark.ui.reverseProxyUrl=/
> {quote}
>  first of all any invocation of the spark master Web UI hangs forever locally 
> (e.g. http://192.168.10.16:25001) and via external URL without any data 
> received. 
> One, sometimes two spark applications succeed without error and than workers 
> start throwing exceptions:
> {quote}
> Caused by: java.io.IOException: Failed to connect to /192.168.10.16:25050
> {quote}
> The application dies during creation of SparkContext:
> {quote}
> 2017-05-22 16:11:23 INFO  StandaloneAppClient$ClientEndpoint:54 - Connecting 
> to master spark://node0101:25000...
> 2017-05-22 16:11:23 INFO  TransportClientFactory:254 - Successfully created 
> connection to node0101/192.168.10.16:25000 after 169 ms (132 ms spent in 
> bootstraps)
> 2017-05-22 16:11:43 INFO  StandaloneAppClient$ClientEndpoint:54 - Connecting 
> to master spark://node0101:25000...
> 2017-05-22 16:12:03 INFO  StandaloneAppClient$ClientEndpoint:54 - Connecting 
> to master spark://node0101:25000...
> 2017-05-22 16:12:23 ERROR StandaloneSchedulerBackend:70 - Application has 
> been killed. Reason: All masters are unresponsive! Giving up.
> 2017-05-22 16:12:23 WARN  StandaloneSchedulerBackend:66 - Application ID is 
> not initialized yet.
> 2017-05-22 16:12:23 INFO  Utils:54 - Successfully started service 
> 'org.apache.spark.network.netty.NettyBlockTransferService' on port 25056.
> .....
> Caused by: java.lang.IllegalArgumentException: requirement failed: Can only 
> call getServletHandlers on a running MetricsSystem
> {quote}
> *This definitively does not happen without reverse proxy enabled!*



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to