Re: Is there a way to get the final web URL from an active Spark context

2020-01-22 Thread Jeff Evans
To answer my own question, it turns out what I was after is the YARN
ResourceManager URL for the Spark application.  As alluded to in SPARK-20458
, it's possible to use
the YARN API client to get this value.  Here is a gist that shows how it
can be done (given an instance of the Hadoop Configuration object):
https://gist.github.com/jeff303/8dab0e52dc227741b6605f576a317798


On Fri, Jan 17, 2020 at 4:09 PM Jeff Evans 
wrote:

> Given a session/context, we can get the UI web URL like this:
>
> sparkSession.sparkContext.uiWebUrl
>
> This gives me something like http://node-name.cluster-name:4040.  If
> opening this from outside the cluster (ex: my laptop), this redirects
> via HTTP 302 to something like
>
> http://node-name.cluster-name:8088/proxy/redirect/application_1579210019853_0023/
> .
> For discussion purposes, call the latter one the "final web URL".
> Critically, this final URL is active even after the application
> terminates.  The original uiWebUrl
> (http://node-name.cluster-name:4040) is not available after the
> application terminates, so one has to have captured the redirect in
> time, if they want to provide a persistent link to that history server
> UI entry (ex: for debugging purposes).
>
> Is there a way, other than using some HTTP client, to detect what this
> final URL will be directly from the SparkContext?
>


Is there a way to get the final web URL from an active Spark context

2020-01-17 Thread Jeff Evans
Given a session/context, we can get the UI web URL like this:

sparkSession.sparkContext.uiWebUrl

This gives me something like http://node-name.cluster-name:4040.  If
opening this from outside the cluster (ex: my laptop), this redirects
via HTTP 302 to something like
http://node-name.cluster-name:8088/proxy/redirect/application_1579210019853_0023/.
For discussion purposes, call the latter one the "final web URL".
Critically, this final URL is active even after the application
terminates.  The original uiWebUrl
(http://node-name.cluster-name:4040) is not available after the
application terminates, so one has to have captured the redirect in
time, if they want to provide a persistent link to that history server
UI entry (ex: for debugging purposes).

Is there a way, other than using some HTTP client, to detect what this
final URL will be directly from the SparkContext?

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org