[ 
https://issues.apache.org/jira/browse/SPARK-45468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nobuaki Sukegawa updated SPARK-45468:
-------------------------------------
    Description: 
Currently, proxies can be made transparent for hyperlinks in Spark web UIs with 
spark.ui.proxyRoot or X-Forwarded-Context header alone. However, HTTP redirects 
(such as job/stage kill) currently requires explicit spark.ui.proxyRedirectUri 
as well for handling proxy. This is not ideal as proxy hostname may not be 
known at the time configuring Spark apps.

This can be mitigated by 1) always prepending spark.ui.proxyRoot to redirect 
path for those proxies that intelligently rewrite Location headers and 2) by 
using path without hostname (/jobs/, not [https://example.com/jobs/]) for those 
proxies without Location header rewrites. Then redirects behavior would be 
basically the same way as other hyperlinks.
h2. Example

Let's say proxy URL is [https://example.org/sparkui/]... forwarding to 
[http://drv.svc/]...
and spark.ui.proxyRoot is configured to be /sparkui
h3. Existing behavior (without spark.ui.proxyRedirectUri)

job/stage kill links redirects to [http://drv.svc/jobs/] - likely 404
(other hyperlinks are to paths with prefix, e.g., /sparkui/executors - works 
fine)
h3. After the change 2)

links redirects to /sparkui/jobs/ - works fine
also consistent with other hyperlinks

NOTE: while hostname was originally required in RFC 2616 in 1999, since RFC 
7231 in 2014 hostname can be formally omitted as most browsers already 
supported it (it is rather hard to find any browser that doesn't support it).

  was:
Currently, proxies can be made transparent for hyperlinks in Spark web UIs with 
spark.ui.proxyRoot or X-Forwarded-Context header alone. However, HTTP redirects 
(such as job/stage kill) currently requires explicit spark.ui.proxyRedirectUri 
as well for handling proxy. This is not ideal as proxy hostname may not be 
known at the time configuring Spark apps.

This can be mitigated by 1) always prepending spark.ui.proxyRoot to redirect 
path for those proxies that intelligently rewrite Location headers and 2) by 
using path without hostname (/jobs/, not https://example.com/jobs/) for those 
proxies without Location header rewrites. Then redirects behavior would be 
basically the same way as other hyperlinks.

Regarding 2), while hostname was originally required in RFC 2616 in 1999, since 
RFC 7231 in 2014 hostname can be formally omitted as most browsers already 
supported it (it is rather hard to find any browser that doesn't support it).


> More transparent proxy handling for HTTP redirects
> --------------------------------------------------
>
>                 Key: SPARK-45468
>                 URL: https://issues.apache.org/jira/browse/SPARK-45468
>             Project: Spark
>          Issue Type: Improvement
>          Components: Web UI
>    Affects Versions: 3.5.0
>            Reporter: Nobuaki Sukegawa
>            Priority: Major
>              Labels: pull-request-available
>
> Currently, proxies can be made transparent for hyperlinks in Spark web UIs 
> with spark.ui.proxyRoot or X-Forwarded-Context header alone. However, HTTP 
> redirects (such as job/stage kill) currently requires explicit 
> spark.ui.proxyRedirectUri as well for handling proxy. This is not ideal as 
> proxy hostname may not be known at the time configuring Spark apps.
> This can be mitigated by 1) always prepending spark.ui.proxyRoot to redirect 
> path for those proxies that intelligently rewrite Location headers and 2) by 
> using path without hostname (/jobs/, not [https://example.com/jobs/]) for 
> those proxies without Location header rewrites. Then redirects behavior would 
> be basically the same way as other hyperlinks.
> h2. Example
> Let's say proxy URL is [https://example.org/sparkui/]... forwarding to 
> [http://drv.svc/]...
> and spark.ui.proxyRoot is configured to be /sparkui
> h3. Existing behavior (without spark.ui.proxyRedirectUri)
> job/stage kill links redirects to [http://drv.svc/jobs/] - likely 404
> (other hyperlinks are to paths with prefix, e.g., /sparkui/executors - works 
> fine)
> h3. After the change 2)
> links redirects to /sparkui/jobs/ - works fine
> also consistent with other hyperlinks
> NOTE: while hostname was originally required in RFC 2616 in 1999, since RFC 
> 7231 in 2014 hostname can be formally omitted as most browsers already 
> supported it (it is rather hard to find any browser that doesn't support it).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to