[ https://issues.apache.org/jira/browse/SPARK-45468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Nobuaki Sukegawa updated SPARK-45468: ------------------------------------- Description: Currently, proxies can be made transparent for hyperlinks in Spark web UIs with spark.ui.proxyRoot or X-Forwarded-Context header alone. However, HTTP redirects (such as job/stage kill) currently requires explicit spark.ui.proxyRedirectUri as well for handling proxy. This is not ideal as proxy hostname may not be known at the time configuring Spark apps. This can be mitigated by 1) always prepending spark.ui.proxyRoot to redirect path for those proxies that intelligently rewrite Location headers and 2) by using path without hostname (/jobs/, not [https://example.com/jobs/]) for those proxies without Location header rewrites. Then redirects behavior would be basically the same way as other hyperlinks. h2. Example Let's say proxy URL is [https://example.org/sparkui/]... forwarding to [http://drv.svc/]... and spark.ui.proxyRoot is configured to be /sparkui h3. Existing behavior (without spark.ui.proxyRedirectUri) job/stage kill links redirects to [http://drv.svc/jobs/] - likely 404 (other hyperlinks are to paths with prefix, e.g., /sparkui/executors - works fine) h3. After the change 2) links redirects to /sparkui/jobs/ - works fine also consistent with other hyperlinks NOTE: while hostname was originally required in RFC 2616 in 1999, since RFC 7231 in 2014 hostname can be formally omitted as most browsers already supported it (it is rather hard to find any browser that doesn't support it). was: Currently, proxies can be made transparent for hyperlinks in Spark web UIs with spark.ui.proxyRoot or X-Forwarded-Context header alone. However, HTTP redirects (such as job/stage kill) currently requires explicit spark.ui.proxyRedirectUri as well for handling proxy. This is not ideal as proxy hostname may not be known at the time configuring Spark apps. This can be mitigated by 1) always prepending spark.ui.proxyRoot to redirect path for those proxies that intelligently rewrite Location headers and 2) by using path without hostname (/jobs/, not https://example.com/jobs/) for those proxies without Location header rewrites. Then redirects behavior would be basically the same way as other hyperlinks. Regarding 2), while hostname was originally required in RFC 2616 in 1999, since RFC 7231 in 2014 hostname can be formally omitted as most browsers already supported it (it is rather hard to find any browser that doesn't support it). > More transparent proxy handling for HTTP redirects > -------------------------------------------------- > > Key: SPARK-45468 > URL: https://issues.apache.org/jira/browse/SPARK-45468 > Project: Spark > Issue Type: Improvement > Components: Web UI > Affects Versions: 3.5.0 > Reporter: Nobuaki Sukegawa > Priority: Major > Labels: pull-request-available > > Currently, proxies can be made transparent for hyperlinks in Spark web UIs > with spark.ui.proxyRoot or X-Forwarded-Context header alone. However, HTTP > redirects (such as job/stage kill) currently requires explicit > spark.ui.proxyRedirectUri as well for handling proxy. This is not ideal as > proxy hostname may not be known at the time configuring Spark apps. > This can be mitigated by 1) always prepending spark.ui.proxyRoot to redirect > path for those proxies that intelligently rewrite Location headers and 2) by > using path without hostname (/jobs/, not [https://example.com/jobs/]) for > those proxies without Location header rewrites. Then redirects behavior would > be basically the same way as other hyperlinks. > h2. Example > Let's say proxy URL is [https://example.org/sparkui/]... forwarding to > [http://drv.svc/]... > and spark.ui.proxyRoot is configured to be /sparkui > h3. Existing behavior (without spark.ui.proxyRedirectUri) > job/stage kill links redirects to [http://drv.svc/jobs/] - likely 404 > (other hyperlinks are to paths with prefix, e.g., /sparkui/executors - works > fine) > h3. After the change 2) > links redirects to /sparkui/jobs/ - works fine > also consistent with other hyperlinks > NOTE: while hostname was originally required in RFC 2616 in 1999, since RFC > 7231 in 2014 hostname can be formally omitted as most browsers already > supported it (it is rather hard to find any browser that doesn't support it). -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org