huangzhir created SPARK-47934:
---------------------------------

             Summary: Inefficient Redirect Handling Due to Missing Trailing 
Slashes in URL Redirection
                 Key: SPARK-47934
                 URL: https://issues.apache.org/jira/browse/SPARK-47934
             Project: Spark
          Issue Type: Improvement
          Components: Spark Core
    Affects Versions: 3.4.3, 3.5.1, 3.3.2, 3.2.4
            Reporter: huangzhir


*{*}Summary:{*}*
The current implementation of URL redirection in Spark's history web UI does 
not consistently add trailing slashes to URLs when constructing redirection 
targets. This inconsistency leads to additional HTTP redirects by Jetty, which 
increases the load time and reduces the efficiency of the Spark UI.

*{*}Problem Description:{*}*
When constructing redirect URLs, particularly in scenarios where an attempt ID 
needs to be appended, the system does not ensure that the base URL ends with a 
slash. This omission results in the generated URL being redirected by Jetty to 
add a trailing slash, thus causing an unnecessary additional HTTP redirect.

For example, when the `shouldAppendAttemptId` flag is true, the URL is formed 
without a trailing slash before the attempt ID is appended, leading to two 
redirects: one by our logic to add the attempt ID, and another by Jetty to 
correct the missing slash. 

!image-2024-04-22-15-06-29-357.png!

*{*}Proposed Solution:{*}*

[https://github.com/apache/spark/blob/2d0b56c3eac611e743c41d16ea8e439bc8a504e4/core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala#L118]

Ensure that all redirect URLs uniformly end with a trailing slash regardless of 
whether an attempt ID is appended. This can be achieved by modifying the URL 
construction logic as follows:

```scala
val redirect = if (shouldAppendAttemptId) {
req.getRequestURI.stripSuffix("/") + "/" + attemptId.get + "/"
} else {
req.getRequestURI.stripSuffix("/") + "/"
}

```
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to