[
https://issues.apache.org/jira/browse/HTTPCLIENT-2363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Oleg Kalnichevski resolved HTTPCLIENT-2363.
-------------------------------------------
Fix Version/s: 5.4.3
5.5-alpha2
Resolution: Fixed
> execute(HttpHost, HttpRequest, ResponseHandler) adds port to Host header
> while execute(HttpRequest, ResponseHandler) does not
> -----------------------------------------------------------------------------------------------------------------------------
>
> Key: HTTPCLIENT-2363
> URL: https://issues.apache.org/jira/browse/HTTPCLIENT-2363
> Project: HttpComponents HttpClient
> Issue Type: Bug
> Components: HttpClient (classic)
> Affects Versions: 5.3.1, 5.4.2
> Reporter: Nicholas O'Connor
> Priority: Minor
> Fix For: 5.4.3, 5.5-alpha2
>
> Original Estimate: 168h
> Remaining Estimate: 168h
>
> I've found what I think is a bug, but could also be expected behavior that's
> surprising from the user's perspective.
> [https://gist.github.com/Earth-Turtle/c39c5282af1c8a306099e89091fafea9]
> Expected behavior: assume we have some URI
> {{{}[https://www.example.com/some/path]{}}}. {{HttpClient}} provides
> overloads for execute that allow the URI to be split into host and path
> components("{{{}[https://www.example.com|https://www.example.com/]{}}}",
> "{{{}/some/path{}}}"), or provided all in the same {{HttpRequest}} (where
> {{{}request.getAuthority({}}}) is
> "[{{https://example.com}}|https://example.com/]" and {{request.getUri()}} is
> "/some/path"). Using either of these two methods provides the exact same
> result.
>
> Actual behavior: {{execute(HttpHost, HttpRequest, ResponseHandler)}} sets the
> Host header to be [{{www.example.com:443}}|http://www.example.com:443/],
> while {{execute(HttpRequest, ResponseHandler)}} sets it to
> [{{www.example.com}}|http://www.example.com/].
>
> Normally, this behavior has no effect. In fact,
> [https://echo.free.beeceptor.com|https://echo.free.beeceptor.com/] will strip
> the port in the Host header when echoing back the headers in a request.
> However, I've recently come across a server that rejected some requests with
> "Invalid host header, this site must be accessed as
> [https://www.example.com|https://www.example.com/]". Investigation revealed
> that it rejected requests where the port was included in the Host header, and
> would only accept requests where a port was not defined.
>
> This behavior is not defined by the HTTP spec; the port number is not
> required in the Host header sent by the client, nor is the server obligated
> to respect the host portion without the port. This case feels like an outlier
> from usual behavior; however, this hidden behavior from {{HttpClient}} was
> unexpected.
>
> It appears that this happens when {{{}ProtocolExec{}}},
> {{{}AsyncProtocolExec{}}}, and {{MinimalHttpClient}} are filling in the
> authority and scheme for a request if it didn't have one to begin with.
> Because they fill from the {{{}HttpRoute{}}}'s target {{{}HttpHost{}}}, this
> host also contains port information (usually scheme-default) when it is set
> as the request's authority.
>
> This bug is very easily worked around by simply setting the requests
> authority from the target before calling execute, but it still seems unusual.
> Was this behavior intended?
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]