On Wed, 2025-03-19 at 21:07 +0000, Nicholas O'Connor wrote:
> I've found what I think is a bug, but could also be expected behavior
> that's surprising from the user's perspective.
> https://gist.github.com/Earth-Turtle/c39c5282af1c8a306099e89091fafea9
> Expected behavior: assume we have some URI
> https://www.example.com/some/path. HttpClient provides overloads for
> execute that allow the URI to be split into host and path
> components("https://www.example.com";, "/some/path"), or provided all
> in the same HttpRequest (where request.getAuthority() is
> "https://example.com"; and request.getUri() is "/some/path"). Using
> either of these two methods provides the exact same result.
> 
> Actual behavior: execute(HttpHost, HttpRequest, ResponseHandler) sets
> the Host header to be www.example.com:443, while execute(HttpRequest,
> ResponseHandler) sets it to www.example.com.
> 
> Normally, this behavior has no effect. In fact,
> https://echo.free.beeceptor.com will strip the port in the Host
> header when echoing back the headers in a request. However, I've
> recently come across a server that rejected some requests with
> "Invalid host header, this site must be accessed as
> https://www.example.com";. Investigation revealed that it rejected
> requests where the port was included in the Host header, and would
> only accept requests where a port was not defined.
> 
> This behavior is not defined by the HTTP spec; the port number is not
> required in the Host header sent by the client, nor is the server
> obligated to respect the host portion without the port. This case
> feels like an outlier from usual behavior; however, this hidden
> behavior from HttpClient was unexpected.
> 
> It appears that this happens when ProtocolExec, AsyncProtocolExec,
> and MinimalHttpClient are filling in the authority and scheme for a
> request if it didn't have one to begin with. Because they fill from
> the HttpRoute's target HttpHost, this host also contains port
> information (usually scheme-default) when it is set as the request's
> authority.
> 
> This bug is very easily worked around by simply setting the requests
> authority from the target before calling execute, but it still seems
> unusual. Was this behavior intended?

Hi Nicholas

Probably not. Please raise a JIRA ticket for this issue. I will look
into it in the coming days.

https://issues.apache.org/jira/browse/HTTPCLIENT

Oleg


---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscr...@hc.apache.org
For additional commands, e-mail: httpclient-users-h...@hc.apache.org

Reply via email to