[
https://issues.apache.org/jira/browse/HTTPCLIENT-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13939227#comment-13939227
]
Oleg Kalnichevski commented on HTTPCLIENT-1486:
-----------------------------------------------
I do not think there is much we can do about it. The uri in question is
represented internally by java.net.URI as
{noformat}
scheme = {java.lang.String@470}"http"
fragment = null
authority = {java.lang.String@426}"www.example.com:80somepath"
userInfo = null
host = null
port = -1
path = {java.lang.String@473}"/someresource.html"
query = null
schemeSpecificPart =
{java.lang.String@474}"//www.example.com:80somepath/someresource.html"
hash = 0
decodedUserInfo = null
decodedAuthority = {java.lang.String@426}"www.example.com:80somepath"
decodedPath = null
decodedQuery = null
decodedFragment = null
decodedSchemeSpecificPart = null
string =
{java.lang.String@392}"http://www.example.com:80somepath/someresource.html"
{noformat}
This is hardly surprising given that the uri is clearly malformed.
The only option we have is implement a custom, more lenient URI parser.
Leniency in URI parsing will come at the cost of having to parse every request
URI twice. I am not sure it is worth it.
Oleg
> Quirky Behavior in URIUtils leads to Improper Request Execution
> ---------------------------------------------------------------
>
> Key: HTTPCLIENT-1486
> URL: https://issues.apache.org/jira/browse/HTTPCLIENT-1486
> Project: HttpComponents HttpClient
> Issue Type: Bug
> Components: HttpClient
> Affects Versions: 4.3.3
> Reporter: William Porter
> Priority: Minor
>
> While executing a HttpUriRequest with a ClosableHttpClient, malformed URIs
> can lead to HTTP requests being executed for unexpected resources. The root
> issue is in the extractHost() method in URIUtils, and is demonstracted by the
> following example.
> {code:title=Main.java|borderStyle=solid}
> import java.io.IOException;
> import java.net.URI;
> import java.net.URISyntaxException;
> import org.apache.http.HttpHost;
> import org.apache.http.HttpResponse;
> import org.apache.http.client.ClientProtocolException;
> import org.apache.http.client.HttpClient;
> import org.apache.http.client.methods.HttpGet;
> import org.apache.http.client.methods.HttpUriRequest;
> import org.apache.http.client.utils.URIUtils;
> import org.apache.http.impl.client.HttpClientBuilder;
> import org.apache.log4j.BasicConfigurator;
> import org.junit.Assert;
> import org.slf4j.Logger;
> import org.slf4j.LoggerFactory;
> public class Main {
>
> private static final Logger LOG = LoggerFactory.getLogger(Main.class);
> public static void main(String [] args) {
>
> // Set up Log4J logging
> BasicConfigurator.configure();
>
> try {
>
> // The following is a strange URI string that is
> possibly a typo that
> // doesn't include the / between the authority and the
> 'intended' path
> final String strangeUriString =
> "http://www.example.com:80somepath/someresource.html";
> // Whereas it doesn't neccesarily seem like strange
> behavior to resolve the
> // host and port as www.example.com and 80 from the
> authority, it can have unintended
> // consequences at higher levels of indirection
> Assert.assertEquals(new HttpHost("www.example.com",
> 80), URIUtils.extractHost(new URI(strangeUriString)));
>
> // Now we construct a request with the strange URI
> String
> HttpUriRequest request = new HttpGet(strangeUriString);
>
> // We create a CloseableHttpClient to execute the
> request
> final HttpClientBuilder builder =
> HttpClientBuilder.create();
> HttpClient client = builder.build();
>
> // Here, the request is executed, but is actually a GET
> /someresource.html
> // on www.example.com:80 since part of the intended
> path was considered part
> // of the authority by the URI class, but disregarded
> by URIUtils
> final HttpResponse response = client.execute(request);
> LOG.info("Response: {}",
> response.getStatusLine().toString());
>
>
> } catch (final URISyntaxException e) {
> LOG.error("UriSyntaxException: {}", e.getMessage());
> } catch (final ClientProtocolException e) {
> LOG.error("ClientProtocolException: {}",
> e.getMessage());
> } catch (final IOException e) {
> LOG.error("IOException: {}", e.getMessage());
> }
>
> }
> }
> {code}
> This bug may be introduced by the fix for
> https://issues.apache.org/jira/browse/HTTPCLIENT-1166. It might be
> advantageous to throw an exception in this case rather than be lenient with
> the host and port parsing, but further discussion might be merited based on
> the comments in the aforementioned issue.
> Here is some debug output to show the request is actually a GET
> /someresource.html
> 87 [main] DEBUG org.apache.http.wire - http-outgoing-0 >> "GET
> /someresource.html HTTP/1.1[\r][\n]"
> 87 [main] DEBUG org.apache.http.wire - http-outgoing-0 >> "Host:
> www.example.com:80[\r][\n]"
> 87 [main] DEBUG org.apache.http.wire - http-outgoing-0 >> "Connection:
> Keep-Alive[\r][\n]"
> 87 [main] DEBUG org.apache.http.wire - http-outgoing-0 >> "User-Agent:
> Apache-HttpClient/4.3.3 (java 1.5)[\r][\n]"
> 87 [main] DEBUG org.apache.http.wire - http-outgoing-0 >> "Accept-Encoding:
> gzip,deflate[\r][\n]"
> 87 [main] DEBUG org.apache.http.wire - http-outgoing-0 >> "[\r][\n]"
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]