[
https://issues.apache.org/jira/browse/HTTPCLIENT-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13940328#comment-13940328
]
Oleg Kalnichevski commented on HTTPCLIENT-1486:
-----------------------------------------------
I just cannot make everyone happy, can I?
The uri in question is not rejected by java.net.URI as invalid.
URIUtils#extractHost does a reasonable job figuring out the right target host.
{code:java}
String strangeUriString =
"http://www.example.com:8888somepath/someresource.html";
Assert.assertEquals(new HttpHost("www.example.com", 8888),
URIUtils.extractHost(new URI(strangeUriString)));
{code}
The method parses authority attribute *only* if the standard parsing route
failed to determine the host attribute. In the normal execution flow there is
almost no extra overhead.
Most importantly, though, there are legitimate cases where URIUtils#extractHost
can be very useful. Try to convince all those people who use underscore in
hostnames that the following URI should be rejected as invalid:
{code:java}
String strangeUriString = "http://www.my_example.com:8888/someresource.html";
URI uri = new URI(strangeUriString);
Assert.assertNull(uri.getHost());
Assert.assertEquals(new HttpHost("www.my_example.com", 8888),
URIUtils.extractHost(uri));
{code}
Oleg
> Quirky Behavior in URIUtils leads to Improper Request Execution
> ---------------------------------------------------------------
>
> Key: HTTPCLIENT-1486
> URL: https://issues.apache.org/jira/browse/HTTPCLIENT-1486
> Project: HttpComponents HttpClient
> Issue Type: Bug
> Components: HttpClient
> Affects Versions: 4.3.3
> Reporter: William Porter
> Priority: Minor
>
> While executing a HttpUriRequest with a ClosableHttpClient, malformed URIs
> can lead to HTTP requests being executed for unexpected resources. The root
> issue is in the extractHost() method in URIUtils, and is demonstracted by the
> following example.
> {code:title=Main.java|borderStyle=solid}
> import java.io.IOException;
> import java.net.URI;
> import java.net.URISyntaxException;
> import org.apache.http.HttpHost;
> import org.apache.http.HttpResponse;
> import org.apache.http.client.ClientProtocolException;
> import org.apache.http.client.HttpClient;
> import org.apache.http.client.methods.HttpGet;
> import org.apache.http.client.methods.HttpUriRequest;
> import org.apache.http.client.utils.URIUtils;
> import org.apache.http.impl.client.HttpClientBuilder;
> import org.apache.log4j.BasicConfigurator;
> import org.junit.Assert;
> import org.slf4j.Logger;
> import org.slf4j.LoggerFactory;
> public class Main {
>
> private static final Logger LOG = LoggerFactory.getLogger(Main.class);
> public static void main(String [] args) {
>
> // Set up Log4J logging
> BasicConfigurator.configure();
>
> try {
>
> // The following is a strange URI string that is
> possibly a typo that
> // doesn't include the / between the authority and the
> 'intended' path
> final String strangeUriString =
> "http://www.example.com:80somepath/someresource.html";
> // Whereas it doesn't neccesarily seem like strange
> behavior to resolve the
> // host and port as www.example.com and 80 from the
> authority, it can have unintended
> // consequences at higher levels of indirection
> Assert.assertEquals(new HttpHost("www.example.com",
> 80), URIUtils.extractHost(new URI(strangeUriString)));
>
> // Now we construct a request with the strange URI
> String
> HttpUriRequest request = new HttpGet(strangeUriString);
>
> // We create a CloseableHttpClient to execute the
> request
> final HttpClientBuilder builder =
> HttpClientBuilder.create();
> HttpClient client = builder.build();
>
> // Here, the request is executed, but is actually a GET
> /someresource.html
> // on www.example.com:80 since part of the intended
> path was considered part
> // of the authority by the URI class, but disregarded
> by URIUtils
> final HttpResponse response = client.execute(request);
> LOG.info("Response: {}",
> response.getStatusLine().toString());
>
>
> } catch (final URISyntaxException e) {
> LOG.error("UriSyntaxException: {}", e.getMessage());
> } catch (final ClientProtocolException e) {
> LOG.error("ClientProtocolException: {}",
> e.getMessage());
> } catch (final IOException e) {
> LOG.error("IOException: {}", e.getMessage());
> }
>
> }
> }
> {code}
> This bug may be introduced by the fix for
> https://issues.apache.org/jira/browse/HTTPCLIENT-1166. It might be
> advantageous to throw an exception in this case rather than be lenient with
> the host and port parsing, but further discussion might be merited based on
> the comments in the aforementioned issue.
> Here is some debug output to show the request is actually a GET
> /someresource.html
> 87 [main] DEBUG org.apache.http.wire - http-outgoing-0 >> "GET
> /someresource.html HTTP/1.1[\r][\n]"
> 87 [main] DEBUG org.apache.http.wire - http-outgoing-0 >> "Host:
> www.example.com:80[\r][\n]"
> 87 [main] DEBUG org.apache.http.wire - http-outgoing-0 >> "Connection:
> Keep-Alive[\r][\n]"
> 87 [main] DEBUG org.apache.http.wire - http-outgoing-0 >> "User-Agent:
> Apache-HttpClient/4.3.3 (java 1.5)[\r][\n]"
> 87 [main] DEBUG org.apache.http.wire - http-outgoing-0 >> "Accept-Encoding:
> gzip,deflate[\r][\n]"
> 87 [main] DEBUG org.apache.http.wire - http-outgoing-0 >> "[\r][\n]"
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]