[ 
https://issues.apache.org/jira/browse/HTTPCLIENT-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13940328#comment-13940328
 ] 

Oleg Kalnichevski commented on HTTPCLIENT-1486:
-----------------------------------------------

I just cannot make everyone happy, can I? 

The uri in question is not rejected by java.net.URI as invalid. 
URIUtils#extractHost does a reasonable job figuring out the right target host. 

{code:java}
String strangeUriString = 
"http://www.example.com:8888somepath/someresource.html";;
Assert.assertEquals(new HttpHost("www.example.com", 8888), 
URIUtils.extractHost(new URI(strangeUriString)));
{code}

The method parses authority attribute *only* if the standard parsing route 
failed to determine the host attribute. In the normal execution flow there is 
almost no extra overhead.

Most importantly, though, there are legitimate cases where URIUtils#extractHost 
can be very useful. Try to convince all those people who use underscore in 
hostnames that the following URI should be rejected as invalid:

{code:java}
String strangeUriString = "http://www.my_example.com:8888/someresource.html";;
URI uri = new URI(strangeUriString);
Assert.assertNull(uri.getHost());
Assert.assertEquals(new HttpHost("www.my_example.com", 8888), 
URIUtils.extractHost(uri));
{code}

Oleg

> Quirky Behavior in URIUtils leads to Improper Request Execution
> ---------------------------------------------------------------
>
>                 Key: HTTPCLIENT-1486
>                 URL: https://issues.apache.org/jira/browse/HTTPCLIENT-1486
>             Project: HttpComponents HttpClient
>          Issue Type: Bug
>          Components: HttpClient
>    Affects Versions: 4.3.3
>            Reporter: William Porter
>            Priority: Minor
>
> While executing a HttpUriRequest with a ClosableHttpClient, malformed URIs 
> can lead to HTTP requests being executed for unexpected resources.  The root 
> issue is in the extractHost() method in URIUtils, and is demonstracted by the 
> following example.
> {code:title=Main.java|borderStyle=solid}
> import java.io.IOException;
> import java.net.URI;
> import java.net.URISyntaxException;
> import org.apache.http.HttpHost;
> import org.apache.http.HttpResponse;
> import org.apache.http.client.ClientProtocolException;
> import org.apache.http.client.HttpClient;
> import org.apache.http.client.methods.HttpGet;
> import org.apache.http.client.methods.HttpUriRequest;
> import org.apache.http.client.utils.URIUtils;
> import org.apache.http.impl.client.HttpClientBuilder;
> import org.apache.log4j.BasicConfigurator;
> import org.junit.Assert;
> import org.slf4j.Logger;
> import org.slf4j.LoggerFactory;
> public class Main {
>       
>       private static final Logger LOG = LoggerFactory.getLogger(Main.class);
>       public static void main(String [] args) {
>               
>               // Set up Log4J logging
>               BasicConfigurator.configure();
>               
>               try {
>                       
>                       // The following is a strange URI string that is 
> possibly a typo that
>                       // doesn't include the / between the authority and the 
> 'intended' path
>                       final String strangeUriString = 
> "http://www.example.com:80somepath/someresource.html";;
>                       // Whereas it doesn't neccesarily seem like strange 
> behavior to resolve the
>                       // host and port as www.example.com and 80 from the 
> authority, it can have unintended
>                       // consequences at higher levels of indirection
>                       Assert.assertEquals(new HttpHost("www.example.com", 
> 80), URIUtils.extractHost(new URI(strangeUriString)));
>                       
>                       // Now we construct a request with the strange URI 
> String
>                       HttpUriRequest request = new HttpGet(strangeUriString);
>                       
>                       // We create a CloseableHttpClient to execute the 
> request
>                       final HttpClientBuilder builder = 
> HttpClientBuilder.create();
>                       HttpClient client = builder.build();
>                       
>                       // Here, the request is executed, but is actually a GET 
> /someresource.html
>                       // on www.example.com:80 since part of the intended 
> path was considered part 
>                       // of the authority by the URI class, but disregarded 
> by URIUtils
>                       final HttpResponse response = client.execute(request);
>                       LOG.info("Response: {}", 
> response.getStatusLine().toString());
>                       
>                       
>               } catch (final URISyntaxException e) {
>                       LOG.error("UriSyntaxException: {}", e.getMessage());
>               } catch (final ClientProtocolException e) {
>                       LOG.error("ClientProtocolException: {}", 
> e.getMessage());
>               } catch (final IOException e) {
>                       LOG.error("IOException: {}", e.getMessage());
>               }
>               
>       }
> }
> {code}
> This bug may be introduced by the fix for 
> https://issues.apache.org/jira/browse/HTTPCLIENT-1166.  It might be 
> advantageous to throw an exception in this case rather than be lenient with 
> the host and port parsing, but further discussion might be merited based on 
> the comments in the aforementioned issue. 
> Here is some debug output to show the request is actually a GET 
> /someresource.html
> 87 [main] DEBUG org.apache.http.wire  - http-outgoing-0 >> "GET 
> /someresource.html HTTP/1.1[\r][\n]"
> 87 [main] DEBUG org.apache.http.wire  - http-outgoing-0 >> "Host: 
> www.example.com:80[\r][\n]"
> 87 [main] DEBUG org.apache.http.wire  - http-outgoing-0 >> "Connection: 
> Keep-Alive[\r][\n]"
> 87 [main] DEBUG org.apache.http.wire  - http-outgoing-0 >> "User-Agent: 
> Apache-HttpClient/4.3.3 (java 1.5)[\r][\n]"
> 87 [main] DEBUG org.apache.http.wire  - http-outgoing-0 >> "Accept-Encoding: 
> gzip,deflate[\r][\n]"
> 87 [main] DEBUG org.apache.http.wire  - http-outgoing-0 >> "[\r][\n]"



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to