[ 
https://issues.apache.org/jira/browse/HTTPCLIENT-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17882073#comment-17882073
 ] 

Xavier BOURGOUIN edited comment on HTTPCLIENT-2341 at 9/16/24 2:32 PM:
-----------------------------------------------------------------------

Hi [~olegk],

Ok thanks and fine by me, just wanted to double check that there's no intention 
to change that behavior in HttpClient 4.x (in the meanwhile we've already 
instructed the users of our API into some workarounds, including upgrading to 
HttpClient5, which works fine as per attached fixture).

(and my bad, I just realize now that in the "Standards compliance" section of  
https://hc.apache.org/httpcomponents-client-4.5.x/index.html, HttpClient 4.x 
only intends to comply with RFC 2396, despite it has since been obsoleted by 
RFC 3986) 

[~reschke] that makes no difference as per Oleg answer, but for your 
information that 6.2.2.2 section seems to apply to "unreserved characters" 
only. See also the long comment thread in 
https://issues.apache.org/jira/browse/HTTPCLIENT-2271  which arrived at the 
same conclusion (that reserved characters are to be treated differently and in 
particular, not transformed from and to their percent-encoded representation)


was (Author: JIRAUSER307029):
Hi [~olegk],

Ok thanks and fine by me, just wanted to double check that there's no intention 
to change that behavior in HttpClient 4.x (in the meanwhile we've already 
instructed the users of our API into some workarounds, including upgrading to 
HttpClient5, which works fine as per attached fixture).

(and my bad, I just realize now that in the "Standards compliance" section of  
https://hc.apache.org/httpcomponents-client-4.5.x/index.html, HttpClient 4.x 
only intends to comply with RFC 2396, despite it has since been obsoleted by 
RFC 3986) 

[~reschke] that makes no difference as per Oleg answer, but for your 
information that section seems to apply to "unreserved characters" only. See 
also the long comment thread in 
https://issues.apache.org/jira/browse/HTTPCLIENT-2271  which arrived at the 
same conclusion (that reserved characters are to be treated differently and in 
particular, not transformed from and to their percent-encoded representation)

> DefaultRedirect strategy breaks reserved chars in URI path
> ----------------------------------------------------------
>
>                 Key: HTTPCLIENT-2341
>                 URL: https://issues.apache.org/jira/browse/HTTPCLIENT-2341
>             Project: HttpComponents HttpClient
>          Issue Type: Bug
>          Components: HttpClient (classic)
>    Affects Versions: 4.5.14
>         Environment: httpclient4 (4.5.14)
> Linux/Ubuntu 22.04
>            Reporter: Xavier BOURGOUIN
>            Priority: Major
>         Attachments: hc4normalize.tar.gz
>
>
> When an HTTP response has an URI in the Location header with percent-encoded 
> reserved chars (such as %40), these chars are replaced by their normalized 
> equivalent (which is "@" in the case of %40), which seems to contradict RFC 
> 3986 ([https://www.rfc-editor.org/rfc/rfc3986#section-2.2] ), at least in the 
> sense that for such reserved characters, their percent-encoded value doesn't 
> have the same semantic meaning and thus aren't to be interpreted as 
> equivalent.
> One of the impacts is that it breaks any server / API that redirect clients 
> to a S3 blob object (AWS S3 for instance) that would happen to contain a %40 
> in the URI path (ex: location: https://<endpoint>/<some blob 
> container>/foo%40bar.file)
> Disabling URI normalization as show below seems to workaround it:
> {code:java}
> new 
> HttpGet("http://service-that-redirects";).setConfig(RequestConfig.custom().setNormalizeUri(false).build())
>  {code}
> However I'm not sure that's satisfying, if, as we suspect above, it is just 
> always wrong to "normalize" those reserved characters (plus it is enabled by 
> default).
> Note that httpclient5 is fine (the percent-encoded %40 is preserved as it 
> should, and it seems there's no more toggle for the normalization behavior 
> anyways).
> Comparing httpclient 4.x vs 5.x, it seems the URI normalization utility isn't 
> the same, which might explain why httpclient5 has no issue: 
> https://github.com/apache/httpcomponents-client/blob/4.5.x/httpclient/src/main/java/org/apache/http/impl/client/DefaultRedirectStrategy.java#L163
> https://github.com/apache/httpcomponents-client/blob/5.3.x/httpclient5/src/main/java/org/apache/hc/client5/http/impl/DefaultRedirectStrategy.java#L116
> (org.apache.http.client.utils.URIUtils.normalize() for HC4, versus 
> java.net.URI.normalize for HC5)
>  
> This past ticket https://issues.apache.org/jira/browse/HTTPCLIENT-2271 was 
> discussing something very similar, except it was the other way around: some 
> reserved characters were replaced by their percent-encoded equivalent. 
> However in the the lengthy comment thread there, it seems a consensus was 
> finally reach that for such chars, their percent-encoded value aren't 
> equivalent to their original value and thus shouldn't be transformed. So I 
> believe that reasoning should be bijective, and should also apply to the case 
> reported here.
> I worked out a reproducer in the form of a little maven project that I'm 
> attaching to this ticket, inspired from the one of that other ticket, that 
> demo the issue for httpclient 4.5.14 (but probably all 4.x is the same), and 
> compares it with httpclient5 (5.3.1). It should run directly with _mvn 
> exec:java_ and hopefully the output and code content are clear enough to be 
> self-explanatory.
>  
> In essence what it does is :
>  * Start a dummy http server with two services: */foo* that redirect to 
> */foo%40bar* and one that listen on *foo@bar* and reply with HTTP 200.
>  * Test httpclient4 (along with some other clients to demonstrate the 
> differences in behavior) by sending some GET request toward */foo* and 
> observe if and how it follows the redirect toward {*}/foo@bar{*}, which thus 
> allows to observe whether *%40* was replaced by *@*
>  
> {code:java}
> // Dummy server
> public static void main(String[] args) throws IOException, 
> InterruptedException {
>         HttpServer server = HttpServer.create(new InetSocketAddress(8000), 0);
>         server.createContext("/foo", new RedirectHttpHandler());
>         server.createContext("/foo@bar", new SuccessHttpHandler());
>         server.setExecutor(null);
>         server.start();
>         server.stop(0);
>        
>        // [... test client requets]
> }
> public static class RedirectHttpHandler implements HttpHandler {
>         @Override
>         public void handle(HttpExchange t) throws IOException {
>             t.getResponseHeaders().add("Location", "/foo%40bar");
>             t.sendResponseHeaders(302, 0);
>             OutputStream os = t.getResponseBody();
>             os.close();
>         }
>     }    
>     
>     public static class SuccessHttpHandler implements HttpHandler {
>         @Override
>         public void handle(HttpExchange t) throws IOException {
>             System.out.println("[server] Received GET with URI: " + 
> t.getRequestURI().toString());
>             String response = "You followed the redirect!";
>             t.sendResponseHeaders(200, response.length());
>             OutputStream os = t.getResponseBody();
>             os.write(response.getBytes());
>             os.close();
>         }
>     }
> {code}
> And httpclient4 test like this:
> {code:java}
> CloseableHttpClient client = HttpClients.createDefault();
> HttpGet httpget = new HttpGet("http://127.0.0.1:8000/foo";);
> CloseableHttpResponse response = client.execute(httpget);
> if (response.getStatusLine().getStatusCode() == 302) {
>     System.out.println("-> Location header: " + 
> response.getFirstHeader("Location").getValue());
> } else if (response.getStatusLine().getStatusCode() == 200) {
>     System.out.println("-> Followed the redirect!");
> } else {
>     throw new RuntimeException("Unexpected response code: " + 
> response.getStatusLine().getStatusCode());
> }   
> {code}
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@hc.apache.org
For additional commands, e-mail: dev-h...@hc.apache.org

Reply via email to