[ 
https://issues.apache.org/jira/browse/HTTPCLIENT-679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oleg Kalnichevski reopened HTTPCLIENT-679:
------------------------------------------


The fix looks reasonable to me. If I hear no complaints I'll check it in later 
this week

Oleg

> URI Absolutization does not follow browser behavior
> ---------------------------------------------------
>
>                 Key: HTTPCLIENT-679
>                 URL: https://issues.apache.org/jira/browse/HTTPCLIENT-679
>             Project: HttpComponents HttpClient
>          Issue Type: Bug
>          Components: HttpClient
>    Affects Versions: 3.1 RC1
>         Environment: HttpClient 3.1 RC1, 
> JDK 1.6.0
> Ubuntu 7.04
>            Reporter: Jeff Dalton
>         Attachments: uri_fix.patch
>
>
> This was encountered using Heritrix to crawl a prominent website.
> The URI resulting from the HttpClient URI constructor (base, relative) does 
> not follow browser behavior:
> URI newUrl = new URI(new 
> URI("http://www.theirwebsite.com/browse/results?type=browse&att=1";), 
> "?sort=0&offset=11&pageSize=10")
> Results in newUrl:
> http://www.theirwebsite.com/browse/?sort=0&offset=11&pageSize=10
> The desired behavior based on Firefox and IE should be:
> http://www.theirwebsite.com/browse/results?sort=0&offset=11&pageSize=10
> These browsers treat the question mark similar to a directory separator and 
> do not require a file to be specified before the query.
> HttpClient's current behavior does not correspond to current browser behavior 
> and leads to an inability to crawl certain websites if HttpClient's URI class 
> is used.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to