DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUGĀ· RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT <http://issues.apache.org/bugzilla/show_bug.cgi?id=36932>. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED ANDĀ· INSERTED IN THE BUG DATABASE.
http://issues.apache.org/bugzilla/show_bug.cgi?id=36932 Summary: httpclient not able to download certain urls Product: HttpClient Version: 3.0 RC2 Platform: Other OS/Version: Linux Status: NEW Severity: normal Priority: P2 Component: Commons HttpClient AssignedTo: [email protected] ReportedBy: [EMAIL PROTECTED] CC: [EMAIL PROTECTED] Hi guys, I was using nutch-0.7 to crawl one of the sites but for certain urls it gave following exception and hence failed for them: java.lang.IllegalArgumentException: Invalid uri 'http://www.trw.com/suppliers/home/0,,5^1^5^5,00.html': escaped absolute path not valid at org.apache.commons.httpclient.HttpMethodBase.<init> (HttpMethodBase.java:219) at org.apache.commons.httpclient.methods.GetMethod.<init> (GetMethod.java:88) at org.apache.nutch.protocol.httpclient.HttpResponse.<init> (HttpResponse.java:87) at org.apache.nutch.protocol.httpclient.Http.getProtocolOutput (Http.java:204) at org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:135) Though this url opens perfectly in browser and its a valid existing url. Actually nutch-0.7 provides two http-protocol plugins, one built on java and other built on httpclient (commons-httpclient-3.0-rc2.jar). So plugin built on java is able to download that url but plugin based on httpclient throws that above exception. Is there a bug in httpclient or i am doing something wrong? I will really appreciate if someone can throw some light on this matter ? TIA (Thanks in Advance) Pushpesh -- Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
