Re: \ in path (instead of /)
On Monday 22 December 2003 09:23, Ortwin Glück wrote: Sam Berlin wrote: HttpClient (rc2) currently barfs on addresses that look like: http://address:port\path\to\some\file.html It might be worthwhile to allow these slashes to be parsed as if they were /'s. Why? Sorry, I don't think this sort of URI is defined by any URI RFC (teach me better). If you need this for your particular application then please write a convertor for it. But this is certainly not something that will go into HttpClient. Odi Sam, some time ago, I have also had to work with such backslashed http-URLs. My solution was to extend the URI class (see below). It was a quick hack only, but may suit your needs. Feel free to use/modify the code in your program (feedback always appreciated). I agree with Odi that working with backslashes in URIs is unwise and that support for this should not be included in the HttpClient core (but probably in contrib someday) Christian /** * A non-standards compliant URI supporting unescaped backslashes ('\') * and more unwise things * * @author Christian Kohlschuetter */ public class UnwiseURI extends URI { public UnwiseURI(String uri) throws URIException { super(uri, true); } public void setRawPath(char[] escapedPath) throws URIException { if (escapedPath == null || escapedPath.length == 0) { _opaque = escapedPath; } escapedPath = removeFragmentIdentifier(escapedPath); _path = escapedPath; setURI(); } public static void main(String[] args) throws Exception { System.out.println(new URI(http://www.newsclub.de/foo\\bar;, false)); System.out.println(new UnwiseURI(http://www.newsclub.de/foo\\bar;)); } } - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: DO NOT REPLY [Bug 24352] - NLTM Proxy and basic host authorization
On Monday 17 November 2003 20:33, Oleg Kalnichevski wrote: [Disregard my previous post. I responded to a wrong message by mistake] Odi, That would be REALLY cool! A simple authenticating proxy (or a proxy that could effectively 'fake' popular authentication schemes) would be a very much appreciated contribution. By the way, have a look at the Christian Kohlschütter's SimpleHttpServer: http://nagoya.apache.org/bugzilla/showattachment.cgi?attach_id=9066 I think that can be a good starting point for a better framework than SimpleHttpconnection. Please have a look at the latest version (see http://nagoya.apache.org/bugzilla/showattachment.cgi?attach_id=9093 ). It is more abstract than the BadHTTPServer example for Bug 24560 and truly test independent. Christian - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: DO NOT REPLY [Bug 24352] - NLTM Proxy and basic host authorization
On Tuesday 18 November 2003 11:26, Ortwin Glück wrote: Christian Kohlschütter wrote: Please have a look at the latest version (see http://nagoya.apache.org/bugzilla/showattachment.cgi?attach_id=9093 ). It is more abstract than the BadHTTPServer example for Bug 24560 and truly test independent. What sort of file is that? It seems binary... tar.gz Christian - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: DO NOT REPLY [Bug 24352] - NLTM Proxy and basic host authorization
On Tuesday 18 November 2003 11:53, Ortwin Glück wrote: Christian Kohlschütter wrote: On Tuesday 18 November 2003 11:26, Ortwin Glück wrote: Christian Kohlschütter wrote: Please have a look at the latest version (see http://nagoya.apache.org/bugzilla/showattachment.cgi?attach_id=9093 ). It is more abstract than the BadHTTPServer example for Bug 24560 and truly test independent. What sort of file is that? It seems binary... tar.gz Thanks. I am gonna check your server package in in a minute. Please confirm that the code in attachment 9093 is meant to be published under the Apache License and is not copyright by any third party. I will then include the Apache License. Odi I own the copyright for this code and I am willing to contribute / publish it under the conditions of the Apache License. Christian - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: DO NOT REPLY [Bug 24560] - HttpClient loops endlessly while trying to retrieve status line
Am Dienstag, 11. November 2003 12:32 schrieb Ortwin Glück: [EMAIL PROTECTED] wrote: Agreed. In other words, can we assume that reusing the HTTP connection is unreliable/should be avoided if there are more bytes available than specified with Content-Length? In this case, at least, I would suggest to close the current connection and open a fresh one. Christian This is the pragmatic way of solving the problem. It always works and is very reliable. But it is expensive. Has this already been implemented or should we do it now?
Re: DO NOT REPLY [Bug 24560] - HttpClient loops endlessly while trying to retrieve status line
Am Dienstag, 11. November 2003 12:32 schrieb Ortwin Glück: [EMAIL PROTECTED] wrote: Agreed. In other words, can we assume that reusing the HTTP connection is unreliable/should be avoided if there are more bytes available than specified with Content-Length? In this case, at least, I would suggest to close the current connection and open a fresh one. Christian This is the pragmatic way of solving the problem. It always works and is very reliable. But it is expensive. btw, what do you mean with expensive? I think an available() check or read()/unread() call pair at the right point would be enough. Christian
Re: DO NOT REPLY [Bug 24560] - HttpClient loops endlessly while trying to retrieve status line
Am Dienstag, 11. November 2003 13:01 schrieb Kalnichevski, Oleg: In other words, can we assume that reusing the HTTP connection is unreliable/should be avoided if there are more bytes available than specified with Content-Length? Christian, Again, there's a similar problem. How long do you think we should wait on a connection to be really sure that there's no garbage coming out of it? I think this timeout is the performance hit Odi was talking about. Besides, regardless how long you wait there can never be 100% certainty that the connection is completely 'clean'. You could still potentially get some garbage out of persistent connection when reading the status line of the following response. So, we are pretty much back to where we started You are welcome to come up with a better solution than the exiting, but I am afraid there's none that it is truly bullet-proof. Oleg I perfectly agree - I do not see a bullet-proof solution either. I should correct my assumption: Can we assume that reusing the HTTP connection is unreliable/should be avoided if there are more bytes *INSTANTLY* available than specified with Content-Length Instantly means without waiting/blocking, so at least for this situation, a simple workaround would be feasible. I think that the currently used SocketInputStream's available() method _does_ return values 0. The only additional thing to change is HttpConnection.WrappedInputStream, as it currently lacks an available() method call to the underlying stream. Christian - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: DO NOT REPLY [Bug 24560] - HttpClient loops endlessly while trying to retrieve status line
On Tuesday 11 November 2003 13:44, Ortwin Glück wrote: Expensive: Throwing away the (known bad) connection and opening a new one, is a performance hit of course. Sure. But this happens only if we detected a connection to be bad, and then it is a very rare situation. In strict mode, the detection of bad connection should cause an exception to be thrown, instead of reopening another connection. What do you think? Christian - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Does HttpClient have support for some sort of Document object model?
Am Montag, 13. Oktober 2003 15:52 schrieb Raj Wagle: I was looking for something that people would have actively used. The last release for the code of this project was almost an year ago, and it has not yet had a major release. But probably this is not the right forum to discuss htmlparser. Thanks for the inputs Raj Hello Raj, have a look at the CyberNeko HTML Parser: http://www.apache.org/~andyc/neko/doc/html/ It has very nice features and straightforward APIs (DOM/SAX). Christian - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]