Re: \ in path (instead of /)

2003-12-29 Thread Christian Kohlschütter
On Monday 22 December 2003 09:23, Ortwin Glück wrote:
 Sam Berlin wrote:
  HttpClient (rc2) currently barfs on addresses that look like:
 
  http://address:port\path\to\some\file.html
 
  It might be worthwhile to allow these slashes to be parsed as if they
  were /'s.

 Why? Sorry, I don't think this sort of URI is defined by any URI RFC
 (teach me better). If you need this for your particular application then
 please write a convertor for it. But this is certainly not something
 that will go into HttpClient.

 Odi

Sam,

some time ago, I have also had to work with such backslashed http-URLs.
My solution was to extend the URI class (see below).

It was a quick hack only, but may suit your needs.
Feel free to use/modify the code in your program (feedback always 
appreciated).

I agree with Odi that working with backslashes in URIs is unwise and that 
support for this should not be included in the HttpClient core (but probably 
in contrib someday)


Christian

/**
 * A non-standards compliant URI supporting unescaped backslashes ('\')
 * and more unwise things
 *
 * @author  Christian Kohlschuetter
 */
public class UnwiseURI extends URI {

public UnwiseURI(String uri) throws URIException {
super(uri, true);
}

public void setRawPath(char[] escapedPath) throws URIException {
if (escapedPath == null || escapedPath.length == 0) {
_opaque = escapedPath;
}

escapedPath = removeFragmentIdentifier(escapedPath);
_path = escapedPath;
setURI();
}

public static void main(String[] args) throws Exception {
System.out.println(new URI(http://www.newsclub.de/foo\\bar;, false));
System.out.println(new UnwiseURI(http://www.newsclub.de/foo\\bar;));
}
}

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: DO NOT REPLY [Bug 24352] - NLTM Proxy and basic host authorization

2003-11-18 Thread Christian Kohlschütter
On Monday 17 November 2003 20:33, Oleg Kalnichevski wrote:
 [Disregard my previous post. I responded to a wrong message by mistake]

 Odi,
 That would be REALLY cool! A simple authenticating proxy (or a proxy
 that could effectively 'fake' popular authentication schemes) would be a
 very much appreciated contribution. By the way, have a look at the
 Christian Kohlschütter's SimpleHttpServer:

 http://nagoya.apache.org/bugzilla/showattachment.cgi?attach_id=9066

 I think that can be a good starting point for a better framework than
 SimpleHttpconnection.

Please have a look at the latest version (see
http://nagoya.apache.org/bugzilla/showattachment.cgi?attach_id=9093
).

It is more abstract than the BadHTTPServer example for Bug 24560 and truly 
test independent.


Christian

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: DO NOT REPLY [Bug 24352] - NLTM Proxy and basic host authorization

2003-11-18 Thread Christian Kohlschütter
On Tuesday 18 November 2003 11:26, Ortwin Glück wrote:
 Christian Kohlschütter wrote:
  Please have a look at the latest version (see
  http://nagoya.apache.org/bugzilla/showattachment.cgi?attach_id=9093
  ).
 
  It is more abstract than the BadHTTPServer example for Bug 24560 and
  truly test independent.

 What sort of file is that? It seems binary...

tar.gz


Christian

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: DO NOT REPLY [Bug 24352] - NLTM Proxy and basic host authorization

2003-11-18 Thread Christian Kohlschütter
On Tuesday 18 November 2003 11:53, Ortwin Glück wrote:
 Christian Kohlschütter wrote:
  On Tuesday 18 November 2003 11:26, Ortwin Glück wrote:
 Christian Kohlschütter wrote:
 Please have a look at the latest version (see
 http://nagoya.apache.org/bugzilla/showattachment.cgi?attach_id=9093
 ).
 
 It is more abstract than the BadHTTPServer example for Bug 24560 and
 truly test independent.
 
 What sort of file is that? It seems binary...
 
  tar.gz

 Thanks. I am gonna check your server package in in a minute. Please
 confirm that the code in attachment 9093 is meant to be published under
 the Apache License and is not copyright by any third party. I will then
 include the Apache License.

 Odi

I own the copyright for this code and I am willing to contribute / publish it 
under the conditions of the Apache License.


Christian

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: DO NOT REPLY [Bug 24560] - HttpClient loops endlessly while trying to retrieve status line

2003-11-11 Thread Christian Kohlschütter
Am Dienstag, 11. November 2003 12:32 schrieb Ortwin Glück:
 [EMAIL PROTECTED] wrote:

  Agreed. 
   
  In other words, can we assume that reusing the HTTP connection is 
  unreliable/should be avoided if there are more bytes available than
  specified 
 with Content-Length?
   
  In this case, at least, I would suggest to close the current connection
  and 
 open a fresh one.
   
   
  Christian

 This is the pragmatic way of solving the problem. It always works and is 
 very reliable. But it is expensive.

Has this already been implemented or should we do it now?


Re: DO NOT REPLY [Bug 24560] - HttpClient loops endlessly while trying to retrieve status line

2003-11-11 Thread Christian Kohlschütter
Am Dienstag, 11. November 2003 12:32 schrieb Ortwin Glück:
 [EMAIL PROTECTED] wrote:

  Agreed. 
   
  In other words, can we assume that reusing the HTTP connection is 
  unreliable/should be avoided if there are more bytes available than
  specified 
 with Content-Length?
   
  In this case, at least, I would suggest to close the current connection
  and 
 open a fresh one.
   
   
  Christian

 
 This is the pragmatic way of solving the problem. It always works and is 
 very reliable. But it is expensive.

btw, what do you mean with expensive?

I think an available() check or read()/unread() call pair at the right point 
would be enough.


Christian


Re: DO NOT REPLY [Bug 24560] - HttpClient loops endlessly while trying to retrieve status line

2003-11-11 Thread Christian Kohlschütter
Am Dienstag, 11. November 2003 13:01 schrieb Kalnichevski, Oleg:
  In other words, can we assume that reusing the HTTP connection is
  unreliable/should be avoided if there are more bytes available than
  specified with Content-Length?

 Christian,
 Again, there's a similar problem. How long do you think we should wait on a
 connection to be really sure that there's no garbage coming out of it? I
 think this timeout is the performance hit Odi was talking about. Besides,
 regardless how long you wait there can never be 100% certainty that the
 connection is completely 'clean'. You could still potentially get some
 garbage out of persistent connection when reading the status line of the
 following response. So, we are pretty much back to where we started

 You are welcome to come up with a better solution than the exiting, but I
 am afraid there's none that it is truly bullet-proof.

 Oleg

I perfectly agree - I do not see a bullet-proof solution either.

I should correct my assumption:

Can we assume that reusing the HTTP connection is unreliable/should be 
avoided if there are more bytes *INSTANTLY* available than specified with 
Content-Length

Instantly means without waiting/blocking, so at least for this situation, a 
simple workaround would be feasible.

I think that the currently used SocketInputStream's available() method _does_ 
return values  0.

The only additional thing to change is HttpConnection.WrappedInputStream, as 
it currently lacks an available() method call to the underlying stream.


Christian

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: DO NOT REPLY [Bug 24560] - HttpClient loops endlessly while trying to retrieve status line

2003-11-11 Thread Christian Kohlschütter
On Tuesday 11 November 2003 13:44, Ortwin Glück wrote:
 
 Expensive: Throwing away the (known bad) connection and opening a new 
 one, is a performance hit of course.

Sure. But this happens only if we detected a connection to be bad, and then it 
is a very rare situation.

In strict mode, the detection of bad connection should cause an exception to 
be thrown, instead of reopening another connection. What do you think?


Christian

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Does HttpClient have support for some sort of Document object model?

2003-10-13 Thread Christian Kohlschütter
Am Montag, 13. Oktober 2003 15:52 schrieb Raj Wagle:
 I was looking for something that people would have actively used.
 The last release for the code of this project was almost an year ago, and
 it has not yet had a major release.

 But probably this is not the right forum to discuss htmlparser.

 Thanks for the inputs
 Raj

Hello Raj,

have a look at the CyberNeko HTML Parser:
http://www.apache.org/~andyc/neko/doc/html/

It has very nice features and straightforward APIs (DOM/SAX).


Christian


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]