Re: DO NOT REPLY [Bug 24352] - NLTM Proxy and basic host authorization
[EMAIL PROTECTED] wrote: Oleg, I agree, our lack of auth/proxy tests is a continuous source of problems. One of our goals for 2.1 should be an effective method for testing all of the various combinations of proxy, authentication and SSL. Ideally it would be best to make this setup as simple as possible. Do you have any thoughts about how we can best accomplish this? Mike The various authentication methods should be tested against servlets in the Test-Webapp. As to proxies, we must implement a couple of tiny local servers running on different ports. Like: TCP 81: Proxy TCP 82: SSL Proxy Those servers should be started and stopped by the test fixtures (setup / teardown). The servers must be configurable as to which authentication method they use. This will also ensure quality of the various authentication methods, as currently their test cases are somewhat minimalistic. I'd love to hack up some code for the server side this week. Odi - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Does HttpClient decompress compressed HTTP transfers?
Hi, If i send the right accept-encoding headers, the web-server may answer with a gzip or deflate compressed stream. Does HttpClient decompress it? If yes, how can i turn that off? - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Does HttpClient decompress compressed HTTP transfers?
Sven Köhler wrote: If i send the right accept-encoding headers, the web-server may answer with a gzip or deflate compressed stream. Does HttpClient decompress it? No, it doesn't. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: DO NOT REPLY [Bug 24352] - NLTM Proxy and basic host authorization
My take is slightly different (and I wish I had time to implement it) Start by virtualizing the access to the connection, and then, rather than having multiple servers, just have different implementations of a virtualized socket interface, for example. Then see to writing test cases that look something like this: # This marks what the server is supposed to receive, note that this is not # literally what is received, because headers might be sent in a different order # for example. GET /foo HTTP/1.1 @Host: http://localhost:8080 @Content-Length: 30 @End-Headers # Note that on content lines, the CRLF (or just LF) should be # discarded. Instead, CRLF pairs should be explicitly encoded, perhaps # with %CRLF%? Content should (must?) allow substitutions, for example # multi-part boundaries. Perhaps do substitution with something like # %BOUNDARY% @Content: Content goes here # the following would wait for three seconds before sending more # content... @Wait: 3000 @Content: Yet more content here... HTTP/1.1 # Note, here since the test case knows the response it is supposed to # send, it can (by and large) simply send it. @Content: . and so on I spend a lot of time working with XML, so I thought about doing some sort of test-framework like the above using XML instead. which would get rid of some of the bizarre syntax that I suggest above, but I'm not sure whether that makes sense in the context of HttpClient. My idea would be to take cases where we want to talk to actual servers, and replace them with test cases like the above, wherein we could mimick (or exactly duplicate) the odd behavior of various servers. Hopefully this gives someone else an idea -Eric. Ortwin Gluck wrote: [EMAIL PROTECTED] wrote: Oleg, I agree, our lack of auth/proxy tests is a continuous source of problems. One of our goals for 2.1 should be an effective method for testing all of the various combinations of proxy, authentication and SSL. Ideally it would be best to make this setup as simple as possible. Do you have any thoughts about how we can best accomplish this? Mike The various authentication methods should be tested against servlets in the Test-Webapp. As to proxies, we must implement a couple of tiny local servers running on different ports. Like: TCP 81: Proxy TCP 82: SSL Proxy Those servers should be started and stopped by the test fixtures (setup / teardown). The servers must be configurable as to which authentication method they use. This will also ensure quality of the various authentication methods, as currently their test cases are somewhat minimalistic. I'd love to hack up some code for the server side this week. Odi - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: DO NOT REPLY [Bug 24560] - HttpClient loops endlessly while trying to retrieve status line
[EMAIL PROTECTED] wrote: --- Additional Comments From [EMAIL PROTECTED] 2003-11-17 14:38 --- Odi, agreed - whitespace is a wrong term. CRLF is better. CRLF or LF is correct ;-) (see RFC2616, section 19.3). Would you then prefer my first version of the patch, or do you have another idea how to handle this? Sorry I did not look at the patch. I just outlined my idea of how it should be in my opinion. Odi Ps. Please use the list for discussion and only post decisions to Bugzilla. Short lists of bug notes are read quicker when fixing bugs. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: DO NOT REPLY [Bug 24352] - NLTM Proxy and basic host authorization
Eric Johnson wrote: My take is slightly different (and I wish I had time to implement it) Start by virtualizing the access to the connection, and then, rather than having multiple servers, just have different implementations of a virtualized socket interface, for example. Eric, we can easily implement that by writing a special connection manager or socket factory. No need to introduce addition abstraction here. Socket is already a nice interface :-) Odi - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Does HttpClient decompress compressed HTTP transfers?
It'd be rather easy to wrap the streams in a DeflaterOutputStream or an InflaterInputStream. Of course, due to limitations in Java's deflate compression, one must extend DeflaterOutputStream to allow true stream deflation. The problem with the current implementation is that there is no way to flush a partially deflated stream -- deflate waits until it reaches an optimal spot to actually perform the deflation and do the flush. You can see http://developer.java.sun.com/developer/bugParade/bugs/4255743.html and http://developer.java.sun.com/developer/bugParade/bugs/4206909.html . In LimeWire we worked around this by using a workaround listed on those pages: having a class 'CompressingOutputStream' that extends DeflaterOutputStream, using the following as the flush method: private static final byte [] EMPTYBYTEARRAY = new byte [0]; /** * Ensure all remaining data will be output. */ public void flush() throws IOException { if( def.finished() ) return; /** * Now this is tricky: We force the Deflater to flush * its data by switching compression level. * As yet, a perplexingly simple workaround for * http://developer.java.sun.com/developer/bugParade/bugs/4255743.html */ def.setInput(EMPTYBYTEARRAY, 0, 0); def.setLevel(Deflater.NO_COMPRESSION); deflate(); def.setLevel(Deflater.DEFAULT_COMPRESSION); deflate(); super.flush(); } The other thing you have to be careful of when using DeflaterOutputStream or InflaterInputStream is spurious NullPointerExceptions from native code if the connection happens to close while a deflate/inflate is being performed. We worked around that particular problem by extending the 'read' of the InflaterInputStream and the 'deflate' call of the DeflaterOutputStream to catch an NPE and rethrow an IOX. Thanks, Sam otisg wrote: I'm not too familiar with HttpClient source code any more, but it seems to me that this should be easy to add. Please correct me if I am wrong. I will be using HttpClient to pull large amounts of data from the web in a few months, and allowing compressed content may translate to less bandwidth and a lower hosting bill :) Otis Get your own 800 number Voicemail, fax, email, and a lot more http://www.ureach.com/reg/tag On Mon, 17 Nov 2003, =?ISO-8859-15?Q?Sven_K=F6hler?= ([EMAIL PROTECTED]) wrote: Hi, If i send the right accept-encoding headers, the web-server may answer with a gzip or deflate compressed stream. Does HttpClient decompress it? If yes, how can i turn that off? - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] . - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Keeping Connections Alive
Hi Mike, Using a listener seems like an interesting idea, although I don't quite see the rationale behind it. MTCM (well, CM's in general really) seem to be designed specifiically for this purpose: to manage http connections. Having an observable interface to HttpConnection would require adding new methods that notify HttpConnection when it is assigned/active, as currently the 'releaseConnection' method basically delegates the call to the HttpConnectionManager that is managing it, leaving HttpConnection as stateless as possible. Perhaps the HttpConnectionManager can actually have the task of informing the observers of the HttpConnection about its status -- that would remove any need to add extraneous methods and state to HttpConnection (other than a way of setting/getting listeners). Of course, it might actually be a good idea to add this state information to HttpConnection for other reasons (that I can't think of at the moment). Thanks, Sam Michael Becke wrote: Hi Sam, I think this is definitely something that could be a useful addition to HttpClient. Though it would be possible to add this functionality to the MultiThreadedHttpConnectionManager (MTCM) I think I would like to keep it separate. Partially because I think MTCM is getting a little to complicated, but also because I think it could be used outside of the MTCM. In general I think we want a pluggable method for tracking the lifecycle of a connection. Something like a HttpConnection listener that is informed when a connection is created, assigned, released, etc. This listener could then handle closing the connections after some period of idleness. Any suggestions or contributions that you may have will be appreciated. Thanks, Mike On Nov 14, 2003, at 12:58 PM, Sam Berlin wrote: Thanks for the reply, Mike. Is there any interest in a feature that would close connections that have been unused for a certain amount of time? I imagine the easiest way to implement this would be to just add some settable parameters (set/getCloseConnectionTime) to MultiThreadedHttpConnectionManager along with another Thread that will occasionally iterate through the list of 'freeConnections' in the 'connectionPool', checking an amount of time has lapsed since the connection was last marked as free. HttpConnection (or HttpConnectionAdapater, since this feature would only be in MultiThreadedHttpConnectionManager) could have a new value added to it that stores the most recent time it was released. Note that this would all rely on the user correctly calling 'releaseConnection', but that's essentially a requirement already anyway. If people are interested in such a feature, I would be more than willing to write up such a patch (as I will probably be doing it for the version LimeWire uses anyway). Thanks, Sam Michael Becke wrote: Hi Sam, HttpClient does not do any active connection reclaiming, except when the resources are reused. In the case of the SimpleHttpConnectionManager the connection is never closed/reopened unless it is required for a new method execution. The case for MultiThreadedHttpConnectionManager is similar though a little more complicated. It keeps a pool of connections with a per-host and total connection limit. Again these connections are never closed until a request for a new connection warrants it. Mike On Nov 13, 2003, at 4:10 PM, Sam Berlin wrote: Hi All, I'd like to clarify a point about HttpClient that I do not fully understand. How/when does the actual connection to a server close? I understand that MultiThreadedHttpConnectionManager (and possibly SimpleConnectionManager as well) will keep the connection alive and reuse it for subsequent HTTP requests. Is there a way to set a limit on how long the connection should be kept alive before waiting for a subsequent request to reuse that connection? Thanks, Sam - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: DO NOT REPLY [Bug 24560] - HttpClient loops endlessly while trying to retrieve status line
Odi, That would be REALLY cool! A simple authenticating proxy (or a proxy that could effectively 'fake' popular authentication schemes) would be a very much appreciated contribution. By the way, have a look at the Christian Kohlschütter's SimpleHttpServer: http://nagoya.apache.org/bugzilla/showattachment.cgi?attach_id=9066 I think that can be a good starting point for a better framework than SimpleHttpconnection. Oleg On Mon, 2003-11-17 at 15:00, [EMAIL PROTECTED] wrote: DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://nagoya.apache.org/bugzilla/show_bug.cgi?id=24560. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND INSERTED IN THE BUG DATABASE. http://nagoya.apache.org/bugzilla/show_bug.cgi?id=24560 HttpClient loops endlessly while trying to retrieve status line --- Additional Comments From [EMAIL PROTECTED] 2003-11-17 14:00 --- Oleg, thank you for reviewing my patch. I think the reviewed version is OK in general (AFAICS from the diff - I haven't applied it yet). Just a few comments (new ideas?) by me: In my opinion, strict mode should be very pedantic about standards compliance. HttpClient should notify the user wherever a problematic, non-standards situation is detected. Of course, trailing whitespace should be silently ignored, but any other characters should be regarded as unexpected (is there a section in RFC 2616 for that? I haven't found it yet). The question is: Is (non-whitespace) garbage following the response (caused by a wrong Content-Length header, for example) unexpected enough? In your version of the patch, there is no chance to get informed about such a situation - and in 'lenient' mode, the detection is disabled completely (did you check the TestBadContentLength testcase? does it pass?). Regarding the ProtocolException/ResponseConsumedWatcher thing, of course, it _is_ a workaround to get that Exception thrown to the caller. However, I would appreciate it if the user _would_ receive that Exception (somehow). I even think it is not such a bad idea to keep that in responseConsumed(), just to inform every HttpClient component that there was an error while reading the response (the interface is not public, anyway). Instead of throwing an Exception, we could also have a boolean without/with errors return value, of course... In short, I would prefer the following behaviour: - For any mode: If garbage is detected, read (up to a certain limit of bytes N) until end of garbage (maximum of N bytes) or until a non-whitespace character is received; N is something 10 (should be user-definable). - For any mode, close the connection (the conncetion is definitely unreliable). - For strict mode, throw a ProtocolException if anything else but whitespace has been received. - (Optionally) introduce an extra pedantic mode (inherits strict mode) and throw a ProtocolException even if N bytes of _whitespace_ garbage have been received. Best regards, Christian - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: DO NOT REPLY [Bug 24560] - HttpClient loops endlessly while trying to retrieve status line
Christian, see my comments in-line In my opinion, strict mode should be very pedantic about standards compliance. HttpClient should notify the user wherever a problematic, non-standards situation is detected. I do not mind being over-pedantic, but not at the expense of code quality. My personal impression was that throwing a protocol exception from responseConsumed required too much of an ugly plumbing. I believe that a big fat log warning should be enough. A protocol exception thrown from responseConsumed would not make HttpClient more reliable (dirty connection would be dropped, anyway), it would just make it, well, more pedantic. That is it. Of course, trailing whitespace should be silently ignored, but any other characters should be regarded as unexpected (is there a section in RFC 2616 for that? I haven't found it yet). This is what RFC has to say: quote In the interest of robustness, servers SHOULD ignore any empty line(s) received where a Request-Line is expected. In other words, if the server is reading the protocol stream at the beginning of a message and receives a CRLF first, it should ignore the CRLF. /quote http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.1 I have not found any mentioning of unexpected content in the RFC, so this is another reason why I would be a bit cautious about throwing a protocol exception. It would suffice to spit out a warning, drop the connection and move on. Folks, any strong options on this issue - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
DO NOT REPLY [Bug 24671] - Basic Authentification fails with non-ASCII username/password characters
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://nagoya.apache.org/bugzilla/show_bug.cgi?id=24671. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND INSERTED IN THE BUG DATABASE. http://nagoya.apache.org/bugzilla/show_bug.cgi?id=24671 Basic Authentification fails with non-ASCII username/password characters [EMAIL PROTECTED] changed: What|Removed |Added Status|RESOLVED|REOPENED Resolution|FIXED | --- Additional Comments From [EMAIL PROTECTED] 2003-11-17 21:38 --- Mike, I set up Apache http server 2.0.48 on Win2K (Prof), enabled digest authentication for a directory, and created an user account with a password containing German umlauts. I hit the URL with Mozilla Firebird 0.7 and attempted to authenticate using the password. It did not work. I may know why. If the RFC 2617 is to be strictly adhered to, only ASCII characters in passwords should be allowed for basic digest authentication RFC 2617, Section 2: Basic Authentication Scheme quote basic-credentials = base64-user-pass base64-user-pass = base64 [4] encoding of user-pass, except not limited to 76 char/line user-pass = userid : password userid = *TEXT excluding : password= *TEXT /quote RFC 822 defines TEXT as quote text= any CHAR, including bare; = atoms, specials, CR bare LF, but NOT ; comments and including CRLF ; quoted-strings are ; NOT recognized. /quote RFC 822 defines TEXT as quote ; ( Octal, Decimal.) CHAR= any ASCII character; ( 0-177, 0.-127.) /quote However, I do think that in this instance the spec is too restrictive and we should be using ISO-8859-1 instead of ASCII. So, I reopen the bug. Sorry for having closed it prematurely - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
DO NOT REPLY [Bug 24671] - Basic Authentification fails with non-ASCII username/password characters
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://nagoya.apache.org/bugzilla/show_bug.cgi?id=24671. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND INSERTED IN THE BUG DATABASE. http://nagoya.apache.org/bugzilla/show_bug.cgi?id=24671 Basic Authentification fails with non-ASCII username/password characters --- Additional Comments From [EMAIL PROTECTED] 2003-11-18 00:19 --- Oleg, No worries about closing the bug. I was a little slow in getting to it. I agree. I think we should be using 8859-1 instead of ASCII. The following section from RFC 2616 seems to imply that TEXT should be 8859-1, though it is a little vague: The TEXT rule is only used for descriptive field contents and values that are not intended to be interpreted by the message parser. Words of *TEXT MAY contain characters from character sets other than ISO- 8859-1 [22] only when encoded according to the rules of RFC 2047draft [14]. TEXT := any OCTET except CTLs, but including LWS Either way I think digest needs to be fixed. I will create a patch. Mike - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: DO NOT REPLY [Bug 24352] - NLTM Proxy and basic host authorization
Odi, Eric, I think a combination of these techniques would be great. One level to handle the socket management(as Odi outlined) and another to handle the content creation/validation (Eric's idea). These two methods in tandem should be sufficient to mimic any combination of servers/configurations. Mike On Nov 17, 2003, at 9:50 AM, Ortwin Glück wrote: Eric Johnson wrote: My take is slightly different (and I wish I had time to implement it) Start by virtualizing the access to the connection, and then, rather than having multiple servers, just have different implementations of a virtualized socket interface, for example. Eric, we can easily implement that by writing a special connection manager or socket factory. No need to introduce addition abstraction here. Socket is already a nice interface :-) Odi - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: DO NOT REPLY [Bug 24560] - HttpClient loops endlessly while trying to retrieve status line
I have not found any mentioning of unexpected content in the RFC, so this is another reason why I would be a bit cautious about throwing a protocol exception. It would suffice to spit out a warning, drop the connection and move on. Folks, any strong options on this issue I would prefer a warning to an exception. Mike - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Keeping Connections Alive
Hi Sam, My thought, probably not well articulated, was to add the ability to observe and interact with the status of HttpConnections. This would allow for classes other that the connection managers to play a hand in the connection management. Another possibility would be to have a HttpConnectionFactory that is responsible for creating connections and is also notified by the connection manager of connection state changes. In general I would like to add plugin support to the connection management process. I think composition of functionality using the connection managers and various plugins will allow for greater flexibility. Mike On Nov 17, 2003, at 1:26 PM, Sam Berlin wrote: Hi Mike, Using a listener seems like an interesting idea, although I don't quite see the rationale behind it. MTCM (well, CM's in general really) seem to be designed specifiically for this purpose: to manage http connections. Having an observable interface to HttpConnection would require adding new methods that notify HttpConnection when it is assigned/active, as currently the 'releaseConnection' method basically delegates the call to the HttpConnectionManager that is managing it, leaving HttpConnection as stateless as possible. Perhaps the HttpConnectionManager can actually have the task of informing the observers of the HttpConnection about its status -- that would remove any need to add extraneous methods and state to HttpConnection (other than a way of setting/getting listeners). Of course, it might actually be a good idea to add this state information to HttpConnection for other reasons (that I can't think of at the moment). Thanks, Sam Michael Becke wrote: Hi Sam, I think this is definitely something that could be a useful addition to HttpClient. Though it would be possible to add this functionality to the MultiThreadedHttpConnectionManager (MTCM) I think I would like to keep it separate. Partially because I think MTCM is getting a little to complicated, but also because I think it could be used outside of the MTCM. In general I think we want a pluggable method for tracking the lifecycle of a connection. Something like a HttpConnection listener that is informed when a connection is created, assigned, released, etc. This listener could then handle closing the connections after some period of idleness. Any suggestions or contributions that you may have will be appreciated. Thanks, Mike On Nov 14, 2003, at 12:58 PM, Sam Berlin wrote: Thanks for the reply, Mike. Is there any interest in a feature that would close connections that have been unused for a certain amount of time? I imagine the easiest way to implement this would be to just add some settable parameters (set/getCloseConnectionTime) to MultiThreadedHttpConnectionManager along with another Thread that will occasionally iterate through the list of 'freeConnections' in the 'connectionPool', checking an amount of time has lapsed since the connection was last marked as free. HttpConnection (or HttpConnectionAdapater, since this feature would only be in MultiThreadedHttpConnectionManager) could have a new value added to it that stores the most recent time it was released. Note that this would all rely on the user correctly calling 'releaseConnection', but that's essentially a requirement already anyway. If people are interested in such a feature, I would be more than willing to write up such a patch (as I will probably be doing it for the version LimeWire uses anyway). Thanks, Sam Michael Becke wrote: Hi Sam, HttpClient does not do any active connection reclaiming, except when the resources are reused. In the case of the SimpleHttpConnectionManager the connection is never closed/reopened unless it is required for a new method execution. The case for MultiThreadedHttpConnectionManager is similar though a little more complicated. It keeps a pool of connections with a per-host and total connection limit. Again these connections are never closed until a request for a new connection warrants it. Mike On Nov 13, 2003, at 4:10 PM, Sam Berlin wrote: Hi All, I'd like to clarify a point about HttpClient that I do not fully understand. How/when does the actual connection to a server close? I understand that MultiThreadedHttpConnectionManager (and possibly SimpleConnectionManager as well) will keep the connection alive and reuse it for subsequent HTTP requests. Is there a way to set a limit on how long the connection should be kept alive before waiting for a subsequent request to reuse that connection? Thanks, Sam - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]