Andrew, The response produced by the web server is clearly wrong and is in violation of the HTTP spec. I personally doubt that HttpClient can be expected to provide a work around for every single problem caused by every single crappy web server out there. What if the server sent two extraneous bytes or more?
I do understand that developers of HTTP spiders do need to deal with broken or non-compliant web servers. Just recently we have had a lengthy and at times very animated discussion with another developer of HTTP crawler software regarding somewhat similar problem in HttpParser class. Still I strongly disagree that it is feasible for the stock version of HttpClient to be able to work around all the 'exotic' protocol violations. In my opinion the problem can better be addressed by a generic plug-in mechanism which would allow custom implementations of HttpParser to enhance HttpClient capabilities to recover from application specific HTTP protocol violations. There is already a feature request filed. Have a look and feel free to contribute your ideas if are in agreement with the suggested approach: http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25468 Oleg On Fri, 2004-01-09 at 22:04, Andrew W. Buchanan wrote: > I've been encountering a frequent problem with the 2.0-rc2 release in the > spider I'm working on where the HttpParser throws an exception when a extra > byte is returned from a web server. When this exception is thrown, none of > the Headers are returned even though they all contained valid data. > > > An example packet from Ethereal is attached. > > As you can see, there is an extraneous byte (0x00) being sent that is causing > the problem. > > I've attached a quick and dirty patch to fix this. There was already a test > looking for a length < 1 in order to skip processing. Rather than > specifically looking for this case, I simply changes the check to look for a > length < 2 on the grounds that there could never be a valid header of one > character anyway. The patch is against HEAD, but would probably apply to > 2.0-rc2 release cleanly. > > Let me know what you think. > > Let me know if this is the wrong place to post this! > > Andrew Buchanan > > ______________________________________________________________________ > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]