Marc Saegesser wrote:

[snip]

>Now a question.  What is the status of the HttpClient 2.0 release?  The code
>is currently tagged alpha 1 but the RELEASE_PLAN_2_0.txt document hasn't
>been modified since October, 2001.  I ask because, depending on how iminent
>an actual release is, some of the changes that I'm proposing should probably
>be made on a separate branch.  
>
It's waiting on the committer's being comfortable that it's ready. I've 
been doing mainly maintenance on httpclient recently, so i'm not the 
best one to decide when it's ready to go.

>Here's my story.  I have need of something like HttpClient in my product but
>I found that I had to extend it somewhat.  The extensions are very generic
>and I believe useful to others so I'd like to add to the HttpClient project.
>I also found several bugs that I fixed along the way.  I've documented these
>changes below.
>
Cool.

>I need to be able to use HttpClient (or a derivative) to navigate around the
>web pretty much like a regular user-agent.  I want to be able to access any
>site and any web application that I can reach with a reasonably modern
>browser.  HttpClient does a good job of implementing the client side of RFC
>2616.  Unfortunately, there are lots of sites and some very big name
>applications that do not implement the server side correctly.  Some sites
>(Yahoo! in particular) actually require a broken client implementation just
>to log in.  Here are two examples of things I've found so far.
>RFC2616/10.3.3 forbids changing a 302 redirected POST method into a GET
>method but acknowledges that most clients are broken in this regard (this is
>the failure that Yahoo! requires).  I have found sites that send relative
>URLs in the Location: header of a redirect (this violates RFC2616/14.30).
>Supporting these sites will require 'breaking' HttpClient.  I propose adding
>some kind of flag to put HttpClient into a 'compatability mode' that
>impelements this and any other required broken behaviour.
>
This sounds like a great idea

>A second need is to provide a mechanism for getting user acknowledgment for
>certain actions.  For exampe when redirecting from secure to non-secure
>sites.
>
>I am going to start working on these changes next but I want to discuss them
>with the HttpClient community so see if they feel they belong in the commons
>HttpClient project or if the project should be forked.
>
You've emailed the development community. I'm not sure many of the 
'user' community hang out here. My preference in this one is that it 
belongs in httpclient as a strict vs relaxed mode.

>Anyway, below is a description of the modified and new files.  The patches
>and new files are attached.
>
>Modified files...
>
>Cookie.java
>  -  Added support for old Netscape cookies.  The biggest difference is that
>the test for valid domains is different for Netscape cookies and RFC 2109
>cookies
>  -  Added space after the semicolons separating the values.  This is
>required by sites that only implement the old Netscape cookie specification.
>  -  Added additional date format for expiration times.
>
>HttpConnection.java
>  -  The write*() and print*() methods now throw HttpRecoverableException.  
>
>HttpMethodBase.java
>  -  Added a new exception class, HttpRecoverableException.  There are some
>error conditions that we can try to recover from internally.  The biggest
>one I found was when a server unexepectedly closed the socket.  In this case
>we should just try to re-open the connection and try the request again.
>  -  Fixed a problem with the handling of 100 status codes.  If we get a 100
>after we've already sent the request body, RFC 2616 states that the response
>should be ignored.  The currently implementation incorrectly broke out of
>the loop looking for the response.
>
This last one sounds like a bug that should be fixed anyway.

>
>  -  Always recreate the cookie header.  A redirect response may have
>included additional cookies that we need to send with the redirected request
>and the path may have changed thus requiring a different cookie set.
>
Ditto.

>
>  -  Fixed readRequestBody implementation.  A new version of this function
>also takes an output stream.  This makes it easier for subclasses to use
>this implementation directly instead of having to re-implement it in order
>to support things like saving the response to a file.
>  -  Better support for responses that don't contain a Content-Length or
>Transfer-Encoding header.  By the specification, if these headers are both
>absent, the response has no body content.  In the real world what this means
>is that the server probably didn't know the length when the response was
>committed.  It just sends the response and closes the connection when the
>body is complete.  This assumption falls apart when we get a response that
>*can not* contain a body.  In this case, the simple implemenation keeps
>reading looking for a response body and actually ends up reading the next
>response headers as the body.  I've added a list of responses that,
>according to the specification, can not ever have a body and fixed
>readResponseBody() to not read a body for these responses.
>
Again, sounds like another bug.

>URIUtil.java
>  -  Added getPath() method.  This method returns the path portion of a
>given URL.  The only difference from java.net.URL.getPath() is that this
>method returns "/" if the URLs path is empty.
>
>GetMethod.java
>  -  Switched to new HttpMethodBase.readResponseBody().
>
>New files...
>
>HttpMultiClient.java
>  -  Replacement for HttpClient.  This class serves two purposes.  First it
>handles off-site redirects.  Second, it is intended to be used within a
>multithreaded application that, like a browser, may have more than one
>request outstanding to a given server and have requests going to more than
>one server.
>  -  Since HttpMultiClient, unlike HttpClient, simultaneously handles
>requesets for multiple servers it can't use HttpMethod classes directly
>because they only include path information, not server information.  A new
>interface, HttpUtlMethod, is used that extends HttpMethod.
>
>HttpSharedState.java
>  -  A simple wrapper around HttpState to synchronized access to data.  This
>is required to support the multi-threaded nature of HttpMultiClient.
>
>HttpConnectionManager.java
>  -  This is actually the heart of HttpMultiClient.  It keeps track of
>available HttpConnections for host:port combinations.  The number of
>connections to a given host:port is limited (per RFC 2616) and if the limit
>is reached calls to getConnection() will block until a connection becomes
>available.
>
>HttpRecoverableException.java
>  -  Extends HttpException.  This exception is thrown when a potentially
>recoverable error has occurred (e.g. a socket connection was closed
>unexpectedly).  Higher level code can attempt to try the operation again.
>
>HttpUrlMethod.java
>  -  An interface that extends HttpMethod.  HttpUrlMethod classes are
>initialized with a fully qualified URL instead of just the path component.
>
>UrlGetMethod.java
>UrlPostMethod.java
>UrlDeleteMethod.java
>UrlOptionsMethod.java
>UrlPutMethod.java
>  -  These classes exetend their respective method classes and impelement
>HttpUrlMethod.
>
>Marc Saegesser 
>
These all sound like good additions. What I think we need to work out is 
how do we turn this on or off?

-- 
dIon Gillard, Multitask Consulting
http://www.multitask.com.au/developers




--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

Reply via email to