On Sun, 2006-08-20 at 19:11 +0200, Roland Weber wrote: 
> Hello all,
> 
> I've been waiting for a quiet week-end to collect my thoughts and
> questions about proxy support in HttpCore, or HttpComponents in
> general. Since a quiet week-end doesn't seem to come my way, I've
> decided to write down what I have in mind right now, to get the
> discussion going.
> 

Hi Roland,

Many thanks for bringing this up. I agree proxy support in HttpCore
needs more work and at this point is likely to be broken. I have not
revisited the client related classes in HttpCore for quite a while as I
was mostly preoccupied with the server side stuff.

> We have two interfaces HttpClientConnection and HttpProxyConnection,
> along with a default implementation for each. Proxy is derived from
> Client, both in the interface and default implementation.
> I think it is a design flaw to separate plain and proxy connections.
> Consider connection management: we want to create and manage a
> number of connections, not knowing whether they'll be used through
> a proxy or not. The proxy connection is not layered on top of a
> plain connection, the class is derived from it. So we always have
> to create proxy connections for the connection manager in order to
> allow proxying. Then those proxy connections are used either as a
> plain or as a proxy connection. Being proxied or not is a runtime
> property, and can not be reflected in the class hierarchy.
> 

I am not sure I agree with that. From the RFC 2616 standpoint there is
no difference between proxied and plain client HTTP connection. The sole
difference is the request-URI, which must be absolute in case of
requests sent over a proxied connection. The connection itself should
not be aware of this distinction.

The special case is not connection proxying but rather connection
tunneling. My first knee-jerk reaction was to put all the tunneling code
into a separate super class

> Another problem is that proxying is not transparent. There is a
> check in HttpRequestExecutor.doEstablishConnection whether the
> connection is pointing to the correct target host. It might have
> been me who introduced that method as part of some refactoring.
> They way it is used, the connection always points to the correct
> host because the invocation argument is taken directly from the
> connection. But in general, the decision whether a connection is
> pointing appropriately can not be made without knowing whether
> it is proxied. If it is connected to a proxy and using a tunnel,
> then both proxy and ultimate target host have to be the intended
> ones. If it is proxied without (real) tunnelling, it can be kept
> alive even if the next request is going to a different target host
> but through the same proxy.

Presently this is one of deficiencies of HttpClient 3.x (MTHCM to be
exact). We definitely should try to make HttpClient 4.0 a bit smarter
about pooling proxied connections. 

> I think the general idea of HttpCore was that a connection would
> be established to the appropriate host or proxy prior to sending
> the request, so that HttpCore doesn't have to bother. But that
> idea is currently broken in HttpRequestExecutor. We either have
> to repair the request executor (and adapt the async processor to
> those changes), or we need a different idea for proxy handling
> in HttpCore.
> Proxy handling is also not transparent to a connection manager,
> for the keep-alive reason mentioned above. If there is a connection
> open to the proxy, and not tunnelled to a specific target host
> (and not associated with some inappropriate authentication state),
> then that connection can be re-used for a different target host.
> I believe that the connection itself would be a good place to
> implement the logic for deciding whether it is pointing correctly.
> 

I rather lean toward keeping this kind of logic in a connection manager,
but am open to consider alternative approaches. In my opinion the job of
connection is to shove around HTTP messages, whereas the decision about
re-usability of connections should be left up to a connection manager

> A minor detail is that only the HttpProxyConnection has an
> isSecure() method. A non-proxied connection can be secure, too :-)
> Another minor detail is the ProxyHost class, which is not used
> anywhere in the API, but only by the DefaultHttpProxyConnection.
> I'm not sure whether it adds any value.
> 

Let's fix it.

> Finally, I am wondering where we'll plug in logic for proxy
> selection. Before digging deeper into this, I thought we could
> have a request interceptor that picks a proxy for the request.
> But request interceptors are executed only after the connection
> is available, and we need the information about the proxy before
> requesting the connection from a connection manager. Also, there
> are problems with proxy requests having a different status line
> from non-proxy requests, which would be ugly to deal with in a
> request interceptor.
> Proxy selection will also affect HttpAsync. While it is possible
> to design HttpCore so that it does not establish connections and
> therefore does not have to know about a proxy, HttpAsync provides
> a different interface. It is the responsibility of an HttpDispatcher
> to establish connections, so those will need to know which proxy to
> use. I'd like to discuss this now, so we can agree on a common
> interface for HttpAsync and HttpClient.
> 

My suggestion would be to port MTHCM to the new API, hack up a very
simple HttpClient prototype (no cookies, no authentication, no
redirects) and see if we end up with some generic aspects that may prove
useful in HttpAsync or HttpCore. It may be a little easier to observe
commonalities rather than trying to 'guest' them.

Oleg

> 
> OK, I think that's about all that has been bugging me about our
> proxy support those last few months. Let me know what you think.
> 
> cheers,
>   Roland
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to