Gil,
The problem is that until Java 1.4 there has simply been no way to
ensure connection timeout. HttpClient only 'mimics' connect timeout at
the expense of having a controller thread watch over the process of
socket initialization. The controller thread attempts to instantiate a
socket for a given period time, and if that fails, the controller thread
simply drops the socket on the floor, leaving it up to the garbage
collector to clean up the mess. This all is very expensive in terms of
resource consumption / memory allocation / garbage collection. Knowing
well about this problem we have put a lot of effects into trying to
reuse connections as much as possible. This approach works only if you
keep HttpClient along with its connection manager alive. Creating an
HttpClient instance per request completely defeats connection re-use and
results in excessive creation/garbage-collection of objects. 

> The only setTimeout() calls that I can find are in HttpClient, but I'll
> have multiple concurrent requests that will want different timeouts. How
> do I set a timeout per request?
> 

The problem is that 2.0 API does not allow to control timeouts on per
request basis. There's an open ticket for this bug

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=24154

We are planning to fix the problem for the 3.0 release. You are
absolutely certain you do need different timeout values on per request
basis I can even provide a fix for it this weekend. There are also plans
to add support for 1.4 connect timeout through reflection to circumvent
the problem by eliminating the controller thread when running in newer
JDKs. The catch there you'd have to use unstable branch of HttpClient
which still in pre Alpha1 state.

Oleg


> -----Original Message-----
> From: Oleg Kalnichevski [mailto:[EMAIL PROTECTED] 
> Sent: Thursday, April 08, 2004 1:20 PM
> To: Commons HttpClient Project
> Subject: RE: question about performance
> 
> Gil,
> HttpClient#getHost / HttpClient#getPort return the DEFAULT host and port
> used when only relative request path is given
> 
> HttpClient agent = new HttpClient();
> GetMethod get1 = new GetMethod("/relative/whatever.html");
> // default host configuration applies
> GetMethod get2 = new
> GetMethod("http://www.whatever.com/absolute/whatever.html";);
> 
> Oleg
> 
> 
> 
> On Thu, 2004-04-08 at 22:01, Alvarez, Gil wrote:
> > Ok, I considered reusing HttpClient, but when I saw methods such as
> > HttpClient.getHost() and getPort(), they implied that at the very
> least
> > it's not a thread safe class to use. If i have multiple threads
> > executing within one HttpClient object at the same time, and I call
> > HttpClient.getHost(), what's going to happen?
> > 
> > -----Original Message-----
> > From: Oleg Kalnichevski [mailto:[EMAIL PROTECTED] 
> > Sent: Thursday, April 08, 2004 12:23 PM
> > To: Commons HttpClient Project
> > Subject: Re: question about performance
> > 
> > Gil,
> > (1) First and foremost DO reuse HttpClient instances when using
> > multi-threaded connection manager. HttpClient class is thread-safe. In
> > fact there are no known problems with having just one instance of
> > HttpClient per application. Using a new instance of HttpClient for
> > processing each request totally defeats all the performance
> > optimizations we have built into HttpClient
> > 
> > (2) Use multi-threaded connection manager in case you do not
> > 
> > (3) Disable stale connection check
> > 
> > (4) Do not use connect timeout which causes a controller thread to be
> > spawned per connection attempt
> > 
> > Oleg
> > 
> > On Thu, 2004-04-08 at 21:02, Alvarez, Gil wrote:
> > > We recently ported our url-hitting code from using java.net.* code
> to
> > > httpclient code. We use it in a high-volume environment (20 machines
> > are
> > > hitting an external 3rd party to retrieve images).
> > > 
> > >  
> > > 
> > > 
> > > 
> > > After the port, we saw a significant increase in cycles used by the
> > > machines, about 2-3 times (ie, the load on the boxes increased from
> > > using up 20% of the cpu, to about 50%-60% of the cpu.
> > > 
> > >  
> > > 
> > > For each request, we instantiate an HttpClient object, and a
> GetMethod
> > > object, and shut things down afterwards.
> > > 
> > >  
> > > 
> > > In order to reduce the use of cycles, what is the recommended
> > approach?
> > > 
> > >  
> > > 
> > > Thank you.
> > > 
> > 
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail:
> > [EMAIL PROTECTED]
> > For additional commands, e-mail:
> > [EMAIL PROTECTED]
> > 
> > 
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail:
> [EMAIL PROTECTED]
> > For additional commands, e-mail:
> [EMAIL PROTECTED]
> > 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> [EMAIL PROTECTED]
> For additional commands, e-mail:
> [EMAIL PROTECTED]
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to