That change made things better. Now I'm using 90Meg after downloading
2150 URLs.

Something's still wrong though, as the amount of memory being used is
pretty much monotonically increasing with the number of URLs downloaded,
just nowhere near as fast.

Do you have any other ideas about what I could be doing wrong?

When I get home from work, I'll try it on a machine with a different
JRE.

Regards,
Michael

-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, June 09, 2004 1:24 PM
To: Jakarta Commons Developers List
Subject: RE: HttpClient -- possible resource leak?

>It doesn't terminate. Instead, it begins to take more and more cpu, and
>just basically hangs.

Hmm, I'd rather expect am OutOfMemoryError if it there were a memory
leak
caused by objects piling up. 

>If I want to get smaller chunks, I don't just call getResponseBody?
>

Do something of that sort

InputStream instream = httpget.getResponseBodyAsStream();
if (instream != null) {
  OutputStream outstream = .... // Stroage of your choice 
  try {
    byte[] buffer = new byte[4096];
    int len;
    while ((len = instream.read(buffer)) > 0) {
      outstream.write(buffer, 0, len);
    }
  finally {
    outstream.close();
  }
}

Finally allow me to express my doubts there is a high likelihood of a
resource
leak in HttpClient. I have a mission critical system with ~250,000
users,
where HttpClient is used to implement a reverse proxy, which has been
running
for 2,5 years already with 99,99% uptime guarantee. If there were a
memory
leak in HttpClient I guess I'd have known about it by now.

Can you try testing the app on a different platform / different JRE /
with
different heap size settings, just for a heck of it?

Oleg



>-- Original Message --
>Reply-To: "Jakarta Commons Developers List"
<[EMAIL PROTECTED]>
>Subject: RE: HttpClient -- possible resource leak?
>Date: Wed, 9 Jun 2004 12:56:48 -0400
>From: "Michael Mastroianni" <[EMAIL PROTECTED]>
>To: "Jakarta Commons Developers List" <[EMAIL PROTECTED]>
>
>
>It doesn't terminate. Instead, it begins to take more and more cpu, and
>just basically hangs.
>
>It appears to be the case that these objects are not being gc-d.
>
>If I want to get smaller chunks, I don't just call getResponseBody?
>
>Whatever I set the maximum heap size to, I get close to it in a fairly
>linear fashion, and then asymptotically approach the max while taking
>more and more cpu.
>
>Thanks,
>Michael
>-----Original Message-----
>From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
>Sent: Wednesday, June 09, 2004 12:05 PM
>To: Jakarta Commons Developers List
>Subject: RE: HttpClient -- possible resource leak?
>
>>5. I looked at performance monitor, and watched the amount of memory
>>java    was using. When it got up to the limit, my program stopped
>>downloading urls.
>
>Michael,
>
>What do you mean by "stopping downloading urls"? Does your application
>terminate
>with OutOfMemoryException or something?
>
>My initial guess was that with a fairly high maximum heap size setting
>the
>garbage collector _might_ not kick in for quite a while until a certain
>limit
>is reached, thus making an impression of the application leaking
memory.
>However, if I understand you right, you are saying that objects are not
>de-referenced
>and therefore are not GC-ed?
>
>Please do follow Odi's advice and do not buffer the content in memory.
>Rather
>use InputStream to read out data in smaller chunks and persist it to
>disk.
>That will drastically reduce the amount of garbage generated by
>HttpClient
>
>Oleg
>
>
>>-- Original Message --
>>Reply-To: "Jakarta Commons Developers List"
><[EMAIL PROTECTED]>
>>Subject: RE: HttpClient -- possible resource leak?
>>Date: Wed, 9 Jun 2004 11:46:09 -0400
>>From: "Michael Mastroianni" <[EMAIL PROTECTED]>
>>To: "Jakarta Commons Developers List" <[EMAIL PROTECTED]>
>>
>>
>>Thanks for your help. Here are some details
>>
>>1. I've tried 2.1 final and 3.0 alpha: similar problems.
>>2. JDK 1.4.2
>>3. Windows XP pro
>>4. I don't set the initial heap size, but I set the max to 500Meg
>>5. I looked at performance monitor, and watched the amount of memory
>>java    was using. When it got up to the limit, my program stopped
>>downloading urls.
>>
>>I went through the version 2 code in the debugger, it looked as if the
>>method's requestbody buffer was never getting cleaned up when I called
>>releaseconnection on it.
>>
>>This was my big suspicion, because my memory usage seemed to be going
>up
>>pretty linearly with downloading, by an amount that seemed reasonable
>>for a web page.
>>
>>I think I might be doing something drastically wrong, but I've read
the
>>docs, looked at the example code, and not seen anything obvious.
>>
>>Thanks again.
>>
>>Michael
>>
>>-----Original Message-----
>>From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
>>Sent: Wednesday, June 09, 2004 11:03 AM
>>To: Jakarta Commons Developers List
>>Subject: RE: HttpClient -- possible resource leak?
>>
>>Michael
>>
>>Could you provide us with additional details on the execution
>>environment
>>of your application?
>>
>>(1) What version of HttpClient are you using?
>>(2) What is the JDK version? 
>>(3) What platform?
>>(4) How exactly do you measure memory consumption by your application?
>>(5) Do you set initial and maximum heap size for the JRE?
>>
>>Oleg
>>
>>
>>>-- Original Message --
>>>Reply-To: "Jakarta Commons Developers List"
>><[EMAIL PROTECTED]>
>>>Subject: HttpClient -- possible resource leak?
>>>Date: Wed, 9 Jun 2004 10:43:10 -0400
>>>From: "Michael Mastroianni" <[EMAIL PROTECTED]>
>>>To: <[EMAIL PROTECTED]>
>>>
>>>
>>>I have a multi-threaded app, using Httpclient to download a few
>>thousand
>>>urls at a time. Currently, I have one
>>MultiThreadedHttpConnectionManager,
>>>which the thread manager creates, and passes around to each of its
>>worker
>>>threads.
>>>
>>>Each thread has a queue of urls, and it creates a new HttpClient,
>using
>>the
>>>ConnectionManager, for each one. I've tried using one, created at
>>construction
>>>time for each worker thread, and gotten no luck.
>>>
>>>The worker threads make executeMethod calls, and I notice that I'm
>>leaking
>>>a lot of memory (it looks like the memory usage goes up every time I
>>successfully
>>>download a page). It seems as if perhaps the underlying buffer of the
>>GetMethod
>>>is not being cleaned up. I'm calling release on the GetMethod in a
>>finally
>>>block. A relevant piece of code is below:
>>>
>>>            private void SpiderUrlImpl()
>>>            {
>>>                        HttpMethod method = new GetMethod(m_sUrl);
>>>                        try
>>>                        {
>>>                                    //if(m_State == null)
>>>                                    //{
>>>                                                m_State = new
>>HttpState();
>>>
>>m_State.setCookiePolicy(CookiePolicy.RFC2109);
>>>                                    //}
>>>            
>>>                                    m_client.setState(m_State);
>>>
>>m_client.setConnectionTimeout(m_timeout);
>>>
>>>                                    method.setFollowRedirects(true);
>>>                                    method.setStrictMode(false);
>>>                                    String responseBody = null;
>>>                                    
>>>                                    int iCode    =
>>m_client.executeMethod(method);
>>>                                    responseBody =
>>method.getResponseBodyAsString();
>>>                                    Header hLoc  =
>>method.getResponseHeader("Location");
>>>                                    
>>>                                    java.io.FileWriter fw = new
>>java.io.FileWriter(m_sPath
>>>+ "\\" + m_sFile);
>>>                                    fw.write(responseBody);
>>>                                    w.close();
>>>                        }//TODO: LOG STUFF GOES HERE
>>>                        catch
>>(org.apache.commons.httpclient.HttpException
>>>he)
>>>                        {
>>>                            System.err.println("Http error connecting
>>to
>>>'" + m_sUrl + "'");
>>>                            System.err.println(he.getMessage());
>>>                        }
>>>                        catch (IOException ioe)
>>>                        {
>>>                            System.err.println("Unable to connect to
>'"
>>+
>>>m_sUrl + "' or print file + '" +  m_sPath + "\\" + m_sFile + "'");
>>>                            System.err.println(ioe.getMessage());
>>>                        }
>>>                        catch(Exception eExc)
>>>                        {
>>>                            System.err.println(eExc.getMessage());
>>>                        }
>>>                        finally
>>>                        {
>>>                            method.releaseConnection();
>>>                        }
>>>            }
>>>}
>>>
>>>
>>>
>>>---------------------------------------------------------------------
>>>To unsubscribe, e-mail: [EMAIL PROTECTED]
>>>For additional commands, e-mail: [EMAIL PROTECTED]
>>>
>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: [EMAIL PROTECTED]
>>For additional commands, e-mail: [EMAIL PROTECTED]
>>
>>
>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: [EMAIL PROTECTED]
>>For additional commands, e-mail: [EMAIL PROTECTED]
>>
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: [EMAIL PROTECTED]
>For additional commands, e-mail: [EMAIL PROTECTED]
>
>
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: [EMAIL PROTECTED]
>For additional commands, e-mail: [EMAIL PROTECTED]
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to