Re: Memory leak.

Richard van der Laan Wed, 28 Dec 2005 02:10:15 -0800

Hi Christopher,

The fix looks very good. I tried a lot (commercial) profilers.
Our project uses a lot 1.5 annotation code, therefore I am bound
to profilers that work correctly under 1.5.


Java hprof works fine, but I found it not very easy to work with.
The eclipse profiler crashed many times executing our code.

At this moment the yourkit commercial profiler is the only tool
that works properly on our annotated code.

Thanks alot for the fix. I will check our repository to be sure that
there are no other issues that I forgot to share.

Best wishes,

Richard

Richard van der Laan, Luminis
TheWeb: http://www.luminis.nl
LOSS  : https://opensource.luminis.net

-----Original Message-----
From: "Christopher Brooks" <[EMAIL PROTECTED]>
Sent: Friday, December 23, 2005 12:33 pm
To: [EMAIL PROTECTED]
Cc: [email protected]
0"Kepler-Dev" <[EMAIL PROTECTED]>
Subject: Re: Memory leak.

Hi Richard,

I fixed this by modifying MoMLParser, the CVS log comment is:

  parse(URL, Reader): If we get an exception, remove _toplevel from
  the workspace, clear the parameters to parse, reset the MoMLParser,
  and, if necessary, purge the model record.

The fix is in the CVS repository, see
http://chess.eecs.berkeley.edu/ptexternal.

The catch block of parse(URL, Reader) now looks like:

        } catch (Exception ex) {
            // If you change this code, try running
            // ptolemy.moml.test.MoMLParserLeak with the heap profiler
            // and look for leaks.
            if (_toplevel != null
                    && _toplevel instanceof ComponentEntity) {
                ((ComponentEntity) _toplevel).setContainer(null);

                // Since the container is probably already null, then
                // the setContainer(null) call probably did not do anything.
                // so, we remove the object from the workspace so it
                // can get gc'd.

                // FIXME: perhaps we should do more of what
                // ComponentEntity.setContainer() does and remove the ports?
                try {
                    _workspace.getWriteAccess();
                    _workspace.remove(_toplevel);
                } finally {
                    _workspace.doneWriting();
                }
                _toplevel = null;
            }

            _paramsToParse.clear();
            reset();
            if (base != null) {
                purgeModelRecord(base);
            }
            throw ex;
        } finally {


I think ComponentEntity.setContainer() has a bit of a bug.

MoMLParser._toplevel is the toplevel that is created by the
MoMLParser.  Its container is null.

When I call _toplevel.setContainer(null), ComponentEntity.setContainer()
ends up returning early because the old container and the new
container are both null.  Instead, I think that we should
only return if both are non-null.  If the containers are null, we
should probably continue with the method and remove the ports.

A few notes about leak detection:

Kevin's notes about -Xrunhprof are below.

The way I tracked this down was by writing a small example
(ptolemy/moml/test/MoMLParserLeak.java) that created some xml
that first created a Ramp and then threw an exception.

Having a small example is critical because it compiled and ran
much faster than starting up all of Ptolemy and invoking the
profiler.  In my example, it was critical that the MoMLParser
stay in scope, because it was what was holding references to
the partially constructed entitities.  Also, the leak code
requests that the garbage collector be run and then waits
a few seconds so that gc can complete.  I found it easiest
to create an example that did not leak and then add in
the bogus code that caused the leak.

I ran the example with
 c:/Program\ Files/java/jdk1.5.0_05/bin/java -Xrunhprof:depth=15
-classpath $PTII ptolemy.moml.test.MoMLParserLeak

which created java.hprof.txt.

I would then look at the bottom of java.hprof.txt in the SITES section
for the Ramp actor.  With Java 1.5, the Ramp actor appears with all
the classes, which is fine, but if it appeared elsewhere in the SITES
section then I knew I had a leak still.

So, for example, if java.hprof.txt contains:

HEAP DUMP END
SITES BEGIN (ordered by live bytes) Fri Dec 23 09:14:17 2005
          percent          live          alloc'ed  stack class
...
443  0.04% 73.25%       320    1       320     1 301478
ptolemy.actor.lib.Ramp
...
916  0.03% 93.07%       192    1       192     1 301512
ptolemy.actor.lib.Ramp

Then site 443 is ok, this is the class loader, the other adjacent
classes all have a size of 320.

Site 916 is the leaker, note that there is one reference of size 192.

One can then use the stack number to find the stack frame and start
tracing, but I found it easier to use HP's JMeter.
HP's JMeter is a tool available as a free download after a quick
registration from:
  http://www.hp.com/products1/unix/java/hpjmeter/

I downloaded it, untar'd it and ran:
  java -Xmx256m -jar $PTII/vendors/hpjmeter/HPjmeter.jar
I then opened java.hprof.txt, selected Metric -> Reference Graph Tree
and then searched for Type Name "Ramp"

This showed me that the Ramp was still referred to (eventually) by the
_workspace variable, which prompted me to add the code that forces
removal of the _toplevel from the workspace.

It is possible to figure this out by hand by looking at java.hprof.txt
but JMeter makes it easier.

I tried using Eclipse TPTP and I was never able to get some sort of
reference graph tree that made it easier to look at objects in a hierarchy.
Also TPTP seemed very slow to me when compared with running java
-Xrunhprof.  I believe that TPTP can do what JMeter can do, but I was
missing some key bit of information on how to use it.

Another tidbit is that running Java 1.4.2_08 -Xrunhprof would result
in a message "HPROF ERROR: heap dump size < 0" and then JMeter would
not be able to show the Reference Graph Tree.
This is a known bug in Java 1.4:
http://bugs.sun.com/bugdatabase/view_bug.do;:YfiG?bug_id=4507533
I think the problem was likely that I was running gc in a final block
as the process was exiting.

The workaround is to use Java 1.5.

_Christopher



--------

    Hi Richard,

    This is an interesting problem.

    I hacked up a small test case that I checked in as
    moml/test/MoMLParserLeak.java

    /**
     Leak memory in MoMLParser by throwing an Exception.
     <p> Under Java 1.4, run this with:
     <pre>
    java -Xrunhprof:depth=15 -classpath "$PTII;."
ptolemy.moml.test.MoMLParserL
   eak
     </pre>
     and then look in java.hprof.txt.

     @author Christopher Brooks
     @version $Id: MoMLParserLeak.java,v 1.1 2005/12/22 19:40:33 cxh Exp $
     @since Ptolemy II 5.2
     @Pt.ProposedRating Red (cxh)
     @Pt.AcceptedRating Red (cxh)
     */
    public class MoMLParserLeak {
        public static void main(String []args) throws Exception {
            MoMLParser parser = new MoMLParser();
            try {
                NamedObj toplevel =
                    parser.parse("<?xml version=\"1.0\"
standalone=\"no\"?>\n"
                            + "<!DOCTYPE entity PUBLIC \"-//UC
Berkeley//DTD Mo
   ML 1//EN\"\n"
                            +
"\"http://ptolemy.eecs.berkeley.edu/xml/dtd/MoML_
   1.dtd\">\n"
                            + "<entity name=\"top\"
class=\"ptolemy.kernel.Comp
   ositeEntity\">\n"
                            + "<entity name=\"myRamp\"
class=\"ptolemy.actor.li
   b.Ramp\"/>\n"
                            + "<entity name=\"notaclass\"
class=\"Not.A.Class\"
   />\n"
                            + "</entity>\n");
            } finally {
                System.gc();
                System.out.println("Sleeping for 2 seconds for any
possible gc.
   ");
                try {
                    Thread.sleep(2000);
                } catch (InterruptedException e) {
                }

            }
        }
    }


    If I run it under Java 1.4 with -Xrunhprof, then I think I can see it
leaks
    memory in that there are references to ptolemy.actor.lib.Ramp left
    around.

    I can think of two ways to clean this up by catching the exception
    in MoMLParser.parse(URL, Reader), cleaning up and rethrowing.

    1) Use the undo mechanism.  This seems really tricky.

    2) Traverse the objects that have been instantiated and call
    setContainer(null) on them.  I tried temporarily hacking this in
    by misusing the _topObjectsCreated list and then calling
    setContainer(null) on each element in _topObjectsCreated.
    Unfortunately, this did not quite work, I still appear to have
    references to Ramp.

    There are probably other ways as well.
    We did resolve/workaround the XmlParser leak by modifying
    MoMLParser.parse(URL, Reader) so it instantiates the XmlParser and
    then deletes it.  I had to make some other modifications as well,
    the fix is in the CVS repository, see
    http://chess.eecs.berkeley.edu/ptexternal.

    Two tools to use to track down leaks are the Java 1.4 -Xrunhprof
    command and Eclipse's tptp memory tool (http://eclipse.org/tptp/).

    Kevin Ruland wrote a pretty good description of -Xrunhprof:

    > I check memory leaks by using -Xrunhprof.  -Xrunhprof:help (Note the
    > colon) lists the arguments.  My preferred combination is
    > -Xrunhprof:cutoff=0.005,depth=15.  There are a number of resources on
    > the web describing runhprof although it has now been superceded by
    > jvmapi (or something) which gives greater control.  The brief
rundown is
    > the report generated in the file java.hprof.txt (which for me was
    > between 100M-250M depending on hprof options) is divided into three
parts
   :
    >
    > Stack traces -
    >
    > Allocation/object information -
    >
    > Active objects -
    >
    > I only know how to use the first and third sections...  There might be
    > an option to hprof to omit the second section, I don't really know.
    >
    > Every stack which allocates memory with an active reference at program
    > exit is represented.  These stacks are called "traces".
    >
    > The active object section tells you the collective size of the objects
    > "leaked" at each trace.  They are in descending order by total size.
    >
    > I tend to follow this procedure.  Find the largest culpret in the
active
    > object section.  Identify it's trace number which is in the second to
    > the last column (it doesn't format the columns very well).  Suppose
it's
    > 10041.  I then search backwards through the file looking for the string
    > "TRACE 10041:"  (note the all caps and colon)  This then gives you the
    > stack trace which allocated all that memory.
    >
    > The arguments I give it:
    >
    > cutoff=0.001 means don't report on objects who's total is less than
0.5%
    > of the total memory allocated.
    >
    > depth=15 means generate stack traces 15 frames deep.  The default for
    > this is 4, that's almost never enough.
    >
    > One final note.  Sometimes hprof gets confused and reports entries for
    > stack trace 0.  I haven't found the answer to this.  Sometimes using
    > -verbose helps.


    I'll see if I can come up with a workable solution.

    _Christopher


    --------

        Hi,

        I have embedded the Ptolemy kernel in an OSGi service. This service
        acts as an container for deployed model graphs and can dynamically
asso
   ciat
       e
        actor behavior with service behavior. In our situation, multiple
parsed
        models
        (around 30) can coexist.

        Before a new model is parsed, I call reset() on MoMLParser. This
avoids
        incremental model parsing, that would otherwise create
dependencies acr
   oss
        seperate deployed models. Maybe this is related to the memory leak
        mentioned by Kevin.

        A few weeks ago I spent some time profiling our system. Regarding
the c
   urre
       nt
        subject, I discovered a memory leak in one of the alternative
paths of
   the
        MoMLParser:

        During parsing, the MoMLParser can throw a XMLException if the
file con
   tent
        is invalid and can throw a ClassNotFoundException if some classes
could
    not
        be resolved. In case one of these exceptions is thrown, the entities
        contained
        by the partial constructed graph are still added to the workspace.

        In my opinion the MoMLParser should in that case undo the performed
        workspace actions.


        Best wishes,

        Richard van der Laan, Luminis
        TheWeb: http://www.luminis.nl
        LOSS  : https://opensource.luminis.net


    ---------------------------------------------------------------------------
   -
    Posted to the ptolemy-hackers mailing list.  Please send administrative
    mail for this list to: [EMAIL PROTECTED]
--------



----------------------------------------------------------------------------
Posted to the ptolemy-hackers mailing list.  Please send administrative
mail for this list to: [EMAIL PROTECTED]

Re: Memory leak.

Reply via email to