Hi Christopher, The fix looks very good. I tried a lot (commercial) profilers. Our project uses a lot 1.5 annotation code, therefore I am bound to profilers that work correctly under 1.5.
Java hprof works fine, but I found it not very easy to work with. The eclipse profiler crashed many times executing our code. At this moment the yourkit commercial profiler is the only tool that works properly on our annotated code. Thanks alot for the fix. I will check our repository to be sure that there are no other issues that I forgot to share. Best wishes, Richard Richard van der Laan, Luminis TheWeb: http://www.luminis.nl LOSS : https://opensource.luminis.net -----Original Message----- From: "Christopher Brooks" <[EMAIL PROTECTED]> Sent: Friday, December 23, 2005 12:33 pm To: [EMAIL PROTECTED] Cc: ptolemy-hackers@bennett.EECS.Berkeley.EDU 0"Kepler-Dev" <[EMAIL PROTECTED]> Subject: Re: Memory leak. Hi Richard, I fixed this by modifying MoMLParser, the CVS log comment is: parse(URL, Reader): If we get an exception, remove _toplevel from the workspace, clear the parameters to parse, reset the MoMLParser, and, if necessary, purge the model record. The fix is in the CVS repository, see http://chess.eecs.berkeley.edu/ptexternal. The catch block of parse(URL, Reader) now looks like: } catch (Exception ex) { // If you change this code, try running // ptolemy.moml.test.MoMLParserLeak with the heap profiler // and look for leaks. if (_toplevel != null && _toplevel instanceof ComponentEntity) { ((ComponentEntity) _toplevel).setContainer(null); // Since the container is probably already null, then // the setContainer(null) call probably did not do anything. // so, we remove the object from the workspace so it // can get gc'd. // FIXME: perhaps we should do more of what // ComponentEntity.setContainer() does and remove the ports? try { _workspace.getWriteAccess(); _workspace.remove(_toplevel); } finally { _workspace.doneWriting(); } _toplevel = null; } _paramsToParse.clear(); reset(); if (base != null) { purgeModelRecord(base); } throw ex; } finally { I think ComponentEntity.setContainer() has a bit of a bug. MoMLParser._toplevel is the toplevel that is created by the MoMLParser. Its container is null. When I call _toplevel.setContainer(null), ComponentEntity.setContainer() ends up returning early because the old container and the new container are both null. Instead, I think that we should only return if both are non-null. If the containers are null, we should probably continue with the method and remove the ports. A few notes about leak detection: Kevin's notes about -Xrunhprof are below. The way I tracked this down was by writing a small example (ptolemy/moml/test/MoMLParserLeak.java) that created some xml that first created a Ramp and then threw an exception. Having a small example is critical because it compiled and ran much faster than starting up all of Ptolemy and invoking the profiler. In my example, it was critical that the MoMLParser stay in scope, because it was what was holding references to the partially constructed entitities. Also, the leak code requests that the garbage collector be run and then waits a few seconds so that gc can complete. I found it easiest to create an example that did not leak and then add in the bogus code that caused the leak. I ran the example with c:/Program\ Files/java/jdk1.5.0_05/bin/java -Xrunhprof:depth=15 -classpath $PTII ptolemy.moml.test.MoMLParserLeak which created java.hprof.txt. I would then look at the bottom of java.hprof.txt in the SITES section for the Ramp actor. With Java 1.5, the Ramp actor appears with all the classes, which is fine, but if it appeared elsewhere in the SITES section then I knew I had a leak still. So, for example, if java.hprof.txt contains: HEAP DUMP END SITES BEGIN (ordered by live bytes) Fri Dec 23 09:14:17 2005 percent live alloc'ed stack class ... 443 0.04% 73.25% 320 1 320 1 301478 ptolemy.actor.lib.Ramp ... 916 0.03% 93.07% 192 1 192 1 301512 ptolemy.actor.lib.Ramp Then site 443 is ok, this is the class loader, the other adjacent classes all have a size of 320. Site 916 is the leaker, note that there is one reference of size 192. One can then use the stack number to find the stack frame and start tracing, but I found it easier to use HP's JMeter. HP's JMeter is a tool available as a free download after a quick registration from: http://www.hp.com/products1/unix/java/hpjmeter/ I downloaded it, untar'd it and ran: java -Xmx256m -jar $PTII/vendors/hpjmeter/HPjmeter.jar I then opened java.hprof.txt, selected Metric -> Reference Graph Tree and then searched for Type Name "Ramp" This showed me that the Ramp was still referred to (eventually) by the _workspace variable, which prompted me to add the code that forces removal of the _toplevel from the workspace. It is possible to figure this out by hand by looking at java.hprof.txt but JMeter makes it easier. I tried using Eclipse TPTP and I was never able to get some sort of reference graph tree that made it easier to look at objects in a hierarchy. Also TPTP seemed very slow to me when compared with running java -Xrunhprof. I believe that TPTP can do what JMeter can do, but I was missing some key bit of information on how to use it. Another tidbit is that running Java 1.4.2_08 -Xrunhprof would result in a message "HPROF ERROR: heap dump size < 0" and then JMeter would not be able to show the Reference Graph Tree. This is a known bug in Java 1.4: http://bugs.sun.com/bugdatabase/view_bug.do;:YfiG?bug_id=4507533 I think the problem was likely that I was running gc in a final block as the process was exiting. The workaround is to use Java 1.5. _Christopher -------- Hi Richard, This is an interesting problem. I hacked up a small test case that I checked in as moml/test/MoMLParserLeak.java /** Leak memory in MoMLParser by throwing an Exception. <p> Under Java 1.4, run this with: <pre> java -Xrunhprof:depth=15 -classpath "$PTII;." ptolemy.moml.test.MoMLParserL eak </pre> and then look in java.hprof.txt. @author Christopher Brooks @version $Id: MoMLParserLeak.java,v 1.1 2005/12/22 19:40:33 cxh Exp $ @since Ptolemy II 5.2 @Pt.ProposedRating Red (cxh) @Pt.AcceptedRating Red (cxh) */ public class MoMLParserLeak { public static void main(String []args) throws Exception { MoMLParser parser = new MoMLParser(); try { NamedObj toplevel = parser.parse("<?xml version=\"1.0\" standalone=\"no\"?>\n" + "<!DOCTYPE entity PUBLIC \"-//UC Berkeley//DTD Mo ML 1//EN\"\n" + "\"http://ptolemy.eecs.berkeley.edu/xml/dtd/MoML_ 1.dtd\">\n" + "<entity name=\"top\" class=\"ptolemy.kernel.Comp ositeEntity\">\n" + "<entity name=\"myRamp\" class=\"ptolemy.actor.li b.Ramp\"/>\n" + "<entity name=\"notaclass\" class=\"Not.A.Class\" />\n" + "</entity>\n"); } finally { System.gc(); System.out.println("Sleeping for 2 seconds for any possible gc. "); try { Thread.sleep(2000); } catch (InterruptedException e) { } } } } If I run it under Java 1.4 with -Xrunhprof, then I think I can see it leaks memory in that there are references to ptolemy.actor.lib.Ramp left around. I can think of two ways to clean this up by catching the exception in MoMLParser.parse(URL, Reader), cleaning up and rethrowing. 1) Use the undo mechanism. This seems really tricky. 2) Traverse the objects that have been instantiated and call setContainer(null) on them. I tried temporarily hacking this in by misusing the _topObjectsCreated list and then calling setContainer(null) on each element in _topObjectsCreated. Unfortunately, this did not quite work, I still appear to have references to Ramp. There are probably other ways as well. We did resolve/workaround the XmlParser leak by modifying MoMLParser.parse(URL, Reader) so it instantiates the XmlParser and then deletes it. I had to make some other modifications as well, the fix is in the CVS repository, see http://chess.eecs.berkeley.edu/ptexternal. Two tools to use to track down leaks are the Java 1.4 -Xrunhprof command and Eclipse's tptp memory tool (http://eclipse.org/tptp/). Kevin Ruland wrote a pretty good description of -Xrunhprof: > I check memory leaks by using -Xrunhprof. -Xrunhprof:help (Note the > colon) lists the arguments. My preferred combination is > -Xrunhprof:cutoff=0.005,depth=15. There are a number of resources on > the web describing runhprof although it has now been superceded by > jvmapi (or something) which gives greater control. The brief rundown is > the report generated in the file java.hprof.txt (which for me was > between 100M-250M depending on hprof options) is divided into three parts : > > Stack traces - > > Allocation/object information - > > Active objects - > > I only know how to use the first and third sections... There might be > an option to hprof to omit the second section, I don't really know. > > Every stack which allocates memory with an active reference at program > exit is represented. These stacks are called "traces". > > The active object section tells you the collective size of the objects > "leaked" at each trace. They are in descending order by total size. > > I tend to follow this procedure. Find the largest culpret in the active > object section. Identify it's trace number which is in the second to > the last column (it doesn't format the columns very well). Suppose it's > 10041. I then search backwards through the file looking for the string > "TRACE 10041:" (note the all caps and colon) This then gives you the > stack trace which allocated all that memory. > > The arguments I give it: > > cutoff=0.001 means don't report on objects who's total is less than 0.5% > of the total memory allocated. > > depth=15 means generate stack traces 15 frames deep. The default for > this is 4, that's almost never enough. > > One final note. Sometimes hprof gets confused and reports entries for > stack trace 0. I haven't found the answer to this. Sometimes using > -verbose helps. I'll see if I can come up with a workable solution. _Christopher -------- Hi, I have embedded the Ptolemy kernel in an OSGi service. This service acts as an container for deployed model graphs and can dynamically asso ciat e actor behavior with service behavior. In our situation, multiple parsed models (around 30) can coexist. Before a new model is parsed, I call reset() on MoMLParser. This avoids incremental model parsing, that would otherwise create dependencies acr oss seperate deployed models. Maybe this is related to the memory leak mentioned by Kevin. A few weeks ago I spent some time profiling our system. Regarding the c urre nt subject, I discovered a memory leak in one of the alternative paths of the MoMLParser: During parsing, the MoMLParser can throw a XMLException if the file con tent is invalid and can throw a ClassNotFoundException if some classes could not be resolved. In case one of these exceptions is thrown, the entities contained by the partial constructed graph are still added to the workspace. In my opinion the MoMLParser should in that case undo the performed workspace actions. Best wishes, Richard van der Laan, Luminis TheWeb: http://www.luminis.nl LOSS : https://opensource.luminis.net --------------------------------------------------------------------------- - Posted to the ptolemy-hackers mailing list. Please send administrative mail for this list to: [EMAIL PROTECTED] -------- ---------------------------------------------------------------------------- Posted to the ptolemy-hackers mailing list. Please send administrative mail for this list to: [EMAIL PROTECTED]