I have found a major source of the memory problems I am running into. The class SAX2DTM has 2 data members that are intialized to consume a large amount of memory. The m_chars data member is a FastStringBuffer that is initialized to a size of 8K chars (16Kbytes), and the m_data data member is an int vector of 512 ints (2Kbytes). So that adds up to around 18K per instance of SAX2DTM, and during a single transformation using the same testcase as described below, there are 38 instances of SAX2DTM created. So that is 684K per transformation. Note that there is only a single input XML element and no other documents loaded.
If I change the initialization of the FastStringBuffer to have an initChunkSize of 0 instead of 13 (which means a size of 1 char instead of 8K chars - it grows automatically as chars are added), the numbers in the Major Object Allocations table below drop to 3270K, which is even less than v2.1, and is half what it was before the change. There was no noticeable change in the speed of the transformation due to the change. It also dropped the total VM memory consumption down to a max of 8024K (without the forced GC -- see the first table below), which is a big improvement, although still significantly higher than v2.1. Can someone shed some light on the role of SAX2DTM with respect to why there would be 38 instances of it created during the transformation. My understanding is that it is building the DTM model from the SAX parser events, but why many of them? I think this is worth analyzing a bit to see if a permanent change in this area (like the one I made) makes sense for the majority of applications. (Admittedly my test is a very narrow test, but not extreme, and is only a slightly modified real-world case) Chris Chris McCabe wrote: > I have been doing some performance analysis this week comparing Xalan > 2.1 vs. 2.2 because we have been unable to use version 2.2 because it > is so much slower. I thought I would share what I have found so far > in case it helps. > > First of all, when testing just a single transformation at a time on a > machine with plenty of RAM to hold everything in memory, the two > versions perform about the same. Version 2.1 performed slightly > better in all the test cases I tried except when the stylesheet was > fairly trivial and input XML was large, in which case 2.2 performed > slightly better. But in either case the numbers were close. > > The big difference between the two versions is the amount of memory > used during a transformation, and this is what is bringing our server > to its knees because with multiple concurrent transformations, the > amount of memory consumed is many times what it was with 2.1. I was > able to reproduce the behavior we are experiencing using a single > stylesheet. > > The test I used to analyze this behavior is a single, fairly complex > stylesheet that is compiled only once and used to transform an XML > document with a single <page/> element. I ran the exact same > transformation 10 times in succession using the cached compiled > stylesheet, and dumped out the memory consumed by the JVM after each > transformation. I found that after a few iterations, the memory used > by the JVM is double what it was with v2.1.0. I then ran the same > tests but ran the garbage collector before each iteration. This had > no effect on v2.1.0, but it improved the v2.2 numbers, which would > indicate that much of the problem is due to a lot more memory being > allocated during the transformation, but is eventually freed. In > between garbage collection runs, the amount of used memory becomes > very large. I used JDK1.2.2 for the test. JDK1.3 core dumps, so I > could not test it. Here are the numbers: > > JVM (1.2.2) Memory Consumption (Kbytes) > v2.1.0 v2.2.D13 v2.1 w/GC D13 w/GC > ---------------------------------------------- > 1 5680 7000 5680 6028 > 2 5680 7000 5680 6028 > 3 5680 9100 5680 7420 > 4 5680 9100 5680 7420 > 5 5680 10256 5680 7420 > 6 5680 10256 5680 7420 > 7 5680 10256 5680 7420 > 8 5680 11280 5680 7420 <--------- > (numbers remain the same after the 8th iteration) > > I am continueing my investigation by looking at the objects that are > being allocated during the transformation to find out where the > differences are. The objects that are consuming most of the memory > are listed below with the amount of allocation of each listed for each > version after 6 iterations. This does not take into account free > objects, just total allocations. > > Major Object Allocations (Kbytes) > v2.1.0 v2.2.D13 > ------------------------------------------ > char[] 4142 7153 > int[] 825 3326 > Object[] 1756 2220 > AttributeIterator 1125 1878 > byte[] 769 1463 > > So far I have only analyzed the first one, the char[] objects. In > v2.1, most of the allocation comes from the StringBuffer class, but in > v2.2, 70% of the char[] allocations come from the > OutputStreamWriter.write() method, called from SerializerToXML.accum() > > I hope to get more information tomorrow. > > Chris > > > > >