Hi Thomas,
Thanks for paying attention to this. I've measured the values using
Runtime.getRuntime().freeMemory() and friends. Indeed the process size
is very misleading because in one case we had a process size of 5 GB
while the free memory was more than 3Gb.
I do have large images but it not a large number, about three and for
every student an QR code which are about 150bytes (png) each. These are
embedded as base64 encoded data.
I am not using much features of Batik, it is mainly replacing text
content in specific SVG elements and then converting them into PDF using
the PDFTransposer provided in the Apache Batik 1.7 package.
What I have noticed is that there is a CleanerThread created when I
start generating PDF files but it never runs, it is always in waiting
state. Is there a command which triggers this thread?
Regards,
Hilbert Mostert
On 05/12/2012 07:17 PM, DeWeese Thomas wrote:
Hi Hilbert,
How are you measuring how much memory you are using after step 4? If
you are just looking at process size that can
be very misleading since typically the JVM will grow and even if the JVM has
freed most of the memory it will hold onto the
larger memory block, partially since it may be fragmented and partially since
it may need the memory again shortly.
There are caches in Batik for documents and images and other assets but
unless you have a lot of large images it
is unlikely they would reach 1Gb. Filter effects may also cache some
intermediate results but typically those will be cleaned
when the filter is disposed of (which given the lazy nature of the JVM may not
happen for a while). A general outline of the
features you are using from SVG might help identify areas that might be
responsible for the memory bloat.
Which by the way raises the other issue are you forcing a GC? If not
lots of currently unused stuff will hang out until the
memory is needed for something else. Finally remember that just calling for a
single GC doesn't typically do much to clear out memory.
Thomas
On May 11, 2012, at 8:39 AM, Hilbert Mostert wrote:
Recently I have started using Apache Batik to create PDF files from SVG
templates. The application is used to generate exam pages for students. It
works great but it uses a huge amount of memory. This is sometimes annoying
because i have to increase the memory limit to over 4Gb to have it complete the
task. There are in general lots of students (500+) and in one case 2000+
students. This will, of course, eat memory like an elephant I accept that.
I want to reduce this memory footprint and have found one issue in my program
where I need help with.
I am using the Java JRE 1.6.0_32, Batik 1.7 and PDFBox 1.6.0.
The program has the following flow:
1. fetch students from source (Excel file)
2. create workers to generate pdf from svg
3. while not all students have been processed do
3.1 replace information in svg document ( using w3c functions from
Document class ) (this is done by worker)
3.2 generate PDF from svg document (Using PDFTranscoder)
3.3 check if there are more students; true: goto 3.1; false: continue with
step 4
4. clean up workers
5. generate single pdf from all generated pdfs using PDFBox
6. done
It is a multi threaded environment and all the workers are in their own thread,
each worker has a copy of the svg document, they dont share anything (for
obvious reasons).
What I have found is what comes after step 4, after cleaning up the workers I
am still using 1Gb of memory which is much more than when I start (around
128Mb). I suspect there is some caching here and there but I have not enough
knowledge from batik to fix this problem.
Who can help me or has the answer for me?
Thanks in advance,
Hilbert Mostert
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]