Well, the results are in. As long as the file can be handled by iText, which
means under 2Gb, my new process is several times faster than the old one.
The general steps are:
1) Preprocess the pages into a HashMap
2) Use the bookmarks to determine which pages go into what end result pdf,
remove found pages from the HashMap
3) Once 1000 of these are collected, create a new pdf containing only those
pages and save the page and file info for later use. I also release the
large file reader at this point since each subsequent thread will be
creating their own smaller readers.
4) Start multiple threads using previous collected info to split the smaller
pdfs into the final pdfs

Steps 1-3 are run in a single thread to minimize memory usage and maximize
the file size that we can process.

Step 4 is easier to tune because we are working with smaller file that use
less memory and the page data has been preparsed. Plus we have a general
idea of how much memory those process are going to use because we control
the file size.

We took a process that was running in 14 hours down to 3 hours this way.
Hope this helps.

Edward W. Rouse


-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Edward
W. Rouse
Sent: Thursday, August 14, 2008 10:50 AM
To: 'Post all your questions about iText here'
Subject: Re: [iText-questions] iText & multithreading delays

We are having issues trying to multi-thread iText due to memory issues. When
we parse through normal size files, our single and multi-threaded programs
work fine. Once the file sizes get bigger (1.5Gb in some cases) even the
single-threaded program can run out of memory. With a 500Mb file anything
more than 2 threads causes OOM. We can get rid of the OOM condition by not
using the getRandomAccessFileOrArray, but then it runs so much slower it
negates the whole reason for using multiple threads.

Our current plan is to try and breaks the file up into smaller 'chunks' and
then multi=thread the processing of the chunks. That is what I am in the
process of doing right now. I'll let ya know how that works when I get it
finished and tested.

Edward W. Rouse


-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of robert
meyer
Sent: Wednesday, August 13, 2008 7:11 PM
To: itext-questions@lists.sourceforge.net
Subject: Re: [iText-questions] iText & multithreading delays

Talmage:  thank you for your response.

I don't believe that making significant modifications
to iText is the way we'd like to go about solving this
problem.

I was hoping to hear from someone that has used iText in
a multi-thread / multi-processor environment, and if
iText performed better for them than it has for us in
our simple test.

We would also appreciate some input from the iText
developers.

Thanks in advance to all.
rm.



-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great
prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions

Buy the iText book: http://www.1t3xt.com/docs/book.php


-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great
prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions

Buy the iText book: http://www.1t3xt.com/docs/book.php


-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions

Buy the iText book: http://www.1t3xt.com/docs/book.php

Reply via email to