Nvn,

Nvn wrote
> My requirement is to read the 2-3GB file sequentially (in order of
> bookmarks), extract contents w.r.t bookmarks and create the smaller PDF
> based on previous result.
> 
> I believe to read the large PDF, I need to use input stream as below. 
> 
> InputStream inputStream = new FileInputStream(INPUTFILE);
> PdfReader reader = new PdfReader(inputStream);

As indicated by Bruno, you need at least 5.3.x and should use 5.4.x.

But even then your choice of PdfReader constructor is probably the least
optimal imaginable because iText, when reading a Pdf from an InputStream,
first reads the complete stream into a byte array (cf.
RandomAccessSourceFactory().createSource(InputStream is)) and then parses
the complete PDF (partialRead parameter is false) from that array. At least
at that moment you need very much memory, more than twice the size of the
PDF which is once present as the byte array and then again as parsed objects
that carry along an additional overhead. And I'm not sure that that array is
then released...

The least amount of memory is required if you create a
java.io.RandomAccessFile for that file, wrap it as a RandomAccessSource
using RandomAccessSourceFactory().createSource(RandomAccessFile raf), wrap
this object in turn as a RandomAccessFileOrArray using the constructor
RandomAccessFileOrArray(RandomAccessSource byteSource), and finally create
the PdfReader using the constructor PdfReader(final RandomAccessFileOrArray
raf, final byte ownerPassword[]).

(Other RandomAccessFileOrArray constructors which might present shortcuts
have been marked deprecated.)

This neither reads the file into an intermediary byte array nor parses the
complete PDF, only the parts required at a given time. (Actually currently
this constructor seems to be the only one parsing the PDF partially only,
all others seem to parse the complete source.)

Regards,   Michael



--
View this message in context: 
http://itext-general.2136553.n4.nabble.com/PDF-file-size-limit-in-iText-tp4658184p4658198.html
Sent from the iText - General mailing list archive at Nabble.com.

------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and 
their applications. This 200-page book is written by three acclaimed 
leaders in the field. The early access version is available now. 
Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
_______________________________________________
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference 
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: 
http://itextpdf.com/themes/keywords.php

Reply via email to