Thank you for the feedback. I've tried using the loadNonSeq approach combined with importing FDF into the acroform. When I do this I see what looks like too much memory being used. Memory goes up 30MB on this call for a simple 13k PDF.
The memory usage goes up on the call acroForm.importFDF(fdfDocument) Code Snippet: // load pdf pdf = PDDocument.loadNonSeq(pdfFile, pdfScratchFile); // load xfdf fdf = FDFDocument.loadXFDF(args[1]); // get acroForm docCatalog = pdfDocument.getDocumentCatalog(); acroForm = docCatalog.getAcroForm(); acroForm.setCacheFields(false); // Import FDF acroForm.importFDF(fdfDocument); I get the impression this call requires the entire document to be loaded into memory. Is there a way to conserve memory and import the FDF? Should I avoid the call to importFDF and approach this differently? ie: manually populating the acroForm? Let me know your thoughts. Thanks, T On Fri, Jan 17, 2014 at 2:12 AM, Maruan Sahyoun <[email protected]>wrote: > Hi Tom, > > PDF is not a format which is build sequentially but a Random Access > format. In order to lower the memory consumption you can pass a temp file > which will be used to store intermediate data. > > Take a look at > http://pdfbox.apache.org/docs/1.8.3/javadocs/org/apache/pdfbox/pdmodel/PDDocument.htmlespecially > the load and loadNonSeq (which is the preferred method) > description > > PDFStreamParser is used internally to parse PDF streams (a PDF internal > structure). > > BR > > Maruan Sahyoun > > Am 17.01.2014 um 04:39 schrieb Tom Kesling <[email protected]>: > > > Hello, > > I would like to ask a few questions about Streaming with PDFBox. > > > > I use the term Streaming for the lack of a better term. My code will > > execute in a JEE Container so I need to conserve memory as much as > > possible. > > > > Goals: > > I want to be able to set form fields in a PDF without loading the PDF > into > > memory. > > I would like to stream in the PDF and set the fields as they are > > encountered. > > A new PDF will be streamed to disk with the populated form fields. > > > > I would also like to be able to read form fields from a PDF without > loading > > it into memory. > > I would like to to stream in the PDF and read the fields as they are > > encountered. > > > > I've messed around with the PDFStreamingParser but I haven't figured out > > how to locate form fields. > > > > If anyone can give me any guidance or examples of how to do this that > would > > help alot. > > > > Any help is appreciated. > > > > Thanks, > > T > >

