Yes that's a good idea. However it will be much slower, and the objects will still be in memory, only the stream contents (e.g. images, fonts, content streams) will be on disk.

Tilman

Am 07.01.2022 um 17:55 schrieb Kevin Day:
If you use the temporary file memory storage, it should be possible to work
with very large files.

https://stackoverflow.com/questions/11301818/pdfbox-working-with-very-large-pdfs/38859566

This isn't streaming (pdf is not really amenable to streaming like you are
asking), but the disk based scratch memory should get you what you need.

On Fri, Jan 7, 2022, 12:18 AM Tilman Hausherr <thaush...@t-online.de> wrote:

Am 06.01.2022 um 18:26 schrieb John Lussmyer:
I have a need to merge a couple thousand PDF's into one humongous PDF.
The old tool we use for PDF manipulation runs out of memory as it builds
the result PDF in memory, and only writes it out when done.
Can PDFBox do something more like streaming the output as it's built?
or even not load all the source pdf content streams until needed for output?


No + Yes, so you'll also run out of memory at some time.

If the huge job is for printing, then remove the structure tree from
each file, which is obviously not needed (it is for screen readers). You
should save somewhere and reload so that these are no longer in memory.

Tilman



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org

Reply via email to