[ 
https://issues.apache.org/jira/browse/PDFBOX-5372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nunop5 updated PDFBOX-5372:
---------------------------
    Description: 
Hello,

I've been using PDFBox since very very long, and it's been working very well 👍

 

However, for 2 sets of PDFs I'm trying to merge in the past day, for some 
reason I'm always getting LOADS of these Warnings, with a "out of memory" crash 
at the end.

I believe (but could be wrong) that the "out of memory" is due to these endless 
Warnings.
(heap is currently 15g)

 

But, does anyone know what could be causing these warnings? (anything in 
specific you'd suggest me to look at, in these PDFs?)

 

Thank you very much!

 

(... LOADS of these, redacted ...)
Jan 31, 2022 2:41:48 AM org.apache.pdfbox.multipdf.PDFMergerUtility mergeIDTree
WARNING: key node00018714 already exists in destination IDTree
Jan 31, 2022 2:41:48 AM org.apache.pdfbox.multipdf.PDFMergerUtility mergeIDTree
WARNING: key node00018715 already exists in destination IDTree
Jan 31, 2022 2:41:48 AM org.apache.pdfbox.multipdf.PDFMergerUtility mergeIDTree
WARNING: key node00018716 already exists in destination IDTree
Jan 31, 2022 2:41:48 AM org.apache.pdfbox.multipdf.PDFMergerUtility mergeIDTree
WARNING: key node00018717 already exists in destination IDTree
Jan 31, 2022 2:41:48 AM org.apache.pdfbox.multipdf.PDFMergerUtility mergeIDTree
WARNING: key node00018718 already exists in destination IDTree
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
        at 
org.apache.pdfbox.io.ScratchFileBuffer.addPage(ScratchFileBuffer.java:132)
        at 
org.apache.pdfbox.io.ScratchFileBuffer.ensureAvailableBytesInPage(ScratchFileBuffer.java:184)
        at 
org.apache.pdfbox.io.ScratchFileBuffer.write(ScratchFileBuffer.java:236)
        at 
org.apache.pdfbox.io.RandomAccessOutputStream.write(RandomAccessOutputStream.java:46)
        at org.apache.pdfbox.cos.COSStream$2.write(COSStream.java:281)
        at org.apache.pdfbox.io.IOUtils.copy(IOUtils.java:70)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:127)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:109)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:146)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:146)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:146)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:109)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:117)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:146)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:109)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:117)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:146)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:109)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:117)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:146)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:109)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:146)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:109)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:146)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:109)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:146)
        at 
org.apache.pdfbox.multipdf.PDFMergerUtility.appendDocument(PDFMergerUtility.java:800)
        at 
org.apache.pdfbox.multipdf.PDFMergerUtility.legacyMergeDocuments(PDFMergerUtility.java:459)
        at 
org.apache.pdfbox.multipdf.PDFMergerUtility.mergeDocuments(PDFMergerUtility.java:346)
        at org.apache.pdfbox.tools.PDFMerger.merge(PDFMerger.java:70)
        at org.apache.pdfbox.tools.PDFMerger.main(PDFMerger.java:49)
        at org.apache.pdfbox.tools.PDFBox.main(PDFBox.java:81)

 

–

 

EDIT - Increased the heap, and seems I got something new too:

 

Jan 31, 2022 10:50:46 AM org.apache.pdfbox.multipdf.PDFMergerUtility mergeIDTree
WARNING: key node00018714 already exists in destination IDTree
Jan 31, 2022 10:51:29 AM org.apache.pdfbox.cos.COSDocument finalize
*WARNING: Warning: You did not close a PDF Document*
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space

 

Not sure if this means something is actually wrong with one of the PDFs, or a 
consequence of going out of memory.

(and how do I know what's the PDF it's referring to? :) )

I'm using Pupeteer to generate the PDFs, so I assume it generated them well... 
(also, it's consistently failing even if I re-generate all of them)

  was:
Hello,

I've been using PDFBox since very very long, and it's been working very well 👍

 

However, for 2 sets of PDFs I'm trying to merge in the past day, for some 
reason I'm always getting LOADS of these Warnings, with a "out of memory" crash 
at the end.

I believe (but could be wrong) that the "out of memory" is due to these endless 
Warnings.
(heap is currently 15g)

 

But, does anyone know what could be causing these warnings? (anything in 
specific you'd suggest me to look at, in these PDFs?)

 

Thank you very much!

 

(... LOADS of these, redacted ...)
Jan 31, 2022 2:41:48 AM org.apache.pdfbox.multipdf.PDFMergerUtility mergeIDTree
WARNING: key node00018714 already exists in destination IDTree
Jan 31, 2022 2:41:48 AM org.apache.pdfbox.multipdf.PDFMergerUtility mergeIDTree
WARNING: key node00018715 already exists in destination IDTree
Jan 31, 2022 2:41:48 AM org.apache.pdfbox.multipdf.PDFMergerUtility mergeIDTree
WARNING: key node00018716 already exists in destination IDTree
Jan 31, 2022 2:41:48 AM org.apache.pdfbox.multipdf.PDFMergerUtility mergeIDTree
WARNING: key node00018717 already exists in destination IDTree
Jan 31, 2022 2:41:48 AM org.apache.pdfbox.multipdf.PDFMergerUtility mergeIDTree
WARNING: key node00018718 already exists in destination IDTree
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
        at 
org.apache.pdfbox.io.ScratchFileBuffer.addPage(ScratchFileBuffer.java:132)
        at 
org.apache.pdfbox.io.ScratchFileBuffer.ensureAvailableBytesInPage(ScratchFileBuffer.java:184)
        at 
org.apache.pdfbox.io.ScratchFileBuffer.write(ScratchFileBuffer.java:236)
        at 
org.apache.pdfbox.io.RandomAccessOutputStream.write(RandomAccessOutputStream.java:46)
        at org.apache.pdfbox.cos.COSStream$2.write(COSStream.java:281)
        at org.apache.pdfbox.io.IOUtils.copy(IOUtils.java:70)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:127)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:109)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:146)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:146)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:146)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:109)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:117)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:146)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:109)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:117)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:146)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:109)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:117)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:146)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:109)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:146)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:109)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:146)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:109)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:146)
        at 
org.apache.pdfbox.multipdf.PDFMergerUtility.appendDocument(PDFMergerUtility.java:800)
        at 
org.apache.pdfbox.multipdf.PDFMergerUtility.legacyMergeDocuments(PDFMergerUtility.java:459)
        at 
org.apache.pdfbox.multipdf.PDFMergerUtility.mergeDocuments(PDFMergerUtility.java:346)
        at org.apache.pdfbox.tools.PDFMerger.merge(PDFMerger.java:70)
        at org.apache.pdfbox.tools.PDFMerger.main(PDFMerger.java:49)
        at org.apache.pdfbox.tools.PDFBox.main(PDFBox.java:81)

 

--

 

EDIT - Increased the heap, and seems I got something new too:

 

Jan 31, 2022 10:50:46 AM org.apache.pdfbox.multipdf.PDFMergerUtility mergeIDTree
WARNING: key node00018714 already exists in destination IDTree
Jan 31, 2022 10:51:29 AM org.apache.pdfbox.cos.COSDocument finalize
*WARNING: Warning: You did not close a PDF Document*
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space

 

Not sure if this means something is actually wrong with one of the PDFs, or a 
consequence of going out of memory.

(and how do I know what's the PDF it's referring to? :) )


> *LOADS of* "WARNING: key node000xxxxx already exists in destination IDTree"
> ---------------------------------------------------------------------------
>
>                 Key: PDFBOX-5372
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-5372
>             Project: PDFBox
>          Issue Type: Bug
>            Reporter: nunop5
>            Priority: Major
>
> Hello,
> I've been using PDFBox since very very long, and it's been working very well 👍
>  
> However, for 2 sets of PDFs I'm trying to merge in the past day, for some 
> reason I'm always getting LOADS of these Warnings, with a "out of memory" 
> crash at the end.
> I believe (but could be wrong) that the "out of memory" is due to these 
> endless Warnings.
> (heap is currently 15g)
>  
> But, does anyone know what could be causing these warnings? (anything in 
> specific you'd suggest me to look at, in these PDFs?)
>  
> Thank you very much!
>  
> (... LOADS of these, redacted ...)
> Jan 31, 2022 2:41:48 AM org.apache.pdfbox.multipdf.PDFMergerUtility 
> mergeIDTree
> WARNING: key node00018714 already exists in destination IDTree
> Jan 31, 2022 2:41:48 AM org.apache.pdfbox.multipdf.PDFMergerUtility 
> mergeIDTree
> WARNING: key node00018715 already exists in destination IDTree
> Jan 31, 2022 2:41:48 AM org.apache.pdfbox.multipdf.PDFMergerUtility 
> mergeIDTree
> WARNING: key node00018716 already exists in destination IDTree
> Jan 31, 2022 2:41:48 AM org.apache.pdfbox.multipdf.PDFMergerUtility 
> mergeIDTree
> WARNING: key node00018717 already exists in destination IDTree
> Jan 31, 2022 2:41:48 AM org.apache.pdfbox.multipdf.PDFMergerUtility 
> mergeIDTree
> WARNING: key node00018718 already exists in destination IDTree
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>         at 
> org.apache.pdfbox.io.ScratchFileBuffer.addPage(ScratchFileBuffer.java:132)
>         at 
> org.apache.pdfbox.io.ScratchFileBuffer.ensureAvailableBytesInPage(ScratchFileBuffer.java:184)
>         at 
> org.apache.pdfbox.io.ScratchFileBuffer.write(ScratchFileBuffer.java:236)
>         at 
> org.apache.pdfbox.io.RandomAccessOutputStream.write(RandomAccessOutputStream.java:46)
>         at org.apache.pdfbox.cos.COSStream$2.write(COSStream.java:281)
>         at org.apache.pdfbox.io.IOUtils.copy(IOUtils.java:70)
>         at 
> org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:127)
>         at 
> org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:109)
>         at 
> org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:146)
>         at 
> org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:146)
>         at 
> org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:146)
>         at 
> org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:109)
>         at 
> org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:117)
>         at 
> org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:146)
>         at 
> org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:109)
>         at 
> org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:117)
>         at 
> org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:146)
>         at 
> org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:109)
>         at 
> org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:117)
>         at 
> org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:146)
>         at 
> org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:109)
>         at 
> org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:146)
>         at 
> org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:109)
>         at 
> org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:146)
>         at 
> org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:109)
>         at 
> org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:146)
>         at 
> org.apache.pdfbox.multipdf.PDFMergerUtility.appendDocument(PDFMergerUtility.java:800)
>         at 
> org.apache.pdfbox.multipdf.PDFMergerUtility.legacyMergeDocuments(PDFMergerUtility.java:459)
>         at 
> org.apache.pdfbox.multipdf.PDFMergerUtility.mergeDocuments(PDFMergerUtility.java:346)
>         at org.apache.pdfbox.tools.PDFMerger.merge(PDFMerger.java:70)
>         at org.apache.pdfbox.tools.PDFMerger.main(PDFMerger.java:49)
>         at org.apache.pdfbox.tools.PDFBox.main(PDFBox.java:81)
>  
> –
>  
> EDIT - Increased the heap, and seems I got something new too:
>  
> Jan 31, 2022 10:50:46 AM org.apache.pdfbox.multipdf.PDFMergerUtility 
> mergeIDTree
> WARNING: key node00018714 already exists in destination IDTree
> Jan 31, 2022 10:51:29 AM org.apache.pdfbox.cos.COSDocument finalize
> *WARNING: Warning: You did not close a PDF Document*
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>  
> Not sure if this means something is actually wrong with one of the PDFs, or a 
> consequence of going out of memory.
> (and how do I know what's the PDF it's referring to? :) )
> I'm using Pupeteer to generate the PDFs, so I assume it generated them 
> well... (also, it's consistently failing even if I re-generate all of them)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to