kohdai created PDFBOX-3443:
------------------------------

             Summary: use org.apache.pdfbox.multipdf.Splitter#split(), 
Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit 
exceeded
                 Key: PDFBOX-3443
                 URL: https://issues.apache.org/jira/browse/PDFBOX-3443
             Project: PDFBox
          Issue Type: Bug
          Components: PDModel
    Affects Versions: 2.0.2
         Environment: Windows 7/Java 8
            Reporter: kohdai


I use org.apache.pdfbox.multipdf.Splitter#split()
but OOM occured when I try to split large size pdf(20Mb, 100pages).

= =
PDDocument document = PDDocument.load(large_size.pdf);
Splitter splitter = new Splitter();
List<PDDocument> splittedDocuments = splitter.split(document);

= =
Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit 
exceeded
        at 
org.apache.pdfbox.io.ScratchFileBuffer.addPage(ScratchFileBuffer.java:132)
        at 
org.apache.pdfbox.io.ScratchFileBuffer.ensureAvailableBytesInPage(ScratchFileBuffer.java:184)
        at 
org.apache.pdfbox.io.ScratchFileBuffer.write(ScratchFileBuffer.java:203)
        at 
org.apache.pdfbox.io.RandomAccessOutputStream.write(RandomAccessOutputStream.java:58)
        at java.io.FilterOutputStream.write(Unknown Source)
        at java.io.FilterOutputStream.write(Unknown Source)
        at org.apache.pdfbox.io.IOUtils.copy(IOUtils.java:68)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:119)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:99)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:125)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:99)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:136)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:136)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:136)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:99)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:108)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:136)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:99)
        at 
org.apache.pdfbox.multipdf.PDFCloneUtility.cloneForNewDocument(PDFCloneUtility.java:136)
        at org.apache.pdfbox.pdmodel.PDDocument.importPage(PDDocument.java:525)
        at org.apache.pdfbox.multipdf.Splitter.processPage(Splitter.java:206)
        at org.apache.pdfbox.multipdf.Splitter.processPages(Splitter.java:128)
        at org.apache.pdfbox.multipdf.Splitter.split(Splitter.java:63)
        at main.PDFConvert2Chapter.convert(PDFConvert2Chapter.java:89)
        at main.PDFConvert2Chapter.main(PDFConvert2Chapter.java:39)

= =

PDFBox 2.0.2 OOM occured
PDFBox 2.0.1 No problem

I think that code is wrong.

org.apache.pdfbox.pdmodel.PDDocument#importPage() tag:2.0.2

public PDPage importPage(PDPage page) throws IOException
{
    PDFCloneUtility cloner = new PDFCloneUtility(this);
    COSBase pageBase = cloner.cloneForNewDocument(page.getCOSObject());
    PDPage importedPage = new PDPage((COSDictionary) pageBase, resourceCache);
    addPage(importedPage);
    return importedPage;
}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to