[ 
https://issues.apache.org/jira/browse/PDFBOX-1586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13679435#comment-13679435
 ] 

Andrew Dale commented on PDFBOX-1586:
-------------------------------------

Even though the bug has been set to fixed in the 1.8.2 release of PDFBox, it is 
in my opinion still there.  A simplified test case is:

    @Test
    public void testPdfBox2() throws Exception {
        PDDocument returnDocument = new PDDocument();
        String outputFilename = "/tmp/output.pdf";

        List<Integer> pages = Arrays.asList(1, 2, 3, 4, 5);

        try {
            // get/load current document
            PDDocument currentPdf = PDDocument.load(new File("/tmp/input.pdf"));

            @SuppressWarnings("unchecked")
            List<PDPage> currentDocumentPages = 
currentPdf.getDocumentCatalog().getAllPages();

            for (Integer currentPage : pages) {
                returnDocument.importPage(currentDocumentPages.get(currentPage 
- 1));
            }

            currentPdf.close(); // cause of the problem, and everything works 
ok if this is closed after the returnDocument.save and returnDocument.close is 
called.

        } finally {
            returnDocument.save(outputFilename);
            returnDocument.close();
        }
    }  

This gives me the following stacktrace:

org.apache.pdfbox.exceptions.COSVisitorException: 
java.lang.IndexOutOfBoundsException: Index: 72, Size: 0
        at 
org.apache.pdfbox.pdfwriter.COSWriter.visitFromStream(COSWriter.java:1354)
        at org.apache.pdfbox.cos.COSStream.accept(COSStream.java:217)
        at org.apache.pdfbox.cos.COSObject.accept(COSObject.java:206)
        at 
org.apache.pdfbox.pdfwriter.COSWriter.doWriteObject(COSWriter.java:525)
        at org.apache.pdfbox.pdfwriter.COSWriter.doWriteBody(COSWriter.java:435)
        at 
org.apache.pdfbox.pdfwriter.COSWriter.visitFromDocument(COSWriter.java:1122)
        at org.apache.pdfbox.cos.COSDocument.accept(COSDocument.java:552)
        at org.apache.pdfbox.pdfwriter.COSWriter.write(COSWriter.java:1501)
        at org.apache.pdfbox.pdmodel.PDDocument.save(PDDocument.java:1324)
        at org.apache.pdfbox.pdmodel.PDDocument.save(PDDocument.java:1305)
        at org.apache.pdfbox.pdmodel.PDDocument.save(PDDocument.java:1292)
        at com.test.PdfBoxTest.testPdfBox2(PdfBoxTest.java:77)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
        at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
        at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
        at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
        at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
        at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
        at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
        at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
        at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
        at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
        at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
        at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
        at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
        at 
org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
        at 
org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
        at 
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
        at 
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
        at 
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
        at 
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
Caused by: java.lang.IndexOutOfBoundsException: Index: 72, Size: 0
        at java.util.ArrayList.RangeCheck(ArrayList.java:547)
        at java.util.ArrayList.get(ArrayList.java:322)
        at 
org.apache.pdfbox.io.RandomAccessBuffer.seek(RandomAccessBuffer.java:84)
        at 
org.apache.pdfbox.io.RandomAccessFileInputStream.read(RandomAccessFileInputStream.java:96)
        at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
        at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
        at 
org.apache.pdfbox.pdfwriter.COSWriter.visitFromStream(COSWriter.java:1337)
        ... 34 more

I am using JDK 1.6.0_33 on Linux 64-Bit (Ubuntu)
                
> IndexOutOfBoundsException when saving a document (at random)
> ------------------------------------------------------------
>
>                 Key: PDFBOX-1586
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1586
>             Project: PDFBox
>          Issue Type: Bug
>    Affects Versions: 1.8.1
>            Reporter: James Green
>            Assignee: Andreas Lehmkühler
>            Priority: Critical
>             Fix For: 1.8.2
>
>
> Getting the following stacktrace:
> org.apache.pdfbox.exceptions.COSVisitorException: 
> java.lang.IndexOutOfBoundsException: Index: 28, Size: 0
>     at 
> org.apache.pdfbox.pdfwriter.COSWriter.visitFromStream(COSWriter.java:1245)
>     at org.apache.pdfbox.cos.COSStream.accept(COSStream.java:201)
>     at org.apache.pdfbox.cos.COSObject.accept(COSObject.java:206)
>     at org.apache.pdfbox.pdfwriter.COSWriter.doWriteObject(COSWriter.java:524)
>     at org.apache.pdfbox.pdfwriter.COSWriter.doWriteBody(COSWriter.java:434)
>     at 
> org.apache.pdfbox.pdfwriter.COSWriter.visitFromDocument(COSWriter.java:1056)
>     at org.apache.pdfbox.cos.COSDocument.accept(COSDocument.java:496)
>     at org.apache.pdfbox.pdfwriter.COSWriter.write(COSWriter.java:1392)
>     at org.apache.pdfbox.pdmodel.PDDocument.save(PDDocument.java:1157)
>     at org.apache.pdfbox.pdmodel.PDDocument.save(PDDocument.java:1138)
> ...
> Caused by: java.lang.IndexOutOfBoundsException: Index: 28, Size: 0
>     at java.util.ArrayList.rangeCheck(ArrayList.java:604)
>     at java.util.ArrayList.get(ArrayList.java:382)
>     at 
> org.apache.pdfbox.io.RandomAccessBuffer.seek(RandomAccessBuffer.java:84)
>     at 
> org.apache.pdfbox.io.RandomAccessFileInputStream.read(RandomAccessFileInputStream.java:96)
>     at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
>     at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
>     at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
>     at 
> org.apache.pdfbox.pdfwriter.COSWriter.visitFromStream(COSWriter.java:1232)
> I'll add some context. We have a "data pipeline" in which a Windows Print 
> Monitor sends postscript into a servlet which then uses GhostScript 9.05 to 
> convert in-memory to PDF. This PDF is then loaded into PDFBox using 
> PDDocument.load().
> At this point we split the original PDF into multiple smaller ones each of 
> which is saved to a ByteArrayOutputStream. At the point of save() we are 
> having serious reliability issues.
> Taking an original PDF from Ghostscript we have saved this into a unit test 
> to replicate the problem without success. If we attempt to re-execute the 
> pipeline to take the original PDF and split it, we get apparently random 
> percentages of saved documents.
> For instance, on a 990 page document (text, no images), to be split into 990 
> 1-page documents using Tomcat 7 with -Xmx=512m:
> Pass 1: 50% were saved, 50% ended with stack traces
> Pass 2: 100% were saved
> Pass 3: 100% were saved
> The same test with -Xmx=128m ended several times with just 1 document saved, 
> the rest were stack traces.
> We have also seen this randomly hit a sample document consisting of four 
> pages to be split into two two-page documents so it does not appear to be 
> memory related. We also added code to catch the IndexOutOfBoundsException and 
> make up to ten attempts to repeat, but it seems the save() either works the 
> first time or not at all.
> We're thinking there are environmental factors here but we're now focused on 
> getting this nailed. Any advice or assistance will be welcomed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to