Re: [jira] [Commented] (PDFBOX-1498) Index Out Of Bounds Exception while reading large PDF Document
Hi John, As much as I applaud your efforts on this project do you have any idea how I can stop receiving the email updates? Steve Tyler Chief Information Officer UK: +44 (0) 7917 005990 USA: +1 312 239 0593 Email: steve.ty...@episys.com Web: www.episys.com On 8 Feb 2014, at 19:33, John Hewson (JIRA) j...@apache.org wrote: [ https://issues.apache.org/jira/browse/PDFBOX-1498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13895704#comment-13895704 ] John Hewson commented on PDFBOX-1498: - There's no way you're going to be open a 700MB PDF file with only 1024MB of heap. I think that is probably the cause of your problem. Try using at least 4x more heap space. Index Out Of Bounds Exception while reading large PDF Document --- Key: PDFBOX-1498 URL: https://issues.apache.org/jira/browse/PDFBOX-1498 Project: PDFBox Issue Type: Bug Reporter: Manoj Patel Assignee: Andreas Lehmkühler I am getting java.lang.IndexOutOfBoundsException while reading large PDF document (800 mb). Below is the full stack Exception in thread main org.apache.pdfbox.exceptions.WrappedIOException at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:243) at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1071) at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1038) at imageData.AddFooter.main(AddFooter.java:26) Caused by: java.lang.IndexOutOfBoundsException: Index: 3377, Size: 3377 at java.util.ArrayList.RangeCheck(ArrayList.java:547) at java.util.ArrayList.get(ArrayList.java:322) at org.apache.pdfbox.io.RandomAccessBuffer.seek(RandomAccessBuffer.java:84) at org.apache.pdfbox.io.RandomAccessFileOutputStream.write(RandomAccessFileOutputStream.java:106) at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65) at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123) at java.io.FilterOutputStream.close(FilterOutputStream.java:140) at org.apache.pdfbox.pdfparser.BaseParser.parseCOSStream(BaseParser.java:606) at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:566) at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:187) ... 3 more -- This message was sent by Atlassian JIRA (v6.1.5#6160) -- This e-mail is only intended for the person(s) to whom it is addressed as it may contain confidential information. For more information please visit www.episys.com/disclaimer.htm
Re: [jira] [Commented] (PDFBOX-1498) Index Out Of Bounds Exception while reading large PDF Document
Hi, Am 08.02.2014 20:36, schrieb Steve Tyler: Hi John, As much as I applaud your efforts on this project do you have any idea how I can stop receiving the email updates? You have to unsubscribe, see [1] for further details BR Andreas Lehmkühler [1] http://pdfbox.apache.org/mailinglists.html Steve Tyler Chief Information Officer UK: +44 (0) 7917 005990 USA: +1 312 239 0593 Email: steve.ty...@episys.com Web: www.episys.com On 8 Feb 2014, at 19:33, John Hewson (JIRA) j...@apache.org wrote: [ https://issues.apache.org/jira/browse/PDFBOX-1498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13895704#comment-13895704 ] John Hewson commented on PDFBOX-1498: - There's no way you're going to be open a 700MB PDF file with only 1024MB of heap. I think that is probably the cause of your problem. Try using at least 4x more heap space. Index Out Of Bounds Exception while reading large PDF Document --- Key: PDFBOX-1498 URL: https://issues.apache.org/jira/browse/PDFBOX-1498 Project: PDFBox Issue Type: Bug Reporter: Manoj Patel Assignee: Andreas Lehmkühler I am getting java.lang.IndexOutOfBoundsException while reading large PDF document (800 mb). Below is the full stack Exception in thread main org.apache.pdfbox.exceptions.WrappedIOException at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:243) at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1071) at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1038) at imageData.AddFooter.main(AddFooter.java:26) Caused by: java.lang.IndexOutOfBoundsException: Index: 3377, Size: 3377 at java.util.ArrayList.RangeCheck(ArrayList.java:547) at java.util.ArrayList.get(ArrayList.java:322) at org.apache.pdfbox.io.RandomAccessBuffer.seek(RandomAccessBuffer.java:84) at org.apache.pdfbox.io.RandomAccessFileOutputStream.write(RandomAccessFileOutputStream.java:106) at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65) at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123) at java.io.FilterOutputStream.close(FilterOutputStream.java:140) at org.apache.pdfbox.pdfparser.BaseParser.parseCOSStream(BaseParser.java:606) at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:566) at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:187) ... 3 more -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: [jira] [Commented] (PDFBOX-1498) Index Out Of Bounds Exception while reading large PDF Document
I guess you could filter out any email containing [jira], I'm not aware of any other method. Perhaps we need a separate mailing list for JIRA? -- John On 8 Feb 2014, at 11:36, Steve Tyler steve.ty...@episys.com wrote: Hi John, As much as I applaud your efforts on this project do you have any idea how I can stop receiving the email updates? Steve Tyler Chief Information Officer UK: +44 (0) 7917 005990 USA: +1 312 239 0593 Email: steve.ty...@episys.com Web: www.episys.com On 8 Feb 2014, at 19:33, John Hewson (JIRA) j...@apache.org wrote: [ https://issues.apache.org/jira/browse/PDFBOX-1498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13895704#comment-13895704 ] John Hewson commented on PDFBOX-1498: - There's no way you're going to be open a 700MB PDF file with only 1024MB of heap. I think that is probably the cause of your problem. Try using at least 4x more heap space. Index Out Of Bounds Exception while reading large PDF Document --- Key: PDFBOX-1498 URL: https://issues.apache.org/jira/browse/PDFBOX-1498 Project: PDFBox Issue Type: Bug Reporter: Manoj Patel Assignee: Andreas Lehmkühler I am getting java.lang.IndexOutOfBoundsException while reading large PDF document (800 mb). Below is the full stack Exception in thread main org.apache.pdfbox.exceptions.WrappedIOException at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:243) at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1071) at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1038) at imageData.AddFooter.main(AddFooter.java:26) Caused by: java.lang.IndexOutOfBoundsException: Index: 3377, Size: 3377 at java.util.ArrayList.RangeCheck(ArrayList.java:547) at java.util.ArrayList.get(ArrayList.java:322) at org.apache.pdfbox.io.RandomAccessBuffer.seek(RandomAccessBuffer.java:84) at org.apache.pdfbox.io.RandomAccessFileOutputStream.write(RandomAccessFileOutputStream.java:106) at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65) at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123) at java.io.FilterOutputStream.close(FilterOutputStream.java:140) at org.apache.pdfbox.pdfparser.BaseParser.parseCOSStream(BaseParser.java:606) at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:566) at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:187) ... 3 more -- This message was sent by Atlassian JIRA (v6.1.5#6160) -- This e-mail is only intended for the person(s) to whom it is addressed as it may contain confidential information. For more information please visit www.episys.com/disclaimer.htm
Re: [jira] [Commented] (PDFBOX-1498) Index Out Of Bounds Exception while reading large PDF Document
Hi, Am 23.01.2013 10:27, schrieb Maruan Sahyoun: Hi Manoj, I'm afraid Manoj isn't subscribed to this list. BR Andreas Lehmkühler the size alone is not the cause of the issue. In a recent project we were handling PDF's larger than the one you are talking about. 1. Can you test with the Non Sequential Parser i.e. PDDocument.loadNonSeq(…) and confirm that this is causing the same issue. 2. Can you upload a sample PDF which enables us to reproduce the issue? Without that it will be very difficult to say why this is happening. 3. Of course you can try with larger heap settings until it works but I don't think this is a good approach. In addition to that it would be good if you could describe what you want to achieve with the PDF. Maybe there are ways doing so without parsing the complete file. With kind regards Maruan Sahyoun Am 23.01.2013 um 10:18 schrieb Manoj Patel (JIRA) j...@apache.org: [ https://issues.apache.org/jira/browse/PDFBOX-1498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560504#comment-13560504 ] Manoj Patel commented on PDFBOX-1498: - Sorry but i cannot share document with anyone. I have created new document which is around 700mb. Now when i try same program it is giving below Java heap space exception, even i have set -Xmx1024 parameter for that Exception in thread main org.apache.pdfbox.exceptions.WrappedIOException at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:243) at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1071) at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1038) at imageData.ReadLargeFile.main(ReadLargeFile.java:13) Caused by: java.lang.OutOfMemoryError: Java heap space at java.io.BufferedOutputStream.init(BufferedOutputStream.java:59) at org.apache.pdfbox.cos.COSStream.createFilteredStream(COSStream.java:415) at org.apache.pdfbox.pdfparser.BaseParser.parseCOSStream(BaseParser.java:452) at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:566) at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:187) ... 3 more Is there any way to read it? Index Out Of Bounds Exception while reading large PDF Document --- Key: PDFBOX-1498 URL: https://issues.apache.org/jira/browse/PDFBOX-1498 Project: PDFBox Issue Type: Bug Reporter: Manoj Patel Assignee: Andreas Lehmkühler I am getting java.lang.IndexOutOfBoundsException while reading large PDF document (800 mb). Below is the full stack Exception in thread main org.apache.pdfbox.exceptions.WrappedIOException at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:243) at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1071) at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1038) at imageData.AddFooter.main(AddFooter.java:26) Caused by: java.lang.IndexOutOfBoundsException: Index: 3377, Size: 3377 at java.util.ArrayList.RangeCheck(ArrayList.java:547) at java.util.ArrayList.get(ArrayList.java:322) at org.apache.pdfbox.io.RandomAccessBuffer.seek(RandomAccessBuffer.java:84) at org.apache.pdfbox.io.RandomAccessFileOutputStream.write(RandomAccessFileOutputStream.java:106) at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65) at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123) at java.io.FilterOutputStream.close(FilterOutputStream.java:140) at org.apache.pdfbox.pdfparser.BaseParser.parseCOSStream(BaseParser.java:606) at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:566) at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:187) ... 3 more -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PDFBOX-1498) Index Out Of Bounds Exception while reading large PDF Document
[ https://issues.apache.org/jira/browse/PDFBOX-1498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13561817#comment-13561817 ] Andreas Lehmkühler commented on PDFBOX-1498: Try the the new Non Sequential Parser iby using PDDocument.loadNonSeq(…) instead of PDDocument.load(...) Index Out Of Bounds Exception while reading large PDF Document --- Key: PDFBOX-1498 URL: https://issues.apache.org/jira/browse/PDFBOX-1498 Project: PDFBox Issue Type: Bug Reporter: Manoj Patel Assignee: Andreas Lehmkühler I am getting java.lang.IndexOutOfBoundsException while reading large PDF document (800 mb). Below is the full stack Exception in thread main org.apache.pdfbox.exceptions.WrappedIOException at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:243) at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1071) at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1038) at imageData.AddFooter.main(AddFooter.java:26) Caused by: java.lang.IndexOutOfBoundsException: Index: 3377, Size: 3377 at java.util.ArrayList.RangeCheck(ArrayList.java:547) at java.util.ArrayList.get(ArrayList.java:322) at org.apache.pdfbox.io.RandomAccessBuffer.seek(RandomAccessBuffer.java:84) at org.apache.pdfbox.io.RandomAccessFileOutputStream.write(RandomAccessFileOutputStream.java:106) at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65) at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123) at java.io.FilterOutputStream.close(FilterOutputStream.java:140) at org.apache.pdfbox.pdfparser.BaseParser.parseCOSStream(BaseParser.java:606) at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:566) at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:187) ... 3 more -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PDFBOX-1498) Index Out Of Bounds Exception while reading large PDF Document
[ https://issues.apache.org/jira/browse/PDFBOX-1498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560504#comment-13560504 ] Manoj Patel commented on PDFBOX-1498: - Sorry but i cannot share document with anyone. I have created new document which is around 700mb. Now when i try same program it is giving below Java heap space exception, even i have set -Xmx1024 parameter for that Exception in thread main org.apache.pdfbox.exceptions.WrappedIOException at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:243) at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1071) at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1038) at imageData.ReadLargeFile.main(ReadLargeFile.java:13) Caused by: java.lang.OutOfMemoryError: Java heap space at java.io.BufferedOutputStream.init(BufferedOutputStream.java:59) at org.apache.pdfbox.cos.COSStream.createFilteredStream(COSStream.java:415) at org.apache.pdfbox.pdfparser.BaseParser.parseCOSStream(BaseParser.java:452) at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:566) at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:187) ... 3 more Is there any way to read it? Index Out Of Bounds Exception while reading large PDF Document --- Key: PDFBOX-1498 URL: https://issues.apache.org/jira/browse/PDFBOX-1498 Project: PDFBox Issue Type: Bug Reporter: Manoj Patel Assignee: Andreas Lehmkühler I am getting java.lang.IndexOutOfBoundsException while reading large PDF document (800 mb). Below is the full stack Exception in thread main org.apache.pdfbox.exceptions.WrappedIOException at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:243) at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1071) at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1038) at imageData.AddFooter.main(AddFooter.java:26) Caused by: java.lang.IndexOutOfBoundsException: Index: 3377, Size: 3377 at java.util.ArrayList.RangeCheck(ArrayList.java:547) at java.util.ArrayList.get(ArrayList.java:322) at org.apache.pdfbox.io.RandomAccessBuffer.seek(RandomAccessBuffer.java:84) at org.apache.pdfbox.io.RandomAccessFileOutputStream.write(RandomAccessFileOutputStream.java:106) at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65) at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123) at java.io.FilterOutputStream.close(FilterOutputStream.java:140) at org.apache.pdfbox.pdfparser.BaseParser.parseCOSStream(BaseParser.java:606) at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:566) at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:187) ... 3 more -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [jira] [Commented] (PDFBOX-1498) Index Out Of Bounds Exception while reading large PDF Document
Hi Manoj, the size alone is not the cause of the issue. In a recent project we were handling PDF's larger than the one you are talking about. 1. Can you test with the Non Sequential Parser i.e. PDDocument.loadNonSeq(…) and confirm that this is causing the same issue. 2. Can you upload a sample PDF which enables us to reproduce the issue? Without that it will be very difficult to say why this is happening. 3. Of course you can try with larger heap settings until it works but I don't think this is a good approach. In addition to that it would be good if you could describe what you want to achieve with the PDF. Maybe there are ways doing so without parsing the complete file. With kind regards Maruan Sahyoun Am 23.01.2013 um 10:18 schrieb Manoj Patel (JIRA) j...@apache.org: [ https://issues.apache.org/jira/browse/PDFBOX-1498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560504#comment-13560504 ] Manoj Patel commented on PDFBOX-1498: - Sorry but i cannot share document with anyone. I have created new document which is around 700mb. Now when i try same program it is giving below Java heap space exception, even i have set -Xmx1024 parameter for that Exception in thread main org.apache.pdfbox.exceptions.WrappedIOException at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:243) at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1071) at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1038) at imageData.ReadLargeFile.main(ReadLargeFile.java:13) Caused by: java.lang.OutOfMemoryError: Java heap space at java.io.BufferedOutputStream.init(BufferedOutputStream.java:59) at org.apache.pdfbox.cos.COSStream.createFilteredStream(COSStream.java:415) at org.apache.pdfbox.pdfparser.BaseParser.parseCOSStream(BaseParser.java:452) at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:566) at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:187) ... 3 more Is there any way to read it? Index Out Of Bounds Exception while reading large PDF Document --- Key: PDFBOX-1498 URL: https://issues.apache.org/jira/browse/PDFBOX-1498 Project: PDFBox Issue Type: Bug Reporter: Manoj Patel Assignee: Andreas Lehmkühler I am getting java.lang.IndexOutOfBoundsException while reading large PDF document (800 mb). Below is the full stack Exception in thread main org.apache.pdfbox.exceptions.WrappedIOException at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:243) at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1071) at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1038) at imageData.AddFooter.main(AddFooter.java:26) Caused by: java.lang.IndexOutOfBoundsException: Index: 3377, Size: 3377 at java.util.ArrayList.RangeCheck(ArrayList.java:547) at java.util.ArrayList.get(ArrayList.java:322) at org.apache.pdfbox.io.RandomAccessBuffer.seek(RandomAccessBuffer.java:84) at org.apache.pdfbox.io.RandomAccessFileOutputStream.write(RandomAccessFileOutputStream.java:106) at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65) at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123) at java.io.FilterOutputStream.close(FilterOutputStream.java:140) at org.apache.pdfbox.pdfparser.BaseParser.parseCOSStream(BaseParser.java:606) at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:566) at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:187) ... 3 more -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PDFBOX-1498) Index Out Of Bounds Exception while reading large PDF Document
[ https://issues.apache.org/jira/browse/PDFBOX-1498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13559514#comment-13559514 ] Andreas Lehmkühler commented on PDFBOX-1498: The described IndexOutOfBounds issue isn't related to the size of the pdf (see PDFBOX-1490, the pdf in question has a size of 102Kb) Are you sure that - you have the latest code? - you are really using the new compiled code? - the issue/stacktrace is the same? Index Out Of Bounds Exception while reading large PDF Document --- Key: PDFBOX-1498 URL: https://issues.apache.org/jira/browse/PDFBOX-1498 Project: PDFBox Issue Type: Bug Reporter: Manoj Patel Assignee: Andreas Lehmkühler I am getting java.lang.IndexOutOfBoundsException while reading large PDF document (800 mb). Below is the full stack Exception in thread main org.apache.pdfbox.exceptions.WrappedIOException at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:243) at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1071) at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1038) at imageData.AddFooter.main(AddFooter.java:26) Caused by: java.lang.IndexOutOfBoundsException: Index: 3377, Size: 3377 at java.util.ArrayList.RangeCheck(ArrayList.java:547) at java.util.ArrayList.get(ArrayList.java:322) at org.apache.pdfbox.io.RandomAccessBuffer.seek(RandomAccessBuffer.java:84) at org.apache.pdfbox.io.RandomAccessFileOutputStream.write(RandomAccessFileOutputStream.java:106) at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65) at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123) at java.io.FilterOutputStream.close(FilterOutputStream.java:140) at org.apache.pdfbox.pdfparser.BaseParser.parseCOSStream(BaseParser.java:606) at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:566) at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:187) ... 3 more -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PDFBOX-1498) Index Out Of Bounds Exception while reading large PDF Document
[ https://issues.apache.org/jira/browse/PDFBOX-1498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13559536#comment-13559536 ] Manoj Patel commented on PDFBOX-1498: - Ya, I am using fontbox-1.8.0-SNAPSHOT.jar, jempbox-1.8.0-SNAPSHOT.jar, pdfbox-1.8.0-SNAPSHOT.jar and still i am getting below error, even i have checkout latest code and build latest jars few minutes before. Below is the stack which i am getting Exception in thread main org.apache.pdfbox.exceptions.WrappedIOException at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:243) at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1071) at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1038) at imageData.ReadLargeFile.main(ReadLargeFile.java:13) Caused by: java.lang.IndexOutOfBoundsException: Index: 3376, Size: 3376 at java.util.ArrayList.RangeCheck(ArrayList.java:547) at java.util.ArrayList.get(ArrayList.java:322) at org.apache.pdfbox.io.RandomAccessBuffer.seek(RandomAccessBuffer.java:84) at org.apache.pdfbox.io.RandomAccessFileOutputStream.write(RandomAccessFileOutputStream.java:106) at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65) at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123) at java.io.FilterOutputStream.close(FilterOutputStream.java:140) at org.apache.pdfbox.pdfparser.BaseParser.parseCOSStream(BaseParser.java:606) at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:566) at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:187) ... 3 more Index Out Of Bounds Exception while reading large PDF Document --- Key: PDFBOX-1498 URL: https://issues.apache.org/jira/browse/PDFBOX-1498 Project: PDFBox Issue Type: Bug Reporter: Manoj Patel Assignee: Andreas Lehmkühler I am getting java.lang.IndexOutOfBoundsException while reading large PDF document (800 mb). Below is the full stack Exception in thread main org.apache.pdfbox.exceptions.WrappedIOException at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:243) at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1071) at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1038) at imageData.AddFooter.main(AddFooter.java:26) Caused by: java.lang.IndexOutOfBoundsException: Index: 3377, Size: 3377 at java.util.ArrayList.RangeCheck(ArrayList.java:547) at java.util.ArrayList.get(ArrayList.java:322) at org.apache.pdfbox.io.RandomAccessBuffer.seek(RandomAccessBuffer.java:84) at org.apache.pdfbox.io.RandomAccessFileOutputStream.write(RandomAccessFileOutputStream.java:106) at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65) at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123) at java.io.FilterOutputStream.close(FilterOutputStream.java:140) at org.apache.pdfbox.pdfparser.BaseParser.parseCOSStream(BaseParser.java:606) at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:566) at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:187) ... 3 more -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PDFBOX-1498) Index Out Of Bounds Exception while reading large PDF Document
[ https://issues.apache.org/jira/browse/PDFBOX-1498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13559557#comment-13559557 ] Andreas Lehmkühler commented on PDFBOX-1498: Hmmm, there are two possible ways to proceed: - upload the pdf in question somewhere to a sharehoster, so that we can download it (Send me a private mail with the download link if you can't make the pdf public) - attach the pdfbox.jar you're using so that we can doublecheck your environment Index Out Of Bounds Exception while reading large PDF Document --- Key: PDFBOX-1498 URL: https://issues.apache.org/jira/browse/PDFBOX-1498 Project: PDFBox Issue Type: Bug Reporter: Manoj Patel Assignee: Andreas Lehmkühler I am getting java.lang.IndexOutOfBoundsException while reading large PDF document (800 mb). Below is the full stack Exception in thread main org.apache.pdfbox.exceptions.WrappedIOException at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:243) at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1071) at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1038) at imageData.AddFooter.main(AddFooter.java:26) Caused by: java.lang.IndexOutOfBoundsException: Index: 3377, Size: 3377 at java.util.ArrayList.RangeCheck(ArrayList.java:547) at java.util.ArrayList.get(ArrayList.java:322) at org.apache.pdfbox.io.RandomAccessBuffer.seek(RandomAccessBuffer.java:84) at org.apache.pdfbox.io.RandomAccessFileOutputStream.write(RandomAccessFileOutputStream.java:106) at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65) at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123) at java.io.FilterOutputStream.close(FilterOutputStream.java:140) at org.apache.pdfbox.pdfparser.BaseParser.parseCOSStream(BaseParser.java:606) at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:566) at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:187) ... 3 more -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PDFBOX-1498) Index Out Of Bounds Exception while reading large PDF Document
[ https://issues.apache.org/jira/browse/PDFBOX-1498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13559429#comment-13559429 ] Andreas Lehmkühler commented on PDFBOX-1498: Please attach a sample pdf Index Out Of Bounds Exception while reading large PDF Document --- Key: PDFBOX-1498 URL: https://issues.apache.org/jira/browse/PDFBOX-1498 Project: PDFBox Issue Type: Bug Reporter: Manoj Patel Assignee: Andreas Lehmkühler I am getting java.lang.IndexOutOfBoundsException while reading large PDF document (800 mb). Below is the full stack Exception in thread main org.apache.pdfbox.exceptions.WrappedIOException at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:243) at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1071) at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1038) at imageData.AddFooter.main(AddFooter.java:26) Caused by: java.lang.IndexOutOfBoundsException: Index: 3377, Size: 3377 at java.util.ArrayList.RangeCheck(ArrayList.java:547) at java.util.ArrayList.get(ArrayList.java:322) at org.apache.pdfbox.io.RandomAccessBuffer.seek(RandomAccessBuffer.java:84) at org.apache.pdfbox.io.RandomAccessFileOutputStream.write(RandomAccessFileOutputStream.java:106) at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65) at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123) at java.io.FilterOutputStream.close(FilterOutputStream.java:140) at org.apache.pdfbox.pdfparser.BaseParser.parseCOSStream(BaseParser.java:606) at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:566) at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:187) ... 3 more -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PDFBOX-1498) Index Out Of Bounds Exception while reading large PDF Document
[ https://issues.apache.org/jira/browse/PDFBOX-1498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13559441#comment-13559441 ] Manoj Patel commented on PDFBOX-1498: - Its around 800 mb size document. You can try any pdf file with same size. Index Out Of Bounds Exception while reading large PDF Document --- Key: PDFBOX-1498 URL: https://issues.apache.org/jira/browse/PDFBOX-1498 Project: PDFBox Issue Type: Bug Reporter: Manoj Patel Assignee: Andreas Lehmkühler I am getting java.lang.IndexOutOfBoundsException while reading large PDF document (800 mb). Below is the full stack Exception in thread main org.apache.pdfbox.exceptions.WrappedIOException at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:243) at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1071) at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1038) at imageData.AddFooter.main(AddFooter.java:26) Caused by: java.lang.IndexOutOfBoundsException: Index: 3377, Size: 3377 at java.util.ArrayList.RangeCheck(ArrayList.java:547) at java.util.ArrayList.get(ArrayList.java:322) at org.apache.pdfbox.io.RandomAccessBuffer.seek(RandomAccessBuffer.java:84) at org.apache.pdfbox.io.RandomAccessFileOutputStream.write(RandomAccessFileOutputStream.java:106) at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65) at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123) at java.io.FilterOutputStream.close(FilterOutputStream.java:140) at org.apache.pdfbox.pdfparser.BaseParser.parseCOSStream(BaseParser.java:606) at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:566) at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:187) ... 3 more -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira