[
https://issues.apache.org/jira/browse/PDFBOX-3142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jim deVos updated PDFBOX-3142:
------------------------------
Description:
My team uses PDFMergerUtility to attach cover pages to various pdfs . We
recently we tried utilizing a scratch file (e.g.
PDFMergerUtility.mergeDocumentsNonSeq()) to cut down on the amount of RAM we
are using. This approach works for the majority of pdf's in our system, but
some files cause the merger utility to generate resultant pdf's with a blank
page. Specifically, the result pdf contains a blank page after the coverpage
instead of the first page of the second document sent to merger utility.
Whenever this problem occurs, we see the following line in our logs:
{{org.apache.pdfbox.pdfparser.NonSequentialPDFParser - Can't find the object 52
0 (origin offset 7187557)}}
I'll try to attach/link an example pdf soon, but currently I don't have
permission to redistribute any files that exhibit the problem. However,
here's a simple snippet that replicates the problem - it's pretty
straightforward.
{code}
@Test
public void testMergeNonSeq() throws IOException, COSVisitorException {
destinationPdf = new File(TMP_FOLDER, "result-nonseq.pdf");
PDFMergerUtility ut = new PDFMergerUtility();
RandomAccess ram = new RandomAccessFile(File.createTempFile("mergeram",
".bin"), "rw");
ut.addSource(coverpagePdf);
ut.addSource(documentPdf);
ut.setDestinationFileName(destinationPdf.getCanonicalPath());
ut.mergeDocumentsNonSeq(ram);
//the only automated way we have to tell that something went wrong is
to check the size of the result
assertThat("destination pdf should be larger than the original pdf",
destinationPdf.length(), is( greaterThan(documentPdf.length())));
}
{code}
Note we only see this problem with PDFMergerUtility.mergeDocumentsNonSeq().
Using PDFMergerUtility.mergeDocuments() does not exhibit any problems.
was:
My team uses PDFMergerUtility to attach cover pages to various pdfs . We
recently we tried utilizing a scratch file (e.g.
PDFMergerUtility.mergeNonSeq()) to cut down on the amount of RAM we are using.
This approach works for the majority of pdf's in our system, but some files
cause the merger utility to generate resultant pdf's with a blank page.
Specifically, the result pdf contains a blank page after the coverpage instead
of the first page of the second document sent to merger utility.
Whenever this problem occurs, we see the following line in our logs:
{{org.apache.pdfbox.pdfparser.NonSequentialPDFParser - Can't find the object 52
0 (origin offset 7187557)}}
I'll try to attach/link an example pdf soon, but currently I don't have
permission to redistribute any files that exhibit the problem. However,
here's a simple snippet that replicates the problem - it's pretty
straightforward.
{code}
@Test
public void testMergeNonSeq() throws IOException, COSVisitorException {
destinationPdf = new File(TMP_FOLDER, "result-nonseq.pdf");
PDFMergerUtility ut = new PDFMergerUtility();
RandomAccess ram = new RandomAccessFile(File.createTempFile("mergeram",
".bin"), "rw");
ut.addSource(coverpagePdf);
ut.addSource(documentPdf);
ut.setDestinationFileName(destinationPdf.getCanonicalPath());
ut.mergeDocumentsNonSeq(ram);
//the only automated way we have to tell that something went wrong is
to check the size of the result
assertThat("destination pdf should be larger than the original pdf",
destinationPdf.length(), is( greaterThan(documentPdf.length())));
}
{code}
> PDFMergerUtility with scratch file generates result with blank pages for
> certain source files.
> ----------------------------------------------------------------------------------------------
>
> Key: PDFBOX-3142
> URL: https://issues.apache.org/jira/browse/PDFBOX-3142
> Project: PDFBox
> Issue Type: Bug
> Components: Utilities
> Affects Versions: 1.8.10
> Environment: Ubuntu 14.04.3, java 1.8.0_66
> Reporter: Jim deVos
>
> My team uses PDFMergerUtility to attach cover pages to various pdfs . We
> recently we tried utilizing a scratch file (e.g.
> PDFMergerUtility.mergeDocumentsNonSeq()) to cut down on the amount of RAM we
> are using. This approach works for the majority of pdf's in our system, but
> some files cause the merger utility to generate resultant pdf's with a blank
> page. Specifically, the result pdf contains a blank page after the coverpage
> instead of the first page of the second document sent to merger utility.
> Whenever this problem occurs, we see the following line in our logs:
> {{org.apache.pdfbox.pdfparser.NonSequentialPDFParser - Can't find the object
> 52 0 (origin offset 7187557)}}
> I'll try to attach/link an example pdf soon, but currently I don't have
> permission to redistribute any files that exhibit the problem. However,
> here's a simple snippet that replicates the problem - it's pretty
> straightforward.
> {code}
> @Test
> public void testMergeNonSeq() throws IOException, COSVisitorException {
> destinationPdf = new File(TMP_FOLDER, "result-nonseq.pdf");
> PDFMergerUtility ut = new PDFMergerUtility();
> RandomAccess ram = new
> RandomAccessFile(File.createTempFile("mergeram", ".bin"), "rw");
> ut.addSource(coverpagePdf);
> ut.addSource(documentPdf);
> ut.setDestinationFileName(destinationPdf.getCanonicalPath());
> ut.mergeDocumentsNonSeq(ram);
>
> //the only automated way we have to tell that something went wrong is
> to check the size of the result
> assertThat("destination pdf should be larger than the original pdf",
> destinationPdf.length(), is( greaterThan(documentPdf.length())));
> }
> {code}
> Note we only see this problem with PDFMergerUtility.mergeDocumentsNonSeq().
> Using PDFMergerUtility.mergeDocuments() does not exhibit any problems.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]