If you mean saving to a temp-file, re-reading and manipulating it, writing it again, this is not an option because performance is very important for us. As you can see from my code snippet, I am already using piped streams to avoid disk I/O. But anyway, Maruan, what is your suggestion?
Von: Maruan Sahyoun <[email protected]> An: [email protected], Datum: 15.04.2016 12:54 Betreff: Re: How to merge PDF/A-1b documents and keep conformity Hi, > Am 15.04.2016 um 12:35 schrieb [email protected]: > > Basically your hack works if I overwrite PDFMergerUtility (extending it is > no option even in the same package because 'appendDocument()' needs > private members). I had to modify your snippet by this in order to avoid > adding multiple intents, leading to a validation error: > > private boolean hasIntent = false; > ... > public void appendDocument(PDDocument destination, PDDocument source) > throws IOException > { > ... > if (!hasIntent) { > hasIntent = true; > List<PDOutputIntent> srcOutputIntents = > srcCatalog.getOutputIntents(); > for (PDOutputIntent outputIntent : srcOutputIntents) > destCatalog.addOutputIntent(outputIntent); > } > ... > } > > It would be really nice if I could either tell the merger to set a given > output intent or to copy the first one as shown above. How do I achieve > this without duplicating your original code? An additional parameter for > setting the desired PDF/A standard type or at least one for setting the > top level output intent to the PDFMergerUtility constructor or to > mergeDocuments() would be really nice. would it be an option to do the merge first and remove the output intent that is needed/you'd like to keep on the merged document afterwards? BR Maruan > > > > Von: [email protected] > An: [email protected], > Datum: 15.04.2016 11:11 > Betreff: Antwort: Re: How to merge PDF/A-1b documents and keep > conformity > > > > Hi Tilman. > > What exactly do you need to know except for what I already told you in the > > "situation" paragraph? We currently use something like this: > > public InputStream merge(final List<InputStream> sources) throws > IOException { > PDFMergerUtility merger = new PDFMergerUtility(); > for (InputStream source : sources) { > logger.trace("PDF merger source = {}", source); > merger.addSource(source); > } > PipedOutputStream outputStream = new PipedOutputStream(); > PipedInputStream inputStream = new PipedInputStream(outputStream); > merger.setDestinationStream(outputStream); > new Thread(() -> { > try { > merger.mergeDocuments(MemoryUsageSetting.setupMainMemoryOnly()); > } catch (IOException e) { > logger.error("PDF merge problem", e); > } > }).start(); > return inputStream; > } > > Does that help? By the way, I need an automated, stable PDF merge > solution, not a one-time hack including manual editing in Notepad++. > Furthermore, I cannot just add code to your API, I would like to use the > API as is. I tried to quick & dirty extend PDFMergerUtility with a > subclass and overwrite 'appendDocument', copying all the original source > code. But the thing is, that methods uses non-public classes like > PDFCloneUtility and non-public members etc. I could only try to use the > same package as the original, but this is not nice. > > The source documents are, as I said, PDF/A-1b compliant, all of them > created by the same output manegement system. So I guess the output > intents (whatever that means) are similar or identical. > > Regards > -- > Alexander Kriegisch > > > > > Von: Tilman Hausherr <[email protected]> > An: [email protected], > Datum: 13.04.2016 18:20 > Betreff: Re: How to merge PDF/A-1b documents and keep conformity > > > > Am 13.04.2016 um 12:03 schrieb [email protected]: >> Hi, I am new to this list. >> >> My profile is: experienced Java programmer, knowing how to use >> PDFMergerUtility, not not a PDF or even PDF/A-1b expert. >> >> Situation: We have a bunch of PDF/A-1b compliant documents from a 3rd >> party system and merge them into a new document. The end result is not >> PDF/A-1b compliant though. >> >> I found this on the mailing list archive: >> > http://pdfbox-users.markmail.org/search/?q=merge%20pdf%2Fa#query:merge%20pdf%2Fa+page:1+mid:uwvybz6lhgof3agg+state:results > > >> Is there a better answer today than to look into PDFMergerUtility > sources? >> Because this class is what we are using, but it does not do it, at least >> not in version 1.8.9. Is there a reason to assume that this has changed > in >> 2.x? >> > You didn't mention what went wrong. I had that problem once with 2 files > from the same source, what I did is: > > 1) in 2.0 source code (I won't bother with 1.8) add this in > PDFMergerUtility.appendDocument() above the comment "merge logical > structure hierarchy": > > List<PDOutputIntent> srcOutputIntents = > srcCatalog.getOutputIntents(); > for (PDOutputIntent outputIntent : srcOutputIntents) > { > destCatalog.addOutputIntent(outputIntent); > } > > then I edited the result PDF manually to remove one of the output > intents. The result PDF should have something like this: > > /OutputIntents [7 0 R 8 0 R] > > just blank one of the two, e.g. like this: > > /OutputIntents [7 0 R ] > > make sure that you don't change any positions, i.e. switch your editor > (NOTEPAD++) to overwrite. > > This may or may not work... if the two files have different output > intents, then you'll have surprises, obviously. > > I haven't done any code changes... I don't know for sure what element of > the outputIntent is the "key" (so to skip others with the same key), and > don't know what I should do if files have different ones. I suspect it > is "OutputConditionIdentifier". > > > Example of an outputIntent: > > << > /Type/OutputIntent > /S/GTS_PDFA1 > /OutputCondition(U.S. Web Coated \(SWOP\) v2) > /OutputConditionIdentifier(CGATS TR 001) > /Info(U.S. Web Coated \(SWOP\) v2) > /DestOutputProfile 4 0 R >>> > > 4 0 obj > > << > /N 4 > /Filter/FlateDecode > /Length 389758 >>> > stream > ... > endstream > > endobj > > > If you tell more what you're trying to do (one time only problem or > not?), maybe I can help... > > Tilman > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > > > > > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]

