Hi,
I'd like to use PDFBOX to remove possibly confidential metadata (like
author, keywords, comments, ...) from a document.
From http://pdfbox.apache.org/cookbook/workingwithmetadata.html , I see
I can easily use the PDDocumentInformation.setXXX() methods to void that
data; okay, that was easy.
But what about XML metadata attached to some PDModel structure? Can this
also be safely removed?
What about included files, how can I detect and remove them?
Is there perhaps a toolkit solution to remove all non-display-related
data from a document?
Thanks for your comments,
best regards
-hannes