Hi,

I'd like to use PDFBOX to remove possibly confidential metadata (like author, keywords, comments, ...) from a document.

From http://pdfbox.apache.org/cookbook/workingwithmetadata.html , I see I can easily use the PDDocumentInformation.setXXX() methods to void that data; okay, that was easy.

But what about XML metadata attached to some PDModel structure? Can this also be safely removed?
What about included files, how can I detect and remove them?


Is there perhaps a toolkit solution to remove all non-display-related data from a document?


Thanks for your comments,
best regards

        -hannes

Reply via email to