[ 
https://issues.apache.org/jira/browse/PDFBOX-5788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17829197#comment-17829197
 ] 

Michael Klink commented on PDFBOX-5788:
---------------------------------------

You should not expect distinctly saved PDF versions to be identical as byte 
streams. In general numerous details may differ, the second part of the ID, the 
modification date and time, even all encrypted data (as there are encryption 
algorithms that require random inputs).


> ID References changes when saving PDFs.
> ---------------------------------------
>
>                 Key: PDFBOX-5788
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-5788
>             Project: PDFBox
>          Issue Type: Bug
>    Affects Versions: 3.0.1 PDFBox, 3.0.2 PDFBox
>            Reporter: Daniel Persson
>            Priority: Minor
>
>  
> {code:java}
> private static void runPDF(String name) throws IOException, 
> NoSuchAlgorithmException {
>     PDDocument doc = Loader.loadPDF(new File(name));
>     File tmpFile = File.createTempFile("tmp", ".pdf");
>     doc.save(tmpFile);
>     byte[] data = Files.readAllBytes(Paths.get(tmpFile.getAbsolutePath()));
>     byte[] hash = MessageDigest.getInstance("SHA256").digest(data);
>     System.out.println(encodeHexString(hash));
>     File tmpFile2 = File.createTempFile("tmp", ".pdf");
>     doc.save(tmpFile2);
>     byte[] data2 = Files.readAllBytes(Paths.get(tmpFile2.getAbsolutePath()));
>     byte[] hash2 = MessageDigest.getInstance("SHA256").digest(data2);
>     System.out.println(encodeHexString(hash2));
> } {code}
> Not sure, this might be expected behavior but it makes my testing framework a 
> bit less robust so I thought I'd report it here. In the newer versions 3.0.2 
> and 3.0.1 when you save a PDF the second time the reference ID's continue 
> incrementing which means that the PDF stored the first time is not identical 
> to the second time.
> In my test case depending on what thread executes first there might be 
> difference in the run and the expected result changes.
> I've not seen this with 3.0.0 and earlier versions of PDFBox.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to