Daniel Persson created PDFBOX-5788:
--------------------------------------

             Summary: ID References changes when saving PDFs.
                 Key: PDFBOX-5788
                 URL: https://issues.apache.org/jira/browse/PDFBOX-5788
             Project: PDFBox
          Issue Type: Bug
    Affects Versions: 3.0.2 PDFBox, 3.0.1 PDFBox
            Reporter: Daniel Persson


 
{code:java}
private static void runPDF(String name) throws IOException, 
NoSuchAlgorithmException {
    PDDocument doc = Loader.loadPDF(new File(name));

    File tmpFile = File.createTempFile("tmp", ".pdf");
    doc.save(tmpFile);
    byte[] data = Files.readAllBytes(Paths.get(tmpFile.getAbsolutePath()));
    byte[] hash = MessageDigest.getInstance("SHA256").digest(data);
    System.out.println(encodeHexString(hash));

    File tmpFile2 = File.createTempFile("tmp", ".pdf");
    doc.save(tmpFile2);
    byte[] data2 = Files.readAllBytes(Paths.get(tmpFile2.getAbsolutePath()));
    byte[] hash2 = MessageDigest.getInstance("SHA256").digest(data2);
    System.out.println(encodeHexString(hash2));
} {code}
Not sure, this might be expected behavior but it makes my testing framework a 
bit less robust so I thought I'd report it here. In the newer versions 3.0.2 
and 3.0.1 when you save a PDF the second time the reference ID's continue 
incrementing which means that the PDF stored the first time is not identical to 
the second time.

In my test case depending on what thread executes first there might be 
difference in the run and the expected result changes.

I've not seen this with 3.0.0 and earlier versions of PDFBox.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to