Daniel Persson created PDFBOX-5788:
--------------------------------------
Summary: ID References changes when saving PDFs.
Key: PDFBOX-5788
URL: https://issues.apache.org/jira/browse/PDFBOX-5788
Project: PDFBox
Issue Type: Bug
Affects Versions: 3.0.2 PDFBox, 3.0.1 PDFBox
Reporter: Daniel Persson
{code:java}
private static void runPDF(String name) throws IOException,
NoSuchAlgorithmException {
PDDocument doc = Loader.loadPDF(new File(name));
File tmpFile = File.createTempFile("tmp", ".pdf");
doc.save(tmpFile);
byte[] data = Files.readAllBytes(Paths.get(tmpFile.getAbsolutePath()));
byte[] hash = MessageDigest.getInstance("SHA256").digest(data);
System.out.println(encodeHexString(hash));
File tmpFile2 = File.createTempFile("tmp", ".pdf");
doc.save(tmpFile2);
byte[] data2 = Files.readAllBytes(Paths.get(tmpFile2.getAbsolutePath()));
byte[] hash2 = MessageDigest.getInstance("SHA256").digest(data2);
System.out.println(encodeHexString(hash2));
} {code}
Not sure, this might be expected behavior but it makes my testing framework a
bit less robust so I thought I'd report it here. In the newer versions 3.0.2
and 3.0.1 when you save a PDF the second time the reference ID's continue
incrementing which means that the PDF stored the first time is not identical to
the second time.
In my test case depending on what thread executes first there might be
difference in the run and the expected result changes.
I've not seen this with 3.0.0 and earlier versions of PDFBox.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]