Hello, I am currently using JDK17 and pdfbox-2.0.33.jar to convert PDFs into images on a Windows 10 OS.
The PDF displays correctly in the MS Edge browser. However, after converting it to an image using PDFBox, some fields begin to appear blank starting from the seventh image. Interestingly, the first six images are generated correctly. After comparing, I noticed that some fonts starting from the seventh page of the PDF differ from the ones used earlier. I suspect that missing fonts may be the cause of the issue, but since there are no errors or warnings in the debug information, I’m unsure which fonts are missing. I have uploaded attachments to Google Drive <https://drive.google.com/drive/folders/1cxpVIJphGwQEQaqwtUlaVllXMSOL5yT0?usp=sharing>. The folder contains the original PDF, a screenshot of the seventh page opened in MS Edge, the converted images, the source code, and the "debugInfo.txt" file. I have removed some redundant logs and only included what I believe to be important in this email. The full DEBUG information is included in the attached "debugInfo.txt." Here are some key DEBUG log entries: 18:13:33.188 [main] DEBUG org.example.PdfFileTest - Page 5 rendered 18:13:33.234 [main] DEBUG org.apache.fontbox.ttf.PostScriptTable - No PostScript name information is provided for the font SimSun 18:13:33.272 [main] DEBUG org.apache.pdfbox.pdmodel.font.FontMapperImpl - getFont('TTF','SimSun') returns SimSun (TTF, mac: 0x0, os/2: 0x0, cid: null) C:\Windows\FONTS\simsun.ttc 77be8045 1690865065721 18:13:33.274 [main] DEBUG org.apache.pdfbox.pdmodel.font.FontMapperImpl - getFont('TTF','SimSun') returns SimSun (TTF, mac: 0x0, os/2: 0x0, cid: null) C:\Windows\FONTS\simsun.ttc 77be8045 1690865065721 18:13:33.276 [main] DEBUG org.apache.pdfbox.pdmodel.font.FontMapperImpl - getFont('TTF','SimSun') returns SimSun (TTF, mac: 0x0, os/2: 0x0, cid: null) C:\Windows\FONTS\simsun.ttc 77be8045 1690865065721 18:13:33.277 [main] DEBUG org.apache.pdfbox.pdmodel.font.FontMapperImpl - getFont('TTF','SimSun') returns SimSun (TTF, mac: 0x0, os/2: 0x0, cid: null) C:\Windows\FONTS\simsun.ttc 77be8045 1690865065721 18:13:33.277 [main] DEBUG org.apache.pdfbox.pdmodel.font.FontMapperImpl - getFont('TTF','SimSun') returns SimSun (TTF, mac: 0x0, os/2: 0x0, cid: null) C:\Windows\FONTS\simsun.ttc 77be8045 1690865065721 18:13:33.278 [main] DEBUG org.apache.pdfbox.pdmodel.font.FontMapperImpl - getFont('TTF','SimSun') returns SimSun (TTF, mac: 0x0, os/2: 0x0, cid: null) C:\Windows\FONTS\simsun.ttc 77be8045 1690865065721 18:13:33.278 [main] DEBUG org.apache.pdfbox.pdmodel.font.FontMapperImpl - getFont('TTF','SimSun') returns SimSun (TTF, mac: 0x0, os/2: 0x0, cid: null) C:\Windows\FONTS\simsun.ttc 77be8045 1690865065721 18:13:33.280 [main] DEBUG org.example.PdfFileTest - Page 6 rendered 18:13:33.310 [main] DEBUG org.apache.fontbox.ttf.PostScriptTable - No PostScript name information is provided for the font SimSun 18:13:33.358 [main] DEBUG org.example.PdfFileTest - Page 7 rendered 18:13:33.392 [main] DEBUG org.apache.fontbox.ttf.PostScriptTable - No PostScript name information is provided for the font SimSun 18:13:33.427 [main] DEBUG org.example.PdfFileTest - Page 8 rendered 18:13:33.461 [main] DEBUG org.apache.fontbox.ttf.PostScriptTable - No PostScript name information is provided for the font SimSun 18:13:33.494 [main] DEBUG org.example.PdfFileTest - Page 9 rendered 18:13:33.526 [main] DEBUG org.apache.fontbox.ttf.PostScriptTable - No PostScript name information is provided for the font SimSun 18:13:33.541 [main] DEBUG org.example.PdfFileTest - Page 10 rendered It is worth noting that the message "No PostScript name information is provided for the font SimSun" appears on every page, but on the first five pages, it is immediately followed by "org.apache.pdfbox.pdmodel.font.FontMapperImpl - getFont(xxxx)". Starting from the sixth page, only the "No PostScript name" message is output, but the conversion for the sixth page still works fine. The issue only appears starting from the seventh page. Best regards