Hello,

I am currently using JDK17 and pdfbox-2.0.33.jar to convert PDFs into
images on a Windows 10 OS.

The PDF displays correctly in the MS Edge browser. However, after
converting it to an image using PDFBox, some fields begin to appear blank
starting from the seventh image.

Interestingly, the first six images are generated correctly. After
comparing, I noticed that some fonts starting from the seventh page of the
PDF differ from the ones used earlier.

I suspect that missing fonts may be the cause of the issue, but since there
are no errors or warnings in the debug information, I’m unsure which fonts
are missing.

I have uploaded attachments to Google Drive
<https://drive.google.com/drive/folders/1cxpVIJphGwQEQaqwtUlaVllXMSOL5yT0?usp=sharing>.
The folder contains the original PDF, a screenshot of the seventh page
opened in MS Edge, the converted images, the source code, and the
"debugInfo.txt" file.
I have removed some redundant logs and only included what I believe to be
important in this email. The full DEBUG information is included in the
attached "debugInfo.txt."

Here are some key DEBUG log entries:

18:13:33.188 [main] DEBUG org.example.PdfFileTest - Page 5 rendered
18:13:33.234 [main] DEBUG org.apache.fontbox.ttf.PostScriptTable - No
PostScript name information is provided for the font SimSun
18:13:33.272 [main] DEBUG org.apache.pdfbox.pdmodel.font.FontMapperImpl -
getFont('TTF','SimSun') returns SimSun (TTF, mac: 0x0, os/2: 0x0, cid:
null) C:\Windows\FONTS\simsun.ttc 77be8045 1690865065721
18:13:33.274 [main] DEBUG org.apache.pdfbox.pdmodel.font.FontMapperImpl -
getFont('TTF','SimSun') returns SimSun (TTF, mac: 0x0, os/2: 0x0, cid:
null) C:\Windows\FONTS\simsun.ttc 77be8045 1690865065721
18:13:33.276 [main] DEBUG org.apache.pdfbox.pdmodel.font.FontMapperImpl -
getFont('TTF','SimSun') returns SimSun (TTF, mac: 0x0, os/2: 0x0, cid:
null) C:\Windows\FONTS\simsun.ttc 77be8045 1690865065721
18:13:33.277 [main] DEBUG org.apache.pdfbox.pdmodel.font.FontMapperImpl -
getFont('TTF','SimSun') returns SimSun (TTF, mac: 0x0, os/2: 0x0, cid:
null) C:\Windows\FONTS\simsun.ttc 77be8045 1690865065721
18:13:33.277 [main] DEBUG org.apache.pdfbox.pdmodel.font.FontMapperImpl -
getFont('TTF','SimSun') returns SimSun (TTF, mac: 0x0, os/2: 0x0, cid:
null) C:\Windows\FONTS\simsun.ttc 77be8045 1690865065721
18:13:33.278 [main] DEBUG org.apache.pdfbox.pdmodel.font.FontMapperImpl -
getFont('TTF','SimSun') returns SimSun (TTF, mac: 0x0, os/2: 0x0, cid:
null) C:\Windows\FONTS\simsun.ttc 77be8045 1690865065721
18:13:33.278 [main] DEBUG org.apache.pdfbox.pdmodel.font.FontMapperImpl -
getFont('TTF','SimSun') returns SimSun (TTF, mac: 0x0, os/2: 0x0, cid:
null) C:\Windows\FONTS\simsun.ttc 77be8045 1690865065721
18:13:33.280 [main] DEBUG org.example.PdfFileTest - Page 6 rendered
18:13:33.310 [main] DEBUG org.apache.fontbox.ttf.PostScriptTable - No
PostScript name information is provided for the font SimSun
18:13:33.358 [main] DEBUG org.example.PdfFileTest - Page 7 rendered
18:13:33.392 [main] DEBUG org.apache.fontbox.ttf.PostScriptTable - No
PostScript name information is provided for the font SimSun
18:13:33.427 [main] DEBUG org.example.PdfFileTest - Page 8 rendered
18:13:33.461 [main] DEBUG org.apache.fontbox.ttf.PostScriptTable - No
PostScript name information is provided for the font SimSun
18:13:33.494 [main] DEBUG org.example.PdfFileTest - Page 9 rendered
18:13:33.526 [main] DEBUG org.apache.fontbox.ttf.PostScriptTable - No
PostScript name information is provided for the font SimSun
18:13:33.541 [main] DEBUG org.example.PdfFileTest - Page 10 rendered


It is worth noting that the message "No PostScript name information is
provided for the font SimSun" appears on every page, but on the first five
pages, it is immediately followed by
"org.apache.pdfbox.pdmodel.font.FontMapperImpl - getFont(xxxx)". Starting
from the sixth page, only the "No PostScript name" message is output, but
the conversion for the sixth page still works fine. The issue only appears
starting from the seventh page.

Best regards

Reply via email to