[
https://issues.apache.org/jira/browse/PDFBOX-5920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17904602#comment-17904602
]
Tilman Hausherr edited comment on PDFBOX-5920 at 12/11/24 9:23 AM:
-------------------------------------------------------------------
I tried using {{font.getStringWidth(" ")}} for {{getFontWidth()}} and there are
many text extraction differences. However all except one are improvements! One
not improved is PDFBOX-2959. That's because type3 fonts don't support encoding.
I'll investigate that next.
improved:
7LRS5U6CAFMN2P6JPTZVNBUW6XOFYH4M.pdf
10.5445IR1000150280-p15.pdf
PDFBOX-3782-reduced.pdf
PDFBOX-4934-JP.pdf
others:
artikel1_20_arab.pdf unclear
PDFBOX-756-p1.pdf not better
PDFBOX-2959-reduced.pdf not better
SO51672080-tiny-gaps.pdf irrelevant
PDFBOX-5324.pdf irrelevant
was (Author: tilman):
I tried using {{font.getStringWidth(" ")}} for {{getFontWidth()}} and there are
many text extraction differences. However all except one are improvements! The
only one not improved is PDFBOX-2959. That's because type3 fonts don't support
encoding. I'll investigate that next.
improved:
7LRS5U6CAFMN2P6JPTZVNBUW6XOFYH4M.pdf
10.5445IR1000150280-p15.pdf
PDFBOX-3782-reduced.pdf
PDFBOX-4934-JP.pdf
others:
artikel1_20_arab.pdf unclear
PDFBOX-756-p1.pdf not better
PDFBOX-2959-reduced.pdf not better
SO51672080-tiny-gaps.pdf irrelevant
PDFBOX-5324.pdf irrelevant
> PDType0Font return invalid space width
> --------------------------------------
>
> Key: PDFBOX-5920
> URL: https://issues.apache.org/jira/browse/PDFBOX-5920
> Project: PDFBox
> Issue Type: Bug
> Components: FontBox
> Affects Versions: 3.0.3 PDFBox
> Reporter: Miroslav Holubec
> Assignee: Tilman Hausherr
> Priority: Major
> Labels: fontwidth, truetype
> Attachments: texgyreheros-regular.ttf
>
>
> WinAnsiEncoding supports not all available characters from the font. That is
> the reason why we moved to the workaround proposed by FAQ, also to use
> PDType0Font. Now we have realized, that returned space width from
> font.getSpaceWidth() returns invalid value.
> {noformat}
> class FontWidthTest {
> @Test
> void pdType0FontTest() throws IOException {
> try (InputStream fontStream =
> FontWidthTest.class.getResourceAsStream("/texgyreheros-regular.ttf");
> PDDocument document = new PDDocument()) {
> PDFont font = PDType0Font.load(document, fontStream, false);
> assertEquals(20064.0, font.getStringWidth("The quick brown fox
> jumps over the lazy dog."));
> assertEquals(278.0, font.getSpaceWidth()); // FAIL: returns 584.0
> }
> }
> @Test
> void pdTrueTypeFontTest() throws IOException {
> try (InputStream fontStream =
> FontWidthTest.class.getResourceAsStream("/texgyreheros-regular.ttf");
> PDDocument document = new PDDocument()) {
> PDFont font = PDTrueTypeFont.load(document, fontStream,
> WinAnsiEncoding.INSTANCE);
> assertEquals(20064.0, font.getStringWidth("The quick brown fox
> jumps over the lazy dog."));
> assertEquals(278.0, font.getSpaceWidth());
> }
> }
> }
> {noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]