[ https://issues.apache.org/jira/browse/PDFBOX-3757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15973062#comment-15973062 ]
Tilman Hausherr commented on PDFBOX-3757: ----------------------------------------- There's a snapshot here: https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox-app/2.0.6-SNAPSHOT/ Please test your application and give feedback. > TTFSubsetter scrambles PostScript names and unicode codepoints when subset > contains diaeresis > --------------------------------------------------------------------------------------------- > > Key: PDFBOX-3757 > URL: https://issues.apache.org/jira/browse/PDFBOX-3757 > Project: PDFBox > Issue Type: Bug > Components: FontBox > Affects Versions: 2.0.5 > Reporter: Tobias Fischer > Assignee: Tilman Hausherr > Fix For: 2.0.6, 3.0.0 > > Attachments: fontbox-2.0.5-ttfsubsetter_dieresis-scrambled-names.png, > fontbox-2.0.5-ttfsubsetter_scrambled-codepoints.png, > Subset-DejaVuSans__dieresis-scrambled-names.ttf, > Subset-DejaVuSans__scrambled-codepoints.ttf > > > I tried to build a standalone FontSubsetter with the great fontbox tools. It > works so far for OpenType/TrueType fonts, but when the glyph subset contains > characters with diaeresis (like german umlauts äöü), the TTFSubsetter class > scrambles PostScript names and unicode codepoints. > When creating a subset from DejaVuSans.ttf for example, with only those two > characters "Ö " (O umlaut and a hair space \u200A), the resulting font subset > is recognized as a valid font, but the unicode codepoint 200A in the > resulting font file has the postscript name "Dieresis" and the single > dieresis are named "uni200A". See screenshot > "fontbox-2.0.5-ttfsubsetter_dieresis-scrambled-names.png" and the subsetted > Font "Subset-DejaVuSans__dieresis-scrambled-names.ttf". > When there are more glyphs in the subset, more whitespace, special chars and > umlauts, the scrambling goes even further and also scrambles unicode > codepoints and not only postscript names: > glyphs in subset: "RabenköigKrmloEyGfthsTjHdAu cvFüD. w,äUp:IzWVZSN-ßLC > PB5M«»O2013Q©/;x978-()64XJ'!Ä?‹› ...ÜqY &Öé|_•{}[]>#*$^\\+" > Resulting font: "Subset-DejaVuSans__scrambled-codepoints.ttf" > Screenshot: "fontbox-2.0.5-ttfsubsetter_scrambled-codepoints.png" > I considder this a bug, as it does not appear when there are no umlauts or > diaeresis in the subset. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org