[
https://issues.apache.org/jira/browse/PDFBOX-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tilman Hausherr resolved PDFBOX-5230.
-------------------------------------
Assignee: Tilman Hausherr
Resolution: Fixed
Thanks for the contribution!
> Zero-width non-joiner characters visible in generated PDF
> ---------------------------------------------------------
>
> Key: PDFBOX-5230
> URL: https://issues.apache.org/jira/browse/PDFBOX-5230
> Project: PDFBox
> Issue Type: Bug
> Components: FontBox, PDModel, Writing
> Affects Versions: 2.0.16
> Reporter: Daniel Gredler
> Assignee: Tilman Hausherr
> Priority: Major
> Fix For: 2.0.34, 3.0.5 PDFBox, 4.0.0
>
> Attachments: Af.pdf, zwnj-pdfkit.pdf, zwnj.pdf, zwnj.png
>
>
> I'd like to use the [zero-width
> non-joiner|https://en.wikipedia.org/wiki/Zero-width_non-joiner] (ZWNJ)
> character to prevent character shaping in some cases when using Arabic and
> Indic scripts. This works correctly using some fonts like Arial Unicode
> (character shaping is prevented and no ZWNJ glyph is visible in the PDF), but
> does not work correctly when using fonts like Tahoma or Google Noto Sans
> Regular, where the ZWNJ character is visible in the PDF. The ZWNJ glyph is
> not visible when using these fonts in other programs, like Microsoft Word.
> I suspect that the `advanceWidth` settings in the `hmtx` table should be
> taken into account somehow but are not, because the `advanceWidth` for this
> glyph is 0 in both of these fonts which are erroneously generating visual
> artifacts for the ZWNJ character (Tahoma and Google Noto Sans Regular).
> Test case generating the attached PDF file:
> {code:java}
> public class ZwnjTest {
> public static void main(String[] args) throws IOException {
> try (PDDocument document = new PDDocument()) {
> PDPage page = new PDPage(PDRectangle.LETTER);
> document.addPage(page);
> try (PDPageContentStream stream = new
> PDPageContentStream(document, page)) {
> // Tahoma: ZWNJ glyph is a vertical bar, but advanceWidth in
> hmtx table is 0 -> shown in PDF anyway (unexpected)
> PDFont tahoma = PDType0Font.load(document, new
> File("C:/Windows/Fonts/tahoma.ttf"));
> stream.beginText();
> stream.setFont(tahoma, 20);
> stream.newLineAtOffset(50, 650);
> stream.showText("t\u200Ce\u200Cs\u200Ct\u200C \u200C1"); //
> U+200C = zero width non-joiner
> stream.endText();
> // Arial Unicode: ZWNJ glyph contains no outline -> not shown
> in PDF (as expected)
> PDFont arialu = PDType0Font.load(document, new
> File("C:/Windows/Fonts/ARIALUNI.TTF"));
> stream.beginText();
> stream.setFont(arialu, 20);
> stream.newLineAtOffset(50, 600);
> stream.showText("t\u200Ce\u200Cs\u200Ct\u200C \u200C2"); //
> U+200C = zero width non-joiner
> stream.endText();
> // Google Noto Sans Regular: ZWNJ glyph is a vertical bar,
> but advanceWidth in hmtx table is 0 -> shown in PDF anyway (unexpected)
> PDFont gnotos = PDType0Font.load(document, new
> File("noto-sans-regular.ttf"));
> stream.beginText();
> stream.setFont(gnotos, 20);
> stream.newLineAtOffset(50, 550);
> stream.showText("t\u200Ce\u200Cs\u200Ct\u200C \u200C3"); //
> U+200C = zero width non-joiner
> stream.endText();
> }
> document.save("zwnj.pdf");
> }
> }
> }
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]