[
https://issues.apache.org/jira/browse/PDFBOX-4951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17212647#comment-17212647
]
Volker Kunert commented on PDFBOX-4951:
---------------------------------------
1 The font must be loaded twice - yes we have to load it twice because we use
the positioning
features using FOP's MultiByteFont. We can't store a reference to
MultibyteFont in PDType0Font because it is stored in a COSDictionary and
recreated - loosing extra attributes in this process.
2 Width of A̋ or Ž̧ return the same size as A or Z for me, there is no new code
involved.
PDType0Font font = PDType0Font.load(pdDocument, new
FileInputStream(fontFile), false);
System.out.printf("%f %f%n", font.getStringWidth("A"),
font.getStringWidth("A̋"));
System.out.printf("%f %f%n", font.getStringWidth("Z"),
font.getStringWidth("Ž̧"));
639,000000 639,000000
572,000000 572,000000
3 Which variant of Z plus accent is not OK? They look good to me.
4 The bug in FOP (FOP-2969) means e.g. that the accent is not located above the
current letter,
instead e.g. above the following letter.
5 Bengali processing and FOP-positioning do both reorder the glyphs -- so they
can't work together
at the moment.
Integration on the base of the current implementation or based on FOP
seems possible but needs a programmer who knows Bengali language and script.
6 IMHO the user should be required to explicitly enable FOP-positioning, in
order not to break
other algorithms. Possibly it could be enabled for script latn.
7 I am preparing little corrections to my code.
> Sequences with combining letters are rendered incorrectly
> ---------------------------------------------------------
>
> Key: PDFBOX-4951
> URL: https://issues.apache.org/jira/browse/PDFBOX-4951
> Project: PDFBox
> Issue Type: Bug
> Components: Rendering
> Affects Versions: 2.0.21
> Reporter: Volker Kunert
> Priority: Major
> Attachments: DIN_SPEC_91379_Sequences-aa.pdf,
> DIN_SPEC_91379_Sequences-ab.pdf, DIN_SPEC_91379_Sequences-ac.pdf,
> DIN_SPEC_91379_Sequences.txt, DefaultScriptProcessor.java,
> ExamplePdfboxFopPos.java, ExamplePdfboxFopPos.pdf,
> ExamplePdfboxFopPosForm.java, ExamplePdfboxFopPosForm.pdf, TestPdfbox.java,
> TestPdfboxFop2.java, TestPdfboxFop2.pdf, TestPdfboxJava2D.java,
> TestPdfboxJava2D.pdf, patch-2020-10-02.txt, pdfbox.pdf, screenshot-1.png
>
>
> Accented Letters composed of Unicode base letter and combining accent are
> rendered wrong. E.g. with 0041 030B LATIN CAPITAL LETTER A WITH COMBINING
> DOUBLE ACUTE ACCENT the accent appears at the right hand side of the letter
> A, not above the letter A.
> The position is wrong for most of the sequences defined in the following spec:
> DIN SPEC 91379: Characters in Unicode for the electronic processing of names
> and data
> exchange in Europe; with digital attachment
> [https://www.xoev.de/downloads-2316#StringLatin]
> [https://www.din.de/de/wdc-beuth:din21:301228458]
>
> The correct rendering should look like the output of hb-view 2.6.8, see files
> DIN_SPEC_91379_Sequences*.pdf.
> The output of PDFBox is appended in pdfbox.pdf, which is created by running
> TestPdfbox.java. The sequences are read from file
> DIN_SPEC_91379_Sequences.txt.
>
> Font used for testing: NotoSansMono-Regular.ttf, see
> [https://www.google.com/get/noto/]
> download:
> [https://noto-website-2.storage.googleapis.com/pkgs/NotoSansMono-hinted.zip]
> See also FOP-2969
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]