[ 
https://issues.apache.org/jira/browse/PDFBOX-4951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18090160#comment-18090160
 ] 

Tilman Hausherr commented on PDFBOX-4951:
-----------------------------------------

Hello [~msahyoun] [~lehmi], should I commit the core of this change? This PR is 
a strong improvement not only on the letters mentioned in the title, but on 
foreign scripts that we don't understand.
In the trunk only or also on 3.0? From my understanding there is no breaking 
change. 
PDAbstractContentStream is different but that one isn't public.
>From the way the new API is implemented we could still dump the awt part and 
>replace it with FOP logic if we discover that awt sucks.
Although I didn't get any comment on my users mailing list post, I believe the 
new look ("text2") is better. The small flaws I saw (the "loop" not exactly in 
place) appear also when drawing directly with awt or using the font with 
Microsoft WORD, so it's a flaw in the font. I tried two other fonts from 
windows (nirmala.ttf and shonar.ttf) and there the output looks perfect.
I'm also thinking of removing the GSUB worker logic, but it might still be 
needed, e.g. for special features like "aalt" (PDFBOX-5808).
We don't have tests yet but I would create tests based on the example code and 
{{TestFontEmbedding.testBengali()}}.

> Sequences of DIN SPEC 91379 with combining letters are rendered incorrectly
> ---------------------------------------------------------------------------
>
>                 Key: PDFBOX-4951
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4951
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Rendering
>    Affects Versions: 2.0.21
>            Reporter: Volker Kunert
>            Priority: Major
>         Attachments: DIN_SPEC_91379_Sequences-aa.pdf, 
> DIN_SPEC_91379_Sequences-ab.pdf, DIN_SPEC_91379_Sequences-ac.pdf, 
> DIN_SPEC_91379_Sequences.txt, DefaultScriptProcessor.java, DejaVuSans.ttf, 
> DoGlyphLayoutBidi.pdf, DoGlyphLayoutDinSpec91379.pdf, 
> DoGlyphLayoutDinSpec91379Form.pdf, DoGlyphPositionBengali.pdf, 
> ExamplePdfboxFopPos-By-Tilman.pdf, ExamplePdfboxFopPos.java, 
> ExamplePdfboxFopPos.pdf, ExamplePdfboxFopPosForm.java, 
> ExamplePdfboxFopPosForm.pdf, FiraCode-Regular.ttf, 
> FontForge-Lohit-Bengali.png, TestPdfbox.java, TestPdfboxFop2.java, 
> TestPdfboxFop2.pdf, TestPdfboxJava2D.java, TestPdfboxJava2D.pdf, bidi-1.png, 
> bidi-2.png, bidi.png, example-PDFBOX-3147-NotoSansThaiLooped-Regular.png, 
> image-2026-05-23-16-16-53-442.png, image-2026-05-23-16-17-28-172.png, 
> image-2026-05-26-16-49-45-529.png, ligatures-kerning.png, 
> patch-2020-10-02.txt, pdfbox.patch, pdfbox.pdf, screenshot-1.png
>
>
> Accented Letters composed of Unicode base letter and combining accent are 
> rendered wrong. E.g. with 0041 030B LATIN CAPITAL LETTER A WITH COMBINING 
> DOUBLE ACUTE ACCENT the accent appears at the right hand side of the letter 
> A, not above the letter A.
> The position is wrong for most of the sequences defined in the following spec:
> DIN SPEC 91379: Characters in Unicode for the electronic processing of names 
> and data 
>  exchange in Europe; with digital attachment
>  [https://www.xoev.de/downloads-2316#StringLatin]
>  [https://www.din.de/de/wdc-beuth:din21:301228458]
>  
> The correct rendering should look like the output of hb-view 2.6.8, see files 
> DIN_SPEC_91379_Sequences*.pdf.
> The output of PDFBox is appended in pdfbox.pdf, which is created by running 
> TestPdfbox.java. The sequences are read from file 
> DIN_SPEC_91379_Sequences.txt.
>  
> Font used for testing: NotoSansMono-Regular.ttf, see 
> [https://www.google.com/get/noto/] 
> download: 
> [https://noto-website-2.storage.googleapis.com/pkgs/NotoSansMono-hinted.zip]
>  See also FOP-2969
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to