Fatemeh Elyasi created PDFBOX-5487:
--------------------------------------

             Summary: extra whitespaces when extracting Arabic text
                 Key: PDFBOX-5487
                 URL: https://issues.apache.org/jira/browse/PDFBOX-5487
             Project: PDFBox
          Issue Type: Bug
            Reporter: Fatemeh Elyasi
         Attachments: Malpass-at-the-G7-Leaders-Summit-Media-Briefing-AR.pdf

trying to extract text from an arabic PDF. You may notice that some of 
whitespaces are extracted in wrong place.

Example:
Original word: العالمية
Extracted word: العالمي ة

 

Pdf is attached, the example word is on the first line.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to