[ 
https://issues.apache.org/jira/browse/PDFBOX-1919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14029925#comment-14029925
 ] 

John Hewson commented on PDFBOX-1919:
-------------------------------------

I took a detailed look at the PDF in question, using Acrobat Pro XI I get "IN 
NoRtheRN IReLAND" and I always get the same result if I copy & paste, export 
plain text, or export accessible text. OS X Preview gives the same result, as 
does Chrome's PDF viewer.

I really don't think that Acrobat is using the span tags to repair the unicode 
table. Andreas, what version of Acrobat did you use? Given that every PDF 
viewer I've tried produces the same text in all cases, I'd say that PDFBox's 
behaviour is correct.

> Font descriptor flags are not implemented
> -----------------------------------------
>
>                 Key: PDFBOX-1919
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1919
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Text extraction
>    Affects Versions: 1.8.5, 1.8.6, 2.0.0
>            Reporter: Corentin Regal
>         Attachments: PDFBOX-1919.AdobeReader.txt, PDFBOX-1919.pdf, 
> PDFBOX-1919.txt
>
>
> The font descriptor flags are not set.
> They are described in the document "PDF reference 1.7" at : 5.7.1 Font 
> Descriptor Flags
> The methods in PDFontDescriptor are ready but never called :
> setFlags()
> setSerif()
> setAllCap() which is used in a lot of PDF
> ...
> I saw some TODO that relate to that issue in the code, is it planned to be 
> implemented soon?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to