[ 
https://issues.apache.org/jira/browse/PDFBOX-1129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13475617#comment-13475617
 ] 

Andreas Lehmkühler edited comment on PDFBOX-1129 at 10/13/12 2:13 PM:
----------------------------------------------------------------------

I'm afraid the pdf doesn't contain any information on how to map those glyphs. 
Even the adobe reader isn't able to map those.

Set resolution to "not a problem"
                
      was (Author: lehmi):
    I'm afraid the pdf doesn't contain any information how to map those glyphs. 
Even the adobe rerader isn't able to map those.

Set resolution to "not a problem"
                  
> Quote glyphs (quoteright, quotedblright, etc.) not mapped to the right 
> Unicode character
> ----------------------------------------------------------------------------------------
>
>                 Key: PDFBOX-1129
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1129
>             Project: PDFBox
>          Issue Type: Bug
>    Affects Versions: 1.7.0
>            Reporter: Michael McCandless
>            Assignee: Andreas Lehmkühler
>            Priority: Minor
>         Attachments: 000086.pdf
>
>
> I have an example PDF (will attach) that uses a right-single-quote
> character, but extracts incorrectly from PDFBox (using ExtractText).
> If I copy/paste, the text is correct (I get U+2019 for the right
> quote).
> Search for "cashier" in the PDF, on page 1 to see it; that right quote
> is supposed to come through as U+2019 I think.
> I looked at the PDF in PDFDebugger, and I see this fragment in the
> "Contents" for page 1:
>   (Bring the voucher handout to the cashier\325s office \(10-180\))Tj
> So somehow this \325 escape fails to map to the quoteright glyph.  The
> font is partial embedded font BPOLKO+TimesNewRomanPSMT, and I can see
> in the Charset (under FontDescriptor, for font F1) that it references
> this glyph.
> I also see a [correct] entry in glyphlist.txt, mapping to U+2019, so
> that's not the problem.
> Not sure what's going wrong... maybe somehow \325 fails to map to
> quoteright? 
> There are other glyphs (quotedblright, quotedblleft) that are also not
> converted correctly, eg search for project review on page 2.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to