[jira] [Created] (PDFBOX-1283) Unicode characters displayed with wrong Advance

Daniel Schwinn (Created) (JIRA) Fri, 06 Apr 2012 15:04:45 -0700

Unicode characters displayed with wrong Advance
-----------------------------------------------


                 Key: PDFBOX-1283
                 URL: https://issues.apache.org/jira/browse/PDFBOX-1283
             Project: PDFBox
          Issue Type: Bug
          Components: PDFReader
    Affects Versions: 1.6.0
            Reporter: Daniel Schwinn
         Attachments: AnnahmeReport_MitRussischTest.pdf

The file AnnahmeReport_MitRussischTest.pdf is not displayed correctly. The 
advance of the characters is calculated wrong. The document is displayed 
correctly in Adobe Reader.

In PDCIDFont.java the method extractWidths() fills widthCache with the 
character widths based on the array in the "W" Dictionary. The widthCache seems 
to translate from from Unicode to character width but the "W" Dictionary 
translates from CID-code to character width. 

In this PDF file the TTF font is embedded and the CID code is identical to the 
glyph code in the TTF font. A cmap maps from unicode directly to the cid/gid in 
the ttf font.

So this cache is filled in the wrong way or when accessing the cache it is not 
taken into account that this array containes the widths based on the cid/gid.

The cmap encoding has to be used when filling the cache or when reading the 
values from the cache

I checked if Adobe Reader uses the values in /W to determine the widths to rule 
out the case that
 the PDF file is faulty and adobe reader just ignores the faulty /W array.



When changing the entries for the glyphs number 20..23 in the /W array of the 
bold font 
(first 4 values in the second line of the array which match to characters 
'1'..'4')  
then the numbers are displayed with wrong widths in AdobeReader while nothing 
changes in PDFBox.
 (file AnnahmeReport_MitRussischTest_Modified.pdf)







--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (PDFBOX-1283) Unicode characters displayed with wrong Advance

Reply via email to