Darren,

FDnC Red wrote
> What I don't understand is the correlation between the hex values and the
> difference array. I don't see anyway to map 0x1d to T even though I know
> and understand that T= /g55 and 55 + 29 = 84 which is T.  Where can I find
> that 0x1d = /g55? If I knew that I could extract the text properly, at
> least for this one PDF.

Ok, so it is about how to interpret the Differences array as mapping.

According to the PDF specification ISO 32000-1, section 9.6.6.1

> The value of the Differences entry shall be an array of character codes
> and character names organized as follows:
> 
> code1 name1,1 name1,2 …
> code2 name2,1 name2,2 …
> …
> coden namen,1 namen,2 …
> 
> Each code shall be the first index in a sequence of character codes to be
> changed. The first character name after the code becomes the name
> corresponding to that code. Subsequent names replace consecutive code
> indices until the next code appears in the array or the array ends. These
> sequences may be specified in any order but shall not overlap.

Thus, in case of your differences array

[2,  /g51, /g85, /g82, /g77, /g72, /g70,
/g87, /g48, /g68, /g81, /g88, /g79, /g76, /g71,
/g74, /g54, /g83, /g73, /g86, /g38, /g56, /g43,
/g36, /g44, /g3, /g9, /g40, /g55, /g53, /g50,
/g49, /g15, /g47, /g24, /g19, /g90, /g75, /g25,
/g37, /g20, /g28, /g23, /g21, /g11, /g12, /g16,
/g27, /g22, /g45, /g41]

the code 2 maps to /g51, 3 maps to /g85, 4 maps to /g82, ..., 0x1d maps to
/g55, ...

Regards,   Michael



--
View this message in context: 
http://itext-general.2136553.n4.nabble.com/Extracting-Text-tp4660444p4660454.html
Sent from the iText - General mailing list archive at Nabble.com.

------------------------------------------------------------------------------
Comprehensive Server Monitoring with Site24x7.
Monitor 10 servers for $9/Month.
Get alerted through email, SMS, voice calls or mobile push notifications.
Take corrective actions from your mobile device.
http://p.sf.net/sfu/Zoho
_______________________________________________
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference 
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: 
http://itextpdf.com/themes/keywords.php

Reply via email to