Darren, FDnC Red wrote > What I don't see is how do I know that the Text Stream > > [()40.2(\()18.3(")0()]TJ > > maps to the glyph names in the Differences array.
First of all that extract from the content stream is incomplete. The pairs of round brackets (i.e. string delimiters) contain the following bytes: 1. pair: 0x1D 0x18 - both missing in your excerpt 2. pair: 0x5C 0x28 - an escaped (by the backslash 0x5C) opening bracket 3. pair: 0x22 - a double quote 4. pair: 0x1C - missing in your excerpt (When doing such an excerpt, always remember that you deal with arbitrary bytes here, not merely bytes properly mapping to characters in ASCII or Latin1 or Unicode! Your excerpt dropped all bytes in the control character range.) So here you see the > characters extracted as hex 0x1d 0x18 0x28 0x22 0x1c The differences array [2, /g51, /g85, /g82, /g77, /g72, /g70, /g87, /g48, /g68, /g81, /g88, /g79, /g76, /g71, /g74, /g54, /g83, /g73, /g86, /g38, /g56, /g43, /g36, /g44, /g3, /g9, /g40, /g55, /g53, /g50, /g49, /g15, /g47, /g24, /g19, /g90, /g75, /g25, /g37, /g20, /g28, /g23, /g21, /g11, /g12, /g16, /g27, /g22, /g45, /g41] maps these bytes as follows, and in combination with an offset 29 you get: 0x1d /g55 55+29=84 T 0x18 /g36 36+29=65 A 0x28 /g37 37+29=66 B 0x22 /g47 47+29=76 L 0x1c /g40 40+29=69 E This offset 29, while seeming arbitrary, can often be seen in the glyph indices in fonts. Regards, Michael -- View this message in context: http://itext-general.2136553.n4.nabble.com/Extracting-Text-tp4660444p4660451.html Sent from the iText - General mailing list archive at Nabble.com. ------------------------------------------------------------------------------ Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer http://p.sf.net/sfu/Zoho _______________________________________________ iText-questions mailing list iText-questions@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/itext-questions iText(R) is a registered trademark of 1T3XT BVBA. Many questions posted to this list can (and will) be answered with a reference to the iText book: http://www.itextpdf.com/book/ Please check the keywords list before you ask for examples: http://itextpdf.com/themes/keywords.php