Darren,

FDnC Red wrote
> What I don't see is how do I know that the Text Stream
> 
> [()40.2(\()18.3(")0()]TJ
> 
> maps to the glyph names in the Differences array.

First of all that extract from the content stream is incomplete. The pairs
of round brackets (i.e. string delimiters) contain the following bytes:

1. pair: 0x1D 0x18 - both missing in your excerpt
2. pair: 0x5C 0x28 - an escaped (by the backslash 0x5C) opening bracket
3. pair: 0x22 - a double quote
4. pair: 0x1C - missing in your excerpt

(When doing such an excerpt, always remember that you deal with arbitrary
bytes here, not merely bytes properly mapping to characters in ASCII or
Latin1 or Unicode! Your excerpt dropped all bytes in the control character
range.)

So here you see the
> characters extracted as hex 0x1d 0x18 0x28 0x22 0x1c

The differences array

[2,  /g51, /g85, /g82, /g77, /g72, /g70,
 /g87, /g48, /g68, /g81, /g88, /g79, /g76, /g71,
 /g74, /g54, /g83, /g73, /g86, /g38, /g56, /g43,
 /g36, /g44, /g3, /g9, /g40, /g55, /g53, /g50,
 /g49, /g15, /g47, /g24, /g19, /g90, /g75, /g25,
 /g37, /g20, /g28, /g23, /g21, /g11, /g12, /g16,
 /g27, /g22, /g45, /g41]

maps these bytes as follows, and in combination with an offset 29 you get:

0x1d /g55 55+29=84 T
0x18 /g36 36+29=65 A
0x28 /g37 37+29=66 B
0x22 /g47 47+29=76 L
0x1c /g40 40+29=69 E

This offset 29, while seeming arbitrary, can often be seen in the glyph
indices in fonts.

Regards,   Michael



--
View this message in context: 
http://itext-general.2136553.n4.nabble.com/Extracting-Text-tp4660444p4660451.html
Sent from the iText - General mailing list archive at Nabble.com.

------------------------------------------------------------------------------
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://p.sf.net/sfu/Zoho
_______________________________________________
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference 
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: 
http://itextpdf.com/themes/keywords.php

Reply via email to