On 10/31/07, Brad Wilmot <[EMAIL PROTECTED]> wrote:
>
> I'm trying to read a PDF with six embedded fonts whereby the content
> uses this technique of encoding. Most of the raw text characters are in
> the hex range of 01-40.
>
It's entirely possible that you're actually looking at a custom encoding
rather than glyph indexes.
If your font(s) have an encoding dictionary with a differences array, you'll
need to parse that instead to get the character info.
Your font might look something like this:
<</Type /Font
/Encoding <</Type /Encoding /Differences [ startIndex1 charName1
charName2... ] >>
...
>>
That's not valid for a /Subtype /Type0 font, but is perfectly legal for any
of the single-byte font types.
PS: LiquidOffice uses this technique when building subsets.
PPS: Can you share one of your PDFs with us?
--
--Mark Storer
Professional Geek
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
iText-questions mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/itext-questions
Buy the iText book: http://itext.ugent.be/itext-in-action/