Hello everyone,
 
An application I am developing is capable of extracting text data from almost every PDF document, and for this to happen I have to parse a font's ToUnicode stream which contains a CMap.
 
In a small percentage of files (3 out of 2000 in my test batch), the cmap looks like this:
 
/CIDInit /ProcSet findresource begin
12 dict begin
begincmap
/CIDSystemInfo <<
/Registry (Arial,Bold+0) def
/Ordering (T1UV) def
/Supplement 0 def
>> def
/CMapName /Arial,Bold+0 def
1 begincodespacerange <20> <ff> endcodespacerange
 31 beginbfrange ....
 
The embedded dictionary contains the word def which trails each entry, and that makes the dictionary unreadable by my parser, since it expects the strict << /Name Value >> structure, and this nasty little word breaks it. 
 
My questions are: what exactly does the word def mean, and what's it doing inside a dictionary? Can someone shed some light?
 
Any help would be greatly appreciated!

Peter

Reply via email to