Edward,

Edward Ling wrote
> an internal company project to extract the text from Mortgage Lender's
> (KFI) pdf documents [...]
> 
> It fails with: Dictionary key is not a name. [...]
> 
> Is this a known bug in the library? I do not have any control in the
> format/content of the pdf, and maybe it is malformed, but it is readable
> with numerous pdf readers. Maybe someone with more knowledge of the
> library could tell me why this pdf (and all the pdfs from this 'supplier')
> is causing this exception.

While I'm far from knowledgeable about fonts in PDF in general and their
handling in iText in particular, I started the debugger and observed that
iText stumbles when trying to parse the Type0 font LMDEPH+Tahoma-Bold (17 0
obj), more exactly when parsing its ToUnicode CMap (40 0 obj):

/CIDInit /ProcSet findresource begin 12 dict begin begincmap 
CIDSystemInfo <</Registry (F1+0) /Ordering (F1) /Supplement 0
/CMapName /F1+0 def
/CMapType 2 def
[...]

When parsing the CIDSystemInfo dictionary, iText fails at the first
appearance of &quot;def&quot; which indeed is not a name.

I think the closing brackets &quot;>>" of the CIDSystemInfo dictionary
simply are missing and should have been right after "/Supplement 0".

So it seems that the ToUnicode map here indeed is malformed. On the other
hand, though, I know very little of fonts and may be completely off the
track... :)

Regards,   Michael

--
View this message in context: 
http://itext-general.2136553.n4.nabble.com/iTextSharpe-exception-calling-PdfTextExtractor-GetTextFromPage-tp4327037p4344694.html
Sent from the iText - General mailing list archive at Nabble.com.

------------------------------------------------------------------------------
Keep Your Developer Skills Current with LearnDevNow!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-d2d
_______________________________________________
iText-questions mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference 
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: 
http://itextpdf.com/themes/keywords.php

Reply via email to