Edward, Edward Ling wrote > an internal company project to extract the text from Mortgage Lender's > (KFI) pdf documents [...] > > It fails with: Dictionary key is not a name. [...] > > Is this a known bug in the library? I do not have any control in the > format/content of the pdf, and maybe it is malformed, but it is readable > with numerous pdf readers. Maybe someone with more knowledge of the > library could tell me why this pdf (and all the pdfs from this 'supplier') > is causing this exception.
While I'm far from knowledgeable about fonts in PDF in general and their handling in iText in particular, I started the debugger and observed that iText stumbles when trying to parse the Type0 font LMDEPH+Tahoma-Bold (17 0 obj), more exactly when parsing its ToUnicode CMap (40 0 obj): /CIDInit /ProcSet findresource begin 12 dict begin begincmap CIDSystemInfo <</Registry (F1+0) /Ordering (F1) /Supplement 0 /CMapName /F1+0 def /CMapType 2 def [...] When parsing the CIDSystemInfo dictionary, iText fails at the first appearance of "def" which indeed is not a name. I think the closing brackets ">>" of the CIDSystemInfo dictionary simply are missing and should have been right after "/Supplement 0". So it seems that the ToUnicode map here indeed is malformed. On the other hand, though, I know very little of fonts and may be completely off the track... :) Regards, Michael -- View this message in context: http://itext-general.2136553.n4.nabble.com/iTextSharpe-exception-calling-PdfTextExtractor-GetTextFromPage-tp4327037p4344694.html Sent from the iText - General mailing list archive at Nabble.com. ------------------------------------------------------------------------------ Keep Your Developer Skills Current with LearnDevNow! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-d2d _______________________________________________ iText-questions mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/itext-questions iText(R) is a registered trademark of 1T3XT BVBA. Many questions posted to this list can (and will) be answered with a reference to the iText book: http://www.itextpdf.com/book/ Please check the keywords list before you ask for examples: http://itextpdf.com/themes/keywords.php
