Hi, I'm was using PoDoFo to extract text from a PDF. When trying to get the unicode characters from glyphs (ie: for Tj command) it was not working in some cases.
When a TrueType font has no Encoding but a ToUnicode map then it's not read. The ToUnicode CMAP parser (from PdfIdentityEncoding and PdfCMapEncoding) has some also bugs (like the value of loop variable that is not reset between sections) loading only partial informations from the CMap. I've fixed this points and now I'm able to get all the text in the PDF file with PoDoFo. I'm new to PoDoFo and I don't know how to submit a patch for these corrections (if there is a way) in case it helps other people having the same problems. Meanwhile or if patch are not accepted/reviewed people having same issue can ask me for the patch. Regards, Hugues ------------------------------------------------------------------------------ _______________________________________________ Podofo-users mailing list Podofo-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/podofo-users