Re: Fwd: Junk Characters while Extracting text from pdf file.

kulbhushan singh Wed, 06 Feb 2013 02:25:06 -0800

Hi Andreas,

I did the adobe test and it gives me the same junk characters as pdfbox. I
also tried to "save as text.." but result is same.  In pdf properties I
found that encoding is Identity-H. I googled this encoding and fond that
many others also have the same problem.


In my pdf I am even not able to search any text. Is OCR and Glyph my only
option to extract text out of it? Or is there and other way to go on this.

Regards, Kulbhushan

Re: Fwd: Junk Characters while Extracting text from pdf file.

Reply via email to