Hi Andreas, I did the adobe test and it gives me the same junk characters as pdfbox. I also tried to "save as text.." but result is same. In pdf properties I found that encoding is Identity-H. I googled this encoding and fond that many others also have the same problem.
In my pdf I am even not able to search any text. Is OCR and Glyph my only option to extract text out of it? Or is there and other way to go on this. Regards, Kulbhushan

