Hi Kulbhushan, is it possible to extract the text using Adobe Reader or Adobe Acrobat without the junk characters? If no PDFBox can't help too. If yes could you open a case at Jira (https://issues.apache.org/jira/browse/PDFBOX) and attach a sample PDF which enables us to reproduce the issue.
With kind regards Maruan Sahyoun Am 05.02.2013 um 14:54 schrieb kulbhushan singh <[email protected]>: > Hi, > > I am trying to extract text from a pdf file with custom fonts but it is > giving me junk characters. The fonts used are ArialMT (embedded subset) & > Arial-BoldMT (embedded subset). The producer of pdf file is GPL Ghost > script 8.15. I am using PDFTextStripper to extract the text. How can do it > for custom fonts. Any reference or solution would be appreciated. > > Regards, Kulbhushan

