The latest release of PDFBox changed the way it dealt with fonts and introduced this bug, please try the version in CVS and let me know if you are still having a problem.
Ben On Thu, 25 Mar 2004, Ankur Goel wrote: > > Hi, > > I have to index PDF files. For that I am using pdfbox. But when I try to > extract text from pdf file using pdfbox I get the following error: > > java.io.IOException: Error: No 'ToUnicode' and no 'Encoding' for Font > > at org.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:347) > > at > org.pdfbox.util.PDFStreamEngine.showString(PDFStreamEngine.java:169) > > at > org.pdfbox.util.PDFTextStripper.showString(PDFTextStripper.java:461) > > at > org.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:692) > > at > org.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:128) > > at > org.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:268) > > at > org.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:200) > > at > org.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:172) > > at > org.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:120) > > at org.pdfbox.ExtractText.main(ExtractText.java:213) > > at test.LuceneExampleIndexer.indexFile(LuceneExampleIndexer.java:67) > > at > test.LuceneExampleIndexer.indexDirectory(LuceneExampleIndexer.java:47) > > at test.LuceneExampleIndexer.index(LuceneExampleIndexer.java:30) > > at test.LuceneExampleIndexer.main(LuceneExampleIndexer.java:118) > > > Please tell me how to go about it. > > Thanks, > Ankur > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]