Hi,
"Czech, Christian" <[email protected]> hat am 17. Juli 2012 um 14:42 geschrieben: > Hello, > > I have a PDF document with Slovakian characters e.g. "Nápovda pro klienta". > How can I extract it correctly? > > My code: > > > PDDocument document = null; > document = PDDocument.load(pdfFile, true); > PDFTextStripper stripper = null; > stripper = new PDFTextStripper("ISO-8859-2"); > stripper.getText(document); > > I always get this result: "N\?pověda pro klienta" There are 2 possible solutions: 1. Try utf-8 as encoding and be sure that the editor you are using to open the result is capable to process such encoding. 2. There is an issue with PDFBox. Which version are you using? Is it possible to get a hand on that pdf? > Thanks > Christian > > > > ________________________________ > > ELO Digital Office GmbH > Firmensitz: Heilbronner Strasse 150, 70191 Stuttgart > Fon: +49 711 806089-0, Fax: +49 711 806089-19, Web: www.elo.com > Geschäftsführer: Karl Heinz Mosbach, Matthias Thiele > BW-Bank, Konto-Nr. 2089782, BLZ 600 501 01 > Registergericht Stuttgart HRB 15059 - USt-IdNr.: DE812471516 BR Andreas Lehmkühler

