Re: Slovakian characters

Andreas Lehmkühler Wed, 18 Jul 2012 03:53:23 -0700

Hi,


"Czech, Christian" <[email protected]> hat am 17. Juli 2012 um 14:42 geschrieben:

> Hello,
>
> I have a PDF document with Slovakian characters e.g. "Nápovda pro klienta".
> How can I extract it correctly?
>
> My code:
>
>
> PDDocument document = null;
> document = PDDocument.load(pdfFile, true);
> PDFTextStripper stripper = null;
> stripper = new PDFTextStripper("ISO-8859-2");
> stripper.getText(document);
>
> I always get this result: "N\?pověda pro klienta"


There are 2 possible solutions:

1. Try utf-8 as encoding and be sure that the editor you are using to open the
result is capable to process such encoding.

2. There is an issue with PDFBox. Which version are you using? Is it possible to
get a hand on that pdf?


> Thanks
> Christian
>
>
>
> ________________________________
>
> ELO Digital Office GmbH
> Firmensitz: Heilbronner Strasse 150, 70191 Stuttgart
> Fon: +49 711 806089-0, Fax: +49 711 806089-19, Web: www.elo.com
> Geschäftsführer: Karl Heinz Mosbach, Matthias Thiele
> BW-Bank, Konto-Nr. 2089782, BLZ 600 501 01
> Registergericht Stuttgart HRB 15059 - USt-IdNr.: DE812471516

BR
Andreas Lehmkühler

Re: Slovakian characters

Reply via email to