Hi,
Am 16.01.2016 um 12:52 schrieb Diogo Ribeiro:
Hi guys,
I'm using PDFBox 1.8.10 to extract some text from a PDF (see attachment).
Your attachment didn't make it due to restcrictions to the mailing list. Please
provide a link to a public download area, e.g. a sharhoster like dropbox or similar.
BR
Andreas
The output lines are not correctly sorted.
Got:
1/435 S LOPES CÂNDIDO FELIX LOPESABEL DIA 27-09-1964
FRANCISCA MARIA DIAS
Was expecting:
1/435 ABEL DIAS LOPES CÂNDIDO FELIX LOPES 27-09-1964
FRANCISCA MARIA DIAS
My simple code:
PDDocument pdf = PDDocument.load(new File(FILE_PATH));
PDFTextStripper stripper = new PDFTextStripper();
stripper.setStartPage(1);
stripper.setEndPage(1);
stripper.setSortByPosition(true);
String plainText = stripper.getText(pdf);
System.out.println(plainText);
Thanks in advance.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]