Hi,

Am 26.09.2013 13:36, schrieb Tapani Vaulasto:
Hi,
I use PDFBox 1.8.2 and this code to convert a PDF to txt-file:

PDDocument pd = PDDocument.load(input);
PDFTextStripper stripper = new PDFTextStripper();
BufferedWriter wr = new BufferedWriter(new OutputStreamWriter(new
  FileOutputStream(output)));
stripper.writeText(pd, wr);

A PDF documes has tables.
Problem is that sometimes a table has one or more empty columns on a line.
Like here:
http://www.tulli.fi/fi/yksityisille/autoverotus/taulukot/autot/au/1308.pdf

On the page 2(44)  some ALFA ROMEOs has an empty column.

Question: How to get all columns marked on a line for BufferedWriter?
Sorry, this can't be done with PDFBox. You have to analyze the text on your own.

Regards
Tapani Vaulasto

BR
Andreas Lehmkühler

Reply via email to