Sorry, I understand pdfbox probably won't be able to do this.... but perhaps
it can? :)

We use this software from BCL called Jade that allowed you to select a
'zone' on a PDF page and extract it to text in such a way that the spacing
and line breaking was preserved. It did (and does!) a better job of this
than any other tool we have ever tried. But they no longer make or support
it! Just wondering if any of you PDF mavens have found a tool or method for
doing this which works really well? It seems impossible to do
programmatically unless you know the parameters of the text -- one needs to
select it manually.  For example, we use this a lot for odd tables.

Reply via email to