Am 31.03.2016 um 12:51 schrieb Felix Hermann:
Hello,
how can I extract the text + coordinates of a PDF document? To be more precise: I would like to extract all words of the document. And for each word I need the coordinates of this word. If PDFBox does not support this: How can I get the coordinates of each character? I tried to adapt the code of this example: https://gist.github.com/DavidYKay/82f20ba67c50c499ebb3

Yes, the printtextlocations (or DrawPrintTextLocations) example is a good start. Look for the blanks and build words from there.

Tilman

However, I was not successful, as I use the new PDFBox version. (2.0.0)
Regards Felix

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to