Am 04.04.2016 um 17:36 schrieb Felix Hermann:
thanks for the answer,
problem is: how can I get the extracted characters and coordinates with pdfbox
version 2.0.0.
The example reffers to an older version of pdfbox.
printtextlocations (or DrawPrintTextLocations) are also available in the 2.0
version, although slightly changed.
Tilman
Gesendet: Donnerstag, 31. März 2016 um 19:58 Uhr
Von: "Tilman Hausherr" <[email protected]>
An: [email protected]
Betreff: Re: Extract Text of Document with coordinates
Am 31.03.2016 um 12:51 schrieb Felix Hermann:
Hello,
how can I extract the text + coordinates of a PDF document?
To be more precise: I would like to extract all words of the document. And for
each word I need the coordinates of this word.
If PDFBox does not support this: How can I get the coordinates of each
character?
I tried to adapt the code of this example:
https://gist.github.com/DavidYKay/82f20ba67c50c499ebb3
Yes, the printtextlocations (or DrawPrintTextLocations) example is a
good start. Look for the blanks and build words from there.
Tilman
However, I was not successful, as I use the new PDFBox version. (2.0.0)
Regards
Felix
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]