Mario,
it would be great if you could provide us a sample-pdf on JIRA[1]. Are
you using the TextExtratctor or did you write a program of your own
using PDFBox?
Greetings,
Erik
[1] https://issues.apache.org/jira/browse/PDFBOX
--
My blog: http://blog.elitecoderz.net
Mario Sangiorgio wrote:
Hi,
I am writing this e-mail because I am having issues parsing pdf documents
with PDFBox.
For example I am trying to parse the PDF of a paper, but I get its title
screwed up as in the following example.
An
Asp
e
ct-Orien
ted
F
ramew
o
rk
for
S
ervice
A
d
aptation
As you can see I get newlines rather than spaces and even worst there are
other newlines in the middle of the words.
If it may help, feel free to ask me any clarification and any test.
Mario