Tilman ,

I didn't actually test it, but I might try that version.


Best regards ,
Hesham


------------------------------------------------------------------------
Included message :

Hi,

Does this also happen with the current version? (1.8.4)

Tilman

Am 25.03.2014 13:53, schrieb Hesham G.:
Hello ,

While reading a pdf using PDFBox 1.7.1 many spaces are being ignored, so words are merged together while reading the pdf. You can test a 1-page sample PDF from here :
http://www.4shared.com/office/yqJGUZn2ce/wrong_space_parsed_sample.html

You can see wrong read words like :
aboutmidnight, andbefore, CountyDonegal, ...

I have tried to use PDFTextStripper.setAverageCharTolerance(...) to control space sensitivity but it didn’t make any change.

Any idea why this happens and how to fix it ?


Best regards ,
Hesham

Reply via email to