Re: What's wrong with this font ?
Hi, Sébastien Dailly sebastien.dai...@elettermail.eu hat am 20. März 2013 um 11:45 geschrieben: Hello, I've got a problem while reading the attached document. (It has been deflated, anonymised, text has been removed, and character shuffled). The text extraction works fine with some pdf reader (I tried with Acrobat and Evince), but the text read by pdfbox is not the expected one, as if pdfbox is using a wrong font description for reading the text : instead of 60CO L4PU7L 03D4 DR DVGWEWNER5L STLERC MLIPHOAP6 AE0TE I've got UvIKGMuK6RuN0TN 0 E4RREDRRRElPéNéOND5vRRrTvNDp 60pMRRRv4KS7v I'm using pdfbox 1.6.0 for that. Please update to a more recent version like 1.7.1. or wait some more days as the release process for the all new 1.8.0 version just started yesterday. Is the document invalid ? What can I do for reading correctly the document ? If after upgrading to a more recent version the issue still persists create an issue on JIRA [1] and attach the pdf in question to it. P.S.: Ensure that you are correctly subscribed to the mailing list [2] otherwise you won't get any answers. Thanks ! -- Sébastien Dailly +33 1 56 29 78 67 ELETTERMAIL BR Andreas Lehkühler [1] https://issues.apache.org/jira/browse/PDFBOX [2] http://pdfbox.apache.org/mail-lists.html
Re: What's wrong with this font ?
Hi, using the latest version of pdfbox (1.7.1) that's what I got MLIPHOAP6 AE0TE 03D4 DR DVGWEWNER5L STLERC 60CO L4PU7L Please give it a try. Maruan Sahyoun Am 20.03.2013 um 11:45 schrieb Sébastien Dailly sebastien.dai...@elettermail.eu: Hello, I've got a problem while reading the attached document. (It has been deflated, anonymised, text has been removed, and character shuffled). The text extraction works fine with some pdf reader (I tried with Acrobat and Evince), but the text read by pdfbox is not the expected one, as if pdfbox is using a wrong font description for reading the text : instead of 60CO L4PU7L 03D4 DR DVGWEWNER5L STLERC MLIPHOAP6 AE0TE I've got UvIKGMuK6RuN0TN 0 E4RREDRRRElPéNéOND5vRRrTvNDp 60pMRRRv4KS7v I'm using pdfbox 1.6.0 for that. Is the document invalid ? What can I do for reading correctly the document ? Thanks ! -- Sébastien Dailly +33 1 56 29 78 67 ELETTERMAIL document.pdf
Re: What's wrong with this font ?
Le 20/03/2013 11:57, Maruan Sahyoun a écrit : Hi, using the latest version of pdfbox (1.7.1) that's what I got MLIPHOAP6 AE0TE 03D4 DR DVGWEWNER5L STLERC 60CO L4PU7L Please give it a try. Thanks for answering so quickly. Sorry for the noise, I should have begun with the last pdfbox version. I'll upgrade and run some tests with the new library. -- Sébastien Dailly +33 1 56 29 78 67 ELETTERMAIL