Re: What's wrong with this font ?

2013-03-20 Thread Andreas Lehmkühler
Hi,


Sébastien Dailly sebastien.dai...@elettermail.eu hat am 20. März 2013 um
11:45 geschrieben:
 Hello,

 I've got a problem while reading the attached document. (It has been
 deflated, anonymised, text has been removed, and character shuffled).

 The text extraction works fine with some pdf reader (I tried with
 Acrobat and Evince), but the text read by pdfbox is not the expected
 one, as if pdfbox is using a wrong font description for reading the text
 : instead of


  60CO L4PU7L
   03D4 DR DVGWEWNER5L STLERC
  MLIPHOAP6 AE0TE

 I've got

  UvIKGMuK6RuN0TN
  0 E4RREDRRRElPéNéOND5vRRrTvNDp
  60pMRRRv4KS7v


 I'm using pdfbox 1.6.0 for that.
Please update to a more recent version like 1.7.1. or wait some more days as the
release
process for the all new 1.8.0 version just started yesterday.

 Is the document invalid ? What can I do for reading correctly the document ?
If after upgrading to a more recent version the issue still persists create an
issue
on JIRA [1] and attach the pdf in question to it.

P.S.: Ensure that you are correctly subscribed to the mailing list [2] otherwise
you won't
get any answers.

 Thanks !

 --
 Sébastien Dailly
 +33 1 56 29 78 67
 ELETTERMAIL

BR
Andreas Lehkühler
[1] https://issues.apache.org/jira/browse/PDFBOX
[2] http://pdfbox.apache.org/mail-lists.html


Re: What's wrong with this font ?

2013-03-20 Thread Maruan Sahyoun
Hi,

using the latest version of pdfbox (1.7.1) that's what I got

MLIPHOAP6 AE0TE
03D4  DR   DVGWEWNER5L  STLERC
60CO   L4PU7L

Please give it a try.

Maruan Sahyoun


Am 20.03.2013 um 11:45 schrieb Sébastien Dailly 
sebastien.dai...@elettermail.eu:

 Hello,
 
 I've got a problem while reading the attached document. (It has been 
 deflated, anonymised, text has been removed, and character shuffled).
 
 The text extraction works fine with some pdf reader (I tried with Acrobat and 
 Evince), but the text read by pdfbox is not the expected one, as if pdfbox is 
 using a wrong font description for reading the text : instead of
 
 
 60CO L4PU7L
  03D4 DR DVGWEWNER5L STLERC
 MLIPHOAP6 AE0TE
 
 I've got
 
 UvIKGMuK6RuN0TN
 0 E4RREDRRRElPéNéOND5vRRrTvNDp
 60pMRRRv4KS7v
 
 
 I'm using pdfbox 1.6.0 for that.
 
 Is the document invalid ? What can I do for reading correctly the document ?
 
 Thanks !
 
 -- 
 Sébastien Dailly
 +33 1 56 29 78 67
 ELETTERMAIL
 document.pdf



Re: What's wrong with this font ?

2013-03-20 Thread Sébastien Dailly

Le 20/03/2013 11:57, Maruan Sahyoun a écrit :

Hi,

using the latest version of pdfbox (1.7.1) that's what I got

MLIPHOAP6 AE0TE
03D4  DR   DVGWEWNER5L  STLERC
60CO   L4PU7L

Please give it a try.



Thanks for answering so quickly.

Sorry for the noise, I should have begun with the last pdfbox version. 
I'll upgrade and run some tests with the new library.



--
Sébastien Dailly
+33 1 56 29 78 67
ELETTERMAIL