Printing to a slightly older version of the Postscript driver on Windows from
some application - which means that since Postscript is for printing, there is
no requirement that the font encoding information be maintained. And then an
old version of Ghostscript used to convert that PS into PDF. Definitely NOT a
good combination.
Is there any way that you can control this production process?!?!
Leonard
From: Pakhu [mailto:[email protected]]
Sent: Friday, March 11, 2011 8:29 AM
To: [email protected]
Subject: Re: [iText-questions] Unreadable Pdf with PdfTextExtractor
<style> <!-- /* Font Definitions */ @font-face {font-family:Tahoma; panose-1:2
11 6 4 3 5 4 4 2 4;} /* Style Definitions */ p.MsoNormal, li.MsoNormal,
div.MsoNormal {margin:0cm; margin-bottom:.0001pt; font-size:12.0pt;
font-family:"Times New Roman";} a:link, span.MsoHyperlink {color:blue;
text-decoration:underline;} a:visited, span.MsoHyperlinkFollowed {color:blue;
text-decoration:underline;} span.EstiloCorreo17 {mso-style-type:personal-reply;
font-family:Arial; color:navy;} @page Section1 {size:595.3pt 841.9pt;
margin:70.85pt 3.0cm 70.85pt 3.0cm;} div.Section1 {page:Section1;} --> </style>
Yes sure:
<</Producer(GNU Ghostscript 7.05)
/Title(Provider Communication)
/Creator(PScript5.dll Version 5.2.2)
________________________________
De: Leonard Rosenthol-3 [via iText - General] [mailto:[hidden
email]</user/SendEmail.jtp?type=node&node=3348134&i=0&by-user=t>]
Enviado el: viernes, 11 de marzo de 2011 14:23
Para: Pakhu
Asunto: Re: Unreadable Pdf with PdfTextExtractor
Can you tell us what software was used to produce these PDFs??
-----Original Message-----
From: Pakhu [mailto:[hidden
email]</user/SendEmail.jtp?type=node&node=3348123&i=0&by-user=t&by-user=t>]
Sent: Friday, March 11, 2011 6:43 AM
To: [hidden
email]</user/SendEmail.jtp?type=node&node=3348123&i=1&by-user=t&by-user=t>
Subject: Re: [iText-questions] Unreadable Pdf with PdfTextExtractor
Thank to both of you.
You are right: when coping to text there is nothing but random characters
because the font (namely the differences array) is wrong.
But I have discovered why is wrong: the character g3 in the vector , for
instance, means the Ascii code 29+3=32 which is an space. All characters
follow the same patern gnn (the letter g followed by an integer). the Ascii
code is always 29+nn
Therefore I made a little program that edits the pdf, gets the differences
array, compute the right caracter and then rebuilds the array back. Now I
can read the pdf, once is beeing rebuilt in this fashion.
I know I should not spend so much time correcting somebody else's mistakes,
but I receive plenty of pdf like this...
--
View this message in context:
http://itext-general.2136553.n4.nabble.com/Unreadable-Pdf-with-PdfTextExtractor-tp3345219p3347943.html<http://itext-general.2136553.n4.nabble.com/Unreadable-Pdf-with-PdfTextExtractor-tp3345219p3347943.html?by-user=t&by-user=t>
Sent from the iText - General mailing list archive at Nabble.com.
------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
iText-questions mailing list
[hidden
email]</user/SendEmail.jtp?type=node&node=3348123&i=2&by-user=t&by-user=t>
https://lists.sourceforge.net/lists/listinfo/itext-questions
iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples:
http://itextpdf.com/themes/keywords.php
------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
iText-questions mailing list
[hidden
email]</user/SendEmail.jtp?type=node&node=3348123&i=3&by-user=t&by-user=t>
https://lists.sourceforge.net/lists/listinfo/itext-questions
iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples:
http://itextpdf.com/themes/keywords.php
________________________________
If you reply to this email, your message will be added to the discussion below:
http://itext-general.2136553.n4.nabble.com/Unreadable-Pdf-with-PdfTextExtractor-tp3345219p3348123.html<http://itext-general.2136553.n4.nabble.com/Unreadable-Pdf-with-PdfTextExtractor-tp3345219p3348123.html?by-user=t>
To unsubscribe from Unreadable Pdf with PdfTextExtractor, click
here<http://itext-general.2136553.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=3345219&code=ZnBvbnNAaG90ZWxtYW5hZ2VtZW50LmVzfDMzNDUyMTl8LTUwMTU2NjU5MQ==&by-user=t>.
________________________________
View this message in context: RE: Unreadable Pdf with
PdfTextExtractor<http://itext-general.2136553.n4.nabble.com/Unreadable-Pdf-with-PdfTextExtractor-tp3345219p3348134.html>
Sent from the iText - General mailing list
archive<http://itext-general.2136553.n4.nabble.com/> at Nabble.com.
------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
iText-questions mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/itext-questions
iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples:
http://itextpdf.com/themes/keywords.php