You can't extract text with iText. > -----Original Message----- > From: Amir Kamel [mailto:[EMAIL PROTECTED] > Sent: Wednesday, February 02, 2005 10:09 AM > To: Paulo Soares > Subject: Re: [iText-questions] Read & parse special encodings > > To simplify the 1st step, from a PDF document, I want to > be able to extract all the content, and to create a text > file from it. > > > On Wed, 2 Feb 2005 10:10:09 -0000 > "Paulo Soares" <[EMAIL PROTECTED]> wrote: > >(Keep on the mailing list) > > > >What's your objective? "Read and parse" are too broad > >descriptions. > > > > > >> -----Original Message----- > >> From: Amir Kamel [mailto:[EMAIL PROTECTED] > >> Sent: Wednesday, February 02, 2005 9:56 AM > >> To: Paulo Soares > >> Subject: Re: [iText-questions] Read & parse special > >>encodings > >> > >> Thanks Paulo, > >> > >> Then how can I do to read and parse properly any of the > >> PDF files that have special fonts or encodings ? > >> > >> can you help ? > >> > >> Many thanks > >> > >> Amir > >> On Wed, 2 Feb 2005 09:58:41 -0000 > >> "Paulo Soares" <[EMAIL PROTECTED]> wrote: > >> >BaseFont.getDocumentFonts(reader) only looks at single > >> >byte type1 and > >> >truetype fonts because those are the fonts that it can > >> >reuse. > >> > > >> >> -----Original Message----- > >> >> From: [EMAIL PROTECTED] > >> >> [mailto:[EMAIL PROTECTED] > >>On > >> >> Behalf Of Amir Kamel > >> >> Sent: Wednesday, February 02, 2005 9:33 AM > >> >> To: [email protected] > >> >> Subject: [iText-questions] Read & parse special > >> >>encodings > >> >> > >> >> On Wed, 02 Feb 2005 10:29:00 +0100 > >> >> "Amir Kamel" <[EMAIL PROTECTED]> wrote: > >> >> >Hello, > >> >> > > >> >> >I am very new to Itext - Adobe and all this stuff, > >>so > >> >> >maybe I will ask a stupid question here : > >> >> > > >> >> >I cannot read/parse properly some of my PDF > >>documents > >> >> >with Itext : From what I see, these documents have > >> >> >special fonts/encodings, and my itext reader does > >>not > >> >> >recognize them. > >> >> >When I try to get all the fonts that are in such a > >>PDF > >> >> >file using BaseFont.getDocumentFonts(reader), I only > >> >>see > >> >> >those fonts that are recognize. But when I open the > >>PDF > >> >> >in acrobat reader, then I see some other "special > >> >>fonts" > >> >> >with weird names, and often the encoding is > >> >> >"Indentity-H". > >> >> > > >> >> >What shall I do to read and parse these files > >>properly > >> >>? > >> >> > > >> >> >Any help is much appreciated. > >> >> > > >> >> >Regards, > >> >> > > >> >> >Amir > >> >> > >> >> > >> >> > >> >> > >>------------------------------------------------------- > >> >> This SF.Net email is sponsored by: IntelliVIEW -- > >> >>Interactive > >> >> Reporting > >> >> Tool for open source databases. Create drag-&-drop > >> >>reports. Save time > >> >> by over 75%! Publish reports on the web. Export to > >>DOC, > >> >>XLS, RTF, etc. > >> >> Download a FREE copy at > >> >>http://www.intelliview.com/go/osdn_nl > >> >> _______________________________________________ > >> >> iText-questions mailing list > >> >> [email protected] > >> >> > >>https://lists.sourceforge.net/lists/listinfo/itext-questions > >> >> > >> > >> > >
------------------------------------------------------- This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting Tool for open source databases. Create drag-&-drop reports. Save time by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc. Download a FREE copy at http://www.intelliview.com/go/osdn_nl _______________________________________________ iText-questions mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/itext-questions
