Many thanks Brian. I realize how ugly this is. I've been experimenting using 
different generators (PDF producer applications) and have seen how vastly 
different the generated text streams can be. 

I'll check out the leads you've provided.

Thanks,

Raimi



----- Original Message -----
From: Brian McKeever <[EMAIL PROTECTED]>
Date: Thursday, April 19, 2007 5:56 pm
Subject: Re: [iText-questions] Reading a PDF file

> > Is there a way in iText to avoid having to manually parse the 
> raw bytes returned from PdfReader.getPageContent() in order to get 
> the text on a page?
> 
> There's the PRTokeniser class, but you should be aware of the
> potential difficulties involved in trying to extract text from a PDF
> (unexpected text order, font encodings, etc).  Check out section 18.2
> of the book (which is highly recommended if you're going to do much
> work with iText).
> 
> Good luck,
> Brian
> 
> -------------------------------------------------------------------
> ------
> This SF.net email is sponsored by DB2 Express
> Download DB2 Express C - the FREE version of DB2 express and take
> control of your XML. No limits. Just data. Click to get it now.
> http://sourceforge.net/powerbar/db2/
> _______________________________________________
> iText-questions mailing list
> iText-questions@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/itext-questions
> Buy the iText book: http://itext.ugent.be/itext-in-action/
> 

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions
Buy the iText book: http://itext.ugent.be/itext-in-action/

Reply via email to