amcereijo cereijo wrote:
> Hello, I am trying to find a text within the contents of a PDF file. I 
> tried with the "getContent" is valid but not me. How can I read the 
> contents of the pdf able to search for a text?

In "iText in Action", there's a section named "Why iText doesn’t do text 
extraction" (18.2.2). When you do getContent, you get a PDF content 
stream. It is possible to examine that stream, but you really need to be 
a PDF specialist to know how to interpret the PDF syntax in that stream. 
  The classes in package com.lowagie.text.pdf.parser, but you won't find 
any examples online on how to use these classes because such examples 
would give the (false) impression that "all PDF files can be parsed". 
That's not true: there are so many tools generating PDF (of which some 
aren't 100% in compliance with the PDF specification) and some PDFs 
won't ever give a good result when you try to extract the text.

Read http://1t3xt.be/?X406 for more info.

You are asking something that is only possible using OCR tools.
-- 
This answer is provided by 1T3XT BVBA
http://www.1t3xt.com/ - http://www.1t3xt.info

------------------------------------------------------------------------------
Enter the BlackBerry Developer Challenge  
This is your chance to win up to $100,000 in prizes! For a limited time, 
vendors submitting new applications to BlackBerry App World(TM) will have
the opportunity to enter the BlackBerry Developer Challenge. See full prize  
details at: http://p.sf.net/sfu/Challenge
_______________________________________________
iText-questions mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/itext-questions

Buy the iText book: http://www.1t3xt.com/docs/book.php
Check the site with examples before you ask questions: 
http://www.1t3xt.info/examples/
You can also search the keywords list: http://1t3xt.info/tutorials/keywords/

Reply via email to