Re: searching in OCRed pdf

Péterfi Balázs Mon, 26 Jan 2009 08:48:19 -0800

I think it has already OCRed because as I wrote I can search in the pdfwith adobe reader and it also selects the result. But what I see is ascanned paper and I guess there is a text layer "behind" it. Is it possible?


Paco Avila írta:

You can make a text extractor which perform an OCR.


On Mon, Jan 26, 2009 at 5:25 PM, Péterfi Balázs <[email protected]> wrote:

Hello,

I'm developing an application that uses jackrabbit and have some problem
with searching in pdf files. When I search in a pdf that was generated from
a word document it works. When I try to search in a pdf that has a scanned
document inside it and I can search through its contents from within Adobe
Reader (some sort of Optical Character Recognition) but my application does
not obtain results. I don't know how does this kind of pdf work but I need
to search in it. Does jackrabbit support it?

Thank you!
Balazs

Re: searching in OCRed pdf

Reply via email to