To all of your - thank you so much.
Your solutions seem to help me :-)
Best regards
bernhard
--
View this message in context:
http://itext-general.2136553.n4.nabble.com/How-to-check-if-a-PDF-is-OCR-recognized-tp3678057p3680049.html
Sent from the iText - General mailing list archive at Nabble.com
I bought the second edition recentlyI will catch-up as soon as I can! :)
- Original Message -
From: "1T3XT BVBA"
To: "Post all your questions about iText here"
Sent: Tuesday, July 19, 2011 9:51 AM
Subject: Re: [iText-questions] How to check if a PDF is OCR rec
On 19/07/2011 15:47, AJ Weber wrote:
> I think the standard message was
> "iText doesn't do text extraction, we don't want to do it, use pdfbox or
> jpedal to do that."
Yes, that sounds like a literal quote from the first book.
But that was written 5 to 6 years ago. Time flies ;-)
uot;1T3XT BVBA"
To: "Post all your questions about iText here"
Sent: Tuesday, July 19, 2011 9:34 AM
Subject: Re: [iText-questions] How to check if a PDF is OCR recognized
> On 19/07/2011 15:29, AJ Weber wrote:
>> It's not foolproof
> True, it's an "educ
On 19/07/2011 15:29, AJ Weber wrote:
> It's not foolproof
True, it's an "educated guess", but I'm pretty sure the error margin is
very low.
By the way: iText can extract text from a PDF too ;-)
--
Magic Quadrant for Conte
n't hurt that I send it for OCR anyway -- just
takes a little longer).
-AJ
- Original Message -
From: "Bernhard Haslinger"
To:
Sent: Tuesday, July 19, 2011 8:58 AM
Subject: [iText-questions] How to check if a PDF is OCR recognized
> Dear all,
>
> I'
On 19/07/2011 14:58, Bernhard Haslinger wrote:
> Is there a way to check with the iText library if a existing pdf has a ocr
> layer or not?
iText can parse PDFs into plain text, provided that the text doesn't
consist of image.
- if you use iText to parse your PDFs, and there's no text; then the PD
Dear all,
I've a lot of all pdf Files - some of them are bitmaps some of them are ocr
recognized.
Now I plan to let alle pfiles be ocr recognized but I dont want to scan all
documents if this is possible because I think the biggest part of them is
already recognized.
Is there a way to check with