going to tell you the steps for Acrobat 6 because that's what I've got, but
I'm pretty sure it's there in older versions too.
1. Open the scan that is saved as PDF.
2. Document > Paper Capture > Start Capture...
In the settings you can choose to have a Searchable Image or Formatted Text
and Graphics. The Searchable image keeps the scan looking just like it is,
but it does the OCR magic and lays it under the image. So when you do a
search for a word, it highlights the scanned image of the word that matches
the hidden, underlying text. The other option of Formatted Text is just like
normal OCR. If it hits a word it can't cleanly convert, it leave it in the
image format and just places it inline with the rest of the text.
Super nifty. Saved my butt on a project I'm working on.
-Kevin
----- Original Message -----
From: "Smith, Matthew P -CONT(CSC)" <[EMAIL PROTECTED]>
To: "CF-Community" <[EMAIL PROTECTED]>
Sent: Thursday, January 15, 2004 10:13 AM
Subject: RE: quick OT question - pdf to format we can put in db
> Yes, I just tried it and get a blank .txt. It looks like one of those
pdf's where someone scanned a page, little Xerox like artifacts and the
like. Text selection tool doesn't grab anything either. I guess OCR would
be the only option.
>
> Any good free OCR packages out there? Heck just a demo of one that costs
tons would work; just need this one time.
>
> Thank you, Kevin.
>
> Matthew
>
> -----Original Message-----
> From: Kevin Graeme [mailto:[EMAIL PROTECTED]
> Sent: Wednesday, January 14, 2004 4:25 PM
> To: CF-Community
> Subject: Re: quick OT question - pdf to format we can put in db
>
> Yes. With the full Acrobat you can save as text.
>
> -Kevin
>
> ----- Original Message -----
> From: "Smith, Matthew P -CONT(CSC)" <[EMAIL PROTECTED]>
> To: "CF-Community" <[EMAIL PROTECTED]>
> Sent: Wednesday, January 14, 2004 4:07 PM
> Subject: quick OT question - pdf to format we can put in db
>
> > My coworker is hand typing a pdf into the sql db. It's a pdf of the
> languages of the world(like 12 pages). Any way to export to text or
> something we can parse?
> >
> > Matthew P. Smith
> > Web Developer, Object Oriented
> > Naval Education & Training Professional
> > Development & Technology Center
> > (NETPDTC)
> > (850)452-1001 ext. 1245
> > [EMAIL PROTECTED]
> >
> >
> >
> _____
>
>
>
[Todays Threads] [This Message] [Subscription] [Fast Unsubscribe] [User Settings]