On Mon, Jan 26, 2009 at 01:36:48PM -0800, Charlie Kester wrote:
> On Mon 26 Jan 2009 at 00:16:23 PST Polytropon wrote:
> >On Mon, 26 Jan 2009 00:06:18 -0800, Gary Kline <kl...@thought.org> wrote:
> >>    Thanks, Gents,
> >>
> >>    But according to one smallish pdf file that I send to a web based
> >>    tool, it was not a real pdf.  Or, more accurately, it (the pdf to 
> >>    speech program) couldn't decode it.
> >
> >This is a typical problem with "poorly engineered" PDFs where the
> >author puts in the text as images (you'll see this stupidity across
> >the Web, too).
> 
> In most cases where I've seen this, it's because they had scanned an
> actual printed document.  Many old, out-of-print books are being made
> newly available this way, so I'm not inclined to complain.
> 
> Unfortunately, OCR software still isn't reliable enough (or, if
> reliable, cheap enough) to convert these scanned images to actual text.


        You're probably right about the cost/performance idea.  Still,
        before I get back to the Last few pages of my thesis, maybe I'll
        try feeding parts of my most vanilla image-PDF file to an
        opensource OCR program.  I'm pretty sure there are a couple in
        ports.  IIRC, though, the images have to be jpegs of tiffs or the
        like.  If anybody knows, please give me a shout out!

        gary

-- 
 Gary Kline  kl...@thought.org  http://www.thought.org  Public Service Unix
        http://jottings.thought.org   http://transfinite.thought.org
    The 2.23a release of Jottings: http://jottings.thought.org/index.php

_______________________________________________
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"

Reply via email to