Gary Kline wrote:
People,
You may remember that I'm trying to scan > 400 pages from a text.
Things work much better using he latest gocr and a greatly
enlarged JPEG image, tweaked with xv. I'm almmost to the point
where I can use aspell -c to correct misinterpreted text. The
gotcha is that the sample jpg file I have are filled with
improper non-characters, including "_", '<", ">", along with
punctuation, and random integers. Is there any way to tell
aspell to look at (say) S_wiss and guess Swiss, an6yle and guess
angle, n:otio:1 and guess motion, and di.5tnnce and guess distance?
You might get somewhere with the bad-spellers suggestion mode
setting, which should make it more aggressive about trying to find a
match for mangled strings. However, I understand that in this mode
it's still looking for soundslike mistrakes, not "9 looks like g"
and the like. This mode also turns of checking for typos IIRC, but
those checks really won't be helping you anyway since they're
looking for fumbled keystrokes, not lookalike chars. Tuning the edit
distance may or may not help for those really bad mangles.
Other than that, you should probably ask this question in an aspell
support forum for best results.
--
Greg Barniskis, Computer Systems Integrator
South Central Library System (SCLS)
Library Interchange Network (LINK)
<gregb at scls.lib.wi.us>, (608) 266-6348
_______________________________________________
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[EMAIL PROTECTED]"