Probably not a practical solution for you to set up but I love this idea: http://blog.wired.com/monkeybites/2007/05/recaptcha_fight.html
----- Original Message ---- From: Renaud Waldura <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Friday, 25 January, 2008 1:43:06 AM Subject: Lucene to index OCR text I've been poking around the list archives and didn't really come up against anything interesting. Anyone using Lucene to index OCR text? Any strategies/algorithms/packages you recommend? I have a large collection (10^7 docs) that's mostly the result of OCR. We index/search/etc. with Lucene without any trouble, but OCR errors are a problem, when doing exact phrase matches in particular. I'm looking for ideas on how to deal with this thorny problem. -- Renaud Waldura Applications Group Manager Library and Center for Knowledge Management University of California, San Francisco (415) 502-6660 ___________________________________________________________ Yahoo! Answers - Got a question? Someone out there knows the answer. Try it now. http://uk.answers.yahoo.com/ --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]