Hi Fredrik Thanks for your posting. I appreciate the way you did, really every nice;)
Regards /Jack On 9/29/05, Fredrik Andersson <[EMAIL PROTECTED]> wrote: > Hi Jack! > > I like these things to be driven by statistics rather than content of the > index. If you run a search engine, and want any kind of feedback, you will > at least save all queries entered. You can store these in an index or > database, and run a Levenshtein metric on the, potentially misspelled, > query. If my memory serves me right, a Lucene FuzzyQuery uses this metric, > so a good approach would be to keep a Lucene index with |query,frequency| > tuples (updated nightly, weekly, or whatever), and simply search this index > with a FuzzyQuery with some defined similarity, and pick the most frequent > query for suggestion. > > Fredrik > > On 9/29/05, Jack Tang <[EMAIL PROTECTED]> wrote: > > Hi > > > > I am very like Google's "Did you mean" and I notice that nutch now > > does not provider this function. > > > > In this article http://today.java.net/lpt/a/211 , author Tim White > > implemented suggestion using n-gram to generate suggestion index. Do > > you think is it good for nutch? I mean index in nutch will be really > > huge. Or just provide some dictionaries like jazzy(LGPL) does? > > > > Thanks > > /Jack > > -- > > Keep Discovering ... ... > > http://www.jroller.com/page/jmars > > > > -- Keep Discovering ... ... http://www.jroller.com/page/jmars