Hi!

On Jan 07, Erlend Hopso Stromsvik wrote:
> > What I can easily do without breaking 4.0.x "gamma" status, is to add
> > command line switch --disable-fulltext-stopwords. It can help as a
> > temporary solution, untill a proper fix - per-index options, that is -
> > will be implemented.
> 
> That would be helpful for me, but what about Thomas Spahni's suggestion?

In todo already :)

> > > I remember working on a project when I was school where we wrote
> > > this program using autogenerated stopword lists and N-gram
> > > matching for the text and search string. By this the stopword list
> > > was not hard coded..
> > 
> > What is "N-gram matching" ?
> 
> ************************
> n-grams are used to describe objects as vectors. This makes it possible to
> apply geometric, statistical and other mathematical techniques, which are
> well defined for vectors, but not for objects in general. For example, one
> of the most common uses is to define a similarity measure between textual
> documents based on the application of a mathematical function to the vector
> representations of the documents

Yep. This is how fulltext search in MySQL works (not IN BOOLEAN MODE, of
course). It is called "vector space model", though words are used as
vector coordinates, not n-grams.

> ************************
> An n-gram is a set of n consecutive characters extracted from a word.
> The main idea behind this approach is that, similar words will have a
> high proportion of n-grams in common. Typical values for n are 2 or 3,
> these corresponding to the use of digrams or trigrams, respectively.
> 
> So if you have the word 'computer' you'll get the following digrams:
> *c, co, om, mp, pu, ut, te, er, r*
> 
> and the trigrams:
> **c,*co,com,omp,mpu,put,ute,ter,er*,r**
 
Regards,
Sergei

-- 
MySQL Development Team
   __  ___     ___ ____  __
  /  |/  /_ __/ __/ __ \/ /   Sergei Golubchik <[EMAIL PROTECTED]>
 / /|_/ / // /\ \/ /_/ / /__  MySQL AB, http://www.mysql.com/
/_/  /_/\_, /___/\___\_\___/  Osnabrueck, Germany
       <___/

---------------------------------------------------------------------
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/           (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Reply via email to