Hi!

On Dec 15, Bob Sidebotham wrote:
> Thanks for the note, Sergei. I admit to being confused by the
> documentation on boolean search. In particular, exactly what the problem
> domain for the two algorithms is supposed to be (and how relevancy is
> computed with boolean search). I don't understand why boolean search has
> (it seems) a totally different approach to relevancy.
> 
> Or to ask it a different way, which applications would use non-boolean
> search, and why?

Boolean search engine answers the question: what documents contain these
words or these words and do not contain these words. It's _boolean_ query.
Every document can either match (value = TRUE) or not (value = FALSE).

Some "relevance" that MySQL returns does not mean that much - it's a
simple estimation based on a number of words matched.

Boolean query - by its nature - should not be a subject to stopword
filtering. It is now, but it will be changed soon.

Natural language query engine is designed to find documents "about
this and that". It can do it any way it want. It is _not_ guaranteed
that it will do it by comparing query and documents word-by-word.
It is _not_ guaranteed that the document found will have some words
in common with the query at all. It does now, but it can be changed.
It's not a query about _words_, it's a query about their _meaning_.
The document cannot simply match or mismatch - it can be partially
relevant.  In fact, different people will have different notion of how
relevant the particular document is. To train a nl search engine so
called test collections are used - a set of documents and queries, were
for each query a set or relevand documents is specified by a group of
human experts.
Natural language search engine can do anything that will help to
approximate those human judgements as close as possible. It can apply
stopword filtering - to remove noise words - stemming, thesaurus
expansion, complex statistics, whatever.

Thanks for the good question.
The issue really requires some clarification.
We'll update the manual.

> Also my understanding from the documentation and release notes is that
> boolean search is only available in 4.0.1, which is not yet available.
> Is this correct?

Almost. Some "boolean search" is available in 4.0.0 as well,
but it's built on top of existing nl-search code.
In 4.0.1 it is completely rewritten from scratch,
syntax is changed in incompatible way and documented in the manual.

("it is documented" means that there should be no syntax changes -
or at least backward-compatibility will be maintained)

Regards,
Sergei

-- 
MySQL Development Team
   __  ___     ___ ____  __
  /  |/  /_ __/ __/ __ \/ /   Sergei Golubchik <[EMAIL PROTECTED]>
 / /|_/ / // /\ \/ /_/ / /__  MySQL AB, http://www.mysql.com/
/_/  /_/\_, /___/\___\_\___/  Osnabrueck, Germany
       <___/

---------------------------------------------------------------------
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/           (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Reply via email to