Hi,

   You are right about this, I should have thought about it. There is just
one problem: A word can occur multiple times in the same documents, so the
table which gives the position of a word in a doc. must give all the
positions of the word. If we keep your logic, we would have to make a table
for each word, and I don't think it's a good idea. Suggestions?

By the way, my question was: 
What is faster between filtering docs and then performing a '%like%' query
on relevant documents:
Ex: We search for 'red sun', so we first filter the documents to keep only
the ones containing red and sun, so we have at chance that some of these
contains the expression 'red sun'. Then we perform a select query with
where text LIKE '%red sun%' in the relevant documents.

OR not using LIKE statements at all and use a word positions table to find
the docs where the position of sun = the position of red + 1


Thank you,

Cedric Veilleux



> what you'll need is:
> 1 table with doc_ids (and perhaps document)
> 1 table with words
> 1 table which links words to docs
> 1 table which gives the position of a word in a doc.
> 
> create table documents (doc_id integer primary key auto_increment, document
> text);
> create table words (word_id integer primary key auto_increment, word
> varchar(255));
> create table occurences (occ_id integer primary key auto_increment, doc_id
> integer, word_id integer);
> create table positions (pos_id integer primary key auto_increment, occ_id
> integer, position integer);
> 
> this way you can handle "unlimited" words with "unlimited" occurences in
> "unlimited" documents.
> 
> any other solution would force you to construct very inefficient tables, or
> use of blob fields which really horribly would slow down your db when adding
> data for example, and is generally a very very bad way you shouldn't even
> think of.

---------------------------------------------------------------------
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/           (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Reply via email to