Hi!

I have posted this message yerstaday, but apperantly
did not give enough information for reader to figure out
what's going on. So, i am reposting it with more information.

I have the following table:

create table law_words (
    l_id int unsigned not null,
    w_id int unsigned not null,
    primary key (w_id, l_id)
);

The request is:

SELECT DISTINCT w0.l_id FROM   law_words as w0
inner join law_words as w1 on w0.l_id=w1.l_id
WHERE
w0.w_id  IN (258,282,287, 615, 1101, 1949, 1968, 3417, 3574,
3578, 3643,4345,4768, 5297, 5976,6133, 7243,7245, 9271, 9348, 11146,
11150, 11172, 11232,11847, 12542, 12859, 14811, 24839, 26653,27662)
AND
 w1.w_id IN (405, 2017,2192, 2592, 2595, 2603, 2981, 4055, 4068,
4346,5755, 6480, 9384,9408, 11513, 11514, 12126, 12134, 12638, 13052,
13643, 13769,13836, 13945, 14154, 14693, 14867, 14980, 15518, 15557,
17830, 19005, 19051, 19247, 20176, 20926, 22364, 22365, 22366, 22732,
24668,24793, 24956,  25286, 26242, 26665, 26847, 27144, 27348, 27815,
28494, 30910, 31878, 32161, 33586,  34396);

The basic idea is thart law_words holds index of
words (w_id) for each law (l_id), so law can be found by words, which
are specified by user and the their ids are looked up in
vocabulary table.

Now law_words has 228207 records and that request takes
about 2 seconds on a pc with 1GB of RAM and dual Pentium III XEON
550Mhz, which is TOO MUCH! Explain shows thart mysql is
using ' range' and primary index, and about 500 records for each
table.

HERE IS EXPLAIN:

+-------+-------+---------------+---------+---------+------+------+---
---------------------------------------+
| table | type  | possible_keys | key     | key_len | ref  | rows |
Extra                                    |
+-------+-------+---------------+---------+---------+------+------+---
---------------------------------------+
| w0    | range | PRIMARY       | PRIMARY |       4 | NULL |  473 |
where used; Using index; Using temporary |
| w1    | range | PRIMARY       | PRIMARY |       4 | NULL |  479 |
where used; Using index; Distinct        |
+-------+-------+---------------+---------+---------+------+------+---
---------------------------------------+
2 rows in set (0.01 sec)

For more than three specified words request takes about forever,
so no search is possible. The request is using INNER JOIN to get
the words in the 'AND' manner (laws which contain ALL specified
words). I mean, that the table is joined with itself, so only law_id
which
have all the specified words are returned.

I don't understand what I am doing wrong, since i thought it is
a basic technology behind any word search engine.

I really do not want to use fulltext search because it does not
do any morphology and some other stuff that I need.

Please, help, if you can.

Regards, Artem



---------------------------------------------------------------------
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/           (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Reply via email to