Hello world, I'm pretty new at SQL. I'm just trying to build a simple document index. I'm hoping there is a better way to do things than what I've got here.
Here's what I've got for a table definition: >mysql> create table main_index( > sequence int unsigned primary key auto_increment not null, > doc_id int unsigned not null, > word_id int unsigned not null, > index idx_word (word_id), > index idx_doc (doc_id) ); >mysql> describe main_index; >+----------+------------------+------+-----+---------+----------------+ >| Field | Type | Null | Key | Default | Extra | >+----------+------------------+------+-----+---------+----------------+ >| sequence | int(10) unsigned | | PRI | NULL | auto_increment | >| doc_id | int(10) unsigned | | MUL | 0 | | >| word_id | int(10) unsigned | | MUL | 0 | | >+----------+------------------+------+-----+---------+----------------+ >3 rows in set (0.00 sec) In this table are about five million rows (at the moment but it will grow). There are a several thousand possible unique values for doc_id's and word_id's. I'm trying to find the unique doc_id's which have matching word_id's. Simple queries run very quickly of course. More complex queries are very unpredictable. They vary tremendously depending on the size of the (intermediate) result set. I expected that a certain amount but the variability is much larger than I expected. Do you have any suggestions about a better way to design/index my table or do the queries? Do I need to break my queries up and use temporary tables? If it matters, I'm running MySQL 3.23.49 on a AMD 1.3ghz Linux machine. Here are some typical queries and results: >mysql> select count(distinct m0.doc_id) from main_index as m0 where >m0.word_id=10; >+---------------------------+ >| count(distinct m0.doc_id) | >+---------------------------+ >| 405 | >+---------------------------+ >1 row in set (0.01 sec) > >mysql> select count(distinct m0.doc_id) from main_index as m0 where >m0.word_id=9; >+---------------------------+ >| count(distinct m0.doc_id) | >+---------------------------+ >| 221 | >+---------------------------+ >1 row in set (0.00 sec) > >mysql> select count(distinct m0.doc_id) from main_index as m0 where >m0.word_id=2327; >+---------------------------+ >| count(distinct m0.doc_id) | >+---------------------------+ >| 17 | >+---------------------------+ >1 row in set (0.00 sec) > >mysql> select count(distinct m0.doc_id) from main_index as m0, >main_index as m1 where m0.word_id=10 and m1.word_id=2327 and >m0.doc_id=m1.doc_id; >+---------------------------+ >| count(distinct m0.doc_id) | >+---------------------------+ >| 17 | >+---------------------------+ >1 row in set (0.59 sec) > >mysql> select count(distinct m0.doc_id) from main_index as m0, >main_index as m1 where m0.word_id=9 and m1.word_id=2327 and >m0.doc_id=m1.doc_id; >+---------------------------+ >| count(distinct m0.doc_id) | >+---------------------------+ >| 15 | >+---------------------------+ >1 row in set (0.55 sec) > >mysql> select count(distinct m0.doc_id) from main_index as m0, >main_index as m1 where m0.word_id=9 and m1.word_id=10 and >m0.doc_id=m1.doc_id; >+---------------------------+ >| count(distinct m0.doc_id) | >+---------------------------+ >| 181 | >+---------------------------+ >1 row in set (59.56 sec) > >mysql> select count(distinct m0.doc_id) from main_index as m0, >main_index as m1, main_index as m2 where m0.word_id=10 and >m1.word_id=2327 and m2.word_id=9 and m0.doc_id=m1.doc_id and >m0.doc_id=m2.doc_id; >+---------------------------+ >| count(distinct m0.doc_id) | >+---------------------------+ >| 15 | >+---------------------------+ >1 row in set (8.62 sec) Thank you in advance, Bill -- Bill Rausch, Software Development, UNIX, Mac, Windows Numerical Applications, Richland, WA 509-943-0861 --------------------------------------------------------------------- Before posting, please check: http://www.mysql.com/manual.php (the manual) http://lists.mysql.com/ (the list archive) To request this thread, e-mail <[EMAIL PROTECTED]> To unsubscribe, e-mail <[EMAIL PROTECTED]> Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php