help on query/group by

mel list_php Wed, 16 Mar 2005 09:47:30 -0800

Hi,

A friend of mine asked me to have a look at one of his query, and I'm stuck.... Here was his query: SELECT drugID, protID, COUNT(DISTINCT pmid), MAX(s1.syn) AS o1, MAX(s2.syn) AS o2 FROM matches INNER JOIN synonyms AS s1 ON drugID=s1.nameID AND s1.syn LIKE 'a%' INNER JOIN synonyms AS s2 ON protID=s2.nameID AND s2.syn LIKE '%' INNER JOIN sentence ON sentID=id GROUP BY drugID, protID ORDER BY o1, o2 LIMIT 601

and this is his goal:

"The idea is quite simple: The table called 'matches' contains triples

 drugID, protID, sentID

indicating a co-occurence of a drug and a protein in a sentence. The
user of course searches for either drug name or protein name or
both. In the above query, the user wants everything for all drugs
starting with 'a'.

The MAX() calls more or less arbitrarily choose one of the many names
associated with drugID as a representative. With the COUNT() I want to
find out how many different medline abstracts (not sentences) have a
hit."

The matches table is 1,247,508 rows, sentence is 817,255 rows and synonyms is 225,497 rows.

First I think using inner join in that case is not helpful, because it is making a whole cartesian product on the tables, whereas a left join would limit the number of rows. The second line "INNER JOIN synonyms AS s2 ON protID=s2.nameID AND s2.syn LIKE '%'" is useless I think, because it just retrieves the not null values for protID.

I also added indexes on the table (i'm not very familiar with indexes, so that is probably my problem) - on matches: index on protID,drugID and sentID - on sentence: index on id (primary key) - on synonyms: index on nameID,syn

Here are the tables:
mysql> desc matches;
+--------+---------+------+-----+---------+-------+
| Field  | Type    | Null | Key | Default | Extra |
+--------+---------+------+-----+---------+-------+
| protID | text     | YES  | MUL  | NULL   |       |
| drugID | text    | YES  |         | NULL    |       |
| sentID | int(11) | YES  | MUL | NULL    |       |
+--------+---------+------+-----+---------+-------+
3 rows in set (0.00 sec)

mysql> desc sentence;
+-------+------------------+------+-----+---------+----------------+
| Field | Type                | Null | Key | Default | Extra          |
+-------+------------------+------+-----+---------+----------------+
| id    | int(10) unsigned |        | PRI | NULL    | auto_increment |
| text  | text                  | YES  |     | NULL     |                |
| pmid  | int(11)            | YES  |     | NULL      |                |
+-------+------------------+------+-----+---------+----------------+
3 rows in set (0.00 sec)

mysql> desc synonyms;
+--------+------+------+-----+---------+-------+
| Field     | Type | Null | Key | Default | Extra |
+--------+------+------+-----+---------+-------+
| nameID | text | YES  | MUL | NULL    |       |
| syn       | text | YES  |        | NULL    |       |
+--------+------+------+-----+---------+-------+
2 rows in set (0.00 sec)

Thanks a lot,
Melanie

_________________________________________________________________ It's fast, it's easy and it's free. Get MSN Messenger today! http://www.msn.co.uk/messenger


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

help on query/group by

Reply via email to