I don't know about everyone else but I would build scores like this: Create an empty scores table. Make gene_name the Primary Key. Allow the other columns to be null or give them a default of 0.0. Whatever that value is will represent "no score available" (I suggest null as a score of 0 may be a possible value)
Insert the contents of one of your 3 feeder table into scores (I picked g2d): INSERT scores (gene_name, g2d) SELECT gene_name, score FROM g2d What I would like to be able to do next is an "INSERT...SELECT ... ON DUPLICATE KEY UPDATE" statement for the other two tables. However, that INSERT form is not available (http://dev.mysql.com/doc/mysql/en/INSERT.html) So what I wind up doing is creating a temporary table of IDs that I will need to update. I use that list to help me merge the data for the two remaining tables. CREATE TEMPORARY TABLE tmpMatches SELECT DISTINCT s.gene_name FROM scores s INNER JOIN dgp ON dgp.gene_name = s.gene_name; ALTER TABLE tmpMatches ADD KEY(gene_name); INSERT scores (gene_name, dgp) SELECT gene_name, score FROM dgp LEFT JOIN tmpMatches tm ON tm.gene_name = dgp.gene_name WHERE tm.gene_name is null; UPDATE scores s INNER JOIN dgp ON dgp.gene_name = s.gene_name SET s.dgp = dgp.score; TRUNCATE TABLE tmpMatches; INSERT tmpMatches (gene_name) SELECT DISTINCT s.gene_name FROM scores s INNER JOIN pocus p ON p.gene_name = s.gene_name; INSERT scores (gene_name, pocus) SELECT gene_name, score FROM pocus p LEFT JOIN tmpMatches tm ON tm.gene_name = p.gene_name WHERE tm.gene_name is null; UPDATE scores s INNER JOIN pocus p ON p.gene_name = s.gene_name SET s.pocus = p.score; DROP TABLE tmpMatches; What you should end up with is that scores should list each gene only once with the score values from the other tables (where they are available). Wherever you have null values will indicate no data from that table. Notes on technique: It is much faster for the query engine to find matches than non-matches (That's why I build a table of gene_names that already exist in both tables) . There should already be indexes on both tables so this will be a very fast comparison. The index on my temporary table considerably speeds up the LEFT JOIN in the INSERT in the next step. (reducing the cost of our non-matching query) The UPDATE is also joining tables on indexed columns so it should also execute very quickly. Using this same basic method to combine columns between similar tables, just like your situation, I have merged two 500,000 row tables *on a laptop* in less than 15 seconds Yours, Shawn Green Database Administrator Unimin Corporation - Spruce Pine Jeroen Van Goey <[EMAIL PROTECTED]> wrote on 08/10/2004 11:44:29 AM: > Hey all, > > I've three tables (g2d, dgp and pocus), with a gene-name, a score and > some other information: > > mysql> select * from dgp; > +--------+---------------+-------------- > +------------------------------+------------+--------+ > | chr | start | stop | gene_name | > score | strand | > +--------+---------------+-------------- > +------------------------------+------------+--------+ > | chr13 | 69810099 | 70239331 | ENSG00000165659 | 738.681 | - | > | chr13 | 64676788 | 65600573 | ENSG00000184226 | 627.447 | - | > | chr13 | 59781822 | 59787282 | ENSG00000150506 | 619.467 | + | > .... > > And I've a table with a lot of gene-name's. Now I want to combine > these 4 tables into a fifth table, named score, wich list in the first > column all the gene names, and in the three succesive columns the > score for that gene from the g2d, dgp, and pocus tables, if a score > is found. Otherwise the default value should be 0.00 > Like this: > > mysql> select * from score; > +------------------------------+-----------------+------------------ > +----------------------+ > | gene_name | score_g2d | score_dgp | score_pocus | > +------------------------------+-----------------+------------------ > +----------------------+ > | ENSG00000165659 | 0.000 | 738.681 | 0.00 | > | ENSG00000184226 | 123,123 | 627.447 | 0.00 | > | ENSG00000999999 | 0.000 | 0.00 | 987,987 | > ...... > > How should I construct my query? I've been trying for hours now, and > can't get it rigth. Thanks in advance. > > -- > MySQL General Mailing List > For list archives: http://lists.mysql.com/mysql > To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED] >