Hi Olga, On Thu, Oct 29, 2009 at 03:29:58PM +0000, Olga Lyashevska wrote: > I have a table which contains taxonomic data (species, genera, family, > order, class) and it is organized as adjacency list model. > > mysql> select* from taxonomic_units1 limit 5; > +-----+-------------------------------------------+---------+ > | tsn | name | parent_tsn | rank_id | > +-----+--------------------------+--------------+---------+ > | 50 | Bacteria | 0 | 10 | > | 51 | Schizomycetes 202421 | 60 | > | 52 | Archangiaceae | 51 | 140 | > | 53 | Pseudomonadale | 51 | 100 | > | 54 | Rhodobacteriineae | 53 | 110 | > +-----+-----------------------------+------------+---------+ > > I am trying to flatten it, so that it can be used in further analysis > (e.g. in R) > I have been trying to run the following query, and it does what I want > it to do, but it takes really long time to get it done. > As a matter of fact I was not patient enough to get the whole output and > instead set LIMIT 10. > > SELECT O1.name AS tclass, O2.name AS torder, O4.name AS tfamily, O5.name > AS tgenus, O6.name AS tspecies > FROM taxonomic_units1 AS O1 > LEFT OUTER JOIN > taxonomic_units1 AS O2 > ON O1.tsn = O2.parent_tsn > LEFT OUTER JOIN > taxonomic_units1 AS O3 > ON O2.tsn = O3.parent_tsn > LEFT OUTER JOIN > taxonomic_units1 AS O4 > ON O3.tsn = O4.parent_tsn > LEFT OUTER JOIN > taxonomic_units1 AS O5 > ON O4.tsn = O5.parent_tsn > LEFT OUTER JOIN > taxonomic_units1 AS O6 > ON O5.tsn = O6.parent_tsn > LIMIT 10; > > +---------------+-----------------+------------------+-------------- > +-----------------------------------------------------------+ > | tclass | torder | tfamily > | tgenus | tspecies > | > +---------------+-----------------+------------------+-------------- > +-----------------------------------------------------------+ > | Bacteria | NULL | NULL > | NULL | NULL | > | Schizomycetes | Archangiaceae | NULL | > NULL | NULL | > | Schizomycetes | Pseudomonadales | NULL | NULL > | NULL | > | Schizomycetes | Pseudomonadales | Nitrobacteraceae | Nitrobacter | > Nitrobacteragilis | > | Schizomycetes | Pseudomonadales | Nitrobacteraceae | Nitrobacter | > Nitrobacterflavus | > | Schizomycetes | Pseudomonadales | Nitrobacteraceae | Nitrobacter | > Nitrobacteroligotrophis | > | Schizomycetes | Pseudomonadales | Nitrobacteraceae | Nitrobacter | > Nitrobacterpolytrophus | > | Schizomycetes | Pseudomonadales | Nitrobacteraceae | Nitrobacter | > Nitrobacterpunctata | > > > I have checked this query with EXPLAIN, and it is not using any indices, > even though column tsn is set as index in original table. > > +----+-------------+-------+------+---------------+------+--------- > +------+--------+-------+ > | id | select_type | table | type | possible_keys | key | key_len | ref > | rows | Extra | > +----+-------------+-------+------+---------------+------+--------- > +------+--------+-------+ > | 1 | SIMPLE | O1 | ALL | NULL | NULL | NULL | > NULL | 483305 | | > | 1 | SIMPLE | O2 | ALL | NULL | NULL | NULL | > NULL | 483305 | | > | 1 | SIMPLE | O3 | ALL | NULL | NULL | NULL | > NULL | 483305 | | > | 1 | SIMPLE | O4 | ALL | NULL | NULL | NULL | > NULL | 483305 | | > | 1 | SIMPLE | O5 | ALL | NULL | NULL | NULL | > NULL | 483305 | | > | 1 | SIMPLE | O6 | ALL | NULL | NULL | NULL | > NULL | 483305 | | > +----+-------------+-------+------+---------------+------+--------- > +------+--------+-------+ > 6 rows in set (0.00 sec) > > > What is wrong with this query? Or is it a problem of all adjacency list > models? > Is there a way to get columns indexed using self-joins?
For an outer join like ... taxonomic_units1 AS O1 LEFT OUTER JOIN taxonomic_units1 AS O2 ON O1.tsn = O2.parent_tsn current optimizer has only one option(*): use Nested-Loops Join algorthm, with the outer table being the first one. That is, it will execute these loops: for each record from O1 for each record in O2 such that O1.tsn = O2.parent_tsn ... this makes it clear that index on O1.tsn will not be useful. You need indexes on parent_tsn column. (*) certain kinds of WHERE clause may open other opportinites. BR Sergey -- Sergey Petrunia, Software Developer Monty Program AB, http://askmonty.org Blog: http://s.petrunia.net/blog -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql?unsub=arch...@jab.org