Dear all,

I have a table which contains taxonomic data (species, genera, family, order, class) and it is organized as adjacency list model.

mysql> select* from taxonomic_units1 limit 5;
+-----+-------------------------------------------+---------+
 | tsn | name                         | parent_tsn | rank_id |
+-----+--------------------------+--------------+---------+
|  50 | Bacteria                         |            0 |           10 |
|  51 | Schizomycetes              202421 |           60 |
|  52 | Archangiaceae             |         51 |        140 |
|  53 | Pseudomonadale         |         51 |       100 |
|  54 | Rhodobacteriineae      |         53 |       110 |
+-----+-----------------------------+------------+---------+

I am trying to flatten it, so that it can be used in further analysis (e.g. in R) I have been trying to run the following query, and it does what I want it to do, but it takes really long time to get it done. As a matter of fact I was not patient enough to get the whole output and instead set LIMIT 10.

SELECT O1.name AS tclass, O2.name AS torder, O4.name AS tfamily, O5.name AS tgenus, O6.name AS tspecies
FROM taxonomic_units1 AS O1
LEFT OUTER JOIN
taxonomic_units1 AS O2
ON O1.tsn = O2.parent_tsn
LEFT OUTER JOIN
taxonomic_units1 AS O3
ON O2.tsn = O3.parent_tsn
LEFT OUTER JOIN
taxonomic_units1 AS O4
ON O3.tsn = O4.parent_tsn
LEFT OUTER JOIN
taxonomic_units1 AS O5
ON O4.tsn = O5.parent_tsn
LEFT OUTER JOIN
taxonomic_units1 AS O6
ON O5.tsn = O6.parent_tsn
LIMIT 10;

+---------------+-----------------+------------------+-------------- +-----------------------------------------------------------+ | tclass | torder | tfamily | tgenus | tspecies | +---------------+-----------------+------------------+-------------- +-----------------------------------------------------------+ | Bacteria | NULL | NULL | NULL | NULL | | Schizomycetes | Archangiaceae | NULL | NULL | NULL | | Schizomycetes | Pseudomonadales | NULL | NULL | NULL | | Schizomycetes | Pseudomonadales | Nitrobacteraceae | Nitrobacter | Nitrobacteragilis | | Schizomycetes | Pseudomonadales | Nitrobacteraceae | Nitrobacter | Nitrobacterflavus | | Schizomycetes | Pseudomonadales | Nitrobacteraceae | Nitrobacter | Nitrobacteroligotrophis | | Schizomycetes | Pseudomonadales | Nitrobacteraceae | Nitrobacter | Nitrobacterpolytrophus | | Schizomycetes | Pseudomonadales | Nitrobacteraceae | Nitrobacter | Nitrobacterpunctata |


I have checked this query with EXPLAIN, and it is not using any indices, even though column tsn is set as index in original table.

+----+-------------+-------+------+---------------+------+--------- +------+--------+-------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+------+---------------+------+--------- +------+--------+-------+ | 1 | SIMPLE | O1 | ALL | NULL | NULL | NULL | NULL | 483305 | | | 1 | SIMPLE | O2 | ALL | NULL | NULL | NULL | NULL | 483305 | | | 1 | SIMPLE | O3 | ALL | NULL | NULL | NULL | NULL | 483305 | | | 1 | SIMPLE | O4 | ALL | NULL | NULL | NULL | NULL | 483305 | | | 1 | SIMPLE | O5 | ALL | NULL | NULL | NULL | NULL | 483305 | | | 1 | SIMPLE | O6 | ALL | NULL | NULL | NULL | NULL | 483305 | | +----+-------------+-------+------+---------------+------+--------- +------+--------+-------+
6 rows in set (0.00 sec)


What is wrong with this query? Or is it a problem of all adjacency list models?
Is there a way to get columns indexed using self-joins?

Thanks,
Olga

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/mysql?unsub=arch...@jab.org

Reply via email to