Jef Driesen wrote:
I want to store a tree in an sqlite database. My first choice was the
adjacency list model:
CREATE TABLE tree (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT,
parent_id INTEGER
);
But this method requires multiple queries to display the entire tree
(or a subtree) in my GUI (a gtk+ treeview). Because childs can only be
added to the treeview if all its parents are already added.
But then I found some resources on the nested set model [1,2]:
CREATE TABLE tree (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT,
lft INTEGER,
rgt INTEGER
);
Retrieving a (sub)tree can be done with only one sql query, at the
expense of more complex queries to add or remove rows. Because all lft
and rgt values to the right of the node have to be modified.
[1] http://www.sitepoint.com/article/hierarchical-data-database
[2] http://dev.mysql.com/tech-resources/articles/hierarchical-data.html
Jef,
I have found an augmented adjacency list which stores a path through the
tree to each node to be very effective. I posted a sample implementation
on the list previously at
http://article.gmane.org/gmane.comp.db.sqlite.general/17286/match=tree
(follow the nabble link for more context).
I start to understand this model, but I still have some questions
(especially Q3):
Q1. Which is more efficient? Two simple queries or one self join?
I have seen two different types of queries to retrieve a tree. The
first one uses two very simple queries:
SELECT lft, rgt FROM tree WHERE name = @name;
SELECT * FROM tree WHERE lft BETWEEN @lft AND @rgt ORDER BY lft ASC;
The first query is only required to retrieve the lft and rgt values of
the node. The other type uses a self join (which I assume is more
expensive), but no extra query is required:
SELECT node.*
FROM tree AS node, tree AS parent
WHERE node.lft BETWEEN parent.lft AND parent.rgt
AND parent.name = @name
ORDER BY node.lft;
Which type of query is more efficient? Retrieving the path to a node
is very similar:
SELECT * FROM tree WHERE lft <= @lft AND rgt >= @rgt ORDER BY lft ASC;
or
SELECT parent.*
FROM tree AS node, tree AS parent
WHERE node.lft BETWEEN parent.lft AND parent.rgt
AND node.name = @name
ORDER BY parent.lft;
There is probably not a lot of difference assuming your calling sqlite
from an efficient programming language. I ssupect the join may be
slightly faster, but you would have to measure both cases to find out
for sure.
Q2. Which indices should I use to make my queries more efficient?
I think your best bet would be a compound index on lft and rgt.
Q3. How do I move a node (or subtree)?
In the adjacency list model, this is extremely easy by pointing the
parent_id to another node. But I don't know how to do that in the
nested set model.
I have no idea.
Q4. sqlite parameter binding for multiple queries?
For some operations (like deleting a node) I need multiple queries:
DELETE FROM tree WHERE lft BETWEEN @lft AND @rgt;
UPDATE tree SET rgt = rgt - (@rgt - @lft + 1) WHERE rgt > @rgt;
UPDATE tree SET lft = lft - (@rgt - @lft + 1) WHERE lft > @rgt;
and they all need the same parameters (@lft and @rgt). Do I have to
prepare each statement separately and bind the parameters every time?
Or is it possible to bind the parameters only once (because the values
remain the same) and execute all the queries at once. I think this is
not possible, but I could be wrong.
You will need to bind the parameters to each prepared statement.
HTH
Dennis Cote
-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------