Jef Driesen wrote:
I want to store a tree in an sqlite database. My first choice was the adjacency list model:

CREATE TABLE tree (
   id INTEGER PRIMARY KEY AUTOINCREMENT,
   name TEXT,
   parent_id INTEGER
);

But this method requires multiple queries to display the entire tree (or a subtree) in my GUI (a gtk+ treeview). Because childs can only be added to the treeview if all its parents are already added.

But then I found some resources on the nested set model [1,2]:

CREATE TABLE tree (
   id INTEGER PRIMARY KEY AUTOINCREMENT,
   name TEXT,
   lft INTEGER,
   rgt INTEGER
);

Retrieving a (sub)tree can be done with only one sql query, at the expense of more complex queries to add or remove rows. Because all lft and rgt values to the right of the node have to be modified.

[1] http://www.sitepoint.com/article/hierarchical-data-database
[2] http://dev.mysql.com/tech-resources/articles/hierarchical-data.html
Jef,

I have found an augmented adjacency list which stores a path through the tree to each node to be very effective. I posted a sample implementation on the list previously at http://article.gmane.org/gmane.comp.db.sqlite.general/17286/match=tree (follow the nabble link for more context).

I start to understand this model, but I still have some questions (especially Q3):

Q1. Which is more efficient? Two simple queries or one self join?

I have seen two different types of queries to retrieve a tree. The first one uses two very simple queries:

SELECT lft, rgt FROM tree WHERE name = @name;
SELECT * FROM tree WHERE lft BETWEEN @lft AND @rgt ORDER BY lft ASC;

The first query is only required to retrieve the lft and rgt values of the node. The other type uses a self join (which I assume is more expensive), but no extra query is required:

SELECT node.*
FROM tree AS node, tree AS parent
WHERE node.lft BETWEEN parent.lft AND parent.rgt
AND parent.name = @name
ORDER BY node.lft;

Which type of query is more efficient? Retrieving the path to a node is very similar:

SELECT * FROM tree WHERE lft <= @lft AND rgt >= @rgt ORDER BY lft ASC;

or

SELECT parent.*
FROM tree AS node, tree AS parent
WHERE node.lft BETWEEN parent.lft AND parent.rgt
AND node.name = @name
ORDER BY parent.lft;
There is probably not a lot of difference assuming your calling sqlite from an efficient programming language. I ssupect the join may be slightly faster, but you would have to measure both cases to find out for sure.

Q2. Which indices should I use to make my queries more efficient?
I think your best bet would be a compound index on lft and rgt.

Q3. How do I move a node (or subtree)?

In the adjacency list model, this is extremely easy by pointing the parent_id to another node. But I don't know how to do that in the nested set model.
I have no idea.

Q4. sqlite parameter binding for multiple queries?

For some operations (like deleting a node) I need multiple queries:

DELETE FROM tree WHERE lft BETWEEN @lft AND @rgt;
UPDATE tree SET rgt = rgt - (@rgt - @lft + 1) WHERE rgt > @rgt;
UPDATE tree SET lft = lft - (@rgt - @lft + 1) WHERE lft > @rgt;

and they all need the same parameters (@lft and @rgt). Do I have to prepare each statement separately and bind the parameters every time? Or is it possible to bind the parameters only once (because the values remain the same) and execute all the queries at once. I think this is not possible, but I could be wrong.

You will need to bind the parameters to each prepared statement.

HTH
Dennis Cote

-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------

Reply via email to