[sqlite] Re: Some questions on hierarchical data (nested set model)

Jef Driesen Tue, 10 Apr 2007 05:56:18 -0700

Dennis Cote wrote:

Jef Driesen wrote:
I want to store a tree in an sqlite database. My first choice was theadjacency list model:
CREATE TABLE tree (
   id INTEGER PRIMARY KEY AUTOINCREMENT,
   name TEXT,
   parent_id INTEGER
);
But this method requires multiple queries to display the entire tree(or a subtree) in my GUI (a gtk+ treeview). Because childs can only beadded to the treeview if all its parents are already added.
But then I found some resources on the nested set model [1,2]:

CREATE TABLE tree (
   id INTEGER PRIMARY KEY AUTOINCREMENT,
   name TEXT,
   lft INTEGER,
   rgt INTEGER
);
Retrieving a (sub)tree can be done with only one sql query, at theexpense of more complex queries to add or remove rows. Because all lftand rgt values to the right of the node have to be modified.
[1] http://www.sitepoint.com/article/hierarchical-data-database
[2] http://dev.mysql.com/tech-resources/articles/hierarchical-data.html
I have found an augmented adjacency list which stores a path through thetree to each node to be very effective. I posted a sample implementationon the list previously athttp://article.gmane.org/gmane.comp.db.sqlite.general/17286/match=tree(follow the nabble link for more context).


I'll take a look at it...

I start to understand this model, but I still have some questions(especially Q3):
Q1. Which is more efficient? Two simple queries or one self join?
I have seen two different types of queries to retrieve a tree. Thefirst one uses two very simple queries:
SELECT lft, rgt FROM tree WHERE name = @name;
SELECT * FROM tree WHERE lft BETWEEN @lft AND @rgt ORDER BY lft ASC;
The first query is only required to retrieve the lft and rgt values ofthe node. The other type uses a self join (which I assume is moreexpensive), but no extra query is required:
SELECT node.*
FROM tree AS node, tree AS parent
WHERE node.lft BETWEEN parent.lft AND parent.rgt
AND parent.name = @name
ORDER BY node.lft;
Which type of query is more efficient? Retrieving the path to a nodeis very similar:
SELECT * FROM tree WHERE lft <= @lft AND rgt >= @rgt ORDER BY lft ASC;

or

SELECT parent.*
FROM tree AS node, tree AS parent
WHERE node.lft BETWEEN parent.lft AND parent.rgt
AND node.name = @name
ORDER BY parent.lft;
There is probably not a lot of difference assuming your calling sqlitefrom an efficient programming language. I ssupect the join may beslightly faster, but you would have to measure both cases to find outfor sure.


I'm using the C/C++ interface.

I know I have to measure to be really sure, but since I'm not very
experienced with SQL, I have no idea which types of queries are usually
faster than others. My first thought was that selecting a subset of a
table (1st SQL statement) would be more simple and thus faster than
doing the join (2nd SQL statement). But I could be wrong of course.

Q2. Which indices should I use to make my queries more efficient?


I think your best bet would be a compound index on lft and rgt.


Why would a compound index be better than a single index? I thought a
compound index is mainly used to order by the second column (rgt) only
when the first column (lft) has duplicates. But the lft (and rgt)
columns are unique in the nested set model.

Q3. How do I move a node (or subtree)?
In the adjacency list model, this is extremely easy by pointing theparent_id to another node. But I don't know how to do that in thenested set model.
I have no idea.


I think I found a way to do it. But it's not finished yet.

Q4. sqlite parameter binding for multiple queries?

For some operations (like deleting a node) I need multiple queries:

DELETE FROM tree WHERE lft BETWEEN @lft AND @rgt;
UPDATE tree SET rgt = rgt - (@rgt - @lft + 1) WHERE rgt > @rgt;
UPDATE tree SET lft = lft - (@rgt - @lft + 1) WHERE lft > @rgt;
and they all need the same parameters (@lft and @rgt). Do I have toprepare each statement separately and bind the parameters every time?Or is it possible to bind the parameters only once (because the valuesremain the same) and execute all the queries at once. I think this isnot possible, but I could be wrong.
You will need to bind the parameters to each prepared statement.


Can I use sqlite3_transfer_bindings here? The website only mentions the
case of preparing the same SQL multiple times. But my SQL statements are
different.




-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------

[sqlite] Re: Some questions on hierarchical data (nested set model)

Reply via email to