Kreibich, You understood the problem perfectly. Thank you.
*>>I would just suggest using a single composite path value, >>rather than trying to break the path value down into >>individual components and store those each in their own >>column.* I already with single column (/root/local/bin/create_db), but search and listing is very difficult with single query. for ex: under root, lising distinct folders and files. So, I am planning to redesign the db for filepaths. I will go through Joe books and let you know. Thanks in advance. Thanks, Durga. On Mon, Jan 2, 2012 at 9:55 PM, Jay A. Kreibich <j...@kreibi.ch> wrote: > On Mon, Jan 02, 2012 at 07:01:29AM +0000, Simon Slavin scratched on the > wall: > > > > On 2 Jan 2012, at 5:25am, Durga D wrote: > > > > > "create table if not exists durtree (id integer primary key > autoincrement, > > [...] > > > Besides which, almost any schema where you have to number your columns is > > a bad schema. You should be able to hold the entire schema in your head > > at one time. > > In the sense of being a properly normalized and theoretically ideal > schema, I would agree, but the real world has a tendency of getting > in the way of theoretically ideals. > > "Normalized until it hurts, denormalize until it works." > > > > I want to make c1 to c70 as unique with default null. But, I could not > with > > > above query. I can make c42 to c70 as default null and c1 to c70 as > > > unique. If I add default null to c41 and c40, it gets failed. > > > > The above does not make each of the c nodes unique. It makes each > > combination of c nodes unique: every single c would have to be identical > > in two rows for SQLite to reject it as violating your UNIQUE constraint. > > Exactly? If he's looking to store a tree that allows direct access > to specific nodes, the numbered columns are essentially storing a path > through the tree. Any single node can have multiple children (or > itself be a node), so the individual columns don't need to be unique. > The path as a whole, however, must be unique to represent a single node. > > > There's no way to make each node unique in the above schema because a > > node could be in c65 in one row and c66 in another row. > > I'm not following you. Many types of trees allow leaf nodes at > different levels. > > > A data structure more used for storing trees would look something like > this: > > > > CREATE TABLE IF NOT EXISTS durTreeNodes ( > > treeNumber INTEGER, > > parent TEXT, > > nodes TEXT NOT NULL, > > UNIQUE (treeNumber, node)) > > > > This allows for trees of any height to be held efficiently. The parent > > value of the root node could be null, or 'root' or something. > > *Held* efficiently, perhaps-- but not accessed. What you're > describing is the "Adjacency Model", which is the most basic way to > store tree-like structures in SQL. It has some serious drawbacks, > however. Standard SQL has no type of recursive or looping features, > meaning "walking" a tree to find a specific node, even when the full > path is known, takes multiple queries. In specific, it takes as many > queries as the node is deep in the tree. This makes it extremely > slow. Oracle has some syntax short-cuts to deal with this, but they're > non-standard. > > Your unique constraint is also radically different. You're forcing > nodes to be unique across whole trees, while the original schema only > forces them to be unique if they have the same parent (which is > usually what most people want). > > These compromises are are part of the reason there are so many different > models to store tree-based structures in an SQL database (or even in > many "NoSQL" systems, for that matter). Each method has specific > advantages and disadvantages. Some are compact to store, but slow to > access, others (like the "nested set") are quick to access, but very > expensive to modify. > > Using a "path" based approach is definitely a valid approach. Most of > the time it is done with a single column, however, using some type of > delimiter for the tree levels. For example, have one column with the > value "/usr/local/bin" rather than {c1="usr", c2="local", c3="bin"}. > You can then use a UNIQUE constraint on just that one column. > Wild-card LIKE matches allow access to sub-trees. Using a single > composite path value also eliminates a specific depth limit. > > > > objective: I am trying to store tree in a sqlite3 db depth of 70. I > need > > > high performance when accessing any level of the tree. > > > > The above schema, with its very large number of columns, is not going > > to be very fast. To retrieve c70 from a row SQLite will need to count > > through 70 columns before it can get to it. > > That's hardly a big deal, especially if the common case is that most > of the columns are NULL. > > > If you want to retrieve an entire level of a tree at once, why not > > store it that way ? > > He's not asking about pulling an entire level, he's saying he wants > to pull any node, at any level, with a single query. The Adjacency > Model (storing a "parent" value) doesn't allow this... the > application has to "walk" the tree, making one query to get through > each level of the tree. A "path" model allows you to jump to any > node (at a known path) in a single query. > > > CREATE TABLE IF NOT EXISTS durTreeLevels ( > > treeNumber INTEGER, > > levelNumber INTEGER, > > nodes TEXT, > > UNIQUE (treeNumber, levelNumber)) > > > > In the 'nodes' column you put a list of nodes, separated by commas > > or something. This also allows trees of any height to be held > efficiently. > > Seriously? That breaks First Normal Form. Stuffing lists into a > single database value is a Database 101 no-no. > > Not to mention your constraint only allows one descending node per level. > That's not a tree, it's a stick. > > > > As I said, there are many different well understood and well tested > models for storing trees and other hierarchies in an SQL database. > I would suggest the OP do some web searches on hierarchies and SQL. > This is, for the most part, a solved problem. > > Joe Celko also has a whole book on on the subject: > > http://www.amazon.com/Hierarchies-Smarties-Kaufmann-Management-Systems/dp/0123877334/ > > Although much of it is just a rehash and expansion of the material > covered in his main book (which I would strongly recommend for > any one doing any semi-serious SQL programming): > http://www.amazon.com/Joe-Celkos-SQL-Smarties-Fourth/dp/0123820227/ > > Overall, using a "path" type model may very well fit your needs, I > would just suggest using a single composite path value, rather than > trying to break the path value down into individual components and > store those each in their own column. > > -j > > -- > Jay A. Kreibich < J A Y @ K R E I B I.C H > > > "Intelligence is like underwear: it is important that you have it, > but showing it to the wrong people has the tendency to make them > feel uncomfortable." -- Angela Johnson > _______________________________________________ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users