Kreibich,
You understood the problem perfectly. Thank you.
*>>I would just suggest using a single composite path value, >>rather
than trying to break the path value down into >>individual components
and store those each in their own >>column.*
I already with single column (/root/local/bin/create_db), but search
and listing is very difficult with single query. for ex: under root, lising
distinct folders and files. So, I am planning to redesign the db for
filepaths. I will go through Joe books and let you know.
Thanks in advance.
Thanks,
Durga.
On Mon, Jan 2, 2012 at 9:55 PM, Jay A. Kreibich <[email protected]> wrote:
> On Mon, Jan 02, 2012 at 07:01:29AM +0000, Simon Slavin scratched on the
> wall:
> >
> > On 2 Jan 2012, at 5:25am, Durga D wrote:
> >
> > > "create table if not exists durtree (id integer primary key
> autoincrement,
>
> [...]
>
> > Besides which, almost any schema where you have to number your columns is
> > a bad schema. You should be able to hold the entire schema in your head
> > at one time.
>
> In the sense of being a properly normalized and theoretically ideal
> schema, I would agree, but the real world has a tendency of getting
> in the way of theoretically ideals.
>
> "Normalized until it hurts, denormalize until it works."
>
> > > I want to make c1 to c70 as unique with default null. But, I could not
> with
> > > above query. I can make c42 to c70 as default null and c1 to c70 as
> > > unique. If I add default null to c41 and c40, it gets failed.
> >
> > The above does not make each of the c nodes unique. It makes each
> > combination of c nodes unique: every single c would have to be identical
> > in two rows for SQLite to reject it as violating your UNIQUE constraint.
>
> Exactly? If he's looking to store a tree that allows direct access
> to specific nodes, the numbered columns are essentially storing a path
> through the tree. Any single node can have multiple children (or
> itself be a node), so the individual columns don't need to be unique.
> The path as a whole, however, must be unique to represent a single node.
>
> > There's no way to make each node unique in the above schema because a
> > node could be in c65 in one row and c66 in another row.
>
> I'm not following you. Many types of trees allow leaf nodes at
> different levels.
>
> > A data structure more used for storing trees would look something like
> this:
> >
> > CREATE TABLE IF NOT EXISTS durTreeNodes (
> > treeNumber INTEGER,
> > parent TEXT,
> > nodes TEXT NOT NULL,
> > UNIQUE (treeNumber, node))
> >
> > This allows for trees of any height to be held efficiently. The parent
> > value of the root node could be null, or 'root' or something.
>
> *Held* efficiently, perhaps-- but not accessed. What you're
> describing is the "Adjacency Model", which is the most basic way to
> store tree-like structures in SQL. It has some serious drawbacks,
> however. Standard SQL has no type of recursive or looping features,
> meaning "walking" a tree to find a specific node, even when the full
> path is known, takes multiple queries. In specific, it takes as many
> queries as the node is deep in the tree. This makes it extremely
> slow. Oracle has some syntax short-cuts to deal with this, but they're
> non-standard.
>
> Your unique constraint is also radically different. You're forcing
> nodes to be unique across whole trees, while the original schema only
> forces them to be unique if they have the same parent (which is
> usually what most people want).
>
> These compromises are are part of the reason there are so many different
> models to store tree-based structures in an SQL database (or even in
> many "NoSQL" systems, for that matter). Each method has specific
> advantages and disadvantages. Some are compact to store, but slow to
> access, others (like the "nested set") are quick to access, but very
> expensive to modify.
>
> Using a "path" based approach is definitely a valid approach. Most of
> the time it is done with a single column, however, using some type of
> delimiter for the tree levels. For example, have one column with the
> value "/usr/local/bin" rather than {c1="usr", c2="local", c3="bin"}.
> You can then use a UNIQUE constraint on just that one column.
> Wild-card LIKE matches allow access to sub-trees. Using a single
> composite path value also eliminates a specific depth limit.
>
> > > objective: I am trying to store tree in a sqlite3 db depth of 70. I
> need
> > > high performance when accessing any level of the tree.
> >
> > The above schema, with its very large number of columns, is not going
> > to be very fast. To retrieve c70 from a row SQLite will need to count
> > through 70 columns before it can get to it.
>
> That's hardly a big deal, especially if the common case is that most
> of the columns are NULL.
>
> > If you want to retrieve an entire level of a tree at once, why not
> > store it that way ?
>
> He's not asking about pulling an entire level, he's saying he wants
> to pull any node, at any level, with a single query. The Adjacency
> Model (storing a "parent" value) doesn't allow this... the
> application has to "walk" the tree, making one query to get through
> each level of the tree. A "path" model allows you to jump to any
> node (at a known path) in a single query.
>
> > CREATE TABLE IF NOT EXISTS durTreeLevels (
> > treeNumber INTEGER,
> > levelNumber INTEGER,
> > nodes TEXT,
> > UNIQUE (treeNumber, levelNumber))
> >
> > In the 'nodes' column you put a list of nodes, separated by commas
> > or something. This also allows trees of any height to be held
> efficiently.
>
> Seriously? That breaks First Normal Form. Stuffing lists into a
> single database value is a Database 101 no-no.
>
> Not to mention your constraint only allows one descending node per level.
> That's not a tree, it's a stick.
>
>
>
> As I said, there are many different well understood and well tested
> models for storing trees and other hierarchies in an SQL database.
> I would suggest the OP do some web searches on hierarchies and SQL.
> This is, for the most part, a solved problem.
>
> Joe Celko also has a whole book on on the subject:
>
> http://www.amazon.com/Hierarchies-Smarties-Kaufmann-Management-Systems/dp/0123877334/
>
> Although much of it is just a rehash and expansion of the material
> covered in his main book (which I would strongly recommend for
> any one doing any semi-serious SQL programming):
> http://www.amazon.com/Joe-Celkos-SQL-Smarties-Fourth/dp/0123820227/
>
> Overall, using a "path" type model may very well fit your needs, I
> would just suggest using a single composite path value, rather than
> trying to break the path value down into individual components and
> store those each in their own column.
>
> -j
>
> --
> Jay A. Kreibich < J A Y @ K R E I B I.C H >
>
> "Intelligence is like underwear: it is important that you have it,
> but showing it to the wrong people has the tendency to make them
> feel uncomfortable." -- Angela Johnson
> _______________________________________________
> sqlite-users mailing list
> [email protected]
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users