Kreibich,

    You understood the problem perfectly. Thank you.

*>>I  would just suggest using a single composite path value, >>rather
than trying to break the path value down into >>individual components
and store those each in their own >>column.*

    I already with single column (/root/local/bin/create_db), but search
and listing is very difficult with single query. for ex: under root, lising
distinct folders and files.  So, I am planning to  redesign the db for
filepaths. I will go through Joe books and let you know.

Thanks in advance.

Thanks,
Durga.

On Mon, Jan 2, 2012 at 9:55 PM, Jay A. Kreibich <j...@kreibi.ch> wrote:

> On Mon, Jan 02, 2012 at 07:01:29AM +0000, Simon Slavin scratched on the
> wall:
> >
> > On 2 Jan 2012, at 5:25am, Durga D wrote:
> >
> > > "create table if not exists durtree  (id integer primary key
> autoincrement,
>
> [...]
>
> > Besides which, almost any schema where you have to number your columns is
> > a bad schema.  You should be able to hold the entire schema in your head
> > at one time.
>
>   In the sense of being a properly normalized and theoretically ideal
>  schema, I would agree, but the real world has a tendency of getting
>  in the way of theoretically ideals.
>
>  "Normalized until it hurts, denormalize until it works."
>
> > > I want to make c1 to c70 as unique with default null. But, I could not
> with
> > > above query. I can make c42 to c70 as default null  and c1 to c70 as
> > > unique. If I add default null to c41 and c40, it gets failed.
> >
> > The above does not make each of the c nodes unique.  It makes each
> > combination of c nodes unique: every single c would have to be identical
> > in two rows for SQLite to reject it as violating your UNIQUE constraint.
>
>   Exactly?  If he's looking to store a tree that allows direct access
>  to specific nodes, the numbered columns are essentially storing a path
>  through the tree.  Any single node can have multiple children (or
>  itself be a node), so the individual columns don't need to be unique.
>  The path as a whole, however, must be unique to represent a single node.
>
> > There's no way to make each node unique in the above schema because a
> > node could be in c65 in one row and c66 in another row.
>
>   I'm not following you.  Many types of trees allow leaf nodes at
>  different levels.
>
> > A data structure more used for storing trees would look something like
> this:
> >
> > CREATE TABLE IF NOT EXISTS durTreeNodes (
> >       treeNumber INTEGER,
> >       parent TEXT,
> >       nodes TEXT NOT NULL,
> >       UNIQUE (treeNumber, node))
> >
> > This allows for trees of any height to be held efficiently.  The parent
> > value of the root node could be null, or 'root' or something.
>
>   *Held* efficiently, perhaps-- but not accessed.  What you're
>  describing is the "Adjacency Model", which is the most basic way to
>  store tree-like structures in SQL.  It has some serious drawbacks,
>  however.  Standard SQL has no type of recursive or looping features,
>  meaning "walking" a tree to find a specific node, even when the full
>  path is known, takes multiple queries.  In specific, it takes as many
>  queries as the node is deep in the tree.  This makes it extremely
>  slow.  Oracle has some syntax short-cuts to deal with this, but they're
>  non-standard.
>
>  Your unique constraint is also radically different.  You're forcing
>  nodes to be unique across whole trees, while the original schema only
>  forces them to be unique if they have the same parent (which is
>  usually what most people want).
>
>  These compromises are are part of the reason there are so many different
>  models to store tree-based structures in an SQL database (or even in
>  many "NoSQL" systems, for that matter).  Each method has specific
>  advantages and disadvantages.  Some are compact to store, but slow to
>  access, others (like the "nested set") are quick to access, but very
>  expensive to modify.
>
>  Using a "path" based approach is definitely a valid approach.  Most of
>  the time it is done with a single column, however, using some type of
>  delimiter for the tree levels.  For example, have one column with the
>  value "/usr/local/bin" rather than {c1="usr", c2="local", c3="bin"}.
>  You can then use a UNIQUE constraint on just that one column.
>  Wild-card LIKE matches allow access to sub-trees.  Using a single
>  composite path value also eliminates a specific depth limit.
>
> > > objective: I am trying to store tree in a sqlite3 db depth of 70.  I
> need
> > > high performance when accessing any level of the tree.
> >
> > The above schema, with its very large number of columns, is not going
> > to be very fast.  To retrieve c70 from a row SQLite will need to count
> > through 70 columns before it can get to it.
>
>   That's hardly a big deal, especially if the common case is that most
>  of the columns are NULL.
>
> > If you want to retrieve an entire level of a tree at once, why not
> > store it that way ?
>
>   He's not asking about pulling an entire level, he's saying he wants
>  to pull any node, at any level, with a single query.  The Adjacency
>  Model (storing a "parent" value) doesn't allow this... the
>  application has to "walk" the tree, making one query to get through
>  each level of the tree.  A "path" model allows you to jump to any
>  node (at a known path) in a single query.
>
> > CREATE TABLE IF NOT EXISTS durTreeLevels (
> >       treeNumber INTEGER,
> >       levelNumber INTEGER,
> >       nodes TEXT,
> >       UNIQUE (treeNumber, levelNumber))
> >
> > In the 'nodes' column you put a list of nodes, separated by commas
> > or something.  This also allows trees of any height to be held
> efficiently.
>
>   Seriously?  That breaks First Normal Form.  Stuffing lists into a
>  single database value is a Database 101 no-no.
>
>  Not to mention your constraint only allows one descending node per level.
>  That's not a tree, it's a stick.
>
>
>
>  As I said, there are many different well understood and well tested
>  models for storing trees and other hierarchies in an SQL database.
>  I would suggest the OP do some web searches on hierarchies and SQL.
>  This is, for the most part, a solved problem.
>
>  Joe Celko also has a whole book on on the subject:
>
> http://www.amazon.com/Hierarchies-Smarties-Kaufmann-Management-Systems/dp/0123877334/
>
>  Although much of it is just a rehash and expansion of the material
>  covered in his main book (which I would strongly recommend for
>  any one doing any semi-serious SQL programming):
>  http://www.amazon.com/Joe-Celkos-SQL-Smarties-Fourth/dp/0123820227/
>
>  Overall, using a "path" type model may very well fit your needs, I
>  would just suggest using a single composite path value, rather than
>  trying to break the path value down into individual components and
>  store those each in their own column.
>
>    -j
>
> --
> Jay A. Kreibich < J A Y  @  K R E I B I.C H >
>
> "Intelligence is like underwear: it is important that you have it,
>  but showing it to the wrong people has the tendency to make them
>  feel uncomfortable." -- Angela Johnson
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to