Well, you pretty much guessed right on just about everything going through
my head.  I am brand new to database programming and have been trying to
learn
just enough to accomplish what I want to do within my own app.

The app I'm working on has evolved over the years into something that could
now
benefit from something more sophisticated than text file data storage.  Your
explanations
were EXTREMELY helpful.  If anything it gives me the confidence that I'm
going down
the right road.  If not for the great help from everyone in this thread I
might have abandoned
this effort altogether.  My developer instincts were telling me that I was
missing
something important or asking the wrong questions.

Thanks again for all the help.

P.S. your developer vs. database perspective should be a sticky or FAQ for
other newbies on this mailing list.


Mark

On Sat, Oct 31, 2009 at 9:32 PM, Jay A. Kreibich <[email protected]> wrote:

>
> On Sat, Oct 31, 2009 at 01:13:31AM -0400, mark m scratched on the wall:
> > O.K.  I think I am starting to get the idea.  It is just so foreign for
> me
> > to organize things this way. A master work history table for all cases
> > almost seems confusing.  It will just take a bit of adjustment for me
> > to "trust" the database way of doing things.  Text files organized in
> > the way I described has always made it easy for me to figure things
> > out when there was a problem.
>
>   This is not uncommon.  If I had to guess, I'd say you're an experienced
>  applications developer, but are somewhat new to databases.  You're
>  applying your instincts and experience in designing and laying out
>  runtime-data structures and classes to your database design, but find
>  yourself stumbling over some of the minor conceptual differences in
>  this new environment.
>
>  The good news is that this is normal, and nearly everyone makes these
>  mistakes.  The great news is that I've found much of that experience
>  can be utilized and successfully applied to database design as soon
>  as you wrap your head around a few minor differences.
>
>  The biggest stumbling block-- and the one you've hit head-on here--
>  is to think of tables as *instances* of compound data structures.  They
>  look a lot like arrays or dynamic lists, so this is an easy mistake.
>
>  In your case, you need a bunch of records associated with some other
>  table row, so the instinct is to create a brand new table to fit that
>  need.  You'll end up with a bunch of tables with the exact same
>  type-signature (column/type/name pattern), but that's how instances
>  of data structures or classes work-- this is exactly what you'd do
>  if that table was some fancy C++ class that offered a dynamic list
>  of elements.  It also provides for a clear and direct route to get
>  from a main data record to a collection of related sub-members.
>
>  Unfortunately, in the database world, that's the wrong concept and,
>  as others have pointed out, this will lead to no end of problems.
>
>  Here is the first big rule, and it alone will get you pretty far:
>
>                Tables are data structures and classes.
>
>  If you would define a data structure or class to fit some specific
>  and well defined purpose, that's a table.  The .sql file that
>  defines your tables should map to the .h file in your head.
>
>  Don't think of tables themselves as multi-element data structures,
>  think of them as the data structures definitions.  It just happens that
>  in the database environment, every instance of a particular data
>  structure (i.e. a row) happens to be managed by a global instance pool.
>
>  In extension of that, the database world has no "contains a."   All
>  your compound structures are built from "references a."  And the
>  references that bind everything together are not pointers, but ID
>  values.
>
>  The other big stumbling block is that most of these references are
>  "backwards".  If you were building this as an application, you'd have
>  your main records, and each one of those would have a pointer off to
>  some dynamic list or array that held the log records.  In the database
>  world, the main record simply exists with some type of unique
>  identifier.  Rather than the main record pointing to the associated log
>  records, the log records point back and the main record they're
>  associated with.
>
>  This tends to make application developers uncomfortable.  If you had
>  a main record and wanted a list of all the log records associated
>  with it, you'd want to be able to de-reference some pointer or other
>  association directly to some container item, like an array or list of
>  log records.  This is generally what leads to the desire to build
>  multiple tables of the same type, because tables look a lot like
>  instances of a container item.  You're thinking of those tables as
>  *instances* of multi-element data structures, rather than a global
>  collection of all sub-instances of that type.
>
>  That's pretty natural.  After all, it would be an odd way to build an
>  application.  While many environments have global lists of objects
>  (e.g. static class variables that hold a list of every instance of
>  that particular class), you rarely use those lists for building
>  general associations.  In specific, if you were manipulating record X
>  and wanted a list of all the associated log records for X, you would
>  never consider putting a "I belong to data X" field into the log
>  record.  This would require scanning the global list of log records
>  to get from a main record to all the associated log records.  Not
>  only is that somewhat indirect, it sounds very inefficient.
>
>  But this is exactly what you do in the database world.  The main
>  difference being that this is exactly the kind of thing that
>  databases are very very good at-- both in terms of optimized searches
>  for specific records in a large collection (if properly indexed) as
>  well as powerful set manipulations that allow very complex
>  manipulation of large sets of data as single operations.
>
>  This second point is actually more important than many people first
>  realize, and that's fine.  But as you get better with databases
>  you'll start to think in doing operations to large sets of data
>  items, rather than direct manipulation of discrete items.  That comes
>  later, however, and for now it isn't a big limitation to just ignore it.
>
>
>
>  So... to make the jump from application development to database design,
>  I'd first start by more or less thinking things through the same way.
>  If you have years of experience designing data models and class trees,
>  there is no good reason to throw all that experience and understanding
>  out.  Use it to lay out your block diagrams and datatypes, but with a
>  few specific changes:
>
>  1) Tables are your data structure definitions.  If you would organize
>  something by defining a data-structure or class, that's a table in
>  your database.
>
>  2) Rather than pointers, you use Foreign Key ID values to bind
>  different data structures (tables) together.
>
>  3) Most of the "pointers" or references are backwards.  Rather than
>  a master record holding pointers to sub-records, in a database it is
>  the sub-records that have the Foreign Keys that point back to the
>  master record.
>
>
>
>  That will get you pretty far.  I could come up with some more points,
>  but that would already be getting into some pretty find details.  The
>  only other thing I might add is that there are some pretty standard
>  idioms in database design, like One-to-Many and Many-to-Many
>  relationships, that are worth a quick web-search and doing some quick
>  reading about.  Their use and purpose should be fairly obvious to the
>  traditional developer, and provides more reinforcement in bridging
>  your traditional programming experience to the database world.
>
>
>
>  I also realize that if there are any traditional database data
>  modelers out there (that are still reading), I know you're cringing
>  pretty hard right now.  At some point, if you want to become very
>  very good at database design, you do need to uncouple your database
>  thinking from your programming language thinking.  Personally I think
>  the bigger obstacle is "thinking in sets" but there are also a number
>  of subtle things in the database design where this is true.
>
>  That said, most of us on this list don't need to get that far--
>  especially most of us working with SQLite.  Many of us are developers
>  that have some nagging data problem that has gotten a bit out of
>  hand, and we found this easy, fast, drop-in database to replace our
>  ugly hacks with text files and out-of-control file formats.  I know
>  there are a few people out there doing some pretty heavy-lifting
>  work-- and I'm definitely not saying the folks around here aren't
>  "real" database programmers-- but I'm guessing a fair number of us are
>  SQLite users for secondary reasons (I'd like to hear if anyone thinks
>  I'm totally off on this one).  For the majority of those people, a
>  few basic rules and a few standard design idioms is all we need to
>  address our needs and get ou with our "real" work, quite thankful
>  that SQLite is so easy to integrate and makes so many ugly data
>  management issues just disappear.  I know that's where I am.
>
>
>  Anyways... I've gone on long enough.  Good luck with your design.
>  Think a bit, ask good questions, and hopefully we can all see
>  a different point of view and learn to see something new.
>
>   -j
>
> --
> Jay A. Kreibich < J A Y  @  K R E I B I.C H >
>
> "Our opponent is an alien starship packed with atomic bombs.  We have
>  a protractor."   "I'll go home and see if I can scrounge up a ruler
>  and a piece of string."  --from Anathem by Neal Stephenson
> _______________________________________________
> sqlite-users mailing list
> [email protected]
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to