On Sun, Nov 01, 2009 at 01:23:52PM -0800, CityDev scratched on the wall:
>
> Darren Duncan wrote:
> >
> > Or at least it is in the version of the relational model
> > that allows non-scalar attribute values, but that is the one that Chris
> > Date et al, as well as myself ascribe to.
>
> I didn't read this through but I recall Chris Date defining a relational
> database as formed from relations normalised to 1NF. First Normal Form
> basically means no substructure ie each column is a single value from a
> scalar domain.
> To say it would be nice to have structured data is ignoring
> two things: i) we've got this for just fine without it, and ii) SQL is a
> first-order language so it can't easily handle substructure.
Let me address these two points first. As for "i", then please
explain to my why nearly every SQL database supports array values?
I'm not sure it is in the standard, but it is so common it might
as well be (for those wondering, no, SQLite does not support arrays).
The answer for "ii" is easy: SQL is not the Relational Model, so the
only point you might be making is that SQL makes for a poor relational
language.
Now back to your first paragraph. This is the more interesting one.
Executive Summary: Multi-valued attributes (column types)
are not allowed in the Relational Model. Relational-Valued
Attributes (tables in tables) are OK, however, because they
don't count.
Now a few pages on why....
The term used by Date (and Codd) is "atomic", not scalar. Of course,
this is generally meaningless, as it is very hard to define what
makes a value "atomic." Are text values atomic? The SQL standard
includes a number of functions to extract, match, and manipulate
sub-strings. What about integers? SQLite includes bit operations
that can be applied to integer values. Does that make them non-atomic?
In "Database In Depth", Date eventually just gives up trying to
define "atomic" in a formal sense, says it cannot and should not be
done. In his writings, Date points out that even atoms are not
"atomic," and can be broken down into quarks-- but that the idea of
atomic values is still a good and powerful one, even if it is inexact.
Given all that, it might come as a surprise that Codd and Date *do*
support the idea that RVA (Relation-Valued Attribute) fit into the
Relational Model. A few points on that. First, the whole concept
of 1NF is from the SQL world. To put a table into 1NF is basically
to make it mimic the requirements of a relation. Under the
Relational Model, it is basically impossible to have a valid
Relation that is not in 1NF. If it isn't in 1NF, it isn't a relation.
So why allow RVAs? Well, to start with, under the Relational Model
relations are atomic. Yes, really. Under the formal Model you're
not allowed to change or alter any sub-value of a relation. A
relational value is what it is, and the whole state of the relation
is what represents its value. By taking the product of the column
type domains, you can compute a known, finite domain for every possible
value for a given relation (relations don't allow duplicate rows,
so the number of rows is bound). That's typically a *very* large
domain, but since it is known and finite, it is a valid type domain.
As an extension of that idea, when you alter a column or row under
the Model you're not changing a sub-part of the relation, you're
replacing the whole relational value with a new value. INSERT
doesn't add a new row to an existing relation, it computes a new
relation value that is the UNION of the old relation and the new row.
Similar manipulations can apply the UPDATE and DELETE, but in all
cases you're not altering a relation, you're computing a new one.
So, in terms of the model that defines both what a relation is and
what it can contain, relations are atomic-- and from that RVAs are
naturally allowed.
At this point, if you're raising an eyebrow and thinking that sounds
like a pretty questionable slight of hand in the formal definition of
atomic values, I'd be quick to agree. But here's another point that
is, perhaps, more important: Any relation with a RVA can be
decomposed into a non-RVA relation. And, just as importantly, can
be re-composed back into the original RVA. That is, you can "round
trip" in and out of a RVA representation <IMPORTANT!>without any
loss of data or meaning</IMPORTANT!>. One can make a strong
argument that RVAs are just a different way to organize the data
"on paper", but in the end the addition (or removal) of RVA doesn't
alter what the Model is capable of representing or doing. This works
the same way that any static multi-dimensional array type can be
reduced to a one-dimensional array type without loss of functionality.
And it is *this* point that separates RVAs from storing arrays,
lists, or other "multi-value" containers as a row/column value.
Exactly why this is true would take pages and pages to explain, but
basically boils down to the idea that the only thing that can
flawlessly represent a relation without adding or losing state or
informational value is... (surprise!) another relation. On the
details of this you're just going to have to trust me.
Or, better yet: don't. Pick up a copy of "Database In Depth"
yourself. You're likely to look at databases and database design in
a whole new light.
-j
--
Jay A. Kreibich < J A Y @ K R E I B I.C H >
"Our opponent is an alien starship packed with atomic bombs. We have
a protractor." "I'll go home and see if I can scrounge up a ruler
and a piece of string." --from Anathem by Neal Stephenson
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users