The problem is actually a little more subtle than that: The basic
problem is not one of wasted space. Think of a much simpler
abstraction, the file. When it is initially created, it may or may
not be allocated as a contiguous block (we'll ignore issues such as
striping). What happens if you come along with vim and insert a
sentence in the middle of the file. If there is no room in an
allocated block, it will be necessary to "split" the file,
essentially breaking it up into blocks linked by pointers. This
storage model may be suitable for most files (e.g., executables that
are created once by the compiler), but not suitable for databases
that need to support inserts. What a B-tree does is provide a data
structure that minimizes the cost of this splitting. If you're
familiar with heaps, or just balanced binary trees, you've already
seen the basic idea. You can maintain a sorted list as binary tree,
so that average search time is approximately log n, rathe than n
(binary vs. linear search). But after many updates, the tree can
become unbalanced, in the extreme, just turning into another linear
list. There are various data structures like red-black trees that can
be used to keep the tree balanced by stopping to shift nodes around.
But that's not really practical for a database, either. What B-trees
do is generalize this approach by replacing binary trees with trees
in which a single node may have many children. This adds a fair
amount of complexity, but it really cuts down on the overhead, making
it practical to maintain much larger data structures without
sacrificing (too much) efficiency.
===
Gregory Woodhouse
[EMAIL PROTECTED]
"The whole of science is nothing more than a refinement
of everyday thinking." -- Albert Einstein
On Sep 19, 2005, at 11:56 AM, Kevin Toppenberg wrote:
Also, the database in M is called by some a "sparce array." This
means that there are no "blank spaces" left for data to be later
filled into. So with M, if there is no data present, then no space is
wasted. I find this to lead to many many fields being defined for a
given file. With a traditional database, having all these fields with
empty/wasted space, would lead to huge database files. But with M,
one can can store years of patient information on a relatively small
disk.
-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server. Download
it for free - -and be entered to win a 42" plasma tv or your very own
Sony(tm)PSP. Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
Hardhats-members mailing list
Hardhats-members@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/hardhats-members