The problem is actually a little more subtle than that: The basic problem is not one of wasted space. Think of a much simpler abstraction, the file. When it is initially created, it may or may not be allocated as a contiguous block (we'll ignore issues such as striping). What happens if you come along with vim and insert a sentence in the middle of the file. If there is no room in an allocated block, it will be necessary to "split" the file, essentially breaking it up into blocks linked by pointers. This storage model may be suitable for most files (e.g., executables that are created once by the compiler), but not suitable for databases that need to support inserts. What a B-tree does is provide a data structure that minimizes the cost of this splitting. If you're familiar with heaps, or just balanced binary trees, you've already seen the basic idea. You can maintain a sorted list as binary tree, so that average search time is approximately log n, rathe than n (binary vs. linear search). But after many updates, the tree can become unbalanced, in the extreme, just turning into another linear list. There are various data structures like red-black trees that can be used to keep the tree balanced by stopping to shift nodes around. But that's not really practical for a database, either. What B-trees do is generalize this approach by replacing binary trees with trees in which a single node may have many children. This adds a fair amount of complexity, but it really cuts down on the overhead, making it practical to maintain much larger data structures without sacrificing (too much) efficiency.

===
Gregory Woodhouse
[EMAIL PROTECTED]

"The whole of science is nothing more than a refinement
 of everyday thinking."  -- Albert Einstein


On Sep 19, 2005, at 11:56 AM, Kevin Toppenberg wrote:

Also, the database in M is called by some a "sparce array."  This
means that there are no "blank spaces" left for data to be later
filled into.  So with M, if there is no data present, then no space is
wasted.  I find this to lead to many many fields being defined for a
given file.  With a traditional database, having all these fields with
empty/wasted space, would lead to huge database files.  But with M,
one can can store years of patient information on a relatively small
disk.




-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server. Download
it for free - -and be entered to win a 42" plasma tv or your very own
Sony(tm)PSP.  Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
Hardhats-members mailing list
Hardhats-members@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/hardhats-members

Reply via email to