Re: Derby architecture/design documents

Mike Matrigali 25 Jan 2005 19:48:05 -0000

Some information to expand on the topics below (not all of them, but a
start), can I suggest you start to maintain some sort of TODO list for
gathering information, I will update as I get a time:

container/conglomerate/table - This terminology comes from the original
modular design of the system.  The store was really 2 modules: the
lower level raw store and then the upper level access.

Raw store uses containers to store rows.  Currently these containers
always map to a single file in the seg0 directory of the database.

Access provides conglomerates as the interface to the it's clients for
storing rows.  Currently there is a one to one mapping between a
conglomerate and a container.

The language level implements SQL tables and indexes using Access
provided Conglomerates.

The layers allow for some decision in the future if derby is ever to
support tables and/or indexes spread across multiple disks.  The
implementation could either happen at the language layer, by spreading
the table across multiple conglomerates; or it could happen at the store
 level by spreading it across multiple containers.  My opinion is that
if the table is fragmented by key, then language should do it as it best
understands the key's and indexes involved.  If a raw partitioning of
the data is desired then it would be best done in the store (but I
actually think that a raw partitioning is best done below the database
itself by the OS or hardware).

Latches - Latches are used in derby.  They are short term locks on the
page in the buffer cache.  They are requested anytime the raw store
needs to read/write information on the page.  They are short term and
never held during any operation which can wait (like an I/O or a lock
request).  Latches are implemented using the Derby lock manager.

page, row and field formats - check out comments at top of
java/engine/org/apache/derby/impl/store/raw/data/StoredPage.java and
StoredFieldHeader.java, let me know if you need more info.

Handling of large rows -
The terminolgy used in raw store is long rows and long columns.  A
column is long if it can't fit on a single page.  A row is long if all
of it's columns can't fit on a single page.  A long column is marked as
long in the base row, and it's field contains a pointer to a chain of
other rows in the same container with contain the data of the row.  Each
of the subsequent rows is on a page to itself.  Each subsquent row,
except for the last piece has 2 columns, the first is the next segment
of the row and the second is the pointer to the the following segment.
The last segment only has the data segment.

Similarly for a long row, the segment of the row which fits on the page
is left there, and a pointer column is added at the end of the row.  It
points to another row in the same container on a different page.  That
row will contain the next set of columns and a continuation pointer if
necessary.  The overflow portion will be on an "overflow" page, and that
page may have overflow portions of other rows on it (unlike overflow
columns).

Dibyendu Majumdar wrote:

> Here are some ideas on how I would like to structure the documentation.
> 
> 
> For every topic, I'd like to create:
> 
> Introduction to concepts - A general introduction to the topic.
> Derby implementation details - This will be main substance of the document
> where I will describe how Derby works.
> References - these will point to published papers, books, etc. that discuss
> the topic in question.
> 
> In terms of topics, here is what I have come up with (let me know if things
> should be added):
> 
> Terms - this will be a glossary of Derby specific terms so that I don't have
> to keep explaining the same terms in every document.
> 
> Row ID - How rows are identified - RecordId and SlotId.
> 
> Row management within a page - storage format of rows, slot table, etc.
> 
> Handling of large rows - how does Derby handle rows that won't fit into one
> page.
> 
> Container - what is a container?
> 
> Space management in Containers - how is space management implemented? How
> does Derby locate an empty page?
> 
> Latches - are latches used by Derby? How are they implemented?
> 
> Lock management - description of lock management, lock conflict resolution,
> deadlocks, lock escalation.
> 
> Buffer cache - how does Derby implement the buffer cache, and what is
> interaction between Buffer cache and Log, and Buffer cache and Transaction
> manager.
> 
> Write ahead log - description of how the log is implemented - this would
> mainly cover the format of log records, how log records are represented in
> memory and in disk, log archiving, checkpointing, etc.
> 
> Transactions - how is an Xid allocated? What is the representation of a
> transaction? How is a transaction related to a thread?
> 
> Transaction Manager - description of how Derby implements ARIES. What
> happens at system restart. How rollbacks and commits work. Different types
> of log records used by the transaction manager - such as do, redo, undo,
> compensation, checkpoint, etc.
> 
> Row locking in tables - how are rows locked? What happens when a row spans
> pages?
> 
> Row recovery - Do/Redo/Undo actions for rows - inserts, updates, deletes.
> 
> BTree - page organisation and structure
> 
> BTree - concurrency - how does Derby handle concurrent updates to the tree -
> inserts and deletes? How are structural changes serialised? Do updates block
> readers (not as in locking but while the change is being made) or can they
> progress concurrently?
> 
> BTree - locking - data row locking or index row loacking? Is next-key
> locking used for serializable reads?
> 
> BTree - recovery - Do/Redo/Undo actions for key inserts, updates, deletes.
> 
> Row scans on tables - is this handled by "store"?
> 
> Row scans in BTrees - is this handled by "store"?
> 
> Conglomerates - what is a Conglomerate?
> 
> 
> 
> 
> 
> 
>

Re: Derby architecture/design documents

Reply via email to