Title: Temporal or Time-Oriented Databases (was auditing tables)

The thread on auditing tables and the mention of temporal or time-oriented databases prompted me to ask about a design problem that has recurred on different projects for me.  Seeing as this problem has occurred multiple times I figured that other people have probably encountered the same and have different solutions.  I'm hoping this email might spark some discussion/debate on the merits of this and other solutions.

Essentially the problem is looking up historical data about an entity in a reliable and performant manner.  For example, paging (via the web) through a set of transactions that have occurred on an account.  Or finding the value of an account at a particular point in time, in order to reconcile the account.  I'm assuming that any solution would support the system running 24/7 because that has been a requirement of most of the projects I've worked on.

A number of designs that I have seen have used a time field as the key to selecting subsets of the transactions for generation of web pages or for the identification of a particular transaction.  I've found that relying on time for this purpose has the following consequences:

1) It is possible that more than one transaction will occur within the same second and therefore make it difficult or impossible to order the transactions correctly (assuming no other suitable key exists).  This is applicable to the discussion on auditing tables too.

2) Find the next or previous batch of n rows is difficult because you have no idea of the selection criteria to use.  Typically the from and to dates are set to encompass ALL the data and only the first n rows are fetched from a result set that contains more than data than necessary.  I don't think this is necessarily a major problem so long as no sorting is involved to evaluate the result set.

I've found that a more general solution is to have two relations, one to represent the entity itself and the other to represent the history of changes to the entity.  For example,

create Entity
(
        myKey             number(38),  -- primary key
        lastVersionNumber number(38),
        attribute1        number(38),  -- non-version dependent data
}

create EntityHistory
{
        myKey          number(38),  -- primary key
        versionNumber  number(38),  -- primary key
        startTime      date,
        endTime        date
        attribute2     number(38), -- version dependent data
        attribute3     number(38)  -- version dependent data
}

The key to the EntityHistory table contains the key of Entity plus a version number that is allocated from Entity.lastVersionNumber.  StartTime and EndTime denote when the change was in effect and both are needed to perform a time based search effectively.  This design as the following consequences:

1) The versionNumber provides a definitive order of the EntityHistory records for each Entity (identified by myKey).

2) The application has to lock Entity.lastVersionNumber to log a change to the Entity.  This does not appear to add any additional serialisation to the application because the applications I have seen would have locked the row anyway.  For example, if Entity contained the current account balance, the application would have to lock the row in order to update the balance.

3) EntityHistory.versionNumber can be used to find the next n changes to Entity.  Similarly, finding the first, last or previous n changes to Entity is just as straight forward.  In this case only the data that is needed is returned within the result set.

4) The startTime and endTime can be used to find changes to an Entity based on time.

5) Data that is not version dependent can be stored in Entity so that it is not repeated unnecessarily.

6) Other relations that are related to a particular version of Entity as it existed sometime in the past have a solid key to refer to in EntityHistory, rather than relying on the time based attributes.

Any and all comments welcome.

Cheers,
Craig.






-----Original Message-----
From: Thomas B. Cox [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, 30 January 2002 7:11 AM
To: Multiple recipients of list ORACLE-L
Subject: Re: AW: auditing tables



Now you're getting into the realm of Temporal or Time-
Oriented Databases. 

Suppose you want to know what change Fred made on
Tuesday.  With your design, the audit row only
shows what the old value was, not what the new
value is.  To find that, you have to find either
the current production row, OR the next-most-recent
change for that row in the audit table.

Finding the next-most-recent row in an audit
table is not a lot of fun, and can be a bit of
a performance pig.

And suppose the next change is a deletion.  A typical
way to track that is to record only the PK value
of the deleted row.  If you do that, then you've
lost the 'new' value that Fred put in.

So, if you regularly report on old-and-new values
from the audit table, it makes sense to store them
both for the same change.  My most recent design
looked something like this, with one row in AUD_TAB
for each row changed, and one row in AUD_COL for
each column changed in that row (inserts and updates
only):

AUD_TAB
change_id (pk)
table_name
change_type
pk_value

AUD_COL
change_id (fk) (pk)
column_name    (pk)
old_value
new_value

Cheers.
 -Tom

--- Jared Still <[EMAIL PROTECTED]> wrote:
> I don't think you need two rows for updates.  The old values
> will be in the audit table, the new ones are in the production
> table.
> At least that's the way I've always done it.
> Is there some other reason for saving both in the audit table?


> On Tuesday 29 January 2002 03:00, Rachel Carmichael wrote:
> > Update -- add two rows to the auditing table -- first with old
> > values and type = O
> > second with all the new values and type =N

> > --- "Foelz.Frank" <[EMAIL PROTECTED]> wrote:
> > > What I need is exactly what Oracle doesn't support. Logging "who"
> > > changed "what" in a special area of our database.
> > >
> > > I think triggering the events will be much more specific and more
> > > easy to change.
> > >
> > > In case all our applications use the same database and user, I am
> > > trying to check out
> > > what application is changing monitored tables (i.e.
> > > c:\app\userapp\app.exe is changing table1).
> > > What do you think of that ??



=====
Thomas B. Cox "Saepe in errore sed numquam in dubito"
[EMAIL PROTECTED]   http://www.geocities.com/tbcox23/

"The whole aim of practical politics is to keep the
populace alarmed (and hence clamorous to be led to
safety) by menacing it with an endless series of
hobgoblins, all of them imaginary." --H.L. Mencken

__________________________________________________
Do You Yahoo!?
Great stuff seeking new owners in Yahoo! Auctions!
http://auctions.yahoo.com
--
Please see the official ORACLE-L FAQ: http://www.orafaq.com
--
Author: Thomas B. Cox
  INET: [EMAIL PROTECTED]

Fat City Network Services    -- (858) 538-5051  FAX: (858) 538-5051
San Diego, California        -- Public Internet access / Mailing Lists
--------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).

Reply via email to