Re: Temporal or Time-Oriented Databases (was auditing tables)

Jared Still Mon, 04 Feb 2002 21:33:04 -0800


Craig,


This is a huge subject.  I wish I knew more about it, but such as it is,
the bit I do know is too much deal with properly in an email

First the books:

Data Warehouse Toolkit by Kimball
Data Warehouse Life Cycle Toolkit by Kimball
Developing Time Oriented Database Applications in SQL by Snodgrass

A series of articles that preceded the book are available at:

http://www.cs.arizona.edu/people/rts/DBPD/index.html

Snodgrass' book is the seminal work on temporal databases as implemented
in an RDBMS.  It can be heavy going at times, but well worth it.

I recommend Kimballs books, as what you are trying to represent might best
be done using DW methods.  Retrieving your temporal data would likely be
easier to implement and faster in data mart with a very robust time-series
attribute table.

Using the time as a key for any part of the lookup table may not be
the best way to organize your data.   

I'm really pushed for time for the forseeable future, or I would really
delve into this, it's really quite interesting.

Contact me offline if you like, maybe we can continue this discussion a bit.

If anything interesting comes of it, we could summarize for the list.

Jared


On Sunday 03 February 2002 15:30, Craig Munday wrote:
> The thread on auditing tables and the mention of temporal or time-oriented
> databases prompted me to ask about a design problem that has recurred on
> different projects for me.  Seeing as this problem has occurred multiple
> times I figured that other people have probably encountered the same and
> have different solutions.  I'm hoping this email might spark some
> discussion/debate on the merits of this and other solutions.
>
> Essentially the problem is looking up historical data about an entity in a
> reliable and performant manner.  For example, paging (via the web) through
> a set of transactions that have occurred on an account.  Or finding the
> value of an account at a particular point in time, in order to reconcile
> the account.  I'm assuming that any solution would support the system
> running 24/7 because that has been a requirement of most of the projects
> I've worked on.
>
> A number of designs that I have seen have used a time field as the key to
> selecting subsets of the transactions for generation of web pages or for
> the identification of a particular transaction.  I've found that relying on
> time for this purpose has the following consequences:
>
> 1) It is possible that more than one transaction will occur within the same
> second and therefore make it difficult or impossible to order the
> transactions correctly (assuming no other suitable key exists).  This is
> applicable to the discussion on auditing tables too.
>
> 2) Find the next or previous batch of n rows is difficult because you have
> no idea of the selection criteria to use.  Typically the from and to dates
> are set to encompass ALL the data and only the first n rows are fetched
> from a result set that contains more than data than necessary.  I don't
> think this is necessarily a major problem so long as no sorting is involved
> to evaluate the result set.
>
> I've found that a more general solution is to have two relations, one to
> represent the entity itself and the other to represent the history of
> changes to the entity.  For example,
>
> create Entity
> (
>       myKey             number(38),  -- primary key
>       lastVersionNumber number(38),
>       attribute1        number(38),  -- non-version dependent data
> }
>
> create EntityHistory
> {
>       myKey          number(38),  -- primary key
>       versionNumber  number(38),  -- primary key
>       startTime      date,
>       endTime        date
>       attribute2     number(38), -- version dependent data
>       attribute3     number(38)  -- version dependent data
> }
>
> The key to the EntityHistory table contains the key of Entity plus a
> version number that is allocated from Entity.lastVersionNumber.  StartTime
> and EndTime denote when the change was in effect and both are needed to
> perform a time based search effectively.  This design as the following
> consequences:
>
> 1) The versionNumber provides a definitive order of the EntityHistory
> records for each Entity (identified by myKey).
>
> 2) The application has to lock Entity.lastVersionNumber to log a change to
> the Entity.  This does not appear to add any additional serialisation to
> the application because the applications I have seen would have locked the
> row anyway.  For example, if Entity contained the current account balance,
> the application would have to lock the row in order to update the balance.
>
> 3) EntityHistory.versionNumber can be used to find the next n changes to
> Entity.  Similarly, finding the first, last or previous n changes to Entity
> is just as straight forward.  In this case only the data that is needed is
> returned within the result set.
>
> 4) The startTime and endTime can be used to find changes to an Entity based
> on time.
>
> 5) Data that is not version dependent can be stored in Entity so that it is
> not repeated unnecessarily.
>
> 6) Other relations that are related to a particular version of Entity as it
> existed sometime in the past have a solid key to refer to in EntityHistory,
> rather than relying on the time based attributes.
>
> Any and all comments welcome.
>
> Cheers,
> Craig.
>
>
>
>
>
>
> -----Original Message-----
> Sent: Wednesday, 30 January 2002 7:11 AM
> To: Multiple recipients of list ORACLE-L
>
>
>
> Now you're getting into the realm of Temporal or Time-
> Oriented Databases.
>
> Suppose you want to know what change Fred made on
> Tuesday.  With your design, the audit row only
> shows what the old value was, not what the new
> value is.  To find that, you have to find either
> the current production row, OR the next-most-recent
> change for that row in the audit table.
>
> Finding the next-most-recent row in an audit
> table is not a lot of fun, and can be a bit of
> a performance pig.
>
> And suppose the next change is a deletion.  A typical
> way to track that is to record only the PK value
> of the deleted row.  If you do that, then you've
> lost the 'new' value that Fred put in.
>
> So, if you regularly report on old-and-new values
> from the audit table, it makes sense to store them
> both for the same change.  My most recent design
> looked something like this, with one row in AUD_TAB
> for each row changed, and one row in AUD_COL for
> each column changed in that row (inserts and updates
> only):
>
> AUD_TAB
> change_id (pk)
> table_name
> change_type
> pk_value
>
> AUD_COL
> change_id (fk) (pk)
> column_name    (pk)
> old_value
> new_value
>
> Cheers.
>  -Tom
>
> --- Jared Still <[EMAIL PROTECTED]> wrote:
> > I don't think you need two rows for updates.  The old values
> > will be in the audit table, the new ones are in the production
> > table.
> > At least that's the way I've always done it.
> > Is there some other reason for saving both in the audit table?
> >
> > On Tuesday 29 January 2002 03:00, Rachel Carmichael wrote:
> > > Update -- add two rows to the auditing table -- first with old
> > > values and type = O
> > > second with all the new values and type =N
> > >
> > > --- "Foelz.Frank" <[EMAIL PROTECTED]> wrote:
> > > > What I need is exactly what Oracle doesn't support. Logging "who"
> > > > changed "what" in a special area of our database.
> > > >
> > > > I think triggering the events will be much more specific and more
> > > > easy to change.
> > > >
> > > > In case all our applications use the same database and user, I am
> > > > trying to check out
> > > > what application is changing monitored tables (i.e.
> > > > c:\app\userapp\app.exe is changing table1).
> > > > What do you think of that ??
>
> =====
> Thomas B. Cox "Saepe in errore sed numquam in dubito"
> [EMAIL PROTECTED]   http://www.geocities.com/tbcox23/
>
> "The whole aim of practical politics is to keep the
> populace alarmed (and hence clamorous to be led to
> safety) by menacing it with an endless series of
> hobgoblins, all of them imaginary." --H.L. Mencken
>
> __________________________________________________
> Do You Yahoo!?
> Great stuff seeking new owners in Yahoo! Auctions!
> http://auctions.yahoo.com

----------------------------------------
Content-Type: text/html; charset="iso-8859-1"; name="Attachment: 1"
Content-Transfer-Encoding: quoted-printable
Content-Description: 
----------------------------------------
-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.com
-- 
Author: Jared Still
  INET: [EMAIL PROTECTED]

Fat City Network Services    -- (858) 538-5051  FAX: (858) 538-5051
San Diego, California        -- Public Internet access / Mailing Lists
--------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).

Re: Temporal or Time-Oriented Databases (was auditing tables)

Reply via email to