[sqlite] Best way to handle time slicing of SQL data.

Stephen Chrzanowski Tue, 2 Jun 2015 11:33:54 -0400

On Tue, Jun 2, 2015 at 7:34 AM, Richard Warburton <
richard at skagerraksoftware.com> wrote:


> 1) Do I need UID?  I'm currently using it to ensure a unique Id for when
> the user is creating a new entry.  This however means two operations, an
> Insert, then an Update to set the record's Id to match its UID. It also has
> the overhead of autoincrement, ensuring that UID is never ever reused and
> only goes upwards (I'm currently using MAX(UID)  to get the latest record
> <= the time I'm interested in.  I guess there's no extra storage as rowid
> would exist anyway.
>

This is basic Parent/Child relationship, or, One-To-Many relationships.
Your ID field can have many references to a parent table, so it is
absolutely 100% valid to have multiple IDs of the same value in one field,
but as I mention later, rather confusing to have multiple fields treated as
"primary information".

To answer your question for this particular example, no, you might not need
UID, but that'll depend on what you plan on doing with the information
you're inserting into this table.  If you plan on showing this record to
the user somehow, and they have the option to interact with the data
(Double-click to show more information on what was changed for this parent
record) then you'll need to provide your user interface with that UID.  In
Delphi/Pascal/Lazarus, every item within a list box or dropdown/combo box
contains an "object" pointer, which is just an integer.  I typically use
this "object" pointer to refer to whatever ID I pulled from the database
query when I'm populating the list.  If you're just displaying the data out
as a dump of text, then no, UID wouldn't be needed since you don't need to
go and reference that data back ever.

An example query to get the oldest record for a particular client would be

select * from data where ID=? order by At desc limit 1

In a multi-threaded/multi-client environment, I'd NEVER query against the
maximum identifier as that can change between the time you make the query
when when you actually want to do something with the result.  I want to
find the maximum value of relevant data I need, not something used
internally by the database to maintain referential integrity.  It "might"
BE the same result, but, at a code review, it doesn't LOOK like it is doing
the right thing.  Basically, your query doesn't tell the developer WHY
you're even making this query.

>From what I understand of your reasoning, you're "kind of" on the right
track.  You're looking to store historical record of changes on data for
any relevant child information to some sort of parent entity.  Say for
instance a change of address for a company or person, or recording the
change of the color of a car.  What you're missing out on is that your ID
field should already be known by the application to whatever the parent
record ID is, or in other words, your application should already know
you're changing the address of a particular person.  In this particular
table, you can probably keep the UID, but for all intents and purposes for
this example, if you ever make an update anything stored in this table,
your two relevant unique identifying keys would probably be the ID linking
to the parent record and the "At" field.  Unless the "At" value is being
changed/added multiple times in a second.


>
> 2) Can I auto fill Id to UID on insert instead of having to do two
> operations?
>

In my mind, having two "primary identifiers that mean two different things,
for one piece of information" indicates a bit more normalization needs to
happen.  Using cars as an example, it is confusing to say that that a
vehicle VIN contains information about the color of the car.  'Primary'
information should never be considered as something that can change later,
and, paint shops would be out of business in this particular scenario.  A
single vehicle has a single VIN, which has identifiable information that
CAN'T change at time of manufacturing, such as who made it, what location,
body trim, etc.  Sure, you can take every component from a Camaro SS and
put it  into the body of a Camaro LT, but, as far as the VIN for that LT is
concerned, it will be just an LT with SS parts.

The other thing is that, as I mentioned, you already should have ID
available when you're actually updating the record.  You shouldn't need to
have the application insert data, call the data back only to have to update
the record later.  In order to keep a paper trail on what has to be
changed, you obviously already know what is to be, or has changed, or what
new values are being entered into this table.  Even if you're adding a
brand new vehicle to the database, you need to add the VIN to a parent
table first and either get the LastID value after the insert, or retain the
VIN, then you add a new record to your "data" table referencing the ID
field to the VIN value.  If you're adding a new change record to the
vehicle (Brown paint to chrome paint anyone?) you already know the subjects
ID (The VIN or car) so it is just a matter of adding to this data table.

Thanks in advance for any feedback.
>
>
Since this was a very simplistic table you provided us, I'm thinking that
you've got your hands on something that deals with recording a set of
changes, and not just one individual change.  In that case, I'd highly
recommend that you create a "change record" table that has a unique
identifier (autoinc will work)  and field with a current_timestamp default
value, and then have the data.UID field reference this "change record"
table ID field, and your data.ID referencing to what specifically was
changed.  So for example, UID would reference the Chevy Nova, while ID
references the body color field, or engine field, or whatever.  When your
application goes to "apply the change" your application would insert a row
into the "change record" table ( [ insert into change_record default values
] -- IIRC], you'd retrieve the LastID of that insert, then bulk insert into
the "data" table.  UID would be the LastID from the insert, and ID would
reference color, engine, or transmission fields.  You'd be then able to
look up specific information about this cars color history, as an example.

Think of this "change record" table I mention as an invoice.  This invoice
contains all the information about what you've purchased.  It doesn't
matter WHAT you've purchased, but it is a list of what you're taking home,
and what has changed in the stores inventory, and that Invoice is a
transactional record of ALL changes applied at that single instance in time.


> --
> Richard Warburton
> _______________________________________________
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>

[sqlite] Best way to handle time slicing of SQL data.

Reply via email to