On Jul 2, 2008, at 2:12 AM, Shawn Wilsher wrote:

> Hey all,
>
> I'm working with a partitioned table setup with a permanent table and
> a temp table with the same columns and indexes.  Every X time we dump
> all the records in the temp table over to the permanent one.  In order
> to make selection queries easier to manage, I've gone and created a
> view like so:
> CREATE TEMPORARY VIEW table_view AS
> SELECT * FROM table_temp
> UNION
> SELECT * FROM table
>
> This was all going well, until I realized that updating was going to
> be very hard (insertion always goes to the temporary table).  That
> seemed easy enough to manage if I use an INSTEAD OF trigger on the
> view for UPDATE statements.  The problem is what I want to do in the
> trigger, which is this:
> 1) if the data is in the temporary table, update that
> 2) if the data is not in the temporary table, copy the data from the
> permanent table into the temp one, and then update the temp table
> Sadly, my SQL-fu isn't strong enough to know how to do this, and I'm
> starting to wonder if it's even possible.  If someone could tell me if
> I can do it, and then provide a pointer as to how to go about it, I'd
> really appreciate it.

(Please excuse the rambling nature of this post. Thinking out loud here.
None of the sql is tested either.)

The idea here is to use a temp table as a cache for write operations  
on the
real table, correct? To minimize the number of writes (and therefore  
fsync()
operations) on the real table?

And the cache has to support both UPDATE and INSERT operations? How  
about
DELETE?

Is the following correct?

   * When inserting a new row into the system, it should be added to the
     temporary table.

   * When updating a row, if it is not already in the temp table, it  
should
     be copied from the real table into the temp table and then the temp
     table copy updated.

   * When reading from the system, we want to return all the records  
from
     the temp table, and all those records from the "real" table that do
     not have corresponding temp table records.

   * (not sure how a delete, it it is required, should work).

I think that supporting the UPDATE operation makes it more difficult to
arrange all this using SQL triggers and views than it would be if each
record existed in either the temp or real tables (but not both).

For example, say we're working with:

     CREATE TABLE uris_real(host PRIMARY KEY, hits INTEGER);
     INSERT INTO uris_real VALUES('sqlite.org', 100);
     INSERT INTO uris_real VALUES('slashdot.org', 200);
     INSERT INTO uris_real VALUES('mozilla.org', 300);

So to create the cache table, we could do:

     CREATE TEMP TABLE uris_temp(host PRIMARY KEY, hits INTEGER);

Records are identified by their primary key and at some point in the  
future
the cache will be flushed through by doing something like:

   BEGIN;
     INSERT OR REPLACE INTO uris_real SELECT * FROM uris_temp;
     DELETE FROM uris_temp;         /* Optional step */
   COMMIT;

Ok, then:

     CREATE TEMP VIEW uris AS
       SELECT * FROM uris_temp
       UNION ALL
       SELECT * FROM uris_real
     ;

A trigger to support INSERT operations seems easy enough:

     CREATE TRIGGER uris_insert INSTEAD OF INSERT ON uris BEGIN
       INSERT INTO uris_temp VALUES(new.host, new.hits);
     END;

Of course, that would allow me to insert a new record ('sqlite.org', 1),
which violates the PRIMARY KEY uniqueness constraint. So I could  
adjust the
trigger to deal with that:

     CREATE TRIGGER uris_insert INSTEAD OF INSERT ON uris BEGIN
       SELECT CASE
         WHEN EXISTS (SELECT host FROM uris_real WHERE host =  
new.host) THEN
           RAISE(ABORT, 'primary key constraint violated');
         END;
       INSERT INTO uris_temp VALUES(new.host, new.hits);
     END;

Now the UPDATE trigger.

     CREATE TRIGGER uris_update INSTEAD OF UPDATE ON uris BEGIN
       INSERT OR REPLACE INTO uris_temp VALUES(new.host, new.hits);
     END;

The constraint again... The trigger above would let me do something like
"UPDATE uris SET host = 'sqlite.org'". And besides, if the primary key
is updated, how will we tell which records to copy over in the real  
table
when the cache is flushed through? So maybe we disallow updates on the
primary key:

     CREATE TRIGGER uris_update INSTEAD OF UPDATE ON uris BEGIN
       SELECT CASE
         WHEN new.host != old.host
           RAISE(ABORT, 'cannot update primary key')
         END;
       INSERT OR REPLACE INTO uris_temp VALUES(new.host, new.hits);
     END;

One more problem - after an UPDATE, the view won't work properly  
anymore,
as duplicate records may be returned. So we can modify it to:

     CREATE TEMP VIEW uris AS
       SELECT * FROM uris_temp
       UNION ALL
       SELECT * FROM uris_real WHERE host NOT IN (SELECT host FROM  
uris_temp)
     ;

So we now have a system that supports UPDATE and INSERT. So long as one
doesn't UPDATE the table's primary key. Still not sure how to support
DELETE operations.

I'm starting to wonder if using an SQL view and triggers is the best way
to implement this. It seems pretty straightforward to support INSERT,
but UPDATE and DELETE start getting a bit complicated. It could be that
the logic for distributing records between the temporary and real
tables would be better done using a procedural programming language.
Either as part of application logic or as a virtual table.

Dan.



_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to