On 16 Apr 2012, at 11:25pm, Puneet Kishor <[email protected]> wrote:
> I absolutely don't get any of the above. Why is "keeping the data" worse than > keeping the commands? I am not even sure what is a command vs. what is a > data. A SQL command has data embedded in it. I mean, if I have > > UPDATE t > SET name = 'foo', val = 3.1415 > WHERE id = 22; > > which part was the command and which part was the data? The command is the three rows above. Imagine you'd passed that command to _exec(). So as well as executing that command you write it to a file somewhere: logid timestamp command 163742 2347634.133 UPDATE t SET name = 'foo', val = 3.1415 WHERE id = 22; In terms of the data, I'm not sure whether you're keeping copies of the entire row, or just triplets. If you're keeping a copy of the row every time the row changes then you would need to store logid timestamp tablename id name val col3 col4 col5 ... 213876 2347634.133 t 22 foo 3.1415 6 fred x ... if instead you are storing triplets then you would need to store two rows of data logid timestamp tablename id column value 824724 2347634.133 t 22 name foo 824725 2347634.133 t 22 val 3.1415 Obviously I just made up the column names off the top of my head, and I didn't invent a mechanism for noting DELETE. On 16 Apr 2012, at 11:10pm, Nico Williams <[email protected]> wrote: > The nice thing about having all historical and current and future data > in one table is that you can: > > a) trivially review the past, > b) trivially create future changes that become effective as time passes. I agree that this has its advantages. I suppose it comes down to what kind of use you're going to make of the data. >> Whether you are keeping copies of the rows in the table, or timestamping SQL >> commands, I suggest that for SQLite your timestamps should be unixepoch >> stored as a REAL rather than a text expression of seconds. > > Why REAL instead of INTEGER? Because with sufficient resolution they are unique, which gives you an understanding of which change was made before which other change. If you use just integer seconds you can be left with two changes with the same timestamp. On 16 Apr 2012, at 11:15pm, Petite Abeille <[email protected]> wrote: > But, in practice, how would one use such DML logs? Say someone got a small > million rows, with over a period of time where subjected to 10x that many > DMLs (update, insert, delete). How would one now practically use these DML > logs to query data at two different point in times? Replay all the logs from > the beginning each and every time? Good question. You can take periodical snapshots of your entire database, and log those together with your log of changes. So to restore your data as of time T, you would 1) restore the latest snapshot made before time T 2) then replay all commands issued after that snapshot but before time T. Now, consider what it takes if you're logging value changes instead of commands issued. Your first problem is figuring out which rows exist. Are you storing triplets, or complete rows ? What SELECTs are you going to do ? Simon. _______________________________________________ sqlite-users mailing list [email protected] http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

