What are the recommended "best practices" around using SQLite in a
distributed scenario?
(I saw Shailesh's email in March; I'm considering building a two-phase
commit layer atop SQLite nodes and looking for advice.)
Clearly, this is a big topic including failover, recovery,
load-balancing, and so on. But right now I'm most curious about the
transaction layer.
The most obvious approach is to just process all transactions with a
basic serialized two-phase commit protocol:
1) Master begins transaction in the local database
2) Master sends query to all slaves
3) Slaves begin transaction in their local database
4) Slaves send approval to the master
5) Master commits to the local database
6) Master sends commit command to all slaves
7) Slaves commit the transaction
8) Goto 1
This works great and is what I have now.
But this can only process one transaction at a time, and the commit
speed is no faster than the slowest slave latency. I'm curious if you
can recommend a better way? (I'm sure this is well documented
somewhere; any suggestions?)
For example, it seems that transactions could be "pipelined" to decouple
throughput from latency: after all, barring edge conditions, every slave
should approve every transaction. (And any failed transaction would
probably result in all future transactions failing as well -- eg disk
fail/full/corrupt.)
So the master could send a constant stream of transactions, and the
slave could send a constant stream of approvals, and everybody is happy.
The only tricky case is in the rare case the slave actually does need
to abort the transaction.
Assuming there are 10 transactions that haven't yet been committed, the
slave could commit the first 3 and then abort the remaining 7. This
would look like:
1) Master begins 10 transaction in the local database
2) Master sends 10 queries to slave
3) Slave begins 3 transactions successfully, but the next 7 fail.
4) So the slave send 3 approval messages, and then 7 abort messages
5) Master commits the first 3, and then rolls back the remaining 7
6) Master sends commit messages for the first 3, and rollbacks for 7
7) Slave commits the first 3, and rolls back the last 7
Seems pretty straightforward in theory. But my question is how to
actually accomplish step 5 with SQLite.
I read a big discussion on nested transactions in the list archives from
a year ago, and I understand they're not currently supported. But I'm
not sure what I want to do is actually nested transactions (else I'd
nest infinitely deep before committing anything).
Rather, I'm thinking of some kind of... serial commits, or savepoints or
something. Basically, maintain a set of "savepoints" that I can revert
the entire database to at any time. Then I just commit transactions in
order, and create a new "savepoint" after each. Later, when I get the
approval from the slave, I delete the associated savepoint (to save
resources, presumably). Alternatively, if I get a rollback from a
slave, then I just revert to the savepoint before that transaction and
-- effect -- rollback every transaction made after.
Can you think of any good way to accomplish this?
One horrifically bad way would be to copy the database after each
transaction.
One less horrible way would be to manually maintain an undo log (just
create SQL statements that -- when applied in reverse -- gradually roll
back the database to any previous state).
http://www.sqlite.org/cvstrac/wiki?p=UndoRedo
But I'm curious if there are any better ways to go about it. I'm
guessing this is a pretty well-explored area -- anybody who's gone ahead
care to share their wisdom?
Thanks!
-david
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users