G'day,




"D. Richard Hipp" <[EMAIL PROTECTED]>
12/06/2004 08:16 AM

 
        To:     [EMAIL PROTECTED]
        cc: 
        Subject:        [sqlite] Locking and concurrency in SQLite version 3.0


>   http://www.sqlite.org/lockingv3.html

My thoughts are listed as they come to me.

Thought 1:

Section 5.0, entitled "Writing to a database file":
After the in-memory cache initially spills to disk the exclusive lock must 
be maintained because the database file is changed. One way to avoid this 
happening might be to change the database file and log structure as 
follows:
1) Add a numeric entry to each page in the database file that refers to a 
specific page (or file offset) into the log file.
2) Add an entry to each log file entry indicating a 0 or 1.

If a page is read from the database and has a non-zero file offset, that 
page refers to the roll-forward log that superceeds it. A zero in the log 
file entry indicates it is a rollback entry, while a 1 indicates it is a 
roll-forward entry.

The algorithms described would change in the following ways:
1) Instead of writing the dirty page to the main file when memory 
spilliage occurs, write it to the journal. If the main file entry already 
has a file offset encoded into it, write the page to that offset. If the 
main file entry has no offset, write it at the end of the journal file and 
overwrite only the offset of the main page.
2) Readers with shared locks should always overlook any such offsets it 
finds in main files. Readers with any of the writer locks should refer to 
the journal for the updated version of such pages.
3) When rolling back a journal file, only rollback pages with a 0 entry in 
the rollback/roll-forward field.
4) When committing a transaction write all pages from memory, but also 
commit any pages in the journal with a 1 in the rollback/roll-forward 
field.
5) You might have to rethink any vacuum operation and some other small 
aspects of life. By using the main file as an index into the roll-forward 
log you make truncating the database file more difficult.

One extra alternative to throw in is to keep the roll-back and 
roll-forward journals in separate files. That would avoid the need to 
identify the individual log entries as roll-back or roll-forward and may 
improve performance of large changes. The roll-forward file would never 
have to be committed.

This approach differs slightly from previous suggestions of the shadow 
pager or of creating tree structures in the journal file. It does not 
completely virtualise the pager level, although the concept is similar. It 
requires only trivial extra structure in the journal file since it uses 
the real main file as an index into the roll-forward section of the 
journal. If this kind of scheme were to be implimented in the future the 
groundwork in file format changes could be laid now in a 
forward-compatable way by allocating the necessary spaces and always 
ensuring they had a zero value.

Thought 2:

I'm a little concerned about when SQLITE_BUSY can be returned. In section 
7.0, entitled "Transaction Control At The SQL Level" a mention is made of 
locks not be acquired with the BEGIN statement. Personally I don't like to 
see SQLITE_BUSY at all. I currently modify my sqlite version to use 
blocking locks in restricted ways to avoid getting the message and ensure 
optimum fairness. If they do occur, I would prefer they happen at 
well-defined and designated places. Hmmm... I guess I can't think of any 
cases where this is really an issue, though.

I would like to see blocking locks supported by SQLITE. If that's not 
possible it's ok, but my preference is that the capability should exist. 
Currently sqlite provides an API to execute a function when SQLITE_BUSY 
would be returned. That's ok, but doesn't suit blocking locks well for two 
reasons: 1) The locking semantics of sqlite use operating system locks in 
specific ways that would be unwise to mess with in a callback function. 2) 
I don't belive there is an API to register a corresponding unlock function 
to the sqlite_busy_callback, so whatever locks might be put in place can't 
be unmade at appropriate times. Perhaps the API should be changed to 
support replacement of the various os.c lock functions for each of the 
specific lock types in the new sqlite locking model.

As a matter of interest, the current sqlite isn't far off being able to 
work with blocking lock in place of its existing non-blocking locks. The 
main prohibition that needs to be imposed is that shared locks cannot be 
upgraded to exclusive locks. The current sqlite can be "tuned" to ensure 
exclusive locks are obtained early to prevent blocking locks from 
deadlocking. I haven't seen the new sqlite3 code and haven't seen detail 
of how the various locking mode transitions will be implimented in a posix 
environment to know whether extra problems will be introduced in this 
area.

By my reading the allowable transitions will be these:
Unlocked -> Shared (non-blocking, may return busy)
Shared -> Pending (non-blocking, my return busy)
Exclusive -> Shared (will always work immediately)
Shared -> Reserved (non-blocking, may return busy)
Reserved -> Pending (will always work immediately when preparing to 
write?)
Pending -> Exclusive (non-blocking, may return busy when preparing to 
rollback)
Pending -> Exclusive (blocking, returns succesful eventually when 
preparing to write)

Again, I'm not sure of the exact posix operations involved in each 
transition but I read this as a lot of shared locks that are transfomed 
into exclusive locks only twice: Once in a non-blocking manner during 
rollback, and once in a blocking manner during write operations. More may 
be introduced in the fiddling with reserved and pending locks. I would 
guess that Shared -> Reserved requires this at least once, and that 
Reserved -> Pending does too.

Either way, it looks like blocking locks could be used most of the time, 
at least in principle. Hmmm... I'm rambling a little... but to get to the 
point: Making blocking locks can be tricky. It kinda looks like it'll be 
trickier under the new system. It's only natural when you're trying to 
avoid holding exclusive locks for any extended period of time, but still 
guaranteeing that you will be able to get one when the time is right. I 
suppose the balance is right there, and it'll be my code that has to 
change in the end to get the most of this sqlite version. Anything that 
preserves the new functionality and reduces the opportunities for getting 
SQLITE_BUSY would be good, though.

Benjamin.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to