Re: [sqlite] Locking and concurrency in SQLite version 3.0

2004-06-16 Thread Daniel K
--- Christian Smith <[EMAIL PROTECTED]> wrote:
> On Wed, 16 Jun 2004, Daniel K wrote:
> 
> >--- [EMAIL PROTECTED] wrote:
> >>
> >> >   http://www.sqlite.org/lockingv3.html
> >>
> >> My thoughts are listed as they come to me.
> >>
> >> Thought 1:
> >>
> >> Section 5.0, entitled "Writing to a database
> file":
> >> After the in-memory cache initially spills to
> disk
> >> the exclusive lock must be maintained because the
> >> database file is changed. One way to avoid
> this
> >
> >Would you get the same effect if you had a
> infinitely
> >large pager-cache? i.e. if the cache was never
> spilled
> >until the transaction is committed.
> >
> >If so, then it might be better to figure out how
> the
> >pager cache itself could use secondary storage once
> >it reached a configured size. This wouldn't be a
> file
> >format change.
> 
> 
> There would be a certain performance hit in this
> case, as data spilled
> would have to be written to disk twice (once on the
> spill, once on the
> commit.) 

Yes that's true. You would safe an fsync() call when
the spill occurred, as the journal file has to be 
synced before the database is written to. However
the same amount of journal file would hit the disk
surface in the end anyway, so I'm not sure if that
would help any.

I'm not suggesting using external memory in the pager
cache is a good general strategy.

Dan.






__
Do you Yahoo!?
Yahoo! Mail is new and improved - Check it out!
http://promotions.yahoo.com/new_mail

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [sqlite] Locking and concurrency in SQLite version 3.0

2004-06-16 Thread Christian Smith
On Wed, 16 Jun 2004, Daniel K wrote:

>--- [EMAIL PROTECTED] wrote:
>>
>> >   http://www.sqlite.org/lockingv3.html
>>
>> My thoughts are listed as they come to me.
>>
>> Thought 1:
>>
>> Section 5.0, entitled "Writing to a database file":
>> After the in-memory cache initially spills to disk
>> the exclusive lock must be maintained because the
>> database file is changed. One way to avoid this
>
>Would you get the same effect if you had a infinitely
>large pager-cache? i.e. if the cache was never spilled
>until the transaction is committed.
>
>If so, then it might be better to figure out how the
>pager cache itself could use secondary storage once
>it reached a configured size. This wouldn't be a file
>format change.


There would be a certain performance hit in this case, as data spilled
would have to be written to disk twice (once on the spill, once on the
commit.) To make matters worse, spilled data could potentially have to be
re-read from the spill file, making a spilled page cause upto 3 disk IOs.
The problem would be compounded by the fact that spilled pages are likely
to be caused by large transactions, which are also more likely to cause
spill data to be purged from the OS cache. So we're talking mucho
performance hit in worst case scenarios.

Spilling directly to the database file may hold up readers, but would be
much more performant.

There was much discussion on how to avoid these problems using version
trees, shadow paging, and probably other solutions. The solution DRH is
implementing is likely to give good concurrency in most cases (small to
medium updates) and no worse than current worst case performance for large
updates. Conversely, a spill file would have much worse than current worst
case performance.


>
>Dan.
>

Christian

-- 
/"\
\ /ASCII RIBBON CAMPAIGN - AGAINST HTML MAIL
 X   - AGAINST MS ATTACHMENTS
/ \

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [sqlite] Locking and concurrency in SQLite version 3.0

2004-06-16 Thread Daniel K
--- [EMAIL PROTECTED] wrote:
> G'day,
> 
> 
> 
> 
> 
> "D. Richard Hipp" <[EMAIL PROTECTED]>
> 12/06/2004 08:16 AM
> 
>  
> To: [EMAIL PROTECTED]
> cc: 
> Subject:    [sqlite] Locking and
> concurrency in SQLite version 3.0
> 
> 
> >   http://www.sqlite.org/lockingv3.html
> 
> My thoughts are listed as they come to me.
> 
> Thought 1:
> 
> Section 5.0, entitled "Writing to a database file":
> After the in-memory cache initially spills to disk
> the exclusive lock must be maintained because the
> database file is changed. One way to avoid this

Would you get the same effect if you had a infinitely 
large pager-cache? i.e. if the cache was never spilled
until the transaction is committed.

If so, then it might be better to figure out how the
pager cache itself could use secondary storage once
it reached a configured size. This wouldn't be a file
format change.

Dan.




__
Do you Yahoo!?
Yahoo! Mail - Helps protect you from nasty viruses.
http://promotions.yahoo.com/new_mail

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [sqlite] Locking and concurrency in SQLite version 3.0

2004-06-16 Thread ben . carlyle
G'day,





"D. Richard Hipp" <[EMAIL PROTECTED]>
12/06/2004 08:16 AM

 
To: [EMAIL PROTECTED]
cc: 
Subject:    [sqlite] Locking and concurrency in SQLite version 3.0


>   http://www.sqlite.org/lockingv3.html

My thoughts are listed as they come to me.

Thought 1:

Section 5.0, entitled "Writing to a database file":
After the in-memory cache initially spills to disk the exclusive lock must 
be maintained because the database file is changed. One way to avoid this 
happening might be to change the database file and log structure as 
follows:
1) Add a numeric entry to each page in the database file that refers to a 
specific page (or file offset) into the log file.
2) Add an entry to each log file entry indicating a 0 or 1.

If a page is read from the database and has a non-zero file offset, that 
page refers to the roll-forward log that superceeds it. A zero in the log 
file entry indicates it is a rollback entry, while a 1 indicates it is a 
roll-forward entry.

The algorithms described would change in the following ways:
1) Instead of writing the dirty page to the main file when memory 
spilliage occurs, write it to the journal. If the main file entry already 
has a file offset encoded into it, write the page to that offset. If the 
main file entry has no offset, write it at the end of the journal file and 
overwrite only the offset of the main page.
2) Readers with shared locks should always overlook any such offsets it 
finds in main files. Readers with any of the writer locks should refer to 
the journal for the updated version of such pages.
3) When rolling back a journal file, only rollback pages with a 0 entry in 
the rollback/roll-forward field.
4) When committing a transaction write all pages from memory, but also 
commit any pages in the journal with a 1 in the rollback/roll-forward 
field.
5) You might have to rethink any vacuum operation and some other small 
aspects of life. By using the main file as an index into the roll-forward 
log you make truncating the database file more difficult.

One extra alternative to throw in is to keep the roll-back and 
roll-forward journals in separate files. That would avoid the need to 
identify the individual log entries as roll-back or roll-forward and may 
improve performance of large changes. The roll-forward file would never 
have to be committed.

This approach differs slightly from previous suggestions of the shadow 
pager or of creating tree structures in the journal file. It does not 
completely virtualise the pager level, although the concept is similar. It 
requires only trivial extra structure in the journal file since it uses 
the real main file as an index into the roll-forward section of the 
journal. If this kind of scheme were to be implimented in the future the 
groundwork in file format changes could be laid now in a 
forward-compatable way by allocating the necessary spaces and always 
ensuring they had a zero value.

Thought 2:

I'm a little concerned about when SQLITE_BUSY can be returned. In section 
7.0, entitled "Transaction Control At The SQL Level" a mention is made of 
locks not be acquired with the BEGIN statement. Personally I don't like to 
see SQLITE_BUSY at all. I currently modify my sqlite version to use 
blocking locks in restricted ways to avoid getting the message and ensure 
optimum fairness. If they do occur, I would prefer they happen at 
well-defined and designated places. Hmmm... I guess I can't think of any 
cases where this is really an issue, though.

I would like to see blocking locks supported by SQLITE. If that's not 
possible it's ok, but my preference is that the capability should exist. 
Currently sqlite provides an API to execute a function when SQLITE_BUSY 
would be returned. That's ok, but doesn't suit blocking locks well for two 
reasons: 1) The locking semantics of sqlite use operating system locks in 
specific ways that would be unwise to mess with in a callback function. 2) 
I don't belive there is an API to register a corresponding unlock function 
to the sqlite_busy_callback, so whatever locks might be put in place can't 
be unmade at appropriate times. Perhaps the API should be changed to 
support replacement of the various os.c lock functions for each of the 
specific lock types in the new sqlite locking model.

As a matter of interest, the current sqlite isn't far off being able to 
work with blocking lock in place of its existing non-blocking locks. The 
main prohibition that needs to be imposed is that shared locks cannot be 
upgraded to exclusive locks. The current sqlite can be "tuned" to ensure 
exclusive locks are obtained early to prevent blocking locks from 
deadlocking. I haven't seen the new sqlite3 code and haven't seen detail 
of how the various locking mode transitions will be implimented in a posix 
environment to know whether extra problems will be introduced in this 
area.

By my reading the allowa

Re: [sqlite] Locking and concurrency in SQLite version 3.0

2004-06-12 Thread D. Richard Hipp
Christian Smith wrote:
"Note that the BEGIN command does not acquire any locks on the database.
After a BEGIN command, a SHARED lock will be acquired when the first
SELECT statement is executed. A RESERVED lock will be acquired when the
first INSERT, UPDATE, or DELETE statement is executed."
Surely, after a BEGIN is executed, the database should be considered a
current snapshot, and no subsequent changes should be visible in this
transaction.
However, if connection A executes a BEGIN, then is preempted by connection
B (perhaps in a different process) which updates the database, by the time
connection A is queried, the database will have changed between the BEGIN
and the first statement.
I'd expect the transaction boundry to begin with the BEGIN statement, not
wait until the first command.
Suppose user A executes the following SQL:
(Slot 1)-->
   BEGIN;
(Slot 2)-->
   SELECT * FROM t1;
(Slot 3)-->
   UPDATE t1 SET ;
You point out that user B could have changed the database
during Slot 2.  This is true.  But from the point of view
of user A, because A has no knowledge of the content of the
database until it does its first SELECT, the database looks
the same to A as if B had made his change at Slot 1.  And
if A cannot tell the different, it doesn't really matter.
Transactions are still isolated.
The important thing is that B is not able to change the
database during Slot 3.
Another suggestion I might add, how about maintaining a transaction ID of
the last transaction committed? That would allow connections to maintain
their cache between transactions using the following rules:
- Cache is valid if:
  - Last transaction ID is unchanged from the data in the cache.
  - Last transaction ID is the same as our last write.
- Else, the cache is invalid.
The transaction ID will then be simply incremented for every write
transaction.
Something like this is in the code but it probably won't be
fully activated in the first release.
--
D. Richard Hipp -- [EMAIL PROTECTED] -- 704.948.4565
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: [sqlite] Locking and concurrency in SQLite version 3.0

2004-06-12 Thread Darren Duncan
At 6:16 PM -0400 6/11/04, D. Richard Hipp wrote:
The first alpha release of SQLite version 3.0 will occur
in less than 7 days.  In preparation for that release, a
document describing the new file locking mechanism has be
prepared and placed on the website.
  http://www.sqlite.org/lockingv3.html
Readers who are so inclined are encouraged to read this
document carefully and try to poke holes in the logic
of the new locking mechanism.  Once the code is published
next week, you are further encouraged to try to poke holes
in the code itself.
Readers are encouraged to be merciless in their criticism
of the new locking scheme.  Intense, unrestricted peer review
is needed in order to make SQLite version 3 a solid product.
Thank you for your help.
Okay, some comments or questions ...
3.0 Locking
Q: Is the set of [RESERVED, PENDING, EXCLUSIVE] mutually exclusive 
such that exactly one lock of exactly one of those can be held at 
once?  That is, if one process/thread holds one of any of those, then 
others are forbidden from acquiring any one of those?  If so, then I 
see these 3 are different from each other only in how or whether they 
allow concurrent SHARED locks.

Q: Would a RESERVED lock be acquired automatically, or alternately be 
the best thing to get, if someone runs a "SELECT ... FOR UPDATE"?

Q: It looks like PENDING only describes the temporary state of a 
process that requested an EXCLUSIVE lock, but then has to wait 
(blocked or non-blocked) until the EXCLUSIVE lock is granted. 
Correct?  But if so, then PENDING locks exist at the same time as 
EXCLUSIVE, while the EXCLUSIVE description says no other locks exist 
at the same time.

Q: The OS layer doesn't track PENDING locks, given that they seem to 
be unborn EXCLUSIVE locks.  In my mind, PENDING should not even be 
considered a lock state like UNLOCKED|SHARED|RESERVED|EXCLUSIVE are. 
Or alternately, I think you should add 2 more states that represent 
unborn SHARED or RESERVED; those 2 new states refer to what state the 
process/thread is in while it is waiting to acquire a SHARED or 
RESERVED lock (whether blocked or non-blocked), just as PENDING is a 
wait state for EXCLUSIVE.  So you should either have 4 or 7 official 
states.

Q: No matter what kind of lock is requested, do you provide both 
blocking and non-blocking options for what happens to the process 
until the lock is granted?  I think both should be provided, or all 
non-blocking if only one option is given.

Q: Given that SQLite uses the operating system's built-in locking 
mechanisms, how is RESERVED implemented?  I have only known the 
operating system to support UNLOCKED, SHARED, EXCLUSIVE.  Mind you, 
that is just because it is all I have used on plain files, or all I 
have seen documented.

4.0 The Rollback Journal
Q: If you ATTACH a database to the one you currently have open, then 
is that database always given the same lock status as the main 
database, or can it be different?  For example, if you have EXCLUSIVE 
on the main, will attached also be EXCLUSIVE?  Or can you attach a 
database SHARED, such as if you just want to read from it while 
updating the main?

4.1 Dealing with hot journals
Q: Is this saying that one process/thread can acquire multiple locks 
at once, or does your language "acquire" and "drop" mean to change 
the single existing lock from one type to another?  Eg: drop 
exclusive means switch to shared?

5.0 Writing to a database file
Q: It appears from this section that RESERVED locks are the 
prepubescent form of an EXCLUSIVE lock; they are conceptually the 
same as the early part of the life of an EXCLUSIVE lock, and exist as 
an optimization meant to cut down on the amount of time that a 
process actually does have exclusive file access; it will actually 
become exclusive as late as possible, such as when a commit is to 
happen or the change cache fills.  Correct?

... in the end, all these questions may just relate to documentation clarity.
In general, the documentation looks good.  Though I had to read to 
the end to understand some issues that were raised earlier on.

In general, the current structure of locking and transactions looks 
great, as I understand it.  I may have further comments/questions 
after the above questions are answered.

-- Darren Duncan
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]