Hi,

I am currently developing a system, which have the following requirements:
1. should be very fast and responsive
2. run forever (or at least a very long time) without restart.
3. have steady stream of writes to the DB.

Currently I am using sqlite with one sql connection, and I run sql 
checkpoint every once in a while (e.g. after 5 minutes, or after 10000 
inserts to the DB) - I don't use auto checkpoint.
Since I run the checkpoint in the main thread, the system is 
unresponsive for a long time (can take from 400ms to 20000ms!)

What I would like to do, is to run the checkpoint in another thread, so 
the main thread can keep working (and using sql), and do the long 
checkpoint work in the background.

The problem is that when I try to do it, the WAL file grows without limit.
I wrote a little program that do exactly this (insert rows as fast as C 
allows, and in the background run checkpoints).
The result (after 30 secs and 400K records) was: DB size 19MB and WAL 
size 451MB! (tried on linux Debian with sqlite 3.7.2, and on winXP with 
same sqlite version).

I think that the problem relies on the fact the during the checkpoint, I 
keep writing to the WAL file, thus the checkpoint can never get to the 
end of the file. And since WAL file is not cyclic, it will grow forever, 
leaving the most of the file unused (the beginning), and using only the 
end of the file for real data.

I found this thread 
http://www.mail-archive.com/sqlite-users@sqlite.org/msg56074.html, 
discussing a similar problem, and also saw dan's commit 
(http://www.sqlite.org/src/info/72787c010c) approaching the same 
problem, but not solving it, since it requires to block all DB activity.

I think this can be fixed, if sqlite use a cyclic WAL file.
I found that PostgreSQL also bumped into this, and used multiple WAL 
files, in order to avoid this kind of problem 
(http://www.sql.org/sql-database/postgresql/manual/wal-implementation.html).
Now, we do not need multiple files to do it, we just need to write to 
the beginning of the WAL file, when we have enough space (e.g. at least 
16MB of unused space - configurable).
So the WAL file might look like this:

                     logical end           logical start
                          |                    |
                          V                    V
------------------------------------------------------------------
header | committed data  | unused space       | committed data   |==>
------------------------------------------------------------------  |
        ^                                                            |
         ============================================================|

when the data at the head of the file is logically at the end of it.

When the writing is too fast, eventually they will get to logical start 
of the file. in this point we can force a blocking checkpoint, since 
this situation is very rare.

This will allow limiting WAL file size, while avoiding blocking 
checkpoint, and will increase responsiveness significantly!

Yoni.
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to