Re: [sqlite] Locking databases - Possibly (probably?) a dumb question

Rob Willett Sat, 06 Aug 2016 23:56:45 -0700

Richard, Ryan,

Thanks for this. We were dimly aware of WAL but until now hadn’tneeded to use it.

We’ve done a quick check with it and it *seems* to work on a testdatabase. We’ve all read the docs again and paid attention tohttps://www.sqlite.org/wal.html#bigwal

To test if it works we started our long running analytics query, on ourtest machine it takes around 8 mins. We then speed up the rate ofupdating our database with external data. In the real world an updatecomes along every 3-5 mins, in our test system we queue them up so wehave them every 6-10 secs so they are around 60x quicker. The updatesare real data around 3-5MB in size.

We monitored the -wal and the .shm files created as we throw data in thedatabase.

The .wal file gets larger and larger until it hits 224MB and then staysconstant, the .shm file is only 1.8MB and seems to stay at that size. Wecan also see that the main sqlite database is NOT updated (or at leastthe timestamp isn’t) whilst we are running the updates in WAL mode.This appears to be correct as the updates would be in the -wal file.

The times taken for each updates seems a little slower (10% possibly butthat could be just because we are looking at it) but since the data isreal and variable in size, it might be just our subjective impression.

Once the long running read-only analytics query completes, the mainsqlite database appears to get updated (or at least the timestamp on thefile is updated) as we are still updating with our test data and the-wal files are still being used.

Once we stop updating with our test data, the -wal files and .shm filesdisappear (as expected).


A quick check of the database appears to show its correct.

One question though, the size of the -wal file worries us.https://www.sqlite.org/wal.html#bigwal states


```
Avoiding Excessively Large WAL Files

In normal cases, new content is appended to the WAL file until the WALfile accumulates about 1000 pages (and isthus about 4MB in size) at which point a checkpoint is automaticallyrun and the WAL file is recycled.The checkpoint does not normally truncate the WAL file (unless thejournal_size_limit pragma is set).Instead, it merely causes SQLite to start overwriting the WAL file fromthe beginning. This is done becauseit is normally faster to overwrite an existing file than to append. Whenthe last connection to a databasecloses, that connection does one last checkpoint and then deletes theWAL and its associated shared-memory

file, to clean up the disk.
```

We have not set the journal_size_limit and we have a -wal file which is224MB in size, somewhat larger than 4MB. We are running


3.8.2 2013-12-06 14:53:30 27392118af4c38c5203a04b8013e1afdb1cebd0d

which does not appear to have the code in 3.11.0 so that the WAL file isproportional to the size of the transaction. From the same page of themanual:


```
Very large write transactions.

A checkpoint can only complete when no other transactions are running,whichmeans the WAL file cannot be reset in the middle of a write transaction.So a large change to a large databasemight result in a large WAL file. The WAL file will be checkpointed oncethe write transaction completes(assuming there are no other readers blocking it) but in the meantime,the file can grow very big.

As of SQLite version 3.11.0, the WAL file for a single transactionshould be proportional in size tothe transaction itself. Pages that are changed by the transaction shouldonly be written into the WALfile once. However, with older versions of SQLite, the same page mightbe written into the WAL file multiple

times if the transaction grows larger than the page cache.
```

We think that using WAL mode works for us, indeed inspection seems toindicate it does, but the size of the -wal file appears to be far largerthan would be expected. Is there a problem here? It doesn’t appear tobe a problem but would welcome any comments.


Thanks for taking the time to reply.

Rob


On 6 Aug 2016, at 22:35, R Smith wrote:

On 2016/08/06 10:50 PM, Rob Willett wrote:
Our understanding of this is that many processes can READ thedatabase at the same time but NO process can INSERT/UPDATE if anotheris reading. We had thought that one process can write and multipleprocesses can read. Our reading (no pun intended) now of thisparagraph from the manual is that you cannot write if one or moreprocesses is reading. Have we understood this correctly? If so isthere an easy way to get around this?
The Write-Ahead-Log (WAL) journal mode will help you. It basicallyallows a writer to write to the WAL Log in stead of the main databaseso that any amount of readers can still do their thing reading thedatabase (and the parts of the WAL journal that is already committed,or even parts still in progress if you use "read_uncommitted" mode).SQLite then pushes committed data into the DB file based onCheckpoints which you can invoke directly or set up to happen every sooften.
This is the new way to do things and the way you should always useunless you have a specific reason not to (which might include filerestrictions, needing read-only-ness, separate speedy DBs that doesn'tfsync() so much, etc.)
More information here:
https://www.sqlite.org/wal.html
Your DB is quite big and it seems you write often, so please takespecial note of this section:
https://www.sqlite.org/wal.html#bigwal


HTH - Cheers,
Ryan
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Locking databases - Possibly (probably?) a dumb question

Reply via email to