In WAL mode with synchronous=NORMAL, SQLite only syncs (FlushFileBuffers()
on windows) when it does a checkpoint operation.  Checkpoints should be
happening automatically whenever the WAL file exceeds about 1MB in size.

For an 8GB database, probably there are about 8000 sync operations,
therefore.  If each takes about 2 seconds, that would pretty much account
for the extra 4.5 hours.

If you are creating a new database from scratch, it is safe to set
synchronous=OFF.  If you lose power in the middle, your database file will
probably be corrupt, but since you were creating it from scratch you can
easily recover just be starting the database creation process over again
from the beginning.

If you want to try running with synchronous=NORMAL, you might try setting
PRAGMA wal_autocheckpoint=100000; (from the default of 1000) which will
make for dramatically larger WAL files, but also dramatically fewer syncs.
Then the syncs will use just 5 or 6 minutes instead of 4.5 hours. Hopefully.



On Sun, Jan 19, 2014 at 9:00 AM, Mario M. Westphal <m...@mwlabs.de> wrote:

> I have a performance effect which I don't quite understand.
> Maybe I'm using the wrong settings or something. Sorry for the long post,
> but I wanted to include all the info that may be important.
>
> My software is written in C++, runs on Windows 7/8, the SQLite database
> file
> is either on a local SATA RAID disk or a SSD.
> Typical database sizes are between 2 GB and 8 GB.
> The largest tables hold several million entries. Also FTS4 is used, which
> also creates large tables.
> Fast internal RAID disks, SDD. Four Xeon cores. 8 GB RAM.
>
> I'm using SQLite 3.8.0.2
> WAL mode, shared cache enabled.
> locking_mode=NORMAL
> checkpoint_fullfsync=0
> pragma page_size=4096
> pragma cache_size=16384
>
> General (retrieval) performance is excellent!
>
>
> During an ingest phase, my application pumps in hundreds of thousands of
> records into multiple tables.
> There are massive amounts of writes during that phase, different record
> sizes, tables with one to four indices etc.
>
> My application is multi-threaded and inserts data into the database
> concurrently from multiple threads.
> The threads process data in batches, and use SQLite transactions to process
> all records of a batch into the database. Transactions gain a lot of speed,
> which outweighs the side effects of potential blocking.
> The threads monitor the execution times of the database operations and
> adjust the batch size to balance speed and transaction lock duration.
> Slower
> operations cause smaller batches, which results in shorter database locks
> and better concurrency. The system adapts fairly well to system performance
> and data structure.
>
> The performance was not that bad, but far from good.
>
> For a given set of input data (100,000 "elements"), the execution estimate
> was about 5 hours.
> Database on a high-speed SSD.
> The largest table holds about 5 million entries afterwards.
>
> ***With one single change*** I improved the execution time from 5 hours
> down
> to about 30 minutes!
>
> I changed
>
> PRAGMA synchronous=NORMAL
>
> to
>
> PRAGMA synchronous=OFF
>
> Also all other database write operations just 'fly' now.
> I'm even more impressed with SQLite than before, but I wonder why is the
> change so _dramatic_ ?
>
> From the docs my impression was that using WAL mode is ideal for bulk
> inserts. That wrapping large bulks of data into smaller batches, wrapped in
> BEGIN/COMMT is best for performance etc. That using synchronous=NORMAL
> limits the file system flush/wait operations certain really important
> operations.
>
> I logged the execution times of various operations in this phase to a text
> file. Everything was fast, the processing, the INSERTs etc.
> But COMMIT operations sometimes took 20s, then 0.2s, then again 10s. That's
> the time SQLite spends in the execute call with "COMMIT".
>
> Of course the amount of data written in each transaction block varied, but
> in general, 85% of the total execution time of my code was spent in the
> COMMIT call.
> My application was the only application with measurable disk I/O at the
> time. No virus checker etc.
>
> Changing to synchronous=OFF made the commits 10 times faster.
>
> Is this the expected behavior or am I missing something obvious?
>
>
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
D. Richard Hipp
d...@sqlite.org
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to