Warren,

On 30 Jul 2018, at 12:28, Warren Young wrote:

On Jul 30, 2018, at 4:51 AM, Rob Willett <rob.sql...@robertwillett.com> wrote:

The system has been running for 2-3 years

Has performance changed over that span?  Which direction?

Performance hasn't changed on the large 60GB data database. its pretty consistent.


we have taken the opportunity to try and prune the database from 60GB down to 600MB.

SQLite’s speed is only weakly affected by database size. A starting guess is log2(N) where N is file size.

Since your data access didn’t get 10x faster from your 100x size drop, you can fairly guess that the speed problem isn’t due to the inherent time required to traverse tree-based data structures in SQLite.


The reason for the pruning is not for the benefit of SQLite, it's for our own administration. Backing up 60GB files is hard work, we're also struggling to fire up replicas, so we wanted to look at the database as a whole and get the size down. The 99% reduction is great, we were hoping for a 50% to 80% reduction.

The system has a restriction of IOPS as it's a Virtual Private Server. Technically it's running Ubuntu 16.04 under OpenVZ. We can get 69MB/sec with a disk to disk copy, which isn't brilliant if we had sustained disk traffic which we don’t.

I am sensing a spinning disk. (Else, throughput should be a lot higher.)

I suspect that is the case. However 69MB/sec is adequate for the 60GB database and for normal usage. Our commits are fine on the 60GB database, just not the 600MB one.

SQLite takes data durability (the D in ACID) seriously, unlike most other software applications, so it is uncommon in that it flushes each transaction to disk before proceeding with further writes to that table.

A flush to disk takes a full disk rotation, and due to the way SQLite’s journal operates, each transaction requires two flushes. That means that with a 7200 RPM disk, you can get a maximum of 60 transactions per second per table with SQLite.

I agree with your logic, but I would wonder why writing the data to a 60GB database and doing a commit is fast and writing exactly the same data to the 600MB database is different. The programs for doing it are the same, the database schema is identical.


Sound familiar?

If I’ve guessed the problem correctly, the solutions are:

1. Batch multiple writes in a transaction.

All ready done.


2. Switch to an SSD.

Not an option in the short term for production BUT the test system we have setup has an SSD. This may skew the results though. Its difficult to get an identical system setup, but the other option is a VMWare Fusion system on a Mac with a spinning disk. Most of our local systems are SSD, we have a Mac with a spinning disk for backup.


3. Use multiple tables and/or multiple DB files. In your case, I’d suggest one SQLite DB per sensor, with one thread per sensor, each of which keeps one of the SQLite DBs open continuously. That way, a blocked DB conn won’t block any other writers.

We have one process that reads a single file in every 5 mins. No need for multiple databases or multiple threads.


Those solutions are given in order of ease of application and cost of implementation.

Nothing in the log files gives us any concern

Have you tried SQLite’s new .expert feature?

No, but we will now :)

   https://sqlite.org/cli.html#index_recommendations_sqlite_expert_

dropping a very large table is really, really, really slow.

If you put your main data table in a file of its own, you can quickly “drop” the table by just closing the DB and removing the DB file from disk.

Thats what we will do on our test system.


When you then recreate the DB file with a fresh schema, it’s effectively defragged/vacuumed as well.

We have a copy of the 60GB data file (took a ling time to download) locally now. We will use this (or rather a copy) to start the testing.

Thanks very much for the thoughtful and useful comments.

Rob

_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to