Droedel,

Fortunately we have no performance issues in production with the 60GB database. These issues came out in testing (which is what testing is for).

We're investigating the newly generated ID's as we speak or we will be once we get our replica production system setup.

sqlite_analyser has been a problem for us. We've struggled to get a build for it.

Rob

On 30 Jul 2018, at 13:49, Droedel wrote:

Hi Rob,
Answers are in the text below

On Mon, Jul 30, 2018, at 13:45, Rob Willett wrote:
Droedel,

We don't think there are significant read access. The database is a
single database on a single thread on a single process. The only access to it is a Perl script that logs the incoming information. We never have
two accesses at the same time.

Can you also _measure_ read access, preferably on system level on your production database, e.g. by using iostat ? I've seen cases where (other, non-SQLite) databases had unexpected disk access patterns due to an application error.

We have a nagging feeling (and thats all it is) about the autoincrement value. We do use that feature in the table, but we have deleted so many

[snip]
If both databases use autoincrement, then performance should be similar (or at least that's what I expect). Can you easily check if the newly generated IDs are as expected and larger than any existing ID in your table ?

We did wonder if we are filling up pages in the middle or something.
However we expected the vacuum and analyse to sort this out. Now its

[snip]
sqlite3_analyzer can give some measurements, e.g. unused bytes on index pages.

We've built the replica test system now and we're going to have some
initial checks and get some benchmarks in place.

It could be an interesting and exciting ride :)

Sure. It's always fun learning something new. But it's less fun in full production when customers are yelling :-(

Regards,

Droedel



Rob


On 30 Jul 2018, at 12:32, Droedel wrote:

Hi Rob,

Is there significant read access (iostat: r/s) during these slow
writes ? If yes, it might be due to a small cache, requiring the
database to read (index) pages before updating them.

And is the data you're adding in both databases (large/small) added at
the end of the table using the autoincrement, or do you insert some
items in the middle ? I'm not a SQLite performance expert, but in
other databases performance can be very different because in the
former case fewer pages must be updated.

Microsoft SQL Server has something called "fill factor", basically the
max percentage of an index page that is used during initial fill,
which helps avoiding too many page shuffling in the index when extra
items are added. Disadvantage: it makes DBAs argue endlessly about the
best fill factor ;-) Maybe there's something similar possible in
SQLite but I couldn't find a pragma for this.

Oh, and do both databases have the same page size, preferably 4K ?

Regards,

Droedel



On Mon, Jul 30, 2018, at 12:51, Rob Willett wrote:
Droedel,

Thanks for the comprehensive reply. We have actually done all of
this.

The system has been running for 2-3 years and we have taken the
opportunity to try and prune the database from 60GB down to 600MB.
Currently the live system is working OK with a 60GB database, but our
test system (which is smaller) is struggling with 600MB.

The system has a restriction of IOPS as it's a Virtual Private
Server.
Technically it's running Ubuntu 16.04 under OpenVZ. We can get
69MB/sec
with a disk to disk copy, which isn't brilliant if we had sustained
disk
traffic which we don't.

We log approx 600 - 800 items of around 3-5K every five minutes.
These
600-800 items are mainly an insert into a single table, there are
other
things happening as well, but this is the bulk of the work. We can
see
that the -wal files grow a small amount (4-5MB) just before the
commit.
It then takes 7 seconds to execute the commit. This is the bit that
we're struggling with. We know we can get circa 70MB/sec data
throughput, so this should take a fraction of a second. Now SQLite
needs
to work out which pages to commit so thats a little longer, but we
know
SQLite is fast, so that shouldn't take 7 seconds on the small
database
as it doesn't take that long on the large 60GB database. Thats the
puzzling bit, the large database is quick, the small one slow.

We have no logging turned on, we can turn SQL logging on at the DBI
level but that turns a 20 sec run into a 2-3 minute run as it
captures
everything :) Nothing in the log files gives us any concern (apart
from
the really long commit time). Simon Slavin suggested dropping the
indexes which we did, that turned the commit into a fast commit, so
its
something to do with the indexes but we can't see what.

What we are now doing is going back to the very beginning:

1. We built a replacement system yesterday with 8GB memory and 8
cores
and 150GB disk space. Its virtualised (ESXI) but under our control.
2. We've installed a copy of the old 60GB database on the new system. 3. We're going to benchmark the new system over a couple of thousand
runs to see what the average time is.
4. We'll then work our way through the deduping of the database step
by
step to see when the commit time blow up. This will take a few days
as
working out the duplications of 200,000,000 rows isn't that quick :)
As
we found out, dropping a very large table is really, really, really
slow.
5. We'll apply some of the ideas that people have suggested since
yesterday to see if they work, but I'm keen that we have a repeatable
problem that we solve rather than we use a scatter gun approach to
fixing it. We think SQLite is well written so we figure the problem
is
ours to solve rather than simply blaming the software.


Thanks

Rob

On 30 Jul 2018, at 11:11, Droedel wrote:

Hi,

When having bad performance, I usually first try to find out if the
slowness is due to disk througput (sequential), slow random access
or
something else. In Linux, try "iostat -xtc 5". Do this with and
without your application writing to disk.

If you see high CPU %iowait and high %util on your disk, then disk
is
the bottleneck. If not: start profiling / analyzing other
bottlenecks
(CPU / network / ...)

If the disk throughput (wMB/s) is close to your normal sequential
throughput (69 MB/s): try to write less data or get a faster disk.
If the disk troughput is low, but high numbers of writes (w/s):
there's too much seeking / too many small writes to your disk. Page
cache too small ? Checkpointing too often ?

Sometimes this kind of problems is caused by other applications
(logging / ...) causing too much baseload. %util should be low when
your application isn't running.

Just my 2 cents.

Kind regards,

Droedel


On Sun, Jul 29, 2018, at 10:14, Rob Willett wrote:
Hi,

Background

We've been refactoring our database to reduce the size of it.
Through
some simple logic we've managed to pull out 99% of the data to
reduce
the size from 51GB down to approx 600MB. This logic has been to
remove
rows that are almost the same but not quite identical. As with all things, the thinking was the difficult bit, the execution somewhat
easier.

As part of the testing we've been doing, we've now hit on an odd
and
weird problem to do with the COMMIT statement. A commit of a few
hundred
(circa 600-800) rows takes approx 7 seconds whereas before we never
even
noticed it, though we now know it was two seconds before. Each row
is
probably 1-2K of data, so its not very much at all.

Details of what we have tried:

1. We've turned synchronous  on and off

PRAGMA synchronous=ON

and thats not made any difference.

2. We are using and have been using WAL mode for years.

PRAGMA journal_mode;
journal_mode
wal

3. We've tested that the server copies OK, we get a consistent
69MB/sec.
This is not as fast we would like, but it's the same across all our
virtual servers.

4. We've tested the commit on our existing 60GB database and it
takes
2
seconds, which is longer than we thought it would be. The server
for
the
60GB database is a large VPS with 8GB/8 cores and runs Ubuntu
14.04.
The
server we are testing on is a 2GB/2 core test server running Ubuntu 16.04. Whilst the test server is smaller, we wouldn't expect it to
take
3 times longer to do a commit.

5. The code is identical across the servers. We are running Perl
and
the
DBI module. The code for doing a commit in Perl::DBI is
  $dbh->do("COMMIT");

We are getting the expected performance elsewhere on the system
and
in the code. It's just the commit that is taking a long time.

6. The code we are committing is adding 600-800 lines to a table
that
used to be 200,000,000 rows in size. It's now 400,000 lines in
size.
We
are wondering if the deletion of the lines has had an impact we
didn't
expect. We have vacuumed and analysed the database.

The schema for the table we insert into is

CREATE TABLE IF NOT EXISTS "Disruptions" (
         "id" INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
         "version" integer NOT NULL,
         "Disruption_id" INTEGER NOT NULL,
         "status" integer NOT NULL,
         "severity" integer NOT NULL,
         "levelOfInterest" integer NOT NULL,
         "category" integer NOT NULL,
         "subCategory" integer NOT NULL,
         "startTime" TEXT NOT NULL,
         "endTime" text NOT NULL,
         "location" integer NOT NULL,
         "corridor" integer NOT NULL,
         "comments" integer NOT NULL,
         "currentUpdate" integer NOT NULL,
         "remarkTime" TEXT NOT NULL,
         "lastModTime" TEXT NOT NULL,
         "CauseAreaPointX" real NOT NULL,
         "CauseAreaPointY" real NOT NULL,
         "Direction" TEXT
);
CREATE INDEX "Disruptions_Idx1" ON Disruptions ("location" COLLATE NOCASE ASC, "corridor" COLLATE NOCASE ASC, "status" COLLATE NOCASE
ASC,
"category" COLLATE NOCASE ASC, "severity" COLLATE NOCASE ASC,
"levelOfInterest" COLLATE NOCASE ASC, "version" COLLATE NOCASE ASC, "subCategory" COLLATE NOCASE ASC, "startTime" COLLATE NOCASE ASC);
CREATE UNIQUE INDEX "Disruptions_Idx2" ON Disruptions
("Disruption_id"
COLLATE NOCASE ASC, "version" COLLATE NOCASE ASC, "category"
COLLATE
NOCASE ASC, "subCategory" COLLATE NOCASE ASC);
CREATE UNIQUE INDEX "Disruptions_Idx3" ON Disruptions ("version"
COLLATE
NOCASE ASC, "Disruption_id" COLLATE NOCASE ASC, "location" COLLATE
NOCASE ASC);
CREATE INDEX Disruptions_Idx5 ON Disruptions ("status",
"Disruption_id",
"Severity", "levelOfInterest", "category", "subCategory",
"version");

We have checked that this schema is consistent across the
databases.

We're about to recreate the table to see if that makes a
difference.

Any help or advice welcomed.

Thanks

Rob
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to