Hi,
Background
We've been refactoring our database to reduce the size of it.
Through
some simple logic we've managed to pull out 99% of the data to
reduce
the size from 51GB down to approx 600MB. This logic has been to
remove
rows that are almost the same but not quite identical. As with all
things, the thinking was the difficult bit, the execution somewhat
easier.
As part of the testing we've been doing, we've now hit on an odd and
weird problem to do with the COMMIT statement. A commit of a few
hundred (circa 600-800) rows takes approx 7 seconds whereas before
we
never even noticed it, though we now know it was two seconds before.
Each row is probably 1-2K of data, so its not very much at all.
Details of what we have tried:
1. We've turned synchronous on and off
PRAGMA synchronous=ON
and thats not made any difference.
2. We are using and have been using WAL mode for years.
PRAGMA journal_mode;
journal_mode
wal
3. We've tested that the server copies OK, we get a consistent
69MB/sec. This is not as fast we would like, but it's the same
across
all our virtual servers.
4. We've tested the commit on our existing 60GB database and it
takes
2 seconds, which is longer than we thought it would be. The server
for
the 60GB database is a large VPS with 8GB/8 cores and runs Ubuntu
14.04. The server we are testing on is a 2GB/2 core test server
running Ubuntu 16.04. Whilst the test server is smaller, we wouldn't
expect it to take 3 times longer to do a commit.
5. The code is identical across the servers. We are running Perl and
the DBI module. The code for doing a commit in Perl::DBI is
$dbh->do("COMMIT");
We are getting the expected performance elsewhere on the system
and
in the code. It's just the commit that is taking a long time.
6. The code we are committing is adding 600-800 lines to a table
that
used to be 200,000,000 rows in size. It's now 400,000 lines in size.
We are wondering if the deletion of the lines has had an impact we
didn't expect. We have vacuumed and analysed the database.
The schema for the table we insert into is
CREATE TABLE IF NOT EXISTS "Disruptions" (
"id" INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
"version" integer NOT NULL,
"Disruption_id" INTEGER NOT NULL,
"status" integer NOT NULL,
"severity" integer NOT NULL,
"levelOfInterest" integer NOT NULL,
"category" integer NOT NULL,
"subCategory" integer NOT NULL,
"startTime" TEXT NOT NULL,
"endTime" text NOT NULL,
"location" integer NOT NULL,
"corridor" integer NOT NULL,
"comments" integer NOT NULL,
"currentUpdate" integer NOT NULL,
"remarkTime" TEXT NOT NULL,
"lastModTime" TEXT NOT NULL,
"CauseAreaPointX" real NOT NULL,
"CauseAreaPointY" real NOT NULL,
"Direction" TEXT
);
CREATE INDEX "Disruptions_Idx1" ON Disruptions ("location" COLLATE
NOCASE ASC, "corridor" COLLATE NOCASE ASC, "status" COLLATE NOCASE
ASC, "category" COLLATE NOCASE ASC, "severity" COLLATE NOCASE ASC,
"levelOfInterest" COLLATE NOCASE ASC, "version" COLLATE NOCASE ASC,
"subCategory" COLLATE NOCASE ASC, "startTime" COLLATE NOCASE ASC);
CREATE UNIQUE INDEX "Disruptions_Idx2" ON Disruptions
("Disruption_id"
COLLATE NOCASE ASC, "version" COLLATE NOCASE ASC, "category" COLLATE
NOCASE ASC, "subCategory" COLLATE NOCASE ASC);
CREATE UNIQUE INDEX "Disruptions_Idx3" ON Disruptions ("version"
COLLATE NOCASE ASC, "Disruption_id" COLLATE NOCASE ASC, "location"
COLLATE NOCASE ASC);
CREATE INDEX Disruptions_Idx5 ON Disruptions ("status",
"Disruption_id", "Severity", "levelOfInterest", "category",
"subCategory", "version");
We have checked that this schema is consistent across the databases.
We're about to recreate the table to see if that makes a difference.
Any help or advice welcomed.
Thanks
Rob
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users