John,

Thanks for the prompt reply and a very good question..

We've dropped the indexes and the commit is now very quick, approx two seconds

However the overall performance of the run is much the same as other areas of the code are now significantly slower, whereas before they were quick.

Where were you going with that question?

Thanks

Rob

On 29 Jul 2018, at 10:33, John Found wrote:

What is the performance without the indexes?

On Sun, 29 Jul 2018 10:20:11 +0100
"Rob Willett" <rob.sql...@robertwillett.com> wrote:

Update 1

We've copied the disruptions table to a new table, dropped the old
table, copied the new table back in and recreated all the indexes.

Exactly the sam commit performance.

We've also tracked the -shm and -wal files and they are around 5MB in
size.

Mmmm.... tricky (as Deepthought said).

Rob

On 29 Jul 2018, at 9:14, Rob Willett wrote:

Hi,

Background

We've been refactoring our database to reduce the size of it. Through some simple logic we've managed to pull out 99% of the data to reduce the size from 51GB down to approx 600MB. This logic has been to remove
rows that are almost the same but not quite identical. As with all
things, the thinking was the difficult bit, the execution somewhat
easier.

As part of the testing we've been doing, we've now hit on an odd and
weird problem to do with the COMMIT statement. A commit of a few
hundred (circa 600-800) rows takes approx 7 seconds whereas before we
never even noticed it, though we now know it was two seconds before.
Each row is probably 1-2K of data, so its not very much at all.

Details of what we have tried:

1. We've turned synchronous  on and off

PRAGMA synchronous=ON

and thats not made any difference.

2. We are using and have been using WAL mode for years.

PRAGMA journal_mode;
journal_mode
wal

3. We've tested that the server copies OK, we get a consistent
69MB/sec. This is not as fast we would like, but it's the same across
all our virtual servers.

4. We've tested the commit on our existing 60GB database and it takes 2 seconds, which is longer than we thought it would be. The server for
the 60GB database is a large VPS with 8GB/8 cores and runs Ubuntu
14.04. The server we are testing on is a 2GB/2 core test server
running Ubuntu 16.04. Whilst the test server is smaller, we wouldn't
expect it to take 3 times longer to do a commit.

5. The code is identical across the servers. We are running Perl and
the DBI module. The code for doing a commit in Perl::DBI is
 $dbh->do("COMMIT");

We are getting the expected performance elsewhere on the system and
in the code. It's just the commit that is taking a long time.

6. The code we are committing is adding 600-800 lines to a table that
used to be 200,000,000 rows in size. It's now 400,000 lines in size.
We are wondering if the deletion of the lines has had an impact we
didn't expect. We have vacuumed and analysed the database.

The schema for the table we insert into is

CREATE TABLE IF NOT EXISTS "Disruptions" (
         "id" INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
         "version" integer NOT NULL,
         "Disruption_id" INTEGER NOT NULL,
         "status" integer NOT NULL,
         "severity" integer NOT NULL,
         "levelOfInterest" integer NOT NULL,
         "category" integer NOT NULL,
         "subCategory" integer NOT NULL,
         "startTime" TEXT NOT NULL,
         "endTime" text NOT NULL,
         "location" integer NOT NULL,
         "corridor" integer NOT NULL,
         "comments" integer NOT NULL,
         "currentUpdate" integer NOT NULL,
         "remarkTime" TEXT NOT NULL,
         "lastModTime" TEXT NOT NULL,
         "CauseAreaPointX" real NOT NULL,
         "CauseAreaPointY" real NOT NULL,
         "Direction" TEXT
);
CREATE INDEX "Disruptions_Idx1" ON Disruptions ("location" COLLATE
NOCASE ASC, "corridor" COLLATE NOCASE ASC, "status" COLLATE NOCASE
ASC, "category" COLLATE NOCASE ASC, "severity" COLLATE NOCASE ASC,
"levelOfInterest" COLLATE NOCASE ASC, "version" COLLATE NOCASE ASC,
"subCategory" COLLATE NOCASE ASC, "startTime" COLLATE NOCASE ASC);
CREATE UNIQUE INDEX "Disruptions_Idx2" ON Disruptions ("Disruption_id"
COLLATE NOCASE ASC, "version" COLLATE NOCASE ASC, "category" COLLATE
NOCASE ASC, "subCategory" COLLATE NOCASE ASC);
CREATE UNIQUE INDEX "Disruptions_Idx3" ON Disruptions ("version"
COLLATE NOCASE ASC, "Disruption_id" COLLATE NOCASE ASC, "location"
COLLATE NOCASE ASC);
CREATE INDEX Disruptions_Idx5 ON Disruptions ("status",
"Disruption_id", "Severity", "levelOfInterest", "category",
"subCategory", "version");

We have checked that this schema is consistent across the databases.

We're about to recreate the table to see if that makes a difference.

Any help or advice welcomed.

Thanks

Rob
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


--
John Found <johnfo...@asm32.info>
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to