Hi,

When having bad performance, I usually first try to find out if the slowness is 
due to disk througput (sequential), slow random access or something else. In 
Linux, try "iostat -xtc 5". Do this with and without your application writing 
to disk.

If you see high CPU %iowait and high %util on your disk, then disk is the 
bottleneck. If not: start profiling / analyzing other bottlenecks (CPU / 
network / ...)

If the disk throughput (wMB/s) is close to your normal sequential throughput 
(69 MB/s): try to write less data or get a faster disk.
If the disk troughput is low, but high numbers of writes (w/s): there's too 
much seeking / too many small writes to your disk. Page cache too small ? 
Checkpointing too often ?

Sometimes this kind of problems is caused by other applications (logging / ...) 
causing too much baseload. %util should be low when your application isn't 
running.

Just my 2 cents.

Kind regards,

Droedel


On Sun, Jul 29, 2018, at 10:14, Rob Willett wrote:
> Hi,
> 
> Background
> 
> We've been refactoring our database to reduce the size of it. Through 
> some simple logic we've managed to pull out 99% of the data to reduce 
> the size from 51GB down to approx 600MB. This logic has been to remove 
> rows that are almost the same but not quite identical. As with all 
> things, the thinking was the difficult bit, the execution somewhat 
> easier.
> 
> As part of the testing we've been doing, we've now hit on an odd and 
> weird problem to do with the COMMIT statement. A commit of a few hundred 
> (circa 600-800) rows takes approx 7 seconds whereas before we never even 
> noticed it, though we now know it was two seconds before. Each row is 
> probably 1-2K of data, so its not very much at all.
> 
> Details of what we have tried:
> 
> 1. We've turned synchronous  on and off
> 
> PRAGMA synchronous=ON
> 
> and thats not made any difference.
> 
> 2. We are using and have been using WAL mode for years.
> 
> PRAGMA journal_mode;
> journal_mode
> wal
> 
> 3. We've tested that the server copies OK, we get a consistent 69MB/sec. 
> This is not as fast we would like, but it's the same across all our 
> virtual servers.
> 
> 4. We've tested the commit on our existing 60GB database and it takes 2 
> seconds, which is longer than we thought it would be. The server for the 
> 60GB database is a large VPS with 8GB/8 cores and runs Ubuntu 14.04. The 
> server we are testing on is a 2GB/2 core test server running Ubuntu 
> 16.04. Whilst the test server is smaller, we wouldn't expect it to take 
> 3 times longer to do a commit.
> 
> 5. The code is identical across the servers. We are running Perl and the 
> DBI module. The code for doing a commit in Perl::DBI is
>   $dbh->do("COMMIT");
> 
>    We are getting the expected performance elsewhere on the system and 
> in the code. It's just the commit that is taking a long time.
> 
> 6. The code we are committing is adding 600-800 lines to a table that 
> used to be 200,000,000 rows in size. It's now 400,000 lines in size. We 
> are wondering if the deletion of the lines has had an impact we didn't 
> expect. We have vacuumed and analysed the database.
> 
> The schema for the table we insert into is
> 
> CREATE TABLE IF NOT EXISTS "Disruptions" (
>        "id" INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
>        "version" integer NOT NULL,
>        "Disruption_id" INTEGER NOT NULL,
>        "status" integer NOT NULL,
>        "severity" integer NOT NULL,
>        "levelOfInterest" integer NOT NULL,
>        "category" integer NOT NULL,
>        "subCategory" integer NOT NULL,
>        "startTime" TEXT NOT NULL,
>        "endTime" text NOT NULL,
>        "location" integer NOT NULL,
>        "corridor" integer NOT NULL,
>        "comments" integer NOT NULL,
>        "currentUpdate" integer NOT NULL,
>        "remarkTime" TEXT NOT NULL,
>        "lastModTime" TEXT NOT NULL,
>        "CauseAreaPointX" real NOT NULL,
>        "CauseAreaPointY" real NOT NULL,
>        "Direction" TEXT
> );
> CREATE INDEX "Disruptions_Idx1" ON Disruptions ("location" COLLATE 
> NOCASE ASC, "corridor" COLLATE NOCASE ASC, "status" COLLATE NOCASE ASC, 
> "category" COLLATE NOCASE ASC, "severity" COLLATE NOCASE ASC, 
> "levelOfInterest" COLLATE NOCASE ASC, "version" COLLATE NOCASE ASC, 
> "subCategory" COLLATE NOCASE ASC, "startTime" COLLATE NOCASE ASC);
> CREATE UNIQUE INDEX "Disruptions_Idx2" ON Disruptions ("Disruption_id" 
> COLLATE NOCASE ASC, "version" COLLATE NOCASE ASC, "category" COLLATE 
> NOCASE ASC, "subCategory" COLLATE NOCASE ASC);
> CREATE UNIQUE INDEX "Disruptions_Idx3" ON Disruptions ("version" COLLATE 
> NOCASE ASC, "Disruption_id" COLLATE NOCASE ASC, "location" COLLATE 
> NOCASE ASC);
> CREATE INDEX Disruptions_Idx5 ON Disruptions ("status", "Disruption_id", 
> "Severity", "levelOfInterest", "category", "subCategory", "version");
> 
> We have checked that this schema is consistent across the databases.
> 
> We're about to recreate the table to see if that makes a difference.
> 
> Any help or advice welcomed.
> 
> Thanks
> 
> Rob
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to