Simon,

Absolutely no need to apologise. We should apologise for all the time we have taken from other people :(

We recognise that the collate no case is inappropriate for our database. We suspect this was added from a SQLite tool we used some time ago. We are going to use this opportunity to remove this sort of nonsense.

We still think we have an inherent design fault in our database that we are trying to understand. One of the problems we had was that checking the performance of a 200M row table for bad indexes is time consuming. We now have a workflow to get from a 60GB database to a 600MB database in a few hours. We cannot do all the work in SQL as it involves an external program to analyse the data but a few hours to run isn't bad. As we now have the database held locally, we can thrash the local server silly to get the performance we need.

Rob

On 31 Jul 2018, at 16:18, Simon Slavin wrote:

On 31 Jul 2018, at 2:59pm, Rob Willett <rob.sql...@robertwillett.com> wrote:

We've created a new table based on your ideas, moved the collate into the table, analysed the database. We did **not** add COLLATE NOCASE to the columns which are defined as integers. Would that make a difference?

What you did is correct. I gave wrong advice for which I apologise. But I am now confused since your original code is a little strange. Your original has a table definition including

        "version" integer NOT NULL,

but then

CREATE UNIQUE INDEX "Disruptions_Idx2" ON Disruptions ("Disruption_id" COLLATE NOCASE ASC, "version" COLLATE NOCASE ASC, "category" COLLATE NOCASE ASC, "subCategory" COLLATE NOCASE ASC);

If "version" really is an INTEGER, then it is incorrect to use version COLLATE NOCASE in the index. NOCASE is purely for text values. This may be slowing things down.

To solve it, in the table definition, use COLLATE NOCASE for TEXT columns and not for INTEGER columns. Also, remove all mentions of COLLATE NOCASE in your index definitions. Collation methods should be set in the table definition, not in indexes, except for some unusual situations.

This should increase your speed relative to your original timings. If it slows things down, something else I haven't spotted is wrong.

We've found it now takes around 10% longer to do the queries than before.

That is understandable given the incorrect advice I gave you before.

In another post you report some strange timing problems with no simple explanation. When I get such things I suspect database corruption or hardware problems and run an integrity_check. But with a 60Gig database I might think twice.

Simon.
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to