If you test the last query in this email, be careful to test with your original 
" where x or y" clause along with the "in" version. There is a serious problem 
where "in" sometimes confuses the search in even simple queries and sqlite ends 
up doing a table scan instead of an obvious index. I sent detailed email about 
this a couple months ago but never got any traction other than one other 
person's confirmation that it happened for them as well. For my case changing 
the statement made my query time improvement from nearly 100ms to < 2ms 

Todd 

----- Original Message ----- 
From: "Igor Tandetnik" <itandet...@mvps.org> 
To: sqlite-users@sqlite.org 
Sent: Wednesday, January 6, 2010 3:02:23 PM 
Subject: Re: [sqlite] SQlite query performs 10 times slower than MS Access 
query 

Doyel5 <doyel...@gmail.com> wrote: 
> I have a 800MB MS Access database that I migrated to SQLite. The 
> structure of the database is as follows (the SQLite database, after 
> migration, is around 330MB): 
> 
> The table ‘Occurrence’ has 1,600,000 records. The table looks like: 
> CREATE TABLE Occurrence 
> ( 
> SimulationID INTEGER, SimRunID INTEGER, OccurrenceID 
> INTEGER, OccurrenceTypeID INTEGER, Period INTEGER, 
> HasSucceeded BOOL, PRIMARY KEY (SimulationID, SimRunID, 
> OccurrenceID) ) 
> 
> It has the following indexes: 
> CREATE INDEX "Occurrence_HasSucceeded_idx" ON "Occurrence" 
> ("HasSucceeded" ASC) 

Don't create indexes on fields with low selectivity (fields that only have a 
small number of distinct values, such as boolean fields). A query like "select 
* from Occurrence where HasSucceeded" would likely run faster doing a full scan 
than using this index. 

> CREATE INDEX "Occurrence_OccurrenceID_idx" ON "Occurrence" 
> ("OccurrenceID" ASC) 
> 
> CREATE INDEX "Occurrence_SimRunID_idx" ON "Occurrence" ("SimRunID" 
> ASC) 
> 
> CREATE INDEX "Occurrence_SimulationID_idx" ON "Occurrence" 
> ("SimulationID" ASC) 

This index is redundant. Your primary key can be used for searches on 
SimulationID. 


> The table ‘InitialParticipant’ has 130 records. The structure of the 
> table is 
> 
> CREATE TABLE InitialParticipant 
> ( 
> ParticipantID INTEGER PRIMARY KEY, ParticipantTypeID 
> INTEGER, ParticipantGroupID INTEGER 
> ) 
> 
> The table has the following indexes: 
> CREATE INDEX "initialpart_participantTypeID_idx" ON 
> "InitialParticipant" ("ParticipantGroupID" ASC) 
> 
> CREATE INDEX "initialpart_ParticipantID_idx" ON "InitialParticipant" 
> ("ParticipantID" ASC) 

Similarly, this index is redundant. The fact that ParticipantID is a primary 
key means it's automatically indexed. 

> The table ‘ParticipantGroup’ has 22 records. It looks like 
> CREATE TABLE ParticipantGroup 
> ( 
> ParticipantGroupID INTEGER, ParticipantGroupTypeID INTEGER, 
> Description varchar (50), PRIMARY KEY( ParticipantGroupID ) 
> ) 
> 
> The table has the following index: 
> CREATE INDEX "ParticipantGroup_ParticipantGroupID_idx" ON 
> "ParticipantGroup" ("ParticipantGroupID" ASC) 

Again, this index is redundant. 

> The query in Access is (the entire query takes 1 second to execute): 
> SELECT ParticipantGroup.Description, Occurrence.SimulationID, 
> Occurrence.SimRunID, Occurrence.Period, 
> Count(OccurrenceParticipant.ParticipantID) AS CountOfParticipantID 
> FROM ( 
> ParticipantGroup INNER JOIN InitialParticipant ON 
> ParticipantGroup.ParticipantGroupID = 
> InitialParticipant.ParticipantGroupID ) INNER JOIN 
> ( 
> tmpPartArgs INNER JOIN 
> ( 
> ( 
> tmpSimArgs INNER JOIN Occurrence ON 
> (tmpSimArgs.SimRunID = Occurrence.SimRunID) AND 
> (tmpSimArgs.SimulationID = Occurrence.SimulationID) ) INNER 
> JOIN OccurrenceParticipant ON (Occurrence.OccurrenceID = 
> OccurrenceParticipant.OccurrenceID) AND (Occurrence.SimRunID = 
> OccurrenceParticipant.SimRunID) AND (Occurrence.SimulationID = 
> OccurrenceParticipant.SimulationID) ) ON 
> tmpPartArgs.participantID = OccurrenceParticipant.ParticipantID ) ON 
> InitialParticipant.ParticipantID = 
> OccurrenceParticipant.ParticipantID WHERE 
> (((OccurrenceParticipant.RoleTypeID)=52 Or 
> (OccurrenceParticipant.RoleTypeID)=49)) AND Occurrence.HasSucceeded = 
> True GROUP BY ParticipantGroup.Description, Occurrence.SimulationID, 
> Occurrence.SimRunID, Occurrence.Period; 

I recommend you drop all those parentheses around joins - they only confuse 
matters. Just write 

SELECT ParticipantGroup.Description, Occurrence.SimulationID, 
Occurrence.SimRunID, Occurrence.Period, 
Count(OccurrenceParticipant.ParticipantID) AS CountOfParticipantID 
FROM 
ParticipantGroup INNER JOIN InitialParticipant ON 
(ParticipantGroup.ParticipantGroupID = InitialParticipant.ParticipantGroupID) 
INNER JOIN OccurrenceParticipant ON 
(InitialParticipant.ParticipantID = OccurrenceParticipant.ParticipantID) 
INNER JOIN tmpPartArgs ON 
(tmpPartArgs.participantID = OccurrenceParticipant.ParticipantID) 
INNER JOIN Occurrence ON 
(Occurrence.OccurrenceID = OccurrenceParticipant.OccurrenceID) AND 
(Occurrence.SimRunID = OccurrenceParticipant.SimRunID) AND 
(Occurrence.SimulationID = OccurrenceParticipant.SimulationID) 
INNER JOIN tmpSimArgs ON 
(tmpSimArgs.SimRunID = Occurrence.SimRunID) AND 
(tmpSimArgs.SimulationID = Occurrence.SimulationID) 
WHERE 
OccurrenceParticipant.RoleTypeID IN (49, 52) AND Occurrence.HasSucceeded 
GROUP BY 
ParticipantGroup.Description, Occurrence.SimulationID, Occurrence.SimRunID, 
Occurrence.Period; 


> The SQLite query is as follows (this query takes around 10 seconds): 

Why did you feel the need to rewrite it? This mess of nested selects makes 
SQLite's job much harder than it needs to be. 

The original query should have worked out of the box (well, almost - there's no 
TRUE keyword in SQLite). In any case, the query shown above should work fine in 
both Access and SQLite, and is hopefully easier to read and undersand. 

Igor Tandetnik 


_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to