If you test the last query in this email, be careful to test with your original " where x or y" clause along with the "in" version. There is a serious problem where "in" sometimes confuses the search in even simple queries and sqlite ends up doing a table scan instead of an obvious index. I sent detailed email about this a couple months ago but never got any traction other than one other person's confirmation that it happened for them as well. For my case changing the statement made my query time improvement from nearly 100ms to < 2ms
Todd ----- Original Message ----- From: "Igor Tandetnik" <itandet...@mvps.org> To: sqlite-users@sqlite.org Sent: Wednesday, January 6, 2010 3:02:23 PM Subject: Re: [sqlite] SQlite query performs 10 times slower than MS Access query Doyel5 <doyel...@gmail.com> wrote: > I have a 800MB MS Access database that I migrated to SQLite. The > structure of the database is as follows (the SQLite database, after > migration, is around 330MB): > > The table ‘Occurrence’ has 1,600,000 records. The table looks like: > CREATE TABLE Occurrence > ( > SimulationID INTEGER, SimRunID INTEGER, OccurrenceID > INTEGER, OccurrenceTypeID INTEGER, Period INTEGER, > HasSucceeded BOOL, PRIMARY KEY (SimulationID, SimRunID, > OccurrenceID) ) > > It has the following indexes: > CREATE INDEX "Occurrence_HasSucceeded_idx" ON "Occurrence" > ("HasSucceeded" ASC) Don't create indexes on fields with low selectivity (fields that only have a small number of distinct values, such as boolean fields). A query like "select * from Occurrence where HasSucceeded" would likely run faster doing a full scan than using this index. > CREATE INDEX "Occurrence_OccurrenceID_idx" ON "Occurrence" > ("OccurrenceID" ASC) > > CREATE INDEX "Occurrence_SimRunID_idx" ON "Occurrence" ("SimRunID" > ASC) > > CREATE INDEX "Occurrence_SimulationID_idx" ON "Occurrence" > ("SimulationID" ASC) This index is redundant. Your primary key can be used for searches on SimulationID. > The table ‘InitialParticipant’ has 130 records. The structure of the > table is > > CREATE TABLE InitialParticipant > ( > ParticipantID INTEGER PRIMARY KEY, ParticipantTypeID > INTEGER, ParticipantGroupID INTEGER > ) > > The table has the following indexes: > CREATE INDEX "initialpart_participantTypeID_idx" ON > "InitialParticipant" ("ParticipantGroupID" ASC) > > CREATE INDEX "initialpart_ParticipantID_idx" ON "InitialParticipant" > ("ParticipantID" ASC) Similarly, this index is redundant. The fact that ParticipantID is a primary key means it's automatically indexed. > The table ‘ParticipantGroup’ has 22 records. It looks like > CREATE TABLE ParticipantGroup > ( > ParticipantGroupID INTEGER, ParticipantGroupTypeID INTEGER, > Description varchar (50), PRIMARY KEY( ParticipantGroupID ) > ) > > The table has the following index: > CREATE INDEX "ParticipantGroup_ParticipantGroupID_idx" ON > "ParticipantGroup" ("ParticipantGroupID" ASC) Again, this index is redundant. > The query in Access is (the entire query takes 1 second to execute): > SELECT ParticipantGroup.Description, Occurrence.SimulationID, > Occurrence.SimRunID, Occurrence.Period, > Count(OccurrenceParticipant.ParticipantID) AS CountOfParticipantID > FROM ( > ParticipantGroup INNER JOIN InitialParticipant ON > ParticipantGroup.ParticipantGroupID = > InitialParticipant.ParticipantGroupID ) INNER JOIN > ( > tmpPartArgs INNER JOIN > ( > ( > tmpSimArgs INNER JOIN Occurrence ON > (tmpSimArgs.SimRunID = Occurrence.SimRunID) AND > (tmpSimArgs.SimulationID = Occurrence.SimulationID) ) INNER > JOIN OccurrenceParticipant ON (Occurrence.OccurrenceID = > OccurrenceParticipant.OccurrenceID) AND (Occurrence.SimRunID = > OccurrenceParticipant.SimRunID) AND (Occurrence.SimulationID = > OccurrenceParticipant.SimulationID) ) ON > tmpPartArgs.participantID = OccurrenceParticipant.ParticipantID ) ON > InitialParticipant.ParticipantID = > OccurrenceParticipant.ParticipantID WHERE > (((OccurrenceParticipant.RoleTypeID)=52 Or > (OccurrenceParticipant.RoleTypeID)=49)) AND Occurrence.HasSucceeded = > True GROUP BY ParticipantGroup.Description, Occurrence.SimulationID, > Occurrence.SimRunID, Occurrence.Period; I recommend you drop all those parentheses around joins - they only confuse matters. Just write SELECT ParticipantGroup.Description, Occurrence.SimulationID, Occurrence.SimRunID, Occurrence.Period, Count(OccurrenceParticipant.ParticipantID) AS CountOfParticipantID FROM ParticipantGroup INNER JOIN InitialParticipant ON (ParticipantGroup.ParticipantGroupID = InitialParticipant.ParticipantGroupID) INNER JOIN OccurrenceParticipant ON (InitialParticipant.ParticipantID = OccurrenceParticipant.ParticipantID) INNER JOIN tmpPartArgs ON (tmpPartArgs.participantID = OccurrenceParticipant.ParticipantID) INNER JOIN Occurrence ON (Occurrence.OccurrenceID = OccurrenceParticipant.OccurrenceID) AND (Occurrence.SimRunID = OccurrenceParticipant.SimRunID) AND (Occurrence.SimulationID = OccurrenceParticipant.SimulationID) INNER JOIN tmpSimArgs ON (tmpSimArgs.SimRunID = Occurrence.SimRunID) AND (tmpSimArgs.SimulationID = Occurrence.SimulationID) WHERE OccurrenceParticipant.RoleTypeID IN (49, 52) AND Occurrence.HasSucceeded GROUP BY ParticipantGroup.Description, Occurrence.SimulationID, Occurrence.SimRunID, Occurrence.Period; > The SQLite query is as follows (this query takes around 10 seconds): Why did you feel the need to rewrite it? This mess of nested selects makes SQLite's job much harder than it needs to be. The original query should have worked out of the box (well, almost - there's no TRUE keyword in SQLite). In any case, the query shown above should work fine in both Access and SQLite, and is hopefully easier to read and undersand. Igor Tandetnik _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users