Doyel5 <doyel...@gmail.com> wrote:
> I have a 800MB MS Access database that I migrated to SQLite. The
> structure of the database is as follows (the SQLite database, after
> migration, is around 330MB):
> 
> The table ‘Occurrence’ has  1,600,000 records. The table looks like:
> CREATE TABLE Occurrence
> (
> SimulationID  INTEGER,    SimRunID   INTEGER,    OccurrenceID  
> INTEGER, OccurrenceTypeID    INTEGER,    Period    INTEGER,   
> HasSucceeded    BOOL, PRIMARY KEY (SimulationID,  SimRunID,  
> OccurrenceID) )
> 
> It has the following indexes:
> CREATE INDEX "Occurrence_HasSucceeded_idx" ON "Occurrence"
> ("HasSucceeded" ASC)

Don't create indexes on fields with low selectivity (fields that only have a 
small number of distinct values, such as boolean fields). A query like "select 
* from Occurrence where HasSucceeded" would likely run faster doing a full scan 
than using this index.

> CREATE INDEX "Occurrence_OccurrenceID_idx" ON "Occurrence"
> ("OccurrenceID" ASC)
> 
> CREATE INDEX "Occurrence_SimRunID_idx" ON "Occurrence" ("SimRunID"
> ASC) 
> 
> CREATE INDEX "Occurrence_SimulationID_idx" ON "Occurrence"
> ("SimulationID" ASC)

This index is redundant. Your primary key can be used for searches on 
SimulationID.


> The table ‘InitialParticipant’ has 130 records. The structure of the
> table is
> 
> CREATE TABLE InitialParticipant
> (
> ParticipantID    INTEGER  PRIMARY KEY,     ParticipantTypeID   
> INTEGER, ParticipantGroupID     INTEGER
> )
> 
> The table has the following indexes:
> CREATE INDEX "initialpart_participantTypeID_idx" ON
> "InitialParticipant" ("ParticipantGroupID" ASC)
> 
> CREATE INDEX "initialpart_ParticipantID_idx" ON "InitialParticipant"
> ("ParticipantID" ASC)

Similarly, this index is redundant. The fact that ParticipantID is a primary 
key means it's automatically indexed.

> The table ‘ParticipantGroup’ has 22 records. It looks like
> CREATE TABLE ParticipantGroup
> (
> ParticipantGroupID    INTEGER,    ParticipantGroupTypeID     INTEGER,
> Description    varchar (50),      PRIMARY KEY(  ParticipantGroupID  )
> )
> 
> The table has the following index:
> CREATE INDEX "ParticipantGroup_ParticipantGroupID_idx" ON
> "ParticipantGroup" ("ParticipantGroupID" ASC)

Again, this index is redundant.

> The query in Access is (the entire query takes 1 second to execute):
> SELECT ParticipantGroup.Description, Occurrence.SimulationID,
> Occurrence.SimRunID, Occurrence.Period,
> Count(OccurrenceParticipant.ParticipantID) AS CountOfParticipantID
> FROM (
>     ParticipantGroup INNER JOIN InitialParticipant ON
> ParticipantGroup.ParticipantGroupID =
> InitialParticipant.ParticipantGroupID ) INNER JOIN
> (
>    tmpPartArgs INNER JOIN
>       (
>           (
>               tmpSimArgs INNER JOIN Occurrence ON
> (tmpSimArgs.SimRunID = Occurrence.SimRunID) AND
>           (tmpSimArgs.SimulationID = Occurrence.SimulationID) ) INNER
> JOIN OccurrenceParticipant ON (Occurrence.OccurrenceID =
> OccurrenceParticipant.OccurrenceID) AND (Occurrence.SimRunID =
> OccurrenceParticipant.SimRunID) AND (Occurrence.SimulationID =
>      OccurrenceParticipant.SimulationID) ) ON
> tmpPartArgs.participantID = OccurrenceParticipant.ParticipantID ) ON
> InitialParticipant.ParticipantID =
> OccurrenceParticipant.ParticipantID WHERE
> (((OccurrenceParticipant.RoleTypeID)=52 Or
> (OccurrenceParticipant.RoleTypeID)=49)) AND Occurrence.HasSucceeded =
> True GROUP BY ParticipantGroup.Description, Occurrence.SimulationID,
> Occurrence.SimRunID, Occurrence.Period;  

I recommend you drop all those parentheses around joins - they only confuse 
matters. Just write

SELECT ParticipantGroup.Description, Occurrence.SimulationID,
    Occurrence.SimRunID, Occurrence.Period,
    Count(OccurrenceParticipant.ParticipantID) AS CountOfParticipantID
FROM
    ParticipantGroup INNER JOIN InitialParticipant ON
        (ParticipantGroup.ParticipantGroupID = 
InitialParticipant.ParticipantGroupID)
    INNER JOIN OccurrenceParticipant ON
        (InitialParticipant.ParticipantID = OccurrenceParticipant.ParticipantID)
    INNER JOIN tmpPartArgs ON
        (tmpPartArgs.participantID = OccurrenceParticipant.ParticipantID)
    INNER JOIN Occurrence ON
        (Occurrence.OccurrenceID = OccurrenceParticipant.OccurrenceID) AND
        (Occurrence.SimRunID = OccurrenceParticipant.SimRunID) AND
        (Occurrence.SimulationID = OccurrenceParticipant.SimulationID)
    INNER JOIN tmpSimArgs ON
        (tmpSimArgs.SimRunID = Occurrence.SimRunID) AND
        (tmpSimArgs.SimulationID = Occurrence.SimulationID)
WHERE
    OccurrenceParticipant.RoleTypeID IN (49, 52) AND Occurrence.HasSucceeded
GROUP BY
    ParticipantGroup.Description, Occurrence.SimulationID, Occurrence.SimRunID, 
Occurrence.Period;


> The SQLite  query is as follows (this query takes around 10 seconds):

Why did you feel the need to rewrite it? This mess of nested selects makes 
SQLite's job much harder than it needs to be.

The original query should have worked out of the box (well, almost - there's no 
TRUE keyword in SQLite). In any case, the query shown above should work fine in 
both Access and SQLite, and is hopefully easier to read and undersand.

Igor Tandetnik

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to