Re: [sqlite] SQlite query performs 10 times slower than MS Access query
> > > The sqlite3_stmt_status() interface was designed for this purpose. > http://www.sqlite.org/c3ref/stmt_status.html > Thanks for the info, I hope it will be extended also with other counters I did some test with this Rowid/Id trick. A larger base, 22mb, 100,000 records (220 bytes per record average) and the difference for "SELECT rowid .." query was more noticable (20 mb read without Id index, 1mb with it). But I also noticed that sometimes sqlite doesn't allow to use explicit index with ORDER BY clause considering them (primary and explicitely created) equivalent. I think it's not a bug, but in terms of performance maybe it would be more appropriate to obey ORDER BY clause in this case. The table is CREATE TABLE [MainData] ([Id] INTEGER PRIMARY KEY AUTOINCREMENT,... The index is CREATE INDEX [idx_ID] ON [MainData] ([ID] ) Different examples (the second line is EXPLAIN QUERY PLAN result): SELECT rowid FROM MainData TABLE MainData 22 Mb read (expected, Primary index used) SELECT rowid FROM MainData ORDER BY Id TABLE MainData WITH INDEX idx_ID ORDER BY 1,2 Mb read (expected, idx_ID used) SELECT rowid FROM MainData Where Id > 1 AND Id < 30 TABLE MainData USING PRIMARY KEY 22 Mb read (expected, Primary index used) SELECT rowid FROM MainData Where Id > 1 AND Id < 30 ORDER BY Id TABLE MainData USING PRIMARY KEY ORDER BY 22 Mb read (not fully expected, ORDER BY points to Id field, but its actually an alias) SELECT rowid FROM MainData INDEXED BY idx_ID Where Id > 1 AND Id < 30 ORDER BY Id TABLE MainData WITH INDEX idx_ID ORDER BY 1,2 Mb read (expected, but INDEXED BY is not recommended for query tuning) So when the query contains WHERE clause only when INDEXED BY was added this explicit index was used. As for use usefullness of this trick, I think it really can be useful when the application wants to load one "virtual" long list of data actually loading only rowids and optionally query full data record for visible rows. I think in this case 22 MB disk access vs 1 MB disk access makes difference Max ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQlite query performs 10 times slower than MS Access query
On Jan 7, 2010, at 8:18 AM, Tim Romano wrote: > I don't claim any SQLite expertise, but indexes on values like > True/False, Male/Female -- i.e. where there are only a couple or a few > different values possible in the column -- are normally to be > avoided > because low-cardinality indexes tend to be inefficient. What is the > advice of the SQLite gurus in this respect? This is good advice. If you run ANALYZE, SQLite will figure out that the index is mostly useless and will hardly ever use it. (There are some obscure queries for which such an index would be useful, but they are the exception rather than the rule.) On the other hand, SQLite still has to maintain the index on INSERT, UPDATE, and DELETE operations which will slow those operations down. D. Richard Hipp d...@hwaci.com ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQlite query performs 10 times slower than MS Access query
On Jan 29, 2010, at 6:15 AM, Max Vlasov wrote: > To be sure that there's no > full or almost-full table scan I use VFS monitoring in such cases. The sqlite3_stmt_status() interface was designed for this purpose. http://www.sqlite.org/c3ref/stmt_status.html D. Richard Hipp d...@hwaci.com ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQlite query performs 10 times slower than MS Access query
On Fri, Jan 29, 2010 at 6:02 AM, Doyel5 wrote: > I deleted all indexes, kept only the indexes on the temporary tables, > namely > - tempSimArgs and tempPartArgs, and seemingly my original query's running > time decreased to around 1 sec (when I ran the query the second time). The > execution time of the first time I run my query varies wildly, I dunno why. > Any ideas? > You mentioned 800 MB as the size of your base. I think that although indexes effectively used, the actual data can be spread across the size of your base so first query involves many hard disk seek operations and disk cache doing unnecessary job (for example sqlite needs only 1024 bytes at a particular offset but the OS reads much more than that). To be sure that there's no full or almost-full table scan I use VFS monitoring in such cases. If you install your functions as your own file system but forward all calls to original VFS, you can calculate real data flow for your query. As for Access and MS db engines like Jet, we only can guess how it works, maybe when you first open your db, the engine starts some caching immediately, we don't know, it's closed software. Also If you want your first join or select operations for a large base to be faster as for general rule I'd suggest to minimize data record size for any table that involved in join or select operations or make the page size lower. I did some tests to explain why I think it would help. My explanation can be questionable so please correct me if I'm wrong. Sqlite bases works with pages that for example can have size 1024 bytes, all data packed in pages and operate by pages. As long as I know the only operation involving fewer sizes is header operations. Also B-tree cells always come with the data so they're not separated inside the db file. The only complex thing is overflow pages when one single record can not fit inside one page. The consequence of such format is that if the data record is fewer than 1024 bytes, visiting a record is always full record read. Just made a quick test with VFS monitoring: CREATE TABLE [TestTable] ([Id] INTEGER PRIMARY KEY AUTOINCREMENT,[Text] TEXT) 100,000 records INSERT INTO TestTable (Text) VALUES ("1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ") db size: 4,671,488 SELECT * FROM TestTable VFS read flow: 4,669,456 SELECT rowid FROM TestTable VFS read flow: 4,669,456 But if insert records with large texts (I chose intentionally 2 kb for overflow pages to be present), SELECT rowid made twice as low page reading comparing to SELECT *. By the way, just discovered a strange trick that may help in similar or other cases. Creating Index for this table on [Id] looks like terrible idea wasting the space (since rowid already ordered), but actually after creating the index CREATE INDEX [idx_ID] ON [TestTable] ([ID] ) changing SELECT rowid query to SELECT rowid FROM TestTable ORDER BY Id led to only 1,312,000 data flow (comparing to 4,669,456) and it really makes sence because: - indexes in sqlite contains the data and rowid so it doesn't need the main table to return the results, - comparing to the rowid B-tree that contains the full datarecords, this index B-tree more effectively packed so need to visit fewer pages. I don't know what one should do to apply this trick in complex queries, but I hope it is possible. Max ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQlite query performs 10 times slower than MS Access query
>The execution time of the first time I run my query varies wildly, I >dunno why. Probably due to an empty or dirty cache. That's fairly common with cache inclined applications, subsystems or OSes. The second time, most of what's needed is found in cache with much less variability. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQlite query performs 10 times slower than MS Access query
Hi Everybody, Thanks a lot for the suggestions. I really appreciate your help. :-) Just wanted to share my results.. I deleted all indexes, kept only the indexes on the temporary tables, namely - tempSimArgs and tempPartArgs, and seemingly my original query's running time decreased to around 1 sec (when I ran the query the second time). The execution time of the first time I run my query varies wildly, I dunno why. Any ideas? Thanks in advance! Regards, Saswati Igor Tandetnik wrote: > > Doyel5 wrote: >> I have a 800MB MS Access database that I migrated to SQLite. The >> structure of the database is as follows (the SQLite database, after >> migration, is around 330MB): >> >> The table ‘Occurrence’ has 1,600,000 records. The table looks like: >> CREATE TABLE Occurrence >> ( >> SimulationID INTEGER,SimRunID INTEGER,OccurrenceID >> INTEGER, OccurrenceTypeIDINTEGER,PeriodINTEGER, >> HasSucceededBOOL, PRIMARY KEY (SimulationID, SimRunID, >> OccurrenceID) ) >> >> It has the following indexes: >> CREATE INDEX "Occurrence_HasSucceeded_idx" ON "Occurrence" >> ("HasSucceeded" ASC) > > Don't create indexes on fields with low selectivity (fields that only have > a small number of distinct values, such as boolean fields). A query like > "select * from Occurrence where HasSucceeded" would likely run faster > doing a full scan than using this index. > >> CREATE INDEX "Occurrence_OccurrenceID_idx" ON "Occurrence" >> ("OccurrenceID" ASC) >> >> CREATE INDEX "Occurrence_SimRunID_idx" ON "Occurrence" ("SimRunID" >> ASC) >> >> CREATE INDEX "Occurrence_SimulationID_idx" ON "Occurrence" >> ("SimulationID" ASC) > > This index is redundant. Your primary key can be used for searches on > SimulationID. > > >> The table ‘InitialParticipant’ has 130 records. The structure of the >> table is >> >> CREATE TABLE InitialParticipant >> ( >> ParticipantIDINTEGER PRIMARY KEY, ParticipantTypeID >> INTEGER, ParticipantGroupID INTEGER >> ) >> >> The table has the following indexes: >> CREATE INDEX "initialpart_participantTypeID_idx" ON >> "InitialParticipant" ("ParticipantGroupID" ASC) >> >> CREATE INDEX "initialpart_ParticipantID_idx" ON "InitialParticipant" >> ("ParticipantID" ASC) > > Similarly, this index is redundant. The fact that ParticipantID is a > primary key means it's automatically indexed. > >> The table ‘ParticipantGroup’ has 22 records. It looks like >> CREATE TABLE ParticipantGroup >> ( >> ParticipantGroupIDINTEGER,ParticipantGroupTypeID INTEGER, >> Descriptionvarchar (50), PRIMARY KEY( ParticipantGroupID ) >> ) >> >> The table has the following index: >> CREATE INDEX "ParticipantGroup_ParticipantGroupID_idx" ON >> "ParticipantGroup" ("ParticipantGroupID" ASC) > > Again, this index is redundant. > >> The query in Access is (the entire query takes 1 second to execute): >> SELECT ParticipantGroup.Description, Occurrence.SimulationID, >> Occurrence.SimRunID, Occurrence.Period, >> Count(OccurrenceParticipant.ParticipantID) AS CountOfParticipantID >> FROM ( >> ParticipantGroup INNER JOIN InitialParticipant ON >> ParticipantGroup.ParticipantGroupID = >> InitialParticipant.ParticipantGroupID ) INNER JOIN >> ( >>tmpPartArgs INNER JOIN >> ( >> ( >> tmpSimArgs INNER JOIN Occurrence ON >> (tmpSimArgs.SimRunID = Occurrence.SimRunID) AND >> (tmpSimArgs.SimulationID = Occurrence.SimulationID) ) INNER >> JOIN OccurrenceParticipant ON (Occurrence.OccurrenceID = >> OccurrenceParticipant.OccurrenceID) AND (Occurrence.SimRunID = >> OccurrenceParticipant.SimRunID) AND (Occurrence.SimulationID = >> OccurrenceParticipant.SimulationID) ) ON >> tmpPartArgs.participantID = OccurrenceParticipant.ParticipantID ) ON >> InitialParticipant.ParticipantID = >> OccurrenceParticipant.ParticipantID WHERE >> (((OccurrenceParticipant.RoleTypeID)=52 Or >> (OccurrenceParticipant.RoleTypeID)=49)) AND Occurrence.HasSucceeded = >> True GROUP BY ParticipantGroup.Description, Occurrence.SimulationID, >> Occurrence.SimRunID, Occurrence.Period; > > I recommend you drop all those parentheses around joins - they only > confuse matters. Just write > > SELECT ParticipantGroup.Description, Occurrence.SimulationID, > Occurrence.SimRunID, Occurrence.Period, > Count(OccurrenceParticipant.ParticipantID) AS CountOfParticipantID > FROM > ParticipantGroup INNER JOIN InitialParticipant ON > (ParticipantGroup.ParticipantGroupID = > InitialParticipant.ParticipantGroupID) > INNER JOIN OccurrenceParticipant ON > (InitialParticipant.ParticipantID = > OccurrenceParticipant.ParticipantID) > INNER JOIN tmpPartArgs ON > (tmpPartArgs.participantID = OccurrenceParticipant.ParticipantID) > INNER JOIN Occurrence ON > (Occurrence.OccurrenceID = OccurrenceParticipant.OccurrenceID) AND > (Occurrence.SimRunID = OccurrenceParticipant.SimRunID) AND > (Occurrence.SimulationID = O
Re: [sqlite] SQlite query performs 10 times slower than MS Access query
If you test the last query in this email, be careful to test with your original " where x or y" clause along with the "in" version. There is a serious problem where "in" sometimes confuses the search in even simple queries and sqlite ends up doing a table scan instead of an obvious index. I sent detailed email about this a couple months ago but never got any traction other than one other person's confirmation that it happened for them as well. For my case changing the statement made my query time improvement from nearly 100ms to < 2ms Todd - Original Message - From: "Igor Tandetnik" To: sqlite-users@sqlite.org Sent: Wednesday, January 6, 2010 3:02:23 PM Subject: Re: [sqlite] SQlite query performs 10 times slower than MS Access query Doyel5 wrote: > I have a 800MB MS Access database that I migrated to SQLite. The > structure of the database is as follows (the SQLite database, after > migration, is around 330MB): > > The table ‘Occurrence’ has 1,600,000 records. The table looks like: > CREATE TABLE Occurrence > ( > SimulationID INTEGER, SimRunID INTEGER, OccurrenceID > INTEGER, OccurrenceTypeID INTEGER, Period INTEGER, > HasSucceeded BOOL, PRIMARY KEY (SimulationID, SimRunID, > OccurrenceID) ) > > It has the following indexes: > CREATE INDEX "Occurrence_HasSucceeded_idx" ON "Occurrence" > ("HasSucceeded" ASC) Don't create indexes on fields with low selectivity (fields that only have a small number of distinct values, such as boolean fields). A query like "select * from Occurrence where HasSucceeded" would likely run faster doing a full scan than using this index. > CREATE INDEX "Occurrence_OccurrenceID_idx" ON "Occurrence" > ("OccurrenceID" ASC) > > CREATE INDEX "Occurrence_SimRunID_idx" ON "Occurrence" ("SimRunID" > ASC) > > CREATE INDEX "Occurrence_SimulationID_idx" ON "Occurrence" > ("SimulationID" ASC) This index is redundant. Your primary key can be used for searches on SimulationID. > The table ‘InitialParticipant’ has 130 records. The structure of the > table is > > CREATE TABLE InitialParticipant > ( > ParticipantID INTEGER PRIMARY KEY, ParticipantTypeID > INTEGER, ParticipantGroupID INTEGER > ) > > The table has the following indexes: > CREATE INDEX "initialpart_participantTypeID_idx" ON > "InitialParticipant" ("ParticipantGroupID" ASC) > > CREATE INDEX "initialpart_ParticipantID_idx" ON "InitialParticipant" > ("ParticipantID" ASC) Similarly, this index is redundant. The fact that ParticipantID is a primary key means it's automatically indexed. > The table ‘ParticipantGroup’ has 22 records. It looks like > CREATE TABLE ParticipantGroup > ( > ParticipantGroupID INTEGER, ParticipantGroupTypeID INTEGER, > Description varchar (50), PRIMARY KEY( ParticipantGroupID ) > ) > > The table has the following index: > CREATE INDEX "ParticipantGroup_ParticipantGroupID_idx" ON > "ParticipantGroup" ("ParticipantGroupID" ASC) Again, this index is redundant. > The query in Access is (the entire query takes 1 second to execute): > SELECT ParticipantGroup.Description, Occurrence.SimulationID, > Occurrence.SimRunID, Occurrence.Period, > Count(OccurrenceParticipant.ParticipantID) AS CountOfParticipantID > FROM ( > ParticipantGroup INNER JOIN InitialParticipant ON > ParticipantGroup.ParticipantGroupID = > InitialParticipant.ParticipantGroupID ) INNER JOIN > ( > tmpPartArgs INNER JOIN > ( > ( > tmpSimArgs INNER JOIN Occurrence ON > (tmpSimArgs.SimRunID = Occurrence.SimRunID) AND > (tmpSimArgs.SimulationID = Occurrence.SimulationID) ) INNER > JOIN OccurrenceParticipant ON (Occurrence.OccurrenceID = > OccurrenceParticipant.OccurrenceID) AND (Occurrence.SimRunID = > OccurrenceParticipant.SimRunID) AND (Occurrence.SimulationID = > OccurrenceParticipant.SimulationID) ) ON > tmpPartArgs.participantID = OccurrenceParticipant.ParticipantID ) ON > InitialParticipant.ParticipantID = > OccurrenceParticipant.ParticipantID WHERE > (((OccurrenceParticipant.RoleTypeID)=52 Or > (OccurrenceParticipant.RoleTypeID)=49)) AND Occurrence.HasSucceeded = > True GROUP BY ParticipantGroup.Description, Occurrence.SimulationID, > Occurrence.SimRunID, Occurrence.Period; I recommend you drop all those parentheses around joins - they only confuse matters. Just write SELECT ParticipantGroup.Description, Occurrence.SimulationID, Occurrence.SimRunID, Occurrence.Period, Count(OccurrenceParticipant.ParticipantID) AS CountOfParticip
Re: [sqlite] SQlite query performs 10 times slower than MS Access query
On Thu, Jan 07, 2010 at 08:18:56AM -0500, Tim Romano scratched on the wall: > I don't claim any SQLite expertise, but indexes on values like > True/False, Male/Female -- i.e. where there are only a couple or a few > different values possible in the column -- are normally to be avoided > because low-cardinality indexes tend to be inefficient. What is the > advice of the SQLite gurus in this respect? "Yes, but..." In general, yes. If a condition and index can't cut a table down to about 15% or less, indexes are usually not worth it... a full table scan will be faster. This is because an index will typically provide a ROWID, which then needs to be looked up in the main table. This makes each index look-up at least twice as expensive, and typically much more due to increased I/O, scattered I/O, and cache thrash. However, in the case of multi-column indexes, if SQLite can extract the column data it needs from the index itself the second table look-up will be skipped. This means there are times when multi-column indexes and boost performance, even if the later columns aren't used as part of the index look-up. There is a insert/update/delete cost associated with that, however. Many server-based database systems keep statistics on their indexes. For example, the number of values vs unique values to measure the selectivity of the index. This meta-data is typically built during idle times. SQLite doesn't have "idle times", but you can manually build basic statistics using ANALYZE. If you built the library with the extended statistics, ANALYZE will also build a histogram to compute the selectivity of ranges. I have no idea how advanced the query optimizer is, or how much this data is used, but it is one more thing to try. -j -- Jay A. Kreibich < J A Y @ K R E I B I.C H > "Our opponent is an alien starship packed with atomic bombs. We have a protractor." "I'll go home and see if I can scrounge up a ruler and a piece of string." --from Anathem by Neal Stephenson ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQlite query performs 10 times slower than MS Access query
On 7 Jan 2010, at 1:18pm, Tim Romano wrote: > I don't claim any SQLite expertise, but indexes on values like > True/False, Male/Female -- i.e. where there are only a couple or a few > different values possible in the column -- are normally to be avoided > because low-cardinality indexes tend to be inefficient. What is the > advice of the SQLite gurus in this respect? It's complicated because more factors are involved than that. For instance, if you have a database where INSERT is used thousands of times a day but you only SELECT records once a week, all indexes will be inefficient: you want INSERT to work as quickly as possible and it doesn't matter as much if your SELECT works slowly. For a situation like this you often make no indexes besides the primary one. This lets each INSERT operation finish faster. However, it is true that if data in a column is 'chunky' then an index on that column is less useful. Some database systems like MySQL use chunkiness as part of a profiling system to work out what the most efficient indexes will be. Certainly if you need two columns in a WHERE clause, you tend to put the chunkier one second when you're making the index the SELECT will use. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQlite query performs 10 times slower than MS Access query
I don't claim any SQLite expertise, but indexes on values like True/False, Male/Female -- i.e. where there are only a couple or a few different values possible in the column -- are normally to be avoided because low-cardinality indexes tend to be inefficient. What is the advice of the SQLite gurus in this respect? Regards Tim Romano P.S. I've noticed a striking performance boost in SQLite when joined inline views (as shown below) are used instead of standard table joins. I haven't delved into it but it could be that because the transient table instantiated by the inline select contains only a few rows and SQLite uses this set of rows as the "inner loop", this turns out be a better approach for the query than the indexes that might have been chosen otherwise. select a, b, c from myTable T1 JOIN (select x, y, z from myOtherTable where ...) as T2 on T1.a = T2.z On 1/6/2010 5:28 PM, Doyel5 wrote: > The table ‘Occurrence’ has 1,600,000 records. The table looks like: > CREATE TABLE Occurrence > ( > SimulationID INTEGER,SimRunID INTEGER,OccurrenceID INTEGER, > OccurrenceTypeIDINTEGER,PeriodINTEGER,HasSucceededBOOL, > PRIMARY KEY (SimulationID, SimRunID, OccurrenceID) > ) > > > CREATE INDEX "Occurrence_HasSucceeded_idx" ON "Occurrence" ("HasSucceeded" > ASC) > > > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQlite query performs 10 times slower than MS Access query
Doyel5 wrote: > I have a 800MB MS Access database that I migrated to SQLite. The > structure of the database is as follows (the SQLite database, after > migration, is around 330MB): > > The table ‘Occurrence’ has 1,600,000 records. The table looks like: > CREATE TABLE Occurrence > ( > SimulationID INTEGER,SimRunID INTEGER,OccurrenceID > INTEGER, OccurrenceTypeIDINTEGER,PeriodINTEGER, > HasSucceededBOOL, PRIMARY KEY (SimulationID, SimRunID, > OccurrenceID) ) > > It has the following indexes: > CREATE INDEX "Occurrence_HasSucceeded_idx" ON "Occurrence" > ("HasSucceeded" ASC) Don't create indexes on fields with low selectivity (fields that only have a small number of distinct values, such as boolean fields). A query like "select * from Occurrence where HasSucceeded" would likely run faster doing a full scan than using this index. > CREATE INDEX "Occurrence_OccurrenceID_idx" ON "Occurrence" > ("OccurrenceID" ASC) > > CREATE INDEX "Occurrence_SimRunID_idx" ON "Occurrence" ("SimRunID" > ASC) > > CREATE INDEX "Occurrence_SimulationID_idx" ON "Occurrence" > ("SimulationID" ASC) This index is redundant. Your primary key can be used for searches on SimulationID. > The table ‘InitialParticipant’ has 130 records. The structure of the > table is > > CREATE TABLE InitialParticipant > ( > ParticipantIDINTEGER PRIMARY KEY, ParticipantTypeID > INTEGER, ParticipantGroupID INTEGER > ) > > The table has the following indexes: > CREATE INDEX "initialpart_participantTypeID_idx" ON > "InitialParticipant" ("ParticipantGroupID" ASC) > > CREATE INDEX "initialpart_ParticipantID_idx" ON "InitialParticipant" > ("ParticipantID" ASC) Similarly, this index is redundant. The fact that ParticipantID is a primary key means it's automatically indexed. > The table ‘ParticipantGroup’ has 22 records. It looks like > CREATE TABLE ParticipantGroup > ( > ParticipantGroupIDINTEGER,ParticipantGroupTypeID INTEGER, > Descriptionvarchar (50), PRIMARY KEY( ParticipantGroupID ) > ) > > The table has the following index: > CREATE INDEX "ParticipantGroup_ParticipantGroupID_idx" ON > "ParticipantGroup" ("ParticipantGroupID" ASC) Again, this index is redundant. > The query in Access is (the entire query takes 1 second to execute): > SELECT ParticipantGroup.Description, Occurrence.SimulationID, > Occurrence.SimRunID, Occurrence.Period, > Count(OccurrenceParticipant.ParticipantID) AS CountOfParticipantID > FROM ( > ParticipantGroup INNER JOIN InitialParticipant ON > ParticipantGroup.ParticipantGroupID = > InitialParticipant.ParticipantGroupID ) INNER JOIN > ( >tmpPartArgs INNER JOIN > ( > ( > tmpSimArgs INNER JOIN Occurrence ON > (tmpSimArgs.SimRunID = Occurrence.SimRunID) AND > (tmpSimArgs.SimulationID = Occurrence.SimulationID) ) INNER > JOIN OccurrenceParticipant ON (Occurrence.OccurrenceID = > OccurrenceParticipant.OccurrenceID) AND (Occurrence.SimRunID = > OccurrenceParticipant.SimRunID) AND (Occurrence.SimulationID = > OccurrenceParticipant.SimulationID) ) ON > tmpPartArgs.participantID = OccurrenceParticipant.ParticipantID ) ON > InitialParticipant.ParticipantID = > OccurrenceParticipant.ParticipantID WHERE > (((OccurrenceParticipant.RoleTypeID)=52 Or > (OccurrenceParticipant.RoleTypeID)=49)) AND Occurrence.HasSucceeded = > True GROUP BY ParticipantGroup.Description, Occurrence.SimulationID, > Occurrence.SimRunID, Occurrence.Period; I recommend you drop all those parentheses around joins - they only confuse matters. Just write SELECT ParticipantGroup.Description, Occurrence.SimulationID, Occurrence.SimRunID, Occurrence.Period, Count(OccurrenceParticipant.ParticipantID) AS CountOfParticipantID FROM ParticipantGroup INNER JOIN InitialParticipant ON (ParticipantGroup.ParticipantGroupID = InitialParticipant.ParticipantGroupID) INNER JOIN OccurrenceParticipant ON (InitialParticipant.ParticipantID = OccurrenceParticipant.ParticipantID) INNER JOIN tmpPartArgs ON (tmpPartArgs.participantID = OccurrenceParticipant.ParticipantID) INNER JOIN Occurrence ON (Occurrence.OccurrenceID = OccurrenceParticipant.OccurrenceID) AND (Occurrence.SimRunID = OccurrenceParticipant.SimRunID) AND (Occurrence.SimulationID = OccurrenceParticipant.SimulationID) INNER JOIN tmpSimArgs ON (tmpSimArgs.SimRunID = Occurrence.SimRunID) AND (tmpSimArgs.SimulationID = Occurrence.SimulationID) WHERE OccurrenceParticipant.RoleTypeID IN (49, 52) AND Occurrence.HasSucceeded GROUP BY ParticipantGroup.Description, Occurrence.SimulationID, Occurrence.SimRunID, Occurrence.Period; > The SQLite query is as follows (this query takes around 10 seconds): Why did you feel the need to rewrite it? This mess of nested selects makes SQLite's job much harder than it needs to be. The original query should have worked out of the box (well, almost - there
Re: [sqlite] SQlite query performs 10 times slower than MS Access query
Why you change the query? Use de Access version or someone without joining a subquery. Is preferible to join table to table (to more tables) than join table to subqueries. Emilio Doyel5 escribió: > I have a 800MB MS Access database that I migrated to SQLite. The structure of > the database is as follows (the SQLite database, after migration, is around > 330MB): > > The table ‘Occurrence’ has 1,600,000 records. The table looks like: > CREATE TABLE Occurrence > ( > SimulationID INTEGER,SimRunID INTEGER,OccurrenceID INTEGER, > OccurrenceTypeIDINTEGER,PeriodINTEGER,HasSucceededBOOL, > PRIMARY KEY (SimulationID, SimRunID, OccurrenceID) > ) > > It has the following indexes: > CREATE INDEX "Occurrence_HasSucceeded_idx" ON "Occurrence" ("HasSucceeded" > ASC) > > CREATE INDEX "Occurrence_OccurrenceID_idx" ON "Occurrence" ("OccurrenceID" > ASC) > > CREATE INDEX "Occurrence_SimRunID_idx" ON "Occurrence" ("SimRunID" ASC) > > CREATE INDEX "Occurrence_SimulationID_idx" ON "Occurrence" ("SimulationID" > ASC) > > > > The table ‘OccurrenceParticipant’ has 3,400,000 records. The table looks > like: > CREATE TABLE OccurrenceParticipant > ( > SimulationIDINTEGER, SimRunIDINTEGER,OccurrenceID > INTEGER, > RoleTypeID INTEGER, ParticipantIDINTEGER > ) > > It has the following indexes: > CREATE INDEX "OccurrenceParticipant_OccurrenceID_idx" ON > "OccurrenceParticipant" ("OccurrenceID" ASC) > > CREATE INDEX "OccurrenceParticipant_ParticipantID_idx" ON > "OccurrenceParticipant" ("ParticipantID" ASC) > > CREATE INDEX "OccurrenceParticipant_RoleType_idx" ON "OccurrenceParticipant" > ("RoleTypeID" ASC) > > CREATE INDEX "OccurrenceParticipant_SimRunID_idx" ON "OccurrenceParticipant" > ("SimRunID" ASC) > > CREATE INDEX "OccurrenceParticipant_SimulationID_idx" ON > "OccurrenceParticipant" ("SimulationID" ASC) > The table ‘InitialParticipant’ has 130 records. The structure of the table > is > > > CREATE TABLE InitialParticipant > ( > ParticipantIDINTEGER PRIMARY KEY, ParticipantTypeIDINTEGER, > ParticipantGroupID INTEGER > ) > > The table has the following indexes: > CREATE INDEX "initialpart_participantTypeID_idx" ON "InitialParticipant" > ("ParticipantGroupID" ASC) > > CREATE INDEX "initialpart_ParticipantID_idx" ON "InitialParticipant" > ("ParticipantID" ASC) > > > > The table ‘ParticipantGroup’ has 22 records. It looks like > CREATE TABLE ParticipantGroup > ( > ParticipantGroupIDINTEGER,ParticipantGroupTypeID INTEGER, > Descriptionvarchar (50), PRIMARY KEY( ParticipantGroupID ) > ) > > The table has the following index: > CREATE INDEX "ParticipantGroup_ParticipantGroupID_idx" ON "ParticipantGroup" > ("ParticipantGroupID" ASC) > > > > The table ‘tmpSimArgs’ has 18 records. It has the following structure: > CREATE TABLE tmpSimArgs (SimulationID varchar, SimRunID int(10)) > > And the following indexes: > CREATE INDEX tmpSimArgs_SimRunID_idx ON tmpSimArgs(SimRunID ASC) > > CREATE INDEX tmpSimArgs_SimulationID_idx ON tmpSimArgs(SimulationID ASC) > > > > The table ‘tmpPartArgs’ has 80 records. It has the below structure: > CREATE TABLE tmpPartArgs(participantID INT) > > And the below index: > CREATE INDEX tmpPartArgs_participantID_idx ON tmpPartArgs(participantID ASC) > > I have a query that involves multiple INNER JOINs and the problem I am > facing is the Access version of the query takes about a second whereas the > SQLite version of the same query takes 10 seconds (about 10 times slow!) It > is impossible for me to migrate back to Access and SQLite is my only option. > > I am new to writing database queries so these queries might look stupid, so > please advise on anything you see faulty or kid-dish. > > The query in Access is (the entire query takes 1 second to execute): > SELECT ParticipantGroup.Description, Occurrence.SimulationID, > Occurrence.SimRunID, Occurrence.Period, > Count(OccurrenceParticipant.ParticipantID) AS CountOfParticipantID FROM > ( > ParticipantGroup INNER JOIN InitialParticipant ON > ParticipantGroup.ParticipantGroupID = InitialParticipant.ParticipantGroupID > ) INNER JOIN > ( > tmpPartArgs INNER JOIN >( >( >tmpSimArgs INNER JOIN Occurrence ON (tmpSimArgs.SimRunID = > Occurrence.SimRunID) AND (tmpSimArgs.SimulationID = Occurrence.SimulationID) >) INNER JOIN OccurrenceParticipant ON (Occurrence.OccurrenceID = > OccurrenceParticipant.OccurrenceID) AND (Occurrence.SimRunID = > OccurrenceParticipant.SimRunID) AND (Occurrence.SimulationID = > OccurrenceParticipant.SimulationID) > ) ON tmpPartArgs.participantID = OccurrenceParticipant.ParticipantID > ) ON InitialParticipant.ParticipantID = OccurrenceParticipant.ParticipantID > WHERE (((OccurrenceParticipant.RoleTypeID)=52 Or > (OccurrenceParticipant.RoleTypeID)=49)) AND Occurrence.HasSucceeded = True > GROUP BY ParticipantGroup.Description, Occurrence.SimulationID, > Occurrence.Sim