[sqlite] Just want to double check on index need
I am porting an application (American Football Results Picking Competition) over from a Postgres databaseo to SQLite which involves some fairly intense queries. I am doing this partially to do some performance comparisons although I have some other reasons too. I just want to make sure that I am setting up the indexes to some of the tables correctly. Here is an example of a representative type of table CREATE TABLE div_winner_pick ( cid integer NOT NULL REFERENCES competition(cid) ON UPDATE CASCADE ON DELETE CASCADE, -- Competition ID confid character(3) NOT NULL REFERENCES conference(confid) ON UPDATE CASCADE ON DELETE CASCADE, --Conference ID divid character(1) NOT NULL REFERENCES division(divid) ON UPDATE CASCADE ON DELETE CASCADE, --Division ID uid integer NOT NULL REFERENCES participant(uid) ON UPDATE CASCADE ON DELETE CASCADE, --User ID tid character(3) NOT NULL REFERENCES team(tid) ON UPDATE CASCADE ON DELETE CASCADE, --Team who will win division submit_time bigint DEFAULT (strftime('%s','now')) NOT NULL, --Time of submission PRIMARY KEY (cid,confid,divid,uid) ); where the Primary key references several columns For this particular table - in my Postgres definition I created the following two indexes CREATE INDEX div_win_pick_uid_idx ON div_winner_pick (uid); CREATE INDEX div_win_pick_cid_idx ON div_winner_pick (cid); i.e. Two of the 4 fields that make up the primary key. and I was anticipating doing the same - or something similar - I am not yet convinced I don't need to do CREATE INDEX div_win_pick_uid_cid_idx ON div_winner_pick (uid,cid); However, I came across the following text on the SQLite Web Site as part of the explanation of the CREATE TABLE command INTEGER PRIMARY KEY columns aside, both UNIQUE and PRIMARY KEY constraints are implemented by creating an index in the database (in the same way as a CREATE UNIQUE INDEX statement would). Such an index is used like any other index in the database to optimize queries. As a result, there often no advantage (but significant overhead) in creating an index on a set of columns that are already collectively subject to a UNIQUE or PRIMARY KEY constraint. I just wanted to check that lack of advantage (and overhead) applies purely to an index across all columns of the primary key and that if I need the index across a lesser number of columns (because I am querying for all records that match where I can define the values cid and uid in the example above) it is still and advantage to create it separately. -- Alan Chandler http://www.chandlerfamily.org.uk ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Just want to double check on index need
On Oct 14, 2010, at 10:43 PM, Alan Chandler wrote: I am porting an application (American Football Results Picking Competition) over from a Postgres databaseo to SQLite which involves some fairly intense queries. I am doing this partially to do some performance comparisons although I have some other reasons too. I just want to make sure that I am setting up the indexes to some of the tables correctly. Here is an example of a representative type of table CREATE TABLE div_winner_pick ( cid integer NOT NULL REFERENCES competition(cid) ON UPDATE CASCADE ON DELETE CASCADE, -- Competition ID confid character(3) NOT NULL REFERENCES conference(confid) ON UPDATE CASCADE ON DELETE CASCADE, --Conference ID divid character(1) NOT NULL REFERENCES division(divid) ON UPDATE CASCADE ON DELETE CASCADE, --Division ID uid integer NOT NULL REFERENCES participant(uid) ON UPDATE CASCADE ON DELETE CASCADE, --User ID tid character(3) NOT NULL REFERENCES team(tid) ON UPDATE CASCADE ON DELETE CASCADE, --Team who will win division submit_time bigint DEFAULT (strftime('%s','now')) NOT NULL, --Time of submission PRIMARY KEY (cid,confid,divid,uid) ); where the Primary key references several columns For this particular table - in my Postgres definition I created the following two indexes CREATE INDEX div_win_pick_uid_idx ON div_winner_pick (uid); CREATE INDEX div_win_pick_cid_idx ON div_winner_pick (cid); i.e. Two of the 4 fields that make up the primary key. and I was anticipating doing the same - or something similar - I am not yet convinced I don't need to do CREATE INDEX div_win_pick_uid_cid_idx ON div_winner_pick (uid,cid); However, I came across the following text on the SQLite Web Site as part of the explanation of the CREATE TABLE command INTEGER PRIMARY KEY columns aside, both UNIQUE and PRIMARY KEY constraints are implemented by creating an index in the database (in the same way as a CREATE UNIQUE INDEX statement would). Such an index is used like any other index in the database to optimize queries. As a result, there often no advantage (but significant overhead) in creating an index on a set of columns that are already collectively subject to a UNIQUE or PRIMARY KEY constraint. I just wanted to check that lack of advantage (and overhead) applies purely to an index across all columns of the primary key and that if I need the index across a lesser number of columns (because I am querying for all records that match where I can define the values cid and uid in the example above) it is still and advantage to create it separately. It should be clearer. Basically the index would be redundant if it contains the same columns in the same order as the primary key. Or a prefix thereof. i.e. the following indexes would be all be redundant (pure overhead for no benefit): CREATE INDEX x ON div_pick_winner(cid); CREATE INDEX x ON div_pick_winner(cid,confid); CREATE INDEX x ON div_pick_winner(cid,confid,divid); CREATE INDEX x ON div_pick_winner(cid,confid,divid,uid); Your index is not redundant though. Dan. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Just want to double check on index need
On 14/10/10 17:28, Dan Kennedy wrote: On Oct 14, 2010, at 10:43 PM, Alan Chandler wrote: CREATE TABLE div_winner_pick ( ... PRIMARY KEY (cid,confid,divid,uid) ); ... CREATE INDEX div_win_pick_uid_cid_idx ON div_winner_pick (uid,cid); ... It should be clearer. Basically the index would be redundant if it contains the same columns in the same order as the primary key. Or a prefix thereof. i.e. the following indexes would be all be redundant (pure overhead for no benefit): CREATE INDEX x ON div_pick_winner(cid); CREATE INDEX x ON div_pick_winner(cid,confid); CREATE INDEX x ON div_pick_winner(cid,confid,divid); CREATE INDEX x ON div_pick_winner(cid,confid,divid,uid); Your index is not redundant though. This is interesting - what if I changed the primary key to be PRIMARY KEY (uid,cid,confid,divid) Is that an optimisation that is useful to make? -- Alan Chandler http://www.chandlerfamily.org.uk ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Just want to double check on index need
On Oct 14, 2010, at 11:56 PM, Alan Chandler wrote: On 14/10/10 17:28, Dan Kennedy wrote: On Oct 14, 2010, at 10:43 PM, Alan Chandler wrote: CREATE TABLE div_winner_pick ( ... PRIMARY KEY (cid,confid,divid,uid) ); ... CREATE INDEX div_win_pick_uid_cid_idx ON div_winner_pick (uid,cid); ... It should be clearer. Basically the index would be redundant if it contains the same columns in the same order as the primary key. Or a prefix thereof. i.e. the following indexes would be all be redundant (pure overhead for no benefit): CREATE INDEX x ON div_pick_winner(cid); CREATE INDEX x ON div_pick_winner(cid,confid); CREATE INDEX x ON div_pick_winner(cid,confid,divid); CREATE INDEX x ON div_pick_winner(cid,confid,divid,uid); Your index is not redundant though. This is interesting - what if I changed the primary key to be PRIMARY KEY (uid,cid,confid,divid) Is that an optimisation that is useful to make? Quite possibly. If you change the primary key to the above then you can leave out the (uid, cid) index. All queries that would have used the (uid, cid) index will use the automatic index created by the PRIMARY KEY instead. One less index means a smaller database file and less work for SQLite to do when you insert/update/delete a row. Dan. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users