[sqlite] Just want to double check on index need

2010-10-14 Thread Alan Chandler
I am porting an application (American Football Results Picking 
Competition) over from a Postgres databaseo to SQLite which involves 
some fairly intense queries.  I am doing this partially to do some 
performance comparisons although I have some other reasons too.

I just want to make sure that I am setting up the indexes to some of the 
tables correctly.

Here is an example of a representative type of table

CREATE TABLE div_winner_pick (
 cid integer NOT NULL REFERENCES competition(cid) ON UPDATE CASCADE 
ON DELETE CASCADE, -- Competition ID
 confid character(3) NOT NULL REFERENCES conference(confid) ON 
UPDATE CASCADE ON DELETE CASCADE, --Conference ID
 divid character(1) NOT NULL REFERENCES division(divid) ON UPDATE 
CASCADE ON DELETE CASCADE, --Division ID
 uid integer NOT NULL REFERENCES participant(uid) ON UPDATE CASCADE 
ON DELETE CASCADE, --User ID
 tid character(3) NOT NULL REFERENCES team(tid) ON UPDATE CASCADE ON 
DELETE CASCADE, --Team who will win division
 submit_time bigint DEFAULT (strftime('%s','now')) NOT NULL, --Time 
of submission
 PRIMARY KEY (cid,confid,divid,uid)
);

where the Primary key references several columns

For this particular table - in my Postgres definition I created the 
following two indexes

CREATE INDEX div_win_pick_uid_idx ON div_winner_pick (uid);
CREATE INDEX div_win_pick_cid_idx ON div_winner_pick (cid);

i.e.  Two of the 4 fields that make up the primary key.

and I was anticipating doing the same - or something similar - I am not 
yet convinced I don't need to do

CREATE INDEX div_win_pick_uid_cid_idx ON div_winner_pick (uid,cid);


However, I came across the following text on the SQLite Web Site as part 
of the explanation of the CREATE TABLE command



INTEGER PRIMARY KEY columns aside, both UNIQUE and PRIMARY KEY 
constraints are implemented by creating an index in the database (in the 
same way as a CREATE UNIQUE INDEX statement would). Such an index is 
used like any other index in the database to optimize queries. As a 
result, there often no advantage (but significant overhead) in creating 
an index on a set of columns that are already collectively subject to a 
UNIQUE or PRIMARY KEY constraint.


I just wanted to check that lack of advantage (and overhead) applies 
purely to an index across all columns of the primary key and that if I 
need the index across a lesser number of columns (because I am querying 
for all records that match where I can define the values cid and uid 
in the example above) it is still and advantage to create it separately.



-- 
Alan Chandler
http://www.chandlerfamily.org.uk
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Just want to double check on index need

2010-10-14 Thread Dan Kennedy

On Oct 14, 2010, at 10:43 PM, Alan Chandler wrote:

 I am porting an application (American Football Results Picking
 Competition) over from a Postgres databaseo to SQLite which involves
 some fairly intense queries.  I am doing this partially to do some
 performance comparisons although I have some other reasons too.

 I just want to make sure that I am setting up the indexes to some of  
 the
 tables correctly.

 Here is an example of a representative type of table

 CREATE TABLE div_winner_pick (
 cid integer NOT NULL REFERENCES competition(cid) ON UPDATE CASCADE
 ON DELETE CASCADE, -- Competition ID
 confid character(3) NOT NULL REFERENCES conference(confid) ON
 UPDATE CASCADE ON DELETE CASCADE, --Conference ID
 divid character(1) NOT NULL REFERENCES division(divid) ON UPDATE
 CASCADE ON DELETE CASCADE, --Division ID
 uid integer NOT NULL REFERENCES participant(uid) ON UPDATE CASCADE
 ON DELETE CASCADE, --User ID
 tid character(3) NOT NULL REFERENCES team(tid) ON UPDATE CASCADE  
 ON
 DELETE CASCADE, --Team who will win division
 submit_time bigint DEFAULT (strftime('%s','now')) NOT NULL, --Time
 of submission
 PRIMARY KEY (cid,confid,divid,uid)
 );

 where the Primary key references several columns

 For this particular table - in my Postgres definition I created the
 following two indexes

 CREATE INDEX div_win_pick_uid_idx ON div_winner_pick (uid);
 CREATE INDEX div_win_pick_cid_idx ON div_winner_pick (cid);

 i.e.  Two of the 4 fields that make up the primary key.

 and I was anticipating doing the same - or something similar - I am  
 not
 yet convinced I don't need to do

 CREATE INDEX div_win_pick_uid_cid_idx ON div_winner_pick (uid,cid);


 However, I came across the following text on the SQLite Web Site as  
 part
 of the explanation of the CREATE TABLE command



 INTEGER PRIMARY KEY columns aside, both UNIQUE and PRIMARY KEY
 constraints are implemented by creating an index in the database (in  
 the
 same way as a CREATE UNIQUE INDEX statement would). Such an index is
 used like any other index in the database to optimize queries. As a
 result, there often no advantage (but significant overhead) in  
 creating
 an index on a set of columns that are already collectively subject  
 to a
 UNIQUE or PRIMARY KEY constraint.


 I just wanted to check that lack of advantage (and overhead) applies
 purely to an index across all columns of the primary key and that if I
 need the index across a lesser number of columns (because I am  
 querying
 for all records that match where I can define the values cid and  
 uid
 in the example above) it is still and advantage to create it  
 separately.

It should be clearer. Basically the index would be redundant
if it contains the same columns in the same order as the primary
key. Or a prefix thereof. i.e. the following indexes would be
all be redundant (pure overhead for no benefit):

CREATE INDEX x ON div_pick_winner(cid);
CREATE INDEX x ON div_pick_winner(cid,confid);
CREATE INDEX x ON div_pick_winner(cid,confid,divid);
CREATE INDEX x ON div_pick_winner(cid,confid,divid,uid);

Your index is not redundant though.

Dan.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Just want to double check on index need

2010-10-14 Thread Alan Chandler
On 14/10/10 17:28, Dan Kennedy wrote:

 On Oct 14, 2010, at 10:43 PM, Alan Chandler wrote:
 CREATE TABLE div_winner_pick (
...
  PRIMARY KEY (cid,confid,divid,uid)
 );

...

 CREATE INDEX div_win_pick_uid_cid_idx ON div_winner_pick (uid,cid);


...

 It should be clearer. Basically the index would be redundant
 if it contains the same columns in the same order as the primary
 key. Or a prefix thereof. i.e. the following indexes would be
 all be redundant (pure overhead for no benefit):

  CREATE INDEX x ON div_pick_winner(cid);
  CREATE INDEX x ON div_pick_winner(cid,confid);
  CREATE INDEX x ON div_pick_winner(cid,confid,divid);
  CREATE INDEX x ON div_pick_winner(cid,confid,divid,uid);

 Your index is not redundant though.

This is interesting - what if I changed the primary key to be

PRIMARY KEY (uid,cid,confid,divid)

Is that an optimisation that is useful to make?


-- 
Alan Chandler
http://www.chandlerfamily.org.uk
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Just want to double check on index need

2010-10-14 Thread Dan Kennedy

On Oct 14, 2010, at 11:56 PM, Alan Chandler wrote:

 On 14/10/10 17:28, Dan Kennedy wrote:

 On Oct 14, 2010, at 10:43 PM, Alan Chandler wrote:
 CREATE TABLE div_winner_pick (
 ...
 PRIMARY KEY (cid,confid,divid,uid)
 );

 ...

 CREATE INDEX div_win_pick_uid_cid_idx ON div_winner_pick (uid,cid);


 ...

 It should be clearer. Basically the index would be redundant
 if it contains the same columns in the same order as the primary
 key. Or a prefix thereof. i.e. the following indexes would be
 all be redundant (pure overhead for no benefit):

 CREATE INDEX x ON div_pick_winner(cid);
 CREATE INDEX x ON div_pick_winner(cid,confid);
 CREATE INDEX x ON div_pick_winner(cid,confid,divid);
 CREATE INDEX x ON div_pick_winner(cid,confid,divid,uid);

 Your index is not redundant though.

 This is interesting - what if I changed the primary key to be

 PRIMARY KEY (uid,cid,confid,divid)

 Is that an optimisation that is useful to make?

Quite possibly.

If you change the primary key to the above then you can
leave out the (uid, cid) index. All queries that would
have used the (uid, cid) index will use the automatic
index created by the PRIMARY KEY instead.

One less index means a smaller database file and less
work for SQLite to do when you insert/update/delete a
row.

Dan.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users