I have a sqlite database with two tables, each with 50,000 rows in, containing names of (fake) people. I've constructed a simple query to find out how many names there are (given name, middle initial, surname) that are common to both tables:
select count(*) from fakenames_uk inner join fakenames_usa on fakenames_uk.givenname=fakenames_usa.givenname and fakenames_uk.surname=fakenames_usa.surname and fakenames_uk.middleinitial=fakenames_usa.middleinitial; When there are no indexes except on the primary keys (irrelevant to this query), it runs quickly: [james@marlon Downloads] $ time sqlite3 generic_data_no_indexes.sqlite "select count(*) from fakenames_uk inner join fakenames_usa on fakenames_uk.givenname=fakenames_usa.givenname and fakenames_uk.surname=fakenames_usa.surname and fakenames_uk.middleinitial=fakenames_usa.middleinitial;" 131 real 0m0.115s user 0m0.111s sys 0m0.004s But if I add indexes to the three columns on each table (six indexes in all): CREATE INDEX `idx_uk_givenname` ON `fakenames_uk` (`givenname` ) //etc. then it runs painfully slowly: [james@marlon Downloads] $ time sqlite3 generic_data.sqlite "select count(*) from fakenames_uk inner join fakenames_usa on fakenames_uk.givenname=fakenames_usa.givenname and fakenames_uk.surname=fakenames_usa.surname and fakenames_uk.middleinitial=fakenames_usa.middleinitial;" 131 real 1m43.102s user 0m52.397s sys 0m50.696s Is there any rhyme or reason to this? I'm running 3.13.0, but I've also tried with 3.11.0, and got the same problem. Here's the query plan for the version with the indexes: addr opcode p1 p2 p3 p4 p5 comment ---- ------------- ---- ---- ---- ------------- -- ------------- 0 Init 0 26 0 00 Start at 26 1 Null 0 1 1 00 r[1..1]=NULL 2 OpenRead 0 2 0 6 00 root=2 iDb=0; fakenames_uk 3 OpenRead 1 5208 0 6 00 root=5208 iDb=0; fakenames_usa 4 OpenRead 2 11215 0 k(2,,) 02 root=11215 iDb=0; idx_us_middleinitial 5 Rewind 0 19 0 00 6 Column 0 4 2 00 r[2]=fakenames_uk.middleinitial 7 SeekGE 2 18 2 1 00 key=r[2] 8 IdxGT 2 18 2 1 00 key=r[2] 9 Seek 2 0 1 00 Move 1 to 2.rowid 10 Column 0 3 3 00 r[3]=fakenames_uk.givenname 11 Column 1 3 4 00 r[4]=fakenames_usa.givenname 12 Ne 4 17 3 (BINARY) 51 if r[4]!=r[3] goto 17 13 Column 0 5 5 00 r[5]=fakenames_uk.surname 14 Column 1 5 6 00 r[6]=fakenames_usa.surname 15 Ne 6 17 5 (BINARY) 51 if r[6]!=r[5] goto 17 16 AggStep0 0 0 1 count(0) 00 accum=r[1] step(r[0]) 17 Next 2 8 1 00 18 Next 0 6 0 01 19 Close 0 0 0 00 20 Close 1 0 0 00 21 Close 2 0 0 00 22 AggFinal 1 0 0 count(0) 00 accum=r[1] N=0 23 Copy 1 7 0 00 r[7]=r[1] 24 ResultRow 7 1 0 00 output=r[7] 25 Halt 0 0 0 00 26 Transaction 0 0 26 0 01 usesStmtJournal=0 27 TableLock 0 2 0 fakenames_uk 00 iDb=0 root=2 write=0 28 TableLock 0 5208 0 fakenames_usa 00 iDb=0 root=5208 write=0 29 Goto 0 1 0 00 Thanks very much for your help! The database is moderately large (45MB or so), but contains no private information, so I can provide it if it would help. James ________________________________ Elsevier Limited. Registered Office: The Boulevard, Langford Lane, Kidlington, Oxford, OX5 1GB, United Kingdom, Registration No. 1982084, Registered in England and Wales. _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users