*I'm leaving aside the rant that your first delete is not identical to combination of the select and delete in the second approach and select in second approach contains typos...*
But did you try to combine your insert and delete statements from the second approach? This approach quicker because of exactly that - select and delete statements are independent whereas in first approach your select is executed again and again for each row in TABLE2. So just make it like this: DELETE FROM TABLE2 WHERE ENTRY_ID NOT IN (SELECT T2.ENTRY_ID FROM TABLE2 T2 INNER JOIN TABLE1 T1 ON (T1.PATIENT_ID = T2.PATIENT_ID) WHERE julianday(T2.START_DATE, '+15 month') > julianday(T1.START_DATE) ) And about indexes: for this query index on (PATIENT_ID, START_DATE) doesn't do any better than on (PATIENT_ID) because SQLite must to check all rows with given PATIENT_ID anyway. And that is because START_DATE is in the query inside function call to julianday(). Index on 2 fields could help only if your condition was e.g. T2.START_DATE > T1.START_DATE. And in most cases there's no benefit creating index on 1 field (PATIENT_ID) when there's already index on 2 fields (PATIENT_ID, START_DATE) where PATIENT_ID is the first field. Pavel On Fri, Sep 18, 2009 at 4:46 PM, Bart Smissaert <bart.smissa...@gmail.com> wrote: > This is what I am dealing with: > 2 tables with exactly the same schema (but could be slightly > different, so can't put in same table): > > CREATE TABLE TABLE1( > [PATIENT_ID] INTEGER, > [ENTRY_ID] INTEGER PRIMARY KEY, > [READ_CODE] TEXT, > [ADDED_DATE] TEXT, > [START_DATE] TEXT) > > The last 2 date fields are in the format yyyy-mm-dd > > Again in both tables the following indexes: > CREATE INDEX IDX_TABLE1_PATIENT_ID ON TABLE1(PATIENT_ID) > CREATE INDEX IDX_TABLE1_PATIENT_ID_START_DATE ON TABLE1(PATIENT_ID, > START_DATE) > CREATE INDEX IDX_TABLE1_PATIENT_ID_ENTRY_ID ON TABLE1(PATIENT_ID, ENTRY_ID) > CREATE INDEX TABLE1_START_DATE ON TABLE1(START_DATE) > > Then some delete SQL's will be run so that in the end PATIENT_ID is > unique in both tables. > > Then the SQL I was trying to improve: > > DELETE > FROM > TABLE2 > WHERE > PATIENT_ID NOT IN ( > SELECT > TABLE1.PATIENT_ID > FROM > TABLE1 > WHERE > JULIANDAY(TABLE2.START_DATE, '-14 month') > > JULIANDAY(TABLE1.START_DATE) AND > TABLE1.PATIENT_ID = TABLE2.PATIENT_ID) > > The index it will use for this SQL is IDX_TABLE1_PATIENT_ID_ENTRY_ID > and that surprises me as ENTRY_ID is not in the above SQL. If I drop > that index then it will > use IDX_TABLE1_PATIENT_ID and only if that is dropped as well will it > use IDX_TABLE1_PATIENT_ID_START_DATE, which I thought would be the > best, but turns > out to be no better than the first. Whatever way I do this it is slow > and I can do it a lot quicker by doing this: > > CREATE TEMP TABLE DATE_COMPARE_TEMP([E_ID] INTEGER) > > INSERT INTO DATE_COMPARE_TEMP (E_ID) > SELECT T2.ENTRY_ID FROM > TABLE2 T2 INNER JOIN TABLE1 T2 ON > (T1.PATIENT_ID = T2.PATIENT_ID) > WHERE julianday(T2.START_DATE, '+15 month') > > julianday(T1.START_DATE) > > CREATE INDEX IDX1_DATE_COMPARE_TEMP_E_ID ON DATE_COMPARE_TEMP(E_ID) > > analyze DATE_COMPARE_TEMP > > and then the delete SQL like this: > > DELETE > FROM > TABLE2 > WHERE > ENTRY_ID NOT IN > (SELECT E_ID FROM DATE_COMPARE_TEMP) > > Although this involves a third temp table this method is about twice > as fast as the first one. > Maybe that is just the way it is and there just is no way to do this > as fast without the intermediate temp table, but I just wondered. > Again there is no problem here as the second method is simple and > fast. Just trying to increase my understanding of SQLite and indexes. > > > RBS > > > > > On Fri, Sep 18, 2009 at 8:25 PM, Simon Slavin > <slav...@hearsay.demon.co.uk> wrote: >> >> On 18 Sep 2009, at 4:49pm, bartsmissa...@blueyonder.co.uk wrote: >> >>>> Did something in the documentation make >>>> you think SQLite wouldn't use a >>>> multi-column index unless you forced it ? >>> >>> No, but I just noticed it didn't use the index I thought would be >>> best. As >>> it turned out it looks I was wrong in that that index didn't give the >>> quickest result. >> >> Ah. Okay, that's fine. It can be quite surprising what's best. Glad >> you got a good result. >> >> Simon. >> _______________________________________________ >> sqlite-users mailing list >> sqlite-users@sqlite.org >> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users >> > _______________________________________________ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users