Joining a large table (ENTRY_ATTRIBUTES) and a small table (BPNewENTRY) and putting the resulting records in a third table, BP3. Large table may have a few million records and small table a few hundred records. The join field is called ENTRY_ID in both tables and this has a non-unique index in the large table and is the integer primary key in the small table. Data type is integer in both these fields.
This is the SQL: INSERT OR REPLACE INTO BP3 (ENTRY_ID, NUMERIC_VALUE, UNIT) SELECT EA.ENTRY_ID, EA.NUMERIC_VALUE, EA.TYPE_SPECIFIC_INFO FROM ENTRY_ATTRIBUTES EA INNER JOIN BPNewENTRY E ON (EA.ENTRY_ID = E.ENTRY_ID) and this is the generated query plan for that: 0--0--0--SCAN TABLE ENTRY_ATTRIBUTES AS EA (~54855 rows) <--> 0--1--1--SEARCH TABLE BPNewENTRY AS E USING INTEGER PRIMARY KEY (rowid=?) (~1 rows) Now I can speed up the query a lot by putting a where clause in with the min(ENTRY_ID) of the small table: INSERT OR REPLACE INTO BP3 (ENTRY_ID, NUMERIC_VALUE, UNIT) SELECT EA.ENTRY_ID, EA.NUMERIC_VALUE, EA.TYPE_SPECIFIC_INFO FROM ENTRY_ATTRIBUTES EA INNER JOIN BPNewENTRY E ON (EA.ENTRY_ID = E.ENTRY_ID) WHERE EA.ENTRY_ID >= 4262936 and that will have this query plan: 0--0--0--SEARCH TABLE ENTRY_ATTRIBUTES AS EA USING INDEX IDX_ENTRY_ATTRIBUTES_ENTRY_ID (ENTRY_ID>?) (~18102 rows) <--> 0--1--1--SEARCH TABLE BPNewENTRY AS E USING INTEGER PRIMARY KEY (rowid=?) (~1 rows) Getting this min(ENTRY_ID) is done in a separate query and because it is done on the small table it is very fast. Now what surprises me is that this optimization is not done automatically by SQLite. I suppose I just over estimate the capabilities of the SQLite plan generator. Or, would this be something that could be improved? I am using SQLite version 3.7.5. RBS _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users