> Now what surprises me is that this optimization is not done > automatically by SQLite. > I suppose I just over estimate the capabilities of the SQLite plan generator. > Or, would this be something that could be improved?
It's very non-obvious optimization and I think other type of optimization will give much better result in your case. To understand the optimizer's behavior look at numbers it shows you: it thinks that scanning the whole ENTRY_ATTRIBUTES table will read about 54855 rows. And you say that it has much more rows. That's why optimizer selects sub-optimal plan. I think running ANALYZE on your database should fix selected plans and even first query will run much faster. Pavel On Tue, Jul 12, 2011 at 3:39 PM, Bart Smissaert <bart.smissa...@gmail.com> wrote: > Joining a large table (ENTRY_ATTRIBUTES) and a small table > (BPNewENTRY) and putting the resulting records > in a third table, BP3. Large table may have a few million records and > small table a few hundred records. > The join field is called ENTRY_ID in both tables and this has a > non-unique index in the large table and is the integer primary key > in the small table. Data type is integer in both these fields. > > This is the SQL: > > INSERT OR REPLACE INTO BP3 > (ENTRY_ID, NUMERIC_VALUE, UNIT) > SELECT EA.ENTRY_ID, EA.NUMERIC_VALUE, EA.TYPE_SPECIFIC_INFO > FROM > ENTRY_ATTRIBUTES EA INNER JOIN BPNewENTRY E ON > (EA.ENTRY_ID = E.ENTRY_ID) > > and this is the generated query plan for that: > > 0--0--0--SCAN TABLE ENTRY_ATTRIBUTES AS EA (~54855 rows) <--> > 0--1--1--SEARCH TABLE BPNewENTRY AS E USING INTEGER PRIMARY KEY > (rowid=?) (~1 rows) > > Now I can speed up the query a lot by putting a where clause in with > the min(ENTRY_ID) of the small table: > > INSERT OR REPLACE INTO BP3 > (ENTRY_ID, NUMERIC_VALUE, UNIT) > SELECT EA.ENTRY_ID, EA.NUMERIC_VALUE, EA.TYPE_SPECIFIC_INFO > FROM > ENTRY_ATTRIBUTES EA INNER JOIN BPNewENTRY E ON > (EA.ENTRY_ID = E.ENTRY_ID) > WHERE > EA.ENTRY_ID >= 4262936 > > and that will have this query plan: > > 0--0--0--SEARCH TABLE ENTRY_ATTRIBUTES AS EA USING INDEX > IDX_ENTRY_ATTRIBUTES_ENTRY_ID (ENTRY_ID>?) (~18102 rows) <--> > 0--1--1--SEARCH TABLE BPNewENTRY AS E USING INTEGER PRIMARY KEY > (rowid=?) (~1 rows) > > Getting this min(ENTRY_ID) is done in a separate query and because it > is done on the small table it is very fast. > > Now what surprises me is that this optimization is not done > automatically by SQLite. > I suppose I just over estimate the capabilities of the SQLite plan generator. > Or, would this be something that could be improved? > > I am using SQLite version 3.7.5. > > > RBS > _______________________________________________ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users