> Now what surprises me is that this optimization is not done
> automatically by SQLite.
> I suppose I just over estimate the capabilities of the SQLite plan generator.
> Or, would this be something that could be improved?

It's very non-obvious optimization and I think other type of
optimization will give much better result in your case.
To understand the optimizer's behavior look at numbers it shows you:
it thinks that scanning the whole ENTRY_ATTRIBUTES table will read
about 54855 rows. And you say that it has much more rows. That's why
optimizer selects sub-optimal plan.
I think running ANALYZE on your database should fix selected plans and
even first query will run much faster.


Pavel


On Tue, Jul 12, 2011 at 3:39 PM, Bart Smissaert
<bart.smissa...@gmail.com> wrote:
> Joining a large table (ENTRY_ATTRIBUTES) and a small table
> (BPNewENTRY) and putting the resulting records
> in a third table, BP3. Large table may have a few million records and
> small table a few hundred records.
> The join field is called ENTRY_ID in both tables and this has a
> non-unique index in the large table and is the integer primary key
> in the small table. Data type is integer in both these fields.
>
> This is the SQL:
>
> INSERT OR REPLACE INTO BP3
> (ENTRY_ID, NUMERIC_VALUE, UNIT)
> SELECT EA.ENTRY_ID, EA.NUMERIC_VALUE, EA.TYPE_SPECIFIC_INFO
> FROM
> ENTRY_ATTRIBUTES EA INNER JOIN BPNewENTRY E ON
> (EA.ENTRY_ID = E.ENTRY_ID)
>
> and this is the generated query plan for that:
>
> 0--0--0--SCAN TABLE ENTRY_ATTRIBUTES AS EA (~54855 rows) <-->
> 0--1--1--SEARCH TABLE BPNewENTRY AS E USING INTEGER PRIMARY KEY
> (rowid=?) (~1 rows)
>
> Now I can speed up the query a lot by putting a where clause in with
> the min(ENTRY_ID) of the small table:
>
> INSERT OR REPLACE INTO BP3
> (ENTRY_ID, NUMERIC_VALUE, UNIT)
> SELECT EA.ENTRY_ID, EA.NUMERIC_VALUE, EA.TYPE_SPECIFIC_INFO
> FROM
> ENTRY_ATTRIBUTES EA INNER JOIN BPNewENTRY E ON
> (EA.ENTRY_ID = E.ENTRY_ID)
> WHERE
> EA.ENTRY_ID >= 4262936
>
> and that will have this query plan:
>
> 0--0--0--SEARCH TABLE ENTRY_ATTRIBUTES AS EA USING INDEX
> IDX_ENTRY_ATTRIBUTES_ENTRY_ID (ENTRY_ID>?) (~18102 rows) <-->
> 0--1--1--SEARCH TABLE BPNewENTRY AS E USING INTEGER PRIMARY KEY
> (rowid=?) (~1 rows)
>
> Getting this min(ENTRY_ID) is done in a separate query and because it
> is done on the small table it is very fast.
>
> Now what surprises me is that this optimization is not done
> automatically by SQLite.
> I suppose I just over estimate the capabilities of the SQLite plan generator.
> Or, would this be something that could be improved?
>
> I am using SQLite version 3.7.5.
>
>
> RBS
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to