Have checked and missing analyze is indeed not the cause of this
difference in query speed.

RBS


On Tue, Jul 12, 2011 at 9:06 PM, Bart Smissaert
<bart.smissa...@gmail.com> wrote:
>> it thinks that scanning the whole ENTRY_ATTRIBUTES table will read
>> about 54855 rows. And you say that it has much more rows.
>
> This particular database has less rows, the millions I mentioned are
> in a different
> database. I think the figures are right, but will check.
>
>> I think running ANALYZE on your database
>
> Analyze was done, but will double-check.
>
> RBS
>
>
> On Tue, Jul 12, 2011 at 8:59 PM, Pavel Ivanov <paiva...@gmail.com> wrote:
>>> Now what surprises me is that this optimization is not done
>>> automatically by SQLite.
>>> I suppose I just over estimate the capabilities of the SQLite plan 
>>> generator.
>>> Or, would this be something that could be improved?
>>
>> It's very non-obvious optimization and I think other type of
>> optimization will give much better result in your case.
>> To understand the optimizer's behavior look at numbers it shows you:
>> it thinks that scanning the whole ENTRY_ATTRIBUTES table will read
>> about 54855 rows. And you say that it has much more rows. That's why
>> optimizer selects sub-optimal plan.
>> I think running ANALYZE on your database should fix selected plans and
>> even first query will run much faster.
>>
>>
>> Pavel
>>
>>
>> On Tue, Jul 12, 2011 at 3:39 PM, Bart Smissaert
>> <bart.smissa...@gmail.com> wrote:
>>> Joining a large table (ENTRY_ATTRIBUTES) and a small table
>>> (BPNewENTRY) and putting the resulting records
>>> in a third table, BP3. Large table may have a few million records and
>>> small table a few hundred records.
>>> The join field is called ENTRY_ID in both tables and this has a
>>> non-unique index in the large table and is the integer primary key
>>> in the small table. Data type is integer in both these fields.
>>>
>>> This is the SQL:
>>>
>>> INSERT OR REPLACE INTO BP3
>>> (ENTRY_ID, NUMERIC_VALUE, UNIT)
>>> SELECT EA.ENTRY_ID, EA.NUMERIC_VALUE, EA.TYPE_SPECIFIC_INFO
>>> FROM
>>> ENTRY_ATTRIBUTES EA INNER JOIN BPNewENTRY E ON
>>> (EA.ENTRY_ID = E.ENTRY_ID)
>>>
>>> and this is the generated query plan for that:
>>>
>>> 0--0--0--SCAN TABLE ENTRY_ATTRIBUTES AS EA (~54855 rows) <-->
>>> 0--1--1--SEARCH TABLE BPNewENTRY AS E USING INTEGER PRIMARY KEY
>>> (rowid=?) (~1 rows)
>>>
>>> Now I can speed up the query a lot by putting a where clause in with
>>> the min(ENTRY_ID) of the small table:
>>>
>>> INSERT OR REPLACE INTO BP3
>>> (ENTRY_ID, NUMERIC_VALUE, UNIT)
>>> SELECT EA.ENTRY_ID, EA.NUMERIC_VALUE, EA.TYPE_SPECIFIC_INFO
>>> FROM
>>> ENTRY_ATTRIBUTES EA INNER JOIN BPNewENTRY E ON
>>> (EA.ENTRY_ID = E.ENTRY_ID)
>>> WHERE
>>> EA.ENTRY_ID >= 4262936
>>>
>>> and that will have this query plan:
>>>
>>> 0--0--0--SEARCH TABLE ENTRY_ATTRIBUTES AS EA USING INDEX
>>> IDX_ENTRY_ATTRIBUTES_ENTRY_ID (ENTRY_ID>?) (~18102 rows) <-->
>>> 0--1--1--SEARCH TABLE BPNewENTRY AS E USING INTEGER PRIMARY KEY
>>> (rowid=?) (~1 rows)
>>>
>>> Getting this min(ENTRY_ID) is done in a separate query and because it
>>> is done on the small table it is very fast.
>>>
>>> Now what surprises me is that this optimization is not done
>>> automatically by SQLite.
>>> I suppose I just over estimate the capabilities of the SQLite plan 
>>> generator.
>>> Or, would this be something that could be improved?
>>>
>>> I am using SQLite version 3.7.5.
>>>
>>>
>>> RBS
>>> _______________________________________________
>>> sqlite-users mailing list
>>> sqlite-users@sqlite.org
>>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>>
>> _______________________________________________
>> sqlite-users mailing list
>> sqlite-users@sqlite.org
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>
>
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to