Dennis Cote wrote:
Hi All,
I have a situation where populating data in a temp table takes a long time,
about 8 minutes, and I am wondering if anyone can offer any suggestions for
ways to speed it up.
I have the permanent table and associated index, shown below, which holds
tree structured component data. Each node is related to its parent by the
container_id field.
create table Component (
component_id integer primary key,
component_kind_id integer not null references Component_Kind,
component_number integer not null, -- number of this component
-- within its container
...
component_function_id integer default 0 references Component_Function,
unique (container_id, component_kind_id, component_number)
);
A primary key is always unique. So putting the primary key as
the first element of a UNIQUE index accomplishes no good. But
it does make the database larger and it makes it run slowly
by given the database engine more things to have to keep
up-to-date.
--index used to speed container lookups
create index Component_Container_Index on Component (container_id);
An index that is on the primary key is redundant. This is especially
true of an INTEGER PRIMARY KEY in SQLite. The index will never be used
to speed up a search. (If it where used, it would only slow the search
down.) But CPU cycles and disk space are consumed maintaining the
index all the same. So an index on a primary key is a net loss.
and I populate the temp table with the following SQL statement;
insert into Component_Count
select sub.id, sub_count, peer_count
from
(select pid as id, count(nid) as sub_count
from
(select p.component_id as pid, n.component_id as nid
from Component p left outer join Component n on p.component_id =
n.container_id)
group by pid) as sub
join
(select nid as id, count(pid) as peer_count
from
(select n.component_id as nid, p.component_id as pid
from Component n left outer join Component p
on (n.component_id != p.component_id
and n.container_id = p.container_id
and n.component_kind_id = p.component_kind_id))
group by id) as peer
using (id);
Joining two unindexed tables (and the result of a subquery is not going
to have an index) will always be an O(N**2) operation. You can change
it into an O(NlogN) operation by storing the results of subqueries in
separate temporary tables, creating indices on those temporary tables,
then doing the join on the temporary tables.
--
D. Richard Hipp -- [EMAIL PROTECTED] -- 704.948.4565