Dennis Cote wrote:
Hi All,

I have a situation where populating data in a temp table takes a long time,
about 8 minutes, and I am wondering if anyone can offer any suggestions for
ways to speed it up.

I have the permanent table and associated index, shown below, which holds
tree structured component data. Each node is related to its parent by the
container_id field.

create table Component (
    component_id            integer primary key,
    component_kind_id       integer not null references Component_Kind,
    component_number        integer not null,   -- number of this component
                                                --  within its container
...
    component_function_id   integer default 0 references Component_Function,
    unique (container_id, component_kind_id, component_number)
);

A primary key is always unique. So putting the primary key as the first element of a UNIQUE index accomplishes no good. But it does make the database larger and it makes it run slowly by given the database engine more things to have to keep up-to-date.


--index used to speed container lookups create index Component_Container_Index on Component (container_id);

An index that is on the primary key is redundant. This is especially true of an INTEGER PRIMARY KEY in SQLite. The index will never be used to speed up a search. (If it where used, it would only slow the search down.) But CPU cycles and disk space are consumed maintaining the index all the same. So an index on a primary key is a net loss.


and I populate the temp table with the following SQL statement;

insert into Component_Count
    select sub.id, sub_count, peer_count
    from
        (select pid as id, count(nid) as sub_count
        from
            (select p.component_id as pid, n.component_id as nid
            from Component p left outer join Component n on p.component_id =
n.container_id)
        group by pid) as sub
    join
        (select nid as id, count(pid) as peer_count
        from
            (select n.component_id as nid, p.component_id as pid
            from Component n left outer join Component p
                on (n.component_id != p.component_id
                and n.container_id = p.container_id
                and n.component_kind_id = p.component_kind_id))
        group by id) as peer
    using (id);


Joining two unindexed tables (and the result of a subquery is not going to have an index) will always be an O(N**2) operation. You can change it into an O(NlogN) operation by storing the results of subqueries in separate temporary tables, creating indices on those temporary tables, then doing the join on the temporary tables.

--
D. Richard Hipp -- [EMAIL PROTECTED] -- 704.948.4565



Reply via email to