On Wed, 2006-02-08 at 22:31 +0100, Markus Schaber wrote:
Hi, Ken,

Ken Hill schrieb:
> I need some help with a bit of SQL. I have two tables. I want to find
> records in one table that don't match records in another table based on
> a common column in the two tables. Both tables have a column named
> 'key100'. I was trying something like:
> 
> SELECT count(*)
> FROM table1, table2
> WHERE (table1.key100 != table2.key100);
> 
> But the query is very slow and I finally just cancel it. Any help is
> very much appreciated.

Do you have indices on the key100 columns? Is autovacuum running, or do
you do analyze manually?

Can you send us the output from "EXPLAIN ANALYZE [your query]"?

Btw, I don't think this query will do what you wanted, it basically
creates a cross product, that means if your tables look like:

schabitest=# select * from table1;
 key100 | valuea | valueb
--------+--------+--------
      1 | foo    | bar
      2 | blah   | blubb
      3 | manga  | mungo

schabitest=# select * from table2;
 key100 | valuec | valued
--------+--------+--------
      1 | monday | euro
      2 | sunday | dollar
      4 | friday | pounds

Then your query will produce something like:
schabitest=# select * from table1, table2 WHERE (table1.key100 !=
table2.key100);
 key100 | valuea | valueb | key100 | valuec | valued
--------+--------+--------+--------+--------+--------
      1 | foo    | bar    |      2 | sunday | dollar
      1 | foo    | bar    |      4 | friday | pounds
      2 | blah   | blubb  |      1 | monday | euro
      2 | blah   | blubb  |      4 | friday | pounds
      3 | manga  | mungo  |      1 | monday | euro
      3 | manga  | mungo  |      2 | sunday | dollar
      3 | manga  | mungo  |      4 | friday | pounds

I suggest you would like to have all records from table1 that don't have
a corresponding record in table2:

schabitest=# select * from table1 where table1.key100 not in (select
key100 from table2);
 key100 | valuea | valueb
--------+--------+--------
      3 | manga  | mungo

HTH,
Markus

Here is my query SQL:

SELECT key100 FROM ncccr10
WHERE ncccr10.key100 NOT IN (SELECT key100 FROM ncccr9);

It is is running after 30 minutes. Here is the query plan:

                               QUERY PLAN
-------------------------------------------------------------------------
Seq Scan on ncccr10  (cost=0.00..20417160510.08 rows=305782 width=104)
   Filter: (NOT (subplan))
   SubPlan
     ->  Seq Scan on ncccr9  (cost=0.00..65533.71 rows=494471 width=104)
(4 rows)

Any ideas why it is so slow?

Reply via email to