Here is my query SQL:Hi, Ken, Ken Hill schrieb: > I need some help with a bit of SQL. I have two tables. I want to find > records in one table that don't match records in another table based on > a common column in the two tables. Both tables have a column named > 'key100'. I was trying something like: > > SELECT count(*) > FROM table1, table2 > WHERE (table1.key100 != table2.key100); > > But the query is very slow and I finally just cancel it. Any help is > very much appreciated. Do you have indices on the key100 columns? Is autovacuum running, or do you do analyze manually? Can you send us the output from "EXPLAIN ANALYZE [your query]"? Btw, I don't think this query will do what you wanted, it basically creates a cross product, that means if your tables look like: schabitest=# select * from table1; key100 | valuea | valueb --------+--------+-------- 1 | foo | bar 2 | blah | blubb 3 | manga | mungo schabitest=# select * from table2; key100 | valuec | valued --------+--------+-------- 1 | monday | euro 2 | sunday | dollar 4 | friday | pounds Then your query will produce something like: schabitest=# select * from table1, table2 WHERE (table1.key100 != table2.key100); key100 | valuea | valueb | key100 | valuec | valued --------+--------+--------+--------+--------+-------- 1 | foo | bar | 2 | sunday | dollar 1 | foo | bar | 4 | friday | pounds 2 | blah | blubb | 1 | monday | euro 2 | blah | blubb | 4 | friday | pounds 3 | manga | mungo | 1 | monday | euro 3 | manga | mungo | 2 | sunday | dollar 3 | manga | mungo | 4 | friday | pounds I suggest you would like to have all records from table1 that don't have a corresponding record in table2: schabitest=# select * from table1 where table1.key100 not in (select key100 from table2); key100 | valuea | valueb --------+--------+-------- 3 | manga | mungo HTH, Markus
SELECT key100 FROM ncccr10
WHERE ncccr10.key100 NOT IN (SELECT key100 FROM ncccr9);
It is is running after 30 minutes. Here is the query plan:
QUERY PLAN
-------------------------------------------------------------------------
Seq Scan on ncccr10 (cost=0.00..20417160510.08 rows=305782 width=104)
Filter: (NOT (subplan))
SubPlan
-> Seq Scan on ncccr9 (cost=0.00..65533.71 rows=494471 width=104)
(4 rows)
Any ideas why it is so slow?