On May 22, 2009, at 1:51 PM, Emi Lu wrote:

Two tables, each contains more than hundreds of thousands records. Is there any efficiency differences between (1) and (2)?

(1) T1 inner join T2 using (c1, c2)


(2) T1 left join T2 using (c1, c2) where c2 is not null

Yes, stick with the first. In the second you're asking the db to generate a result set with tuples for every row in T1 and then filter it down to where there are only matching T2 rows whereas in the first it does the filtering as it goes. The LEFT JOIN ... WHERE X NOT NULL construct is typically used as an alternative to a NOT IN or NOT EXISTS (<subquery>). So, this:

SELECT *
FROM t1
WHERE id NOT IN (SELECT some_id
                                    FROM T2);

becomes

SELECT t1.*
FROM t1 LEFT JOIN t2 ON (t1.id = t2.some_id)
WHERE t2.id IS NULL;

Basically, it's used in the opposite case of what you're asking about.

Erik Jones, Database Administrator
Engine Yard
Support, Scalability, Reliability
866.518.9273 x 260
Location: US/Pacific
IRC: mage2k






--
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

Reply via email to