I tried to see how values were distributed across two tables, and tried 
something similar to (slightly simplified):

SELECT COALESCE(A.F1, B.F1), COALESCE(A.F2, B.F2),
       CASE 
         WHEN A.PK IS NULL THEN 'B only'
         WHEN B.PK IS NULL THEN 'A only'
         ELSE 'Both'
       END InTable, COUNT(*)
FROM A
FULL JOIN B ON A.F1 = B.F1 AND A.F2 = B.F2
GROUP BY 1, 2, 3

This took longer than I expected, so I stopped the query (I waited for about 15 
minutes before stopping it), and rewrote to:

WITH T(F1, F2) AS
(SELECT F1, F2
 FROM A
 UNION
 SELECT F1, F2
 FROM B)
SELECT T.F1, T.F2,
       CASE 
         WHEN A.PK IS NULL THEN 'B only'
         WHEN B.PK IS NULL THEN 'A only'
         ELSE 'Both'
       END InTable, COUNT(*)
FROM T
LEFT JOIN A ON T.F1 = A.F1 AND T.F2 = A.F2
LEFT JOIN B ON T.F1 = B.F1 AND T.F2 = B.F2
GROUP BY 1, 2, 3

This query executed fine (don't remember whether it used 30 or 90 seconds to 
execute).

A contains the about 8000 allowed combinations of B, whereas B contains 1.6 
millions of rows. Think of B.F1, B.F2 as being a relaxed FOREIGN KEY - no new 
entries that aren't in A should be allowed, but historical content should not 
be deleted, so we don't want to actually implement it as a FK (moreover, in the 
real database A.F2 contains five digits, whereas B.F2 is six digits with no 
restrictions on the last digit).

Naturally, the first query use NATURAL for A and B, whereas the second query 
use natural for both tables in the CTE, but indexes for the LEFT JOIN.

The server where I ran the queries is a Firebird 2.5.1 SuperServer installation.

I hardly ever use FULL JOIN and wonder why the first query should take so much 
longer than the second? 
Set
-INNER JOIN expert, fairly knowledgeable regarding LEFT/RIGHT JOIN, but a 
newbie regarding FULL JOIN
  • ... Svein Erling Tysvær svein.erling.tysv...@kreftregisteret.no [firebird-support]
    • ... Dmitry Yemanov dim...@users.sourceforge.net [firebird-support]

Reply via email to