Jamie Norrish wrote:
> Is there any optimisation done for the case of a symmetric EXCEPT, to
> avoid performing duplicate SELECT statements?

Some subqueries can be flattened by SQLite, but otherwise, subqueries
aren't really optimized at all.

> (SELECT A... EXCEPT SELECT B...) UNION (SELECT B... EXCEPT SELECT A...)
>
> would seem to be a poor approach to getting the set of rows in either A
> or B. Are there potentially better generic approaches, or other tricks
> I've missed?

The description "the set of rows in A or B, except those that are in
both A and B" leads to

  (SELECT A... UNION SELECT B...) EXCEPT (SELECT A... INTERSECT SELECT B...)

There doesn't seem to be much of a difference in practice; performance
probably depends on the size of the subquery results:

sqlite> explain query plan select * from a except select * from b union all
            select * from (select * from b except select * from a);
sele  order          from  deta
----  -------------  ----  ----
2     0              0     SCAN TABLE a (~1000000 rows)
3     0              0     SCAN TABLE b (~1000000 rows)
1     0              0     COMPOUND SUBQUERIES 2 AND 3 USING TEMP B-TREE 
(EXCEPT)
6     0              0     SCAN TABLE b (~1000000 rows)
7     0              0     SCAN TABLE a (~1000000 rows)
5     0              0     COMPOUND SUBQUERIES 6 AND 7 USING TEMP B-TREE 
(EXCEPT)
4     0              0     SCAN SUBQUERY 5 (~1000000 rows)
0     0              0     COMPOUND SUBQUERIES 1 AND 4 (UNION ALL)

sqlite> explain query plan select * from a union all select * from b except
            select * from (select * from a intersect select * from b);
sele  order          from  deta
----  -------------  ----  ----
2     0              0     SCAN TABLE a (~1000000 rows)
3     0              0     SCAN TABLE b (~1000000 rows)
1     0              0     COMPOUND SUBQUERIES 2 AND 3 (UNION ALL)
6     0              0     SCAN TABLE a (~1000000 rows)
7     0              0     SCAN TABLE b (~1000000 rows)
5     0              0     COMPOUND SUBQUERIES 6 AND 7 USING TEMP B-TREE 
(INTERSECT)
4     0              0     SCAN SUBQUERY 5 (~1000000 rows)
0     0              0     COMPOUND SUBQUERIES 1 AND 4 USING TEMP B-TREE 
(EXCEPT)


If the rows can be represented by some key column(s), the lookups could
be implemented with indexes that could be created in advance, instead of
with temporary B-trees:

sqlite> explain query plan select * from a where not exists (select 1 from b 
where b.id = a.id)
                 union all select * from b where not exists (select 1 from a 
where a.id = b.id);
sele  order          from  deta
----  -------------  ----  ----
1     0              0     SCAN TABLE a (~500000 rows)
1     0              0     EXECUTE CORRELATED SCALAR SUBQUERY 2
2     0              0     SEARCH TABLE b USING AUTOMATIC COVERING INDEX (id=?) 
(~7 rows)
3     0              0     SCAN TABLE b (~500000 rows)
3     0              0     EXECUTE CORRELATED SCALAR SUBQUERY 4
4     0              0     SEARCH TABLE a USING AUTOMATIC COVERING INDEX (id=?) 
(~7 rows)
0     0              0     COMPOUND SUBQUERIES 1 AND 3 (UNION ALL)


Regards,
Clemens
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to