Hi all, Today I've stumbled on a situation where I think I really need to use a RIGHT OUTER JOIN, and looking at all the examples on the internet I could find so far, I'm not finding a way to simulate it properly using LEFT OUTER JOINs.
So I thought, before I commit to an inefficient alternative I would check to see if someone on this list has an idea of how I can form an efficient query for this. I'll start by highlighting some of the schema which is relevant to the query: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ CREATE TABLE shift ( uid INTEGER PRIMARY KEY AUTOINCREMENT, ... name and start time/duration of a shift ... ); CREATE TABLE resource ( uid INTEGER PRIMARY KEY AUTOINCREMENT, ... details about the resource ... ); CREATE TABLE event ( uid INTEGER PRIMARY KEY AUTOINCREMENT, date INTEGER NOT NULL, /* unix timestamp */ shift_uid INTEGER NOT NULL REFERENCES shift (uid), ... other details about the event ... ); CREATE TABLE IF NOT EXISTS event_participant ( id INTEGER CHECK (id > 0), event_uid INTEGER REFERENCES event (uid) ON DELETE CASCADE, resource_uid INTEGER REFERENCES resource (uid), ... other details about this participant ... PRIMARY KEY (event_uid, id) ); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Now my goal here, is to pull out every resource UID which is _not_ assigned to any event on a given date and shift UID. My first attempt that gets the job done looks like this: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~ Statement 1 ~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ SELECT resource.uid /* All resources... */ FROM resource /* Subtract from this set all the booked resources */ LEFT JOIN ( /* All resources which are booked on the given date / shift_uid */ SELECT participant.resource_uid AS uid FROM event JOIN event_participant AS participant ON (participant.event_uid = event.uid) WHERE (event.date = 1411185600 AND event.shift_uid = 1 AND participant.resource_uid IS NOT NULL) ) AS booked_resource ON (booked_resource.uid = resource.uid) WHERE booked_resource.uid IS NULL; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ EXPLAIN SAYS ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1|0|0|SEARCH TABLE event USING INDEX event_shift_uid_idx (shift_uid=?) (~2 rows) 1|1|1|SEARCH TABLE event_participant AS participant USING INDEX event_participant_pk_idx (event_uid=?) (~5 rows) 0|0|0|SCAN TABLE resource (~1000000 rows) 0|1|1|SEARCH SUBQUERY 1 AS booked_resource USING AUTOMATIC COVERING INDEX (uid=?) (~2 rows) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Now, this is not all that bad, except that I try to avoid nesting select statements in cases like this as it prevents the query planner from doing anything intelligent. So I made another attempt which discards the nested statement: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~ Statement 2 ~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ SELECT resource.uid /* All Resources */ FROM resource /* Get the cartesian product of event * resources, dangerously big, * but hopefully trimmed down by constraints found in the final WHERE * clause */ JOIN event /* Subtract from this set all the booked resources */ LEFT JOIN event_participant AS participant ON (participant.event_uid = event.uid AND participant.resource_uid = resource.uid) WHERE event.date = 1411185600 AND event.shift_uid = 1 AND participant.resource_uid IS NULL; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ EXPLAIN SAYS ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 0|0|1|SEARCH TABLE event USING INDEX event_shift_uid_idx (shift_uid=?) (~2 rows) 0|1|0|SCAN TABLE resource (~1000000 rows) 0|2|2|SEARCH TABLE event_participant AS participant USING INDEX event_participant_resource_uid_idx (resource_uid=?) (~2 rows) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ It turns out that this query is almost semantically the same as the first, however without the nested select, EXPLAIN QUERY PLAN seems to indicate the performance is better (the SCAN on the 'resource' table is still present, but the extra SEARCH SUBQUERY is not gone). Finally, after doing some more digging, I think I've found that the correct way to make this query should really be with RIGHT OUTER JOIN as follows: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~ Statement 3 ~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ SELECT resource.uid /* All events on the date specified in the final WHERE clause */ FROM event /* Join participants by indexed event key, only those with assigned * resources */ JOIN event_participant AS participant ON (participant.event_uid = event.uid AND participant.resource_uid IS NOT NULL) /* Join in all the resources which do not appear in the preceding set */ RIGHT OUTER JOIN resource ON (resource.uid = participant.resource_uid) WHERE event.DATE = 1411185600 AND event.shift_uid = 1 AND participant.resource_uid IS NULL; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ However, unlike other simple use cases of LEFT OUTER JOIN, this one cannot simply be inverted to get the same performance/results as with RIGHT OUTER JOIN (inverting the statement only works in the very simple case where you really only have 2 tables to join). So, in closing, I have only been doing SQL for a couple of years now and would consider myself a 'newbie' at this - and I was wondering if someone more enlightened could point out something I've missed. If it's the case that RIGHT OUTER JOIN would be the correct way but SQLite simply doesn't offer that (and there is no way to efficiently work around this), I will gladly except that and probably go with a variation of 'Statement 2' above. I would also be interested to know if 'Statement 1' may be superior to 'Statement 2' due to the dangerously huge cartesian product of the 'event' and 'resource' tables. Thank you all for your attention :) Best Regards, -Tristan _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users