On Wed, 27 Nov 2013 at 21:21:43 -0800, Igor Tandetnik wrote > Why are you using outer joins when your WHERE clause discards > unmatched records anyway? If you replace LEFT OUTER with INNER, the > end result would be exactly the same.
Not for all queries. Consider the query (or (beginswith "full_name" "foo") (beginswith "email" "foo")) The full_name field is in the main table, since each vcard only has *one* full_name field. There can be multiple email addresses, so those are in an auxiliary table. The SQL query becomes: SELECT DISTINCT summary.uid, summary.vcard from 'folder_id' AS summary LEFT OUTER JOIN 'folder_id_email_list' AS email_list ON email_list.uid = summary.uid WHERE (email_list.value like 'foo%') OR (summary.full_name like 'foo%'); If that *isn't* a LEFT OUTER JOIN, and there's a record in the main 'folder_id' table which doesn't have *any* email addresses, then it will be incorrectly omitted from the results. As it happens, the performance on the LEFT OUTER JOIN sucks somewhat, since it doesn't use any indices on folder_id(full_name): 0|0|0|SCAN TABLE folder_id AS summary USING INDEX sqlite_autoindex_folder_id_1 0|1|1|SEARCH TABLE folder_id_email_list AS email_list USING INDEX UID_INDEX_email_folder_id (uid=?) 0|0|0|USE TEMP B-TREE FOR DISTINCT There's an optimisation there which maybe it would be nice if sqlite would see for itself. It works a *whole* lot faster if we express it thus: SELECT DISTINCT summary.uid, summary.vcard from 'folder_id' as summary JOIN 'folder_id_email_list' as email_list ON +email_list.uid = summary.uid WHERE (email_list.value like 'foo%') UNION SELECT summary.uid, summary.vcard from 'folder_id' as summary WHERE (summary.full_name like 'foo%'); (Note that this time it *can* be an INNER JOIN.) 1|0|1|SEARCH TABLE folder_id_email_list AS email_list USING INDEX INDEX_email_folder_id (value>? AND value<?) 1|1|0|SEARCH TABLE folder_id AS summary USING INDEX sqlite_autoindex_folder_id_1 (uid=?) 2|0|0|SEARCH TABLE folder_id AS summary USING INDEX INDEX_full_name_folder_id (full_name>? AND full_name<?) 0|0|0|COMPOUND SUBQUERIES 1 AND 2 USING TEMP B-TREE (UNION) FWIW on my data set of about 240,000 records from the company address book, the top query runs in about 1400ms while the last one takes about 6ms. Since this is the query that happens when we start typing into the To: or Cc: field of an email composer window, we want it to be fast :) Tristan's original query was about using nested SELECT to implement logical 'AND'. I don't care much about that, but I sure as hell want him to use UNION to implement logical 'OR'. Unless there's a better way? Should we be asking for sqlite to actually spot the optimisation for itself? -- David Woodhouse Open Source Technology Centre david.woodho...@intel.com Intel Corporation
_______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users