On 01/03/2013 11:10 AM, Stephen Chrzanowski wrote:
***I'm waiting for the repair man to show up to fix my waterheater... so...
I'm bored. This is going to be to the point at the beginning, but get wordy
and technical near the end. ;)  Super over kill..... ahem****
Nice explanation... just a couple of nitpicks:

You will ALWAYS incur slower speeds when using foreign keys in either a join
or "ON [DELETE/UPDATE]".  Additional look ups have to happen, which means more
time spent, which typically is the millisecond range per lookup.
I can't think of any reason a foreign key constraint would impact the cost of joins in any query. The cost is entirely at update time (when you have to enforce the constraint).

FKs are two (or more) pointers that say one field in one table is related
to that of another table in some regard.  The use of FKs are typically used
to delete data at the database level.
Foreign keys protect inserts and updates as well: they prevent you from accidentally inserting a sales order with a bogus customer ID, for example, and you're not allowed to change the ID of a customer who has orders without chasing down all those orders and updating them as well.

The easiest way to think of it might be: foreign key constraints make sure your foreign key joins always behave as expected: a bogus customer ID would cause the order to disappear when joined with customer, and that's probably a Bad Thing. Note that the DBMS can be told to allow NULL customer ID in orders, if that's useful to you.

*1)
A basic index would be something similar to a key/value pair.  The list of
keys would be sorted, however the list of values that key holds doesn't
necessarily have to be.  From memory, back when I was doing my MCDBA cert
for SQL2k, the basic index look up engine would count how many unique
indexes exist, read the key in the middle, decide if further look ups had
to be done.  If more had to be done, it'd either look at the key at the 1/4
mark, or the 3/4 mark, and decide again.  It'd keep drilling the index page
until it found what it needed.  It'd then look at all the data pages
required and process the data.  So if you were looking for the number 30 in
a list of 100 unique numbers (1-100), it'd look at 50, decide what it found
was too high, look at 25, decide it was too low, then look at 37, decide
too high, then look at 31, again find it too high, then look at 30 and read
in the data which may live on pages 99, 45, 58, 109, and 55.
That describes a binary search, which is actually pretty inefficient for disk-resident data. Most database engines (sqlite3 included) use B+Tree search. It requires a complex data structure, but the effect is that you could find your needle in a one-billion entry haystack by fetching perhaps 5 records from disk [1]; binary search would touch perhaps 35 records, and without an index you'd have to search all billion records on disk directly [2].

[1] If the index gets used very often, 2-4 of those records will probably still be in memory from previous searches, cutting the disk access count even further.

[2] If you only need exact match (rather than range or nearest-neighbor) you can go with a hash index and cut the disk access count to one (in expectation). Sqlite3 doesn't have hash indexes, though, and B+Tree indexes fit in memory well enough that hashing isn't usually worth the trouble.

Ryan

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to