On Tue, Apr 1, 2008 at 2:01 AM, kdecapite <[EMAIL PROTECTED]> wrote: > > > Oh, and don't forget to create an index on (fn, ln) for both tables. > > Sorry for being out of touch lately. I had to table this issue to make > progress on a different work-related project. Anyway, I spent some > time reading up on indexes. Just by adding the (ln, fn) index to both > tables (contacts and import_contacts), my query went from an 18-21 > second execution time to a 9-10 second execution time, but not > consistently. I ran the query again a few minutes later and it took 18 > seconds still. I tried again 10-15 minutes later and the execution > time went back down to 11 seconds. Not sure what's happening here > exactly.
Table cache is the name of what is happening, and your DB engine is who is doing it. Also about setting indexes: http://www.mysqlperformanceblog.com/2006/11/23/covering-index-and-prefix-indexes/ http://peter-zaitsev.livejournal.com/6949.html http://www.xaprb.com/blog/2006/07/04/how-to-exploit-mysql-index-optimizations/ > > > Sounds about right, except for the difference in number of dupes. How many > are there in total? > > Using my method of finding duplicates, I'm counting 14 matches. > However, when I looked again at b logica's query, I'm really getting > 16 matches and not 24. > > Here's why: > > Even though 24 records are returned, some of them are duplicate names > due to record redundancy (a separate issue related to how the data is > managed by my client). For instance, the name "Joe Rohan" appears 3 > times simply because this name exists 3 times in the contacts table as > individual records (there are 3 Joe Rohan's in the contacts table, I > mean). > > The only confusing thing is that some names appear more than once in b > logica's result set but they are displaying the same "contact.id" > value. For example, the contact with the name "Kami Dolney" appears > twice in the result set, but each record is showing the same primary > key ID. So Kami Dolney does not exist twice in the contacts table. I'm > not sure why this contact is appearing twice in the query's result > set. > > Even more confounding is the fact that this contact (Kami Dolney) does > NOT even appear at all as a duplicate using my method of finding dupes > (along with a couple other names). In other words, b logica's query > seems to be returning an accurate result set of matching contact names > from both tables. > I you are importing this: 1 John Smith 2 John Smith 3 Other Guy And already have this contacts: 100 John Smith 200 More People B Logica's query : SELECT c.id, c.fn, c.ln FROM contacts AS c INNER JOIN import_contacts AS ic ON c.fn = ic.fn AND c.ln = ic.ln; Will return : 100 John Smith // this matches "1 John Smith" 100 John Smith // this matches "2 John Smith" So, fix it this way: SELECT DISTINCT c.id, c.fn, c.ln FROM contacts AS c INNER JOIN import_contacts AS ic ON c.fn = ic.fn AND c.ln = ic.ln; or: SELECT c.id, c.fn, c.ln FROM contacts AS c INNER JOIN import_contacts AS ic ON c.fn = ic.fn AND c.ln = ic.ln GROUP BY (c.id); But, this is selecting the contacts that you already have in the ones that you are importing. If you need the contacts beign imported that are repeated: SELECT ic.id, ic.fn, ic.ln FROM import_contacts AS ic INNER JOIN contacts AS c ON c.fn = ic.fn AND c.ln = ic.ln; - HTH, Dardo Sordi. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Cake PHP" group. To post to this group, send email to cake-php@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/cake-php?hl=en -~----------~----~----~----~------~----~------~--~---