On Tue, Apr 1, 2008 at 2:01 AM, kdecapite <[EMAIL PROTECTED]> wrote:
>
>  > Oh, and don't forget to create an index on (fn, ln) for both tables.
>
>  Sorry for being out of touch lately. I had to table this issue to make
>  progress on a different work-related project. Anyway, I spent some
>  time reading up on indexes. Just by adding the (ln, fn) index to both
>  tables (contacts and import_contacts), my query went from an 18-21
>  second execution time to a 9-10 second execution time, but not
>  consistently. I ran the query again a few minutes later and it took 18
>  seconds still. I tried again 10-15 minutes later and the execution
>  time went back down to 11 seconds. Not sure what's happening here
>  exactly.

Table cache is the name of what is happening, and your DB engine is
who is doing it. Also about setting indexes:

http://www.mysqlperformanceblog.com/2006/11/23/covering-index-and-prefix-indexes/
http://peter-zaitsev.livejournal.com/6949.html
http://www.xaprb.com/blog/2006/07/04/how-to-exploit-mysql-index-optimizations/

>
>  > Sounds about right, except for the difference in number of dupes. How many 
> are there in total?
>
>  Using my method of finding duplicates, I'm counting 14 matches.
>  However, when I looked again at b logica's query, I'm really getting
>  16 matches and not 24.
>
>  Here's why:
>
>  Even though 24 records are returned, some of them are duplicate names
>  due to record redundancy (a separate issue related to how the data is
>  managed by my client). For instance, the name "Joe Rohan" appears 3
>  times simply because this name exists 3 times in the contacts table as
>  individual records (there are 3 Joe Rohan's in the contacts table, I
>  mean).
>
>  The only confusing thing is that some names appear more than once in b
>  logica's result set but they are displaying the same "contact.id"
>  value. For example, the contact with the name "Kami Dolney" appears
>  twice in the result set, but each record is showing the same primary
>  key ID. So Kami Dolney does not exist twice in the contacts table. I'm
>  not sure why this contact is appearing twice in the query's result
>  set.
>
>  Even more confounding is the fact that this contact (Kami Dolney) does
>  NOT even appear at all as a duplicate using my method of finding dupes
>  (along with a couple other names). In other words, b logica's query
>  seems to be returning an accurate result set of matching contact names
>  from both tables.
>

I you are importing this:

1 John Smith
2 John Smith
3 Other Guy

And already have this contacts:

100 John Smith
200 More People

B Logica's query :

SELECT c.id, c.fn, c.ln FROM contacts AS c INNER JOIN import_contacts
AS ic ON c.fn = ic.fn AND c.ln = ic.ln;

Will return :

100 John Smith // this matches "1 John Smith"
100 John Smith // this matches "2 John Smith"

So, fix it this way:

SELECT DISTINCT c.id, c.fn, c.ln FROM contacts AS c INNER JOIN
import_contacts AS ic ON c.fn = ic.fn AND c.ln = ic.ln;

or:

SELECT  c.id, c.fn, c.ln FROM contacts AS c INNER JOIN import_contacts
AS ic ON c.fn = ic.fn AND c.ln = ic.ln GROUP BY (c.id);


But, this is selecting the contacts that you already have in the ones
that you are importing. If you need the contacts beign imported that
are repeated:

SELECT ic.id, ic.fn, ic.ln FROM import_contacts  AS ic INNER JOIN
contacts AS c ON c.fn = ic.fn AND c.ln = ic.ln;

- HTH,
Dardo Sordi.

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "Cake 
PHP" group.
To post to this group, send email to cake-php@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/cake-php?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to