You _can_ do this in SQL, it just takes several steps.
What you need to do is create a table that contains the ID values of the
records you want to get rid of. If you want to keep the "first" record in a
set of dupes try this:
CREATE TABLE DeleteMe
SELECT t.email, t.ID
FROM yourtablenamehere t
GROUP BY email
HAVING count(1) >1;
ALTER TABLE DeleteMe add Key(ID);
This makes a full list of all IDs for all duplicate emails and puts an
index on it (you may not need the index if your dupes list is small
enough). But you said that you don't want to get rid of all dupes, you want
to keep the "first" dupe in each set. Let's make another list:
CREATE TABLE KeepMe
SELECT email, MIN(ID)
FROM DeleteMe
GROUP BY email
This is a list of the IDs you want to keep. Now to eliminate the "keepers"
from the full dupes list.
DELETE DeleteMe
FROM DeleteMe d
LEFT JOIN KeepMe k
on k.id = d.id
WHERE k.id is null
You now have a list of all duplicate IDs except the "first" ones. We will
use this list to clear the dupes from your original table:
DELETE yourtablenamehere
FROM yourtablenamehere t
INNER JOIN Deleteme d
ON d.id = t.id
Now, all that's left to do is to clean up our working tables:
DROP TABLE KeepMe, DeleteMe
And you are done!
Yours,
Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine
"Robert A.
Rosenberg" To: <[EMAIL PROTECTED]>
<[EMAIL PROTECTED] cc:
> Fax to:
Subject: RE: Weeding out duplicates
07/10/2004 10:05
PM
At 10:25 -0600 on 07/09/2004, Jonathan Duncan wrote about Re: Weeding
out duplicates:
>Lachlan,
>
>I want to identify the entries in the table where the email addresses
>are the same as another entry. Whatever else is in the record does not
>matter to me.
>
>However, a second requirement for the query is that it show me the last
>duplicate instead of the first. This way I keep the first entries and
>remove subsequent ones.
>
>Thanks,
>Jonathan Duncan
If you are willing to go with a PHP/MySQL solution as opposed to a
pure MySQL one, try this:
1) Use that query to get a list of what Email Addresses are duplicated.
2) Now do a query with a WHERE Email=Duplicate-Email-Address and
fetch the Primary Key.
3) Do a Delete by Primary Key for all rows fetched in 2 EXCEPT for
those keys fetched in 1.
The sequence is that you do step 1, loop though the results one at a
time (step 2) doing step 3 in that loop.
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]