Taking it step by step, this query will give you all the lowest ids, for those records with duplicates.
SELECT min(id), email, count(*) AS cnt
FROM addressbook
GROUP BY email
HAVING cnt>1

Now think of that query as an already existing table, which you can do, you just need to name the query result, in this case I'm calling it t1. Then you LEFT JOIN it with the addressbook table, but filtering out the ids you already have.
SELECT id1, email1,
t2.id AS id2, t2.email AS email2
FROM
(
SELECT min(id) AS id1, email AS email1, count(*) AS cnt
FROM addressbook
GROUP BY email
HAVING cnt>1
) AS t1
LEFT JOIN
addressbook AS t2
ON t1.email1=t2.email AND t1.id1!=t2.id
ORDER BY email1

I haven't tested it, but that query should work and give you the output you want. I don't recall if it works in v4.0, but v4.1 and above should work fine.

Brent


On Dec 12, 2007, at 8:35 AM, Yashesh Bhatia wrote:

Hello:

    I have the following table

select * from addressbook
+----+--------------------+
| id | email              |
+----+--------------------+
|  1 | [EMAIL PROTECTED] |
|  2 | [EMAIL PROTECTED] |
|  3 | [EMAIL PROTECTED] |
+----+--------------------+
3 rows in set (0.00 sec)

Now i wanted to find a list of duplicate contacts wherein i can get
the 1st contact with same email
and merge with the others. so in the above case id 1 has duplicates 2
and 3, 2 has 1 & 3 and 3 and 1 & 2.

I'm only interested in getting the first set of duplicates i.e. "1 has
duplicates 2 & 3".

So i tried the query
select t1.id as id1, t2.id as id2, t1.email as email1, t2.email as email2
from addressbook t1, addressbook t2
where t1.email = t2.email and t1.id != t2.id
order by t1.id
+-----+-----+--------------------+--------------------+
| id1 | id2 | email1             | email2             |
+-----+-----+--------------------+--------------------+
|   1 |   2 | [EMAIL PROTECTED] | [EMAIL PROTECTED] |
|   1 |   3 | [EMAIL PROTECTED] | [EMAIL PROTECTED] |
|   2 |   1 | [EMAIL PROTECTED] | [EMAIL PROTECTED] |
|   2 |   3 | [EMAIL PROTECTED] | [EMAIL PROTECTED] |
|   3 |   1 | [EMAIL PROTECTED] | [EMAIL PROTECTED] |
|   3 |   2 | [EMAIL PROTECTED] | [EMAIL PROTECTED] |
+-----+-----+--------------------+--------------------+

then i tried the query similar to the one suggested in the MySQL
Cookbook Recipe 14.5

select DISTINCT if(t1.id < t2.id, t1.id, t2.id) as id1,
       if(t1.id < t2.id, t1.email, t2.email) as email1,
       if(t1.id < t2.id, t2.id, t1.id) as id2,
       if(t1.id < t2.id, t2.email, t1.email) as email2
from addressbook t1, addressbook t2
where t1.email = t2.email and t1.id != t2.id
order by t1.id
+-----+--------------------+-----+--------------------+
| id1 | email1             | id2 | email2             |
+-----+--------------------+-----+--------------------+
|   1 | [EMAIL PROTECTED] |   2 | [EMAIL PROTECTED] |
|   1 | [EMAIL PROTECTED] |   3 | [EMAIL PROTECTED] |
|   2 | [EMAIL PROTECTED] |   3 | [EMAIL PROTECTED] |
+-----+--------------------+-----+--------------------+

I'm stuck trying to get a query that will give me only
+-----+--------------------+-----+--------------------+
| id1 | email1             | id2 | email2             |
+-----+--------------------+-----+--------------------+
|   1 | [EMAIL PROTECTED] |   2 | [EMAIL PROTECTED] |
|   1 | [EMAIL PROTECTED] |   3 | [EMAIL PROTECTED] |
+-----+--------------------+-----+--------------------+

Any help, feeback is deeply appreciated. Thanks a bunch in advance.

Yashesh Bhatia

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql? [EMAIL PROTECTED]



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to