RE: removing duplicate entries

Magnus Smith Wed, 06 Aug 2008 05:12:42 -0700

When I try the first suggestion (i) then I get all the 1682 duplicate
rows.  The thing is that I need to keep the originals which are the ones
with the lowest ACCOUNTACTION.ID value.
 
The second suggestion (ii) gives me 563 rows that are the duplicates
with the lowest ACCOUNTACTION.ID which are the ones I wish to keep
 
So the ones I want to delete are the ones in (i) and not (ii)
 
When I use 
 
select ACCOUNTACTION.ID from ACCOUNTACTION
where ACCOUNTACTION.ACTIONDATE like '2008-08-01 02:00%'
group by ACCOUNTACTION.ACCOUNT_ID
having (count(ACCOUNTACTION.ACCOUNT_ID) > 1 
and ACCOUNTACTION.ID != min(ACCOUNTACTION.ID));
 
then I get 1119 rows which is all the duplicates in (i) less the
originals in (ii)
 
 
The problem I'm having is using this in a delete statement.


________________________________

From: Ananda Kumar [mailto:[EMAIL PROTECTED] 
Sent: 06 August 2008 10:11
To: Magnus Smith
Cc: mysql@lists.mysql.com
Subject: Re: removing duplicate entries


I doubt the belwo sql will give you duplcates
 
select ACCOUNTACTION.ID <http://accountaction.id/>  from ACCOUNTACTION
where ACCOUNTACTION.ACTIONDATE like '2008-08-01 02:00%'
group by ACCOUNTACTION.ACCOUNT_ID
having (count(ACCOUNTACTION.ACCOUNT_ID) > 1 and ACCOUNTACTION.ID
<http://accountaction.id/>  !=
min(ACCOUNTACTION.ID <http://accountaction.id/> ));
 
The reason being, for duplicates records accountaction.id will always
equal to min(accountaction.id).
 
 
try this
select ACCOUNTACTION.ID <http://accountaction.id/>  from ACCOUNTACTION
where ACCOUNTACTION.ACTIONDATE like '2008-08-01 02:00%'
group by ACCOUNTACTION.ACCOUNT_ID
having (count(ACCOUNTACTION.ACCOUNT_ID) > 1
 
or 
 
select ACCOUNTACTION.ID <http://accountaction.id/>  from ACCOUNTACTION
where ACCOUNTACTION.ACTIONDATE like '2008-08-01 02:00%'
group by ACCOUNTACTION.ACCOUNT_ID
having (count(ACCOUNTACTION.ACCOUNT_ID) > 1 and ACCOUNTACTION.ID
<http://accountaction.id/> =min(accountaction.id);
 
I would use the first select statement.
 

 
On 8/6/08, Magnus Smith <[EMAIL PROTECTED]> wrote: 

        I have the following two tables
        
        ACCOUNTACTION
        
+-------------------+--------------+------+-----+---------+-------+
        | Field             | Type         | Null | Key | Default |
Extra |
        
+-------------------+--------------+------+-----+---------+-------+
        | ID                | bigint(20)   | NO   | PRI |         |
|
        | AccountActionType | varchar(31)  | YES  |     | NULL    |
|
        | DESCRIPTION       | varchar(255) | YES  |     | NULL    |
|
        | ACTIONDATE        | datetime     | YES  |     | NULL    |
|
        | ACCOUNT_ID        | bigint(20)   | YES  | MUL | NULL    |
|
        
+-------------------+--------------+------+-----+---------+-------+
        
        and
        
        ACCOUNTPAYMENTACTION
        +---------------+------------+------+-----+---------+-------+
        | Field         | Type       | Null | Key | Default | Extra |
        +---------------+------------+------+-----+---------+-------+
        | ID            | bigint(20) | NO   | PRI |         |       |
        | AMOUNTINPENCE | bigint(20) | YES  |     | NULL    |       |
        +---------------+------------+------+-----+---------+-------+
        
        ACCOUNTPAYMENTACTION shares the primary key with ACCOUNTACTION
        
        I need to remove duplicate entries that occured at a specific
time in
        ACCOUNTACTION I then plan to remove the rows in
ACCOUNTPAYMENTACTION
        that are no longer referenced in ACCOUNTACTION by using an outer
join
        
        I can select the duplicate records in ACCOUNTACTION using
        
        select ACCOUNTACTION.ID from ACCOUNTACTION
        where ACCOUNTACTION.ACTIONDATE like '2008-08-01 02:00%'
        group by ACCOUNTACTION.ACCOUNT_ID
        having (count(ACCOUNTACTION.ACCOUNT_ID) > 1 and ACCOUNTACTION.ID
!=
        min(ACCOUNTACTION.ID));
        
        I am trying to delete these records but am having trouble with
the sql
        delete
        
        I tried the following but nothing happened
        
        delete ACCOUNTACTION where ACCOUNTACTION.ID in
        (select ACCOUNTACTION.ID from ACCOUNTACTION
        where ACCOUNTACTION.ACTIONDATE like '2008-08-01 02:00%'
        group by ACCOUNTACTION.ACCOUNT_ID
        having (count(ACCOUNTACTION.ACCOUNT_ID) > 1 and ACCOUNTACTION.ID
!=
        min(ACCOUNTACTION.ID)));
        
        Can anyone help me?

RE: removing duplicate entries

Reply via email to