RE: SQL statement to find and delete double entries

gavin . hale Mon, 22 Sep 2003 07:57:51 -0700

You don't mention your RDBMS.

But some RDBMS (and I presume all internally) include a unique row 
identifier which you can access (e.g Ingres - tid <tuple-id>)
If this is available, then the rows are no longer 'unique' 
and you can decide which row you want to get rid of 
e.g max/min(tid) group by <my uniqueness> having count(*) > 1
in a one-off SQL statement, thereby avoiding front-end/back-end 
and programming solutions.

I suggest you check your RDBMS DBA/SQL guides.

-----Original Message-----
From: Chuck Fox [mailto:[EMAIL PROTECTED]
Sent: 22 September 2003 15:45
To: [EMAIL PROTECTED]
Cc: Christian Merz; [EMAIL PROTECTED]; Morrison, Trevor (Trevor)
Subject: Re: SQL statement to find and delete double entries

Jeff,

Here is the syntax for doing this in Sybase as a stored procedure.  
Assumption:  id is an int.

CREATE PROCEDURE del_dupes
AS
BEGIN
    DECLARE @id int, @id_count int, @delete_rows int

    DECLARE read_keys CURSOR FOR
    SELECT id, count(*)
    FROM table
    GROUP BY id
    HAVING count(*) > 1

    OPEN read_keys

    FETCH read_keys INTO @id, @id_count

    WHILE( @@sqlstatus = 0 )
    BEGIN
       SELECT @delete_rows = @id_count - 1

       SET ROWCOUNT @delete_rows

       DELETE table WHERE id = @id

       FETCH read_keys INTO @id, @id_count
    END
END
GO

Stored procedures are the fastest way to do it.  As the data never 
leaves the server.  Check with your RDBMS on correct syntax for your 
particular db.  The same steps could be done in perl, but the data for 
the select is transmitted to the client and then each delete statement 
is passed, compiled, optimized and then executed.

HTH,

Chuck Fox
Principal DBA
America Online, INC

[EMAIL PROTECTED] wrote:

>The only problem with that approach is that it deletes all of the entries 
>and doesn't leave "singles" behind.  I'd probably do it programatically. 
>Grab the results of query 1, store the data in a hash of hashes, then do 
>the delete and re-insert.
>
>But I'd love to hear an SQL solution to leaving one copy of each duplicate 
>behind.
>
>
>
>____________________________
>Jeff Seger
>Fairchild Semiconductor
>[EMAIL PROTECTED]
>____________________________
>
>
>
>
>
>"Christian Merz" <[EMAIL PROTECTED]>
>09/18/2003 08:33 AM
>
> 
>        To:     "Morrison, Trevor (Trevor)" <[EMAIL PROTECTED]>,
<[EMAIL PROTECTED]>
>        cc: 
>        Subject:        Re: SQL statement to find and delete double entries
>
>
>Hi,
>
>the basic idea to find duplicate or multiple values is:
>    select id, count(*)
>    from table
>    group by id
>    having count(*) > 1;
>
>to delete ALL such values you may do this:
>    delete from table where id in
>      ( select id
>        from table
>        group by id
>        having count(*) > 1
>     );
>
>cu, Christian
>
>----- Original Message -----
>From: "Morrison, Trevor (Trevor)" <[EMAIL PROTECTED]>
>To: <[EMAIL PROTECTED]>
>Sent: Saturday, August 16, 2003 6:39 PM
>Subject: SQL statement to find and delete double entries
>
>
>Hi,
>
>What would be an SQL statement that will find duplicate order numbers in
>table and then delete them?
>
>TIA
>
>Trevor
>
>
>
>
>
>
>  
>

Internet communications are not secure and therefore the Barclays Group
does not accept legal responsibility for the contents of this message.
Although the Barclays Group operates anti-virus programmes, it does not
accept responsibility for any damage whatsoever that is caused by
viruses being passed.  Any views or opinions presented are solely those
of the author and do not necessarily represent those of the Barclays
Group.  Replies to this email may be monitored by the Barclays Group
for operational or business reasons.

RE: SQL statement to find and delete double entries

Reply via email to