Re: removing duplicate records

Vivian Wang Tue, 30 Jul 2002 14:18:43 -0700

Walt,

Don't trust group by.
I am using mysqldump and sort -u on *nix for duplicate.


Try both ways, let me know your result.

Thanks
At 04:11 PM 7/30/2002 -0400, walt wrote:
>Thanks David!
>
>The entire row, not just one or two columns, is a duplicate which makes life
>fun..
>:-)
>
>I can key or unique index only a few columns once the data is cleaned up to
>prevent this problem.
>
>If I create an identical table and include either a key or unique index
>(innodb seems to like the index better) on all the columns and do a
>
>insert into new_table using select * from old_table
>
>will mysql quit inserting once a duplicate is hit, or will it keep going and
>skip over the duplicates?
>
>walt
>
>On Tuesday 30 July 2002 03:57 pm, David Kramer wrote:
> > You could always use an insert statement into a second table, when
> > performing the insert use a GROUP BY clause to consalidate your records
> >
> > something along these lines but this isnt 100% accurate, I would need the
> > table DDL and business rules behind the DEDUP
> >
> > Insert into tableB
> > (
> >       column names, ...
> >
> >
> > )
> > (select
> >       column a,
> >       column b,
> >       max(column c), --or you could use MIN
> > from
> >       table A
> >
> > group by
> >       column a,
> >       column b);
> >
> >
> >
> >
> >
> > **JUST make sure your Identifing column, i.e. the column you use to tell
> > which is a duplicate record or not is included in the group by.  Also what
> > are the business rules behind the DEDUP(Deduplication)? Are the other
> > values contained in the other columns necessary?  If you tell me more about
> > what your trying to do and provide some Table DDL I can help you write this
> > query.  Just let me know!
> >
> > Thanks,
> >
> > DK
> >
> >   group by statement
> >
> > -----Original Message-----
> > From: walt [mailto:[EMAIL PROTECTED]]
> > Sent: Tuesday, July 30, 2002 12:43 PM
> > To: [EMAIL PROTECTED]
> > Subject: removing duplicate records
> >
> >
> > Does anyone know a good way find and remove duplicate records from a table?
> > I can create an identical table structure and use a script to pull records
> > from the existing table and insert them into the new table if they are not
> > duplicates, but I'd rather not do it that way. Here is an example of an sql
> > script  I use for Oracle databases
> >
> > delete   from employee a
> > where    rowid < (
> > select  max(rowid)
> > from    employee b
> > where   b.COL1 = a.COL1
> > and     b.COL2 = a.COL2
> > and     b.COL# = a.COL#);
> >
> > sql, query
> >
> > Thanks in advance!
>
>
>
>
>---------------------------------------------------------------------
>Before posting, please check:
>    http://www.mysql.com/manual.php   (the manual)
>    http://lists.mysql.com/           (the list archive)
>
>To request this thread, e-mail <[EMAIL PROTECTED]>
>To unsubscribe, e-mail 
><[EMAIL PROTECTED]>
>Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php


---------------------------------------------------------------------
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/           (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Re: removing duplicate records

Reply via email to