Vivian - Im just curious as to why I shouldnt trust group by? Is there a bug or something within Mysql regarding the group by clause?
Walt - you could also create a compound primary key that consists of all columns, then create a unique constraint on the Primary key..... I've used this many times in the past, and it works quite well. The only caveat is the SQL will puke when it tries to load the duplicate records into the table, I personally would use the DEDUP process of Insert/select with a group by... But you might want to see what Vivian's response is on the group by issue. DK -----Original Message----- From: Vivian Wang [mailto:[EMAIL PROTECTED]] Sent: Tuesday, July 30, 2002 2:37 PM To: walt; David Kramer; [EMAIL PROTECTED] Subject: Re: removing duplicate records Walt, Don't trust group by. I am using mysqldump and sort -u on *nix for duplicate. Try both ways, let me know your result. Thanks At 04:11 PM 7/30/2002 -0400, walt wrote: >Thanks David! > >The entire row, not just one or two columns, is a duplicate which makes life >fun.. >:-) > >I can key or unique index only a few columns once the data is cleaned up to >prevent this problem. > >If I create an identical table and include either a key or unique index >(innodb seems to like the index better) on all the columns and do a > >insert into new_table using select * from old_table > >will mysql quit inserting once a duplicate is hit, or will it keep going and >skip over the duplicates? > >walt > >On Tuesday 30 July 2002 03:57 pm, David Kramer wrote: > > You could always use an insert statement into a second table, when > > performing the insert use a GROUP BY clause to consalidate your records > > > > something along these lines but this isnt 100% accurate, I would need the > > table DDL and business rules behind the DEDUP > > > > Insert into tableB > > ( > > column names, ... > > > > > > ) > > (select > > column a, > > column b, > > max(column c), --or you could use MIN > > from > > table A > > > > group by > > column a, > > column b); > > > > > > > > > > > > **JUST make sure your Identifing column, i.e. the column you use to tell > > which is a duplicate record or not is included in the group by. Also what > > are the business rules behind the DEDUP(Deduplication)? Are the other > > values contained in the other columns necessary? If you tell me more about > > what your trying to do and provide some Table DDL I can help you write this > > query. Just let me know! > > > > Thanks, > > > > DK > > > > group by statement > > > > -----Original Message----- > > From: walt [mailto:[EMAIL PROTECTED]] > > Sent: Tuesday, July 30, 2002 12:43 PM > > To: [EMAIL PROTECTED] > > Subject: removing duplicate records > > > > > > Does anyone know a good way find and remove duplicate records from a table? > > I can create an identical table structure and use a script to pull records > > from the existing table and insert them into the new table if they are not > > duplicates, but I'd rather not do it that way. Here is an example of an sql > > script I use for Oracle databases > > > > delete from employee a > > where rowid < ( > > select max(rowid) > > from employee b > > where b.COL1 = a.COL1 > > and b.COL2 = a.COL2 > > and b.COL# = a.COL#); > > > > sql, query > > > > Thanks in advance! > > > > >--------------------------------------------------------------------- >Before posting, please check: > http://www.mysql.com/manual.php (the manual) > http://lists.mysql.com/ (the list archive) > >To request this thread, e-mail <[EMAIL PROTECTED]> >To unsubscribe, e-mail ><[EMAIL PROTECTED]> >Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php --------------------------------------------------------------------- Before posting, please check: http://www.mysql.com/manual.php (the manual) http://lists.mysql.com/ (the list archive) To request this thread, e-mail <[EMAIL PROTECTED]> To unsubscribe, e-mail <[EMAIL PROTECTED]> Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php