Dan, [I've cc-ed this to the list and to Rich - who asked the original question]
> You can do the following to remove duplicates: > > - create table foo as select distinct cols from > table_name_containing_duplicates; > - drop table_name_containing_duplicates; > - alter table foo rename to table_name_containing_duplicates; > > Done. > > Now, make a primay key for the new table!! -- this will avoid duplicates. This is one of a couple of rather neat SQL methods for dealing with duplicates - which strictly-speaking answers Rich's question. What is its disadvantage? The problem with a 'delete' technique is that no real record is kept of the action - which in a commercial situation makes auditors and other 'suits' unhappy. Even in a personal situation it can make life difficult because "you don't know what you've got till it's gone"! Consequently when it comes to 'deletions', us worry-worts tend to take the longer-winded/more boring approach, eg doing a SELECT before a DELETE, because then one is sure of two things: 1 the actual data being DELETEd (and an audit trail for those that regard such things as 'light reading'); and 2 that there is no 'slight error' in the deletion criteria (or assumptions about the data) which leads to a major 'oops'. A visual inspection row by row (as I recommended) on a table of millions of records would be totally impractical - automated methods would be the only way to go! Regards, =dn > > Rich, > > > > > How does one go about removing one of two identical records in a MySQL > > > database? My mistake in an earlier database was not applying a unique > > > number to each record (1, 2, 3, ---). I know it's possible to use the > > > DISTINCT operator to show only one of the identical records in a result, > > > but that does not remove one of them from the database. > > > > =Ouch! > > > > =Do you have a particular criteria to delete one or other of any > > duplicate records, or are they absolutely > > identical (and therefore it doesn't matter which stays/goes)? > > > > =You cannot really risk automated deletion in either case! > > - if the records are slightly different, criteria must be > > established to determine which is right/should be > > kept; > > - if they are absolutely identical, how will you identify in a > > WHERE clause that only one record of that > > criteria is to be removed of two identical rows? > > > > =To identify the duplicate rows try:- > > > > SELECT column(s), COUNT(*) as duplicates > > FROM tbl > > GROUP BY key > > HAVING duplicates > 1 > > > > =Depending upon the number of rows returned, you could then feed > > those keys into a series of DELETE ... LIMIT 1 > > commands - assuming duplicates are strictly 'two of'; otherwise > > it might be easier to take the listing and do it > > by hand (using a mgmt package). > > > > > One thought that I had would be to add a unique number to each record, > > > and that could probably be done manually, one record at a time, but is > > > there a way to automate the process so that unique numbers could be > > > assigned with one command? > > > > =ALTER TABLE allows the addition of a new column. The only > > question is whether to do it before (if it might help > > the editing job) or after, weeding out the duplicates (to get a > > more continuous AUTO_INCREMENT sequence, if it's > > at all of interest). > > > > =Regards, > > =dn --------------------------------------------------------------------- Before posting, please check: http://www.mysql.com/manual.php (the manual) http://lists.mysql.com/ (the list archive) To request this thread, e-mail <[EMAIL PROTECTED]> To unsubscribe, e-mail <[EMAIL PROTECTED]> Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php