Re: Help with cleaning up data
delete b from icd9x10 a join icd9x10 b on a.icd9 = b.icd9 and a.id < b.id >... > CREATE TABLE `ICD9X10` ( > ... > id icd9 icd10 > 25 29182 F10182 > 26 29182 F10282 > ... Good luck, Bob
Re: Help with cleaning up data
On 3/29/2014 2:26 PM, william drescher wrote: I am given a table: ICD9X10 which is a maping of ICD9 codes to ICD10 codes. Unfortunately the table contains duplicate entries that I need to remove. CREATE TABLE `ICD9X10` ( `id` smallint(6) NOT NULL AUTO_INCREMENT, `icd9` char(8) NOT NULL, `icd10` char(6) NOT NULL, PRIMARY KEY (`id`), UNIQUE KEY `icd9` (`icd9`,`id`), UNIQUE KEY `icd10` (`icd10`,`id`) ) ENGINE=InnoDB AUTO_INCREMENT=671 DEFAULT CHARSET=ascii id icd9 icd10 25 29182 F10182 26 29182 F10282 27 29182 F10982 I just can't think of a way to write a querey to delete the duplicates. Does anyone have a suggestion ? bill Thanks for all the suggestions. I learned a lot, which is the most important part of the exercise. bill -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
RE: Help with cleaning up data
Bill, here is one approach: The following query will return the id's that should NOT be deleted: Select min (id) from icd9x10 group by icd9, icd10 Once you run it and happy with the results then you subquery it in a DELETE statement. Something like: Delete from icd9x10 A where A.id not in (Select min (B.id) from icd9x10 B group by B.icd9, B.icd10). I have not tested it (sorry it is a weekend here...), but I hope it will lead you into the right direction. David. David Lerer | Director, Database Administration | Interactive | 605 Third Avenue, 12th Floor, New York, NY 10158 Direct: (646) 487-6522 | Fax: (646) 487-1569 | dle...@univision.net | www.univision.net -Original Message- From: william drescher [mailto:will...@techservsys.com] Sent: Saturday, March 29, 2014 2:26 PM To: mysql@lists.mysql.com Subject: Help with cleaning up data I am given a table: ICD9X10 which is a maping of ICD9 codes to ICD10 codes. Unfortunately the table contains duplicate entries that I need to remove. CREATE TABLE `ICD9X10` ( `id` smallint(6) NOT NULL AUTO_INCREMENT, `icd9` char(8) NOT NULL, `icd10` char(6) NOT NULL, PRIMARY KEY (`id`), UNIQUE KEY `icd9` (`icd9`,`id`), UNIQUE KEY `icd10` (`icd10`,`id`) ) ENGINE=InnoDB AUTO_INCREMENT=671 DEFAULT CHARSET=ascii id icd9 icd10 25 29182 F10182 26 29182 F10282 27 29182 F10982 I just can't think of a way to write a querey to delete the duplicates. Does anyone have a suggestion ? bill -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql The information contained in this e-mail and any attached documents may be privileged, confidential and protected from disclosure. If you are not the intended recipient you may not read, copy, distribute or use this information. If you have received this communication in error, please notify the sender immediately by replying to this message and then delete it from your system.
Re: Help with cleaning up data
On 29-03-2014 19:26, william drescher wrote: I am given a table: ICD9X10 which is a maping of ICD9 codes to ICD10 codes. Unfortunately the table contains duplicate entries that I need to remove. ... I just can't think of a way to write a querey to delete the duplicates. Does anyone have a suggestion ? http://bit.ly/1hKCVHi -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: Help with cleaning up data
Hi Bill, How big is your table? It seems to me that you might want to change your unique keys to something like (icd9, icd10), thus guaranteeing that every mapping will exist only once in your table. You could create a new table with that constraint and copy all your data to it: CREATE TABLE `ICD9X10_2` ( `id` smallint(6) NOT NULL AUTO_INCREMENT, `icd9` char(8) NOT NULL, `icd10` char(6) NOT NULL, PRIMARY KEY (`id`), UNIQUE KEY `icd9_icd10` (`icd9`,`icd10`) ) ENGINE=InnoDB DEFAULT CHARSET=ascii INSERT IGNORE INTO ICD9X10_2 SELECT * FROM ICD9X10; -- This will skip the duplicates -- Once you've checked the new table and it looks fine to you, you can swap them: RENAME TABLE ICD9X10 TO ICD9X10_old, ICD9X10_2 TO ICD9X10; Or, alternatively, you can also directly alter your table by adding that unique index like this: ALTER IGNORE TABLE ICD9X10 ADD UNIQUE KEY (ICD9, ICD10); Hope that helps 2014-03-29 18:26 GMT+00:00 william drescher : > I am given a table: ICD9X10 which is a maping of ICD9 codes to ICD10 > codes. Unfortunately the table contains duplicate entries that I need to > remove. > > CREATE TABLE `ICD9X10` ( > `id` smallint(6) NOT NULL AUTO_INCREMENT, > `icd9` char(8) NOT NULL, > `icd10` char(6) NOT NULL, > PRIMARY KEY (`id`), > UNIQUE KEY `icd9` (`icd9`,`id`), > UNIQUE KEY `icd10` (`icd10`,`id`) > ) ENGINE=InnoDB AUTO_INCREMENT=671 DEFAULT CHARSET=ascii > > id icd9 icd10 > 25 29182 F10182 > 26 29182 F10282 > 27 29182 F10982 > > I just can't think of a way to write a querey to delete the duplicates. > Does anyone have a suggestion ? > > bill > > > -- > MySQL General Mailing List > For list archives: http://lists.mysql.com/mysql > To unsubscribe:http://lists.mysql.com/mysql > >