I have been researching on how to deal with duplicate data. While I have a case where there can be duplicate data, and I want to get rid of it, the general ideas I seem to find are as follows, plus my own, which I would like opinions on since I have not seen it mentioned.
I am not in a situation where I can test for duplicates on insert, so the on duplicate key update deal does me no good here. Most data points to selecting distinct records to a temp table, deleting the source records, and selecting back into the original, three steps. I imagine there is a bit of overhead to the create temp table, it afterall, is writing a small file to disk. saying I want to favor newer records over old, and I have a pk of id, is there any flaw in: SELECT distinct(id) from table where my_param = 'foo' order by id; that gives me the newest unique record id's // little logic to convert result set inot a mysql list DELETE from table where my_param = 'foo' and id NOT IN (the_result) So I have two steps, I save the temp table creation, AND I don't bother doing anything at all in the case where there are only uniqe records to begin with, ie: nothing to delete. Is there something wrong with this, is seems a wee bit more efficient that the temp table deal for my case? -- ------------------------------------------------------------- Scott Haneda Tel: 415.898.2602 <http://www.newgeo.com> Novato, CA U.S.A. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]