Hi, i've a rather large table (~ 1.800.000 rows) with five CHAR columns - let's say col1, col2, ...., col5. Col1has the primary key. The columns col2,col3,col4,col5 hold strings of variable length. I need to find duplicate entries that have the same value for col2,col3,col4 & col5 but (and thats the tricky part :-)) must use the LIKE operator as the columns contain whitespaces.
e.g. 0000000001 xufnc qj3n5 qcm2 .... 0000000002 x% qj3n% qcm2 .... 0000000003 xufnc qj3n5 qcm2 .... the rows 1 and 3 are duplicates to each other (both directions), while row 2 is duplicate to row 1 and row 3 (single direction). currently i'm using a java tool that selects all rows from the table and then generates SELECT queries for every single one of them. the table has two indexes on (col2,col3,col4,col5) and (col3,col2,col4,col5). As I use one query for each row, the server is able to pick the most valueable key each time. SELECT COL1 FROM myTable WHERE COL2 = 'xufnc' AND COL3 = 'qj3n5' AND COL4 = 'qcm2' AND ... AND COL1 != '0000000001'; SELECT COL1 FROM myTable WHERE COL2 LIKE 'x%' AND COL3 LIKE 'qj3n%' AND COL4 = 'qcm2' AND ... AND COL1 != '0000000002'; well, as you probably think yourself sending 1.800.000 queries to the server sux a lot :-) on the other hand the LIKE operator that some of the queries require makes choosing the most valueable key essential. anyone with a better idea to solve this? maybe the whole whitespaces-in-rows-thing can be solved differently? regards kst -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]