[Ilugc] Mysql query issue

JAGANADH G Fri, 01 Apr 2011 03:36:10 -0700

Hi All

I have a MySQL table and it contains around 20,000,00 records.
id(promary key and autogenerate),review(text),hash(int)) is the table
structure


There is a column called "hash",which has the hash value (generated
programatcally)
I am pretty sure that there is duplicate records in my table.Thats y i
generated a hash.
Now i would like to dedupe the table using hash.
This is the query i used for the said purpose

Positive : table name
id : auto generated id
hash : hash value
review : reviews

delete Positive
from Positive,
    (
        select MIN(id) minIdent, hash s
        from Positive m
        group by hash
        having count(1) > 1
    ) as derived
where Positive.hash= derived.s
and id > minIdent

The above dedupe query is working.I checked it in a tabel which contains
10,000 records.All the duplicate hash values are removed.
But my problem is while trying the same query in large table (20,000,00),it
takes too long.
On a test run the query runs 24 hours and not completed


Is there anything which is wrong. Because I am not that much expert in DB

-- 
**********************************
JAGANADH G
http://jaganadhg.freeflux.net/blog
*ILUGCBE*
http://ilugcbe.techstud.org
_______________________________________________
ILUGC Mailing List:
http://www.ae.iitm.ac.in/mailman/listinfo/ilugc

[Ilugc] Mysql query issue

Reply via email to