Hi everyone! I have a very large 2-column table (about 500M records) from which I want to remove duplicate records.
I have tried many approaches, but they all take forever. The table's definition consists of two short TEXT columns. It is a temporary table generated from a query: CREATE TEMP TABLE huge_table AS SELECT x, y FROM ... ; Initially I tried CREATE TEMP TABLE huge_table AS SELECT DISTINCT x, y FROM ... ; but after waiting for nearly an hour I aborted the query, and repeated it after getting rid of the DISTINCT clause. Everything takes forever with this monster! It's uncanny. Even printing it out to a file takes forever, let alone creating an index for it. Any words of wisdom on how to speed this up would be appreciated. TIA! Kynn