Well guys, Your replies have been more than helpful to me, showing me both the learning stuff I still have to get in my mind about real SQL and the wonder called PostgreSQL and a very good solution from Tom Lane (thanks a lot sir!)!
Indeed, changing mem_sort and other server parmeters along with the quite strange (to me!) outer join Tom mentioned finally got me to finalize the cleaning task and indeed in warp speed (some 5 mintues or less!). I am running PG v7.4.5 on a PIV Celeron 1,7Ghz with 1,5Gb ram so talking about the time performance I might say that I'm more than pleased with the result. As with the amazement PG "caused" me through its reliability so far I am decided to go even deeper in learning it! Thanks again all for your precious help! Regards, Victor > > If you are using PG 7.4, you can get reasonable performance out of this > approach, but you need to jack sort_mem up to the point where the whole > DIRTY table will fit into sort_mem (so that you get a hashed-subplan > plan and not a plain subplan). If you find yourself setting sort_mem to > more than say half of your machine's available RAM, you should probably > forget that idea. > > > [explain] select * from bigma,dirty where bigma.email!=dirty.email; > > This of course does not give the right answer at all. > > A trick that people sometimes use is an outer join: > > select * from bigma left join dirty on (bigma.email=dirty.email) > where dirty.email is null; > > Understanding why this works is left as an exercise for the reader > ... but it does work, and pretty well too. If you're using pre-7.4 > PG then this is about the only effective solution AFAIR. > > regards, tom lane > ---------------------------(end of broadcast)--------------------------- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match