Thanks baron, I will try this just for test purposes as I already wrote a script, which is slow but not as bad as using IN()
Olaf On 9/5/07 3:29 PM, "Baron Schwartz" <[EMAIL PROTECTED]> wrote: > IN() subqueries in MySQL are badly optimized. It's usually better to > use a JOIN, even though it's non-standard: > > DELETE geno_260k.* > FROM geno_260k > INNER JOIN ( > SELECT ident FROM geno_260k > WHERE a1=0 > GROUP BY ident HAVING count(*)>250000 > ) AS der USING(ident); > > Try profiling this and see if it's faster. It probably will be on any > reasonably large data set, as long as the table has an index on ident. > > Note I changed the COUNT(a1) to COUNT(*) for efficiency. Counting a > column counts the number of values (e.g. non-null). Counting * just > counts the number of rows and can be faster. COUNT(*) is what you want > to use 99% of the time. > > Regards > Baron > > Olaf Stein wrote: >> Thanks, >> >> This seems to work but that IN seems to be really slow... >> >> >> On 9/5/07 9:41 AM, "Justin" <[EMAIL PROTECTED]> wrote: >> >>> try >>> >>> SELECT * from geno_260k WHERE ident IN (SELECT ident FROM geno_260k WHERE >>> a1=0 >>> GROUP BY ident HAVING count(a1)>250000); >>> >>> This will give you what you're deleting first.. then if that is good. do >>> >>> >>> DELETE FROM geno_260k WHERE ident IN (SELECT ident FROM geno_260k WHERE >>> a1=0 >>> GROUP BY ident HAVING count(a1)>250000); >>> >>> (note the change in case is just my way of seeing things.. it's not >>> necessary that I know of) >>> >>> >>> ----- Original Message ----- >>> From: "Olaf Stein" <[EMAIL PROTECTED]> >>> To: "MySql" <mysql@lists.mysql.com> >>> Sent: Wednesday, September 05, 2007 9:35 AM >>> Subject: Delete query question >>> >>> >>>> Hey all >>>> >>>> I am stuck here (thinking wise) and need some ideas: >>>> >>>> I have this table: >>>> >>>> CREATE TABLE `geno_260k` ( >>>> `genotype_id` int(10) unsigned NOT NULL auto_increment, >>>> `ident` int(10) unsigned NOT NULL, >>>> `marker_id` int(10) unsigned NOT NULL, >>>> `a1` tinyint(3) unsigned NOT NULL, >>>> `a2` tinyint(3) unsigned NOT NULL default '0', >>>> PRIMARY KEY (`genotype_id`), >>>> KEY `ident` (`ident`), >>>> KEY `marker_id` (`marker_id`), >>>> CONSTRAINT `geno_260k_ibfk_2` FOREIGN KEY (`marker_id`) REFERENCES >>>> `markers` (`marker_id`), >>>> CONSTRAINT `geno_260k_ibfk_1` FOREIGN KEY (`ident`) REFERENCES >>>> `individual` (`ident`) >>>> ) ENGINE=InnoDB AUTO_INCREMENT=91318273 DEFAULT CHARSET=utf8 >>>> >>>> >>>> And with the following query I get 159 ident's back: >>>> >>>> select ident from geno_260k where a1=0 group by ident having >>>> count(a1)>250000; >>>> >>>> I want to delete all records containing those idents (about 260000 per >>>> ident >>>> so 159*260000). >>>> So I thought >>>> >>>> delete from geno_260k where ident=(select ident from geno_260k where a1=0 >>>> group by ident having count(a1)>250000); >>>> >>>> But mysql can not select and delete from the same table. >>>> >>>> Any ideas? >>>> >>>> Thanks >>>> Olaf >>>> >>>> >>>> >>>> -- >>>> MySQL General Mailing List >>>> For list archives: http://lists.mysql.com/mysql >>>> To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED] >>>> >>>> >> >> >> >> >> >> ------------------------- >> Olaf Stein >> DBA >> Center for Quantitative and Computational Biology >> Columbus Children's Research Institute >> 700 Children's Drive >> phone: 1-614-355-5685 >> cell: 1-614-843-0432 >> email: [EMAIL PROTECTED] >> >> -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]