IN() subqueries in MySQL are badly optimized. It's usually better to
use a JOIN, even though it's non-standard:
DELETE geno_260k.*
FROM geno_260k
INNER JOIN (
SELECT ident FROM geno_260k
WHERE a1=0
GROUP BY ident HAVING count(*)>250000
) AS der USING(ident);
Try profiling this and see if it's faster. It probably will be on any
reasonably large data set, as long as the table has an index on ident.
Note I changed the COUNT(a1) to COUNT(*) for efficiency. Counting a
column counts the number of values (e.g. non-null). Counting * just
counts the number of rows and can be faster. COUNT(*) is what you want
to use 99% of the time.
Regards
Baron
Olaf Stein wrote:
Thanks,
This seems to work but that IN seems to be really slow...
On 9/5/07 9:41 AM, "Justin" <[EMAIL PROTECTED]> wrote:
try
SELECT * from geno_260k WHERE ident IN (SELECT ident FROM geno_260k WHERE
a1=0
GROUP BY ident HAVING count(a1)>250000);
This will give you what you're deleting first.. then if that is good. do
DELETE FROM geno_260k WHERE ident IN (SELECT ident FROM geno_260k WHERE
a1=0
GROUP BY ident HAVING count(a1)>250000);
(note the change in case is just my way of seeing things.. it's not
necessary that I know of)
----- Original Message -----
From: "Olaf Stein" <[EMAIL PROTECTED]>
To: "MySql" <mysql@lists.mysql.com>
Sent: Wednesday, September 05, 2007 9:35 AM
Subject: Delete query question
Hey all
I am stuck here (thinking wise) and need some ideas:
I have this table:
CREATE TABLE `geno_260k` (
`genotype_id` int(10) unsigned NOT NULL auto_increment,
`ident` int(10) unsigned NOT NULL,
`marker_id` int(10) unsigned NOT NULL,
`a1` tinyint(3) unsigned NOT NULL,
`a2` tinyint(3) unsigned NOT NULL default '0',
PRIMARY KEY (`genotype_id`),
KEY `ident` (`ident`),
KEY `marker_id` (`marker_id`),
CONSTRAINT `geno_260k_ibfk_2` FOREIGN KEY (`marker_id`) REFERENCES
`markers` (`marker_id`),
CONSTRAINT `geno_260k_ibfk_1` FOREIGN KEY (`ident`) REFERENCES
`individual` (`ident`)
) ENGINE=InnoDB AUTO_INCREMENT=91318273 DEFAULT CHARSET=utf8
And with the following query I get 159 ident's back:
select ident from geno_260k where a1=0 group by ident having
count(a1)>250000;
I want to delete all records containing those idents (about 260000 per
ident
so 159*260000).
So I thought
delete from geno_260k where ident=(select ident from geno_260k where a1=0
group by ident having count(a1)>250000);
But mysql can not select and delete from the same table.
Any ideas?
Thanks
Olaf
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]
-------------------------
Olaf Stein
DBA
Center for Quantitative and Computational Biology
Columbus Children's Research Institute
700 Children's Drive
phone: 1-614-355-5685
cell: 1-614-843-0432
email: [EMAIL PROTECTED]
--
Baron Schwartz
Xaprb LLC
http://www.xaprb.com/
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]