Thanks baron,

I will try this just for test purposes as I already wrote a script, which is
slow but not as bad as using IN()

Olaf


On 9/5/07 3:29 PM, "Baron Schwartz" <[EMAIL PROTECTED]> wrote:

> IN() subqueries in MySQL are badly optimized.  It's usually better to
> use a JOIN, even though it's non-standard:
> 
> DELETE geno_260k.*
> FROM geno_260k
>     INNER JOIN (
>        SELECT ident FROM geno_260k
>        WHERE a1=0
>        GROUP BY ident HAVING count(*)>250000
>     ) AS der USING(ident);
> 
> Try profiling this and see if it's faster.  It probably will be on any
> reasonably large data set, as long as the table has an index on ident.
> 
> Note I changed the COUNT(a1) to COUNT(*) for efficiency.  Counting a
> column counts the number of values (e.g. non-null).  Counting * just
> counts the number of rows and can be faster.  COUNT(*) is what you want
> to use 99% of the time.
> 
> Regards
> Baron
> 
> Olaf Stein wrote:
>> Thanks,
>> 
>> This seems to work but that IN seems to be really slow...
>> 
>> 
>> On 9/5/07 9:41 AM, "Justin" <[EMAIL PROTECTED]> wrote:
>> 
>>> try
>>> 
>>> SELECT * from geno_260k WHERE  ident IN (SELECT ident FROM geno_260k WHERE
>>> a1=0
>>>  GROUP BY ident HAVING count(a1)>250000);
>>> 
>>> This will give you what you're deleting first.. then if that is good. do
>>> 
>>> 
>>> DELETE FROM geno_260k WHERE  ident IN (SELECT ident FROM geno_260k WHERE
>>> a1=0
>>>  GROUP BY ident HAVING count(a1)>250000);
>>> 
>>> (note the change in case is just my way of seeing things.. it's not
>>> necessary that I know of)
>>> 
>>> 
>>> ----- Original Message -----
>>> From: "Olaf Stein" <[EMAIL PROTECTED]>
>>> To: "MySql" <mysql@lists.mysql.com>
>>> Sent: Wednesday, September 05, 2007 9:35 AM
>>> Subject: Delete query question
>>> 
>>> 
>>>> Hey all
>>>> 
>>>> I am stuck here (thinking wise) and need some ideas:
>>>> 
>>>> I have this table:
>>>> 
>>>> CREATE TABLE `geno_260k` (
>>>>  `genotype_id` int(10) unsigned NOT NULL auto_increment,
>>>>  `ident` int(10) unsigned NOT NULL,
>>>>  `marker_id` int(10) unsigned NOT NULL,
>>>>  `a1` tinyint(3) unsigned NOT NULL,
>>>>  `a2` tinyint(3) unsigned NOT NULL default '0',
>>>>  PRIMARY KEY  (`genotype_id`),
>>>>  KEY `ident` (`ident`),
>>>>  KEY `marker_id` (`marker_id`),
>>>>  CONSTRAINT `geno_260k_ibfk_2` FOREIGN KEY (`marker_id`) REFERENCES
>>>> `markers` (`marker_id`),
>>>>  CONSTRAINT `geno_260k_ibfk_1` FOREIGN KEY (`ident`) REFERENCES
>>>> `individual` (`ident`)
>>>> ) ENGINE=InnoDB AUTO_INCREMENT=91318273 DEFAULT CHARSET=utf8
>>>> 
>>>> 
>>>> And with the following query I get 159 ident's back:
>>>> 
>>>> select ident from geno_260k where a1=0 group by ident having
>>>> count(a1)>250000;
>>>> 
>>>> I want to delete all records containing those idents (about 260000 per
>>>> ident
>>>> so 159*260000).
>>>> So I thought
>>>> 
>>>> delete from geno_260k where ident=(select ident from geno_260k where a1=0
>>>> group by ident having count(a1)>250000);
>>>> 
>>>> But mysql can not select and delete from the same table.
>>>> 
>>>> Any ideas?
>>>> 
>>>> Thanks
>>>> Olaf
>>>> 
>>>> 
>>>> 
>>>> -- 
>>>> MySQL General Mailing List
>>>> For list archives: http://lists.mysql.com/mysql
>>>> To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]
>>>> 
>>>> 
>> 
>> 
>> 
>> 
>> 
>> -------------------------
>> Olaf Stein
>> DBA
>> Center for Quantitative and Computational Biology
>> Columbus Children's Research Institute
>> 700 Children's Drive
>> phone: 1-614-355-5685
>> cell: 1-614-843-0432
>> email: [EMAIL PROTECTED]
>> 
>> 


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to