On Fri, Jan 1, 2016 at 2:12 PM, Peter Brawley <peter.braw...@earthlink.net> wrote: > On 12/31/2015 0:51, Larry Martell wrote: >> >> I need to count the number of rows in a table that are grouped by a >> list of columns, but I also need to exclude rows that have more then >> some count when grouped by a different set of columns. Conceptually, >> this is not hard, but I am having trouble doing this efficiently. >> >> My first counting query would be this: >> >> SELECT count(*) >> FROM cst_rollup >> GROUP BY target_name_id, ep, roiname, recipe_process, >> recipe_product, recipe_layer, f_tag_bottom, >> measname, recipe_id >> >> But from this count I need to subtract the count of rows that have >> more then 50 rows with a different grouping: >> >> SELECT count(*) >> FROM cst_rollup >> GROUP BY target_name_id, ep, wafer_id >> HAVING count(*) >= 50 >> >> As you can see, the second query has wafer_id, but the first query does >> not. >> >> Currently I am doing this in python, and it's slow. In my current >> implementation I have one query, and it selects the columns (i.e. >> doesn't just count), and I have added wafer_id: >> >> SELECT target_name_id, ep, roiname, recipe_process, >> recipe_product, recipe_layer, f_tag_bottom, >> measname, recipe_id, wafer_id >> FROM cst_rollup >> >> Then I go through the result set (which can be over 200k rows) and I >> count the number of rows with matching (target_name_id, ep, wafer_id). >> Then I go through the rows again and regroup them without wafer_id, >> but skipping the rows that have more then 50 rows for that row's >> (target_name_id, ep, wafer_id). >> >> Is this clear to everyone what I am trying to do? > > > If I've understand this correctly, the resultset you wish to aggregate on is > ... > > select target_name_id, ep, wafer_id > from cst_rollup a > left join ( -- exclude rows for which wafer_id count >= 50 > select name_id, ep, wafer, count(*) n > from cst_rollup > group by target_name_id, ep, wafer_id > having n >= 50 > ) b using ( target_name_id, ep, wafer_id ) > where b.target_name is null ; > > If that's so, you could assemble that resultset in a temp table then run the > desired aggregate query on it, or you could aggregate on it directly as a > subquery.
That query gives: ERROR 1137 (HY000): Can't reopen table: 'a' -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql