I need to count the number of rows in a table that are grouped by a
list of columns, but I also need to exclude rows that have more then
some count when grouped by a different set of columns. Conceptually,
this is not hard, but I am having trouble doing this efficiently.

My first counting query would be this:

SELECT count(*)
FROM cst_rollup
GROUP BY target_name_id, ep, roiname, recipe_process,
recipe_product, recipe_layer, f_tag_bottom,
measname, recipe_id

But from this count I need to subtract the count of rows that have
more then 50 rows with a different grouping:

SELECT count(*)
FROM cst_rollup
GROUP BY target_name_id, ep, wafer_id
HAVING count(*) >= 50

As you can see, the second query has wafer_id, but the first query does not.

Currently I am doing this in python, and it's slow. In my current
implementation I have one query, and it selects the columns (i.e.
doesn't just count), and I have added wafer_id:

SELECT target_name_id, ep, roiname, recipe_process,
recipe_product, recipe_layer, f_tag_bottom,
measname, recipe_id, wafer_id
FROM cst_rollup

Then I go through the result set (which can be over 200k rows) and I
count the number of rows with matching (target_name_id, ep, wafer_id).
Then I go through the rows again and regroup them without wafer_id,
but skipping the rows that have more then 50 rows for that row's
(target_name_id, ep, wafer_id).

Is this clear to everyone what I am trying to do?

I'd like to do this all in sql with count because then I do not have
to actually return and parse the data in python.

Can anyone think of a way to do this in sql in a way that will be more
efficient then my current implementation?


Thanks!
-Larry

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/mysql

Reply via email to