OK, I think I have this working. The last join was this: JOIN (select id, target_name_id, ep,date_time from data_cst WHERE target_name_id IN (775, 776, 777, 778, 779, 780, 45, 44, 116, 117, 118, 119, 120, 121) AND data_cst.date_time BETWEEN '2010-03-04 00:00:00' AND '2010-03-04 23:59:59' GROUP BY target_name_id, ep HAVING count(*) < 999 ORDER BY ABS(data_cst.bottom - AVG(bottom))) cm on cm.target_name_id = q1.target_name_id and cm.ep = q1.ep and cm.date_time BETWEEN '2010-03-04 00:00:00' AND '2010-03-04 23:59:59'
I have do to more testing to ensure I'm getting the correct rows back. Thanks for all the help! On Wed, Sep 19, 2012 at 10:07 AM, Larry Martell <larry.mart...@gmail.com> wrote: > On Tue, Sep 18, 2012 at 7:41 PM, Rick James <rja...@yahoo-inc.com> wrote: >> SELECT ((the appropriate id)) -- <-- >> FROM data_cst, rollup, data_target >> WHERE data_target.name = rollup.Target >> AND data_cst.ep = rollup.EP >> AND data_cst.date_time BETWEEN '2010-03-04 00:00:00' AND '2010-03-04 >> 23:59:59' >> AND data_target.id = data_cst.target_name_id >> GROUP BY rollup.Target, rollup.EP >> ORDER BY ABS(data_cst.bottom - rollup.`Avg Bottom`) >> LIMIT 1 -- <-- > > rollup is the name of the temp table that I was testing with. It > contains the output of the original query. I don't have that in the > real world. That's my issue - I can't figure out how to join with the > result set and apply the order by to the join condition. What would I > use in place of rollup to get the 'Avg Bottom'? > > Here's what my query looks like now: > > SELECT data_target.name as Target, > q1.ep as EP, > COUNT(*) as Wafers, > Lots, > SUM(numonep)/(COUNT(*)+SUM(CASE WHEN reruns > 0 THEN reruns > ELSE 0 END)) as 'Sites/Wafer', > MAX(LastRun) as "Last Run", > SUM(CASE WHEN reruns > 0 THEN reruns ELSE 0 END) as Rerun, > COUNT(*)+SUM(CASE WHEN reruns > 0 THEN reruns ELSE 0 END) as Runs, > avgbottom as "Avg Bottom", > 3*stdbottom as "3 Sig", > maxbottom as Max, > minbottom as Min, > SUM(numonep) as Count, SUM(numonep) - SUM(numbottoms) as NAs, > 100-((SUM(numonep) - SUM(numbottoms))/SUM(numonep)*100) as "% Success", > 3*stdbottom/avgbottom as "3Sig/Avg", > AVG(avgbottom) as 'Wafer Avg', > AVG(Wafer3Sigma) as 'Wafer 3 Sigma', > AVG(Ranges) as 'Avg Range', > 3*STD(Ranges) as '3Sig of Ranges', > MAX(Ranges) as 'Max Range', > MIN(Ranges) as 'Min Range', (SUM(numonep) - > SUM(numbottoms))/COUNT(*) as 'NAs/Wafer', > mx.id as maxID, > mn.id as minID, > lr.id as lrID > FROM (SELECT target_name_id, > ep, > wafer_id, > COUNT(bottom) as numbottoms, > AVG(bottom) as avgbottom, > STD(bottom) as stdbottom, > MAX(bottom) as maxbottom, > MIN(bottom) as minbottom, > MAX(date_time) as "LastRun", > COUNT(*) as numonep, > COUNT(DISTINCT target_name_id, ep, lot_id, > data_file_id)-1 as reruns, > COUNT(DISTINCT(lot_id)) as Lots, > 3*STD(bottom) as Wafer3Sigma, > MAX(bottom) - MIN(bottom) as Ranges > FROM data_cst > WHERE target_name_id IN (775, 776, 777, 778, 779, 780, 45, 44, > 116, 117, 118, 119, 120, 121) > AND data_cst.date_time BETWEEN '2010-03-04 00:00:00' AND > '2010-03-04 23:59:59' > GROUP BY target_name_id, ep, wafer_id > HAVING count(*) < 999) q1 > join data_cst mn ON mn.bottom = q1.minbottom > join data_cst mx on mx.bottom = q1.maxbottom > join data_cst lr on lr.date_time = q1.LastRun, > data_target > WHERE data_target.id = q1.target_name_id > GROUP BY q1.target_name_id, q1.ep; > >> >>> -----Original Message----- >>> From: Larry Martell [mailto:larry.mart...@gmail.com] >>> Sent: Tuesday, September 18, 2012 2:57 PM >>> To: Rick James >>> Cc: peter.braw...@earthlink.net; mysql@lists.mysql.com >>> Subject: Re: getting certain rows from a group by >>> >>> On Tue, Sep 18, 2012 at 4:01 PM, Rick James <rja...@yahoo-inc.com> >>> wrote: >>> > SELECT mn.* >>> > mx.* >>> >> > FROM ( SELECT @min = MIN(bottom), >>> >> > @max = MAX(bottom), ... ) >>> >> > JOIN data_cst mn ON bottom = @min >>> >> > JOIN data_cst mx ON bottom = @max >>> > >>> > That is, the FROM finds the value for the desired row. >>> > The JOIN then gets to the rest of the fields. >>> > Caveat: If two rows have bottom = @min, it returns both. >>> >>> >>> Thanks a lot Rick! This was super helpful. I've got it working for 3 of >>> the 4 cases - min(bottom), max(bottom), and max(date_time). But I can't >>> figure out how to work in the last case - where bottom is closest to >>> avg(bottom). In an individual query I can get it with an order by, like >>> this: >>> >>> SELECT rollup.Target, rollup.EP, rollup.`Avg Bottom`, data_cst.id, >>> data_cst.bottom FROM data_cst, rollup, data_target WHERE >>> data_target.name = rollup.Target AND data_cst.ep = rollup.EP AND >>> data_cst.date_time BETWEEN '2010-03-04 00:00:00' AND '2010-03-04 >>> 23:59:59' >>> AND data_target.id = data_cst.target_name_id GROUP BY rollup.Target, >>> rollup.EP ORDER BY ABS(data_cst.bottom - rollup.`Avg Bottom`); >>> >>> Any way to work that into another join? >>> >>> Thanks! >>> >>> > >>> >> -----Original Message----- >>> >> From: Larry Martell [mailto:larry.mart...@gmail.com] >>> >> Sent: Tuesday, September 18, 2012 12:54 PM >>> >> To: Rick James >>> >> Cc: peter.braw...@earthlink.net; mysql@lists.mysql.com >>> >> Subject: Re: getting certain rows from a group by >>> >> >>> >> On Tue, Sep 18, 2012 at 2:05 PM, Rick James <rja...@yahoo-inc.com> >>> >> wrote: >>> >> > For single query... >>> >> > >>> >> > Plan A: >>> >> > Repeat the SELECT as once for each (MIN, AVG, etc) as a subquery >>> in >>> >> > WHERE bottom = ( SELECT MIN(bottom) ... ) >>> >> > >>> >> > Plan B: >>> >> > FROM ( SELECT @min = MIN(bottom), >>> >> > @max = MAX(bottom), ... ) >>> >> > JOIN data_cst mn ON bottom = @min >>> >> > JOIN data_cst mx ON bottom = @max >>> >> >>> >> >>> >> Can you elaborate on this? I don't see how this will give me what I >>> >> need. I'm not looking for the min or max bottom - I already have >>> that >>> >> - I'm looking for the row from each group that has the min and max >>> >> bottom. >>> >> >>> >> > Plan C: >>> >> > Get rid of 3rd party packages that eventually get in your way >>> >> > instead >>> >> of 'helping'. >>> >> >>> >> It's not django that is getting in the way. django is a fantastic >>> >> framework for web development. It's just changing requirements. The >>> >> original developer (who no longer works here) created a few >>> different >>> >> classes, and then based on the original requirements, chose to use >>> >> the one that supports a single query. That worked then, but may not >>> >> work now. There are other classes that allow multiple queries, but >>> >> then you have to parse the data and stuff it into the context that >>> >> will be picked up by the browser yourself. The single query class >>> >> does that for you. I may yet go that way, but I'm trying to avoid a >>> >> big rewrite if I can. >>> >> >>> >> >> -----Original Message----- >>> >> >> From: Larry Martell [mailto:larry.mart...@gmail.com] >>> >> >> Sent: Tuesday, September 18, 2012 4:57 AM >>> >> >> To: peter.braw...@earthlink.net >>> >> >> Cc: mysql@lists.mysql.com >>> >> >> Subject: Re: getting certain rows from a group by >>> >> >> >>> >> >> On Tue, Sep 18, 2012 at 7:52 AM, Peter Brawley >>> >> >> <peter.braw...@earthlink.net> wrote: >>> >> >> > On 2012-09-18 5:53 AM, Larry Martell wrote: >>> >> >> >> >>> >> >> >> I have this query: >>> >> >> >> >>> >> >> >> SELECT data_target.name as Target, >>> >> >> >> q1.ep as EP, >>> >> >> >> COUNT(*) as Wafers, >>> >> >> >> Lots, >>> >> >> >> SUM(numonep)/(COUNT(*)+SUM(CASE WHEN reruns > 0 THEN >>> >> >> >> reruns ELSE 0 END)) as 'Sites/Wafer', >>> >> >> >> MAX(LastRun) as "Last Run", >>> >> >> >> SUM(CASE WHEN reruns > 0 THEN reruns ELSE 0 END) as >>> >> Rerun, >>> >> >> >> COUNT(*)+SUM(CASE WHEN reruns > 0 THEN reruns ELSE 0 >>> >> >> >> END) as >>> >> >> Runs, >>> >> >> >> avgbottom as "Avg Bottom", >>> >> >> >> 3*stdbottom as "3 Sig", >>> >> >> >> maxbottom as Max, >>> >> >> >> minbottom as Min, >>> >> >> >> SUM(numonep) as Count, >>> >> >> >> SUM(numonep) - SUM(numbottoms) as NAs, >>> >> >> >> 100-((SUM(numonep) - >>> SUM(numbottoms))/SUM(numonep)*100) >>> >> as >>> >> >> "% >>> >> >> >> Success", >>> >> >> >> 3*stdbottom/avgbottom as "3Sig/Avg", >>> >> >> >> AVG(avgbottom) as 'Wafer Avg', >>> >> >> >> AVG(Wafer3Sigma) as 'Wafer 3 Sigma', >>> >> >> >> AVG(Ranges) as 'Avg Range', >>> >> >> >> 3*STD(Ranges) as '3Sig of Ranges', >>> >> >> >> MAX(Ranges) as 'Max Range', >>> >> >> >> MIN(Ranges) as 'Min Range', >>> >> >> >> (SUM(numonep) - SUM(numbottoms))/COUNT(*) as >>> 'NAs/Wafer' >>> >> >> >> FROM (SELECT target_name_id, >>> >> >> >> ep, >>> >> >> >> wafer_id, >>> >> >> >> COUNT(bottom) as numbottoms, >>> >> >> >> AVG(bottom) as avgbottom, >>> >> >> >> STD(bottom) as stdbottom, >>> >> >> >> MAX(bottom) as maxbottom, >>> >> >> >> MIN(bottom) as minbottom, >>> >> >> >> MAX(date_time) as "LastRun", >>> >> >> >> COUNT(*) as numonep, >>> >> >> >> COUNT(DISTINCT target_name_id, ep, lot_id, >>> >> >> >> data_file_id)-1 as reruns, >>> >> >> >> COUNT(DISTINCT(lot_id)) as Lots, >>> >> >> >> 3*STD(bottom) as Wafer3Sigma, >>> >> >> >> MAX(bottom) - MIN(bottom) as Ranges >>> >> >> >> FROM data_cst >>> >> >> >> WHERE target_name_id IN (775, 776, 777, 778, 779, 780, >>> >> >> >> 45, >>> >> >> 44, >>> >> >> >> 116, 117, 118, 119, 120, 121) >>> >> >> >> AND data_cst.date_time BETWEEN '2010-03-04 00:00:00' >>> AND >>> >> >> >> '2010-03-04 23:59:59' >>> >> >> >> GROUP BY target_name_id, ep, wafer_id >>> >> >> >> HAVING count(*) < 999) q1, >>> >> >> >> data_target >>> >> >> >> WHERE data_target.id = target_name_id GROUP BY >>> >> >> >> q1.target_name_id, q1.ep; >>> >> >> >> >>> >> >> >> >>> >> >> >> This works fine. But now I need to get a certain column >>> >> >> >> (image_measurer_id) with each row returned that corresponds to >>> >> the >>> >> >> >> row from the group that has bottom = Min(bottom), bottom = >>> >> >> >> Max(bottom), bottom closest to Avg(bottom), and bottom from >>> the >>> >> >> >> row where date_time = Max(date_time). >>> >> >> >> >>> >> >> >> Is this even possible from one query? >>> >> >> > >>> >> >> > >>> >> >> > Might be, but what's the importance of doing it as one query? >>> >> >> >>> >> >> Because it's part of a django based web app, and the class that >>> >> >> this is part of only supports having a single query. To use >>> >> >> multiple queries will require a fairly major rewrite of the >>> server >>> >> >> side of >>> >> that app. >>> >> >> >>> >> >> > I'd start by >>> >> >> > saving this result to a temp table and developing the new >>> query. >>> >> >> > When that's running, see if you can to optimise a query built >>> by >>> >> >> > replacing the reference to the temp table with the original >>> query. >>> >> >> >>> >> >> Thanks, I'll look into this. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql