On Mon, Jan 13, 2014 at 2:43 AM, Peter Brawley <peter.braw...@earthlink.net> wrote: > > On 2014-01-12 9:13 PM, Larry Martell wrote: >> >> On Sun, Jan 12, 2014 at 2:47 PM, Peter Brawley >> <peter.braw...@earthlink.net> wrote: >>> >>> On 2014-01-12 1:17 PM, Larry Martell wrote: >>>> >>>> I've been asked to do something that I do not think is possible in SQL. >>>> >>>> I have a query that has this basic form: >>>> >>>> SELECT a, b, c, d, AVG(e), STD(e), CONCAT(x, ',', y) as f >>>> FROM t >>>> GROUP BY a, b, c, d, f >>>> >>>> x and y are numbers (378.18, 2213.797 or 378.218, 2213.949 or >>>> 10053.490, 2542.094). >>>> >>>> The business issue is that if either x or y in 2 rows that are in the >>>> same a, b, c, d group are within 1 of each other then they should be >>>> grouped together. And to make it more complicated, the tolerance is >>>> applied as a rolling continuum. For example, if the x and y in a set >>>> of grouped rows are: >>>> >>>> row 1: 1.5, 9.5 >>>> row 2: 2.4, 20.8 >>>> row 3: 3.3, 40.6 >>>> row 4: 4.2, 2.5 >>>> row 5: 5.1, 10.1 >>>> row 6: 6.0, 7.9 >>>> row 7: 8.0, 21.0 >>>> row 8: 100, 200 >>>> >>>> 1 through 6 get combined because all their X values are within the >>>> tolerance of some other X in the set that's been combined. 7's Y value >>>> is within the tolerance of 2's Y, so that should be combined as well. >>>> 8 is not combined because neither the X or Y value is within the >>>> tolerance of any X or Y in the set that was combined. >>>> >>>> In python I can easily parse the data and identify the rows that need >>>> to be combined, but then I've lost the ability to calculate the >>>> average and std. The only way I can think of to do this is to remove >>>> the grouping from the SQL and do all the grouping and aggregating >>>> myself. But this query often returns 20k to 30k rows after grouping. >>>> It could easily be 80k to 100k rows that I have to process if I remove >>>> the grouping and I think that will be very slow. >>>> >>>> Anyone have any ideas? >>> >>> >>> Could you compute the row-to-row values & write them to a temp table, >>> then >>> run the SQL that incorporates that result column? >> >> I thought of temp tables, but I could not come up with a way to use >> them for this. How can I apply the x/y tolerance grouping in sql? > > > Run the query you showed, saving the result to a temp table. In an sproc or > your preferred app language, do the row-to-row processing to generate a new > column in the temp table from the biz rules you outlined, now query the > revised temp table as desired.
Not too clear on how this will help me. The issue with the query I showed is that I don't have the individual rows that make up the aggregate data of the rows I need to combine. I think I have to run a query with no group by and do all the grouping and aggregation myself. In any case, unfortunately this has been made a low priority task and I've been put on to something else (I hate when they do that). I'll revive this thread when I'm allowed to get back on this. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql