Re: SQL query question for GROUP BY
On Fri, Apr 11, 2008 at 4:01 PM, Victor Danilchenko <[EMAIL PROTECTED]> wrote: > Oooh, this looks evil. It seems like such a simple thing. I guess > creating max(log_date) as a field, and then joining on it, is a solution -- > but my actual query (not the abridged version) is already half a page long. A derived table or a LEFT JOIN are your best bets, as shown here: http://dev.mysql.com/doc/refman/5.0/en/example-maximum-column-group-row.html In most cases, the derived table is best. It creates a temp table automatically, so it's similar to using a view. My experiments with actual views gave dismal performance, and the user variable trick described on Baron's blog is pretty hard to get right. > I think at this point, unless someone else suggests a better > solution, this would be easier to do programatically -- skip the group > altogether, and instead simply order the rows, and grab the last one for > each username in code. If you don't have that much data to worry about then this could be good, but it's often tricky to code correctly because of the state you have to keep track of. Also, use UNION ALL if you don't need MySQL to remove duplicate rows. It makes a big difference in performance. - Perrin -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: SQL query question for GROUP BY
I just thought of something else... could the same be accomplished using stored routines? I could find no way in MySQL to create stored routines which could be used with the 'group by' queries though. If this were possible, it should then be also possible to define a 'LAST' stored routine, or something which would output a given field value based on whether some other field (say, numeric ID, or timestamp) was the highest in its group. This looks to be possible with external functions ('CREATE AGGREGATE FUNCTION'), but this would require writing an external library to handle the call, too. It would be strange it if were impossible to create an aggregate stored procedure. Does anyone know if it's possible to define stored procedures this way? Rob Wultsch wrote: On Fri, Apr 11, 2008 at 1:01 PM, Victor Danilchenko <[EMAIL PROTECTED]> wrote: Oooh, this looks evil. It seems like such a simple thing. I guess creating max(log_date) as a field, and then joining on it, is a solution -- but my actual query (not the abridged version) is already half a page long. I think at this point, unless someone else suggests a better solution, this would be easier to do programatically -- skip the group altogether, and instead simply order the rows, and grab the last one for each username in code. I guess another alternative would be to use a View for the UNIONized query, but doesn't MySQL 'fake' views in 5.0 somehow? I have used views to good results, however I have read not good things about them. I would not be surprised if they worked well for this use. I would also not be surprised if the merge storage engine was a better option for you. Possibly interesting: http://www.mysqlperformanceblog.com/2007/08/12/mysql-view-as-performance-troublemaker/ -- Victor Danilchenko Senior Software Engineer, AskOnline.net [EMAIL PROTECTED] - 617-273-0119 -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: SQL query question for GROUP BY
On Fri, Apr 11, 2008 at 1:01 PM, Victor Danilchenko <[EMAIL PROTECTED]> wrote: > Oooh, this looks evil. It seems like such a simple thing. I guess > creating max(log_date) as a field, and then joining on it, is a solution -- > but my actual query (not the abridged version) is already half a page long. > > I think at this point, unless someone else suggests a better > solution, this would be easier to do programatically -- skip the group > altogether, and instead simply order the rows, and grab the last one for > each username in code. > > I guess another alternative would be to use a View for the UNIONized > query, but doesn't MySQL 'fake' views in 5.0 somehow? I have used views to good results, however I have read not good things about them. I would not be surprised if they worked well for this use. I would also not be surprised if the merge storage engine was a better option for you. Possibly interesting: http://www.mysqlperformanceblog.com/2007/08/12/mysql-view-as-performance-troublemaker/ -- Rob Wultsch [EMAIL PROTECTED] wultsch (aim) -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: SQL query question for GROUP BY
Oooh, this looks evil. It seems like such a simple thing. I guess creating max(log_date) as a field, and then joining on it, is a solution -- but my actual query (not the abridged version) is already half a page long. I think at this point, unless someone else suggests a better solution, this would be easier to do programatically -- skip the group altogether, and instead simply order the rows, and grab the last one for each username in code. I guess another alternative would be to use a View for the UNIONized query, but doesn't MySQL 'fake' views in 5.0 somehow? Rob Wultsch wrote: On Fri, Apr 11, 2008 at 11:46 AM, Victor Danilchenko <[EMAIL PROTECTED]> wrote: GROUP BY seems like an obvious choice; 'GROUP BY username', to be exact. However, this seems to produce not the last row's values, but ones from a random row in the group. Under most databases your query is erroneous. Take a look at http://lists.mysql.com/mysql/212084 . I don't think the fact that I am doing this on a subquery is relevant, but just in case, I am including this info. Here is what the query looks like, abridged: SELECT id,username,log_date,event_type FROM (SELECT * FROM a UNION SELECT * from b) as h GROUP BY username Read http://www.xaprb.com/blog/2006/12/07/how-to-select-the-firstleastmax-row-per-group-in-sql/ Your are probably going to end up with a fairly ugly query (mostly because of the union) with what you have a derived table which will join against a and b again. SELECT if(a2.id is NULL, b2.id, a2.id) ,a1.username ... FROM (SELECT username, MAX(log_date) as mlog_date FROM (SELECT * FROM a UNION SELECT * from b) as h GROUP BY username) AS a1 LEFT JOIN a AS a2 ON a1.mlog_date = a2.log_date AND username... LEFT JOIN b AS b2 ... Any one have a suggestion for how to do with in a way that is not ugly as heck? -- Victor Danilchenko Senior Software Engineer, AskOnline.net [EMAIL PROTECTED] - 617-273-0119 -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: SQL query question for GROUP BY
On Fri, Apr 11, 2008 at 11:46 AM, Victor Danilchenko <[EMAIL PROTECTED]> wrote: > GROUP BY seems like an obvious choice; 'GROUP BY username', to be > exact. However, this seems to produce not the last row's values, but ones > from a random row in the group. Under most databases your query is erroneous. Take a look at http://lists.mysql.com/mysql/212084 . > I don't think the fact that I am doing this on a subquery is > relevant, but just in case, I am including this info. > > Here is what the query looks like, abridged: > > > SELECT id,username,log_date,event_type > FROM (SELECT * FROM a > UNION > SELECT * from b) as h > GROUP BY username Read http://www.xaprb.com/blog/2006/12/07/how-to-select-the-firstleastmax-row-per-group-in-sql/ Your are probably going to end up with a fairly ugly query (mostly because of the union) with what you have a derived table which will join against a and b again. SELECT if(a2.id is NULL, b2.id, a2.id) ,a1.username ... FROM (SELECT username, MAX(log_date) as mlog_date FROM (SELECT * FROM a UNION SELECT * from b) as h GROUP BY username) AS a1 LEFT JOIN a AS a2 ON a1.mlog_date = a2.log_date AND username... LEFT JOIN b AS b2 ... Any one have a suggestion for how to do with in a way that is not ugly as heck? -- Rob Wultsch [EMAIL PROTECTED] wultsch (aim) -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]