Re: Updating two fields from an aggregate query
Robert, you might give insert ... select ... on duplicate key update a try: http://dev.mysql.com/doc/refman/5.0/en/insert-select.html something like this (untested): INSERT INTO parent (id, maxChildAge, childCount) SELECT parentid, MAX(age) as maxAge, COUNT(*) as ct FROM child WHERE parentid IN (1, 2, 3, 4) ON DUPLICATE KEY UPDATE maxChildAge=maxAge, childCount=ct That seems like it ought to work as long as the id column in the parent table is a unique key. One consideration is that if you are writing this query programmatically and using a long list of parentid values in the IN clause, the SQL could get pretty long, perhaps too long for the default value of MAX_ALLOWED_PACKET. You can adjust that up pretty easily, but bear in mind you need to adjust it for both client and server. http://dev.mysql.com/doc/refman/5.0/en/packet-too-large.html HTH, Dan On 9/26/06, Robert DiFalco [EMAIL PROTECTED] wrote: I have two tables that are related: Parent LONG id LONG childCount LONG maxChildAge ... Child LONG parentId LONG age ... There can be thousands of parents and millions of children, that is why I have denormalized childCount and maxChildAge. The values are too expensive to calculate each time the data is viewed so I update these values each time a Child is added, removed, or modified. I currently have to update the Parent table with two queries like so: SELECT MAX( Child.age ), COUNT(*) FROM Child WHERE parentID = x; UPDATE Parent SET maxChildAge = MAX, childCount = COUNT WHERE id = x; Worse yet I might be updating the stats for several hundred Parents at a time, so I have to loop through the above where x is the current Parent.id in the batch. What I would like to do is something like the following (made up syntax): UPDATE Parent SET maxChildAge AND childCount = COUNT WITH ( SELECT MAX( Child.age ), COUNT(*) FROM Child WHERE parentID = Parent.id ) WHERE id IN ( set_of_parents_to_update ); Any suggestions? TIA R. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED] -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Updating two fields from an aggregate query
I have two tables that are related: Parent LONG id LONG childCount LONG maxChildAge ... Child LONG parentId LONG age ... There can be thousands of parents and millions of children, that is why I have denormalized childCount and maxChildAge. The values are too expensive to calculate each time the data is viewed so I update these values each time a Child is added, removed, or modified. I currently have to update the Parent table with two queries like so: SELECT MAX( Child.age ), COUNT(*) FROM Child WHERE parentID = x; UPDATE Parent SET maxChildAge = MAX, childCount = COUNT WHERE id = x; Worse yet I might be updating the stats for several hundred Parents at a time, so I have to loop through the above where x is the current Parent.id in the batch. What I would like to do is something like the following (made up syntax): UPDATE Parent SET maxChildAge AND childCount = COUNT WITH ( SELECT MAX( Child.age ), COUNT(*) FROM Child WHERE parentID = Parent.id ) WHERE id IN ( set_of_parents_to_update ); Any suggestions? TIA R. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: RE: Updating two fields from an aggregate query
I'd expect it to be quicker, especially in your situation where you are updating potentially hundreds of records at a time. If you have 250 records to update, today you're performing 500 queries - first a select and then an update for each parentid. This is one query for all 250 records. I haven't ever used INSERT - SELECT - ON DUPLICATE KEY UPDATE myself for anything in production, but I have used its cousin, REPLACE, on a couple of projects which have been in production for more than 5 years, and it's very fast. Dan On 9/26/06, Robert DiFalco [EMAIL PROTECTED] wrote: Hdo you think this would perform better than simply using the two queries? I wonder if the overhead associated with the ON DUPLICATE KEY exception and handler might not outweigh the benefits of a single query. -Original Message- From: Dan Buettner [mailto:[EMAIL PROTECTED] Sent: Tuesday, September 26, 2006 7:15 AM To: Robert DiFalco Cc: mysql@lists.mysql.com Subject: Re: Updating two fields from an aggregate query Robert, you might give insert ... select ... on duplicate key update a try: http://dev.mysql.com/doc/refman/5.0/en/insert-select.html something like this (untested): INSERT INTO parent (id, maxChildAge, childCount) SELECT parentid, MAX(age) as maxAge, COUNT(*) as ct FROM child WHERE parentid IN (1, 2, 3, 4) ON DUPLICATE KEY UPDATE maxChildAge=maxAge, childCount=ct That seems like it ought to work as long as the id column in the parent table is a unique key. One consideration is that if you are writing this query programmatically and using a long list of parentid values in the IN clause, the SQL could get pretty long, perhaps too long for the default value of MAX_ALLOWED_PACKET. You can adjust that up pretty easily, but bear in mind you need to adjust it for both client and server. http://dev.mysql.com/doc/refman/5.0/en/packet-too-large.html HTH, Dan On 9/26/06, Robert DiFalco [EMAIL PROTECTED] wrote: I have two tables that are related: Parent LONG id LONG childCount LONG maxChildAge ... Child LONG parentId LONG age ... There can be thousands of parents and millions of children, that is why I have denormalized childCount and maxChildAge. The values are too expensive to calculate each time the data is viewed so I update these values each time a Child is added, removed, or modified. I currently have to update the Parent table with two queries like so: SELECT MAX( Child.age ), COUNT(*) FROM Child WHERE parentID = x; UPDATE Parent SET maxChildAge = MAX, childCount = COUNT WHERE id = x; Worse yet I might be updating the stats for several hundred Parents at a time, so I have to loop through the above where x is the current Parent.id in the batch. What I would like to do is something like the following (made up syntax): UPDATE Parent SET maxChildAge AND childCount = COUNT WITH ( SELECT MAX( Child.age ), COUNT(*) FROM Child WHERE parentID = Parent.id ) WHERE id IN ( set_of_parents_to_update ); Any suggestions? TIA R. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED] -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED] -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
RE: Updating two fields from an aggregate query
Hdo you think this would perform better than simply using the two queries? I wonder if the overhead associated with the ON DUPLICATE KEY exception and handler might not outweigh the benefits of a single query. -Original Message- From: Dan Buettner [mailto:[EMAIL PROTECTED] Sent: Tuesday, September 26, 2006 7:15 AM To: Robert DiFalco Cc: mysql@lists.mysql.com Subject: Re: Updating two fields from an aggregate query Robert, you might give insert ... select ... on duplicate key update a try: http://dev.mysql.com/doc/refman/5.0/en/insert-select.html something like this (untested): INSERT INTO parent (id, maxChildAge, childCount) SELECT parentid, MAX(age) as maxAge, COUNT(*) as ct FROM child WHERE parentid IN (1, 2, 3, 4) ON DUPLICATE KEY UPDATE maxChildAge=maxAge, childCount=ct That seems like it ought to work as long as the id column in the parent table is a unique key. One consideration is that if you are writing this query programmatically and using a long list of parentid values in the IN clause, the SQL could get pretty long, perhaps too long for the default value of MAX_ALLOWED_PACKET. You can adjust that up pretty easily, but bear in mind you need to adjust it for both client and server. http://dev.mysql.com/doc/refman/5.0/en/packet-too-large.html HTH, Dan On 9/26/06, Robert DiFalco [EMAIL PROTECTED] wrote: I have two tables that are related: Parent LONG id LONG childCount LONG maxChildAge ... Child LONG parentId LONG age ... There can be thousands of parents and millions of children, that is why I have denormalized childCount and maxChildAge. The values are too expensive to calculate each time the data is viewed so I update these values each time a Child is added, removed, or modified. I currently have to update the Parent table with two queries like so: SELECT MAX( Child.age ), COUNT(*) FROM Child WHERE parentID = x; UPDATE Parent SET maxChildAge = MAX, childCount = COUNT WHERE id = x; Worse yet I might be updating the stats for several hundred Parents at a time, so I have to loop through the above where x is the current Parent.id in the batch. What I would like to do is something like the following (made up syntax): UPDATE Parent SET maxChildAge AND childCount = COUNT WITH ( SELECT MAX( Child.age ), COUNT(*) FROM Child WHERE parentID = Parent.id ) WHERE id IN ( set_of_parents_to_update ); Any suggestions? TIA R. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED] -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]