I'd expect it to be quicker, especially in your situation where you are updating potentially hundreds of records at a time. If you have 250 records to update, today you're performing 500 queries - first a select and then an update for each parentid. This is one query for all 250 records.
I haven't ever used INSERT - SELECT - ON DUPLICATE KEY UPDATE myself for anything in production, but I have used its cousin, REPLACE, on a couple of projects which have been in production for more than 5 years, and it's very fast. Dan On 9/26/06, Robert DiFalco <[EMAIL PROTECTED]> wrote:
Hmmmm....do you think this would perform better than simply using the two queries? I wonder if the overhead associated with the ON DUPLICATE KEY exception and handler might not outweigh the benefits of a single query. -----Original Message----- From: Dan Buettner [mailto:[EMAIL PROTECTED] Sent: Tuesday, September 26, 2006 7:15 AM To: Robert DiFalco Cc: mysql@lists.mysql.com Subject: Re: Updating two fields from an aggregate query Robert, you might give "insert ... select ... on duplicate key update" a try: http://dev.mysql.com/doc/refman/5.0/en/insert-select.html something like this (untested): INSERT INTO parent (id, maxChildAge, childCount) SELECT parentid, MAX(age) as maxAge, COUNT(*) as ct FROM child WHERE parentid IN (1, 2, 3, 4) ON DUPLICATE KEY UPDATE maxChildAge=maxAge, childCount=ct That seems like it ought to work as long as the id column in the parent table is a unique key. One consideration is that if you are writing this query programmatically and using a long list of parentid values in the IN clause, the SQL could get pretty long, perhaps too long for the default value of MAX_ALLOWED_PACKET. You can adjust that up pretty easily, but bear in mind you need to adjust it for both client and server. http://dev.mysql.com/doc/refman/5.0/en/packet-too-large.html HTH, Dan On 9/26/06, Robert DiFalco <[EMAIL PROTECTED]> wrote: > I have two tables that are related: > > Parent > LONG id > LONG childCount > LONG maxChildAge > ... > > Child > LONG parentId > LONG age > ... > > There can be thousands of parents and millions of children, that is > why I have denormalized "childCount" and "maxChildAge". The values are > too expensive to calculate each time the data is viewed so I update > these values each time a Child is added, removed, or modified. > > I currently have to update the Parent table with two queries like so: > > SELECT MAX( Child.age ), COUNT(*) > FROM Child > WHERE parentID = <x>; > > UPDATE Parent > SET maxChildAge = MAX, childCount = COUNT > WHERE id = <x>; > > Worse yet I might be updating the stats for several hundred Parents at > a time, so I have to loop through the above where <x> is the current > Parent.id in the batch. What I would like to do is something like the > following (made up syntax): > > UPDATE Parent > SET maxChildAge AND childCount = COUNT > WITH ( SELECT MAX( Child.age ), COUNT(*) > FROM Child > WHERE parentID = Parent.id ) > WHERE id IN ( <set_of_parents_to_update> ); > > Any suggestions? > > TIA > > R. > > > -- > MySQL General Mailing List > For list archives: http://lists.mysql.com/mysql > To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED] > > -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]
-- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]