Re: Updating two fields from an aggregate query

2006-09-26 Thread Dan Buettner

Robert, you might give insert ... select ... on duplicate key update a try:
http://dev.mysql.com/doc/refman/5.0/en/insert-select.html

something like this (untested):

INSERT INTO parent (id, maxChildAge, childCount)
SELECT parentid, MAX(age) as maxAge, COUNT(*) as ct
FROM child
WHERE parentid IN (1, 2, 3, 4)
ON DUPLICATE KEY UPDATE maxChildAge=maxAge, childCount=ct

That seems like it ought to work as long as the id column in the
parent table is a unique key.

One consideration is that if you are writing this query
programmatically and using a long list of parentid values in the IN
clause, the SQL could get pretty long, perhaps too long for the
default value of MAX_ALLOWED_PACKET.  You can adjust that up pretty
easily, but bear in mind you need to adjust it for both client and
server.
http://dev.mysql.com/doc/refman/5.0/en/packet-too-large.html

HTH,
Dan

On 9/26/06, Robert DiFalco [EMAIL PROTECTED] wrote:

I have two tables that are related:

Parent
   LONG id
   LONG childCount
   LONG maxChildAge
   ...

Child
   LONG parentId
   LONG age
   ...

There can be thousands of parents and millions of children, that is why
I have denormalized childCount and maxChildAge. The values are too
expensive to calculate each time the data is viewed so I update these
values each time a Child is added, removed, or modified.

I currently have to update the Parent table with two queries like so:

   SELECT MAX( Child.age ), COUNT(*)
   FROM Child
   WHERE parentID = x;

   UPDATE Parent
   SET maxChildAge = MAX, childCount = COUNT
   WHERE id = x;

Worse yet I might be updating the stats for several hundred Parents at a
time, so I have to loop through the above where x is the current
Parent.id in the batch. What I would like to do is something like the
following (made up syntax):

   UPDATE Parent
   SET maxChildAge AND childCount = COUNT
   WITH ( SELECT MAX( Child.age ), COUNT(*)
  FROM Child
WHERE parentID = Parent.id )
   WHERE id IN ( set_of_parents_to_update );

Any suggestions?

TIA

R.


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]




--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Updating two fields from an aggregate query

2006-09-26 Thread Robert DiFalco
I have two tables that are related:

Parent
   LONG id
   LONG childCount
   LONG maxChildAge
   ...

Child
   LONG parentId
   LONG age
   ...

There can be thousands of parents and millions of children, that is why
I have denormalized childCount and maxChildAge. The values are too
expensive to calculate each time the data is viewed so I update these
values each time a Child is added, removed, or modified.

I currently have to update the Parent table with two queries like so:

   SELECT MAX( Child.age ), COUNT(*) 
   FROM Child 
   WHERE parentID = x;

   UPDATE Parent
   SET maxChildAge = MAX, childCount = COUNT
   WHERE id = x;

Worse yet I might be updating the stats for several hundred Parents at a
time, so I have to loop through the above where x is the current
Parent.id in the batch. What I would like to do is something like the
following (made up syntax):

   UPDATE Parent
   SET maxChildAge AND childCount = COUNT
   WITH ( SELECT MAX( Child.age ), COUNT(*) 
  FROM Child 
WHERE parentID = Parent.id )
   WHERE id IN ( set_of_parents_to_update );

Any suggestions?

TIA

R.  


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: RE: Updating two fields from an aggregate query

2006-09-26 Thread Dan Buettner

I'd expect it to be quicker, especially in your situation where you
are updating potentially hundreds of records at a time.  If you have
250 records to update, today you're performing 500 queries - first a
select and then an update for each parentid.  This is one query for
all 250 records.

I haven't ever used INSERT - SELECT - ON DUPLICATE KEY UPDATE myself
for anything in production, but I have used its cousin, REPLACE, on a
couple of projects which have been in production for more than 5
years, and it's very fast.

Dan

On 9/26/06, Robert DiFalco [EMAIL PROTECTED] wrote:

Hdo you think this would perform better than simply using the
two queries? I wonder if the overhead associated with the ON DUPLICATE
KEY exception and handler might not outweigh the benefits of a single
query.

-Original Message-
From: Dan Buettner [mailto:[EMAIL PROTECTED]
Sent: Tuesday, September 26, 2006 7:15 AM
To: Robert DiFalco
Cc: mysql@lists.mysql.com
Subject: Re: Updating two fields from an aggregate query

Robert, you might give insert ... select ... on duplicate key update a
try:
http://dev.mysql.com/doc/refman/5.0/en/insert-select.html

something like this (untested):

INSERT INTO parent (id, maxChildAge, childCount) SELECT parentid,
MAX(age) as maxAge, COUNT(*) as ct FROM child WHERE parentid IN (1, 2,
3, 4) ON DUPLICATE KEY UPDATE maxChildAge=maxAge, childCount=ct

That seems like it ought to work as long as the id column in the parent
table is a unique key.

One consideration is that if you are writing this query programmatically
and using a long list of parentid values in the IN clause, the SQL could
get pretty long, perhaps too long for the default value of
MAX_ALLOWED_PACKET.  You can adjust that up pretty easily, but bear in
mind you need to adjust it for both client and server.
http://dev.mysql.com/doc/refman/5.0/en/packet-too-large.html

HTH,
Dan

On 9/26/06, Robert DiFalco [EMAIL PROTECTED] wrote:
 I have two tables that are related:

 Parent
LONG id
LONG childCount
LONG maxChildAge
...

 Child
LONG parentId
LONG age
...

 There can be thousands of parents and millions of children, that is
 why I have denormalized childCount and maxChildAge. The values are

 too expensive to calculate each time the data is viewed so I update
 these values each time a Child is added, removed, or modified.

 I currently have to update the Parent table with two queries like so:

SELECT MAX( Child.age ), COUNT(*)
FROM Child
WHERE parentID = x;

UPDATE Parent
SET maxChildAge = MAX, childCount = COUNT
WHERE id = x;

 Worse yet I might be updating the stats for several hundred Parents at

 a time, so I have to loop through the above where x is the current
 Parent.id in the batch. What I would like to do is something like the
 following (made up syntax):

UPDATE Parent
SET maxChildAge AND childCount = COUNT
WITH ( SELECT MAX( Child.age ), COUNT(*)
   FROM Child
 WHERE parentID = Parent.id )
WHERE id IN ( set_of_parents_to_update );

 Any suggestions?

 TIA

 R.


 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:
http://lists.mysql.com/[EMAIL PROTECTED]





--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]




--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



RE: Updating two fields from an aggregate query

2006-09-26 Thread Robert DiFalco
Hdo you think this would perform better than simply using the
two queries? I wonder if the overhead associated with the ON DUPLICATE
KEY exception and handler might not outweigh the benefits of a single
query. 

-Original Message-
From: Dan Buettner [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, September 26, 2006 7:15 AM
To: Robert DiFalco
Cc: mysql@lists.mysql.com
Subject: Re: Updating two fields from an aggregate query

Robert, you might give insert ... select ... on duplicate key update a
try:
http://dev.mysql.com/doc/refman/5.0/en/insert-select.html

something like this (untested):

INSERT INTO parent (id, maxChildAge, childCount) SELECT parentid,
MAX(age) as maxAge, COUNT(*) as ct FROM child WHERE parentid IN (1, 2,
3, 4) ON DUPLICATE KEY UPDATE maxChildAge=maxAge, childCount=ct

That seems like it ought to work as long as the id column in the parent
table is a unique key.

One consideration is that if you are writing this query programmatically
and using a long list of parentid values in the IN clause, the SQL could
get pretty long, perhaps too long for the default value of
MAX_ALLOWED_PACKET.  You can adjust that up pretty easily, but bear in
mind you need to adjust it for both client and server.
http://dev.mysql.com/doc/refman/5.0/en/packet-too-large.html

HTH,
Dan

On 9/26/06, Robert DiFalco [EMAIL PROTECTED] wrote:
 I have two tables that are related:

 Parent
LONG id
LONG childCount
LONG maxChildAge
...

 Child
LONG parentId
LONG age
...

 There can be thousands of parents and millions of children, that is 
 why I have denormalized childCount and maxChildAge. The values are

 too expensive to calculate each time the data is viewed so I update 
 these values each time a Child is added, removed, or modified.

 I currently have to update the Parent table with two queries like so:

SELECT MAX( Child.age ), COUNT(*)
FROM Child
WHERE parentID = x;

UPDATE Parent
SET maxChildAge = MAX, childCount = COUNT
WHERE id = x;

 Worse yet I might be updating the stats for several hundred Parents at

 a time, so I have to loop through the above where x is the current 
 Parent.id in the batch. What I would like to do is something like the 
 following (made up syntax):

UPDATE Parent
SET maxChildAge AND childCount = COUNT
WITH ( SELECT MAX( Child.age ), COUNT(*)
   FROM Child
 WHERE parentID = Parent.id )
WHERE id IN ( set_of_parents_to_update );

 Any suggestions?

 TIA

 R.


 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:
http://lists.mysql.com/[EMAIL PROTECTED]





--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]