Re: slow insert into select statement

2004-05-27 Thread nyem
Thanks for all the feedback.  Here's my latest attempt:
SELECT @start:=NOW();
SELECT @date1:=DATE_SUB( CURDATE(), INTERVAL 1 MONTH);
SELECT @date2:=CURDATE();
LOCK TABLES  rptPricingTEST READ, rptPricingTEST a READ;
CREATE TEMPORARY TABLE tmpLatestDates
   SELECT
 commodity,
 MAX(PricingDt) as MaxDate
   FROM rptPricingTEST
   WHERE   PricingDt  @date1
 AND   PricingDt = @date2
   GROUP BY commodity;
CREATE TEMPORARY TABLE tmpLatestPrices
   SELECT
 b.commodity,
 a.PricingDt,
 a.PricingHighPrice,
 a.PricingLowPrice,
 a.PricingAvgPrice
   FROM rptPricingTEST a
   INNER JOIN tmpLatestDates b
   ON b.commodity = a.commodity
   AND b.MaxDate = a.PricingDt;
CREATE TEMPORARY TABLE tmpPrevDates
   SELECT
 a.commodity,
 MAX(a.PricingDt) as PrevDate
   FROM rptPricingTEST a, tmpLatestPrices b
   WHERE   a.PricingDt  @date1
 AND a.commodity = b.commodity
   GROUP BY commodity;
CREATE TEMPORARY TABLE tmpPrevPrices
   SELECT
 a.commodity,
 a.PricingDt,
 a.PricingHighPrice,
 a.PricingLowPrice,
 a.PricingAvgPrice
   FROM rptPricingTEST a
   INNER JOIN tmpPrevDates b
   ON b.commodity = a.commodity
   AND b.PrevDate = a.PricingDt;
SELECT
   a.commodity,
   a.PricingDtas PrevDate,
   a.PricingAvgPrice  as PrevAvg,
   b.PricingDtas LatestDate,
   b.PricingAvgPrice  as LatestAvg,
   ((b.PricingAvgPrice - a.PricingAvgPrice)
   /a.PricingAvgPrice) * 100  as priceChange
FROM tmpPrevPrices a, tmpLatestPrices b
WHERE
   a.commodity = b.commodity
   AND a.PricingAvgPrice  b.PricingAvgPrice;
UNLOCK TABLES;
DROP TABLE IF EXISTS
   tmpLatestDates,
   tmpPrevDates,
   tmpLatestPrices,
   tmpPrevPrices;
SELECT TIMEDIFF(NOW(),@start);
++
| TIMEDIFF(NOW(),@start) |
++
| 00:00:08   |
++
1 row in set (0.00 sec)
Now I get it all executed in 8 seconds.
[EMAIL PROTECTED] wrote:

.. snip
After reviewing your original post, I am not confident that you are
answering your question with this query. You said I need to generate a
report of price change for all commodity for the date interval selected by
users. which to me implies a report something like:
+-+-+---+--++-+-+-+
|commodity|starting date|ending date|starting price|ending price|max
price|min price|avg price|
+-+-+---+--++-+-+-+
I am just not sure where you are going with the query you are building. Can
you provide a template of the results you want?
Thanks,
Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine 

You were right about this. I was merely posting the first part of my
query, as that's where my trouble began. Here's the result table that
I'm looking for, where priceChange is the % increase in price for the
two date interval.
+---+--+-++---+-+
| commodity | PrevDate | PrevAvg | LatestDate | LatestAvg | priceChange
+--+-++---+-+
The slowest part of the script is when generating this table (5.00 sec).
   How could I optimise it further? I don't mind creating more temp
tables as long as the total execution time is much reduced.
I'm using mysql 4.1.1a-alpha on win2k.
Thanks,
nyem



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]


Re: slow insert into select statement

2004-05-27 Thread SGreen

Something I have done in the past (with another DB system) was to put
indexes on my temp tables. I have just gone over my copy of the MySQL docs
and I don't see where that is NOT allowed so I think its fair game to try
it.

You have two options to do this:
Create the temp tables then populate them (two statements)
-or-
Use ALTER TABLE your temp tables to create the indexes.

I have no data about which is definitely faster although I have read many
times that if you index a table AFTER filling it with data, the entire
operation tends to finish quicker (especially if you have lots of data).

Since you are using multiple values in your joins (when you are collecting
the historical prices) lets try indexing tmpLatestDates and tmpPrevDates on
both columns. Do you have an index on rptPricingTest for (commodity,
PricingDt) ? That could also speed up the joins.

You could also combine the collection of one set of data (I chose to do
your previous months values)with your final results calculations. Maybe
this will work faster

SELECT @start:=NOW();
SELECT @date1:=DATE_SUB( CURDATE(), INTERVAL 1 MONTH);
SELECT @date2:=CURDATE();

CREATE TEMPORARY TABLE tmpLatestDates
SELECT
  commodity,
  MAX(PricingDt) as MaxDate
FROM rptPricingTEST
WHERE   PricingDt  @date1
  AND   PricingDt = @date2
GROUP BY commodity;

ALTER TABLE tmpLatestDates ADD KEY (commodity, MaxDate);

CREATE TEMPORARY TABLE tmpLatestPrices
SELECT
  b.commodity,
  a.PricingDt,
  a.PricingHighPrice,
  a.PricingLowPrice,
  a.PricingAvgPrice
FROM rptPricingTEST a
INNER JOIN tmpLatestDates b
ON b.commodity = a.commodity
AND b.MaxDate = a.PricingDt;

ALTER TABLE tmpLatestPrices ADD KEY(commodity);

CREATE TEMPORARY TABLE tmpPrevDates
SELECT
  a.commodity,
  MAX(a.PricingDt) as PrevDate
FROM rptPricingTEST a, tmpLatestPrices b
WHERE   a.PricingDt  @date1
  AND a.commodity = b.commodity
GROUP BY commodity;

ALTER TABLE tmpPrevDates ADD KEY(commodity, PrevDate);

/* I eliminated one temp table and added some indexes */

SELECT
a.commodity,
a.PricingDtas PrevDate,
a.PricingAvgPrice  as PrevAvg,
tlp.PricingDtas LatestDate,
tlp.PricingAvgPrice  as LatestAvg,
((tlp.PricingAvgPrice - a.PricingAvgPrice)
/a.PricingAvgPrice) * 100  as priceChange
FROM rptPricingTEST a
INNER JOIN tmpPrevDates tpd
  ON a.commodity = tpd.commodity
  AND a.PricingDt = tpd.PrevDate
INNER JOIN tmpLatestPrices tlp
  ON tlp.commodity = a.commodity;


DROP TABLE IF EXISTS
tmpLatestDates,
tmpPrevDates,
tmpLatestPrices

SELECT TIMEDIFF(NOW(),@start);


Now, because we are using INNER JOINS, only those commodities that actually
existed in the previous month will be returned. That actually makes some
sense as you cannot calculate a % change from nonexistence to some
value.

Let us know how this works out, OK?

Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine


   
  
  nyem 
  
  [EMAIL PROTECTED]To:   [EMAIL PROTECTED]
   
  .my cc: 
  
   Fax to: 
  
  05/27/2004 05:01 Subject:  Re: slow insert into select 
statement   
  AM   
  
   
  
   
  




Thanks for all the feedback.  Here's my latest attempt:

SELECT @start:=NOW();
SELECT @date1:=DATE_SUB( CURDATE(), INTERVAL 1 MONTH);
SELECT @date2:=CURDATE();

LOCK TABLES  rptPricingTEST READ, rptPricingTEST a READ;

CREATE TEMPORARY TABLE tmpLatestDates
SELECT
  commodity,
  MAX(PricingDt) as MaxDate
FROM rptPricingTEST
WHERE   PricingDt  @date1
  AND   PricingDt = @date2
GROUP BY commodity;

CREATE TEMPORARY TABLE tmpLatestPrices
SELECT
  b.commodity,
  a.PricingDt,
  a.PricingHighPrice,
  a.PricingLowPrice,
  a.PricingAvgPrice
FROM rptPricingTEST a
INNER JOIN tmpLatestDates b
ON b.commodity = a.commodity
AND b.MaxDate = a.PricingDt;

CREATE TEMPORARY TABLE tmpPrevDates
SELECT

RE: slow insert into select statement

2004-05-26 Thread Victor Pendleton
If you do the math a large result set will be created. You could rewriting
your query or adding more indexes to see if this speeds up the process.

-Original Message-
From: nyem
To: [EMAIL PROTECTED]
Sent: 5/26/04 2:57 AM
Subject: slow insert into select statement

I have this insert statement that took a long time to execute:

INSERT INTO priceLast5
SELECT
   DISTINCT a.commodity,
   a.PricingDt
FROM rptPricingTEST a
WHERE
   a.PricingDt = (
 SELECT MAX(PricingDt)
 FROM rptPricingTEST b
 WHERE
   b.PricingDt  @date1
   AND b.PricingDt = @date2
   AND a.commodity = b.commodity
   );

  Query OK, 3010 rows affected (4 min 17.37 sec)

Is there any way I can modify the sql statement to speed this up?

Here's the result of explain (splitted for easy reading):

id select_typetable type  possible_keys key
-- -- - - - 
  1 PRIMARYa index NULL  idxcommodity
  2 DEPENDENT SUBQUERY b ref   idxcommodity  idxcommodity

id key_len ref   rows   Extra
-- --- - -- 
  1  25 NULL  384155 Using where; Using index
  2  22 dbsdvitmp.a.commodity 52 Using where; Using index



And here's the rest of the sql before the above insert:

SELECT @date1:=DATE_SUB( CURDATE(), INTERVAL 1 MONTH);
SELECT @date2:=CURDATE();

CREATE TEMPORARY TABLE priceLast5 (
commodity char(22) NOT NULL,
PricingDt date NOT NULL,
PRIMARY KEY  (commodity),
KEY idxPricingDt (PricingDt),
);

LOCK TABLES rptPricingTEST a READ;


The source table rptPricingTEST consists of commodity id (which is a 
concat of several ids), pricing date, and prices (low, hi, avg). I need 
to generate a report of price change for all commodity for the date 
interval selected by users. Since this will be a web-based report, to 
wait for 4 minutes for a response is unacceptable. The actual table has 
2 million rows which I guess can take forever to execute.

regards,
nyem


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:
http://lists.mysql.com/[EMAIL PROTECTED]

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



RE: slow insert into select statement

2004-05-26 Thread Andrew Braithwaite
Hi,

I would start with finding out if  it's the select or the insert that's
taking a long time...

Does the priceLast5 table have heavy indexes to build?

Try running the select seperately and see how long it takes...

Andrew


-Original Message-
From: Victor Pendleton [mailto:[EMAIL PROTECTED] 
Sent: Wednesday 26 May 2004 14:13
To: 'nyem '; '[EMAIL PROTECTED] '
Subject: RE: slow insert into select statement

If you do the math a large result set will be created. You could
rewriting your query or adding more indexes to see if this speeds up the
process.

-Original Message-
From: nyem
To: [EMAIL PROTECTED]
Sent: 5/26/04 2:57 AM
Subject: slow insert into select statement

I have this insert statement that took a long time to execute:

INSERT INTO priceLast5
SELECT
   DISTINCT a.commodity,
   a.PricingDt
FROM rptPricingTEST a
WHERE
   a.PricingDt = (
 SELECT MAX(PricingDt)
 FROM rptPricingTEST b
 WHERE
   b.PricingDt  @date1
   AND b.PricingDt = @date2
   AND a.commodity = b.commodity
   );

  Query OK, 3010 rows affected (4 min 17.37 sec)

Is there any way I can modify the sql statement to speed this up?

Here's the result of explain (splitted for easy reading):

id select_typetable type  possible_keys key
-- -- - - - 
  1 PRIMARYa index NULL  idxcommodity
  2 DEPENDENT SUBQUERY b ref   idxcommodity  idxcommodity

id key_len ref   rows   Extra
-- --- - -- 
  1  25 NULL  384155 Using where; Using index
  2  22 dbsdvitmp.a.commodity 52 Using where; Using index



And here's the rest of the sql before the above insert:

SELECT @date1:=DATE_SUB( CURDATE(), INTERVAL 1 MONTH);
SELECT @date2:=CURDATE();

CREATE TEMPORARY TABLE priceLast5 (
commodity char(22) NOT NULL,
PricingDt date NOT NULL,
PRIMARY KEY  (commodity),
KEY idxPricingDt (PricingDt),
);

LOCK TABLES rptPricingTEST a READ;


The source table rptPricingTEST consists of commodity id (which is a
concat of several ids), pricing date, and prices (low, hi, avg). I need
to generate a report of price change for all commodity for the date
interval selected by users. Since this will be a web-based report, to
wait for 4 minutes for a response is unacceptable. The actual table has
2 million rows which I guess can take forever to execute.

regards,
nyem


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql To unsubscribe:
http://lists.mysql.com/[EMAIL PROTECTED]

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:
http://lists.mysql.com/[EMAIL PROTECTED]




--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



RE: slow insert into select statement

2004-05-26 Thread SGreen

I would also question the sub-select in the WHERE clause. I suspect that it
is being evaluated once for each row of rptPricingTEST. You could get a
significant performance boost if you move the results of that select into a
temp table and INNER JOIN to it. That way you calculate the MAX() date only
once for the entire table and not once per row (per row, per row,...)

CREATE TEMPORARY TABLE tmpLatestDates
SELECT commodity, MAX(PricingDt) as MaxDate
FROM rptPricingTEST
WHERE PricingDt  @date1
AND PricingDt = @date2
GROUP BY commodity

INSERT INTO priceLast5
SELECT DISTINCT a.commodity, a.PricingDt
FROM rptPricingTEST a
INNER JOIN tmpLatestDates b
ON b.commodity = a.commodity
  AND b.PricingDt = a.PricingDt

(By using the INNER JOIN, I question if the DISTINCT still needed, too.)

After reviewing your original post, I am not confident that you are
answering your question with this query. You said I need to generate a
report of price change for all commodity for the date interval selected by
users. which to me implies a report something like:

+-+-+---+--++-+-+-+
|commodity|starting date|ending date|starting price|ending price|max
price|min price|avg price|
+-+-+---+--++-+-+-+

I am just not sure where you are going with the query you are building. Can
you provide a template of the results you want?

Thanks,
Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine





   
  
  Andrew  
  
  Braithwaite To:   Victor Pendleton [EMAIL 
PROTECTED], nyem 
  [EMAIL PROTECTED] [EMAIL PROTECTED], [EMAIL 
PROTECTED]   
  com cc: 
  
   Fax to: 
  
  05/26/2004 10:28 Subject:  RE: slow insert into select 
statement   
  AM   
  
   
  
   
  




Hi,

I would start with finding out if  it's the select or the insert that's
taking a long time...

Does the priceLast5 table have heavy indexes to build?

Try running the select seperately and see how long it takes...

Andrew


-Original Message-
From: Victor Pendleton [mailto:[EMAIL PROTECTED]
Sent: Wednesday 26 May 2004 14:13
To: 'nyem '; '[EMAIL PROTECTED] '
Subject: RE: slow insert into select statement

If you do the math a large result set will be created. You could
rewriting your query or adding more indexes to see if this speeds up the
process.

-Original Message-
From: nyem
To: [EMAIL PROTECTED]
Sent: 5/26/04 2:57 AM
Subject: slow insert into select statement

I have this insert statement that took a long time to execute:

 INSERT INTO priceLast5
 SELECT
DISTINCT a.commodity,
a.PricingDt
 FROM rptPricingTEST a
 WHERE
a.PricingDt = (SELECT MAX(PricingDt)
  FROM rptPricingTEST b
  WHERE
b.PricingDt  @date1
AND b.PricingDt = @date2
AND a.commodity = b.commodity
);

  Query OK, 3010 rows affected (4 min 17.37 sec)

Is there any way I can modify the sql statement to speed this up?

Here's the result of explain (splitted for easy reading):

id select_typetable type  possible_keys key
-- -- - - - 
  1 PRIMARYa index NULL  idxcommodity
  2 DEPENDENT SUBQUERY b ref   idxcommodity  idxcommodity

id key_len ref   rows   Extra
-- --- - -- 
  1  25 NULL  384155 Using where; Using index
  2  22 dbsdvitmp.a.commodity 52 Using where; Using index



And here's the rest of the sql before the above insert:

 SELECT @date1:=DATE_SUB( CURDATE(), INTERVAL 1 MONTH);
 SELECT @date2:=CURDATE();

 CREATE TEMPORARY TABLE priceLast5 (
 commodity char(22) NOT NULL