Something I have done in the past (with another DB system) was to put
indexes on my temp tables. I have just gone over my copy of the MySQL docs
and I don't see where that is NOT allowed so I think its fair game to try
it.

You have two options to do this:
Create the temp tables then populate them (two statements)
-or-
Use ALTER TABLE your temp tables to create the indexes.

I have no data about which is definitely faster although I have read many
times that if you index a table AFTER filling it with data, the entire
operation tends to finish quicker (especially if you have lots of data).

Since you are using multiple values in your joins (when you are collecting
the historical prices) lets try indexing tmpLatestDates and tmpPrevDates on
both columns. Do you have an index on rptPricingTest for (commodity,
PricingDt) ? That could also speed up the joins.

You could also combine the collection of one set of data (I chose to do
your previous months values)with your final results calculations. Maybe
this will work faster....

SELECT @start:=NOW();
SELECT @date1:=DATE_SUB( CURDATE(), INTERVAL 1 MONTH);
SELECT @date2:=CURDATE();

CREATE TEMPORARY TABLE tmpLatestDates
    SELECT
      commodity,
      MAX(PricingDt) as MaxDate
    FROM rptPricingTEST
    WHERE   PricingDt > @date1
      AND   PricingDt <= @date2
    GROUP BY commodity;

ALTER TABLE tmpLatestDates ADD KEY (commodity, MaxDate);

CREATE TEMPORARY TABLE tmpLatestPrices
    SELECT
      b.commodity,
      a.PricingDt,
      a.PricingHighPrice,
      a.PricingLowPrice,
      a.PricingAvgPrice
    FROM rptPricingTEST a
    INNER JOIN tmpLatestDates b
    ON b.commodity = a.commodity
        AND b.MaxDate = a.PricingDt;

ALTER TABLE tmpLatestPrices ADD KEY(commodity);

CREATE TEMPORARY TABLE tmpPrevDates
    SELECT
      a.commodity,
      MAX(a.PricingDt) as PrevDate
    FROM rptPricingTEST a, tmpLatestPrices b
    WHERE   a.PricingDt < @date1
      AND a.commodity = b.commodity
    GROUP BY commodity;

ALTER TABLE tmpPrevDates ADD KEY(commodity, PrevDate);

/* I eliminated one temp table and added some indexes */

SELECT
    a.commodity,
    a.PricingDt                as PrevDate,
    a.PricingAvgPrice          as PrevAvg,
    tlp.PricingDt                as LatestDate,
    tlp.PricingAvgPrice          as LatestAvg,
    ((tlp.PricingAvgPrice - a.PricingAvgPrice)
    /a.PricingAvgPrice) * 100  as priceChange
FROM rptPricingTEST a
INNER JOIN tmpPrevDates tpd
      ON a.commodity = tpd.commodity
      AND a.PricingDt = tpd.PrevDate
INNER JOIN tmpLatestPrices tlp
      ON tlp.commodity = a.commodity;


DROP TABLE IF EXISTS
    tmpLatestDates,
    tmpPrevDates,
    tmpLatestPrices

SELECT TIMEDIFF(NOW(),@start);


Now, because we are using INNER JOINS, only those commodities that actually
existed in the previous month will be returned. That actually makes some
sense as you cannot calculate a % change from "nonexistence" to "some
value".

Let us know how this works out, OK?

Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine


                                                                                       
                                          
                      nyem                                                             
                                          
                      <[EMAIL PROTECTED]        To:       [EMAIL PROTECTED]            
                                       
                      .my>                     cc:                                     
                                          
                                               Fax to:                                 
                                          
                      05/27/2004 05:01         Subject:  Re: slow insert into select 
statement                                   
                      AM                                                               
                                          
                                                                                       
                                          
                                                                                       
                                          




Thanks for all the feedback.  Here's my latest attempt:

SELECT @start:=NOW();
SELECT @date1:=DATE_SUB( CURDATE(), INTERVAL 1 MONTH);
SELECT @date2:=CURDATE();

LOCK TABLES  rptPricingTEST READ, rptPricingTEST a READ;

CREATE TEMPORARY TABLE tmpLatestDates
    SELECT
      commodity,
      MAX(PricingDt) as MaxDate
    FROM rptPricingTEST
    WHERE   PricingDt > @date1
      AND   PricingDt <= @date2
    GROUP BY commodity;

CREATE TEMPORARY TABLE tmpLatestPrices
    SELECT
      b.commodity,
      a.PricingDt,
      a.PricingHighPrice,
      a.PricingLowPrice,
      a.PricingAvgPrice
    FROM rptPricingTEST a
    INNER JOIN tmpLatestDates b
    ON b.commodity = a.commodity
        AND b.MaxDate = a.PricingDt;

CREATE TEMPORARY TABLE tmpPrevDates
    SELECT
      a.commodity,
      MAX(a.PricingDt) as PrevDate
    FROM rptPricingTEST a, tmpLatestPrices b
    WHERE   a.PricingDt < @date1
      AND a.commodity = b.commodity
    GROUP BY commodity;

CREATE TEMPORARY TABLE tmpPrevPrices
    SELECT
      a.commodity,
      a.PricingDt,
      a.PricingHighPrice,
      a.PricingLowPrice,
      a.PricingAvgPrice
    FROM rptPricingTEST a
    INNER JOIN tmpPrevDates b
    ON b.commodity = a.commodity
        AND b.PrevDate = a.PricingDt;

SELECT
    a.commodity,
    a.PricingDt                as PrevDate,
    a.PricingAvgPrice          as PrevAvg,
    b.PricingDt                as LatestDate,
    b.PricingAvgPrice          as LatestAvg,
    ((b.PricingAvgPrice - a.PricingAvgPrice)
    /a.PricingAvgPrice) * 100  as priceChange
FROM tmpPrevPrices a, tmpLatestPrices b
WHERE
    a.commodity = b.commodity
    AND a.PricingAvgPrice < b.PricingAvgPrice;

UNLOCK TABLES;

DROP TABLE IF EXISTS
    tmpLatestDates,
    tmpPrevDates,
    tmpLatestPrices,
    tmpPrevPrices;

SELECT TIMEDIFF(NOW(),@start);
+------------------------+
| TIMEDIFF(NOW(),@start) |
+------------------------+
| 00:00:08               |
+------------------------+
1 row in set (0.00 sec)


Now I get it all executed in 8 seconds.


[EMAIL PROTECTED] wrote:
>
.. snip
>
> After reviewing your original post, I am not confident that you are
> answering your question with this query. You said "I need to generate a
> report of price change for all commodity for the date interval selected
by
> users." which to me implies a report something like:
>
>
+---------+-------------+-----------+--------------+------------+---------+---------+---------+

> |commodity|starting date|ending date|starting price|ending price|max
> price|min price|avg price|
>
+---------+-------------+-----------+--------------+------------+---------+---------+---------+

>
> I am just not sure where you are going with the query you are building.
Can
> you provide a template of the results you want?
>
> Thanks,
> Shawn Green
> Database Administrator
> Unimin Corporation - Spruce Pine



You were right about this. I was merely posting the first part of my
query, as that's where my trouble began. Here's the result table that
I'm looking for, where priceChange is the % increase in price for the
two date interval.
+-----------+----------+---------+------------+-----------+-------------+
| commodity | PrevDate | PrevAvg | LatestDate | LatestAvg | priceChange
+----------------------+---------+------------+-----------+-------------+

The slowest part of the script is when generating this table (5.00 sec).
    How could I optimise it further? I don't mind creating more temp
tables as long as the total execution time is much reduced.

I'm using mysql 4.1.1a-alpha on win2k.

Thanks,
nyem








--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]







-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to