Re: slow insert into select statement
Thanks for all the feedback. Here's my latest attempt: SELECT @start:=NOW(); SELECT @date1:=DATE_SUB( CURDATE(), INTERVAL 1 MONTH); SELECT @date2:=CURDATE(); LOCK TABLES rptPricingTEST READ, rptPricingTEST a READ; CREATE TEMPORARY TABLE tmpLatestDates SELECT commodity, MAX(PricingDt) as MaxDate FROM rptPricingTEST WHERE PricingDt @date1 AND PricingDt = @date2 GROUP BY commodity; CREATE TEMPORARY TABLE tmpLatestPrices SELECT b.commodity, a.PricingDt, a.PricingHighPrice, a.PricingLowPrice, a.PricingAvgPrice FROM rptPricingTEST a INNER JOIN tmpLatestDates b ON b.commodity = a.commodity AND b.MaxDate = a.PricingDt; CREATE TEMPORARY TABLE tmpPrevDates SELECT a.commodity, MAX(a.PricingDt) as PrevDate FROM rptPricingTEST a, tmpLatestPrices b WHERE a.PricingDt @date1 AND a.commodity = b.commodity GROUP BY commodity; CREATE TEMPORARY TABLE tmpPrevPrices SELECT a.commodity, a.PricingDt, a.PricingHighPrice, a.PricingLowPrice, a.PricingAvgPrice FROM rptPricingTEST a INNER JOIN tmpPrevDates b ON b.commodity = a.commodity AND b.PrevDate = a.PricingDt; SELECT a.commodity, a.PricingDtas PrevDate, a.PricingAvgPrice as PrevAvg, b.PricingDtas LatestDate, b.PricingAvgPrice as LatestAvg, ((b.PricingAvgPrice - a.PricingAvgPrice) /a.PricingAvgPrice) * 100 as priceChange FROM tmpPrevPrices a, tmpLatestPrices b WHERE a.commodity = b.commodity AND a.PricingAvgPrice b.PricingAvgPrice; UNLOCK TABLES; DROP TABLE IF EXISTS tmpLatestDates, tmpPrevDates, tmpLatestPrices, tmpPrevPrices; SELECT TIMEDIFF(NOW(),@start); ++ | TIMEDIFF(NOW(),@start) | ++ | 00:00:08 | ++ 1 row in set (0.00 sec) Now I get it all executed in 8 seconds. [EMAIL PROTECTED] wrote: .. snip After reviewing your original post, I am not confident that you are answering your question with this query. You said I need to generate a report of price change for all commodity for the date interval selected by users. which to me implies a report something like: +-+-+---+--++-+-+-+ |commodity|starting date|ending date|starting price|ending price|max price|min price|avg price| +-+-+---+--++-+-+-+ I am just not sure where you are going with the query you are building. Can you provide a template of the results you want? Thanks, Shawn Green Database Administrator Unimin Corporation - Spruce Pine You were right about this. I was merely posting the first part of my query, as that's where my trouble began. Here's the result table that I'm looking for, where priceChange is the % increase in price for the two date interval. +---+--+-++---+-+ | commodity | PrevDate | PrevAvg | LatestDate | LatestAvg | priceChange +--+-++---+-+ The slowest part of the script is when generating this table (5.00 sec). How could I optimise it further? I don't mind creating more temp tables as long as the total execution time is much reduced. I'm using mysql 4.1.1a-alpha on win2k. Thanks, nyem -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: slow insert into select statement
Something I have done in the past (with another DB system) was to put indexes on my temp tables. I have just gone over my copy of the MySQL docs and I don't see where that is NOT allowed so I think its fair game to try it. You have two options to do this: Create the temp tables then populate them (two statements) -or- Use ALTER TABLE your temp tables to create the indexes. I have no data about which is definitely faster although I have read many times that if you index a table AFTER filling it with data, the entire operation tends to finish quicker (especially if you have lots of data). Since you are using multiple values in your joins (when you are collecting the historical prices) lets try indexing tmpLatestDates and tmpPrevDates on both columns. Do you have an index on rptPricingTest for (commodity, PricingDt) ? That could also speed up the joins. You could also combine the collection of one set of data (I chose to do your previous months values)with your final results calculations. Maybe this will work faster SELECT @start:=NOW(); SELECT @date1:=DATE_SUB( CURDATE(), INTERVAL 1 MONTH); SELECT @date2:=CURDATE(); CREATE TEMPORARY TABLE tmpLatestDates SELECT commodity, MAX(PricingDt) as MaxDate FROM rptPricingTEST WHERE PricingDt @date1 AND PricingDt = @date2 GROUP BY commodity; ALTER TABLE tmpLatestDates ADD KEY (commodity, MaxDate); CREATE TEMPORARY TABLE tmpLatestPrices SELECT b.commodity, a.PricingDt, a.PricingHighPrice, a.PricingLowPrice, a.PricingAvgPrice FROM rptPricingTEST a INNER JOIN tmpLatestDates b ON b.commodity = a.commodity AND b.MaxDate = a.PricingDt; ALTER TABLE tmpLatestPrices ADD KEY(commodity); CREATE TEMPORARY TABLE tmpPrevDates SELECT a.commodity, MAX(a.PricingDt) as PrevDate FROM rptPricingTEST a, tmpLatestPrices b WHERE a.PricingDt @date1 AND a.commodity = b.commodity GROUP BY commodity; ALTER TABLE tmpPrevDates ADD KEY(commodity, PrevDate); /* I eliminated one temp table and added some indexes */ SELECT a.commodity, a.PricingDtas PrevDate, a.PricingAvgPrice as PrevAvg, tlp.PricingDtas LatestDate, tlp.PricingAvgPrice as LatestAvg, ((tlp.PricingAvgPrice - a.PricingAvgPrice) /a.PricingAvgPrice) * 100 as priceChange FROM rptPricingTEST a INNER JOIN tmpPrevDates tpd ON a.commodity = tpd.commodity AND a.PricingDt = tpd.PrevDate INNER JOIN tmpLatestPrices tlp ON tlp.commodity = a.commodity; DROP TABLE IF EXISTS tmpLatestDates, tmpPrevDates, tmpLatestPrices SELECT TIMEDIFF(NOW(),@start); Now, because we are using INNER JOINS, only those commodities that actually existed in the previous month will be returned. That actually makes some sense as you cannot calculate a % change from nonexistence to some value. Let us know how this works out, OK? Shawn Green Database Administrator Unimin Corporation - Spruce Pine nyem [EMAIL PROTECTED]To: [EMAIL PROTECTED] .my cc: Fax to: 05/27/2004 05:01 Subject: Re: slow insert into select statement AM Thanks for all the feedback. Here's my latest attempt: SELECT @start:=NOW(); SELECT @date1:=DATE_SUB( CURDATE(), INTERVAL 1 MONTH); SELECT @date2:=CURDATE(); LOCK TABLES rptPricingTEST READ, rptPricingTEST a READ; CREATE TEMPORARY TABLE tmpLatestDates SELECT commodity, MAX(PricingDt) as MaxDate FROM rptPricingTEST WHERE PricingDt @date1 AND PricingDt = @date2 GROUP BY commodity; CREATE TEMPORARY TABLE tmpLatestPrices SELECT b.commodity, a.PricingDt, a.PricingHighPrice, a.PricingLowPrice, a.PricingAvgPrice FROM rptPricingTEST a INNER JOIN tmpLatestDates b ON b.commodity = a.commodity AND b.MaxDate = a.PricingDt; CREATE TEMPORARY TABLE tmpPrevDates SELECT
RE: slow insert into select statement
If you do the math a large result set will be created. You could rewriting your query or adding more indexes to see if this speeds up the process. -Original Message- From: nyem To: [EMAIL PROTECTED] Sent: 5/26/04 2:57 AM Subject: slow insert into select statement I have this insert statement that took a long time to execute: INSERT INTO priceLast5 SELECT DISTINCT a.commodity, a.PricingDt FROM rptPricingTEST a WHERE a.PricingDt = ( SELECT MAX(PricingDt) FROM rptPricingTEST b WHERE b.PricingDt @date1 AND b.PricingDt = @date2 AND a.commodity = b.commodity ); Query OK, 3010 rows affected (4 min 17.37 sec) Is there any way I can modify the sql statement to speed this up? Here's the result of explain (splitted for easy reading): id select_typetable type possible_keys key -- -- - - - 1 PRIMARYa index NULL idxcommodity 2 DEPENDENT SUBQUERY b ref idxcommodity idxcommodity id key_len ref rows Extra -- --- - -- 1 25 NULL 384155 Using where; Using index 2 22 dbsdvitmp.a.commodity 52 Using where; Using index And here's the rest of the sql before the above insert: SELECT @date1:=DATE_SUB( CURDATE(), INTERVAL 1 MONTH); SELECT @date2:=CURDATE(); CREATE TEMPORARY TABLE priceLast5 ( commodity char(22) NOT NULL, PricingDt date NOT NULL, PRIMARY KEY (commodity), KEY idxPricingDt (PricingDt), ); LOCK TABLES rptPricingTEST a READ; The source table rptPricingTEST consists of commodity id (which is a concat of several ids), pricing date, and prices (low, hi, avg). I need to generate a report of price change for all commodity for the date interval selected by users. Since this will be a web-based report, to wait for 4 minutes for a response is unacceptable. The actual table has 2 million rows which I guess can take forever to execute. regards, nyem -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED] -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
RE: slow insert into select statement
Hi, I would start with finding out if it's the select or the insert that's taking a long time... Does the priceLast5 table have heavy indexes to build? Try running the select seperately and see how long it takes... Andrew -Original Message- From: Victor Pendleton [mailto:[EMAIL PROTECTED] Sent: Wednesday 26 May 2004 14:13 To: 'nyem '; '[EMAIL PROTECTED] ' Subject: RE: slow insert into select statement If you do the math a large result set will be created. You could rewriting your query or adding more indexes to see if this speeds up the process. -Original Message- From: nyem To: [EMAIL PROTECTED] Sent: 5/26/04 2:57 AM Subject: slow insert into select statement I have this insert statement that took a long time to execute: INSERT INTO priceLast5 SELECT DISTINCT a.commodity, a.PricingDt FROM rptPricingTEST a WHERE a.PricingDt = ( SELECT MAX(PricingDt) FROM rptPricingTEST b WHERE b.PricingDt @date1 AND b.PricingDt = @date2 AND a.commodity = b.commodity ); Query OK, 3010 rows affected (4 min 17.37 sec) Is there any way I can modify the sql statement to speed this up? Here's the result of explain (splitted for easy reading): id select_typetable type possible_keys key -- -- - - - 1 PRIMARYa index NULL idxcommodity 2 DEPENDENT SUBQUERY b ref idxcommodity idxcommodity id key_len ref rows Extra -- --- - -- 1 25 NULL 384155 Using where; Using index 2 22 dbsdvitmp.a.commodity 52 Using where; Using index And here's the rest of the sql before the above insert: SELECT @date1:=DATE_SUB( CURDATE(), INTERVAL 1 MONTH); SELECT @date2:=CURDATE(); CREATE TEMPORARY TABLE priceLast5 ( commodity char(22) NOT NULL, PricingDt date NOT NULL, PRIMARY KEY (commodity), KEY idxPricingDt (PricingDt), ); LOCK TABLES rptPricingTEST a READ; The source table rptPricingTEST consists of commodity id (which is a concat of several ids), pricing date, and prices (low, hi, avg). I need to generate a report of price change for all commodity for the date interval selected by users. Since this will be a web-based report, to wait for 4 minutes for a response is unacceptable. The actual table has 2 million rows which I guess can take forever to execute. regards, nyem -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED] -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED] -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
RE: slow insert into select statement
I would also question the sub-select in the WHERE clause. I suspect that it is being evaluated once for each row of rptPricingTEST. You could get a significant performance boost if you move the results of that select into a temp table and INNER JOIN to it. That way you calculate the MAX() date only once for the entire table and not once per row (per row, per row,...) CREATE TEMPORARY TABLE tmpLatestDates SELECT commodity, MAX(PricingDt) as MaxDate FROM rptPricingTEST WHERE PricingDt @date1 AND PricingDt = @date2 GROUP BY commodity INSERT INTO priceLast5 SELECT DISTINCT a.commodity, a.PricingDt FROM rptPricingTEST a INNER JOIN tmpLatestDates b ON b.commodity = a.commodity AND b.PricingDt = a.PricingDt (By using the INNER JOIN, I question if the DISTINCT still needed, too.) After reviewing your original post, I am not confident that you are answering your question with this query. You said I need to generate a report of price change for all commodity for the date interval selected by users. which to me implies a report something like: +-+-+---+--++-+-+-+ |commodity|starting date|ending date|starting price|ending price|max price|min price|avg price| +-+-+---+--++-+-+-+ I am just not sure where you are going with the query you are building. Can you provide a template of the results you want? Thanks, Shawn Green Database Administrator Unimin Corporation - Spruce Pine Andrew Braithwaite To: Victor Pendleton [EMAIL PROTECTED], nyem [EMAIL PROTECTED] [EMAIL PROTECTED], [EMAIL PROTECTED] com cc: Fax to: 05/26/2004 10:28 Subject: RE: slow insert into select statement AM Hi, I would start with finding out if it's the select or the insert that's taking a long time... Does the priceLast5 table have heavy indexes to build? Try running the select seperately and see how long it takes... Andrew -Original Message- From: Victor Pendleton [mailto:[EMAIL PROTECTED] Sent: Wednesday 26 May 2004 14:13 To: 'nyem '; '[EMAIL PROTECTED] ' Subject: RE: slow insert into select statement If you do the math a large result set will be created. You could rewriting your query or adding more indexes to see if this speeds up the process. -Original Message- From: nyem To: [EMAIL PROTECTED] Sent: 5/26/04 2:57 AM Subject: slow insert into select statement I have this insert statement that took a long time to execute: INSERT INTO priceLast5 SELECT DISTINCT a.commodity, a.PricingDt FROM rptPricingTEST a WHERE a.PricingDt = ( SELECT MAX(PricingDt) FROM rptPricingTEST b WHERE b.PricingDt @date1 AND b.PricingDt = @date2 AND a.commodity = b.commodity ); Query OK, 3010 rows affected (4 min 17.37 sec) Is there any way I can modify the sql statement to speed this up? Here's the result of explain (splitted for easy reading): id select_typetable type possible_keys key -- -- - - - 1 PRIMARYa index NULL idxcommodity 2 DEPENDENT SUBQUERY b ref idxcommodity idxcommodity id key_len ref rows Extra -- --- - -- 1 25 NULL 384155 Using where; Using index 2 22 dbsdvitmp.a.commodity 52 Using where; Using index And here's the rest of the sql before the above insert: SELECT @date1:=DATE_SUB( CURDATE(), INTERVAL 1 MONTH); SELECT @date2:=CURDATE(); CREATE TEMPORARY TABLE priceLast5 ( commodity char(22) NOT NULL