Re: Help improving query performance

Larry Martell Wed, 04 Feb 2015 12:23:12 -0800

On Wed, Feb 4, 2015 at 2:56 PM, shawn l.green <shawn.l.gr...@oracle.com> wrote:
> Hi Larry,
>
>
> On 2/1/2015 4:49 PM, Larry Martell wrote:
>>
>> I have 2 queries. One takes 4 hours to run and returns 21 rows, and
>> the other, which has 1 additional where clause, takes 3 minutes and
>> returns 20 rows. The main table being selected from is largish
>> (37,247,884 rows with 282 columns). Caching is off for my testing, so
>> it's not related to that. To short circuit anyone asking, these
>> queries are generated by python code, which is why there's an IN
>> clause with 1 value, as oppose to an =.
>>
>> Here are the queries and their explains. The significant difference is
>> that the faster query has "Using
>> intersect(data_cst_bbccbce0,data_cst_fba12377)" in the query plan -
>> those 2 indexes are on the 2 columns in the where clause, so that's
>> why the second one is faster. But I am wondering what I can do to make
>> the first one faster.
>>
>>
>> 4 hour query:
>>
>> SELECT MIN(data_tool.name) as tool,
>>         MIN(data_cst.date_time) "start",
>>         MAX(data_cst.date_time) "end",
>>         MIN(data_target.name) as target,
>>         MIN(data_lot.name) as lot,
>>         MIN(data_wafer.name) as wafer,
>>         MIN(measname) as measname,
>>         MIN(data_recipe.name) as recipe
>> FROM data_cst
>> INNER JOIN data_tool ON data_tool.id = data_cst.tool_id
>> INNER JOIN data_target ON data_target.id = data_cst.target_name_id
>> INNER JOIN data_lot ON data_lot.id = data_cst.lot_id
>> INNER JOIN data_wafer ON data_wafer.id = data_cst.wafer_id
>> INNER JOIN data_measparams ON data_measparams.id =
>> data_cst.meas_params_name_id
>> INNER JOIN data_recipe ON data_recipe.id = data_cst.recipe_id
>> WHERE data_target.id IN (172) AND
>>        data_cst.date_time BETWEEN '2015-01-26 00:00:00' AND '2015-01-26
>> 23:59:59'
>> GROUP BY wafer_id, data_cst.lot_id, target_name_id
>>
>
> ... snipped ...
>
>>
>>
>> Faster query:
>>
>> SELECT MIN(data_tool.name) as tool,
>>         MIN(data_cst.date_time) "start",
>>         MAX(data_cst.date_time) "end",
>>         MIN(data_target.name) as target,
>>         MIN(data_lot.name) as lot,
>>         MIN(data_wafer.name) as wafer,
>>         MIN(measname) as measname,
>>         MIN(data_recipe.name) as recipe
>> FROM data_cst
>> INNER JOIN data_tool ON data_tool.id = data_cst.tool_id
>> INNER JOIN data_target ON data_target.id = data_cst.target_name_id
>> INNER JOIN data_lot ON data_lot.id = data_cst.lot_id
>> INNER JOIN data_wafer ON data_wafer.id = data_cst.wafer_id
>> INNER JOIN data_measparams ON data_measparams.id =
>> data_cst.meas_params_name_id
>> INNER JOIN data_recipe ON data_recipe.id = data_cst.recipe_id
>> WHERE data_target.id IN (172) AND
>>        data_recipe.id IN (148) AND
>>        data_cst.date_time BETWEEN '2015-01-26 00:00:00' AND '2015-01-26
>> 23:59:59'
>> GROUP BY wafer_id, data_cst.lot_id, target_name_id
>>
> ... snip ...
>>
>>
>> Thanks for taking the time to read this, and for any help or pointers
>> you can give me.
>>
>
> The biggest difference is the added selectivity generated by the WHERE term
> against the data_recipe table.
>
> Compare the two EXPLAINS, in the faster query you see that data_recipe is
> listed second. This allows the additional term a chance to reduce the number
> of row combinations for the entire query.
>
> To really get at the logic behind how the Optimizer chooses its execution
> plan, get an optimizer trace. Look at the "cost" estimates for each phase
> being considered.
> http://dev.mysql.com/doc/refman/5.6/en/optimizer-trace-table.html
> http://dev.mysql.com/doc/internals/en/optimizer-tracing.html


Thanks very much Shawn for the reply and the links. I will check those
out and I'm sure I will find them very useful.

Meanwhile I changed the query to select from data_cst using the where
clause into a temp table and then I join the temp table with the other
tables. That has improved the slow query from 4 hours to 10 seconds
(!)

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/mysql

Re: Help improving query performance

Reply via email to