Thanks for your help, Rick!

Interspersed are some questions and rationales for you to shoot down... :-)

> From: Rick James <rja...@yahoo-inc.com>
> 
> s_product_sales_log has no PRIMARY KEY.  All InnoDB tables 'should' have an 
> explicit PK.

This table really has no identifying information. There could be two identical, 
valid rows, if the same person sold the same amount of the same product to the 
same other person on the same day.

All the foreign keys were indexed. Is there something I don't understand about 
something a PK field does? If an individual record cannot be uniquely 
identified by its information, is there really any need for a primary key?

None the less, I added field "ID" as an unsigned autoincrement INT and made it 
PK.

> INT(5) is not what you think.  INT is always a 32-bit, 4-byte quantity, 
> regardless of the number.

> 
> Use TINYINT UNSIGNED, SMALLINT UNSIGNED, etc. wherever reasonable.

Understood. I make all my keys UINT even when they could be smaller, because 
I've gotten into some gnarly consistency problems. with foreign keys.

>> KEY `is_value_added` (`is_value_added`),
> A single-column INDEX on a flag, ENUM, and other low-cardinality field, is 
> almost always useless.

Why is that? Surely, even a flag separates the record space into two?

> Performance issues...
>> WHERE  YEAR(sales.`Date`) = '{{{1}}}'
> won't use
>> KEY `Date` (`Date`),
> because the column (Date) is hidden in a function.  A workaround:
>   WHERE `Date` >= '{{{1}}}-01-01'
>     AND `Date` <  '{{{1}}}-01-01' + INTERVAL 1 YEAR

Thanks! Good catch.

> JOINing two subqueries -- There is no way to index either of them, so the 
> JOIN will have to do a table scan of one temp table for every row of the 
> other temp table.
> (The alternative is to CREATE TEMPORARY TABLE... with an index, for one of 
> the subqueries.)

But I made sure the subqueries were the smallest possible sets -- essentially, 
the domain of s_profit_centre, which only has 12 records.

I had the entire thing coded up into one massive JOIN of everything, and it 
took 30 minutes to run! By LEFT JOINing down to a dozen or fewer records, it 
seems to run in reasonable time, even though it's two subqueries that are not 
indexed.

> It would probably be better to move the mt.tot!=0 test inside:
> 
>                   GROUP BY  `Profit Centre`
>           ) mt ON mt.pcid = tt.pcid
>           WHERE  mt.tot != 0
>   ) xx
> -->
>                   GROUP BY  `Profit Centre`
>                   HAVING tot != 0                   -- added
>           ) mt ON mt.pcid = tt.pcid
>           -- removed:   WHERE  mt.tot != 0
>   ) xx
> 
> That would make mt have fewer rows, hence that unindexed JOIN could run 
> faster.

My first attempt to do that produced an error. And again, both the subqueries 
will have 12 or fewer records, so I'm wondering if this really helps anything.

Thanks for your help!

[clip]

----------------
:::: People see what they have been conditioned to see; they refuse to see what 
they don't expect to see. -- Merle P. Martin
:::: Jan Steinman, EcoReality Co-op ::::





--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/mysql

Reply via email to