> Hello,
>
> I need to be able to sum over distinct values but I can't seem to do it
> unless I use sub-selects (which I want to avoid doing).
>
> To see what I mean, I've constructed a toy DB:
>
> DROP TABLE IF EXISTS spell;
> CREATE TABLE spell (
> id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
> power FLOAT,
> time_casted DATETIME
> );
>
> DROP TABLE IF EXISTS wizard;
> CREATE TABLE wizard (
> id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
> age INT UNSIGNED,
> name VARCHAR(255)
> );
>
> DROP TABLE IF EXISTS spellcast;
> CREATE TABLE spellcast (
> id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
> type ENUM ('fire', 'air', 'water', 'earth'),
> spell_id INT UNSIGNED,
> wizard_id INT UNSIGNED
> );
>
> INSERT INTO wizard (name, age) VALUES
> ('alan', 20),
> ('bill', 23),
> ('chris', 21);
>
> INSERT INTO spell (power, time_casted) VALUES
> (400, '2006-02-02 12:00'),
> (432, '2006-02-04 12:00'),
> (123, '2006-02-03 12:00'),
> (543, '2006-02-08 12:00'),
> (320, '2006-02-01 12:00'),
> (102, '2006-02-12 12:00'),
> (732, '2006-02-14 12:00'),
> (948, '2006-02-18 12:00'),
> (932, '2006-02-21 12:00'),
> (842, '2006-02-26 12:00');
>
> INSERT INTO spellcast (type, spell_id, wizard_id) VALUES
> ('fire', 1, 1),
> ('air', 1, 1),
> ('water', 1, 1),
> ('earth', 2, 1),
> ('water', 2, 1),
> ('fire', 3, 1),
> ('water', 3, 1),
> ('water', 4, 1),
> ('fire', 4, 1),
> ('air', 5, 1),
> ('fire', 6, 1),
> ('water', 7, 1),
> ('water', 1, 2),
> ('fire', 1, 2),
> ('air', 2, 2),
> ('earth', 3, 2),
> ('water', 3, 2),
> ('earth', 4, 2),
> ('fire', 4, 2),
> ('air', 4, 2),
> ('water', 1, 3),
> ('earth', 1, 3),
> ('air', 1, 3),
> ('water', 5, 3),
> ('fire', 5, 3),
> ('earth', 5, 3),
> ('water', 6, 3),
> ('air', 7, 3);
>
> A spell is an individual spell that's been cast. A spellcast is the
> action of casting the spell by a particular wizard (or a group of
> wizards). When casting a spell, a wizard can contribute various
essenses
> (fire, earth, air, water). So for example, Alan cast a spell (id=1) and
> contributed three essences (fire, air & water) - this means that there
> are 3 spellcast rows for this contribution to this spell.
>
> Let's say I want to find the total power of all the spells cast by each
> wizard that involve fire & air. At first I thought the following
might work:
>
> SELECT wizard.name,
> SUM(spell.power)
> FROM spell,
> spellcast,
> wizard
> WHERE wizard.id = spellcast.wizard_id AND
> spellcast.spell_id = spell.id AND
> spellcast.type IN ('fire', 'air')
> GROUP BY wizard.id;
>
> But this is wrong. The above query will count some spells more than
> once, so the resulting sum is greater than it should be.
>
> The only way I can think of doing this correctly is to use sub-queries:
>
> SELECT DISTINCT wizard.name,
> sub.s
> FROM
> (SELECT SUM(inner_sub.power) AS s,
> inner_sub.wiz_id
> FROM
> (SELECT DISTINCT
> spell.id,
> spell.power,
> wizard.id AS wiz_id
> FROM spell,
> spellcast,
> wizard
> WHERE wizard.id = spellcast.wizard_id AND
> spellcast.spell_id = spell.id AND
> spellcast.type IN ('fire', 'air')
> ) AS inner_sub
> GROUP BY inner_sub.wiz_id
> ) AS sub,
> wizard,
> spellcast
> WHERE wizard.id = sub.wiz_id AND
> spellcast.wizard_id = wizard.id AND
> spellcast.type IN ('fire', 'air');
>
> This works but I was wondering whether there was a simpler way to do
it.
> All my queries are generated dynamically, and I want to avoid
generating
> complex subqueries.
>
> Anyone know of a way to do the above a lot more simply? I can change
the
> DB schema if needs be.
>
> Thanks,
> Yasir
>
>
Based on database theory, your schema is correct (so long as each
spell only consumes up to 1 unit of essence). As you have discovered,
you are trying to take a second-level summary within a single
statement. While the COUNT() aggregate function has a DISTINCT
modifier none of the others do. That is why you posted.
One technique you could try is to create an intermediate pivot table.
That way you can know how much of each essence has been used in each
spell.
CREATE TEMPORARY TABLE tmpSpellSummary (PRIMARY KEY
(wizard_id,spell_id)) SELECT
wizard_id,
spell_id,
SUM(if(type='AIR',1,0)) air,
SUM(if(type='EARTH',1,0)) earth,
SUM(if(type='FIRE',1,0)) fire,
SUM(if(type='WATER',1,0)) water
FROM spellcast
GROUP BY wizard_id, spell_id
Now you can join this pivot table to your other query and you won't
have the duplication. You can also modify this by adding (after the
GROUP BY clause):
HAVING air>0 and water>0
to pick out just those spell_id's that used both air and water.