A question on grouping I've never been able to solve...

create table j (proj char(3), id int, score double,cpid char(32),team
char(10));

insert into j values('aaa',1,100,'aaaaaaaaaaaaaaaaa','team1');
insert into j values('bbb',2,200,'aaaaaaaaaaaaaaaaa','team1');
insert into j values('ccc',3,300,'aaaaaaaaaaaaaaaaa','team2');
insert into j values('aaa',4,100,'bbbbbbbbbbbbbbbbb','team2');
insert into j values('bbb',5,300,'bbbbbbbbbbbbbbbbb','team1');
insert into j values('ccc',6,400,'bbbbbbbbbbbbbbbbb','team1');
insert into j values('aaa',7,101,'ccccccccccccccccc','team1');
insert into j values('bbb',8,302,'ccccccccccccccccc','team2');
insert into j values('ccc',9,503,'ccccccccccccccccc','team2');

mysql> select * from j;
+------+------+-------+-------------------+-------+
| proj | id   | score | cpid              | team  |
+------+------+-------+-------------------+-------+
| aaa  |    1 |   100 | aaaaaaaaaaaaaaaaa | team1 |
| bbb  |    2 |   200 | aaaaaaaaaaaaaaaaa | team1 |
| ccc  |    3 |   300 | aaaaaaaaaaaaaaaaa | team2 |
| aaa  |    4 |   100 | bbbbbbbbbbbbbbbbb | team2 |
| bbb  |    5 |   300 | bbbbbbbbbbbbbbbbb | team1 |
| ccc  |    6 |   400 | bbbbbbbbbbbbbbbbb | team1 |
| aaa  |    7 |   101 | ccccccccccccccccc | team1 |
| bbb  |    8 |   302 | ccccccccccccccccc | team2 |
| ccc  |    9 |   503 | ccccccccccccccccc | team2 |
+------+------+-------+-------------------+-------+
9 rows in set (0.00 sec)

mysql> select cpid,sum(score),team from j group by cpid;
+-------------------+------------+-------+
| cpid              | sum(score) | team  |
+-------------------+------------+-------+
| aaaaaaaaaaaaaaaaa |        600 | team1 |
| bbbbbbbbbbbbbbbbb |        800 | team2 |
| ccccccccccccccccc |        906 | team1 |
+-------------------+------------+-------+
3 rows in set (0.00 sec)

Using MAX or MIN on the team gives different but not necessarily closer
results.

mysql> select cpid,sum(score),max(team) from j group by cpid;
+-------------------+------------+-----------+
| cpid              | sum(score) | max(team) |
+-------------------+------------+-----------+
| aaaaaaaaaaaaaaaaa |        600 | team2     |
| bbbbbbbbbbbbbbbbb |        800 | team2     |
| ccccccccccccccccc |        906 | team2     |
+-------------------+------------+-----------+
3 rows in set (0.00 sec)

mysql> select cpid,sum(score),min(team) from j group by cpid;
+-------------------+------------+-----------+
| cpid              | sum(score) | min(team) |
+-------------------+------------+-----------+
| aaaaaaaaaaaaaaaaa |        600 | team1     |
| bbbbbbbbbbbbbbbbb |        800 | team1     |
| ccccccccccccccccc |        906 | team1     |
+-------------------+------------+-----------+
3 rows in set (0.00 sec)

Given that for cpid = 'bbbbbbbbbbbbbbb', they have 2 rows where it is team1,
and only 1 with team2 but the original query gives team2 and rightly so as
it just uses the first row in mysql's slightly illegal (but useful!) use of
allowing other columns in the query but not in the group by.

The question is, is there any way to modify this query so that it would
return the team having the most entries?

Theoretical what I would like:

| cpid              | sum(score) | team  |
+-------------------+------------+-------+
| aaaaaaaaaaaaaaaaa |        600 | team1 |
| bbbbbbbbbbbbbbbbb |        800 | team1 |
| ccccccccccccccccc |        906 | team2 |


If not, is there an easy way to have another column, say mostteam char(10)
and run an update statement on the whole table which would put the correct
value in?

Regards

Phil

-- 
Distributed Computing stats
http://stats.free-dc.org

Reply via email to