Re: optimizer: GROUP BY unnessesary slow using ANSI notation for redundant group by's.

2004-01-25 Thread Sergei Golubchik
Hi!

On Jan 24, Stefan Traby wrote:
 Hi !
 
 For my forum system, I use the following query to generate the
 main-overview: (using 4.0.16-log)
 
 select f1.id as BoardId,
f1.name as Board,
f1.more as BoardDesc,
f2.id as AreaId,
f2.name as Area,
f2.more as AreaDesc,
count(distinct f3.id) as ThemenCount,
count(distinct m1.ctime) as MessageCount,
max(m1.ctime) as LastMessageStamp
 from forum as f1,
  forum as f2
  left join forum as f3 on (f3.rid = f2.id)
  left join forum_msg as m1 on (m1.fid = f3.id)
 where f1.rid = 0 
  and f2.rid = f1.id
 group by AreaId -- note ANSI: group by AreaId, Area, AreaDesc
 order by BoardId, AreaId;
 
 ANSI requires to use group by AreaId, Area, AreaDesc instead
 of group by AreaId (which is a documented MySQL shortcut against
 this redundancy) but the ANSI notation is ~4 times slower.
 
 This performance penalty is really unnessesary because the optimizer
 could detect this kind of redundancy in many cases, especially this
 simple case because group by f2.id generates clearly the same
 results as group by f2.id, f2.name, f2.more does.

Yes, you are right.
But though MySQL doesn't have this optimization, I still cannot
understand where this ~4 came from.

How big are your tables (rows and bytes) ? What are typical values for
ThemenCount and MessageCount in the result of this query ?

Can you provide the results of SHOW CREATE TABLE for forum and forum_msg ?

Regards,
Sergei

-- 
   __  ___ ___   __
  /  |/  /_ __/ __/ __ \/ /   Sergei Golubchik [EMAIL PROTECTED]
 / /|_/ / // /\ \/ /_/ / /__  MySQL AB, Senior Software Developer
/_/  /_/\_, /___/\___\_\___/  Osnabrueck, Germany
   ___/  www.mysql.com

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



optimizer: GROUP BY unnessesary slow using ANSI notation for redundant group by's.

2004-01-23 Thread Stefan Traby
Hi !

For my forum system, I use the following query to generate the
main-overview: (using 4.0.16-log)


select f1.id as BoardId,
   f1.name as Board,
   f1.more as BoardDesc,
   f2.id as AreaId,
   f2.name as Area,
   f2.more as AreaDesc,
   count(distinct f3.id) as ThemenCount,
   count(distinct m1.ctime) as MessageCount,
   max(m1.ctime) as LastMessageStamp
from forum as f1,
 forum as f2
 left join forum as f3 on (f3.rid = f2.id)
 left join forum_msg as m1 on (m1.fid = f3.id)
where f1.rid = 0 
 and f2.rid = f1.id
group by AreaId -- note ANSI: group by AreaId, Area, AreaDesc
order by BoardId, AreaId;


ANSI requires to use group by AreaId, Area, AreaDesc instead
of group by AreaId (which is a documented MySQL shortcut against
this redundancy) but the ANSI notation is ~4 times slower.

This performance penalty is really unnessesary because the optimizer
could detect this kind of redundancy in many cases, especially this
simple case because group by f2.id generates clearly the same
results as group by f2.id, f2.name, f2.more does.

I think it's really *bad* to allow the non-ANSI shortcut and
to not detect the ANSI-notation as redundant!

shame on you :) :)

-- 

  ciao - 
Stefan

aclocal - emit a warning if -ac kernel is used. 

Stefan TrabyLinux/ia32 office: +49-721-3523165
Mathystr. 18-20 V/8 Linux/alpha  cell: +XX-XXX-XXX
76133 Karlsruhe Linux/sparc   http://graz03.kwc.at
Germany Linux/arm   mailto:[EMAIL PROTECTED]
Europe  Linux/mips mailto:[EMAIL PROTECTED]

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]