Re: A question about the select count(*) performance and the InnoDB engine

Mathias Tue, 28 Jun 2005 10:14:28 -0700

Selon [EMAIL PROTECTED]:

> Mathias <[EMAIL PROTECTED]> wrote on 06/28/2005 06:13:08 AM:
>
> > Selon Behrang Saeedzadeh <[EMAIL PROTECTED]>:
> >
> > > Mathias,
> > >
> > > Thanks a lot!
> > >
> > > > I will not explain the same thing for sqlserver, sybase ..., but
> when
> > > > your RDBMs
> > > > have a data dictionnary, you don't need to execute count(*)  :o)
> WITH
> > > > Updated
> > > > statistics of course.
> > >
> > > I'm a little bit confused here. Why the count(*) is not transformed to
> a
> > > select from the data dictionary if this way is faster? And what's the
> > > difference between updated statistics and statistics not updated?
> > >
> > > > With information_schema in 5.x and higher, innodb will act as it's
> done
> > > > in all
> > > > the other RDBMS.
> > > >
> > > > Hope that helps
> > >
> > > Sure! It helped by orders of magnitured more than I thought it can
> help ;-)
> > >
> > > > :o)
> > > > Mathias
> > >
> > >
> > >
> > > --
> > > Behrang Saeedzadeh
> > > http://www.jroller.com/page/behrangsa
> > >
> > > Using Opera's revolutionary e-mail client
> > >
> >
> > Well,
> > The information in data dictionnary are correct only just after updating
> them.
> > imagine at 12h, you update statistics, num_rows=2000. At 12h05, you
> > insert 1000
> > lignes and delete 500.
> >
> > At 12h10, you ask the data dictinary num_rows, it will give you 2000,
> even if
> > they are 2500.
> >
> >
> > Hope that helps
> > :o)
> > Mathias
> >
> Mathias,
>
> COUNT(*) is not slow in InnoDB due to a lack of statistics. It's due to
> the fact that for any user the value of COUNT(*) can be completely
> different than for any other user.  Assume for a moment that there is a
> table stored in InnoDB that has 1000 records in it. UserA starts a
> transaction that adds 200 records and changes 50. UserB also starts a
> transaction and adds 500 records of his own. For the rest of this example,
> both transactions remain "pending".
>
> Physically, the database now contains 1000 (original) + 200 (added by
> UserA) + 50 (changes pending from UserA) + 500 (added by UserB) = 1750
> total records. However, if UserA performs a COUNT(*) query, they would
> only be able to see the 1200 records visible within their transaction.
> UserB will only be able to count 1500 records for the same reason.
>
> The slowness of performing a COUNT(*) query is caused by the need to
> individually evaluate all 1750 records to see if the user that asked to
> "count" them should actually know about them. Unless the engine is changed
> to maintain a separate set of table statistics for each user there won't
> be any way to just "look up" the number because the record count can (and
> usually will) be different for each user.
>
> After both transactions commit, the database will only have 1700 records
> (total) as the 50 pending updates, from UserA's transaction, will have
> overwritten the 50 original records.
>
> Does that help?
>
> Shawn Green
> Database Administrator
> Unimin Corporation - Spruce Pine


Thanks Shawn, but i'm not speaking about data consistency during transaction and
isolation levels.
I spoke about what is seen in the data dictionary as num_rows an why it  can not
be used even it's quite faster.



Hope that helps
:o)
Mathias

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Re: A question about the select count(*) performance and the InnoDB engine

Reply via email to