Re: [PERFORM] Slow count(*) again...

Reid Thompson Sun, 10 Oct 2010 09:11:10 -0700

 On 10/10/2010 6:29 AM, Neil Whelchel wrote:

On Saturday 09 October 2010 23:56:15 Craig Ringer wrote:

On 10/10/2010 11:02 AM, Neil Whelchel wrote:

On Saturday 09 October 2010 18:47:34 Scott Marlowe wrote:

On Sat, Oct 9, 2010 at 5:26 PM, Neil Whelchel<[email protected]>

wrote:

I know that there haven been many discussions on the slowness of
count(*) even when an index is involved because the visibility of the
rows has to be checked. In the past I have seen many suggestions about
using triggers and tables to keep track of counts and while this works
fine in a situation where you know what the report is going to be
ahead of time, this is simply not an option when an unknown WHERE
clause is to be used (dynamically generated). I ran into a fine
example of this when I was searching this mailing list, "Searching in
856,646 pages took 13.48202 seconds. Site search powered by PostgreSQL
8.3." Obviously at some point count(*) came into play here because the
site made a list of pages (1 2 3 4 5 6>   next). I very commonly make a
list of pages from search results, and the biggest time killer here is
the count(*) portion, even worse yet, I sometimes have to hit the
database with two SELECT statements, one with OFFSET and LIMIT to get
the page of results I need and another to get the amount of total rows
so I can estimate how many pages of results are available. The point I
am driving at here is that since building a list of pages of results
is such a common thing to do, there need to be some specific high
speed ways to do this in one query. Maybe an estimate(*) that works
like count but gives an answer from the index without checking
visibility? I am sure that this would be good enough to make a page
list, it is really no big deal if it errors on the positive side,
maybe the list of pages has an extra page off the end. I can live with
that. What I can't live with is taking 13 seconds to get a page of
results from 850,000 rows in a table.

Good point, maybe this is turning more into a discussion of how to generate a
list of pages of results and one page of results with one query so we don't
have to do the same painfully slow query twice to do a very common task.


On the other hand, I copied a table out of one of my production servers that
has about 60,000 rows with 6 columns (numeric, numeric, bool, bool, timestamp,
text). The first numeric column has numbers evenly spread between 0 and 100
and it is indexed. I put the table in a pair of database servers both running
on the same physical hardware. One server is Postgres, the other is a popular
server (I am not mentioning names here). on Postgres: SELECT count(*) FROM
table where column>50; takes about 8 seconds to run. The other database server
took less than one second (about 25 ms) as it is using the index (I assume) to
come up with the results. It is true that this is not a fair test because both
servers were tested with their default settings, and the defaults for Postgres
are much more conservative, however, I don't think that any amount of settings
tweaking will bring them even in the same ball park. There has been discussion
about the other server returning an incorrect count because all of the indexed
rows may not be live at the time. This is not a problem for the intended use,
that is why I suggested another function like estimate(*). It's name suggests
that the result will be close, not 100% correct, which is plenty good enough
for generating a list of results pages in most cases. I am faced with a very
serious problem here. If the query to make a list of pages takes say 6 seconds
and it takes another 6 seconds to generate a page of results, the customer is
waiting 12 seconds. This is not going to work. If count made a quick estimate,
say less than a second, and it took 6 seconds to come up with the actual
results, I could live with that. Or if coming up with the window of results
via (OFFSET and LIMIT) and returned the total number of rows that would have
matched the query, then I would still have everything I need to render the
page in a reasonable time. I really think that this needs to be addressed
somewhere. It's not like I am the only one that does this. You see it nearly
everywhere a long list of results is (expected to be) returned in a web site.
Among the people I work with, this seems to be the most mentioned reason that
they claim that they don't use Postgres for their projects.

It would be nice to see how the server comes up with the search results and
list of links to pages of results for this mailing list.
(http://search.postgresql.org/search?q=slow+count%28%29&m=1&l=&d=365&s=r) I am
guessing that it probably uses the count and query method I am talking about.

I guess there could be *some* way to expose the query planner's cost
estimates in a manner useful for result count estimation ... but given
how coarse its stats are and how wildly out the estimates can be, I kind
of doubt it. It's really intended for query planning decisions and more
interested in orders of magnitude, "0, 1, or more than that" measures,
etc, and seems to consider 30% here or there to be pretty insignificant
most of the time.

It's bad enough that count(*) is slow, then you have to do it all over
again to get the results you need! I have not dug into this much yet,
but would it be possible to return the amount of rows that a WHERE
clause would actually return if the LIMIT and OFFSET were not applied.
IE: When a normal query is executed, the server returns the number of
rows aside from the actual row data. Would it be a big deal to modify
this to allow it to return the amount of rows before the LIMIT and
OFFSET is applied as well?

It'd force the server to fully execute the query. Then again, it sounds
like you're doing that anyway.

How big is your DB?
How fast is your disk access?
Any chance disks/RAM can be addressed?

My disk access is pitiful...
first run, 2.3 million rows.. 0m35.38s, subsequent runs.. real    0m2.55s

rthom...@hw-prod-repdb1> time psql -c "select count(*) from 
my_production_table" reporting
  count
---------
 2340704
(1 row)


real    0m35.38s
user    0m0.25s
sys     0m0.03s

subsequent runs.... (count changes due to inserts.)

rthom...@hw-prod-repdb1> time psql -c "select count(*) from 
my_production_table" reporting
  count
---------
 2363707
(1 row)


real    0m2.70s
user    0m0.27s
sys     0m0.02s
rthom...@hw-prod-repdb1> time psql -c "select count(*) from 
my_production_table" reporting
  count
---------
 2363707
(1 row)


real    0m2.55s
user    0m0.26s
sys     0m0.02s
rthom...@hw-prod-repdb1> time psql -c "select count(*) from 
my_production_table" reporting
  count
---------
 2363707
(1 row)


real    0m2.50s
user    0m0.26s
sys     0m0.02s

reporting=# SELECT 
pg_size_pretty(pg_total_relation_size('my_production_table'));
 pg_size_pretty
----------------
 1890 MB
(1 row)


--
Sent via pgsql-performance mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Slow count(*) again...

Reply via email to