Phoenix Kiula wrote: > [Ppsted similar note to PG General but I suppose it's more appropriate > in this list. Apologies for cross-posting.] > > Hi. Further to my bafflement with the "count(*)" queries as described > in this thread: > > http://archives.postgresql.org/pgsql-general/2009-01/msg00804.php > > It seems that whenever this question has come up, Postgresql comes up > very short in terms of "count(*)" functions.
Sorry - I'm confused. That thread doesn't seem to contain a slow count(*) query. You seem to be saying you're having problems with the query taking 10-15 seconds, but the example takes less then half a second. How have you identified the count() as being the problem here? > The performance is always slow, because of the planner's need to guess > and such. I don't fully understand how the statistics work (and the > explanation on the PG website is way too geeky) but he columns I work > with already have a stat level of 100. Not helping at all. But your own email says it's slow sometimes: "My queries are fast in general *except* the first time" I'm not sure how the planner comes into this. > We are now considering a web based logging functionality for users of > our website. This means the table could be heavily INSERTed into. We > get about 10 million hits a day, and I'm guessing that we will have to > keep this data around for a while. > > My question: with that kind of volume and the underlying aggregation > functions (by product id, dates, possibly IP addresses or at least > countries of origin..) will PG ever be a good choice? A good choice compared to what? > Or should I be > looking at some other kind of tools? I wonder if OLAP tools would be > overkill for something that needs to look like a barebones version of > google analytics limited to our site.. Typically you'd summarise the data by hour/day via triggers / a scheduled script if you weren't going towards a pre-packaged OLAP toolkit. Otherwise you're going to have to scan the hundreds of millions of rows you've accumulated. > Appreciate any thoughts. If possible I would prefer to tone down any > requests for MySQL and such! I'm not sure MySQL is going to help you here - if you were running lots of small, simple queries it might make sense. If you want to aggregate data by varying criteria I don't think there is any sensible optimisation (other than pre-calculating summaries). -- Richard Huxton Archonet Ltd -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance