On 17/12/2010, at 8:27 PM, Filip Rembiałkowski wrote: > > 2010/12/17 Royce Ausburn <ro...@inomial.com> > Hi all, > > I have a table that in the typical case holds two minute sample data for a > few thousand sources. Often we need to report on these data for a particular > source over a particular time period and we're finding this query tends to > get a bit slow. > > > how about (auto)vacuuming?
A key piece of information I left out: we almost never update rows in this table. > > > I figure at most there should only be ~20,000 rows to be read from disk, and > I expect that the index is doing a pretty good job of making sure only the > rows that need reading are read. inclusion of the ip in the query is almost > redundant as most of the time an ip has its own collection.... My suspicion > is that the rows that we're interested in are very sparsely distributed on > disk, so we're having to read too many pages for the query... > > > you can test this suspicion in very simple way: > - create test table (like yours including indexes including constraints, but > with no data) > - insert into test select * from yours order by > - analyze test tablee available > - test the query on the new table > > If new query is much faster, and if you have intensive random UPD/DEL/INS > activity, periodic CLUSTER could be a good idea... > but it depends on actual usage patterns (SELECT/modify ratio, types of > updates, and so on). Good idea! This vastly improves query times. > > > and finally, you did not specify what PostgreSQL version are you using. In the case I've been working with it's 8.1 =( But we have a few instances of this database... I believe the rest are a mixture of 8.4s and they all have the same problem. --Royce