[GENERAL] Select Last n Rows Matching an Index Condition (and caches)

Alex Stapleton Fri, 18 Mar 2005 01:36:58 -0800

We have a ~10million row table but are expecting it to get larger, possibly
by a factor of 10 or more. The rows are quite long and fixed length (just
over 500 bytes.)


We have an index of (symbol, source, date) on this table and doing queries
like this

SELECT * FROM article WHERE symbol=12646 AND source = 19 ORDER BY time DESC
LIMIT 1000;

To get the latest 1000 rows for that symbol and source.

However this takes quite a while at the best of times, (1-10 seconds.) The
query without the order by and the limit tends to return about 70000 rows
which adds up to about 30MB of data. Once the pages are in the cache they
take around 100ms but this is to be expected. Unfortunately the initial
query required to cache it is unnacceptably long for web application like
ours.

My (not yet implemented) solution to this problem is to add a SEQUENCE and
index it so that by adding a WHERE id > [max_id]-1000 and ordering by time
DESC will reduce the I/O quite a lot. Am I right here? It would be nice if
there was a way to get PostgreSQL to try and precache the tables pages as
well, is there anyway I could achieve something like that? I have toyed with
creating a ramdisk to store a lookup table of sorts on (we only care about a
few columns initially) to speed this up a bit but its a right pain in the
arse to do by the looks of things.


---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

               http://archives.postgresql.org

[GENERAL] Select Last n Rows Matching an Index Condition (and caches)

Reply via email to