Simon,

have you read src/backend/storage/buffer/README of current CVS tip?

The algorithm in the new replacement strategy is an attempt to figure that SMALL_TABLE_THRESHOLD automatically. Do you see anything that can be improved in that algorithm?


Jan


Simon Riggs wrote:

This discussion seems likely to have a major effect on DBT-3 (DSS-type)
performance from PostgreSQL...

On Fri, 16 Jan 2004, Michael Brusser wrote:

> Is there a way to force database to load
> a frequently-accessed table into cache and keep it there?

Scott Marlow replied...

Nope. But there is a new cache buffer handler that may make it into
7.5
that would make that happen automagically.

The important question here is "what forces blocks out of cache?" rather than thinking about how to directly keep them there.

Jeroen T. Vermeulen wrote:
Sent: Friday, January 16, 2004 23:02
Subject: [HACKERS] Reverse scans?

Would it be doable, and would it be useful, to try to alternate the
directions of table and index scans every time each table/index was
fully scanned?

I was thinking that it could help cache performance at various levels
in cases where data at the end of a large table, say, that remained in
memory after a scan, would otherwise be flushed out by a new scan of
the
same table.  If the next scan of the same table was to go in the other
direction, any remains of the last time around that were still in the
filesystem cache, buffer pool, hard disk cache etc. would stand a
greater
chance of being reused.

Jereon's idea is a good one when we consider the current behaviour,
which is that large scans are placed into block buffer cache, which then forces
other data out. I would like to question the latter behaviour, so we can
address the cause and not just the symptom.


Earlier versions of Oracle had a parameter called something like
SMALL_TABLE_THRESHOLD. Any scan of a table larger than that threshold
was NOT placed into buffer cache, but was consumed directly by the
shadow process (the backend). Teradata also uses a similar buffer
control technique for large table scans.

If a table is too large to fit into buffer, it clearly wasn't going to
be cached properly in the first place; Jereon's idea only works well for
tables near to the size of the cache. If the table is MUCH bigger then
it will have very little gain. Good original thinking, though I'm not
sure its worth it.

Oracle 9i now offers some variety for buffer cache management (as does
DB2). You can specify at the tablespace and object level whether to use
one of two buffer pools, KEEP or RECYCLE. The idea is to separate out
the two types of blocks - ones that are there because they're well used
and other blocks that are there at the moment, but unlikely to stay.

My suggestion would be to:
- split the buffer cache into two, just as Oracle does: KEEP & RECYCLE.
This could default to KEEP=66% of total memory available, but could also
be settable by init parameter.
[changes to the memory management routines]
- if we do a scan on a table whose size in blocks is more than some
fraction (25%?) of KEEP bufferpool then we place the blocks into RECYCLE
bufferpool. This can be decided immediately following optimization,
rather than including it within the optimizer decision process since we
aren't going to change the way the statement executes, we're just going
to stop it from having an adverse effect on other current or future
statements.
[additional test to set parameter, then work out where to note it]

Notice that I haven't suggested that the KEEP/RECYCLE option could be
specified at table level. That optionality sounds like a lot of extra
work, when what is needed is the automatic avoidance of cache-spoiling
behaviour. (This would still mean that very large indexes with random
request patterns would still spoil cache...maybe implement that later?)

This would remove most reasons for spoiling the cache and blocks would
then leave the cache only when they were genuinely no longer wanted.

Any comments?? Takers?


---------------------------(end of broadcast)--------------------------- TIP 6: Have you searched our list archives?

http://archives.postgresql.org


--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== [EMAIL PROTECTED] #


---------------------------(end of broadcast)--------------------------- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match

Reply via email to