Re: [HACKERS] Avoiding redundant fetches of btree index metapages

2006-04-26 Thread Simon Riggs
On Tue, 2006-04-25 at 13:43 -0400, Tom Lane wrote:

 What I'm considering doing to fix that is require any change to a
 btree's metapage data to send out a relcache inval message for the
 index.  That will force all backends to flush the stale cache item
 not later than the start of their next transaction, and thereby
 guarantee that they aren't using pointers that are too old to be safe
 against vacuuming.  (There are other ways we could handle this, but
 that one seems like the simplest and least invasive.)
 
 Comments?  Anyone see any flaws in the reasoning?

Hmmm I'm slightly worried that frequently-churning small tables will
be made even slower by this. What do you think?

 * Re-using rd_targblock was OK for a quick hack because we don't use it
 for indexes, but it's too klugy for real use, and it's not quite enough
 info anyway (we really ought to cache the root level as well as root
 block number).  I'm planning to add a void * pointer to the Relation
 struct that the index AM is allowed to use as it sees fit, with the
 understanding that any pointed-to data lives in rd_indexcxt and the
 pointer will be reset to NULL on any relcache clear.  btree would store
 a copy of the BTMetaPageData struct.  The other AMs might have some
 other use for this.

So we would be able to cache other items also? I'd also considered
caching the rightmost page, for when the table grows via a monotonically
increasing key. Similar benefits, same problems, so same-ish solution
sounds the right way.

For that case we'd save N block accesses to locate the rightmost leaf
page. If the cache is wrong, we could seek again from the root at a cost
of 1 additional access (or move right until we find it depending upon
how stale we think the cache is, if we can find a heuristic to gauge
that). We would only need to send an invalidation message when VACUUM
removes a page, as well as for any insertion other than the rightmost
page. Maybe do this as an index option, e.g. APPEND (MONOTONIC seems
like a poor choice, even if it would be more correct)?

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Avoiding redundant fetches of btree index metapages

2006-04-26 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 Hmmm I'm slightly worried that frequently-churning small tables will
 be made even slower by this. What do you think?

How so?

 So we would be able to cache other items also?

Only to the extent that you can guarantee a stale cache entry isn't a
problem.  We've already done the analysis involved for the existing
metapage entries, but anything else would require more thought.  (And
more cache flush events.)

 For that case we'd save N block accesses to locate the rightmost leaf
 page.

Surely you mean log(N).

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Avoiding redundant fetches of btree index metapages

2006-04-26 Thread Simon Riggs
On Wed, 2006-04-26 at 12:53 -0400, Tom Lane wrote:
  So we would be able to cache other items also?
 
 Only to the extent that you can guarantee a stale cache entry isn't a
 problem.  We've already done the analysis involved for the existing
 metapage entries, but anything else would require more thought.  (And
 more cache flush events.)

You mean performance tests! Will do.

Methinks that cache flushing is the key to performance for that idea.

  For that case we'd save N block accesses to locate the rightmost leaf
  page.
 
 Surely you mean log(N).

Depends what N is. I meant the level, not the number of rows.

-- 
  Simon Riggs
  EnterpriseDB  http://www.enterprisedb.com/


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Avoiding redundant fetches of btree index metapages

2006-04-25 Thread Alvaro Herrera
Tom Lane wrote:
 Some experimentation over the weekend turned up the result that $SUBJECT 
 is a good idea.  I had never thought about this much before, figuring
 that in a heavily-used index the metapage would stay in cache anyway and
 so fetching it at the start of each index search isn't costing any extra
 I/O.  That's true, but what it does cost is bufmgr contention, and in
 fact more than you'd expect because the amount of work actually done
 before releasing the page again is minuscule.  (See off-list discussion
 attached below.)

Wow, this is extremely nice.  Congratulations on another well-spotted
performance problem solved.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings