On Tue, 2006-04-25 at 13:43 -0400, Tom Lane wrote: > What I'm considering doing to fix that is require any change to a > btree's metapage data to send out a relcache inval message for the > index. That will force all backends to flush the stale cache item > not later than the start of their next transaction, and thereby > guarantee that they aren't using pointers that are too old to be safe > against vacuuming. (There are other ways we could handle this, but > that one seems like the simplest and least invasive.) > > Comments? Anyone see any flaws in the reasoning?
Hmmm.... I'm slightly worried that frequently-churning small tables will be made even slower by this. What do you think? > * Re-using rd_targblock was OK for a quick hack because we don't use it > for indexes, but it's too klugy for real use, and it's not quite enough > info anyway (we really ought to cache the root level as well as root > block number). I'm planning to add a "void *" pointer to the Relation > struct that the index AM is allowed to use as it sees fit, with the > understanding that any pointed-to data lives in rd_indexcxt and the > pointer will be reset to NULL on any relcache clear. btree would store > a copy of the BTMetaPageData struct. The other AMs might have some > other use for this. So we would be able to cache other items also? I'd also considered caching the rightmost page, for when the table grows via a monotonically increasing key. Similar benefits, same problems, so same-ish solution sounds the right way. For that case we'd save N block accesses to locate the rightmost leaf page. If the cache is wrong, we could seek again from the root at a cost of 1 additional access (or move right until we find it depending upon how stale we think the cache is, if we can find a heuristic to gauge that). We would only need to send an invalidation message when VACUUM removes a page, as well as for any insertion other than the rightmost page. Maybe do this as an index option, e.g. APPEND (MONOTONIC seems like a poor choice, even if it would be more correct)? -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org