Re: [HACKERS] New FSM patch

Heikki Linnakangas Thu, 04 Sep 2008 01:10:17 -0700

Thanks for the review!

Simon Riggs wrote:

Can I check some aspects of this related to Hot Standby? Some of them
sound obvious, but worth double checking.


* There will be no need to read FSM by any normal operation of a
read-only transaction, so locking correctness considerations can
possibly be ignored during recovery.


Correct.

A HOT prune operation doesn't currently update the FSM, but if it did,that would be a case of a read-only transaction updating the FSM. But wecan't prune anyway in a hot standby.

 pg_freespacemap exists though:
would we need to prevent that from executing during recovery, or will
the FSM be fully readable? i.e. does redo take appropriate locks already
(I don't see any Cleanup locks being required).


pg_freespacemap, the contrib module? Yeah, the FSM should be fully readable.

During normal operation, when a bottom level page is updated, and theupdate needs to be bubbled up, the upper level page is pinned and lockedbefore the lock on the lower level page is released. That interlockingis not done during WAL replay, and the lock on the lower level page isreleased before locking the upper page. It's not required during WALreplay, as there's no concurrent updates to the FSM.

* FSM will be continuously maintained during recovery, so FSM will now
be correct and immediately available when recovery completes?


Correct,

* There are no cases where a screwed-up FSM will crash either recovery
(FATAL+) or halt normal operation (PANIC)?

Hmm, there's no explicit elog(FATAL/PANIC) calls, but if the FSM isreally corrupt, you can probably get a segfault. That should be fixableby adding more sanity checks, though.

* incomplete action cleanup is fairly cheap and doesn't rely on the FSM
being searchable to correct the error? This last is a hard one...


Correct.

Do we have the concept of a invalid/corrupt FSM?  What happens if the
logic goes wrong and we have a corrupt page? Will that mean we can't

> complete actions against the heap?

Some scenarios I can think of:

Scenario: The binary tree on a page is corrupt, so that the value of anupper node is < Max(leftchild, rightchild).Consequence: Searchers won't see the free space below that node, andwill look elsewhere.

Scenario: The binary tree on a page is corrupt, so that the value of anupper node is > Max(leftchild, rightchild).Consequence: Searchers will notice the corruption while trying totraverse down that path, and throw an elog(WARNING) in search_avail().fsm_search will retry the search, and will in worst case go into aninfinite loop. That's obviously not good. We could automatically fix theupper nodes of the tree, but that would wipe evidence that would beuseful in debugging.

Scenario: An upper level page is corrupt, claiming that there's no freespace on a lower level page, while there actually is. (the opposite,where an upper level page claims that there *is* free space on a lowerlevel page, while there actually isn't, is now normal. The searcher willupdate the upper level page in that case)

Consequence: A searcher won't see that free space, and will look elsewhere.

Scenario: An upper level page is corrupt, claiming that there is freespace on a lower level page that doesn't exist.Consequence: Searchers will elog(ERROR), trying to read the non-existentFSM page.

The 3rd scenario would lead to heap inserts/updates failing. We couldavoid that by checking that the page exists withRelationGetNumberOfBlocks(), but I'm not sure if it's worth the cost.

Are there really any changes to these files?
src/include/storage/bufmgr.h
src/include/postmaster/bgwriter.h


Hmm, apparently not.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] New FSM patch

Reply via email to