On Sun, 2006-07-16 at 10:51 -0400, Tom Lane wrote: > Andreas Seltenreich <[EMAIL PROTECTED]> writes: > > Simon Riggs <[EMAIL PROTECTED]> writes: > >> [2. text/x-patch; restartableRecovery.patch] > > > Hmm, wouldn't you have to reboot the resource managers at each > > checkpoint? I'm afraid otherwise things like postponed page splits > > could get lost on restart from a later checkpoint. > > Ouch. That's a bit nasty. You can't just apply a postponed split at > checkpoint time, because the WAL record could easily be somewhere after > the checkpoint, leading to duplicate insertions. Right offhand I don't > see how to make this work :-(
Yes, ouch. So much for gung-ho code sprints; thanks Andreas. To do this we would need to have another rmgr specific routine that gets called at a recovery checkpoint. This would then write to disk the current state of the incomplete multi-WAL actions, in some manner. During the startup routines we would check for any pre-existing state files and use those to initialise the incomplete action cache. Cleanup would then discard all state files. That allows us to not-forget actions, but it doesn't help us if there are problems repeating actions twice. We would at least know that we are in a potential double-action zone and could give different kinds of errors or handling. Or we can simply mark any indexes incomplete-needs-rebuild if they had a page split during the overlap time between the last known good recovery checkpoint and the following one. But that does lead to randomly bounded recovery time, which might be better to have started from scratch anyway. Given time available for 8.2, neither one is a quick fix. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match