Hi After struggling with understanding xlog.c and friends far enough to be able to refactor StartupXLOG to suit the needs of concurrent recovery, I think I've finally reached a workable (but still a bit hacky) solution.
My design is centered around the idea of a bgreplay process that takes over the role of the bgwriter in readonly mode, and continously replays WALs as they arrive. But since recovery during startup is still necessary (We need to bring a filesystem-level backup into a consistent state - past minRecoveryLoc - before allowing connections), this means doing recovery in two steps, from two different processes. I've changed StartupXLOG to only recover up to minRecoveryLoc in readonly mode, and to skip all steps that are not required if no writes to the database will be done later (Especially creating a checkpoint at the end of recovery). Instead, it posts the pointer to the last recovered xlog record to shared memory. bgreplay than uses that pointer for an initial call to ReadRecord to setup WAL reading for the bgreplay process. Afterwards, it repeatedly calls ReplayXLOG (new function), which always replays at least one record (If there is one, otherwise it returns false), until it reaches a safe restart point. Currently, in my test setup, I can start a slave in readonly mode and it will do initial recovery, bring postgres online, and continously recover from inside bgreplay. There isn't yet any locking between wal replay and queries. I'll add that locking during the new few days, which should result it a very early prototype. The next steps will then be finding a way to flush backend caches after replaying code that modified system tables, and (related) finding a way to deal with the flatfiles. I'd appreciate any comments on this, especially those pointing out problems that I overlooked. greetings, Florian Pflug ---------------------------(end of broadcast)--------------------------- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate