On Thu, Sep 29, 2011 at 12:31 PM, Fujii Masao <[email protected]> wrote: > On Tue, Sep 27, 2011 at 8:06 PM, Florian Pflug <[email protected]> wrote: >> On Sep27, 2011, at 07:59 , Heikki Linnakangas wrote: >>> On 27.09.2011 00:28, Florian Pflug wrote: >>>> On Sep26, 2011, at 22:39 , Tom Lane wrote: >>>>> It might be worthwhile to invoke XLogCheckInvalidPages() as soon as >>>>> we (think we) have reached consistency, rather than leaving it to be >>>>> done only when we exit recovery mode. >>>> >>>> I believe we also need to prevent the creation of restart points before >>>> we've reached consistency. >>> >>> Seems reasonable. We could still allow restartpoints when the hash table is >>> empty, though. And once we've reached consistency, we can throw an error >>> immediately in log_invalid_page(), instead of adding the entry in the hash >>> table. >> >> That mimics the way the rm_safe_restartpoint callbacks work, which is good. >> >> Actually, why don't we use that machinery to implement this? There's >> currently no rm_safe_restartpoint callback for RM_XLOG_ID, so we'd just need >> to create one that checks whether invalid_page_tab is empty. > > Okay, the attached patch prevents the creation of restartpoints by using > rm_safe_restartpoint callback if we've not reached a consistent state yet > and the invalid-page table is not empty. But the invalid-page table is not > tied to the specific resource manager, so using rm_safe_restartpoint for > that seems to slightly odd. Is this OK? > > Also, according to other suggestions, the patch changes > XLogCheckInvalidPages() > so that it's called as soon as we've reached a consistent state, and changes > log_invalid_page() so that it emits PANIC immediately if consistency is > already > reached. These are very good changes, I think. Because they enable us to > notice serious problem which causes PANIC error immediately. Without these > changes, you unfortunately might notice that the standby database is corrupted > when failover happens. Though such a problem might rarely happen, I think it's > worth doing those changes.
Patch does everything we agreed it should. Good suggestion from Florian. This worries me slightly now though because the patch makes us PANIC in a place we didn't used to and once we do that we cannot restart the server at all. Are we sure we want that? It's certainly a great way to shake down errors in other code... -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list ([email protected]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
