On Tue, Jul 27, 2010 at 2:06 PM, Jeff Davis <pg...@j-davis.com> wrote: > I reported a problem here: > > http://archives.postgresql.org/pgsql-bugs/2010-07/msg00173.php > > Perhaps I used a poor subject line, but I believe it's a serious issue. > That reproducible sequence seems like an obvious bug to me on 8.3+, and > what's worse, the corruption propagates to the standby as I found out > today (through a test, fortunately).
I think that the problem is not so much your choice of subject line as your misfortune to discover this bug when Tom and Heikki were both on vacation. > The only mitigating factor is that it doesn't actually lose data, and > you can fix it (I believe) with zero_damaged_pages (or careful use of > dd). > > There are two fixes that I can see: > > 1. Have log_newpage() and heap_xlog_newpage() only call PageSetLSN() and > PageSetTLI() if the page is not new. This seems slightly awkward because > most WAL replay stuff doesn't have to worry about zero pages, but in > this case I think it does. > > 2. Have copy_relation_data() initialize new pages. I don't like this > because (a) it's not really the job of SET TABLESPACE to clean up zero > pages; and (b) it could be an index with different special size, etc., > and it doesn't seem like a good place to figure that out. It appears to me that all of the callers of log_newpage() other than copy_relation_data() do so with pages that they've just constructed, and which therefore can't be new. So maybe we could just modify copy_relation_data to check PageIsNew(buf), or something like that, and only call log_newpage() if that returns true. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers