On Wed, Jul 28, 2010 at 12:23 AM, Jeff Davis <pg...@j-davis.com> wrote: > On Tue, 2010-07-27 at 15:23 -0700, Jeff Davis wrote: >> On Tue, 2010-07-27 at 17:18 -0400, Robert Haas wrote: >> > > My first concern with that idea was that it may create an inconsistency >> > > between the primary and the standby. The primary could have a bunch of >> > > zero pages that never make it to the standby. >> > >> > Maybe I'm slow on the uptake here, but don't the pages start out >> > all-zeroes on the standby just as they do on the primary? The only way >> > it seems like this would be a problem is if a page that previously >> > contained data on the primary was subsequently zeroed without writing >> > a WAL record - or am I confused? >> >> The case I was concerned about is when you have a table on the primary >> with a bunch of zero pages at the end. Then you SET TABLESPACE, and none >> of the copied pages (or even the fact that they exist) would be sent to >> the standby, but they would exist on the primary. And later pages may >> have data, so the standby may see page N but not N-1. >> >> Generally, most of the code is not expecting to read or write past the >> end of the file, unless it's doing an extension. >> >> However, I think everything is fine during recovery, because it looks >> like it's designed to create zero pages as needed. So your idea seems >> safe to me, although I do still have some doubts because of my lack of >> knowledge in this area; particularly hot standby conflict >> detection/resolution. >> >> My idea was different: still log the zero page, just don't set LSN or >> TLI for a zero page in log_newpage() or heap_xlog_newpage(). This isn't >> as clean as your idea, but I'm a little more confident that it is >> correct. >> > > Both potential fixes attached and both appear to work. > > fix1 -- Only call PageSetLSN/TLI inside log_newpage() and > heap_xlog_newpage() if the page is not zeroed. > > fix2 -- Don't call log_newpage() at all if the page is not zeroed. > > Please review. I don't have a strong opinion about which one should be > applied.
Looks good. I still prefer fix2; it seems cleaner to me. It has another advantage, too, which is that copy_relation_data() is used ONLY by ALTER TABLE SET TABLESPACE. So if I stick to patching that function, that's the only thing I can possibly break, whereas log_newpage() is used in a bunch of other places. I don't think either fix is going to break anything at all, but considering that this is going to need back-patching, I'd rather be conservative. Speaking of back-patching, the subject line describes this as an 8.3+ problem, but it looks to me like this goes all the way back to 8.0. The code is a little different in 8.2 and prior, but ISTM it's vulnerable to the same issue. Don't we need a modified version of this patch for the 8.0 - 8.2 branches? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers