Re: [HACKERS] WAL & RC1 status
[EMAIL PROTECTED] (Nathan Myers) writes: > It Seems to Me that after an orderly shutdown, the WAL files should be, > effectively, slag -- they should contain no deltas from the current > table contents. In practice that means the only part of the format that > *should* matter is whatever it takes to discover that they really are > slag. > That *should* mean that, at worst, a change to the WAL file format should > only require doing an orderly shutdown, and then (perhaps) running a simple > program to generate a new-format empty WAL. It ought not to require an > initdb. Excellent point, considering that we were already thinking of making a handy-dandy little utility to remove broken WAL files... Shouldn't take much more than that to build something that also reformats pg_control. Thanks for the suggestion! regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] WAL & RC1 status
> It Seems to Me that after an orderly shutdown, the WAL files should be, > effectively, slag -- they should contain no deltas from the current > table contents. In practice that means the only part of the format that > *should* matter is whatever it takes to discover that they really are > slag. > > That *should* mean that, at worst, a change to the WAL file format should > only require doing an orderly shutdown, and then (perhaps) running a simple > program to generate a new-format empty WAL. It ought not to require an > initdb. > > Of course the details of the current implementation may interfere with > that ideal, but it seems a worthy goal for the next beta, if it's not > possible already. Given the opportunity to change the current WAL format, > it ought to be possible to avoid even needing to run a program to generate > an empty WAL. This was my question too. If we are just changing WAL, why can't we just have them stop the postmaster, install the new binaries, and restart. Tom told me on the phone that there was a magic number in the WAL log file, and I see it now: #define XLOG_PAGE_MAGIC 0x17345168 Couldn't we just have our new beta ignore WAL pages with this entry, knowing that startup/shutdown creates new WAL files anyway, Aside from inconveniencing the beta users, people can do testing easier if we don't require a dump/reload for every WAL format change. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] WAL & RC1 status
On Fri, Mar 02, 2001 at 10:54:04AM -0500, Bruce Momjian wrote: > > Bruce Momjian <[EMAIL PROTECTED]> writes: > > > Is there a version number in the WAL file? > > > > catversion.h will do fine, no? > > > > > Can we put conditional code in there to create > > > new log file records with an updated format? > > > > The WAL stuff is *far* too complex already. I've spent a week studying > > it and I only partially understand it. I will not consent to trying to > > support multiple log file formats concurrently. > > Well, I was thinking a few things. Right now, if we update the > catversion.h, we will require a dump/reload. If we can update just the > WAL version stamp, that will allow us to fix WAL format problems without > requiring people to dump/reload. I can imagine this would be valuable > if we find we need to make changes in 7.1.1, where we can not require > dump/reload. It Seems to Me that after an orderly shutdown, the WAL files should be, effectively, slag -- they should contain no deltas from the current table contents. In practice that means the only part of the format that *should* matter is whatever it takes to discover that they really are slag. That *should* mean that, at worst, a change to the WAL file format should only require doing an orderly shutdown, and then (perhaps) running a simple program to generate a new-format empty WAL. It ought not to require an initdb. Of course the details of the current implementation may interfere with that ideal, but it seems a worthy goal for the next beta, if it's not possible already. Given the opportunity to change the current WAL format, it ought to be possible to avoid even needing to run a program to generate an empty WAL. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] WAL & RC1 status
> Bruce Momjian <[EMAIL PROTECTED]> writes: > > Well, I was thinking a few things. Right now, if we update the > > catversion.h, we will require a dump/reload. If we can update just the > > WAL version stamp, that will allow us to fix WAL format problems without > > requiring people to dump/reload. > > Since there is not a separate WAL version stamp, introducing one now > would certainly force an initdb. I don't mind adding one if you think > it's useful; another 4 bytes in pg_control won't hurt anything. But > it's not going to save anyone's bacon on this cycle. > > At least one of my concerns (single point of failure) would require a > change to the layout of pg_control, which would force initdb anyway. > Anyone want to propose a third version# for pg_control? I now remember Hiroshi complaining about major WAL problems also, particularly corrupt WAL files preventing the database from starting. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
RE: [HACKERS] WAL & RC1 status
> From: Bruce Momjian [SMTP:[EMAIL PROTECTED]] > Sent: Friday, March 02, 2001 9:54 AM > To: Tom Lane > Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED] > Subject: Re: [HACKERS] WAL & RC1 status > > > Bruce Momjian <[EMAIL PROTECTED]> writes: > > > Is there a version number in the WAL file? > > > > catversion.h will do fine, no? > > > > > Can we put conditional code in there to create > > > new log file records with an updated format? > > > While it may be unfortunate to have to do an initdb at this point in the beta cycle, it is a beta and that is part of the deal. Postgre has the reputation of being the highest quality opensource database and we should do nothing to tarnish that. Release it when it's ready and not before. ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] WAL & RC1 status
> I've been going through the WAL code, trying to understand it and > document it. I've found a number of minor problems and several major > ones ("major" meaning "can't really fix without an incompatible file > format change, hence initdb"). I've reported the major problems to > the mailing lists but gotten almost no feedback about what to do. Sorry for the "no feedback", but I've assumed that this will be more productively discussed with Vadim in the loop. I don't disagree with your observations, but of course that is from a position of happy ignorance :) > ... I want to veto putting out an RC1 until these issues are resolved... > comments? OK with me. - Thomas ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] WAL & RC1 status
> Bruce Momjian <[EMAIL PROTECTED]> writes: > > Well, I was thinking a few things. Right now, if we update the > > catversion.h, we will require a dump/reload. If we can update just the > > WAL version stamp, that will allow us to fix WAL format problems without > > requiring people to dump/reload. > > Since there is not a separate WAL version stamp, introducing one now > would certainly force an initdb. I don't mind adding one if you think > it's useful; another 4 bytes in pg_control won't hurt anything. But > it's not going to save anyone's bacon on this cycle. Having a version number of binary files has saved me many times because I can add a little 'if' to allow upward binary compatibility without breaking old binary files. I think we should have one. I see our btree files, but I don't see one in heap. I am going to recommend that for 7.2. All our files should have versions just in case we ever need it. Some day, we may be able to skip dump/reload for major versions. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] WAL & RC1 status
Bruce Momjian <[EMAIL PROTECTED]> writes: > Well, I was thinking a few things. Right now, if we update the > catversion.h, we will require a dump/reload. If we can update just the > WAL version stamp, that will allow us to fix WAL format problems without > requiring people to dump/reload. Since there is not a separate WAL version stamp, introducing one now would certainly force an initdb. I don't mind adding one if you think it's useful; another 4 bytes in pg_control won't hurt anything. But it's not going to save anyone's bacon on this cycle. At least one of my concerns (single point of failure) would require a change to the layout of pg_control, which would force initdb anyway. Anyone want to propose a third version# for pg_control? regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] WAL & RC1 status
Bruce Momjian <[EMAIL PROTECTED]> writes: > Is there a version number in the WAL file? catversion.h will do fine, no? > Can we put conditional code in there to create > new log file records with an updated format? The WAL stuff is *far* too complex already. I've spent a week studying it and I only partially understand it. I will not consent to trying to support multiple log file formats concurrently. regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] WAL & RC1 status
> Bruce Momjian <[EMAIL PROTECTED]> writes: > > Is there a version number in the WAL file? > > catversion.h will do fine, no? > > > Can we put conditional code in there to create > > new log file records with an updated format? > > The WAL stuff is *far* too complex already. I've spent a week studying > it and I only partially understand it. I will not consent to trying to > support multiple log file formats concurrently. Well, I was thinking a few things. Right now, if we update the catversion.h, we will require a dump/reload. If we can update just the WAL version stamp, that will allow us to fix WAL format problems without requiring people to dump/reload. I can imagine this would be valuable if we find we need to make changes in 7.1.1, where we can not require dump/reload. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] WAL & RC1 status
> I am *not* feeling good about pushing out an RC1 release candidate > today. > > I've been going through the WAL code, trying to understand it and > document it. I've found a number of minor problems and several major > ones ("major" meaning "can't really fix without an incompatible file > format change, hence initdb"). I've reported the major problems to > the mailing lists but gotten almost no feedback about what to do. > > In addition, I'm still looking for the bug that I originally went in to > find: Scott Parish's report of being unable to restart after a normal > shutdown of beta4. Examination of his WAL log shows some pretty serious > lossage (see attached dump). My current theory is that the > buffer-slinging logic in xlog.c dropped one or more whole buffers' worth > of log records, but I haven't figured out exactly how. > > I want to veto putting out an RC1 until these issues are resolved... > comments? I was not sure how to respond. Requiring an initdb at this stage seems like it could be a pretty major blow to beta testers. However, if we will have 7.1 problems with WAL that can not be fixed without a file format change, we will have problems down the road. Is there a version number in the WAL file? Can we put conditional code in there to create new log file records with an updated format? -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] WAL & RC1 status
On Fri, 2 Mar 2001, Tom Lane wrote: > I am *not* feeling good about pushing out an RC1 release candidate > today. > > I've been going through the WAL code, trying to understand it and > document it. I've found a number of minor problems and several major > ones ("major" meaning "can't really fix without an incompatible file > format change, hence initdb"). I've reported the major problems to > the mailing lists but gotten almost no feedback about what to do. > > In addition, I'm still looking for the bug that I originally went in to > find: Scott Parish's report of being unable to restart after a normal > shutdown of beta4. Examination of his WAL log shows some pretty serious > lossage (see attached dump). My current theory is that the > buffer-slinging logic in xlog.c dropped one or more whole buffers' worth > of log records, but I haven't figured out exactly how. > > I want to veto putting out an RC1 until these issues are resolved... > comments? Will second it ... Vadim is supposed to be back on the 6th, and Peter has a couple of changes to configure he wants to do this weekend for the JDBC stuff ... Thomas and I are in SF the end of next week for some meetings, so if you can pop off a summary of what you've found to either of us, and assuming that Vadim doesn't get caught up by then, we can bring them up "in person" at that time ... ? ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])