Re: [HACKERS] WAL & RC1 status

2001-03-02 Thread Tom Lane

[EMAIL PROTECTED] (Nathan Myers) writes:
> It Seems to Me that after an orderly shutdown, the WAL files should be, 
> effectively, slag -- they should contain no deltas from the current 
> table contents.  In practice that means the only part of the format that 
> *should* matter is whatever it takes to discover that they really are 
> slag.

> That *should* mean that, at worst, a change to the WAL file format should 
> only require doing an orderly shutdown, and then (perhaps) running a simple
> program to generate a new-format empty WAL.  It ought not to require an 
> initdb.  

Excellent point, considering that we were already thinking of making a
handy-dandy little utility to remove broken WAL files...  Shouldn't take
much more than that to build something that also reformats pg_control.
Thanks for the suggestion!

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



Re: [HACKERS] WAL & RC1 status

2001-03-02 Thread Bruce Momjian

> It Seems to Me that after an orderly shutdown, the WAL files should be, 
> effectively, slag -- they should contain no deltas from the current 
> table contents.  In practice that means the only part of the format that 
> *should* matter is whatever it takes to discover that they really are 
> slag.

> 
> That *should* mean that, at worst, a change to the WAL file format should 
> only require doing an orderly shutdown, and then (perhaps) running a simple
> program to generate a new-format empty WAL.  It ought not to require an 
> initdb.  
> 
> Of course the details of the current implementation may interfere with
> that ideal, but it seems a worthy goal for the next beta, if it's not
> possible already.  Given the opportunity to change the current WAL format, 
> it ought to be possible to avoid even needing to run a program to generate 
> an empty WAL.

This was my question too.  If we are just changing WAL, why can't we
just have them stop the postmaster, install the new binaries, and
restart.

Tom told me on the phone that there was a magic number in the WAL log
file, and I see it now:

#define XLOG_PAGE_MAGIC 0x17345168

Couldn't we just have our new beta ignore WAL pages with this entry,
knowing that startup/shutdown creates new WAL files anyway, 

Aside from inconveniencing the beta users, people can do testing easier
if we don't require a dump/reload for every WAL format change.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] WAL & RC1 status

2001-03-02 Thread Nathan Myers

On Fri, Mar 02, 2001 at 10:54:04AM -0500, Bruce Momjian wrote:
> > Bruce Momjian <[EMAIL PROTECTED]> writes:
> > > Is there a version number in the WAL file?
> > 
> > catversion.h will do fine, no?
> > 
> > > Can we put conditional code in there to create
> > > new log file records with an updated format?
> > 
> > The WAL stuff is *far* too complex already.  I've spent a week studying
> > it and I only partially understand it.  I will not consent to trying to
> > support multiple log file formats concurrently.
> 
> Well, I was thinking a few things.  Right now, if we update the
> catversion.h, we will require a dump/reload.  If we can update just the
> WAL version stamp, that will allow us to fix WAL format problems without
> requiring people to dump/reload.  I can imagine this would be valuable
> if we find we need to make changes in 7.1.1, where we can not require
> dump/reload.

It Seems to Me that after an orderly shutdown, the WAL files should be, 
effectively, slag -- they should contain no deltas from the current 
table contents.  In practice that means the only part of the format that 
*should* matter is whatever it takes to discover that they really are 
slag.

That *should* mean that, at worst, a change to the WAL file format should 
only require doing an orderly shutdown, and then (perhaps) running a simple
program to generate a new-format empty WAL.  It ought not to require an 
initdb.  

Of course the details of the current implementation may interfere with
that ideal, but it seems a worthy goal for the next beta, if it's not
possible already.  Given the opportunity to change the current WAL format, 
it ought to be possible to avoid even needing to run a program to generate 
an empty WAL.

Nathan Myers
[EMAIL PROTECTED]

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] WAL & RC1 status

2001-03-02 Thread Bruce Momjian

> Bruce Momjian <[EMAIL PROTECTED]> writes:
> > Well, I was thinking a few things.  Right now, if we update the
> > catversion.h, we will require a dump/reload.  If we can update just the
> > WAL version stamp, that will allow us to fix WAL format problems without
> > requiring people to dump/reload.
> 
> Since there is not a separate WAL version stamp, introducing one now
> would certainly force an initdb.  I don't mind adding one if you think
> it's useful; another 4 bytes in pg_control won't hurt anything.  But
> it's not going to save anyone's bacon on this cycle.
> 
> At least one of my concerns (single point of failure) would require a
> change to the layout of pg_control, which would force initdb anyway.
> Anyone want to propose a third version# for pg_control?

I now remember Hiroshi complaining about major WAL problems also,
particularly corrupt WAL files preventing the database from starting.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



RE: [HACKERS] WAL & RC1 status

2001-03-02 Thread Matthew



> From: Bruce Momjian [SMTP:[EMAIL PROTECTED]]
> Sent: Friday, March 02, 2001 9:54 AM
> To:   Tom Lane
> Cc:   [EMAIL PROTECTED]; [EMAIL PROTECTED]
> Subject:      Re: [HACKERS] WAL & RC1 status
> 
> > Bruce Momjian <[EMAIL PROTECTED]> writes:
> > > Is there a version number in the WAL file?
> > 
> > catversion.h will do fine, no?
> > 
> > > Can we put conditional code in there to create
> > > new log file records with an updated format?
> > 
> 
While it may be unfortunate to have to do an initdb at this point in
the beta cycle, it is a beta and that is part of the deal.  Postgre has the
reputation of being the highest quality opensource database and we should do
nothing to tarnish that.  Release it when it's ready and not before.

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] WAL & RC1 status

2001-03-02 Thread Thomas Lockhart

> I've been going through the WAL code, trying to understand it and
> document it.  I've found a number of minor problems and several major
> ones ("major" meaning "can't really fix without an incompatible file
> format change, hence initdb").  I've reported the major problems to
> the mailing lists but gotten almost no feedback about what to do.

Sorry for the "no feedback", but I've assumed that this will be more
productively discussed with Vadim in the loop. I don't disagree with
your observations, but of course that is from a position of happy
ignorance :)

> ... I want to veto putting out an RC1 until these issues are resolved...
> comments?

OK with me.

- Thomas

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



Re: [HACKERS] WAL & RC1 status

2001-03-02 Thread Bruce Momjian

> Bruce Momjian <[EMAIL PROTECTED]> writes:
> > Well, I was thinking a few things.  Right now, if we update the
> > catversion.h, we will require a dump/reload.  If we can update just the
> > WAL version stamp, that will allow us to fix WAL format problems without
> > requiring people to dump/reload.
> 
> Since there is not a separate WAL version stamp, introducing one now
> would certainly force an initdb.  I don't mind adding one if you think
> it's useful; another 4 bytes in pg_control won't hurt anything.  But
> it's not going to save anyone's bacon on this cycle.

Having a version number of binary files has saved me many times because
I can add a little 'if' to allow upward binary compatibility without
breaking old binary files.  I think we should have one.

I see our btree files, but I don't see one in heap.  I am going to
recommend that for 7.2.  All our files should have versions just in case
we ever need it.  Some day, we may be able to skip dump/reload for major
versions.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] WAL & RC1 status

2001-03-02 Thread Tom Lane

Bruce Momjian <[EMAIL PROTECTED]> writes:
> Well, I was thinking a few things.  Right now, if we update the
> catversion.h, we will require a dump/reload.  If we can update just the
> WAL version stamp, that will allow us to fix WAL format problems without
> requiring people to dump/reload.

Since there is not a separate WAL version stamp, introducing one now
would certainly force an initdb.  I don't mind adding one if you think
it's useful; another 4 bytes in pg_control won't hurt anything.  But
it's not going to save anyone's bacon on this cycle.

At least one of my concerns (single point of failure) would require a
change to the layout of pg_control, which would force initdb anyway.
Anyone want to propose a third version# for pg_control?

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] WAL & RC1 status

2001-03-02 Thread Tom Lane

Bruce Momjian <[EMAIL PROTECTED]> writes:
> Is there a version number in the WAL file?

catversion.h will do fine, no?

> Can we put conditional code in there to create
> new log file records with an updated format?

The WAL stuff is *far* too complex already.  I've spent a week studying
it and I only partially understand it.  I will not consent to trying to
support multiple log file formats concurrently.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] WAL & RC1 status

2001-03-02 Thread Bruce Momjian

> Bruce Momjian <[EMAIL PROTECTED]> writes:
> > Is there a version number in the WAL file?
> 
> catversion.h will do fine, no?
> 
> > Can we put conditional code in there to create
> > new log file records with an updated format?
> 
> The WAL stuff is *far* too complex already.  I've spent a week studying
> it and I only partially understand it.  I will not consent to trying to
> support multiple log file formats concurrently.

Well, I was thinking a few things.  Right now, if we update the
catversion.h, we will require a dump/reload.  If we can update just the
WAL version stamp, that will allow us to fix WAL format problems without
requiring people to dump/reload.  I can imagine this would be valuable
if we find we need to make changes in 7.1.1, where we can not require
dump/reload.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



Re: [HACKERS] WAL & RC1 status

2001-03-02 Thread Bruce Momjian

> I am *not* feeling good about pushing out an RC1 release candidate
> today.
> 
> I've been going through the WAL code, trying to understand it and
> document it.  I've found a number of minor problems and several major
> ones ("major" meaning "can't really fix without an incompatible file
> format change, hence initdb").  I've reported the major problems to
> the mailing lists but gotten almost no feedback about what to do.
> 
> In addition, I'm still looking for the bug that I originally went in to
> find: Scott Parish's report of being unable to restart after a normal
> shutdown of beta4.  Examination of his WAL log shows some pretty serious
> lossage (see attached dump).  My current theory is that the
> buffer-slinging logic in xlog.c dropped one or more whole buffers' worth
> of log records, but I haven't figured out exactly how.
> 
> I want to veto putting out an RC1 until these issues are resolved...
> comments?

I was not sure how to respond.  Requiring an initdb at this stage seems
like it could be a pretty major blow to beta testers.  However, if we
will have 7.1 problems with WAL that can not be fixed without a file
format change, we will have problems down the road.  Is there a version
number in the WAL file?  Can we put conditional code in there to create
new log file records with an updated format?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] WAL & RC1 status

2001-03-02 Thread The Hermit Hacker

On Fri, 2 Mar 2001, Tom Lane wrote:

> I am *not* feeling good about pushing out an RC1 release candidate
> today.
>
> I've been going through the WAL code, trying to understand it and
> document it.  I've found a number of minor problems and several major
> ones ("major" meaning "can't really fix without an incompatible file
> format change, hence initdb").  I've reported the major problems to
> the mailing lists but gotten almost no feedback about what to do.
>
> In addition, I'm still looking for the bug that I originally went in to
> find: Scott Parish's report of being unable to restart after a normal
> shutdown of beta4.  Examination of his WAL log shows some pretty serious
> lossage (see attached dump).  My current theory is that the
> buffer-slinging logic in xlog.c dropped one or more whole buffers' worth
> of log records, but I haven't figured out exactly how.
>
> I want to veto putting out an RC1 until these issues are resolved...
> comments?

Will second it ... Vadim is supposed to be back on the 6th, and Peter has
a couple of changes to configure he wants to do this weekend for the JDBC
stuff ... Thomas and I are in SF the end of next week for some meetings,
so if you can pop off a summary of what you've found to either of us, and
assuming that Vadim doesn't get caught up by then, we can bring them up
"in person" at that time ... ?



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])