On Monday, December 19, 2011 12:10:11 PM Simon Riggs wrote:
The only sensible way to handle this is to change the page format as
discussed. IMHO the only sensible way that can happen is if we also
support an online upgrade feature. I will take on the online upgrade
feature if others work on
On Mon, Dec 19, 2011 at 6:10 AM, Simon Riggs si...@2ndquadrant.com wrote:
Throwing WARNINGs for normal events would not help anybody; thousands
of false positives would just make Postgres appear to be less robust
than it really is. That would be a credibility disaster. VMWare
already have
* Aidan Van Dyk (ai...@highrise.ca) wrote:
But the scary part is you don't know how long *ago* the crash was.
Because a hint-bit-only change w/ a torn-page is a non event in
PostgreSQL *DESIGN*, on crash recovery, it doesn't do anything to try
and scrub every page in the database.
Fair
* Aidan Van Dyk (ai...@highrise.ca) wrote:
#) Anybody investigated putting the CRC in a relation fork, but not
right in the data block? If the CRC contains a timestamp, and is WAL
logged before the write, at least on reading a block with a wrong
checksum, if a warning is emitted, the
Excerpts from Stephen Frost's message of lun dic 19 11:18:21 -0300 2011:
* Aidan Van Dyk (ai...@highrise.ca) wrote:
#) Anybody investigated putting the CRC in a relation fork, but not
right in the data block? If the CRC contains a timestamp, and is WAL
logged before the write, at least on
On Mon, Dec 19, 2011 at 9:14 AM, Stephen Frost sfr...@snowman.net wrote:
* Aidan Van Dyk (ai...@highrise.ca) wrote:
But the scary part is you don't know how long *ago* the crash was.
Because a hint-bit-only change w/ a torn-page is a non event in
PostgreSQL *DESIGN*, on crash recovery, it
On Mon, Dec 19, 2011 at 09:34:51AM -0500, Robert Haas wrote:
On Mon, Dec 19, 2011 at 9:14 AM, Stephen Frost sfr...@snowman.net wrote:
* Aidan Van Dyk (ai...@highrise.ca) wrote:
But the scary part is you don't know how long *ago* the crash was.
Because a hint-bit-only change w/ a torn-page
On Monday, December 19, 2011 03:33:22 PM Alvaro Herrera wrote:
Excerpts from Stephen Frost's message of lun dic 19 11:18:21 -0300 2011:
* Aidan Van Dyk (ai...@highrise.ca) wrote:
#) Anybody investigated putting the CRC in a relation fork, but not
right in the data block? If the CRC
* David Fetter (da...@fetter.org) wrote:
On Mon, Dec 19, 2011 at 09:34:51AM -0500, Robert Haas wrote:
On Mon, Dec 19, 2011 at 9:14 AM, Stephen Frost sfr...@snowman.net wrote:
Fair enough, but, could we distinguish these two cases? In other words,
would it be possible to detect if a page
* Andres Freund (and...@anarazel.de) wrote:
On Monday, December 19, 2011 03:33:22 PM Alvaro Herrera wrote:
I do like the idea of putting the CRC info in a relation fork, if it can
be made to work decently, as we might be able to then support it on a
per-relation basis, and maybe even
On 12/19/2011 07:50 AM, Robert Haas wrote:
On Mon, Dec 19, 2011 at 6:10 AM, Simon Riggssi...@2ndquadrant.com wrote:
The only sensible way to handle this is to change the page format as
discussed. IMHO the only sensible way that can happen is if we also
support an online upgrade feature. I will
Greg Smith g...@2ndquadrant.com wrote:
2) Rework hint bits to make the torn page problem go away.
Checksums go elsewhere? More WAL logging to eliminate the bad
situations? Eliminate some types of hint bit writes? It seems
every alternative has trade-offs that will require serious
On Mon, Dec 19, 2011 at 12:07 PM, David Fetter da...@fetter.org wrote:
On Mon, Dec 19, 2011 at 09:34:51AM -0500, Robert Haas wrote:
On Mon, Dec 19, 2011 at 9:14 AM, Stephen Frost sfr...@snowman.net wrote:
* Aidan Van Dyk (ai...@highrise.ca) wrote:
But the scary part is you don't know how
On Mon, Dec 19, 2011 at 2:16 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
It seems to me that on a typical production system you would
probably have zero or one such page per OS crash, with zero being
far more likely than one. If we can get that one fixed (if it
exists) before enough
Robert Haas robertmh...@gmail.com wrote:
On Mon, Dec 19, 2011 at 2:16 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
It seems to me that on a typical production system you would
probably have zero or one such page per OS crash, with zero being
far more likely than one. If we can get
On 12/19/2011 02:44 PM, Kevin Grittner wrote:
I was thinking that we would warn when such was found, set hint bits
as needed, and rewrite with the new CRC. In the unlikely event that
it was a torn hint-bit-only page update, it would be a warning about
something which is a benign side-effect of
On 19.12.2011 21:27, Robert Haas wrote:
To put this another way, we currently WAL-log just about everything.
We get away with NOT WAL-logging some things when we don't care about
whether they make it to disk. Hint bits, killed index tuple pointers,
etc. cause no harm if they don't get written
Greg Smith g...@2ndquadrant.com wrote:
But if you need all that infrastructure just to get the feature
launched, that's a bit hard to stomach.
Triggering a vacuum or some hypothetical scrubbing feature?
Also, as someone who follows Murphy's Law as my chosen religion,
If you don't think
On 17.12.2011 23:33, David Fetter wrote:
What:
Please find attached a patch for 9.2-to-be which implements page
checksums. It changes the page format, so it's an initdb-forcing
change.
How:
In order to ensure that the checksum actually matches the hint
bits, this
On Sun, Dec 18, 2011 at 10:14:38AM +0200, Heikki Linnakangas wrote:
On 17.12.2011 23:33, David Fetter wrote:
What:
Please find attached a patch for 9.2-to-be which implements page
checksums. It changes the page format, so it's an initdb-forcing
change.
How:
In order
On 18.12.2011 10:54, David Fetter wrote:
On Sun, Dec 18, 2011 at 10:14:38AM +0200, Heikki Linnakangas wrote:
On 17.12.2011 23:33, David Fetter wrote:
If this introduces new failure modes, please detail, and preferably
demonstrate, just what those new modes are.
Hint bits, torn pages -
On Sun, Dec 18, 2011 at 12:19:32PM +0200, Heikki Linnakangas wrote:
On 18.12.2011 10:54, David Fetter wrote:
On Sun, Dec 18, 2011 at 10:14:38AM +0200, Heikki Linnakangas wrote:
On 17.12.2011 23:33, David Fetter wrote:
If this introduces new failure modes, please detail, and preferably
On 18.12.2011 20:44, David Fetter wrote:
On Sun, Dec 18, 2011 at 12:19:32PM +0200, Heikki Linnakangas wrote:
On 18.12.2011 10:54, David Fetter wrote:
On Sun, Dec 18, 2011 at 10:14:38AM +0200, Heikki Linnakangas wrote:
On 17.12.2011 23:33, David Fetter wrote:
If this introduces new failure
On sön, 2011-12-18 at 21:34 +0200, Heikki Linnakangas wrote:
On 18.12.2011 20:44, David Fetter wrote:
Any way to
simulate them, even if it's by injecting faults into the source code?
Hmm, it's hard to persuade the OS to suffer a torn page on purpose. What
you could do is split the
On 2011-12-18 11:19, Heikki Linnakangas wrote:
The patch requires that full page writes be on in order to obviate
this problem by never reading a torn page.
Doesn't help. Hint bit updates are not WAL-logged.
I dont know if it would be seen as a half baked feature.. or similar,
and I dont
On Sun, Dec 18, 2011 at 7:51 PM, Jesper Krogh jes...@krogh.cc wrote:
I dont know if it would be seen as a half baked feature.. or similar,
and I dont know if the hint bit problem is solvable at all, but I could
easily imagine checksumming just skipping the hit bit entirely.
That was one
On 12/18/11 5:55 PM, Greg Stark wrote:
There is another way to look at this problem. Perhaps it's worth
having a checksum *even if* there are ways for the checksum to be
spuriously wrong. Obviously having an invalid checksum can't be a
fatal error then but it might still be useful information.
On Sun, Dec 18, 2011 at 11:21 PM, Josh Berkus j...@agliodbs.com wrote:
On 12/18/11 5:55 PM, Greg Stark wrote:
There is another way to look at this problem. Perhaps it's worth
having a checksum *even if* there are ways for the checksum to be
spuriously wrong. Obviously having an invalid
Folks,
What:
Please find attached a patch for 9.2-to-be which implements page
checksums. It changes the page format, so it's an initdb-forcing
change.
How:
In order to ensure that the checksum actually matches the hint
bits, this makes a copy of the page, calculates the
101 - 129 of 129 matches
Mail list logo