Re: [HACKERS] [GENERAL] Undetected corruption of table files

2007-08-27 Thread Jonah H. Harris
On 8/27/07, Tom Lane <[EMAIL PROTECTED]> wrote:
> that and the lack of evidence that they'd actually gain anything

I find it somewhat ironic that PostgreSQL strives to be fairly
non-corruptable, yet has no way to detect a corrupted page.  The only
reason for not having CRCs is because it will slow down performance...
which is exactly opposite of conventional PostgreSQL wisdom (no
performance trade-off for durability).

-- 
Jonah H. Harris, Software Architect | phone: 732.331.1324
EnterpriseDB Corporation| fax: 732.331.1301
33 Wood Ave S, 3rd Floor| [EMAIL PROTECTED]
Iselin, New Jersey 08830| http://www.enterprisedb.com/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] [GENERAL] Undetected corruption of table files

2007-08-27 Thread Trevor Talbot
On 8/27/07, Jonah H. Harris <[EMAIL PROTECTED]> wrote:
> On 8/27/07, Tom Lane <[EMAIL PROTECTED]> wrote:
> > that and the lack of evidence that they'd actually gain anything
>
> I find it somewhat ironic that PostgreSQL strives to be fairly
> non-corruptable, yet has no way to detect a corrupted page.  The only
> reason for not having CRCs is because it will slow down performance...
> which is exactly opposite of conventional PostgreSQL wisdom (no
> performance trade-off for durability).

But how does detecting a corrupted data page gain you any durability?
All it means is that the platform underneath screwed up, and you've
already *lost* durability.  What do you do then?

It seems like the same idea as an application trying to detect RAM errors.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [GENERAL] Undetected corruption of table files

2007-08-27 Thread Alban Hertroys
Jonah H. Harris wrote:
> On 8/27/07, Tom Lane <[EMAIL PROTECTED]> wrote:
>> that and the lack of evidence that they'd actually gain anything
> 
> I find it somewhat ironic that PostgreSQL strives to be fairly
> non-corruptable, yet has no way to detect a corrupted page.  The only
> reason for not having CRCs is because it will slow down performance...
> which is exactly opposite of conventional PostgreSQL wisdom (no
> performance trade-off for durability).

Why? I can't say I speak for the developers, but I think the reason is
that data corruption can (with the very rare exception of undetected
programming errors) only be caused by hardware problems.

If you have a "proper" production database server, your memory has error
checking, and your RAID controller has something of the kind as well. If
not you would probably be running the database on a filesystem that has
reliable integrity verification mechanisms.

In the worst case (all the above mechanisms fail), you have backups.

IMHO the problem is covered quite adequately. The operating system and
the hardware cover for the database, as they should; it's _their_ job.

-- 
Alban Hertroys
[EMAIL PROTECTED]

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
   7500 AK Enschede

// Integrate Your World //

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] [GENERAL] Undetected corruption of table files

2007-08-27 Thread Tom Lane
"Trevor Talbot" <[EMAIL PROTECTED]> writes:
> On 8/27/07, Jonah H. Harris <[EMAIL PROTECTED]> wrote:
>> I find it somewhat ironic that PostgreSQL strives to be fairly
>> non-corruptable, yet has no way to detect a corrupted page.

> But how does detecting a corrupted data page gain you any durability?
> All it means is that the platform underneath screwed up, and you've
> already *lost* durability.  What do you do then?

Indeed.  In fact, the most likely implementation of this (refuse to do
anything with a page with a bad CRC) would be a net loss from that
standpoint, because you couldn't get *any* data out of a page, even if
only part of it had been zapped.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [GENERAL] Undetected corruption of table files

2007-08-27 Thread Jonah H. Harris
On 8/27/07, Tom Lane <[EMAIL PROTECTED]> wrote:
> Indeed.  In fact, the most likely implementation of this (refuse to do
> anything with a page with a bad CRC) would be a net loss from that
> standpoint, because you couldn't get *any* data out of a page, even if
> only part of it had been zapped.

At least you would know it was corrupted, instead of getting funky
errors and/or crashes.

-- 
Jonah H. Harris, Software Architect | phone: 732.331.1324
EnterpriseDB Corporation| fax: 732.331.1301
33 Wood Ave S, 3rd Floor| [EMAIL PROTECTED]
Iselin, New Jersey 08830| http://www.enterprisedb.com/

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [GENERAL] Undetected corruption of table files

2007-08-27 Thread Decibel!
On Mon, Aug 27, 2007 at 12:08:17PM -0400, Jonah H. Harris wrote:
> On 8/27/07, Tom Lane <[EMAIL PROTECTED]> wrote:
> > Indeed.  In fact, the most likely implementation of this (refuse to do
> > anything with a page with a bad CRC) would be a net loss from that
> > standpoint, because you couldn't get *any* data out of a page, even if
> > only part of it had been zapped.

I think it'd be perfectly reasonable to have a mode where you could
bypass the check so that you could see what was in the corrupted page
(as well as deleting everything on the page so that you could "fix" the
corruption). Obviously, this should be restricted to superusers.

> At least you would know it was corrupted, instead of getting funky
> errors and/or crashes.

Or worse, getting what appears to be perfectly valid data, but isn't.
-- 
Decibel!, aka Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)


pgp33ocMCEwPO.pgp
Description: PGP signature


Re: [HACKERS] [GENERAL] Undetected corruption of table files

2007-08-28 Thread Lincoln Yeoh

At 11:48 PM 8/27/2007, Trevor Talbot wrote:

On 8/27/07, Jonah H. Harris <[EMAIL PROTECTED]> wrote:
> On 8/27/07, Tom Lane <[EMAIL PROTECTED]> wrote:
> > that and the lack of evidence that they'd actually gain anything
>
> I find it somewhat ironic that PostgreSQL strives to be fairly
> non-corruptable, yet has no way to detect a corrupted page.  The only
> reason for not having CRCs is because it will slow down performance...
> which is exactly opposite of conventional PostgreSQL wisdom (no
> performance trade-off for durability).

But how does detecting a corrupted data page gain you any durability?
All it means is that the platform underneath screwed up, and you've
already *lost* durability.  What do you do then?


The benefit I see is you get to change the platform underneath 
earlier than later.


Whether that's worth it or not I don't know - real world stats/info 
would be good.


Even my home PATA drives tend to grumble about stuff first before 
they fail, so it might not be worthwhile doing the extra work.


Regards,
Link.




---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [GENERAL] Undetected corruption of table files

2007-08-29 Thread Florian Weimer
* Alban Hertroys:

> If you have a "proper" production database server, your memory has
> error checking, and your RAID controller has something of the kind
> as well.

To my knowledge, no readily available controller performs validation
on reads (not even for RAID-1 or RAID-10, where it would be pretty
straightforward).

Something like an Adler32 checksum (not a full CRC) on each page might
be helpful.  However, what I'd really like to see is something that
catches missed writes, but this is very difficult to implement AFAICT.

-- 
Florian Weimer<[EMAIL PROTECTED]>
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly