Re: [ADMIN] Detecting DB corruption

2012-11-01 Thread Gunnar Nick Bluth

Am 01.11.2012 06:47, schrieb Craig Ringer:

On 11/01/2012 01:10 PM, Scott Ribe wrote:

On Oct 31, 2012, at 8:50 PM, Craig Ringer wrote:


Seriously, if you're facing DB corruption then something is already
horribly wrong with your setup.
Horribly is not strong enough a word IMHO when we're discussing double 
primary key values... except if Raj is not using sequences to generate 
them. Although on the other hand, in that case, it's in turn an even 
more horrible setup; questionable DB design on unreliable hardware.


Raj, would you mind pasting your schema somewhere, at least of the 
tables you experienced the corruption?

True, but. In a past life, complaints from the db (it was a db that stored a 
checksum with every block) were the very first symptom when something went horribly 
wrong with the hardware. (Partial short between wires of an internal SCSI cable; 
eventually we determined that about every 1MB, 1 bit would get flipped between the 
controller  disk.)

So, if there were an official db verifier tool for PG, I for one would have it 
run periodically.

If there were a way to reliably detect corruption, so would I. As things
stand there are no block checksums, so if a bit gets flipped in some
random `text` field you're never going to know, corruption-checker or
no. Some forms of random corruption - like bad blocks on disks causing
I think checksums are currently being worked on and are to be expected 
for 9.3. Might be interesting to scan -hackers for that once more...

I/O errors, zeroed blocks, truncated files, etc - will become apparent
with general checking, but others won't be detectable unless you know
what the expected vs actual data is.

If page checksumming or any other reliable method of detecting possible
incipient corruption were available I'd quite likely want to use it for
much the same reason you outlined. For that matter, if there were a
general sanity check my tables and indexes tool I'd probably use that
too. However, no such tool exists - and in a good setup, none should be
needed. I'd want to use one anyway purely out of paranoia.

--
Craig Ringer


On a side note, Raj, you might want to read the descriptions of MVCC and 
WAL once more, then re-think about your idea of updating all rows and 
rolling back the transaction. That would potentially produce the effect 
you're looking for with InnoDB or Oracle, but not with PG.


Cheers,

--
Gunnar Nick Bluth
RHCE/SCLA

Mobil   +49 172 8853339
Email: gunnar.bl...@pro-open.de
__
In 1984 mainstream users were choosing VMS over UNIX.  Ten years later
they are choosing Windows over UNIX.  What part of that message aren't you
getting? - Tom Payne



--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] Detecting DB corruption

2012-11-01 Thread Raj Gandhi
Re-sending to correct addresses.

Seriously, if you're facing DB corruption then something is already
horribly wrong with your setup.


Horribly is not strong enough a word IMHO when we're discussing double
primary key values... except if Raj is not using sequences to generate
them. Although on the other hand, in that case, it's in turn an even more
horrible setup; questionable DB design on unreliable hardware.

Raj, would you mind pasting your schema somewhere, at least of the tables
you experienced the corruption?



Each DB table has primary key that is populated using DB-sequence. There is
a UNIQUE constraint created on natural keys.
The problem on the test setup was because disk cache was enabled.  Indexes
were corrupted when powering down the host. I have noticed that integrity
of both PK and UNIQUE constraint were violated - Table had rows with
duplicate primary keys and in other case there were rows with duplicate
unique key constraint.

We are now evaluating to turn off the disk cache to avoid this kind of
corruption.

I would still like to have some mechanism to detect any kind of corruption
in Postgres. As Craig pointed out I am planning to run reindex database
dbname which will detect any duplicates and will also remove any
corruption as indexes are recreated.

About the corruption in table -  will running VACUUM FULL on all tables
detect the corruption?
I see 8.4 and later version has param 'vacuum_freeze_table_age' which by
setting to 0 will force regular vacuum to run on whole database and will
check every block.  I don't see that param in 8.3 though so I guess vacuum
full is the only option.

If vacuum full is not going to detect the corruption then I am also
thinking to run pg_dump which should catch the corruption.


On Thu, Nov 1, 2012 at 4:06 AM, Gunnar Nick Bluth 
gunnar.bl...@pro-open.de wrote:

 Am 01.11.2012 06:47, schrieb Craig Ringer:

  On 11/01/2012 01:10 PM, Scott Ribe wrote:

 On Oct 31, 2012, at 8:50 PM, Craig Ringer wrote:

  Seriously, if you're facing DB corruption then something is already
 horribly wrong with your setup.

 Horribly is not strong enough a word IMHO when we're discussing double
 primary key values... except if Raj is not using sequences to generate
 them. Although on the other hand, in that case, it's in turn an even more
 horrible setup; questionable DB design on unreliable hardware.

 Raj, would you mind pasting your schema somewhere, at least of the tables
 you experienced the corruption?

  True, but. In a past life, complaints from the db (it was a db that
 stored a checksum with every block) were the very first symptom when
 something went horribly wrong with the hardware. (Partial short between
 wires of an internal SCSI cable; eventually we determined that about every
 1MB, 1 bit would get flipped between the controller  disk.)

 So, if there were an official db verifier tool for PG, I for one would
 have it run periodically.

 If there were a way to reliably detect corruption, so would I. As things
 stand there are no block checksums, so if a bit gets flipped in some
 random `text` field you're never going to know, corruption-checker or
 no. Some forms of random corruption - like bad blocks on disks causing

 I think checksums are currently being worked on and are to be expected for
 9.3. Might be interesting to scan -hackers for that once more...

  I/O errors, zeroed blocks, truncated files, etc - will become apparent
 with general checking, but others won't be detectable unless you know
 what the expected vs actual data is.

 If page checksumming or any other reliable method of detecting possible
 incipient corruption were available I'd quite likely want to use it for
 much the same reason you outlined. For that matter, if there were a
 general sanity check my tables and indexes tool I'd probably use that
 too. However, no such tool exists - and in a good setup, none should be
 needed. I'd want to use one anyway purely out of paranoia.

 --
 Craig Ringer


  On a side note, Raj, you might want to read the descriptions of MVCC and
 WAL once more, then re-think about your idea of updating all rows and
 rolling back the transaction. That would potentially produce the effect
 you're looking for with InnoDB or Oracle, but not with PG.

 Cheers,

 --
 Gunnar Nick Bluth
 RHCE/SCLA

 Mobil   +49 172 8853339
 Email: gunnar.bl...@pro-open.de
 __**__**
 __
 In 1984 mainstream users were choosing VMS over UNIX.  Ten years later
 they are choosing Windows over UNIX.  What part of that message aren't you
 getting? - Tom Payne




Re: [ADMIN] Public key for wxBase-2.8.12-1.el5.i386.rpm is not installed for pg_admin3

2012-11-01 Thread Devrim GÜNDÜZ

Hi,



On Tue, 2012-10-30 at 23:33 +, jkells wrote:
 I am trying to install pgadmin3_91.i386 0:1.16.0-1.rhel5  on a centos 5.7

snip

 warning: rpmts_HdrFromFdno: Header V3 DSA signature: NOKEY, key ID 
 217521f6
 pgdg91/gpgkey   | 1.7 kB 00:00 
 
 
 Public key for wxBase-2.8.12-1.el5.i386.rpm is not installed
 Trying to just install wxbase has the same outcome 

It was my mistake. Just resigned the packages. Please run

yum makecache

first, and then try installing pgadmin3 again.

Sorry for the inconvenience.

Regards,
-- 
Devrim GÜNDÜZ
Principal Systems Engineer @ EnterpriseDB: http://www.enterprisedb.com
PostgreSQL Danışmanı/Consultant, Red Hat Certified Engineer
Community: devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr
http://www.gunduz.org  Twitter: http://twitter.com/devrimgunduz


signature.asc
Description: This is a digitally signed message part


[ADMIN] Fwd: Errors on pg_dumpall

2012-11-01 Thread Terry Khatri
Somebody PLEASE HELP ! is Tome Lane around !
Tks
T

-- Forwarded message --
From: Terry Khatri terrykhatri...@gmail.com
Date: 22 October 2012 00:20
Subject: Errors on pg_dumpall
To: pgsql-admin-ow...@postgresql.org


Hi

I am getting the following error msg when I do a pg_dumpall

-

TK@OraLinux /usr/local/pgsql/backups
$ pg_dumpall -U sns84  completebackup20121020,sql
pg_dump: schema with OID 74363 does not exist
pg_dumpall: pg_dump failed on database bihistory, exiting

-

I have set the logs to debug5 to catch where the problem is while taking
the dump but nothing shows up there.

If you have any ideas please help me.

And thanks to all for responding to my earlier post.

Rgds
T


Re: [ADMIN] Fwd: Errors on pg_dumpall

2012-11-01 Thread Fernando Hevia
Oops. You were posting to the right list.
I took it for the pgAdmin list. Sorry. :)



On Thu, Nov 1, 2012 at 6:06 PM, Fernando Hevia fhe...@gmail.com wrote:

 You are posting to the wrong list.
 You should address this in the 
 pgsql-adminhttp://archives.postgresql.org/pgsql-admin/
  list.




 On Thu, Nov 1, 2012 at 11:05 AM, Terry Khatri terrykhatri...@gmail.comwrote:

 Somebody PLEASE HELP ! is Tome Lane around !
 Tks
 T

 -- Forwarded message --
 From: Terry Khatri terrykhatri...@gmail.com
 Date: 22 October 2012 00:20
 Subject: Errors on pg_dumpall
 To: pgsql-admin-ow...@postgresql.org


 Hi

 I am getting the following error msg when I do a pg_dumpall

 -

 TK@OraLinux /usr/local/pgsql/backups
 $ pg_dumpall -U sns84  completebackup20121020,sql
 pg_dump: schema with OID 74363 does not exist
 pg_dumpall: pg_dump failed on database bihistory, exiting

 -

 I have set the logs to debug5 to catch where the problem is while taking
 the dump but nothing shows up there.

 If you have any ideas please help me.

 And thanks to all for responding to my earlier post.

 Rgds
 T