Hi Cory,

We have *zero_damaged_pages* parameter in PostgreSQL configuration,by
default it is set be *off*.
To recover data from corrupted table,we can turn *on* this parameter as a
 super user and populate new table using dump or copy utility.

Note : The damaged pages we can't recover from table,it will set to 0 and
it will skip while fetching data from table.

Please follow below steps, if decided to recover data from corrupted table.

*Sample case :*

[postgres@instructor ~]$ /usr/local/pgsql/bin/psql
psql (9.4rc1)
Type "help" for help.

postgres=# select count(*) from test;
*ERROR:  invalid page in block 7 of relation base/13003/16384*
postgres=# show zero_damaged_pages;
 zero_damaged_pages
--------------------
 off
(1 row)

postgres=# *set zero_damaged_pages=on;*
SET
postgres=# show zero_damaged_pages;
 zero_damaged_pages
--------------------
 on
(1 row)

postgres=# select count(*) from test;
*WARNING:  invalid page in block 7 of relation base/13003/16384; zeroing
out page*
WARNING:  invalid page in block 8 of relation base/13003/16384; zeroing out
page
WARNING:  invalid page in block 9 of relation base/13003/16384; zeroing out
page
WARNING:  invalid page in block 10 of relation base/13003/16384; zeroing
out page
WARNING:  invalid page in block 11 of relation base/13003/16384; zeroing
out page
WARNING:  invalid page in block 12 of relation base/13003/16384; zeroing
out page
WARNING:  invalid page in block 13 of relation base/13003/16384; zeroing
out page

count
--------
 979163
(1 row)


On Tue, Dec 23, 2014 at 8:47 AM, Cory Zue <c...@dimagi.com> wrote:

> Hi all,
>
> Our postgres instance on one of our production machines has recently been
> returning errors of the form "DatabaseError: invalid page header in block
> 1 of relation base/16384/76623" from normal queries. I've been reading that
> these are often linked to hardware errors, but I would like to better
> understand what else it could be or how to determine that for sure. I've
> filled out the standard issue reporting template below. Any feedback or
> troubleshooting instructions would be much appreciated.
>
> ---
> A description of what you are trying to achieve and what results you
> expect.:
>
> Intermittent queries are failing with the error "DatabaseError: invalid
> page header in block 1 of relation base/16384/76623"
>
> PostgreSQL version number you are running:
>
> PostgreSQL 8.4.13 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC)
> 4.4.6 20120305 (Red Hat 4.4.6-4), 64-bit
>
> How you installed PostgreSQL:
>
> from standard package installer
>
> Changes made to the settings in the postgresql.conf file:
>
>
>              name             |       current_setting       |
>  source
>
> ------------------------------+-----------------------------+----------------------
>  checkpoint_completion_target | 0.9                         |
> configuration file
>  checkpoint_segments          | 32                          |
> configuration file
>  checkpoint_timeout           | 15min                       |
> configuration file
>  DateStyle                    | ISO, MDY                    |
> configuration file
>  default_text_search_config   | pg_catalog.english          |
> configuration file
>  effective_cache_size         | 1GB                         |
> configuration file
>  lc_messages                  | en_US.UTF-8                 |
> configuration file
>  lc_monetary                  | en_US.UTF-8                 |
> configuration file
>  lc_numeric                   | en_US.UTF-8                 |
> configuration file
>  lc_time                      | en_US.UTF-8                 |
> configuration file
>  log_checkpoints              | on                          |
> configuration file
>  log_connections              | off                         |
> configuration file
>  log_destination              | csvlog                      |
> configuration file
>  log_directory                | /opt/data/pgsql/data/pg_log |
> configuration file
>  log_disconnections           | off                         |
> configuration file
>  log_duration                 | on                          |
> configuration file
>  log_filename                 | postgres-%Y-%m-%d_%H%M%S    |
> configuration file
>  log_lock_waits               | on                          |
> configuration file
>  log_min_duration_statement   | 250ms                       |
> configuration file
>  log_rotation_age             | 1d                          |
> configuration file
>  log_rotation_size            | 1GB                         |
> configuration file
>  log_temp_files               | 0                           |
> configuration file
>  log_timezone                 | Asia/Kolkata                | command line
>  log_truncate_on_rotation     | on                          |
> configuration file
>  logging_collector            | on                          |
> configuration file
>  maintenance_work_mem         | 768MB                       |
> configuration file
>  max_connections              | 500                         |
> configuration file
>  max_stack_depth              | 2MB                         | environment
> variable
>  port                         | 5432                        | command line
>  shared_buffers               | 4GB                         |
> configuration file
>  ssl                          | on                          |
> configuration file
>  TimeZone                     | Asia/Kolkata                | command line
>  timezone_abbreviations       | Default                     | command line
>  wal_buffers                  | 16MB                        |
> configuration file
>  work_mem                     | 48MB                        |
> configuration file
>
> It's also probably worth noting that postgres is installed on an encrypted
> volume which is mounted using ecryptfs.
>
> Operating system and version:
>
> RedHatEnterpriseServer, version 6.6
>
> What program you're using to connect to PostgreSQL:
>
> Python (django)
>
> Is there anything relevant or unusual in the PostgreSQL server logs?:
>
> I see lots of instances of this error (and similar). I'm not sure what
> else I should be looking for.
>
> What you were doing when the error happened / how to cause the error:
>
> I haven't explicitly tried to reproduce it, but it seems to consistently
> happen with certain queries. However, the system was rebooted shortly
> before the errors started occuring. The system was rebooted because another
> database (elasticsearch) was having problems on the same machine and the
> reboot was to attempt to resolve things.
>
> The EXACT TEXT of the error message you're getting, if there is one:
>
> DatabaseError: invalid page header in block 1 of relation base/16384/76623
>
> (block and relation numbers change)
>
> Unfortunately, I'm not completely familiar with the CPU and disk/RAID
> configurations used on the server. However it is storing to a (software)
> encrypted volume as mentioned above.
>
>
>    - Have you *ever* set fsync=off in the postgresql config file?
>
> No
>
>    - Have you had any unexpected power loss lately? Replaced a failed
>    RAID disk? Had an operating system crash?
>
> Not recently, though the system did reboot normally as described above.
>
>    - Have you run a file system check? (chkdsk / fsck)
>
> No.
>
>    - Are there any error messages in the system logs? (unix/linux: dmesg,
>    /var/log/syslog ;
>
> I haven't seen anything obvious but I wasn't sure what to look for.
>
> thanks,
> Cory
>

Reply via email to