On 12 February 2016 at 17:56, Oliver Stöneberg wrote:
> A few weeks ago we already had a data corruption when the disk was
> full. There are other services running on the same machine that could
> cause the disk to fill up (e.g. local chaching when the network is
> acting
We are running a 64-bit PostgreSQL 9.4.5 server on Windows Server
2012. The system is a virtual machine on a VMware ESX 6.0 server and
has 24 GB of memory. The database server is only accessed locally by
two services and there is only a single database in the server. The
disk is located on a
On Fri, 12 Feb 2016 10:56:04 +0100
"Oliver Stöneberg" wrote:
> We are running a 64-bit PostgreSQL 9.4.5 server on Windows Server
> 2012. The system is a virtual machine on a VMware ESX 6.0 server and
> has 24 GB of memory. The database server is only accessed locally by
>
On Fri, Feb 12, 2016 at 07:46:25AM -0500, Bill Moran wrote:
> Long term, you need to fix your hardware. Postgres doesn't corrupt
> itself just because the disks fill up, so your hardware must be lying
> about what writes completed successfully, otherwise, Postgres would
> be able to recover after
(I originally posted this to pgsql-admin and was pointed to here instead.)
Folks-
I'm doing a postmortem on a corruption event we had. I have an idea on
what happened, but not sure. I figure I'd share what happened and see if
I'm close to right here.
Event: Running 9.1.6 with hot-standby,
Ned Wolpert ned.wolp...@imemories.com wrote:
I'm doing a postmortem on a corruption event we had. I have an
idea on what happened, but not sure. I figure I'd share what
happened and see if I'm close to right here.
Running 9.1.6 with hot-standby, archiving 4 months of wal files,
and even a
Ned Wolpert ned.wolp...@imemories.com writes:
Event: Running 9.1.6 with hot-standby, archiving 4 months of wal files,
and even a nightly pg_dump all. 50G database. Trying to update or delete a
row in a small (21 row, but heavily used table) would lock up completely.
Never finish. Removed
Tom and Kevin-
There were two entries in pg_prepared_xacts. In the test-bed, executing
the 'ROLLBACK PREPARED' on both allowed the system to continue processing.
All locks I saw in 'pg_locks' where the virtualtransaction started with the
'-1/' were also gone. That was indeed the issue. More
Heine Ferreira wrote:
Are there any best practices for avoiding database corruption?
First and foremost, do not turn off fsync or full_page_writes in your
configuration. After that the most common causes for database
corruption I've seen are bad RAM (ECC RAM is a requirement, not an
option for
Craig Ringer wrote:
On 10/14/2012 05:53 AM, Heine Ferreira wrote:
Hi
Are there any best practices for avoiding database
corruption?
* Maintain rolling backups with proper ageing. For example, keep one a
day for the last 7 days, then one a week for the last 4 weeks, then
one a month for the
Craig Ringer wrote:
On 10/14/2012 05:53 AM, Heine Ferreira wrote:
Hi
Are there any best practices for avoiding database
corruption?
* Maintain rolling backups with proper ageing. For example, keep one a
day for the last 7 days, then one a week for the last 4 weeks, then
one a month for the
On 10/18/2012 01:06 AM, Daniel Serodio wrote:
Craig Ringer wrote:
On 10/14/2012 05:53 AM, Heine Ferreira wrote:
Hi
Are there any best practices for avoiding database
corruption?
* Maintain rolling backups with proper ageing. For example, keep one a
day for the last 7 days, then one a week
On Sun, Oct 14, 2012 at 11:26:40AM +0800, Craig Ringer wrote:
On 10/14/2012 11:00 AM, John R Pierce wrote:
On 10/13/12 7:13 PM, Craig Ringer wrote:
* Use a good quality hardware RAID controller with a battery backup
cache unit if you're using spinning disks in RAID. This is as much for
Hi
Are there any best practices for avoiding database
corruption? I suppose the most obvious one is
to have a ups if it's a desktop machine.
How do you detect corruption in a Postgresql
database and are there any ways to fix it besides
restoring a backup?
Thanks
H.F.
Lørdag 13. oktober 2012 23.53.03 skrev Heine Ferreira :
Hi
Are there any best practices for avoiding database
corruption?
In my experience, database corruption always comes down to flaky disk drives.
Keep your disks new and shiny eg. less than 3 years, and go for some kind of
redundancy
On 10/13/12 3:04 PM, Leif Biberg Kristensen wrote:
Lørdag 13. oktober 2012 23.53.03 skrev Heine Ferreira :
Hi
Are there any best practices for avoiding database
corruption?
In my experience, database corruption always comes down to flaky disk drives.
Keep your disks new and shiny eg. less
On 10/14/2012 05:53 AM, Heine Ferreira wrote:
Hi
Are there any best practices for avoiding database
corruption?
* Maintain rolling backups with proper ageing. For example, keep one a
day for the last 7 days, then one a week for the last 4 weeks, then one
a month for the rest of the year,
On 10/13/12 7:13 PM, Craig Ringer wrote:
* Use a good quality hardware RAID controller with a battery backup
cache unit if you're using spinning disks in RAID. This is as much for
performance as reliability; a BBU will make an immense difference to
database performance.
a comment on this
On 10/14/2012 11:00 AM, John R Pierce wrote:
On 10/13/12 7:13 PM, Craig Ringer wrote:
* Use a good quality hardware RAID controller with a battery backup
cache unit if you're using spinning disks in RAID. This is as much for
performance as reliability; a BBU will make an immense difference to
On 10/14/2012 05:53 AM, Heine Ferreira wrote:
Hi
Are there any best practices for avoiding database
corruption?
I forgot to mention, you should also read:
http://www.postgresql.org/docs/current/static/wal-reliability.html
--
Craig Ringer
--
Sent via pgsql-general mailing list
On Sun, Oct 14, 2012 at 1:13 PM, Craig Ringer ring...@ringerc.id.au wrote:
* Never, ever, ever use cheap SSDs. Use good quality hard drives or (after
proper testing) high end SSDs. Read the SSD reviews periodically posted on
this mailing list if considering using SSDs. Make sure the SSD has a
On 10/14/2012 12:02 PM, Chris Angelico wrote:
Is there an article somewhere about how best to do a plug-pull test?
Or is it as simple as fire up pgbench, kill the power, bring things
back up, and see if anything isn't working?
That's what I'd do and what I've always done in the past, but
I am running 8.3.3 currently on this box.
Last week we had a database corruption issue that started as:
Aug 24 07:15:19 iprobe028 postgres[20034]: [3-1] ERROR: could not read
block 0 of relation 1663/16554/7463400: read only 0 of 8192 bytes
Aug 24 07:15:49 iprobe028 postgres[27663]: [3-1] ERROR:
Excerpts from George Woodring's message of lun ago 30 08:17:56 -0400 2010:
I am running 8.3.3 currently on this box.
Last week we had a database corruption issue that started as:
Aug 24 07:15:19 iprobe028 postgres[20034]: [3-1] ERROR: could not read
block 0 of relation 1663/16554/7463400:
I have found that I have a database problem after receiving the
following error from pg_dump:
pg_dump: SQL command failed
pg_dump: Error message from server: ERROR: more than one row returned
by a subquery used as an expression
pg_dump: The command was: SELECT tableoid, oid, typname,
George Woodring wrote:
I have found that I have a database problem after receiving the
following error from pg_dump:
Lack of vacuuming, most likely. What version is this? Did you read
previous threads about this problem on the archives?
--
Alvaro Herrera
George Woodring george.woodr...@iglass.net writes:
Upon investigation I found that I have a table that is in the database twice
db= select oid, relname from pg_class where oid IN (26770910,
26770918, 26770919);
oid|relname
The version is 8.3.3, and I use autovacuum for the routine maintenance.
The ctid's are distinct
grande=# select oid, ctid, relname from pg_class where oid IN
(26770910, 26770918, 26770919, 26770920);
oid| ctid |relname
This thread is a top posting mess. I'll try to rearrange:
Jeff Brenton wrote:
REINDEX INDEX testrun_log_pkey;
ERROR: could not write block 1832079 of temporary file: No space left
on device
HINT: Perhaps out of disk space?
There is currently 14GB free on the disk that postgres is
On Wed, 8 Apr 2009 22:14:38 -0400
Jeff Brenton jbren...@sandvine.com wrote:
There are no filesystem level content size restrictions that I am
aware of on this system. The user pgsql should have full access
to the filesystems indicated except for the root filesystem.
finished inodes?
A lot
I've encountered some db corruption after restarting postgres on my
database server running 8.2.4. I think that postgres did not shut down
cleanly. Postgres started appropriately but crashed 45 minutes later.
I used pg_resetxlog after the crash to get the db to start again but it
appears that
I would imagine you would have better luck dropping the index and
recreating. But considering you're 98% full on that drive, it looks like
you're about to have other problems...
On Wed, Apr 8, 2009 at 8:32 PM, Jeff Brenton jbren...@sandvine.com wrote:
I’ve encountered some db corruption after
On Wednesday 08 April 2009 6:32:06 pm Jeff Brenton wrote:
I've encountered some db corruption after restarting postgres on my
database server running 8.2.4. I think that postgres did not shut down
cleanly. Postgres started appropriately but crashed 45 minutes later.
I used pg_resetxlog after
: [GENERAL] database corruption
On Wednesday 08 April 2009 6:32:06 pm Jeff Brenton wrote:
I've encountered some db corruption after restarting postgres on my
database server running 8.2.4. I think that postgres did not shut
down
cleanly. Postgres started appropriately but crashed 45 minutes later
, 2009 10:08 PM
To: Jeff Brenton
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] database corruption
I would imagine you would have better luck dropping the index and
recreating. But considering you're 98% full on that drive, it looks
like you're about to have other problems...
On Wed
10:10 PM
To: pgsql-general@postgresql.org
Cc: Jeff Brenton
Subject: Re: [GENERAL] database corruption
On Wednesday 08 April 2009 6:32:06 pm Jeff Brenton wrote:
I've encountered some db corruption after restarting postgres on my
database server running 8.2.4. I think that postgres did
: [GENERAL] database corruption
On Wed, 2009-04-08 at 22:14 -0400, Jeff Brenton wrote:
There are no filesystem level content size restrictions that I am
aware
of on this system. The user pgsql should have full access to the
filesystems indicated except for the root filesystem.
Inodes
Jeff Brenton wrote:
I've attempted to re-index the pkey listed but after an hour it fails
with
REINDEX INDEX testrun_log_pkey;
ERROR: could not write block 1832079 of temporary file: No space left
on device
HINT: Perhaps out of disk space?
There is currently 14GB free on the
Hi all,
I just had the following error on one of our data bases:
ERROR: could not access status of transaction 1038286848
DETAIL: could not open file pg_clog/03DE: No such file or directory
I researched on the mailing list and it looks like the usual suspect is
disk page corruption. There are
On Thu, 2007-07-12 at 15:09 +0200, Csaba Nagy wrote:
Luckily I remembered I have a WAL logging based replica, so I
recovered
the rest of the truncated file from the replica's same file... this
being an insert only table I was lucky I guess that this was an
option.
To my surprise, the same
On Thu, 2007-07-12 at 16:18, Simon Riggs wrote:
The corruption could only migrate if the WAL records themselves caused
the damage, which is much less likely than corruption of the data blocks
at hardware level. ISTM that both Slony and Log shipping replication
protect fairly well against block
On Jul 12, 2007, at 8:09 AM, Csaba Nagy wrote:
Hi all,
I just had the following error on one of our data bases:
ERROR: could not access status of transaction 1038286848
DETAIL: could not open file pg_clog/03DE: No such file or directory
I researched on the mailing list and it looks like
Shane wrote:
Hello all,
Whilst running a regular pg_dumpall, I received the
following error from our spamassassin DB.
pg_dump: ERROR: could not access status of transaction
4521992
DETAIL: could not open file pg_clog/0004: No such file
or directory
pg_dump: SQL command to dump the
Tom Lane wrote:
Michael Guerin [EMAIL PROTECTED] writes:
Hmm, that makes it sound like a plain old data-corruption problem, ie,
trashed xmin or xmax in some tuple header. Can you do a select
count(*) from this table without getting the error?
no, select count(*) fails around 25
Michael Guerin [EMAIL PROTECTED] writes:
Ok, so I'm trying to track down the rows now (big table slow queries :(
) How does one zero out a corrupt row, plain delete? I see references
for creating the missing pg_clog file but I don't believe that's what
you're suggesting..
Zeroing out the
Zeroing out the whole block containing it is the usual recipe. I forget
the exact command but if you trawl the archives for mention of dd and
/dev/zero you'll probably find it. Keep in mind you want to stop the
postmaster first, to ensure it doesn't have a copy of the bad block
cached in
Michael Guerin [EMAIL PROTECTED] writes:
You're suggesting to zero out the block in the underlying table files,
or creating the missing pg_clog file and start filling with zero's?
The former. Making up clog data is unlikely to help --- the bad xmin is
just the first symptom of what's probably
Zeroing out the whole block containing it is the usual recipe.
Something like this worked for me in the past:
% dd bs=8k count=X /dev/zero clog-file
I had to calculate X, because I usually had a situation with truncated
clog-file, and a failed attempt to read it from offset XYZ.
And I
Hi,
Our database filled up and now I'm getting this error on one of the
tables. Is there any way to recover from this? Please let me know if
more information is needed.
pg_version version
Also, all files in pg_clog are sequential with the last file being 0135.
Michael Guerin wrote:
Hi,
Our database filled up and now I'm getting this error on one of the
tables. Is there any way to recover from this? Please let me know if
more information is needed.
pg_version
Michael Guerin [EMAIL PROTECTED] writes:
Also, all files in pg_clog are sequential with the last file being 0135.
Hmm, that makes it sound like a plain old data-corruption problem, ie,
trashed xmin or xmax in some tuple header. Can you do a select
count(*) from this table without getting the
Hmm, that makes it sound like a plain old data-corruption problem, ie,
trashed xmin or xmax in some tuple header. Can you do a select
count(*) from this table without getting the error?
no, select count(*) fails around 25 millions rows.
PostgreSQL 8.1RC1 on x86_64-unknown-linux-gnu,
Michael Guerin [EMAIL PROTECTED] writes:
Hmm, that makes it sound like a plain old data-corruption problem, ie,
trashed xmin or xmax in some tuple header. Can you do a select
count(*) from this table without getting the error?
no, select count(*) fails around 25 millions rows.
OK, so you
On Jan 5, 2007, at 10:01 PM, Tom Lane wrote:
Michael Best [EMAIL PROTECTED] writes:
Set your memory requirement too high in postgresql.conf, reload
instead
of restarting the database, it silently fails sometime later?
Yeah, wouldn't surprise me, since the reload is going to ignore any
Thomas F. O'Connell [EMAIL PROTECTED] writes:
Michael Best [EMAIL PROTECTED] writes:
Set your memory requirement too high in postgresql.conf, reload
instead of restarting the database, it silently fails sometime later?
Wait, now I'm curious. If a change in postgresql.conf that requires a
On Jan 4, 2007, at 11:24 PM, Michael Best wrote:
When I finally got the error report in the morning the database was
in this state:
$ psql dbname
dbname=# \dt
ERROR: cache lookup failed for relation 20884
Do you have your error logs, and were there any relevant errors in
them
Thomas F. O'Connell wrote:
On Jan 4, 2007, at 11:24 PM, Michael Best wrote:
When I finally got the error report in the morning the database was in
this state:
$ psql dbname
dbname=# \dt
ERROR: cache lookup failed for relation 20884
Do you have your error logs, and were there any
Michael Best [EMAIL PROTECTED] writes:
Set your memory requirement too high in postgresql.conf, reload instead
of restarting the database, it silently fails sometime later?
Yeah, wouldn't surprise me, since the reload is going to ignore any
changes related to resizing shared memory. I think
Had some database corruption problems today. Since they came on the
heels of making some minor database changes yesterday, they may or may
not be related to that. Centos 4.x, Postgresql 8.1.4
I modified the following settings and then issued a reload.I hadn't
turned up the kernel.shmmax
In the document Transaction Processing in PostgreSQL
( http://www.postgresql.org/files/developer/transactions.pdf )
I read :
Postgres transactions are only guaranteed atomic if a disk page write
is an atomic action. On most modern hard drives that's true if a page
is a physical sector, but most
[EMAIL PROTECTED] writes:
In the document Transaction Processing in PostgreSQL
( http://www.postgresql.org/files/developer/transactions.pdf )
That's very, very old information.
I read :
Postgres transactions are only guaranteed atomic if a disk page write
is an atomic action.
Not true
We have recently upgraded from PostgreSQL 7.4.5 to 8.1.4. Our DB is
about 45GB in size and has about 100 tables, lots of stored
procedures, triggers etc.
The DB lived on a SAN and is accessed via Fibre Channel cards in an
IBM BladeCenter. The system is running RedHat Linux Enterprise 3 I
It shouldnt run into these problems from time to time, that kind of a scenario only happened to me once so dont know exactly how often this can happen. But a recommendation from my end will be to upgrade to the newer PostgreSQL version as you are using an old release. Also try running some disk
Hello,
We are stressing testing our application. It adds and deletes a lot of
rows. Within 24 hours we ran into some sort of database corruption
problem. We got this error when trying to insert into the users table.
ERROR XX001: invalid page header in block 2552 of relation
Try doing a REINDEX and see if you can recover all data blocks as it appears to me you have some data blocks messed up. If possible try taking the backup for your database as well.Thanks,-- Shoaib Mir
EnterpriseDB (www.enterprisedb.com)On 7/27/06, aurora [EMAIL PROTECTED]
wrote:Hello,We are
From your experience do you expect the database would run into this from
time to time that requires DBA's interventions? Is so it would become a
problem for our customers because our product is a standalone system. We
don't intend to expose the Postgre database underneath.
wy
Try doing a
On 7/26/06, aurora [EMAIL PROTECTED] wrote:
From your experience do you expect the database would run into this from
time to time that requires DBA's interventions? Is so it would become a
problem for our customers because our product is a standalone system. We
don't intend to expose the
On Fri, Jun 18, 2004 at 02:32:16PM -0400, Tom Lane wrote:
Since that 7.4.2 release-note only talked about crashing queries due to the
7.4.1 bug, but not about data-corruption occuring, I wondered if the
symptoms I have seen are related to the alignment bug in 7.4.1 or not.
No, I don't
Hi
One of our production systems was running 7.4.1 for a few months, when
suddenly some queries that used a specifiy table (a cache table) started
crashing the backend.
A colleague of mine fixed the problem by simply dumping and rebuilding
the affected table (That was possible since it was only
Florian G. Pflug [EMAIL PROTECTED] writes:
... I upgraded to 7.4.2, and fixed the system-tables
according to the 7.4.2 release-note. But this didn't really help - the
analyze table issued after fixing the system-tables exited with an
error about an invalid page header in one of our tables.
Chris Stokes [EMAIL PROTECTED] writes:
PANIC: XLogWrite: write request 1/812D is past end of log 1/812D
This sure looks like the symptom of the 7.3.3 failure-to-restart bug.
If you are on 7.3.3 then an update to 7.3.4 will fix it.
regards, tom lane
Chris Stokes [EMAIL PROTECTED] writes:
Just one more question, Where can I read up on this bug, I would like to inform
myself better before I promise a fix to our customer.
See the list archives from just before the 7.3.4 release. The failure
occurs when the old WAL ends exactly on a page
Chris Stokes [EMAIL PROTECTED] writes:
We use the RPM installation, if I do and rpm -Uvh for the packages to upgrade to the
new 7.3.4 will that be sufficient or does it require some sort of database upgrade
or unload/reload?
Not for an update within the 7.3.* series. Just stop postmaster,
Corey Minter [EMAIL PROTECTED] writes:
I don't understand how I
wouldn't be able to run initdb.
How much free disk space have you got?
regards, tom lane
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster
Looks like one of my tables got corrupted. Can someone explain how to
recover from this?? Trying to drop the table is not working...Postgres
hangs.
Any help is appreciated.
Arthur
NOTICE: Index error_interface_idx: NUMBER OF INDEX' TUPLES (226766)
IS NOT THE SAME AS HEAP' (226765)
...
NOTICE: Index pg_class_relname_index: NUMBER OF INDEX' TUPLES (74)
IS NOT THE SAME AS HEAP' (75)
...
IIRC, I think the problem and solution is basically the same:
Bryan Henderson wrote:
NOTICE: Index error_interface_idx: NUMBER OF INDEX' TUPLES (226766)
IS NOT THE SAME AS HEAP' (226765)
...
NOTICE: Index pg_class_relname_index: NUMBER OF INDEX' TUPLES (74)
IS NOT THE SAME AS HEAP' (75)
...
IIRC, I think the problem and
Chris Jones schrieb:
I'm currently getting this error on my nightly vacuum. These two
indices (as you may have guessed already) are on columns named
interface and ewhen, on a table named error. The error table is
constantly being updated. (No comments about the implications of
that,
Hi, all.
I'm relatively new to PostgreSQL, but I've been quite impressed with
it so far. This may be due to too much experience with MySQL. :)
I'm currently getting this error on my nightly vacuum. These two
indices (as you may have guessed already) are on columns named
interface and ewhen,
Chris Jones wrote:
NOTICE: Index error_interface_idx: NUMBER OF INDEX' TUPLES (226766) IS NOT THE SAME
AS HEAP' (226765)
NOTICE: Index error_ewhen_idx: NUMBER OF INDEX' TUPLES (226766) IS NOT THE SAME AS
HEAP' (226765)
Hope this was not already answered...
I believe it means that the
Hi there,
I'm using LibPQ from C++ to access a database and I am doing an update
on a table with two primary keyes but my where clause is only using one
of these keys. I am somehow corrupting the database and after that, the
backend dies every time I try to access that table. Any ideas on what
81 matches
Mail list logo