Re: [GENERAL] ltree gist index errors and fill factor questions

2014-12-31 Thread Mike Broers
The database is not crashing thankfully.  We are waiting for the errors to
come back to turn up logging in the hopes of creating the reproducible set.


Re: [GENERAL] ltree gist index errors and fill factor questions

2014-12-31 Thread Mike Broers
I will do my best to provide a reproducible test case.  Is there any more
information I can supply in the meantime that would help?


[GENERAL] ltree gist index errors and fill factor questions

2014-12-31 Thread Mike Broers
Hello,

We have an ltree column (tree_path) that has a gist index
(index_nodes_on_tree_path).  This is in a 9.3.5 database.  Recently errors
started occurring in the postgres log on some updates to this table:

fixing incomplete split in index "index_nodes_on_tree_path", block 2358

STATEMENT:  UPDATE "nodes" SET "parent_id" = $1, "contents_id" =
$2, "contents_type" = $3, "trough_code" = $4, "live_flag" = $5, "ordering"
=$6, "id" = $7, "tree_path" = $8 WHERE "nodes"."id" = 127695

ERROR:  failed to add item to index page in "index_nodes_on_tree_path"


Reindexing index_nodes_on_tree_path fixes the problem temporarily, but it
returns.

I'm wondering if this is a bug (googleing the incomplete split error
returns results about a similar bug with b-tree indexes) or a problem with
an inadequate fill factor setting.  It doesnt look like there is a
specified fill factor for this index and I'm not sure what the gist default
is.

CREATE INDEX index_nodes_on_tree_path ON nodes USING gist (tree_path)

The table in question has about 94k rows, an example of the widest
tree_path tuple is 69066.69090.69091.69094

Any advice is appreciated, happy new year!

Mike


Re: [GENERAL] question about checksum in 9.3

2014-01-14 Thread Mike Broers
Thanks for the responses, that clarifies the checksum feature for me.

FWIW, my pgbench tests between a 9.2 server and a 9.3 server with checksum
showed very similar performance characteristics and system resource
utilization.  Im going to set up another load test with our actual
application to see if that reveals any noticeable performance difference.

Thanks again
Mike


On Mon, Jan 13, 2014 at 7:11 PM, Michael Paquier
wrote:

> On Tue, Jan 14, 2014 at 1:50 AM, Mike Broers  wrote:
> > Hello, I am in the process of planning a 9.3 migration of postgres and I
> am
> > curious about the checksum features available.  In my test 9.3 instance
> it
> > seemed like this feature provides a log entry of the exact database/oid
> of
> > the corrupt object when it is accessed, but not much else.  I can't find
> > much documentation on anything else this feature provides.
> Few things:
> - The only way to know if a server is using data checksums is to use
> pg_controldata.
> - Be aware as well of the potential performance impact on your CPU,
> checksums are checked each time a page is read, and recalculated each
> time a page is updated.
> - ignore_checksum_failure can be used to ignore failures. Don't use
> that on a production system.
>
> > Is there a built-in method of scanning the server to check for
> corruption or
> > will I have to wait for a corrupt object to be accessed to see the log
> > entry?
> You can as well access manually tables with some for example
> sequential scan to check if blocks are broken or not.
>
> > Is there a relation that stores last checksum status or anyway of
> > reporting on what objects are identified by postgres as corrupt or not
> > corrupt?
> No, you could build one though with a background worker that scans
> relation pages and registers that failing blocks.
>
> > Are there any other features of the checksum I am missing besides the log
> > entry?
> 9.4 has a new GUC parameter called data_checksums that allow you to
> check with a psql client if checksums are used on a server.
>
> Regards,
> --
> Michael
>


[GENERAL] question about checksum in 9.3

2014-01-13 Thread Mike Broers
Hello, I am in the process of planning a 9.3 migration of postgres and I am
curious about the checksum features available.  In my test 9.3 instance it
seemed like this feature provides a log entry of the exact database/oid of
the corrupt object when it is accessed, but not much else.  I can't find
much documentation on anything else this feature provides.

Is there a built-in method of scanning the server to check for corruption
or will I have to wait for a corrupt object to be accessed to see the log
entry?  Is there a relation that stores last checksum status or anyway of
reporting on what objects are identified by postgres as corrupt or not
corrupt?

Are there any other features of the checksum I am missing besides the log
entry?

Thanks
Mike


Re: [GENERAL] Re: corruption issue after server crash - ERROR: unexpected chunk number 0

2013-11-26 Thread Mike Broers
The restore of a post-crash production backup worked as hoped and the 2nd
replication slave is back into its happy hot standby state.

So if this problem replicated to our standby servers does that indicate
that the potential problematic fsync occurred during a pg_xlog write?
 Would breaking replication at the time of the crash have prevented this
from cascading or was it already too late at that point?

Thanks again for the input, its been very helpful!
Mike




On Mon, Nov 25, 2013 at 12:20 PM, Mike Broers  wrote:

> Thanks Shaun,
>
> Im planning to schedule a time to do the vacuum freeze suggested
> previously.  So far the extent of the problem seems limited to the one
> session table and the one session row that was being used by a heavy bot
> scan at the time of the crash.  Currently Im testing a recovery of a
> production backup from today to rebase one of the replication targets that
> I was using to test fixes last week.  Hopefully that validates the current
> backups and I can proceed inquiring with our managed services provider
> about the false notification of the disk write and ways to prevent that
> going forward.
>
> I'll update the list if I uncover anything interesting in the process
> and/or need more advice, thanks again for your input - its much appreciated
> as always.  Nothing like a little crash corruption to get the blood flowing!
>
> Mike
>
>
> On Mon, Nov 25, 2013 at 10:29 AM, Shaun Thomas 
> wrote:
>
>> > Update - I have two hot replication slaves of this db, both have the
>> problem.
>> > I took one out of recovery and ran REINDEX table session_session and it
>> > fixed the errors about this row.  Now Im going to run vacuum and see if
>> > there are other tables that complain, but Im guessing if so I will need
>> to see
>> > if there is a way to force vacuum to continue on error, worst case I
>> might
>> > have to script a table by table vacuum script I guess..  If anyone has
>> a better
>> > suggestion for determining the extent of the damage Id appreciate it.
>>
>> Oh man. I'm sorry, Mike.
>>
>> One of the cardinal rules I have is to disconnect any replication
>> following a database crash. It's just too easy for damaged replicated rows
>> to be propagated unless you're on 9.3 and have checksums enabled. If you
>> want to perform a  table-by-table check, don't vacuum the database, but the
>> individual tables. I'd go with a DO loop and have it raise notices into the
>> log so you can investigate further:
>>
>> COPY (
>> SELECT 'VACUUM ' || oid::regclass::text || ';'
>>   FROM pg_class
>>  WHERE relkind = 'r'
>> ) to '/tmp/vac_all.sql';
>>
>> Run the /tmp/vac_all.sql through psql and pipe the contents into a log
>> file. Any table that doesn't vacuum successfully will need to be repaired
>> manually. One way you can do this if there are dupes, is by checking the
>> ctid value after disabling index scans:
>>
>> SET enable_indexscan TO False;
>>
>> SELECT ctid, * FROM [broken_table] WHERE ...;
>>
>> Just construct the WHERE clause based on the error output, and you should
>> get all rows if there are dupes. You'll need to figure out which row to
>> keep, then delete the bad row based on the ctid. Do this as many times as
>> it takes, then reindex to make sure the proper row versions are indexed.
>>
>> It's also a good idea to dump any table that came back with an error,
>> just in case.
>>
>> After you've done all of that, you should re-base your replicas once
>> you've determined your production system is usable. In the meantime, I
>> highly recommend you set up a VIP you can assign to one of your replicas if
>> your production system dies again, and remove any autostart code. If your
>> production system crashes, switch the VIP immediately to a replica, and
>> invalidate your old production system. Data corruption is insidious when
>> streaming replication is involved.
>>
>> Look into tools like repmgr to handle managing your replicas as a cluster
>> to make forced invalidation and re-basing easier.
>>
>> Good luck!
>>
>> --
>> Shaun Thomas
>> OptionsHouse | 141 W. Jackson Blvd | Suite 500 | Chicago IL, 60604
>> 312-676-8870
>> stho...@optionshouse.com
>>
>> __
>>
>> See http://www.peak6.com/email_disclaimer/ for terms and conditions
>> related to this email
>>
>
>


Re: [GENERAL] Re: corruption issue after server crash - ERROR: unexpected chunk number 0

2013-11-25 Thread Mike Broers
Thanks Shaun,

Im planning to schedule a time to do the vacuum freeze suggested
previously.  So far the extent of the problem seems limited to the one
session table and the one session row that was being used by a heavy bot
scan at the time of the crash.  Currently Im testing a recovery of a
production backup from today to rebase one of the replication targets that
I was using to test fixes last week.  Hopefully that validates the current
backups and I can proceed inquiring with our managed services provider
about the false notification of the disk write and ways to prevent that
going forward.

I'll update the list if I uncover anything interesting in the process
and/or need more advice, thanks again for your input - its much appreciated
as always.  Nothing like a little crash corruption to get the blood flowing!

Mike


On Mon, Nov 25, 2013 at 10:29 AM, Shaun Thomas wrote:

> > Update - I have two hot replication slaves of this db, both have the
> problem.
> > I took one out of recovery and ran REINDEX table session_session and it
> > fixed the errors about this row.  Now Im going to run vacuum and see if
> > there are other tables that complain, but Im guessing if so I will need
> to see
> > if there is a way to force vacuum to continue on error, worst case I
> might
> > have to script a table by table vacuum script I guess..  If anyone has a
> better
> > suggestion for determining the extent of the damage Id appreciate it.
>
> Oh man. I'm sorry, Mike.
>
> One of the cardinal rules I have is to disconnect any replication
> following a database crash. It's just too easy for damaged replicated rows
> to be propagated unless you're on 9.3 and have checksums enabled. If you
> want to perform a  table-by-table check, don't vacuum the database, but the
> individual tables. I'd go with a DO loop and have it raise notices into the
> log so you can investigate further:
>
> COPY (
> SELECT 'VACUUM ' || oid::regclass::text || ';'
>   FROM pg_class
>  WHERE relkind = 'r'
> ) to '/tmp/vac_all.sql';
>
> Run the /tmp/vac_all.sql through psql and pipe the contents into a log
> file. Any table that doesn't vacuum successfully will need to be repaired
> manually. One way you can do this if there are dupes, is by checking the
> ctid value after disabling index scans:
>
> SET enable_indexscan TO False;
>
> SELECT ctid, * FROM [broken_table] WHERE ...;
>
> Just construct the WHERE clause based on the error output, and you should
> get all rows if there are dupes. You'll need to figure out which row to
> keep, then delete the bad row based on the ctid. Do this as many times as
> it takes, then reindex to make sure the proper row versions are indexed.
>
> It's also a good idea to dump any table that came back with an error, just
> in case.
>
> After you've done all of that, you should re-base your replicas once
> you've determined your production system is usable. In the meantime, I
> highly recommend you set up a VIP you can assign to one of your replicas if
> your production system dies again, and remove any autostart code. If your
> production system crashes, switch the VIP immediately to a replica, and
> invalidate your old production system. Data corruption is insidious when
> streaming replication is involved.
>
> Look into tools like repmgr to handle managing your replicas as a cluster
> to make forced invalidation and re-basing easier.
>
> Good luck!
>
> --
> Shaun Thomas
> OptionsHouse | 141 W. Jackson Blvd | Suite 500 | Chicago IL, 60604
> 312-676-8870
> stho...@optionshouse.com
>
> __
>
> See http://www.peak6.com/email_disclaimer/ for terms and conditions
> related to this email
>


Re: [GENERAL] corruption issue after server crash - ERROR: unexpected chunk number 0

2013-11-22 Thread Mike Broers
vacuumb avz, pg_dumpall, and vacuum freeze analyze on the former standby
database that received the corruption via replication all came back without
errors.  Is the vacuum freeze intended to potentially fix problems or just
reveal if other tables may have corruption, Im trying to decide if this
needs to be run in production.


On Thu, Nov 21, 2013 at 5:09 PM, Kevin Grittner  wrote:

> Mike Broers  wrote:
>
> > Is there anything I should look out for with vacuum freeze?
>
> Just check the logs and the vacuum output for errors and warnings.
>
> --
> Kevin Grittner
> EDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>


Re: [GENERAL] corruption issue after server crash - ERROR: unexpected chunk number 0

2013-11-21 Thread Mike Broers
Thanks, after this pg_dumpall I am going to see what kind of impact I can
expect from running VACUUM FREEZE ANALYZE (normally I just run vacuumdb
-avz nightly via a cron job) and schedule time to run this in production
against all the tables in the database.  Is there anything I should look
out for with vacuum freeze?

Much appreciated,
Mike


On Thu, Nov 21, 2013 at 4:51 PM, Kevin Grittner  wrote:

> Mike Broers  wrote:
>
> > Thanks for the response.  fsync and full_page_writes are both on.
>
> > [ corruption appeared following power loss on the machine hosing
> > the VM running PostgreSQL ]
>
> That leaves three possibilities:
>   (1)  fsync doesn't actually guarantee persistence in your stack.
>   (2)  There is a hardware problem which has not been recognized.
>   (3)  There is a so-far unrecognized bug in PostgreSQL.
>
> Based on my personal experience, those are listed in descending
> order of probability.  I seem to recall reports of some VM for
> which an fsync did not force data all the way to persistent
> storage, but I don't recall which one.  You might want to talk to
> your service provider about what guarantees they make in this
> regard.
>
> > Is there something else I can run to confirm we are more or less
> > ok at the database level after the pg_dumpall or is there no way
> > to be sure and a fresh initdb is required.
>
> Given that you had persistence options in their default state of
> "on", and the corruption appeared after a power failure in a VM
> environment, I would guess that the damage is probably limited.
> That said, damage from this sort of event can remain hidden and
> cause data loss later.  Unfortunately we do not yet have a
> consistency checker that can root out such problems.  If you can
> arrange a maintenance window to dump and load to a fresh initdb,
> that would eliminate the possibility that some hidden corruption is
> lurking.  If that is not possible, running VACUUM FREEZE ANALYZE
> will reduce the number of things that can go wrong, without
> requiring down time.
>
> --
> Kevin Grittner
> EDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>


Re: [GENERAL] corruption issue after server crash - ERROR: unexpected chunk number 0

2013-11-21 Thread Mike Broers
Thanks for the response.  fsync and full_page_writes are both on.

Our database runs on a managed hosting provider's vmhost server/san, I can
possibly request for them to provide some hardware test results - do you
have any specifics diagnostics in mind?  The crash was apparently due to
our vmhost suddenly losing power, the only row that it has complained with
the chunk error also migrated into both standby servers, and as previously
stated was fixed with a reindex of the parent table in one of the standby
servers after taking it out of recovery.  The vacuumdb -avz on this test
copy didnt have any errors or warnings, im going to also run a pg_dumpall
on this host to see if any other rows are problematic.

Is there something else I can run to confirm we are more or less ok at the
database level after the pg_dumpall or is there no way to be sure and a
fresh initdb is required.

I am planning on running the reindex in actual production tonight during
our maintenance window, but was hoping if that worked we would be out of
the woods.



On Thu, Nov 21, 2013 at 3:56 PM, Kevin Grittner  wrote:

> Mike Broers  wrote:
>
> > Hello we are running postgres 9.2.5 on RHEL6, our production
> > server crashed hard and when it came back up our logs were
> > flooded with:
>
> > ERROR:  unexpected chunk number 0 (expected 1) for toast value 117927127
> in pg_toast_19122
>
> Your database is corrupted.  Unless you were running with fsync =
> off or full_page_writes = off, that should not happen.  It is
> likely to be caused by a hardware problem (bad RAM, a bad disk
> drive, or network problems if your storage is across a network).
>
> If it were me, I would stop the database service and copy the full
> data directory tree.
>
> http://wiki.postgresql.org/wiki/Corruption
>
> If fsync or full_page_writes were off, your best bet is probably to
> go to your backup.  If you don't go to a backup, you should try to
> get to a point where you can run pg_dump, and dump and load to a
> freshly initdb'd cluster.
>
> If fsync and full_page_writes were both on, you should run hardware
> diagnostics at your earliest opportunity.  When hardware starts to
> fail, the first episode is rarely the last or the most severe.
>
> --
> Kevin Grittner
> EDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>


[GENERAL] Re: corruption issue after server crash - ERROR: unexpected chunk number 0

2013-11-21 Thread Mike Broers
Update - I have two hot replication slaves of this db, both have the
problem.  I took one out of recovery and ran REINDEX table session_session
and it fixed the errors about this row.  Now Im going to run vacuum and see
if there are other tables that complain, but Im guessing if so I will need
to see if there is a way to force vacuum to continue on error, worst case I
might have to script a table by table vacuum script I guess..  If anyone
has a better suggestion for determining the extent of the damage Id
appreciate it.


On Thu, Nov 21, 2013 at 2:10 PM, Mike Broers  wrote:

> Hello we are running postgres 9.2.5 on RHEL6, our production server
> crashed hard and when it came back up our logs were flooded with:
>
> STATEMENT:  SELECT "session_session"."session_key",
> "session_session"."session_data", "session_session"."expire_date",
> "session_session"."nonce" FROM "session_session" WHERE
> ("session_session"."session_key" = 'gk9aap5d7btp6tzquh0kf73gpfmik5w3'  AND
> "session_session"."expire_date" > '2013-11-21 13:27:33.107913' )
>
> ERROR:  unexpected chunk number 0 (expected 1) for toast value 117927127
> in pg_toast_19122
>
> We restarted the application and whatever session was constantly hitting
> that row stopped, but Im concerned about remediation.  When I attempt to
> read from that row the error occurs.
>
> select * from session_session where session_key =
> 'gk9aap5d7btp6tzquh0kf73gpfmik5w3';
> ERROR:  unexpected chunk number 0 (expected 1) for toast value 117927127
> in pg_toast_19122
>
> When I attempt to delete this row I get this error:
> delete from session_session where session_key =
> 'gk9aap5d7btp6tzquh0kf73gpfmik5w3';
> ERROR:  tuple concurrently updated
>
> We happen to have a maintenance window tonight so I will have some time
> when the app is down to run some database fixes.  I saw other threads
> suggesting a reindex of the toast table, but this is a 14GB table and I'm
> not sure how long that will take or if it will even be successful.  We also
> have a full db vacuum/analyze scheduled nightly for 2am so I am expecting
> to learn if there are other impacted tables, but its troubling if I dont
> know what the remediation.  This particular table could be truncated if
> necessary if that is an option but Im not sure about other tables.
>
> Any suggestions for how to handle the tuple concurrently updated error? Or
> if a reindex is likely to help with the unexpected chunk error?
>
> Thanks
> Mike
>


[GENERAL] corruption issue after server crash - ERROR: unexpected chunk number 0

2013-11-21 Thread Mike Broers
Hello we are running postgres 9.2.5 on RHEL6, our production server crashed
hard and when it came back up our logs were flooded with:

STATEMENT:  SELECT "session_session"."session_key",
"session_session"."session_data", "session_session"."expire_date",
"session_session"."nonce" FROM "session_session" WHERE
("session_session"."session_key" = 'gk9aap5d7btp6tzquh0kf73gpfmik5w3'  AND
"session_session"."expire_date" > '2013-11-21 13:27:33.107913' )

ERROR:  unexpected chunk number 0 (expected 1) for toast value 117927127 in
pg_toast_19122

We restarted the application and whatever session was constantly hitting
that row stopped, but Im concerned about remediation.  When I attempt to
read from that row the error occurs.

select * from session_session where session_key =
'gk9aap5d7btp6tzquh0kf73gpfmik5w3';
ERROR:  unexpected chunk number 0 (expected 1) for toast value 117927127 in
pg_toast_19122

When I attempt to delete this row I get this error:
delete from session_session where session_key =
'gk9aap5d7btp6tzquh0kf73gpfmik5w3';
ERROR:  tuple concurrently updated

We happen to have a maintenance window tonight so I will have some time
when the app is down to run some database fixes.  I saw other threads
suggesting a reindex of the toast table, but this is a 14GB table and I'm
not sure how long that will take or if it will even be successful.  We also
have a full db vacuum/analyze scheduled nightly for 2am so I am expecting
to learn if there are other impacted tables, but its troubling if I dont
know what the remediation.  This particular table could be truncated if
necessary if that is an option but Im not sure about other tables.

Any suggestions for how to handle the tuple concurrently updated error? Or
if a reindex is likely to help with the unexpected chunk error?

Thanks
Mike


[GENERAL] 8.3.8 question about backup/recovery behavior

2010-09-15 Thread Mike Broers
We take nightly backups using the start backup, copying the data directory
and archived logs, then stop backup method.

Today I tested the recoverability of the backup by mounting this backup
directory on a different server, copying the 3 hours of transactions logs
from after last nights backup up until this morning to the different server,
and starting up postgres there from the backup, and I want to verify what I
experienced was normal behavior:

My first attempt was to see if I could start the postmaster without setting
up recovery.conf.  It worked to my surprise, pg looked for the pg_xlog
folder, and in the log noted the last transaction was at the time of the
last available transaction log (915) and opened the database for
connections.  I guess I was surprised that it automatically rolled forward
through all the logs up to the last one, and opened instead of just opening
the database at the time of the stop backup command which was about 3 hours
prior to the last available transaction log.

Here is what I saw in the log:
2010-09-14 10:48:43.626 CDT <@> [6848] LOG:  0: database system was not
properly shut down; automatic recovery in progress
2010-09-14 10:48:43.626 CDT <@> [6848] LOCATION:  StartupXLOG, xlog.c:5006
2010-09-14 10:48:43.679 CDT <@> [6848] LOG:  0: redo starts at
15A/4E01A5D8
2010-09-14 10:48:43.679 CDT <@> [6848] LOCATION:  StartupXLOG, xlog.c:5068
2010-09-14 10:49:01.059 CDT <@> [6848] LOG:  58P01: could not open file
"pg_xlog/0001015A00C6" (log file 346, segment 198): No such file
or directory
2010-09-14 10:49:01.059 CDT <@> [6848] LOCATION:  XLogFileRead, xlog.c:2365
2010-09-14 10:49:01.059 CDT <@> [6848] LOG:  0: redo done at
15A/C5FFF1F8
2010-09-14 10:49:01.059 CDT <@> [6848] LOCATION:  StartupXLOG, xlog.c:5138
2010-09-14 10:49:01.097 CDT <@> [6848] LOG:  0: last completed
transaction was at log time 2010-09-14 09:15:07.450314-05
2010-09-14 10:49:01.097 CDT <@> [6848] LOCATION:  StartupXLOG, xlog.c:5142

The database was open, I could log in, but I couldnt determine a way to see
if the database was brought up to 9:15am or some earlier time.

My second attempt was to shut down, set the recovery.conf, and start the
database in recovery.  This worked as well as expected.  Can anyone verify
that the first attempt is valid or expected behavior?  I guess I'm so used
to oracles very specific and strict backup/recovery methods that this
shocked me a little.


Thanks
Mike