Re: [GENERAL] Streaming replication slave crash

2013-11-25 Thread Mahlon E. Smith
On Tue, Sep 10, 2013, Mahlon E. Smith wrote: > On Mon, Sep 09, 2013, Jeff Davis wrote: > > > > You may have seen only partial information about that bug and the fix. > > Yep, I totally glazed over the REINDEX. Giving it a go -- thank you! As a followup for anyone else landing on this thread, th

Re: [GENERAL] Streaming replication slave crash

2013-09-10 Thread Mahlon E. Smith
On Mon, Sep 09, 2013, Jeff Davis wrote: > > You may have seen only partial information about that bug and the fix. Yep, I totally glazed over the REINDEX. Giving it a go -- thank you! -- Mahlon E. Smith http://www.martini.nu/contact.html pgpIXe1Uec5OO.pgp Description: PGP signature

Re: [GENERAL] Streaming replication slave crash

2013-09-09 Thread Jeff Davis
On Mon, 2013-09-09 at 13:04 -0700, Mahlon E. Smith wrote: > After some wild googlin' "research", I saw the index visibility map fix > for 9.2.1. We did pg_upgrade in-between versions, but just to be sure I > wasn't somehow carrying corrupt data across versions (?), I went ahead > and VACUUMed eve

Re: [GENERAL] Streaming replication slave crash

2013-09-09 Thread Mahlon E. Smith
[piggybackin' on older (seeming very similar) thread...] On Fri, Mar 29, 2013, Quentin Hartman wrote: > Yesterday morning, one of my streaming replication slaves running 9.2.3 > crashed with the following in the log file: > > 2013-03-28 12:49:30 GMT WARNING: page 1441792 of relation base/63229/

Re: [GENERAL] Streaming replication slave crash

2013-03-29 Thread Quentin Hartman
On Fri, Mar 29, 2013 at 10:50 AM, Tom Lane wrote: > Quentin Hartman writes: > > On Fri, Mar 29, 2013 at 10:37 AM, Tom Lane wrote: > >> What process did you use for setting up the slave? > > > I used an rsync from the master while both were stopped. > > If the master was shut down cleanly (not -

Re: [GENERAL] Streaming replication slave crash

2013-03-29 Thread Tom Lane
Quentin Hartman writes: > On Fri, Mar 29, 2013 at 10:37 AM, Tom Lane wrote: >> What process did you use for setting up the slave? > I used an rsync from the master while both were stopped. If the master was shut down cleanly (not -m immediate) then the bug fix I was thinking about wouldn't expl

Re: [GENERAL] Streaming replication slave crash

2013-03-29 Thread Quentin Hartman
On Fri, Mar 29, 2013 at 10:37 AM, Tom Lane wrote: > Quentin Hartman writes: > > Yesterday morning, one of my streaming replication slaves running 9.2.3 > > crashed with the following in the log file: > > What process did you use for setting up the slave? > I used an rsync from the master while

Re: [GENERAL] Streaming replication slave crash

2013-03-29 Thread Tom Lane
Quentin Hartman writes: > Yesterday morning, one of my streaming replication slaves running 9.2.3 > crashed with the following in the log file: What process did you use for setting up the slave? There's a fix awaiting release in 9.2.4 that might explain data corruption on a slave, depending on h

Re: [GENERAL] Streaming replication slave crash

2013-03-29 Thread Quentin Hartman
On Fri, Mar 29, 2013 at 10:23 AM, Lonni J Friedman wrote: > Looks like you've got some form of coruption: > page 1441792 of relation base/63229/63370 does not exist > Thanks for the insight. I thought that might be it, but never having seen this before I'm glad to have some confirmation. The que

Re: [GENERAL] Streaming replication slave crash

2013-03-29 Thread Lonni J Friedman
Looks like you've got some form of coruption: page 1441792 of relation base/63229/63370 does not exist The question is whether it was corrupted on the master and then replicated to the slave, or if it was corrupted on the slave. I'd guess that the pg_dump tried to read from that page and barfed.

[GENERAL] Streaming replication slave crash

2013-03-29 Thread Quentin Hartman
Yesterday morning, one of my streaming replication slaves running 9.2.3 crashed with the following in the log file: 2013-03-28 12:49:30 GMT WARNING: page 1441792 of relation base/63229/63370 does not exist 2013-03-28 12:49:30 GMT CONTEXT: xlog redo delete: index 1663/63229/109956; iblk 303, heap