On Tuesday, February 11, 2014 5:43:24 PM UTC-6, Daniel Farina wrote:
>
> On Tue, Feb 11, 2014 at 3:39 PM, Kevin Harriss 
> <[email protected]<javascript:>> 
> wrote: 
> >> 
> >> 
> >> The fact that WAL-E suggests it's downloading the log is troubling. 
> >> 
> >> I've seen WAL corruption manifest this way: postgres will look at the 
> >> segment, give up, but then try restoring again without so much as a 
> >> peep if memory serves.  Is postgres complaining somewhere? 
> > 
> > 
> > There aren't any postgres errors or complaints in any of the logs. It 
> just 
> > always says it is waiting to startup when a client tries to connect to 
> the 
> > slave. 
>
> Yeah, it's stuck in crash recovery, perhaps vainly hoping to someday 
> escape. 
>
> You can try wal-fetching this segment and placing it in pg_xlog, then 
> turning off archiving.  Maybe Postgres will be convinced to die and 
> tell you why, then. 
>
> >> Sadly, the last time I figured this out it was a corruption so severe 
> >> that I downloaded the WAL to break it open and noticed it had very 
> >> much the wrong file size, as were all the WAL leading up to it before 
> >> an EBS crash.  Somehow the server continued on happily for hours 
> >> afterwards which did not make for an easy recovery (I was lucky that 
> >> there was not a double-failure and pg_resetxlog plus dump/restore was 
> >> available to me). 
> >> 
> >> It could also be a more pedestrian bug somewhere else, but if so, it'd 
> >> be the first. 
> >> 
> >> Try a new base backup/restore and cross your fingers, and perhaps 
> >> preserve 000000010000001C000000CE and try running it through xlogdump 
> >> and submitting information to pgsql-bugs if things are amiss. 
> > 
> > 
> > Are you recommending to push a fresh backup from the master to S3 and 
> then 
> > do a fresh restore on the slave? 
>
> Yes.  You probably want to do this first before digging around in the 
> old system for forensics.
>

On the slave would do the following steps: 
1 stop postgres
2 delete the $PG_DATA
3 envdir /etc/wal-e.d/env wal-e backup-fetch $PGDATA LATEST
4 start up postgres and watch the recovery
 

>
> Also, what version of WAL-E are you using? 
>

The wal-e version is 0.6.8.

-- 
You received this message because you are subscribed to the Google Groups 
"wal-e" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to