Re: [GENERAL] Streaming Replication - Error on Standby

2014-02-12 Thread bobJobS
Thanks for the information and the URLs!



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Streaming-Replication-Error-on-Standby-tp5791463p5791588.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Streaming Replication - Error on Standby

2014-02-11 Thread Adrian Klaver

On 02/11/2014 10:12 AM, bobJobS wrote:

Postgres 9.3.2.
RHEL 5

After performing all of the Streaming Replication setup steps, I get the
following error message in my standby DB log file.

   database system identifier differ between the primary and standby

I've double checked the recovery.conf file and it contains the correct
hostname, port, username and password.

I've also verified that ssh into either the primaryor standby as the
replication user does not require a password.

When building the standby I took the latest backup from the primary and
loaded the standby. After, I executed the start backup command on the
primary, followed by rsync, then the stop backup command.

I've googled the heck out of the error message and I found only references
to invalid primary connection information.

Any ideas?




Also meant to send this link which is newer and covers pg_basebackup:

https://wiki.postgresql.org/wiki/Hot_Standby



--
Adrian Klaver
adrian.kla...@gmail.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Streaming Replication - Error on Standby

2014-02-11 Thread Adrian Klaver

On 02/11/2014 10:12 AM, bobJobS wrote:

Postgres 9.3.2.
RHEL 5

After performing all of the Streaming Replication setup steps, I get the
following error message in my standby DB log file.

   database system identifier differ between the primary and standby

I've double checked the recovery.conf file and it contains the correct
hostname, port, username and password.

I've also verified that ssh into either the primaryor standby as the
replication user does not require a password.

When building the standby I took the latest backup from the primary and
loaded the standby. After, I executed the start backup command on the
primary, followed by rsync, then the stop backup command.

I've googled the heck out of the error message and I found only references
to invalid primary connection information.

Any ideas?


Look at this tutorial:

https://wiki.postgresql.org/wiki/Binary_Replication_Tutorial



--
Adrian Klaver
adrian.kla...@gmail.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Streaming Replication - Error on Standby

2014-02-11 Thread bricklen
On Tue, Feb 11, 2014 at 11:25 AM, bobJobS  wrote:

>
> To get the standby server to a point, I tool a globals dump and a data dump
> of the primary server and build the standby.
>
> Then I executed pg_startbackup, rsync data dir to standby data dir (to
> catch
> any changes made while I was building the standby) and finally
> pg_stopbackup... all on the primary.
>


The steps you are using will not work. You cannot use a pg_dump/pg_dump
backup from a master to set up a slave. pg_dump generates a "logical"
backup, which is used for recovery not setting up slaves.  A very
high-level view of the replication setup:
- put the master in backup mode
- pg_basebackup of the master to the slave (no slave data exists prior to
this step)
- take the master out of backup mode


Re: [GENERAL] Streaming Replication - Error on Standby

2014-02-11 Thread bobJobS


To get the standby server to a point, I tool a globals dump and a data dump
of the primary server and build the standby.

Then I executed pg_startbackup, rsync data dir to standby data dir (to catch
any changes made while I was building the standby) and finally
pg_stopbackup... all on the primary.

Thank you for the URL. I'll check it out.



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Streaming-Replication-Error-on-Standby-tp5791463p5791478.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Streaming Replication - Error on Standby

2014-02-11 Thread bricklen
On Tue, Feb 11, 2014 at 10:12 AM, bobJobS  wrote:

> Postgres 9.3.2.
> RHEL 5
>
> After performing all of the Streaming Replication setup steps,
>

What replication steps?


>   database system identifier differ between the primary and standby
>

How did you take the initial backup of the master? Did you rsync the master
filesystem (after issuing pg_start_backup()) or use pg_basebackup, or did
you literally take a pg_dump of the master and try to turn that backup into
a slave? If the latter, you will need to use the rsync/pg_basebackup method.

There are some reasonably thorough steps at
http://dba.stackexchange.com/a/53546/24393 if you want to compare them
against what you tried already.


[GENERAL] Streaming Replication - Error on Standby

2014-02-11 Thread bobJobS
Postgres 9.3.2.
RHEL 5

After performing all of the Streaming Replication setup steps, I get the
following error message in my standby DB log file.

  database system identifier differ between the primary and standby

I've double checked the recovery.conf file and it contains the correct
hostname, port, username and password.

I've also verified that ssh into either the primaryor standby as the
replication user does not require a password.

When building the standby I took the latest backup from the primary and
loaded the standby. After, I executed the start backup command on the
primary, followed by rsync, then the stop backup command.

I've googled the heck out of the error message and I found only references
to invalid primary connection information.

Any ideas?



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Streaming-Replication-Error-on-Standby-tp5791463.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Streaming Replication error - relation uninitialized and WAL contains references to invalid pages

2012-11-27 Thread Gareth Lyons
PostgreSQL 9.2 on Windows Server 2008 R2 64-bit

Streaming Replication initialised & working on a 500GB database.
Following day on slave, following errors in the log:

2012-11-27 09:57:53 GMT WARNING:  page 911726 of relation 
pg_tblspc/16570/PG_9.2_201204301/16571/16595 is uninitialized
2012-11-27 09:57:53 GMT CONTEXT:  xlog redo vacuum: rel 16570/16571/16595; blk 
911727, lastBlockVacuumed 911725
2012-11-27 09:57:53 GMT PANIC:  WAL contains references to invalid pages
2012-11-27 09:57:53 GMT CONTEXT:  xlog redo vacuum: rel 16570/16571/16595; blk 
911727, lastBlockVacuumed 911725
2012-11-27 09:57:54 GMT LOG:  startup process (PID 1392) exited with exit code 3
2012-11-27 09:57:54 GMT LOG:  terminating any other active server processes

Restarting the PostgreSQL service on the slave, the log shows the same errors:

2012-11-27 10:26:39 GMT LOG:  database system was interrupted while in recovery 
at log time 2012-11-27 09:37:29 GMT
2012-11-27 10:26:39 GMT HINT:  If this has occurred more than once some data 
might be corrupted and you might need to choose an earlier recovery target.
2012-11-27 10:26:39 GMT LOG:  entering standby mode
2012-11-27 10:26:39 GMT LOG:  redo starts at 3A8/11A5A9A8
2012-11-27 10:26:40 GMT FATAL:  the database system is starting up
2012-11-27 10:26:41 GMT FATAL:  the database system is starting up
2012-11-27 10:26:42 GMT FATAL:  the database system is starting up
2012-11-27 10:26:43 GMT FATAL:  the database system is starting up
2012-11-27 10:26:44 GMT FATAL:  the database system is starting up
2012-11-27 10:26:45 GMT FATAL:  the database system is starting up
2012-11-27 10:26:46 GMT FATAL:  the database system is starting up
2012-11-27 10:26:47 GMT FATAL:  the database system is starting up
2012-11-27 10:26:48 GMT FATAL:  the database system is starting up
2012-11-27 10:26:49 GMT FATAL:  the database system is starting up
2012-11-27 10:26:50 GMT LOG:  consistent recovery state reached at 3A8/387BFF48
2012-11-27 10:26:50 GMT LOG:  database system is ready to accept read only 
connections
2012-11-27 10:26:50 GMT WARNING:  page 911726 of relation 
pg_tblspc/16570/PG_9.2_201204301/16571/16595 is uninitialized
2012-11-27 10:26:50 GMT CONTEXT:  xlog redo vacuum: rel 16570/16571/16595; blk 
911727, lastBlockVacuumed 911725
2012-11-27 10:26:50 GMT PANIC:  WAL contains references to invalid pages
2012-11-27 10:26:50 GMT CONTEXT:  xlog redo vacuum: rel 16570/16571/16595; blk 
911727, lastBlockVacuumed 911725
2012-11-27 10:26:50 GMT LOG:  startup process (PID 2640) exited with exit code 3
2012-11-27 10:26:50 GMT LOG:  terminating any other active server processes

I noticed a autovacuum process started on the master at 09:04am and was still 
running when the slave tripped over - could this be a factor?
Or could this be a consequence of the WAL corruption issue fixed in 9.2.1/9.1.6?

Full page writes is switched on.

Thanks
Gareth



This email and any attachments are intended only for the use of the individual 
or entity to which it is directed and may contain information that is 
privileged, confidential and exempt from disclosure under applicable law. If 
you have received this email and you are not the intended recipient or the 
employee or agent responsible for delivering this email to the intended 
recipient, please inform the RWS IT Service Desk on +44 1753 480700 and then 
delete the email from your system. If you are not a named addressee you must 
not use, disclose, disseminate, distribute, copy, print or reply to this email. 
Although RWS routinely screens for viruses, addressees should scan this email 
and any attachments for viruses. RWS makes no representation nor warranty as to 
the absence of viruses in this email or any attachments.
RWS Group Limited, Europa House, Chiltern Park, Chiltern Hill, Chalfont St 
Peter, Bucks, SL9 9FG, Registered number 1575193. Details for other group 
companies are at 
http://www.rws.com/lang_english/pdf/investors/Company%20information/Country_of_incorporation_and_operation.pdf



WARNING: Our spam filters may occasionally eliminate legitimate e-mails from 
clients. If your e-mail contains important instructions, please ensure that we 
acknowledge receipt of those instructions.


Re: [GENERAL] Streaming Replication Error

2012-05-30 Thread Jeff Davis
On Mon, 2012-04-30 at 17:23 -0400, Andrew Hannon wrote:

> 1. Is our data intact? PG eventually starts up, and it seems like once
> the streaming suffers the FATAL error, it falls back to performing log
> restores.

I don't see anything alarming there. Postgres will not start up if it
thinks it's really missing data.

I'd advise using an archive command that does not output anything unless
it's something you really need to know. A log file missing from the
archive is normal operation for recovery mode, so notices telling you
that are just cluttering the log.

> 2. What triggers this error? Too much time between log recovery,
> streaming startup and a low wal_keep_segments value (currently 128)?

128 sounds like a high-enough number, so after it catches up fully, it
should be plenty.

It looks like, while trying to catch up, it falls within the 128
segments and begins streaming, and then momentarily falls back out and
needs to restore from the archive.

Unless you have steady-state replication lag, it should catch up fully
and then just be able to use streaming all the time. Do you see it
resume streaming later on in the logfile?

Disclaimer: I'm not 100% confident in my response, so please take it
with a grain of salt, but I hope it is helpful anyway.

Regards,
Jeff Davis


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Streaming Replication Error

2012-04-30 Thread Andrew Hannon
Hello,

We were auditing our logs on one of our PG 9.0.6 standby servers that we use 
for nightly snapshotting. The high-level process is:

1. Stop PG
2. Snapshot
3. Start PG

Where "Snapshot" includes several steps to ensure data/filesystem integrity. 
The archive command on the master continues throughout this process, so the 
standby does have all of the log files. When we restart the cluster, we see the 
typical startup message about restoring files from the archive. However, we 
have noticed that occasionally the following occurs:

LOG:  restored log file "00014456007F" from archive
LOG:  restored log file "000144560080" from archive
cp: cannot stat `/ebs-raid0/archive/000144560081': No such file or 
directory
LOG:  unexpected pageaddr 4454/7400 in log file 17494, segment 129, offset 0
cp: cannot stat `/ebs-raid0/archive/000144560081': No such file or 
directory
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 
000144560091 has already been removed

LOG:  restored log file "000144560091" from archive
LOG:  restored log file "000144560092" from archive
LOG:  restored log file "000144560093" from archive
…
LOG:  restored log file "000144570092" from archive
cp: cannot stat `/ebs-raid0/archive/000144570093': No such file or 
directory
LOG:  streaming replication successfully connected to primary

--

The concerning bit here is that we receive the FATAL message "requested WAL 
segment 000144560091 has already been removed" after streaming 
replication connects successfully, which seems to trigger an additional 
sequence of log restores.

The questions we have are:

1. Is our data intact? PG eventually starts up, and it seems like once the 
streaming suffers the FATAL error, it falls back to performing log restores.
2. What triggers this error? Too much time between log recovery, streaming 
startup and a low wal_keep_segments value (currently 128)?

Thank you very much,

Andrew Hannon
-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general