Re: [GENERAL] Streaming Replication - Error on Standby
Thanks for the information and the URLs! -- View this message in context: http://postgresql.1045698.n5.nabble.com/Streaming-Replication-Error-on-Standby-tp5791463p5791588.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Streaming Replication - Error on Standby
On 02/11/2014 10:12 AM, bobJobS wrote: Postgres 9.3.2. RHEL 5 After performing all of the Streaming Replication setup steps, I get the following error message in my standby DB log file. database system identifier differ between the primary and standby I've double checked the recovery.conf file and it contains the correct hostname, port, username and password. I've also verified that ssh into either the primaryor standby as the replication user does not require a password. When building the standby I took the latest backup from the primary and loaded the standby. After, I executed the start backup command on the primary, followed by rsync, then the stop backup command. I've googled the heck out of the error message and I found only references to invalid primary connection information. Any ideas? Also meant to send this link which is newer and covers pg_basebackup: https://wiki.postgresql.org/wiki/Hot_Standby -- Adrian Klaver adrian.kla...@gmail.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Streaming Replication - Error on Standby
On 02/11/2014 10:12 AM, bobJobS wrote: Postgres 9.3.2. RHEL 5 After performing all of the Streaming Replication setup steps, I get the following error message in my standby DB log file. database system identifier differ between the primary and standby I've double checked the recovery.conf file and it contains the correct hostname, port, username and password. I've also verified that ssh into either the primaryor standby as the replication user does not require a password. When building the standby I took the latest backup from the primary and loaded the standby. After, I executed the start backup command on the primary, followed by rsync, then the stop backup command. I've googled the heck out of the error message and I found only references to invalid primary connection information. Any ideas? Look at this tutorial: https://wiki.postgresql.org/wiki/Binary_Replication_Tutorial -- Adrian Klaver adrian.kla...@gmail.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Streaming Replication - Error on Standby
On Tue, Feb 11, 2014 at 11:25 AM, bobJobS wrote: > > To get the standby server to a point, I tool a globals dump and a data dump > of the primary server and build the standby. > > Then I executed pg_startbackup, rsync data dir to standby data dir (to > catch > any changes made while I was building the standby) and finally > pg_stopbackup... all on the primary. > The steps you are using will not work. You cannot use a pg_dump/pg_dump backup from a master to set up a slave. pg_dump generates a "logical" backup, which is used for recovery not setting up slaves. A very high-level view of the replication setup: - put the master in backup mode - pg_basebackup of the master to the slave (no slave data exists prior to this step) - take the master out of backup mode
Re: [GENERAL] Streaming Replication - Error on Standby
To get the standby server to a point, I tool a globals dump and a data dump of the primary server and build the standby. Then I executed pg_startbackup, rsync data dir to standby data dir (to catch any changes made while I was building the standby) and finally pg_stopbackup... all on the primary. Thank you for the URL. I'll check it out. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Streaming-Replication-Error-on-Standby-tp5791463p5791478.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Streaming Replication - Error on Standby
On Tue, Feb 11, 2014 at 10:12 AM, bobJobS wrote: > Postgres 9.3.2. > RHEL 5 > > After performing all of the Streaming Replication setup steps, > What replication steps? > database system identifier differ between the primary and standby > How did you take the initial backup of the master? Did you rsync the master filesystem (after issuing pg_start_backup()) or use pg_basebackup, or did you literally take a pg_dump of the master and try to turn that backup into a slave? If the latter, you will need to use the rsync/pg_basebackup method. There are some reasonably thorough steps at http://dba.stackexchange.com/a/53546/24393 if you want to compare them against what you tried already.
[GENERAL] Streaming Replication - Error on Standby
Postgres 9.3.2. RHEL 5 After performing all of the Streaming Replication setup steps, I get the following error message in my standby DB log file. database system identifier differ between the primary and standby I've double checked the recovery.conf file and it contains the correct hostname, port, username and password. I've also verified that ssh into either the primaryor standby as the replication user does not require a password. When building the standby I took the latest backup from the primary and loaded the standby. After, I executed the start backup command on the primary, followed by rsync, then the stop backup command. I've googled the heck out of the error message and I found only references to invalid primary connection information. Any ideas? -- View this message in context: http://postgresql.1045698.n5.nabble.com/Streaming-Replication-Error-on-Standby-tp5791463.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Streaming Replication error - relation uninitialized and WAL contains references to invalid pages
PostgreSQL 9.2 on Windows Server 2008 R2 64-bit Streaming Replication initialised & working on a 500GB database. Following day on slave, following errors in the log: 2012-11-27 09:57:53 GMT WARNING: page 911726 of relation pg_tblspc/16570/PG_9.2_201204301/16571/16595 is uninitialized 2012-11-27 09:57:53 GMT CONTEXT: xlog redo vacuum: rel 16570/16571/16595; blk 911727, lastBlockVacuumed 911725 2012-11-27 09:57:53 GMT PANIC: WAL contains references to invalid pages 2012-11-27 09:57:53 GMT CONTEXT: xlog redo vacuum: rel 16570/16571/16595; blk 911727, lastBlockVacuumed 911725 2012-11-27 09:57:54 GMT LOG: startup process (PID 1392) exited with exit code 3 2012-11-27 09:57:54 GMT LOG: terminating any other active server processes Restarting the PostgreSQL service on the slave, the log shows the same errors: 2012-11-27 10:26:39 GMT LOG: database system was interrupted while in recovery at log time 2012-11-27 09:37:29 GMT 2012-11-27 10:26:39 GMT HINT: If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target. 2012-11-27 10:26:39 GMT LOG: entering standby mode 2012-11-27 10:26:39 GMT LOG: redo starts at 3A8/11A5A9A8 2012-11-27 10:26:40 GMT FATAL: the database system is starting up 2012-11-27 10:26:41 GMT FATAL: the database system is starting up 2012-11-27 10:26:42 GMT FATAL: the database system is starting up 2012-11-27 10:26:43 GMT FATAL: the database system is starting up 2012-11-27 10:26:44 GMT FATAL: the database system is starting up 2012-11-27 10:26:45 GMT FATAL: the database system is starting up 2012-11-27 10:26:46 GMT FATAL: the database system is starting up 2012-11-27 10:26:47 GMT FATAL: the database system is starting up 2012-11-27 10:26:48 GMT FATAL: the database system is starting up 2012-11-27 10:26:49 GMT FATAL: the database system is starting up 2012-11-27 10:26:50 GMT LOG: consistent recovery state reached at 3A8/387BFF48 2012-11-27 10:26:50 GMT LOG: database system is ready to accept read only connections 2012-11-27 10:26:50 GMT WARNING: page 911726 of relation pg_tblspc/16570/PG_9.2_201204301/16571/16595 is uninitialized 2012-11-27 10:26:50 GMT CONTEXT: xlog redo vacuum: rel 16570/16571/16595; blk 911727, lastBlockVacuumed 911725 2012-11-27 10:26:50 GMT PANIC: WAL contains references to invalid pages 2012-11-27 10:26:50 GMT CONTEXT: xlog redo vacuum: rel 16570/16571/16595; blk 911727, lastBlockVacuumed 911725 2012-11-27 10:26:50 GMT LOG: startup process (PID 2640) exited with exit code 3 2012-11-27 10:26:50 GMT LOG: terminating any other active server processes I noticed a autovacuum process started on the master at 09:04am and was still running when the slave tripped over - could this be a factor? Or could this be a consequence of the WAL corruption issue fixed in 9.2.1/9.1.6? Full page writes is switched on. Thanks Gareth This email and any attachments are intended only for the use of the individual or entity to which it is directed and may contain information that is privileged, confidential and exempt from disclosure under applicable law. If you have received this email and you are not the intended recipient or the employee or agent responsible for delivering this email to the intended recipient, please inform the RWS IT Service Desk on +44 1753 480700 and then delete the email from your system. If you are not a named addressee you must not use, disclose, disseminate, distribute, copy, print or reply to this email. Although RWS routinely screens for viruses, addressees should scan this email and any attachments for viruses. RWS makes no representation nor warranty as to the absence of viruses in this email or any attachments. RWS Group Limited, Europa House, Chiltern Park, Chiltern Hill, Chalfont St Peter, Bucks, SL9 9FG, Registered number 1575193. Details for other group companies are at http://www.rws.com/lang_english/pdf/investors/Company%20information/Country_of_incorporation_and_operation.pdf WARNING: Our spam filters may occasionally eliminate legitimate e-mails from clients. If your e-mail contains important instructions, please ensure that we acknowledge receipt of those instructions.
Re: [GENERAL] Streaming Replication Error
On Mon, 2012-04-30 at 17:23 -0400, Andrew Hannon wrote: > 1. Is our data intact? PG eventually starts up, and it seems like once > the streaming suffers the FATAL error, it falls back to performing log > restores. I don't see anything alarming there. Postgres will not start up if it thinks it's really missing data. I'd advise using an archive command that does not output anything unless it's something you really need to know. A log file missing from the archive is normal operation for recovery mode, so notices telling you that are just cluttering the log. > 2. What triggers this error? Too much time between log recovery, > streaming startup and a low wal_keep_segments value (currently 128)? 128 sounds like a high-enough number, so after it catches up fully, it should be plenty. It looks like, while trying to catch up, it falls within the 128 segments and begins streaming, and then momentarily falls back out and needs to restore from the archive. Unless you have steady-state replication lag, it should catch up fully and then just be able to use streaming all the time. Do you see it resume streaming later on in the logfile? Disclaimer: I'm not 100% confident in my response, so please take it with a grain of salt, but I hope it is helpful anyway. Regards, Jeff Davis -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Streaming Replication Error
Hello, We were auditing our logs on one of our PG 9.0.6 standby servers that we use for nightly snapshotting. The high-level process is: 1. Stop PG 2. Snapshot 3. Start PG Where "Snapshot" includes several steps to ensure data/filesystem integrity. The archive command on the master continues throughout this process, so the standby does have all of the log files. When we restart the cluster, we see the typical startup message about restoring files from the archive. However, we have noticed that occasionally the following occurs: LOG: restored log file "00014456007F" from archive LOG: restored log file "000144560080" from archive cp: cannot stat `/ebs-raid0/archive/000144560081': No such file or directory LOG: unexpected pageaddr 4454/7400 in log file 17494, segment 129, offset 0 cp: cannot stat `/ebs-raid0/archive/000144560081': No such file or directory LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000144560091 has already been removed LOG: restored log file "000144560091" from archive LOG: restored log file "000144560092" from archive LOG: restored log file "000144560093" from archive … LOG: restored log file "000144570092" from archive cp: cannot stat `/ebs-raid0/archive/000144570093': No such file or directory LOG: streaming replication successfully connected to primary -- The concerning bit here is that we receive the FATAL message "requested WAL segment 000144560091 has already been removed" after streaming replication connects successfully, which seems to trigger an additional sequence of log restores. The questions we have are: 1. Is our data intact? PG eventually starts up, and it seems like once the streaming suffers the FATAL error, it falls back to performing log restores. 2. What triggers this error? Too much time between log recovery, streaming startup and a low wal_keep_segments value (currently 128)? Thank you very much, Andrew Hannon -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general