Re: Setting up replication on Windows, v9.4
Note that WAL replication replicates *the whole instance* not just specific databases. You need logical replication for that. Also, I just learned that 9.4 *does* have WAL replication slots (which makes replication *much* easier). v9.6 is where replication is "like rolling off a log" simple. On 11/4/22 17:59, Brad White wrote: I'm setting up a backup for our primary postgres server using the archived WAL files. Then I'll try to upgrade it to Streaming Replication. Then I'll upgrade the system to v.latest. For now, we are on v.9.4. I do a base backup from the primary to a directory on the NAS. "C:\Program Files\PostgreSQL\9.4\bin\pg_basebackup.exe" -D \\diskstation\AccessData\Dev\Backup -P -X s -v -h 192.168.1.118 -p 5432 -U postgres That appears to go fine. Then I delete data\*.* and copy everything except the config files from the backup into data. Copy in recovery.conf -- standby_mode = 'on' primary_conninfo = 'host=192.168.1.118 port=5432 user=replication password=**' restore_command = 'copy "DISKSTATION\\AccessData\\WALfiles\\%f" "%p"' -- Copy in postgresql.conf, with settings -- listen_addresses = '127.0.0.1,192.168.1.118' wal_level = archive hot_standby = on -- Interestingly, the recovery file says # Note that recovery.conf must be in $PGDATA directory. # It should NOT be located in the same directory as postgresql.conf Those seem contradictory. And if I remove the postgresql, it just refuses to start. With all this in place, I start the service, it runs for a bit, then shuts down. No errors in the event log. 5 postgres processes are left running along with a pid file. The log file says -- LOG: database system was interrupted while in recovery at log time 2022-11-04 13:17:28 PDT HINT: If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target. FATAL: the database system is starting up FATAL: the database system is starting up FATAL: the database system is starting up LOG: entering standby mode LOG: consistent recovery state reached at 6A/3590 LOG: record with zero length at 6A/3590 LOG: started streaming WAL from primary at 6A/3500 on timeline 1 LOG: redo starts at 6A/3590 FATAL: the database system is starting up FATAL: the database system is starting up FATAL: the database system is starting up FATAL: the database system is starting up FATAL: the database system is starting up So it seems that the backup didn't work as well as first appeared. -- Angular momentum makes the world go 'round.
Re: Setting up replication on Windows, v9.4
Cranking up the log_min_messages level might help, to see what is (or is not) reaching the server. On 11/11/22 14:35, Brad White wrote: > Or manually running the application queries? That was the odd thing. It didn't appear to be in the middle of running any queries. The database shouldn't have had any effect. > Anyway, try Pg 9.6. I have Pg 12 installed, so I'll try that. I'll start a new thread on my adventures in upgrading. Thanks, Brad. On Fri, Nov 11, 2022 at 12:38 PM Ron wrote: Nothing in the application error logs? What about the Postgresql error logs? Or manually running the application queries? Anyway, try Pg 9.6. Still EOL, but /not as/ EOL. -- Angular momentum makes the world go 'round. -- Angular momentum makes the world go 'round.
Re: Setting up replication on Windows, v9.4
> Or manually running the application queries? That was the odd thing. It didn't appear to be in the middle of running any queries. The database shouldn't have had any effect. > Anyway, try Pg 9.6. I have Pg 12 installed, so I'll try that. I'll start a new thread on my adventures in upgrading. Thanks, Brad. On Fri, Nov 11, 2022 at 12:38 PM Ron wrote: >Nothing in the application error logs? What about the Postgresql error > logs? Or manually running the application queries? > > Anyway, try Pg 9.6. Still EOL, but *not as* EOL. > > -- > Angular momentum makes the world go 'round. >
Re: Setting up replication on Windows, v9.4
On 11/10/22 21:24, Brad White wrote: On 11/7/2022 3:42 PM, Rob Sargent wrote: > Care to share some of the ways the app stopped working? You might get a leg up on where best to remediate. I don't recall, as that was a few months ago. We are running MS-Access as a front end with Postgres as the back end. It appeared to read and write data perfectly fine, but failed in the business logic on the Access side where there didn't appear to be any DB involvement. But switching to v14 caused it and switching away fixed it. No obvious cause that I could see. Nothing in the application error logs? What about the Postgresql error logs? Or manually running the application queries? Anyway, try Pg 9.6. Still EOL, but /not as/ EOL. -- Angular momentum makes the world go 'round.
Re: Setting up replication on Windows, v9.4
From the error log: FATAL: hot standby is not possible because wal_level was not set to "hot_standby" or higher on the master server HINT: Either set wal_level to "hot_standby" on the master, or turn off hot_standby here. I tried setting hot_standby to off, but that didn't help. I set the wal_level to "hot_standby" and hot_standby to on, and now it appears to have made progress. The service starts up without failing. > check if there's an entry in pg_stat_replication on the primary. No. No entries. There is now an entry on the primary. But still can't connect from Navicat on the primary server and the log says FATAL: could not receive data from WAL stream: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. LOG: record with zero length at 6B/6390 FATAL: could not connect to the primary server: FATAL: the database system is starting up LOG: started streaming WAL from primary at 6B/6300 on timeline 1 LOG: startup process (PID 3368) exited with exit code 1 LOG: terminating any other active server processes ¯\_(ツ)_/¯
Re: Setting up replication on Windows, v9.4
On 11/7/2022 3:42 PM, Rob Sargent wrote: > Care to share some of the ways the app stopped working? You might get a leg up on where best to remediate. I don't recall, as that was a few months ago. We are running MS-Access as a front end with Postgres as the back end. It appeared to read and write data perfectly fine, but failed in the business logic on the Access side where there didn't appear to be any DB involvement. But switching to v14 caused it and switching away fixed it. No obvious cause that I could see. As written, switching databases was very time-consuming. Since then, I've rewritten the app so that the network admin just changes an INI file that the app reads on start up. If there is a change, it switches all the tables over in about 15 seconds. We wanted that ability anyway in case we ever needed to switch to using the backup server. Now, I am working on getting replication going. Next up: back to upgrading. But, since you ask, I'll post here when I get back to that so y'all can follow along as I narrow it down to a specific version.
Re: Setting up replication on Windows, v9.4
On 11/7/22 13:59, Brad White wrote: > v9.4 has been EOL for 2 years 9 months. As I said, the next step will be to upgrade. It would make sense to upgrade first, since "there have been some big advances since then which make replication much easier" But when we upgraded, the app stopped working. So I'll need to go through and nail down which exact version causes the issue and then see if we can get it resolved. I have no way of knowing how long that will take. So we're doing the replication first. Care to share some of the ways the app stopped working? You might get a leg up on where best to remediate.
Re: Setting up replication on Windows, v9.4
> > > > v9.4 has been EOL for 2 years 9 months. As I said, the next step will be to upgrade. It would make sense to upgrade first, since "there have been some big advances since then which make replication much easier" But when we upgraded, the app stopped working. So I'll need to go through and nail down which exact version causes the issue and then see if we can get it resolved. I have no way of knowing how long that will take. So we're doing the replication first.
Re: Setting up replication on Windows, v9.4
On 11/4/22 17:59, Brad White wrote: I'm setting up a backup for our primary postgres server using the archived WAL files. Then I'll try to upgrade it to Streaming Replication. Then I'll upgrade the system to v.latest. For now, we are on v.9.4. FYI: v9.4 has been EOL for 2 years 9 months. (And there have been some big advances since then which make replication much easier.) -- Angular momentum makes the world go 'round.
Re: Setting up replication on Windows, v9.4
2022年11月5日(土) 10:02 Ian Lawrence Barwick : > > 2022年11月5日(土) 7:59 Brad White : ... > > Interestingly, the recovery file says > > # Note that recovery.conf must be in $PGDATA directory. > > # It should NOT be located in the same directory as postgresql.conf > > Those seem contradictory. > > I don't know where those lines come from, they're not generated by PostgreSQL. > > It is certainly true that recovery.conf *must* be in the $PGDATA directory. > The > above lines would make sense if it's expected that postgresql.conf will be > located in another location (as is usually the case with Debian/Ubuntu > packages), but unless your setup is specifying that, just put both files in > $PGDATA. ... Hmm, looks like those lines came from here: https://wiki.postgresql.org/wiki/Streaming_Replication That's plain wrong and confusing [1], I'll update. [1] not for the first time either: https://serverfault.com/questions/760802/why-should-postgresql-conf-not-be-in-the-same-dir-as-recovery-conf Ian Barwick
Re: Setting up replication on Windows, v9.4
2022年11月5日(土) 7:59 Brad White : > > I'm setting up a backup for our primary postgres server using the archived > WAL files. > Then I'll try to upgrade it to Streaming Replication. > Then I'll upgrade the system to v.latest. > For now, we are on v.9.4. > > I do a base backup from the primary to a directory on the NAS. > > "C:\Program Files\PostgreSQL\9.4\bin\pg_basebackup.exe" -D > \\diskstation\AccessData\Dev\Backup -P -X s -v -h 192.168.1.118 -p 5432 -U > postgres > > That appears to go fine. > Then I delete data\*.* and copy everything except the config files from the > backup into data. > > Copy in recovery.conf > -- > standby_mode = 'on' > primary_conninfo = 'host=192.168.1.118 port=5432 user=replication > password=**' > restore_command = 'copy "DISKSTATION\\AccessData\\WALfiles\\%f" "%p"' > -- > Copy in postgresql.conf, with settings > -- > listen_addresses = '127.0.0.1,192.168.1.118' > wal_level = archive > hot_standby = on > -- Is this the postgresql.conf applied to the standby? Just wondering as one of the listen_addresses is the same as the host in primary_conninfo. > Interestingly, the recovery file says > # Note that recovery.conf must be in $PGDATA directory. > # It should NOT be located in the same directory as postgresql.conf > Those seem contradictory. I don't know where those lines come from, they're not generated by PostgreSQL. It is certainly true that recovery.conf *must* be in the $PGDATA directory. The above lines would make sense if it's expected that postgresql.conf will be located in another location (as is usually the case with Debian/Ubuntu packages), but unless your setup is specifying that, just put both files in $PGDATA. > And if I remove the postgresql, it just refuses to start. > > With all this in place, I start the service, it runs for a bit, then shuts > down. > No errors in the event log. > 5 postgres processes are left running along with a pid file. I'm not familiar with Windows, but it sounds like PostgreSQL is actually running. What happens if you try and connect to it? > The log file says > -- > LOG: database system was interrupted while in recovery at log time > 2022-11-04 13:17:28 PDT > HINT: If this has occurred more than once some data might be corrupted and > you might need to choose an earlier recovery target. > FATAL: the database system is starting up > FATAL: the database system is starting up > FATAL: the database system is starting up > LOG: entering standby mode > LOG: consistent recovery state reached at 6A/3590 > LOG: record with zero length at 6A/3590 > LOG: started streaming WAL from primary at 6A/3500 on timeline 1 > LOG: redo starts at 6A/3590 > FATAL: the database system is starting up > FATAL: the database system is starting up > FATAL: the database system is starting up > > FATAL: the database system is starting up > FATAL: the database system is starting up > > So it seems that the backup didn't work as well as first appeared. The above state looks very much what would happen if the standby is running with "hot_standby = off", and connection attempts are being made. Try attempting to connect to it. Also, check if there's an entry in pg_stat_replication on the primary. Regards Ian Barwick
Setting up replication on Windows, v9.4
I'm setting up a backup for our primary postgres server using the archived WAL files. Then I'll try to upgrade it to Streaming Replication. Then I'll upgrade the system to v.latest. For now, we are on v.9.4. I do a base backup from the primary to a directory on the NAS. "C:\Program Files\PostgreSQL\9.4\bin\pg_basebackup.exe" -D \\diskstation\AccessData\Dev\Backup -P -X s -v -h 192.168.1.118 -p 5432 -U postgres That appears to go fine. Then I delete data\*.* and copy everything except the config files from the backup into data. Copy in recovery.conf -- standby_mode = 'on' primary_conninfo = 'host=192.168.1.118 port=5432 user=replication password=**' restore_command = 'copy "DISKSTATION\\AccessData\\WALfiles\\%f" "%p"' -- Copy in postgresql.conf, with settings -- listen_addresses = '127.0.0.1,192.168.1.118' wal_level = archive hot_standby = on -- Interestingly, the recovery file says # Note that recovery.conf must be in $PGDATA directory. # It should NOT be located in the same directory as postgresql.conf Those seem contradictory. And if I remove the postgresql, it just refuses to start. With all this in place, I start the service, it runs for a bit, then shuts down. No errors in the event log. 5 postgres processes are left running along with a pid file. The log file says -- LOG: database system was interrupted while in recovery at log time 2022-11-04 13:17:28 PDT HINT: If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target. FATAL: the database system is starting up FATAL: the database system is starting up FATAL: the database system is starting up LOG: entering standby mode LOG: consistent recovery state reached at 6A/3590 LOG: record with zero length at 6A/3590 LOG: started streaming WAL from primary at 6A/3500 on timeline 1 LOG: redo starts at 6A/3590 FATAL: the database system is starting up FATAL: the database system is starting up FATAL: the database system is starting up FATAL: the database system is starting up FATAL: the database system is starting up So it seems that the backup didn't work as well as first appeared.