Hello, I'm having an issue where streaming replication just randomly stops working. I haven't been able to find anything in the logs which point to an issue, but the Postgres process shows a "waiting" status on the slave:
postgres 5639 0.1 24.3 3428264 2970236 ? Ss Aug14 1:54 postgres: startup process recovering 000000010000053D0000003F waiting postgres 5642 0.0 21.4 3428356 2613252 ? Ss Aug14 0:30 postgres: writer process postgres 5659 0.0 0.0 177524 788 ? Ss Aug14 0:03 postgres: stats collector process postgres 7159 1.2 0.1 3451360 18352 ? Ss Aug14 17:31 postgres: wal receiver process streaming 549/216B3730 The replication works great for days, but randomly seems to lock up and replication halts. I verified that the two databases were out of sync with a query on both of them. Has anyone experienced this issue before? Here are some relevant config settings: Master: wal_level = hot_standby checkpoint_segments = 32 checkpoint_completion_target = 0.9 archive_mode = on archive_command = 'rsync -a %p foo@foo:/var/lib/pgsql/9.1/wals/%f </dev/null' max_wal_senders = 2 wal_keep_segments = 32 Slave: wal_level = hot_standby checkpoint_segments = 32 #checkpoint_completion_target = 0.5 hot_standby = on max_standby_archive_delay = -1 max_standby_streaming_delay = -1 #wal_receiver_status_interval = 10s #hot_standby_feedback = off Thank you for any help you can provide! Andrew