Hi,
I just tried another round of tests, without running "sync; echo 3 >
/proc/sys/vm/drop_caches',
still got the same error, following FATAL errors are found in pg_log (slave),
can anyone please advise how to resolve this error?
regards
LOG: entering standby mode
LOG: consistent recovery state reached at 11/42000318
LOG: redo starts at 11/42000280
LOG: invalid record length at 11/42000318
LOG: database system is ready to accept read only connections
LOG: streaming replication successfully connected to primary
FATAL: could not send data to WAL stream: server closed the connection
unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
LOG: unexpected pageaddr 10/D2EC0000 in log file 18, segment 5, offset 15466496
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment
000000010000001200000005 has already been removed
On 6 Aug 2013, at 12:39 PM, [email protected] wrote:
> Hi,
>
> I found the problem should be because I tried to clean RAM cache in the slave
> by running "sync; echo 3 > /proc/sys/vm/drop_caches'
> that caused the "receiver" of slave gone away.
>
> ps -ef | grep receiver
> postgres 6182 6178 0 12:11 ? 00:00:06 postgres: wal receiver
> process streaming D/FB8DA000
> sync; echo 3 > /proc/sys/vm/drop_caches
> ps -ef | grep receiver
> root 8804 30447 0 12:29 pts/2 00:00:00 grep --color=auto receiver
>
> regards
>
> On 6 Aug 2013, at 10:44 AM, [email protected] wrote:
>
>> Hi,
>>
>> I am doing some stress tests to a pair of PG servers to monitor the
>> pg_stat_replication, during the test, the pg_stat_replication suddenly
>> became empty.
>>
>> PG version: 9.2.4
>> O/S: Ubuntu: 12.04
>>
>> Since I need to monitor the replication lag from time to time, if the
>> pg_stat_replication becomes empty, the lag calculation in the slave will be
>> wrong.
>> Please advise if this is a bug.
>>
>> regards
>>
>>
>>
>>
>> How to reproduce:
>>
>> session 1: Master server - try to insert a large number of records into a
>> test table
>> postgres=# drop table test;CREATE TABLE test (id INTEGER PRIMARY KEY);
>> INSERT INTO test VALUES (generate_series(1,100000000)); EXPLAIN ANALYZE
>> SELECT COUNT(*) FROM test;
>>
>> 2) session 2: Master server - check the byte_lag from time to time
>> postgres=# SELECT
>> sent_offset - (
>> replay_offset - (sent_xlog - replay_xlog) * 255 * 16 ^ 6 ) AS
>> byte_lag
>> FROM (
>> SELECT
>> client_addr,
>> ('x' || lpad(split_part(sent_location, '/', 1), 8,
>> '0'))::bit(32)::bigint AS sent_xlog,
>> ('x' || lpad(split_part(replay_location, '/', 1), 8,
>> '0'))::bit(32)::bigint AS replay_xlog,
>> ('x' || lpad(split_part(sent_location, '/', 2), 8,
>> '0'))::bit(32)::bigint AS sent_offset,
>> ('x' || lpad(split_part(replay_location, '/', 2), 8,
>> '0'))::bit(32)::bigint AS replay_offset
>> FROM pg_stat_replication
>> ) AS s;
>> byte_lag
>> ----------
>> 2097216
>> (1 row)
>>
>> postgres=# SELECT
>>
>> sent_offset - (
>> replay_offset - (sent_xlog - replay_xlog) * 255 * 16 ^ 6 ) AS
>> byte_lag
>> FROM (
>> SELECT
>> client_addr,
>> ('x' || lpad(split_part(sent_location, '/', 1), 8,
>> '0'))::bit(32)::bigint AS sent_xlog,
>> ('x' || lpad(split_part(replay_location, '/', 1), 8,
>> '0'))::bit(32)::bigint AS replay_xlog,
>> ('x' || lpad(split_part(sent_location, '/', 2), 8,
>> '0'))::bit(32)::bigint AS sent_offset,
>> ('x' || lpad(split_part(replay_location, '/', 2), 8,
>> '0'))::bit(32)::bigint AS replay_offset
>> FROM pg_stat_replication
>> ) AS s;
>> byte_lag
>> ----------
>> (0 rows)
>>
>>
>> 3) session 3: Slave server -
>> postgres=# SELECT CASE WHEN pg_last_xlog_receive_location() =
>> pg_last_xlog_replay_location() THEN 0 ELSE EXTRACT (EPOCH FROM now() -
>> pg_last_xact_replay_timestamp()) END AS log_delay;
>> log_delay
>> -----------
>> 0
>> (1 row)
>>
>> postgres=# SELECT CASE WHEN pg_last_xlog_receive_location() =
>> pg_last_xlog_replay_location() THEN 0 ELSE EXTRACT (EPOCH FROM now() -
>> pg_last_xact_replay_timestamp()) END AS log_delay;
>> log_delay
>> -----------
>> 4.873282
>> (1 row)
>>
>> .
>> .
>> .
>> postgres=# SELECT CASE WHEN pg_last_xlog_receive_location() =
>> pg_last_xlog_replay_location() THEN 0 ELSE EXTRACT (EPOCH FROM now() -
>> pg_last_xact_replay_timestamp()) END AS log_delay;
>> log_delay
>> -------------
>> 4070.325329
>> (1 row)
>>
>>
>