Hi,

I found the problem should be because I tried to clean RAM cache in the slave 
by running "sync; echo 3 > /proc/sys/vm/drop_caches' 
that caused the "receiver" of slave gone away.

ps -ef | grep receiver
        postgres  6182  6178  0 12:11 ?        00:00:06 postgres: wal receiver 
process   streaming D/FB8DA000    
sync; echo 3 > /proc/sys/vm/drop_caches
ps -ef | grep receiver
root      8804 30447  0 12:29 pts/2    00:00:00 grep --color=auto receiver

regards

On 6 Aug 2013, at 10:44 AM, ascot.m...@gmail.com wrote:

> Hi,
> 
> I am doing some stress tests to a pair of PG servers to monitor the 
> pg_stat_replication, during the test, the pg_stat_replication suddenly became 
> empty. 
> 
> PG version: 9.2.4
> O/S: Ubuntu: 12.04
> 
> Since I need to monitor the replication lag from time to time, if the 
> pg_stat_replication becomes empty, the lag calculation in the slave will be 
> wrong.
> Please advise if this is a bug.
> 
> regards
> 
> 
> 
> 
> How to reproduce:
> 
> session 1: Master server - try to insert a large number of records into a 
> test table 
> postgres=# drop table test;CREATE TABLE test (id INTEGER PRIMARY KEY); INSERT 
> INTO test VALUES (generate_series(1,100000000));  EXPLAIN ANALYZE SELECT 
> COUNT(*) FROM test;
> 
> 2) session 2: Master server - check the byte_lag from time to time
> postgres=# SELECT
>     sent_offset - (
>         replay_offset - (sent_xlog - replay_xlog) * 255 * 16 ^ 6 ) AS byte_lag
> FROM (
>     SELECT
>         client_addr,
>         ('x' || lpad(split_part(sent_location,   '/', 1), 8, 
> '0'))::bit(32)::bigint AS sent_xlog,
>         ('x' || lpad(split_part(replay_location, '/', 1), 8, 
> '0'))::bit(32)::bigint AS replay_xlog,
>         ('x' || lpad(split_part(sent_location,   '/', 2), 8, 
> '0'))::bit(32)::bigint AS sent_offset,
>         ('x' || lpad(split_part(replay_location, '/', 2), 8, 
> '0'))::bit(32)::bigint AS replay_offset
>     FROM pg_stat_replication
> ) AS s;
> byte_lag 
> ----------
> 2097216
> (1 row)
> 
> postgres=# SELECT
> 
>     sent_offset - (
>         replay_offset - (sent_xlog - replay_xlog) * 255 * 16 ^ 6 ) AS byte_lag
> FROM (
>     SELECT
>         client_addr,
>         ('x' || lpad(split_part(sent_location,   '/', 1), 8, 
> '0'))::bit(32)::bigint AS sent_xlog,
>         ('x' || lpad(split_part(replay_location, '/', 1), 8, 
> '0'))::bit(32)::bigint AS replay_xlog,
>         ('x' || lpad(split_part(sent_location,   '/', 2), 8, 
> '0'))::bit(32)::bigint AS sent_offset,
>         ('x' || lpad(split_part(replay_location, '/', 2), 8, 
> '0'))::bit(32)::bigint AS replay_offset
>     FROM pg_stat_replication
> ) AS s;
> byte_lag 
> ----------
> (0 rows)
> 
> 
> 3) session 3: Slave server - 
> postgres=# SELECT CASE WHEN pg_last_xlog_receive_location() = 
> pg_last_xlog_replay_location()  THEN 0 ELSE EXTRACT (EPOCH FROM now() - 
> pg_last_xact_replay_timestamp()) END AS log_delay;
>  log_delay 
> -----------
>          0
> (1 row)
> 
> postgres=# SELECT CASE WHEN pg_last_xlog_receive_location() = 
> pg_last_xlog_replay_location()  THEN 0 ELSE EXTRACT (EPOCH FROM now() - 
> pg_last_xact_replay_timestamp()) END AS log_delay;
>  log_delay 
> -----------
>   4.873282
> (1 row)
> 
> .
> .
> .
> postgres=# SELECT CASE WHEN pg_last_xlog_receive_location() = 
> pg_last_xlog_replay_location()  THEN 0 ELSE EXTRACT (EPOCH FROM now() - 
> pg_last_xact_replay_timestamp()) END AS log_delay;
>   log_delay  
> -------------
>  4070.325329
> (1 row)
> 
> 

Reply via email to