[BUGS] BUG #8192: On very large tables the concurrent update with vacuum lag the hot_standby replica

federico Thu, 30 May 2013 09:43:40 -0700

The following bug has been logged on the website:

Bug reference:      8192
Logged by:          Federico Campoli
Email address:      [email protected]
PostgreSQL version: 9.2.4
Operating system:   Debian 6.0
Description:


/*

Description:

It seems on very large tables the concurrent update with vacuum (or
autovacuum), 
when the slave is in hot standby mode, generates long loops in read on a
single wal segment during the recovery process.

This have two nasty effects.
A massive read IO peak and the replay lag increasing as the recovery process
hangs for long periods on a pointless loop.

PostgreSQL version: PostgreSQL 9.2.4 on x86_64-unknown-linux-gnu, compiled
by gcc-4.4.real (Debian 4.4.5-8) 4.4.5, 64-bit

Steps to reproduce the error:
setup an hot standby server.
the error occurs with streaming replication enabled and disabled

*/
SET client_min_messages='debug2';
SET trace_sort='on';


--create a new empty table
DROP TABLE IF EXISTS t_vacuum;
CREATE table t_vacuum
        (
           i_id_row integer,
           ts_time  timestamp with time zone default now()
        )
;
--disable the autovacuum, we need to run it manually
ALTER TABLE t_vacuum set (autovacuum_enabled='off');

--insert into the table enough data to have multiple file nodes
INSERT INTO t_vacuum
        (
                i_id_row
        )
        SELECT * FROM generate_series(1,40000000)
;


/*
start iotop on the hot standby monitoring the postgres:startup process and
eventually the wal receiver
for example iotop -p 31293
wait for all the wal files generated during the update to be applied 
this does not apply on live servers of course but is useful to trap only the
IO activity generated by vacuum

If the streaming replication is on, this select will tell when the slave has
finished.

In any case is useful to check the replication lag during the vacuum.
SELECT 
        pg_size_pretty(pg_xlog_location_diff(pg_current_xlog_location(),
replay_location))
FROM 
        pg_stat_replication
;

*/

--in a new session and start an huge table update
UPDATE t_vacuum set ts_time=now() WHERE i_id_row<20000000;

--then vacuum the table 
VACUUM VERBOSE t_vacuum;

--at some point the startup process will stuck recovering one single wal
file and
--the DISK READ column will show a huge IO for a while. 
--if you monitor the wal receiver also you will notice stream will continue
without problems
--sooner or later the startup process will resume the recovery but meanwhile
the DISK READ is not zero
--the slave will accumulate replication lag 



-- 
Sent via pgsql-bugs mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

[BUGS] BUG #8192: On very large tables the concurrent update with vacuum lag the hot_standby replica

Reply via email to