> 27 окт. 2015 г., в 19:45, Vladimir Borodin <r...@simply.name> написал(а):
> 
> Hi all.
> 
> I’m wondering why do I get conflicts with recovery on hot standby using 
> replication slots and read commited isolation level? And if I start 
> repeatable read transaction I don’t get any errors. Below is some diagnostics.

+hackers@

Could anybody explain, why this is happening?

> 
> I’m using 9.4.4 (upgrade is planned) from yum.postgresql.org 
> <http://yum.postgresql.org/> packages on both master and standby. Configs are 
> the same on both master and standby:
> 
> rpopdb01d/postgres M # SELECT name, setting FROM pg_settings
> WHERE category LIKE 'Replication%' or category LIKE 'Write-Ahead Log';
>              name             | setting
> ------------------------------+---------
>  hot_standby                  | on
>  hot_standby_feedback         | on
>  max_replication_slots        | 1
>  max_standby_archive_delay    | 30000
>  max_standby_streaming_delay  | 30000
>  max_wal_senders              | 10
>  synchronous_standby_names    |
>  vacuum_defer_cleanup_age     | 200000
>  wal_keep_segments            | 64
>  wal_receiver_status_interval | 1
>  wal_receiver_timeout         | 60000
>  wal_sender_timeout           | 3000
> (12 rows)
> 
> Time: 1.583 ms
> rpopdb01d/postgres M #
> 
> On the master I’ve created a physical replication slot and attached standby 
> to it, I do see changing xmin and restart_lsn fields in pg_replication_slots 
> view.
> 
> rpopdb01d/postgres M # select * from pg_replication_slots ;
>          slot_name    | plugin | slot_type | datoid | database | active |    
> xmin    | catalog_xmin |  restart_lsn
> ----------------------+--------+-----------+--------+----------+--------+------------+--------------+---------------
>  rpopdb01e_domain_com | [null] | physical  | [null] | [null]   | t      | 
> 2127399287 |       [null] | 960B/415C79C8
> (1 row)
> 
> Time: 0.463 ms
> rpopdb01d/postgres M #
> 
> When I start a read commited transaction on standby (or use autocommit mode, 
> doesn’t matter) I still see that xmin in pg_replication_slots view on master 
> increases. If I do run a heavy SELECT statement, at some point of time 
> (presumably after vacuum_defer_cleanup_age expires) standby starts to lag 
> replication apply and when it hits max_standby_streaming_delay I get 40001 
> sql code, either ERROR or FATAL:
> 
> rpopdb01e/rpopdb R # SHOW transaction_isolation ;
>  transaction_isolation
> -----------------------
>  read committed
> (1 row)
> 
> Time: 0.324 ms
> rpopdb01e/rpopdb R # SELECT count(*) FROM big_table;
> ERROR:  40001: canceling statement due to conflict with recovery
> DETAIL:  User query might have needed to see row versions that must be 
> removed.
> LOCATION:  ProcessInterrupts, postgres.c:2990
> Time: 199791.339 ms
> rpopdb01e/rpopdb R #
> 
> 
> 
> rpopdb01e/rpopdb R # SHOW transaction_isolation ;
>  transaction_isolation
> -----------------------
>  read committed
> (1 row)
> 
> Time: 0.258 ms
> rpopdb01e/rpopdb R # BEGIN;
> BEGIN
> Time: 0.067 ms
> rpopdb01e/rpopdb R # SELECT count(*) FROM big_table;
> FATAL:  40001: terminating connection due to conflict with recovery
> DETAIL:  User was holding a relation lock for too long.
> LOCATION:  ProcessInterrupts, postgres.c:2857
> server closed the connection unexpectedly
>       This probably means the server terminated abnormally
>       before or while processing the request.
> The connection to the server was lost. Attempting reset: Succeeded.
> Time: 307864.830 ms
> rpopdb01e/rpopdb R #
> 
> The behavior is the same as expected to be without using replication slots.
> 
> But when I start repeatable read transaction xmin field in 
> pg_replication_slots view on master freezes (while restart_lsn is still 
> increasing) and I don’t get any replication lag and conflicts with recovery. 
> When I end this transaction, xmin starts increasing again.
> 
> rpopdb01e/rpopdb R # begin transaction isolation level repeatable read;
> BEGIN
> Time: 0.118 ms
> rpopdb01e/rpopdb R # SELECT count(*) FROM big_table;
>    count
> ------------
>  3106222429
> (1 row)
> 
> Time: 411944.889 ms
> rpopdb01e/rpopdb R # ROLLBACK;
> ROLLBACK
> Time: 0.269 ms
> rpopdb01e/rpopdb R #
> 
>  And that is what I expect. Am I missing something or is it expected behavior 
> in read commited mode?
> 
> Thanks in advance.
> 
> --
> May the force be with you…
> https://simply.name <https://simply.name/>


--
May the force be with you…
https://simply.name

Reply via email to