Re: [HACKERS] Re: Hot Standby query cancellation and Streaming Replication integration

Bruce Momjian Tue, 02 Mar 2010 15:39:41 -0800

Greg Stark wrote:
> On Mon, Mar 1, 2010 at 5:50 PM, Josh Berkus <[email protected]> wrote:
> > I don't think that defer_cleanup_age is a long-term solution. ?But we
> > need *a* solution which does not involve delaying 9.0.
> 
> So I think the primary solution currently is to raise max_standby_age.
> 
> However there is a concern with max_standby_age. If you set it to,
> say, 300s. Then run a 300s query on the slave which causes the slave
> to fall 299s behind. Now you start a new query on the slave -- it gets
> a snapshot based on the point in time that the slave is currently at.
> If it hits a conflict it will only have 1s to finish before the
> conflict causes the query to be cancelled.
> 
> In short in the current setup I think there is no safe value of
> max_standby_age which will prevent query cancellations short of -1. If
> the slave has a constant stream of queries and always has at least one
> concurrent query running then it's possible that the slave will run
> continuously max_standby_age-epsilon behind the master and cancel
> queries left and right, regardless of how large max_standby_age is.


This is sobering.  I have added the attached documentation so at least
this odd behavior is documented.

-- 
  Bruce Momjian  <[email protected]>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  PG East:  http://www.enterprisedb.com/community/nav-pg-east-2010.do

Index: doc/src/sgml/config.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/config.sgml,v
retrieving revision 1.257
diff -c -c -r1.257 config.sgml
*** doc/src/sgml/config.sgml	2 Mar 2010 21:18:59 -0000	1.257
--- doc/src/sgml/config.sgml	2 Mar 2010 23:34:38 -0000
***************
*** 1862,1879 ****
        <listitem>
         <para>
          When server acts as a standby, this parameter specifies a wait policy
!         for queries that conflict with data changes being replayed by recovery.
          If a conflict should occur the server will delay up to this number
!         of seconds before it begins trying to resolve things less amicably, as
!         described in <xref linkend="hot-standby-conflict">. Typically,
!         this parameter makes sense only during replication, so when
!         performing an archive recovery to recover from data loss a very high
!         parameter setting or -1 which means wait forever is recommended.
!         The default is 30 seconds.  Increasing this parameter can delay
!         master server changes from appearing on the standby.
          This parameter can only be set in the <filename>postgresql.conf</>
          file or on the server command line.
         </para>
        </listitem>
       </varlistentry>
  
--- 1862,1892 ----
        <listitem>
         <para>
          When server acts as a standby, this parameter specifies a wait policy
!         for applying WAL entries that conflict with active queries.
          If a conflict should occur the server will delay up to this number
!         of seconds before it cancels conflicting queries, as
!         described in <xref linkend="hot-standby-conflict">.
!         Typically, this parameter is used only during replication.
!         The default is 30 seconds.
          This parameter can only be set in the <filename>postgresql.conf</>
          file or on the server command line.
         </para>
+        <para>
+         A high value makes query cancel less likely, and -1
+         causes the standby to wait forever for a conflicting query to
+         complete.  Increasing this parameter might delay master server
+         changes from appearing on the standby.
+       </para>
+       <para>
+        While it is tempting to believe that <varname>max_standby_delay</>
+        is the maximum number of seconds a query can run before
+        cancellation is possible, this is not true.  When a long-running
+        query ends, there is a finite time required to apply backlogged
+        WAL logs.  If a second long-running query appears before the
+        WAL has caught up, the snapshot taken by the second query will
+        allow significantly less than <varname>max_standby_delay</>
+        before query cancellation is possible.
+       </para>
        </listitem>
       </varlistentry>

-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: Hot Standby query cancellation and Streaming Replication integration

Reply via email to