Greg Stark wrote:
> On Mon, Mar 1, 2010 at 5:50 PM, Josh Berkus <[email protected]> wrote:
> > I don't think that defer_cleanup_age is a long-term solution. ?But we
> > need *a* solution which does not involve delaying 9.0.
>
> So I think the primary solution currently is to raise max_standby_age.
>
> However there is a concern with max_standby_age. If you set it to,
> say, 300s. Then run a 300s query on the slave which causes the slave
> to fall 299s behind. Now you start a new query on the slave -- it gets
> a snapshot based on the point in time that the slave is currently at.
> If it hits a conflict it will only have 1s to finish before the
> conflict causes the query to be cancelled.
>
> In short in the current setup I think there is no safe value of
> max_standby_age which will prevent query cancellations short of -1. If
> the slave has a constant stream of queries and always has at least one
> concurrent query running then it's possible that the slave will run
> continuously max_standby_age-epsilon behind the master and cancel
> queries left and right, regardless of how large max_standby_age is.
This is sobering. I have added the attached documentation so at least
this odd behavior is documented.
--
Bruce Momjian <[email protected]> http://momjian.us
EnterpriseDB http://enterprisedb.com
PG East: http://www.enterprisedb.com/community/nav-pg-east-2010.do
Index: doc/src/sgml/config.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/config.sgml,v
retrieving revision 1.257
diff -c -c -r1.257 config.sgml
*** doc/src/sgml/config.sgml 2 Mar 2010 21:18:59 -0000 1.257
--- doc/src/sgml/config.sgml 2 Mar 2010 23:34:38 -0000
***************
*** 1862,1879 ****
<listitem>
<para>
When server acts as a standby, this parameter specifies a wait policy
! for queries that conflict with data changes being replayed by recovery.
If a conflict should occur the server will delay up to this number
! of seconds before it begins trying to resolve things less amicably, as
! described in <xref linkend="hot-standby-conflict">. Typically,
! this parameter makes sense only during replication, so when
! performing an archive recovery to recover from data loss a very high
! parameter setting or -1 which means wait forever is recommended.
! The default is 30 seconds. Increasing this parameter can delay
! master server changes from appearing on the standby.
This parameter can only be set in the <filename>postgresql.conf</>
file or on the server command line.
</para>
</listitem>
</varlistentry>
--- 1862,1892 ----
<listitem>
<para>
When server acts as a standby, this parameter specifies a wait policy
! for applying WAL entries that conflict with active queries.
If a conflict should occur the server will delay up to this number
! of seconds before it cancels conflicting queries, as
! described in <xref linkend="hot-standby-conflict">.
! Typically, this parameter is used only during replication.
! The default is 30 seconds.
This parameter can only be set in the <filename>postgresql.conf</>
file or on the server command line.
</para>
+ <para>
+ A high value makes query cancel less likely, and -1
+ causes the standby to wait forever for a conflicting query to
+ complete. Increasing this parameter might delay master server
+ changes from appearing on the standby.
+ </para>
+ <para>
+ While it is tempting to believe that <varname>max_standby_delay</>
+ is the maximum number of seconds a query can run before
+ cancellation is possible, this is not true. When a long-running
+ query ends, there is a finite time required to apply backlogged
+ WAL logs. If a second long-running query appears before the
+ WAL has caught up, the snapshot taken by the second query will
+ allow significantly less than <varname>max_standby_delay</>
+ before query cancellation is possible.
+ </para>
</listitem>
</varlistentry>
--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers