subject:"Re\: \[HACKERS\] Standalone synchronous master"

Re: [HACKERS] Standalone synchronous master

2014-01-27 Thread Robert Haas

On Sun, Jan 26, 2014 at 10:56 PM, Rajeev rastogi
rajeev.rast...@huawei.com wrote:
 On 01/25/2014, Josh Berkus wrote:
  ISTM the consensus is that we need better monitoring/administration
  interfaces so that people can script the behavior they want in
  external tools. Also, a new synchronous apply replication mode would
  be handy, but that'd be a whole different patch. We don't have a
 patch
  on the table that we could consider committing any time soon, so I'm
  going to mark this as rejected in the commitfest app.

 I don't feel that we'll never do auto-degrade is determinative;
 several hackers were for auto-degrade, and they have a good use-case
 argument.  However, we do have consensus that we need more scaffolding
 than this patch supplies in order to make auto-degrade *safe*.

 I encourage the submitter to resumbit and improved version of this
 patch (one with more monitorability) for  9.5 CF1.  That'll give us a
 whole dev cycle to argue about it.

 I shall rework to improve this patch. Below are the summarization of all
 discussions, which will be used as input for improving the patch:

 1. Method of degrading the synchronous mode:
 a. Expose the configuration variable to a new SQL-callable functions.
 b. Using ALTER SYSTEM SET.
 c. Auto-degrade using some sort of configuration parameter as done in 
 current patch.
 d. Or may be combination of above, which DBA can use depending on 
 their use-cases.

   We can discuss further to decide on one of the approach.

 2. Synchronous mode should upgraded/restored after at-least one synchronous 
 standby comes up and has caught up with the master.

 3. A better monitoring/administration interfaces, which can be even better if 
 it is made as a generic trap system.

   I shall propose a better approach for this.

 4. Send committing clients, a WARNING if they have committed a synchronous 
 transaction and we are in degraded mode.

 5. Please add more if I am missing something.

All of those things have been mentioned, but I'm not sure we have
consensus on which of them we actually want to do, or how.  Figuring
that out seems like the next step.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-27 Thread Josh Berkus

On 01/26/2014 07:56 PM, Rajeev rastogi wrote:
 I shall rework to improve this patch. Below are the summarization of all
 discussions, which will be used as input for improving the patch:
 
 1. Method of degrading the synchronous mode:
   a. Expose the configuration variable to a new SQL-callable functions.
   b. Using ALTER SYSTEM SET.
   c. Auto-degrade using some sort of configuration parameter as done in 
 current patch.
   d. Or may be combination of above, which DBA can use depending on their 
 use-cases.  
 
   We can discuss further to decide on one of the approach.
 
 2. Synchronous mode should upgraded/restored after at-least one synchronous 
 standby comes up and has caught up with the master.
 
 3. A better monitoring/administration interfaces, which can be even better if 
 it is made as a generic trap system.
 
   I shall propose a better approach for this.
 
 4. Send committing clients, a WARNING if they have committed a synchronous 
 transaction and we are in degraded mode.
 
 5. Please add more if I am missing something.

I think we actually need two degrade modes:

A. degrade once: if the sync standby connection is ever lost, degrade
and do not resync.

B. reconnect: if the sync standby catches up again, return it to sync
status.

The reason you'd want degrade once is to avoid the flaky network
issue where you're constantly degrading then reattaching the sync
standby, resulting in horrible performance.

If we did offer degrade once though, we'd need some easy way to
determine that the master was in a state of permanent degrade, and a
command to make it resync.

Discuss?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-26 Thread Hannu Krosing

On 01/24/2014 10:29 PM, Josh Berkus wrote:
 On 01/24/2014 12:47 PM, Heikki Linnakangas wrote:
 ISTM the consensus is that we need better monitoring/administration
 interfaces so that people can script the behavior they want in external
 tools. Also, a new synchronous apply replication mode would be handy,
 but that'd be a whole different patch. We don't have a patch on the
 table that we could consider committing any time soon, so I'm going to
 mark this as rejected in the commitfest app.
 I don't feel that we'll never do auto-degrade is determinative;
 several hackers were for auto-degrade, and they have a good use-case
 argument.  
Auto-degrade may make sense together with synchronous apply
mentioned by Heikki.

I do not see much use for synchronous-(noapply)-if-you-can mode,
though it may make some sense in some scenarios if sync failure
is accompanied by loud screaming (hey DBA, we are writing checks
with no money in the bank, do something fast!)

Perhaps some kind of sync-with-timeout mode, where timing out
results with a weak error (something between current
warning and error) returned to client and/or where it causes and
external command to be run which could then be used to flood
admins mailbox :)
 However, we do have consensus that we need more scaffolding
 than this patch supplies in order to make auto-degrade *safe*.

 I encourage the submitter to resumbit and improved version of this patch
 (one with more monitorability) for  9.5 CF1.  That'll give us a whole
 dev cycle to argue about it.


Cheers

-- 
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-26 Thread Rajeev rastogi

On 01/25/2014, Josh Berkus wrote:
  ISTM the consensus is that we need better monitoring/administration
  interfaces so that people can script the behavior they want in
  external tools. Also, a new synchronous apply replication mode would
  be handy, but that'd be a whole different patch. We don't have a
 patch
  on the table that we could consider committing any time soon, so I'm
  going to mark this as rejected in the commitfest app.
 
 I don't feel that we'll never do auto-degrade is determinative;
 several hackers were for auto-degrade, and they have a good use-case
 argument.  However, we do have consensus that we need more scaffolding
 than this patch supplies in order to make auto-degrade *safe*.
 
 I encourage the submitter to resumbit and improved version of this
 patch (one with more monitorability) for  9.5 CF1.  That'll give us a
 whole dev cycle to argue about it.

I shall rework to improve this patch. Below are the summarization of all
discussions, which will be used as input for improving the patch:

1. Method of degrading the synchronous mode:
a. Expose the configuration variable to a new SQL-callable functions.
b. Using ALTER SYSTEM SET.
c. Auto-degrade using some sort of configuration parameter as done in 
current patch.
d. Or may be combination of above, which DBA can use depending on their 
use-cases.  

  We can discuss further to decide on one of the approach.

2. Synchronous mode should upgraded/restored after at-least one synchronous 
standby comes up and has caught up with the master.

3. A better monitoring/administration interfaces, which can be even better if 
it is made as a generic trap system.

  I shall propose a better approach for this.

4. Send committing clients, a WARNING if they have committed a synchronous 
transaction and we are in degraded mode.

5. Please add more if I am missing something.

Thanks and Regards,
Kumar Rajeev Rastogi
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-24 Thread Heikki Linnakangas

ISTM the consensus is that we need better monitoring/administration 
interfaces so that people can script the behavior they want in external 
tools. Also, a new synchronous apply replication mode would be handy, 
but that'd be a whole different patch. We don't have a patch on the 
table that we could consider committing any time soon, so I'm going to 
mark this as rejected in the commitfest app.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-24 Thread Josh Berkus

On 01/24/2014 12:47 PM, Heikki Linnakangas wrote:
 ISTM the consensus is that we need better monitoring/administration
 interfaces so that people can script the behavior they want in external
 tools. Also, a new synchronous apply replication mode would be handy,
 but that'd be a whole different patch. We don't have a patch on the
 table that we could consider committing any time soon, so I'm going to
 mark this as rejected in the commitfest app.

I don't feel that we'll never do auto-degrade is determinative;
several hackers were for auto-degrade, and they have a good use-case
argument.  However, we do have consensus that we need more scaffolding
than this patch supplies in order to make auto-degrade *safe*.

I encourage the submitter to resumbit and improved version of this patch
(one with more monitorability) for  9.5 CF1.  That'll give us a whole
dev cycle to argue about it.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-24 Thread Florian Pflug

On Jan24, 2014, at 22:29 , Josh Berkus j...@agliodbs.com wrote:
 On 01/24/2014 12:47 PM, Heikki Linnakangas wrote:
 ISTM the consensus is that we need better monitoring/administration
 interfaces so that people can script the behavior they want in external
 tools. Also, a new synchronous apply replication mode would be handy,
 but that'd be a whole different patch. We don't have a patch on the
 table that we could consider committing any time soon, so I'm going to
 mark this as rejected in the commitfest app.
 
 I don't feel that we'll never do auto-degrade is determinative;
 several hackers were for auto-degrade, and they have a good use-case
 argument.  However, we do have consensus that we need more scaffolding
 than this patch supplies in order to make auto-degrade *safe*.
 
 I encourage the submitter to resumbit and improved version of this patch
 (one with more monitorability) for  9.5 CF1.  That'll give us a whole
 dev cycle to argue about it.

There seemed to be at least some support for having way to manually
degrade from sync rep to async rep via something like

  ALTER SYSTEM SET synchronous_commit='local';

Doing that seems unlikely to meet much resistant on grounds of principle,
so it seems to me that working on that would be the best way forward for
the submitter. I don't know how hard it would be to pull this off,
though.

best regards,
Florian Pflug



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-13 Thread Rajeev rastogi

 
 On Sun, Jan 12, Amit Kapila wrote:
  How would that work?  Would it be a tool in contrib?  There already
  is a timeout, so if a tool checked more frequently than the timeout,
  it should work.  The durable notification of the admin would happen
  in the tool, right?
 
  Well, you know what tool *I'm* planning to use.
 
  Thing is, when we talk about auto-degrade, we need to determine
 things
  like Is the replica down or is this just a network blip? and take
  action according to the user's desired configuration.  This is not
  something, realistically, that we can do on a single request.
 Whereas
  it would be fairly simple for an external monitoring utility to do:
 
  1. decide replica is offline for the duration (several poll attempts
  have failed)
 
  2. Send ALTER SYSTEM SET to the master and change/disable the
  synch_replicas.
 
Will it possible in current mechanism, because presently master will
not accept any new command when the sync replica is not available?
Or is there something else also which needs to be done along with
above 2 points to make it possible.

Since there is not WAL written for ALTER SYSTEM SET command, 
then
it should be able to handle this command even though sync 
replica is
not available.

Thanks and Regards,
Kumar Rajeev Rastogi


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-13 Thread Florian Pflug

On Jan12, 2014, at 04:18 , Josh Berkus j...@agliodbs.com wrote:
 Thing is, when we talk about auto-degrade, we need to determine things
 like Is the replica down or is this just a network blip? and take
 action according to the user's desired configuration.  This is not
 something, realistically, that we can do on a single request.  Whereas
 it would be fairly simple for an external monitoring utility to do:
 
 1. decide replica is offline for the duration (several poll attempts
 have failed)
 
 2. Send ALTER SYSTEM SET to the master and change/disable the
 synch_replicas.
 
 In other words, if we're going to have auto-degrade, the most
 intelligent place for it is in
 RepMgr/HandyRep/OmniPITR/pgPoolII/whatever.  It's also the *easiest*
 place.  Anything we do *inside* Postgres is going to have a really,
 really hard time determining when to degrade.

+1

This is also how 2PC works, btw - the database provides the building
blocks, i.e. PREPARE and COMMIT, and leaves it to a transaction manager
to deal with issues that require a whole-cluster perspective.

best regards,
Florian Pflug



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-13 Thread Hannu Krosing

On 01/13/2014 04:12 PM, Florian Pflug wrote:
 On Jan12, 2014, at 04:18 , Josh Berkus j...@agliodbs.com wrote:
 Thing is, when we talk about auto-degrade, we need to determine things
 like Is the replica down or is this just a network blip? and take
 action according to the user's desired configuration.  This is not
 something, realistically, that we can do on a single request.  Whereas
 it would be fairly simple for an external monitoring utility to do:

 1. decide replica is offline for the duration (several poll attempts
 have failed)

 2. Send ALTER SYSTEM SET to the master and change/disable the
 synch_replicas.

 In other words, if we're going to have auto-degrade, the most
 intelligent place for it is in
 RepMgr/HandyRep/OmniPITR/pgPoolII/whatever.  It's also the *easiest*
 place.  Anything we do *inside* Postgres is going to have a really,
 really hard time determining when to degrade.
 +1

 This is also how 2PC works, btw - the database provides the building
 blocks, i.e. PREPARE and COMMIT, and leaves it to a transaction manager
 to deal with issues that require a whole-cluster perspective.


++1

I like Simons idea to have a pg_xxx function for switching between
replication modes, which should be enough to support a monitor
daemon doing the switching.

Maybe we could have an 'syncrep_taking_too_long_command' GUC
which could be used to alert such a monitoring daemon, so it can
immediately check weather to

a) switch master to async rep or standalone mode (in case of sync slave
becoming unavailable)

or

b) to failover to slave (in almost equally likely case that it was the
master
which became disconnected from the world and slave is available)

or

c) do something else depending on circumstances/policy :)


NB! Note that in case of b) 'syncrep_taking_too_long_command' will
very likely also not reach the monitor daemon, so it can not relay on
this as main trigger!

Cheers

-- 
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-13 Thread Joshua D. Drake



On 01/13/2014 10:12 AM, Hannu Krosing wrote:

In other words, if we're going to have auto-degrade, the most
intelligent place for it is in
RepMgr/HandyRep/OmniPITR/pgPoolII/whatever.  It's also the *easiest*
place.  Anything we do *inside* Postgres is going to have a really,
really hard time determining when to degrade.

+1

This is also how 2PC works, btw - the database provides the building
blocks, i.e. PREPARE and COMMIT, and leaves it to a transaction manager
to deal with issues that require a whole-cluster perspective.



++1


+1



I like Simons idea to have a pg_xxx function for switching between
replication modes, which should be enough to support a monitor
daemon doing the switching.

Maybe we could have an 'syncrep_taking_too_long_command' GUC
which could be used to alert such a monitoring daemon, so it can
immediately check weather to



I would think that would be a column in pg_stat_replication. Basically 
last_ack or something like that.




a) switch master to async rep or standalone mode (in case of sync slave
becoming unavailable)


Yep.



or

b) to failover to slave (in almost equally likely case that it was the
master
which became disconnected from the world and slave is available)

or


I think this should be left to external tools.

JD


--
Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
In a time of universal deceit - telling the truth is a revolutionary 
act., George Orwell



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-13 Thread Jim Nasby


On 1/13/14, 12:21 PM, Joshua D. Drake wrote:


On 01/13/2014 10:12 AM, Hannu Krosing wrote:

In other words, if we're going to have auto-degrade, the most
intelligent place for it is in
RepMgr/HandyRep/OmniPITR/pgPoolII/whatever.  It's also the *easiest*
place.  Anything we do *inside* Postgres is going to have a really,
really hard time determining when to degrade.

+1

This is also how 2PC works, btw - the database provides the building
blocks, i.e. PREPARE and COMMIT, and leaves it to a transaction manager
to deal with issues that require a whole-cluster perspective.



++1


+1


Josh, what do you think of the upthread idea of being able to recover 
in-progress transactions that are waiting when we turn off sync rep? I'm 
thinking that would be a very good feature to have... and it's not something 
you can easily do externally.
--
Jim C. Nasby, Data Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-13 Thread Andres Freund

On 2014-01-13 15:14:21 -0600, Jim Nasby wrote:
 On 1/13/14, 12:21 PM, Joshua D. Drake wrote:
 
 On 01/13/2014 10:12 AM, Hannu Krosing wrote:
 In other words, if we're going to have auto-degrade, the most
 intelligent place for it is in
 RepMgr/HandyRep/OmniPITR/pgPoolII/whatever.  It's also the *easiest*
 place.  Anything we do *inside* Postgres is going to have a really,
 really hard time determining when to degrade.
 +1
 
 This is also how 2PC works, btw - the database provides the building
 blocks, i.e. PREPARE and COMMIT, and leaves it to a transaction manager
 to deal with issues that require a whole-cluster perspective.
 
 
 ++1
 
 +1
 
 Josh, what do you think of the upthread idea of being able to recover 
 in-progress transactions that are waiting when we turn off sync rep? I'm 
 thinking that would be a very good feature to have... and it's not something 
 you can easily do externally.

I think it'd be a fairly simple patch to re-check the state of syncrep
config in SyncRepWaitForLsn(). Alternatively you can just write code to
iterate over the procarray and sets Proc-syncRepState to
SYNC_REP_WAIT_CANCELLED or such.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-13 Thread Joshua D. Drake



On 01/13/2014 01:14 PM, Jim Nasby wrote:


On 1/13/14, 12:21 PM, Joshua D. Drake wrote:


On 01/13/2014 10:12 AM, Hannu Krosing wrote:

In other words, if we're going to have auto-degrade, the most
intelligent place for it is in
RepMgr/HandyRep/OmniPITR/pgPoolII/whatever.  It's also the *easiest*
place.  Anything we do *inside* Postgres is going to have a really,
really hard time determining when to degrade.

+1

This is also how 2PC works, btw - the database provides the building
blocks, i.e. PREPARE and COMMIT, and leaves it to a transaction manager
to deal with issues that require a whole-cluster perspective.



++1


+1


Josh, what do you think of the upthread idea of being able to recover
in-progress transactions that are waiting when we turn off sync rep? I'm
thinking that would be a very good feature to have... and it's not
something you can easily do externally.


I think it is extremely valuable, else we have lost those transactions 
which is exactly what we don't want.


JD


--
Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
In a time of universal deceit - telling the truth is a revolutionary 
act., George Orwell



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-13 Thread Florian Pflug

On Jan13, 2014, at 22:30 , Joshua D. Drake j...@commandprompt.com wrote:
 On 01/13/2014 01:14 PM, Jim Nasby wrote:
 
 On 1/13/14, 12:21 PM, Joshua D. Drake wrote:
 
 On 01/13/2014 10:12 AM, Hannu Krosing wrote:
 In other words, if we're going to have auto-degrade, the most
 intelligent place for it is in
 RepMgr/HandyRep/OmniPITR/pgPoolII/whatever.  It's also the *easiest*
 place.  Anything we do *inside* Postgres is going to have a really,
 really hard time determining when to degrade.
 +1
 
 This is also how 2PC works, btw - the database provides the building
 blocks, i.e. PREPARE and COMMIT, and leaves it to a transaction manager
 to deal with issues that require a whole-cluster perspective.
 
 
 ++1
 
 +1
 
 Josh, what do you think of the upthread idea of being able to recover
 in-progress transactions that are waiting when we turn off sync rep? I'm
 thinking that would be a very good feature to have... and it's not
 something you can easily do externally.
 
 I think it is extremely valuable, else we have lost those transactions which
 is exactly what we don't want.

We *have* to recover waiting transaction upon switching off sync rep.

A transaction that waits for a sync standby to respond has already committed
locally (i.e., updated the clog), it just hasn't updated the proc array yet,
and thus is still seen as in-progress by the rest of the system. But rolling
back the transaction is nevertheless *impossible* at that point (except by
PITR, and hence the quoted around reciver). So the only alternative to
recovering them, i.e. have them abort their waiting, is to let them linger
indefinitely, still holding their locks, preventing xmin from advancing, etc,
until either the client disconnects or the server is restarted.

best regards,
Florian Pflug



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-12 Thread Florian Pflug

On Jan11, 2014, at 18:53 , Andres Freund and...@2ndquadrant.com wrote:
 On 2014-01-11 18:28:31 +0100, Florian Pflug wrote:
 Hm, I was about to suggest that you can set statement_timeout before
 doing COMMIT to limit the amount of time you want to wait for the
 standby to respond. Interestingly, however, that doesn't seem to work,
 which is weird, since AFAICS statement_timeout simply generates a
 query cancel requester after the timeout has elapsed, and cancelling
 the COMMIT with Ctrl-C in psql *does* work.
 
 I think that'd be a pretty bad API since you won't know whether the
 commit failed or succeeded but replication timed out. There very well
 might have been longrunning constraint triggers or such taking a long
 time.

You could still distinguish these cases because the COMMIT would succeed
with a WARNING if the timeout elapses while waiting for the standby, just
as it does for query cancellations already.

I'm not saying that this is a great API, though - I brought it up only
because I accepting cancellation requests but ignoring timeouts seems
a bit inconsistent to me.

best regards,
Florian Pflug



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-12 Thread Josh Berkus

All,

I'm leading this off with a review of the features offered by the actual
patch submitted.  My general discussion of the issues of Sync Degrade,
which justifies my specific suggestions below, follows that.  Rajeev,
please be aware that other hackers may have different opinions than me
on what needs to change about the patch, so you should collect all
opinions before changing code.

===

 Add a new parameter :

 synchronous_standalone_master = on | off

I think this is a TERRIBLE name for any such parameter.  What does
synchronous standalone even mean?  A better name for the parameter
would be auto_degrade_sync_replication or synchronous_timeout_action
= error | degrade, or something similar.  It would be even better for
this to be a mode of synchronous_commit, except that synchronous_commit
is heavily overloaded already.

Some issues raised by this log script:

LOG:  standby tx0113 is now the synchronous standby with priority 1
LOG:  waiting for standby synchronization
  -- standby wal receiver on the standby is killed (SIGKILL)
LOG:  unexpected EOF on standby connection
LOG:  not waiting for standby synchronization
  -- restart standby so that it connects again
LOG:  standby tx0113 is now the synchronous standby with priority 1
LOG:  waiting for standby synchronization
  -- standby wal receiver is first stopped (SIGSTOP) to make sure

The not waiting for standby synchronization message should be marked
something stronger than LOG.  I'd like ERROR.

Second, you have the master resuming sync rep when the standby
reconnects.  How do you determine when it's safe to do that?  You're
making the assumption that you have a failing sync standby instead of
one which simply can't keep up with the master, or a flakey network
connection (see discussion below).

 a.   Master_to_standalone_cmd: To be executed before master
switches to standalone mode.

 b.  Master_to_sync_cmd: To be executed before master switches from
sync mode to standalone mode.

I'm not at all clear what the difference between these two commands is.
 When would one be excuted, and when would the other be executed?  Also,
renaming ...

Missing features:

a) we should at least send committing clients a WARNING if they have
commited a synchronous transaction and we are in degraded mode.

I know others have dismissed this idea as too talky, but from my
perspective, the agreement with the client for each synchronous commit
is being violated, so each and every synchronous commit should report
failure to sync.  Also, having a warning on every commit would make it
easier to troubleshoot degraded mode for users who have ignored the
other warnings we give them.

b) pg_stat_replication needs to show degraded mode in some way, or we
need pg_sync_rep_degraded(), or (ideally) both.

I'm also wondering if we need a more sophisticated approach to
wal_sender_timeout to go with all this.

===

On 01/11/2014 08:33 PM, Bruce Momjian wrote:
 On Sat, Jan 11, 2014 at 07:18:02PM -0800, Josh Berkus wrote:
 In other words, if we're going to have auto-degrade, the most
 intelligent place for it is in
 RepMgr/HandyRep/OmniPITR/pgPoolII/whatever.  It's also the *easiest*
 place.  Anything we do *inside* Postgres is going to have a really,
 really hard time determining when to degrade.
 
 Well, one goal I was considering is that if a commit is hung waiting for
 slave sync confirmation, and the timeout happens, then the mode is
 changed to degraded and the commit returns success.  I am not sure how
 you would do that in an external tool, meaning there is going to be
 period where commits fail, unless you think there is a way that when the
 external tool changes the mode to degrade that all hung commits
 complete.  That would be nice.

Realistically, though, that's pretty unavoidable.  Any technique which
waits a reasonable interval to determine that the replica isn't going to
respond is liable to go beyond the application's timeout threshold
anyway.  There are undoubtedly exceptions to that, but it will be the
case a lot of the time -- how many applications are willing to wait
*minutes* for a COMMIT?

I also don't see any way to allow the hung transactions to commit
without allowing the walsender to make a decision on degrading.  As I've
outlined elsewhere (and below), the walsender just doesn't have enough
information to make a good decision.

On 01/11/2014 08:52 PM, Amit Kapila wrote: It is better than async mode
in a way such that in async mode it never
 waits for commits to be written to standby, but in this new mode it will
 do so unless it is not possible (all sync standby's goes down).
 Can't we use existing wal_sender_timeout, or even if user expects a
 different timeout because for this new mode, he expects master to wait
 more before it start operating like standalone sync master, we can provide
 a new parameter.

One of the reasons that there's so much disagreement about this feature
is that most of the

Re: [HACKERS] Standalone synchronous master

2014-01-12 Thread Stephen Frost

* Josh Berkus (j...@agliodbs.com) wrote:
 On 01/11/2014 08:52 PM, Amit Kapila wrote: It is better than async mode
 in a way such that in async mode it never
  waits for commits to be written to standby, but in this new mode it will
  do so unless it is not possible (all sync standby's goes down).
  Can't we use existing wal_sender_timeout, or even if user expects a
  different timeout because for this new mode, he expects master to wait
  more before it start operating like standalone sync master, we can provide
  a new parameter.
 
 One of the reasons that there's so much disagreement about this feature
 is that most of the folks strongly in favor of auto-degrade are thinking
 *only* of the case that the standby is completely down.  There are many
 other reasons for a sync transaction to hang, and the walsender has
 absolutely no way of knowing which is the case.  For example:

Uhh, yea, no, I'm pretty sure those in favor of auto-degrade are very
specifically thinking of cases like Standby is restarting, which is
not a reason for the master to fall over.

 * Transient network issues
 * Standby can't keep up with master
 * Postgres bug
 * Storage/IO issues (think EBS)
 * Standby is restarting
 
 You don't want to handle all of those issues the same way as far as sync
 rep is concerned.  For example, if the standby is restaring, you
 probably want to wait instead of degrading.

*What*?!  Certainly not in any kind of OLTP-type system; a system
restart can easily take minutes.  Clearly, you want to resume once the
standby is back up, which I feel like the people against an auto-degrade
mode are missing, but holding up a commit until the standby finishes
rebooting isn't practical.

 There's also the issue that this patch, and necessarily any
 walsender-level auto-degrade, has IMHO no safe way to resume sync
 replication.  This means that any use who has a network or storage blip
 once a day (again, think AWS) would be constantly in degraded mode, even
 though both the master and the replica are up and running -- and it will
 come as a complete surprise to them when the lose the master and
 discover that they've lost data.

I don't follow this logic at all- why is there no safe way to resume?
You wait til the slave is caught up fully and then go back to sync mode.
If that turns out to be an extended problem then an alarm needs to be
raised, of course.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Standalone synchronous master

2014-01-12 Thread Kevin Grittner

Josh Berkus j...@agliodbs.com wrote:

  Add a new parameter :

 
  synchronous_standalone_master = on | off
 
 I think this is a TERRIBLE name for any such parameter.  What does
 synchronous standalone even mean?  A better name for the parameter
 would be auto_degrade_sync_replication or 
 synchronous_timeout_action
 = error | degrade, or something similar.  It would be even better for
 this to be a mode of synchronous_commit, except that synchronous_commit
 is heavily overloaded already.

+1

 a) we should at least send committing clients a WARNING if they have
 commited a synchronous transaction and we are in degraded mode.
 
 I know others have dismissed this idea as too talky, but from my
 perspective, the agreement with the client for each synchronous commit
 is being violated, so each and every synchronous commit should report
 failure to sync.  Also, having a warning on every commit would make it
 easier to troubleshoot degraded mode for users who have ignored the
 other warnings we give them.

I agree that every synchronous commit on a master which is configured for 
synchronous replication which returns without persisting the work of the 
transaction on both the (local) primary and a synchronous replica should issue 
a WARNING.  That said, the API for some connectors (like JDBC) puts the burden 
on the application or its framework to check for warnings each time and do 
something reasonable if found; I fear that a Venn diagram of those shops which 
would use this new feature and those shops that don't rigorously look for and 
reasonably deal with warnings would have significant overlap.

 b) pg_stat_replication needs to show degraded mode in some way, or we
 need pg_sync_rep_degraded(), or (ideally) both.

+1

Since this new feature, where enabled, would cause synchronous replication to 
provide no guarantees beyond what asynchronous replication does[1], but would 
tend to cause people to have an *expectation* that they have some additional 
protection, I think proper documentation will be a big challenge.


--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

[1]  If I understand correctly, this is what the feature is intended to provide:
- A transaction successfully committed on the primary is guaranteed to be 
visible on the replica?  No, in all modes.
- A transaction successfully committed on the primary is guaranteed *not* to be 
visible on the replica?  No, in all modes.
- A the work of a transaction which has not returned from a commit request may 
be visible on the primary and/or the standby?  Yes in all modes.
- A failure of the primary is guaranteed not to lose successfully committed 
transactions when failing over to the replica?  Yes for sync rep without this 
feature, no for async or when this feature is used.  If things are going well 
up to the moment of primary failure, the feature improves the odds (versus 
async) that successfully committed transactions will not be lost, or may reduce 
the number of successfully committed transactions lost.
- A failure of the replica allows transactions on the primary to continue?  
Read only for sync rep without this feature if the last sync standby has 
failed, read only for some interval and then read write with this feature or if 
there is still another working sync rep target, all transactions without 
interruption with async.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-12 Thread Josh Berkus

On 01/12/2014 12:35 PM, Stephen Frost wrote:
 * Josh Berkus (j...@agliodbs.com) wrote:
 You don't want to handle all of those issues the same way as far as sync
 rep is concerned.  For example, if the standby is restaring, you
 probably want to wait instead of degrading.
 
 *What*?!  Certainly not in any kind of OLTP-type system; a system
 restart can easily take minutes.  Clearly, you want to resume once the
 standby is back up, which I feel like the people against an auto-degrade
 mode are missing, but holding up a commit until the standby finishes
 rebooting isn't practical.

Well, then that becomes a reason to want better/more configurability.
In the couple of sync rep sites I admin, I *would* want to wait.

 There's also the issue that this patch, and necessarily any
 walsender-level auto-degrade, has IMHO no safe way to resume sync
 replication.  This means that any use who has a network or storage blip
 once a day (again, think AWS) would be constantly in degraded mode, even
 though both the master and the replica are up and running -- and it will
 come as a complete surprise to them when the lose the master and
 discover that they've lost data.
 
 I don't follow this logic at all- why is there no safe way to resume?
 You wait til the slave is caught up fully and then go back to sync mode.
 If that turns out to be an extended problem then an alarm needs to be
 raised, of course.

So, if you have auto-resume, how do you handle the flaky network case?
 And how would an alarm be raised?

On 01/12/2014 12:51 PM, Kevin Grittner wrote:
 Josh Berkus j...@agliodbs.com wrote:
 I know others have dismissed this idea as too talky, but from my
 perspective, the agreement with the client for each synchronous
 commit is being violated, so each and every synchronous commit
 should report failure to sync.  Also, having a warning on every
 commit would make it easier to troubleshoot degraded mode for users
 who have ignored the other warnings we give them.

 I agree that every synchronous commit on a master which is configured
 for synchronous replication which returns without persisting the work
 of the transaction on both the (local) primary and a synchronous
 replica should issue a WARNING.  That said, the API for some
 connectors (like JDBC) puts the burden on the application or its
 framework to check for warnings each time and do something reasonable
 if found; I fear that a Venn diagram of those shops which would use
 this new feature and those shops that don't rigorously look for and
 reasonably deal with warnings would have significant overlap.

Oh, no question.  However, having such a WARNING would help with
interactive troubleshooting once a problem has been identified, and
that's my main reason for wanting it.

Imagine the case where you have auto-degrade and a flaky network.  The
user would experience problems as performance problems; that is, some
commits take minutes on-again, off-again.  They wouldn't necessarily
even LOOK at the sync rep settings.  So next step is to try walking
through a sample transaction on the command line, and then the
DBA/consultant gets WARNING messages, which gives an idea where the real
problem lies.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-12 Thread Stephen Frost

* Josh Berkus (j...@agliodbs.com) wrote:
 Well, then that becomes a reason to want better/more configurability.

I agree with this- the challenge is figuring out what those options
should be and how we should document them.

 In the couple of sync rep sites I admin, I *would* want to wait.

That's certainly an interesting data point.  One of the specific
use-cases that I'm thinking of is to auto-degrade on a graceful shutdown
of the slave for upgrades and/or maintenance.  Perhaps we don't need
*auto* degrade in that case, but then an actual failure of the slave
will also bring down the master.

  I don't follow this logic at all- why is there no safe way to resume?
  You wait til the slave is caught up fully and then go back to sync mode.
  If that turns out to be an extended problem then an alarm needs to be
  raised, of course.
 
 So, if you have auto-resume, how do you handle the flaky network case?
  And how would an alarm be raised?

Ideally, every time there is a auto-degrade, messages are logs to log
files which are monitored and notices are sent to admins about it
happening, who, upon getting repeated such emails, would realize there's
a problem and work to fix it.

 On 01/12/2014 12:51 PM, Kevin Grittner wrote:
  Josh Berkus j...@agliodbs.com wrote:
  I know others have dismissed this idea as too talky, but from my
  perspective, the agreement with the client for each synchronous
  commit is being violated, so each and every synchronous commit
  should report failure to sync.  Also, having a warning on every
  commit would make it easier to troubleshoot degraded mode for users
  who have ignored the other warnings we give them.
 
  I agree that every synchronous commit on a master which is configured
  for synchronous replication which returns without persisting the work
  of the transaction on both the (local) primary and a synchronous
  replica should issue a WARNING.  That said, the API for some
  connectors (like JDBC) puts the burden on the application or its
  framework to check for warnings each time and do something reasonable
  if found; I fear that a Venn diagram of those shops which would use
  this new feature and those shops that don't rigorously look for and
  reasonably deal with warnings would have significant overlap.
 
 Oh, no question.  However, having such a WARNING would help with
 interactive troubleshooting once a problem has been identified, and
 that's my main reason for wanting it.

I'm in the camp of this being too 'talky'.

 Imagine the case where you have auto-degrade and a flaky network.  The
 user would experience problems as performance problems; that is, some
 commits take minutes on-again, off-again.  They wouldn't necessarily
 even LOOK at the sync rep settings.  So next step is to try walking
 through a sample transaction on the command line, and then the
 DBA/consultant gets WARNING messages, which gives an idea where the real
 problem lies.

Or they look in the logs which hopefully say that their slave keeps
getting disconnected...

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Standalone synchronous master

2014-01-12 Thread Amit Kapila

 On 01/11/2014 08:52 PM, Amit Kapila wrote: It is better than async mode
 in a way such that in async mode it never
 waits for commits to be written to standby, but in this new mode it will
 do so unless it is not possible (all sync standby's goes down).
 Can't we use existing wal_sender_timeout, or even if user expects a
 different timeout because for this new mode, he expects master to wait
 more before it start operating like standalone sync master, we can provide
 a new parameter.

 One of the reasons that there's so much disagreement about this feature
 is that most of the folks strongly in favor of auto-degrade are thinking
 *only* of the case that the standby is completely down.  There are many
 other reasons for a sync transaction to hang, and the walsender has
 absolutely no way of knowing which is the case.  For example:

 * Transient network issues
 * Standby can't keep up with master
 * Postgres bug
 * Storage/IO issues (think EBS)
 * Standby is restarting

 You don't want to handle all of those issues the same way as far as sync
 rep is concerned.  For example, if the standby is restaring, you
 probably want to wait instead of degrading.

   I think it might be difficult to differentiate the cases except may be
   by having a separate timeout for this mode, so that it can wait more
   when server runs in this mode. OTOH why can't we define this new
   mode such that it will behave same for all cases, basically we can tell
   whenever sync standby is not available (n/w issue or m/c down), it will
   behave as master in async mode.
   Here I think the important point would be to gracefully allow resuming
   sync standby when it tries to reconnect (we can allow to reconnect if it
   can resolve all WAL differences.)


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-12 Thread Rajeev rastogi


On 13th January 2013, Josh Berkus Wrote:

 I'm leading this off with a review of the features offered by the
 actual patch submitted.  My general discussion of the issues of Sync
 Degrade, which justifies my specific suggestions below, follows that.
 Rajeev, please be aware that other hackers may have different opinions
 than me on what needs to change about the patch, so you should collect
 all opinions before changing code.

Thanks for reviewing and providing the first level of comments. Surely
We'll collect all feedback to improve this patch.

 
  Add a new parameter :
 
  synchronous_standalone_master = on | off
 
 I think this is a TERRIBLE name for any such parameter.  What does
 synchronous standalone even mean?  A better name for the parameter
 would be auto_degrade_sync_replication or synchronous_timeout_action
 = error | degrade, or something similar.  It would be even better for
 this to be a mode of synchronous_commit, except that synchronous_commit
 is heavily overloaded already.

Yes we can change this parameter name. Some of the suggestion in order to 
degrade the mode
1. Auto-degrade using some sort of configuration parameter as done in 
current patch.
2. Expose the configuration variable to a new SQL-callable functions as 
suggested by Heikki.
3. Or using ALTER SYSTEM SET as suggested by others.

 Some issues raised by this log script:
 
 LOG:  standby tx0113 is now the synchronous standby with priority 1
 LOG:  waiting for standby synchronization
   -- standby wal receiver on the standby is killed (SIGKILL)
 LOG:  unexpected EOF on standby connection
 LOG:  not waiting for standby synchronization
   -- restart standby so that it connects again
 LOG:  standby tx0113 is now the synchronous standby with priority 1
 LOG:  waiting for standby synchronization
   -- standby wal receiver is first stopped (SIGSTOP) to make sure
 
 The not waiting for standby synchronization message should be marked
 something stronger than LOG.  I'd like ERROR.

Yes we can change this to ERROR.

 Second, you have the master resuming sync rep when the standby
 reconnects.  How do you determine when it's safe to do that?  You're
 making the assumption that you have a failing sync standby instead of
 one which simply can't keep up with the master, or a flakey network
 connection (see discussion below).

Yes this can be further improved so that only if we make sure that synchronous
Standby has caught up with master node (may require a better design), then only 
master can be upgraded to Synchronous mode by one of the method discussed above.

  a.   Master_to_standalone_cmd: To be executed before master
 switches to standalone mode.
 
  b.  Master_to_sync_cmd: To be executed before master switches
 from
 sync mode to standalone mode.
 
 I'm not at all clear what the difference between these two commands is.
  When would one be excuted, and when would the other be executed?  Also,
 renaming ...

There is typo mistake in above explain, meaning of two commands are:
a.Master_to_standalone_cmd: To be executed during degradation of sync mode.

 b.  Master_to_sync_cmd: To be executed before upgrade or restoration of mode.

These two commands are per the TODO item to inform DBA.

But as per Heikki suggestion, we should not use this mechanism to inform DBA 
rather
We should some have some sort of generic trap system, instead of adding this 
one 
particular extra config option specifically for this feature. 
This looks to be better idea so we can have further discussion to come with 
proper
design.


 Missing features:
 
 a) we should at least send committing clients a WARNING if they have
 commited a synchronous transaction and we are in degraded mode.

Yes it is great idea.

 One of the reasons that there's so much disagreement about this feature
 is that most of the folks strongly in favor of auto-degrade are
 thinking
 *only* of the case that the standby is completely down.  There are many
 other reasons for a sync transaction to hang, and the walsender has
 absolutely no way of knowing which is the case.  For example:
 
 * Transient network issues
 * Standby can't keep up with master
 * Postgres bug
 * Storage/IO issues (think EBS)
 * Standby is restarting
 
 You don't want to handle all of those issues the same way as far as
 sync rep is concerned.  For example, if the standby is restaring, you
 probably want to wait instead of degrading.

I think if we support to have some external SQL-callable functions as Heikki 
suggested to degrade instead of auto-degrade then user can handle at-least some 
of the above scenarios if not all based on their experience and observation. 


Thanks and Regards,
Kumar Rajeev Rastogi


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-11 Thread Amit Kapila

On Fri, Jan 10, 2014 at 9:17 PM, Bruce Momjian br...@momjian.us wrote:
 On Fri, Jan 10, 2014 at 10:21:42AM +0530, Amit Kapila wrote:
 Here I think if user is aware from beginning that this is the behaviour,
 then may be the importance of message is not very high.
 What I want to say is that if we provide a UI in such a way that user
 decides during setup of server the behavior that is required by him.

 For example, if we provide a new parameter
 available_synchronous_standby_names along with current parameter
 and ask user to use this new parameter, if he wishes to synchronously
 commit transactions on another server when it is available, else it will
 operate as a standalone sync master.

 I know there was a desire to remove this TODO item, but I think we have
 brought up enough new issues that we can keep it to see if we can come
 up with a solution.

  I am not telling any such thing, rather I am suggesting some other way
  for this new mode.

 I have added a link to this discussion on the TODO
 item.

 I think we will need at least four new GUC variables:

 *  timeout control for degraded mode
 *  command to run during switch to degraded mode
 *  command to run during switch from degraded mode
 *  read-only variable to report degraded mode

Okay, this is one way of providing this new mode, others could be:

a.
Have just one GUC sync_standalone_mode = true|false and make
this as PGC_POSTMASTER parameter, so that user is only
allowed to set this mode at startup. Even if we don't want it as
Postmaster parameter, we can mention to users that they can
change this parameter only before server reaches current situation.
I understand that without any alarm or some other way, it is difficult
for user to know and change it, but I think in that case he should
set it before server startup.

b.
On above lines, instead of boolean parameter, provide a parameter
similar to current one such as available_synchronous_standby_names,
setting of this should follow what I said in point a. The benefit in this
as compare to 'a' is that it appears to be more like what we currently have.

I think if we try to solve this problem by providing a way so that user
can change it at runtime or when the problem actually occurred, it can
make the UI more complex and difficult for us to provide a way so that
user can be alerted on such situation. We can keep our options open
so that if tomorrow, we can find any reasonable way, then we can
provide it to user a mechanism for changing this at runtime, but I don't
think it is stopping us from providing a way with which user can get the
benefit of this mode by providing start time parameter.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-11 Thread Bruce Momjian

On Sat, Jan 11, 2014 at 01:29:23PM +0530, Amit Kapila wrote:
 Okay, this is one way of providing this new mode, others could be:
 
 a.
 Have just one GUC sync_standalone_mode = true|false and make
 this as PGC_POSTMASTER parameter, so that user is only
 allowed to set this mode at startup. Even if we don't want it as
 Postmaster parameter, we can mention to users that they can
 change this parameter only before server reaches current situation.
 I understand that without any alarm or some other way, it is difficult
 for user to know and change it, but I think in that case he should
 set it before server startup.
 
 b.
 On above lines, instead of boolean parameter, provide a parameter
 similar to current one such as available_synchronous_standby_names,
 setting of this should follow what I said in point a. The benefit in this
 as compare to 'a' is that it appears to be more like what we currently have.
 
 I think if we try to solve this problem by providing a way so that user
 can change it at runtime or when the problem actually occurred, it can
 make the UI more complex and difficult for us to provide a way so that
 user can be alerted on such situation. We can keep our options open
 so that if tomorrow, we can find any reasonable way, then we can
 provide it to user a mechanism for changing this at runtime, but I don't
 think it is stopping us from providing a way with which user can get the
 benefit of this mode by providing start time parameter.

I am not sure how this would work.  Right now we wait for one of the
synchronous_standby_names servers to verify the writes.   We need some
way of telling the system how long to wait before continuing in degraded
mode.  Without a timeout and admin notification, it doesn't seem much
better than our async mode, which is what many people were complaining
about.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-11 Thread Florian Pflug

On Jan11, 2014, at 01:48 , Joshua D. Drake j...@commandprompt.com wrote:
 On 01/10/2014 04:38 PM, Stephen Frost wrote:
 Adrian,
 
 * Adrian Klaver (adrian.kla...@gmail.com) wrote:
 On 01/10/2014 04:25 PM, Stephen Frost wrote:
 * Adrian Klaver (adrian.kla...@gmail.com) wrote:
 A) Change the existing sync mode to allow the master and standby
 fall out of sync should a standby fall over.
 
 I'm not sure that anyone is argueing for this..
 
 Looks like here, unless I am really missing the point:
 
 Elsewhere in the thread, JD agreed that having it as an independent
 option was fine.
 
 Yes. I am fine with an independent option.

Hm, I was about to suggest that you can set statement_timeout before
doing COMMIT to limit the amount of time you want to wait for the
standby to respond. Interestingly, however, that doesn't seem to work,
which is weird, since AFAICS statement_timeout simply generates a
query cancel requester after the timeout has elapsed, and cancelling
the COMMIT with Ctrl-C in psql *does* work.

I'm quite probably missing something, but what?

best regards,
Florian Pflug



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-11 Thread Tom Lane

Florian Pflug f...@phlo.org writes:
 Hm, I was about to suggest that you can set statement_timeout before
 doing COMMIT to limit the amount of time you want to wait for the
 standby to respond. Interestingly, however, that doesn't seem to work,
 which is weird, since AFAICS statement_timeout simply generates a
 query cancel requester after the timeout has elapsed, and cancelling
 the COMMIT with Ctrl-C in psql *does* work.

 I'm quite probably missing something, but what?

finish_xact_command() disables statement timeout before committing.

Not sure about the pros and cons of doing that later in the sequence.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-11 Thread Andres Freund

On 2014-01-11 18:28:31 +0100, Florian Pflug wrote:
 Hm, I was about to suggest that you can set statement_timeout before
 doing COMMIT to limit the amount of time you want to wait for the
 standby to respond. Interestingly, however, that doesn't seem to work,
 which is weird, since AFAICS statement_timeout simply generates a
 query cancel requester after the timeout has elapsed, and cancelling
 the COMMIT with Ctrl-C in psql *does* work.

I think that'd be a pretty bad API since you won't know whether the
commit failed or succeeded but replication timed out. There very well
might have been longrunning constraint triggers or such taking a long
time.
So it really would need a separate GUC.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-11 Thread Mark Kirkwood


On 11/01/14 13:25, Stephen Frost wrote:

Adrian,


* Adrian Klaver (adrian.kla...@gmail.com) wrote:

A) Change the existing sync mode to allow the master and standby
fall out of sync should a standby fall over.


I'm not sure that anyone is argueing for this..


B) Create a new mode that does this without changing the existing sync mode.

My two cents would be to implement B. Sync to me is a contract that
master and standby are in sync at any point in time. Anything else
should be called something else. Then it is up to the documentation
to clearly point out the benefits/pitfalls. If you want to implement
something as important as replication without reading the docs then
the results are on you.


The issue is that there are folks who are argueing, essentially, that
B is worthless, wrong, and no one should want it and therefore we
shouldn't have it.



We have some people who clearly do want it (and seemed to have provided 
sensible arguments about why it might be worthwhile), and the others who 
say they should not.


My 2c is:

The current behavior in CAP theorem speak is 'Cap' - i.e focused on 
consistency at the expense of availability. A reasonable thing to want.


The other behavior being asked for is 'cAp' - i.e focused on 
availability. Also a reasonable configuration to want. Now the desire to 
use sync rather than async is to achieve as much consistency as 
possible, which is also reasonable.


I think an option to control whether we operate 'Cap' or 'cAp' 
(defaulting to the current 'Cap' I guess) is probably the best solution.


Regards

Mark



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-11 Thread Tom Lane

Mark Kirkwood mark.kirkw...@catalyst.net.nz writes [slightly rearranged]
 My 2c is:

 The current behavior in CAP theorem speak is 'Cap' - i.e focused on 
 consistency at the expense of availability. A reasonable thing to want.

 The other behavior being asked for is 'cAp' - i.e focused on 
 availability. Also a reasonable configuration to want.

 I think an option to control whether we operate 'Cap' or 'cAp' 
 (defaulting to the current 'Cap' I guess) is probably the best solution.

The above is all perfectly reasonable.  The argument that's not been made
to my satisfaction is that the proposed patch is a good implementation of
'cAp'-optimized behavior.  In particular,

 ... Now the desire to 
 use sync rather than async is to achieve as much consistency as 
 possible, which is also reasonable.

I don't think that the existing sync mode is designed to do that, and
simply lobotomizing it as proposed doesn't get you there.  I think we
need a replication mode that's been designed *from the ground up*
with cAp priorities in mind.  There may end up being only a few actual
differences in behavior --- but I fear that some of those differences
will be crucial.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-11 Thread Josh Berkus

On 01/10/2014 06:27 PM, Bruce Momjian wrote:
 How would that work?  Would it be a tool in contrib?  There already is a
 timeout, so if a tool checked more frequently than the timeout, it
 should work.  The durable notification of the admin would happen in the
 tool, right?

Well, you know what tool *I'm* planning to use.

Thing is, when we talk about auto-degrade, we need to determine things
like Is the replica down or is this just a network blip? and take
action according to the user's desired configuration.  This is not
something, realistically, that we can do on a single request.  Whereas
it would be fairly simple for an external monitoring utility to do:

1. decide replica is offline for the duration (several poll attempts
have failed)

2. Send ALTER SYSTEM SET to the master and change/disable the
synch_replicas.

Such a tool would *also* be capable of detecting when the synchronous
replica was back up and operating, and switch back to sync mode,
something we simply can't do inside Postgres.  And it would be a lot
easier to configure an external tool with monitoring system integration
so that it can alert the DBA to degradation in a way which the DBA was
liable to actually see (which is NOT the Postgres log).

In other words, if we're going to have auto-degrade, the most
intelligent place for it is in
RepMgr/HandyRep/OmniPITR/pgPoolII/whatever.  It's also the *easiest*
place.  Anything we do *inside* Postgres is going to have a really,
really hard time determining when to degrade.


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-11 Thread Amit Kapila

On Sun, Jan 12, 2014 at 8:48 AM, Josh Berkus j...@agliodbs.com wrote:
 On 01/10/2014 06:27 PM, Bruce Momjian wrote:
 How would that work?  Would it be a tool in contrib?  There already is a
 timeout, so if a tool checked more frequently than the timeout, it
 should work.  The durable notification of the admin would happen in the
 tool, right?

 Well, you know what tool *I'm* planning to use.

 Thing is, when we talk about auto-degrade, we need to determine things
 like Is the replica down or is this just a network blip? and take
 action according to the user's desired configuration.  This is not
 something, realistically, that we can do on a single request.  Whereas
 it would be fairly simple for an external monitoring utility to do:

 1. decide replica is offline for the duration (several poll attempts
 have failed)

 2. Send ALTER SYSTEM SET to the master and change/disable the
 synch_replicas.

   Will it possible in current mechanism, because presently master will
   not accept any new command when the sync replica is not available?
   Or is there something else also which needs to be done along with
   above 2 points to make it possible.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-11 Thread Bruce Momjian

On Sat, Jan 11, 2014 at 07:18:02PM -0800, Josh Berkus wrote:
 In other words, if we're going to have auto-degrade, the most
 intelligent place for it is in
 RepMgr/HandyRep/OmniPITR/pgPoolII/whatever.  It's also the *easiest*
 place.  Anything we do *inside* Postgres is going to have a really,
 really hard time determining when to degrade.

Well, one goal I was considering is that if a commit is hung waiting for
slave sync confirmation, and the timeout happens, then the mode is
changed to degraded and the commit returns success.  I am not sure how
you would do that in an external tool, meaning there is going to be
period where commits fail, unless you think there is a way that when the
external tool changes the mode to degrade that all hung commits
complete.  That would be nice.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-11 Thread Amit Kapila

On Sat, Jan 11, 2014 at 9:41 PM, Bruce Momjian br...@momjian.us wrote:
 On Sat, Jan 11, 2014 at 01:29:23PM +0530, Amit Kapila wrote:
 Okay, this is one way of providing this new mode, others could be:

 a.
 Have just one GUC sync_standalone_mode = true|false and make
 this as PGC_POSTMASTER parameter, so that user is only
 allowed to set this mode at startup. Even if we don't want it as
 Postmaster parameter, we can mention to users that they can
 change this parameter only before server reaches current situation.
 I understand that without any alarm or some other way, it is difficult
 for user to know and change it, but I think in that case he should
 set it before server startup.

 b.
 On above lines, instead of boolean parameter, provide a parameter
 similar to current one such as available_synchronous_standby_names,
 setting of this should follow what I said in point a. The benefit in this
 as compare to 'a' is that it appears to be more like what we currently have.

 I think if we try to solve this problem by providing a way so that user
 can change it at runtime or when the problem actually occurred, it can
 make the UI more complex and difficult for us to provide a way so that
 user can be alerted on such situation. We can keep our options open
 so that if tomorrow, we can find any reasonable way, then we can
 provide it to user a mechanism for changing this at runtime, but I don't
 think it is stopping us from providing a way with which user can get the
 benefit of this mode by providing start time parameter.

 I am not sure how this would work.  Right now we wait for one of the
 synchronous_standby_names servers to verify the writes.   We need some
 way of telling the system how long to wait before continuing in degraded
 mode.  Without a timeout and admin notification, it doesn't seem much
 better than our async mode, which is what many people were complaining
 about.

It is better than async mode in a way such that in async mode it never
waits for commits to be written to standby, but in this new mode it will
do so unless it is not possible (all sync standby's goes down).
Can't we use existing wal_sender_timeout, or even if user expects a
different timeout because for this new mode, he expects master to wait
more before it start operating like standalone sync master, we can provide
a new parameter.

With this the definition of new mode is to provide maximum
availability.

We can define the behavior in this new mode as:
a. It will operate like current synchronous master till one of the standby
mentioned in available_synchronous_standby_names is available.
b. If none is available, then it will start operating link current async
master, which means that if any async standby is configured, then
it will start sending WAL to that standby asynchronously, else if none
is configured, it will start operating in a standalone master.
c. We can even provide a new parameter replication_mode here
(non persistent), which will tell to user that master has switched
its mode, this can be made available by view. Update the value of
parameter when server switches to new mode.
d. When one of the standby mentioned in
available_synchronous_standby_names comes back and able to resolve
all WAL difference, then it will again switch back to sync mode, where it
will write to that standby before Commit finishes. After switch, it will
update the replication_mode parameter.

Now I think with above definition and behavior, it can switch to new mode
and will be able to provide information if user wants it by using view.

In above behaviour, the tricky part would be point 'd' where it has to switch
back to sync mode when one of the sync standby become available, but I
think we can workout design for that if you are positive about the above
definition and behaviour as defined by 4 points.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-10 Thread Bruce Momjian

On Fri, Jan 10, 2014 at 10:21:42AM +0530, Amit Kapila wrote:
 On Thu, Jan 9, 2014 at 10:45 PM, Bruce Momjian br...@momjian.us wrote:
 
  I think RAID-1 is a very good comparison because it is successful
  technology and has similar issues.
 
  RAID-1 is like Postgres synchronous_standby_names mode in the sense that
  the RAID-1 controller will not return success until writes have happened
  on both mirrors, but it is unlike synchronous_standby_names in that it
  will degrade and continue writes even when it can't write to both
  mirrors.  What is being discussed is to allow the RAID-1 behavior in
  Postgres.
 
  One issue that came up in discussions is the insufficiency of writing a
  degrade notice in a server log file because the log file isn't durable
  from server failures, meaning you don't know if a fail-over to the slave
  lost commits.  The degrade message has to be stored durably against a
  server failure, e.g. on a pager, probably using a command like we do for
  archive_command, and has to return success before the server continues
  in degrade mode.  I assume degraded RAID-1 controllers inform
  administrators in the same way.
 
 Here I think if user is aware from beginning that this is the behaviour,
 then may be the importance of message is not very high.
 What I want to say is that if we provide a UI in such a way that user
 decides during setup of server the behavior that is required by him.
 
 For example, if we provide a new parameter
 available_synchronous_standby_names along with current parameter
 and ask user to use this new parameter, if he wishes to synchronously
 commit transactions on another server when it is available, else it will
 operate as a standalone sync master.

I know there was a desire to remove this TODO item, but I think we have
brought up enough new issues that we can keep it to see if we can come
up with a solution.  I have added a link to this discussion on the TODO
item.

I think we will need at least four new GUC variables:

*  timeout control for degraded mode
*  command to run during switch to degraded mode
*  command to run during switch from degraded mode
*  read-only variable to report degraded mode

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-10 Thread Simon Riggs

On 10 January 2014 15:47, Bruce Momjian br...@momjian.us wrote:

 I know there was a desire to remove this TODO item, but I think we have
 brought up enough new issues that we can keep it to see if we can come
 up with a solution.

Can you summarise what you think the new issues are? All I see is some
further rehashing of old discussions.

There is already a solution to the problem because the docs are
already very clear that you need multiple standbys to achieve commit
guarantees AND high availability. RTFM is usually used as some form of
put down, but that is what needs to happen here.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-10 Thread Hannu Krosing

On 01/10/2014 05:09 PM, Simon Riggs wrote:
 On 10 January 2014 15:47, Bruce Momjian br...@momjian.us wrote:

 I know there was a desire to remove this TODO item, but I think we have
 brought up enough new issues that we can keep it to see if we can come
 up with a solution.
 Can you summarise what you think the new issues are? All I see is some
 further rehashing of old discussions.

 There is already a solution to the problem because the docs are
 already very clear that you need multiple standbys to achieve commit
 guarantees AND high availability. RTFM is usually used as some form of
 put down, but that is what needs to happen here.

If we want to get the guarantees that often come up in sync rep
discussions - namely that you can assume that your change is applied
on standby when commit returns - then we could implement this by
returning LSN from commit at protocol level and having an option in
queries on standby to wait for this LSN (again passed on wire below
the level of query)  to be applied.

This can be mostly hidden in drivers and would need very little effort
from end user to use. basically you tell the driver that one connection
is bound as the slave of another and driver can manage using the
right LSNs. That is the last LSN received from master is always
attached to queries on slaves.

Cheers

-- 
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-10 Thread Joshua D. Drake



On 01/10/2014 07:47 AM, Bruce Momjian wrote:


I know there was a desire to remove this TODO item, but I think we have
brought up enough new issues that we can keep it to see if we can come
up with a solution.  I have added a link to this discussion on the TODO
item.

I think we will need at least four new GUC variables:

*  timeout control for degraded mode
*  command to run during switch to degraded mode
*  command to run during switch from degraded mode
*  read-only variable to report degraded mode



I know I am the one that instigated all of this so I want to be very 
clear on what I and what I am confident that my customers would expect.


If a synchronous slave goes down, the master continues to operate. That 
is all. I don't care if it is configurable (I would be fine with that). 
I don't care if it is not automatic (e.g; slave goes down and we have to 
tell the master to continue).


I have read through this thread more than once, and I have also went 
back to the docs. I understand why we do it the way we do it. I also 
understand that from a business requirement for 99% of CMD's customers, 
it's wrong. At least in the sense of providing continuity of service.


Sincerely,

JD

--
Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
In a time of universal deceit - telling the truth is a revolutionary 
act., George Orwell



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-10 Thread Jim Nasby


On 1/10/14, 12:59 PM, Joshua D. Drake wrote:

I know I am the one that instigated all of this so I want to be very clear on 
what I and what I am confident that my customers would expect.

If a synchronous slave goes down, the master continues to operate. That is all. 
I don't care if it is configurable (I would be fine with that). I don't care if 
it is not automatic (e.g; slave goes down and we have to tell the master to 
continue).

I have read through this thread more than once, and I have also went back to 
the docs. I understand why we do it the way we do it. I also understand that 
from a business requirement for 99% of CMD's customers, it's wrong. At least in 
the sense of providing continuity of service.


+1

I understand that this is a degredation of full-on sync rep. But there is 
definite value added with sync-rep that can automatically (or at least easily) 
degrade over async; it protects you from single failures. I fully understand 
that it will not protect you from a double failure. That's OK in many cases.
Jim C. Nasby, Data Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-10 Thread Andres Freund

On 2014-01-10 10:59:23 -0800, Joshua D. Drake wrote:
 
 On 01/10/2014 07:47 AM, Bruce Momjian wrote:
 
 I know there was a desire to remove this TODO item, but I think we have
 brought up enough new issues that we can keep it to see if we can come
 up with a solution.  I have added a link to this discussion on the TODO
 item.
 
 I think we will need at least four new GUC variables:
 
 *  timeout control for degraded mode
 *  command to run during switch to degraded mode
 *  command to run during switch from degraded mode
 *  read-only variable to report degraded mode
 
 
 I know I am the one that instigated all of this so I want to be very clear
 on what I and what I am confident that my customers would expect.
 
 If a synchronous slave goes down, the master continues to operate. That is
 all. I don't care if it is configurable (I would be fine with that). I don't
 care if it is not automatic (e.g; slave goes down and we have to tell the
 master to continue).

Would you please explain, as precise as possible, what the advantages of
using a synchronous standby would be in such a scenario?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-10 Thread Stephen Frost

* Andres Freund (and...@2ndquadrant.com) wrote:
 On 2014-01-10 10:59:23 -0800, Joshua D. Drake wrote:
  If a synchronous slave goes down, the master continues to operate. That is
  all. I don't care if it is configurable (I would be fine with that). I don't
  care if it is not automatic (e.g; slave goes down and we have to tell the
  master to continue).
 
 Would you please explain, as precise as possible, what the advantages of
 using a synchronous standby would be in such a scenario?

In a degraded/failure state, things continue to *work*.  In a
non-degraded/failure state, you're able to handle a system failure and
know that you didn't lose any transactions.

Tom's point is correct, that you will fail on the have two copies of
everything in this mode, but that could certainly be acceptable in the
case where there is a system failure.  As pointed out by someone
previously, that's how RAID-1 works (which I imagine quite a few of us
use).

I've been thinking about this a fair bit and I've come to like the RAID1
analogy.  Stinks that we can't keep things going (automatically) if
either side fails, but perhaps we will one day...

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Standalone synchronous master

2014-01-10 Thread Andres Freund

On 2014-01-10 17:02:08 -0500, Stephen Frost wrote:
 * Andres Freund (and...@2ndquadrant.com) wrote:
  On 2014-01-10 10:59:23 -0800, Joshua D. Drake wrote:
   If a synchronous slave goes down, the master continues to operate. That is
   all. I don't care if it is configurable (I would be fine with that). I 
   don't
   care if it is not automatic (e.g; slave goes down and we have to tell the
   master to continue).
  
  Would you please explain, as precise as possible, what the advantages of
  using a synchronous standby would be in such a scenario?
 
 In a degraded/failure state, things continue to *work*.  In a
 non-degraded/failure state, you're able to handle a system failure and
 know that you didn't lose any transactions.

Why do you know that you didn't loose any transactions? Trivial network
hiccups, a restart of a standby, IO overload on the standby all can
cause a very short interruptions in the walsender connection - leading
to degradation.

 As pointed out by someone
 previously, that's how RAID-1 works (which I imagine quite a few of us
 use).

I don't think that argument makes much sense. Raid-1 isn't safe
as-is. It's only safe if you use some sort of journaling or similar
ontop. If you issued a write during a crash you normally will just get
either the version from before or the version after the last write back,
depending on the state on the individual disks and which disk is treated
as authoritative by the raid software.

And even if you disregard that, there's not much outside influence that
can lead to loosing connection to a disk drive inside a raid outside an
actually broken drive. Any network connection is normally kept *outside*
the leven at which you build raids.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-10 Thread Stephen Frost

Andres,

On Friday, January 10, 2014, Andres Freund wrote:

 On 2014-01-10 17:02:08 -0500, Stephen Frost wrote:
  * Andres Freund (and...@2ndquadrant.com javascript:;) wrote:
   On 2014-01-10 10:59:23 -0800, Joshua D. Drake wrote:
If a synchronous slave goes down, the master continues to operate.
 That is
all. I don't care if it is configurable (I would be fine with that).
 I don't
care if it is not automatic (e.g; slave goes down and we have to
 tell the
master to continue).
  
   Would you please explain, as precise as possible, what the advantages
 of
   using a synchronous standby would be in such a scenario?
 
  In a degraded/failure state, things continue to *work*.  In a
  non-degraded/failure state, you're able to handle a system failure and
  know that you didn't lose any transactions.

 Why do you know that you didn't loose any transactions? Trivial network
 hiccups, a restart of a standby, IO overload on the standby all can
 cause a very short interruptions in the walsender connection - leading
 to degradation.


You know that you haven't *lost* any by virtue of the master still being
up. The case you describe is a double-failure scenario- the link between
the master and slave has to go away AND the master must accept a
transaction and then fail independently.


  As pointed out by someone
  previously, that's how RAID-1 works (which I imagine quite a few of us
  use).

 I don't think that argument makes much sense. Raid-1 isn't safe
 as-is. It's only safe if you use some sort of journaling or similar
 ontop. If you issued a write during a crash you normally will just get
 either the version from before or the version after the last write back,
 depending on the state on the individual disks and which disk is treated
 as authoritative by the raid software.


Uh, you need a decent raid controller then and we're talking about after a
transaction commit/sync.

And even if you disregard that, there's not much outside influence that
 can lead to loosing connection to a disk drive inside a raid outside an
 actually broken drive. Any network connection is normally kept *outside*
 the leven at which you build raids.


This is a fair point and perhaps we should have the timeout or jitter GUC
which was proposed elsewhere, but the notion that this configuration is
completely unreasonable is not accurate and therefore having it would be a
benefit overall.

Thanks,

Stephen

Re: [HACKERS] Standalone synchronous master

2014-01-10 Thread Joshua D. Drake



On 01/10/2014 01:49 PM, Andres Freund wrote:


I know I am the one that instigated all of this so I want to be very clear
on what I and what I am confident that my customers would expect.

If a synchronous slave goes down, the master continues to operate. That is
all. I don't care if it is configurable (I would be fine with that). I don't
care if it is not automatic (e.g; slave goes down and we have to tell the
master to continue).


Would you please explain, as precise as possible, what the advantages of
using a synchronous standby would be in such a scenario?


Current behavior:

db01-sync-db02

Transactions are happening. Everything is happy. Website is up. Orders 
are being made.


db02 goes down. It doesn't matter why. It is down. Because it is down, 
db01 for all intents and purposes is also down because we are using sync 
replication. We have just lost continuity of service, we can no longer 
accept orders, we can no longer allow people to log into the website, we 
can no longer service accounts.


In short, we are out of business.

Proposed behavior:

db01-sync-db02

Transactions are happening. Everything is happy. Website is up. Orders 
are being made.


db02 goes down. It doesn't matter why. It is down. db01 continues to 
accept orders, allow people to log into the website and we can still 
service accounts. The continuity of service continues.


Yes, there are all kinds of things that need to be considered when that 
happens, that isn't the point. The point is, PostgreSQL continues its 
uptime guarantee and allows the business to continue to function as (if) 
nothing has happened.


For many and I dare say the majority of businesses, this is enough. They 
know that if the slave goes down they can continue to operate. They know 
if the master goes down they can fail over. They know that while both 
are up they are using sync rep (with various caveats). They are happy. 
They like that it is simple and just works. They continue to use PostgreSQL.



Sincerely,

JD

--
Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
In a time of universal deceit - telling the truth is a revolutionary 
act., George Orwell



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-10 Thread Andres Freund

On 2014-01-10 14:29:58 -0800, Joshua D. Drake wrote:
 db02 goes down. It doesn't matter why. It is down. db01 continues to accept
 orders, allow people to log into the website and we can still service
 accounts. The continuity of service continues.

Why is that configuration advantageous over a async configuration is the
question. Why, with those requirements, are you using a synchronous
standby at all?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-10 Thread Adrian Klaver


On 01/10/2014 02:33 PM, Andres Freund wrote:

On 2014-01-10 14:29:58 -0800, Joshua D. Drake wrote:

db02 goes down. It doesn't matter why. It is down. db01 continues to accept
orders, allow people to log into the website and we can still service
accounts. The continuity of service continues.


Why is that configuration advantageous over a async configuration is the
question. Why, with those requirements, are you using a synchronous
standby at all?


+1



Greetings,

Andres Freund




--
Adrian Klaver
adrian.kla...@gmail.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-10 Thread Joshua D. Drake



On 01/10/2014 02:33 PM, Andres Freund wrote:


On 2014-01-10 14:29:58 -0800, Joshua D. Drake wrote:

db02 goes down. It doesn't matter why. It is down. db01 continues to accept
orders, allow people to log into the website and we can still service
accounts. The continuity of service continues.


Why is that configuration advantageous over a async configuration is the
question. Why, with those requirements, are you using a synchronous
standby at all?


If the master goes down, I can fail over knowing that as many of my 
transactions as possible have been replicated.


JD




--
Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
In a time of universal deceit - telling the truth is a revolutionary 
act., George Orwell



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-10 Thread Andres Freund

Hi,

On 2014-01-10 17:28:55 -0500, Stephen Frost wrote:
  Why do you know that you didn't loose any transactions? Trivial network
  hiccups, a restart of a standby, IO overload on the standby all can
  cause a very short interruptions in the walsender connection - leading
  to degradation.

 You know that you haven't *lost* any by virtue of the master still being
 up. The case you describe is a double-failure scenario- the link between
 the master and slave has to go away AND the master must accept a
 transaction and then fail independently.

Unfortunately network outages do correlate with other system
faults. What you're wishing for really is the I like the world to be
friendly to me mode.
Even if you have only disk problems, quite often if your disks die, you
can continue to write (especially with a BBU), but uncached reads
fail. So the walsender connection errors out because a read failed, and
youre degrading into async mode. *Because* your primary is about to die.

   As pointed out by someone
   previously, that's how RAID-1 works (which I imagine quite a few of us
   use).
 
  I don't think that argument makes much sense. Raid-1 isn't safe
  as-is. It's only safe if you use some sort of journaling or similar
  ontop. If you issued a write during a crash you normally will just get
  either the version from before or the version after the last write back,
  depending on the state on the individual disks and which disk is treated
  as authoritative by the raid software.

 Uh, you need a decent raid controller then and we're talking about after a
 transaction commit/sync.

Yes, if you have a BBU that memory is authoritative in most cases. But
in that case the argument of having two disks is pretty much pointless,
the SPOF suddenly became the battery + ram.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-10 Thread Jeff Janes

On Fri, Jan 10, 2014 at 2:33 PM, Andres Freund and...@2ndquadrant.comwrote:

 On 2014-01-10 14:29:58 -0800, Joshua D. Drake wrote:
  db02 goes down. It doesn't matter why. It is down. db01 continues to
 accept
  orders, allow people to log into the website and we can still service
  accounts. The continuity of service continues.

 Why is that configuration advantageous over a async configuration is the
 question.


Because it is orders of magnitude less likely to lose transactions that
were reported to have been committed.  A permanent failure of the master is
almost guaranteed to lose transactions with async.  With auto-degrade, a
permanent failure of the master only loses reported-committed transactions
if it co-occurs with a temporary failure of the replica or the network,
lasting longer than the time out period.


Why, with those requirements, are you using a synchronous
 standby at all?


They aren't using synchronous standby, they are using asynchronous standby
because we fail to provide the choice they prefer, which is a compromise
between the two.

Cheers,

Jeff

Re: [HACKERS] Standalone synchronous master

2014-01-10 Thread Andres Freund

On 2014-01-10 14:44:28 -0800, Joshua D. Drake wrote:
 
 On 01/10/2014 02:33 PM, Andres Freund wrote:
 
 On 2014-01-10 14:29:58 -0800, Joshua D. Drake wrote:
 db02 goes down. It doesn't matter why. It is down. db01 continues to accept
 orders, allow people to log into the website and we can still service
 accounts. The continuity of service continues.
 
 Why is that configuration advantageous over a async configuration is the
 question. Why, with those requirements, are you using a synchronous
 standby at all?
 
 If the master goes down, I can fail over knowing that as many of my
 transactions as possible have been replicated.

It's not like async replication mode delays sending data to the standby
in any way.

Really, the commits themselves are sent to the server at exactly the
same speed independent of sync/async. The only thing that's delayed is
the *notificiation* of the client that sent the commit. Not the commit
itself.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-10 Thread Stephen Frost

Greetings,

On Friday, January 10, 2014, Andres Freund wrote:

 Hi,

 On 2014-01-10 17:28:55 -0500, Stephen Frost wrote:
   Why do you know that you didn't loose any transactions? Trivial network
   hiccups, a restart of a standby, IO overload on the standby all can
   cause a very short interruptions in the walsender connection - leading
   to degradation.

  You know that you haven't *lost* any by virtue of the master still being
  up. The case you describe is a double-failure scenario- the link between
  the master and slave has to go away AND the master must accept a
  transaction and then fail independently.

 Unfortunately network outages do correlate with other system
 faults. What you're wishing for really is the I like the world to be
 friendly to me mode.
 Even if you have only disk problems, quite often if your disks die, you
 can continue to write (especially with a BBU), but uncached reads
 fail. So the walsender connection errors out because a read failed, and
 youre degrading into async mode. *Because* your primary is about to die.


That can happen, sure, but I don't agree that people using a single drive
with a BBU or having two drives in a raid1 die at the same time cases are
reasonable arguments against this option. Not to mention that, today, if
the master has an issue then we're SOL anyway. Also, if the network fails
then likely there aren't any new transactions happening.


As pointed out by someone
previously, that's how RAID-1 works (which I imagine quite a few of
 us
use).
  
   I don't think that argument makes much sense. Raid-1 isn't safe
   as-is. It's only safe if you use some sort of journaling or similar
   ontop. If you issued a write during a crash you normally will just get
   either the version from before or the version after the last write
 back,
   depending on the state on the individual disks and which disk is
 treated
   as authoritative by the raid software.

  Uh, you need a decent raid controller then and we're talking about after
 a
  transaction commit/sync.

 Yes, if you have a BBU that memory is authoritative in most cases. But
 in that case the argument of having two disks is pretty much pointless,
 the SPOF suddenly became the battery + ram.


If that is a concern then use multiple controllers. Certainly not unheard
of- look at SANs...

Thanks,

Stephen

Re: [HACKERS] Standalone synchronous master

2014-01-10 Thread Joshua D. Drake



On 01/10/2014 02:47 PM, Andres Freund wrote:


Really, the commits themselves are sent to the server at exactly the
same speed independent of sync/async. The only thing that's delayed is
the *notificiation* of the client that sent the commit. Not the commit
itself.


Which is irrelevant to the point that if the standby goes down, we are 
now out of business.


Any continuous replication should not be a SPOF. The current behavior 
guarantees that a two node sync cluster is a SPOF. The proposed behavior 
removes that.


Sincerely,

Joshua D. Drake



--
Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
In a time of universal deceit - telling the truth is a revolutionary 
act., George Orwell



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-10 Thread Joshua D. Drake



On 01/10/2014 02:57 PM, Stephen Frost wrote:


Yes, if you have a BBU that memory is authoritative in most cases. But
in that case the argument of having two disks is pretty much pointless,
the SPOF suddenly became the battery + ram.


If that is a concern then use multiple controllers. Certainly not
unheard of- look at SANs...



And in PostgreSQL we obviously have the option of having a third or 
fourth standby but that isn't the problem we are trying to solve.


JD



--
Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
In a time of universal deceit - telling the truth is a revolutionary 
act., George Orwell



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-10 Thread Hannu Krosing

On 01/10/2014 11:59 PM, Joshua D. Drake wrote:

 On 01/10/2014 02:57 PM, Stephen Frost wrote:

 Yes, if you have a BBU that memory is authoritative in most
 cases. But
 in that case the argument of having two disks is pretty much
 pointless,
 the SPOF suddenly became the battery + ram.


 If that is a concern then use multiple controllers. Certainly not
 unheard of- look at SANs...


 And in PostgreSQL we obviously have the option of having a third or
 fourth standby but that isn't the problem we are trying to solve.
The problem you are trying to solve is a controller with enough
Battery Backed Cache RAM to cache the entire database but with
write-though mode.

And you want it to degrade to write-back in case of disk failure so that
you can continue while the disk is broken.

People here are telling you that it would not be safe, use at least RAID-1
if you want availability

Cheers

-- 
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-10 Thread Josh Berkus

On 01/10/2014 02:59 PM, Joshua D. Drake wrote:
 
 On 01/10/2014 02:47 PM, Andres Freund wrote:
 
 Really, the commits themselves are sent to the server at exactly the
 same speed independent of sync/async. The only thing that's delayed is
 the *notificiation* of the client that sent the commit. Not the commit
 itself.
 
 Which is irrelevant to the point that if the standby goes down, we are
 now out of business.
 
 Any continuous replication should not be a SPOF. The current behavior
 guarantees that a two node sync cluster is a SPOF. The proposed behavior
 removes that.

Again, if that's your goal, then use async replication.

I really don't understand the use-case here.

The purpose of sync rep is to know determinatively whether or not you
have lost data when disaster strikes.  If knowing for certain isn't
important to you, then use async.

BTW, people are using RAID1 as an analogy to 2-node sync replication.
That's a very bad analogy, because in RAID1 you have a *single*
controller which is capable of determining if the disks are in a failed
state or not, and this is all happening on a single node where things
like network outages aren't a consideration.  It's really not the same
situation at all.

Also, frankly, I absolutely can't count the number of times I've had to
rescue a customer or family member who had RAID1 but wan't monitoring
syslog, and so one of their disks had been down for months without them
knowning it.  Heck, I've done this myself.

So ... the Filesystem geeks have already been through this.  Filesystem
clustering started out with systems like DRBD, which includes an
auto-degrade option.  However, DBRD with auto-degrade is widely
considered untrustworthy and is a significant portion of why DBRD isn't
trusted today.

From here, clustered filesystems went in two directions: RHCS added
layers of monitoring and management to make auto-degrade a safer option
than it is with DRBD (and still not the default option).  Scalable
clustered filesystems added N(M) quorum commit in order to support more
than 2 nodes.  Either of these courses are reasonable for us to pursue.

What's a bad idea is adding an auto-degrade option without any tools to
manage and monitor it, which is what this patch does by my reading.  If
I'm wrong, then someone can point it out to me.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-10 Thread Josh Berkus

On 01/10/2014 01:49 PM, Andres Freund wrote:
 On 2014-01-10 10:59:23 -0800, Joshua D. Drake wrote:

 On 01/10/2014 07:47 AM, Bruce Momjian wrote:

 I know there was a desire to remove this TODO item, but I think we have
 brought up enough new issues that we can keep it to see if we can come
 up with a solution.  I have added a link to this discussion on the TODO
 item.

 I think we will need at least four new GUC variables:

 *  timeout control for degraded mode
 *  command to run during switch to degraded mode
 *  command to run during switch from degraded mode
 *  read-only variable to report degraded mode

I would argue that we don't need the first.  We just want a command to
switch synchronous/degraded, and a variable (or function) to report on
degraded mode.  If we have those things, then it becomes completely
possible to have an external monitoring framework, which is capable of
answering questions like is the replica down or just slow?, control
degrade.

Oh, wait!  We DO have such a command.  It's called ALTER SYSTEM SET!
Recently committed.  So this is really a solvable issue if one is
willing to use an external utility.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-10 Thread Joshua D. Drake



On 01/10/2014 03:17 PM, Josh Berkus wrote:


Any continuous replication should not be a SPOF. The current behavior
guarantees that a two node sync cluster is a SPOF. The proposed behavior
removes that.


Again, if that's your goal, then use async replication.


I think I have gone about this the wrong way. Async does not meet the 
technical or business requirements that I have. Sync does except that it 
increases the possibility of an outage. That is the requirement I am 
trying to address.




The purpose of sync rep is to know determinatively whether or not you
have lost data when disaster strikes.  If knowing for certain isn't
important to you, then use async.


PostgreSQL Sync replication increases the possibility of an outage. That 
is incorrect behavior.


I want sync because on the chance that the master goes down, I have as 
much data as possible to fail over to. However, I can't use sync because 
it increases the possibility that my business will not be able to 
function on the chance that the standby goes down.




What's a bad idea is adding an auto-degrade option without any tools to
manage and monitor it, which is what this patch does by my reading.  If


This we absolutely agree on.

JD


--
Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
In a time of universal deceit - telling the truth is a revolutionary 
act., George Orwell



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-10 Thread Adrian Klaver


On 01/10/2014 03:38 PM, Joshua D. Drake wrote:


On 01/10/2014 03:17 PM, Josh Berkus wrote:


Any continuous replication should not be a SPOF. The current behavior
guarantees that a two node sync cluster is a SPOF. The proposed behavior
removes that.


Again, if that's your goal, then use async replication.


I think I have gone about this the wrong way. Async does not meet the
technical or business requirements that I have. Sync does except that it
increases the possibility of an outage. That is the requirement I am
trying to address.



The purpose of sync rep is to know determinatively whether or not you
have lost data when disaster strikes.  If knowing for certain isn't
important to you, then use async.


PostgreSQL Sync replication increases the possibility of an outage. That
is incorrect behavior.

I want sync because on the chance that the master goes down, I have as
much data as possible to fail over to. However, I can't use sync because
it increases the possibility that my business will not be able to
function on the chance that the standby goes down.



What's a bad idea is adding an auto-degrade option without any tools to
manage and monitor it, which is what this patch does by my reading.  If


This we absolutely agree on.



As I see it the state of replication in Postgres is as follows.

1) Async. Runs at the speed of the master as it does not have to wait on 
the standby to signal a successful commit. There is some degree of 
offset between master and standby(s) due to latency.


2) Sync. Runs at the speed of the standby + latency between master and 
standby. This is counter balanced by knowledge that the master and 
standby are in the same state. As Josh Berkus pointed out there is a 
loop hole in this when multiple standbys are involved.


The topic under discussion is an intermediate mode between 1 and 2. 
There seems to be a consensus that this is not unreasonable.


The issue seems to be how to achieve this with ideas falling into 
roughly two camps.


A) Change the existing sync mode to allow the master and standby fall 
out of sync should a standby fall over.


B) Create a new mode that does this without changing the existing sync mode.


My two cents would be to implement B. Sync to me is a contract that 
master and standby are in sync at any point in time. Anything else 
should be called something else. Then it is up to the documentation to 
clearly point out the benefits/pitfalls. If you want to implement 
something as important as replication without reading the docs then the 
results are on you.




JD





--
Adrian Klaver
adrian.kla...@gmail.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-10 Thread Stephen Frost

Adrian,


* Adrian Klaver (adrian.kla...@gmail.com) wrote:
 A) Change the existing sync mode to allow the master and standby
 fall out of sync should a standby fall over.

I'm not sure that anyone is argueing for this..

 B) Create a new mode that does this without changing the existing sync mode.
 
 My two cents would be to implement B. Sync to me is a contract that
 master and standby are in sync at any point in time. Anything else
 should be called something else. Then it is up to the documentation
 to clearly point out the benefits/pitfalls. If you want to implement
 something as important as replication without reading the docs then
 the results are on you.

The issue is that there are folks who are argueing, essentially, that
B is worthless, wrong, and no one should want it and therefore we
shouldn't have it.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Standalone synchronous master

2014-01-10 Thread Adrian Klaver


On 01/10/2014 04:25 PM, Stephen Frost wrote:

Adrian,


* Adrian Klaver (adrian.kla...@gmail.com) wrote:

A) Change the existing sync mode to allow the master and standby
fall out of sync should a standby fall over.


I'm not sure that anyone is argueing for this..


Looks like here, unless I am really missing the point:

http://www.postgresql.org/message-id/52d07466.6070...@commandprompt.com

Proposed behavior:

db01-sync-db02

Transactions are happening. Everything is happy. Website is up. Orders 
are being made.


db02 goes down. It doesn't matter why. It is down. db01 continues to 
accept orders, allow people to log into the website and we can still 
service accounts. The continuity of service continues.


Yes, there are all kinds of things that need to be considered when that 
happens, that isn't the point. The point is, PostgreSQL continues its 
uptime guarantee and allows the business to continue to function as (if) 
nothing has happened.


For many and I dare say the majority of businesses, this is enough. They 
know that if the slave goes down they can continue to operate. They know 
if the master goes down they can fail over. They know that while both 
are up they are using sync rep (with various caveats). They are happy. 
They like that it is simple and just works. They continue to use 
PostgreSQL. 





B) Create a new mode that does this without changing the existing sync mode.

My two cents would be to implement B. Sync to me is a contract that
master and standby are in sync at any point in time. Anything else
should be called something else. Then it is up to the documentation
to clearly point out the benefits/pitfalls. If you want to implement
something as important as replication without reading the docs then
the results are on you.


The issue is that there are folks who are argueing, essentially, that
B is worthless, wrong, and no one should want it and therefore we
shouldn't have it.


Well you will not please everyone, just displease the least.



Thanks,

Stephen




--
Adrian Klaver
adrian.kla...@gmail.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-10 Thread Stephen Frost

Adrian,

* Adrian Klaver (adrian.kla...@gmail.com) wrote:
 On 01/10/2014 04:25 PM, Stephen Frost wrote:
 * Adrian Klaver (adrian.kla...@gmail.com) wrote:
 A) Change the existing sync mode to allow the master and standby
 fall out of sync should a standby fall over.
 
 I'm not sure that anyone is argueing for this..
 
 Looks like here, unless I am really missing the point:

Elsewhere in the thread, JD agreed that having it as an independent
option was fine.

 Well you will not please everyone, just displease the least.

Well, sure, but we do generally try to reach concensus. :)

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Standalone synchronous master

2014-01-10 Thread Joshua D. Drake



On 01/10/2014 04:38 PM, Stephen Frost wrote:

Adrian,

* Adrian Klaver (adrian.kla...@gmail.com) wrote:

On 01/10/2014 04:25 PM, Stephen Frost wrote:

* Adrian Klaver (adrian.kla...@gmail.com) wrote:

A) Change the existing sync mode to allow the master and standby
fall out of sync should a standby fall over.


I'm not sure that anyone is argueing for this..


Looks like here, unless I am really missing the point:


Elsewhere in the thread, JD agreed that having it as an independent
option was fine.


Yes. I am fine with an independent option.

JD



--
Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
In a time of universal deceit - telling the truth is a revolutionary 
act., George Orwell



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-10 Thread Jim Nasby


On 1/10/14, 6:19 PM, Adrian Klaver wrote:

1) Async. Runs at the speed of the master as it does not have to wait on the 
standby to signal a successful commit. There is some degree of offset between 
master and standby(s) due to latency.

2) Sync. Runs at the speed of the standby + latency between master and standby. 
This is counter balanced by knowledge that the master and standby are in the 
same state. As Josh Berkus pointed out there is a loop hole in this when 
multiple standbys are involved.

The topic under discussion is an intermediate mode between 1 and 2. There seems 
to be a consensus that this is not unreasonable.


That's not what's actually under debate; allow me to restate as option 3:

3) Sync. Everything you said, plus: If for ANY reason the master can not talk to 
the slave it becomes read-only.

That's the current state.

What many people want is something along the lines of what you said in 2: The 
slave ALWAYS has everything the master does (at least on disk) unless the 
connection between master and slave fails.

The reason people want this is it protects you against a *single* fault. If 
just the master blows up, you have a 100% reliable slave. If the connection (or 
the slave itself) blows up, the master is still working.

I agree that there's a non-obvious gotcha here: in the case of a master failure 
you might also have experienced a connection failure, and without some kind of 
3rd party involved you have no way to know that.

We should make best efforts to make that gotcha as clear to users as we can. 
But just because some users will blindly ignore that doesn't mean we flat-out 
shouldn't support those that will understand the gotcha and accept it's 
limitations.

BTW, if ALTER SYSTEM SET actually does make it possible to implement automated 
failover without directly adding it to Postgres then I think a good compromise 
would be to have an external project that does just that and have the docs 
reference that project and explain why we haven't built it in.
--
Jim C. Nasby, Data Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-10 Thread Adrian Klaver


On 01/10/2014 04:48 PM, Joshua D. Drake wrote:


On 01/10/2014 04:38 PM, Stephen Frost wrote:

Adrian,

* Adrian Klaver (adrian.kla...@gmail.com) wrote:

On 01/10/2014 04:25 PM, Stephen Frost wrote:

* Adrian Klaver (adrian.kla...@gmail.com) wrote:

A) Change the existing sync mode to allow the master and standby
fall out of sync should a standby fall over.


I'm not sure that anyone is argueing for this..


Looks like here, unless I am really missing the point:


Elsewhere in the thread, JD agreed that having it as an independent
option was fine.


Yes. I am fine with an independent option.


I missed that. What confused me and seems to be generally confusing is 
the overloading of the term sync:


Proposed behavior:

db01-sync-db02 

In my mind if that is an independent option it should have different 
name. I propose Schrödinger:)




JD






--
Adrian Klaver
adrian.kla...@gmail.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-10 Thread Peter Eisentraut

On Wed, 2014-01-08 at 17:56 -0500, Stephen Frost wrote:
 * Andres Freund (and...@2ndquadrant.com) wrote:
  That's why you should configure a second standby as another (candidate)
  synchronous replica, also listed in synchronous_standby_names.
 
 Perhaps we should stress in the docs that this is, in fact, the *only*
 reasonable mode in which to run with sync rep on?  Where there are
 multiple replicas, because otherwise Drake is correct that you'll just
 end up having both nodes go offline if the slave fails.

It's not unreasonable to run with only two if the writers are consuming
from a reliable message queue (or another system that maintains its own
reliable persistence).  Then you can just continue processing messages
after you have repaired your replication pair.




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-10 Thread Bruce Momjian

On Fri, Jan 10, 2014 at 03:17:34PM -0800, Josh Berkus wrote:
 The purpose of sync rep is to know determinatively whether or not you
 have lost data when disaster strikes.  If knowing for certain isn't
 important to you, then use async.
 
 BTW, people are using RAID1 as an analogy to 2-node sync replication.
 That's a very bad analogy, because in RAID1 you have a *single*
 controller which is capable of determining if the disks are in a failed
 state or not, and this is all happening on a single node where things
 like network outages aren't a consideration.  It's really not the same
 situation at all.
 
 Also, frankly, I absolutely can't count the number of times I've had to
 rescue a customer or family member who had RAID1 but wan't monitoring
 syslog, and so one of their disks had been down for months without them
 knowning it.  Heck, I've done this myself.
 
 So ... the Filesystem geeks have already been through this.  Filesystem
 clustering started out with systems like DRBD, which includes an
 auto-degrade option.  However, DBRD with auto-degrade is widely
 considered untrustworthy and is a significant portion of why DBRD isn't
 trusted today.
 
 From here, clustered filesystems went in two directions: RHCS added
 layers of monitoring and management to make auto-degrade a safer option
 than it is with DRBD (and still not the default option).  Scalable
 clustered filesystems added N(M) quorum commit in order to support more
 than 2 nodes.  Either of these courses are reasonable for us to pursue.
 
 What's a bad idea is adding an auto-degrade option without any tools to
 manage and monitor it, which is what this patch does by my reading.  If
 I'm wrong, then someone can point it out to me.

Yes, my big take-away from the discussion is that informing the admin in
a durable way is a requirement for this degraded mode.  You are right
that many ignore RAID degradation warnings, but with the warnings
heeded, degraded functionality can be useful.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-10 Thread Bruce Momjian

On Fri, Jan 10, 2014 at 03:27:10PM -0800, Josh Berkus wrote:
 On 01/10/2014 01:49 PM, Andres Freund wrote:
  On 2014-01-10 10:59:23 -0800, Joshua D. Drake wrote:
 
  On 01/10/2014 07:47 AM, Bruce Momjian wrote:
 
  I know there was a desire to remove this TODO item, but I think we have
  brought up enough new issues that we can keep it to see if we can come
  up with a solution.  I have added a link to this discussion on the TODO
  item.
 
  I think we will need at least four new GUC variables:
 
  *  timeout control for degraded mode
  *  command to run during switch to degraded mode
  *  command to run during switch from degraded mode
  *  read-only variable to report degraded mode
 
 I would argue that we don't need the first.  We just want a command to
 switch synchronous/degraded, and a variable (or function) to report on
 degraded mode.  If we have those things, then it becomes completely
 possible to have an external monitoring framework, which is capable of
 answering questions like is the replica down or just slow?, control
 degrade.
 
 Oh, wait!  We DO have such a command.  It's called ALTER SYSTEM SET!
 Recently committed.  So this is really a solvable issue if one is
 willing to use an external utility.

How would that work?  Would it be a tool in contrib?  There already is a
timeout, so if a tool checked more frequently than the timeout, it
should work.  The durable notification of the admin would happen in the
tool, right?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-09 Thread MauMau


From: Andres Freund and...@2ndquadrant.com

On 2014-01-08 14:42:37 -0800, Joshua D. Drake wrote:

If we have the following:

db0-db1:down

Using the model (as I understand it) that is being discussed we have
increased our failure rate because the moment db1:down we also lose db0. 
The

node db0 may be up but if it isn't going to process transactions it is
useless. I can tell you that I have exactly 0 customers that would want 
that

model because a single node failure would cause a double node failure.


That's why you should configure a second standby as another (candidate)
synchronous replica, also listed in synchronous_standby_names.


Let me ask a (probably) stupid question.  How is the sync rep different from 
RAID-1?


When I first saw sync rep, I expected that it would provide the same 
guarantees as RAID-1 in terms of durability (data is always mirrored on two 
servers) and availability (if one server goes down, another server continues 
full service).


The cost is reasonable with RAID-1.  The sync rep requires high cost to get 
both durability and availability --- three servers.


Am I expecting too much?


Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-09 Thread Hannu Krosing

On 01/09/2014 05:09 AM, Robert Treat wrote:
 On Wed, Jan 8, 2014 at 6:15 PM, Josh Berkus j...@agliodbs.com wrote:
 Stephen,


 I'm aware, my point was simply that we should state, up-front in
 25.2.7.3 *and* where we document synchronous_standby_names, that it
 requires at least three servers to be involved to be a workable
 solution.
 It's a workable solution with 2 servers.  That's a low-availability,
 high-integrity solution; the user has chosen to double their risk of
 not accepting writes against never losing a write.  That's a perfectly
 valid configuration, and I believe that NTT runs several applications
 this way.

 In fact, that can already be looked at as a kind of auto-degrade mode:
 if there aren't two nodes, then the database goes read-only.

 Might I also point out that transactions are synchronous or not
 individually?  The sensible configuration is for only the important
 writes being synchronous -- in which case auto-degrade makes even less
 sense.

 I really think that demand for auto-degrade is coming from users who
 don't know what sync rep is for in the first place.  The fact that other
 vendors are offering auto-degrade as a feature instead of the ginormous
 foot-gun it is adds to the confusion, but we can't help that.

 I think the problem here is that we tend to have a limited view of
 the right way to use synch rep. If I have 5 nodes, and I set 1
 synchronous and the other 3 asynchronous, I've set up a known
 successor in the event that the leader fails. 
But there is no guarantee that the synchronous replica actually
is ahead of async ones.

 In this scenario
 though, if the successor fails, you actually probably want to keep
 accepting writes; since you weren't using synchronous for durability
 but for operational simplicity. I suspect there are probably other
 scenarios where users are willing to trade latency for improved and/or
 directed durability but not at the extent of availability, don't you?

Cheers

-- 
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-09 Thread Hannu Krosing

On 01/09/2014 12:05 AM, Stephen Frost wrote:
 * Andres Freund (and...@2ndquadrant.com) wrote:
 On 2014-01-08 17:56:37 -0500, Stephen Frost wrote:
 * Andres Freund (and...@2ndquadrant.com) wrote:
 That's why you should configure a second standby as another (candidate)
 synchronous replica, also listed in synchronous_standby_names.
 Perhaps we should stress in the docs that this is, in fact, the *only*
 reasonable mode in which to run with sync rep on?  Where there are
 multiple replicas, because otherwise Drake is correct that you'll just
 end up having both nodes go offline if the slave fails.
 Which, as it happens, is actually documented.
 I'm aware, my point was simply that we should state, up-front in
 25.2.7.3 *and* where we document synchronous_standby_names, that it
 requires at least three servers to be involved to be a workable
 solution.

 Perhaps we should even log a warning if only one value is found in
 synchronous_standby_names...
You can have only one name in synchronous_standby_names and
have multiple slaves connecting with that name

Also, I can attest that I have had clients who want exactly that - a system
stop until admin intervention in case of a designated sync standby failing.

And they actually run more than one standby, they just want to make
sure that sync rep to 2nd data center always happens.


Cheers

-- 
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-09 Thread Hannu Krosing

On 01/08/2014 11:49 PM, Tom Lane wrote:
 Joshua D. Drake j...@commandprompt.com writes:
 On 01/08/2014 01:55 PM, Tom Lane wrote:
 Sync mode is about providing a guarantee that the data exists on more than
 one server *before* we tell the client it's committed.  If you don't need
 that guarantee, you shouldn't be using sync mode.  If you do need it,
 it's not clear to me why you'd suddenly not need it the moment the going
 actually gets tough.
 As I understand it what is being suggested is that if a subscriber or 
 target goes down, then the master will just sit there and wait. When I 
 read that, I read that the master will no longer process write 
 transactions. If I am wrong in that understanding then cool. If I am not 
 then that is a serious problem with a production scenario. There is an 
 expectation that a master will continue to function if the target is 
 down, synchronous or not.
 Then you don't understand the point of sync mode, and you shouldn't be
 using it.  The point is *exactly* to refuse to commit transactions unless
 we can guarantee the data's been replicated.
For single host scenario this would be similar to asking for
a mode which turns fsync=off in case of disk failure :)


Cheers

-- 
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-09 Thread Hannu Krosing

On 01/09/2014 01:57 PM, MauMau wrote:
 From: Andres Freund and...@2ndquadrant.com
 On 2014-01-08 14:42:37 -0800, Joshua D. Drake wrote:
 If we have the following:

 db0-db1:down

 Using the model (as I understand it) that is being discussed we have
 increased our failure rate because the moment db1:down we also lose
 db0. The
 node db0 may be up but if it isn't going to process transactions it is
 useless. I can tell you that I have exactly 0 customers that would
 want that
 model because a single node failure would cause a double node failure.

 That's why you should configure a second standby as another (candidate)
 synchronous replica, also listed in synchronous_standby_names.

 Let me ask a (probably) stupid question.  How is the sync rep
 different from RAID-1?

 When I first saw sync rep, I expected that it would provide the same
 guarantees as RAID-1 in terms of durability (data is always mirrored
 on two servers) and availability (if one server goes down, another
 server continues full service).
What you describe is most like A-sync rep.

Sync rep makes sure that data is always replicated before confirming to
writer.


Cheers

-- 
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-09 Thread Hannu Krosing

On 01/09/2014 02:01 AM, Jim Nasby wrote:
 On 1/8/14, 6:05 PM, Tom Lane wrote:
 Josh Berkusj...@agliodbs.com  writes:
 On 01/08/2014 03:27 PM, Tom Lane wrote:
 What we lack, and should work on, is a way for sync mode to have
 M larger
 than one.  AFAICS, right now we'll report commit as soon as
 there's one
 up-to-date replica, and some high-reliability cases are going to
 want
 more.
 Sync N times is really just a guarantee against data loss as long as
 you lose N-1 servers or fewer.  And it becomes an even
 lower-availability solution if you don't have at least N+1 replicas.
 For that reason, I'd like to see some realistic actual user demand
 before we take the idea seriously.
 Sure.  I wasn't volunteering to implement it, just saying that what
 we've got now is not designed to guarantee data survival across failure
 of more than one server.  Changing things around the margins isn't
 going to improve such scenarios very much.

 It struck me after re-reading your example scenario that the most
 likely way to figure out what you had left would be to see if some
 additional system (think Nagios monitor, or monitors) had records
 of when the various database servers went down.  This might be
 what you were getting at when you said logging, but the key point
 is it has to be logging done on an external server that could survive
 failure of the database server.  postmaster.log ain't gonna do it.

 Yeah, and I think that the logging command that was suggested allows
 for that *if configured correctly*.
*But* for relying on this, we would also need to make logging
*synchronous*,
which would probably not go down well with many people, as it makes things
even more fragile from availability viewpoint (and slower as well).

Cheers

-- 
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-09 Thread MauMau


From: Hannu Krosing ha...@2ndquadrant.com

On 01/09/2014 01:57 PM, MauMau wrote:

Let me ask a (probably) stupid question.  How is the sync rep
different from RAID-1?

When I first saw sync rep, I expected that it would provide the same
guarantees as RAID-1 in terms of durability (data is always mirrored
on two servers) and availability (if one server goes down, another
server continues full service).

What you describe is most like A-sync rep.

Sync rep makes sure that data is always replicated before confirming to
writer.


Really?  RAID-1 is a-sync?

Regards
MauMau




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-09 Thread Hannu Krosing

On 01/09/2014 04:15 PM, MauMau wrote:
 From: Hannu Krosing ha...@2ndquadrant.com
 On 01/09/2014 01:57 PM, MauMau wrote:
 Let me ask a (probably) stupid question.  How is the sync rep
 different from RAID-1?

 When I first saw sync rep, I expected that it would provide the same
 guarantees as RAID-1 in terms of durability (data is always mirrored
 on two servers) and availability (if one server goes down, another
 server continues full service).
 What you describe is most like A-sync rep.

 Sync rep makes sure that data is always replicated before confirming to
 writer.

 Really?  RAID-1 is a-sync?
Not exactly, as there is no master just controller writing to two
equal disks.

But having a degraded mode makes it
more like async - it continues even with single disk and syncs later if
and when the 2nd disk comes back.

Cheers

-- 
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-09 Thread Bruce Momjian

On Thu, Jan  9, 2014 at 04:55:22PM +0100, Hannu Krosing wrote:
 On 01/09/2014 04:15 PM, MauMau wrote:
  From: Hannu Krosing ha...@2ndquadrant.com
  On 01/09/2014 01:57 PM, MauMau wrote:
  Let me ask a (probably) stupid question.  How is the sync rep
  different from RAID-1?
 
  When I first saw sync rep, I expected that it would provide the same
  guarantees as RAID-1 in terms of durability (data is always mirrored
  on two servers) and availability (if one server goes down, another
  server continues full service).
  What you describe is most like A-sync rep.
 
  Sync rep makes sure that data is always replicated before confirming to
  writer.
 
  Really?  RAID-1 is a-sync?
 Not exactly, as there is no master just controller writing to two
 equal disks.
 
 But having a degraded mode makes it
 more like async - it continues even with single disk and syncs later if
 and when the 2nd disk comes back.

I think RAID-1 is a very good comparison because it is successful
technology and has similar issues.

RAID-1 is like Postgres synchronous_standby_names mode in the sense that
the RAID-1 controller will not return success until writes have happened
on both mirrors, but it is unlike synchronous_standby_names in that it
will degrade and continue writes even when it can't write to both
mirrors.  What is being discussed is to allow the RAID-1 behavior in
Postgres.

One issue that came up in discussions is the insufficiency of writing a
degrade notice in a server log file because the log file isn't durable
from server failures, meaning you don't know if a fail-over to the slave
lost commits.  The degrade message has to be stored durably against a
server failure, e.g. on a pager, probably using a command like we do for
archive_command, and has to return success before the server continues
in degrade mode.  I assume degraded RAID-1 controllers inform
administrators in the same way.

I think RAID-1 controllers operate successfully with this behavior
because they are seen as durable and authoritative in reporting the
status of mirrors, while with Postgres, there is no central authority
that can report that degrade status of master/slaves.

Another concern with degrade mode is that once Postgres enters degrade
mode, how does it get back to synchronous_standby_names mode?  We could
have each commit wait for the timeout before continuing, but that is
going to make degrade mode unusably slow.  Would there be an admin
command?  With a timeout to force degrade mode, a temporary network
outage could cause degrade mode, while our current behavior would
recover synchronous_standby_names mode once the network was repaired.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-09 Thread Jeff Janes

On Wed, Jan 8, 2014 at 3:00 PM, Josh Berkus j...@agliodbs.com wrote:

 On 01/08/2014 01:49 PM, Tom Lane wrote:
  Josh Berkus j...@agliodbs.com writes:
  If we really want auto-degrading sync rep, then we'd (at a minimum) need
  a way to determine *from the replica* whether or not it was in degraded
  mode when the master died.  What good do messages to the master log do
  you if the master no longer exists?
 
  How would it be possible for a replica to know whether the master had
  committed more transactions while communication was lost, if the master
  dies without ever restoring communication?  It sounds like pie in the
  sky from here ...

 Oh, right.  Because the main reason for a sync replica degrading is that
 it's down.  In which case it isn't going to record anything.  This would
 still be useful for sync rep candidates, though, and I'll document why
 below.  But first, lemme demolish the case for auto-degrade.

 So here's the case that we can't possibly solve for auto-degrade.
 Anyone who wants auto-degrade needs to come up with a solution for this
 case as a first requirement:


It seems like the only deterministically useful thing to do is to send a
NOTICE to the *client* that the commit has succeeded, but in degraded mode,
so keep your receipts and have your lawyer's number handy.  Whether anyone
is willing to add code to the client to process that message is doubtful,
as well as whether the client will even ever receive it if we are in the
middle of a major disruption.

But I think  there is a good probabilistic justification for an
auto-degrade mode.  (And really, what else is there?  There are never any
real guarantees of anything.  Maybe none of your replicas ever come back
up.  Maybe none of your customers do, either.)




 1. A data center network/power event starts.

 2. The sync replica goes down.

 3. A short time later, the master goes down.

 4. Data center power is restored.

 5. The master is fried and is a permanent loss.  The replica is ok, though.

 Question: how does the DBA know whether data has been lost or not?


What if he had a way of knowing that some data *has* been lost?  What can
he do about it?  What is the value in knowing it was lost after the fact,
but without the ability to do anything about it?

But let's say that instead of a permanent loss, the master can be brought
back up in a few days after replacing a few components, or in a few weeks
after sending the drives out to clean-room data recovery specialists.
 Writing has already failed over to the replica, because you couldn't wait
that long to bring things back up.

Once you get your old master back, you can see if transaction have been
lost, and if they have been you can dump the tables out to a human readable
format, use PITR and restore a copy of the replica to the point just before
the failover (although I'm not really sure exactly how to identify that
point) and dump that out, then use 'diff' tools to figure out what changes
to the database were lost, consult with the application specialists to
figure out what the application was doing that lead to those changes (if
that is not obvious) and business operations people to figure out how to
apply the analogous changes to the top of the database, and customer
service VP or someone to figure how to retroactively fix transactions that
were done after the failover which would have been differently had the lost
transactions not been lost.  Or instead of all that, you could look at the
recovered data and learn that in fact nothing had been lost, so nothing
further needs to be done.

If you were running in asyn replication mode on a busy server, there is a
virtual certainty that some transactions have been lost.  If you were
running in sync mode with possibility of auto-degrade, it is far from
certain.  That depends on how long the power event lasted, compared to how
long you had the timeout set to.

Or rather than a data-center-wide power spike, what if your master just
done fell over with no drama to the rest of the neighborhood? Inspection
after the fail-over to the replica shows the RAID controller card failed.
 There is no reason to think that a RAID controller, in the process of
failing, would have caused the replication to kick into degraded mode.  You
know from the surviving logs that the master spent 60 seconds total in
degraded mode over the last 3 months, so there is a 99.999% chance no
confirmed transactions were lost.  To be conservative, let's drop it to
99.99% because maybe some unknown mechanism did allow a failing RAID
controller to blip the network card without leaving any evidence behind.
That's a lot better than the chances of lost transactions while in async
replication mode, which could be 99.9% in the other direction.

Cheers,

Jeff

Re: [HACKERS] Standalone synchronous master

2014-01-09 Thread Bruce Momjian

On Thu, Jan  9, 2014 at 09:36:47AM -0800, Jeff Janes wrote:
 Oh, right.  Because the main reason for a sync replica degrading is that
 it's down.  In which case it isn't going to record anything.  This would
 still be useful for sync rep candidates, though, and I'll document why
 below.  But first, lemme demolish the case for auto-degrade.
 
 So here's the case that we can't possibly solve for auto-degrade.
 Anyone who wants auto-degrade needs to come up with a solution for this
 case as a first requirement:
 
 
 It seems like the only deterministically useful thing to do is to send a 
 NOTICE
 to the *client* that the commit has succeeded, but in degraded mode, so keep
 your receipts and have your lawyer's number handy.  Whether anyone is willing
 to add code to the client to process that message is doubtful, as well as
 whether the client will even ever receive it if we are in the middle of a 
 major
 disruption.

I don't think clients are the right place for notification.  Clients
running on a single server could have fsync=off set by the admin or
lying drives and never know it.  I can't imagine a client only wiling to
run if synchronous_standby_names is set.

The synchronous slave is something the administrator has set up and is
responsible for, so the administrator should be notified.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-09 Thread Josh Berkus

Robert,

 I think the problem here is that we tend to have a limited view of
 the right way to use synch rep. If I have 5 nodes, and I set 1
 synchronous and the other 3 asynchronous, I've set up a known
 successor in the event that the leader fails. In this scenario
 though, if the successor fails, you actually probably want to keep
 accepting writes; since you weren't using synchronous for durability
 but for operational simplicity. I suspect there are probably other
 scenarios where users are willing to trade latency for improved and/or
 directed durability but not at the extent of availability, don't you?

That's a workaround for a completely different limitation though; the
inability to designate a specific async replica as first.  That is, if
there were some way to do so, you would be using that rather than sync
rep.  Extending the capabilities of that workaround is not something I
would gladly do until I had exhausted other options.

The other problem is that *many* users think they can get improved
availability, consistency AND durability on two nodes somehow, and to
heck with the CAP theorem (certain companies are happy to foster this
illusion).  Having a simple, easily-accessable auto-degrade without
treading degrade as a major monitoring event will feed this
self-deception.  I know I already have to explain the difference between
synchronous and simultaneous to practically every one of my clients
for whom I set up replication.

Realistically, degrade shouldn't be something that happens inside a
single PostgreSQL node, either the master or the replica.  It should be
controlled by some external controller which is capable of deciding on
degrade or not based on a more complex set of circumstances (e.g. Is
the replica actually down or just slow?).  Certainly this is the case
with Cassandra, VoltDB, Riak, and the other serious multinode databases.

 This isn't to say there isn't a lot of confusion around the issue.
 Designing, implementing, and configuring different guarantees in the
 presence of node failures is a non-trivial problem. Still, I'd prefer
 to see Postgres head in the direction of providing more options in
 this area rather than drawing a firm line at being a CP-oriented
 system.

I'm not categorically opposed to having any form of auto-degrade at all;
what I'm opposed to is a patch which adds auto-degrade **without adding
any additional monitoring or management infrastructure at all**.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-09 Thread Simon Riggs

On 8 January 2014 21:40, Tom Lane t...@sss.pgh.pa.us wrote:
 Kevin Grittner kgri...@ymail.com writes:
 I'm torn on whether we should cave to popular demand on this; but
 if we do, we sure need to be very clear in the documentation about
 what a successful return from a commit request means.  Sooner or
 later, Murphy's Law being what it is, if we do this someone will
 lose the primary and blame us because the synchronous replica is
 missing gobs of transactions that were successfully committed.

 I'm for not caving.  I think people who are asking for this don't
 actually understand what they'd be getting.

Agreed.


Just to be clear, I made this mistake initially. Now I realise Heikki
was right and if you think about it long enough, you will too. If you
still disagree, think hard, read the archives until you do.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-09 Thread Jim Nasby


On 1/9/14, 9:01 AM, Hannu Krosing wrote:

Yeah, and I think that the logging command that was suggested allows
for that*if configured correctly*.

*But*  for relying on this, we would also need to make logging
*synchronous*,
which would probably not go down well with many people, as it makes things
even more fragile from availability viewpoint (and slower as well).


Not really... you only care about monitoring performance when the standby has 
gone AWOL *and* you haven't sent a notification yet. Once you've notified once 
you're done.

So in this case the master won't go down unless you have a double fault: 
standby goes down AND you can't get to your monitoring.
--
Jim C. Nasby, Data Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-09 Thread Amit Kapila

On Thu, Jan 9, 2014 at 10:45 PM, Bruce Momjian br...@momjian.us wrote:

 I think RAID-1 is a very good comparison because it is successful
 technology and has similar issues.

 RAID-1 is like Postgres synchronous_standby_names mode in the sense that
 the RAID-1 controller will not return success until writes have happened
 on both mirrors, but it is unlike synchronous_standby_names in that it
 will degrade and continue writes even when it can't write to both
 mirrors.  What is being discussed is to allow the RAID-1 behavior in
 Postgres.

 One issue that came up in discussions is the insufficiency of writing a
 degrade notice in a server log file because the log file isn't durable
 from server failures, meaning you don't know if a fail-over to the slave
 lost commits.  The degrade message has to be stored durably against a
 server failure, e.g. on a pager, probably using a command like we do for
 archive_command, and has to return success before the server continues
 in degrade mode.  I assume degraded RAID-1 controllers inform
 administrators in the same way.

Here I think if user is aware from beginning that this is the behaviour,
then may be the importance of message is not very high.
What I want to say is that if we provide a UI in such a way that user
decides during setup of server the behavior that is required by him.

For example, if we provide a new parameter
available_synchronous_standby_names along with current parameter
and ask user to use this new parameter, if he wishes to synchronously
commit transactions on another server when it is available, else it will
operate as a standalone sync master.


 I think RAID-1 controllers operate successfully with this behavior
 because they are seen as durable and authoritative in reporting the
 status of mirrors, while with Postgres, there is no central authority
 that can report that degrade status of master/slaves.

 Another concern with degrade mode is that once Postgres enters degrade
 mode, how does it get back to synchronous_standby_names mode?

   It will get back to mode where it will commit the transactions to another
   server before commit completes when all the gap in WAL is resolved.
   I think in new new mode it will operate as if there is no
   synchronous_standby_names.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-09 Thread Michael Paquier

On Fri, Jan 10, 2014 at 3:23 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On 8 January 2014 21:40, Tom Lane t...@sss.pgh.pa.us wrote:
 Kevin Grittner kgri...@ymail.com writes:
 I'm torn on whether we should cave to popular demand on this; but
 if we do, we sure need to be very clear in the documentation about
 what a successful return from a commit request means.  Sooner or
 later, Murphy's Law being what it is, if we do this someone will
 lose the primary and blame us because the synchronous replica is
 missing gobs of transactions that were successfully committed.

 I'm for not caving.  I think people who are asking for this don't
 actually understand what they'd be getting.

 Agreed.


 Just to be clear, I made this mistake initially. Now I realise Heikki
 was right and if you think about it long enough, you will too. If you
 still disagree, think hard, read the archives until you do.
+1. I see far more potential in having a N-sync solution from the
usability viewpoint, and consistency with the existing mechanisms in
place. A synchronous apply mode would be nice as well.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-08 Thread Heikki Linnakangas


On 11/13/2013 03:09 PM, Rajeev rastogi wrote:

This patch implements the following TODO item:

Add a new eager synchronous mode that starts out synchronous but reverts to 
asynchronous after a failure timeout period
This would require some type of command to be executed to alert administrators 
of this change.
http://archives.postgresql.org/pgsql-hackers/2011-12/msg01224.php

This patch implementation is in the same line as it was given in the earlier 
thread.
Some Of the additional important changes are:

1.   Have added two GUC variable to take commands from user to be executed

a.   Master_to_standalone_cmd: To be executed before master switches to 
standalone mode.

b.  Master_to_sync_cmd: To be executed before master switches from sync 
mode to standalone mode.

2.   Master mode switch will happen only if the corresponding command 
executed successfully.

3.   Taken care of replication timeout to decide whether synchronous 
standby has gone down. i.e. only after expiry of

wal_sender_timeout, the master will switch from sync mode to standalone mode.

Please provide your opinion or any other expectation out of this patch.


I'm going to say right off the bat that I think the whole notion to 
automatically disable synchronous replication when the standby goes down 
is completely bonkers. If you don't need the strong guarantee that your 
transaction is safe in at least two servers before it's acknowledged to 
the client, there's no point enabling synchronous replication in the 
first place. If you do need it, then you shouldn't fall back to a 
degraded mode, at least not automatically. It's an idea that keeps 
coming back, but I have not heard a convincing argument why it makes 
sense. It's been discussed many times before, most recently in that 
thread you linked to.


Now that I got that out of the way, I concur that some sort of hooks or 
commands that fire when a standby goes down or comes back up makes 
sense, for monitoring purposes. I don't much like this particular 
design. If you just want to write log entry, when all the standbys are 
disconnected, running a shell command seems like an awkward interface. 
It's OK for raising an alarm, but there are many other situations where 
you might want to raise alarms, so I'd rather have us implement some 
sort of a generic trap system, instead of adding this one particular 
extra config option. What do people usually use to monitor replication?


There are two things we're trying to solve here: raising an alarm when 
something interesting happens, and changing the configuration to 
temporarily disable synchronous replication. What would be a good API to 
disable synchronous replication? Editing the config file and SIGHUPing 
is not very nice. There's been talk of an ALTER command to change the 
config, but I'm not sure that's a very good API either. Perhaps expose 
the sync_master_in_standalone_mode variable you have in your patch to 
new SQL-callable functions. Something like:


pg_disable_synchronous_replication()
pg_enable_synchronous_replication()

I'm not sure where that state would be stored. Should it persist 
restarts? And you probably should get some sort of warnings in the log 
when synchronous replication is disabled.


In summary, more work is required to design a good 
user/admin/programming interface. Let's hear a solid proposal for that, 
before writing patches.


BTW, calling an external command with system(), while holding 
SyncRepLock in exclusive-mode, seems like a bad idea. For starters, 
holding a lock will prevent a new WAL sender from starting up and 
becoming a synchronous standby, and the external command might take a 
long time to return.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-08 Thread Andres Freund

On 2014-01-08 11:07:48 +0200, Heikki Linnakangas wrote:
 I'm going to say right off the bat that I think the whole notion to
 automatically disable synchronous replication when the standby goes down is
 completely bonkers. If you don't need the strong guarantee that your
 transaction is safe in at least two servers before it's acknowledged to the
 client, there's no point enabling synchronous replication in the first
 place.

I think that's likely caused by the misconception that synchronous
replication is synchronous in apply, not just remote write/fsync. I have
now seen several sites that assumed that and just set up sync rep to
maintain that goal to then query standbys instead of the primary after
the commit finished.
If that assumption were true, supporting a timeout that way would
possibly be helpful, but it is not atm...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-08 Thread Simon Riggs

On 8 January 2014 09:07, Heikki Linnakangas hlinnakan...@vmware.com wrote:

 I'm going to say right off the bat that I think the whole notion to
 automatically disable synchronous replication when the standby goes down is
 completely bonkers.

Agreed

We had this discussion across 3 months and we don't want it again.
This should not have been added as a TODO item.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-08 Thread Bruce Momjian

On Wed, Jan  8, 2014 at 05:39:23PM +, Simon Riggs wrote:
 On 8 January 2014 09:07, Heikki Linnakangas hlinnakan...@vmware.com wrote:
 
  I'm going to say right off the bat that I think the whole notion to
  automatically disable synchronous replication when the standby goes down is
  completely bonkers.
 
 Agreed
 
 We had this discussion across 3 months and we don't want it again.
 This should not have been added as a TODO item.

I am glad Heikki and Simon agree, but I don't.  ;-)

The way that I understand it is that you might want durability, but
might not want to sacrifice availability.  Phrased that way, it makes
sense, and notifying the administrator seems the appropriate action.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-08 Thread Hans-Jürgen Schönig


On Jan 8, 2014, at 9:27 PM, Bruce Momjian wrote:

 On Wed, Jan  8, 2014 at 05:39:23PM +, Simon Riggs wrote:
 On 8 January 2014 09:07, Heikki Linnakangas hlinnakan...@vmware.com wrote:
 
 I'm going to say right off the bat that I think the whole notion to
 automatically disable synchronous replication when the standby goes down is
 completely bonkers.
 
 Agreed
 
 We had this discussion across 3 months and we don't want it again.
 This should not have been added as a TODO item.
 
 I am glad Heikki and Simon agree, but I don't.  ;-)
 
 The way that I understand it is that you might want durability, but
 might not want to sacrifice availability.  Phrased that way, it makes
 sense, and notifying the administrator seems the appropriate action.
 

technically and conceptually i agree with andres and simon but from daily 
experience i would say that we should make it configurable.
some people got some nasty experiences when their systems stopped working.

+1 for a GUC to control this one.

many thanks,

hans

--
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-08 Thread Heikki Linnakangas


On 01/08/2014 10:27 PM, Bruce Momjian wrote:

On Wed, Jan  8, 2014 at 05:39:23PM +, Simon Riggs wrote:

On 8 January 2014 09:07, Heikki Linnakangas hlinnakan...@vmware.com wrote:


I'm going to say right off the bat that I think the whole notion to
automatically disable synchronous replication when the standby goes down is
completely bonkers.


Agreed

We had this discussion across 3 months and we don't want it again.
This should not have been added as a TODO item.


I am glad Heikki and Simon agree, but I don't.  ;-)

The way that I understand it is that you might want durability, but
might not want to sacrifice availability.  Phrased that way, it makes
sense, and notifying the administrator seems the appropriate action.


They want to have the cake and eat it too. But they're not actually 
getting that. What they actually get is extra latency when things work, 
with no gain in durability.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-08 Thread Bruce Momjian

On Wed, Jan  8, 2014 at 10:46:51PM +0200, Heikki Linnakangas wrote:
 On 01/08/2014 10:27 PM, Bruce Momjian wrote:
 On Wed, Jan  8, 2014 at 05:39:23PM +, Simon Riggs wrote:
 On 8 January 2014 09:07, Heikki Linnakangas hlinnakan...@vmware.com wrote:
 
 I'm going to say right off the bat that I think the whole notion to
 automatically disable synchronous replication when the standby goes down is
 completely bonkers.
 
 Agreed
 
 We had this discussion across 3 months and we don't want it again.
 This should not have been added as a TODO item.
 
 I am glad Heikki and Simon agree, but I don't.  ;-)
 
 The way that I understand it is that you might want durability, but
 might not want to sacrifice availability.  Phrased that way, it makes
 sense, and notifying the administrator seems the appropriate action.
 
 They want to have the cake and eat it too. But they're not actually
 getting that. What they actually get is extra latency when things
 work, with no gain in durability.

They are getting guaranteed durability until they get a notification ---
that seems valuable.  When they get the notification, they can
reevaluate if they want that tradeoff.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-08 Thread Kevin Grittner

Bruce Momjian br...@momjian.us wrote:
 Heikki Linnakangas wrote:

 They want to have the cake and eat it too. But they're not
 actually getting that. What they actually get is extra latency
 when things work, with no gain in durability.

 They are getting guaranteed durability until they get a
 notification --- that seems valuable.  When they get the
 notification, they can reevaluate if they want that tradeoff.

My first reaction to this has been that if you want synchronous
replication without having the system wait if the synchronous
target goes down, you should configure an alternate target.  With
the requested change we can no longer state that when a COMMIT
returns with an indication of success that the data has been
persisted to multiple clusters.  We would be moving to a situation
where the difference between synchronous is subtle -- either way
the data may or may not be on a second cluster by the time the
committer is notified of success.  We wait up to some threshold
time to try to make the success indication indicate that, but then
return success even if the guarantee has not been provided, without
any way for the committer to know the difference.

On the other hand, we keep getting people saying they want the
database to make the promise of synchronous replication, and tell
applications that it has been successful even when it hasn't been,
as long as there's a line in the server log to record the lie.  Or,
more likely, to record the boundaries of time blocks where it has
been a lie.  This appears to be requested because other products
behave that way.

I'm torn on whether we should cave to popular demand on this; but
if we do, we sure need to be very clear in the documentation about
what a successful return from a commit request means.  Sooner or
later, Murphy's Law being what it is, if we do this someone will
lose the primary and blame us because the synchronous replica is
missing gobs of transactions that were successfully committed.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-08 Thread Tom Lane

Kevin Grittner kgri...@ymail.com writes:
 I'm torn on whether we should cave to popular demand on this; but
 if we do, we sure need to be very clear in the documentation about
 what a successful return from a commit request means.  Sooner or
 later, Murphy's Law being what it is, if we do this someone will
 lose the primary and blame us because the synchronous replica is
 missing gobs of transactions that were successfully committed.

I'm for not caving.  I think people who are asking for this don't
actually understand what they'd be getting.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-08 Thread Andres Freund

On 2014-01-08 13:34:08 -0800, Kevin Grittner wrote:
 On the other hand, we keep getting people saying they want the
 database to make the promise of synchronous replication, and tell
 applications that it has been successful even when it hasn't been,
 as long as there's a line in the server log to record the lie.

Most people having such a position I've talked to have held that
position because they thought synchronous replication would mean that
apply (and thus visibility) would also be synchronous. Is that
different from your experience?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-08 Thread Josh Berkus

On 01/08/2014 12:27 PM, Bruce Momjian wrote:
 I am glad Heikki and Simon agree, but I don't.  ;-)
 
 The way that I understand it is that you might want durability, but
 might not want to sacrifice availability.  Phrased that way, it makes
 sense, and notifying the administrator seems the appropriate action.

I think there's a valid argument to want things the other way, but I
find the argument not persuasive.  In general, people who want
auto-degrade for sync rep either:

a) don't understand what sync rep actually does (lots of folks confuse
synchronous with simultaneous), or

b) want more infrastructure than we actually have around managing sync
replicas

Now, the folks who want (b) have a legitimate need, and I'll point out
that we always planned to have more features around sync rep, it's just
that we never actually worked on any.  For example, quorum sync was
extensively discussed and originally projected for 9.2, only certain
hackers changed jobs and interests.

If we just did the minimal change, that is, added an auto-degrade GUC
and an alert to the logs each time the master server went into degraded
mode, as Heikki says we'd be loading a big foot-gun for a bunch of
ill-informed DBAs.  People who want that are really much better off with
async rep in the first place.

If we really want auto-degrading sync rep, then we'd (at a minimum) need
a way to determine *from the replica* whether or not it was in degraded
mode when the master died.  What good do messages to the master log do
you if the master no longer exists?

Mind you, being able to determine on the replica whether it was
synchronous or not when it lost communication with the master would be a
great feature to have for sync rep groups as well, and would make them
practical (right now, they're pretty useless).  However, I seriously
doubt that someone is going to code that up in the next 5 days.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-08 Thread Joshua D. Drake



On 01/08/2014 01:34 PM, Kevin Grittner wrote:


I'm torn on whether we should cave to popular demand on this; but
if we do, we sure need to be very clear in the documentation about
what a successful return from a commit request means.  Sooner or
later, Murphy's Law being what it is, if we do this someone will
lose the primary and blame us because the synchronous replica is
missing gobs of transactions that were successfully committed.


I am trying to follow this thread and perhaps I am just being dense but 
it seems to me that:


If you are running synchronous replication, as long as the target 
(subscriber) is up, synchronous replication operates as it should. That 
is that the origin will wait for a notification from the subscriber that 
the write has been successful before continuing.


However, if the subscriber is down, the origin should NEVER wait. That 
is just silly behavior and makes synchronous replication pretty much 
useless. Machines go down, that is the nature of things. Yes, we should 
log and log loudly if the subscriber is down:


ERROR: target xyz is non-communicative: switching to async replication.

We then should store the wal logs up to wal_keep_segments.

When the subscriber comes back up, it will then replicate in async mode 
until the two are back in sync and then switch (perhaps by hand) to sync 
mode. This of course assumes that we have a valid database on the 
subscriber and we have not overrun wal_keep_segments.


Sincerely,

Joshua D. Drake



--
Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
For my dreams of your image that blossoms
   a rose in the deeps of my heart. - W.B. Yeats


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-08 Thread Heikki Linnakangas


On 01/08/2014 11:37 PM, Andres Freund wrote:

On 2014-01-08 13:34:08 -0800, Kevin Grittner wrote:

On the other hand, we keep getting people saying they want the
database to make the promise of synchronous replication, and tell
applications that it has been successful even when it hasn't been,
as long as there's a line in the server log to record the lie.


Most people having such a position I've talked to have held that
position because they thought synchronous replication would mean that
apply (and thus visibility) would also be synchronous.


And I totally agree that it would be a useful mode if apply was 
synchronous. You could then build a master-standby pair where it's 
guaranteed that when you commit a transaction in the master, it's 
thereafter always seen as committed in the standby too. In that usage, 
if the link between the two is broken, you could set up timeouts e.g so 
that the standby stops accepting new queries after 20 seconds, and then 
the master proceeds without the standby after 25 seconds. Then the 
guarantee would hold.


I don't know if the people asking for the fallback mode are thinking 
that synchronous replication means synchronous apply, or if they're 
trying to have the cake and eat it too wrt. durability and availability.


Synchronous apply would be cool..

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-08 Thread Tom Lane

Josh Berkus j...@agliodbs.com writes:
 If we really want auto-degrading sync rep, then we'd (at a minimum) need
 a way to determine *from the replica* whether or not it was in degraded
 mode when the master died.  What good do messages to the master log do
 you if the master no longer exists?

How would it be possible for a replica to know whether the master had
committed more transactions while communication was lost, if the master
dies without ever restoring communication?  It sounds like pie in the
sky from here ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-08 Thread Tom Lane

Joshua D. Drake j...@commandprompt.com writes:
 However, if the subscriber is down, the origin should NEVER wait. That 
 is just silly behavior and makes synchronous replication pretty much 
 useless. Machines go down, that is the nature of things. Yes, we should 
 log and log loudly if the subscriber is down:

 ERROR: target xyz is non-communicative: switching to async replication.

 We then should store the wal logs up to wal_keep_segments.

 When the subscriber comes back up, it will then replicate in async mode 
 until the two are back in sync and then switch (perhaps by hand) to sync 
 mode. This of course assumes that we have a valid database on the 
 subscriber and we have not overrun wal_keep_segments.

It sounds to me like you are describing the existing behavior of async
mode, with the possible exception of exactly what shows up in the
postmaster log.

Sync mode is about providing a guarantee that the data exists on more than
one server *before* we tell the client it's committed.  If you don't need
that guarantee, you shouldn't be using sync mode.  If you do need it,
it's not clear to me why you'd suddenly not need it the moment the going
actually gets tough.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-08 Thread Kevin Grittner

Andres Freund and...@2ndquadrant.com wrote:
 On 2014-01-08 13:34:08 -0800, Kevin Grittner wrote:

 On the other hand, we keep getting people saying they want the
 database to make the promise of synchronous replication, and
 tell applications that it has been successful even when it
 hasn't been, as long as there's a line in the server log to
 record the lie.

 Most people having such a position I've talked to have held that
 position because they thought synchronous replication would mean
 that apply (and thus visibility) would also be synchronous. Is
 that different from your experience?

I haven't pursued it that far because we don't have
maybe-synchronous mode yet and seem unlikely to ever support it.
I'm not sure why that use-case is any better than any other.  You
still would never really know whether the data read is current.  If
we were to implement this, the supposedly synchronous replica could
be out-of-date by any arbitrary amount of time (from milliseconds
to months).  (Consider what could happen if the replication
connection authorizations got messed up while application
connections to the replica were fine.)

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-08 Thread Joshua D. Drake



On 01/08/2014 01:55 PM, Tom Lane wrote:


Sync mode is about providing a guarantee that the data exists on more than
one server *before* we tell the client it's committed.  If you don't need
that guarantee, you shouldn't be using sync mode.  If you do need it,
it's not clear to me why you'd suddenly not need it the moment the going
actually gets tough.


As I understand it what is being suggested is that if a subscriber or 
target goes down, then the master will just sit there and wait. When I 
read that, I read that the master will no longer process write 
transactions. If I am wrong in that understanding then cool. If I am not 
then that is a serious problem with a production scenario. There is an 
expectation that a master will continue to function if the target is 
down, synchronous or not.


Sincerely,

JD

--
Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
For my dreams of your image that blossoms
   a rose in the deeps of my heart. - W.B. Yeats


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

1 2 >

1 - 100 of 150 matches

Mail list logo