Re: [HACKERS] Sync Rep: Second thoughts

Emmanuel Cecchet Sun, 21 Dec 2008 15:55:25 -0800

Hi Markus,

I am happy to see that Postgres-R is alive again. The paper was writtenin 07 (and published in 08, the review process is longer than aCommitFest ;-)) and at the time of the writing there was no version ofPostgres-R available, hence the 'obsolete' mention referring to pastversions.I think that it is legitimate for users to expect more guarantees from areplicated database than from a single database. Not knowing what happenwhen a failure happens at commit time when some nodes are still activein a cluster is not intuitive for users.I did not look at the source, but if Postgres -R continue to elaborateon Bettina's ideas with writeset extraction and a certificationprotocol, I think that it will be a bad idea to try to mix it with SyncRep (mentioned in another thread). If you delay commits, you willincrease the window for transactions to conflict and therefore induce ahigher abort rate (thus less scalability). Certification-basedapproaches have already multiple reliability issues to improve writeperformance compared to statement-based replication, but this is verydependent on the capacity of the system to limit the conflicting windowfor concurrent transactions. The writeset extraction mechanisms have hadtoo many limitations so far to allow the use of certification-basedreplication in production (AFAIK).


Good luck with Postgres-R.
Emmanuel

Emmanuel Cecchet wrote:

What the application is going to see is a failure when the postmaster it
is connected to is going down. If this happen at commit time, I think
that there is no guarantee for the application to know what happened:
1. failure occurred before the request reached postmaster:  no instance
committed
2. failure occurred during commit: might be committed on either nodes
3. failure occurred while sending back ack of commit to client: both
instances have committed
But for the client, it will all look the same: an error on commit.


This is very much the same for a single node database system, so I think
current application developers are used to that behavior.

A distributed database system just needs to make sure, that committed
transactions can and will eventually get committed on all nodes. So in
case the client doesn't receive a COMMIT acknowledgment due to atny kind
of failure, it can only be sure the distributed database is in a
consistent state. The client cannot tell, if it applied the transaction
or not. Much like for single node systems.

I agree, that a distributed database system could theoretically do
better, but that would require changes on the client side connection
library as well, letting the client connect to two or even more nodes of
the distributed database system.

This is just to point out that despite all your efforts, the client
might think that some transactions have failed (error on commit) but
they are actually committed.


As pointed out above, that would currently be an erroneous conclusion.

If you don't put some state in the driver
that is able to check at failover time if the commit operation succeeded
or not, it does not really matter what happens for in-flight
transactions (or in-commit transactions) at failure time.


Sure it does. The database system still needs to guarantee consistency.

Hm.. well, you're right if there's only one single standby left (as is
obviously the case for the proposed Sync Rep). Ensuring consistency is
pretty simple in such a case. But imagine having two or more standby
servers. Those would need to agree on a set of in-flight transactions
from the master they both need to apply.

Actually, if there was a way to query the database about the status of a
particular transaction by providing a cluster-wide unique id, that would
help a lot.


You're certainly aware, that Postgres-R features such a global
transaction id... And I guess Sync Rep could easily add such an
identifier as well.

I wrote a paper on the issues with database replication at
Sigmod earlier this year (http://infoscience.epfl.ch/record/129042).
Even though it was targeted at middleware replication, I think that some
of it is still relevant for the problem at hand.


Interesting read, thanks for the pointer. It's pretty obvious that I
don't consider Postgres-R to be obsolete...

I've linked your paper from www.postgres-r.org [1].

Regarding the wording, if experts can't agree, you can be sure that
users won't either. Most of them don't have a clue about the different
flavors of replication. So as long as you state clearly how it behaves
and define all the terms you use that should be fine.


I mostly agree to that, without repeating my concerns again, here.

Regards

Markus Wanner


[1]: Referenced Papers from Postgres-R website:
http://www.postgres-r.org/documentation/references



--
Emmanuel Cecchet

FTO @ Frog ThinkerOpen Source Development & Consulting

--
Web: http://www.frogthinker.org
email: m...@frogthinker.org
Skype: emmanuel_cecchet


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep: Second thoughts

Reply via email to