Re: [HACKERS] Sync Rep: First Thoughts on Code

Mark Mielke Sat, 13 Dec 2008 15:36:16 -0800

Markus Wanner wrote:

I don't think synchronous replication guarantees that it will be
immediately visible. Even if it did push the change to the other
machine, and the other machine had committed it, that doesn't guarantee
that any reader sees it any more than if I commit to the same machine
(no replication), I am guaranteed to see the change from another
session.


AFAIK every snapshot taken after a transaction has acknowledged its
commit is guaranteed to see changes from that transaction. Isn't that a
pretty frequent and obvious user expectation?

Yes - but that's only really true while the session continues. Fromanother session? I've never assumed that I could reconnect and beguaranteed to get the latest snapshot that includes absolutelyeverything that has been committed.

Any system that guaranteed this even when involving multiple machineswould be guaranteed to be inefficient and difficult to scale in myopinion. How could any system promise to have reasonable commit timeswhile also guaranteeing that once a commit completes, any session to anyother server will be able to see the commit? I think this forces somesort of serialization between multiple machines and defeats the purposeof having multiple machines. Where before it was indeterminate to knowwhen the commit would take effect at each replica, it's notindeterminate when my commit will succeed. That is, my commit cannotsucceed until every single server acknowledge that it is has fullyreceived and committed my transaction. What happens if there are networkproblems, or what happens if I am replicating over a slower link? Whatif I am committing to 100 servers? Is it reasonable to expect 100 servernegotiations to complete in full before my own commit will return?

Synchronous replication only means that I can be assured that
my change has been saved permanently by the time my commit completes. It
doesn't mean anybody else can see my change or is guaranteed to see my
change if the query from another session.

So you wouldn't be surprised if a transaction from two hours ago isn't
visible on another node, just because that node happens to be rather
busy with lots of other readers and maintenance tasks?

Any system that is two hours behind should fall out of the pool used tosatisfy reads from. So, if there was a surprise, it would be this. Idon't believe ACID requires that a commit on one server is immediatelyvisible on another server. Any work I do on the "behind" server wouldstill be safe from a transaction and referential integrity perspective.However, if I executed 'commit' on this "behind" server, I would expectthe commit to wait until it catches up, or in the case of a 2 hourbehind, I would expect the commit to fail. Look at the alternative - allcommits to any server in the pool would be locked up waiting for thisone machine to catch up on 2 hours of transaction. This emphasizes thatthe problem is that a server two hours of date is still in the pool,rather than the problem being keeping things up-to-date.

If my application assumes that it can commit to one server, and then
read back the commit from another server, and my application breaks as a
result, it's because I didn't understand the problem.

Well, yeah, depends on user expectations. I'm surprised to hear that you
have that understanding of synchronous replication.

I've seen people face it in the past. Most recently we had apresentation from the developer of digg.com, and he described how he hadthis problem with MySQL and that he had to work around it.

On a smaller scale and slightly unrelated, I had this problem frequentlybetween memcache and PostgreSQL. That is, memcache would always belatest, but PostgreSQL might not be latest, because the commit had notoccurred.

It seems like a standard enough problem to me. I don't expect Postgres-Rto do the impossible. As with my previous paragraph, I don't expectPostgres-R to wait 2-hours to commit just because one server is fallingbehind.

Even if PostgreSQL
didn't use the word "synchronous replication", I could still be
confused. I need to understand the problem no matter what words are used.


As said, it depends on what the common understanding of "synchronous
replication" is. I've so far been under the impression, that these
potential lags are unexpected and confusing. Several people pointed me
at that problem and I've thus "relabeled" Postgres-R as not being
synchronous. I'm at least surprised to suddenly get pushed into the
other direction. :-)

However, I absolutely agree that it's not that important how we name it.
What is important, is that users and developers understand the difference

I agree they are unexpected and confusing. I don't agree that they areunexpected or confusing to those knowledgeable in the domain. So, thequestion becomes - whose expectation is wrong? Should the user learnmore? Or should we push for a change in terminology? Does it make sensefor Postgres-R (which looks excellent to me BTW, at least in principle)be marketed differently, because a few users tie "synchronousreplication" to "serialized access"?

Because that's really what we're talking about - we're talking abouttransactions in all sessions being serialized between machines toprovide less surprise to users who don't understand the complexity ofhaving multiple replicas.

Forget replication - even for the exact same server - I don't expectthat if I commit from one session, I will be able to see the changeimmediately from my other session or a new session that I just opened.Perhaps this is often stable to rely on this, and it is useful for thedatabase server to minimize the window during which the commit becomesvisible to others, but I think it's a false expectation from the startthat it absolutely will be immediately visible to another session. I'mthinking of situations where some part of the table is in cache. Theonly way the commit can communicate that the new transaction isavailable is by during communication between the processes or threads,or between the multiple CPUs on the machine. Do I want every commit toforce each session to become fully in alignment before my commitcompletes? Does PostgreSQL make this guarantee today? I bet it doesn'tif you look far enough into the guts. It might be very fast - I don'tthink it is infinitely fast.


Cheers,
mark

--
Mark Mielke <m...@mielke.cc>

Re: [HACKERS] Sync Rep: First Thoughts on Code

Reply via email to