On 27/02/16 09:19, Konstantin Knizhnik wrote:
On 02/27/2016 06:54 AM, Robert Haas wrote:

[...]

So maybe the goal for the GTM isn't to provide true serializability
across the cluster but some lesser degree of transaction isolation.
But then exactly which serialization anomalies are we trying to
prevent, and why is it OK to prevent those and not others?

Absolutely agree. There are some theoretical discussion regarding CAP and different distributed level of isolation. But at practice people want to solve their tasks. Most of PostgeSQL used are using default isolation level: read committed although there are alot of "wonderful" anomalies with it. Serialazable transaction in Oracle are actually violating fundamental serializability rule and still Oracle is one of ther most popular database in the world... The was isolation bug in Postgres-XL which doesn't prevent from using it by commercial customers...

I think this might be a dangerous line of thought. While I agree PostgreSQL should definitely look at the market and answer questions that (current and prospective) users may ask, and be more practical than idealist, easily ditching isolation guarantees might not be a good thing.

That Oracle is the leader with their isolation problems or that most people run PostgreSQL under read committed is not a good argument to cut the corner and just go to bare minimum (if any) isolation guarantees. First, because PostgreSQL has always been trusted and understood as a system with *strong* guarantees (whatever that means). . Second, because what we may perceive as OK from the market, might change soon. From my observations, while I agree with you most people "don't care" or, worse, "don't realize", is rapidly changing. More and more people are becoming aware of the problems of distributed systems and the significant consequences they may have on them.

A lot of them have been illustrated in the famous Jepsen posts. As an example, and a good one given that you have mentioned Galera before, is this one: https://aphyr.com/posts/327-jepsen-mariadb-galera-cluster which demonstrates how Galera fails to provide Snapshot Isolation, even on healthy state --despite they claim that.

As of today, I would expect any distributed system to clearly state its guarantees in the documentation. And them adhere to them, like for instance proving it with tests such as Jepsen.


So I do not say that discussing all this theoretical questions is not need as formally proven correctness of distributed algorithm.

I would like to see work forward here, so I really appreciate all your work here. I cannot give an opinion on whether the DTM API is good or not, but I agree with Robert a good technical discussion on these issues is a good, and a needed, starting point. Feedback may also help you avoid pitfalls that may have gone unnoticed until tons of code are implemented.

Academical approaches are sometimes "very academical", but studying them doesn't hurt either :)


    Álvaro


--
Álvaro Hernández Tortosa


-----------
8Kdata



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to