[MariaDB discuss] Re: Repeatable read anomalies

Kristian Nielsen via discuss Tue, 19 Dec 2023 15:00:34 -0800

Kyle Kingsbury via discuss <[email protected]> writes:

> I've just completed an analysis of isolation levels in MySQL. The
> report focuses on MySQL, but the test suite runs against MariaDB as
> well. MariaDB exhibits all the same behaviors--"Repeatable Read"
> allows a whole bunch of anomalies that should be prohibited under
> Repeatable Read, including G2-item, G-single (even without
> predicates), non-repeatable reads, violations of Monotonic Atomic
> View, etc. Figured this might be of interest to the list.
>
> https://jepsen.io/analyses/mysql-8.0.34
>
> I'm also curious whether this is widely-understood behavior in the
> MariaDB community, or if it's news to most people. The extant


>From a quick read of the link, the main point seems to be about what happens
to INSERT/UPDATE/DELETE ("updates") in a REPEATABLE READ transaction.

As is mentioned, the basic idea in InnoDB REPEATABLE READ is to use a
logical snapshot at the time of transaction start to satisfy SELECTs.
I think it was always obvious to me that this cannot work for updates.
Updates have to modify the current version of the database, not some old
version.

So I have always thought of REPEATABLE READ to be "repeatable" for read-only
transactions spanning multiple selects.

The primary purpose of the way locking is done in InnoDB is to support
statement-based asynchronous replication. The idea is that taking the update
statements and re-running them transaction by transaction, in commit order,
will produce identical data. Similar to SERIALIZABLE, but for the
INSERT/UPDATE/DELETE statements, not necessarily the SELECTs.

The ability to replicate using statements is very important for performance.
Queries that modify a large number of rows otherwise have to write huge
amounts of replication data ("binlogs" in MySQL/MariaDB terms) recording the
contents of every single changed database row.

As is also remarked in the link, using statements for replication is also
very complicated to do correctly in all corner cases, and in practise quite
fragile, unfortunately.

In the end, it is an evolutionary compromise between getting flexibility and
performance on the one hand, and sufficiently correct behaviour on the
other. My personal opinion is that the success of the database is in large
part due to the constant focus on ensuring the flexibility and performance
needed by practical use. But the resulting complexity and fragility of
replication is a very real problem for users, and needs improvement. So this
kind of deep analysis that you have done is very valuable I think, to
understand the current situation and point out current weaknesses.

As to concrete suggestions, getting documented what REPEATABLE READ actually
does would be very valuable I think.

As to the SQL standard, I am not intimately familiar with it, but the
inadequacy of the isolation levels as defined by the standard has been
"folklore" in the community for decades. Unlike something like the C/C++
standard, the SQL standard seems to be far removed from the needs of users,
at least in the applications that typically employ MySQL/MariaDB. And from
my experience, people are resigned rather than hopeful that this could ever
change. This is sad, as a good standard could help improve the portability
of applications between different SQL implementations.

 - Kristian.
_______________________________________________
discuss mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[MariaDB discuss] Re: Repeatable read anomalies

Reply via email to