Re: [Firebird-devel] Write Ahead Logs. MVCC

Jim Starkey Fri, 03 Jun 2022 07:15:46 -0700

Traditional database journalling logged page before-images (to allow adatabase to be rolled back to a point in time), page after-images (toroll forward from a backup for disaster recovery), or both. Ethertechnique eliminates a single point of failure (the disk).

Interbase originally offered both before and after image journalling to separate journal server that handled both before and after images andsupported multiple databases.

A write ahead log is a totally different animal. A write ahead logwrites page changes to a single serial file so when a transactioncommits, only the serial log gets flushed to storage rather than alldirty pages in the cache. It doesn't, however, solve the problem of asingle point of failure without RAID.

During the Borland years, Interbase tried to do a write ahead log andflushed the journal;omg code and server. When it was pointed out thatthis introduced a single point of failure, they abandoned the writeahead log on concentrated on shadowing.

The Falcon storage engine used a write ahead log so a transaction couldbe committed with a single non-buffered write.

I also put a write ahead log for replication messages for NuoDB storagemanagers to meet some customer's insistence that every piece of data bestored on at least two devices before a transaction could be reported ascommitted. And one of these days I gotta write one for Amorphous forthe same reason.


Write ahead logs are implemented in almost all commercial database systems.

I've forgotten the details of the InnoDB logs, but they implemented MVCCwith a pointer in their lock manager to a prior version of a record intheir log. It does (or used to) have some crock where it stops workingwhen the lock space is exhausted.


I haven't a clue as to how contemporary Interbase works.

Almost everyone lies about serializability. Everyone should know theformal definition: A database is serializable if for any set ofconcurrent transactions there exists a transaction order such executetransaction in that order yields the same database state.

Here's a test case: Given a database with variables a and b initializedto 1 and variable c and d initialized to zero, consider two concurrenttransaction A and B. Transaction A copies b to c and bumps a. Transaction B copies a to d and bumps b.

A serializable database will either deadlock or have c and d with valuesof either 1 and 2 or 2 and 1. An MVCC database will have both c and dwith values of 1.

CockroachDB, which implements MVCC with record timestamps, claims to beboth MVCC and serializable. Some of their literature say they are"virtually serializable" which when translated from marketing to Englishmeans "not serializable." I haven't been able to find anything thatsays they can handle the above test case. It is possible that theyretain the full record read set and re-read and verify every recordbefore commit, but they don't say they do with and the cost would beprohibitive. If anyone knows, I'd like to hear about it.

Two-phase locking without phantom control, however, isn't serializablebut the concurrency cost for phantom control is too expensive for mostdatabase systems. Many systems implement a truly serializable mode toget a marketing check mark that they expect nobody to ever use inpractice (Interbase implemented a two phase locking scheme for tables,which was both serializable and unusable.

Personally, I believe what while serializable is a sufficient conditionfor consistency, it isn't a necessary condition. In my book,consistency means:


1. A transaction sees a consistent view of the database plus its own
   updates.
2. A transaction sees only committed data
3. A transaction can't overwrite any data it couldn't see
4. The database enforces any additional declare consistency constraints.

Work for MVCC.

On 6/3/2022 4:05 AM, Pól Ua L. via Firebird-devel wrote:


Hi again Jim, and thanks for your replies - it's interesting reading about the 
history of MVCC inter alia - see below.

Your answers bring up a couple of questions though.

One day I was driving down Route 3 in Manchester, New Hampshire, that
rather than keeping multiple page images, I could keep multiple record
versions, hopefully on the same page, and with clever bookkeeping have
individual transactions keep track of which of server record versions it
should see. So it solved concurrency control, transaction backout,
garbage collection, and database restart without journalling.

Q.1) If MVCC doesn't require jounalling, then why does Interbase now tout the 
fact that it has a Write Ahead Log (WAL - which I assume is a synonym for 
journalling)?

 From the page  (https://en.wikipedia.org/wiki/InterBase):

RESILIENT
Live Backups
Distinguished Data Dumps
Write-Ahead Logging       <<--------**
Point-in-Time Recovery

Oracle and MySQL (InnoDB engine) use MVCC and have Redo logs - which (at least 
AFAICS) are a WAL by another name.

Refs:

https://docs.oracle.com/cd/E18283_01/server.112/e17120/onlineredo001.htm

https://dev.mysql.com/blog-archive/mysql-8-0-new-lock-free-scalable-wal-design/


========================================================

And, in another reply, there's this:

For what it's worth, David Reed's dissertation was on a
non-transactional distributed directory system.  Bernstein and Goodman's
book "proved" that MVCC was serializable, which it most definitely was not.

Q.2) How then do the various MVCC systems implement SERIALIZABLE?

It's quite a confusing topic - there's an excellent article (which I haven't 
fully digested yet) here:

https://medium.com/paypal-tech/think-twice-before-dropping-acid-and-throw-your-cap-away-dbe0d6171dc0https://medium.com/paypal-tech/think-twice-before-dropping-acid-and-throw-your-cap-away-dbe0d6171dc0

which appears to imply that none of the major systems have a true SERIALIZABLE 
transaction isolation level?


Thanks to anyone for any input.


Best and regards,


Pól...





Firebird-Devel mailing list, web interface 
athttps://lists.sourceforge.net/lists/listinfo/firebird-devel

--
Jim Starkey, AmorphousDB, LLC

Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel

Re: [Firebird-devel] Write Ahead Logs. MVCC

Reply via email to