Re: [sqlite] Conflict between snapshots and checkpoints

Gwendal Roué Sun, 29 Sep 2019 08:40:57 -0700

Thank you very much Keith.

Apologies for my imprecise vocabulary, and the use of the same "snapshot"
word with different meanings.


I have used the term "snapshot isolation" as used in
https://www.sqlite.org/isolation.html; But I'll remember about
REPEATABLE-READ isolation.

I also thank you very much for confirming that preventing checkpoints
allows the WAL markers to remain valid.

My goal is to provide a set of concurrency primitives that are useful for
GUI app developers. I'm learning a lot about SQLite in the process, of
course.

On Sun, Sep 29, 2019 at 2:13 PM Keith Medcalf <kmedc...@dessus.com> wrote:

>
> On Sunday, 29 September, 2019 01:28, Gwendal Roué <gwendal.r...@gmail.com>
> wrote:
>
> >But now I fail to understand the indented use case of sqlite3 snapshots..
> >Why allow to reuse snapshots with several calls to open()? Why do they
> >exist at all, since we can already profit from snapshot isolation with
> >one transaction (at the cost of keeping it open)?
>
> The sqlite3_snapshot_* interfaces allow you to record information about
> transactions maintained by the WAL transaction manager.  The interface DOES
> NOT manage the WAL transaction manager nor does it manage your so-called
> snapshots (which are not snapshots).  It allows you to "get" information
> about a particular point-in-time location in the WAL transaction log and
> "goto" a specific point-in-time in the WAL transaction log, to "free" the
> memory used to record this information, and to "cmp" which of two
> structures point to a later point-in-time in the WAL transaction log.
>
> They do NOT modify or control the WAL transaction system or the WAL file.
> You are responsible for doing this.
>
> https://sqlite.org/wal.html
>
> So, *IF* you want to maintain the availability of a specific point in time
> in the WAL transaction history, they YOU must make sure that YOU do not
> erase that particular point-in-time location by checkpointing the WAL
> transaction log.  You have the capability of doing this by disabling the
> autocheckpoint mechanism and controlling when you do checkpoints.  The WAL
> transaction system neither knows nor cares that you happen to have recorded
> some information about a particular point-in-time offset in the WAL file.
>
> It is sort of like "No Smoking" signs.  The sign does not mean that you
> cannot smoke.  It means that someone posted a sign that says "No Smoking".
> The fact that there is a sign and what it says is an entirely separate fact
> that has no bearing at all on whether or not one may smoke, nor does the
> sign somehow magically prevent smoking, nor will it prevent you from
> catching on fire and smoking as a result thereof.  It is merely a sign that
> has something written on it which someone stuck up on the wall (or whatever
> support structure it is stuck to).  And what is written on the sign is
> entirely independent of any other state of affairs.
>
> >For information, my quest for snapshot protection has met some success (
> >https://github.com/groue/GRDB.swift/pull/625)
>
> This appears to be updating a wrapper interface to SQLite3 so that if you
> happen to have a point-in-time location recorded, then the automatic WAL
> checkpoint system is disabled, plus doing some prevention to protect users
> of the wrapper from modifying (checkpointing) the WAL transaction log if
> they happen to have some recorded "points-in-time" that are dependent on
> not modifying the WAL transaction log.  (ie, things that you should be
> doing yourself so as to not shoot yourself in the foot).
>
> >Given I control the connections to a given database file, I am able to
> >open one writer connection, and, on demand, up to N read-only connections.
>
> >Those N reader connections allow concurrent database reads. Those "reads"
> are
> >generally wrapped in a deferred transaction which provides snapshot
> >isolation.
>
> No, it provides REPEATABLE-READ isolation.  There is no actual "snapshot"
> taken and no snapshot exists.  You are merely not seeing data written to
> the WAL transaction log at a point-in-time subsequent to the point in time
> at which you commenced the "repeatable-read".
>
> The sqlite3_snapshot_* interface merely provides a way to record
> information about this repeatable-read point-in-time position in the WAL
> file so that you can go back to that "repeatable-read" point-in-time
> location sometime in the future IF IT STILL EXISTS in the WAL transaction
> log.  Since YOU are in control of whether or not the WAL file is
> checkpointed, then YOU are in control of whether this point-in-time still
> exists in the WAL transaction log.  If you remove data from the WAL
> transaction log by performing a checkpoint (which moves transactions into
> the main database file) then this recorded "point-in-time" may no longer
> exist in the WAL transaction log.
>
> >At the end of this transaction, the reader connection becomes
> >available for another read.
>
> Yes.  And by default when the transaction starts it gets a repeatable-read
> isolation "point-in-time" stamp corresponding to the last transaction
> written to the WAL transaction log which has been committed, or gets the
> current point-in-time representing the "current database" if there are no
> committed transactions in the WAL transaction log.
>
> >One can now create a "snapshot". Those snapshots use the same pool of N
> >readers: snapshot reads are then wrapped in a deferred transaction and
> >sqlite3_snapshot_open. At the end of this transaction, the reader
> >connection becomes available for another regular read or for another
> >snapshot read.
>
> There is the error.  There is no "snapshot".  You do not "create a
> snapshot".  You record a point-in-time marker location within the WAL
> transaction log.  Think of it as taking a picture of the state of the
> moving van after it has been half filled.  You can go back to the
> point-in-time represented by this picture right up until the moving van
> drives to the new location and unloads those boxes at your new house (does
> a checkpoint).  When this happens, you still have the picture, but now a
> new van has arrived to be loaded up and the old one is gone forever.  While
> you still might have the picture, it is just a bunch of data that is not
> useful for anything except entertainment.
>
> >In order to protect unused snapshots, checkpoints are prevented as long
> >as there exists snapshots, with sqlite3_wal_hook().
>
> There are no "snapshots".  There are merely recorded point-in-time markers
> from the WAL transaction log.  If you modify the WAL transaction log by
> performing a checkpoint operation, your "point-in-time" may no longer be
> accessible because that "point-in-time marker" may no longer exist in the
> WAL transaction log.
>
> >I **really** hope this protects snapshots for good. Of course, a
> >confirmation from knowledgeable people would be appreciated :-)
>
> Yes.  If you do not remove transactions from the WAL transaction log then
> all the point-in-time markers before each of these transactions will still
> exist.  As soon as you move a committed transaction from the WAL file to
> the main database (this is called a checkpoint), the "point-in-time" marker
> prior to that transaction no longer exists in the WAL file, so you "cannot
> get there from here" (you cannot go to that point-in-time location because
> it is no longer recorded in the transaction log).
>
> --
> The fact that there's a Highway to Hell but only a Stairway to Heaven says
> a lot about anticipated traffic volume.
>
>
>
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Conflict between snapshots and checkpoints

Reply via email to