On Thu, 2015-12-10 at 17:12 +0000, Ian Jackson wrote:
> diff --git a/schema/README.updates b/schema/README.updates
> new file mode 100644
> index 0000000..622410c
> --- /dev/null
> +++ b/schema/README.updates
> @@ -0,0 +1,179 @@
> +SCHEMA DEFINITION AND SCHEMA UPDATES (PRODUCTION `EXECUTIVE' MODE)
> +==================================================================
> +
> +To generate a new DB, we apply the original schema (in initial.sql)
> +and then apply all the updates, in order.
> +
> +We maintain a table in the DB which records which updates are applied.
> +
> +
> +Schema update snippet format
> +----------------------------
> +
> +Schema update snippets should be called
> +   schema/<updatename>.sql
> +
> +They should contain DDL commands (ALTER TABLE etc.) to make whatever
> +changes are needed.
> +
> +They MUST NOT contain BEGIN or COMMIT.
> +
> +They must contain a special comment near the top:
> +
> +  -- ##OSSTEST## <sequence> <status>
> +
> +<updatename> is a string (/^[a-z][0-9a-z-]+$/) which uniquely identifies
> +the update.  It must not be changed because existing installations
> +rely on updates having stable names.
> +
> +<sequence> is a positive integer, which should be unique.  Updates are
> +applied in order.

Don't these also need to be monotonically increasing over time/commits?

i.e. committing (and applying through all the states) sequence #42 and then
later committing #12 would be at best confusing and at worse perhaps
produce different results when recreating the db (which, I think, would run
#12 first).

So maybe the rule needs to be something about being larger than the largest
currently applied patch?


> +
> +<status> reflects the compatibility of various schema versions.  It is
> +a literal string naming one of the statuses shown in `Update orders',
> +below.
> +
> +<status> depends on the nature of the specific database change, and
> +the behaviour and capabilities of the other code in the same revision
> +of osstest.git.  But, so <status> does not depend on the state of the
> +database.  Applying a schema update to a database does not change its
> +`status'.

"But, so ..." ? I think maybe s/But, s/S/ is what you meant? Or maybe there
was a missing thought?

It's not stated outright, but AIUI the <status> of an update changes in a
commit which either adds/edits a schema update, or which adds code which
adds compatibility/requirements for a particular schema update. Is that
right?

> +In principle, each update can separately be in applied or not applied
> +in any one moment in time (in various databases), and simultaneously
> +have different statuses in different relevant versions of osstest.git.
> +So overall the possible states of an update in the whole world are the
> +cross product of (i) status in each relevant osstest revision, and
> +(ii) appliedness (boolean) in each relevant database instance.
> +
> +
> +Update orders
> +-------------
> +
> +There are four reasonable plans for schema changes:
> +
> + * Fully intercompatible: both old code and new code are each
> +   compatible with both old schema and new schema.  The code and
> +   schema updates may be done in any order.
> +
> +   Such a schema change always has status:
> +      Harmless

What happens if some subsequent change (perhaps a long time later) causes
the code to require the changes made by a "Harmless" schema update?

Would that be a bug in that later code for not coping with the old schema,
or a bug in the commit adding it for not updating the header of the schema
update (to "Needed", presumably) or is it not an issue because there is
some point at which a schema update becomes part of the assumed baseline?

> +
> + * Explicit conditional: first update the code to understand both
> +   versions of the schema; then update the schema; then drop the
> +   compatibility code.
> +
> +   Such a schema change always has status:
> +      Unfinished (or absent)   in old code
> +      Ready                    in intermediate code
> +      Needed                   in the final code

So, a plausible sequence of commits to osstest.git might be:

1: Add schema/foo.update with initial status "Unfinished".
2a: Add code to partially implement compat with the new schema:
        no status change
[2b,2c...] more compat or unrelated changes
3: Add final code to completely implement compat with the new schema:
        status changed to "Ready" in that same commit
4a,b,c: Maybe other unrelated changes
5: Remove any piece code which provides support for the old schema:
        status changes to "Needed"
6: Eventually remove remove other compat code.

I've broken down the addition/removal of compat code into stages to
illustrate, that code might really all come/go in a single commit.

(I see now that there is a more comprehensive example further down the doc)

> +
> + * Code first: the new code works with either old or new schema,
> +   but the old code cannot cope with the new schema.
> +
> +   Such a schema change has status:
> +      Unfinished (or absent)   in old code
> +      Ready                    in new code
> +
> + * Schema first: the new schema works with any code; but the old
> +   schema does not work with new code.
> +
> +   Such a schema change has status:
> +      Preparatory              in old code
> +      Needed                   in the new code
> +
> +
> +Update order for Populate-then-rely
> +-----------------------------------
> +
> +This is for when we want to record new information and then later rely
> +on it.  There are typically two schema changes:
> +
> +* To add the column(s).  I will call this `add'.  It is a `Schema
> +  first' change, in the taxonomy above.
> +
> +* To add appropriate constraints, to prevent the new information being
> +  left blank.  I will call this `constraint'.  This is a `Code first'
> +  or `Explicit conditional' change in the taxonomy above.
> +
> +1. Commit: new schema update `add', status Preparatory.
> +
> +2. Commit: new schema update `constraint', status Unfinished.

At this point we want to wait for those commits to pass the push gate,
before we can apply `add', since applying `add' should be done from a
"properly acked version of osstest.git".

This is made pretty clear by the following commentary, for `Apply' but I
wanted to check I'd got the placement of the wait correct.

> +3. Apply: `add'.
> +
> +4. Optionally commit: code to read new column, but which tolerates
> +   both complete absence of the column, and/or it containing NULL
> +   (or whatever the DEFAULT value is).
> +
> +5. Commit: code to populate new column; changing `add' to status
> +   Needed and `constraint' to status Ready.
> +
> +6. Optionally commit: code which read new column, but which tolerates
> +   it containing NULL/DEFAULT.  (`add' is already Needed.)
> +
> +7. If necessary commit: idempotent utility script to populate missing
> +   data.  (Alternatively, this can be done with DML statements in the
> +   `constraint' schema update .sql file.  This is better if it is
> +   possible.)
> +
> +8. Wait for all executions of old code to finish.  (This obviously
> +   implies first getting a push of all the commits mentioned above.)
> +
> +8. If necessary, execute utility script to populate missing data.
> +
> +9. Apply: `constraint'.
> +
> +10. Optionally commit: code which relies on new column, and does not
> +   necessarily tolerate NULL/DEFAULT; changing `constraint' to Needed.
> +
> +
> +`Commit' means committing somewhere public and probably pushing to
> +osstest.git#pretest, but not necessarily getting a push.  (It
> +necessarily precedes any formal testing of the relevant changes on a
> +production instance.)
> +
> +`Apply' (and `execute utility script') should only be done using a
> +properly acked version of osstest.git.  If verifying the sanity of the
> +schema change is nontrivial then ad-hoc tests may need to have been
> +run with a testing instance of the database.  Using only a pushed
> +production version is a good idea to avoid the possibility that the
> +production database might contain changes which are not evident in
> +published code (or worse, which are different in future versions).
> +
> +Subject to those conditions, `Apply' means an administrator running
> +./mg-schema-update as osstest; if `wait for executions of old code to
> +finish is needed', this will usually involve passing an appropriate
> +`-o' option.
> +
> +
> +Statuses and rules for push and db update
> +-----------------------------------------
> +
> +  Harmless
> +  Preparatory
> +     No restrictions
> +
> +  Unfinished
> +  (sql fragment entirely missing is equivalent to Unfinished)
> +     Schema update: prevented

In the case of "entirely missing" "prevented" must really mean "there can't
possibly be anything to do/prevent"?

> +     Code push: unrestricted
> +
> +  Ready
> +     Schema update: need all live code to be Preparatory/Ready/Needed
> +     Code push: unrestricted
> +
> +  Needed
> +     Schema update: need all live code to be Preparatory/Ready/Needed
> +     Code push: depends on schema update
> +
> +
> +"Code push: depends on schema update" is not currently implemented.
> +However, many (most?) such changes would cause the push gate itself to
> +fail.
> +
> +"Need all live code to be ..." means to look for the status of this
> +schema update in other running versions of osstest.  An attempt at
> +this is provided in the form of the `-o' option to mg-schema-update.
> +It is the administrator's responsibility to select an appropriate
> +argument to `-o'.

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Reply via email to