On 2018-04-06 17:59:28 -0700, Andres Freund wrote:
> +     /*
> +      * Create a database list.  We don't need to concern ourselves with
> +      * rebuilding this list during runtime since any database created after
> +      * this process started will be running with checksums turned on from 
> the
> +      * start.
> +      */
> 
> Why is this true? What if somebody runs CREATE DATABASE while the
> launcher / worker are processing a different database? It'll copy the
> template database on the filesystem level, and it very well might not
> yet have checksums set?  Afaict the second time we go through this list
> that's not cought.

*caught

It's indeed trivial to reproduce this, just slowing down a checksum run
and copying the database yields:
./pg_verify_checksums -D /srv/dev/pgdev-dev
pg_verify_checksums: checksum verification failed in file 
"/srv/dev/pgdev-dev/base/16385/2703", block 0: calculated checksum 45A7 but 
expected 0
pg_verify_checksums: checksum verification failed in file 
"/srv/dev/pgdev-dev/base/16385/2703", block 1: calculated checksum 8C7D but 
expected 0



further complaints:

The new isolation test cannot be re-run on an existing cluster. That's
because the first test expects isolationtests to be disabled. As even
remarked upon:
# The checksum_enable suite will enable checksums for the cluster so should
# not run before anything expecting the cluster to have checksums turned off

How's that ok? You can leave database wide objects around, but the
cluster-wide stuff needs to be cleaned up.


The tests don't actually make sure that no checksum launcher / apply is
running anymore. They just assume that it's gone once the GUC shows
checksums have been set.  If you wanted to make the tests stable, you'd
need to wait for that to show true *and* then check that no workers are
around anymore.


If it's not obvious: This isn't ready, should be reverted, cleaned up,
and re-submitted for v12.


Greetings,

Andres Freund

Reply via email to