On Nov 6, 2008, at 1:31 PM, Bruce Momjian wrote:
3. What about multi-release upgrades?  Say someone wants to upgrade
from 8.3 to 8.6.  8.6 only knows how to read pages that are
8.5-and-a-half or better, 8.5 only knows how to read pages that are
8.4-and-a-half or better, and 8.4 only knows how to read pages that
are 8.3-and-a-half or better.  So the user will have to upgrade to
8.3.MAX, then 8.4.MAX, then 8.5.MAX, and then 8.6.

Yes.


I think that's pretty seriously un-desirable. It's not at all uncommon for databases to stick around for a very long time and then jump ahead many versions. I don't think we want to tell people they can't do that.

More importantly, I think we're barking up the wrong tree by putting migration knowledge into old versions. All that the old versions need to do is guarantee a specific amount of free space per page. We should provide a mechanism to tell a cluster what that free space requirement is, and not hard-code it into the backend.

Unless I'm mistaken, there are only two cases we care about for additional space: per-page and per-tuple. Those requirements could also vary for different types of pg_class objects. What we need is an API that allows an administrator to tell the database to start setting this space aside. One possibility:

pg_min_free_space( version, relkind, bytes_per_page, bytes_per_tuple );
pg_min_free_space_index( version, indexkind, bytes_per_page, bytes_per_tuple );

version: This would be provided as a safety mechanism. You would have to provide the major version that matches what the backend is running. See below for an example.

relkind: Essentially, heap vs toast, though I suppose it's possible we might need this for sequences.

indexkind: Because we support different types of indexes, I think we need to handle them differently than heap/toast. If we wanted, we could have a single function that demands that indexkind is NULL if relkind != 'index'.

bytes_per_(page|tuple): obvious. :)


Once we have an API, we need to get users to make use of it. I'm thinking add something like the following to the release notes:

"To upgrade from a prior version to 8.4, you will need to run some of the following commands, depending on what version you are currently using:

For version 8.3:
SELECT pg_min_free_space( '8.3', 'heap', 4, 12 );
SELECT pg_min_free_space( '8.3', 'toast', 4, 12 );

For version 8.2:
SELECT pg_min_free_space( '8.2', 'heap', 14, 12 );
SELECT pg_min_free_space( '8.2', 'toast', 14, 12 );
SELECT pg_min_free_space_index( '8.2', 'b-tree', 4, 4);"

(Note I'm just pulling numbers out of thin air in this example.)

As you can see, we pass in the version number to ensure that if someone accidentally cut and pastes the wrong stuff they know what they did wrong immediately.

One downside to this scheme is that it doesn't provide a mechanism to ensure that all required minimum free space requirements were passed in. Perhaps we want a function that takes an array of complex types and forces you to supply information for all known storage mechanisms. Another possibility would be to pass in some kind of binary format that contains a checksum.

Even if we do come up with a pretty fool-proof way to tell the old version what free space it needs to set aside, I think we should still have a mechanism for the new version to know exactly what the old version has set aside, and if it's actually been accomplished or not. One option that comes to mind is to add min_free_space_per_page and min_free_space_per_tuple to pg_class. Normally these fields would be NULL; the old version would only set them once it had verified that all pages in a given relation met those requirements (presumably via vacuum). The new version would check all these values on startup to ensure they made sense.

OTOH, we might not want to go mucking around with changing the catalog for older versions (I'm not even sure if we can). So perhaps it would be better to store this information in a separate table, or maybe a separate file. That might be best anyway; we generally wouldn't need this information, so it would be nice if it wasn't bloating pg_class all the time.
--
Decibel!, aka Jim C. Nasby, Database Architect  [EMAIL PROTECTED]
Give your computer some brain candy! www.distributed.net Team #1828



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to