Heikki Linnakangas napsal(a):
Tom Lane wrote:
"Heikki Linnakangas" <[EMAIL PROTECTED]> writes:
(this won't come as a surprise as we talked about this in PGCon, but) I think we should rather convert the page structure to new format in ReadBuffer the first time a page is read in. That would keep the changes a lot more isolated.

The problem is that ReadBuffer is an extremely low-level environment,
and it's not clear that it's possible (let alone practical) to do a
conversion at that level in every case.

Well, we can't predict the future, and can't guarantee that it's possible or practical to do the things we need to do in the future no matter what approach we choose.

 In particular it hardly seems
sane to expect ReadBuffer to do tuple content conversion, which is going
to be practically impossible to perform without any catalog accesses.

ReadBuffer has access to Relation, which has information about what kind of a relation it's dealing with, and TupleDesc. That should get us pretty far. It would be a modularity violation, for sure, but I could live with that for the purpose of page version conversion.

But if you look for example into hash implementation some pages are not in regular format and conversion could need more information which we do not have to have in ReadBuffer.

Another issue is that it might not be possible to update a page for
lack of space.  Are we prepared to assume that there will never be a
transformation we need to apply that makes the data bigger?

We do need some solution to that. One idea is to run a pre-upgrade script in the old version that scans the database and moves tuples that would no longer fit on their pages in the new version. This could be run before the upgrade, while the old database is still running, so it would be acceptable for that to take some time.

It could not work for indexes and do not forget TOAST chunks. I think in some cases you can get unused quoter of each page in TOAST table.

No doubt people would prefer something better than that. Another idea would be to have some over-sized buffers that can be used as the target of conversion, until some tuples are moved off to another page. Perhaps the over-sized buffer wouldn't need to be in shared memory, if they're read-only until some tuples are moved.

Anyway, you need mechanism how to mark that this page is read only which is also require a lot of modification. And some mechanism how to make a decision when this page converted. I guess this approach will require similar modification as convert on write.

This is pretty hand-wavy, I know. The point is, I don't think these problems are insurmountable.

 (Likely counterexample: adding collation info to text values.)

I doubt it, as collation is not a property of text values, but operations. But that's off-topic...

Yes, it is offtopic, however I think Tom is right :-).

                Zdenek




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to