On 11/12/12 3:44 AM, Markus Wanner wrote:
Sorry if that has been discussed before, but can't we do without that
bit at all? It adds a checksum switch to each page, where we just agreed
we don't event want a per-database switch.

Once you accept that eventually there need to be online conversion tools, there needs to be some easy way to distinguish which pages have been processed for several potential implementations. The options seem to be adding some bits just for that or bumping the page format. I would like to just bump the format, but that has a pile of its own issues to cross. Rather not make that a requirement for this month's requirements.

Can we simply write a progress indicator to pg_control or someplace
saying that all pages up to X of relation Y are supposed to have valid
checksums?

All of the table-based checksum enabling ideas seem destined to add metadata to pg_class or something related to it for this purpose. While I think everyone agrees that this is a secondary priority to getting basic cluster-level checksums going right now, I'd like to have at least a prototype for that before 9.3 development ends. All of the

I realize this doesn't support Jesper's use case of wanting to have the
checksums only for newly dirtied pages. However, I'd argue that
prolonging the migration to spread the load would allow even big shops
to go through this without much of an impact on performance.

I'm thinking of this in some ways like the way creation of a new (but not yet valid) foreign key works. Once that's active, new activity is immediately protected moving forward. And eventually there's this cleanup step needed, one that you can inch forward over a few days.

The main upper limit on load spreading here is that the conversion program may need to grab a snapshot. In that case the conversion taking too long will be a problem, as it blocks other vacuum activity past that point. This is why I think any good solution to this problem needs to incorporate restartable conversion. We were just getting complaints recently about how losing a CREATE INDEX CONCURRENTLY session can cause the whole process to end and need to be started over. The way autovacuum runs right now it can be stopped and restarted later, with only a small loss of duplicated work in many common cases. If it's possible to maintain that property for the checksum conversion, that would be very helpful to larger sites. It doesn't matter if adding checksums to the old data takes a week if you throttle the load down, so long as you're not forced to hold an open snapshot the whole time.

--
Greg Smith   2ndQuadrant US    g...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to