On 4/23/15 8:42 AM, Robert Haas wrote:
On Thu, Apr 23, 2015 at 4:19 AM, Simon Riggs <si...@2ndquadrant.com> wrote:
We were talking about having an incremental backup map also. Which sounds a
lot like the freeze map.

Yeah, possibly.  I think we should try to set things up so that the
backup map can be updated asynchronously by a background worker, so
that we're not adding more work to the foreground path just for the
benefit of maintenance operations.  That might make the logic for
autovacuum to use it a little bit more complex, but it seems
manageable.

I'm not sure an actual map makes sense... for incremental backups you need some kind of stream that tells you not only what changed but when it changed. A simple freeze map won't work for that because the operation of freezing itself writes data (and the same can be true for VM). Though, if the backup utility was actually comparing live data to an actual backup maybe this would work...

We only need a freeze/backup map for larger relations. So if we map 1000
blocks per map page, we skip having a map at all when size < 1000.

Agreed.  We might also want to map multiple blocks per map slot - e.g.
one slot per 32 blocks.  That would keep the map quite small even for
very large relations, and would not compromise efficiency that much
since reading 256kB sequentially probably takes only a little longer
than reading 8kB.

The problem with mapping a range of pages per bit is dealing with locking when you set the bit. Currently that's easy because we're holding the cleanup lock on the page, but you can't do that if you have a range of pages. Though, if each 'slot' wasn't a simple binary value we could have a 3rd state that indicates we're in the process of marking that slot as all visible/frozen, but you still need to consider the bit as cleared.

Honestly though, I think concerns about the size of the map are a bit overblown. Even if we double it's size, it's still 32,000 times smaller than the heap is with 8k pages. I suspect if you have tables large enough where you'll care, you'll also be using 32k pages, which means it'd be 128,000 times smaller than the heap. I have a hard time believing that's going to be even a faint blip on the performance radar.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to