I found a nice paper describing a few free space management algorithms:

M. L. McAuliffe, M. J. Carey and M. H. Solomon, Towards Effective and Efficient Free Space Management, Proceedings of the ACM SIGMOD, Jun. 1996, pages 389--400. http://citeseer.ist.psu.edu/mcauliffe96towards.html

The basic data structure used by all of the discussed algorithms is essentially a bitmap, with a few bits per each heap page, indicating the amount of free space on each page. Just like Tom suggested (http://archives.postgresql.org/pgsql-hackers/2007-11/msg00204.php). The paper calls this the Free Space Inventory Page, or FSIP.

The problem is efficiently finding a page with X bytes from the FSIP. The algorithms surveyed in that paper aim to solve that problem, and they're all pretty simple. The general trick is to cache some of the information in the FSIP, so that you don't always have to linearly scan it.

Another nice property of the FSIP is that you can quickly get a summary of the distribution of the free space, and sum of free space and utilization %.

I still feel the FSM should be in a file of it's own, rather than distributed on every nth heap page. It makes scanning it quicker, because it's sequential rather than random access, we're going to need a solution for indexes as well, and using the special area of heap pages would make the locking quite tricky.

I think we can, however, mix the visibility map with the FSM. The visibility map would be spread over many more pages that way, which might affect scan performance. But it'd also ease the potential lock contention of updates, and you could then update the FSM and the visibility map in one operation.

Just thinking ahead...

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

              http://www.postgresql.org/docs/faq

Reply via email to