Good ideas.  I think the master solution is to hook the statistics
daemon information into an automatic vacuum that could _know_ which
tables need attention.

---------------------------------------------------------------------------

Curtis Faith wrote:
> tom lane wrote:
> > Sure, it's just shuffling the housekeeping work from one place to
> > another.  The thing that I like about Postgres' approach is that we
> > put the housekeeping in a background task (VACUUM) rather than in the
> > critical path of foreground transaction commit.
> 
> Thinking with my marketing hat on, MVCC would be a much bigger win if VACUUM
> was not required (or was done automagically). The need for periodic VACUUM
> just gives ammunition to the PostgreSQL opponents who can claim we are
> deferring work but that it amounts to the same thing.
> 
> A fully automatic background VACUUM will significantly reduce but will not
> eliminate this perceived weakness.
> 
> However, it always seemed to me there should be some way to reuse the space
> more dynamically and quickly than a background VACUUM thereby reducing the
> percentage of tuples that are expired in heavy update cases. If only a very
> tiny number of tuples on the disk are expired this will reduce the aggregate
> performance/space penalty of MVCC into insignificance for the majority of
> uses.
> 
> Couldn't we reuse tuple and index space as soon as there are no transactions
> that depend on the old tuple or index values. I have imagined that this was
> always part of the long-term master plan.
> 
> Couldn't we keep a list of dead tuples in shared memory and look in the list
> first when deciding where to place new values for inserts or updates so we
> don't have to rely on VACUUM (even a background one)? If there are expired
> tuple slots in the list these would be used before allocating a new slot from
> the tuple heap.
> 
> The only issue is determining the lowest transaction ID for in-process
> transactions which seems relatively easy to do (if it's not already done
> somewhere).
> 
> In the normal shutdown and startup case, a tuple VACUUM could be performed
> automatically. This would normally be very fast since there would not be many
> tuples in the list.
> 
> Index slots would be handled differently since these cannot be substituted
> one for another. However, these could be recovered as part of every index
> page update. Pages would be scanned before being written and any expired
> slots that had transaction ID's lower than the lowest active slot would be
> removed. This could be done for non-leaf pages as well and would result in
> only reorganizing a page that is already going to be written thereby not
> adding much to the overall work.
> 
> I don't think that internal pages that contain pointers to values in nodes
> further down the tree that are no longer in the leaf nodes because of this
> partial expired entry elimination will cause a problem since searches and
> scans will still work fine.
> 
> Does VACUUM do something that could not be handled in this realtime manner?
> 
> - Curtis
> 
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
> 

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  [EMAIL PROTECTED]               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Reply via email to