On Tue, Feb 11, 2014 at 11:28:36AM -0500, Robert Haas wrote:
> A sequential scan will set hint bits and will prune the page, but
> pruning the page doesn't ever mark it all-visible; that logic is
> entirely in vacuum.  If that could be made cheap enough to be
> negligible, it might well be worth doing in heap_page_prune().  I
> think there might be a way to do that, but it's a bit tricky because
> the pruning logic iterates over the page in a somewhat complex way,
> not just a straightforward scan of all the item pointers the way the
> existing logic doesn't.  It would be pretty cool if we could just use
> a bit out of the heap-prune xlog record to indicate whether the
> all-visible bit should be set; then we'd gain the benefit of marking
> things all-visible much more often without needing vacuum.
> 
> That doesn't help insert-only tables much, though, because those won't
> require pruning.  We set hint bits (which dirties the page) but
> currently don't write WAL.  We'd have to change that to set the
> all-visible bit when scanning such a table, and that would be
> expensive.  :-(

Yes, that pretty much sums it up.  We introduced index-only scans in 9.2
(2012) but they still seem to be not usable for insert-only workloads
two years later.  Based on current progress, it doesn't look like this
will be corrected until 9.5 (2015).  I am kind of confused why this has
not generated more urgency.

I guess my question is what approach do we want to take to fixing this? 
If we are doing pruning, aren't we emitting WAL?  You are right that for
an insert-only workload, we aren't going to prune, but if pruning WAL
overhead is acceptable for a sequential scan, isn't index-only
page-all-visible WAL overhead acceptable?

Do we want to track the number of inserts in statistics and trigger an
auto-vacuum after a specified number of inserts?  The problem there is
that we really don't need to do any index cleanup, which is what vacuum
typically does --- we just want to scan the table and set the
all-visible bits, so that approach seems non-optimal.

-- 
  Bruce Momjian  <br...@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to