On 08/04/2015 08:03 AM, Jeff Janes wrote:
For a GIN index with fastupdate turned on, both the user backends and
autoanalyze routine will clear out the pending list, pushing the entries
into the normal index structure and deleting the pages used by the pending
list.  But those deleted pages will not get added to the freespace map
until a vacuum is done.  This leads to horrible bloat on insert only
tables, as it is never vacuumed and so the pending list space is never
reused.  And the pending list is very inefficient in space usage to start
with, even compared to the old style posting lists and especially compared
to the new compressed ones.  (If they were aggressively recycled, this
inefficient use wouldn't be much of a problem.)

Good point.

The attached proof of concept patch greatly improves the bloat for both the
insert and the update cases.  You need to turn on both features: adding the
pages to fsm, and vacuuming the fsm, to get the benefit (so JJ_GIN=3).  The
first of those two things could probably be adopted for real, but the
second probably is not acceptable.  What is the right way to do this?
Could a variant of RecordFreeIndexPage bubble the free space up the map
immediately rather than waiting for a vacuum?  It would only have to move
up until it found a page with freespace already recorded in it, which the
vast majority of the time would mean observing up one level and then not
writing to it, assuming the pending list pages remain well clustered.

Yep, that sounds good.

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to