Re: [HACKERS] On-the-fly index tuple deletion vs. hot_standby

Simon Riggs Mon, 13 Jun 2011 08:55:15 -0700

On Thu, Jun 9, 2011 at 10:38 PM, Noah Misch <[email protected]> wrote:
> On Fri, Apr 22, 2011 at 11:10:34AM -0400, Noah Misch wrote:
>> On Tue, Mar 15, 2011 at 10:22:59PM -0400, Noah Misch wrote:
>> > On Mon, Mar 14, 2011 at 01:56:22PM +0200, Heikki Linnakangas wrote:
>> > > On 12.03.2011 12:40, Noah Misch wrote:
>> > >> The installation that inspired my original report recently upgraded 
>> > >> from 9.0.1
>> > >> to 9.0.3, and your fix did significantly decrease its conflict 
>> > >> frequency.  The
>> > >> last several conflicts I have captured involve XLOG_BTREE_REUSE_PAGE 
>> > >> records.
>> > >> (FWIW, the index has generally been pg_attribute_relid_attnam_index.)  
>> > >> I've
>> > >> attached a test script demonstrating the behavior.  _bt_page_recyclable 
>> > >> approves
>> > >> any page deleted no more recently than RecentXmin, because we need only 
>> > >> ensure
>> > >> that every ongoing scan has witnessed the page as dead.  For the hot 
>> > >> standby
>> > >> case, we need to account for possibly-ongoing standby transactions.  
>> > >> Using
>> > >> RecentGlobalXmin covers that, albeit with some pessimism: we really 
>> > >> only need
>> > >> LEAST(RecentXmin, PGPROC->xmin of walsender_1, .., PGPROC->xmin of 
>> > >> walsender_N)
>> > >> - vacuum_defer_cleanup_age.  Not sure the accounting to achieve that 
>> > >> would pay
>> > >> off, though.  Thoughts?
>> > >
>> > > Hmm, instead of bloating the master, I wonder if we could detect more
>> > > accurately if there are any on-going scans, in the standby. For example,
>> > > you must hold a lock on the index to scan it, so only transactions
>> > > holding the lock need to be checked for conflict.
>> >
>> > That would be nice.  Do you have an outline of an implementation in mind?
>>
>> In an attempt to resuscitate this thread, here's my own shot at that.  
>> Apologies
>> in advance if it's just an already-burning straw man.
> [full proposal at 
> http://archives.postgresql.org/message-id/[email protected]]
>
> Anyone care to comment?  On this system, which has vacuum_defer_cleanup_age 
> set
> to 3 peak hours worth of xid consumption, the problem caps recovery conflict
> hold off at 10-20 minutes.  It will have the same effect on standby feedback 
> in
> 9.1.  I think we should start by using RecentGlobalXmin instead of RecentXmin 
> as
> the reuse horizon when wal_level = hot_standby, and backpatch that.  Then,
> independently consider for master a bloat-avoidance improvement like I 
> outlined
> most recently; I'm not sure whether that's worth it.  In any event, I'm hoping
> to get some consensus on the way forward.


I like your ideas.

(Also, I note that using xids in this way unnecessarily keeps bloat
around for a long time, if we have periods of mostly read-only
activity. Interesting point.)

I think we would only get away with this approach on leaf pages of the
index. It doesn't seem worth trying for the locks if we were higher
up.

On the standby side, its possible this could generate additional
buffer pin deadlocks and/or contention. So I would also want to look
at some deferral mechanism, so that we can mark the block removed, but
not actually do so until some time later, or we really need to, for
example when we write new data to that page.

Got time for a patch in this coming CF?

-- 
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] On-the-fly index tuple deletion vs. hot_standby

Reply via email to