Re: [HACKERS] SSI freezing bug

2013-10-08 Thread Kevin Grittner
Heikki Linnakangas hlinnakan...@vmware.com wrote:

 A comment somewhere would be nice to explain why we're no longer worried
 about confusing an old tuple version with a new tuple in the same slot.
 Not sure where.

For now I counted on the commit message.  Perhaps we also want to
add something to the README file?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI freezing bug

2013-10-08 Thread Kevin Grittner
Heikki Linnakangas hlinnakan...@vmware.com wrote:
 On 07.10.2013 23:45, Heikki Linnakangas wrote:

 To fix the bug that Hannu pointed out, we also need to apply these fixes:

 http://www.postgresql.org/message-id/52440266.5040...@vmware.com

 Per a chat with Bruce, I'm going to apply that patch now to get it into
 the minor releases.

Please do.  (Somehow I had it in my head that you already had done so.)

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI freezing bug

2013-10-07 Thread Heikki Linnakangas

On 06.10.2013 20:34, Kevin Grittner wrote:

Note this comment, which I think was written by Heikki back when
there was a lot more benchmarking and profiling of SSI going on:

   * Because a particular target might become obsolete, due to update to a new
   * version, before the reading transaction is obsolete, we need some way to
   * prevent errors from reuse of a tuple ID.  Rather than attempting to clean
   * up the targets as the related tuples are pruned or vacuumed, we check the
   * xmin on access.This should be far less costly.

Based on what this patch looks like, I'm afraid he may have been
right when he wrote that.  In any event, after the exercise of
developing a first draft of searching for predicate locks to clean
up every time a tuple is pruned or vacuumed, I continue to feel
strongly that the previously-posted patch, which only takes action
when tuples are frozen by a vacuum process, is much more suitable
for backpatching.  I don't think we should switch to anything
resembling the attached without some careful benchmarking.


Hmm, you're probably right. I was thinking that the overhead of scanning 
the lock hash on a regular vacuum is negligible, but I didn't consider 
pruning. It might be significant there.


I'd like to give this line of investigation some more thought:


On 04.10.2013 07:14, Dan Ports wrote:

On Thu, Oct 03, 2013 at 06:19:49AM -0700, Kevin Grittner wrote:

Heikki Linnakangashlinnakan...@vmware.com  wrote:

IMHO it would be better to remove xmin from the lock key, and vacuum
away the old predicate locks when the corresponding tuple is vacuumed.
The xmin field is only required to handle the case that a tuple is
vacuumed, and a new unrelated tuple is inserted to the same slot.
Removing the lock when the tuple is removed fixes that.


This seems definitely safe: we need the predicate locks to determine if
someone is modifying a tuple we read, and certainly if it's eligible
for vacuum nobody's going to be modifying that tuple anymore.


In fact, I cannot even come up with a situation where you would have a
problem if we just removed xmin from the key, even if we didn't vacuum
away old locks. I don't think the old lock can conflict with anything
that would see the new tuple that gets inserted in the same slot. I have
a feeling that you could probably prove that if you stare long enough at
the diagram of a dangerous structure and the properties required for a
conflict.


This would also be safe, in the sense that it's OK to flag a
conflict even if one doesn't exist. I'm not convinced that it isn't
possible to have false positives this way. I think it's possible for a
tuple to be vacuumed away and the ctid reused before the predicate
locks on it are eligible for cleanup. (In fact, isn't this what was
happening in the thread Kevin linked?)


Time to stare at the dangerous structure again:


SSI is based on the observation [2] that each snapshot isolation
anomaly corresponds to a cycle that contains a dangerous structure
of two adjacent rw-conflict edges:

  Tin -- Tpivot -- Tout
rw rw



Furthermore, Tout must commit first.

The following are true:

* A transaction can only hold a predicate lock on a tuple that's visible 
to it.


* A tuple that's visible to Tin or Tpivot cannot be vacuumed away until 
Tout commits.


When updating a tuple, CheckTargetForConflictsIn() only marks a conflict 
if the transaction holding the predicate lock overlapped with the 
updating transaction. And if a tuple is vacuumed away and the slot is 
reused, an transaction updating the new tuple cannot overlap with any 
transaction holding a lock on the old tuple. Otherwise the tuple 
wouldn't have been eligible for vacuuming in the first place.


So I don't think you can ever get a false conflict because of slot 
reuse. And if there's a hole in that thinking I can't see right now, the 
worst that will happen is some unnecessary conflicts, ie. it's still 
correct. It surely can't be worse than upgrading the lock to a 
page-level lock, which would also create unnecessary conflicts.


Summary: IMHO we should just remove xmin from the predicate lock tag.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI freezing bug

2013-10-07 Thread Kevin Grittner
Heikki Linnakangas hlinnakan...@vmware.com wrote:

 So I don't think you can ever get a false conflict because of
 slot reuse.

I spent some time looking at this, and I now agree.

 And if there's a hole in that thinking I can't see right now, the
 worst that will happen is some unnecessary conflicts, ie. it's
 still correct.

That is definitely true; no doubt about that part.

 Summary: IMHO we should just remove xmin from the predicate lock
 tag.

I spent some time trying to see how the vacuum could happen at a
point that could cause a false positive, and was unable to do so. 
Even if that is just a failure of imagination on my part, the above
argument that it doesn't produce incorrect results still holds.  I
think the fact that I couldn't find a sequence of events which
would trigger a false positive suggests it would be a rare case,
anyway.

I found the original bug report which led to the addition of xmin
to the tag, and it was because we were still carrying tuple locks
forward to new versions of those locks at the time.  This was later
proven to be unnecessary, which simplified other code; we
apparently missed a trick in not removing xmin from the lock tag at
that point.  Since leaving it there has now been shown to *cause* a
bug, I'm inclined to agree that we should simply remove it, and
backpatch that.

Patch attached.  Any objections to applying that Real Soon Now? 
(When, exactly is the deadline to make today's minor release
cut-off?)

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company*** a/src/backend/storage/lmgr/predicate.c
--- b/src/backend/storage/lmgr/predicate.c
***
*** 156,164 
   *		PredicateLockTuple(Relation relation, HeapTuple tuple,
   *		Snapshot snapshot)
   *		PredicateLockPageSplit(Relation relation, BlockNumber oldblkno,
!  *			   BlockNumber newblkno);
   *		PredicateLockPageCombine(Relation relation, BlockNumber oldblkno,
!  * BlockNumber newblkno);
   *		TransferPredicateLocksToHeapRelation(Relation relation)
   *		ReleasePredicateLocks(bool isCommit)
   *
--- 156,164 
   *		PredicateLockTuple(Relation relation, HeapTuple tuple,
   *		Snapshot snapshot)
   *		PredicateLockPageSplit(Relation relation, BlockNumber oldblkno,
!  *			   BlockNumber newblkno)
   *		PredicateLockPageCombine(Relation relation, BlockNumber oldblkno,
!  * BlockNumber newblkno)
   *		TransferPredicateLocksToHeapRelation(Relation relation)
   *		ReleasePredicateLocks(bool isCommit)
   *
***
*** 381,387  static SHM_QUEUE *FinishedSerializableTransactions;
   * this entry, you can ensure that there's enough scratch space available for
   * inserting one entry in the hash table. This is an otherwise-invalid tag.
   */
! static const PREDICATELOCKTARGETTAG ScratchTargetTag = {0, 0, 0, 0, 0};
  static uint32 ScratchTargetTagHash;
  static int	ScratchPartitionLock;
  
--- 381,387 
   * this entry, you can ensure that there's enough scratch space available for
   * inserting one entry in the hash table. This is an otherwise-invalid tag.
   */
! static const PREDICATELOCKTARGETTAG ScratchTargetTag = {0, 0, 0, 0};
  static uint32 ScratchTargetTagHash;
  static int	ScratchPartitionLock;
  
***
*** 2492,2499  PredicateLockTuple(Relation relation, HeapTuple tuple, Snapshot snapshot)
  			}
  		}
  	}
- 	else
- 		targetxmin = InvalidTransactionId;
  
  	/*
  	 * Do quick-but-not-definitive test for a relation lock first.	This will
--- 2492,2497 
***
*** 2512,2519  PredicateLockTuple(Relation relation, HeapTuple tuple, Snapshot snapshot)
  	 relation-rd_node.dbNode,
  	 relation-rd_id,
  	 ItemPointerGetBlockNumber(tid),
! 	 ItemPointerGetOffsetNumber(tid),
! 	 targetxmin);
  	PredicateLockAcquire(tag);
  }
  
--- 2510,2516 
  	 relation-rd_node.dbNode,
  	 relation-rd_id,
  	 ItemPointerGetBlockNumber(tid),
! 	 ItemPointerGetOffsetNumber(tid));
  	PredicateLockAcquire(tag);
  }
  
***
*** 4283,4290  CheckForSerializableConflictIn(Relation relation, HeapTuple tuple,
  		 relation-rd_node.dbNode,
  		 relation-rd_id,
  		 ItemPointerGetBlockNumber((tuple-t_data-t_ctid)),
! 		ItemPointerGetOffsetNumber((tuple-t_data-t_ctid)),
! 	  HeapTupleHeaderGetXmin(tuple-t_data));
  		CheckTargetForConflictsIn(targettag);
  	}
  
--- 4280,4286 
  		 relation-rd_node.dbNode,
  		 relation-rd_id,
  		 ItemPointerGetBlockNumber((tuple-t_data-t_ctid)),
! 		ItemPointerGetOffsetNumber((tuple-t_data-t_ctid)));
  		CheckTargetForConflictsIn(targettag);
  	}
  
*** a/src/include/storage/predicate.h
--- b/src/include/storage/predicate.h
***
*** 52,57  extern void PredicateLockTuple(Relation relation, HeapTuple tuple, Snapshot snap
--- 52,58 
  extern void PredicateLockPageSplit(Relation relation, BlockNumber oldblkno, BlockNumber 

Re: [HACKERS] SSI freezing bug

2013-10-07 Thread Kevin Grittner
Kevin Grittner kgri...@ymail.com wrote:

 Patch attached.  Any objections to applying that Real Soon Now?

Oh, without the new line in predicate.h.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI freezing bug

2013-10-07 Thread Andres Freund
On 2013-10-07 06:44:19 -0700, Kevin Grittner wrote:
 Patch attached.  Any objections to applying that Real Soon Now? 
 (When, exactly is the deadline to make today's minor release
 cut-off?)

Maybe it's overly careful, but I personally slightly vote for applying
it after the backbranch releases. The current behaviour doesn't have any
harsh consequences and mostly reproduceable in artifical scenarios and
the logic here is complex enough that we might miss something.

A day just doesn't leave much time to noticing any issues.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI freezing bug

2013-10-07 Thread Heikki Linnakangas

On 07.10.2013 16:58, Andres Freund wrote:

On 2013-10-07 06:44:19 -0700, Kevin Grittner wrote:

Patch attached.  Any objections to applying that Real Soon Now?
(When, exactly is the deadline to make today's minor release
cut-off?)


Maybe it's overly careful, but I personally slightly vote for applying
it after the backbranch releases. The current behaviour doesn't have any
harsh consequences and mostly reproduceable in artifical scenarios and
the logic here is complex enough that we might miss something.


Well, it's fairly harsh if the feature isn't working as advertised.


A day just doesn't leave much time to noticing any issues.


It's less than ideal, I agree, but it doesn't seem better to me to just 
leave the bug unfixed for another minor release. Even if we sit on this 
for another few months, I don't think we'll get any meaningful amount of 
extra testing on it.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI freezing bug

2013-10-07 Thread Heikki Linnakangas

On 07.10.2013 16:44, Kevin Grittner wrote:

Heikki Linnakangashlinnakan...@vmware.com  wrote:


So I don't think you can ever get a false conflict because of
slot reuse.


I spent some time looking at this, and I now agree.


And if there's a hole in that thinking I can't see right now, the
worst that will happen is some unnecessary conflicts, ie. it's
still correct.


That is definitely true; no doubt about that part.


Summary: IMHO we should just remove xmin from the predicate lock
tag.


I spent some time trying to see how the vacuum could happen at a
point that could cause a false positive, and was unable to do so.
Even if that is just a failure of imagination on my part, the above
argument that it doesn't produce incorrect results still holds.  I
think the fact that I couldn't find a sequence of events which
would trigger a false positive suggests it would be a rare case,
anyway.

I found the original bug report which led to the addition of xmin
to the tag, and it was because we were still carrying tuple locks
forward to new versions of those locks at the time.  This was later
proven to be unnecessary, which simplified other code; we
apparently missed a trick in not removing xmin from the lock tag at
that point.  Since leaving it there has now been shown to *cause* a
bug, I'm inclined to agree that we should simply remove it, and
backpatch that.

Patch attached.  Any objections to applying that Real Soon Now?
(When, exactly is the deadline to make today's minor release
cut-off?)


A comment somewhere would be nice to explain why we're no longer worried 
about confusing an old tuple version with a new tuple in the same slot. 
Not sure where.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI freezing bug

2013-10-07 Thread Andres Freund
On 2013-10-07 17:07:16 +0300, Heikki Linnakangas wrote:
 On 07.10.2013 16:58, Andres Freund wrote:
 On 2013-10-07 06:44:19 -0700, Kevin Grittner wrote:
 Patch attached.  Any objections to applying that Real Soon Now?
 (When, exactly is the deadline to make today's minor release
 cut-off?)
 
 Maybe it's overly careful, but I personally slightly vote for applying
 it after the backbranch releases. The current behaviour doesn't have any
 harsh consequences and mostly reproduceable in artifical scenarios and
 the logic here is complex enough that we might miss something.
 
 Well, it's fairly harsh if the feature isn't working as advertised.

Well, you need to have a predicate lock on a tuple that's getting frozen
(i.e. hasn't been written to for 50mio+ xids) and you need to have an
update during the time that predicate lock is held. That's not too
likely without explicit VACUUM FREEZEs interleaved there somewhere.

 A day just doesn't leave much time to noticing any issues.
 
 It's less than ideal, I agree, but it doesn't seem better to me to just
 leave the bug unfixed for another minor release. Even if we sit on this for
 another few months, I don't think we'll get any meaningful amount of extra
 testing on it.

Yea, maybe. Although HEAD does get some testing...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI freezing bug

2013-10-07 Thread Kevin Grittner
Heikki Linnakangas hlinnakan...@vmware.com wrote:

 Well, it's fairly harsh if the feature isn't working as
 advertised.

Right -- there are people counting on serializable transaction to
avoid data corruption (creation of data which violates the business
rules which they believe are being enforced).  A tuple freeze at
the wrong moment could currently break that.  The patch allows the
guarantee to hold.  The fact that this bug is only seen if there is
a very-long-running transaction or if VACUUM FREEZE is run while
data-modifying transactions are active does not eliminate the fact
that without this patch data can be corrupted.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI freezing bug

2013-10-07 Thread Kevin Grittner
Andres Freund and...@2ndquadrant.com wrote:
 On 2013-10-07 06:44:19 -0700, Kevin Grittner wrote:

 Patch attached.  Any objections to applying that Real Soon Now?
 (When, exactly is the deadline to make today's minor release
 cut-off?)

 Maybe it's overly careful, but I personally slightly vote for applying
 it after the backbranch releases. The current behaviour doesn't have any
 harsh consequences and mostly reproduceable in artifical scenarios and
 the logic here is complex enough that we might miss something.

 A day just doesn't leave much time to noticing any issues.

I grant that the bug in existing production code is not likely to
get hit very often, but it is a bug; the new isolation test shows
the bug clearly and shows that the suggested patch fixes it.  What
tips the scales for me is that the only possible downside if we
missed something is an occasional false positive serialization
failure, which does not break correctness -- we try to minimize
those for performance reasons, but the algorithm allows them and
they currently do happen.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI freezing bug

2013-10-07 Thread Kevin Grittner
Kevin Grittner kgri...@ymail.com wrote:
 Andres Freund and...@2ndquadrant.com wrote:
 On 2013-10-07 06:44:19 -0700, Kevin Grittner wrote:

 Patch attached.  Any objections to applying that Real Soon Now?
 (When, exactly is the deadline to make today's minor release
 cut-off?)

 Maybe it's overly careful, but I personally slightly vote for
 applying it after the backbranch releases. The current behaviour
 doesn't have any harsh consequences and mostly reproduceable in
 artifical scenarios and the logic here is complex enough that we
 might miss something.

 A day just doesn't leave much time to noticing any issues.

 I grant that the bug in existing production code is not likely to
 get hit very often, but it is a bug; the new isolation test shows
 the bug clearly and shows that the suggested patch fixes it. 
 What tips the scales for me is that the only possible downside if
 we missed something is an occasional false positive serialization
 failure, which does not break correctness -- we try to minimize
 those for performance reasons, but the algorithm allows them and
 they currently do happen.

I am, of course, continuing to review this.

There might be a problem if someone applies this fix while any
prepared transactions are pending.  Still investigating the impact
and possible fixes.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI freezing bug

2013-10-07 Thread Kevin Grittner
Kevin Grittner kgri...@ymail.com wrote:

 There might be a problem if someone applies this fix while any
 prepared transactions are pending.  Still investigating the impact
 and possible fixes.

I found a one-line fix for this.  It tested without problem.

Pushed.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI freezing bug

2013-10-07 Thread Dan Ports
On Mon, Oct 07, 2013 at 12:26:37PM +0300, Heikki Linnakangas wrote:
 When updating a tuple, CheckTargetForConflictsIn() only marks a
 conflict if the transaction holding the predicate lock overlapped
 with the updating transaction.

Ah, this is the bit I was forgetting. (I really ought to have
remembered that, but it's been a while...)

I think it's possible, then, to construct a scenario where a slot is
reused before predicate locks on the old tuple are eligible for
cleanup -- but those locks will never cause a conflict.

So I agree: it's correct to just remove the xmin from the key
unconditionally.

And this is also true:

 And if there's a hole in that thinking I can't see right now,
 the worst that will happen is some unnecessary conflicts, ie. it's
 still correct. It surely can't be worse than upgrading the lock to a
 page-level lock, which would also create unnecessary conflicts.

Dan

-- 
Dan R. K. PortsUW CSEhttp://drkp.net/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI freezing bug

2013-10-07 Thread Heikki Linnakangas

On 07.10.2013 22:35, Kevin Grittner wrote:

Kevin Grittnerkgri...@ymail.com  wrote:


There might be a problem if someone applies this fix while any
prepared transactions are pending.  Still investigating the impact
and possible fixes.


I found a one-line fix for this.  It tested without problem.


Good catch.

To fix the bug that Hannu pointed out, we also need to apply these fixes:

http://www.postgresql.org/message-id/52440266.5040...@vmware.com

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI freezing bug

2013-10-07 Thread Heikki Linnakangas

On 07.10.2013 23:45, Heikki Linnakangas wrote:

On 07.10.2013 22:35, Kevin Grittner wrote:

Kevin Grittnerkgri...@ymail.com wrote:


There might be a problem if someone applies this fix while any
prepared transactions are pending. Still investigating the impact
and possible fixes.


I found a one-line fix for this. It tested without problem.


Good catch.

To fix the bug that Hannu pointed out, we also need to apply these fixes:

http://www.postgresql.org/message-id/52440266.5040...@vmware.com


Per a chat with Bruce, I'm going to apply that patch now to get it into 
the minor releases.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI freezing bug

2013-10-07 Thread Peter Eisentraut
On Mon, 2013-10-07 at 23:49 +0300, Heikki Linnakangas wrote:
 On 07.10.2013 23:45, Heikki Linnakangas wrote:
  On 07.10.2013 22:35, Kevin Grittner wrote:
  Kevin Grittnerkgri...@ymail.com wrote:
 
  There might be a problem if someone applies this fix while any
  prepared transactions are pending. Still investigating the impact
  and possible fixes.
 
  I found a one-line fix for this. It tested without problem.
 
  Good catch.
 
  To fix the bug that Hannu pointed out, we also need to apply these fixes:
 
  http://www.postgresql.org/message-id/52440266.5040...@vmware.com
 
 Per a chat with Bruce, I'm going to apply that patch now to get it into 
 the minor releases.

9.1 branch is broken now.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI freezing bug

2013-10-07 Thread Heikki Linnakangas

On 08.10.2013 03:25, Peter Eisentraut wrote:

On Mon, 2013-10-07 at 23:49 +0300, Heikki Linnakangas wrote:

To fix the bug that Hannu pointed out, we also need to apply these fixes:

http://www.postgresql.org/message-id/52440266.5040...@vmware.com


Per a chat with Bruce, I'm going to apply that patch now to get it into
the minor releases.


9.1 branch is broken now


Rats. I forgot to git add the latest changes while backpatching.

Committed a fix for that.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI freezing bug

2013-10-06 Thread Kevin Grittner
Kevin Grittner kgri...@ymail.com wrote:
 Kevin Grittner kgri...@ymail.com wrote:

 I'm strongly leaning toward the idea that a slightly tweaked
 version of the proposed patch is appropriate for the back-branches,
 because the fix Heikki is now suggesting seems too invasive to
 back-patch.  I think it would make sense to apply it to master,
 too, so that the new isolation tests can be immediately added.  We
 can then work on the new approach for 9.4 and have the tests to
 help confirm we are not breaking anything.  The tweak would be to
 preserve the signature of heap_freeze_tuple(), because after the
 more invasive fix in 9.4 the new parameters will not be needed.
 They are only passed as non-NULL from one of the three callers, so
 it seems best to leave those call sites alone rather than change
 them back-and-forth.

 I will post a new patch today or tomorrow.

 Attached.

And here's a rough cut of what I think the alternative now
suggested by Heikki would look like.  (I've omitted the new
isolation test because that is the same as the previously posted
patch.)

Note this comment, which I think was written by Heikki back when
there was a lot more benchmarking and profiling of SSI going on:

  * Because a particular target might become obsolete, due to update to a new
  * version, before the reading transaction is obsolete, we need some way to
  * prevent errors from reuse of a tuple ID.  Rather than attempting to clean
  * up the targets as the related tuples are pruned or vacuumed, we check the
  * xmin on access.    This should be far less costly.

Based on what this patch looks like, I'm afraid he may have been
right when he wrote that.  In any event, after the exercise of
developing a first draft of searching for predicate locks to clean
up every time a tuple is pruned or vacuumed, I continue to feel
strongly that the previously-posted patch, which only takes action
when tuples are frozen by a vacuum process, is much more suitable
for backpatching.  I don't think we should switch to anything
resembling the attached without some careful benchmarking.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company*** a/src/backend/access/heap/pruneheap.c
--- b/src/backend/access/heap/pruneheap.c
***
*** 21,26 
--- 21,27 
  #include miscadmin.h
  #include pgstat.h
  #include storage/bufmgr.h
+ #include storage/predicate.h
  #include utils/rel.h
  #include utils/tqual.h
  
***
*** 52,58  static void heap_prune_record_redirect(PruneState *prstate,
  		   OffsetNumber offnum, OffsetNumber rdoffnum);
  static void heap_prune_record_dead(PruneState *prstate, OffsetNumber offnum);
  static void heap_prune_record_unused(PruneState *prstate, OffsetNumber offnum);
! 
  
  /*
   * Optionally prune and repair fragmentation in the specified page.
--- 53,62 
  		   OffsetNumber offnum, OffsetNumber rdoffnum);
  static void heap_prune_record_dead(PruneState *prstate, OffsetNumber offnum);
  static void heap_prune_record_unused(PruneState *prstate, OffsetNumber offnum);
! static void heap_prune_page_predicate_locks(Relation relation, Buffer buffer,
! OffsetNumber *redirected, int nredirected,
! OffsetNumber *nowdead, int ndead,
! OffsetNumber *nowunused, int nunused);
  
  /*
   * Optionally prune and repair fragmentation in the specified page.
***
*** 200,205  heap_page_prune(Relation relation, Buffer buffer, TransactionId OldestXmin,
--- 204,218 
  	 prstate);
  	}
  
+ 	if (prstate.nredirected  0 || prstate.ndead  0 || prstate.nunused  0)
+ 	{
+ 		heap_prune_page_predicate_locks(relation, buffer,
+ 		prstate.redirected,
+ 		prstate.nredirected,
+ 		prstate.nowdead, prstate.ndead,
+ 		prstate.nowunused, prstate.nunused);
+ 	}
+ 
  	/* Any error while applying the changes is critical */
  	START_CRIT_SECTION();
  
***
*** 640,645  heap_prune_record_unused(PruneState *prstate, OffsetNumber offnum)
--- 653,692 
  	prstate-marked[offnum] = true;
  }
  
+ /* Release predicate locks for now-unreachable tuples */
+ static void
+ heap_prune_page_predicate_locks(Relation relation, Buffer buffer,
+ OffsetNumber *redirected, int nredirected,
+ OffsetNumber *nowdead, int ndead,
+ OffsetNumber *nowunused, int nunused)
+ {
+ 	BlockNumber blkno = BufferGetBlockNumber(buffer);
+ 	OffsetNumber *offnum;
+ 	int			i;
+ 
+ 	/* Release predicate locks on all redirected line pointers */
+ 	offnum = redirected;
+ 	for (i = 0; i  nredirected; i++)
+ 	{
+ 		ReleasePredicateTupleLocks(relation, blkno, *offnum++);
+ 		offnum++;  /* skip redirect to offset */
+ 	}
+ 
+ 	/* Release predicate locks on all now-dead line pointers */
+ 	offnum = nowdead;
+ 	for (i = 0; i  ndead; i++)
+ 	{
+ 		ReleasePredicateTupleLocks(relation, blkno, *offnum++);
+ 	}
+ 
+ 	/* Release predicate locks on all now-unused line pointers */
+ 	offnum = nowunused;
+ 	for (i = 

Re: [HACKERS] SSI freezing bug

2013-10-04 Thread Andres Freund
On 2013-10-03 21:14:17 -0700, Dan Ports wrote:
 On Thu, Oct 03, 2013 at 06:19:49AM -0700, Kevin Grittner wrote:
  Heikki Linnakangas hlinnakan...@vmware.com wrote:
   IMHO it would be better to remove xmin from the lock key, and vacuum
   away the old predicate locks when the corresponding tuple is vacuumed.
   The xmin field is only required to handle the case that a tuple is
   vacuumed, and a new unrelated tuple is inserted to the same slot.
   Removing the lock when the tuple is removed fixes that.
 
 This seems definitely safe: we need the predicate locks to determine if
 someone is modifying a tuple we read, and certainly if it's eligible
 for vacuum nobody's going to be modifying that tuple anymore.

But we're talking about freezing a tuple, not removing a dead tuple. I
don't see anything preventing modification of a frozen tuple. Right?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI freezing bug

2013-10-04 Thread Heikki Linnakangas

On 04.10.2013 13:23, Andres Freund wrote:

On 2013-10-03 21:14:17 -0700, Dan Ports wrote:

On Thu, Oct 03, 2013 at 06:19:49AM -0700, Kevin Grittner wrote:

Heikki Linnakangashlinnakan...@vmware.com  wrote:

IMHO it would be better to remove xmin from the lock key, and vacuum
away the old predicate locks when the corresponding tuple is vacuumed.
The xmin field is only required to handle the case that a tuple is
vacuumed, and a new unrelated tuple is inserted to the same slot.
Removing the lock when the tuple is removed fixes that.


This seems definitely safe: we need the predicate locks to determine if
someone is modifying a tuple we read, and certainly if it's eligible
for vacuum nobody's going to be modifying that tuple anymore.


But we're talking about freezing a tuple, not removing a dead tuple. I
don't see anything preventing modification of a frozen tuple. Right?


Right, but if we no longer include xmin in the lock key, freezing a 
tuple makes no difference. Currently, the problem is that when a tuple 
is frozen, the locks on the tuple on the tuple are lost, as the 
xmin+ctid of the lock no longer matches xmin+ctid of the tuple.


Removing xmin from the lock altogether solves that problem, but it 
introduces the possibility that when an old tuple is vacuumed away and a 
new tuple is inserted on the same slot, a lock on the old tuple is 
confused to be a lock on the new tuple. And that problem can be fixed by 
vacuuming locks, when the corresponding tuple is vacuumed.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI freezing bug

2013-10-04 Thread Andres Freund
On 2013-10-04 13:51:00 +0300, Heikki Linnakangas wrote:
 On 04.10.2013 13:23, Andres Freund wrote:
 On 2013-10-03 21:14:17 -0700, Dan Ports wrote:
 On Thu, Oct 03, 2013 at 06:19:49AM -0700, Kevin Grittner wrote:
 Heikki Linnakangashlinnakan...@vmware.com  wrote:
 IMHO it would be better to remove xmin from the lock key, and vacuum
 away the old predicate locks when the corresponding tuple is vacuumed.
 The xmin field is only required to handle the case that a tuple is
 vacuumed, and a new unrelated tuple is inserted to the same slot.
 Removing the lock when the tuple is removed fixes that.
 
 This seems definitely safe: we need the predicate locks to determine if
 someone is modifying a tuple we read, and certainly if it's eligible
 for vacuum nobody's going to be modifying that tuple anymore.
 
 But we're talking about freezing a tuple, not removing a dead tuple. I
 don't see anything preventing modification of a frozen tuple. Right?
 
 Right, but if we no longer include xmin in the lock key, freezing a tuple
 makes no difference. Currently, the problem is that when a tuple is frozen,
 the locks on the tuple on the tuple are lost, as the xmin+ctid of the lock
 no longer matches xmin+ctid of the tuple.
 
 Removing xmin from the lock altogether solves that problem, but it
 introduces the possibility that when an old tuple is vacuumed away and a new
 tuple is inserted on the same slot, a lock on the old tuple is confused to
 be a lock on the new tuple. And that problem can be fixed by vacuuming
 locks, when the corresponding tuple is vacuumed.

But locks on those still can have meaning, right? From my very limited
understanding of predicate.c/SSI I don't see why we cannot have
meaningful conflicts on tuples that are already vacuumable. You'd need
some curiously interleaved transactions, sure, but it seems possible?

ISTM we'd need to peg the xmin horizon for vacuum to the oldest xmin
predicate.c keeps track of.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI freezing bug

2013-10-04 Thread Heikki Linnakangas

On 04.10.2013 14:02, Andres Freund wrote:

But locks on those still can have meaning, right? From my very limited
understanding of predicate.c/SSI I don't see why we cannot have
meaningful conflicts on tuples that are already vacuumable. You'd need
some curiously interleaved transactions, sure, but it seems possible?


To conflict with a lock, a backend would need to read or update the 
tuple the lock is on. If the tuple is vacuumable, it's no longer visible 
to anyone, so no backend can possibly read or update it anymore.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI freezing bug

2013-10-04 Thread Kevin Grittner
Andres Freund and...@2ndquadrant.com wrote:
 On 2013-10-03 21:14:17 -0700, Dan Ports wrote:
 Heikki Linnakangashlinnakan...@vmware.com  wrote:
 IMHO it would be better to remove xmin from the lock key,
 and vacuum away the old predicate locks when the
 corresponding tuple is vacuumed.
 The xmin field is only required to handle the case that a
 tuple is vacuumed, and a new unrelated tuple is inserted to
 the same slot.
 Removing the lock when the tuple is removed fixes that.

 This seems definitely safe: we need the predicate locks to
 determine if someone is modifying a tuple we read, and
 certainly if it's eligible for vacuum nobody's going to be
 modifying that tuple anymore.

I totally agree.  It would be safe, and generally seems better, to
omit xmin from the hash tag for heap tuple locks.

 But locks on those still can have meaning, right? From my very
 limited understanding of predicate.c/SSI I don't see why we
 cannot have meaningful conflicts on tuples that are already
 vacuumable. You'd need some curiously interleaved transactions,
 sure, but it seems possible?

No.  We established quite firmly that predicate locks on a tuple do
not need to be carried forward to new versions of that tuple.  It
is only the *version* of the row that was read to create the
predicate lock which needs to be monitored for write conflicts.  If
the row has been vacuumed, or even determined to be eligible for
vacuuming, we know it is not a candidate for update or delete -- so
it is safe to drop the predicate locks.  We do *not* want to do any
other cleanup for the transaction which read that tuple based on
this; just the individual tuple lock should be cleared.  Any
conflict with the lock which occurred earlier will be preserved.

 ISTM we'd need to peg the xmin horizon for vacuum to the oldest
 xmin predicate.c keeps track of.

That would be another alternative, but it seems more problematic
and less effective.

I'm strongly leaning toward the idea that a slightly tweaked
version of the proposed patch is appropriate for the back-branches,
because the fix Heikki is now suggesting seems too invasive to
back-patch.  I think it would make sense to apply it to master,
too, so that the new isolation tests can be immediately added.  We
can then work on the new approach for 9.4 and have the tests to
help confirm we are not breaking anything.  The tweak would be to
preserve the signature of heap_freeze_tuple(), because after the
more invasive fix in 9.4 the new parameters will not be needed. 
They are only passed as non-NULL from one of the three callers, so
it seems best to leave those call sites alone rather than change
them back-and-forth.

I will post a new patch today or tomorrow.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI freezing bug

2013-10-03 Thread Heikki Linnakangas

On 03.10.2013 01:05, Kevin Grittner wrote:

Andres Freundand...@2ndquadrant.com  wrote:

On 2013-10-01 07:41:46 -0700, Kevin Grittner wrote:

Andres Freundand...@2ndquadrant.com  wrote:


A better solution probably is to promote tuple-level locks if
they exist to a relation level one upon freezing I guess?


It would be sufficient to promote the tuple lock to a page lock.
It would be pretty easy to add a function to predicate.c which
would accept a Relation and HeapTuple, check for a predicate lock
for the tuple, and add a page lock if found (which will
automatically clear the tuple lock).  This new function would be
called when a tuple was chosen for freezing.  Since freezing always
causes WAL-logging and disk I/O, the cost of a couple hash table
operations should not be noticeable.


Yea, not sure why I was thinking of table level locks.


This seems like a bug fix which should be back-patched to 9.1, yes?


Yes.


Patch attached, including new isolation test based on Heikki's
example.  This patch does change the signature of
heap_freeze_tuple().  If anyone thinks there is risk that external
code may be calling this, I could keep the old function with its
old behavior (including the bug) and add a new function name with
the new parameters added -- the old function could call the new one
with NULL for the last two parameters.  I'm not sure whether that
is better than breaking a compile of code which uses the old
signature, which would force a choice about what to do.

Will give it a couple days for feedback before pushing.


IMHO it would be better to remove xmin from the lock key, and vacuum 
away the old predicate locks when the corresponding tuple is vacuumed. 
The xmin field is only required to handle the case that a tuple is 
vacuumed, and a new unrelated tuple is inserted to the same slot. 
Removing the lock when the tuple is removed fixes that.


In fact, I cannot even come up with a situation where you would have a 
problem if we just removed xmin from the key, even if we didn't vacuum 
away old locks. I don't think the old lock can conflict with anything 
that would see the new tuple that gets inserted in the same slot. I have 
a feeling that you could probably prove that if you stare long enough at 
the diagram of a dangerous structure and the properties required for a 
conflict.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI freezing bug

2013-10-03 Thread Kevin Grittner
Heikki Linnakangas hlinnakan...@vmware.com wrote:

 IMHO it would be better to remove xmin from the lock key, and vacuum
 away the old predicate locks when the corresponding tuple is vacuumed.
 The xmin field is only required to handle the case that a tuple is
 vacuumed, and a new unrelated tuple is inserted to the same slot.
 Removing the lock when the tuple is removed fixes that.

 In fact, I cannot even come up with a situation where you would have a
 problem if we just removed xmin from the key, even if we didn't vacuum
 away old locks. I don't think the old lock can conflict with anything
 that would see the new tuple that gets inserted in the same slot. I have
 a feeling that you could probably prove that if you stare long enough at
 the diagram of a dangerous structure and the properties required for a
 conflict.

You are the one who suggested adding xmin to the key:

http://www.postgresql.org/message-id/4d5a36fc.6010...@enterprisedb.com

I will review that thread in light of your recent comments, but the
fact is that xmin was not originally in the lock key, testing
uncovered bugs, and adding xmin fixed those bugs.  I know I tried
some other approach first, which turned out to be complex and quite
messy -- it may have been similar to what you are proposing now.

It seems to me that a change such as you are now suggesting is
likely to be too invasive to back-patch.  Do you agree that it
would make sense to apply the patch I have proposed, back to 9.1,
and then consider any alternative as 9.4 material?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI freezing bug

2013-10-03 Thread Dan Ports
On Thu, Oct 03, 2013 at 06:19:49AM -0700, Kevin Grittner wrote:
 Heikki Linnakangas hlinnakan...@vmware.com wrote:
  IMHO it would be better to remove xmin from the lock key, and vacuum
  away the old predicate locks when the corresponding tuple is vacuumed.
  The xmin field is only required to handle the case that a tuple is
  vacuumed, and a new unrelated tuple is inserted to the same slot.
  Removing the lock when the tuple is removed fixes that.

This seems definitely safe: we need the predicate locks to determine if
someone is modifying a tuple we read, and certainly if it's eligible
for vacuum nobody's going to be modifying that tuple anymore.

  In fact, I cannot even come up with a situation where you would have a
  problem if we just removed xmin from the key, even if we didn't vacuum
  away old locks. I don't think the old lock can conflict with anything
  that would see the new tuple that gets inserted in the same slot. I have
  a feeling that you could probably prove that if you stare long enough at
  the diagram of a dangerous structure and the properties required for a
  conflict.

This would also be safe, in the sense that it's OK to flag a
conflict even if one doesn't exist. I'm not convinced that it isn't
possible to have false positives this way. I think it's possible for a
tuple to be vacuumed away and the ctid reused before the predicate
locks on it are eligible for cleanup. (In fact, isn't this what was
happening in the thread Kevin linked?)

 You are the one who suggested adding xmin to the key:
 
 http://www.postgresql.org/message-id/4d5a36fc.6010...@enterprisedb.com
 
 I will review that thread in light of your recent comments, but the
 fact is that xmin was not originally in the lock key, testing
 uncovered bugs, and adding xmin fixed those bugs.  I know I tried
 some other approach first, which turned out to be complex and quite
 messy -- it may have been similar to what you are proposing now.

At the time, we thought it was necessary for a predicate lock to lock
*all future versions* of a tuple, and so we had a bunch of code to
maintain a version chain. That was fraught with bugs, and turned out
not to be necessary (IIRC, we worked that out at the pub at PGcon).
That made it critical to distinguish different tuples that had the same
ctid because they could wind up in the wrong chain or cause a cycle.
With that code ripped out, that's no longer an issue.

But all this is an exceptionally subtle part of what was an
exceptionally complex patch, so a lot of careful thought is needed
here...

 It seems to me that a change such as you are now suggesting is
 likely to be too invasive to back-patch.  Do you agree that it
 would make sense to apply the patch I have proposed, back to 9.1,
 and then consider any alternative as 9.4 material?

I agree with this.

Dan

-- 
Dan R. K. PortsUW CSEhttp://drkp.net/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI freezing bug

2013-10-01 Thread Kevin Grittner
Andres Freund and...@2ndquadrant.com wrote:

 A better solution probably is to promote tuple-level locks if
 they exist to a relation level one upon freezing I guess?

It would be sufficient to promote the tuple lock to a page lock.
It would be pretty easy to add a function to predicate.c which
would accept a Relation and HeapTuple, check for a predicate lock
for the tuple, and add a page lock if found (which will
automatically clear the tuple lock).  This new function would be
called when a tuple was chosen for freezing.  Since freezing always
causes WAL-logging and disk I/O, the cost of a couple hash table
operations should not be noticeable.

This seems like a bug fix which should be back-patched to 9.1, yes?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI freezing bug

2013-10-01 Thread Andres Freund
On 2013-10-01 07:41:46 -0700, Kevin Grittner wrote:
 Andres Freund and...@2ndquadrant.com wrote:
 
  A better solution probably is to promote tuple-level locks if
  they exist to a relation level one upon freezing I guess?
 
 It would be sufficient to promote the tuple lock to a page lock.
 It would be pretty easy to add a function to predicate.c which
 would accept a Relation and HeapTuple, check for a predicate lock
 for the tuple, and add a page lock if found (which will
 automatically clear the tuple lock).  This new function would be
 called when a tuple was chosen for freezing.  Since freezing always
 causes WAL-logging and disk I/O, the cost of a couple hash table
 operations should not be noticeable.

Yea, not sure why I was thinking of table level locks.

 This seems like a bug fix which should be back-patched to 9.1, yes?

Yes.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI freezing bug

2013-09-26 Thread Heikki Linnakangas

On 23.09.2013 01:07, Hannu Krosing wrote:

On 09/20/2013 12:55 PM, Heikki Linnakangas wrote:

Hi,

Prompted by Andres Freund's comments on my Freezing without Write I/O
patch, I realized that there's there's an existing bug in the way
predicate locking handles freezing (or rather, it doesn't handle it).

When a tuple is predicate-locked, the key of the lock is ctid+xmin.
However, when a tuple is frozen, its xmin is changed to FrozenXid.
That effectively invalidates any predicate lock on the tuple, as
checking for a lock on the same tuple later won't find it as the xmin
is different.

Attached is an isolationtester spec to demonstrate this.

The case is even fishier than that.

That is, you can get bad behaviour on at least v9.2.4 even without
VACUUM FREEZE.

You just need to run

permutation r1 r2 w1 w2 c1 c2

twice in a row.

the first time it does get serialization error at c2
but the 2nd time both c1 and c2 complete successfully


Oh, interesting. I did some debugging on this: there are actually *two* 
bugs, either one of which alone is enough to cause this on its own:


1. in heap_hot_search_buffer(), the PredicateLockTuple() call is passed 
wrong offset number. heapTuple-t_self is set to the tid of the first 
tuple in the chain that's visited, not the one actually being read.


2. CheckForSerializableConflictIn() uses the tuple's t_ctid field 
instead of t_self to check for exiting predicate locks on the tuple. If 
the tuple was updated, but the updater rolled back, t_ctid points to the 
aborted dead tuple.


After fixing both of those bugs, running the test case twice in a row 
works, ie. causes a conflict and a rollback both times. Anyone see a 
problem with this?


That still leaves the original problem I spotted, with freezing; that's 
yet another unrelated bug.


- Heikki
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index ead3d69..a21f31b 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -1688,6 +1688,8 @@ heap_hot_search_buffer(ItemPointer tid, Relation relation, Buffer buffer,
 	at_chain_start = first_call;
 	skip = !first_call;
 
+	heapTuple-t_self = *tid;
+
 	/* Scan through possible multiple members of HOT-chain */
 	for (;;)
 	{
@@ -1717,7 +1719,7 @@ heap_hot_search_buffer(ItemPointer tid, Relation relation, Buffer buffer,
 		heapTuple-t_data = (HeapTupleHeader) PageGetItem(dp, lp);
 		heapTuple-t_len = ItemIdGetLength(lp);
 		heapTuple-t_tableOid = RelationGetRelid(relation);
-		heapTuple-t_self = *tid;
+		ItemPointerSetOffsetNumber(heapTuple-t_self, offnum);
 
 		/*
 		 * Shouldn't see a HEAP_ONLY tuple at chain start.
diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index d656d62..50583b3 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -4282,8 +4282,8 @@ CheckForSerializableConflictIn(Relation relation, HeapTuple tuple,
 		SET_PREDICATELOCKTARGETTAG_TUPLE(targettag,
 		 relation-rd_node.dbNode,
 		 relation-rd_id,
-		 ItemPointerGetBlockNumber((tuple-t_data-t_ctid)),
-		ItemPointerGetOffsetNumber((tuple-t_data-t_ctid)),
+ ItemPointerGetBlockNumber((tuple-t_self)),
+ItemPointerGetOffsetNumber((tuple-t_self)),
 	  HeapTupleHeaderGetXmin(tuple-t_data));
 		CheckTargetForConflictsIn(targettag);
 	}

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI freezing bug

2013-09-25 Thread Heikki Linnakangas

On 22.09.2013 00:12, Hannu Krosing wrote:

On 09/21/2013 10:46 PM, Andres Freund wrote:


Heikki Linnakangashlinnakan...@vmware.com  schrieb:

Kevin Grittnerkgri...@ymail.com  wrote:

Andres Freundand...@2ndquadrant.com  wrote:

On 2013-09-20 13:55:36 +0300, Heikki Linnakangas wrote:

When a tuple is predicate-locked, the key of the lock is

ctid+xmin.

However, when a tuple is frozen, its xmin is changed to FrozenXid.

That

effectively
invalidates any predicate lock on the tuple, as checking for a

lock

on

the
same tuple later won't find it as the xmin is different.

Attached is an isolationtester spec to demonstrate this.

Do you have any idea to fix that besides keeping the xmin horizon

below the

lowest of the xids that are predicate locked? Which seems nasty to
compute and is probably not trivial to fit into the procarray.c
machinery?

A better solution probably is to promote tuple-level locks if they

exist

to a relation level one upon freezing I guess?

That would work.  A couple other ideas would be to use the oldest
serializable xmin which is being calculated in predicate.c to
either prevent freezing of newer transaction or to summarize
predicate locks (using the existing SLRU-based summarization
functions) which use xmin values for xids which are being frozen.
The transactions must already be committed, and so are eligible for
summarization.

What's the point of using xmin as part of the lock key in the first
place, wouldn't ctid alone suffice? If the tuple was visible to the
reading transaction, it cannot be vacuumed away until the transaction
commits. No other tuple can appear with the same ctid.

SSI locks can live longer than one TX/snapshot.

But the only way that ctid can change without xmin changing is when
you update the tuple in the same TX that created it.


I don't think that's relevant. We're not talking about the ctid field 
of a tuple, ie. HeapTupleHeader.t_ctid. We're talking about its physical 
location, ie. HeapTuple-t_self.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI freezing bug

2013-09-25 Thread Heikki Linnakangas

On 21.09.2013 23:46, Andres Freund wrote:



Heikki Linnakangashlinnakan...@vmware.com  schrieb:

Kevin Grittnerkgri...@ymail.com  wrote:

Andres Freundand...@2ndquadrant.com  wrote:

On 2013-09-20 13:55:36 +0300, Heikki Linnakangas wrote:

When a tuple is predicate-locked, the key of the lock is

ctid+xmin.

However, when a tuple is frozen, its xmin is changed to FrozenXid.

That

effectively
invalidates any predicate lock on the tuple, as checking for a

lock

on

the
same tuple later won't find it as the xmin is different.

Attached is an isolationtester spec to demonstrate this.


Do you have any idea to fix that besides keeping the xmin horizon

below the

lowest of the xids that are predicate locked? Which seems nasty to
compute and is probably not trivial to fit into the procarray.c
machinery?


A better solution probably is to promote tuple-level locks if they

exist

to a relation level one upon freezing I guess?


That would work.  A couple other ideas would be to use the oldest
serializable xmin which is being calculated in predicate.c to
either prevent freezing of newer transaction or to summarize
predicate locks (using the existing SLRU-based summarization
functions) which use xmin values for xids which are being frozen.
The transactions must already be committed, and so are eligible for
summarization.


What's the point of using xmin as part of the lock key in the first
place, wouldn't ctid alone suffice? If the tuple was visible to the
reading transaction, it cannot be vacuumed away until the transaction
commits. No other tuple can appear with the same ctid.


SSI locks can live longer than one TX/snapshot.


Hmm, true.

Another idea: we could vacuum away predicate locks, when the 
corresponding tuples are vacuumed from the heap. Before the 2nd phase of 
vacuum, before removing the dead tuples, we could scan the predicate 
lock hash and remove any locks belonging to the tuples we're about to 
remove. And not include xmin in the lock key.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI freezing bug

2013-09-22 Thread Hannu Krosing
On 09/20/2013 12:55 PM, Heikki Linnakangas wrote:
 Hi,

 Prompted by Andres Freund's comments on my Freezing without Write I/O
 patch, I realized that there's there's an existing bug in the way
 predicate locking handles freezing (or rather, it doesn't handle it).

 When a tuple is predicate-locked, the key of the lock is ctid+xmin.
 However, when a tuple is frozen, its xmin is changed to FrozenXid.
 That effectively invalidates any predicate lock on the tuple, as
 checking for a lock on the same tuple later won't find it as the xmin
 is different.

 Attached is an isolationtester spec to demonstrate this.
The case is even fishier than that.

That is, you can get bad behaviour on at least v9.2.4 even without
VACUUM FREEZE.

You just need to run 

permutation r1 r2 w1 w2 c1 c2

twice in a row.

the first time it does get serialization error at c2
but the 2nd time both c1 and c2 complete successfully

Cheers

-- 
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI freezing bug

2013-09-21 Thread Heikki Linnakangas
Kevin Grittner kgri...@ymail.com wrote:
Andres Freund and...@2ndquadrant.com wrote:
 On 2013-09-20 13:55:36 +0300, Heikki Linnakangas wrote:
 When a tuple is predicate-locked, the key of the lock is ctid+xmin.
 However, when a tuple is frozen, its xmin is changed to FrozenXid.
That
 effectively
 invalidates any predicate lock on the tuple, as checking for a lock
on
 the
 same tuple later won't find it as the xmin is different.

 Attached is an isolationtester spec to demonstrate this.

 Do you have any idea to fix that besides keeping the xmin horizon
below the
 lowest of the xids that are predicate locked? Which seems nasty to
 compute and is probably not trivial to fit into the procarray.c
 machinery?

 A better solution probably is to promote tuple-level locks if they
exist
 to a relation level one upon freezing I guess?

That would work.  A couple other ideas would be to use the oldest
serializable xmin which is being calculated in predicate.c to
either prevent freezing of newer transaction or to summarize
predicate locks (using the existing SLRU-based summarization
functions) which use xmin values for xids which are being frozen. 
The transactions must already be committed, and so are eligible for
summarization.

What's the point of using xmin as part of the lock key in the first place, 
wouldn't ctid alone suffice? If the tuple was visible to the reading 
transaction, it cannot be vacuumed away until the transaction commits. No other 
tuple can appear with the same ctid.


- Heikki


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI freezing bug

2013-09-21 Thread Andres Freund


Heikki Linnakangas hlinnakan...@vmware.com schrieb:
Kevin Grittner kgri...@ymail.com wrote:
Andres Freund and...@2ndquadrant.com wrote:
 On 2013-09-20 13:55:36 +0300, Heikki Linnakangas wrote:
 When a tuple is predicate-locked, the key of the lock is
ctid+xmin.
 However, when a tuple is frozen, its xmin is changed to FrozenXid.
That
 effectively
 invalidates any predicate lock on the tuple, as checking for a
lock
on
 the
 same tuple later won't find it as the xmin is different.

 Attached is an isolationtester spec to demonstrate this.

 Do you have any idea to fix that besides keeping the xmin horizon
below the
 lowest of the xids that are predicate locked? Which seems nasty to
 compute and is probably not trivial to fit into the procarray.c
 machinery?

 A better solution probably is to promote tuple-level locks if they
exist
 to a relation level one upon freezing I guess?

That would work.  A couple other ideas would be to use the oldest
serializable xmin which is being calculated in predicate.c to
either prevent freezing of newer transaction or to summarize
predicate locks (using the existing SLRU-based summarization
functions) which use xmin values for xids which are being frozen. 
The transactions must already be committed, and so are eligible for
summarization.

What's the point of using xmin as part of the lock key in the first
place, wouldn't ctid alone suffice? If the tuple was visible to the
reading transaction, it cannot be vacuumed away until the transaction
commits. No other tuple can appear with the same ctid.

SSI locks can live longer than one TX/snapshot.

Andres

-- 
Please excuse brevity and formatting - I am writing this on my mobile phone.

Andres Freund  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI freezing bug

2013-09-21 Thread Hannu Krosing
On 09/21/2013 10:46 PM, Andres Freund wrote:

 Heikki Linnakangas hlinnakan...@vmware.com schrieb:
 Kevin Grittner kgri...@ymail.com wrote:
 Andres Freund and...@2ndquadrant.com wrote:
 On 2013-09-20 13:55:36 +0300, Heikki Linnakangas wrote:
 When a tuple is predicate-locked, the key of the lock is
 ctid+xmin.
 However, when a tuple is frozen, its xmin is changed to FrozenXid.
 That
 effectively
 invalidates any predicate lock on the tuple, as checking for a
 lock
 on
 the
 same tuple later won't find it as the xmin is different.

 Attached is an isolationtester spec to demonstrate this.
 Do you have any idea to fix that besides keeping the xmin horizon
 below the
 lowest of the xids that are predicate locked? Which seems nasty to
 compute and is probably not trivial to fit into the procarray.c
 machinery?
 A better solution probably is to promote tuple-level locks if they
 exist
 to a relation level one upon freezing I guess?
 That would work.  A couple other ideas would be to use the oldest
 serializable xmin which is being calculated in predicate.c to
 either prevent freezing of newer transaction or to summarize
 predicate locks (using the existing SLRU-based summarization
 functions) which use xmin values for xids which are being frozen. 
 The transactions must already be committed, and so are eligible for
 summarization.
 What's the point of using xmin as part of the lock key in the first
 place, wouldn't ctid alone suffice? If the tuple was visible to the
 reading transaction, it cannot be vacuumed away until the transaction
 commits. No other tuple can appear with the same ctid.
 SSI locks can live longer than one TX/snapshot.
But the only way that ctid can change without xmin changing is when
you update the tuple in the same TX that created it.

Could it be the case here or can we safely exclude this ?


-- 
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] SSI freezing bug

2013-09-20 Thread Heikki Linnakangas

Hi,

Prompted by Andres Freund's comments on my Freezing without Write I/O 
patch, I realized that there's there's an existing bug in the way 
predicate locking handles freezing (or rather, it doesn't handle it).


When a tuple is predicate-locked, the key of the lock is ctid+xmin. 
However, when a tuple is frozen, its xmin is changed to FrozenXid. That 
effectively invalidates any predicate lock on the tuple, as checking for 
a lock on the same tuple later won't find it as the xmin is different.


Attached is an isolationtester spec to demonstrate this.

- Heikki
# Test predicate locks with freezing
#
# This test has two serializable transactions. Both select two rows
# from the table, and then update one of them.
# If these were serialized (run one at a time), the transaction that
# runs later would see one of the rows to be updated.
#
# Any overlap between the transactions must cause a serialization failure.
#
# Normally that works, but freezing a tuple screws up predicate locking.
# After freezing a tuple, any predicate locks on it are effectively lost.

setup
{
  CREATE TABLE test (i int PRIMARY KEY, t text);
  INSERT INTO test VALUES (5, 'apple'), (7, 'pear'), (11, 'banana');
}

teardown
{
  DROP TABLE test;
}

session s1
setup { BEGIN ISOLATION LEVEL SERIALIZABLE; set enable_seqscan=off;}
step r1 { SELECT * FROM test WHERE i IN (5, 7) }
step w1 { UPDATE test SET t = 'pear_xact1' WHERE i = 7 }
step c1 { COMMIT; }

session s2
setup { BEGIN ISOLATION LEVEL SERIALIZABLE; set enable_seqscan=off;}
step r2 { SELECT * FROM test WHERE i IN (5, 7) }
step w2 { UPDATE test SET t = 'apple_xact2' WHERE i = 5 }
step c2 { COMMIT; }

session freezer
step freeze { VACUUM FREEZE test; }

# This should cause an serialization error - and it does if you remove the
# freeze step. 
permutation r1 r2 freeze w1 w2 c1 c2

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI freezing bug

2013-09-20 Thread Andres Freund
Hi,


On 2013-09-20 13:55:36 +0300, Heikki Linnakangas wrote:
 When a tuple is predicate-locked, the key of the lock is ctid+xmin. However,
 when a tuple is frozen, its xmin is changed to FrozenXid. That effectively
 invalidates any predicate lock on the tuple, as checking for a lock on the
 same tuple later won't find it as the xmin is different.
 
 Attached is an isolationtester spec to demonstrate this.

Do you have any idea to fix that besides keeping the xmin horizon below the
lowest of the xids that are predicate locked? Which seems nasty to
compute and is probably not trivial to fit into the procarray.c
machinery?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI freezing bug

2013-09-20 Thread Andres Freund
On 2013-09-20 13:53:04 +0200, Andres Freund wrote:
 Hi,
 
 
 On 2013-09-20 13:55:36 +0300, Heikki Linnakangas wrote:
  When a tuple is predicate-locked, the key of the lock is ctid+xmin. However,
  when a tuple is frozen, its xmin is changed to FrozenXid. That effectively
  invalidates any predicate lock on the tuple, as checking for a lock on the
  same tuple later won't find it as the xmin is different.
  
  Attached is an isolationtester spec to demonstrate this.
 
 Do you have any idea to fix that besides keeping the xmin horizon below the
 lowest of the xids that are predicate locked? Which seems nasty to
 compute and is probably not trivial to fit into the procarray.c
 machinery?

A better solution probably is to promote tuple-level locks if they exist
to a relation level one upon freezing I guess?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI freezing bug

2013-09-20 Thread Kevin Grittner
Andres Freund and...@2ndquadrant.com wrote:
 On 2013-09-20 13:55:36 +0300, Heikki Linnakangas wrote:
 When a tuple is predicate-locked, the key of the lock is ctid+xmin.
 However, when a tuple is frozen, its xmin is changed to FrozenXid. That
 effectively
 invalidates any predicate lock on the tuple, as checking for a lock on
 the
 same tuple later won't find it as the xmin is different.

 Attached is an isolationtester spec to demonstrate this.

 Do you have any idea to fix that besides keeping the xmin horizon below the
 lowest of the xids that are predicate locked? Which seems nasty to
 compute and is probably not trivial to fit into the procarray.c
 machinery?

 A better solution probably is to promote tuple-level locks if they exist
 to a relation level one upon freezing I guess?

That would work.  A couple other ideas would be to use the oldest
serializable xmin which is being calculated in predicate.c to
either prevent freezing of newer transaction or to summarize
predicate locks (using the existing SLRU-based summarization
functions) which use xmin values for xids which are being frozen. 
The transactions must already be committed, and so are eligible for
summarization.

I'm not sure which is best.  Will review, probably this weekend.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers