[HACKERS] LISTEN and tuple concurrently updated

2003-09-15 Thread Gavin Sherry
Hi all,

A user on IRC came across the following tuple concurrently updated error
when using LISTEN/NOTIFY intensively. The user managed to isolate the
problem and SQL generating the problem. A few sessions are required to
generate the error.

T1:
---
begin;
listen test;
commit;

T2:
---
begin;
notify test;
commit;

T1:
---
begin;
-- got notify
unlisten test;


T3:
---
begin;
notify test;
commit;
-- blocks

T1:
---
commit;

T3 then receives:

WARNING:  AbortTransaction and not in in-progress state
ERROR:  tuple concurrently updated

A brief look into this:

heap_update() in T3 (called by AtCommit_Notify()) calls
HeapTupleSatisfiesUpdate(). This returns HeapTupleBeingUpdated. Once we
issue COMMIT; in T1 updates pg_listen and the tuple T3 is trying to
update no longer exists.

I've attached a patch which solves this problem. Basically, T1 will now
just hold AccessExclusiveLock on pg_listen for the rest of the
transaction. I've also modified AsyncExistsPendingNotify() to step through
pg_listen which allows T3's NOTIFY to block until T1 commits. This is not
really necessary due to the semantics of LISTEN/NOTIFY -- it is not an
error if a record exists in pg_listen already.

Thanks,

GavinIndex: src/backend/commands/async.c
===
RCS file: /usr/local/cvsroot/pgsql-server/src/backend/commands/async.c,v
retrieving revision 1.99
diff -2 -c -r1.99 async.c
*** src/backend/commands/async.c15 Sep 2003 00:26:31 -  1.99
--- src/backend/commands/async.c16 Sep 2003 00:16:27 -
***
*** 317,321 
heap_endscan(scan);
  
!   heap_close(lRel, AccessExclusiveLock);
  
/*
--- 317,321 
heap_endscan(scan);
  
!   heap_close(lRel, NoLock);
  
/*
***
*** 869,878 
  }
  
! /* Does pendingNotifies include the given relname? */
  static bool
  AsyncExistsPendingNotify(const char *relname)
  {
!   List   *p;
  
foreach(p, pendingNotifies)
{
--- 869,903 
  }
  
! /* Does pendingNotifies and pg_listen include the given relname? */
  static bool
  AsyncExistsPendingNotify(const char *relname)
  {
!   List*p;
!   RelationlRel;
!   boolfound = false;
!   HeapScanDescscan;
!   HeapTuple   tuple;
! 
! lRel = heap_openr(ListenerRelationName, AccessExclusiveLock);
! 
! scan = heap_beginscan(lRel, SnapshotNow, 0, (ScanKey) NULL);
! while ((tuple = heap_getnext(scan, ForwardScanDirection)) != NULL)
! {
! Form_pg_listener listener = (Form_pg_listener) GETSTRUCT(tuple);
! 
! if(strncmp(NameStr(listener-relname), relname, NAMEDATALEN) == 0)
! {
! /* Found the matching tuple */
!   found = true;
! break;
! }
! }
! 
! heap_endscan(scan);
! 
! heap_close(lRel, AccessExclusiveLock);
  
+   if(!found)
+   return(false);
foreach(p, pendingNotifies)
{

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] LISTEN and tuple concurrently updated

2003-09-15 Thread Tom Lane
Gavin Sherry [EMAIL PROTECTED] writes:
 ERROR:  tuple concurrently updated

 A brief look into this:

 heap_update() in T3 (called by AtCommit_Notify()) calls
 HeapTupleSatisfiesUpdate(). This returns HeapTupleBeingUpdated. Once we
 issue COMMIT; in T1 updates pg_listen and the tuple T3 is trying to
 update no longer exists.

Ugh.

 I've attached a patch which solves this problem. Basically, T1 will now
 just hold AccessExclusiveLock on pg_listen for the rest of the
 transaction.

That seems quite unworkable --- it creates the potential for deadlock,
and in any case the exclusive lock could be held for an unreasonably
long time.

 I've also modified AsyncExistsPendingNotify() to step through
 pg_listen which allows T3's NOTIFY to block until T1 commits. This is not
 really necessary due to the semantics of LISTEN/NOTIFY -- it is not an
 error if a record exists in pg_listen already.

This appears to turn AtCommit_Notify into an O(N^2) operation, which
doesn't strike me as a pleasant answer at all.  I think it also breaks
the semantics of the other caller, Async_Notify.

What we probably need to do instead of this is not use
simple_heap_update in AtCommit_Notify; instead we have to use
heap_update directly and cope with concurrent-update situations.
The simple_heap_delete calls may need work too, now that I think
about it ...

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] LISTEN and tuple concurrently updated

2003-09-15 Thread Tom Lane
Gavin Sherry [EMAIL PROTECTED] writes:
 A user on IRC came across the following tuple concurrently updated error
 when using LISTEN/NOTIFY intensively.

I've applied a fix for this to CVS tip.

Thinking about the implications of rolling back UNLISTEN, it occurs to
me that there is another possible solution, which is to queue up LISTEN
and UNLISTEN commands locally in the backend and not apply them to
pg_listener until commit.  Doing it this way, it'd be okay to hold the
pg_listener lock until commit, which would eliminate the problem of
uncommitted unlistens being visible to notifiers.  However it would
require adding a fair amount more code to async.c.

I think that whenever we get around to rewriting LISTEN/NOTIFY to use
shared memory messages instead of a table, it will be necessary to apply
listen/unlisten commands that way (hold them until commit) to preserve
transactional semantics.  But for now, I'm not going to do the extra
work.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] LISTEN and tuple concurrently updated

2003-09-15 Thread Gavin Sherry
On Mon, 15 Sep 2003, Tom Lane wrote:

 Gavin Sherry [EMAIL PROTECTED] writes:
  A user on IRC came across the following tuple concurrently updated error
  when using LISTEN/NOTIFY intensively.

 I've applied a fix for this to CVS tip.

Great.

 I think that whenever we get around to rewriting LISTEN/NOTIFY to use
 shared memory messages instead of a table, it will be necessary to apply
 listen/unlisten commands that way (hold them until commit) to preserve
 transactional semantics.  But for now, I'm not going to do the extra
 work.

I wasn't thinking about the deadlock/performance problems when I sent in
that patch. It was more a proof of my theory. I was certainly thinking
about the various discussions about reworking LISTEN/NOTIFY into
shared memory when looking at the code, but as you say, its not a job for
right now :-).

Thanks,

Gavin

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org