Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-12-08 Thread Andres Freund
On 2012-10-18 22:40:15 +0200, Boszormenyi Zoltan wrote:
 2012-10-18 20:08 keltezéssel, Tom Lane írta:
 Alvaro Herrera alvhe...@2ndquadrant.com writes:
 Boszormenyi Zoltan escribió:
 this is the latest one, fixing a bug in the accounting
 of per-statement lock timeout handling and tweaking
 some comments.
 Tom, are you able to give this patch some more time on this commitfest?
 I'm still hoping to get to it, but I've been spending a lot of time on
 bug fixing rather than patch review lately :-(.  If you're hoping to
 close out the current CF soon, maybe we should just slip it to the next
 one.

 Fine by me. Thanks.

According to this the current state of the patch should be Ready for
Committer not Needs Review is that right? I changed the state for
now...

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-12-08 Thread Boszormenyi Zoltan

2012-12-08 15:30 keltezéssel, Andres Freund írta:

On 2012-10-18 22:40:15 +0200, Boszormenyi Zoltan wrote:

2012-10-18 20:08 keltezéssel, Tom Lane írta:

Alvaro Herrera alvhe...@2ndquadrant.com writes:

Boszormenyi Zoltan escribió:

this is the latest one, fixing a bug in the accounting
of per-statement lock timeout handling and tweaking
some comments.

Tom, are you able to give this patch some more time on this commitfest?

I'm still hoping to get to it, but I've been spending a lot of time on
bug fixing rather than patch review lately :-(.  If you're hoping to
close out the current CF soon, maybe we should just slip it to the next
one.

Fine by me. Thanks.

According to this the current state of the patch should be Ready for
Committer not Needs Review is that right? I changed the state for
now...


Thanks. I just tried the patch on current GIT HEAD and
gives some offset warnings but no rejects. Also, it compiles
without warnings and still works as it should.
Should I post a new patch that applies cleanly?

Best regards,
Zoltán Böszörményi

--
--
Zoltán Böszörményi
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-12-08 Thread Tom Lane
Boszormenyi Zoltan z...@cybertec.at writes:
 Thanks. I just tried the patch on current GIT HEAD and
 gives some offset warnings but no rejects. Also, it compiles
 without warnings and still works as it should.
 Should I post a new patch that applies cleanly?

Offsets are not a problem --- if you tried to keep them exact you'd just
be making a lot of unnecessary updates.  If there were fuzz warnings I'd
be more concerned.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-10-18 Thread Alvaro Herrera
Boszormenyi Zoltan escribió:
 Hi,
 
 this is the latest one, fixing a bug in the accounting
 of per-statement lock timeout handling and tweaking
 some comments.

Tom, are you able to give this patch some more time on this commitfest?

(If not, I think it would be fair to boot it to CF3; this is final in a
series, there's nothing that depends on it, and there's been good
movement on it; there's plenty of time before the devel cycle closes.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-10-18 Thread Tom Lane
Alvaro Herrera alvhe...@2ndquadrant.com writes:
 Boszormenyi Zoltan escribió:
 this is the latest one, fixing a bug in the accounting
 of per-statement lock timeout handling and tweaking
 some comments.

 Tom, are you able to give this patch some more time on this commitfest?

I'm still hoping to get to it, but I've been spending a lot of time on
bug fixing rather than patch review lately :-(.  If you're hoping to
close out the current CF soon, maybe we should just slip it to the next
one.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-10-18 Thread Boszormenyi Zoltan

2012-10-18 20:08 keltezéssel, Tom Lane írta:

Alvaro Herrera alvhe...@2ndquadrant.com writes:

Boszormenyi Zoltan escribió:

this is the latest one, fixing a bug in the accounting
of per-statement lock timeout handling and tweaking
some comments.

Tom, are you able to give this patch some more time on this commitfest?

I'm still hoping to get to it, but I've been spending a lot of time on
bug fixing rather than patch review lately :-(.  If you're hoping to
close out the current CF soon, maybe we should just slip it to the next
one.


Fine by me. Thanks.

--
--
Zoltán Böszörményi
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-10-03 Thread Boszormenyi Zoltan

Hi,

this is the latest one, fixing a bug in the accounting
of per-statement lock timeout handling and tweaking
some comments.

Best regards,
Zoltán Böszörményi

--
--
Zoltán Böszörményi
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/



2-lock_timeout-v27.patch.gz
Description: Unix tar archive

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-09-24 Thread Boszormenyi Zoltan

2012-09-22 20:49 keltezéssel, Tom Lane írta:

Boszormenyi Zoltan z...@cybertec.at writes:

new version with a lot more cleanup is attached.

I looked at this patch, and frankly I'm rather dismayed.  It's a mess.


Thank you for the kind words. :-)


To start at the bottom level, the changes to PGSemaphoreLock broke it,
and seem probably unnecessary anyway.  As coded, calling the condition
checker breaks handling of error returns from semop(), unless the checker
is careful to preserve errno, which LmgrTimeoutCondition isn't (and really
shouldn't need to be anyway).


I would call it an oversight, nothing more.


   More, if the checker does return true,
it causes PGSemaphoreLock to utterly violate its contract: it returns to
the caller without having acquired the semaphore, and without even telling
the caller so.  Worse, if we *did* acquire the semaphore, we might still
exit via this path, since the placement of the condition check call
ignores the comment a few lines up:

  * Once we acquire the lock, we do NOT check for an interrupt before
  * returning.  The caller needs to be able to record ownership of the lock
  * before any interrupt can be accepted.

We could possibly fix all this with a redesigned API contract for
PGSemaphoreLock, but frankly I do not see a good reason to be tinkering
with it at all.  We never needed to get it involved with deadlock check
handling, and I don't see why that needs to change for lock timeouts.

One very good reason why monkeying with PGSemaphoreLock is wrong is that
on some platforms a SIGALRM interrupt won't interrupt the semop() call,
and thus control would never reach the checker anyway.  If we're going
to throw an error, it must be thrown from the interrupt handler.


Then please, explain to me, how on Earth can the current deadlock_timeout
can report the error? Sure, I can see the PG_TRY() ... PG_END_TRY() block in
lock.c but as far as I can see, nothing in the CheckDeadLock() - 
DeadLockCheck()
- DeadLockCheckRecurse() path diverts the code to return to a different
address from the signal handler, i.e. there is no elog(ERROR) or ereport(ERROR)
even in the DS_HARD_DEADLOCK case, so nothing calls siglongjmp(). So logically
the code shouldn't end up in the PG_CATCH() branch.

semop() will only get an errno = EINTR when returning from the signal handler
and would loop again. Then what makes it return beside being able to lock the
semaphore? The conditions in ProcSleep() that e.g. print the lock stats work 
somehow.


The whole lmgrtimeout module seems to me to be far more mechanism than is
warranted, for too little added functionality.  In the first place, there
is nothing on the horizon suggesting that we need to let any plug-in code
get control here, and if anything the delicacy of what's going on leads me
to not wish to expose such a possibility.  In the second place, it isn't
adding any actually useful functionality, it's just agglomerating some
checks.  The minimum thing I would want it to do is avoid calling
timeout.c multiple times, which is what would happen right now (leading
to four extra syscalls per lock acquisition, which is enough new overhead
to constitute a strong objection to committing this patch at all).

On the whole I think we could forget lmgrtimeout and just hardwire the
lock timeout and deadlock check cases.  But in any case we're going to
need support in timeout.c for enabling/disabling multiple timeouts at
once without extra setitimer calls.


OK, so you prefer the previous hardcoding PGSemaphoreTimedLock()
that makes every LockAcquire() check its return code and the detailed
error message about failed to lock the given object?

I will add new functions to timeout.c to remove many timeout sources
at once to decrease the amount of syscalls needed.


I'm also not thrilled about the way in which the existing deadlock
checking code has been hacked up.  As an example, you added this to
DeadLockReport():

+   if (!DeadLockTimeoutCondition())
+   return;

which again causes it to violate its contract, namely to report a
deadlock, in the most fundamental way -- existing callers aren't
expecting it to return *at all*.


Existing caller*s*? There is only one caller. The reasoning behind this
change was that if the code reaches ReportLmgrTimeoutError() in lock.c
then at least one timeout triggered and one *Report function in the chain
will do ereport(ERROR). *THAT* would trigger te siglongjmp() ending
up in PG_CATCH(). The added lines in DeadLockReport() ensures that
the deadlock error is not reported if it didn't trigger.


   Surely we can decouple the deadlock
and lock timeout cases better than that; or at least if we can't it's
a delusion to propose anything like lmgrtimeout in the first place.

There's considerable lack of attention to updating comments, too.
For instance in WaitOnLock you only bothered to update the comment
immediately adjacent to the changed code, and not the two comment
blocks above that, which 

Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-09-24 Thread Boszormenyi Zoltan

Hi,

2012-09-22 20:49 keltezéssel, Tom Lane írta:

Boszormenyi Zoltan z...@cybertec.at writes:

new version with a lot more cleanup is attached.

I looked at this patch, and frankly I'm rather dismayed.  It's a mess.


I hope you won't find this one a mess. I tried to address all your complaints.


[rather long diatribe on modifying PGSemaphoreLock improperly]


I have returned to the previous version that used PGSemaphoreTimedLock
and this time I save and restore errno around calling the timeout condition
checker.


The whole lmgrtimeout module seems to me to be far more mechanism than is
warranted, for too little added functionality.  [...]


lmgrtimeout is no more, back to hardcoding things.


   The minimum thing I would want it to do is avoid calling
timeout.c multiple times, which is what would happen right now (leading
to four extra syscalls per lock acquisition, which is enough new overhead
to constitute a strong objection to committing this patch at all).


I have added enable_multiple_timeouts() and disable_multiple_timeouts()
that minimize the number setitimer() calls.


There's considerable lack of attention to updating comments, too.
For instance in WaitOnLock you only bothered to update the comment
immediately adjacent to the changed code, and not the two comment
blocks above that, which both have specific references to deadlocks
being the reason for failure.


I modified the comment in question. I hope the wording is right.


Also, the per statement mode for lock timeout doesn't seem to be
any such thing, because it's implemented like this:

+case LOCK_TIMEOUT_PER_STMT:
+enable_timeout_at(LOCK_TIMEOUT,
+TimestampTzPlusMilliseconds(
+GetCurrentStatementStartTimestamp(),
+LockTimeout));
+break;

That doesn't provide anything like you can spend at most N milliseconds
waiting for locks during a statement.  What it is is if you happen to be
waiting for a lock N milliseconds after the statement starts, or if you
attempt to acquire any lock more than N milliseconds after the statement
starts, you lose instantly.  I don't think that definition actually adds
any useful functionality compared to setting statement_timeout to N
milliseconds, and it's certainly wrongly documented.  To do what the
documentation implies would require tracking and adding up the time spent
waiting for locks during a statement.  Which might be a good thing to do,
especially if the required gettimeofday() calls could be shared with what
timeout.c probably has to do anyway at start and stop of a lock wait.
But this code doesn't do it.


The code now properly accumulates the time spent in waiting for
LOCK_TIMEOUT. This means that if STATEMENT_TIMEOUT and
LOCK_TIMEOUT are set to the same value, STATEMENT_TIMEOUT
will trigger because it considers the time as one contiguous unit,
LOCK_TIMEOUT only accounts the time spent in waiting, not the time
spent with useful work. This means that LOCK_TIMEOUT doesn't
need any special code in its handler function, it's a NOP. The
relation between timeouts is only handled by the timeout.c module.


Lastly, I'm not sure where is the best place to be adding the control
logic for this, but I'm pretty sure postinit.c is not it.  It oughta be
somewhere under storage/lmgr/, no?


The above change means that there is no control logic outside of
storage/lmgr now.

I temporarily abandoned the idea of detailed error reporting on
the object type and name/ID. WaitOnLock() reports canceling statement
due to lock timeout and LockAcquire() kept its previous semantics.
This can be quickly revived in case of demand, it would be another ~15K patch.

I hope you can find another time slot in this CF to review this one.

Best regards,
Zoltán Böszörményi

--
--
Zoltán Böszörményi
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/



2-lock_timeout-v26.patch.gz
Description: Unix tar archive

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-09-22 Thread Tom Lane
Boszormenyi Zoltan z...@cybertec.at writes:
 new version with a lot more cleanup is attached.

I looked at this patch, and frankly I'm rather dismayed.  It's a mess.

To start at the bottom level, the changes to PGSemaphoreLock broke it,
and seem probably unnecessary anyway.  As coded, calling the condition
checker breaks handling of error returns from semop(), unless the checker
is careful to preserve errno, which LmgrTimeoutCondition isn't (and really
shouldn't need to be anyway).  More, if the checker does return true,
it causes PGSemaphoreLock to utterly violate its contract: it returns to
the caller without having acquired the semaphore, and without even telling
the caller so.  Worse, if we *did* acquire the semaphore, we might still
exit via this path, since the placement of the condition check call
ignores the comment a few lines up:

 * Once we acquire the lock, we do NOT check for an interrupt before
 * returning.  The caller needs to be able to record ownership of the lock
 * before any interrupt can be accepted.

We could possibly fix all this with a redesigned API contract for
PGSemaphoreLock, but frankly I do not see a good reason to be tinkering
with it at all.  We never needed to get it involved with deadlock check
handling, and I don't see why that needs to change for lock timeouts.

One very good reason why monkeying with PGSemaphoreLock is wrong is that
on some platforms a SIGALRM interrupt won't interrupt the semop() call,
and thus control would never reach the checker anyway.  If we're going
to throw an error, it must be thrown from the interrupt handler.

The whole lmgrtimeout module seems to me to be far more mechanism than is
warranted, for too little added functionality.  In the first place, there
is nothing on the horizon suggesting that we need to let any plug-in code
get control here, and if anything the delicacy of what's going on leads me
to not wish to expose such a possibility.  In the second place, it isn't
adding any actually useful functionality, it's just agglomerating some
checks.  The minimum thing I would want it to do is avoid calling
timeout.c multiple times, which is what would happen right now (leading
to four extra syscalls per lock acquisition, which is enough new overhead
to constitute a strong objection to committing this patch at all).

On the whole I think we could forget lmgrtimeout and just hardwire the
lock timeout and deadlock check cases.  But in any case we're going to
need support in timeout.c for enabling/disabling multiple timeouts at
once without extra setitimer calls.

I'm also not thrilled about the way in which the existing deadlock
checking code has been hacked up.  As an example, you added this to
DeadLockReport():

+   if (!DeadLockTimeoutCondition())
+   return;

which again causes it to violate its contract, namely to report a
deadlock, in the most fundamental way -- existing callers aren't
expecting it to return *at all*.  Surely we can decouple the deadlock
and lock timeout cases better than that; or at least if we can't it's
a delusion to propose anything like lmgrtimeout in the first place.

There's considerable lack of attention to updating comments, too.
For instance in WaitOnLock you only bothered to update the comment
immediately adjacent to the changed code, and not the two comment
blocks above that, which both have specific references to deadlocks
being the reason for failure.

Also, the per statement mode for lock timeout doesn't seem to be
any such thing, because it's implemented like this:

+case LOCK_TIMEOUT_PER_STMT:
+enable_timeout_at(LOCK_TIMEOUT,
+TimestampTzPlusMilliseconds(
+GetCurrentStatementStartTimestamp(),
+LockTimeout));
+break;

That doesn't provide anything like you can spend at most N milliseconds
waiting for locks during a statement.  What it is is if you happen to be
waiting for a lock N milliseconds after the statement starts, or if you
attempt to acquire any lock more than N milliseconds after the statement
starts, you lose instantly.  I don't think that definition actually adds
any useful functionality compared to setting statement_timeout to N
milliseconds, and it's certainly wrongly documented.  To do what the
documentation implies would require tracking and adding up the time spent
waiting for locks during a statement.  Which might be a good thing to do,
especially if the required gettimeofday() calls could be shared with what
timeout.c probably has to do anyway at start and stop of a lock wait.
But this code doesn't do it.

Lastly, I'm not sure where is the best place to be adding the control
logic for this, but I'm pretty sure postinit.c is not it.  It oughta be
somewhere under storage/lmgr/, no?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your 

Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-08-21 Thread Boszormenyi Zoltan

Hi,

new version with a lot more cleanup is attached.

2012-07-22 22:03 keltezéssel, Boszormenyi Zoltan írta:

Attached is the revised (and a lot leaner, more generic) lock timeout patch,
which introduces new functionality for the timeout registration framework.
The new functionality is called extra timeouts, better naming is welcome.
Instead of only the previously defined (deadlock and statement) timeouts,
the extra timeouts can also be activated from within ProcSleep() in a linked
way.


This mini-framework is now called lock manager timeouts and
both deadlock timeout and the new lock timeout belong to it.
The little piece of standalone code managing these are in
storage/lmgr/lmgrtimeout.c.

There is no PGSemaphoreTimedLock() any more. Instead,
PGSemaphoreLock() gained a new function argument for
checking timeouts. This has three advantages:
- There is only one PGSemaphoreLock() implementation and bug fixes
  like ada8fa08fc6cf5f199b6df935b4d0a730aaa4fec don't need to
  touch several places.
- There is no layering violation between pg_sema.c and proc.c.
- The extra function can check other type of conditions from different
  callers, should the need arise.

Best regards,
Zoltán Böszörményi

--
--
Zoltán Böszörményi
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/

diff -durpN postgresql/src/backend/port/ipc_test.c postgresql.1/src/backend/port/ipc_test.c
--- postgresql/src/backend/port/ipc_test.c	2012-04-16 19:57:22.437915477 +0200
+++ postgresql.1/src/backend/port/ipc_test.c	2012-08-21 15:53:50.059329927 +0200
@@ -240,7 +240,7 @@ main(int argc, char **argv)
 	printf(Testing Lock ... );
 	fflush(stdout);
 
-	PGSemaphoreLock(storage-sem, false);
+	PGSemaphoreLock(storage-sem, false, NULL);
 
 	printf(OK\n);
 
@@ -262,8 +262,8 @@ main(int argc, char **argv)
 	PGSemaphoreUnlock(storage-sem);
 	PGSemaphoreUnlock(storage-sem);
 
-	PGSemaphoreLock(storage-sem, false);
-	PGSemaphoreLock(storage-sem, false);
+	PGSemaphoreLock(storage-sem, false, NULL);
+	PGSemaphoreLock(storage-sem, false, NULL);
 
 	printf(OK\n);
 
@@ -311,7 +311,7 @@ main(int argc, char **argv)
 	printf(Waiting for child (should wait 3 sec here) ... );
 	fflush(stdout);
 
-	PGSemaphoreLock(storage-sem, false);
+	PGSemaphoreLock(storage-sem, false, NULL);
 
 	printf(OK\n);
 
diff -durpN postgresql/src/backend/port/posix_sema.c postgresql.1/src/backend/port/posix_sema.c
--- postgresql/src/backend/port/posix_sema.c	2012-04-16 19:57:22.438915489 +0200
+++ postgresql.1/src/backend/port/posix_sema.c	2012-08-21 15:49:26.215579665 +0200
@@ -236,9 +236,11 @@ PGSemaphoreReset(PGSemaphore sema)
  * Lock a semaphore (decrement count), blocking if count would be  0
  */
 void
-PGSemaphoreLock(PGSemaphore sema, bool interruptOK)
+PGSemaphoreLock(PGSemaphore sema, bool interruptOK,
+	PGSemaphoreCondition condition_checker)
 {
 	int			errStatus;
+	bool			condition = false;
 
 	/*
 	 * See notes in sysv_sema.c's implementation of PGSemaphoreLock. Just as
@@ -252,8 +254,12 @@ PGSemaphoreLock(PGSemaphore sema, bool i
 		CHECK_FOR_INTERRUPTS();
 		errStatus = sem_wait(PG_SEM_REF(sema));
 		ImmediateInterruptOK = false;
-	} while (errStatus  0  errno == EINTR);
+		if (condition_checker)
+			condition = condition_checker();
+	} while (errStatus  0  errno == EINTR  !condition);
 
+	if (condition)
+		return;
 	if (errStatus  0)
 		elog(FATAL, sem_wait failed: %m);
 }
diff -durpN postgresql/src/backend/port/sysv_sema.c postgresql.1/src/backend/port/sysv_sema.c
--- postgresql/src/backend/port/sysv_sema.c	2012-05-14 08:20:56.284830580 +0200
+++ postgresql.1/src/backend/port/sysv_sema.c	2012-08-21 15:49:26.991584804 +0200
@@ -358,9 +358,11 @@ PGSemaphoreReset(PGSemaphore sema)
  * Lock a semaphore (decrement count), blocking if count would be  0
  */
 void
-PGSemaphoreLock(PGSemaphore sema, bool interruptOK)
+PGSemaphoreLock(PGSemaphore sema, bool interruptOK,
+	PGSemaphoreCondition condition_checker)
 {
 	int			errStatus;
+	bool			condition = false;
 	struct sembuf sops;
 
 	sops.sem_op = -1;			/* decrement */
@@ -414,8 +416,12 @@ PGSemaphoreLock(PGSemaphore sema, bool i
 		CHECK_FOR_INTERRUPTS();
 		errStatus = semop(sema-semId, sops, 1);
 		ImmediateInterruptOK = false;
-	} while (errStatus  0  errno == EINTR);
+		if (condition_checker)
+			condition = condition_checker();
+	} while (errStatus  0  errno == EINTR  !condition);
 
+	if (condition)
+		return;
 	if (errStatus  0)
 		elog(FATAL, semop(id=%d) failed: %m, sema-semId);
 }
diff -durpN postgresql/src/backend/port/win32_sema.c postgresql.1/src/backend/port/win32_sema.c
--- postgresql/src/backend/port/win32_sema.c	2012-06-11 06:22:48.137921483 +0200
+++ postgresql.1/src/backend/port/win32_sema.c	2012-08-21 15:49:24.921571070 +0200
@@ -116,10 +116,12 @@ PGSemaphoreReset(PGSemaphore sema)
  * Serve the interrupt if interruptOK is true.
  */
 void
-PGSemaphoreLock(PGSemaphore 

Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-07-22 Thread Boszormenyi Zoltan

2012-07-17 06:32 keltezéssel, Alvaro Herrera írta:

Excerpts from Tom Lane's message of vie jul 13 18:23:31 -0400 2012:

Boszormenyi Zoltan z...@cybertec.at writes:

Try SET deadlock_timeout = 0;

Actually, when I try that I get

ERROR:  0 is outside the valid range for parameter deadlock_timeout (1 .. 
2147483647)

So I don't see any bug here.

I committed this patch without changing this.  If this needs patched,
please speak up.  I also considered adding a comment on
enable_timeout_after about calling it with a delay of 0, but refrained;
if anybody thinks it's necessary, suggestions are welcome.


Thanks for committing this part.

Attached is the revised (and a lot leaner, more generic) lock timeout patch,
which introduces new functionality for the timeout registration framework.
The new functionality is called extra timeouts, better naming is welcome.
Instead of only the previously defined (deadlock and statement) timeouts,
the extra timeouts can also be activated from within ProcSleep() in a linked
way.

The lock timeout is a special case of this functionality. To show this, this 
patch
is split into two again to make reviewing easier.

This way, the timeout framework is really usable for external modules, as
envisioned by you guys

Also, rewriting statement and deadlock timeouts using this functionality
and unifying the two registration interfaces may be possible later. But
messing up proven and working code is not in the scope of this patch or
my job at this point.

Best regards,
Zoltán Böszörményi

--
--
Zoltán Böszörményi
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/

diff -durpN postgresql/src/backend/port/posix_sema.c postgresql.1/src/backend/port/posix_sema.c
--- postgresql/src/backend/port/posix_sema.c	2012-04-16 19:57:22.438915489 +0200
+++ postgresql.1/src/backend/port/posix_sema.c	2012-07-22 21:34:50.475375677 +0200
@@ -24,6 +24,7 @@
 #include miscadmin.h
 #include storage/ipc.h
 #include storage/pg_sema.h
+#include utils/timeout.h
 
 
 #ifdef USE_NAMED_POSIX_SEMAPHORES
@@ -313,3 +314,31 @@ PGSemaphoreTryLock(PGSemaphore sema)
 
 	return true;
 }
+
+/*
+ * PGSemaphoreTimedLock
+ *
+ * Lock a semaphore (decrement count), blocking if count would be  0
+ * Return if lock_timeout expired
+ */
+bool
+PGSemaphoreTimedLock(PGSemaphore sema, bool interruptOK)
+{
+	int			errStatus;
+	bool			timeout;
+
+	do
+	{
+		ImmediateInterruptOK = interruptOK;
+		CHECK_FOR_INTERRUPTS();
+		errStatus = sem_wait(PG_SEM_REF(sema));
+		ImmediateInterruptOK = false;
+		timeout = ExtraTimeoutCondition();
+	} while (errStatus  0  errno == EINTR  !timeout);
+
+	if (timeout)
+		return false;
+	if (errStatus  0)
+		elog(FATAL, sem_wait failed: %m);
+	return true;
+}
diff -durpN postgresql/src/backend/port/sysv_sema.c postgresql.1/src/backend/port/sysv_sema.c
--- postgresql/src/backend/port/sysv_sema.c	2012-05-14 08:20:56.284830580 +0200
+++ postgresql.1/src/backend/port/sysv_sema.c	2012-07-22 21:34:50.476375683 +0200
@@ -27,6 +27,7 @@
 #include miscadmin.h
 #include storage/ipc.h
 #include storage/pg_sema.h
+#include utils/timeout.h
 
 
 #ifndef HAVE_UNION_SEMUN
@@ -492,3 +493,36 @@ PGSemaphoreTryLock(PGSemaphore sema)
 
 	return true;
 }
+
+/*
+ * PGSemaphoreTimedLock
+ *
+ * Lock a semaphore (decrement count), blocking if count would be  0
+ * Return if lock_timeout expired
+ */
+bool
+PGSemaphoreTimedLock(PGSemaphore sema, bool interruptOK)
+{
+	int			errStatus;
+	bool			timeout;
+	struct sembuf sops;
+
+	sops.sem_op = -1;			/* decrement */
+	sops.sem_flg = 0;
+	sops.sem_num = sema-semNum;
+
+	do
+	{
+		ImmediateInterruptOK = interruptOK;
+		CHECK_FOR_INTERRUPTS();
+		errStatus = semop(sema-semId, sops, 1);
+		ImmediateInterruptOK = false;
+		timeout = ExtraTimeoutCondition();
+	} while (errStatus  0  errno == EINTR  !timeout);
+
+	if (timeout)
+		return false;
+	if (errStatus  0)
+		elog(FATAL, semop(id=%d) failed: %m, sema-semId);
+	return true;
+}
diff -durpN postgresql/src/backend/port/win32_sema.c postgresql.1/src/backend/port/win32_sema.c
--- postgresql/src/backend/port/win32_sema.c	2012-06-11 06:22:48.137921483 +0200
+++ postgresql.1/src/backend/port/win32_sema.c	2012-07-22 21:34:50.476375683 +0200
@@ -16,6 +16,7 @@
 #include miscadmin.h
 #include storage/ipc.h
 #include storage/pg_sema.h
+#include utils/timeout.h
 
 static HANDLE *mySemSet;		/* IDs of sema sets acquired so far */
 static int	numSems;			/* number of sema sets acquired so far */
@@ -209,3 +210,65 @@ PGSemaphoreTryLock(PGSemaphore sema)
 	/* keep compiler quiet */
 	return false;
 }
+
+/*
+ * PGSemaphoreTimedLock
+ *
+ * Lock a semaphore (decrement count), blocking if count would be  0.
+ * Serve the interrupt if interruptOK is true.
+ * Return if lock_timeout expired.
+ */
+bool
+PGSemaphoreTimedLock(PGSemaphore sema, bool interruptOK)
+{
+	DWORD		ret;
+	HANDLE		wh[2];
+	bool			timeout;
+
+	/*
+	 * Note: 

Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-07-16 Thread Alvaro Herrera

Excerpts from Tom Lane's message of vie jul 13 18:23:31 -0400 2012:
 
 Boszormenyi Zoltan z...@cybertec.at writes:
  Try SET deadlock_timeout = 0;
 
 Actually, when I try that I get
 
 ERROR:  0 is outside the valid range for parameter deadlock_timeout (1 .. 
 2147483647)
 
 So I don't see any bug here.

I committed this patch without changing this.  If this needs patched,
please speak up.  I also considered adding a comment on
enable_timeout_after about calling it with a delay of 0, but refrained;
if anybody thinks it's necessary, suggestions are welcome.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-07-14 Thread Boszormenyi Zoltan

2012-07-14 00:36 keltezéssel, Tom Lane írta:

Alvaro Herrera alvhe...@commandprompt.com writes:

For what it's worth, I would appreciate it if you would post the lock
timeout patch for the upcoming commitfest.

+1.  I think it's reasonable to get the infrastructure patch in now,
but we are running out of time in this commitfest (and there's still
a lot of patches that haven't been reviewed at all).

regards, tom lane



OK, I will do that.

Thanks for the review.

--
--
Zoltán Böszörményi
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-07-13 Thread Boszormenyi Zoltan

2012-07-11 21:47 keltezéssel, Tom Lane írta:

Boszormenyi Zoltan z...@cybertec.at writes:

Attached are the refreshed patches. InitializeTimeouts() can be called
twice and PGSemaphoreTimedLock() returns bool now. This saves
two calls to get_timeout_indicator().

I'm starting to look at this patch now.  There are a number of cosmetic
things I don't care for, the biggest one being the placement of
timeout.c under storage/lmgr/.  That seems an entirely random place,
since the functionality provided has got nothing to do with storage
let alone locks.  I'm inclined to think that utils/misc/ is about
the best option in the existing backend directory hierarchy.  Anybody
object to that, or have a better idea?


Good idea, storage/lmgr/timeout.c was chosen simply because
it was born out of files living there.


Another thing that needs some discussion is the handling of
InitializeTimeouts.  As designed, I think it's completely unsafe,
the reason being that if a process using timeouts forks off another
one, the child will inherit the parent's timeout reasons and be unable
to reset them.  Right now this might not be such a big problem because
the postmaster doesn't need any timeouts, but what if it does in the
future?  So I think we should drop the base_timeouts_initialized
protection, and that means we need a pretty consistent scheme for
where to call InitializeTimeouts.  But we already have the same issue
with respect to on_proc_exit callbacks, so we can just add
InitializeTimeouts calls in the same places as on_exit_reset().

Comments?

I'll work up a revised patch and post it.

regards, tom lane




--
--
Zoltán Böszörményi
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-07-13 Thread Boszormenyi Zoltan

2012-07-11 22:59 keltezéssel, Tom Lane írta:

I wrote:

I'm starting to look at this patch now.

After reading this further, I think that the sched_next option is a
bad idea and we should get rid of it.  AFAICT, what it is meant to do
is (if !sched_next) automatically do disable_all_timeouts(true) if
the particular timeout happens to fire.  But there is no reason the
timeout's callback function couldn't do that; and doing it in the
callback is more flexible since you could have logic about whether to do
it or not, rather than freezing the decision at RegisterTimeout time.
Moreover, it does not seem to me to be a particularly good idea to
encourage timeouts to have such behavior, anyway.  Each time we add
another timeout we'd have to look to see if it's still sane for each
existing timeout to use !sched_next.  It would likely be better, in
most cases, for individual callbacks to explicitly disable any other
individual timeout reasons that should no longer be fired.


+1


I am also underwhelmed by the timeout_start callback function concept.


It was generalized out of static TimestampTz timeout_start_time;
in proc.c which is valid if deadlock_timeout is activated. It is used
in ProcSleep() for logging the time difference between the time when
the timeout was activated and now at several places in that function.


In the first place, that's broken enable_timeout, which incorrectly
assumes that the value it gets must be now (see its schedule_alarm
call).


You're right, it must be fixed.


   In the second place, it seems fairly likely that callers of
get_timeout_start would likewise want the clock time at which the
timeout was enabled, not the timeout_start reference time.  (If they
did want the latter, why couldn't they get it from wherever the callback
function had gotten it?)  I'm inclined to propose that we drop the
timeout_start concept and instead provide two functions for scheduling
interrupts:

enable_timeout_after(TimeoutName tn, int delay_ms);
enable_timeout_at(TimeoutName tn, TimestampTz fin_time);

where you use the former if you want the standard GetCurrentTimestamp +
n msec calculation, but if you want the stop time calculated in some
other way, you calculate it yourself and use the second function.

regards, tom lane




--
--
Zoltán Böszörményi
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-07-13 Thread Boszormenyi Zoltan

2012-07-12 19:05 keltezéssel, Tom Lane írta:

Here is a revised version of the timeout-infrastructure patch.
I whacked it around quite a bit, notably:

* I decided that the most convenient way to handle the initialization
issue was to combine establishment of the signal handler with resetting
of the per-process variables.  So handle_sig_alarm is no longer global,
and InitializeTimeouts is called at the places where we used to do
pqsignal(SIGALRM, handle_sig_alarm);.  I believe this is correct
because any subprocess that was intending to use SIGALRM must have
called that before establishing any timeouts.


OK.


* BTW, doing that exposed the fact that walsender processes were failing
to establish a SIGALRM signal handler, which is a pre-existing bug,
because they run a normal authentication transaction during InitPostgres
and hence need to be able to cope with deadlock and statement timeouts.
I will do something about back-patching a fix for that.


Wow, my work uncovered a pre-existing bug in PostgreSQL. :-)


* I ended up putting the RegisterTimeout calls for DEADLOCK_TIMEOUT
and STATEMENT_TIMEOUT into InitPostgres, ensuring that they'd get
done in walsender and autovacuum processes.  I'm not totally satisfied
with that, but for sure it didn't work to only establish them in
regular backends.

* I didn't like the TimeoutName nomenclature, because to me name
suggests that the value is a string, not just an enum.  So I renamed
that to TimeoutId.


OK.


* I whacked around the logic in timeout.c a fair amount, because it
had race conditions if SIGALRM happened while enabling or disabling
a timeout.  I believe the attached coding is safe, but I'm not totally
happy with it from a performance standpoint, because it will do two
setitimer calls (a disable and then a re-enable) in many cases where
the old code did only one.


Disabling deadlock timeout, a.k.a. disable_sig_alarm(false) in
the original code called setitimer() twice if statement timeout
was still in effect, it was done in CheckStatementTimeout().
Considering this, I think there is no performance problem with
the new code you came up with.


I think what we ought to do is go ahead and apply this, so that we
can have the API nailed down, and then we can revisit the internals
of timeout.c to see if we can't get the performance back up.
It's clearly a much cleaner design than the old spaghetti logic in
proc.c, so I think we ought to go ahead with this independently of
whether the second patch gets accepted.


There is one tiny bit you might have broken. You wrote previously:


I am also underwhelmed by the timeout_start callback function concept.
In the first place, that's broken enable_timeout, which incorrectly
assumes that the value it gets must be now (see its schedule_alarm
call).  In the second place, it seems fairly likely that callers of
get_timeout_start would likewise want the clock time at which the
timeout was enabled, not the timeout_start reference time.  (If they
did want the latter, why couldn't they get it from wherever the callback
function had gotten it?)  I'm inclined to propose that we drop the
timeout_start concept and instead provide two functions for scheduling
interrupts:

enable_timeout_after(TimeoutName tn, int delay_ms);
enable_timeout_at(TimeoutName tn, TimestampTz fin_time);

where you use the former if you want the standard GetCurrentTimestamp +
n msec calculation, but if you want the stop time calculated in some
other way, you calculate it yourself and use the second function.


It's okay, but you haven't really followed it with STATEMENT_TIMEOUT:

-8--8--8--8--8-
*** 2396,2404 
/* Set statement timeout running, if any */
/* NB: this mustn't be enabled until we are within an xact */
if (StatementTimeout  0)
!   enable_sig_alarm(StatementTimeout, true);
else
!   cancel_from_timeout = false;

xact_started = true;
}
--- 2397,2405 
/* Set statement timeout running, if any */
/* NB: this mustn't be enabled until we are within an xact */
if (StatementTimeout  0)
!   enable_timeout_after(STATEMENT_TIMEOUT, 
StatementTimeout);
else
!   disable_timeout(STATEMENT_TIMEOUT, false);

xact_started = true;
}
-8--8--8--8--8-

It means that StatementTimeout losts its precision. It would trigger
in the future counting from now instead of counting from
GetCurrentStatementStartTimestamp(). It should be:

enable_timeout_at(STATEMENT_TIMEOUT,
TimestampTzPlusMilliseconds(GetCurrentStatementStartTimestamp(), 
StatementTimeout));


I haven't really looked at the second patch yet, but at minimum that
will need some rebasing to match the API tweaks here.


Yes, I will do that.

Thanks for your 

Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-07-13 Thread Boszormenyi Zoltan

2012-07-13 22:32 keltezéssel, Boszormenyi Zoltan írta:

2012-07-12 19:05 keltezéssel, Tom Lane írta:


I haven't really looked at the second patch yet, but at minimum that
will need some rebasing to match the API tweaks here.


Yes, I will do that.


While doing it, I discovered another bug you introduced.
enable_timeout_after(..., 0); would set an alarm instead of ignoring it.
Try SET deadlock_timeout = 0;

Same for enable_timeout_at(..., fin_time): if fin_time points to the past,
it enables a huge timeout that wouldn't possibly trigger for short
transactions but it's a bug nevertheless.



Thanks for your review and work.

Best regards,
Zoltán Böszörményi




--
--
Zoltán Böszörményi
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-07-13 Thread Tom Lane
Boszormenyi Zoltan z...@cybertec.at writes:
 It means that StatementTimeout losts its precision. It would trigger
 in the future counting from now instead of counting from
 GetCurrentStatementStartTimestamp().

That is, in fact, better not worse.  Note the comment in the existing
code:

 * Begin statement-level timeout
 *
 * Note that we compute statement_fin_time with reference to the
 * statement_timestamp, but apply the specified delay without any
 * correction; that is, we ignore whatever time has elapsed since
 * statement_timestamp was set.  In the normal case only a small
 * interval will have elapsed and so this doesn't matter, but there
 * are corner cases (involving multi-statement query strings with
 * embedded COMMIT or ROLLBACK) where we might re-initialize the
 * statement timeout long after initial receipt of the message. In
 * such cases the enforcement of the statement timeout will be a bit
 * inconsistent.  This annoyance is judged not worth the cost of
 * performing an additional gettimeofday() here.

That is, measuring from GetCurrentStatementStartTimestamp is a hack to
save one gettimeofday call, at the cost of making the timeout less
accurate, sometimes significantly so.  In the new model there isn't any
good way to duplicate this kluge (in particular, there's no point in
using enable_timeout_at, because that will still make a gettimeofday
call).  That doesn't bother me too much.  I'd rather try to buy back
whatever performance was lost by seeing if we can reduce the number of
setitimer calls.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-07-13 Thread Tom Lane
Boszormenyi Zoltan z...@cybertec.at writes:
 While doing it, I discovered another bug you introduced.
 enable_timeout_after(..., 0); would set an alarm instead of ignoring it.
 Try SET deadlock_timeout = 0;

Hm.  I don't think it's a bug for enable_timeout_after(..., 0) to cause
a timeout ... but we'll have to change the calling code.  Thanks for
catching that.

 Same for enable_timeout_at(..., fin_time): if fin_time points to the past,
 it enables a huge timeout

No, it should cause an immediate interrupt, or at least after 1
microsecond.  Look at TimestampDifference.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-07-13 Thread Boszormenyi Zoltan

2012-07-13 23:51 keltezéssel, Tom Lane írta:

Boszormenyi Zoltan z...@cybertec.at writes:

While doing it, I discovered another bug you introduced.
enable_timeout_after(..., 0); would set an alarm instead of ignoring it.
Try SET deadlock_timeout = 0;

Hm.  I don't think it's a bug for enable_timeout_after(..., 0) to cause
a timeout ... but we'll have to change the calling code.  Thanks for
catching that.


You're welcome. This caused a segfault in my second patch,
the code didn't expect enable_timeout_after(..., 0);
to set up a timer.

So, the calling code should check for the value and not call
enable_timeout_*() when it shouldn't, right? It's making the code
more obvious for the casual reader, I agree it's better that way.

Will you post a new version with callers checking their *Timeout settings
or commit it with this change? I can then post a new second patch.

Regarding the lock_timeout functionality: the patch can be reduced to
about half of its current size and it would be a lot less intrusive if the
LockAcquire() callers don't need to report the individual object types
and names or OIDs. Do you prefer the verbose ereport()s or a
generic one about lock timeout triggered in ProcSleep()?


Same for enable_timeout_at(..., fin_time): if fin_time points to the past,
it enables a huge timeout

No, it should cause an immediate interrupt, or at least after 1
microsecond.  Look at TimestampDifference.


Okay.

--
--
Zoltán Böszörményi
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-07-13 Thread Alvaro Herrera

Excerpts from Boszormenyi Zoltan's message of vie jul 13 18:11:27 -0400 2012:

 Regarding the lock_timeout functionality: the patch can be reduced to
 about half of its current size and it would be a lot less intrusive if the
 LockAcquire() callers don't need to report the individual object types
 and names or OIDs. Do you prefer the verbose ereport()s or a
 generic one about lock timeout triggered in ProcSleep()?

For what it's worth, I would appreciate it if you would post the lock
timeout patch for the upcoming commitfest.  This one is already almost a
month long now.  That way we can close this CF item soon and concentrate
on the remaining patches.  This one has received its fair share of
committer attention already, ISTM.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-07-13 Thread Tom Lane
Boszormenyi Zoltan z...@cybertec.at writes:
 Try SET deadlock_timeout = 0;

Actually, when I try that I get

ERROR:  0 is outside the valid range for parameter deadlock_timeout (1 .. 
2147483647)

So I don't see any bug here.  The places that are unconditionally doing
enable_timeout_after(..., DeadlockTimeout); are perfectly fine.  The
only place that might need an if-test has already got one:

  if (StatementTimeout  0)
! enable_timeout_after(STATEMENT_TIMEOUT, StatementTimeout);
  else
! disable_timeout(STATEMENT_TIMEOUT, false);


regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-07-13 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes:
 For what it's worth, I would appreciate it if you would post the lock
 timeout patch for the upcoming commitfest.

+1.  I think it's reasonable to get the infrastructure patch in now,
but we are running out of time in this commitfest (and there's still
a lot of patches that haven't been reviewed at all).

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-07-12 Thread Tom Lane
Here is a revised version of the timeout-infrastructure patch.
I whacked it around quite a bit, notably:

* I decided that the most convenient way to handle the initialization
issue was to combine establishment of the signal handler with resetting
of the per-process variables.  So handle_sig_alarm is no longer global,
and InitializeTimeouts is called at the places where we used to do
pqsignal(SIGALRM, handle_sig_alarm);.  I believe this is correct
because any subprocess that was intending to use SIGALRM must have
called that before establishing any timeouts.

* BTW, doing that exposed the fact that walsender processes were failing
to establish a SIGALRM signal handler, which is a pre-existing bug,
because they run a normal authentication transaction during InitPostgres
and hence need to be able to cope with deadlock and statement timeouts.
I will do something about back-patching a fix for that.

* I ended up putting the RegisterTimeout calls for DEADLOCK_TIMEOUT
and STATEMENT_TIMEOUT into InitPostgres, ensuring that they'd get
done in walsender and autovacuum processes.  I'm not totally satisfied
with that, but for sure it didn't work to only establish them in
regular backends.

* I didn't like the TimeoutName nomenclature, because to me name
suggests that the value is a string, not just an enum.  So I renamed
that to TimeoutId.

* I whacked around the logic in timeout.c a fair amount, because it
had race conditions if SIGALRM happened while enabling or disabling
a timeout.  I believe the attached coding is safe, but I'm not totally
happy with it from a performance standpoint, because it will do two
setitimer calls (a disable and then a re-enable) in many cases where
the old code did only one.

I think what we ought to do is go ahead and apply this, so that we
can have the API nailed down, and then we can revisit the internals
of timeout.c to see if we can't get the performance back up.
It's clearly a much cleaner design than the old spaghetti logic in
proc.c, so I think we ought to go ahead with this independently of
whether the second patch gets accepted.

I haven't really looked at the second patch yet, but at minimum that
will need some rebasing to match the API tweaks here.

regards, tom lane



binwIymnjnW5K.bin
Description: 1-timeout-framework-v16.patch.gz

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-07-11 Thread Tom Lane
Boszormenyi Zoltan z...@cybertec.at writes:
 Attached are the refreshed patches. InitializeTimeouts() can be called
 twice and PGSemaphoreTimedLock() returns bool now. This saves
 two calls to get_timeout_indicator().

I'm starting to look at this patch now.  There are a number of cosmetic
things I don't care for, the biggest one being the placement of
timeout.c under storage/lmgr/.  That seems an entirely random place,
since the functionality provided has got nothing to do with storage
let alone locks.  I'm inclined to think that utils/misc/ is about
the best option in the existing backend directory hierarchy.  Anybody
object to that, or have a better idea?

Another thing that needs some discussion is the handling of
InitializeTimeouts.  As designed, I think it's completely unsafe,
the reason being that if a process using timeouts forks off another
one, the child will inherit the parent's timeout reasons and be unable
to reset them.  Right now this might not be such a big problem because
the postmaster doesn't need any timeouts, but what if it does in the
future?  So I think we should drop the base_timeouts_initialized
protection, and that means we need a pretty consistent scheme for
where to call InitializeTimeouts.  But we already have the same issue
with respect to on_proc_exit callbacks, so we can just add
InitializeTimeouts calls in the same places as on_exit_reset().

Comments?

I'll work up a revised patch and post it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-07-11 Thread Alvaro Herrera

Excerpts from Tom Lane's message of mié jul 11 15:47:47 -0400 2012:
 
 Boszormenyi Zoltan z...@cybertec.at writes:
  Attached are the refreshed patches. InitializeTimeouts() can be called
  twice and PGSemaphoreTimedLock() returns bool now. This saves
  two calls to get_timeout_indicator().
 
 I'm starting to look at this patch now.  There are a number of cosmetic
 things I don't care for, the biggest one being the placement of
 timeout.c under storage/lmgr/.  That seems an entirely random place,
 since the functionality provided has got nothing to do with storage
 let alone locks.  I'm inclined to think that utils/misc/ is about
 the best option in the existing backend directory hierarchy.  Anybody
 object to that, or have a better idea?

I agree with the proposed new location.

 Another thing that needs some discussion is the handling of
 InitializeTimeouts.  As designed, I think it's completely unsafe,
 the reason being that if a process using timeouts forks off another
 one, the child will inherit the parent's timeout reasons and be unable
 to reset them.  Right now this might not be such a big problem because
 the postmaster doesn't need any timeouts, but what if it does in the
 future?  So I think we should drop the base_timeouts_initialized
 protection, and that means we need a pretty consistent scheme for
 where to call InitializeTimeouts.  But we already have the same issue
 with respect to on_proc_exit callbacks, so we can just add
 InitializeTimeouts calls in the same places as on_exit_reset().

I do agree that InitializeTimeouts is not optimally placed.  We
discussed this upthread.

Some of the calls of on_exit_reset() are placed in code that's about to
die.  Surely we don't need InitializeTimeouts() then.  Maybe we should
have another routine, say InitializeProcess (noting we already
InitProcess so maybe some name would be good), that calls both
on_exit_reset and InitializeTimeouts.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-07-11 Thread Tom Lane
I wrote:
 I'm starting to look at this patch now.

After reading this further, I think that the sched_next option is a
bad idea and we should get rid of it.  AFAICT, what it is meant to do
is (if !sched_next) automatically do disable_all_timeouts(true) if
the particular timeout happens to fire.  But there is no reason the
timeout's callback function couldn't do that; and doing it in the
callback is more flexible since you could have logic about whether to do
it or not, rather than freezing the decision at RegisterTimeout time.
Moreover, it does not seem to me to be a particularly good idea to
encourage timeouts to have such behavior, anyway.  Each time we add
another timeout we'd have to look to see if it's still sane for each
existing timeout to use !sched_next.  It would likely be better, in
most cases, for individual callbacks to explicitly disable any other
individual timeout reasons that should no longer be fired.

I am also underwhelmed by the timeout_start callback function concept.
In the first place, that's broken enable_timeout, which incorrectly
assumes that the value it gets must be now (see its schedule_alarm
call).  In the second place, it seems fairly likely that callers of
get_timeout_start would likewise want the clock time at which the
timeout was enabled, not the timeout_start reference time.  (If they
did want the latter, why couldn't they get it from wherever the callback
function had gotten it?)  I'm inclined to propose that we drop the
timeout_start concept and instead provide two functions for scheduling
interrupts:

enable_timeout_after(TimeoutName tn, int delay_ms);
enable_timeout_at(TimeoutName tn, TimestampTz fin_time);

where you use the former if you want the standard GetCurrentTimestamp +
n msec calculation, but if you want the stop time calculated in some
other way, you calculate it yourself and use the second function.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-07-11 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes:
 Excerpts from Tom Lane's message of mié jul 11 15:47:47 -0400 2012:
 ... that means we need a pretty consistent scheme for
 where to call InitializeTimeouts.  But we already have the same issue
 with respect to on_proc_exit callbacks, so we can just add
 InitializeTimeouts calls in the same places as on_exit_reset().

 I do agree that InitializeTimeouts is not optimally placed.  We
 discussed this upthread.

 Some of the calls of on_exit_reset() are placed in code that's about to
 die.  Surely we don't need InitializeTimeouts() then.  Maybe we should
 have another routine, say InitializeProcess (noting we already
 InitProcess so maybe some name would be good), that calls both
 on_exit_reset and InitializeTimeouts.

Yeah, I was wondering about that too, but it seems a bit ad-hoc from a
modularity standpoint.  I gave some consideration to the idea of putting
these calls directly into fork_process(), but we'd have to be very sure
that there would never be a case where it was incorrect to do them after
forking.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-07-04 Thread Alvaro Herrera

Excerpts from Boszormenyi Zoltan's message of mié jul 04 06:32:46 -0400 2012:
 2012-07-04 12:09 keltezéssel, Boszormenyi Zoltan írta:

  You just broke initdb with this cleanup. :-)

Ouch.

  initdb starts postgres --single, that doesn't do BackendInitialize(),
  only PostgresMain(). So, you need InitializeTimeouts() before
  the RegisterTimeout() calls in PostgresMain and the elog(PANIC)
  must not be in InitializeTimeouts() if called twice.
 
 Attached is the fix for this problem. PostgresMain() has a new
 argument: bool single_user. This way, InitializeTimeouts() can
 keep its elog(PANIC) if called twice and postgres --single
 doesn't fail its Assert() in RegisterTimeout().

Hmm.  Maybe it's better to leave InitializeTimeouts to be called twice
after all.  The fix seems a lot uglier than the disease it's curing.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-07-04 Thread Alvaro Herrera

Excerpts from Boszormenyi Zoltan's message of mié jul 04 07:03:44 -0400 2012:
 
 2012-07-03 23:38 keltezéssel, Alvaro Herrera írta:
 
  I don't understand why PGSemaphoreTimedLock() is not broken.  I mean
  surely you need a bool return to let the caller know whether the
  acquisition succeeded or failed?
 
 Well, this is the same interface PGSemaphoreTryLock() uses.
 By this reasoning, it's also broken.

Eh, no -- as far as I can see, PGSemaphoreTryLock returns a boolean,
which is precisely my point.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-07-04 Thread Boszormenyi Zoltan

2012-07-04 17:25 keltezéssel, Alvaro Herrera írta:

Excerpts from Boszormenyi Zoltan's message of mié jul 04 07:03:44 -0400 2012:

2012-07-03 23:38 keltezéssel, Alvaro Herrera írta:

I don't understand why PGSemaphoreTimedLock() is not broken.  I mean
surely you need a bool return to let the caller know whether the
acquisition succeeded or failed?

Well, this is the same interface PGSemaphoreTryLock() uses.
By this reasoning, it's also broken.

Eh, no -- as far as I can see, PGSemaphoreTryLock returns a boolean,
which is precisely my point.


You're right. I blame the heat for not being able to properly
read my own code.

--
--
Zoltán Böszörményi
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-07-04 Thread Boszormenyi Zoltan

2012-07-03 23:31 keltezéssel, Alvaro Herrera írta:

Excerpts from Boszormenyi Zoltan's message of vie jun 29 14:30:28 -0400 2012:


Does anyone have a little time to look at the latest timeout framework
with the registration interface and the 2nd patch too? I am at work
until Friday next week, after that I will be on vacation for two weeks.
Just in case there is anything that needs tweaking to make it more
acceptable.

I cleaned up this a bit more and now I think it's ready to commit --
as soon as somebody tests that the standby bits still work.


You just broke initdb with this cleanup. :-)

---8--8--8--8--8--8--8---
$ cat src/test/regress/log/initdb.log
Running in noclean mode.  Mistakes will not be cleaned up.
The files belonging to this database system will be owned by user zozo.
This user must also own the server process.

The database cluster will be initialized with locales
  COLLATE:  hu_HU.utf8
  CTYPE:hu_HU.utf8
  MESSAGES: C
  MONETARY: hu_HU.utf8
  NUMERIC:  hu_HU.utf8
  TIME: hu_HU.utf8
The default database encoding has accordingly been set to UTF8.
The default text search configuration will be set to hungarian.

creating directory 
/home/zozo/lock-timeout/9.1/1/postgresql.14/src/test/regress/./tmp_check/data ... ok

creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 32MB
creating configuration files ... ok
creating template1 database in 
/home/zozo/lock-timeout/9.1/1/postgresql.14/src/test/regress/./tmp_check/data/base/1 ... ok
initializing pg_authid ... TRAP: FailedAssertion(!(base_timeouts_initialized), File: 
timeout.c, Line: 217)
sh: line 1: 29872 Aborted (core dumped) 
/home/zozo/lock-timeout/9.1/1/postgresql.14/src/test/regress/tmp_check/install/home/zozo/pgc92dev-locktimeout/bin/postgres 
--single -F -O -c search_path=pg_catalog -c exit_on_error=true template1  /dev/null

child process exited with exit code 134
initdb: data directory 
/home/zozo/lock-timeout/9.1/1/postgresql.14/src/test/regress/./tmp_check/data not 
removed at user's request

---8--8--8--8--8--8--8---

initdb starts postgres --single, that doesn't do BackendInitialize(),
only PostgresMain(). So, you need InitializeTimeouts() before
the RegisterTimeout() calls in PostgresMain and the elog(PANIC)
must not be in InitializeTimeouts() if called twice.


--
--
Zoltán Böszörményi
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-07-04 Thread Boszormenyi Zoltan

2012-07-04 12:09 keltezéssel, Boszormenyi Zoltan írta:

2012-07-03 23:31 keltezéssel, Alvaro Herrera írta:

Excerpts from Boszormenyi Zoltan's message of vie jun 29 14:30:28 -0400 2012:


Does anyone have a little time to look at the latest timeout framework
with the registration interface and the 2nd patch too? I am at work
until Friday next week, after that I will be on vacation for two weeks.
Just in case there is anything that needs tweaking to make it more
acceptable.

I cleaned up this a bit more and now I think it's ready to commit --
as soon as somebody tests that the standby bits still work.


You just broke initdb with this cleanup. :-)

---8--8--8--8--8--8--8---
$ cat src/test/regress/log/initdb.log
Running in noclean mode.  Mistakes will not be cleaned up.
The files belonging to this database system will be owned by user zozo.
This user must also own the server process.

The database cluster will be initialized with locales
  COLLATE:  hu_HU.utf8
  CTYPE:hu_HU.utf8
  MESSAGES: C
  MONETARY: hu_HU.utf8
  NUMERIC:  hu_HU.utf8
  TIME: hu_HU.utf8
The default database encoding has accordingly been set to UTF8.
The default text search configuration will be set to hungarian.

creating directory 
/home/zozo/lock-timeout/9.1/1/postgresql.14/src/test/regress/./tmp_check/data ... ok

creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 32MB
creating configuration files ... ok
creating template1 database in 
/home/zozo/lock-timeout/9.1/1/postgresql.14/src/test/regress/./tmp_check/data/base/1 ... ok
initializing pg_authid ... TRAP: FailedAssertion(!(base_timeouts_initialized), File: 
timeout.c, Line: 217)
sh: line 1: 29872 Aborted (core dumped) 
/home/zozo/lock-timeout/9.1/1/postgresql.14/src/test/regress/tmp_check/install/home/zozo/pgc92dev-locktimeout/bin/postgres 
--single -F -O -c search_path=pg_catalog -c exit_on_error=true template1  /dev/null

child process exited with exit code 134
initdb: data directory 
/home/zozo/lock-timeout/9.1/1/postgresql.14/src/test/regress/./tmp_check/data not 
removed at user's request

---8--8--8--8--8--8--8---

initdb starts postgres --single, that doesn't do BackendInitialize(),
only PostgresMain(). So, you need InitializeTimeouts() before
the RegisterTimeout() calls in PostgresMain and the elog(PANIC)
must not be in InitializeTimeouts() if called twice.




Attached is the fix for this problem. PostgresMain() has a new
argument: bool single_user. This way, InitializeTimeouts() can
keep its elog(PANIC) if called twice and postgres --single
doesn't fail its Assert() in RegisterTimeout().

Comments?

Best regards,
Zoltán Böszörményi

--
--
Zoltán Böszörményi
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/

diff -durpN postgresql.14.orig/src/backend/main/main.c postgresql.14/src/backend/main/main.c
--- postgresql.14.orig/src/backend/main/main.c	2012-06-26 09:10:21.272759354 +0200
+++ postgresql.14/src/backend/main/main.c	2012-07-04 12:21:58.869037014 +0200
@@ -192,7 +192,7 @@ main(int argc, char *argv[])
 	else if (argc  1  strcmp(argv[1], --describe-config) == 0)
 		GucInfoMain();			/* does not return */
 	else if (argc  1  strcmp(argv[1], --single) == 0)
-		PostgresMain(argc, argv, get_current_username(progname)); /* does not return */
+		PostgresMain(argc, argv, get_current_username(progname), true); /* does not return */
 	else
 		PostmasterMain(argc, argv); /* does not return */
 	abort();		/* should not get here */
diff -durpN postgresql.14.orig/src/backend/postmaster/postmaster.c postgresql.14/src/backend/postmaster/postmaster.c
--- postgresql.14.orig/src/backend/postmaster/postmaster.c	2012-07-04 12:25:09.247183727 +0200
+++ postgresql.14/src/backend/postmaster/postmaster.c	2012-07-04 12:23:22.933543240 +0200
@@ -3626,7 +3626,7 @@ BackendRun(Port *port)
 	 */
 	MemoryContextSwitchTo(TopMemoryContext);
 
-	PostgresMain(ac, av, port-user_name);
+	PostgresMain(ac, av, port-user_name, false);
 }
 
 
diff -durpN postgresql.14.orig/src/backend/tcop/postgres.c postgresql.14/src/backend/tcop/postgres.c
--- postgresql.14.orig/src/backend/tcop/postgres.c	2012-07-04 12:25:09.255183775 +0200
+++ postgresql.14/src/backend/tcop/postgres.c	2012-07-04 12:24:17.685873058 +0200
@@ -3509,7 +3509,7 @@ process_postgres_switches(int argc, char
  * 
  */
 void
-PostgresMain(int argc, char *argv[], const char *username)
+PostgresMain(int argc, char *argv[], const char *username, bool single_user)
 {
 	const char *dbname;
 	int			firstchar;
@@ -3603,8 +3603,11 @@ PostgresMain(int argc, char *argv[], con
 	{
 		/*
 		 * Register timeout sources needed by backend operation.  Note
-		 * that InitializeTimeout was already called by BackendInitialize.
+		 * that 

Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-07-04 Thread Boszormenyi Zoltan

2012-07-03 23:38 keltezéssel, Alvaro Herrera írta:


I don't understand why PGSemaphoreTimedLock() is not broken.  I mean
surely you need a bool return to let the caller know whether the
acquisition succeeded or failed?


Well, this is the same interface PGSemaphoreTryLock() uses.
By this reasoning, it's also broken.


   AFAICS you are relying on
get_timeout_indicator() but this seems to me the wrong thing to do ...
(not to mention how ugly it is to percolate through two levels of
abstraction)


What other way do you suggest? EINTR may come from
a different signal, which may also be ignored or not. Ctrl-C
is handled and leads to elog(ERROR) but an ignored signal
technically calls a NOP handler deep inside the OS runtime
libraries but the signal *is* delivered to the backend which
in turn interrupts semop() or whatever the platform equivalent is.

I can add a flag to timeout.c that is set whenever SIGALRM
is delivered but checking that would be another abstraction
violation as calling get_timeout_indicator() in your opinion.

The original coding of PGSemaphoreTryLock() used
semtimedop(), sem_timedwait() and the timeout value applied
to WaitForMultipleObjectsEx(). This was quickly shot down
as using the SIGALRM signal and its behaviour to interrupt the
locking operation is be better and fits the PostgreSQL portability
features. Also, OS X at the time didn't support sem_timedwait().

I am not complaining, just recalling the different details.

How about not hardcoding get_timeout_indicator(LOCK_TIMEOUT)
into PGSemaphoreTimedLock()? Passing TimeoutName to it would
make it more generic and usable for other timeout sources.

--
--
Zoltán Böszörményi
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-07-03 Thread Alvaro Herrera
Excerpts from Boszormenyi Zoltan's message of vie jun 29 14:30:28 -0400 2012:

 Does anyone have a little time to look at the latest timeout framework
 with the registration interface and the 2nd patch too? I am at work
 until Friday next week, after that I will be on vacation for two weeks.
 Just in case there is anything that needs tweaking to make it more
 acceptable.

I cleaned up this a bit more and now I think it's ready to commit --
as soon as somebody tests that the standby bits still work.

I still have not looked at the second patch.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


1-timeout-framework-v14.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-07-03 Thread Alvaro Herrera


I don't understand why PGSemaphoreTimedLock() is not broken.  I mean
surely you need a bool return to let the caller know whether the
acquisition succeeded or failed?  AFAICS you are relying on
get_timeout_indicator() but this seems to me the wrong thing to do ...
(not to mention how ugly it is to percolate through two levels of
abstraction)

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-06-29 Thread Boszormenyi Zoltan

2012-06-27 10:34 keltezéssel, Boszormenyi Zoltan írta:

2012-06-26 18:49 keltezéssel, Alvaro Herrera írta:

Excerpts from Boszormenyi Zoltan's message of mar jun 26 12:43:34 -0400 2012:


So, should I keep the enum TimeoutName? Are global variables for
keeping dynamically assigned values preferred over the enum?
Currently we have 5 timeout sources in total, 3 of them are used by
regular backends, the remaining 2 are used by replication standby.
We can have a fixed size array (say with 8 or 16 elements) for future use
and this would be plenty.

Opinions?

My opinion is that the fixed size array is fine.


Attached is the version which uses a registration interface.

Also, to further minimize knowledge of timeouts in timeout.c,
all GUCs are moved back to proc.c


I'll go set the patch waiting on author.  Also, remember to review
some other people's patches.


I will look into it.

Best regards,
Zoltán Böszörményi


Does anyone have a little time to look at the latest timeout framework
with the registration interface and the 2nd patch too? I am at work
until Friday next week, after that I will be on vacation for two weeks.
Just in case there is anything that needs tweaking to make it more
acceptable.

Thanks in advance,
Zoltán Böszörményi

--
--
Zoltán Böszörményi
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-06-26 Thread Boszormenyi Zoltan

2012-06-26 06:59 keltezéssel, Alvaro Herrera írta:

I cleaned up the framework patch a bit.  My version's attached.  Mainly,
returning false for failure in some code paths that are only going to
have the caller elog(FATAL) is rather pointless -- it seems much better
to just have the code itself do the elog(FATAL).  In any case, I
searched for reports of those error messages being reported in the wild
and saw none.


OK. The cleanups are always good.

One nitpick, though. Your version doesn't contain the timeout.h
and compilation fails because of it. :-) You might have forgotten
to do git commit -a.


There are other things but they are in the nitpick department.  (A
reference to -check_timeout remains that needs to be fixed too).


Yes, it's called -timeout_func() now.


There is one thing that still bothers me on this whole framework patch,
but I'm not sure it's easily fixable.  Basically the API to timeout.c is
the whole list of timeouts and their whole behaviors.  If you want to
add a new one, you can't just call into the entry points, you have to
modify timeout.c itself (as well as timeout.h as well as whatever code
it is that you want to add timeouts to).  This may be good enough, but I
don't like it.  I think it boils down to proctimeout.h is cheating.

The interface I would actually like to have is something that lets the
modules trying to get a timeout register the timeout-checking function
as a callback.  API-wise, it would be much cleaner.  However, I'm not
raelly sure that code-wise it would be any cleaner at all.  In fact I
think it'd be rather cumbersome.  However, if you could give that idea
some thought, it'd be swell.


Well, I can make the registration interface similar to how LWLocks
are treated, but that doesn't avoid modification of the base_timeouts
array in case a new internal use case arises. Say:

#define USER_TIMEOUTS4

intn_timeouts = TIMEOUT_MAX;
static timeout_params base_timeouts[TIMEOUT_MAX + USER_TIMEOUTS];

and register_timeout() adds data after TIMEOUT_MAX.


I haven't looked at the second patch at all yet.


This is the whole point of the first patch. But as I said above for
registering a new timeout source, it's a new internal use case.
One that touches a lot of places which cannot simply be done
by registering a new timeout source.

--
--
Zoltán Böszörményi
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-06-26 Thread Robert Haas
On Tue, Jun 26, 2012 at 3:59 AM, Boszormenyi Zoltan z...@cybertec.at wrote:
 Well, I can make the registration interface similar to how LWLocks
 are treated, but that doesn't avoid modification of the base_timeouts
 array in case a new internal use case arises. Say:

 #define USER_TIMEOUTS    4

 int    n_timeouts = TIMEOUT_MAX;
 static timeout_params base_timeouts[TIMEOUT_MAX + USER_TIMEOUTS];

Since timeouts - unlike lwlocks - do not need to touch shared memory,
there's no need for a hard-coded limit here.  You can just allocate
the array using MemoryContextAlloc(TopMemoryContext, ...) and enlarge
it as necessary.  To avoid needing to modify the base_timeouts array,
you can just have internal callers push their entries into the array
during process startup using the same function call that an external
module would use.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-06-26 Thread Boszormenyi Zoltan

2012-06-26 13:50 keltezéssel, Robert Haas írta:

On Tue, Jun 26, 2012 at 3:59 AM, Boszormenyi Zoltan z...@cybertec.at wrote:

Well, I can make the registration interface similar to how LWLocks
are treated, but that doesn't avoid modification of the base_timeouts
array in case a new internal use case arises. Say:

#define USER_TIMEOUTS4

intn_timeouts = TIMEOUT_MAX;
static timeout_params base_timeouts[TIMEOUT_MAX + USER_TIMEOUTS];

Since timeouts - unlike lwlocks - do not need to touch shared memory,
there's no need for a hard-coded limit here.  You can just allocate
the array using MemoryContextAlloc(TopMemoryContext, ...) and enlarge
it as necessary.  To avoid needing to modify the base_timeouts array,
you can just have internal callers push their entries into the array
during process startup using the same function call that an external
module would use.


I know, but it doesn't feel right to register static functionality.

--
--
Zoltán Böszörményi
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-06-26 Thread Robert Haas
On Tue, Jun 26, 2012 at 8:03 AM, Boszormenyi Zoltan z...@cybertec.at wrote:
 I know, but it doesn't feel right to register static functionality.

We do it elsewhere.  The overhead is pretty minimal compared to other
things we already do during startup, and avoiding the need for the
array to have a fixed-size seems worth it, IMHO.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-06-26 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Tue, Jun 26, 2012 at 8:03 AM, Boszormenyi Zoltan z...@cybertec.at wrote:
 I know, but it doesn't feel right to register static functionality.

 We do it elsewhere.  The overhead is pretty minimal compared to other
 things we already do during startup, and avoiding the need for the
 array to have a fixed-size seems worth it, IMHO.

It's not even clear that the array needs to be dynamically resizable (yet).
Compare for instance syscache invalidation callbacks, which have so far
gotten along fine with a fixed-size array to hold registrations.  It's
foreseeable that we'll need something better someday, but that day
hasn't arrived, and might never arrive.

I agree with the feeling that this patch isn't done if the core
timeout code has to know specifically about each usage.  We have that
situation already.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-06-26 Thread Alvaro Herrera
Excerpts from Boszormenyi Zoltan's message of mar jun 26 03:59:06 -0400 2012:
 
 2012-06-26 06:59 keltezéssel, Alvaro Herrera írta:
  I cleaned up the framework patch a bit.  My version's attached.  Mainly,
  returning false for failure in some code paths that are only going to
  have the caller elog(FATAL) is rather pointless -- it seems much better
  to just have the code itself do the elog(FATAL).  In any case, I
  searched for reports of those error messages being reported in the wild
  and saw none.
 
 OK. The cleanups are always good.
 
 One nitpick, though. Your version doesn't contain the timeout.h
 and compilation fails because of it. :-) You might have forgotten
 to do git commit -a.

Oops.  Attached. It's pretty much the same you had, except some bools
are changed to void.

  There is one thing that still bothers me on this whole framework patch,
  but I'm not sure it's easily fixable.  Basically the API to timeout.c is
  the whole list of timeouts and their whole behaviors.  If you want to
  add a new one, you can't just call into the entry points, you have to
  modify timeout.c itself (as well as timeout.h as well as whatever code
  it is that you want to add timeouts to).  This may be good enough, but I
  don't like it.  I think it boils down to proctimeout.h is cheating.
 
  The interface I would actually like to have is something that lets the
  modules trying to get a timeout register the timeout-checking function
  as a callback.  API-wise, it would be much cleaner.  However, I'm not
  raelly sure that code-wise it would be any cleaner at all.  In fact I
  think it'd be rather cumbersome.  However, if you could give that idea
  some thought, it'd be swell.
 
 Well, I can make the registration interface similar to how LWLocks
 are treated, but that doesn't avoid modification of the base_timeouts
 array in case a new internal use case arises. Say:
 
 #define USER_TIMEOUTS4
 
 intn_timeouts = TIMEOUT_MAX;
 static timeout_params base_timeouts[TIMEOUT_MAX + USER_TIMEOUTS];
 
 and register_timeout() adds data after TIMEOUT_MAX.

Well, I don't expect that we're going to have many external uses.  My
point about registration is so that internal callers use it as well.  I
was thinking we could do something like xact.c does, which is to have
somewhere in proc.c a call like this:

TimeoutRegister(DEADLOCK_TIMEOUT, 1, true, CheckDeadLock, GetCurrentTimestamp);

at process startup (the magic value 1 is the priority.  Maybe there's a
better way to handle this).  That way, timeout.c or timeout.h do not
need to know anything about proc.c at all; we don't need proctimeout.h
at all.  The only thing it (the timeout module) needs to know is that
there is a symbolic constant named DEADLOCK_TIMEOUT.  Similarly for
statement timeout, etc.

When you call enable_timeout you first have to ensure that
DEADLOCK_TIMEOUT has been registered; and if it's not, die on an
Assert().  That way you ensure that there are no bugs where you try to
enable a timeout that hasn't been registered.

(If we later find the need to let external modules add timeouts, which I
find unlikely, we can have TimeoutRegister return TimeoutName and have
this value be dynamically allocated.  But for now I think that would be
useless complication).

The difference with lwlocks is that each lwlock is just a number.  The
lwlock.c module doesn't need to know anything about how each lwlock is
actually used.  It's not the same thing here -- which is why I think it
would be better to have all modules, even the hardcoded ones, use a
registering interface.

... Now, it could be argued that it would be even better to have
TimeoutRegister not take the TimeoutName argument, and instead generate
it dynamically and pass it back to the caller -- that way you don't need
the enum in timeout.h.  The problem I have with that idea is that it
would force the caller modules to have a global variable to keep track
of that, which seems worse to me.

  I haven't looked at the second patch at all yet.
 
 This is the whole point of the first patch.

I know that.  But I want to get the details of the framework right
before we move on to add more stuff to it.

 But as I said above for
 registering a new timeout source, it's a new internal use case.
 One that touches a lot of places which cannot simply be done
 by registering a new timeout source.

Sure.  That's going to be the case for any other timeout we want to add
(which is why I think we don't really need dynamic timeouts).

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


1-timeout-framework-v9a.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-06-26 Thread Boszormenyi Zoltan

2012-06-26 18:12 keltezéssel, Alvaro Herrera írta:

Excerpts from Boszormenyi Zoltan's message of mar jun 26 03:59:06 -0400 2012:

2012-06-26 06:59 keltezéssel, Alvaro Herrera írta:

I cleaned up the framework patch a bit.  My version's attached.  Mainly,
returning false for failure in some code paths that are only going to
have the caller elog(FATAL) is rather pointless -- it seems much better
to just have the code itself do the elog(FATAL).  In any case, I
searched for reports of those error messages being reported in the wild
and saw none.

OK. The cleanups are always good.

One nitpick, though. Your version doesn't contain the timeout.h
and compilation fails because of it. :-) You might have forgotten
to do git commit -a.

Oops.  Attached. It's pretty much the same you had, except some bools
are changed to void.


There is one thing that still bothers me on this whole framework patch,
but I'm not sure it's easily fixable.  Basically the API to timeout.c is
the whole list of timeouts and their whole behaviors.  If you want to
add a new one, you can't just call into the entry points, you have to
modify timeout.c itself (as well as timeout.h as well as whatever code
it is that you want to add timeouts to).  This may be good enough, but I
don't like it.  I think it boils down to proctimeout.h is cheating.

The interface I would actually like to have is something that lets the
modules trying to get a timeout register the timeout-checking function
as a callback.  API-wise, it would be much cleaner.  However, I'm not
raelly sure that code-wise it would be any cleaner at all.  In fact I
think it'd be rather cumbersome.  However, if you could give that idea
some thought, it'd be swell.

Well, I can make the registration interface similar to how LWLocks
are treated, but that doesn't avoid modification of the base_timeouts
array in case a new internal use case arises. Say:

#define USER_TIMEOUTS4

intn_timeouts = TIMEOUT_MAX;
static timeout_params base_timeouts[TIMEOUT_MAX + USER_TIMEOUTS];

and register_timeout() adds data after TIMEOUT_MAX.

Well, I don't expect that we're going to have many external uses.  My
point about registration is so that internal callers use it as well.  I
was thinking we could do something like xact.c does, which is to have
somewhere in proc.c a call like this:

TimeoutRegister(DEADLOCK_TIMEOUT, 1, true, CheckDeadLock, GetCurrentTimestamp);

at process startup (the magic value 1 is the priority.  Maybe there's a
better way to handle this).  That way, timeout.c or timeout.h do not
need to know anything about proc.c at all; we don't need proctimeout.h
at all.  The only thing it (the timeout module) needs to know is that
there is a symbolic constant named DEADLOCK_TIMEOUT.  Similarly for
statement timeout, etc.

When you call enable_timeout you first have to ensure that
DEADLOCK_TIMEOUT has been registered; and if it's not, die on an
Assert().  That way you ensure that there are no bugs where you try to
enable a timeout that hasn't been registered.


Currently, TimeoutName (as an index value) and priority is the same
semantically.

I would also add an Assert into register_timeout() to avoid double registration
of the same to prevent overriding he callback function pointer. If that's in
place, the TimeoutName value may still work as is: an index into 
base_timeouts[].


(If we later find the need to let external modules add timeouts, which I
find unlikely, we can have TimeoutRegister return TimeoutName and have
this value be dynamically allocated.  But for now I think that would be
useless complication).

The difference with lwlocks is that each lwlock is just a number.


Strictly speaking, just as TimeoutName.


   The
lwlock.c module doesn't need to know anything about how each lwlock is
actually used.  It's not the same thing here -- which is why I think it
would be better to have all modules, even the hardcoded ones, use a
registering interface.


OK.


... Now, it could be argued that it would be even better to have
TimeoutRegister not take the TimeoutName argument, and instead generate
it dynamically and pass it back to the caller -- that way you don't need
the enum in timeout.h.


This was what I had in mind at first ...


   The problem I have with that idea is that it
would force the caller modules to have a global variable to keep track
of that, which seems worse to me.


... and realized this as well.

So, should I keep the enum TimeoutName? Are global variables for
keeping dynamically assigned values preferred over the enum?
Currently we have 5 timeout sources in total, 3 of them are used by
regular backends, the remaining 2 are used by replication standby.
We can have a fixed size array (say with 8 or 16 elements) for future use
and this would be plenty.

Opinions?




I haven't looked at the second patch at all yet.

This is the whole point of the first patch.

I know that.  But I want to get the details of the framework right
before we move on to add more stuff to 

Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-06-26 Thread Alvaro Herrera

Excerpts from Boszormenyi Zoltan's message of mar jun 26 12:43:34 -0400 2012:

 So, should I keep the enum TimeoutName? Are global variables for
 keeping dynamically assigned values preferred over the enum?
 Currently we have 5 timeout sources in total, 3 of them are used by
 regular backends, the remaining 2 are used by replication standby.
 We can have a fixed size array (say with 8 or 16 elements) for future use
 and this would be plenty.
 
 Opinions?

My opinion is that the fixed size array is fine.

I'll go set the patch waiting on author.  Also, remember to review
some other people's patches.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-06-25 Thread Alvaro Herrera
I cleaned up the framework patch a bit.  My version's attached.  Mainly,
returning false for failure in some code paths that are only going to
have the caller elog(FATAL) is rather pointless -- it seems much better
to just have the code itself do the elog(FATAL).  In any case, I
searched for reports of those error messages being reported in the wild
and saw none.

There are other things but they are in the nitpick department.  (A
reference to -check_timeout remains that needs to be fixed too).

There is one thing that still bothers me on this whole framework patch,
but I'm not sure it's easily fixable.  Basically the API to timeout.c is
the whole list of timeouts and their whole behaviors.  If you want to
add a new one, you can't just call into the entry points, you have to
modify timeout.c itself (as well as timeout.h as well as whatever code
it is that you want to add timeouts to).  This may be good enough, but I
don't like it.  I think it boils down to proctimeout.h is cheating.

The interface I would actually like to have is something that lets the
modules trying to get a timeout register the timeout-checking function
as a callback.  API-wise, it would be much cleaner.  However, I'm not
raelly sure that code-wise it would be any cleaner at all.  In fact I
think it'd be rather cumbersome.  However, if you could give that idea
some thought, it'd be swell.

I haven't looked at the second patch at all yet.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


1-timeout-framework-v9.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-06-19 Thread Boszormenyi Zoltan

2012-06-18 19:46 keltezéssel, Alvaro Herrera írta:

Excerpts from Boszormenyi Zoltan's message of vie may 11 03:54:13 -0400 2012:

Hi,

another rebasing and applied the GIT changes in
ada8fa08fc6cf5f199b6df935b4d0a730aaa4fec to the
Windows implementation of PGSemaphoreTimedLock.

Hi,

I gave the framework patch a look.  One thing I'm not sure about is the
way you've defined the API.  It seems a bit strange to have a nice and
clean, generic interface in timeout.h; and then have the internal
implementation of the API cluttered with details such as what to do when
the deadlock timeout expires.  Wouldn't it be much better if we could
keep the details of CheckDeadLock itself within proc.c, for example?


Do you mean adding a callback function argument to for enable_timeout()
would be better?


Same for the other specific Check*Timeout functions.  It seems to me
that the only timeout.c specific requirement is to be able to poke
base_timeouts[].indicator and fin_time.  Maybe that can get done in
timeout.c and then have the generic checker call the module-specific
checker.


Or instead of static functions, Check* functions can be external
to timeout.c. It seemed to be a good idea to move all the timeout
related functions to timeout.c.


   ... I see that you have things to do before and after setting
indicator.  Maybe you could split the module-specific Check functions
in two and have timeout.c call each in turn.  Other than that I don't
see that this should pose any difficulty.

Also, I see you're calling GetCurrentTimestamp() in the checkers; maybe
more than once per signal if you have multiple timeouts enabled.


Actually, GetCurrentTimestamp() is not called multiple times,
because only the first timeout source in the queue can get triggered.


   Maybe
it'd be better to call it in once handle_sig_alarm and then pass the
value down, if necessary (AFAICS if you restructure the code as I
suggest above, you don't need to get the value down the the
module-specific code).


But yes, this way it can be cleaner.



As for the Init*Timeout() and Destroy*Timeout() functions, they seem
a bit pointless -- I mean if they just always call the generic
InitTimeout and DestroyTimeout functions, why not just define the struct
in a way that makes the specific functions unnecessary?  Maybe you even
already have all that's necessary ... I think you just change
base_timeouts[i].timeout_destroy(false); to DestroyTimeout(i, false) and
so on.


OK, I will experiment with it.


Minor nitpick: the Interface: comment in timeout.c seems useless.
That kind of thing tends to get overlooked and obsolete over time.  We
have examples of such things all over the place.  I'd just rip it and
have timeout.h be the interface documentation.


OK.


BTW it doesn't seem that datatype/timestamp.h is really necessary in
timeout.h.


You are wrong. get_timeout_start() returns TimestampTz and it's
defined in datatype/timestamp.h.

Thanks for the review, I will send the new code soon.

Best regards,
Zoltán Böszörményi

--
--
Zoltán Böszörményi
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-06-19 Thread Alvaro Herrera

Excerpts from Boszormenyi Zoltan's message of mar jun 19 04:44:35 -0400 2012:
 
 2012-06-18 19:46 keltezéssel, Alvaro Herrera írta:
  Excerpts from Boszormenyi Zoltan's message of vie may 11 03:54:13 -0400 
  2012:
  Hi,
 
  another rebasing and applied the GIT changes in
  ada8fa08fc6cf5f199b6df935b4d0a730aaa4fec to the
  Windows implementation of PGSemaphoreTimedLock.
  Hi,
 
  I gave the framework patch a look.  One thing I'm not sure about is the
  way you've defined the API.  It seems a bit strange to have a nice and
  clean, generic interface in timeout.h; and then have the internal
  implementation of the API cluttered with details such as what to do when
  the deadlock timeout expires.  Wouldn't it be much better if we could
  keep the details of CheckDeadLock itself within proc.c, for example?
 
 Do you mean adding a callback function argument to for enable_timeout()
 would be better?

Well, that's not exactly what I was thinking, but maybe your idea is
better than what I had in mind.

  Same for the other specific Check*Timeout functions.  It seems to me
  that the only timeout.c specific requirement is to be able to poke
  base_timeouts[].indicator and fin_time.  Maybe that can get done in
  timeout.c and then have the generic checker call the module-specific
  checker.
 
 Or instead of static functions, Check* functions can be external
 to timeout.c. It seemed to be a good idea to move all the timeout
 related functions to timeout.c.

Yeah, except that they play with members of the timeout_params struct,
which hopefully does not need to be exported.  So if you can just move
(that is, put back in their original places) the portions that do not
touch that strcut, it'd be great.

  Also, I see you're calling GetCurrentTimestamp() in the checkers; maybe
  more than once per signal if you have multiple timeouts enabled.
 
 Actually, GetCurrentTimestamp() is not called multiple times,
 because only the first timeout source in the queue can get triggered.

I see.

  BTW it doesn't seem that datatype/timestamp.h is really necessary in
  timeout.h.
 
 You are wrong. get_timeout_start() returns TimestampTz and it's
 defined in datatype/timestamp.h.

Oh, right.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-06-19 Thread Alvaro Herrera

Excerpts from Boszormenyi Zoltan's message of mar jun 19 12:44:04 -0400 2012:

 OK, all 4 Check* functions are now moved back into proc.c,
 nothing outside of timeout.c touches anything in it. New patches
 are attached.

Yeah, I like this one better, thanks.

It seems to me that the check functions are no longer check anymore,
right?  I mean, they don't check whether the time has expired.  It can
be argued that CheckDeadLock is well named, because it does check
whether there is a deadlock before doing anything else; but
CheckStandbyTimeout is no longer a check -- it just sends a signal.
Do we need to rename these functions?

Why are you using the deadlock timeout for authentication?  Wouldn't it
make more sense to have a separate TimeoutName, just to keep things
clean?

The NB: comment here doesn't seem to be useful anymore:
+ /*
+  * Init, Destroy and Check functions for different timeouts.
+  *
+  * NB: all Check* functions are run inside a signal handler, so be very wary
+  * about what is done in them or in called routines.
+  
*/


In base_timeouts you don't initialize fin_time for any of the members.
This is probably unimportant but then why initialize start_time?

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-06-18 Thread Alvaro Herrera

Excerpts from Boszormenyi Zoltan's message of vie may 11 03:54:13 -0400 2012:
 Hi,
 
 another rebasing and applied the GIT changes in
 ada8fa08fc6cf5f199b6df935b4d0a730aaa4fec to the
 Windows implementation of PGSemaphoreTimedLock.

Hi,

I gave the framework patch a look.  One thing I'm not sure about is the
way you've defined the API.  It seems a bit strange to have a nice and
clean, generic interface in timeout.h; and then have the internal
implementation of the API cluttered with details such as what to do when
the deadlock timeout expires.  Wouldn't it be much better if we could
keep the details of CheckDeadLock itself within proc.c, for example?  

Same for the other specific Check*Timeout functions.  It seems to me
that the only timeout.c specific requirement is to be able to poke
base_timeouts[].indicator and fin_time.  Maybe that can get done in
timeout.c and then have the generic checker call the module-specific
checker.  ... I see that you have things to do before and after setting
indicator.  Maybe you could split the module-specific Check functions
in two and have timeout.c call each in turn.  Other than that I don't
see that this should pose any difficulty.

Also, I see you're calling GetCurrentTimestamp() in the checkers; maybe
more than once per signal if you have multiple timeouts enabled.  Maybe
it'd be better to call it in once handle_sig_alarm and then pass the
value down, if necessary (AFAICS if you restructure the code as I
suggest above, you don't need to get the value down the the
module-specific code).

As for the Init*Timeout() and Destroy*Timeout() functions, they seem
a bit pointless -- I mean if they just always call the generic
InitTimeout and DestroyTimeout functions, why not just define the struct
in a way that makes the specific functions unnecessary?  Maybe you even
already have all that's necessary ... I think you just change
base_timeouts[i].timeout_destroy(false); to DestroyTimeout(i, false) and
so on.

Minor nitpick: the Interface: comment in timeout.c seems useless.
That kind of thing tends to get overlooked and obsolete over time.  We
have examples of such things all over the place.  I'd just rip it and
have timeout.h be the interface documentation.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-06-18 Thread Alvaro Herrera


BTW it doesn't seem that datatype/timestamp.h is really necessary in
timeout.h.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-04-24 Thread Boszormenyi Zoltan

2012-04-23 15:08 keltezéssel, Marc Cousin írta:

On Mon, 2012-04-23 at 10:53 +0200, Boszormenyi Zoltan wrote:

2012-04-10 09:02 keltezéssel, Boszormenyi Zoltan írta:

2012-04-06 14:47 keltezéssel, Cousin Marc írta:

On 05/04/12 08:02, Boszormenyi Zoltan wrote:

2012-04-04 21:30 keltezéssel, Alvaro Herrera írta:

I think this patch is doing two things: first touching infrastructure
stuff and then adding lock_timeout on top of that.  Would it work to
split the patch in two pieces?


Sure. Attached is the split version.

Best regards,
Zoltán Böszörményi


Hi,

I've started looking at and testing both patches.

Technically speaking, I think the source looks much better than the
first version of lock timeout, and may help adding other timeouts in the
future. I haven't tested it in depth though, because I encountered the
following problem:

While testing the patch, I found a way to crash PG. But what's weird is
that it crashes also with an unpatched git version.

Here is the way to reproduce it (I have done it with a pgbench schema):

- Set a small statement_timeout (just to save time during the tests)

Session1:
=#BEGIN;
=#lock TABLE pgbench_accounts ;

Session 2:
=#BEGIN;
=#lock TABLE pgbench_accounts ;
ERROR:  canceling statement due to statement timeout
=# lock TABLE pgbench_accounts ;

I'm using \set ON_ERROR_ROLLBACK INTERACTIVE by the way. It can also be
done with a rollback to savepoint of course.

Session 2 crashes with this : TRAP : FailedAssertion(«
!(locallock-holdsStrongLockCount == 0) », fichier : « lock.c », ligne :
749).

It can also be done without a statement_timeout, and a control-C on the
second lock table.

I didn't touch anything but this. It occurs everytime, when asserts are
activated.

I tried it on 9.1.3, and I couldn't make it crash with the same sequence
of events. So maybe it's something introduced since ? Or is the assert
still valid ?

Cheers


Attached are the new patches. I rebased them to current GIT and
they are expected to be applied after Robert Haas' patch in the
bug in fast-path locking thread.

Now it survives the above scenario.

Best regards,
Zoltán Böszörményi

New patch attached, rebased to today's GIT.

Best regards,
Zoltán Böszörményi


Ok, I've done what was missing from the review (from when I had a bug in
locking the other day), so here is the full review. By the way, this
patch doesn't belong to current commitfest, but to the next one.


It was added to 2012-Next when I posted it, 2012-01 was already
closed for new additions.


Is the patch in context diff format?
Yes

Does it apply cleanly to the current git master?
Yes

Does it include reasonable tests, necessary doc patches, etc?
The new lock_timeout GUC is documented. There are regression tests.

Read what the patch is supposed to do, and consider:
Does the patch actually implement that?
Yes

Do we want that?
I do. Mostly for administrative jobs which could lock the whole
application. It would be much easier to run reindexes, vacuum full, etc…
without worrying about bringing application down because of lock
contention.

Do we already have it?
No.

Does it follow SQL spec, or the community-agreed behavior?
I don't know if there is a consensus on this new GUC. statement_timeout
is obviously not in the SQL spec.

Does it include pg_dump support (if applicable)?
Not applicable

Are there dangers?
Yes, as it rewrites all the timeout code. I feel it is much cleaner this
way, as there is a generic set of function managing all sigalarm code,
but it heavily touches this part.

Have all the bases been covered?
I tried all sql statements I could think of (select, insert, update,
delete, truncate, drop, create index, adding constraint, lock.

I tried having statement_timeout, lock_timeout and deadlock_timeout at
very short and close or equal values. It worked too.

Rollback to savepoint while holding locks dont crash PostgreSQL anymore.

Other timeouts such as archive_timeout and checkpoint_timeout still
work.

Does the feature work as advertised?
Yes

Are there corner cases the author has failed to consider?
I didn't find any.

Are there any assertion failures or crashes?
No.

Does the patch slow down simple tests?
No

If it claims to improve performance, does it?
Not applicable

Does it slow down other things?
No

Does it follow the project coding guidelines?
I think so

Are there portability issues?
No, all the portable code (acquiring locks and manipulating sigalarm) is
the same as before.

Will it work on Windows/BSD etc?
It should. I couldn't test it though.

Are the comments sufficient and accurate?
Yes

Does it do what it says, correctly?
Yes

Does it produce compiler warnings?
No

Can you make it crash?
Not anymore

Is everything done in a way that fits together coherently with other
features/modules?
Yes, I think so. The new way of handling sigalarm seems more robust to
me.

Are there interdependencies that can cause problems?
I don't see any.

Regards,

Marc


Thanks for the review.

Best regards,
Zoltán 

Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-04-23 Thread Marc Cousin
On Mon, 2012-04-23 at 10:53 +0200, Boszormenyi Zoltan wrote:
 2012-04-10 09:02 keltezéssel, Boszormenyi Zoltan írta:
  2012-04-06 14:47 keltezéssel, Cousin Marc írta:
  On 05/04/12 08:02, Boszormenyi Zoltan wrote:
  2012-04-04 21:30 keltezéssel, Alvaro Herrera írta:
  I think this patch is doing two things: first touching infrastructure
  stuff and then adding lock_timeout on top of that.  Would it work to
  split the patch in two pieces?
 
  Sure. Attached is the split version.
 
  Best regards,
  Zoltán Böszörményi
 
  Hi,
 
  I've started looking at and testing both patches.
 
  Technically speaking, I think the source looks much better than the
  first version of lock timeout, and may help adding other timeouts in the
  future. I haven't tested it in depth though, because I encountered the
  following problem:
 
  While testing the patch, I found a way to crash PG. But what's weird is
  that it crashes also with an unpatched git version.
 
  Here is the way to reproduce it (I have done it with a pgbench schema):
 
  - Set a small statement_timeout (just to save time during the tests)
 
  Session1:
  =#BEGIN;
  =#lock TABLE pgbench_accounts ;
 
  Session 2:
  =#BEGIN;
  =#lock TABLE pgbench_accounts ;
  ERROR:  canceling statement due to statement timeout
  =# lock TABLE pgbench_accounts ;
 
  I'm using \set ON_ERROR_ROLLBACK INTERACTIVE by the way. It can also be
  done with a rollback to savepoint of course.
 
  Session 2 crashes with this : TRAP : FailedAssertion(«
  !(locallock-holdsStrongLockCount == 0) », fichier : « lock.c », ligne :
  749).
 
  It can also be done without a statement_timeout, and a control-C on the
  second lock table.
 
  I didn't touch anything but this. It occurs everytime, when asserts are
  activated.
 
  I tried it on 9.1.3, and I couldn't make it crash with the same sequence
  of events. So maybe it's something introduced since ? Or is the assert
  still valid ?
 
  Cheers
 
 
  Attached are the new patches. I rebased them to current GIT and
  they are expected to be applied after Robert Haas' patch in the
  bug in fast-path locking thread.
 
  Now it survives the above scenario.
 
  Best regards,
  Zoltán Böszörményi
 
 New patch attached, rebased to today's GIT.
 
 Best regards,
 Zoltán Böszörményi
 

Ok, I've done what was missing from the review (from when I had a bug in
locking the other day), so here is the full review. By the way, this
patch doesn't belong to current commitfest, but to the next one.



Is the patch in context diff format?
Yes

Does it apply cleanly to the current git master?
Yes

Does it include reasonable tests, necessary doc patches, etc?
The new lock_timeout GUC is documented. There are regression tests.

Read what the patch is supposed to do, and consider:
Does the patch actually implement that?
Yes

Do we want that?
I do. Mostly for administrative jobs which could lock the whole
application. It would be much easier to run reindexes, vacuum full, etc…
without worrying about bringing application down because of lock
contention.

Do we already have it?
No.

Does it follow SQL spec, or the community-agreed behavior?
I don't know if there is a consensus on this new GUC. statement_timeout
is obviously not in the SQL spec.

Does it include pg_dump support (if applicable)?
Not applicable

Are there dangers?
Yes, as it rewrites all the timeout code. I feel it is much cleaner this
way, as there is a generic set of function managing all sigalarm code,
but it heavily touches this part.

Have all the bases been covered?
I tried all sql statements I could think of (select, insert, update,
delete, truncate, drop, create index, adding constraint, lock.

I tried having statement_timeout, lock_timeout and deadlock_timeout at
very short and close or equal values. It worked too.

Rollback to savepoint while holding locks dont crash PostgreSQL anymore.

Other timeouts such as archive_timeout and checkpoint_timeout still
work.

Does the feature work as advertised?
Yes

Are there corner cases the author has failed to consider?
I didn't find any.

Are there any assertion failures or crashes?
No.

Does the patch slow down simple tests?
No

If it claims to improve performance, does it?
Not applicable

Does it slow down other things?
No

Does it follow the project coding guidelines?
I think so

Are there portability issues?
No, all the portable code (acquiring locks and manipulating sigalarm) is
the same as before.

Will it work on Windows/BSD etc?
It should. I couldn't test it though.

Are the comments sufficient and accurate?
Yes

Does it do what it says, correctly?
Yes

Does it produce compiler warnings?
No

Can you make it crash?
Not anymore

Is everything done in a way that fits together coherently with other
features/modules?
Yes, I think so. The new way of handling sigalarm seems more robust to
me.

Are there interdependencies that can cause problems?
I don't see any.

Regards,

Marc




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To 

Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-04-08 Thread Boszormenyi Zoltan

2012-04-06 14:47 keltezéssel, Cousin Marc írta:

On 05/04/12 08:02, Boszormenyi Zoltan wrote:

2012-04-04 21:30 keltezéssel, Alvaro Herrera írta:

I think this patch is doing two things: first touching infrastructure
stuff and then adding lock_timeout on top of that.  Would it work to
split the patch in two pieces?


Sure. Attached is the split version.

Best regards,
Zoltán Böszörményi


Hi,

I've started looking at and testing both patches.

Technically speaking, I think the source looks much better than the
first version of lock timeout, and may help adding other timeouts in the
future.


Thanks.


  I haven't tested it in depth though, because I encountered the
following problem:

While testing the patch, I found a way to crash PG. But what's weird is
that it crashes also with an unpatched git version.

Here is the way to reproduce it (I have done it with a pgbench schema):

- Set a small statement_timeout (just to save time during the tests)

Session1:
=#BEGIN;
=#lock TABLE pgbench_accounts ;

Session 2:
=#BEGIN;
=#lock TABLE pgbench_accounts ;
ERROR:  canceling statement due to statement timeout
=# lock TABLE pgbench_accounts ;

I'm using \set ON_ERROR_ROLLBACK INTERACTIVE by the way. It can also be
done with a rollback to savepoint of course.

Session 2 crashes with this : TRAP : FailedAssertion(«
!(locallock-holdsStrongLockCount == 0) », fichier : « lock.c », ligne :
749).

It can also be done without a statement_timeout, and a control-C on the
second lock table.


Indeed, the unpatched GIT version crashes if you enter
  =#lock TABLE pgbench_accounts ;
the second time in session 2 after the first one failed. Also,
manually spelling it out:

Session 1:

$ psql
psql (9.2devel)
Type help for help.

zozo=# begin;
BEGIN
zozo=# lock table pgbench_accounts;
LOCK TABLE
zozo=#

Session 2:

zozo=# begin;
BEGIN
zozo=# savepoint a;
SAVEPOINT
zozo=# lock table pgbench_accounts;
ERROR:  canceling statement due to statement timeout
zozo=# rollback to a;
ROLLBACK
zozo=# savepoint b;
SAVEPOINT
zozo=# lock table pgbench_accounts;
The connection to the server was lost. Attempting reset: Failed.
!

Server log after the second lock table:

TRAP: FailedAssertion(!(locallock-holdsStrongLockCount == 0), File: 
lock.c, Line: 749)
LOG:  server process (PID 12978) was terminated by signal 6: Aborted

Best regards,
Zoltán Böszörményi



I didn't touch anything but this. It occurs everytime, when asserts are
activated.

I tried it on 9.1.3, and I couldn't make it crash with the same sequence
of events. So maybe it's something introduced since ? Or is the assert
still valid ?

Cheers




--
--
Zoltán Böszörményi
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-04-08 Thread Boszormenyi Zoltan

2012-04-08 11:24 keltezéssel, Boszormenyi Zoltan írta:

2012-04-06 14:47 keltezéssel, Cousin Marc írta:

On 05/04/12 08:02, Boszormenyi Zoltan wrote:

2012-04-04 21:30 keltezéssel, Alvaro Herrera írta:

I think this patch is doing two things: first touching infrastructure
stuff and then adding lock_timeout on top of that.  Would it work to
split the patch in two pieces?


Sure. Attached is the split version.

Best regards,
Zoltán Böszörményi


Hi,

I've started looking at and testing both patches.

Technically speaking, I think the source looks much better than the
first version of lock timeout, and may help adding other timeouts in the
future.


Thanks.


  I haven't tested it in depth though, because I encountered the
following problem:

While testing the patch, I found a way to crash PG. But what's weird is
that it crashes also with an unpatched git version.

Here is the way to reproduce it (I have done it with a pgbench schema):

- Set a small statement_timeout (just to save time during the tests)

Session1:
=#BEGIN;
=#lock TABLE pgbench_accounts ;

Session 2:
=#BEGIN;
=#lock TABLE pgbench_accounts ;
ERROR:  canceling statement due to statement timeout
=# lock TABLE pgbench_accounts ;

I'm using \set ON_ERROR_ROLLBACK INTERACTIVE by the way. It can also be
done with a rollback to savepoint of course.

Session 2 crashes with this : TRAP : FailedAssertion(«
!(locallock-holdsStrongLockCount == 0) », fichier : « lock.c », ligne :
749).

It can also be done without a statement_timeout, and a control-C on the
second lock table.


Indeed, the unpatched GIT version crashes if you enter
  =#lock TABLE pgbench_accounts ;
the second time in session 2 after the first one failed. Also,
manually spelling it out:

Session 1:

$ psql
psql (9.2devel)
Type help for help.

zozo=# begin;
BEGIN
zozo=# lock table pgbench_accounts;
LOCK TABLE
zozo=#

Session 2:

zozo=# begin;
BEGIN
zozo=# savepoint a;
SAVEPOINT
zozo=# lock table pgbench_accounts;
ERROR:  canceling statement due to statement timeout
zozo=# rollback to a;
ROLLBACK
zozo=# savepoint b;
SAVEPOINT
zozo=# lock table pgbench_accounts;
The connection to the server was lost. Attempting reset: Failed.
!

Server log after the second lock table:

TRAP: FailedAssertion(!(locallock-holdsStrongLockCount == 0), File: 
lock.c, Line: 749)
LOG:  server process (PID 12978) was terminated by signal 6: Aborted

Best regards,
Zoltán Böszörményi


Robert, the Assert triggering with the above procedure
is in your fast path locking code with current GIT.

Best regards,
Zoltán Böszörményi





I didn't touch anything but this. It occurs everytime, when asserts are
activated.

I tried it on 9.1.3, and I couldn't make it crash with the same sequence
of events. So maybe it's something introduced since ? Or is the assert
still valid ?

Cheers







--
--
Zoltán Böszörményi
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-04-06 Thread Cousin Marc
On 05/04/12 08:02, Boszormenyi Zoltan wrote:
 2012-04-04 21:30 keltezéssel, Alvaro Herrera írta:
 I think this patch is doing two things: first touching infrastructure
 stuff and then adding lock_timeout on top of that.  Would it work to
 split the patch in two pieces?


 Sure. Attached is the split version.

 Best regards,
 Zoltán Böszörményi

Hi,

I've started looking at and testing both patches.

Technically speaking, I think the source looks much better than the
first version of lock timeout, and may help adding other timeouts in the
future. I haven't tested it in depth though, because I encountered the
following problem:

While testing the patch, I found a way to crash PG. But what's weird is
that it crashes also with an unpatched git version.

Here is the way to reproduce it (I have done it with a pgbench schema):

- Set a small statement_timeout (just to save time during the tests)

Session1:
=#BEGIN;
=#lock TABLE pgbench_accounts ;

Session 2:
=#BEGIN;
=#lock TABLE pgbench_accounts ;
ERROR:  canceling statement due to statement timeout
=# lock TABLE pgbench_accounts ;

I'm using \set ON_ERROR_ROLLBACK INTERACTIVE by the way. It can also be
done with a rollback to savepoint of course.

Session 2 crashes with this : TRAP : FailedAssertion(«
!(locallock-holdsStrongLockCount == 0) », fichier : « lock.c », ligne :
749).

It can also be done without a statement_timeout, and a control-C on the
second lock table.

I didn't touch anything but this. It occurs everytime, when asserts are
activated.

I tried it on 9.1.3, and I couldn't make it crash with the same sequence
of events. So maybe it's something introduced since ? Or is the assert
still valid ?

Cheers

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-04-04 Thread Boszormenyi Zoltan

2012-04-04 17:12 keltezéssel, Boszormenyi Zoltan írta:

2012-04-04 16:22 keltezéssel, Boszormenyi Zoltan írta:

2012-04-04 15:17 keltezéssel, Boszormenyi Zoltan írta:

Hi,

2012-04-04 12:30 keltezéssel, Boszormenyi Zoltan írta:

Hi,

attached is a patch to implement a framework to simplify and
correctly nest multiplexing more than two timeout sources
into the same SIGALRM signal handler.

The framework is using a new internal API for timeouts:

bool enable_timeout(TimeoutName tn, int delayms);
bool disable_timeout(TimeoutName tn, bool keep_indicator);
bool disable_all_timeouts(bool keep_indicators);

A timeout source has these features to allow different initialization,
cleanup and check functions and rescheduling:

typedef void (*timeout_init)(TimestampTz, TimestampTz);
typedef void (*timeout_destroy)(bool);
typedef bool (*timeout_check)(void);
typedef TimestampTz (*timeout_start)(void);

typedef struct {
TimeoutName index;
boolresched_next;
timeout_inittimeout_init;
timeout_destroy timeout_destroy;
timeout_check   timeout_check;
timeout_start   timeout_start;
TimestampTz fin_time;
} timeout_params;

This makes it possible to differentiate between the standby and
statement timeouts, regular deadlock and standby deadlock using
the same signal handler function.

And finally, this makes it possible to implement the lock_timeout
feature that we at Cybertec implemented more than 2 years ago.

The patch also adds two new tests into prepared_xacts.sql to trigger
the lock_timeout instead of statement_timeout.

Documentation and extensive comments are included.


Second version. Every timeout-related functions are now in a separate
source file, src/backend/storage/timeout.c with accessor functions.
There are no related global variables anymore, only the GUCs.


3rd and (for now) final version.


I lied. This is the final one. I fixed a typo in the documentation
and replaced timeout_start_time (previously static to proc.c)
with get_timeout_start(DEADLOCK_TIMEOUT). This one should
have happened in the second version.


Tidied comments, the time checks in
Check*() functions and function order in timeout.c.





Best regards,
Zoltán Böszörményi


One comment for testers: all the timeout GUC values are given in
milliseconds, the kernel interface (setitimer) and TimestampTz uses
microseconds.

The transaction that locks is inherently a read/write one and by the time
the code reaches ProcSleep(), at least a few tens of microseconds has
already passed since the start of the statement even on 3GHz+ CPUs.

Not to mention that computers nowadays have high precision timers
and OSs running on them utilitize those. So, the time returned by
GetCurrentStatementStartTimestamp() will certainly be different from
GetCurrentTimestamp(). This means that the timeouts' fin_time will also
be different.

Long story short, using the same value for statement_timeout and
lock_timeout (or deadlock_timeout for that matter) means that
statement_timeout will trigger first. The story is different only on
a combination of a fast CPU and an OS with greater-then-millisecond
timer resolution.

Best regards,
Zoltán Böszörményi

--
--
Zoltán Böszörményi
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/



Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-04-04 Thread Alvaro Herrera

I think this patch is doing two things: first touching infrastructure
stuff and then adding lock_timeout on top of that.  Would it work to
split the patch in two pieces?

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers