Hm. It doesn't seem that DeadLockCheck is taking very much of the time.
Isn't the real time big ?
Yes, it sure is, but remember that the guy getting useful work done
(DeadLockCheck) is having to share the CPU with 999 other processes
that are waking up on every clock tick for just
Tatsuo Ishii [EMAIL PROTECTED] writes:
If so, what about increase the dead lock timer proportional to the
length of the waiting holder queue?
I don't think that's a good idea; it's not solving the problem, only
reducing performance, and in a fairly arbitrary way at that. (The
length of the
Tatsuo Ishii [EMAIL PROTECTED] writes:
I added some codes into HandleDeadLock to measure how long
LockLockTable and DeadLOckCheck calls take. Followings are the result
in running pgbench -c 1000 (it failed with stuck spin lock
error). real time shows how long they actually run (using
Hiroshi Inoue [EMAIL PROTECTED] writes:
DeadLockCheck: real time
min | max | avg
-+---+-
0 | 87671 | 3463.6996197719
DeadLockCheck: user time
min | max | avg
-+-+---
0 | 330 | 14.2205323194
DeadLockCheck: system time
Bruce Momjian [EMAIL PROTECTED] writes:
If you estimate that a process dispatch cycle is ~ 10 microseconds,
then waking 999 useless processes every 10 msec is just about enough
to consume 100% of the CPU doing nothing useful...
Don't we back off the sleeps or was that code removed?
Not
Tatsuo Ishii [EMAIL PROTECTED] writes:
In my understanding the deadlock check is performed every time the
backend aquires lock. Once the it aquires, it kill the timer. However,
under heavy transactions such as pgbench generates, chances are that
the checking fires, and it tries to aquire
Tatsuo Ishii [EMAIL PROTECTED] writes
How can I check it?
The 'stuck' message should at least give you a code location...
FATAL: s_lock(0x2ac2d016) at spin.c:158, stuck spinlock. Aborting.
Hmm, that's SpinAcquire, so it's one of the predefined spinlocks
(and not, say, a
Tatsuo Ishii wrote:
Tatsuo Ishii [EMAIL PROTECTED] writes
How can I check it?
The 'stuck' message should at least give you a code location...
FATAL: s_lock(0x2ac2d016) at spin.c:158, stuck spinlock. Aborting.
Hmm, that's SpinAcquire, so it's one of the
Tatsuo Ishii [EMAIL PROTECTED] writes:
It appeared that the deadlock checking timer seems to be the source of
the problem. With the default settings, it checks deadlocks every 1
second PER backend.
I don't believe it. setitimer with it_interval = 0 should produce one
interrupt, no more.
Tatsuo Ishii [EMAIL PROTECTED] writes:
In my understanding the deadlock check is performed every time the
backend aquires lock. Once the it aquires, it kill the timer. However,
under heavy transactions such as pgbench generates, chances are that
the checking fires, and it tries to aquire a
FATAL: s_lock(0x2ac2d016) at spin.c:158, stuck spinlock. Aborting.
Hmm, that's SpinAcquire, so it's one of the predefined spinlocks
(and not, say, a buffer spinlock). You could try adding some
debug logging here, although the output would be voluminous.
But what would really be
Tatsuo Ishii [EMAIL PROTECTED] writes:
I got an interesting result. If I compile backend with -g (and without
-O2), I get no stuck spin lock errors. However, if s_lock.c is
compiled with -O2 enabled, I got the error again. It seems only
s_lock.c is related to this phenomenon.
That's very
Tatsuo Ishii [EMAIL PROTECTED] writes:
I got an interesting result. If I compile backend with -g (and without
-O2), I get no stuck spin lock errors. However, if s_lock.c is
compiled with -O2 enabled, I got the error again. It seems only
s_lock.c is related to this phenomenon.
That's
Tatsuo Ishii [EMAIL PROTECTED] writes:
I have seen problems with extremely many concurrent users.
I run pgbench:
pgbench -c 1000 -t 1 test
And I get stuck spin lock errors. This is 100% reproducable (i.e. I
have nerver succeeded in pgbench -c 1000).
Is it actually stuck, or just
Tatsuo Ishii [EMAIL PROTECTED] writes:
If it is stuck, on which lock(s)?
How can I check it?
The 'stuck' message should at least give you a code location...
regards, tom lane
---(end of broadcast)---
TIP 6: Have you
Tatsuo Ishii [EMAIL PROTECTED] writes:
If it is stuck, on which lock(s)?
How can I check it?
The 'stuck' message should at least give you a code location...
Here is the actual message:
FATAL: s_lock(0x2ac2d016) at spin.c:158, stuck spinlock. Aborting.
Last several queries before
Tatsuo Ishii [EMAIL PROTECTED] writes:
How can I check it?
The 'stuck' message should at least give you a code location...
FATAL: s_lock(0x2ac2d016) at spin.c:158, stuck spinlock. Aborting.
Hmm, that's SpinAcquire, so it's one of the predefined spinlocks
(and not, say, a buffer spinlock).
Tatsuo Ishii [EMAIL PROTECTED] writes:
How can I check it?
The 'stuck' message should at least give you a code location...
FATAL: s_lock(0x2ac2d016) at spin.c:158, stuck spinlock. Aborting.
Hmm, that's SpinAcquire, so it's one of the predefined spinlocks
(and not, say, a buffer
I have seen problems with extremely many concurrent users.
I run pgbench:
pgbench -c 1000 -t 1 test
And I get stuck spin lock errors. This is 100% reproducable (i.e. I
have nerver succeeded in pgbench -c 1000).
This is Linux kernel 2.2.18. Followings are some resource settings
that seem
Tatsuo Ishii [EMAIL PROTECTED] writes:
I have seen problems with extremely many concurrent users.
I run pgbench:
pgbench -c 1000 -t 1 test
And I get stuck spin lock errors. This is 100% reproducable (i.e. I
have nerver succeeded in pgbench -c 1000).
Is it actually stuck, or just timing
20 matches
Mail list logo