Re: [HACKERS] stuck spin lock with many concurrent users

2001-07-05 Thread Tatsuo Ishii
Hm. It doesn't seem that DeadLockCheck is taking very much of the time. Isn't the real time big ? Yes, it sure is, but remember that the guy getting useful work done (DeadLockCheck) is having to share the CPU with 999 other processes that are waking up on every clock tick for just

Re: [HACKERS] stuck spin lock with many concurrent users

2001-07-05 Thread Tom Lane
Tatsuo Ishii [EMAIL PROTECTED] writes: If so, what about increase the dead lock timer proportional to the length of the waiting holder queue? I don't think that's a good idea; it's not solving the problem, only reducing performance, and in a fairly arbitrary way at that. (The length of the

Re: [HACKERS] stuck spin lock with many concurrent users

2001-07-03 Thread Tom Lane
Tatsuo Ishii [EMAIL PROTECTED] writes: I added some codes into HandleDeadLock to measure how long LockLockTable and DeadLOckCheck calls take. Followings are the result in running pgbench -c 1000 (it failed with stuck spin lock error). real time shows how long they actually run (using

Re: [HACKERS] stuck spin lock with many concurrent users

2001-07-03 Thread Tom Lane
Hiroshi Inoue [EMAIL PROTECTED] writes: DeadLockCheck: real time min | max | avg -+---+- 0 | 87671 | 3463.6996197719 DeadLockCheck: user time min | max | avg -+-+--- 0 | 330 | 14.2205323194 DeadLockCheck: system time

Re: [HACKERS] stuck spin lock with many concurrent users

2001-07-03 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes: If you estimate that a process dispatch cycle is ~ 10 microseconds, then waking 999 useless processes every 10 msec is just about enough to consume 100% of the CPU doing nothing useful... Don't we back off the sleeps or was that code removed? Not

Re: [HACKERS] stuck spin lock with many concurrent users

2001-06-27 Thread Tatsuo Ishii
Tatsuo Ishii [EMAIL PROTECTED] writes: In my understanding the deadlock check is performed every time the backend aquires lock. Once the it aquires, it kill the timer. However, under heavy transactions such as pgbench generates, chances are that the checking fires, and it tries to aquire

Re: [HACKERS] stuck spin lock with many concurrent users

2001-06-26 Thread Tatsuo Ishii
Tatsuo Ishii [EMAIL PROTECTED] writes How can I check it? The 'stuck' message should at least give you a code location... FATAL: s_lock(0x2ac2d016) at spin.c:158, stuck spinlock. Aborting. Hmm, that's SpinAcquire, so it's one of the predefined spinlocks (and not, say, a

Re: [HACKERS] stuck spin lock with many concurrent users

2001-06-26 Thread Tatsuo Ishii
Tatsuo Ishii wrote: Tatsuo Ishii [EMAIL PROTECTED] writes How can I check it? The 'stuck' message should at least give you a code location... FATAL: s_lock(0x2ac2d016) at spin.c:158, stuck spinlock. Aborting. Hmm, that's SpinAcquire, so it's one of the

Re: [HACKERS] stuck spin lock with many concurrent users

2001-06-26 Thread Tom Lane
Tatsuo Ishii [EMAIL PROTECTED] writes: It appeared that the deadlock checking timer seems to be the source of the problem. With the default settings, it checks deadlocks every 1 second PER backend. I don't believe it. setitimer with it_interval = 0 should produce one interrupt, no more.

Re: [HACKERS] stuck spin lock with many concurrent users

2001-06-26 Thread Tom Lane
Tatsuo Ishii [EMAIL PROTECTED] writes: In my understanding the deadlock check is performed every time the backend aquires lock. Once the it aquires, it kill the timer. However, under heavy transactions such as pgbench generates, chances are that the checking fires, and it tries to aquire a

Re: [HACKERS] stuck spin lock with many concurrent users

2001-06-24 Thread Tatsuo Ishii
FATAL: s_lock(0x2ac2d016) at spin.c:158, stuck spinlock. Aborting. Hmm, that's SpinAcquire, so it's one of the predefined spinlocks (and not, say, a buffer spinlock). You could try adding some debug logging here, although the output would be voluminous. But what would really be

Re: [HACKERS] stuck spin lock with many concurrent users

2001-06-24 Thread Tom Lane
Tatsuo Ishii [EMAIL PROTECTED] writes: I got an interesting result. If I compile backend with -g (and without -O2), I get no stuck spin lock errors. However, if s_lock.c is compiled with -O2 enabled, I got the error again. It seems only s_lock.c is related to this phenomenon. That's very

Re: [HACKERS] stuck spin lock with many concurrent users

2001-06-24 Thread Tatsuo Ishii
Tatsuo Ishii [EMAIL PROTECTED] writes: I got an interesting result. If I compile backend with -g (and without -O2), I get no stuck spin lock errors. However, if s_lock.c is compiled with -O2 enabled, I got the error again. It seems only s_lock.c is related to this phenomenon. That's

Re: [HACKERS] stuck spin lock with many concurrent users

2001-06-21 Thread Tatsuo Ishii
Tatsuo Ishii [EMAIL PROTECTED] writes: I have seen problems with extremely many concurrent users. I run pgbench: pgbench -c 1000 -t 1 test And I get stuck spin lock errors. This is 100% reproducable (i.e. I have nerver succeeded in pgbench -c 1000). Is it actually stuck, or just

Re: [HACKERS] stuck spin lock with many concurrent users

2001-06-21 Thread Tom Lane
Tatsuo Ishii [EMAIL PROTECTED] writes: If it is stuck, on which lock(s)? How can I check it? The 'stuck' message should at least give you a code location... regards, tom lane ---(end of broadcast)--- TIP 6: Have you

Re: [HACKERS] stuck spin lock with many concurrent users

2001-06-21 Thread Tatsuo Ishii
Tatsuo Ishii [EMAIL PROTECTED] writes: If it is stuck, on which lock(s)? How can I check it? The 'stuck' message should at least give you a code location... Here is the actual message: FATAL: s_lock(0x2ac2d016) at spin.c:158, stuck spinlock. Aborting. Last several queries before

Re: [HACKERS] stuck spin lock with many concurrent users

2001-06-21 Thread Tom Lane
Tatsuo Ishii [EMAIL PROTECTED] writes: How can I check it? The 'stuck' message should at least give you a code location... FATAL: s_lock(0x2ac2d016) at spin.c:158, stuck spinlock. Aborting. Hmm, that's SpinAcquire, so it's one of the predefined spinlocks (and not, say, a buffer spinlock).

Re: [HACKERS] stuck spin lock with many concurrent users

2001-06-21 Thread Tatsuo Ishii
Tatsuo Ishii [EMAIL PROTECTED] writes: How can I check it? The 'stuck' message should at least give you a code location... FATAL: s_lock(0x2ac2d016) at spin.c:158, stuck spinlock. Aborting. Hmm, that's SpinAcquire, so it's one of the predefined spinlocks (and not, say, a buffer

[HACKERS] stuck spin lock with many concurrent users

2001-06-20 Thread Tatsuo Ishii
I have seen problems with extremely many concurrent users. I run pgbench: pgbench -c 1000 -t 1 test And I get stuck spin lock errors. This is 100% reproducable (i.e. I have nerver succeeded in pgbench -c 1000). This is Linux kernel 2.2.18. Followings are some resource settings that seem

Re: [HACKERS] stuck spin lock with many concurrent users

2001-06-20 Thread Tom Lane
Tatsuo Ishii [EMAIL PROTECTED] writes: I have seen problems with extremely many concurrent users. I run pgbench: pgbench -c 1000 -t 1 test And I get stuck spin lock errors. This is 100% reproducable (i.e. I have nerver succeeded in pgbench -c 1000). Is it actually stuck, or just timing