Re: cygserver - Postgres Multiple connection Load Testing - Inifinte Loop
On Jun 20 19:29, Marco Atzeri wrote: > On 20/06/2017 13:11, Corinna Vinschen wrote: > > > > > > > > > > I suggest reverting the cygwin-20170324 cygserver changes for now. > > > > > Older > > > > > versions can be configured to have reliable sysv semaphores, but I > > > > > think no > > > > > settings render sysv semaphores reliable in Cygwin 2.8.0. What do you > > > > > think? > > > > > > > > Just FYI, Corinna is away for a bit (in European time, "a bit" = until > > > > June ;-) ) so don't be surprised if her response is delayed. > > > > > > > > > > as she is back, we can humble rise her attention to the matter. > > > > I can do that, but wouldn't it be nice if somebody would actually > > try to debug Cygserver further, to handle this new testcase correctly > > as well? 6 weeks, and nobody actually tries. Sigh. > > > > > > Corinna > > I agree. > > To my discharge: > - I have no clue what to look for, my knowledge of cygwin inside > is not so good. There are not a lot of Cygwin internals involved, it's pretty much all in cygserver. > - Real life is eating my available time for cygwin. > - The Symantec Endpoint Protection BLODA, that I can not get rid of, > is making strace impossible and debugging a nightmare. > > Sorry > Marco No worries. But I only have so much time on my hands either, so it would be nice if others (not necessarily you) would try to plunge into the code and debug it from the inside. Sources *are* available after all, and there's no ban on asking source code related question on cygwin-developers. Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Maintainer cygwin AT cygwin DOT com Red Hat signature.asc Description: PGP signature
Re: cygserver - Postgres Multiple connection Load Testing - Inifinte Loop
On 20/06/2017 13:11, Corinna Vinschen wrote: I suggest reverting the cygwin-20170324 cygserver changes for now. Older versions can be configured to have reliable sysv semaphores, but I think no settings render sysv semaphores reliable in Cygwin 2.8.0. What do you think? Just FYI, Corinna is away for a bit (in European time, "a bit" = until June ;-) ) so don't be surprised if her response is delayed. as she is back, we can humble rise her attention to the matter. I can do that, but wouldn't it be nice if somebody would actually try to debug Cygserver further, to handle this new testcase correctly as well? 6 weeks, and nobody actually tries. Sigh. Corinna I agree. To my discharge: - I have no clue what to look for, my knowledge of cygwin inside is not so good. - Real life is eating my available time for cygwin. - The Symantec Endpoint Protection BLODA, that I can not get rid of, is making strace impossible and debugging a nightmare. Sorry Marco -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: cygserver - Postgres Multiple connection Load Testing - Inifinte Loop
On Jun 15 01:32, Marco Atzeri wrote: > On 07/05/2017 05:47, Larry Hall (Cygwin) wrote: > > On 05/06/2017 11:27 PM, Noah Misch wrote: > > > On Sat, Apr 01, 2017 at 10:36:24PM -0400, Noah Misch wrote: > > > > On Tue, Mar 28, 2017 at 01:26:52AM -0400, Noah Misch wrote: > > > > > On Fri, Mar 24, 2017 at 06:11:01PM +0100, Corinna Vinschen wrote: > > > > > > I pushed a patchset now, and uploaded new developer snapshots for > > > > > > testing to https://cygwin.com/snapshots/ > > > > > > > > > > > Please give it a try > > > > > > > > > I call the cygwin-20170324 freezes "limited" because the symptoms > > > > > differ from > > > > > the classic freeze I described upthread. "strace /bin/true" and "cat > > > > > /proc/sysvipc/sem" do not hang, but every PostgreSQL backend process > > > > > is stuck > > > > > waiting on a synchronization primitive. > > > > > > > > > > I can distill another self-contained test case for the limited > > > > > freeze seen in > > > > > cygwin-20170324, but that make take awhile. I'm sending this early > > > > > report so > > > > > you're aware of the possible regression in cygwin-20170324. > > > > > > > > I'm attaching a new test program that demonstrates the regression. > > > > My previous > > > > test program created sixteen processes that each picked a random > > > > semaphore to > > > > lock. Now, each process picks two semaphores and locks them in > > > > order. This > > > > proceeds smoothly on GNU/Linux and on cygwin-20170321.tar.xz > > > > "cygserver -r 40". > > > > It freezes within one second on cygwin-20170324.tar.xz "cygserver -r > > > > 40". > > > > > > I suggest reverting the cygwin-20170324 cygserver changes for now. Older > > > versions can be configured to have reliable sysv semaphores, but I > > > think no > > > settings render sysv semaphores reliable in Cygwin 2.8.0. What do you > > > think? > > > > Just FYI, Corinna is away for a bit (in European time, "a bit" = until > > June ;-) ) so don't be surprised if her response is delayed. > > > > as she is back, we can humble rise her attention to the matter. I can do that, but wouldn't it be nice if somebody would actually try to debug Cygserver further, to handle this new testcase correctly as well? 6 weeks, and nobody actually tries. Sigh. Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Maintainer cygwin AT cygwin DOT com Red Hat signature.asc Description: PGP signature
Re: cygserver - Postgres Multiple connection Load Testing - Inifinte Loop
On 07/05/2017 05:47, Larry Hall (Cygwin) wrote: On 05/06/2017 11:27 PM, Noah Misch wrote: On Sat, Apr 01, 2017 at 10:36:24PM -0400, Noah Misch wrote: On Tue, Mar 28, 2017 at 01:26:52AM -0400, Noah Misch wrote: On Fri, Mar 24, 2017 at 06:11:01PM +0100, Corinna Vinschen wrote: I pushed a patchset now, and uploaded new developer snapshots for testing to https://cygwin.com/snapshots/ Please give it a try I call the cygwin-20170324 freezes "limited" because the symptoms differ from the classic freeze I described upthread. "strace /bin/true" and "cat /proc/sysvipc/sem" do not hang, but every PostgreSQL backend process is stuck waiting on a synchronization primitive. I can distill another self-contained test case for the limited freeze seen in cygwin-20170324, but that make take awhile. I'm sending this early report so you're aware of the possible regression in cygwin-20170324. I'm attaching a new test program that demonstrates the regression. My previous test program created sixteen processes that each picked a random semaphore to lock. Now, each process picks two semaphores and locks them in order. This proceeds smoothly on GNU/Linux and on cygwin-20170321.tar.xz "cygserver -r 40". It freezes within one second on cygwin-20170324.tar.xz "cygserver -r 40". I suggest reverting the cygwin-20170324 cygserver changes for now. Older versions can be configured to have reliable sysv semaphores, but I think no settings render sysv semaphores reliable in Cygwin 2.8.0. What do you think? Just FYI, Corinna is away for a bit (in European time, "a bit" = until June ;-) ) so don't be surprised if her response is delayed. as she is back, we can humble rise her attention to the matter. Regards Marco -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: cygserver - Postgres Multiple connection Load Testing - Inifinte Loop
On 05/06/2017 11:27 PM, Noah Misch wrote: On Sat, Apr 01, 2017 at 10:36:24PM -0400, Noah Misch wrote: On Tue, Mar 28, 2017 at 01:26:52AM -0400, Noah Misch wrote: On Fri, Mar 24, 2017 at 06:11:01PM +0100, Corinna Vinschen wrote: I pushed a patchset now, and uploaded new developer snapshots for testing to https://cygwin.com/snapshots/ Please give it a try I call the cygwin-20170324 freezes "limited" because the symptoms differ from the classic freeze I described upthread. "strace /bin/true" and "cat /proc/sysvipc/sem" do not hang, but every PostgreSQL backend process is stuck waiting on a synchronization primitive. I can distill another self-contained test case for the limited freeze seen in cygwin-20170324, but that make take awhile. I'm sending this early report so you're aware of the possible regression in cygwin-20170324. I'm attaching a new test program that demonstrates the regression. My previous test program created sixteen processes that each picked a random semaphore to lock. Now, each process picks two semaphores and locks them in order. This proceeds smoothly on GNU/Linux and on cygwin-20170321.tar.xz "cygserver -r 40". It freezes within one second on cygwin-20170324.tar.xz "cygserver -r 40". I suggest reverting the cygwin-20170324 cygserver changes for now. Older versions can be configured to have reliable sysv semaphores, but I think no settings render sysv semaphores reliable in Cygwin 2.8.0. What do you think? Just FYI, Corinna is away for a bit (in European time, "a bit" = until June ;-) ) so don't be surprised if her response is delayed. -- Larry _ A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting annoying in email? -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: cygserver - Postgres Multiple connection Load Testing - Inifinte Loop
On Sat, Apr 01, 2017 at 10:36:24PM -0400, Noah Misch wrote: > On Tue, Mar 28, 2017 at 01:26:52AM -0400, Noah Misch wrote: > > On Fri, Mar 24, 2017 at 06:11:01PM +0100, Corinna Vinschen wrote: > > > I pushed a patchset now, and uploaded new developer snapshots for > > > testing to https://cygwin.com/snapshots/ > > > > > Please give it a try > > > I call the cygwin-20170324 freezes "limited" because the symptoms differ > > from > > the classic freeze I described upthread. "strace /bin/true" and "cat > > /proc/sysvipc/sem" do not hang, but every PostgreSQL backend process is > > stuck > > waiting on a synchronization primitive. > > > > I can distill another self-contained test case for the limited freeze seen > > in > > cygwin-20170324, but that make take awhile. I'm sending this early report > > so > > you're aware of the possible regression in cygwin-20170324. > > I'm attaching a new test program that demonstrates the regression. My > previous > test program created sixteen processes that each picked a random semaphore to > lock. Now, each process picks two semaphores and locks them in order. This > proceeds smoothly on GNU/Linux and on cygwin-20170321.tar.xz "cygserver -r > 40". > It freezes within one second on cygwin-20170324.tar.xz "cygserver -r 40". I suggest reverting the cygwin-20170324 cygserver changes for now. Older versions can be configured to have reliable sysv semaphores, but I think no settings render sysv semaphores reliable in Cygwin 2.8.0. What do you think? -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: cygserver - Postgres Multiple connection Load Testing - Inifinte Loop
On Tue, Mar 28, 2017 at 01:26:52AM -0400, Noah Misch wrote: > On Fri, Mar 24, 2017 at 06:11:01PM +0100, Corinna Vinschen wrote: > > I pushed a patchset now, and uploaded new developer snapshots for > > testing to https://cygwin.com/snapshots/ > > > Please give it a try > I call the cygwin-20170324 freezes "limited" because the symptoms differ from > the classic freeze I described upthread. "strace /bin/true" and "cat > /proc/sysvipc/sem" do not hang, but every PostgreSQL backend process is stuck > waiting on a synchronization primitive. > > I can distill another self-contained test case for the limited freeze seen in > cygwin-20170324, but that make take awhile. I'm sending this early report so > you're aware of the possible regression in cygwin-20170324. I'm attaching a new test program that demonstrates the regression. My previous test program created sixteen processes that each picked a random semaphore to lock. Now, each process picks two semaphores and locks them in order. This proceeds smoothly on GNU/Linux and on cygwin-20170321.tar.xz "cygserver -r 40". It freezes within one second on cygwin-20170324.tar.xz "cygserver -r 40". /* * Demonstrate cygserver bug introduced in cygwin-20170324.tar.xz snapshot. Run * without arguments. Test against "cygserver -r 40" to get enough threads. * This is otherwise compatible with default cygserver settings; it uses a * single semaphore set of eight semaphores. * * Output will cease within a few seconds. On older cygserver and non-Cygwin * systems, it will run to completion. */ #include #include #include #include #include #include #include #include #define SEM_KEY 0x631a2c3f #define N_WORKER 16 #define N_SEMA (N_WORKER/2) #define N_CYCLE 100 union semun { int val; struct semid_ds *buf; unsigned short *array; }; static int print_every = 1; static int lock(int set, unsigned short sem_num) { struct sembuf op; op.sem_num = sem_num; op.sem_op = -1; op.sem_flg = 0; if (0 > semop(set, &op, 1)) { perror("semop"); return 1; } return 0; } static int unlock(int set, unsigned short sem_num) { struct sembuf op; op.sem_num = sem_num; op.sem_op = 1 ; /* only difference vs. lock() */ op.sem_flg = 0; if (0 > semop(set, &op, 1)) { perror("semop"); return 1; } return 0; } /* In parallel, N_WORKER processes run this function. */ static int do_worker(int ordinal, int set) { int i; printf("start worker %d\n", ordinal); fflush(stdout); for (i = 1; i <= N_CYCLE; i++) { unsigned short s0, s1; /* Pick two non-identical semaphore numbers. */ s0 = random() % N_SEMA; do { s1 = random() % N_SEMA; } while (s0 == s1); /* Lock the lower one first, thereby preventing deadlock. */ if (lock(set, s0 < s1 ? s0 : s1) || lock(set, s0 < s1 ? s1 : s0) || unlock(set, s0) || unlock(set, s1)) return 1; if (i % print_every == 0) { printf("worker %d: %d cycles elapsed\n", ordinal, i); fflush(stdout); } } return 0; } int main(int argc, char **argv) { int status = 1, set, i, child_status; if (argc == 2) print_every = atoi(argv[1]); else if (argc != 1) { fprintf(stderr, "Usage: sema_two [print-every-N]\n"); return status; } puts("semget"); fflush(stdout); set = semget(SEM_KEY, N_SEMA, IPC_CREAT | 0600); if (set == -1) { perror("semget"); return status; } puts("SETVAL"); fflush(stdout); for (i = 0; i < N_SEMA; i++) { union semun s; s.val = 1; if (0 > semctl(set, i, SETVAL, s)) { perror("semctl(SETVAL)"); goto cleanup; } } for (i = 0; i < N_WORKER; i++) { pid_t pid; pid = fork(); switch (pid) { case -1: perror("fork"); goto cleanup; case 0: return do_worker(i, set); } } status = 0; cleanup: while (wait(&child_status) != -1) ; if (errno != ECHILD) { perror("wait"); status = 1; } if (0 > semctl(set, 0, IPC_RMID)) { perror("semtctl(IPC_RMID)"); status = 1; } return status; } -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: cygserver - Postgres Multiple connection Load Testing - Inifinte Loop
On Fri, Mar 24, 2017 at 06:11:01PM +0100, Corinna Vinschen wrote: > - cygserver is using a defined number of threads in a thread pool for > application requests. Every request is added to a request submission > queue and handled by the next free thread in the pool. > > The default number of threads in the pool is 10. Each wait for a > semaphore is blocking one thread. If more than the number of threads > in the pool are supposed to wait on a semaphore the pool starves. Interesting. I can confirm that, without updating software, "cygserver -r 40" fixes both my self-contained test and my PostgreSQL test case. Folks can use that workaround in released-version installations. > So what I did now is to allow cygserver to raise the number of worker > threads on demand. That is, if a request is in the queue and all > worker threads are busy, just create a new one. > > There's no way yet to drop threads again, but this should be a minor > problem in scenarions which really have a lot of contention. Agreed. This is nicer. > I pushed a patchset now, and uploaded new developer snapshots for > testing to https://cygwin.com/snapshots/ > Please give it a try Self-contained test case results look good: cygwin-20170321.tar.xz "cygserver -r40": ok cygwin-20170324.tar.xz "cygserver -r40": ok cygwin-20170321.tar.xz "cygserver -r10": freezes (expected) cygwin-20170324.tar.xz "cygserver -r10": ok; cygserver output concludes with "cygserver: All threads busy, added one (now 21)". I then tried my PostgreSQL test case ("pgbench -i -s 50" once to setup, then "pgbench -S -j2 -c16 -T900 -P5" to test): cygwin-20170321.tar.xz "cygserver -r40": ok for >3600s cygwin-20170324.tar.xz "cygserver -r40": limited freeze in <1000s; no cygserver output cygwin-20170321.tar.xz "cygserver -r10": classic freeze in <1000s (expected) cygwin-20170324.tar.xz "cygserver -r10": limited freeze in <1000s; no cygserver output for most of the run, then output concluding with "cygserver: All threads busy, added one (now 15)" just before the freeze I call the cygwin-20170324 freezes "limited" because the symptoms differ from the classic freeze I described upthread. "strace /bin/true" and "cat /proc/sysvipc/sem" do not hang, but every PostgreSQL backend process is stuck waiting on a synchronization primitive. I can distill another self-contained test case for the limited freeze seen in cygwin-20170324, but that make take awhile. I'm sending this early report so you're aware of the possible regression in cygwin-20170324. Thanks, nm -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: cygserver - Postgres Multiple connection Load Testing - Inifinte Loop
On 25/03/2017 12:30, Corinna Vinschen wrote: On Mar 25 09:09, Marco Atzeri wrote: It seems that the number of max available semaphores is frozen to first call value. That's normal and documented. An existing semaphore set using the same key has the number of semaphores defined in the first call, until you remove the semaphore set with, for instance, ipcrm -s. POSIX has this to say: [EINVAL] The value of nsems is either less than or equal to 0 or greater than the system-imposed limit, or a semaphore identifier exists for the argument key, but the number of semaphores in the set associated with it is less than nsems and nsems is not equal to 0. Linux doesn't care, but BSD does, and our XSI IPC code is 95% BSD. Corinna noted. Thanks Marco -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: cygserver - Postgres Multiple connection Load Testing - Inifinte Loop
On Mar 25 09:09, Marco Atzeri wrote: > On 24/03/2017 18:11, Corinna Vinschen wrote: > > Hi Noah, > > > > > > > > > On GNU/Linux, AIX, and Solaris, the processes keep busy and finish one > > > million > > > lock/unlock cycles apiece in a few minutes. On Cygwin, they hang within > > > a few > > > seconds and under one hundred cycles apiece. At that point, cygserver is > > > unresponsive to other clients; for example, "strace /bin/true", opening a > > > new > > > Cygwin terminal, "cat /proc/sysvipc/sem" and "cygserver -S" all hang. In > > > most > > > tests, cygserver was not consuming CPU while unresponsive. > > > > > > I pushed a patchset now, and uploaded new developer snapshots for > > testing to https://cygwin.com/snapshots/ > > > > I'm also going to create a 2.8.0-0.4 test release later today. > > > > Please give it a try, and please note that *all* patches affect > > cygserver itself, so you have to test the new cygserver in the > > first place. The Cygwin DLL is not affected by the changes. > > > > > > Thanks, > > Corinna > > > > Hi Corinna, > just noted a small glitch. > > Attached a modification of Noah's test, it now accepts the number of workers > and semaphore are as before workers/4 > > ./sema_parallel-2 32 > worker > OK > > ./sema_parallel-2 64 > semget > semget: Invalid argument > > If I restart the cygserver > > ./sema_parallel-2 64 > worker > OK > > ./sema_parallel-2 128 > semget > semget: Invalid argument > > > It seems that the number of max available semaphores is frozen to first call > value. That's normal and documented. An existing semaphore set using the same key has the number of semaphores defined in the first call, until you remove the semaphore set with, for instance, ipcrm -s. POSIX has this to say: [EINVAL] The value of nsems is either less than or equal to 0 or greater than the system-imposed limit, or a semaphore identifier exists for the argument key, but the number of semaphores in the set associated with it is less than nsems and nsems is not equal to 0. Linux doesn't care, but BSD does, and our XSI IPC code is 95% BSD. Corinna signature.asc Description: PGP signature
Re: cygserver - Postgres Multiple connection Load Testing - Inifinte Loop
On 24/03/2017 18:11, Corinna Vinschen wrote: Hi Noah, On GNU/Linux, AIX, and Solaris, the processes keep busy and finish one million lock/unlock cycles apiece in a few minutes. On Cygwin, they hang within a few seconds and under one hundred cycles apiece. At that point, cygserver is unresponsive to other clients; for example, "strace /bin/true", opening a new Cygwin terminal, "cat /proc/sysvipc/sem" and "cygserver -S" all hang. In most tests, cygserver was not consuming CPU while unresponsive. I pushed a patchset now, and uploaded new developer snapshots for testing to https://cygwin.com/snapshots/ I'm also going to create a 2.8.0-0.4 test release later today. Please give it a try, and please note that *all* patches affect cygserver itself, so you have to test the new cygserver in the first place. The Cygwin DLL is not affected by the changes. Thanks, Corinna Hi Corinna, just noted a small glitch. Attached a modification of Noah's test, it now accepts the number of workers and semaphore are as before workers/4 ./sema_parallel-2 32 worker OK ./sema_parallel-2 64 semget semget: Invalid argument If I restart the cygserver ./sema_parallel-2 64 worker OK ./sema_parallel-2 128 semget semget: Invalid argument It seems that the number of max available semaphores is frozen to first call value. /* * Demonstrate cygserver hang under concurrent sysv semaphore traffic. Run * without arguments. Output will cease within a few seconds, and cygserver * will be unresponsive to all clients. * * This is compatible with default cygserver settings; it uses a single * semaphore set of four semaphores. */ #include #include #include #include #include #include #include #include #define SEM_KEY 0x631a2c3e #define N_CYCLE 100 union semun { int val; struct semid_ds *buf; unsigned short *array; }; static int print_every = 1; /* In parallel, N_WORKER processes run this function. */ static int do_worker(int ordinal, int set, int N_SEMA) { int i; struct sembuf op; printf("start worker %d\n", ordinal); fflush(stdout); op.sem_flg = 0; for (i = 1; i <= N_CYCLE; i++) { op.sem_num = random() % N_SEMA; op.sem_op = -1; if (0 > semop(set, &op, 1)) { perror("semop"); return 1; } op.sem_op = 1; if (0 > semop(set, &op, 1)) { perror("semop"); return 1; } if (i % print_every == 0) { printf("worker %d: %d cycles elapsed\n", ordinal, i); fflush(stdout); } } return 0; } int main(int argc, char **argv) { int status = 1, set, i, child_status; int N_WORKER, N_SEMA; switch(argc){ case 3: print_every = atoi(argv[2]); case 2: N_WORKER=atoi(argv[1]); N_SEMA= (N_WORKER/4); break; default: fprintf(stderr, "Usage: sema_parallel workers [print-every-N]\n"); return status; } puts("semget"); fflush(stdout); set = semget(SEM_KEY, N_SEMA, IPC_CREAT | 0600); if (set == -1) { perror("semget"); return status; } puts("SETVAL"); fflush(stdout); for (i = 0; i < N_SEMA; i++) { union semun s; s.val = 1; if (0 > semctl(set, i, SETVAL, s)) { perror("semctl(SETVAL)"); goto cleanup; } } for (i = 0; i < N_WORKER; i++) { pid_t pid; pid = fork(); switch (pid) { case -1: perror("fork"); goto cleanup; case 0: return do_worker(i, set, N_SEMA ); } } status = 0; cleanup: while (wait(&child_status) != -1) ; if (errno != ECHILD) { perror("wait"); status = 1; } if (0 > semctl(set, 0, IPC_RMID)) { perror("semtctl(IPC_RMID)"); status = 1; } return status; } -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: cygserver - Postgres Multiple connection Load Testing - Inifinte Loop
Hi Noah, thanks for the report and especially the testcase. It took me a while to debug that, but I think I fixed it now. At least your testcase is working for me now. It also got faster, albeit always slower than Linux because of the communication overhead between processes and cygserver. On Mar 21 02:56, Noah Misch wrote: > On Tue, Aug 03, 2004 at 12:06:12PM +0200, Corinna Vinschen wrote: > > On Aug 2 20:33, sarbx-cygwin6...@mailblocks.com wrote: > > > This time around, cygserver does not eat CPU. But after 5 to 6 > > > concurrent > > > connections nothing seem to work, looks kind of hung. There is no > > > activity in the Postgres > > > log file. Opening a new database connection also hangs. There is no > > > activity on the machine. > > > Any chance to create a simple testcase which uncovers that behaviour > > without involving a whole database system? > > Attached test program reproduces it on Cygwin 2.7.0, Cygwin 1.7.5, and a few > intermediate versions. The program creates sixteen processes that each > perform a tight loop over the following: > > - select one of four semaphores > - reduce semaphore's value from 1 to 0 ("lock" it) > - raise semaphore's value from 0 to 1 ("unlock" it) > > On GNU/Linux, AIX, and Solaris, the processes keep busy and finish one million > lock/unlock cycles apiece in a few minutes. On Cygwin, they hang within a few > seconds and under one hundred cycles apiece. At that point, cygserver is > unresponsive to other clients; for example, "strace /bin/true", opening a new > Cygwin terminal, "cat /proc/sysvipc/sem" and "cygserver -S" all hang. In most > tests, cygserver was not consuming CPU while unresponsive. There are two problems here. - cygserver is using a defined number of threads in a thread pool for application requests. Every request is added to a request submission queue and handled by the next free thread in the pool. The default number of threads in the pool is 10. Each wait for a semaphore is blocking one thread. If more than the number of threads in the pool are supposed to wait on a semaphore the pool starves. Raising the pool size fixes this part, but the situation is still a bit unsatisfying. You may not know the load and the number of competing processes in every scenario beforehand, but raising cygservers thread pool to some really big value doesn't always make sense either. So what I did now is to allow cygserver to raise the number of worker threads on demand. That is, if a request is in the queue and all worker threads are busy, just create a new one. There's no way yet to drop threads again, but this should be a minor problem in scenarions which really have a lot of contention. - The code emulating BSD msleep/wakeup wasn't quite up to speed. I rewrote a major part of the code to be more robust and faster. I pushed a patchset now, and uploaded new developer snapshots for testing to https://cygwin.com/snapshots/ I'm also going to create a 2.8.0-0.4 test release later today. Please give it a try, and please note that *all* patches affect cygserver itself, so you have to test the new cygserver in the first place. The Cygwin DLL is not affected by the changes. Thanks, Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Maintainer cygwin AT cygwin DOT com Red Hat signature.asc Description: PGP signature
Re: cygserver - Postgres Multiple connection Load Testing - Inifinte Loop
On 21/03/2017 03:56, Noah Misch wrote: On Tue, Aug 03, 2004 at 12:06:12PM +0200, Corinna Vinschen wrote: On Aug 2 20:33, sarbx-cygwin6...@mailblocks.com wrote: This time around, cygserver does not eat CPU. But after 5 to 6 concurrent connections nothing seem to work, looks kind of hung. There is no activity in the Postgres log file. Opening a new database connection also hangs. There is no activity on the machine. Any chance to create a simple testcase which uncovers that behaviour without involving a whole database system? Attached test program reproduces it on Cygwin 2.7.0, Cygwin 1.7.5, and a few intermediate versions. The program creates sixteen processes that each perform a tight loop over the following: same on x86_64 2.8.0-0.1 - select one of four semaphores - reduce semaphore's value from 1 to 0 ("lock" it) - raise semaphore's value from 0 to 1 ("unlock" it) On GNU/Linux, AIX, and Solaris, the processes keep busy and finish one million lock/unlock cycles apiece in a few minutes. On Cygwin, they hang within a few seconds and under one hundred cycles apiece. At that point, cygserver is unresponsive to other clients; for example, "strace /bin/true", opening a new Cygwin terminal, "cat /proc/sysvipc/sem" and "cygserver -S" all hang. In most tests, cygserver was not consuming CPU while unresponsive. confirmed Thanks, nm Regards Marco -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: cygserver - Postgres Multiple connection Load Testing - Inifinte Loop
On Tue, Aug 03, 2004 at 12:06:12PM +0200, Corinna Vinschen wrote: > On Aug 2 20:33, sarbx-cygwin6...@mailblocks.com wrote: > > This time around, cygserver does not eat CPU. But after 5 to 6 > > concurrent > > connections nothing seem to work, looks kind of hung. There is no > > activity in the Postgres > > log file. Opening a new database connection also hangs. There is no > > activity on the machine. > Any chance to create a simple testcase which uncovers that behaviour > without involving a whole database system? Attached test program reproduces it on Cygwin 2.7.0, Cygwin 1.7.5, and a few intermediate versions. The program creates sixteen processes that each perform a tight loop over the following: - select one of four semaphores - reduce semaphore's value from 1 to 0 ("lock" it) - raise semaphore's value from 0 to 1 ("unlock" it) On GNU/Linux, AIX, and Solaris, the processes keep busy and finish one million lock/unlock cycles apiece in a few minutes. On Cygwin, they hang within a few seconds and under one hundred cycles apiece. At that point, cygserver is unresponsive to other clients; for example, "strace /bin/true", opening a new Cygwin terminal, "cat /proc/sysvipc/sem" and "cygserver -S" all hang. In most tests, cygserver was not consuming CPU while unresponsive. Thanks, nm /* * Demonstrate cygserver hang under concurrent sysv semaphore traffic. Run * without arguments. Output will cease within a few seconds, and cygserver * will be unresponsive to all clients. * * This is compatible with default cygserver settings; it uses a single * semaphore set of four semaphores. */ #include #include #include #include #include #include #include #include #define SEM_KEY 0x631a2c3e #define N_WORKER 16 #define N_SEMA (N_WORKER/4) #define N_CYCLE 100 union semun { int val; struct semid_ds *buf; unsigned short *array; }; static int print_every = 1; /* In parallel, N_WORKER processes run this function. */ static int do_worker(int ordinal, int set) { int i; struct sembuf op; printf("start worker %d\n", ordinal); fflush(stdout); op.sem_flg = 0; for (i = 1; i <= N_CYCLE; i++) { op.sem_num = random() % N_SEMA; op.sem_op = -1; if (0 > semop(set, &op, 1)) { perror("semop"); return 1; } op.sem_op = 1; if (0 > semop(set, &op, 1)) { perror("semop"); return 1; } if (i % print_every == 0) { printf("worker %d: %d cycles elapsed\n", ordinal, i); fflush(stdout); } } return 0; } int main(int argc, char **argv) { int status = 1, set, i, child_status; if (argc == 2) print_every = atoi(argv[1]); else if (argc != 1) { fprintf(stderr, "Usage: sema_parallel [print-every-N]\n"); return status; } puts("semget"); fflush(stdout); set = semget(SEM_KEY, N_SEMA, IPC_CREAT | 0600); if (set == -1) { perror("semget"); return status; } puts("SETVAL"); fflush(stdout); for (i = 0; i < N_SEMA; i++) { union semun s; s.val = 1; if (0 > semctl(set, i, SETVAL, s)) { perror("semctl(SETVAL)"); goto cleanup; } } for (i = 0; i < N_WORKER; i++) { pid_t pid; pid = fork(); switch (pid) { case -1: perror("fork"); goto cleanup; case 0: return do_worker(i, set); } } status = 0; cleanup: while (wait(&child_status) != -1) ; if (errno != ECHILD) { perror("wait"); status = 1; } if (0 > semctl(set, 0, IPC_RMID)) { perror("semtctl(IPC_RMID)"); status = 1; } return status; } -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: cygserver - Postgres Multiple connection Load Testing - Inifinte Loop
Thanks for the information. I'm attaching the gzipped log file. Sarva -Original Message- From: Christopher Faylor <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Sent: Tue, 3 Aug 2004 16:54:34 -0400 Subject: Re: cygserver - Postgres Multiple connection Load Testing - Inifinte Loop On Tue, Aug 03, 2004 at 01:45:07PM -0700, Saravanan Bellan wrote: The compressed log file .zip is about 19K. Uncompressed is large. This mailing list does not accept attachements. Yes, it does. sourceware.org doesn't accept zip, exe, bat, etc. attachments. It does accept gzipped attachments. cgf -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/ -- Mailblocks - A Better Way to Do Email http://about.mailblocks.com/info cygserver.log.gz Description: Binary data -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
Re: cygserver - Postgres Multiple connection Load Testing - Inifinte Loop
On Tue, Aug 03, 2004 at 01:45:07PM -0700, Saravanan Bellan wrote: >The compressed log file .zip is about 19K. Uncompressed is large. This >mailing list does not accept attachements. Yes, it does. sourceware.org doesn't accept zip, exe, bat, etc. attachments. It does accept gzipped attachments. cgf -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
Re: cygserver - Postgres Multiple connection Load Testing - Inifinte Loop
The compressed log file .zip is about 19K. Uncompressed is large. This mailing list does not accept attachements. I do not know of any other way to reproduce the problem, except thru load testing of the database. Thanks, -Sarva -Original Message- From: Corinna Vinschen <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Sent: Tue, 3 Aug 2004 12:06:12 +0200 Subject: Re: cygserver - Postgres Multiple connection Load Testing - Inifinte Loop On Aug 2 20:33, [EMAIL PROTECTED] wrote: This time around, cygserver does not eat CPU. But after 5 to 6 concurrent connections nothing seem to work, looks kind of hung. There is no activity in the Postgres log file. Opening a new database connection also hangs. There is no activity on the machine. Let me know a alternate address where I can send you the log file. Send it to this list, please. Any chance to create a simple testcase which uncovers that behaviour without involving a whole database system? Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Co-Project Leader mailto:[EMAIL PROTECTED] Red Hat, Inc. -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/ -- Mailblocks - A Better Way to Do Email http://about.mailblocks.com/info -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
Re: cygserver - Postgres Multiple connection Load Testing - Inifinte Loop
On Aug 2 20:33, [EMAIL PROTECTED] wrote: > This time around, cygserver does not eat CPU. But after 5 to 6 > concurrent > connections nothing seem to work, looks kind of hung. There is no > activity in the Postgres > log file. Opening a new database connection also hangs. There is no > activity on the machine. > > Let me know a alternate address where I can send you the log file. Send it to this list, please. Any chance to create a simple testcase which uncovers that behaviour without involving a whole database system? Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Co-Project Leader mailto:[EMAIL PROTECTED] Red Hat, Inc. -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
Re: cygserver - Postgres Multiple connection Load Testing - Inifinte Loop
On Jul 30 14:32, [EMAIL PROTECTED] wrote: >On Jul 28 20:19, [EMAIL PROTECTED] wrote: >>$ postgres --version >>postgres (PostgreSQL) 7.4.3 >> >>$ uname -a >>CYGWIN_NT-5.0 sbellan-nb 1.5.10(0.116/4/2) 2004-05-25 22:07 i686 >>unknown unknown >>Cygwin >> >>While doing Load testing using DOTS, after 5 to 6 connections the >>machine starts to slow down and >>cygserver seems to hog most of the CPU. After running the cygserve in >>debug mode , I see the following >>in the log file repeating infinitely. >>[...] > >Just to be sure, are you talking about 5 or 6 *concurrent* connections? >That might be important. I'm not exactly sure so far but there might >be a race condition in my mutex code. > >Corinna Yes, I'm talking about 5 or 6 *concurrent* connections. Ok, that's what I'd suspected. I have patched Cygserver and I'd like to get some feedback on whether this solves the problem or not. Would you please try the latest developer snapshot from http://cygwin.com/snapshots/ and report if that version of Cygserver still eats up CPU time under the above circumstances? Corinna This time around, cygserver does not eat CPU. But after 5 to 6 concurrent connections nothing seem to work, looks kind of hung. There is no activity in the Postgres log file. Opening a new database connection also hangs. There is no activity on the machine. Let me know a alternate address where I can send you the log file. -- Mailblocks - A Better Way to Do Email http://about.mailblocks.com/info -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
Re: cygserver - Postgres Multiple connection Load Testing - Inifinte Loop
On Jul 30 14:32, [EMAIL PROTECTED] wrote: > >On Jul 28 20:19, [EMAIL PROTECTED] wrote: > >>$ postgres --version > >>postgres (PostgreSQL) 7.4.3 > >> > >>$ uname -a > >>CYGWIN_NT-5.0 sbellan-nb 1.5.10(0.116/4/2) 2004-05-25 22:07 i686 > >>unknown unknown > >>Cygwin > >> > >>While doing Load testing using DOTS, after 5 to 6 connections the > >>machine starts to slow down and > >>cygserver seems to hog most of the CPU. After running the cygserve > in > >>debug mode , I see the following > >>in the log file repeating infinitely. > >>[...] > > > >Just to be sure, are you talking about 5 or 6 *concurrent* connections? > >That might be important. I'm not exactly sure so far but there might > >be a race condition in my mutex code. > > > >Corinna > > Yes, I'm talking about 5 or 6 *concurrent* connections. Ok, that's what I'd suspected. I have patched Cygserver and I'd like to get some feedback on whether this solves the problem or not. Would you please try the latest developer snapshot from http://cygwin.com/snapshots/ and report if that version of Cygserver still eats up CPU time under the above circumstances? Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Co-Project Leader mailto:[EMAIL PROTECTED] Red Hat, Inc. -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
Re: cygserver - Postgres Multiple connection Load Testing - Inifinte Loop
On Jul 28 20:19, [EMAIL PROTECTED] wrote: $ postgres --version postgres (PostgreSQL) 7.4.3 $ uname -a CYGWIN_NT-5.0 sbellan-nb 1.5.10(0.116/4/2) 2004-05-25 22:07 i686 unknown unknown Cygwin While doing Load testing using DOTS, after 5 to 6 connections the machine starts to slow down and cygserver seems to hog most of the CPU. After running the cygserve in debug mode , I see the following in the log file repeating infinitely. [...] Just to be sure, are you talking about 5 or 6 *concurrent* connections? That might be important. I'm not exactly sure so far but there might be a race condition in my mutex code. Corinna Yes, I'm talking about 5 or 6 *concurrent* connections. Thanks, -- Mailblocks - A Better Way to Do Email http://about.mailblocks.com/info -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
Re: cygserver - Postgres Multiple connection Load Testing - Inifinte Loop
On Jul 28 20:19, [EMAIL PROTECTED] wrote: > $ postgres --version > postgres (PostgreSQL) 7.4.3 > > $ uname -a > CYGWIN_NT-5.0 sbellan-nb 1.5.10(0.116/4/2) 2004-05-25 22:07 i686 > unknown unknown > Cygwin > > While doing Load testing using DOTS, after 5 to 6 connections the > machine starts to slow down and > cygserver seems to hog most of the CPU. After running the cygserve in > debug mode , I see the following > in the log file repeating infinitely. > [...] Just to be sure, are you talking about 5 or 6 *concurrent* connections? That might be important. I'm not exactly sure so far but there might be a race condition in my mutex code. Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Co-Project Leader mailto:[EMAIL PROTECTED] Red Hat, Inc. -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/