Re: [HACKERS] Unixware 714 pthreads
On Thu, 28 Oct 2004, Bruce Momjian wrote: Date: Thu, 28 Oct 2004 19:58:56 -0400 (EDT) From: Bruce Momjian [EMAIL PROTECTED] To: [EMAIL PROTECTED] Cc: Tom Lane [EMAIL PROTECTED], [EMAIL PROTECTED] Subject: Re: [HACKERS] Unixware 714 pthreads [EMAIL PROTECTED] wrote: I agree with all that you say Tom, I'm just asking for some help to debug this, Now that Larry is a litle off the list, I'm feeling really lonely on UW. SCO won't do anything until I come up with a test program that fails. All my tries did work until then. I use other threaded progs like postfix or bind that nether fail. I'm really at lost. Would you/someone help me? The problem is that we are then spending our time debugging Unixware problems rather than focusing on our database software. I think this is why few have offered assistance. I understand your concerns, OTOH you all spend quite a lot of time debugging windows... Anyway, the little program attached does more or less what postgresql does in term of sigaction and setitimer (Yes, unixware DOES have posix signals) and works like a charm wether compiled with or without pthreads. Regards -- Olivier PRENANT Tel: +33-5-61-50-97-00 (Work) 6, Chemin d'Harraud Turrou +33-5-61-50-97-01 (Fax) 31190 AUTERIVE +33-6-07-63-80-64 (GSM) FRANCE Email: [EMAIL PROTECTED] -- Make your life a dream, make your dream a reality. (St Exupery)#include stdio.h #include signal.h #include sys/types.h #include sys/time.h extern int errno; main() { sigset_t UnBlockSig, BlockSig, AuthBlockSig; struct sigaction act, oact; void handler(); sigemptyset(UnBlockSig); sigfillset(BlockSig); sigfillset(AuthBlockSig); /* * Unmark those signals that should never be blocked. Some of these * signal names don't exist on all platforms. Most do, but might as * well ifdef them all for consistency... */ #ifdef SIGTRAP sigdelset(BlockSig, SIGTRAP); sigdelset(AuthBlockSig, SIGTRAP); #endif #ifdef SIGABRT sigdelset(BlockSig, SIGABRT); sigdelset(AuthBlockSig, SIGABRT); #endif #ifdef SIGILL sigdelset(BlockSig, SIGILL); sigdelset(AuthBlockSig, SIGILL); #endif #ifdef SIGFPE sigdelset(BlockSig, SIGFPE); sigdelset(AuthBlockSig, SIGFPE); #endif #ifdef SIGSEGV sigdelset(BlockSig, SIGSEGV); sigdelset(AuthBlockSig, SIGSEGV); #endif #ifdef SIGBUS sigdelset(BlockSig, SIGBUS); sigdelset(AuthBlockSig, SIGBUS); #endif #ifdef SIGSYS sigdelset(BlockSig, SIGSYS); sigdelset(AuthBlockSig, SIGSYS); #endif #ifdef SIGCONT sigdelset(BlockSig, SIGCONT); sigdelset(AuthBlockSig, SIGCONT); #endif #ifdef SIGTERM sigdelset(AuthBlockSig, SIGTERM); #endif #ifdef SIGQUIT sigdelset(AuthBlockSig, SIGQUIT); #endif #ifdef SIGALRM sigdelset(AuthBlockSig, SIGALRM); #endif act.sa_handler = handler; sigemptyset(act.sa_mask); act.sa_flags = 0; // if (signo != SIGALRM) // act.sa_flags |= SA_RESTART; /* #ifdef SA_NOCLDSTOP if (signo == SIGCHLD) act.sa_flags |= SA_NOCLDSTOP; #endif */ if (sigaction(SIGALRM, act, oact) 0) fprintf(stderr, sigaction failed, errno = %d\n,errno); handler(0); for (;;) ; } void handler(sig) { struct itimerval timeval; memset(timeval,0,sizeof(struct itimerval)); timeval.it_value.tv_sec=5; timeval.it_value.tv_usec=0; if (setitimer(ITIMER_REAL,timeval,NULL) 0) fprintf(stderr, could not set itimer errono = %d\n); printf(caught sig %d\n,sig); } ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] Unixware 714 pthreads
[EMAIL PROTECTED] wrote: On Thu, 28 Oct 2004, Bruce Momjian wrote: Date: Thu, 28 Oct 2004 19:58:56 -0400 (EDT) From: Bruce Momjian [EMAIL PROTECTED] To: [EMAIL PROTECTED] Cc: Tom Lane [EMAIL PROTECTED], [EMAIL PROTECTED] Subject: Re: [HACKERS] Unixware 714 pthreads [EMAIL PROTECTED] wrote: I agree with all that you say Tom, I'm just asking for some help to debug this, Now that Larry is a litle off the list, I'm feeling really lonely on UW. SCO won't do anything until I come up with a test program that fails. All my tries did work until then. I use other threaded progs like postfix or bind that nether fail. I'm really at lost. Would you/someone help me? The problem is that we are then spending our time debugging Unixware problems rather than focusing on our database software. I think this is why few have offered assistance. I understand your concerns, OTOH you all spend quite a lot of time debugging windows... Anyway, the little program attached does more or less what postgresql does in term of sigaction and setitimer (Yes, unixware DOES have posix signals) and works like a charm wether compiled with or without pthreads. That work is done by people who care about the Win32 port and plan to use it, just like you are debugging Unixware because you use it. We don't expect non-Win32 users to fix Win32 problems and we don't expect non-Unixware people to fix Unixware problems. Each platform's users have to track down its own bugs for us to work efficiently. Also, we haven't spent much time tracking down Win32 bugs but more figuring out how to use the Win32 API. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] Unixware 714 pthreads
Sorry to follow up my own post. I made some more tests: create table foo (f1 int) -- for it not to be removed if if kill the process each time I do: psql template1 template1# set statement_timeout=1000; SET template1 select block_me(); it works ok if i do it a second time in the same session, blockme() never returns I wonder if handle_sig_alarm is re-armed after being used or if sleep is used anywhere in the backend. Unixware doc for setitimer (http://www.pyrenet.fr:8458/en/man/html.3C/getitimer.3C.html) says that A sleep following a setitimer wipes out knowledge of the user signal handler What can I do next? Regards On Wed, 27 Oct 2004 [EMAIL PROTECTED] wrote: Date: Wed, 27 Oct 2004 13:01:45 +0200 (MET DST) From: [EMAIL PROTECTED] Newsgroups: comp.databases.postgresql.hackers Subject: Re: Unixware 714 pthreads Dear Bruce, Thanks for your reply, I was desperate I did'nt get one! As I said, I'm quite sure there is a bug in pthread library, Before saying this to SCO, I have to prove it. Postgresql is the way to prove it! What I need is to know where to start from (I'd like to put elogs where statement_timeout is processed to see what really happens and why it doesn't cancel the query). Could someone tell me where to look for? If anyone is interessed in debugging this issue with me, I can set up an account on a test unixware machine. TIA On Tue, 26 Oct 2004, Bruce Momjian wrote: Date: Tue, 26 Oct 2004 17:59:17 -0400 (EDT) From: Bruce Momjian [EMAIL PROTECTED] To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: Re: [HACKERS] Unixware 714 pthreads The only help I can be is that on Unixware (only) the backend is compiled with threading enabled. This might be showing some thread bugs. --- [EMAIL PROTECTED] wrote: Hi every one, I need help to debug the problem I have on Unixware 714 and beta3. postgresql make check hangs on plpgsql test when compiled with --enable-thread-safty. It does hang on select block_me(); This select should be canceled by the set statement_timeout=1000, instead, the backend is 100% CPU bound and only kill -9 can kill it. It works ok when compiled without -enable-thread-safty. I've tried almost every thing I could think of, but not knowing so much about threads and PG source code, I request that someone can help me as to find a way to debug this. It worked up until beta2, but I'm not sure block_me()was there. I really need someone to tell me where to begin. TIA -- Olivier PRENANT Tel: +33-5-61-50-97-00 (Work) 6, Chemin d'Harraud Turrou +33-5-61-50-97-01 (Fax) 31190 AUTERIVE +33-6-07-63-80-64 (GSM) FRANCE Email: [EMAIL PROTECTED] -- Make your life a dream, make your dream a reality. (St Exupery) ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match -- Olivier PRENANT Tel: +33-5-61-50-97-00 (Work) 6, Chemin d'Harraud Turrou +33-5-61-50-97-01 (Fax) 31190 AUTERIVE +33-6-07-63-80-64 (GSM) FRANCE Email: [EMAIL PROTECTED] -- Make your life a dream, make your dream a reality. (St Exupery) ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Unixware 714 pthreads
[EMAIL PROTECTED] writes: if i do it a second time in the same session, blockme() never returns I wonder if handle_sig_alarm is re-armed after being used No. Why should the signal handler need re-arming? or if sleep is used anywhere in the backend. Nope. regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Unixware 714 pthreads
Tom Lane wrote: [EMAIL PROTECTED] writes: if i do it a second time in the same session, blockme() never returns I wonder if handle_sig_alarm is re-armed after being used No. Why should the signal handler need re-arming? or if sleep is used anywhere in the backend. Nope. Actually, I just noticed that postmaster/pg_arch.c has a call to sleep() that needs to be removed ... I guess it snuck in after all that stuff was adjusted. cheers andrew ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] Unixware 714 pthreads
On Thu, 28 Oct 2004, Tom Lane wrote: Date: Thu, 28 Oct 2004 12:11:12 -0400 From: Tom Lane [EMAIL PROTECTED] To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: Re: [HACKERS] Unixware 714 pthreads [EMAIL PROTECTED] writes: if i do it a second time in the same session, blockme() never returns I wonder if handle_sig_alarm is re-armed after being used No. Why should the signal handler need re-arming? My impression was that once caught, signal handler for a particular signal is reset to SIG-DFL. or if sleep is used anywhere in the backend. Nope. regards, tom lane Oh well, bye bye htreads on unixware, I give up! (very disapointed) cheers -- Olivier PRENANT Tel: +33-5-61-50-97-00 (Work) 6, Chemin d'Harraud Turrou +33-5-61-50-97-01 (Fax) 31190 AUTERIVE +33-6-07-63-80-64 (GSM) FRANCE Email: [EMAIL PROTECTED] -- Make your life a dream, make your dream a reality. (St Exupery) ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] Unixware 714 pthreads
[EMAIL PROTECTED] writes: On Thu, 28 Oct 2004, Tom Lane wrote: No. Why should the signal handler need re-arming? My impression was that once caught, signal handler for a particular signal is reset to SIG-DFL. No. If your signal support is POSIX-compatible, it should not do that because we don't set SA_RESETHAND when calling sigaction(2). If you don't have POSIX signals, you had better have BSD-style signal(2), which doesn't reset either. If this is not happening as expected, you will have much worse problems than whether statement_timeout works :-( regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Unixware 714 pthreads
I agree with all that you say Tom, I'm just asking for some help to debug this, Now that Larry is a litle off the list, I'm feeling really lonely on UW. SCO won't do anything until I come up with a test program that fails. All my tries did work until then. I use other threaded progs like postfix or bind that nether fail. I'm really at lost. Would you/someone help me? Best regards On Thu, 28 Oct 2004, Tom Lane wrote: Date: Thu, 28 Oct 2004 13:55:56 -0400 From: Tom Lane [EMAIL PROTECTED] To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: Re: [HACKERS] Unixware 714 pthreads [EMAIL PROTECTED] writes: On Thu, 28 Oct 2004, Tom Lane wrote: No. Why should the signal handler need re-arming? My impression was that once caught, signal handler for a particular signal is reset to SIG-DFL. No. If your signal support is POSIX-compatible, it should not do that because we don't set SA_RESETHAND when calling sigaction(2). If you don't have POSIX signals, you had better have BSD-style signal(2), which doesn't reset either. If this is not happening as expected, you will have much worse problems than whether statement_timeout works :-( regards, tom lane -- Olivier PRENANT Tel: +33-5-61-50-97-00 (Work) 6, Chemin d'Harraud Turrou +33-5-61-50-97-01 (Fax) 31190 AUTERIVE +33-6-07-63-80-64 (GSM) FRANCE Email: [EMAIL PROTECTED] -- Make your life a dream, make your dream a reality. (St Exupery) ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Unixware 714 pthreads
Tom Lane wrote: [EMAIL PROTECTED] writes: On Thu, 28 Oct 2004, Tom Lane wrote: No. Why should the signal handler need re-arming? My impression was that once caught, signal handler for a particular signal is reset to SIG-DFL. No. If your signal support is POSIX-compatible, it should not do that because we don't set SA_RESETHAND when calling sigaction(2). If you don't have POSIX signals, you had better have BSD-style signal(2), which doesn't reset either. If this is not happening as expected, you will have much worse problems than whether statement_timeout works :-( SysV-style signal(2) handling does indeed require that the signal handler be re-enabled. The attached program demonstrates this on Solaris, and probably on Unixware as well (I don't have access to the latter). Just run it and interrupt it with ctrl-c. It should print something the first time around, and actually be interrupted the second time. So if Unixware doesn't have sigaction() or it's not being picked up by autoconf then yeah, he'll have big problems... -- Kevin Brown [EMAIL PROTECTED] #include signal.h #include stdio.h #include unistd.h void sighandler(int sig) { printf (Received signal %d\n, sig); } int main (int argc, char *argv[]) { signal(SIGINT, sighandler); while(1) { sleep(1); } } ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] Unixware 714 pthreads
[EMAIL PROTECTED] wrote: I agree with all that you say Tom, I'm just asking for some help to debug this, Now that Larry is a litle off the list, I'm feeling really lonely on UW. SCO won't do anything until I come up with a test program that fails. All my tries did work until then. I use other threaded progs like postfix or bind that nether fail. I'm really at lost. Would you/someone help me? The problem is that we are then spending our time debugging Unixware problems rather than focusing on our database software. I think this is why few have offered assistance. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Unixware 714 pthreads
Dear Bruce, Thanks for your reply, I was desperate I did'nt get one! As I said, I'm quite sure there is a bug in pthread library, Before saying this to SCO, I have to prove it. Postgresql is the way to prove it! What I need is to know where to start from (I'd like to put elogs where statement_timeout is processed to see what really happens and why it doesn't cancel the query). Could someone tell me where to look for? If anyone is interessed in debugging this issue with me, I can set up an account on a test unixware machine. TIA On Tue, 26 Oct 2004, Bruce Momjian wrote: Date: Tue, 26 Oct 2004 17:59:17 -0400 (EDT) From: Bruce Momjian [EMAIL PROTECTED] To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: Re: [HACKERS] Unixware 714 pthreads The only help I can be is that on Unixware (only) the backend is compiled with threading enabled. This might be showing some thread bugs. --- [EMAIL PROTECTED] wrote: Hi every one, I need help to debug the problem I have on Unixware 714 and beta3. postgresql make check hangs on plpgsql test when compiled with --enable-thread-safty. It does hang on select block_me(); This select should be canceled by the set statement_timeout=1000, instead, the backend is 100% CPU bound and only kill -9 can kill it. It works ok when compiled without -enable-thread-safty. I've tried almost every thing I could think of, but not knowing so much about threads and PG source code, I request that someone can help me as to find a way to debug this. It worked up until beta2, but I'm not sure block_me()was there. I really need someone to tell me where to begin. TIA -- Olivier PRENANT Tel: +33-5-61-50-97-00 (Work) 6, Chemin d'Harraud Turrou +33-5-61-50-97-01 (Fax) 31190 AUTERIVE +33-6-07-63-80-64 (GSM) FRANCE Email: [EMAIL PROTECTED] -- Make your life a dream, make your dream a reality. (St Exupery) ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match -- Olivier PRENANT Tel: +33-5-61-50-97-00 (Work) 6, Chemin d'Harraud Turrou +33-5-61-50-97-01 (Fax) 31190 AUTERIVE +33-6-07-63-80-64 (GSM) FRANCE Email: [EMAIL PROTECTED] -- Make your life a dream, make your dream a reality. (St Exupery) ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Unixware 714 pthreads
[EMAIL PROTECTED] wrote: Dear Bruce, Thanks for your reply, I was desperate I did'nt get one! As I said, I'm quite sure there is a bug in pthread library, Before saying this to SCO, I have to prove it. Postgresql is the way to prove it! What I need is to know where to start from (I'd like to put elogs where statement_timeout is processed to see what really happens and why it doesn't cancel the query). Could someone tell me where to look for? If anyone is interessed in debugging this issue with me, I can set up an account on a test unixware machine. My guess is that there is some problem with delivering alarm signals because that is how the timeout code works. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Unixware 714 pthreads
On Wed, 27 Oct 2004, Bruce Momjian wrote: Date: Wed, 27 Oct 2004 14:53:26 -0400 (EDT) From: Bruce Momjian [EMAIL PROTECTED] To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: Re: [HACKERS] Unixware 714 pthreads [EMAIL PROTECTED] wrote: Dear Bruce, Thanks for your reply, I was desperate I did'nt get one! As I said, I'm quite sure there is a bug in pthread library, Before saying this to SCO, I have to prove it. Postgresql is the way to prove it! What I need is to know where to start from (I'd like to put elogs where statement_timeout is processed to see what really happens and why it doesn't cancel the query). Could someone tell me where to look for? If anyone is interessed in debugging this issue with me, I can set up an account on a test unixware machine. My guess is that there is some problem with delivering alarm signals because that is how the timeout code works. That's my guess too. I've traked that to src/backend/storage/lmrg/proc.c where kill is called. Unixware doc says that kill to self_proc id delivers the signal to the thread that called it. For some reason, this backend has 2 threads (can't figure why) and INMHO kill should be pthread_kill. I wanted to try but found no way to find the other thread_id. I need the help of postgresql/thread guru here. Many thanks -- Olivier PRENANT Tel: +33-5-61-50-97-00 (Work) 6, Chemin d'Harraud Turrou +33-5-61-50-97-01 (Fax) 31190 AUTERIVE +33-6-07-63-80-64 (GSM) FRANCE Email: [EMAIL PROTECTED] -- Make your life a dream, make your dream a reality. (St Exupery) ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Unixware 714 pthreads
The only help I can be is that on Unixware (only) the backend is compiled with threading enabled. This might be showing some thread bugs. --- [EMAIL PROTECTED] wrote: Hi every one, I need help to debug the problem I have on Unixware 714 and beta3. postgresql make check hangs on plpgsql test when compiled with --enable-thread-safty. It does hang on select block_me(); This select should be canceled by the set statement_timeout=1000, instead, the backend is 100% CPU bound and only kill -9 can kill it. It works ok when compiled without -enable-thread-safty. I've tried almost every thing I could think of, but not knowing so much about threads and PG source code, I request that someone can help me as to find a way to debug this. It worked up until beta2, but I'm not sure block_me()was there. I really need someone to tell me where to begin. TIA -- Olivier PRENANT Tel: +33-5-61-50-97-00 (Work) 6, Chemin d'Harraud Turrou +33-5-61-50-97-01 (Fax) 31190 AUTERIVE +33-6-07-63-80-64 (GSM) FRANCE Email: [EMAIL PROTECTED] -- Make your life a dream, make your dream a reality. (St Exupery) ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 8: explain analyze is your friend
[HACKERS] Unixware 714 pthreads
Hi every one, I need help to debug the problem I have on Unixware 714 and beta3. postgresql make check hangs on plpgsql test when compiled with --enable-thread-safty. It does hang on select block_me(); This select should be canceled by the set statement_timeout=1000, instead, the backend is 100% CPU bound and only kill -9 can kill it. It works ok when compiled without -enable-thread-safty. I've tried almost every thing I could think of, but not knowing so much about threads and PG source code, I request that someone can help me as to find a way to debug this. It worked up until beta2, but I'm not sure block_me()was there. I really need someone to tell me where to begin. TIA -- Olivier PRENANT Tel: +33-5-61-50-97-00 (Work) 6, Chemin d'Harraud Turrou +33-5-61-50-97-01 (Fax) 31190 AUTERIVE +33-6-07-63-80-64 (GSM) FRANCE Email: [EMAIL PROTECTED] -- Make your life a dream, make your dream a reality. (St Exupery) ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match