That's awesome, thank you. I will download the latest cvs version and try it out.
On Dec 4, 2008, at 20:39, Tatsuo Ishii <[EMAIL PROTECTED]> wrote: > Hi Marcelo, > > With your help, I was able to find the problem. > If you connect to pgpool *before* starting recovery, the timeout > parameter to select(2) is set to NULL which means it will wait > forever. I have modified pool_process_query.c so that it will set > timeout whenever client_idle_limit_in_recovery > 0. Please grab the > CVS Head and try it out. > > Thanks for your great help! > -- > Tatsuo Ishii > SRA OSS, Inc. Japan > >> Hi Tatsuo, >> >> >> I have also checked the value for fds >> >> This "if (((*InRecovery == 0 && pool_config->client_idle_limit > 0) >> || >> (*InRecovery && pool_config->client_idle_limit_in_recovery > 0)) && >> fds == 0) " never >> becomes true unless I run some query inside the psql connection >> that I >> created on step 1. >> >> 1) connect to pgpool with psql >> >> 2) run pcp_recovery_node >> >> 3) When log shows that is stuck on starting staging 2 connect to >> pgpool PID from step 1 >> >> - 1st GDB backtrace - >> >> - on frame 2 - in pool_process_query (frontend=0x8128c18, >> backend=0x8128a10, connection_reuse=0, >> first_ready_for_query_received=0) at pool_process_query.c:363 >> 363 fds = select(num_fds, >> &readmask, &writemask, &exceptmask, &timeout); >> >> (gdb) frame 2 >> #2 0x0805a499 in pool_process_query (frontend=0x8128c18, >> backend=0x8128a10, connection_reuse=0, >> first_ready_for_query_received=0) at pool_process_query.c:365 >> 365 fds = select(num_fds, >> &readmask, &writemask, &exceptmask, NULL); >> (gdb) print *InRecovery >> $1 = 1 >> (gdb) print pool_config->client_idle_limit >> $2 = 0 >> (gdb) print pool_config->client_idle_limit_in_recovery >> $3 = 7 >> (gdb) print fds >> $4 = 135432720 >> >> >> 4) List databases inside psql connection created on step 1 "\l" >> >> 5) Detach gdb from PID and attach it back to let "\l" run >> >> - 2st GDB backtrace - >> >> - on frame 2 - in pool_process_query (frontend=0x8128c18, >> backend=0x8128a10, connection_reuse=0, >> first_ready_for_query_received=0) at pool_process_query.c:363 >> 363 fds = select(num_fds, >> &readmask, &writemask, &exceptmask, &timeout); >> >> (gdb) bt >> #0 0xb7f69402 in ?? () >> #1 0xb7e810fd in select () from /lib/tls/i686/cmov/libc.so.6 >> #2 0x0805a463 in pool_process_query (frontend=0x8128c18, >> backend=0x8128a10, connection_reuse=0, >> first_ready_for_query_received=0) at pool_process_query.c:363 >> #3 0x0804f03e in do_child (unix_fd=3, inet_fd=4) at child.c:428 >> #4 0x0804bc21 in fork_a_child (unix_fd=3, inet_fd=4, id=3) at >> main.c: >> 814 >> #5 0x0804d1e8 in failover () at main.c:1328 >> #6 0x0804b16b in main (argc=7, argv=0xbfef7c64) at main.c:519 >> (gdb) frame 2 >> #2 0x0805a463 in pool_process_query (frontend=0x8128c18, >> backend=0x8128a10, connection_reuse=0, >> first_ready_for_query_received=0) at pool_process_query.c:363 >> 363 fds = select(num_fds, >> &readmask, &writemask, &exceptmask, &timeout); >> (gdb) print *InRecovery >> $1 = 1 >> (gdb) print pool_config->client_idle_limit >> $2 = 0 >> (gdb) print pool_config->client_idle_limit_in_recovery >> $3 = 7 >> (gdb) print fds >> $4 = 0 >> >> >> Once I attach back to process I'm able to see a line in the pgpool >> LOG file as shown below >> >> Dec 4 09:41:58 debian-db6 pgpool: 2008-12-04 09:41:58 DEBUG: pid >> 24697: idle count:1 InRecovery:0 client_idle_limit:7 >> client_idle_limit_in_recovery:-1074827064 >> >> Then I let gdb continue the process and recovery proceeds since the >> if >> statement is now able to evaluate to true >> >> >> >> >> Hope that helps >> >> If you want to see this happening let me know and I can setup some >> VMs >> and then provide you with access to it >> >> - >> Marcelo >> >> >> On Dec 4, 2008, at 4:17 AM, Tatsuo Ishii wrote: >> >>> Thanks! >>> >>> Can you please print the value of: >>> >>> *InRecovery >>> *pool_config >>> >>> at frame #2? >>> -- >>> Tatsuo Ishii >>> SRA OSS, Inc. Japan >>> >>>> Hi Tatsuo, >>>> >>>> sorry for the delay here. >>>> I was able to compile the CVS version now and no problem in regards >>>> to >>>> bison, thanks. >>>> >>>> I have also placed this back on the list >>>>> >>>>> Thanks. What I want to know is followings: >>>>> >>>>> 1) connect to pgpool-II using psql >>>> >>>> Ok connected to pgpool through psql >>>>> >>>>> 2) start recovery >>>> >>>> Ok, ./pcp_recovery_node 100 localhost 9898 nastpcp nastpcp 1 >>>> >>>>> >>>>> 3) pgpool-II stucks at the beginning of 2nd stage (this is what I >>>>> couldn't reproduce) >>>> >>>> Ok, got stuck >>>> >>>>> >>>>> 4) attach gdb to pgpool-II child process which psql connected at >>>>> 1) >>>> >>>> Ok, gdb pgpool PID >>>> >>>>> >>>>> 5) get backtrace to know where pgpool-II sticks >>>>> >>>>>> >>>> >>>> Attaching to process 23712 >>>> Reading symbols from /opt/pgpool-cvs.1.117/bin/pgpool...done. >>>> Using host libthread_db library "/lib/tls/i686/cmov/ >>>> libthread_db.so. >>>> 1". >>>> Reading symbols from /usr/lib/libpq.so.5...done. >>>> Loaded symbols for /usr/lib/libpq.so.5 >>>> Reading symbols from /opt/pgpool-cvs.1.117/lib/libpcp.so.0...done. >>>> Loaded symbols for /opt/pgpool-cvs.1.117/lib/libpcp.so.0 >>>> Reading symbols from /lib/tls/i686/cmov/libresolv.so.2...done. >>>> Loaded symbols for /lib/tls/i686/cmov/libresolv.so.2 >>>> Reading symbols from /lib/tls/i686/cmov/libnsl.so.1...done. >>>> Loaded symbols for /lib/tls/i686/cmov/libnsl.so.1 >>>> Reading symbols from /lib/tls/i686/cmov/libm.so.6...done. >>>> Loaded symbols for /lib/tls/i686/cmov/libm.so.6 >>>> Reading symbols from /lib/tls/i686/cmov/libc.so.6...done. >>>> Loaded symbols for /lib/tls/i686/cmov/libc.so.6 >>>> Reading symbols from /lib/tls/i686/cmov/libcrypt.so.1...done. >>>> Loaded symbols for /lib/tls/i686/cmov/libcrypt.so.1 >>>> Reading symbols from /usr/lib/i686/cmov/libssl.so.0.9.8...done. >>>> Loaded symbols for /usr/lib/i686/cmov/libssl.so.0.9.8 >>>> Reading symbols from /usr/lib/i686/cmov/libcrypto.so.0.9.8...done. >>>> Loaded symbols for /usr/lib/i686/cmov/libcrypto.so.0.9.8 >>>> Reading symbols from /usr/lib/libkrb5.so.3...done. >>>> Loaded symbols for /usr/lib/libkrb5.so.3 >>>> Reading symbols from /lib/libcom_err.so.2...done. >>>> Loaded symbols for /lib/libcom_err.so.2 >>>> Reading symbols from /usr/lib/libgssapi_krb5.so.2...done. >>>> Loaded symbols for /usr/lib/libgssapi_krb5.so.2 >>>> Reading symbols from /usr/lib/libldap_r.so.2...done. >>>> Loaded symbols for /usr/lib/libldap_r.so.2 >>>> Reading symbols from /lib/tls/i686/cmov/libpthread.so.0...done. >>>> [Thread debugging using libthread_db enabled] >>>> [New Thread -1214495040 (LWP 23712)] >>>> Loaded symbols for /lib/tls/i686/cmov/libpthread.so.0 >>>> Reading symbols from /lib/ld-linux.so.2...done. >>>> Loaded symbols for /lib/ld-linux.so.2 >>>> Reading symbols from /lib/tls/i686/cmov/libdl.so.2...done. >>>> Loaded symbols for /lib/tls/i686/cmov/libdl.so.2 >>>> Reading symbols from /usr/lib/libz.so.1...done. >>>> Loaded symbols for /usr/lib/libz.so.1 >>>> Reading symbols from /usr/lib/libk5crypto.so.3...done. >>>> Loaded symbols for /usr/lib/libk5crypto.so.3 >>>> Reading symbols from /usr/lib/libkrb5support.so.0...done. >>>> Loaded symbols for /usr/lib/libkrb5support.so.0 >>>> Reading symbols from /usr/lib/liblber.so.2...done. >>>> root 5312 6 0 Dec03 ? 00:00:00 [pdflush] >>>> Loaded symbols for /usr/lib/liblber.so.2 >>>> Reading symbols from /usr/lib/libsasl2.so.2...done. >>>> Loaded symbols for /usr/lib/libsasl2.so.2 >>>> Reading symbols from /usr/lib/libgnutls.so.13...done. >>>> Loaded symbols for /usr/lib/libgnutls.so.13 >>>> Reading symbols from /usr/lib/libtasn1.so.3...done. >>>> Loaded symbols for /usr/lib/libtasn1.so.3 >>>> Reading symbols from /usr/lib/libgcrypt.so.11...done. >>>> Loaded symbols for /usr/lib/libgcrypt.so.11 >>>> Reading symbols from /usr/lib/libgpg-error.so.0...done. >>>> Loaded symbols for /usr/lib/libgpg-error.so.0 >>>> Reading symbols from /lib/tls/i686/cmov/libnss_files.so.2...done. >>>> Loaded symbols for /lib/tls/i686/cmov/libnss_files.so.2 >>>> Failed to read a valid object file image from memory. >>>> 0xb7f39402 in ?? () >>>> >>>> (gdb) bt >>>> #0 0xb7f39402 in ?? () >>>> #1 0xb7e510fd in select () from /lib/tls/i686/cmov/libc.so.6 >>>> #2 0x0805a499 in pool_process_query (frontend=0x8128c18, >>>> backend=0x8128a10, connection_reuse=0, >>>> first_ready_for_query_received=0) >>>> at pool_process_query.c:365 >>>> #3 0x0804f03e in do_child (unix_fd=3, inet_fd=4) at child.c:428 >>>> #4 0x0804bc21 in fork_a_child (unix_fd=3, inet_fd=4, id=2) at >>>> main.c: >>>> 814 >>>> #5 0x0804d1e8 in failover () at main.c:1328 >>>> #6 0x0804b16b in main (argc=7, argv=0xbff10594) at main.c:519 >>>> >>>> >>>> >>>> >> _______________________________________________ Pgpool-general mailing list [email protected] http://pgfoundry.org/mailman/listinfo/pgpool-general
