Stefan Kaltenbrunner wrote:
> Tom Lane wrote:
>> Stefan Kaltenbrunner <[EMAIL PROTECTED]> writes:
>>> one of my new buildfarm boxes (an Debian/Etch based ARM box) is
>>> sometimes failing to stop the database during the regression tests:
>>> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=quagga&dt=2007-01-08%2003:03:03
>>> this only seems to happen sometimes and only if --with-tcl is enabled on
>>> quagga.
>>> lionfish (my mipsel box) is able to trigger that on every build if I
>>> enable --with-tcl but it is nearly impossible to debug it there because
>>> of the low amount of memory and diskspace it has.
>> Hm, could pl/tcl somehow be preventing the backend from exiting once
>> it's run any pl/tcl stuff?  I have no idea why though, and even less
>> why it wouldn't be repeatable. 
>>
>>> After the stopdb failure we still have those processes running:
>>> pgbuild   3488  0.0  2.4  43640  6300 ?        Ss   06:15   0:01
>>> postgres: pgbuild pl_regression [local] idle
>> Can you get a stack trace from this process?
> 
> (gdb) bt
> #0  0x406b9d80 in __pthread_sigsuspend () from /lib/libpthread.so.0
> #1  0x406b8a7c in __pthread_wait_for_restart_signal () from
> /lib/libpthread.so.0
> #2  0x406b91f8 in pthread_onexit_process () from /lib/libpthread.so.0
> #3  0x40438658 in exit () from /lib/libc.so.6
> #4  0x40438658 in exit () from /lib/libc.so.6
> Previous frame identical to this frame (corrupt stack?)
> 
> 
> 
>>> pgbuild   3489  0.0  0.0      0     0 ?        Z    06:15   0:00
>>> [postgres] <defunct>
>> This is a bit odd ... if that process is a direct child of the
>> postmaster it should have been reaped promptly.  Could it be a child
>> of the other backend?  If so, why was it started?  Please try the
>> ps again with whatever switch it needs to list parent process ID.
> 
> looks you are right - the defunct 3489 seems to be a child of 3488:
> 
>  PPID   PID  PGID   SID TTY      TPGID STAT   UID   TIME COMMAND
>     1  3389 18341 18341 ?           -1 S     1001   0:03
> /home/pgbuild/pgbuildfarm/HEAD/inst/bin/postgres -D data
>  3389  3391  3391  3391 ?           -1 Ss    1001   0:00 postgres:
> writer process
>  3389  3392  3392  3392 ?           -1 Ss    1001   0:00 postgres: stats
> collector process
>  3389  3488  3488  3488 ?           -1 Ss    1001   0:01 postgres:
> pgbuild pl_regression [local] idle
>  3488  3489  3488  3488 ?           -1 Z     1001   0:00 [postgres]
> <defunct>

FWIW - I removed --with-tcl from quagga's configuration about two weeks
ago and it has not failed(for that reason) again. So the issue most
definitly looks like plptcl related ...


Stefan

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Reply via email to