Re: [GENERAL] Autovacuum daemon terminated by signal 11

2009-01-20 Thread Alvaro Herrera
Tom Lane wrote: > Anyway, it happens consistently on my HP box. I find that your proposed > patch fixes it, but makes the "normal" path crash :-( --- the loop in > do_autovacuum has to be executed in AutovacMemCxt, because it creates an > Oid List that gets passed to vacuum() and had better not b

Re: [GENERAL] Autovacuum daemon terminated by signal 11

2009-01-17 Thread Tom Lane
"Justin Pasher" writes: >> From: Tom Lane [mailto:t...@sss.pgh.pa.us] >> Anyway, it happens consistently on my HP box. I find that your proposed >> patch fixes it, but makes the "normal" path crash :-( --- the loop in >> do_autovacuum has to be executed in AutovacMemCxt, because it creates an >>

Re: [GENERAL] Autovacuum daemon terminated by signal 11

2009-01-17 Thread Justin Pasher
> -Original Message- > From: Tom Lane [mailto:t...@sss.pgh.pa.us] > Sent: Saturday, January 17, 2009 9:50 AM > To: Alvaro Herrera > Cc: Justin Pasher; pgsql-general@postgresql.org > Subject: Re: [GENERAL] Autovacuum daemon terminated by signal 11 > > Alvaro Herr

Re: [GENERAL] Autovacuum daemon terminated by signal 11

2009-01-17 Thread Tom Lane
Alvaro Herrera writes: > Hmm, in retrospect this is pretty obviously buggy. I can't say that > it's that easy for me to reproduce it though; I definitely can't make it > crash. Maybe by sheer luck, the new TopTransactionContext pointer > points to the same memory area that the old was stored in.

Re: [GENERAL] Autovacuum daemon terminated by signal 11

2009-01-17 Thread Martijn van Oosterhout
On Fri, Jan 16, 2009 at 05:13:17PM -0600, Justin Pasher wrote: > Dang it. I wonder why the --enable-debug option doesn't seem to actually > be enabling debug. :( For reference, here is the configure command that > the package uses according to the config.log (in case you spot anything > wrong).

Re: [GENERAL] Autovacuum daemon terminated by signal 11

2009-01-16 Thread Alvaro Herrera
Tom Lane wrote: > What is happening is that autovacuum_do_vac_analyze contains > > old_cxt = MemoryContextSwitchTo(AutovacMemCxt); > ... > vacuum(vacstmt, relids); > ... > MemoryContextSwitchTo(old_cxt); > > and at the time it is called by process_whole_db, CurrentM

Re: [GENERAL] Autovacuum daemon terminated by signal 11

2009-01-16 Thread Tom Lane
I wrote: > ... and you've seemingly not managed to install the debug symbols where > gdb can find them. But never mind that --- it turns out to be trivial to reproduce the crash. Just create a database, set its datfrozenxid and datvacuumxid far in the past (via a manual update of pg_database), en

Re: [GENERAL] Autovacuum daemon terminated by signal 11

2009-01-16 Thread Justin Pasher
Tom Lane wrote: #1 0xb7c37811 in raise () from /lib/tls/i686/cmov/libc.so.6 #2 0xb7c38fb9 in abort () from /lib/tls/i686/cmov/libc.so.6 #3 0x0828cdf3 in ExceptionalCondition () #4 0x082a8cd2 in MemoryContextAlloc () #5 0x082a8d67 in MemoryContextStrdup () #6 0x0829749c in database_getflat

Re: [GENERAL] Autovacuum daemon terminated by signal 11

2009-01-16 Thread Alvaro Herrera
Justin Pasher wrote: > Dang it. I wonder why the --enable-debug option doesn't seem to actually > be enabling debug. :( For reference, here is the configure command that > the package uses according to the config.log (in case you spot anything > wrong). Maybe the executable is getting strip

Re: [GENERAL] Autovacuum daemon terminated by signal 11

2009-01-16 Thread Justin Pasher
Tom Lane wrote: Justin Pasher writes: I recompiled from the Debian source package and added --enable-cassert (--enable-debug was already there). I replaced the Debian standard packages with the recompiled versions and started up the cluster. Now it is hitting a failure on one of the assert

Re: [GENERAL] Autovacuum daemon terminated by signal 11

2009-01-16 Thread Tom Lane
Justin Pasher writes: > I recompiled from the Debian source package and added --enable-cassert > (--enable-debug was already there). I replaced the Debian standard > packages with the recompiled versions and started up the cluster. Now it > is hitting a failure on one of the assert lines, and t

Re: [GENERAL] Autovacuum daemon terminated by signal 11

2009-01-16 Thread Justin Pasher
Tom Lane wrote: I read it like this: #0 0x0827441d in MemoryContextAlloc () <-- real #1 0x08274467 in MemoryContextStrdup ()<-- real #2 0x0826501c in database_getflatfilename () <-- real #3 0x0826504e in database_getflatfilename () <-- must be write_database_file #4 0x08

Re: [GENERAL] Autovacuum daemon terminated by signal 11

2009-01-15 Thread Justin Pasher
Tom Lane wrote: I read it like this: #0 0x0827441d in MemoryContextAlloc () <-- real #1 0x08274467 in MemoryContextStrdup ()<-- real #2 0x0826501c in database_getflatfilename () <-- real #3 0x0826504e in database_getflatfilename () <-- must be write_database_file #4 0x08

Re: [GENERAL] Autovacuum daemon terminated by signal 11

2009-01-15 Thread Tom Lane
Alvaro Herrera writes: > Tom Lane wrote: >> Hmm. This isn't very trustworthy for lack of debug symbols (what we're >> probably looking at are the nearest global function names before the >> actual locations). > The lack of debug symbols makes this all mere guesses though. The > backtrace did no

Re: [GENERAL] Autovacuum daemon terminated by signal 11

2009-01-15 Thread Alvaro Herrera
Tom Lane wrote: > Hmm. This isn't very trustworthy for lack of debug symbols (what we're > probably looking at are the nearest global function names before the > actual locations). However, it strongly suggests that something is > broken in the active memory context, and the most likely explanat

Re: [GENERAL] Autovacuum daemon terminated by signal 11

2009-01-15 Thread Tom Lane
Justin Pasher writes: > Program terminated with signal 11, Segmentation fault. > #0 0x0827441d in MemoryContextAlloc () > (gdb) bt > #0 0x0827441d in MemoryContextAlloc () > #1 0x08274467 in MemoryContextStrdup () > #2 0x0826501c in database_getflatfilename () > #3 0x0826504e in database_getf

Re: [GENERAL] Autovacuum daemon terminated by signal 11

2009-01-15 Thread Justin Pasher
Tom Lane wrote: Having debug symbols would be more useful, but unless the binary is totally stripped, a backtrace might provide enough info without that. Try it and see if you get any function names in the trace, or only numbers. (BTW, does Debian have anything comparable to Red Hat's debuginfo

Re: [GENERAL] Autovacuum daemon terminated by signal 11

2009-01-15 Thread Tom Lane
Justin Pasher writes: > I'll let you know when I get a chance to get a core dump from the > process. I assume I will need a version of Postgres built with debug > symbols for it to be useful? I'm not seeing one in the standard Debian > repositories, so I might have to compile from source. Havi

Re: [GENERAL] Autovacuum daemon terminated by signal 11

2009-01-15 Thread Justin Pasher
Tom Lane wrote: Justin Pasher writes: Richard Huxton wrote: Segmentation fault - probably a bug or bad RAM. It's a relatively new machine, but that's obviously a possibility with any hardware. I haven't seen any other programs experiencing problems on the box, but the Postgres

Re: [GENERAL] Autovacuum daemon terminated by signal 11

2009-01-15 Thread Tom Lane
Justin Pasher writes: > Richard Huxton wrote: >> Segmentation fault - probably a bug or bad RAM. > It's a relatively new machine, but that's obviously a possibility with > any hardware. I haven't seen any other programs experiencing problems on > the box, but the Postgres daemon is the one that

Re: [GENERAL] Autovacuum daemon terminated by signal 11

2009-01-15 Thread Justin Pasher
Richard Huxton wrote: Justin Pasher wrote: Hello, I have a server running PostgreSQL 8.1.15-0etch1 (Debian etch) that was recently put into production. Last week a developer started having a problem with his psql connection being terminated every couple of minutes when he was running a query

Re: [GENERAL] Autovacuum daemon terminated by signal 11

2009-01-15 Thread Alvaro Herrera
>> Justin Pasher wrote: > Are there any internal Postgres tables I can look at that may shed some > light on this? Any particular maintenance commands that could be run for > repair? Please obtain a backtrace from the core file. If there's no core file, please set "ulimit -c unlimited" in th

Re: [GENERAL] Autovacuum daemon terminated by signal 11

2009-01-15 Thread Richard Huxton
Justin Pasher wrote: > Hello, > > I have a server running PostgreSQL 8.1.15-0etch1 (Debian etch) that was > recently put into production. Last week a developer started having a problem > with his psql connection being terminated every couple of minutes when he > was running a query. When I look th

[GENERAL] Autovacuum daemon terminated by signal 11

2009-01-14 Thread Justin Pasher
Hello, I have a server running PostgreSQL 8.1.15-0etch1 (Debian etch) that was recently put into production. Last week a developer started having a problem with his psql connection being terminated every couple of minutes when he was running a query. When I look through the logs, I noticed this me