Re: [GENERAL] "show all" command crashes server * FIXED *

Grant Maxwell Sun, 13 Sep 2009 16:33:19 -0700

First of all thanks to those who provided input.

This problem is now fixed and I thought I would post this solution sothat others might benefit in the future.


For the sake of completeness:

The error was that if "show all" was run on this postgresql (version8.3) server, postgres would crash and then recover.

        Otherwise the server "seemed" healthy

        The postgres log showed:

Sep 10 23:55:36 theconsole postgres[31118]: [4-1] 0: LOG: 00000:server process (PID 31145) was terminated by signal 11: SegmentationfaultSep 10 23:55:36 theconsole postgres[31118]: [4-2] 0: LOCATION:LogChildExit, postmaster.c:2529Sep 10 23:55:36 theconsole postgres[31118]: [5-1] 0: LOG: 00000:terminating any other active server processesSep 10 23:55:36 theconsole postgres[31118]: [5-2] 0: LOCATION:HandleChildCrash, postmaster.c:2374Sep 10 23:55:36 theconsole postgres[31118]: [6-1] 0: LOG: 00000:all server processes terminated; reinitializingSep 10 23:55:36 theconsole postgres[31118]: [6-2] 0: LOCATION:PostmasterStateMachine, postmaster.c:2690Sep 10 23:55:36 theconsole postgres[31146]: [7-1] 0: LOG: 00000:database system was interrupted; last known up at 2009-09-10 23:55:14ESTSep 10 23:55:36 theconsole postgres[31146]: [7-2] 0: LOCATION:StartupXLOG, xlog.c:4836Sep 10 23:55:36 theconsole postgres[31147]: [7-1] [local] postgrespostgres 0: FATAL: 57P03: the database system is in recovery modeSep 10 23:55:36 theconsole postgres[31147]: [7-2] [local] postgrespostgres 0: LOCATION: ProcessStartupPacket, postmaster.c:1648Sep 10 23:55:36 theconsole postgres[31146]: [8-1] 0: LOG: 00000:database system was not properly shut down; automatic recovery inprogressSep 10 23:55:36 theconsole postgres[31146]: [8-2] 0: LOCATION:StartupXLOG, xlog.c:5003Sep 10 23:55:36 theconsole postgres[31146]: [9-1] 0: LOG: 00000:record with zero length at 2A/E734761CSep 10 23:55:36 theconsole postgres[31146]: [9-2] 0: LOCATION:ReadRecord, xlog.c:3126Sep 10 23:55:36 theconsole postgres[31146]: [10-1] 0: LOG: 00000:redo is not requiredSep 10 23:55:36 theconsole postgres[31146]: [10-2] 0: LOCATION:StartupXLOG, xlog.c:5146Sep 10 23:55:36 theconsole postgres[31150]: [7-1] 0: LOG: 00000:autovacuum launcher startedSep 10 23:55:36 theconsole postgres[31150]: [7-2] 0: LOCATION:AutoVacLauncherMain, autovacuum.c:520Sep 10 23:55:36 theconsole postgres[31118]: [7-1] 0: LOG: 00000:database system is ready to accept connections


SOLUTION:
        Increase the memory on the server.

WHY

We had recently ( a month before) had installed splunk on the server.It was running okThe combination of splunk and other tasks running had pushed thememory too close.What we did not notice was that swap had been almost completelyconsumed - nasty


RESULT

We shut it all down, increased the memory (double) and voila -problem gone.

It goes to show that when hunting problems we should not ignore thebasic environmental elements.It also goes to show that our monitoring system was not looking atthis relatively new server.

(this confession is not an invitation for a spanking)

again thanks for the help
Grant


On 11/09/2009, at 9:09 AM, Grant Maxwell wrote:


On 11/09/2009, at 8:36 AM, Tom Lane wrote:

Grant Maxwell <grant.maxw...@maxan.com.au> writes:

On the problem server:
        shared_preload_libraries = 'pgmemcache'
        #local_preload_libraries = ''

on the others both are emply.


Sounds like a smoking gun to me.

For good measure I removed pgmemcache but the problem persists.


Did you restart the postmaster afterwards?  shared_preload_libraries
is only considered at postmaster start.


        yep - full restart.


                        regards, tom lane



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] "show all" command crashes server *** FIXED ***

Reply via email to

Re: [GENERAL] "show all" command crashes server * FIXED *