Re: [GENERAL] Trimming transaction logs after extended WAL archive failures

Steven Schlansker Tue, 25 Mar 2014 16:53:52 -0700

On Mar 25, 2014, at 4:45 PM, Adrian Klaver <[email protected]> wrote:


> On 03/25/2014 04:17 PM, Steven Schlansker wrote:
>> 
>> On Mar 25, 2014, at 4:02 PM, Adrian Klaver <[email protected]> wrote:
>> 
>>> On 03/25/2014 03:54 PM, Steven Schlansker wrote:
>>>> 
>>>> On Mar 25, 2014, at 3:52 PM, Adrian Klaver <[email protected]> 
>>>> wrote:
>>>> 
>>>>> On 03/25/2014 01:56 PM, Steven Schlansker wrote:
>>>>>> Hi everyone,
>>>>>> 
>>>>>> I have a Postgres 9.3.3 database machine.  Due to some intelligent work 
>>>>>> on the part of someone who shall remain nameless, the WAL archive 
>>>>>> command included a ‘> /dev/null 2>&1’ which masked archive failures 
>>>>>> until the disk entirely filled with 400GB of pg_xlog entries.
>>>>>> 
>>>>>> I have fixed the archive command and can see WAL segments being shipped 
>>>>>> off of the server, however the xlog remains at a stable size and is not 
>>>>>> shrinking.  In fact, it’s still growing at a (much slower) rate.
>>>>> 
>>>>> So what is wal_keep_segments set at in postgresql.conf?
>>>>> 
>>>> 
>>>> 5000.  There are currently about 18000 WAL segments in pg_xlog.
>>> 
>>> I guess what I should have also asked previously is what exactly are you 
>>> doing, are you streaming as well as archiving?
>> 
>> Yes, we have both enabled.  Here’s some hopefully relevant configuration 
>> stanzas and information:
>> 
> 
>> 
>> I have verified that WAL segments are being archived to the archive 
>> destination, and that the slave is connected and receiving segments.
> 
> Some more questions, what happens when things begin to dawn on me:)
> 
> You said the disk filled up entirely with log files yet currently the 
> number(size) of logs is growing.

It’s holding stable now.  I tried to vacuum up to clean some space which turned 
out to generate more pg_xlog activity than it saved space, and (I assume) the 
archiver fell behind and that was the source of the growing log.  There haven’t 
been any new segments since I stopped doing that.

> 
> So did you grow the disk, move the logs or find some way to reduce the number?

I used tune2fs to use some of the “reserved” filesystem space temporarily.  I 
was too scared to move log segments away, this is a production database.

> 
> What happened to the server when the disk filled up?

Postgresql PANICed due to failed writes.
Mar 25 22:46:41 prd-db1a postgres[18995]: [12-1] db=checkin,user=postgres 
PANIC:  could not write to file "pg_xlog/xlogtemp.18995": No space left on 
device

> In other words do the log entries at the time show it recovered gracefully?

The database is currently up and running, although I do not have much time 
until it fails again, there are only a few precious GB free.

> If not what did you do to get it running again?
> 

tune2fs and restarted postgres

> The concern being that the server is actually fully recovered.

I believe it is.  Our production site is back up and running seemingly 
normally, the postgres log has no obvious complaining.



-- 
Sent via pgsql-general mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Trimming transaction logs after extended WAL archive failures

Reply via email to