Re: 8.2 Autovacuum BUG ?

2018-01-24 Thread pavan95
Hi Álvaro Herrera,

Please find the corresponding output:

*1).select name, setting, source, sourcefile, sourceline from pg_settings
where name like '%vacuum%'; *
-[ RECORD 1 ]
name   | autovacuum
setting| on
source | configuration file
sourcefile | /etc/postgresql/9.1/main/postgresql.conf
sourceline | 437
-[ RECORD 2 ]
name   | autovacuum_analyze_scale_factor
setting| 0.1
source | configuration file
sourcefile | /etc/postgresql/9.1/main/postgresql.conf
sourceline | 451
-[ RECORD 3 ]
name   | autovacuum_analyze_threshold
setting| 50
source | configuration file
sourcefile | /etc/postgresql/9.1/main/postgresql.conf
sourceline | 448
-[ RECORD 4 ]
name   | autovacuum_freeze_max_age
setting| 2
source | configuration file
sourcefile | /etc/postgresql/9.1/main/postgresql.conf
sourceline | 452
-[ RECORD 5 ]
name   | autovacuum_max_workers
setting| 3
source | configuration file
sourcefile | /etc/postgresql/9.1/main/postgresql.conf
sourceline | 443
-[ RECORD 6 ]
name   | autovacuum_naptime
setting| 60
source | configuration file
sourcefile | /etc/postgresql/9.1/main/postgresql.conf
sourceline | 445
-[ RECORD 7 ]
name   | autovacuum_vacuum_cost_delay
setting| 20
source | configuration file
sourcefile | /etc/postgresql/9.1/main/postgresql.conf
sourceline | 454
-[ RECORD 8 ]
name   | autovacuum_vacuum_cost_limit
setting| -1
source | configuration file
sourcefile | /etc/postgresql/9.1/main/postgresql.conf
sourceline | 457
-[ RECORD 9 ]
name   | autovacuum_vacuum_scale_factor
setting| 0.2
source | configuration file
sourcefile | /etc/postgresql/9.1/main/postgresql.conf
sourceline | 450
-[ RECORD 10 ]---
name   | autovacuum_vacuum_threshold
setting| 50
source | configuration file
sourcefile | /etc/postgresql/9.1/main/postgresql.conf
sourceline | 446
-[ RECORD 11 ]---
name   | log_autovacuum_min_duration
setting| 100
source | configuration file
sourcefile | /etc/postgresql/9.1/main/postgresql.conf
sourceline | 439
-[ RECORD 12 ]---
name   | vacuum_cost_delay
setting| 0
source | default
sourcefile |
sourceline |
-[ RECORD 13 ]---
name   | vacuum_cost_limit
setting| 200
source | default
sourcefile |
sourceline |
-[ RECORD 14 ]---
name   | vacuum_cost_page_dirty
setting| 20
source | default
sourcefile |
sourceline |
-[ RECORD 15 ]---
name   | vacuum_cost_page_hit
setting| 1
source | default
sourcefile |
sourceline |
-[ RECORD 16 ]---
name   | vacuum_cost_page_miss
setting| 10
source | default
sourcefile |
sourceline |
-[ RECORD 17 ]---
name   | vacuum_defer_cleanup_age
setting| 0
source | default
sourcefile |
sourceline |
-[ RECORD 18 ]---
name   | vacuum_freeze_min_age
setting| 5000
source | default
sourcefile |
sourceline |
-[ RECORD 19 ]---
name   | vacuum_freeze_table_age
setting| 15000
source | default
sourcefile |
sourceline |


*2).select oid::regclass, reloptions from pg_class where reloptions is not
null; *

(No rows)



Thanks in Advance.


Regards,
Pavan




--
Sent from: 
http://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html



Re: 8.2 Autovacuum BUG ?

2018-01-24 Thread Alvaro Herrera
Please show the output of these queries in the relevant databases:

select name, setting, source, sourcefile, sourceline from pg_settings where 
name like '%vacuum%';
select oid::regclass, reloptions from pg_class where reloptions is not null;

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: 8.2 Autovacuum BUG ?

2018-01-24 Thread David G. Johnston
On Wed, Jan 24, 2018 at 7:48 AM, Pavan Teja 
wrote:

>
>
>  Yes, but why doing vacuum freeze of a table is causing a rapid
> ​archiving??
> Any idea??
>
>
IIUC ​Freezing involves physically altering those pages that are not frozen
to make them frozen.  Those changes are logged just like any (most?) other
physical changes to pages.  The rapid-ness is because freezing is not that
difficult so lots of pages can be changed in a relatively short period of
time.

David J.
​


Re: 8.2 Autovacuum BUG ?

2018-01-24 Thread Pavan Teja
On Jan 24, 2018 7:57 PM, "Claudio Freire"  wrote:



On Wed, Jan 24, 2018 at 8:50 AM, pavan95 
wrote:

> Hello all,
>
> One more interesting observation made by me.
>
> I have ran the below query(s) on production:
>
> SELECT
> relname,
> age(relfrozenxid) as xid_age,
> pg_size_pretty(pg_table_size(oid)) as table_size
> FROM pg_class
> WHERE relkind = 'r' and pg_table_size(oid) > 1073741824
> ORDER BY age(relfrozenxid) DESC ;
> relname  |
> xid_age | table_size
> 
> +-+
>  *hxx*  |
> 7798262 | 3245 MB
>  hrx |
> 7797554 | 4917 MB
>  irxx|
> 7796771 | 2841 MB
>  hr_   | 7744262 |
> 4778 MB
>  reimbxxx | 6767712 | 1110 MB
>
> show autovacuum_freeze_max_age;
>  autovacuum_freeze_max_age
> ---
>  2
> (1 row)
>

You seem to be rather far from the freeze_max_age. Unless you're consuming
txids at a very high rate, I don't think that's your problem.


   Hi ,


 Yes, but why doing vacuum freeze of a table is causing a rapid
​archiving??
Any idea??

Regards,
Pavan


Re: 8.2 Autovacuum BUG ?

2018-01-24 Thread Claudio Freire
On Wed, Jan 24, 2018 at 8:50 AM, pavan95 
wrote:

> Hello all,
>
> One more interesting observation made by me.
>
> I have ran the below query(s) on production:
>
> SELECT
> relname,
> age(relfrozenxid) as xid_age,
> pg_size_pretty(pg_table_size(oid)) as table_size
> FROM pg_class
> WHERE relkind = 'r' and pg_table_size(oid) > 1073741824
> ORDER BY age(relfrozenxid) DESC ;
> relname  |
> xid_age | table_size
> 
> +-+
>  *hxx*  |
> 7798262 | 3245 MB
>  hrx |
> 7797554 | 4917 MB
>  irxx|
> 7796771 | 2841 MB
>  hr_   | 7744262 |
> 4778 MB
>  reimbxxx | 6767712 | 1110 MB
>
> show autovacuum_freeze_max_age;
>  autovacuum_freeze_max_age
> ---
>  2
> (1 row)
>

You seem to be rather far from the freeze_max_age. Unless you're consuming
txids at a very high rate, I don't think that's your problem.


pg_xlog unbounded growth

2018-01-24 Thread Stefan Petrea
Hello,

This email is structured in sections as follows:

1 - Estimating the size of pg_xlog depending on postgresql.conf parameters
2 - Cleaning up pg_xlog using a watchdog script
3 - Mailing list survey of related bugs
4 - Thoughts

We're using PostgreSQL 9.6.6 on a Ubuntu 16.04.3 LTS.
During some database imports(using pg_restore), we're noticing fast
and unbounded growth of pg_xlog up to the point where the
partition(280G in size for us) that stores it fills up and PostgreSQL
shuts down. The error seen in the logs:

2018-01-17 01:46:23.035 CST [41671] LOG:  database system was shut down at 
2018-01-16 15:49:26 CST
2018-01-17 01:46:23.038 CST [41671] FATAL:  could not write to file 
"pg_xlog/xlogtemp.41671": No space left on device
2018-01-17 01:46:23.039 CST [41662] LOG:  startup process (PID 41671) 
exited with exit code 1
2018-01-17 01:46:23.039 CST [41662] LOG:  aborting startup due to startup 
process failure
2018-01-17 01:46:23.078 CST [41662] LOG:  database system is shut down

The config settings I thought were relevant are these ones (but I'm
also attaching the entire postgresql.conf if there are other ones that
I missed):

wal_level=replica
archive_command='exit 0;'
min_wal_size=2GB
max_wal_size=500MB
checkpoint_completion_target = 0.7
wal_keep_segments = 8

So currently the pg_xlog is growing a lot, and there doesn't seem to
be any way to stop it.

There are some formulas I came across that allow one to compute the
maximum number of WAL allowed in pg_xlog as a function of the
PostgreSQL config parameters.

1.1) Method from 2012 found in [2]

The formula for the upper bound for WAL files in pg_xlog is 

(2 + checkpoint_completion_target) * checkpoint_segments + 1
which is 
( (2 + 0.7) * (2048/16 * 1/3 ) ) + 1 ~ 116 WAL files

I used the 1/3 because of [6] the shift from checkpoint_segments to
max_wal_size in 9.5 , the relevant quote from the release notes being:

If you previously adjusted checkpoint_segments, the following formula
will give you an approximately equivalent setting:
max_wal_size = (3 * checkpoint_segments) * 16MB

Another way of computing it, also according to [2] is the following
2 * checkpoint_segments + wal_keep_segments + 1
which is (2048/16) + 8 + 1 = 137  WAL files

So far we have two answers, in practice none of them check out, since
pg_xlog grows indefinitely.

1.2) Method from the PostgreSQL internals book 

The book [4] says the following:

it could temporarily become up to "3 * checkpoint_segments + 1"

Ok, let's compute this too, it's 3 * (128/3) + 1 = 129 WAL files

This doesn't check out either.

1.3) On the mailing list [3] , I found similar formulas that were seen
previously.

1.4) The post at [5] says max_wal_size is as soft limit and also sets
wal_keep_segments = 0 in order to enforce keeping as little WAL as
possible around.  Would this work?

Does wal_keep_segments = 0 turn off WAL recycling? Frankly, I would
rather have WAL not be recycled/reused, and just deleted to keep
pg_xlog below expected size.

Another question is, does wal_level = replica affect the size of
pg_xlog in any way?  We have an archive_command that just exits with
exit code 0, so I don't see any reason for the pg_xlog files to not be
cleaned up.

2) Cleaning up pg_xlog using a watchdog script

To get the import done I wrote a script that's actually inspired from
a blog post where the pg_xlog out of disk space problem is
addressed [1].  It periodically reads the last checkpoint's REDO WAL
file, and deletes all WAL in pg_xlog before that one. 

The intended usage is for this script to run alongside the imports
in order for pg_xlog to be cleaned up gradually and prevent the disk
from filling up.

Unlike the blog post and probably slightly wrong is that I used
lexicographic ordering and not ordering by date. But I guess it worked
because the checks were frequent enough that no WAL ever got
recycled. In retrospect I should've used the date ordering.

Does this script have the same effect as checkpoint_completion_target=0 ?

At the end of the day, this script seems to have allowed the import we needed
to get done, but I acknowledge it was a stop-gap measure and not a long-term
solution, hence me posting on the mailing list to find a better solution.

3) Mailing list survey of related bugs

On the mailing lists, in the past, there have been bugs around pg_xlog
growing out of control:

BUG 7902 [7] - Discusses a situation where WAL are produced faster than
checkpoints can be completed(written to disk), and therefore the WALs
in pg_xlog cannot be recycled/deleted.  The status of this bug report
is unclear. I have a feeling it's still open. Is that the case?

BUG 14340 [9] - A user(Sonu Gupta) is reporting pg_xlog unbounded growth
and is asked to do some checks and then directed to the pgsql-general mailing 
list
where he did not follow up.
I quote the checks that were suggested

Check that your archive_command is functioning