[GENERAL] Try to understand VACUUM and its settings

2015-10-05 Thread Michael Chau
Hi,

Last Friday, I ran :

postgres=# select max(age(datfrozenxid)) from pg_database;

max
42579490

and then I ran :

SELECT relname, age(relfrozenxid) as xid_age,
pg_size_pretty(pg_table_size(oid)) as table_size
FROM pg_class
WHERE relkind = 'r' and pg_table_size(oid) > 1073741824
ORDER BY age(relfrozenxid) DESC LIMIT 20;


"table_1";42574012;"10111 MB"
"table_2";42567398;"5961 MB"
"table_3";20320509;"12 GB"

Today, the max is

43068744

and

"table_1";43063214;"10134 MB"
"table_2";43056600;"5984 MB"
"table_3";20809711;"12 GB"



It looks like the  age(relfrozenxid) of the tables go up in tandem with the
max. autovacuum_freeze_max_age and vacuum_freeze_table_age are commented
out in postgresql.conf

#autovacuum_freeze_max_age = 2
#vacuum_freeze_table_age = 15000

So, do I need to run vacuum freeze on those tables? Also, if
autovacuum_freeze_max_age is commented, does it still mean that the default
is 200M?

Thanks,
Michael


Re: [GENERAL] Broken primary key after backup restore.

2015-09-21 Thread Michael Chau
Hi,

Just want to let everybody knows

So, I found out this morning what went wrong.

For some reason, there were some bad wal log files in pg_xlog. I believe
that they got generated during the backup last Monday, but I don't know
why. I speculate that may be the wal_keep_segments was not set high enough
as I have changed it recently.

Luckily, I have archived the wal log files. And by comparing between the
two directories, I did see those bad wal log files in pg_xlog directory
only.

Yes, we were looking at Barman before. But at that time Barman didn't
support our version of PostgreSQL. I think that I will pursue this.

Thanks again.

On Fri, Sep 18, 2015 at 2:40 PM, David Steele  wrote:

> On 9/18/15 3:44 PM, Michael Chau wrote:
>
>> Hi Jeff,
>>
>> Only if you are very lucky.  If your tar command tars up the pg_xlog
>>> directory as the last thing it does, then you are probably going to be OK.
>>> Otherwise, it is a crap shoot.
>>>
>>
>> May be that's it. I have another similar set up, but the pg_xlog is a
>> soft link to another directory, and I use 'tar -chvzf'. It tar up the
>> pg_xlog at the very last. And the restore is fine.
>>
>
> This is still not always safe.  It depends on your wal_keep_segments
> settings and some luck.  WAL segments can be recycled during the backup.
>
> For this one, DB1 and DB2, the pg_xlog is the directory itself, and I
>> use 'tar -cvzf'. And it tar up pg_xlog at the beginning. I always have
>> doubt about it. But I though pg_stop_backup() and pg_start_backup() like
>> freezing would prevent the inconsistency.
>>
>
> This is definitely not a good idea.
>
> Indeed, I will look inot pgbasebackup.
>>
>
> pg_basebackup is good for creating replicas but for real backup you might
> want to consider purpose-built backup software like pgBackRest or barman.
>
> --
> -David
> da...@pgmasters.net
>



-- 
*Michael Chau*
*Database Administrator*
*GAME GOLF*
77 Geary St, 5th floor
San Francisco, CA 94108
c) *510-366-3800*
e) *michael.c...@gameyourgame.com <http://gameyourgame.com>*
f) www.facebook.com/gamegolf <http://www.facebook.com/gamegolf.gyg>
t) @GAMEGOLF
w) www.gamegolf.c <http://www.gameyourgame.com/>*om*


Re: [GENERAL] Broken primary key after backup restore.

2015-09-18 Thread Michael Chau
Hi Jeff,

>Only if you are very lucky.  If your tar command tars up the pg_xlog
directory as the last thing it does, then you are probably going to be OK.
Otherwise, it is a crap shoot.

May be that's it. I have another similar set up, but the pg_xlog is a soft
link to another directory, and I use 'tar -chvzf'. It tar up the pg_xlog at
the very last. And the restore is fine.

For this one, DB1 and DB2, the pg_xlog is the directory itself, and I use
'tar -cvzf'. And it tar up pg_xlog at the beginning. I always have doubt
about it. But I though pg_stop_backup() and pg_start_backup() like freezing
would prevent the inconsistency.

Indeed, I will look inot pgbasebackup.

Thanks,



On Fri, Sep 18, 2015 at 11:20 AM, Jeff Janes  wrote:

> On Fri, Sep 18, 2015 at 6:16 AM, Adrian Klaver 
> wrote:
>
>> On 09/17/2015 11:15 PM, Guillaume Lelarge wrote:
>>
>>> Le 18 sept. 2015 5:23 AM, "Adrian Klaver" >> <mailto:adrian.kla...@aklaver.com>> a écrit :
>>>  >
>>>  > On 09/17/2015 05:37 PM, Michael Chau wrote:
>>>  >>
>>>  >> To restore on test server:
>>>  >>
>>>  >> 1) Just untar the tar ball, then start up Postgres. Of course the
>>> data
>>>  >> directory is empty beforehand.
>>>  >>
>>>  >> This has been working for almost 2 years without any problem until
>>> last
>>>  >> Monday. I remember that I just ran vacuum analyze that table on both
>>> DB1
>>>  >> and DB2  that morning. But, I don't think that it harms anything.
>>>  >
>>>  >
>>>  > Well it looks fairly straight forward, to me at least.
>>>  >
>>>
>>> Do I miss something obvious? Because this is to me the wrong way to do
>>> the restore. You need to apply WAL files archived between
>>> pg_start_backup and pg_stop_backup to get consistent data files.
>>>
>>
>> Would that not be taken care of by the tar data directory/ untar data
>> directory?
>>
>
> Only if you are very lucky.  If your tar command tars up the pg_xlog
> directory as the last thing it does, then you are probably going to be OK.
> Otherwise, it is a crap shoot.
>
>
>
>> I would think if it was a WAL issue the OP could never get the server to
>> start and get to the point the query failed on a single table and column.
>
>
> With pg_basebackup, that is probably the case, as it either doesn't copy
> xlog at all, or if it does it makes sure it is complete.  But with tar, you
> have no such protection.
>
>
>
>> All that being said, I think the OP would be better served by
>> pg_basebackup:
>>
>> http://www.postgresql.org/docs/9.4/static/app-pgbasebackup.html
>
>
>
> Yes, indeed.
>
> Cheers,
>
> Jeff
>
>


-- 
*Michael Chau*
*Database Administrator*
*GAME GOLF*
77 Geary St, 5th floor
San Francisco, CA 94108
c) *510-366-3800*
e) *michael.c...@gameyourgame.com <http://gameyourgame.com>*
f) www.facebook.com/gamegolf <http://www.facebook.com/gamegolf.gyg>
t) @GAMEGOLF
w) www.gamegolf.c <http://www.gameyourgame.com/>*om*


Re: [GENERAL] Broken primary key after backup restore.

2015-09-17 Thread Michael Chau
1)
>
> In Production, I have a DB2 which is replicated partially using Londiste
> from DB1.
>

Well I think the above needs more explanation to help understand how the
DB2 backup got into this state and possibly prevent it in the future.

A: So, the DB1 has several schemas in the database. We use Londiste to
replicate just one of the schemas to DB2.  The table in question is in that
schema. Backup is done on both DB1 and DB2.

2)
I make file-system backups nightly on both DBs.

How is that done exactly?

A: To backup:

1) pg_start_backup()
2) tar up the files under the data directory
3) pg_stop_backup()

To restore on test server:

1) Just untar the tar ball, then start up Postgres. Of course the data
directory is empty beforehand.

This has been working for almost 2 years without any problem until last
Monday. I remember that I just ran vacuum analyze that table on both DB1
and DB2  that morning. But, I don't think that it harms anything.

Thanks


On Thu, Sep 17, 2015 at 4:53 PM, Adrian Klaver 
wrote:

> On 09/17/2015 04:31 PM, Michael Chau wrote:
>
>> Hi,
>>
>> In Production, I have a DB2 which is replicated partially using Londiste
>> from DB1.
>>
>
> Well I think the above needs more explanation to help understand how the
> DB2 backup got into this state and possibly prevent it in the future.
>
>
> I make file-system backups nightly on both DBs.
>
> How is that done exactly?
>
>
>> Last Monday, when I restored the backup made from DB2 to a test server,
>> Postgres(9.3.5) started up fine. But, I found out that the primary key
>> of one of the tables is broken
>>
>> # select * from  order by id desc;
>> ERROR:  could not find left sibling of block 17392 in index "mytable_pkey"
>>
>> I am able to select without using the id index. On Prod DB1 , DB2 and on
>> another test server restored from backup made from DB1, there is no
>> problem, as I am able to select the table with and without index.
>>
>
> Did you restore to the DB2 derived test server in the same way as you did
> the other servers?
>
>
>
>> The table has 5 million rows. And I run Vacuum Analyze once a week.
>>
>> 1) To fix the above error, I tried to run vacuum full on the table and
>> run 'reindex table ;. But it didn't help as the reindexing has
>> taken very very long time and not sure if it has finished or just timed
>> out.
>>
>> There is also a suggestion to recreate the primary key constraint
>> concurrently which I will look into later.
>>
>> 2) However, my main concern right now is whether there is any corruption
>> in the Prods table as it does look fine. Is there any way to check? And
>> also should we trust a file-system backup in this case?
>>
>> Thanks
>>
>>
>>
>>
>>
>>
>>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>



-- 
*Michael Chau*
*Database Administrator*
*GAME GOLF*
77 Geary St, 5th floor
San Francisco, CA 94108
c) *510-366-3800 <510-366-3800>*
e) *michael.c...@gameyourgame.com <http://gameyourgame.com>*
f) www.facebook.com/gamegolf <http://www.facebook.com/gamegolf.gyg>
t) @GAMEGOLF
w) www.gamegolf.c <http://www.gameyourgame.com/>*om*


[GENERAL] Broken primary key after backup restore.

2015-09-17 Thread Michael Chau
Hi,

In Production, I have a DB2 which is replicated partially using Londiste
from DB1. I make file-system backups nightly on both DBs.

Last Monday, when I restored the backup made from DB2 to a test server,
Postgres(9.3.5) started up fine. But, I found out that the primary key of
one of the tables is broken

# select * from  order by id desc;
ERROR:  could not find left sibling of block 17392 in index "mytable_pkey"

I am able to select without using the id index. On Prod DB1 , DB2 and on
another test server restored from backup made from DB1, there is no
problem, as I am able to select the table with and without index.

The table has 5 million rows. And I run Vacuum Analyze once a week.

1) To fix the above error, I tried to run vacuum full on the table and run
'reindex table ;. But it didn't help as the reindexing has taken
very very long time and not sure if it has finished or just timed out.

There is also a suggestion to recreate the primary key constraint
concurrently which I will look into later.

2) However, my main concern right now is whether there is any corruption in
the Prods table as it does look fine. Is there any way to check? And also
should we trust a file-system backup in this case?

Thanks