Re: [GENERAL] New Slave - timeline ERROR

2016-01-13 Thread drum.lu...@gmail.com
Hi guys..
I started a new PG_BASEBACKUP and it's working now..

The problem was the line: standby_mode = on on the recovery.conf  on the
STANDBY server.

After the basebackup, I comented this line and started the postgreSQL. BUT,
you shouldn't  do that.

On the last time I didn't  comment and it worked.

Thank you!



Lucas Possamai

kinghost.co.nz


On 10 January 2016 at 19:22, drum.lu...@gmail.com 
wrote:

> What is the --pgdata=- in your original command? Are you perhaps in the
>> wrong directory and not getting all the required files?
>
>
> I run the pg_basebackup from the Slave on /var/lib/pgsql/9.2/data.
> So I'm not in the wrong directory...
>
> I'm out of fresh ideas. The rsync command is what I would go with, given
>> that I can't think of any other commands to try.
>
>
> I chose the pg_basebackup command just to not stop any database. It's out
> of circumstances to stop even the slave one... sorry...
>
> I really don't know what else to do. Have tried everything!
>
> Lucas
>
> On 10 January 2016 at 13:31, bricklen  wrote:
>
>> Bottom-posting is the convention in the postgresql lists, and makes it
>> easier to follow a long thread.
>>
>> On Sat, Jan 9, 2016 at 3:16 PM, drum.lu...@gmail.com <
>> drum.lu...@gmail.com> wrote:
>>
>>> My servers are not in the same network. A new pg_backup would take 30
>>> hours to complete as I use --rate-limit 100MB.
>>
>>
>> If you had enough bandwidth, you could do some shell magic to parallelize
>> the rsync commands, or use something like
>> http://moo.nac.uci.edu/~hjm/parsync/ to do that. If you are limited by
>> bandwidth, then a single rsync run is probably what you're stuck with.
>>
>>
>>> I really need to put his server up! =\
>>>
>>
>> If you were running zfs you could also take a snapshot of the fs and use
>> that for your base backup, but I assume you would have mentioned that if it
>> was an option.
>>
>>
>>
>>> I don't think that running a pg_basebackup one more time will solve the
>>> problem, because I've already done that!
>>> I could run actually, but the problem is that it takes 30h! hahahahah
>>>
>>
>> What is the --pgdata=- in your original command? Are you perhaps in the
>> wrong directory and not getting all the required files?
>>
>>
>> I'm out of fresh ideas. The rsync command is what I would go with, given
>> that I can't think of any other commands to try.
>>
>>
>>
>>>
>>> *Have a look:*
>>> http://www.postgresql.org/docs/9.2/static/app-pgbasebackup.html
>>>
>>> Note that there are some limitations in an online backup from the
 standby:

>>>
>>>
>>> The backup history file is not created in the database cluster backed up.
 There is no guarantee that all WAL files required for the backup are
 archived at the end of backup. If you are planning to use the backup for an
 archive recovery and want to ensure that all required files are available
 at that moment, you need to include them into the backup by using -x
  option.

>>>
>> You had that in your original command I believe.
>>
>
>


Re: [GENERAL] New Slave - timeline ERROR

2016-01-09 Thread drum.lu...@gmail.com
Hi @bricklen. Thanks for your reply. I've been working on it for 20h =(
So

The master is currently shipping the WALs to the slave.
When pg_basebackup has done, I got a successful log:

postgres(iostreams)[10037]:   2016-01-09 00:07:26.604
UTC|10085|LOG:  database system is ready to accept read only
connections

The problem happens when, after pg_basebackup, I change the recovery.conf
to replicate from live,  changing these 2 lines:

standby_mode = on

primary_conninfo = 'host=IP_TO_THE_OTHER_SLAVE port=5432
user=replicator application_name=replication_slave02'


I read the link you have sent: http://dba.stackexchange.com/a/53546/24393
I'm doing all exactly the same, but only one parameter is different:

time pg_basebackup --pgdata=$PGDATA --host=IP_OF_MASTER --port=5432
--username=replication --password --xlog-method=stream --format=plain
--progress --verbose

I'm not using --xlog-method=strem
I'm using:
ssh postgres@slave1 'pg_basebackup --pgdata=- --format=tar
--label=bb_master --progress --host=localhost --port=5432
--username=replicator *--xlog* | pv --quiet --rate-limit 100M' | tar -x
--no-same-owner

on http://www.postgresql.org/docs/9.2/static/app-pgbasebackup.html is:
-xlog

Using this option is equivalent of using -X with method fetch.
So... I have 2 TB of data. I wouldn't like to run the pg_basebackup again
with the option: --xlog-method=stream without have sure that is going to
work


Do you have any other ideia? Do you know if --xlog it's the problem and I
should re-run the pg_basebackup again with the *--xlog-method=stream*
option?

Thank you



Lucas Possamai

kinghost.co.nz


On 10 January 2016 at 06:06, bricklen  wrote:

> On Fri, Jan 8, 2016 at 8:44 PM, drum.lu...@gmail.com  > wrote:
>
> Hi, I'm a bit too lazy to try suss out the exact reasons for your failure,
> but here is a reasonably thorough guide to set up replication:
> http://dba.stackexchange.com/a/53546/24393
>
> A few tips:
> - Having the master ship WALs to the slaves is handy if you can pull it
> off. If you are doing it over the wire and using rsync, "-z" for
> compression is recommended. If you are doing the tar format of the
> pg_basebackup, you *must* have the master ship the WALs to the slave
> otherwise it won't be able to synchronize (the "stream" method ships WALs
> over the wire so the end result is a synchronized system.
>
> - I always run pg_basebackup from the slave I am building, for simplicity.
> - I create new slaves almost every day (we have thousands of databases)
> using a bash script and it almost much never fails. In essence it is a big
> wrapper around the pg_basebackup command (though we are using pg93 mostly).
>
> The base backup command that I run from the slave I am building:
> pg_basebackup --pgdata=$PGDATA --host=$MASTER_IP --port=$PGPORT
> --username=replication --no-password --xlog-method=stream --format=plain
> --progress --verbose
>
> The recovery.conf:
> standby_mode = 'on'
> primary_conninfo = 'user=replication host=$IP_OF_UPSTREAM_SLAVE_OR_MASTER
> port=5432 sslmode=prefer sslcompression=1 krbsrvname=postgres'
> recovery_target_timeline = 'latest'
> archive_cleanup_command = '/usr/pgsql-9.3/bin/pg_archivecleanup
> /path/to/WALs %r'
> restore_command = 'cp /path/to/WALs/%f "%p" 2>>
> /your/PGDATA/path/pg_log/standby.log'
>


Re: [GENERAL] New Slave - timeline ERROR

2016-01-09 Thread bricklen
On Sat, Jan 9, 2016 at 1:49 PM, drum.lu...@gmail.com 
wrote:

> Hi,
>
> If the master is successfully ships WALs to the slave you are setting up
>> you do not need the "stream" option.
>
>
> yes.. the master is successfully shipping the WALs
>
> Is there anything else? Help, please hehehehe
>

If you are able to stop Postgres on the slave you are taking the base
backup from, you could do this:

1). Stop postgres on slave1
2). Rsync slave1 to slave2 to copy only the deltas.
3). When you start up slave2 the WALs that the master has shipped to slave2
should apply and bring your system up to consistency.


Re: [GENERAL] New Slave - timeline ERROR

2016-01-09 Thread drum.lu...@gmail.com
Hmm... I see...
>
>
>> Depending on when you took the base backup and how many changes have
>> occurred at your source (slave1) database cluster, the rsync execution time
>> may or may not take as long as a new base backup if is only only shipping
>> deltas (changed files).
>
>
I could stop the slave then But I'm afraid getting it back online and
get some other errors hehehehe =\







Lucas Possamai

kinghost.co.nz


On 10 January 2016 at 10:59, bricklen  wrote:

> On Sat, Jan 9, 2016 at 1:54 PM, drum.lu...@gmail.com  > wrote:
>
>> Hi,
>>
>> If you are able to stop Postgres on the slave you are taking the base
>>> backup from, you could do this:
>>
>>
>> I'm not... the data base is 2 TB.
>> So, a RSYNC would take DAYS.  And I'm not able to stop the SLAVE for
>> that long time
>>
>
> Depending on when you took the base backup and how many changes have
> occurred at your source (slave1) database cluster, the rsync execution time
> may or may not take as long as a new base backup if is only only shipping
> deltas (changed files).
>


Re: [GENERAL] New Slave - timeline ERROR

2016-01-09 Thread bricklen
On Sat, Jan 9, 2016 at 12:36 PM, drum.lu...@gmail.com 
wrote:

> Do you have any other ideia? Do you know if --xlog it's the problem and I
> should re-run the pg_basebackup again with the *--xlog-method=stream*
> option?
>


If the master is successfully ships WALs to the slave you are setting up
you do not need the "stream" option.


Re: [GENERAL] New Slave - timeline ERROR

2016-01-09 Thread drum.lu...@gmail.com
*rsync would be something like:*

from slave1:
rsync -av data/pg_xlog slave2:/var/lib/postgresql/data/

Is that correct?

At this point I think your options are slim. If you are feeling
> adventurous, you can try doing the rsync with the slave running, then do a
> second rsync with the slave stopped or do it from the master after putting
> the master in backup mode (eg. executing "pg_start_backup('slave_
> backup')")


I didn't unterstand why doing RSYNC twice... sorry

Lucas



Lucas Possamai

kinghost.co.nz


On 10 January 2016 at 11:19, bricklen  wrote:

> On Sat, Jan 9, 2016 at 2:10 PM, drum.lu...@gmail.com  > wrote:
>
>> I could stop the slave then But I'm afraid getting it back online and
>> get some other errors
>>
>
> At this point I think your options are slim. If you are feeling
> adventurous, you can try doing the rsync with the slave running, then do a
> second rsync with the slave stopped or do it from the master after putting
> the master in backup mode (eg. executing "pg_start_backup('slave_backup')")
>
>


Re: [GENERAL] New Slave - timeline ERROR

2016-01-09 Thread bricklen
On Sat, Jan 9, 2016 at 2:22 PM, drum.lu...@gmail.com 
wrote:

> *rsync would be something like:*
>
> from slave1:
> rsync -av data/pg_xlog slave2:/var/lib/postgresql/data/
>

Normally I run something like the following from the slave I am setting up.

rsync -azr --progress --partial postgres@$MASTER_IP:/var/lib/postgresql/data
/var/lib/postgresql/data/ --exclude postmaster.pid


> I didn't unterstand why doing RSYNC twice... sorry
>

Unless the source db cluster you are rsync'ing from is stopped, there will
be changes to data files replicated from the master. The second rsync might
not be necessary given the WALs are shipping from the master to slave2.


Re: [GENERAL] New Slave - timeline ERROR

2016-01-09 Thread drum.lu...@gmail.com
>
> rsync -azr --progress --partial postgres@$MASTER_IP:
> /var/lib/postgresql/data/var/lib/postgresql/data/ --exclude
> postmaster.pid


Ah ok! So this will do an incrementa, right? not supposed to copy ALL the
base/ again?





Lucas Possamai

kinghost.co.nz


On 10 January 2016 at 11:31, bricklen  wrote:

>
>
> On Sat, Jan 9, 2016 at 2:22 PM, drum.lu...@gmail.com  > wrote:
>
>> *rsync would be something like:*
>>
>> from slave1:
>> rsync -av data/pg_xlog slave2:/var/lib/postgresql/data/
>>
>
> Normally I run something like the following from the slave I am setting up.
>
> rsync -azr --progress --partial postgres@$MASTER_IP:
> /var/lib/postgresql/data /var/lib/postgresql/data/ --exclude
> postmaster.pid
>
>
>> I didn't unterstand why doing RSYNC twice... sorry
>>
>
> Unless the source db cluster you are rsync'ing from is stopped, there will
> be changes to data files replicated from the master. The second rsync might
> not be necessary given the WALs are shipping from the master to slave2.
>
>


Re: [GENERAL] New Slave - timeline ERROR

2016-01-09 Thread drum.lu...@gmail.com
Hi,

If the master is successfully ships WALs to the slave you are setting up
> you do not need the "stream" option.


yes.. the master is successfully shipping the WALs

Is there anything else? Help, please hehehehe





Lucas Possamai

kinghost.co.nz


On 10 January 2016 at 10:34, bricklen  wrote:

>
> On Sat, Jan 9, 2016 at 12:36 PM, drum.lu...@gmail.com <
> drum.lu...@gmail.com> wrote:
>
>> Do you have any other ideia? Do you know if --xlog it's the problem and I
>> should re-run the pg_basebackup again with the *--xlog-method=stream*
>> option?
>>
>
>
> If the master is successfully ships WALs to the slave you are setting up
> you do not need the "stream" option.
>


Re: [GENERAL] New Slave - timeline ERROR

2016-01-09 Thread drum.lu...@gmail.com
Hi,

If you are able to stop Postgres on the slave you are taking the base
> backup from, you could do this:


I'm not... the data base is 2 TB.
So, a RSYNC would take DAYS.  And I'm not able to stop the SLAVE for
that long time

Lucas



Lucas Possamai

kinghost.co.nz


On 10 January 2016 at 10:53, bricklen  wrote:

> On Sat, Jan 9, 2016 at 1:49 PM, drum.lu...@gmail.com  > wrote:
>
>> Hi,
>>
>> If the master is successfully ships WALs to the slave you are setting up
>>> you do not need the "stream" option.
>>
>>
>> yes.. the master is successfully shipping the WALs
>>
>> Is there anything else? Help, please hehehehe
>>
>
> If you are able to stop Postgres on the slave you are taking the base
> backup from, you could do this:
>
> 1). Stop postgres on slave1
> 2). Rsync slave1 to slave2 to copy only the deltas.
> 3). When you start up slave2 the WALs that the master has shipped to
> slave2 should apply and bring your system up to consistency.
>


Re: [GENERAL] New Slave - timeline ERROR

2016-01-09 Thread bricklen
On Sat, Jan 9, 2016 at 1:54 PM, drum.lu...@gmail.com 
wrote:

> Hi,
>
> If you are able to stop Postgres on the slave you are taking the base
>> backup from, you could do this:
>
>
> I'm not... the data base is 2 TB.
> So, a RSYNC would take DAYS.  And I'm not able to stop the SLAVE for
> that long time
>

Depending on when you took the base backup and how many changes have
occurred at your source (slave1) database cluster, the rsync execution time
may or may not take as long as a new base backup if is only only shipping
deltas (changed files).


Re: [GENERAL] New Slave - timeline ERROR

2016-01-09 Thread bricklen
On Sat, Jan 9, 2016 at 2:10 PM, drum.lu...@gmail.com 
wrote:

> I could stop the slave then But I'm afraid getting it back online and
> get some other errors
>

At this point I think your options are slim. If you are feeling
adventurous, you can try doing the rsync with the slave running, then do a
second rsync with the slave stopped or do it from the master after putting
the master in backup mode (eg. executing "pg_start_backup('slave_backup')")


Re: [GENERAL] New Slave - timeline ERROR

2016-01-09 Thread bricklen
On Fri, Jan 8, 2016 at 8:44 PM, drum.lu...@gmail.com 
wrote:

Hi, I'm a bit too lazy to try suss out the exact reasons for your failure,
but here is a reasonably thorough guide to set up replication:
http://dba.stackexchange.com/a/53546/24393

A few tips:
- Having the master ship WALs to the slaves is handy if you can pull it
off. If you are doing it over the wire and using rsync, "-z" for
compression is recommended. If you are doing the tar format of the
pg_basebackup, you *must* have the master ship the WALs to the slave
otherwise it won't be able to synchronize (the "stream" method ships WALs
over the wire so the end result is a synchronized system.

- I always run pg_basebackup from the slave I am building, for simplicity.
- I create new slaves almost every day (we have thousands of databases)
using a bash script and it almost much never fails. In essence it is a big
wrapper around the pg_basebackup command (though we are using pg93 mostly).

The base backup command that I run from the slave I am building:
pg_basebackup --pgdata=$PGDATA --host=$MASTER_IP --port=$PGPORT
--username=replication --no-password --xlog-method=stream --format=plain
--progress --verbose

The recovery.conf:
standby_mode = 'on'
primary_conninfo = 'user=replication host=$IP_OF_UPSTREAM_SLAVE_OR_MASTER
port=5432 sslmode=prefer sslcompression=1 krbsrvname=postgres'
recovery_target_timeline = 'latest'
archive_cleanup_command = '/usr/pgsql-9.3/bin/pg_archivecleanup
/path/to/WALs %r'
restore_command = 'cp /path/to/WALs/%f "%p" 2>>
/your/PGDATA/path/pg_log/standby.log'


Re: [GENERAL] New Slave - timeline ERROR

2016-01-09 Thread bricklen
On Sat, Jan 9, 2016 at 2:35 PM, drum.lu...@gmail.com 
wrote:

> rsync -azr --progress --partial postgres@$MASTER_IP:
>> /var/lib/postgresql/data/var/lib/postgresql/data/ --exclude
>> postmaster.pid
>
>
> Ah ok! So this will do an incrementa, right? not supposed to copy ALL the
> base/ again?
>

Yes, this is for incremental copying from the upstream source.
Actually, you don't need the -r with -a (it is implied), and you can run it
first with --dry-run to see what it _would_ do.
If you are not shipping over the WAN, then omit the -z flag as you do not
need compression.


Re: [GENERAL] New Slave - timeline ERROR

2016-01-09 Thread John R Pierce

On 1/9/2016 4:33 PM, drum.lu...@gmail.com wrote:

Should I point of replication new slave to same DB?



I can't even guess what you're asking here.



--
john r pierce, recycling bits in santa cruz



Re: [GENERAL] New Slave - timeline ERROR

2016-01-09 Thread drum.lu...@gmail.com
Should I point of replication new slave to same DB?

Lucas

On Sunday, 10 January 2016, bricklen  wrote:

> On Sat, Jan 9, 2016 at 2:35 PM, drum.lu...@gmail.com
>  <
> drum.lu...@gmail.com
> > wrote:
>
>> rsync -azr --progress --partial postgres@$MASTER_IP:
>>> /var/lib/postgresql/data/var/lib/postgresql/data/ --exclude
>>> postmaster.pid
>>
>>
>> Ah ok! So this will do an incrementa, right? not supposed to copy ALL the
>> base/ again?
>>
>
> Yes, this is for incremental copying from the upstream source.
> Actually, you don't need the -r with -a (it is implied), and you can run
> it first with --dry-run to see what it _would_ do.
> If you are not shipping over the WAN, then omit the -z flag as you do not
> need compression.
>


-- 


Lucas Possamai

kinghost.co.nz



Re: [GENERAL] New Slave - timeline ERROR

2016-01-09 Thread drum.lu...@gmail.com
What is the --pgdata=- in your original command? Are you perhaps in the
> wrong directory and not getting all the required files?


I run the pg_basebackup from the Slave on /var/lib/pgsql/9.2/data.
So I'm not in the wrong directory...

I'm out of fresh ideas. The rsync command is what I would go with, given
> that I can't think of any other commands to try.


I chose the pg_basebackup command just to not stop any database. It's out
of circumstances to stop even the slave one... sorry...

I really don't know what else to do. Have tried everything!

Lucas

On 10 January 2016 at 13:31, bricklen  wrote:

> Bottom-posting is the convention in the postgresql lists, and makes it
> easier to follow a long thread.
>
> On Sat, Jan 9, 2016 at 3:16 PM, drum.lu...@gmail.com  > wrote:
>
>> My servers are not in the same network. A new pg_backup would take 30
>> hours to complete as I use --rate-limit 100MB.
>
>
> If you had enough bandwidth, you could do some shell magic to parallelize
> the rsync commands, or use something like
> http://moo.nac.uci.edu/~hjm/parsync/ to do that. If you are limited by
> bandwidth, then a single rsync run is probably what you're stuck with.
>
>
>> I really need to put his server up! =\
>>
>
> If you were running zfs you could also take a snapshot of the fs and use
> that for your base backup, but I assume you would have mentioned that if it
> was an option.
>
>
>
>> I don't think that running a pg_basebackup one more time will solve the
>> problem, because I've already done that!
>> I could run actually, but the problem is that it takes 30h! hahahahah
>>
>
> What is the --pgdata=- in your original command? Are you perhaps in the
> wrong directory and not getting all the required files?
>
>
> I'm out of fresh ideas. The rsync command is what I would go with, given
> that I can't think of any other commands to try.
>
>
>
>>
>> *Have a look:*
>> http://www.postgresql.org/docs/9.2/static/app-pgbasebackup.html
>>
>> Note that there are some limitations in an online backup from the standby:
>>>
>>
>>
>> The backup history file is not created in the database cluster backed up.
>>> There is no guarantee that all WAL files required for the backup are
>>> archived at the end of backup. If you are planning to use the backup for an
>>> archive recovery and want to ensure that all required files are available
>>> at that moment, you need to include them into the backup by using -x
>>>  option.
>>>
>>
> You had that in your original command I believe.
>


[GENERAL] New Slave - timeline ERROR

2016-01-08 Thread drum.lu...@gmail.com
I've started a new SLAVE PostgreSQL server set up.

** NOTE: I run the pg_basebackup from another STANDBY SERVER. Not from the
MASTER*

1 - screen -t basebackup

2 - su - postgres

3 - cd ~/9.2/data/

4 - ssh postgres@slave01 'pg_basebackup --pgdata=- --format=tar
--label=bb_master --progress --host=localhost --port=5432
--username=replicator --xlog | pv --quiet --rate-limit 100M' | tar -x
--no-same-owner

5 - I've commented the "primary_conninfo =" and "standby_mode=" so the
slave can get the files from WAL_ARCHIVE

6 - Afte I got the logs:

postgres(iostreams)[10037]:   2016-01-09 00:07:26.604
UTC|10085|LOG:  database system is ready to accept read only
connections

7 - After the server finished the WAL_ARCHIVE, I turned on replication from
MASTER on recovery.conf:

*recovery.conf on the New Slave:*

restore_command = 'exec nice -n 19 ionice -c 2 -n 7
../../bin/restore_wal_segment.bash "../wal_archive/%f" "%p"'
archive_cleanup_command = 'exec nice -n 19 ionice -c 2 -n 7
../../bin/pg_archivecleaup_mv.bash -d "../wal_archive" "%r"'
recovery_target_timeline = 'latest'
standby_mode = on
primary_conninfo = 'host=192.168.100.XX port=5432 user=replicator
application_name=replication_slave02'

But, once I've restarted the POSTGRESQL I got this error:

WAL segment `../wal_archive/0005.history` not found2016-01-09
01:13:39.183 UTC|774|FATAL:  timeline 2 of the primary does not match
recovery target timeline 4

What can I do to solve the problem?

It's really important as it's a production New Slave. Thank you!