Re: PG 14 pg_basebackup accepts --compress=server-zst option

2024-06-07 Thread Tom Lane
Ron Johnson  writes:
> On Fri, Jun 7, 2024 at 12:32 AM David G. Johnston <
> david.g.johns...@gmail.com> wrote:
>> I don’t see us adding an error message at this point.

> Me neither.  It just seemed odd.

v14 thinks the argument of --compress must be an integer, and doesn't
really bother with any syntax error checks:

case 'Z':
compresslevel = atoi(optarg);
if (compresslevel < 0 || compresslevel > 9)
{
pg_log_error("invalid compression level \"%s\"", optarg);
exit(1);
}
break;

In your example, atoi() will return zero and it will sail along with
no compression.  Releases 15 and up have more complex ideas of what
--compress can specify, and seem to syntax-check it much more
thoroughly.

This is a pretty common coding pattern, so I can't get excited
about changing it, especially not in long-stable branches.

regards, tom lane




Re: PG 14 pg_basebackup accepts --compress=server-zst option

2024-06-07 Thread Ron Johnson
On Fri, Jun 7, 2024 at 12:32 AM David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Thursday, June 6, 2024, Kashif Zeeshan  wrote:
>
>> Hi
>>
>> On Fri, Jun 7, 2024 at 6:54 AM Ron Johnson 
>> wrote:
>>
>>>
>>> https://www.postgresql.org/docs/14/app-pgbasebackup.html doesn't
>>> mention "--compress=[{client|server}-]method".  That first appears in the
>>> v15 docs.
>>>
>>> And yet pg_basebackup doesn't complain about an invalid option.
>>> (Technically, this is a bug; I first noticed it a week after copying a
>>> script from a PG 15 server to five PG 14 servers, and running it quite a
>>> few times without fail.)
>>>
>>
> Seems a bit suspect, but as your script doesn’t mention tar the option
> itself is apparently ignored, I guess silently.
>

Does this mean that "--compress=server-zst" is only relevant with
--format=tar?


> Assuming this isn’t an actual regression in behavior in a patch-released
> older version
>

My apologies for not mentioning the version: 14.12-1PGDG-rhel8.


> I don’t see us adding an error message at this point.
>

Me neither.  It just seemed odd.


Re: PG 14 pg_basebackup accepts --compress=server-zst option

2024-06-06 Thread David G. Johnston
On Thursday, June 6, 2024, Kashif Zeeshan  wrote:

> Hi
>
> On Fri, Jun 7, 2024 at 6:54 AM Ron Johnson 
> wrote:
>
>>
>> https://www.postgresql.org/docs/14/app-pgbasebackup.html doesn't mention
>> "--compress=[{client|server}-]method".  That first appears in the v15
>> docs.
>>
>> And yet pg_basebackup doesn't complain about an invalid option.
>> (Technically, this is a bug; I first noticed it a week after copying a
>> script from a PG 15 server to five PG 14 servers, and running it quite a
>> few times without fail.)
>>
>
Seems a bit suspect, but as your script doesn’t mention tar the option
itself is apparently ignored, I guess silently.  Assuming this isn’t an
actual regression in behavior in a patch-released older version I don’t see
us adding an error message at this point.


> If the support is removed then it should be mentioned in the official
> documentation.
>

Support wasn’t removed.  Re-read the email and check the version/times
being mentioned again.

David J.


Re: PG 14 pg_basebackup accepts --compress=server-zst option

2024-06-06 Thread Kashif Zeeshan
Hi

On Fri, Jun 7, 2024 at 6:54 AM Ron Johnson  wrote:

>
> https://www.postgresql.org/docs/14/app-pgbasebackup.html doesn't mention
> "--compress=[{client|server}-]method".  That first appears in the v15 docs.
>
> And yet pg_basebackup doesn't complain about an invalid option.
> (Technically, this is a bug; I first noticed it a week after copying a
> script from a PG 15 server to five PG 14 servers, and running it quite a
> few times without fail.)
>
If the support is removed then it should be mentioned in the official
documentation.

Regards
Kashif Zeeshan
Bitnine Global

>
> $ pg_basebackup \
> > --pgdata=$PGDATA \
> > --dbname=service=basebackup \
> > --verbose --progress \
> > --checkpoint=fast \
> > --write-recovery-conf \
> > --wal-method=stream \
> > --create-slot --slot=pgstandby1 \
> > --compress=server-zst ; echo $?
> pg_basebackup: initiating base backup, waiting for checkpoint to complete
> pg_basebackup: checkpoint completed
> pg_basebackup: write-ahead log start point: 256/BC28 on timeline 1
> pg_basebackup: starting background WAL receiver
> pg_basebackup: created replication slot "pgstandby1"
> 42567083/42567083 kB (100%), 1/1 tablespace
> pg_basebackup: write-ahead log end point: 256/BC000138
> pg_basebackup: waiting for background process to finish streaming ...
> pg_basebackup: syncing data to disk ...
> pg_basebackup: renaming backup_manifest.tmp to backup_manifest
> pg_basebackup: base backup completed
> 0
>
>


PG 14 pg_basebackup accepts --compress=server-zst option

2024-06-06 Thread Ron Johnson
https://www.postgresql.org/docs/14/app-pgbasebackup.html doesn't mention
"--compress=[{client|server}-]method".  That first appears in the v15 docs.

And yet pg_basebackup doesn't complain about an invalid option.
(Technically, this is a bug; I first noticed it a week after copying a
script from a PG 15 server to five PG 14 servers, and running it quite a
few times without fail.)

$ pg_basebackup \
> --pgdata=$PGDATA \
> --dbname=service=basebackup \
> --verbose --progress \
> --checkpoint=fast \
> --write-recovery-conf \
> --wal-method=stream \
> --create-slot --slot=pgstandby1 \
>     --compress=server-zst ; echo $?
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 256/BC28 on timeline 1
pg_basebackup: starting background WAL receiver
pg_basebackup: created replication slot "pgstandby1"
42567083/42567083 kB (100%), 1/1 tablespace
pg_basebackup: write-ahead log end point: 256/BC000138
pg_basebackup: waiting for background process to finish streaming ...
pg_basebackup: syncing data to disk ...
pg_basebackup: renaming backup_manifest.tmp to backup_manifest
pg_basebackup: base backup completed
0


pg_basebackup "Permission denied" error when initiating streaming replication

2024-05-07 Thread Frans Simmelvuo (Papula-Nevinpat)
Hi,

I'm trying to initiate streaming replication between two Postgres 10 servers on 
Windows. The source cluster is of size ~450gb, and the cluster consists of the 
primary data folder, and two separated tablespaces. One of the tablespaces is 
configured with a symbolic link inside Windows.

Now, the problem I'm encountering is the following:
I call pg_basebackup at the target as follows:
>> pg_basebackup -h  -U  -p  -W -D  -T 
>> "C:\Server 
>> Files\Data\ProgramsData\PostgreSQL\10\data\base\demo_tablespace"=E:\PG_tablespace_data\10\demo_tablespace
>>  -T "C:\Server 
>> Files\Data\ProgramsData\PostgreSQL\10\data\base\tsb_tablespace"=E:\PG_tablespace_data\10\tsb_tablespace
>>  -P -R -v

This starts the pg_basebackup and it runs without problems for the duration of 
copying the 450gbs of data.
Then the output is the following:

>>pg_basebackup: initiating base backup, waiting for checkpoint to complete
>>pg_basebackup: checkpoint complete
>>pg_basebackup: write-ahead log start point: 181/928 on timeline 1
>>pg_basebackup: starting background WAL receiver
>>432917950/482036946 kB (100%), 3/3 tablespaces
>>pg_basebackup: could not get write-ahead log end position from server: ERROR: 
>>could not open file "./base/demo_tablespace": Permission denied

The demo_tablespace is the one that is a symbolic link from another folder. 
After the error, I find that all the data is copied to the target tablespaces 
and primary data directory, but the data directory is missing all pg_... 
directories and configuration files. So no running database at target.

I'm running the source Postgres service on Windows as a local account, the 
command is given with administrator cmd and the ./base/demo_tablespace folder's 
security settings should have all needed permissions for the operation.

I have a hard time understanding why pg_basebackup can retrieve the data from 
the ./base/demo_tablespace directory, and then suddenly gets denied. I'm 
running out of ideas on what to try next.

BR,
Frans Simmelvuo


Re: pg_basebackup Restore problem

2024-01-17 Thread Johnathan Tiamoh
Thank You very much for your time.

On Wed, Jan 17, 2024 at 4:51 PM Ron Johnson  wrote:

> Perfectly understandable, but tar *did* fail.  Time to start debugging
> your shell script.
>
> On Wed, Jan 17, 2024 at 4:26 PM Johnathan Tiamoh <
> johnathantia...@gmail.com> wrote:
>
>> Ok.
>>
>> I'm a little confused because has always work
>>
>> On Wed, Jan 17, 2024 at 4:11 PM Ron Johnson 
>> wrote:
>>
>>> Then you've got a bug somewhere in:
>>> tar -h  -zxvf   $PATH_FOLDER/*  .tar.gz
>>> ls $PATH_FOLDER/*.tar.gz | xargs -I  {} tar -h  -zxvf  {}
>>>
>>> On Wed, Jan 17, 2024 at 4:07 PM Johnathan Tiamoh <
>>> johnathantia...@gmail.com> wrote:
>>>
>>>> Yes.
>>>>
>>>> I am trying to restore the backups on a standby
>>>>
>>>> On Wed, Jan 17, 2024 at 4:04 PM Ron Johnson 
>>>> wrote:
>>>>
>>>>> Wait a minute... *tar* is throwing the errors, not pg_basebackup, no?
>>>>>
>>>>> On Wed, Jan 17, 2024 at 3:27 PM Johnathan Tiamoh <
>>>>> johnathantia...@gmail.com> wrote:
>>>>>
>>>>>> 1. What's in $PATH_FOLDER?
>>>>>>
>>>>>> /tnt/backup/current
>>>>>>
>>>>>> 2. What pg_basebackup command did you use?
>>>>>>
>>>>>> pg_basebackup -D "$baseback_dir" --format=tar   \
>>>>>>     "${comp_opts[@]}" --wal-method=stream --no-password
>>>>>> --verbose "${PG_DUMP_OPTS[@]}"
>>>>>>
>>>>>> /bin/mv "$baseback_dir"/* "/enf/backup/current/"
>>>>>> /bin/rm -r "$baseback_dir"
>>>>>>
>>>>>>
>>>>>> 3. Why aren't you letting pg_basebackup maintain the WAL files it
>>>>>> needs?
>>>>>>
>>>>>>
>>>>>> On Wed, Jan 17, 2024 at 11:46 AM Ron Johnson 
>>>>>> wrote:
>>>>>>
>>>>>>> 1. What's in $PATH_FOLDER?
>>>>>>> 2. What pg_basebackup command did you use?
>>>>>>> 3. Why aren't you letting pg_basebackup maintain the WAL files it
>>>>>>> needs?
>>>>>>>
>>>>>>> On Wed, Jan 17, 2024 at 11:27 AM Johnathan Tiamoh <
>>>>>>> johnathantia...@gmail.com> wrote:
>>>>>>>
>>>>>>>> You need to tell us the PG version number *and* show us the full
>>>>>>>> command you ran. ?
>>>>>>>>
>>>>>>>> Postgresql Version 14.10
>>>>>>>>
>>>>>>>> tar -h  -zxvf   $PATH_FOLDER/*  .tar.gz
>>>>>>>>
>>>>>>>> ls $PATH_FOLDER/*.tar.gz | xargs -I  {} tar -h  -zxvf  {}
>>>>>>>>
>>>>>>>> What user are you running pg_basebackup as?
>>>>>>>>
>>>>>>>> I ran it as postgres and now I'm restoring as postgres
>>>>>>>>
>>>>>>>> /tnt???  Or /mnt?
>>>>>>>> /tnt/backup/current
>>>>>>>>
>>>>>>>> This is where the backup files are.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Jan 17, 2024 at 11:15 AM Ron Johnson <
>>>>>>>> ronljohnso...@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> On Wed, Jan 17, 2024 at 11:11 AM Johnathan Tiamoh <
>>>>>>>>> johnathantia...@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hello,
>>>>>>>>>>
>>>>>>>>>> I am trying to restore a pg_basebackup and have the following
>>>>>>>>>> errors.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> You need to tell us the PG version number *and* show us the full
>>>>>>>>> command you ran.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> nohup: ignoring input
>>>>>>>>>> tar: /tnt/backup/current/7400.tar.gz: Not found in archive
>>>>>>>>>>
>>>>>>>>> tar: /tnt/backup/current/7401.tar.gz: Not found in archive
>>>>>>>>>> tar: /tnt/backup/current/7402.tar.gz: Not found in archive
>>>>>>>>>> tar: /tnt/backup/current/base.tar.gz: Not found in archive
>>>>>>>>>> tar: /tnt/backup/current/pg_wal.tar.gz: Not found in archive
>>>>>>>>>> tar: Exiting with failure status due to previous errors
>>>>>>>>>> PG_14_202107181/
>>>>>>>>>> tar: PG_14_202107181: Cannot mkdir: Permission denied
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> What user are you running pg_basebackup as?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> PG_14_202107181/137502/
>>>>>>>>>> tar: PG_14_202107181: Cannot mkdir: Permission denied
>>>>>>>>>> tar: PG_14_202107181/137502: Cannot mkdir: No such file or
>>>>>>>>>> directory
>>>>>>>>>> PG_14_202107181/137502/3222926016
>>>>>>>>>> tar: PG_14_202107181: Cannot mkdir: Permission denied
>>>>>>>>>>
>>>>>>>>>> The files exist in the specified directory.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> /tnt???  Or /mnt?
>>>>>>>>>
>>>>>>>>>


Re: pg_basebackup Restore problem

2024-01-17 Thread Ron Johnson
Perfectly understandable, but tar *did* fail.  Time to start debugging your
shell script.

On Wed, Jan 17, 2024 at 4:26 PM Johnathan Tiamoh 
wrote:

> Ok.
>
> I'm a little confused because has always work
>
> On Wed, Jan 17, 2024 at 4:11 PM Ron Johnson 
> wrote:
>
>> Then you've got a bug somewhere in:
>> tar -h  -zxvf   $PATH_FOLDER/*  .tar.gz
>> ls $PATH_FOLDER/*.tar.gz | xargs -I  {} tar -h  -zxvf  {}
>>
>> On Wed, Jan 17, 2024 at 4:07 PM Johnathan Tiamoh <
>> johnathantia...@gmail.com> wrote:
>>
>>> Yes.
>>>
>>> I am trying to restore the backups on a standby
>>>
>>> On Wed, Jan 17, 2024 at 4:04 PM Ron Johnson 
>>> wrote:
>>>
>>>> Wait a minute... *tar* is throwing the errors, not pg_basebackup, no?
>>>>
>>>> On Wed, Jan 17, 2024 at 3:27 PM Johnathan Tiamoh <
>>>> johnathantia...@gmail.com> wrote:
>>>>
>>>>> 1. What's in $PATH_FOLDER?
>>>>>
>>>>> /tnt/backup/current
>>>>>
>>>>> 2. What pg_basebackup command did you use?
>>>>>
>>>>> pg_basebackup -D "$baseback_dir" --format=tar   \
>>>>> "${comp_opts[@]}" --wal-method=stream --no-password
>>>>> --verbose "${PG_DUMP_OPTS[@]}"
>>>>>
>>>>>     /bin/mv "$baseback_dir"/* "/enf/backup/current/"
>>>>> /bin/rm -r "$baseback_dir"
>>>>>
>>>>>
>>>>> 3. Why aren't you letting pg_basebackup maintain the WAL files it
>>>>> needs?
>>>>>
>>>>>
>>>>> On Wed, Jan 17, 2024 at 11:46 AM Ron Johnson 
>>>>> wrote:
>>>>>
>>>>>> 1. What's in $PATH_FOLDER?
>>>>>> 2. What pg_basebackup command did you use?
>>>>>> 3. Why aren't you letting pg_basebackup maintain the WAL files it
>>>>>> needs?
>>>>>>
>>>>>> On Wed, Jan 17, 2024 at 11:27 AM Johnathan Tiamoh <
>>>>>> johnathantia...@gmail.com> wrote:
>>>>>>
>>>>>>> You need to tell us the PG version number *and* show us the full
>>>>>>> command you ran. ?
>>>>>>>
>>>>>>> Postgresql Version 14.10
>>>>>>>
>>>>>>> tar -h  -zxvf   $PATH_FOLDER/*  .tar.gz
>>>>>>>
>>>>>>> ls $PATH_FOLDER/*.tar.gz | xargs -I  {} tar -h  -zxvf  {}
>>>>>>>
>>>>>>> What user are you running pg_basebackup as?
>>>>>>>
>>>>>>> I ran it as postgres and now I'm restoring as postgres
>>>>>>>
>>>>>>> /tnt???  Or /mnt?
>>>>>>> /tnt/backup/current
>>>>>>>
>>>>>>> This is where the backup files are.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Jan 17, 2024 at 11:15 AM Ron Johnson <
>>>>>>> ronljohnso...@gmail.com> wrote:
>>>>>>>
>>>>>>>> On Wed, Jan 17, 2024 at 11:11 AM Johnathan Tiamoh <
>>>>>>>> johnathantia...@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hello,
>>>>>>>>>
>>>>>>>>> I am trying to restore a pg_basebackup and have the following
>>>>>>>>> errors.
>>>>>>>>>
>>>>>>>>>
>>>>>>>> You need to tell us the PG version number *and* show us the full
>>>>>>>> command you ran.
>>>>>>>>
>>>>>>>>
>>>>>>>>> nohup: ignoring input
>>>>>>>>> tar: /tnt/backup/current/7400.tar.gz: Not found in archive
>>>>>>>>>
>>>>>>>> tar: /tnt/backup/current/7401.tar.gz: Not found in archive
>>>>>>>>> tar: /tnt/backup/current/7402.tar.gz: Not found in archive
>>>>>>>>> tar: /tnt/backup/current/base.tar.gz: Not found in archive
>>>>>>>>> tar: /tnt/backup/current/pg_wal.tar.gz: Not found in archive
>>>>>>>>> tar: Exiting with failure status due to previous errors
>>>>>>>>> PG_14_202107181/
>>>>>>>>> tar: PG_14_202107181: Cannot mkdir: Permission denied
>>>>>>>>>
>>>>>>>>
>>>>>>>> What user are you running pg_basebackup as?
>>>>>>>>
>>>>>>>>
>>>>>>>>> PG_14_202107181/137502/
>>>>>>>>> tar: PG_14_202107181: Cannot mkdir: Permission denied
>>>>>>>>> tar: PG_14_202107181/137502: Cannot mkdir: No such file or
>>>>>>>>> directory
>>>>>>>>> PG_14_202107181/137502/3222926016
>>>>>>>>> tar: PG_14_202107181: Cannot mkdir: Permission denied
>>>>>>>>>
>>>>>>>>> The files exist in the specified directory.
>>>>>>>>>
>>>>>>>>
>>>>>>>> /tnt???  Or /mnt?
>>>>>>>>
>>>>>>>>


Re: pg_basebackup Restore problem

2024-01-17 Thread Johnathan Tiamoh
Ok.

I'm a little confused because has always work

On Wed, Jan 17, 2024 at 4:11 PM Ron Johnson  wrote:

> Then you've got a bug somewhere in:
> tar -h  -zxvf   $PATH_FOLDER/*  .tar.gz
> ls $PATH_FOLDER/*.tar.gz | xargs -I  {} tar -h  -zxvf  {}
>
> On Wed, Jan 17, 2024 at 4:07 PM Johnathan Tiamoh <
> johnathantia...@gmail.com> wrote:
>
>> Yes.
>>
>> I am trying to restore the backups on a standby
>>
>> On Wed, Jan 17, 2024 at 4:04 PM Ron Johnson 
>> wrote:
>>
>>> Wait a minute... *tar* is throwing the errors, not pg_basebackup, no?
>>>
>>> On Wed, Jan 17, 2024 at 3:27 PM Johnathan Tiamoh <
>>> johnathantia...@gmail.com> wrote:
>>>
>>>> 1. What's in $PATH_FOLDER?
>>>>
>>>> /tnt/backup/current
>>>>
>>>> 2. What pg_basebackup command did you use?
>>>>
>>>> pg_basebackup -D "$baseback_dir" --format=tar   \
>>>> "${comp_opts[@]}" --wal-method=stream --no-password
>>>> --verbose "${PG_DUMP_OPTS[@]}"
>>>>
>>>> /bin/mv "$baseback_dir"/* "/enf/backup/current/"
>>>>     /bin/rm -r "$baseback_dir"
>>>>
>>>>
>>>> 3. Why aren't you letting pg_basebackup maintain the WAL files it needs?
>>>>
>>>>
>>>> On Wed, Jan 17, 2024 at 11:46 AM Ron Johnson 
>>>> wrote:
>>>>
>>>>> 1. What's in $PATH_FOLDER?
>>>>> 2. What pg_basebackup command did you use?
>>>>> 3. Why aren't you letting pg_basebackup maintain the WAL files it
>>>>> needs?
>>>>>
>>>>> On Wed, Jan 17, 2024 at 11:27 AM Johnathan Tiamoh <
>>>>> johnathantia...@gmail.com> wrote:
>>>>>
>>>>>> You need to tell us the PG version number *and* show us the full
>>>>>> command you ran. ?
>>>>>>
>>>>>> Postgresql Version 14.10
>>>>>>
>>>>>> tar -h  -zxvf   $PATH_FOLDER/*  .tar.gz
>>>>>>
>>>>>> ls $PATH_FOLDER/*.tar.gz | xargs -I  {} tar -h  -zxvf  {}
>>>>>>
>>>>>> What user are you running pg_basebackup as?
>>>>>>
>>>>>> I ran it as postgres and now I'm restoring as postgres
>>>>>>
>>>>>> /tnt???  Or /mnt?
>>>>>> /tnt/backup/current
>>>>>>
>>>>>> This is where the backup files are.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Jan 17, 2024 at 11:15 AM Ron Johnson 
>>>>>> wrote:
>>>>>>
>>>>>>> On Wed, Jan 17, 2024 at 11:11 AM Johnathan Tiamoh <
>>>>>>> johnathantia...@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> I am trying to restore a pg_basebackup and have the following
>>>>>>>> errors.
>>>>>>>>
>>>>>>>>
>>>>>>> You need to tell us the PG version number *and* show us the full
>>>>>>> command you ran.
>>>>>>>
>>>>>>>
>>>>>>>> nohup: ignoring input
>>>>>>>> tar: /tnt/backup/current/7400.tar.gz: Not found in archive
>>>>>>>>
>>>>>>> tar: /tnt/backup/current/7401.tar.gz: Not found in archive
>>>>>>>> tar: /tnt/backup/current/7402.tar.gz: Not found in archive
>>>>>>>> tar: /tnt/backup/current/base.tar.gz: Not found in archive
>>>>>>>> tar: /tnt/backup/current/pg_wal.tar.gz: Not found in archive
>>>>>>>> tar: Exiting with failure status due to previous errors
>>>>>>>> PG_14_202107181/
>>>>>>>> tar: PG_14_202107181: Cannot mkdir: Permission denied
>>>>>>>>
>>>>>>>
>>>>>>> What user are you running pg_basebackup as?
>>>>>>>
>>>>>>>
>>>>>>>> PG_14_202107181/137502/
>>>>>>>> tar: PG_14_202107181: Cannot mkdir: Permission denied
>>>>>>>> tar: PG_14_202107181/137502: Cannot mkdir: No such file or directory
>>>>>>>> PG_14_202107181/137502/3222926016
>>>>>>>> tar: PG_14_202107181: Cannot mkdir: Permission denied
>>>>>>>>
>>>>>>>> The files exist in the specified directory.
>>>>>>>>
>>>>>>>
>>>>>>> /tnt???  Or /mnt?
>>>>>>>
>>>>>>>


Re: pg_basebackup Restore problem

2024-01-17 Thread Ron Johnson
Then you've got a bug somewhere in:
tar -h  -zxvf   $PATH_FOLDER/*  .tar.gz
ls $PATH_FOLDER/*.tar.gz | xargs -I  {} tar -h  -zxvf  {}

On Wed, Jan 17, 2024 at 4:07 PM Johnathan Tiamoh 
wrote:

> Yes.
>
> I am trying to restore the backups on a standby
>
> On Wed, Jan 17, 2024 at 4:04 PM Ron Johnson 
> wrote:
>
>> Wait a minute... *tar* is throwing the errors, not pg_basebackup, no?
>>
>> On Wed, Jan 17, 2024 at 3:27 PM Johnathan Tiamoh <
>> johnathantia...@gmail.com> wrote:
>>
>>> 1. What's in $PATH_FOLDER?
>>>
>>> /tnt/backup/current
>>>
>>> 2. What pg_basebackup command did you use?
>>>
>>> pg_basebackup -D "$baseback_dir" --format=tar   \
>>> "${comp_opts[@]}" --wal-method=stream --no-password
>>> --verbose "${PG_DUMP_OPTS[@]}"
>>>
>>> /bin/mv "$baseback_dir"/* "/enf/backup/current/"
>>> /bin/rm -r "$baseback_dir"
>>>
>>>
>>> 3. Why aren't you letting pg_basebackup maintain the WAL files it needs?
>>>
>>>
>>> On Wed, Jan 17, 2024 at 11:46 AM Ron Johnson 
>>> wrote:
>>>
>>>> 1. What's in $PATH_FOLDER?
>>>> 2. What pg_basebackup command did you use?
>>>> 3. Why aren't you letting pg_basebackup maintain the WAL files it needs?
>>>>
>>>> On Wed, Jan 17, 2024 at 11:27 AM Johnathan Tiamoh <
>>>> johnathantia...@gmail.com> wrote:
>>>>
>>>>> You need to tell us the PG version number *and* show us the full
>>>>> command you ran. ?
>>>>>
>>>>> Postgresql Version 14.10
>>>>>
>>>>> tar -h  -zxvf   $PATH_FOLDER/*  .tar.gz
>>>>>
>>>>> ls $PATH_FOLDER/*.tar.gz | xargs -I  {} tar -h  -zxvf  {}
>>>>>
>>>>> What user are you running pg_basebackup as?
>>>>>
>>>>> I ran it as postgres and now I'm restoring as postgres
>>>>>
>>>>> /tnt???  Or /mnt?
>>>>> /tnt/backup/current
>>>>>
>>>>> This is where the backup files are.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Jan 17, 2024 at 11:15 AM Ron Johnson 
>>>>> wrote:
>>>>>
>>>>>> On Wed, Jan 17, 2024 at 11:11 AM Johnathan Tiamoh <
>>>>>> johnathantia...@gmail.com> wrote:
>>>>>>
>>>>>>> Hello,
>>>>>>>
>>>>>>> I am trying to restore a pg_basebackup and have the following errors.
>>>>>>>
>>>>>>>
>>>>>> You need to tell us the PG version number *and* show us the full
>>>>>> command you ran.
>>>>>>
>>>>>>
>>>>>>> nohup: ignoring input
>>>>>>> tar: /tnt/backup/current/7400.tar.gz: Not found in archive
>>>>>>>
>>>>>> tar: /tnt/backup/current/7401.tar.gz: Not found in archive
>>>>>>> tar: /tnt/backup/current/7402.tar.gz: Not found in archive
>>>>>>> tar: /tnt/backup/current/base.tar.gz: Not found in archive
>>>>>>> tar: /tnt/backup/current/pg_wal.tar.gz: Not found in archive
>>>>>>> tar: Exiting with failure status due to previous errors
>>>>>>> PG_14_202107181/
>>>>>>> tar: PG_14_202107181: Cannot mkdir: Permission denied
>>>>>>>
>>>>>>
>>>>>> What user are you running pg_basebackup as?
>>>>>>
>>>>>>
>>>>>>> PG_14_202107181/137502/
>>>>>>> tar: PG_14_202107181: Cannot mkdir: Permission denied
>>>>>>> tar: PG_14_202107181/137502: Cannot mkdir: No such file or directory
>>>>>>> PG_14_202107181/137502/3222926016
>>>>>>> tar: PG_14_202107181: Cannot mkdir: Permission denied
>>>>>>>
>>>>>>> The files exist in the specified directory.
>>>>>>>
>>>>>>
>>>>>> /tnt???  Or /mnt?
>>>>>>
>>>>>>


Re: pg_basebackup Restore problem

2024-01-17 Thread Johnathan Tiamoh
Yes.

I am trying to restore the backups on a standby

On Wed, Jan 17, 2024 at 4:04 PM Ron Johnson  wrote:

> Wait a minute... *tar* is throwing the errors, not pg_basebackup, no?
>
> On Wed, Jan 17, 2024 at 3:27 PM Johnathan Tiamoh <
> johnathantia...@gmail.com> wrote:
>
>> 1. What's in $PATH_FOLDER?
>>
>> /tnt/backup/current
>>
>> 2. What pg_basebackup command did you use?
>>
>> pg_basebackup -D "$baseback_dir" --format=tar   \
>> "${comp_opts[@]}" --wal-method=stream --no-password
>> --verbose "${PG_DUMP_OPTS[@]}"
>>
>> /bin/mv "$baseback_dir"/* "/enf/backup/current/"
>> /bin/rm -r "$baseback_dir"
>>
>>
>> 3. Why aren't you letting pg_basebackup maintain the WAL files it needs?
>>
>>
>> On Wed, Jan 17, 2024 at 11:46 AM Ron Johnson 
>> wrote:
>>
>>> 1. What's in $PATH_FOLDER?
>>> 2. What pg_basebackup command did you use?
>>> 3. Why aren't you letting pg_basebackup maintain the WAL files it needs?
>>>
>>> On Wed, Jan 17, 2024 at 11:27 AM Johnathan Tiamoh <
>>> johnathantia...@gmail.com> wrote:
>>>
>>>> You need to tell us the PG version number *and* show us the full
>>>> command you ran. ?
>>>>
>>>> Postgresql Version 14.10
>>>>
>>>> tar -h  -zxvf   $PATH_FOLDER/*  .tar.gz
>>>>
>>>> ls $PATH_FOLDER/*.tar.gz | xargs -I  {} tar -h  -zxvf  {}
>>>>
>>>> What user are you running pg_basebackup as?
>>>>
>>>> I ran it as postgres and now I'm restoring as postgres
>>>>
>>>> /tnt???  Or /mnt?
>>>> /tnt/backup/current
>>>>
>>>> This is where the backup files are.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, Jan 17, 2024 at 11:15 AM Ron Johnson 
>>>> wrote:
>>>>
>>>>> On Wed, Jan 17, 2024 at 11:11 AM Johnathan Tiamoh <
>>>>> johnathantia...@gmail.com> wrote:
>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> I am trying to restore a pg_basebackup and have the following errors.
>>>>>>
>>>>>>
>>>>> You need to tell us the PG version number *and* show us the full
>>>>> command you ran.
>>>>>
>>>>>
>>>>>> nohup: ignoring input
>>>>>> tar: /tnt/backup/current/7400.tar.gz: Not found in archive
>>>>>>
>>>>> tar: /tnt/backup/current/7401.tar.gz: Not found in archive
>>>>>> tar: /tnt/backup/current/7402.tar.gz: Not found in archive
>>>>>> tar: /tnt/backup/current/base.tar.gz: Not found in archive
>>>>>> tar: /tnt/backup/current/pg_wal.tar.gz: Not found in archive
>>>>>> tar: Exiting with failure status due to previous errors
>>>>>> PG_14_202107181/
>>>>>> tar: PG_14_202107181: Cannot mkdir: Permission denied
>>>>>>
>>>>>
>>>>> What user are you running pg_basebackup as?
>>>>>
>>>>>
>>>>>> PG_14_202107181/137502/
>>>>>> tar: PG_14_202107181: Cannot mkdir: Permission denied
>>>>>> tar: PG_14_202107181/137502: Cannot mkdir: No such file or directory
>>>>>> PG_14_202107181/137502/3222926016
>>>>>> tar: PG_14_202107181: Cannot mkdir: Permission denied
>>>>>>
>>>>>> The files exist in the specified directory.
>>>>>>
>>>>>
>>>>> /tnt???  Or /mnt?
>>>>>
>>>>>


Re: pg_basebackup Restore problem

2024-01-17 Thread Ron Johnson
Wait a minute... *tar* is throwing the errors, not pg_basebackup, no?

On Wed, Jan 17, 2024 at 3:27 PM Johnathan Tiamoh 
wrote:

> 1. What's in $PATH_FOLDER?
>
> /tnt/backup/current
>
> 2. What pg_basebackup command did you use?
>
> pg_basebackup -D "$baseback_dir" --format=tar   \
> "${comp_opts[@]}" --wal-method=stream --no-password
> --verbose "${PG_DUMP_OPTS[@]}"
>
> /bin/mv "$baseback_dir"/* "/enf/backup/current/"
> /bin/rm -r "$baseback_dir"
>
>
> 3. Why aren't you letting pg_basebackup maintain the WAL files it needs?
>
>
> On Wed, Jan 17, 2024 at 11:46 AM Ron Johnson 
> wrote:
>
>> 1. What's in $PATH_FOLDER?
>> 2. What pg_basebackup command did you use?
>> 3. Why aren't you letting pg_basebackup maintain the WAL files it needs?
>>
>> On Wed, Jan 17, 2024 at 11:27 AM Johnathan Tiamoh <
>> johnathantia...@gmail.com> wrote:
>>
>>> You need to tell us the PG version number *and* show us the full
>>> command you ran. ?
>>>
>>> Postgresql Version 14.10
>>>
>>> tar -h  -zxvf   $PATH_FOLDER/*  .tar.gz
>>>
>>> ls $PATH_FOLDER/*.tar.gz | xargs -I  {} tar -h  -zxvf  {}
>>>
>>> What user are you running pg_basebackup as?
>>>
>>> I ran it as postgres and now I'm restoring as postgres
>>>
>>> /tnt???  Or /mnt?
>>> /tnt/backup/current
>>>
>>> This is where the backup files are.
>>>
>>>
>>>
>>>
>>>
>>> On Wed, Jan 17, 2024 at 11:15 AM Ron Johnson 
>>> wrote:
>>>
>>>> On Wed, Jan 17, 2024 at 11:11 AM Johnathan Tiamoh <
>>>> johnathantia...@gmail.com> wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> I am trying to restore a pg_basebackup and have the following errors.
>>>>>
>>>>>
>>>> You need to tell us the PG version number *and* show us the full
>>>> command you ran.
>>>>
>>>>
>>>>> nohup: ignoring input
>>>>> tar: /tnt/backup/current/7400.tar.gz: Not found in archive
>>>>>
>>>> tar: /tnt/backup/current/7401.tar.gz: Not found in archive
>>>>> tar: /tnt/backup/current/7402.tar.gz: Not found in archive
>>>>> tar: /tnt/backup/current/base.tar.gz: Not found in archive
>>>>> tar: /tnt/backup/current/pg_wal.tar.gz: Not found in archive
>>>>> tar: Exiting with failure status due to previous errors
>>>>> PG_14_202107181/
>>>>> tar: PG_14_202107181: Cannot mkdir: Permission denied
>>>>>
>>>>
>>>> What user are you running pg_basebackup as?
>>>>
>>>>
>>>>> PG_14_202107181/137502/
>>>>> tar: PG_14_202107181: Cannot mkdir: Permission denied
>>>>> tar: PG_14_202107181/137502: Cannot mkdir: No such file or directory
>>>>> PG_14_202107181/137502/3222926016
>>>>> tar: PG_14_202107181: Cannot mkdir: Permission denied
>>>>>
>>>>> The files exist in the specified directory.
>>>>>
>>>>
>>>> /tnt???  Or /mnt?
>>>>
>>>>


Re: pg_basebackup Restore problem

2024-01-17 Thread Johnathan Tiamoh
1. What's in $PATH_FOLDER?

/tnt/backup/current

2. What pg_basebackup command did you use?

pg_basebackup -D "$baseback_dir" --format=tar   \
"${comp_opts[@]}" --wal-method=stream --no-password
--verbose "${PG_DUMP_OPTS[@]}"

/bin/mv "$baseback_dir"/* "/enf/backup/current/"
/bin/rm -r "$baseback_dir"


3. Why aren't you letting pg_basebackup maintain the WAL files it needs?


On Wed, Jan 17, 2024 at 11:46 AM Ron Johnson 
wrote:

> 1. What's in $PATH_FOLDER?
> 2. What pg_basebackup command did you use?
> 3. Why aren't you letting pg_basebackup maintain the WAL files it needs?
>
> On Wed, Jan 17, 2024 at 11:27 AM Johnathan Tiamoh <
> johnathantia...@gmail.com> wrote:
>
>> You need to tell us the PG version number *and* show us the full command
>> you ran. ?
>>
>> Postgresql Version 14.10
>>
>> tar -h  -zxvf   $PATH_FOLDER/*  .tar.gz
>>
>> ls $PATH_FOLDER/*.tar.gz | xargs -I  {} tar -h  -zxvf  {}
>>
>> What user are you running pg_basebackup as?
>>
>> I ran it as postgres and now I'm restoring as postgres
>>
>> /tnt???  Or /mnt?
>> /tnt/backup/current
>>
>> This is where the backup files are.
>>
>>
>>
>>
>>
>> On Wed, Jan 17, 2024 at 11:15 AM Ron Johnson 
>> wrote:
>>
>>> On Wed, Jan 17, 2024 at 11:11 AM Johnathan Tiamoh <
>>> johnathantia...@gmail.com> wrote:
>>>
>>>> Hello,
>>>>
>>>> I am trying to restore a pg_basebackup and have the following errors.
>>>>
>>>>
>>> You need to tell us the PG version number *and* show us the full
>>> command you ran.
>>>
>>>
>>>> nohup: ignoring input
>>>> tar: /tnt/backup/current/7400.tar.gz: Not found in archive
>>>>
>>> tar: /tnt/backup/current/7401.tar.gz: Not found in archive
>>>> tar: /tnt/backup/current/7402.tar.gz: Not found in archive
>>>> tar: /tnt/backup/current/base.tar.gz: Not found in archive
>>>> tar: /tnt/backup/current/pg_wal.tar.gz: Not found in archive
>>>> tar: Exiting with failure status due to previous errors
>>>> PG_14_202107181/
>>>> tar: PG_14_202107181: Cannot mkdir: Permission denied
>>>>
>>>
>>> What user are you running pg_basebackup as?
>>>
>>>
>>>> PG_14_202107181/137502/
>>>> tar: PG_14_202107181: Cannot mkdir: Permission denied
>>>> tar: PG_14_202107181/137502: Cannot mkdir: No such file or directory
>>>> PG_14_202107181/137502/3222926016
>>>> tar: PG_14_202107181: Cannot mkdir: Permission denied
>>>>
>>>> The files exist in the specified directory.
>>>>
>>>
>>> /tnt???  Or /mnt?
>>>
>>>


Re: pg_basebackup Restore problem

2024-01-17 Thread Ron Johnson
1. What's in $PATH_FOLDER?
2. What pg_basebackup command did you use?
3. Why aren't you letting pg_basebackup maintain the WAL files it needs?

On Wed, Jan 17, 2024 at 11:27 AM Johnathan Tiamoh 
wrote:

> You need to tell us the PG version number *and* show us the full command
> you ran. ?
>
> Postgresql Version 14.10
>
> tar -h  -zxvf   $PATH_FOLDER/*  .tar.gz
>
> ls $PATH_FOLDER/*.tar.gz | xargs -I  {} tar -h  -zxvf  {}
>
> What user are you running pg_basebackup as?
>
> I ran it as postgres and now I'm restoring as postgres
>
> /tnt???  Or /mnt?
> /tnt/backup/current
>
> This is where the backup files are.
>
>
>
>
>
> On Wed, Jan 17, 2024 at 11:15 AM Ron Johnson 
> wrote:
>
>> On Wed, Jan 17, 2024 at 11:11 AM Johnathan Tiamoh <
>> johnathantia...@gmail.com> wrote:
>>
>>> Hello,
>>>
>>> I am trying to restore a pg_basebackup and have the following errors.
>>>
>>>
>> You need to tell us the PG version number *and* show us the full command
>> you ran.
>>
>>
>>> nohup: ignoring input
>>> tar: /tnt/backup/current/7400.tar.gz: Not found in archive
>>>
>> tar: /tnt/backup/current/7401.tar.gz: Not found in archive
>>> tar: /tnt/backup/current/7402.tar.gz: Not found in archive
>>> tar: /tnt/backup/current/base.tar.gz: Not found in archive
>>> tar: /tnt/backup/current/pg_wal.tar.gz: Not found in archive
>>> tar: Exiting with failure status due to previous errors
>>> PG_14_202107181/
>>> tar: PG_14_202107181: Cannot mkdir: Permission denied
>>>
>>
>> What user are you running pg_basebackup as?
>>
>>
>>> PG_14_202107181/137502/
>>> tar: PG_14_202107181: Cannot mkdir: Permission denied
>>> tar: PG_14_202107181/137502: Cannot mkdir: No such file or directory
>>> PG_14_202107181/137502/3222926016
>>> tar: PG_14_202107181: Cannot mkdir: Permission denied
>>>
>>> The files exist in the specified directory.
>>>
>>
>> /tnt???  Or /mnt?
>>
>>


Re: pg_basebackup Restore problem

2024-01-17 Thread Johnathan Tiamoh
You need to tell us the PG version number *and* show us the full command
you ran. ?

Postgresql Version 14.10

tar -h  -zxvf   $PATH_FOLDER/*  .tar.gz

ls $PATH_FOLDER/*.tar.gz | xargs -I  {} tar -h  -zxvf  {}

What user are you running pg_basebackup as?

I ran it as postgres and now I'm restoring as postgres

/tnt???  Or /mnt?
/tnt/backup/current

This is where the backup files are.





On Wed, Jan 17, 2024 at 11:15 AM Ron Johnson 
wrote:

> On Wed, Jan 17, 2024 at 11:11 AM Johnathan Tiamoh <
> johnathantia...@gmail.com> wrote:
>
>> Hello,
>>
>> I am trying to restore a pg_basebackup and have the following errors.
>>
>>
> You need to tell us the PG version number *and* show us the full command
> you ran.
>
>
>> nohup: ignoring input
>> tar: /tnt/backup/current/7400.tar.gz: Not found in archive
>>
> tar: /tnt/backup/current/7401.tar.gz: Not found in archive
>> tar: /tnt/backup/current/7402.tar.gz: Not found in archive
>> tar: /tnt/backup/current/base.tar.gz: Not found in archive
>> tar: /tnt/backup/current/pg_wal.tar.gz: Not found in archive
>> tar: Exiting with failure status due to previous errors
>> PG_14_202107181/
>> tar: PG_14_202107181: Cannot mkdir: Permission denied
>>
>
> What user are you running pg_basebackup as?
>
>
>> PG_14_202107181/137502/
>> tar: PG_14_202107181: Cannot mkdir: Permission denied
>> tar: PG_14_202107181/137502: Cannot mkdir: No such file or directory
>> PG_14_202107181/137502/3222926016
>> tar: PG_14_202107181: Cannot mkdir: Permission denied
>>
>> The files exist in the specified directory.
>>
>
> /tnt???  Or /mnt?
>
>


Re: pg_basebackup Restore problem

2024-01-17 Thread Ron Johnson
On Wed, Jan 17, 2024 at 11:11 AM Johnathan Tiamoh 
wrote:

> Hello,
>
> I am trying to restore a pg_basebackup and have the following errors.
>
>
You need to tell us the PG version number *and* show us the full command
you ran.


> nohup: ignoring input
> tar: /tnt/backup/current/7400.tar.gz: Not found in archive
>
tar: /tnt/backup/current/7401.tar.gz: Not found in archive
> tar: /tnt/backup/current/7402.tar.gz: Not found in archive
> tar: /tnt/backup/current/base.tar.gz: Not found in archive
> tar: /tnt/backup/current/pg_wal.tar.gz: Not found in archive
> tar: Exiting with failure status due to previous errors
> PG_14_202107181/
> tar: PG_14_202107181: Cannot mkdir: Permission denied
>

What user are you running pg_basebackup as?


> PG_14_202107181/137502/
> tar: PG_14_202107181: Cannot mkdir: Permission denied
> tar: PG_14_202107181/137502: Cannot mkdir: No such file or directory
> PG_14_202107181/137502/3222926016
> tar: PG_14_202107181: Cannot mkdir: Permission denied
>
> The files exist in the specified directory.
>

/tnt???  Or /mnt?


pg_basebackup Restore problem

2024-01-17 Thread Johnathan Tiamoh
Hello,

I am trying to restore a pg_basebackup and have the following errors.

nohup: ignoring input
tar: /tnt/backup/current/7400.tar.gz: Not found in archive
tar: /tnt/backup/current/7401.tar.gz: Not found in archive
tar: /tnt/backup/current/7402.tar.gz: Not found in archive
tar: /tnt/backup/current/base.tar.gz: Not found in archive
tar: /tnt/backup/current/pg_wal.tar.gz: Not found in archive
tar: Exiting with failure status due to previous errors
PG_14_202107181/
tar: PG_14_202107181: Cannot mkdir: Permission denied
PG_14_202107181/137502/
tar: PG_14_202107181: Cannot mkdir: Permission denied
tar: PG_14_202107181/137502: Cannot mkdir: No such file or directory
PG_14_202107181/137502/3222926016
tar: PG_14_202107181: Cannot mkdir: Permission denied

The files exist in the specified directory.

Thank you
Johnathan T.


Re: pg_basebackup

2023-11-20 Thread Christoph Moench-Tegeder
## Matthias Apitz (g...@unixarea.de):

> 2023-11-16 20:34:13.538 CET [6250] LOG:  terminating walsender process due to 
> replication timeout

Besides "what Lauenz said" (especially about the horribly ooutdated
PostgreSQL version): check IO speed and saturation during backup
and make sure you're not stalling. I've seen this beaviour a few
times, mostly in conjunction with btrfs - using a suitably proven
filesystem usually solved the problem (overloaded hardware can
be a problem, too - but modern systems can take quite a bit more
than in the olden days of spinning rust).

Regards,
Christoph

-- 
Spare Space.




Re: pg_basebackup

2023-11-20 Thread Laurenz Albe
On Mon, 2023-11-20 at 07:30 +0100, Matthias Apitz wrote:
> We're facing in a customer installation (PostgreSQL 13.1 on Linux) the
> following problem for the first time and not reproducible:

13.1?  Your immediate reaction should be "update to the latest minor release".

> ${BINDIR}/pg_basebackup -U ${DBSUSER} -Ft -z -D ${BACKUPDIR}-${DATE}-${NUM}
> 
> The resulting stdout/stderr of the script:
> 
> 16.11.2023-20:20:02: pg_basebackup the cluster to 
> /Backup/postgres/sisis-20231116-1 ... 
> pg_basebackup: could not receive data from WAL stream: server closed the 
> connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
> pg_basebackup: child process exited with error 1
> 
> pg-error.log:
> 
> 2023-11-16 20:34:13.538 CET [6250] LOG:  terminating walsender process due to 
> replication timeout
> 
> Why the PostgreSQL server says something about "replication", we do
> pg_basebackup?

Because "pg_basebackup" uses a replication connection.

> Some more information:
> 
> - wal_sender_timeout has default value (60s)

Increase "wal_sender_timeout", perhaps to 0 (which means "infinite").

Yours,
Laurenz Albe




pg_basebackup

2023-11-19 Thread Matthias Apitz


Hello,

We're facing in a customer installation (PostgreSQL 13.1 on Linux) the
following problem for the first time and not reproducible:

The effective part of our backup script contains:
...
test -d ${BACKUPWAL}-${DATE}-${NUM}/ || mkdir -p ${BACKUPWAL}-${DATE}-${NUM}/

# kick to archive the current log; use a DB which will exist;
#
psql -U ${DBSUSER} -dpostgres -c "select pg_switch_wal();" > /dev/null

# backup the cluster
#
printf "%s: pg_basebackup the cluster to %s ... " "`date "+%d.%m.%Y-%H:%M:%S"`" 
${BACKUPDIR}-${DATE}-${NUM}
${BINDIR}/pg_basebackup -U ${DBSUSER} -Ft -z -D ${BACKUPDIR}-${DATE}-${NUM}

...


The resulting stdout/stderr of the script:

16.11.2023-20:20:02: pg_basebackup the cluster to 
/Backup/postgres/sisis-20231116-1 ... 
pg_basebackup: could not receive data from WAL stream: server closed the 
connection unexpectedly
This probably means the server terminated abnormally
    before or while processing the request.
pg_basebackup: child process exited with error 1

pg-error.log:

2023-11-16 20:34:13.538 CET [6250] LOG:  terminating walsender process due to 
replication timeout

Why the PostgreSQL server says something about "replication", we do
pg_basebackup?

Some more information:

- wal_sender_timeout has default value (60s)
- backup target is a local file, not a network storage
- the Linux SLES 15 server is good equipped
- nothing is logged in /var/log/messages

Any ideas? Thanks.

matthias


-- 
Matthias Apitz, ✉ g...@unixarea.de, http://www.unixarea.de/ +49-176-38902045
Public GnuPG key: http://www.unixarea.de/key.pub




Re: pg_basebackup / recovery

2023-04-12 Thread Achilleas Mantzios

Στις 13/4/23 01:31, ο/η Michael Paquier έγραψε:

On Wed, Apr 12, 2023 at 01:45:56PM +0300, Achilleas Mantzios - cloud wrote:

On 4/12/23 12:32, Fabrice Chapuis wrote:

During recovery process of a self contained backup, how postgres know to
stop reading wal when consistency is reached?

Because it knows the full packup info. It will observe the

STOP WAL LOCATION: 3BC7/4B000130 (file 00023BC7004B)

inside the backup file

There is a bit more to that in the recovery logic, depending mostly
on the presence of backup_ label file in the data folder when recovery
begins.  Once the backup_label is found at the beginning of recovery,
its information is stored in the control file and the file is renamed
to backup_label.old hence stopping the server when recovery has not
reached its expected point would rely on the control file contents
later on.  Then, the startup process and its WAL redo makes sure that
WAL replays until it finds the WAL record marking the end of the
backup.  Grepping for XLOG_BACKUP_END (WAL record type in this case)
shows all the areas that rely on that, and xlogrecovery.c covers the
most relevant bits.


Thank you for the info!

Sorry about my stupid typo "packup" :(


--
Michael


--
Achilleas Mantzios
 IT DEV - HEAD
 IT DEPT
 Dynacom Tankers Mgmt





Re: pg_basebackup / recovery

2023-04-12 Thread Michael Paquier
On Wed, Apr 12, 2023 at 01:45:56PM +0300, Achilleas Mantzios - cloud wrote:
> On 4/12/23 12:32, Fabrice Chapuis wrote:
>> During recovery process of a self contained backup, how postgres know to
>> stop reading wal when consistency is reached?
> 
> Because it knows the full packup info. It will observe the
> 
> STOP WAL LOCATION: 3BC7/4B000130 (file 00023BC7004B)
> 
> inside the backup file

There is a bit more to that in the recovery logic, depending mostly
on the presence of backup_ label file in the data folder when recovery
begins.  Once the backup_label is found at the beginning of recovery,
its information is stored in the control file and the file is renamed
to backup_label.old hence stopping the server when recovery has not
reached its expected point would rely on the control file contents
later on.  Then, the startup process and its WAL redo makes sure that
WAL replays until it finds the WAL record marking the end of the
backup.  Grepping for XLOG_BACKUP_END (WAL record type in this case)
shows all the areas that rely on that, and xlogrecovery.c covers the
most relevant bits.
--
Michael


signature.asc
Description: PGP signature


Re: pg_basebackup / recovery

2023-04-12 Thread Achilleas Mantzios - cloud

On 4/12/23 12:32, Fabrice Chapuis wrote:

During recovery process of a self contained backup, how postgres know 
to stop reading wal when consistency is reached?



Because it knows the full packup info. It will observe the

STOP WAL LOCATION: 3BC7/4B000130 (file 00023BC7004B)

inside the backup file

 It's a full contained backup because you haven't called with the |-X 
/|method none|/| , and it doesn't get into standby because you haven't 
called with |--write-recovery-conf !|


|I believe by default it will do what you meant that you want.
|


pg_basebackup / recovery

2023-04-12 Thread Fabrice Chapuis
During recovery process of a self contained backup, how postgres know to
stop reading wal when consistency is reached?


Re: postgres replication without pg_basebackup? postgres 13.3

2022-11-07 Thread Laurenz Albe
On Mon, 2022-11-07 at 23:11 +0100, Pilar de Teodoro wrote:
> Thank you very much for the idea. [of running pg_rewind]
> We ran pg_rewind correctly:
> [postgres@gacsdb05 data-13.3]$ pg_rewind -c -R 
> --target-pgdata=/PostgresDB/sas_hdd/data-13.3/  --source-server="host= 
> port= user=postgres password=XXX" 
> pg_rewind: source and target cluster are on the same timeline
> pg_rewind: no rewind required

I cannot verify that you ran it correctly.

"Target" should be the old server with the extra transactions.
"Source" should be the promoted standby server.  Promotion switches to a new 
timeline,
so it looks like you did something wrong.

> but then we got the following error:
> 
> [2022-11-07 22:57:55 CET-]LOG:  starting PostgreSQL 13.3 on 
> x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-10), 
> 64-bit
> [2022-11-07 22:57:55 CET-]LOG:  listening on IPv4 address "0.0.0.0", port 
> [2022-11-07 22:57:55 CET-]LOG:  could not create IPv6 socket for address 
> "::": Address family not supported by protocol
> [2022-11-07 22:57:55 CET-]LOG:  listening on Unix socket "/tmp/.s.PGSQL."
> [2022-11-07 22:57:55 CET-]LOG:  database system was shut down in recovery at 
> 2022-11-07 22:57:47 CET
> [2022-11-07 22:57:55 CET-]LOG:  restored log file "0003.history" from 
> archive
> cp: cannot stat '/PostgresWalLogArchive/new/0004.history': No such file 
> or directory
> [2022-11-07 22:57:55 CET-]LOG:  entering standby mode
> [2022-11-07 22:57:55 CET-]LOG:  restored log file "0003.history" from 
> archive
> [2022-11-07 22:57:55 CET-]LOG:  invalid primary checkpoint record
> [2022-11-07 22:57:55 CET-]PANIC:  could not locate a valid checkpoint record
> [2022-11-07 22:57:55 CET-]LOG:  startup process (PID 3011860) was terminated 
> by signal 6: Aborted
> [2022-11-07 22:57:55 CET-]LOG:  aborting startup due to startup process 
> failure
> [2022-11-07 22:57:55 CET-]LOG:  database system is shut down
> 
> We have read we can run pg_resetwal but the Wal folder is the folder where 
> the Wals are archived in the primary. Would that be correct to reset them?

Don't run "pg_resetwal".  At the very least, it will break your standby.

At this point, your standby seems to be broken.  I don't know what exactly you
did, but it leeks like you should run a "pg_basebackup" after all.

Yours,
Laurenz Albe




Re: postgres replication without pg_basebackup? postgres 13.3

2022-11-07 Thread Pilar de Teodoro
Dear Laurenz,
Thank you very much for the idea.
We ran pg_rewind correctly:

[postgres@gacsdb05 data-13.3]$ pg_rewind -c -R
--target-pgdata=/PostgresDB/sas_hdd/data-13.3/  --source-server="host=
port= user=postgres password=XXX"

pg_rewind: source and target cluster are on the same timeline

pg_rewind: no rewind required


but then we got the following error:


[2022-11-07 22:57:55 CET-]LOG:  starting PostgreSQL 13.3 on
x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 20210514 (Red Hat
8.5.0-10), 64-bit

[2022-11-07 22:57:55 CET-]LOG:  listening on IPv4 address "0.0.0.0", port


[2022-11-07 22:57:55 CET-]LOG:  could not create IPv6 socket for address
"::": Address family not supported by protocol

[2022-11-07 22:57:55 CET-]LOG:  listening on Unix socket
"/tmp/.s.PGSQL."

[2022-11-07 22:57:55 CET-]LOG:  database system was shut down in recovery
at 2022-11-07 22:57:47 CET

[2022-11-07 22:57:55 CET-]LOG:  restored log file "0003.history" from
archive

cp: cannot stat '/PostgresWalLogArchive/new/0004.history': No such file
or directory

[2022-11-07 22:57:55 CET-]LOG:  entering standby mode

[2022-11-07 22:57:55 CET-]LOG:  restored log file "0003.history" from
archive

[2022-11-07 22:57:55 CET-]LOG:  invalid primary checkpoint record

[2022-11-07 22:57:55 CET-]PANIC:  could not locate a valid checkpoint record

[2022-11-07 22:57:55 CET-]LOG:  startup process (PID 3011860) was
terminated by signal 6: Aborted

[2022-11-07 22:57:55 CET-]LOG:  aborting startup due to startup process
failure

[2022-11-07 22:57:55 CET-]LOG:  database system is shut down


We have read we can run pg_resetwal but the Wal folder is the folder where
the Wals are archived in the primary. Would that be correct to reset them?


Any suggestion?

Thank you so much,


Pilar

El lun, 7 nov 2022 a las 12:21, Laurenz Albe ()
escribió:

> On Mon, 2022-11-07 at 11:02 +0100, Pilar de Teodoro wrote:
> > We have a very large database of 37TB and we had to promote our standby
> to primary due to
> > some disk failures. Now the issues are solved, we would like to make
> standby  the old primary
> > from a copy of the new primary which is already in place.  Is it
> possible without using pg_basebackup?
>
> That's exactly what "pg_rewind" is for.  It is a fast version of
> "pg_basebackup" for exactly
> that case.
>
> Yours,
> Laurenz Albe
> --
> Cybertec | https://www.cybertec-postgresql.com
>


Re: postgres replication without pg_basebackup? postgres 13.3

2022-11-07 Thread Laurenz Albe
On Mon, 2022-11-07 at 11:02 +0100, Pilar de Teodoro wrote:
> We have a very large database of 37TB and we had to promote our standby to 
> primary due to
> some disk failures. Now the issues are solved, we would like to make standby  
> the old primary
> from a copy of the new primary which is already in place.  Is it possible 
> without using pg_basebackup?

That's exactly what "pg_rewind" is for.  It is a fast version of 
"pg_basebackup" for exactly
that case.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com




postgres replication without pg_basebackup? postgres 13.3

2022-11-07 Thread Pilar de Teodoro
Dear all,

We have a very large database of 37TB and we had to promote our standby to
primary due to some disk failures. Now the issues are solved, we would like
to make standby  the old primary  from a copy of the new primary which is
already in place.  Is it possible without using pg_basebackup? We have
already a copy created than can be started but if we open as standby
(standby.signal created) the log says when started:

[2022-10-28 21:27:12 CEST-]LOG:  starting PostgreSQL 13.3 on
x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 20210514 (Red Hat
8.5.0-10), 64-bit

[2022-10-28 21:27:12 CEST-]LOG:  listening on IPv4 address "0.0.0.0", port
8300

[2022-10-28 21:27:12 CEST-]LOG:  could not create IPv6 socket for address
"::": Address family not supported by protocol

[2022-10-28 21:27:12 CEST-]LOG:  listening on Unix socket
"/tmp/.s.PGSQL.8300"

[2022-10-28 21:27:12 CEST-]LOG:  database system was interrupted; last
known up at 2022-10-28 20:31:08 CEST

[2022-10-28 21:28:06 CEST-]LOG:  restored log file "0003.history" from
archive

cp: cannot stat '/PostgresWalLogArchive/new/0004.history': No such file
or directory

[2022-10-28 21:28:06 CEST-]LOG:  entering standby mode

[2022-10-28 21:28:06 CEST-]LOG:  restored log file "0003.history" from
archive

cp: cannot stat '/PostgresWalLogArchive/new/0003BF72002F': No
such file or directory

[2022-10-28 21:28:06 CEST-]LOG:  restored log file
"0002BF72002F" from archive

[2022-10-28 21:28:06 CEST-]FATAL:  requested timeline 3 is not a child of
this server's history

*[2022-10-28 21:28:06 CEST-]DETAIL:  Latest checkpoint is at BF72/2F0309C0
on timeline 2, but in the history of the requested timeline, the server
forked off from that timeline at B1D6/6000.*

*[2022-10-28 21:28:06 CEST-]LOG:  startup process (PID 1298266) exited with
exit code 1*

[2022-10-28 21:28:06 CEST-]LOG:  aborting startup due to startup process
failure

[2022-10-28 21:28:07 CEST-]LOG:  database system is shut down





Do you know if we can make it work without having to run pg_basebackup for
37TB which will take about 3 days to copy? Why it is requesting a different
timeline if it is a copy of the primary?


Primary replica parameters:



wal_level = replica

archive_mode = on

archive_command = 'cp -i %p /PostgresWalLogArchive/new/%f'



standby replica parameters:


primary_conninfo = 'host=XXX port=8300 user=postgres password=***'

restore_command = 'cp /PostgresWalLogArchive/new/%f %p'

archive_cleanup_command =
'/home/postgres/software/postgresql/bin/pg_archivecleanup
/PostgresWalLogArchive/new/%f %r'



Thank you very much for your help in advance.


Pilar de Teodoro


Re: About pg_basebackup

2022-05-19 Thread Laurenz Albe
On Thu, 2022-05-19 at 13:18 +0900, 菊池祐 wrote:
> I executed the pg_basebackup command to set up a replication configuration in 
> postgresql,
> but the following message appears and replication didn’t complete.
>
> 2741/2742 tablespaces (/var/lib/pgsql/9.2/bac12609150596/12609150596 kB 
> (100%), 2741/2742 tablespaces (/var/lib/pgsql/9.2/bac12609150596/12609150596 
> kB (100%), 2742/2742 tablespaces
> NOTICE: pg_stop_backup cleanup done, waiting for required WAL segments to be 
> archived
> WARNING: pg_stop_backup still waiting for all required WAL segments to be 
> archived (60 seconds elapsed)
> HINT: Check that your archive_command is executing properly. pg_stop_backup 
> can be canceled safely, but the database backup will not be usable without 
> all the WAL segments.
> WARNING: pg_stop_backup still waiting for all required WAL segments to be 
> archived (120 seconds elapsed)

I guess that your "archive_command" is hanging or failing.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com




About pg_basebackup

2022-05-18 Thread 菊池祐
Hi,I executed the pg_basebackup command to set up a replication configuration in postgresql, but the following message appears and replication didn’t complete.2741/2742 tablespaces (/var/lib/pgsql/9.2/bac12609150596/12609150596 kB (100%), 2741/2742 tablespaces (/var/lib/pgsql/9.2/bac12609150596/12609150596 kB (100%), 2742/2742 tablespaces NOTICE: pg_stop_backup cleanup done, waiting for required WAL segments to be archived WARNING: pg_stop_backup still waiting for all required WAL segments to be archived (60 seconds elapsed) HINT: Check that your archive_command is executing properly. pg_stop_backup can be canceled safely, but the database backup will not be usable without all the WAL segments. WARNING: pg_stop_backup still waiting for all required WAL segments to be archived (120 seconds elapsed) ・ ・ ・Please let me know how to deal with this. The environment is as follows and I attached postgresql.conf file.Version : 9.2OS : Cent 6.6

-6124946774920295119postgresql.conf
Description: Binary data


Re: pg_basebackup with hostssl ?

2022-02-02 Thread Laura Smith



Sent with ProtonMail Secure Email.

‐‐‐ Original Message ‐‐‐

On Wednesday, February 2nd, 2022 at 17:20, Adrian Klaver 
 wrote:

>
> Before you do that I would establish that you are connecting to the
>
> correct Postgres instance.
>
>

Good news, all up and running !

The new "postgresql.auto.conf" automagic is pure awesome. ;-)




Re: pg_basebackup with hostssl ?

2022-02-02 Thread Adrian Klaver

On 2/2/22 09:18, Laura Smith wrote:



Seems like I was looking for the wrong words on the right page !

Thanks, will try that.


Before you do that I would establish that you are connecting to the 
correct Postgres instance.




--
Adrian Klaver
adrian.kla...@aklaver.com




Re: pg_basebackup with hostssl ?

2022-02-02 Thread Laura Smith



Sent with ProtonMail Secure Email.

‐‐‐ Original Message ‐‐‐

On Wednesday, February 2nd, 2022 at 16:50, Adrian Klaver 
 wrote:

> Not completely:
>
> https://www.postgresql.org/docs/current/app-pgbasebackup.html
>
> -d connstr
>
> --dbname=connstr
>
> Specifies parameters used to connect to the server, as a
>
> ; these will override any conflicting command line
>
> options.
>
> The option is called --dbname for consistency with other client
>
> applications, but because pg_basebackup doesn't connect to any
>
> particular database in the cluster, any database name in the connection
>
> string will be ignored.
>

Seems like I was looking for the wrong words on the right page !

Thanks, will try that.




Re: pg_basebackup with hostssl ?

2022-02-02 Thread David G. Johnston
On Wed, Feb 2, 2022 at 9:37 AM Laura Smith <
n5d9xq3ti233xiyif...@protonmail.ch> wrote:

> The point I am making is that pg_basebackup is connecting without
> encryption and I want to know how to tell it to use encryption. The docs
> are silent on the subject.
>

It is a client application that talks libpq.  All of them understand a
common set of environment variables:

https://www.postgresql.org/docs/current/libpq-envars.html

Many of those can also be supplied as part of the connection string.  These
include SSL mode control.

The question I would ask is whether psql connects by default using ssl in
the same basic configuration.  If so, then the inconsistency amounts to
pg_basebackup having an insecure default connection method while psql has a
secure one.

David J.


Re: pg_basebackup with hostssl ?

2022-02-02 Thread Adrian Klaver

On 2/2/22 08:37, Laura Smith wrote:



Sent with ProtonMail Secure Email.

‐‐‐ Original Message ‐‐‐

On Wednesday, February 2nd, 2022 at 16:30, Adrian Klaver 
 wrote:


I am not following. pg_basebackup is a stand alone client that is not

involved in replication other then establishing a starting point from

which to later establish a replication relationship.



Indeed. And that is exactly that I am trying to do (i.e. have master, want new 
slave).


The error is saying that whatever instance you are pointing

pg_basebackup at does not have entry in pg_hba.conf for the

host/user/encryption combination pg_basebackup is trying to connect as.



But it *DOES* have an entry, per my original message:
"hostsslreplicationall10.0.0.0/8md5"


From the OP:

"
Long story short, I've got the following in pg_hba.conf of the master:
hostsslreplicationall10.0.0.0/8md5

But the slave is complaining:
FATAL:  no pg_hba.conf entry for replication connection from host 
"10.1.2.3", user "myrepl", no encryption


"

The error is coming from a connection to the slave which does not have 
data yet, correct? Or a matching pg_hba.conf entry, it would seem.






And yes, that hba is loaded and live because other remote clients are happily 
connected to that server and thus reliant on a valid hba.conf.

The host is correct. The user is correct.

The point I am making is that pg_basebackup is connecting without encryption 
and I want to know how to tell it to use encryption. The docs are silent on the 
subject.


Not completely:

https://www.postgresql.org/docs/current/app-pgbasebackup.html

-d connstr
--dbname=connstr

Specifies parameters used to connect to the server, as a 
; these will override any conflicting command line 
options.


The option is called --dbname for consistency with other client 
applications, but because pg_basebackup doesn't connect to any 
particular database in the cluster, any database name in the connection 
string will be ignored.


Following the  link:

https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-CONNSTRING

You can use that to set sslmode.

--
Adrian Klaver
adrian.kla...@aklaver.com




Re: pg_basebackup with hostssl ?

2022-02-02 Thread Laura Smith
Forgot to add that I also have :
"hostsslallall10.0.0.0/8md5"






Re: pg_basebackup with hostssl ?

2022-02-02 Thread Laura Smith



Sent with ProtonMail Secure Email.

‐‐‐ Original Message ‐‐‐

On Wednesday, February 2nd, 2022 at 16:30, Adrian Klaver 
 wrote:

> I am not following. pg_basebackup is a stand alone client that is not
>
> involved in replication other then establishing a starting point from
>
> which to later establish a replication relationship.
>

Indeed. And that is exactly that I am trying to do (i.e. have master, want new 
slave).

> The error is saying that whatever instance you are pointing
>
> pg_basebackup at does not have entry in pg_hba.conf for the
>
> host/user/encryption combination pg_basebackup is trying to connect as.
>

But it *DOES* have an entry, per my original message:
"hostsslreplicationall10.0.0.0/8md5"

And yes, that hba is loaded and live because other remote clients are happily 
connected to that server and thus reliant on a valid hba.conf.

The host is correct. The user is correct.

The point I am making is that pg_basebackup is connecting without encryption 
and I want to know how to tell it to use encryption. The docs are silent on the 
subject.




Re: pg_basebackup with hostssl ?

2022-02-02 Thread Adrian Klaver

On 2/2/22 07:48, Laura Smith wrote:

I've had a quick glance through the man page for pg_basebackup but can't see 
any flags to set ssl.

Long story short, I've got the following in pg_hba.conf of the master:
hostsslreplicationall10.0.0.0/8md5

But the slave is complaining:
FATAL:  no pg_hba.conf entry for replication connection from host "10.1.2.3", user 
"myrepl", no encryption

I suspect this is almost certainly because I'm using "hostssl" instead of 
"host".

But other than the obvious and undesirable quick-fix, how should I be calling 
pg_basebackup to make sure it uses encryption ?


I am not following. pg_basebackup is a stand alone client that is not 
involved in replication other then establishing a starting point from 
which to later establish a replication relationship.


The error is saying that whatever instance you are pointing 
pg_basebackup at does not have entry in pg_hba.conf for the 
host/user/encryption combination pg_basebackup is trying to connect as.






--
Adrian Klaver
adrian.kla...@aklaver.com




pg_basebackup with hostssl ?

2022-02-02 Thread Laura Smith
I've had a quick glance through the man page for pg_basebackup but can't see 
any flags to set ssl.

Long story short, I've got the following in pg_hba.conf of the master:
hostsslreplicationall10.0.0.0/8md5

But the slave is complaining:
FATAL:  no pg_hba.conf entry for replication connection from host "10.1.2.3", 
user "myrepl", no encryption

I suspect this is almost certainly because I'm using "hostssl" instead of 
"host".

But other than the obvious and undesirable quick-fix, how should I be calling 
pg_basebackup to make sure it uses encryption ?




Re: Issue with pg_basebackup v.11

2021-10-25 Thread Ninad Shah
Thanks Tom.


Regards,
Ninad Shah

On Sat, 23 Oct 2021 at 20:12, Tom Lane  wrote:

> Ninad Shah  writes:
> > Would keepalive setting address and mitigate the issue?
>
> [ shrug... ]  Maybe; nobody else has more information about this
> situation than you do.  I suggested something to experiment with.
>
> regards, tom lane
>


Re: Issue with pg_basebackup v.11

2021-10-23 Thread Tom Lane
Ninad Shah  writes:
> Would keepalive setting address and mitigate the issue?

[ shrug... ]  Maybe; nobody else has more information about this
situation than you do.  I suggested something to experiment with.

regards, tom lane




Re: Issue with pg_basebackup v.11

2021-10-22 Thread Ninad Shah
Hey Tom,

Thank you for your response. Actually, when we copy data using scp/rsync,
it works without any issue. But, it fails while attempting to transfer
using pg_basebackup.

Would keepalive setting address and mitigate the issue?


Regards,
Ninad Shah

On Fri, 22 Oct 2021 at 21:39, Tom Lane  wrote:

> Ninad Shah  writes:
> > What I observed is that it takes a couple of hours between below 2 lines.
>
> >  115454656/1304172127 kB (8%), 0/1 tablespace
> > (...atastaging/base/115868/154220.2)
> > pgbasebackup: could not read COPY data: could not receive data from
> server:
> > Connection timed out
>
> We have heard reports of network connections dropping while pg_basebackup
> is busy doing something disk-intensive such as fsync'ing.  The apparent
> 2-hour delay here does not mean that pg_basebackup was out to lunch for
> 2 hours; more likely that reflects the TCP timeout delay before the kernel
> realizes that the connection is lost.  The actual blame probably resides
> with some firewall or router that has a short timeout for idle
> connections.
>
> I'd try turning on fairly aggressive TCP keepalive settings for the
> connection, say keepalives_idle=30 or so.
>
> regards, tom lane
>


Re: Issue with pg_basebackup v.11

2021-10-22 Thread Tom Lane
Ninad Shah  writes:
> What I observed is that it takes a couple of hours between below 2 lines.

>  115454656/1304172127 kB (8%), 0/1 tablespace
> (...atastaging/base/115868/154220.2)
> pgbasebackup: could not read COPY data: could not receive data from server:
> Connection timed out

We have heard reports of network connections dropping while pg_basebackup
is busy doing something disk-intensive such as fsync'ing.  The apparent
2-hour delay here does not mean that pg_basebackup was out to lunch for
2 hours; more likely that reflects the TCP timeout delay before the kernel
realizes that the connection is lost.  The actual blame probably resides
with some firewall or router that has a short timeout for idle
connections.

I'd try turning on fairly aggressive TCP keepalive settings for the
connection, say keepalives_idle=30 or so.

regards, tom lane




Issue with pg_basebackup v.11

2021-10-22 Thread Ninad Shah
Hello experts,

I am facing an issue with a customer's production server while trying to
take backup using pg_basebackup.

Below is the log from pg_basebackup execution.

* 115338208/1304172127 kB (8%), 0/1 tablespace
(...atastaging/base/115868/154220.1)
 115355616/1304172127 kB (8%), 0/1 tablespace
(...atastaging/base/115868/154220.1)
 115372640/1304172127 kB (8%), 0/1 tablespace
(...atastaging/base/115868/154220.1)
 115389568/1304172127 kB (8%), 0/1 tablespace
(...atastaging/base/115868/154220.1)
 115405792/1304172127 kB (8%), 0/1 tablespace
(...atastaging/base/115868/154220.1)
 115423776/1304172127 kB (8%), 0/1 tablespace
(...atastaging/base/115868/154220.1)
 115440640/1304172127 kB (8%), 0/1 tablespace
(...atastaging/base/115868/154220.2)
 115454656/1304172127 kB (8%), 0/1 tablespace
(...atastaging/base/115868/154220.2)
pgbasebackup: could not read COPY data: could not receive data from server:
Connection timed out
pgbasebackup: removing contents of data directory
"/u01/PostgreSQL/11/datastaging"*

It copied nearly 110 GB of data and exited. Initially, we suspected it as a
network/OS issue. However, we tried to copy a 150 GB large file over the
network, which finished successfully.

What I observed is that it takes a couple of hours between below 2 lines.

 115454656/1304172127 kB (8%), 0/1 tablespace
(...atastaging/base/115868/154220.2)
pgbasebackup: could not read COPY data: could not receive data from server:
Connection timed out

In other words, it run for an hour, and later, it takes 2 hours before it
times out.

Can someone please help me out here?


Regards,
Ninad Shah


Re: Problem in pg_basebackup of 1.7 TB database while replicating in PostgreSQL 9.4.( Master and slave both are in postgreSQL 9.4)

2021-10-04 Thread Ron

On 10/4/21 2:28 AM, Amal Chakravarty wrote:
Hi all. I am working with a database of 1.7 TB size which is in PostgreSQL 
9.4. Kindly suggest how to take the pg_basebackup of such a huge data 
while replicating from master to slave.


Obligatory "9.4 is EOL" comment.

--
Angular momentum makes the world go 'round.




Re: Problem in pg_basebackup of 1.7 TB database while replicating in PostgreSQL 9.4.( Master and slave both are in postgreSQL 9.4)

2021-10-04 Thread Tomas Vondra

On 10/4/21 9:28 AM, Amal Chakravarty wrote:
Hi all. I am working with a database of 1.7 TB size which is in 
PostgreSQL 9.4. Kindly suggest how to take the pg_basebackup of such a 
huge data while replicating from master to slave.




And what exactly is the issue? Does it fail in some way or are you 
looking for a faster / more efficient way to transfer the data?



regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Problem in pg_basebackup of 1.7 TB database while replicating in PostgreSQL 9.4.( Master and slave both are in postgreSQL 9.4)

2021-10-04 Thread Amal Chakravarty
Hi all. I am working with a database of 1.7 TB size which is in PostgreSQL
9.4. Kindly suggest how to take the pg_basebackup of such a huge data while
replicating from master to slave.

Regards,
Amal Chakravarty.


Re: pg_basebackup fails with "COPY stream ended"

2021-06-15 Thread Julien Rouhaud
On Tue, Jun 15, 2021 at 09:53:45PM -0700, Dipanjan Das wrote:
> 
> I am running "pg_basebackup -h  -U postgres -D  -X stream". It
> fails with either of the following two error messages:
> [...]
> WARNING:  terminating connection because of crash of another server process
> DETAIL:  The postmaster has commanded this server process to roll back the
> current transaction and exit, because another server process exited
> abnormally and possibly corrupted shared memory.

Do you have frequent emergency restarts like that?  You should start by
investigating on why this is happening.  The logs should tell you what process
crashed and what it was executing, could you report those messages?

We will also probably need a backtrace, see
https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD#Getting_a_trace_from_a_randomly_crashing_backend
on how do to that.




pg_basebackup fails with "COPY stream ended"

2021-06-15 Thread Dipanjan Das
Hi,

I am running "pg_basebackup -h  -U postgres -D  -X stream". It
fails with either of the following two error messages:

ERROR: Backup failed copying files.
DETAILS: data transfer failure on directory
'/mnt/data/barman/base/20210615T212304/data'
pg_basebackup error:
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the
current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and
repeat your command.
pg_basebackup: error: could not read COPY data: SSL SYSCALL error: EOF
detected
pg_basebackup: removing contents of data directory
"/mnt/data/barman/base/20210615T212304/data"

OR,

ERROR: Backup failed copying files.
DETAILS: data transfer failure on directory
'/mnt/data/barman/base/20210615T212849/data'
pg_basebackup error:
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: error: COPY stream ended before last file was finished
pg_basebackup: removing contents of data directory
"/mnt/data/barman/base/20210615T212849/data"

I am running Postgres 12.7 on Ubuntu 20.04. Can anyone please help me
figure out what's going wrong?


Re: "missing chunk number 0 for toast value xxx in pg_toast_xxx" when pg_basebackup

2021-04-13 Thread Laurenz Albe
On Tue, 2021-04-13 at 06:36 -0700, MaXinjian wrote:
> > Why do you write the WAL to /tmp/pg_wal, only to later mount that at the
> > default location?
>
> pg_wal dir has size limitation, if wal files are too large, they will be
> overwrited, right?

No, they won't.
You could run out of space on the file system though.

> > I see nothing wrong with what you are doing, but I may have got lost in
> > your complicated procedure.
> > You don't happen to remove "backup_label", do you?
>
> em, I do remove backup_label...

Then that's your problem.
That will corrupt your data, because recovery starts from the wrong
checkpoint.

> 1. It means recovery.conf is not necessary, backup_label is necessary?

Yes, exactly.

> 2. Which key in backup_label is necessary?

The whole file needs to be preserved unchanged, just as it is.
Don't mess with that file.

> 3. I searched the log, it do has recoveried.
> Then, if there is no backup_label, what's the default START WAL LOCATION and
> CHECKPOINT LOCATION?

That's the catch.

"backup_label" is the *only way* to tell a backup from a crashed
PostgreSQL cluster.

If there is no "backup_label", PostgreSQL will get the latest checkpoint
from the control file (global/pg_control), which may well be later than
the checkpoint that started the backup, so you will miss to recover some
transactions.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com





Re: "missing chunk number 0 for toast value xxx in pg_toast_xxx" when pg_basebackup

2021-04-13 Thread Tom Lane
Ma Xinjian  writes:
> When I use pg_basebackup to backup and restore db(Let's call it A) to a
> standalone instance(Let's call it B), "missing chunk number 0 for toast
> value xxx in pg_toast_xxx" errors output.
> PG version: 10.3

10.3 is quite a few bug fixes ago.  Maybe you'd have better results
with the current release (10.16).

regards, tom lane




Re: "missing chunk number 0 for toast value xxx in pg_toast_xxx" when pg_basebackup

2021-04-13 Thread MaXinjian
> Why do you write the WAL to /tmp/pg_wal, only to later mount that at the
> default location?
pg_wal dir has size limitation, if wal files are too large, they will be
overwrited, right?

> I see nothing wrong with what you are doing, but I may have got lost in
> your complicated procedure.

> You don't happen to remove "backup_label", do you?
em, I do remove backup_label...

1. It means recovery.conf is not necessary, backup_label is necessary?
2. Which key in backup_label is necessary?
3. I searched the log, it do has recoveried.
Then, if there is no backup_label, what's the default START WAL LOCATION and
CHECKPOINT LOCATION?



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-general-f1843780.html




Re: "missing chunk number 0 for toast value xxx in pg_toast_xxx" when pg_basebackup

2021-04-13 Thread Laurenz Albe
On Tue, 2021-04-13 at 02:38 -0700, Ma Xinjian wrote:
> When I use pg_basebackup to backup and restore db(Let's call it A) to a
> standalone instance(Let's call it B), "missing chunk number 0 for toast
> value xxx in pg_toast_xxx" errors output.
> 
> PG version: 10.3
> pg_basebackup command:
>  /usr/pgsql-10/bin/pg_basebackup -h p-rdb-c01 -D /var/lib/pgsql/10/data
> -Xs -P -n --waldir=/tmp/pg_wal
>I have mounted a disk to /tmp/pg_wal before, then I will mount the disk
> to /var/lib/pgsql/10/data/pg_wal, so as to ensure completeness of wal
> records during backup.
>  
> Since I don't want B to be a standy server, I just want it to be a
> standalone server.
> I removed recovery.conf, then simply start postgresql-10.service. It turned
> out that postgresql-10.service
> can be started successfully. But when I use this postgresql(reindex, vacumm
> and so on), "missing chunk number 0 for toast value xxx in pg_toast_xxx"
> errors output.
> 
> When pg_basebackup, it will store wal under pg_wal, can't postgresql work
> with wal records locally?
> I think primary_conninfo in recovery.conf is just used to get newer wal
> records from A. Right?
> 
> I have also tested:
> If I start postgresql-10.service with recovery.conf firstly, then split it
> from postgresql cluster, everything works fine.
> 
> Above test seems proved that it is wal records's problem. I am really
> confused.

Your mail got me confused...

Why do you write the WAL to /tmp/pg_wal, only to later mount that at the
default location?

I see nothing wrong with what you are doing, but I may have got lost
in your complicated procedure.

You don't happen to remove "backup_label", do you?

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com





"missing chunk number 0 for toast value xxx in pg_toast_xxx" when pg_basebackup

2021-04-13 Thread Ma Xinjian
Hi,

When I use pg_basebackup to backup and restore db(Let's call it A) to a
standalone instance(Let's call it B), "missing chunk number 0 for toast
value xxx in pg_toast_xxx" errors output.

PG version: 10.3
pg_basebackup command:
 /usr/pgsql-10/bin/pg_basebackup -h p-rdb-c01 -D /var/lib/pgsql/10/data
-Xs -P -n --waldir=/tmp/pg_wal
   I have mounted a disk to /tmp/pg_wal before, then I will mount the disk
to /var/lib/pgsql/10/data/pg_wal, so as to ensure completeness of wal
records during backup.
 
Since I don't want B to be a standy server, I just want it to be a
standalone server.
I removed recovery.conf, then simply start postgresql-10.service. It turned
out that postgresql-10.service
can be started successfully. But when I use this postgresql(reindex, vacumm
and so on), "missing chunk number 0 for toast value xxx in pg_toast_xxx"
errors output.

When pg_basebackup, it will store wal under pg_wal, can't postgresql work
with wal records locally?
I think primary_conninfo in recovery.conf is just used to get newer wal
records from A. Right?

I have also tested:
If I start postgresql-10.service with recovery.conf firstly, then split it
from postgresql cluster, everything works fine.

Above test seems proved that it is wal records's problem. I am really
confused.

Regards
Ma Xinjian






--
Sent from: https://www.postgresql-archive.org/PostgreSQL-general-f1843780.html




Re: "invalid record length" after restoring pg_basebackup

2020-11-16 Thread Tom Lane
Dennis Jacobfeuerborn  writes:
> On 11/13/20 4:02 PM, Tom Lane wrote:
>> This looks quite normal to me.  If you'd pulled the power plug on the
>> primary system at the time you made this backup, you would likely see
>> the same message at the end of its crash recovery.  Some sort of
>> corrupt-WAL-entry report is expected at the end of WAL replay anytime
>> you didn't have a clean shutdown.

> But the system the backup was pulled from kept running fine. Also
> wouldn't that make re-attaching the system to the primary impossible
> since replication cannot be continued due to the broken WAL record?

There is no "broken WAL record".  There is only junk following the
primary's current WAL write point.

> What I would expect is that pg_basebackup only transfer healthy WAL
> entries so that a restored system can pick up right after that with
> streaming replication.

You need to adjust your expectations.  pg_basebackup doesn't parse the WAL
data, because it has no need to.  It just copies whole WAL segment files.

regards, tom lane




Re: "invalid record length" after restoring pg_basebackup

2020-11-16 Thread Dennis Jacobfeuerborn
On 11/13/20 4:02 PM, Tom Lane wrote:
> Dennis Jacobfeuerborn  writes:
>> All of this works fine and the logs report that the db reaches a
>> consistent recovery state but as last entry it reports an "invalid
>> record length":
> 
> This looks quite normal to me.  If you'd pulled the power plug on the
> primary system at the time you made this backup, you would likely see
> the same message at the end of its crash recovery.  Some sort of
> corrupt-WAL-entry report is expected at the end of WAL replay anytime
> you didn't have a clean shutdown.

But the system the backup was pulled from kept running fine. Also
wouldn't that make re-attaching the system to the primary impossible
since replication cannot be continued due to the broken WAL record?
What I would expect is that pg_basebackup only transfer healthy WAL
entries so that a restored system can pick up right after that with
streaming replication.

Regards,
 Dennis




Re: "invalid record length" after restoring pg_basebackup

2020-11-13 Thread Tom Lane
Dennis Jacobfeuerborn  writes:
> All of this works fine and the logs report that the db reaches a
> consistent recovery state but as last entry it reports an "invalid
> record length":

This looks quite normal to me.  If you'd pulled the power plug on the
primary system at the time you made this backup, you would likely see
the same message at the end of its crash recovery.  Some sort of
corrupt-WAL-entry report is expected at the end of WAL replay anytime
you didn't have a clean shutdown.

regards, tom lane




"invalid record length" after restoring pg_basebackup

2020-11-13 Thread Dennis Jacobfeuerborn
Hi,
I've run into a strange issue after restoring a backup that I created
using pg_basebackup on a standby instance. The command I use to create
the backup is this:

pg_basebackup -v --write-recovery-conf -h$BACKUP_HOST -p5432
-U$BACKUP_USER --format tar --wal-method stream --compress=2 -D
"$BACKUP_DIR"

This backup runs fine and produces a "base.tar.gz" and "pg_wal.tar.gz"
file. The server version is 11.7.

To restore the backup on another system I unpack the "base.tar.gz" file
into the data directory and "pg_wal.tar.gz" into the "pg_wal"
sub-directory. I then comment out the "primary_conninfo" directive in
the "recovery.conf" file since I don't want to replicate from the
primary on this system I just want to restore the state from the backup.

All of this works fine and the logs report that the db reaches a
consistent recovery state but as last entry it reports an "invalid
record length":

2020-11-13 12:25:34.266 UTC [1] LOG:  listening on IPv4 address
"0.0.0.0", port 5432
2020-11-13 12:25:34.266 UTC [1] LOG:  listening on IPv6 address "::",
port 5432
2020-11-13 12:25:34.269 UTC [1] LOG:  listening on Unix socket
"/var/run/postgresql/.s.PGSQL.5432"
2020-11-13 12:25:34.437 UTC [20] LOG:  database system was interrupted
while in recovery at log time 2020-11-03 14:36:00 UTC
2020-11-13 12:25:34.437 UTC [20] HINT:  If this has occurred more than
once some data might be corrupted and you might need to choose an
earlier recovery target.
2020-11-13 12:25:35.359 UTC [20] WARNING:  recovery command file
"recovery.conf" specified neither primary_conninfo nor restore_command
2020-11-13 12:25:35.359 UTC [20] HINT:  The database server will
regularly poll the pg_wal subdirectory to check for files placed there.
2020-11-13 12:25:35.359 UTC [20] LOG:  entering standby mode
2020-11-13 12:25:35.372 UTC [20] LOG:  redo starts at 932/9D123E70
2020-11-13 12:26:45.435 UTC [20] LOG:  consistent recovery state reached
at 933/AA63F600
2020-11-13 12:26:45.435 UTC [1] LOG:  database system is ready to accept
read only connections
2020-11-13 12:26:45.444 UTC [20] LOG:  invalid record length at
933/AA6DA660: wanted 24, got 0

Looking at the WAL data the last file is "0001093300AA" and
there is a file "archive_status/0001093300A9.done" but no
corresponding ".done" file for the "...AA" WAL file.

Running pg_waldump on the "...AA" file shows the following at the end:

$ pg_waldump 0001093300AA
...
rmgr: Heaplen (rec/tot): 54/54, tx:1743220, lsn:
933/AA6DA510, prev 933/AA6D8F08, desc: LOCK off 12: xid 1743220: flags 0
LOCK_ONLY KEYSHR_LOCK , blkref #0: rel 1663/13117/874105 blk 0
rmgr: Heaplen (rec/tot): 81/81, tx:1743220, lsn:
933/AA6DA548, prev 933/AA6DA510, desc: INSERT off 19, blkref #0: rel
1663/13117/868296 blk 0
rmgr: Btree   len (rec/tot): 80/80, tx:1743220, lsn:
933/AA6DA5A0, prev 933/AA6DA548, desc: INSERT_LEAF off 24, blkref #0:
rel 1663/13117/868299 blk 1
rmgr: Btree   len (rec/tot): 72/72, tx:1743220, lsn:
933/AA6DA5F0, prev 933/AA6DA5A0, desc: INSERT_LEAF off 23, blkref #0:
rel 1663/13117/915254 blk 2
rmgr: Transaction len (rec/tot): 34/34, tx:1743220, lsn:
933/AA6DA638, prev 933/AA6DA5F0, desc: COMMIT 2020-11-03 14:49:29.205562 UTC
pg_waldump: FATAL:  error in WAL record at 933/AA6DA638: invalid record
length at 933/AA6DA660: wanted 24, got 0

It looks like the WAL in the backup is inconsistent at the end but I'd
expect pg_basebackup to create a clean backup. Is this something I
should expect or is this a problem with pg_basebackup?

Regards,
  Dennis




Re: pg_basebackup + delta base backups

2020-08-15 Thread Stephen Frost
Greetings,

* Christopher Pereira (krip...@imatronix.cl) wrote:
> On 26-May-20 10:20, Stephen Frost wrote:
> >"out of sync" is a bit of an odd concept, but having a replica fall
> >behind a long way is certainly something that can happen and may require
> >a rebuild from a backup (or from a new sync off of the primary in some
> >other way, as you suggest below).  In a situation where there's async
> >replication happening and you promote a replica to take over, that's
> >definitely a case where you might also have to rebuild the former
> >primary.
> 
> Yes, a common case with async streaming is when primary (A) goes down and
> replica is promoted as a new master (B).
> Then A comes back and has some data that was not streamed to B so pg_rewind
> is useless.
> 
> I wonder if there is some option to just discard this branched data from A
> in order to start as a new replica.

... This is exactly what pg_rewind is for, so I'm not really sure what
you're asking about here.

> I noticed that pg_rewind is useless even when both DBs are identical
> (according to pg_dumpall | md5sum).

That isn't really a meaningful or useful thing to consider.

> >>As you said, all the pieces are there and it would be quite easy to write a
> >>new "pg_basebackup_delta" script that could be executed on the standby host
> >>to:
> >>
> >>1) setup a pgBackRest repo on the primary host (via SSH)
> >>
> >>2) create a backup on the primary host (via SSH)
> >>
> >>3) do a delta restore on the standby
> >>
> >>Even when the repository on the primary host is only created temporarily
> >>(and require double storage, resources, etc), it may still be worth
> >>considering the traffic that can be saved by doing a delta restore on a
> >>standby host in a different region, right?
> >So...  There's actually a way to do this with pgbackrest, but it doesn't
> >support the delta capability.
> 
> If I understood correctly the method you described, you were basically doing
> a "backup" between A (primary) and B (repo) and in such a way the repo is
> then compatible with the pg_data structure, but without delta support (ie.
> transfering the whole database)?

Yes.

> Delta support is critical for VLDBs, so I see two alternatives to replace
> pg_basebackup with pgbackrest to rebuild a replica:
> 
> 1) Create a temporary repo on the primary
> 
> 2) Create a temporary repo on the replica

Yes, if you use a repo then you can use pgbackrest's incremental and
delta features.

> All configurations would be undone after the replica has been rebuilt and
> both alternatives would be using delta over the wire.
> In your opinion, which alternative is better considering network traffic?

If you want to minimize network traffic then probably using either
incremental backups or delta restore would make the most sense.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: canceling statement due to conflict with recovery after pg_basebackup

2020-06-03 Thread Kyotaro Horiguchi
At Wed, 3 Jun 2020 10:07:14 +0300, "Andrus"  wrote in 
> Hi!

Hi.

> Async binary replication hot standby was started after pg_basebackup.
> Running query in slave throws error 
> 
> ERROR:  canceling statement due to conflict with recovery
> 
> Why ?

As written in the messages.
> 2020-06-03 09:54:23 EEST 85.253.131.166 user@sba DETAIL: User query
> might have needed to see row versions that must be removed.

Despite the master had removed some rows by vacuum and that must be
reflected to your standby, the standby cannot do that because of the
transaction that has started before the rows are vacuumed. More
technically, the transaction on the standby was using a snapshot with
older transaction ids than the vacuum cutoff transaction id on the
master. So the standby needed to cancel the the statement or session
in order to continue replication.

> Query should return table and other sizes in decreasing order.
> How to improve it so that this error does not occur.

Hot-standby-feedback would work.

https://www.postgresql.org/docs/12/runtime-config-replication.html#GUC-HOT-STANDBY-FEEDBACK

> Log:
> 
> 2020-06-03 09:40:52 EEST LOG: database system was interrupted; last
> known up at 2020-06-03 07:59:56 EEST
> 2020-06-03 09:41:10 EEST  LOG:  entering standby mode
> 2020-06-03 09:41:10 EEST  LOG:  redo starts at 2E2/28
> 2020-06-03 09:41:19 EEST LOG: consistent recovery state reached at
> 2E2/B5A56C8
> 2020-06-03 09:41:19 EEST LOG: database system is ready to accept read
> only connections
> 2020-06-03 09:41:19 EEST LOG: started streaming WAL from primary at
> 2E2/C00 on timeline 1
> 2020-06-03 09:54:23 EEST 85.253.131.166 user@sba ERROR: canceling
> statement due to conflict with recovery
> 2020-06-03 09:54:23 EEST 85.253.131.166 user@sba DETAIL: User query
> might have needed to see row versions that must be removed.
> 2020-06-03 09:54:23 EEST 85.253.131.166 user@sba STATEMENT:  select
> company_name(n.nspname)::char(20) as company,
>  relname::char(25),
>pg_size_pretty(pg_total_relation_size(c.oid))::char(10) as totalsize,
> n.nspname::char(12),
>case
>   when c.relkind='i' then 'index'
>when c.relkind='t' then 'toast'
>   when c.relkind='r' then 'table'
>   when c.relkind='v' then 'view'
>   when c.relkind='c' then 'composite type'
>   when c.relkind='S' then 'sequence'
>else c.relkind::text
>  end ::char(14) as "type"
> from
>pg_class c
>left join pg_namespace n on n.oid = c.relnamespace
>left join pg_tablespace t on t.oid = c.reltablespace
> where
>(pg_total_relation_size(c.oid)>>21)>0 and c.relkind!='t'
> order by
>pg_total_relation_size(c.oid) desc

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




canceling statement due to conflict with recovery after pg_basebackup

2020-06-03 Thread Andrus

Hi!

Async binary replication hot standby was started after pg_basebackup.
Running query in slave throws error 


ERROR:  canceling statement due to conflict with recovery

Why ?
Query should return table and other sizes in decreasing order.
How to improve it so that this error does not occur.


Log:

2020-06-03 09:40:52 EEST  LOG:  database system was interrupted; last known up 
at 2020-06-03 07:59:56 EEST
2020-06-03 09:41:10 EEST  LOG:  entering standby mode
2020-06-03 09:41:10 EEST  LOG:  redo starts at 2E2/28
2020-06-03 09:41:19 EEST  LOG:  consistent recovery state reached at 2E2/B5A56C8
2020-06-03 09:41:19 EEST  LOG:  database system is ready to accept read only 
connections
2020-06-03 09:41:19 EEST  LOG:  started streaming WAL from primary at 
2E2/C00 on timeline 1
2020-06-03 09:54:23 EEST 85.253.131.166 user@sba ERROR:  canceling statement 
due to conflict with recovery
2020-06-03 09:54:23 EEST 85.253.131.166 user@sba DETAIL:  User query might have 
needed to see row versions that must be removed.
2020-06-03 09:54:23 EEST 85.253.131.166 user@sba STATEMENT:  select
company_name(n.nspname)::char(20) as company,
 relname::char(25),
   pg_size_pretty(pg_total_relation_size(c.oid))::char(10) as totalsize,
n.nspname::char(12),
   case
  when c.relkind='i' then 'index'
   when c.relkind='t' then 'toast'
  when c.relkind='r' then 'table'
  when c.relkind='v' then 'view'
  when c.relkind='c' then 'composite type'
  when c.relkind='S' then 'sequence'
   else c.relkind::text
 end ::char(14) as "type"
from
   pg_class c
   left join pg_namespace n on n.oid = c.relnamespace
   left join pg_tablespace t on t.oid = c.reltablespace
where
   (pg_total_relation_size(c.oid)>>21)>0 and c.relkind!='t'
order by
   pg_total_relation_size(c.oid) desc

Andrus.




Re: How to start slave after pg_basebackup. Why min_wal_size and wal_keep_segments are duplicated

2020-06-01 Thread Paul Förster
Hi Andrus,

> On 01. Jun, 2020, at 12:26, Andrus  wrote:
> My script does the same thing as your comands.

no, it does not. And as long as you refuse to understand that, I can't help you.

> I want to create replication server.

no, you want to create a streaming replica.

> Old cluster is empty, from initdb. Backup is not needed

then why the mv stuff?

> pg_basebackup uses environment varuables if not specified in command line. So 
> my script does the same thing.

have you looked at the other options?

> I tried
> sudo --user=postgres pg_basebackup 
> but got error
> could not change directory to "/root": Permission denied

try
sudo su - postgres pg_basebackup 

Note the "-"! It's essential to get the environment of postgres, which is what 
you want.

Cheers,
Paul



Re: How to start slave after pg_basebackup. Why min_wal_size and wal_keep_segments are duplicated

2020-06-01 Thread Andrus

Hi!


How to create replication server ?
I always do it this way and it work for me:
$ pg_basebackup -h ${PGHOST} -p ${PGPORT} -U replicator -W -R -D ${PGDATA} -P 
-v -Fp -Xs
After that, I edit ${PGDATA}/postgresql.conf and (w/ PostgreSQL 11 and older 
${PGDATA}/recovery.conf) to make it do what I want and
then I just launch it:
$ pg_ctl start


My script does the same thing as your comands.


From that moment onward, it replicates and applies to the replica. Checks in 
pg_stat_replication on the master and
pg_stat_wal_receiver on the replica >confirm that. They also show the WAL 
switches.
To provoke a WAL switch I always do:
postgres=# checkpoint; select pg_switch_wal();
CHECKPOINT
pg_switch_wal
I just don't understand what you're trying to achieve here.


I want to create replication server.


My guess is, you want to stop and backup the old database cluster,


Old cluster is empty, from initdb. Backup is not needed


then create a new one in its old directory, right?


pg_basebackup creates new main directory.


In this case, you probably need to change your script to something like this:
PGHOST=remote.example.com
PGPASSWORD=mypass
PGUSER=replikaator
PGDATA=/var/lib/postgresql/12/main
export PGHOST PGPASSWORD PGUSER PGDATA
/etc/init.d/postgresql stop
mv ${PGDATA} /var/lib/postgresql/12/mainennebaasbakuppi
pg_basebackup -h ${PGHOST} -p ${PGPORT} -U ${PGUSER} -W -R -D ${PGDATA} -P -v 
-Fp -Xs
/etc/init.d/postgresql start


pg_basebackup uses environment varuables if not specified in command line. So 
my script does the same thing.


Note that my invocation of pg_basebackup asks for the replicator password. This 
is intended. You'd probably want to change that.
Also, no need to play around with ownership and permissions. Do it as "postgres", not as 
"root".


I tried

sudo --user=postgres pg_basebackup 

but got error

could not change directory to "/root": Permission denied

Andrus.





Re: How to start slave after pg_basebackup. Why min_wal_size and wal_keep_segments are duplicated

2020-06-01 Thread Magnus Hagander
On Mon, Jun 1, 2020 at 10:17 AM Andrus  wrote:

> Hi!
>
> > I have tried to re-initiate replica serveral times in low-use time but
> this error occurs again.
> >remove the whole replica's PGDATA/* and do a pg_basebackup again. But
> before that, make sure wal_keep_segments in big enough on the
> >master and,
>
> I renamed whole cluster before pg_basebackup
>
> >just as much important, do a vacuumdb -a (takes much space during the
> process) and use archiving!
>
> I run vacuumdb --full --all before pg_basebackup
>
> > If named replication slot is used commands like
> > vacuumdb --all --full
> > will cause main server crash due to disk space limit. pg_wal directory
> will occupy free disk space. After that main server stops.
> >>if you have disk constraints you will run into trouble sooner or later
> anyway. Make sure, you have enough disk space. There's no
> >>way around that anyway.
>
> This space is sufficient for base backup and replication.
>
> >> I tried using wal_keep_segments =180
> >> Will setting wal_keep_segments to higher value allw replication start
> after pg_basebackup ?
> >it depends. If you start the replica immediately and don't wait for hours
> or days, you should be good to go. But that depends on
> >different factors, for example, how >many WAL files are written during
> the pg_basebackup and pg_ctl start of the replica. If more
> >than 180 WALs have gone by on the master because it is really busy, >then
> you're probably lost again. Point being, you'll have to
> >launch the replica before WALs are expired!
> >Again: Make sure you have enough disk space, use archiving and use a
> replication slot.
>
> I tried with wal_keep_segments=360 but problem persisists.
> Server generates lot of less than 300 wal files.
>

Have you verified that wal_keep_segments actually end up at 360, by
connecting to the database and issuing SHOW wal_keep_segments? I've seen
far too many examples of people who accidentally had a second line that
overrode the one they thought they changed, and thus still ran with a lower
number.


Shell script starts server after pg_basebackup completes automatically:
>
> PGHOST=example.com
> PGPASSWORD=mypass
> PGUSER=replikaator
> export PGHOST  PGPASSWORD PGUSER
> /etc/init.d/postgresql stop
> mv /var/lib/postgresql/12/main /var/lib/postgresql/12/mainennebaasbakuppi
> pg_basebackup --verbose --progress --write-recovery-conf -D
> /var/lib/postgresql/12/main
> chmod --recursive --verbose 0700 /var/lib/postgresql/12/main
> chown -Rv postgres:postgres /var/lib/postgresql/12/main
> /etc/init.d/postgresql start
>

Do you get any useful output from the -v part of pg_basebackup? It should
for example tell you the exact start and stop point in the wal during the
basebackup, that can be  correlated to the msising file.

Normally the window between end of pg_basebackup and start of the actual
service is not big enough to cause a problem (since v12 will do a streaming
receive of the logs *during* the backup -- it could be a big problem before
that was possible, or if one forgot to enable it before it was the
default), and it certainly sounds weird that it should be in your case,
unless the chmod and chown commands take a *long* time. But if it is, there
is nothing preventing you from creating a slot just during setup and then
get rid of it. That is:

1. create slot
2. pg_basebackup with slot
3. start replication with slot
4. restart replication without slot  once it's caught up
5. drop slot

However, if you want reliable replication, you really should have a slot.
Or at least, you should have either a slot *or* log archiving that's
read-accessible from the replica.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ <http://www.hagander.net/>
 Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>


Re: How to start slave after pg_basebackup. Why min_wal_size and wal_keep_segments are duplicated

2020-06-01 Thread Paul Förster
Hi Andrus,

> On 01. Jun, 2020, at 10:17, Andrus  wrote:
> Shell script starts server after pg_basebackup completes automatically:
> 
> PGHOST=example.com
> PGPASSWORD=mypass
> PGUSER=replikaator
> export PGHOST  PGPASSWORD PGUSER
> /etc/init.d/postgresql stop
> mv /var/lib/postgresql/12/main /var/lib/postgresql/12/mainennebaasbakuppi
> pg_basebackup --verbose --progress --write-recovery-conf -D 
> /var/lib/postgresql/12/main
> chmod --recursive --verbose 0700 /var/lib/postgresql/12/main
> chown -Rv postgres:postgres /var/lib/postgresql/12/main
> /etc/init.d/postgresql start
> 
> How to create replication server ?

I always do it this way and it work for me:

$ pg_basebackup -h ${PGHOST} -p ${PGPORT} -U replicator -W -R -D ${PGDATA} -P 
-v -Fp -Xs

After that, I edit ${PGDATA}/postgresql.conf and (w/ PostgreSQL 11 and older 
${PGDATA}/recovery.conf) to make it do what I want and then I just launch it:

$ pg_ctl start

From that moment onward, it replicates and applies to the replica. Checks in 
pg_stat_replication on the master and pg_stat_wal_receiver on the replica 
confirm that. They also show the WAL switches.

To provoke a WAL switch I always do:

postgres=# checkpoint; select pg_switch_wal();
CHECKPOINT
 pg_switch_wal 
---
 C/5128
(1 row)

I just don't understand what you're trying to achieve here. My guess is, you 
want to stop and backup the old database cluster, then create a new one in its 
old directory, right? In this case, you probably need to change your script to 
something like this:

PGHOST=remote.example.com
PGPASSWORD=mypass
PGUSER=replikaator
PGDATA=/var/lib/postgresql/12/main
export PGHOST PGPASSWORD PGUSER PGDATA

/etc/init.d/postgresql stop
mv ${PGDATA} /var/lib/postgresql/12/mainennebaasbakuppi
pg_basebackup -h ${PGHOST} -p ${PGPORT} -U ${PGUSER} -W -R -D ${PGDATA} -P -v 
-Fp -Xs
/etc/init.d/postgresql start

Note that my invocation of pg_basebackup asks for the replicator password. This 
is intended. You'd probably want to change that.

Also, no need to play around with ownership and permissions. Do it as 
"postgres", not as "root".

Cheers,
Paul



Re: How to start slave after pg_basebackup. Why min_wal_size and wal_keep_segments are duplicated

2020-05-31 Thread Andrus

Hi!


I want to create  hot standby async server using
/etc/init.d/postgresql stop
mv /var/lib/postgresql/12/main /var/lib/postgresql/12/mainold
pg_basebackup --write-recovery-conf -D /var/lib/postgresql/12/main

I don't see where the base backup is being taken from just where it is going.


It is taken from VPS server over 20 Mbit public internet connection.
Both servers are running Debian Linux.


I dont receive WALs.
If you are doing binary replication then you are receiving WALs. It just a matter of whether you are streaming them or shipping 
them over complete.


Using wal_keep_segments=360 also causes same wal file not found error after 
pg_basebackup.
master server has 6GB wal files.  wal log during pg_basebackup is much slower 
than 360.
Maybe pg_basebackup skips wal segments . Maybe using wal_compression=on causes 
the issue.

How to fix this ?
How to create base backup so that cluster is tranferred over internet faster? Maybe it can transferred in compressed form over 
internet.


Andrus.




Re: How to start slave after pg_basebackup. Why min_wal_size and wal_keep_segments are duplicated

2020-05-31 Thread Adrian Klaver

On 5/31/20 2:43 PM, Andrus wrote:

Hi!


In addition to my most recent questions:
What are you trying to achieve?


I want to create  hot standby async server using
/etc/init.d/postgresql stop
mv /var/lib/postgresql/12/main /var/lib/postgresql/12/mainold
pg_basebackup --write-recovery-conf -D /var/lib/postgresql/12/main


I don't see where the base backup is being taken from just where it is 
going.



chmod --recursive --verbose 0700 /var/lib/postgresql/12/main
chown -Rv postgres:postgres /var/lib/postgresql/12/main
/etc/init.d/postgresql start

In other words why do a pg_basebackup if you have a standby receiving 
WALs?


I dont receive WALs.


If you are doing binary replication then you are receiving WALs. It just 
a matter of whether you are streaming them or shipping them over complete.




Andrus.



--
Adrian Klaver
adrian.kla...@aklaver.com




Re: How to start slave after pg_basebackup. Why min_wal_size and wal_keep_segments are duplicated

2020-05-31 Thread Andrus

Hi!

Will wal_keep_segments  keep segments also if named replication slot is 
lot used ?
Well if you are using a replication slot there is no point in using 
wal_keep_segments. Slots where created in, part at least, so you did not 
have to guess at a wal_keep_segments number.


I dont use slot.

To really answer this we will need to see the exact commands you are 
using and the sequence they are done in.


Replication server is created using

/etc/init.d/postgresql stop
mv /var/lib/postgresql/12/main /var/lib/postgresql/12/mainold
pg_basebackup --write-recovery-conf -D /var/lib/postgresql/12/main
chmod --recursive --verbose 0700 /var/lib/postgresql/12/main
chown -Rv postgres:postgres /var/lib/postgresql/12/main
/etc/init.d/postgresql start

Andrus.





Re: How to start slave after pg_basebackup. Why min_wal_size and wal_keep_segments are duplicated

2020-05-31 Thread Andrus

Hi!


In addition to my most recent questions:
What are you trying to achieve?


I want to create  hot standby async server using 


/etc/init.d/postgresql stop
mv /var/lib/postgresql/12/main /var/lib/postgresql/12/mainold
pg_basebackup --write-recovery-conf -D /var/lib/postgresql/12/main
chmod --recursive --verbose 0700 /var/lib/postgresql/12/main
chown -Rv postgres:postgres /var/lib/postgresql/12/main
/etc/init.d/postgresql start


In other words why do a pg_basebackup if you have a standby receiving WALs?


I dont receive WALs.

Andrus.




Re: How to start slave after pg_basebackup. Why min_wal_size and wal_keep_segments are duplicated

2020-05-31 Thread Paul Förster
Hi Andrus,

> On 31. May, 2020, at 22:56, Andrus  wrote:
> wal files are not archieved.

IMHO a bad decision. They should be. Now you're in the situation where you see 
why.

> I have tried to re-initiate replica serveral times in low-use time but this 
> error occurs again.

remove the whole replica's PGDATA/* and do a pg_basebackup again. But before 
that, make sure wal_keep_segments in big enough on the master and, just as much 
important, do a vacuumdb -a (takes much space during the process) and use 
archiving!

> If named replication slot is used commands like
> vacuumdb --all --full
> will cause main server crash due to disk space limit. pg_wal directory will 
> occupy free disk space. After that main server stops.

if you have disk constraints you will run into trouble sooner or later anyway. 
Make sure, you have enough disk space. There's no way around that anyway.

> I tried using wal_keep_segments =180
> Will setting wal_keep_segments to higher value allw replication start after 
> pg_basebackup ?

it depends. If you start the replica immediately and don't wait for hours or 
days, you should be good to go. But that depends on different factors, for 
example, how many WAL files are written during the pg_basebackup and pg_ctl 
start of the replica. If more than 180 WALs have gone by on the master because 
it is really busy, then you're probably lost again. Point being, you'll have to 
launch the replica before WALs are expired!

Again: Make sure you have enough disk space, use archiving and use a 
replication slot.

Cheers,
Paul



Re: How to start slave after pg_basebackup. Why min_wal_size and wal_keep_segments are duplicated

2020-05-31 Thread Adrian Klaver

On 5/31/20 2:03 PM, Andrus wrote:

Hi!


I'm guessing are looking for:
https://www.postgresql.org/docs/12/runtime-config-replication.html
26.2.6. Replication Slots
Replication slots provide an automated way to ensure that the master 
does not remove WAL segments until they have been received by all 
standbys, and that the master does not remove rows which could cause a 
recovery conflict even when the standby is disconnected.


Using replication slot can cause pg_wal directoy to occupy all free disk 
space and after that server stop respondig.



This is spelled out here:
https://www.postgresql.org/docs/12/warm-standby.html#STREAMING-REPLICATION 

"If you use streaming replication without file-based continuous 
archiving, the server might recycle old WAL segments before the 
standby has received them. If this occurs, the standby will need to be 
reinitialized from a new base backup. You can avoid this by setting 
wal_keep_segments to a value large enough to ensure that WAL segments 
are not recycled too early, or by configuring a replication slot for 
the standby. 


Will wal_keep_segments  keep segments also if named replication slot is 
lot used ?


In addition to my most recent questions:

What are you trying to achieve?

In other words why do a pg_basebackup if you have a standby receiving WALs?



Andrus.



--
Adrian Klaver
adrian.kla...@aklaver.com




Re: How to start slave after pg_basebackup. Why min_wal_size and wal_keep_segments are duplicated

2020-05-31 Thread Adrian Klaver

On 5/31/20 2:03 PM, Andrus wrote:

Hi!


I'm guessing are looking for:
https://www.postgresql.org/docs/12/runtime-config-replication.html
26.2.6. Replication Slots
Replication slots provide an automated way to ensure that the master 
does not remove WAL segments until they have been received by all 
standbys, and that the master does not remove rows which could cause a 
recovery conflict even when the standby is disconnected.


Using replication slot can cause pg_wal directoy to occupy all free disk 
space and after that server stop respondig.



This is spelled out here:
https://www.postgresql.org/docs/12/warm-standby.html#STREAMING-REPLICATION 

"If you use streaming replication without file-based continuous 
archiving, the server might recycle old WAL segments before the 
standby has received them. If this occurs, the standby will need to be 
reinitialized from a new base backup. You can avoid this by setting 
wal_keep_segments to a value large enough to ensure that WAL segments 
are not recycled too early, or by configuring a replication slot for 
the standby. 


Will wal_keep_segments  keep segments also if named replication slot is 
lot used ?


Well if you are using a replication slot there is no point in using 
wal_keep_segments. Slots where created in, part at least, so you did not 
have to guess at a wal_keep_segments number.


To really answer this we will need to see the exact commands you are 
using and the sequence they are done in.




Andrus.



--
Adrian Klaver
adrian.kla...@aklaver.com




Re: How to start slave after pg_basebackup. Why min_wal_size and wal_keep_segments are duplicated

2020-05-31 Thread Andrus

Hi!


I'm guessing are looking for:
https://www.postgresql.org/docs/12/runtime-config-replication.html
26.2.6. Replication Slots
Replication slots provide an automated way to ensure that the master 
does not remove WAL segments until they have been received by all 
standbys, and that the master does not remove rows which could cause a 
recovery conflict even when the standby is disconnected.


Using replication slot can cause pg_wal directoy to occupy all free disk space 
and after that server stop respondig.


This is spelled out here:
https://www.postgresql.org/docs/12/warm-standby.html#STREAMING-REPLICATION
"If you use streaming replication without file-based continuous 
archiving, the server might recycle old WAL segments before the standby 
has received them. If this occurs, the standby will need to be 
reinitialized from a new base backup. You can avoid this by setting 
wal_keep_segments to a value large enough to ensure that WAL segments 
are not recycled too early, or by configuring a replication slot for the 
standby. 


Will wal_keep_segments  keep segments also if named replication slot is lot 
used ?

Andrus.




Re: How to start slave after pg_basebackup. Why min_wal_size and wal_keep_segments are duplicated

2020-05-31 Thread Paul Förster
Hi Andrus,

> On 31. May, 2020, at 21:47, Andrus  wrote:
> replikaator@[unknown] ERROR:  requested WAL segment 000102CF00E9 
> has already been removed

the message says it all. You need to copy the WAL file 000102CF00E9 
and newer to the replica's pg_wal directory because it has been removed already 
on the master site. Obviously, you can only do that if the files have been 
archived. Otherwise, you'd have to fully reinitiate the replica.

Replication will start again as soon as the requested WALs are copied over to 
the replica.

Use a replication slot to avoid this situation. If you use a replication slot, 
the master will only remove WAL files which are not needed by any one replica.

Hope this helps.

Cheers,
Paul



Re: How to start slave after pg_basebackup. Why min_wal_size and wal_keep_segments are duplicated

2020-05-31 Thread Adrian Klaver

On 5/31/20 12:47 PM, Andrus wrote:

Hi!

pg_basebackup takes 8 hours.
After it is finished,  replication slave does not start:

LOG:  consistent recovery state reached at 2DE/985A5BE0
LOG:  database system is ready to accept read only connections
LOG:  started streaming WAL from primary at 2DE/9900 on timeline 1
replikaator@[unknown] LOG:  received replication command: SHOW 
data_directory_mode

replikaator@[unknown] LOG:  received replication command: IDENTIFY_SYSTEM
replikaator@[unknown] LOG:  received replication command: 
START_REPLICATION 2CF/E900 TIMELIN
replikaator@[unknown] ERROR:  requested WAL segment 
000102CF00E9 has already been re
replikaator@[unknown] LOG:  received replication command: SHOW 
data_directory_mode

replikaator@[unknown] LOG:  received replication command: IDENTIFY_SYSTEM
replikaator@[unknown] LOG:  received replication command: 
START_REPLICATION 2CF/E900 TIMELIN
replikaator@[unknown] ERROR:  requested WAL segment 
000102CF00E9 has already been removed


There's your problem ^


...

i tried it again and same error occured.

How to force replication to start?


If the WAL is gone you can't.

More below.



I increased wal parameters in master to

wal_compression=on
max_wal_size = 5GB
min_wal_size = 4GB # was 80MB
wal_keep_segments= 360 # was 180

Will this allow replication to start after pg_basebackup ?

According to doc min_wal_size and wal_keep_segments both keep the 
minimum number of wal segments for replication.


No it doesn't:

https://www.postgresql.org/docs/12/runtime-config-replication.html

"wal_keep_segments (integer)

Specifies the minimum number of past log file segments kept in the 
pg_wal directory, in case a standby server needs to fetch them for 
streaming replication. Each segment is normally 16 megabytes. If a 
standby server connected to the sending server falls behind by more than 
wal_keep_segments segments, the sending server might remove a WAL 
segment still needed by the standby, in which case the replication

^^
connection will be terminated. Downstream connections will also
^
eventually fail as a result. (However, the standby server can recover by 
fetching the segment from archive, if WAL archiving is in use.)


...

https://www.postgresql.org/docs/12/runtime-config-wal.html

"min_wal_size (integer)

As long as WAL disk usage stays below this setting, old WAL files 
are always recycled for future use at a checkpoint, rather than removed. 
This can be used to ensure that enough WAL space is reserved to handle 
spikes in WAL usage, for example when running large batch jobs. If this 
value is specified without units, it is taken as megabytes. The default 
is 80 MB. This parameter can only be set in the postgresql.conf file or 
on the server command line.

"

I'm guessing are looking for:

https://www.postgresql.org/docs/12/runtime-config-replication.html

"
26.2.6. Replication Slots

Replication slots provide an automated way to ensure that the master 
does not remove WAL segments until they have been received by all 
standbys, and that the master does not remove rows which could cause a 
recovery conflict even when the standby is disconnected.


...
"

This is spelled out here:

https://www.postgresql.org/docs/12/warm-standby.html#STREAMING-REPLICATION

"If you use streaming replication without file-based continuous 
archiving, the server might recycle old WAL segments before the standby 
has received them. If this occurs, the standby will need to be 
reinitialized from a new base backup. You can avoid this by setting 
wal_keep_segments to a value large enough to ensure that WAL segments 
are not recycled too early, or by configuring a replication slot for the 
standby. If you set up a WAL archive that's accessible from the standby, 
these solutions are not required, since the standby can always use the 
archive to catch up provided it retains enough segments."





Why those parameters are duplicated?

Andrus.





--
Adrian Klaver
adrian.kla...@aklaver.com




How to start slave after pg_basebackup. Why min_wal_size and wal_keep_segments are duplicated

2020-05-31 Thread Andrus

Hi!

pg_basebackup takes 8 hours.
After it is finished,  replication slave does not start:

LOG:  consistent recovery state reached at 2DE/985A5BE0
LOG:  database system is ready to accept read only connections
LOG:  started streaming WAL from primary at 2DE/9900 on timeline 1
replikaator@[unknown] LOG:  received replication command: SHOW 
data_directory_mode
replikaator@[unknown] LOG:  received replication command: IDENTIFY_SYSTEM
replikaator@[unknown] LOG:  received replication command: START_REPLICATION 
2CF/E900 TIMELIN
replikaator@[unknown] ERROR:  requested WAL segment 000102CF00E9 
has already been re
replikaator@[unknown] LOG:  received replication command: SHOW 
data_directory_mode
replikaator@[unknown] LOG:  received replication command: IDENTIFY_SYSTEM
replikaator@[unknown] LOG:  received replication command: START_REPLICATION 
2CF/E900 TIMELIN
replikaator@[unknown] ERROR:  requested WAL segment 000102CF00E9 
has already been removed
...

i tried it again and same error occured.

How to force replication to start?

I increased wal parameters in master to

wal_compression=on
max_wal_size = 5GB
min_wal_size = 4GB # was 80MB
wal_keep_segments= 360 # was 180

Will this allow replication to start after pg_basebackup ?

According to doc min_wal_size and wal_keep_segments both keep the minimum 
number of wal segments for replication.
Why those parameters are duplicated?

Andrus.




Re: pg_basebackup + delta base backups

2020-05-27 Thread Christopher Pereira

On 26-May-20 10:20, Stephen Frost wrote:

[...]

"out of sync" is a bit of an odd concept, but having a replica fall
behind a long way is certainly something that can happen and may require
a rebuild from a backup (or from a new sync off of the primary in some
other way, as you suggest below).  In a situation where there's async
replication happening and you promote a replica to take over, that's
definitely a case where you might also have to rebuild the former
primary.


Hi Stepehen,

Yes, a common case with async streaming is when primary (A) goes down 
and replica is promoted as a new master (B).
Then A comes back and has some data that was not streamed to B so 
pg_rewind is useless.


I wonder if there is some option to just discard this branched data from 
A in order to start as a new replica.


I noticed that pg_rewind is useless even when both DBs are identical 
(according to pg_dumpall | md5sum).



[...]

As you said, all the pieces are there and it would be quite easy to write a
new "pg_basebackup_delta" script that could be executed on the standby host
to:

1) setup a pgBackRest repo on the primary host (via SSH)

2) create a backup on the primary host (via SSH)

3) do a delta restore on the standby

Even when the repository on the primary host is only created temporarily
(and require double storage, resources, etc), it may still be worth
considering the traffic that can be saved by doing a delta restore on a
standby host in a different region, right?

So...  There's actually a way to do this with pgbackrest, but it doesn't
support the delta capability.


If I understood correctly the method you described, you were basically 
doing a "backup" between A (primary) and B (repo) and in such a way the 
repo is then compatible with the pg_data structure, but without delta 
support (ie. transfering the whole database)?


Delta support is critical for VLDBs, so I see two alternatives to 
replace pg_basebackup with pgbackrest to rebuild a replica:


1) Create a temporary repo on the primary

2) Create a temporary repo on the replica

All configurations would be undone after the replica has been rebuilt 
and both alternatives would be using delta over the wire.

In your opinion, which alternative is better considering network traffic?

Thanks,

Christopher





Re: pg_basebackup + incremental base backups

2020-05-26 Thread Stephen Frost
Greetings,

* Christopher Pereira (krip...@imatronix.cl) wrote:
> On 24-May-20 15:48, Stephen Frost wrote:
> >That really shouldn't be possible.  I'm very curious as to exactly what
> >happened that resulted in your primary/replica being 'out of sync', as
> >you say.
> 
> Actually this was more a hypothetical question to find a solution in case
> some day one of our standby clusters goes out of sync and we have to rebuild
> it having a very big database.

"out of sync" is a bit of an odd concept, but having a replica fall
behind a long way is certainly something that can happen and may require
a rebuild from a backup (or from a new sync off of the primary in some
other way, as you suggest below).  In a situation where there's async
replication happening and you promote a replica to take over, that's
definitely a case where you might also have to rebuild the former
primary.

> With proper WAL archiving this shouldn't happen but we wanted to be prepared
> for this scenario just in case.

Sure, having WAL archiving and good backups is definitely my strong
recommendation for how to address any cases where the replica falls
behind, or the replica is promoted and you want to remaster the former
primary.

> We did some tests measuring IO and traffic and are very happy with the
> results. We will definitely be adding pgBackRest to our toolchain.

Glad to hear that.

> Regarding my initial question, I still believe that the world deserves a
> simple direct pg_basebackup replacement even when putting an additional
> "repo host" in the middle is a better idea in the long term.

Perhaps..

> As you said, all the pieces are there and it would be quite easy to write a
> new "pg_basebackup_delta" script that could be executed on the standby host
> to:
> 
> 1) setup a pgBackRest repo on the primary host (via SSH)
> 
> 2) create a backup on the primary host (via SSH)
> 
> 3) do a delta restore on the standby
> 
> Even when the repository on the primary host is only created temporarily
> (and require double storage, resources, etc), it may still be worth
> considering the traffic that can be saved by doing a delta restore on a
> standby host in a different region, right?

So...  There's actually a way to do this with pgbackrest, but it doesn't
support the delta capability.

We take care in pgbackrest to make the repo format for full backups
actually match exactly what a PG cluster would look like, specifically
because we wish to allow users to, if absolutely everything else fails
and pgbackrest is non-functional, rebuild from the repo.  To that point,
we even make sure that command-line tools like gzip, bzip2, openssl,
etc, work with the files we create.

For what you're asking about though, you would do something like:

- Set up a pgbackrest repo on the host you're rebuilding (the replica)
- Disable all repo compression, encryption, et al.
- Enable archive copy, so the WAL for the backup is put into the backup
- Enable parallel jobs
- Set up pgbackrest on the primary with the replica configured as the
  repo host, get WAL archiving working and such.
- Run create-stanza
- Perform a *full* backup, make sure everything works.

Once that's done, you go find the pg_data directory inside the full
backup that you made inside of the pgbackrest repo, copy/move/rsync
(with checksums enabled!) that to where your PG data directory should
be, set up a recovery.conf to point to the primary, maybe also set it up
with a restore_command pointing to that repo (may or may not be needed,
depending) and start up PG.  That should allow PG to start, replay all
of the necessary WAL, and then connect to the primary and start
streaming.

If you have any tablespaces, you'd need to deal with those too, of
course.  If you start moving things out of the pgbackrest repo, you're
corrupting it, naturally, so you wouldn't really want to continue using
it once you've gotten all of this done.

Again, this isn't an approach which I'd generally recommend...  We used
to do it when we were first writing pgbackrest with a ZFS filesystem and
after a backup we'd create a snapshot off of the repo and start PG right
up and then run pg_dump on it and do other checks to make sure it
worked, but have moved to using pgbackrest delta restores instead for
that kind of use-case.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: pg_basebackup + incremental base backups

2020-05-25 Thread Christopher Pereira



On 24-May-20 15:48, Stephen Frost wrote:

That really shouldn't be possible.  I'm very curious as to exactly what
happened that resulted in your primary/replica being 'out of sync', as
you say.


Hi Stephen,

Actually this was more a hypothetical question to find a solution in 
case some day one of our standby clusters goes out of sync and we have 
to rebuild it having a very big database.
With proper WAL archiving this shouldn't happen but we wanted to be 
prepared for this scenario just in case.


We did some tests measuring IO and traffic and are very happy with the 
results. We will definitely be adding pgBackRest to our toolchain.


Regarding my initial question, I still believe that the world deserves a 
simple direct pg_basebackup replacement even when putting an additional 
"repo host" in the middle is a better idea in the long term.


As you said, all the pieces are there and it would be quite easy to 
write a new "pg_basebackup_delta" script that could be executed on the 
standby host to:


1) setup a pgBackRest repo on the primary host (via SSH)

2) create a backup on the primary host (via SSH)

3) do a delta restore on the standby

Even when the repository on the primary host is only created temporarily 
(and require double storage, resources, etc), it may still be worth 
considering the traffic that can be saved by doing a delta restore on a 
standby host in a different region, right?


Thanks and congratulations for the good work.





Re: Query returns no rows in pg_basebackup cluster

2020-05-25 Thread Andrus

Hi!


How to set logical replication for all user databases in cluster so that
when new database is added or new tables are added to database they will
start replicate automatically ?

I think that it would be good if you spend some time reading the
documentation on this stuff, particularly the part about restrictions,
to understand the use cases where that can become useful:
https://www.postgresql.org/docs/devel/logical-replication.html


Thank you.
I read it and havent found any reference to PITR recovery.
For PITR recovery it should probably save sql statements to files and allow to 
specify recovery target time
for applying sql statements to base backup.

Is PITR recovery supported only using binary WAL files ?

Other limits can probably be solved.

Andrus.




Re: Query returns no rows in pg_basebackup cluster

2020-05-25 Thread David G. Johnston
On Sunday, May 24, 2020, Andrus  wrote:

> Hi!
>
> Backup in created in Windows from Linux server using pg_receivewal and
>>> pg_basebackup .
>>> Can this backup used for PITR in Linux ?
>>>
>> No.  Physical copies need to be based on the same platform.  If you
>> wish to replicate a cluster without any platform, architecture or even
>> not-too-many major version constraints, there is also logical
>> replication available since v10.
>>
>
> Will logical replication also allow two modes:
>  1. PITR recovery can used if needed
>  2. Hot standby: User databases in both clusters contain same data.
>
>
Why are you spending so much effort on this Window/Linux hybrid setup?  Get
yourself another Linux server and setup physical replication.  It sounds
like it will exactly meet your requirements and you will waste more time
and money working out alternatives than the server would cost.

David J.


Re: Query returns no rows in pg_basebackup cluster

2020-05-25 Thread Michael Paquier
On Mon, May 25, 2020 at 09:02:49AM +0300, Andrus wrote:
> Will logical replication also allow two modes:
>  1. PITR recovery can used if needed
>  2. Hot standby: User databases in both clusters contain same data.
> 
> How to set logical replication for all user databases in cluster so that
> when new database is added or new tables are added to database they will
> start replicate automatically ?

I think that it would be good if you spend some time reading the
documentation on this stuff, particularly the part about restrictions,
to understand the use cases where that can become useful:
https://www.postgresql.org/docs/devel/logical-replication.html
--
Michael


signature.asc
Description: PGP signature


Re: Query returns no rows in pg_basebackup cluster

2020-05-25 Thread Andrus

Hi!


No.  Physical copies need to be based on the same platform.

Does the O/S that the client software runs on really affect this?

To the extent that the O/S determines text sort order, yes; see thread.
The short answer here is that we aren't going to support such cases.
If you try to replicate across platforms, and it works, you're in luck.
If it doesn't work, you get to keep both pieces; we will not accept
that as a bug.


In 2017 Peter wrote that  ICU-based collations will offered alongside the 
libc-based collations (1)
Currently it still requires re-compilation of Postgres for all binary 
replication platforms.
Maybe ICU locale will selected during installation automatically in Postgres 13 . Using same ICU locale in all replication platforms 
will hopefully fix the issue.


Currently option is to use ucs_basic as default collation when creating cluster.

(1) https://www.2ndquadrant.com/en/blog/icu-support-postgresql-10/

Andrus. 






Re: Query returns no rows in pg_basebackup cluster

2020-05-25 Thread Andrus

Hi!


Backup in created in Windows from Linux server using pg_receivewal and 
pg_basebackup .
Can this backup used for PITR in Linux ?

No.  Physical copies need to be based on the same platform.  If you
wish to replicate a cluster without any platform, architecture or even
not-too-many major version constraints, there is also logical
replication available since v10.


Will logical replication also allow two modes:
 1. PITR recovery can used if needed
 2. Hot standby: User databases in both clusters contain same data.

How to set logical replication for all user databases in cluster so that when new database is added or new tables are added to 
database they will start replicate automatically ?

Will it require more powerful backup server to replay main server sql stream 
from different databases.

Andrus. 






Re: Query returns no rows in pg_basebackup cluster

2020-05-24 Thread Tom Lane
"David G. Johnston"  writes:
> On Sun, May 24, 2020 at 4:10 PM Michael Paquier  wrote:
>> No.  Physical copies need to be based on the same platform.

> Does the O/S that the client software runs on really affect this?

To the extent that the O/S determines text sort order, yes; see thread.

The short answer here is that we aren't going to support such cases.
If you try to replicate across platforms, and it works, you're in luck.
If it doesn't work, you get to keep both pieces; we will not accept
that as a bug.

regards, tom lane




Re: Query returns no rows in pg_basebackup cluster

2020-05-24 Thread David G. Johnston
On Sun, May 24, 2020 at 4:10 PM Michael Paquier  wrote:

> On Fri, May 22, 2020 at 09:53:37AM +0300, Andrus wrote:
> > Backup in created in Windows from Linux server using pg_receivewal and
> pg_basebackup .
> > Can this backup used for PITR in Linux ?
>
> No.  Physical copies need to be based on the same platform.  If you
> wish to replicate a cluster without any platform, architecture or even
> not-too-many major version constraints, there is also logical
> replication available since v10.
>

Does the O/S that the client software runs on really affect this?  I would
expect that you could store the offline files anywhere.  As long as the
architecture your original server is on and the one you are restoring to
are the same the restored server should work.  They are just bytes until a
server interprets them, no?

David J.


Re: Query returns no rows in pg_basebackup cluster

2020-05-24 Thread Michael Paquier
On Fri, May 22, 2020 at 09:53:37AM +0300, Andrus wrote:
> Backup in created in Windows from Linux server using pg_receivewal and 
> pg_basebackup .
> Can this backup used for PITR in Linux ?

No.  Physical copies need to be based on the same platform.  If you
wish to replicate a cluster without any platform, architecture or even
not-too-many major version constraints, there is also logical
replication available since v10.
--
Michael


signature.asc
Description: PGP signature


Re: pg_basebackup + incremental base backups

2020-05-24 Thread Stephen Frost
Greetings,

* Christopher Pereira (krip...@imatronix.cl) wrote:
> Ok, we want to use pgbackrest to *rebuild a standby that has fallen behind*
> (where pg_rewind won't work). After reading the docs, we believe we should
> use this setup:
> 
> a) Primary host: primary cluster
> 
> b) Repository host: needed for rebuilding the standby (and having PITR as
> bonus).
> 
> c) Standby host: standby cluster

That would work.

> 1) The standby will use streaming replication and will be in sync until
> someday something funny happens and both standby and repository get out of
> sync with the primary.

Having failures in both the standby and repo would have to be something
very 'funny' indeed..

> Now, to rebuild the standby first we will have to create a new backup
> transferring the data from *primary -> repository*, right?

If the repo is entirely gone *and* the standby server is broken, then,
yes, you'd have to do a new backup into the repo and then restore that.

> Wouldn't this also have a load impact on the primary cluster?

Yes.

The distinction I was trying to make earlier is that if the standby
server gets messed up in any shape, and the repo/repo-server is still
operating correctly, then you can perform a pgbackrest delta restore
whereby the standby is rebuilt from the last backup that was performed.
That process only involves the repo server and the standby server, there
wouldn't be any load on the primary, and further, only those files which
are different on the standby server vs. the last backup in the repo
would be copied, minimizing bandwidth between the two.

> 2) In the user guide section 17.3 is explained how to create a "pg-standby
> host" to replicate the data *from the repository host*.
> And in section 17.4 is explained how to setup Streaming Replication to
> replicate the data *from the primary host*.
> Do 17.3 and 17.4 work together so that the data is *replicated from the
> repository* and then *streamed from the primary*?

It's more of an either-or.  That is, if you have a pgbackrest repo, and
you configure PostgreSQL to have a restore_command which will fetch WAL
from the repo *and* you configure PostgreSQL to have a primary_conninfo
line to connect to an upstream primary and stream WAL, then you give PG
two ways to get the WAL and it'll try one and then the other.  If either
are able to provide the necessary WAL, then the replica will replay the
WAL until it gets to the end of all available WAL, at which point it
should be able to connect to the primary and stream WAL.

Should the replica ever fall behind and the primary no longer has the
WAL that the replica needs, it'll automatically go to the repo to fetch
that WAL.  Only if the WAL is no longer available from either the
primary or the WAL repo is the replica so far behind that it would need
a delta restore from a backup to be caught back up.

> 3) Before being able to rebuild the standby cluster, would we first need to
> update the backup on the repository (backup from primary -> repository) in
> order for streaming replication to work (from primary -> standby)?

This isn't required, no.  Doing so, however, perhaps using a pgbackrest
incremental backup, would reduce the amount of WAL the replica would
need to replay to get caught back up with the primary though.

> 4) Once the backup on the repository is ready, what are the chances that
> streaming replication from primary to standby won't work because they got
> out of sync again?

That really shouldn't be possible.  I'm very curious as to exactly what
happened that resulted in your primary/replica being 'out of sync', as
you say.

> 5) Could we just work with 2 hosts (primary and standby) instead of 3?
> FAQ section 8 says the repository shouldn't be on the same host as the
> standby and having it on the primary doesn't make much sense because if the
> primary host is down we won't have access to the backup.

It's possible but it's really not recommended, particularly if the idea
is to have the primary/replica providing high availability.  When you
are setting up such an HA solution, you *really* want to have your
configurations and such be symmetric between the two (and pgbackrest
makes this easy for you, when you have a repo host, by allowing you to
configure both in the pgbackrest config file and pgbackrest will figure
out which is primary and which is the replica, and perform the backup
from whichever one you'd prefer to).

In the 2 node configuration you're contemplating, if the replia is down,
the primary won't be able to ship WAL off of the system, resulting in a
WAL buildup on the primary which could lead to a database failure due to
running out of disk space, and further increases risk that the primary
then fails and you have lost both availability *and* whatever had been
still on the primary, potentially going back hours or more.

> It would be ideal to have the repository on the standby host and taking good
> care of the configurations. What exactly should be cared of for 

Re: pg_basebackup + incremental base backups

2020-05-24 Thread Christopher Pereira



We've contemplated adding support for something like this to pgbackrest,
since all the pieces are there, but there hasn't been a lot of demand
for it and it kind of goes against the idea of having a proper backup
solution, really..  It'd also create quite a bit of load on the primary
to checksum all the files to do the comparison against what's on the
replica that you're trying to update, so not something you'd probably
want to do a lot more than necessary.


Ok, we want to use pgbackrest to *rebuild a standby that has fallen 
behind* (where pg_rewind won't work). After reading the docs, we believe 
we should use this setup:


a) Primary host: primary cluster

b) Repository host: needed for rebuilding the standby (and having PITR 
as bonus).


c) Standby host: standby cluster

Some questions:

1) The standby will use streaming replication and will be in sync until 
someday something funny happens and both standby and repository get out 
of sync with the primary.
Now, to rebuild the standby first we will have to create a new backup 
transferring the data from *primary -> repository*, right?

Wouldn't this also have a load impact on the primary cluster?

2) In the user guide section 17.3 is explained how to create a 
"pg-standby host" to replicate the data *from the repository host*.
And in section 17.4 is explained how to setup Streaming Replication to 
replicate the data *from the primary host*.
Do 17.3 and 17.4 work together so that the data is *replicated from the 
repository* and then *streamed from the primary*?


3) Before being able to rebuild the standby cluster, would we first need 
to update the backup on the repository (backup from primary -> 
repository) in order for streaming replication to work (from primary -> 
standby)?


4) Once the backup on the repository is ready, what are the chances that 
streaming replication from primary to standby won't work because they 
got out of sync again?


5) Could we just work with 2 hosts (primary and standby) instead of 3?
FAQ section 8 says the repository shouldn't be on the same host as the 
standby and having it on the primary doesn't make much sense because if 
the primary host is down we won't have access to the backup.


It would be ideal to have the repository on the standby host and taking 
good care of the configurations. What exactly should be cared of for 
this setup to be safe?


I'm afraid I'm not understanding very well the pgbackrest design or how 
to use it efficiently to rebuild a standby cluster that got out of sync.




Re: Query returns no rows in pg_basebackup cluster

2020-05-22 Thread Andrus

Hi!


ERROR:  item order invariant violated for index "desktop_baas_liigid_idx"
DETAIL:  Lower index tid=(3,15) (points to index tid=(16,4098)) higher index tid=(3,16) (points to index tid=(17,4098)) page 
lsn=292/630C0CE8.

SQL state: XX002

Uh huh ... and I'll bet the same test on the source server is just fine?
I don't find it surprising in the least that different platforms have
different ideas on fine points like how to sort a leading underscore.
Those things just aren't that well standardized.


This column is not used for locale specific data.

Running 

alter table desktop alter  baas type char(8) collate ucs_basic 

fixes the issue. 
Is this fix reasonable ?

What other issues may occur ?

Can base backup created in windows using pg_basecakup used in Linux without 
such fix?

Andrus.




Re: Query returns no rows in pg_basebackup cluster

2020-05-22 Thread Tom Lane
"Andrus"  writes:
>> Possibly you could try running contrib/amcheck on the index in question
>> and see if it reports any issues.

> I tried and it reports error

> ERROR:  item order invariant violated for index "desktop_baas_liigid_idx"
> DETAIL:  Lower index tid=(3,15) (points to index tid=(16,4098)) higher index 
> tid=(3,16) (points to index tid=(17,4098)) page 
> lsn=292/630C0CE8.
> SQL state: XX002

Uh huh ... and I'll bet the same test on the source server is just fine?

I don't find it surprising in the least that different platforms have
different ideas on fine points like how to sort a leading underscore.
Those things just aren't that well standardized.

regards, tom lane




Re: Query returns no rows in pg_basebackup cluster

2020-05-22 Thread Andrus

Hi!


The sorting rules for this locale must be the same in both platforms.
Only locale names are different.

I think they are less alike than you hoped, because if they were alike,
you wouldn't be seeing this problem.
Possibly you could try running contrib/amcheck on the index in question
and see if it reports any issues.


I tried and it reports error

ERROR:  item order invariant violated for index "desktop_baas_liigid_idx"
DETAIL:  Lower index tid=(3,15) (points to index tid=(16,4098)) higher index tid=(3,16) (points to index tid=(17,4098)) page 
lsn=292/630C0CE8.

SQL state: XX002

Andrus.





Re: Query returns no rows in pg_basebackup cluster

2020-05-22 Thread Tom Lane
"Andrus"  writes:
>> No, what it sounds like is the OP tried to physically replicate a
>> database on another platform with completely different sorting rules.

> The sorting rules for this locale must be the same in both platforms.
> Only locale names are different.

I think they are less alike than you hoped, because if they were alike,
you wouldn't be seeing this problem.

Possibly you could try running contrib/amcheck on the index in question
and see if it reports any issues.

regards, tom lane




Re: pg_basebackup + incremental base backups

2020-05-22 Thread Christopher Pereira



On 21-May-20 08:43, Stephen Frost wrote:

* Christopher Pereira (krip...@imatronix.cl) wrote:

[...]

Is there some way to rebuild the standby cluster by doing a differential
backup of the primary cluster directly?

We've contemplated adding support for something like this to pgbackrest,
since all the pieces are there, but there hasn't been a lot of demand
for it and it kind of goes against the idea of having a proper backup
solution, really..  It'd also create quite a bit of load on the primary
to checksum all the files to do the comparison against what's on the
replica that you're trying to update, so not something you'd probably
want to do a lot more than necessary.


We have backups of the whole server and only need a efficient way to 
rebuild the hot-standby cluster when pg_rewind is not able to do so.


I agree with your concerns about the increased load on the primary 
server, but this rebuilding process would only be done in case of 
emergency or during low load hours.


pg_basebackup works fine but does not support differential/incremental 
backups which is a blocker.


Do you know any alternative software that is able to rebuild the standby 
PG data dir using rsync or similar while the primary is still online?


It seems a simple pg_start_backup + rsync + pg_stop_backup (maybe 
combined with a LVM snapshot) would do, but we would prefer to use some 
existing tool.


We just tried barman, but it also seems to require a restore from the 
backup before being able to start the standby server (?), and we are 
afraid this would require double storage, IO and time for rebuilding the 
standby cluster.


Thanks.





Re: Query returns no rows in pg_basebackup cluster

2020-05-22 Thread Andrus

Hi!


Database in Windows is in read-only (recovery) mode so it cannot changed.

Then you might as well just rm -rf it (or whatever the equivalent Windows
incantation is).  On Windows, that database is broken and useless.


Backup in created in Windows from Linux server using pg_receivewal and 
pg_basebackup .
Can this backup used for PITR in Linux ?

Andrus.




Re: Query returns no rows in pg_basebackup cluster

2020-05-22 Thread Andrus

Hi!


Main server is in Linux and backup server is in windows.

This is not a supported setup if you want to run a physical backup.
Your backup and your primary need to be the same - software and hardware.  
Consider anything that is working to be a false
negative – assume >something will break or simply give incorrect results.


This base backup should used for recovery. Taking new base backup in Linux does 
not allow to recover to earlier date.
Both servers have Intel 64 bit CPUs.
I understand that only issue is the index structure and that REINDEX will fix 
this.
What other issues may occur ?

Will pg_dump/pg_restore in Windows server fix all issues.

Andrus. 






Re: Query returns no rows in pg_basebackup cluster

2020-05-21 Thread David G. Johnston
On Thu, May 21, 2020 at 10:41 PM Andrus  wrote:

> Main server is in Linux and backup server is in windows.
>

This is not a supported setup if you want to run a physical backup.

Your backup and your primary need to be the same - software and hardware.
Consider anything that is working to be a false negative - assume something
will break or simply give incorrect results.

David J.


Re: Query returns no rows in pg_basebackup cluster

2020-05-21 Thread Andrus

Hi!


No, what it sounds like is the OP tried to physically replicate a
database on another platform with completely different sorting rules.


The sorting rules for this locale must be the same in both platforms.
Only locale names are different. It looks like windows server does not 
recognize Linux locale name.


Which means all his text indexes are corrupt according to the
destination platform's sorting rules, which easily explains the
observed misbehavior (ie, index searches not finding the expected rows).


Lot of queries seems working properly.

REINDEX would fix it. 


REINDEX throws error

ERROR:  cannot execute REINDEX during recovery
SQL state: 25006


But the major point here is you can't just ignore
a collation mismatch, which in turn implies that you can't do physical
replication from Linux to Windows, or vice versa (and most other
cross-platform cases are just as dangerous).


Database is used in recovery mode to find proper recovery point and to get data 
from it in this point.
Locales are actually same. In windows Postgres does not recognize Linux locale 
name.


Database in Windows is in read-only (recovery) mode so it cannot changed.

Then you might as well just rm -rf it (or whatever the equivalent Windows
incantation is).  On Windows, that database is broken and useless.


Most queries seems to work. 
Database should examined to get accidently deleted data from it.


Is making it read-write and index only solution or can it fixed in read-only 
database also, e-q forcing same local in postgres.conf

Andrus.




Re: Query returns no rows in pg_basebackup cluster

2020-05-21 Thread Andrus

Hi!


The LIKE query probably doesn't use an index and thus finds the relevant
data via sequential scan and equality checks on each record.



Yeah, exactly.  An equality condition will use a btree index if
available.  LIKE, however, sees the "_" as a wildcard so it cannot
use an index and resorts to a seqscan --- which will work fine.
It's just index searches (and index-based sorts) that are broken.
Of course, if there isn't an index on the column in question
then this theory falls to the ground.


There is composite index on baas column

CREATE TABLE public.desktop
(
   id integer NOT NULL DEFAULT nextval('desktop_id_seq'::regclass),
   recordtype character(5) COLLATE pg_catalog."default" NOT NULL,
   klass character(1) COLLATE pg_catalog."default",
   baas character(8) COLLATE pg_catalog."default" NOT NULL,
   liigid character(1) COLLATE pg_catalog."default" NOT NULL DEFAULT ''::bpchar,
   jrk numeric(4,0) NOT NULL DEFAULT 0,
...
   CONSTRAINT desktop_pkey PRIMARY KEY (id),
   CONSTRAINT desktop_baas_not_empty CHECK (baas <> ''::bpchar),
   CONSTRAINT desktop_id_check CHECK (id > 0),
   CONSTRAINT desktop_recordtype_check CHECK (recordtype = 'Aken'::bpchar OR 
recordtype = 'Veerg'::bpchar)
)

TABLESPACE pg_default;
CREATE INDEX desktop_baas_liigid_idx
   ON public.desktop USING btree
   (baas COLLATE pg_catalog."default" ASC NULLS LAST, liigid COLLATE 
pg_catalog."default" ASC NULLS LAST)
   TABLESPACE pg_default;

Maybe it is possible to force postgres in windows to use the same locale as in 
Linux. Locales are actually the same.

Andrus.





Re: Query returns no rows in pg_basebackup cluster

2020-05-21 Thread Andrus

Hi!


Are you referring to two different instances of Postgres on Windows?


No. 
Main server is in Linux and backup server is in windows.


Andrus.




  1   2   3   >