Re: PostgreSQL 14.8 - server fails to start even though all transaction logs with base backup are restored

2023-08-08 Thread Stephen Frost
Greetings,

* Meera Nair (mn...@commvault.com) wrote:
> We are following 
> https://www.postgresql.org/docs/14/continuous-archiving.html#BACKUP-LOWLEVEL-BASE-BACKUP
> Making An Exclusive Low-Level Backup.

Exclusive backups have been removed, so you really don't want to be
depending on it.

> After restoring, we see the server fails to start with below error:

Your archive doesn't seem to have all of the WAL which was generated
during the backup, which means that your backup wasn't complete and the
system cannot be restored from this backup.

> We are restoring all the transaction logs required for base backup to be 
> consistent.

Evidently not, per the complaint from PG when it starts up.

> BACKUP_END is present in the 000400020046 segment. Why did 
> recovery fail to find this?

It's possible to have multiple backups going concurrently.  Presumably,
this was from a different backup and that's why it wasn't picked up as
being the end for this backup.  You seem to have quite a few .backup
files in your archive directory which would seem to support this.

Not sure exactly what you're doing, but unless your goal is to spend a
great deal of effort developing a PG backup solution, you're really
better off using one of the existing solutions (eg: pg_basebackup or
pgBackRest).  If you are developing your own backup solution for PG, you
definitely want to be using the new APIs and not using the exclusive
backup method.  Please review the current (15) documentation:

https://www.postgresql.org/docs/current/continuous-archiving.html

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: PostgreSQL 14.8 - server fails to start even though all transaction logs with base backup are restored

2023-08-08 Thread Ron


"26.3.3.2. Making An Exclusive Low-Level Backup


 Note

The exclusive backup method is deprecated and should be avoided. Prior 
toPostgreSQL9.6, this was the only low-level method available, but it is now 
recommended that all users upgrade their scripts to use non-exclusive backups."



Use PgBackRest, instead.


On 8/8/23 02:39, Meera Nair wrote:


Hi Team,

We are following 
https://www.postgresql.org/docs/14/continuous-archiving.html#BACKUP-LOWLEVEL-BASE-BACKUP


Making An Exclusive Low-Level Backup.

After restoring, we see the server fails to start with below error:

PostgreSQL server log:

2023-08-04 16:47:47.227 IST [40582] LOG:  starting PostgreSQL 14.8 on 
x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 
4.8.5-44), 64-bit


2023-08-04 16:47:47.228 IST [40582] LOG:  listening on IPv4 address 
"0.0.0.0", port 5414


2023-08-04 16:47:47.228 IST [40582] LOG:  listening on IPv6 address "::", 
port 5414


2023-08-04 16:47:47.231 IST [40582] LOG:  listening on Unix socket 
"/var/run/postgresql/.s.PGSQL.5414"


2023-08-04 16:47:47.236 IST [40582] LOG:  listening on Unix socket 
"/tmp/.s.PGSQL.5414"


2023-08-04 16:47:47.243 IST [40584] LOG:  database system was interrupted; 
last known up at 2023-08-04 14:48:29 IST


2023-08-04 16:47:49.254 IST [40584] LOG:  starting archive recovery

cp: cannot stat ‘/var/lib/pgsql/14/wal/0004.history’: No such file or 
directory


2023-08-04 16:47:49.275 IST [40584] LOG:  restored log file 
"000400020041" from archive


2023-08-04 16:47:49.290 IST [40584] LOG:  redo starts at 2/4128

2023-08-04 16:47:49.310 IST [40584] LOG:  restored log file 
"000400020042" from archive


2023-08-04 16:47:49.344 IST [40584] LOG:  restored log file 
"000400020043" from archive


2023-08-04 16:47:49.368 IST [40584] LOG:  restored log file 
"000400020044" from archive


2023-08-04 16:47:49.395 IST [40584] LOG:  restored log file 
"000400020045" from archive


2023-08-04 16:47:49.423 IST [40584] LOG:  restored log file 
"000400020046" from archive


2023-08-04 16:47:49.460 IST [40584] LOG:  restored log file 
"000400020047" from archive


2023-08-04 16:47:50.649 IST [40584] LOG:  restored log file 
"000400020048" from archive


2023-08-04 16:47:51.219 IST [40584] LOG:  restored log file 
"000400020049" from archive


cp: cannot stat ‘/var/lib/pgsql/14/wal/00040002004A’: No such 
file or directory


2023-08-04 16:47:51.603 IST [40584] LOG:  redo done at 2/49013500 system 
usage: CPU: user: 0.00 s, system: 0.19 s, elapsed: 2.31 s


2023-08-04 16:47:51.603 IST [40584] LOG:  last completed transaction was 
at log time 2023-08-04 14:58:51.021414+05:30


2023-08-04 16:47:51.628 IST [40584] LOG:  restored log file 
"000400020049" from archive


2023-08-04 16:47:51.973 IST [40584] FATAL:  WAL ends before end of online 
backup


2023-08-04 16:47:51.973 IST [40584] HINT:  Online backup started with 
pg_start_backup() must be ended with pg_stop_backup(), and all WAL up to 
that point must be available at recovery.


2023-08-04 16:47:51.975 IST [40582] LOG:  startup process (PID 40584) 
exited with exit code 1


2023-08-04 16:47:51.975 IST [40582] LOG:  terminating any other active 
server processes


2023-08-04 16:47:51.975 IST [40582] LOG:  shutting down due to startup 
process failure


2023-08-04 16:47:51.977 IST [40582] LOG:  database system is shut down

Backup_label from data directory:

[root@gkstandby2 data]# cat backup_label

START WAL LOCATION: 2/4128 (file 000400020041)

CHECKPOINT LOCATION: 2/4160

BACKUP METHOD: pg_start_backup

BACKUP FROM: primary

START TIME: 2023-08-04 14:48:29 IST

LABEL: pgida_backup_5414_108625_1691140709

START TIMELINE: 4

[root@gkstandby2 data]#

Archive log directory:

[root@gkstandby2 wal]# ls

00020002003B 00020002003E.0028.backup 
000400020040.0028.backup 000400020044 
000400020048


00020002003C 0002.history 000400020041 
000400020045 000400020049


00020002003C.0028.backup 0003000500E4 
000400020042 000400020045.5AE0.backup


00020002003D 00040002003F 000400020043 
000400020046


00020002003E 000400020040 
000400020043.0028.backup 000400020047


[root@gkstandby2 wal]# /usr/pgsql-14/bin/pg_waldump 
000400020046 | grep BACKUP


rmgr: XLOG    len (rec/tot): 34/    34, tx:  0, lsn: 
2/46000110, prev 2/4698, desc: BACKUP_END 2/45005AE0


[root@gkstandby2 wal]#

We are restoring all the transaction logs required for base backup to be 
consistent.


BACKUP_END is present in the 000400020046 segment. Why did 
recovery fail to find this?


Regards,

Meera



--
Born 

Re: DB Server slow down & hang during Peak hours of Usage

2023-08-08 Thread KK CHN
On Tue, Aug 8, 2023 at 5:49 PM Marc Millas  wrote:

> also,
> checkpoint setup are all default values
>
> you may try to
> checkpoint_completion_target = 0.9
> checkpoint_timeout = 15min
> max_wal_size = 5GB
>
> and, as said in the previous mail, check the checkpoint logs
>
> Also, all vacuum and autovacuum values are defaults
> so, as autovacuum_work_mem = -1
> the autovacuum processes will use the 4 GB setuped by maintenance_work_mem
> = 4096MB
> as there are 3 launched at the same time, its 12 GB "eaten"
> which doesn't look like a good idea, so set
> autovacuum_work_mem = 128MB
>
> also pls read the autovacuum doc for your version (which is ?) here for
> postgres 12:
> https://www.postgresql.org/docs/12/runtime-config-autovacuum.html
>
>
>
> Marc MILLAS
> Senior Architect
> +33607850334
> www.mokadb.com
>
>
>
> On Tue, Aug 8, 2023 at 1:59 PM Marc Millas  wrote:
>
>> Hello,
>> in the postgresql.conf joined, 2 things (at least) look strange:
>> 1) the values for background writer are the default values, fit for a
>> server with a limited writes throughput.
>> you may want to increase those, like:
>> bgwriter_delay = 50ms
>> bgwriter_lru_maxpages = 400
>> bgwriter_lru_multiplier = 4.0
>> and check the checkpoint log to see if there are still backend processes
>> writes.
>>
>> 2) work_mem is set to 2 GB.
>> so, if 50 simultaneous requests use at least one buffer for sorting,
>> joining, ..., you will consume 100 GB of RAM
>> this value seems huge for the kind of config/usage you describe.
>> You may try to set work_mem to 100 MB and check what's happening.
>>
>> Also check the logs, postgres tells his life there...
>>
>>
>>
>>
>>
>> Marc MILLAS
>> Senior Architect
>> +33607850334
>> www.mokadb.com
>>
>>
>>
Thank you all for your time and the valuable inputs to fix the issue.  Let
me tune conf parameters as advised and   will get back with the results and
log outputs .

Krishane

>
>> On Mon, Aug 7, 2023 at 3:36 PM KK CHN  wrote:
>>
>>> List ,
>>>
>>> *Description:*
>>>
>>> Maintaining a DB Server Postgres and with a lot of read writes to this
>>> Server( virtual machine running on  ESXi 7 with CentOS 7) .
>>>
>>> ( I am not sure how to get the read / write counts or required IOPS or
>>> any other parameters for you. If  you point our  I can execute those
>>> commands and get the data. )
>>>
>>> Peak hours  say 19:00 Hrs to 21:00 hrs it hangs ( The application is an
>>> Emergency call response system  writing many  Emergency Response vehicles
>>> locations coordinates to the DB every 30 Seconds and every emergency call
>>> metadata (username, phone number, location info and address of the caller
>>> to the DB for each call)
>>>
>>> During these hours  the system hangs and the  Application ( which shows
>>> the location of the vehicles on a  GIS map hangs ) and the CAD machines
>>> which connects to the system hangs as those machines can't  connect to the
>>> DB and get data for displaying the caller information to the call taking
>>> persons working on them. )
>>>
>>> *Issue : *
>>> How to trace out what makes this DB  hangs and make it slow  and how to
>>> fix it..
>>>
>>> *Resource poured on the system :*
>>>
>>> *64 vCPUs  allocate ( Out of a host machine comprised of 2 processor
>>> slots of 20 cores each with Hyper Threading, intel xeon 2nd Gen, CPU usage
>>> show 50 % in vCentre Console), and RAM 64 GB allocated ( buy usage always
>>> showing around 33 GB only ) *
>>>
>>> *Query :*
>>>
>>> How to rectify the issues that makes the DB server underperforming and
>>> find a permanent fix for this slow down issue*. *
>>>
>>> *Attached the  Postgres.conf file here for reference .*
>>>
>>> *Any more information required I can share for analysis to fix the
>>> issue. *
>>>
>>>
>>> *Krishane *
>>>
>>


Re: DB Server slow down & hang during Peak hours of Usage

2023-08-08 Thread Marc Millas
also,
checkpoint setup are all default values

you may try to
checkpoint_completion_target = 0.9
checkpoint_timeout = 15min
max_wal_size = 5GB

and, as said in the previous mail, check the checkpoint logs

Also, all vacuum and autovacuum values are defaults
so, as autovacuum_work_mem = -1
the autovacuum processes will use the 4 GB setuped by maintenance_work_mem
= 4096MB
as there are 3 launched at the same time, its 12 GB "eaten"
which doesn't look like a good idea, so set
autovacuum_work_mem = 128MB

also pls read the autovacuum doc for your version (which is ?) here for
postgres 12:
https://www.postgresql.org/docs/12/runtime-config-autovacuum.html



Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com



On Tue, Aug 8, 2023 at 1:59 PM Marc Millas  wrote:

> Hello,
> in the postgresql.conf joined, 2 things (at least) look strange:
> 1) the values for background writer are the default values, fit for a
> server with a limited writes throughput.
> you may want to increase those, like:
> bgwriter_delay = 50ms
> bgwriter_lru_maxpages = 400
> bgwriter_lru_multiplier = 4.0
> and check the checkpoint log to see if there are still backend processes
> writes.
>
> 2) work_mem is set to 2 GB.
> so, if 50 simultaneous requests use at least one buffer for sorting,
> joining, ..., you will consume 100 GB of RAM
> this value seems huge for the kind of config/usage you describe.
> You may try to set work_mem to 100 MB and check what's happening.
>
> Also check the logs, postgres tells his life there...
>
>
>
>
>
> Marc MILLAS
> Senior Architect
> +33607850334
> www.mokadb.com
>
>
>
> On Mon, Aug 7, 2023 at 3:36 PM KK CHN  wrote:
>
>> List ,
>>
>> *Description:*
>>
>> Maintaining a DB Server Postgres and with a lot of read writes to this
>> Server( virtual machine running on  ESXi 7 with CentOS 7) .
>>
>> ( I am not sure how to get the read / write counts or required IOPS or
>> any other parameters for you. If  you point our  I can execute those
>> commands and get the data. )
>>
>> Peak hours  say 19:00 Hrs to 21:00 hrs it hangs ( The application is an
>> Emergency call response system  writing many  Emergency Response vehicles
>> locations coordinates to the DB every 30 Seconds and every emergency call
>> metadata (username, phone number, location info and address of the caller
>> to the DB for each call)
>>
>> During these hours  the system hangs and the  Application ( which shows
>> the location of the vehicles on a  GIS map hangs ) and the CAD machines
>> which connects to the system hangs as those machines can't  connect to the
>> DB and get data for displaying the caller information to the call taking
>> persons working on them. )
>>
>> *Issue : *
>> How to trace out what makes this DB  hangs and make it slow  and how to
>> fix it..
>>
>> *Resource poured on the system :*
>>
>> *64 vCPUs  allocate ( Out of a host machine comprised of 2 processor
>> slots of 20 cores each with Hyper Threading, intel xeon 2nd Gen, CPU usage
>> show 50 % in vCentre Console), and RAM 64 GB allocated ( buy usage always
>> showing around 33 GB only ) *
>>
>> *Query :*
>>
>> How to rectify the issues that makes the DB server underperforming and
>> find a permanent fix for this slow down issue*. *
>>
>> *Attached the  Postgres.conf file here for reference .*
>>
>> *Any more information required I can share for analysis to fix the
>> issue. *
>>
>>
>> *Krishane *
>>
>


Re: DB Server slow down & hang during Peak hours of Usage

2023-08-08 Thread Marc Millas
Hello,
in the postgresql.conf joined, 2 things (at least) look strange:
1) the values for background writer are the default values, fit for a
server with a limited writes throughput.
you may want to increase those, like:
bgwriter_delay = 50ms
bgwriter_lru_maxpages = 400
bgwriter_lru_multiplier = 4.0
and check the checkpoint log to see if there are still backend processes
writes.

2) work_mem is set to 2 GB.
so, if 50 simultaneous requests use at least one buffer for sorting,
joining, ..., you will consume 100 GB of RAM
this value seems huge for the kind of config/usage you describe.
You may try to set work_mem to 100 MB and check what's happening.

Also check the logs, postgres tells his life there...





Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com



On Mon, Aug 7, 2023 at 3:36 PM KK CHN  wrote:

> List ,
>
> *Description:*
>
> Maintaining a DB Server Postgres and with a lot of read writes to this
> Server( virtual machine running on  ESXi 7 with CentOS 7) .
>
> ( I am not sure how to get the read / write counts or required IOPS or any
> other parameters for you. If  you point our  I can execute those commands
> and get the data. )
>
> Peak hours  say 19:00 Hrs to 21:00 hrs it hangs ( The application is an
> Emergency call response system  writing many  Emergency Response vehicles
> locations coordinates to the DB every 30 Seconds and every emergency call
> metadata (username, phone number, location info and address of the caller
> to the DB for each call)
>
> During these hours  the system hangs and the  Application ( which shows
> the location of the vehicles on a  GIS map hangs ) and the CAD machines
> which connects to the system hangs as those machines can't  connect to the
> DB and get data for displaying the caller information to the call taking
> persons working on them. )
>
> *Issue : *
> How to trace out what makes this DB  hangs and make it slow  and how to
> fix it..
>
> *Resource poured on the system :*
>
> *64 vCPUs  allocate ( Out of a host machine comprised of 2 processor slots
> of 20 cores each with Hyper Threading, intel xeon 2nd Gen, CPU usage show
> 50 % in vCentre Console), and RAM 64 GB allocated ( buy usage always
> showing around 33 GB only ) *
>
> *Query :*
>
> How to rectify the issues that makes the DB server underperforming and
> find a permanent fix for this slow down issue*. *
>
> *Attached the  Postgres.conf file here for reference .*
>
> *Any more information required I can share for analysis to fix the issue. *
>
>
> *Krishane *
>


Re: My 1st TABLESPACE

2023-08-08 Thread KK CHN
On Mon, Aug 7, 2023 at 5:47 PM Amn Ojee Uw  wrote:

> Thanks Negora.
>
> Makes sense, I will check it out.
>
> On 8/7/23 1:48 a.m., negora wrote:
>
> Hi:
>
> Although the "postgres" user owns the "data" directory, Has he access to
> the whole branch of directories? Maybe the problem is that he can't reach
> the "data" directory.
>
> Regards.
>
>
> On 07/08/2023 07:43, Amn Ojee Uw wrote:
>
> I'd like to create a TABLESPACE, so, following this web page
> ,  I
> have done the following :
>
> *mkdir
> /home/my_debian_account/Documents/NetbeansWorkSpace/JavaSE/Jme/database/postgresql/data*
>
> *sudo chown postgres:postgres
> /home/my_debian_account/Documents/NetbeansWorkSpace/JavaSE/Jme/database/postgresql/data*
>
> *sudo -u postgres psql*
>
> *\du*
> * arbolone| Cannot login  | {}*
> * chispa
> || {prosafe}*
> * workerbee | Superuser, Create DB| {arbolone}*
> * jme
> || {arbolone}*
> * postgres| Superuser, Create role, Create DB, Replication, Bypass RLS
> | {}*
> * prosafe  | Cannot login  | {}*
>
> *CREATE TABLESPACE jmetablespace OWNER jme LOCATION
> '/home/my_debian_account/Documents/NetbeansWorkSpace/JavaSE/Jme/database/postgresql/data';*
>
>
Here owner is jme   and the  data dir  you created must have owner jme..

> The *CREATE **TABLESPACE* schema throws this error message :
>
> *ERROR:  could not set permissions on directory
> "/home/my_debian_account/Documents/NetbeansWorkSpace/JavaSE/Jme/database/postgresql/data":
> Permission denied*
>
> I have followed the web page to the best of my abilities, and AFAIK, the
> postgres user owns the folder '*data*'.
>
> I know that something is missing, where did I go wrong and how can I
> resolve this issue?
>
>
> Thanks in advance.
>
>
>


PostgreSQL 14.8 - server fails to start even though all transaction logs with base backup are restored

2023-08-08 Thread Meera Nair
Hi Team,

We are following 
https://www.postgresql.org/docs/14/continuous-archiving.html#BACKUP-LOWLEVEL-BASE-BACKUP
Making An Exclusive Low-Level Backup.

After restoring, we see the server fails to start with below error:

PostgreSQL server log:
2023-08-04 16:47:47.227 IST [40582] LOG:  starting PostgreSQL 14.8 on 
x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 
64-bit
2023-08-04 16:47:47.228 IST [40582] LOG:  listening on IPv4 address "0.0.0.0", 
port 5414
2023-08-04 16:47:47.228 IST [40582] LOG:  listening on IPv6 address "::", port 
5414
2023-08-04 16:47:47.231 IST [40582] LOG:  listening on Unix socket 
"/var/run/postgresql/.s.PGSQL.5414"
2023-08-04 16:47:47.236 IST [40582] LOG:  listening on Unix socket 
"/tmp/.s.PGSQL.5414"
2023-08-04 16:47:47.243 IST [40584] LOG:  database system was interrupted; last 
known up at 2023-08-04 14:48:29 IST
2023-08-04 16:47:49.254 IST [40584] LOG:  starting archive recovery
cp: cannot stat '/var/lib/pgsql/14/wal/0004.history': No such file or 
directory
2023-08-04 16:47:49.275 IST [40584] LOG:  restored log file 
"000400020041" from archive
2023-08-04 16:47:49.290 IST [40584] LOG:  redo starts at 2/4128
2023-08-04 16:47:49.310 IST [40584] LOG:  restored log file 
"000400020042" from archive
2023-08-04 16:47:49.344 IST [40584] LOG:  restored log file 
"000400020043" from archive
2023-08-04 16:47:49.368 IST [40584] LOG:  restored log file 
"000400020044" from archive
2023-08-04 16:47:49.395 IST [40584] LOG:  restored log file 
"000400020045" from archive
2023-08-04 16:47:49.423 IST [40584] LOG:  restored log file 
"000400020046" from archive
2023-08-04 16:47:49.460 IST [40584] LOG:  restored log file 
"000400020047" from archive
2023-08-04 16:47:50.649 IST [40584] LOG:  restored log file 
"000400020048" from archive
2023-08-04 16:47:51.219 IST [40584] LOG:  restored log file 
"000400020049" from archive
cp: cannot stat '/var/lib/pgsql/14/wal/00040002004A': No such file 
or directory
2023-08-04 16:47:51.603 IST [40584] LOG:  redo done at 2/49013500 system usage: 
CPU: user: 0.00 s, system: 0.19 s, elapsed: 2.31 s
2023-08-04 16:47:51.603 IST [40584] LOG:  last completed transaction was at log 
time 2023-08-04 14:58:51.021414+05:30
2023-08-04 16:47:51.628 IST [40584] LOG:  restored log file 
"000400020049" from archive
2023-08-04 16:47:51.973 IST [40584] FATAL:  WAL ends before end of online backup
2023-08-04 16:47:51.973 IST [40584] HINT:  Online backup started with 
pg_start_backup() must be ended with pg_stop_backup(), and all WAL up to that 
point must be available at recovery.
2023-08-04 16:47:51.975 IST [40582] LOG:  startup process (PID 40584) exited 
with exit code 1
2023-08-04 16:47:51.975 IST [40582] LOG:  terminating any other active server 
processes
2023-08-04 16:47:51.975 IST [40582] LOG:  shutting down due to startup process 
failure
2023-08-04 16:47:51.977 IST [40582] LOG:  database system is shut down


Backup_label from data directory:

[root@gkstandby2 data]# cat backup_label
START WAL LOCATION: 2/4128 (file 000400020041)
CHECKPOINT LOCATION: 2/4160
BACKUP METHOD: pg_start_backup
BACKUP FROM: primary
START TIME: 2023-08-04 14:48:29 IST
LABEL: pgida_backup_5414_108625_1691140709
START TIMELINE: 4
[root@gkstandby2 data]#


Archive log directory:

[root@gkstandby2 wal]# ls
00020002003B  
00020002003E.0028.backup  
000400020040.0028.backup  000400020044  
000400020048
00020002003C  0002.history  
000400020041  000400020045  
000400020049
00020002003C.0028.backup  0003000500E4  
000400020042  
000400020045.5AE0.backup
00020002003D  00040002003F  
000400020043  000400020046
00020002003E  000400020040  
000400020043.0028.backup  000400020047


[root@gkstandby2 wal]# /usr/pgsql-14/bin/pg_waldump 000400020046 | 
grep BACKUP
rmgr: XLOGlen (rec/tot): 34/34, tx:  0, lsn: 
2/46000110, prev 2/4698, desc: BACKUP_END 2/45005AE0
[root@gkstandby2 wal]#


We are restoring all the transaction logs required for base backup to be 
consistent.
BACKUP_END is present in the 000400020046 segment. Why did recovery 
fail to find this?


Regards,
Meera