Re: pgBackRest for a 50 TB database

2023-12-19 Thread Abhishek Bhola
Hello Stephen

Just an update on this. After we deployed it on our PROD system, the
results were far better than testing.
Time taken is around 4-5 hours only. And has been the case for the last 3
months or so.
full backup: 20231209-150002F
timestamp start/stop: 2023-12-09 15:00:02+09 / 2023-12-09
19:33:56+09
wal start/stop: 00010001DCC3008E /
00010001DCC300A6
database size: 32834.8GB, database backup size: 32834.8GB
repo1: backup size: 5096.4GB

Now a question. I restored this big DB and it all works fine. However, I
was wondering if there was a way to disable the subscription on Postgres
while restoring the data using pgbackrest?
So for example, I have been taking a backup of this DB which has an active
subscription.
When I am restoring the DB for test purposes, I don't want the subscription
to be there. Is there any option to ignore the subscription?

Thanks

On Thu, Oct 5, 2023 at 10:19 PM Stephen Frost  wrote:

> Greetings,
>
> On Thu, Oct 5, 2023 at 03:10 Abhishek Bhola <
> abhishek.bh...@japannext.co.jp> wrote:
>
>> Here is the update with compress-type=zst in the config file
>> Process-max is still 30. *But it longer than before, around 27 hours 50
>> mins*
>>
>> full backup: 20231004-130621F
>> timestamp start/stop: 2023-10-04 13:06:21+09 / 2023-10-05
>> 15:56:03+09
>> wal start/stop: 00010001AC0E0054 /
>> 00010001AC0E0054
>> database size: 38249.0GB, database backup size: 38249.0GB
>> repo1: backup size: 5799.8GB
>>
>> Do you think I could be missing something?
>>
>
> Sounds like there’s something else which is the bottleneck once you have
> process-max at 30. I suspect you could reduce that process-max value and
> have around the same time still with zstd.  Ultimately if you want it to be
> faster then you’ll need to figure out what the bottleneck is (seemingly not
> CPU, unlikely to be memory, so that leaves network or storage) and address
> that.
>
> We’ve seen numbers approaching 10TB/hr with lots of processes and zstd and
> fast storage on high end physical hardware.
>
> Thanks,
>
> Stephen
>

-- 
_This correspondence (including any attachments) is for the intended 
recipient(s) only. It may contain confidential or privileged information or 
both. No confidentiality or privilege is waived or lost by any 
mis-transmission. If you receive this correspondence by mistake, please 
contact the sender immediately, delete this correspondence (and all 
attachments) and destroy any hard copies. You must not use, disclose, copy, 
distribute or rely on any part of this correspondence (including any 
attachments) if you are not the intended 
recipient(s).本メッセージに記載および添付されている情報(以下、総称して「本情報」といいます。)は、本来の受信者による使用のみを意図しています。誤送信等により本情報を取得された場合でも、本情報に係る秘密、または法律上の秘匿特権が失われるものではありません。本電子メールを受取られた方が、本来の受信者ではない場合には、本情報及びそのコピーすべてを削除・破棄し、本電子メールが誤って届いた旨を発信者宛てにご通知下さいますようお願いします。本情報の閲覧、発信または本情報に基づくいかなる行為も明確に禁止されていることをご了承ください。_


Re: pgBackRest for a 50 TB database

2023-10-05 Thread Stephen Frost
Greetings,

On Thu, Oct 5, 2023 at 03:10 Abhishek Bhola 
wrote:

> Here is the update with compress-type=zst in the config file
> Process-max is still 30. *But it longer than before, around 27 hours 50
> mins*
>
> full backup: 20231004-130621F
> timestamp start/stop: 2023-10-04 13:06:21+09 / 2023-10-05
> 15:56:03+09
> wal start/stop: 00010001AC0E0054 /
> 00010001AC0E0054
> database size: 38249.0GB, database backup size: 38249.0GB
> repo1: backup size: 5799.8GB
>
> Do you think I could be missing something?
>

Sounds like there’s something else which is the bottleneck once you have
process-max at 30. I suspect you could reduce that process-max value and
have around the same time still with zstd.  Ultimately if you want it to be
faster then you’ll need to figure out what the bottleneck is (seemingly not
CPU, unlikely to be memory, so that leaves network or storage) and address
that.

We’ve seen numbers approaching 10TB/hr with lots of processes and zstd and
fast storage on high end physical hardware.

Thanks,

Stephen


Re: pgBackRest for a 50 TB database

2023-10-05 Thread Abhishek Bhola
Hi Stephen

Here is the update with compress-type=zst in the config file
Process-max is still 30. *But it longer than before, around 27 hours 50
mins*

full backup: 20231004-130621F
timestamp start/stop: 2023-10-04 13:06:21+09 / 2023-10-05
15:56:03+09
wal start/stop: 00010001AC0E0054 /
00010001AC0E0054
database size: 38249.0GB, database backup size: 38249.0GB
repo1: backup size: 5799.8GB

Do you think I could be missing something?

@Krishane

Let me try to answer the questions the best I can
1. The Connectivity protocol for DB is FC.
I cannot pinpoint the exact reason why it takes 26 hours. If I knew
exactly, I would have improved it myself.
I don't think 10 hours is even realistic, although if you can improve this
number, please let us know.

2. Yes, it is a dedicated DB server.

3. You're right, it is NAS

Thanks

On Wed, Oct 4, 2023 at 2:37 PM KK CHN  wrote:

> Greetings,
> Happy to hear you successfully performed pgBackRest for a 50TB DB. Out of
> curiosity I would like to know your infrastructure settings.
>
> 1. The  connectivity protocoal and bandwidth you used for your backend
> storage ?  Is it iSCSI, FC FCoE or GbE ? what's the exact reason for
> the 26 Hours it took in the best case ? What factors may reduce 26 Hours to
> much less time say 10 Hour or so for a 50 TB DB to  backup destination ??
> What to  fine tune or deploy  for a better performance?
>
> 2. It has been said that  you are running the DB on a 2 slot 18 core
> processor = 36 Physical cores ..  Is it a dedicated Server H/W entirely
> dedicated for a 50 TB database alone ?
> Why I asked, nowadays mostly we may run the DB servers on VMs in
> virtualized environments..  So I would like to know  all 36 Physical cores
> and associated RAM are all utilized by your 50 TB Database server ? or any
> vacant CPU cores/Free RAM on those server machines?
>
> 3.  What kind of connectivity/bandwidth between DB server and Storage
> backend you established ( I Want to know the server NIC card details,
> Connectivity Channel protocol/bandwidth and Connecting Switch spec from DB
> Server to Storage backend( NAS in this case right ?)
>
> Could you share the recommendations / details as in your case , Becoz I'm
> also in need to perform such a pgBackRest trial from a  production DB  to
> a  suitable Storage Device( Mostly Unified storage  DELL Unity)
>
> Any inputs are most welcome.
>
> Thanks,
> Krishane
>
> On Tue, Oct 3, 2023 at 12:14 PM Abhishek Bhola <
> abhishek.bh...@japannext.co.jp> wrote:
>
>> Hello,
>>
>> As said above, I tested pgBackRest on my bigger DB and here are the
>> results.
>> Server on which this is running has the following config:
>> Architecture:  x86_64
>> CPU op-mode(s):32-bit, 64-bit
>> Byte Order:Little Endian
>> CPU(s):36
>> On-line CPU(s) list:   0-35
>> Thread(s) per core:1
>> Core(s) per socket:18
>> Socket(s): 2
>> NUMA node(s):  2
>>
>> Data folder size: 52 TB (has some duplicate files since it is restored
>> from tapes)
>> Backup is being written on to DELL Storage, mounted on the server.
>>
>> pgbackrest.conf with following options enabled
>> repo1-block=y
>> repo1-bundle=y
>> start-fast=y
>>
>>
>> 1. *Using process-max: 30, Time taken: ~26 hours*
>> full backup: 20230926-092555F
>> timestamp start/stop: 2023-09-26 09:25:55+09 / 2023-09-27
>> 11:07:18+09
>> wal start/stop: 00010001AC0E0044 /
>> 00010001AC0E0044
>> database size: 38248.9GB, database backup size: 38248.9GB
>> repo1: backup size: 6222.0GB
>>
>> 2. *Using process-max: 10, Time taken: ~37 hours*
>>  full backup: 20230930-190002F
>> timestamp start/stop: 2023-09-30 19:00:02+09 / 2023-10-02
>> 08:01:20+09
>> wal start/stop: 00010001AC0E004E /
>> 00010001AC0E004E
>> database size: 38248.9GB, database backup size: 38248.9GB
>> repo1: backup size: 6222.0GB
>>
>> Hope it helps someone to use these numbers as some reference.
>>
>> Thanks
>>
>>
>> On Mon, Aug 28, 2023 at 12:30 AM Abhishek Bhola <
>> abhishek.bh...@japannext.co.jp> wrote:
>>
>>> Hi Stephen
>>>
>>> Thank you for the prompt response.
>>> Hearing it from you makes me more confident about rolling it to PROD.
>>> I will have a discussion with the network team once about and hear what
>>> they have to say and make an estimate accordingly.
>>>
>>> If you happen to know anyone using it with that size and having
>>> published their numbers, that would be great, but if not, I will post them
>>> once I set it up.
>>>
>>> Thanks for your help.
>>>
>>> Cheers,
>>> Abhishek
>>>
>>> On Mon, Aug 28, 2023 at 12:22 AM Stephen Frost 
>>> wrote:
>>>
 Greetings,

 * Abhishek Bhola (abhishek.bh...@japannext.co.jp) wrote:
 > I am trying to use pgBackRest for all my Postgres servers. I have
 tested it
 > on a sample database a

Re: pgBackRest for a 50 TB database

2023-10-03 Thread KK CHN
Greetings,
Happy to hear you successfully performed pgBackRest for a 50TB DB. Out of
curiosity I would like to know your infrastructure settings.

1. The  connectivity protocoal and bandwidth you used for your backend
storage ?  Is it iSCSI, FC FCoE or GbE ? what's the exact reason for
the 26 Hours it took in the best case ? What factors may reduce 26 Hours to
much less time say 10 Hour or so for a 50 TB DB to  backup destination ??
What to  fine tune or deploy  for a better performance?

2. It has been said that  you are running the DB on a 2 slot 18 core
processor = 36 Physical cores ..  Is it a dedicated Server H/W entirely
dedicated for a 50 TB database alone ?
Why I asked, nowadays mostly we may run the DB servers on VMs in
virtualized environments..  So I would like to know  all 36 Physical cores
and associated RAM are all utilized by your 50 TB Database server ? or any
vacant CPU cores/Free RAM on those server machines?

3.  What kind of connectivity/bandwidth between DB server and Storage
backend you established ( I Want to know the server NIC card details,
Connectivity Channel protocol/bandwidth and Connecting Switch spec from DB
Server to Storage backend( NAS in this case right ?)

Could you share the recommendations / details as in your case , Becoz I'm
also in need to perform such a pgBackRest trial from a  production DB  to
a  suitable Storage Device( Mostly Unified storage  DELL Unity)

Any inputs are most welcome.

Thanks,
Krishane

On Tue, Oct 3, 2023 at 12:14 PM Abhishek Bhola <
abhishek.bh...@japannext.co.jp> wrote:

> Hello,
>
> As said above, I tested pgBackRest on my bigger DB and here are the
> results.
> Server on which this is running has the following config:
> Architecture:  x86_64
> CPU op-mode(s):32-bit, 64-bit
> Byte Order:Little Endian
> CPU(s):36
> On-line CPU(s) list:   0-35
> Thread(s) per core:1
> Core(s) per socket:18
> Socket(s): 2
> NUMA node(s):  2
>
> Data folder size: 52 TB (has some duplicate files since it is restored
> from tapes)
> Backup is being written on to DELL Storage, mounted on the server.
>
> pgbackrest.conf with following options enabled
> repo1-block=y
> repo1-bundle=y
> start-fast=y
>
>
> 1. *Using process-max: 30, Time taken: ~26 hours*
> full backup: 20230926-092555F
> timestamp start/stop: 2023-09-26 09:25:55+09 / 2023-09-27
> 11:07:18+09
> wal start/stop: 00010001AC0E0044 /
> 00010001AC0E0044
> database size: 38248.9GB, database backup size: 38248.9GB
> repo1: backup size: 6222.0GB
>
> 2. *Using process-max: 10, Time taken: ~37 hours*
>  full backup: 20230930-190002F
> timestamp start/stop: 2023-09-30 19:00:02+09 / 2023-10-02
> 08:01:20+09
> wal start/stop: 00010001AC0E004E /
> 00010001AC0E004E
> database size: 38248.9GB, database backup size: 38248.9GB
> repo1: backup size: 6222.0GB
>
> Hope it helps someone to use these numbers as some reference.
>
> Thanks
>
>
> On Mon, Aug 28, 2023 at 12:30 AM Abhishek Bhola <
> abhishek.bh...@japannext.co.jp> wrote:
>
>> Hi Stephen
>>
>> Thank you for the prompt response.
>> Hearing it from you makes me more confident about rolling it to PROD.
>> I will have a discussion with the network team once about and hear what
>> they have to say and make an estimate accordingly.
>>
>> If you happen to know anyone using it with that size and having published
>> their numbers, that would be great, but if not, I will post them once I set
>> it up.
>>
>> Thanks for your help.
>>
>> Cheers,
>> Abhishek
>>
>> On Mon, Aug 28, 2023 at 12:22 AM Stephen Frost 
>> wrote:
>>
>>> Greetings,
>>>
>>> * Abhishek Bhola (abhishek.bh...@japannext.co.jp) wrote:
>>> > I am trying to use pgBackRest for all my Postgres servers. I have
>>> tested it
>>> > on a sample database and it works fine. But my concern is for some of
>>> the
>>> > bigger DB clusters, the largest one being 50TB and growing by about
>>> > 200-300GB a day.
>>>
>>> Glad pgBackRest has been working well for you.
>>>
>>> > I plan to mount NAS storage on my DB server to store my backup. The
>>> server
>>> > with 50 TB data is using DELL Storage underneath to store this data
>>> and has
>>> > 36 18-core CPUs.
>>>
>>> How much free CPU capacity does the system have?
>>>
>>> > As I understand, pgBackRest recommends having 2 full backups and then
>>> > having incremental or differential backups as per requirement. Does
>>> anyone
>>> > have any reference numbers on how much time a backup for such a DB
>>> would
>>> > usually take, just for reference. If I take a full backup every Sunday
>>> and
>>> > then incremental backups for the rest of the week, I believe the
>>> > incremental backups should not be a problem, but the full backup every
>>> > Sunday might not finish in time.
>>>
>>> pgBackRest scales extremely well- what's going to matter here is how
>>> much you can 

Re: pgBackRest for a 50 TB database

2023-10-03 Thread Abhishek Bhola
Hi Stephen

No, I did not try that. Let me try that now and report the numbers here,
both in terms of size and time taken.
Thanks for the suggestion.


On Tue, Oct 3, 2023 at 10:39 PM Stephen Frost  wrote:

> Greetings,
>
> On Mon, Oct 2, 2023 at 20:08 Abhishek Bhola <
> abhishek.bh...@japannext.co.jp> wrote:
>
>> As said above, I tested pgBackRest on my bigger DB and here are the
>> results.
>> Server on which this is running has the following config:
>> Architecture:  x86_64
>> CPU op-mode(s):32-bit, 64-bit
>> Byte Order:Little Endian
>> CPU(s):36
>> On-line CPU(s) list:   0-35
>> Thread(s) per core:1
>> Core(s) per socket:18
>> Socket(s): 2
>> NUMA node(s):  2
>>
>> Data folder size: 52 TB (has some duplicate files since it is restored
>> from tapes)
>> Backup is being written on to DELL Storage, mounted on the server.
>>
>> pgbackrest.conf with following options enabled
>> repo1-block=y
>> repo1-bundle=y
>> start-fast=y
>>
>
> Thanks for sharing!  Did you perhaps consider using zstd for the
> compression..?  You might find that you get similar compression in less
> time.
>
> Thanks.
>
> Stephen
>

-- 
_This correspondence (including any attachments) is for the intended 
recipient(s) only. It may contain confidential or privileged information or 
both. No confidentiality or privilege is waived or lost by any 
mis-transmission. If you receive this correspondence by mistake, please 
contact the sender immediately, delete this correspondence (and all 
attachments) and destroy any hard copies. You must not use, disclose, copy, 
distribute or rely on any part of this correspondence (including any 
attachments) if you are not the intended 
recipient(s).本メッセージに記載および添付されている情報(以下、総称して「本情報」といいます。)は、本来の受信者による使用のみを意図しています。誤送信等により本情報を取得された場合でも、本情報に係る秘密、または法律上の秘匿特権が失われるものではありません。本電子メールを受取られた方が、本来の受信者ではない場合には、本情報及びそのコピーすべてを削除・破棄し、本電子メールが誤って届いた旨を発信者宛てにご通知下さいますようお願いします。本情報の閲覧、発信または本情報に基づくいかなる行為も明確に禁止されていることをご了承ください。_


Re: pgBackRest for a 50 TB database

2023-10-03 Thread Stephen Frost
Greetings,

On Mon, Oct 2, 2023 at 20:08 Abhishek Bhola 
wrote:

> As said above, I tested pgBackRest on my bigger DB and here are the
> results.
> Server on which this is running has the following config:
> Architecture:  x86_64
> CPU op-mode(s):32-bit, 64-bit
> Byte Order:Little Endian
> CPU(s):36
> On-line CPU(s) list:   0-35
> Thread(s) per core:1
> Core(s) per socket:18
> Socket(s): 2
> NUMA node(s):  2
>
> Data folder size: 52 TB (has some duplicate files since it is restored
> from tapes)
> Backup is being written on to DELL Storage, mounted on the server.
>
> pgbackrest.conf with following options enabled
> repo1-block=y
> repo1-bundle=y
> start-fast=y
>

Thanks for sharing!  Did you perhaps consider using zstd for the
compression..?  You might find that you get similar compression in less
time.

Thanks.

Stephen


Re: pgBackRest for a 50 TB database

2023-10-02 Thread Abhishek Bhola
Hello,

As said above, I tested pgBackRest on my bigger DB and here are the results.
Server on which this is running has the following config:
Architecture:  x86_64
CPU op-mode(s):32-bit, 64-bit
Byte Order:Little Endian
CPU(s):36
On-line CPU(s) list:   0-35
Thread(s) per core:1
Core(s) per socket:18
Socket(s): 2
NUMA node(s):  2

Data folder size: 52 TB (has some duplicate files since it is restored from
tapes)
Backup is being written on to DELL Storage, mounted on the server.

pgbackrest.conf with following options enabled
repo1-block=y
repo1-bundle=y
start-fast=y


1. *Using process-max: 30, Time taken: ~26 hours*
full backup: 20230926-092555F
timestamp start/stop: 2023-09-26 09:25:55+09 / 2023-09-27
11:07:18+09
wal start/stop: 00010001AC0E0044 /
00010001AC0E0044
database size: 38248.9GB, database backup size: 38248.9GB
repo1: backup size: 6222.0GB

2. *Using process-max: 10, Time taken: ~37 hours*
 full backup: 20230930-190002F
timestamp start/stop: 2023-09-30 19:00:02+09 / 2023-10-02
08:01:20+09
wal start/stop: 00010001AC0E004E /
00010001AC0E004E
database size: 38248.9GB, database backup size: 38248.9GB
repo1: backup size: 6222.0GB

Hope it helps someone to use these numbers as some reference.

Thanks


On Mon, Aug 28, 2023 at 12:30 AM Abhishek Bhola <
abhishek.bh...@japannext.co.jp> wrote:

> Hi Stephen
>
> Thank you for the prompt response.
> Hearing it from you makes me more confident about rolling it to PROD.
> I will have a discussion with the network team once about and hear what
> they have to say and make an estimate accordingly.
>
> If you happen to know anyone using it with that size and having published
> their numbers, that would be great, but if not, I will post them once I set
> it up.
>
> Thanks for your help.
>
> Cheers,
> Abhishek
>
> On Mon, Aug 28, 2023 at 12:22 AM Stephen Frost  wrote:
>
>> Greetings,
>>
>> * Abhishek Bhola (abhishek.bh...@japannext.co.jp) wrote:
>> > I am trying to use pgBackRest for all my Postgres servers. I have
>> tested it
>> > on a sample database and it works fine. But my concern is for some of
>> the
>> > bigger DB clusters, the largest one being 50TB and growing by about
>> > 200-300GB a day.
>>
>> Glad pgBackRest has been working well for you.
>>
>> > I plan to mount NAS storage on my DB server to store my backup. The
>> server
>> > with 50 TB data is using DELL Storage underneath to store this data and
>> has
>> > 36 18-core CPUs.
>>
>> How much free CPU capacity does the system have?
>>
>> > As I understand, pgBackRest recommends having 2 full backups and then
>> > having incremental or differential backups as per requirement. Does
>> anyone
>> > have any reference numbers on how much time a backup for such a DB would
>> > usually take, just for reference. If I take a full backup every Sunday
>> and
>> > then incremental backups for the rest of the week, I believe the
>> > incremental backups should not be a problem, but the full backup every
>> > Sunday might not finish in time.
>>
>> pgBackRest scales extremely well- what's going to matter here is how
>> much you can give it in terms of resources.  The primary bottle necks
>> will be CPU time for compression, network bandwidth for the NAS, and
>> storage bandwidth of the NAS and the DB filesystems.  Typically, CPU
>> time dominates due to the compression, though if you're able to give
>> pgBackRest a lot of those CPUs then you might get to the point of
>> running out of network bandwidth or storage bandwidth on your NAS.
>> We've certainly seen folks pushing upwards of 3TB/hr, so a 50TB backup
>> should be able to complete in less than a day.  Strongly recommend
>> taking an incremental backup more-or-less immediately after the full
>> backup to minimize the amount of WAL you'd have to replay on a restore.
>> Also strongly recommend actually doing serious restore tests of this
>> system to make sure you understand the process, have an idea how long
>> it'll take to restore the actual files with pgBackRest and then how long
>> PG will take to come up and replay the WAL generated during the backup.
>>
>> > I think converting a diff/incr backup to a full backup has been
>> discussed
>> > here , but not yet
>> > implemented. If there is a workaround, please let me know. Or if
>> someone is
>> > simply using pgBackRest for a bigger DB (comparable to 50TB), please
>> share
>> > your experience with the exact numbers and config/schedule of backups. I
>> > know the easiest way would be to use it myself and find out, but since
>> it
>> > is a PROD DB, I wanted to get some ideas before starting.
>>
>> No, we haven't implemented that yet.  It's starting to come up higher in
>> our list of things we want to work on though.  There are risks to doing
>> such conversi

Re: pgBackRest for a 50 TB database

2023-08-27 Thread o1bigtenor
On Sun, Aug 27, 2023 at 10:57 AM Abhishek Bhola
 wrote:
>
> Hi
>
> I am trying to use pgBackRest for all my Postgres servers. I have tested it 
> on a sample database and it works fine. But my concern is for some of the 
> bigger DB clusters, the largest one being 50TB and growing by about 200-300GB 
> a day.
>

Hopefully you are able to say something but what kind of stuff is
being done that generates 2 to 300 GB per day?

Genomic research is a maybe that I can think of.

(Sorry - - - just a curious Georg )

Regards




Re: pgBackRest for a 50 TB database

2023-08-27 Thread Abhishek Bhola
Hi Stephen

Thank you for the prompt response.
Hearing it from you makes me more confident about rolling it to PROD.
I will have a discussion with the network team once about and hear what
they have to say and make an estimate accordingly.

If you happen to know anyone using it with that size and having published
their numbers, that would be great, but if not, I will post them once I set
it up.

Thanks for your help.

Cheers,
Abhishek

On Mon, Aug 28, 2023 at 12:22 AM Stephen Frost  wrote:

> Greetings,
>
> * Abhishek Bhola (abhishek.bh...@japannext.co.jp) wrote:
> > I am trying to use pgBackRest for all my Postgres servers. I have tested
> it
> > on a sample database and it works fine. But my concern is for some of the
> > bigger DB clusters, the largest one being 50TB and growing by about
> > 200-300GB a day.
>
> Glad pgBackRest has been working well for you.
>
> > I plan to mount NAS storage on my DB server to store my backup. The
> server
> > with 50 TB data is using DELL Storage underneath to store this data and
> has
> > 36 18-core CPUs.
>
> How much free CPU capacity does the system have?
>
> > As I understand, pgBackRest recommends having 2 full backups and then
> > having incremental or differential backups as per requirement. Does
> anyone
> > have any reference numbers on how much time a backup for such a DB would
> > usually take, just for reference. If I take a full backup every Sunday
> and
> > then incremental backups for the rest of the week, I believe the
> > incremental backups should not be a problem, but the full backup every
> > Sunday might not finish in time.
>
> pgBackRest scales extremely well- what's going to matter here is how
> much you can give it in terms of resources.  The primary bottle necks
> will be CPU time for compression, network bandwidth for the NAS, and
> storage bandwidth of the NAS and the DB filesystems.  Typically, CPU
> time dominates due to the compression, though if you're able to give
> pgBackRest a lot of those CPUs then you might get to the point of
> running out of network bandwidth or storage bandwidth on your NAS.
> We've certainly seen folks pushing upwards of 3TB/hr, so a 50TB backup
> should be able to complete in less than a day.  Strongly recommend
> taking an incremental backup more-or-less immediately after the full
> backup to minimize the amount of WAL you'd have to replay on a restore.
> Also strongly recommend actually doing serious restore tests of this
> system to make sure you understand the process, have an idea how long
> it'll take to restore the actual files with pgBackRest and then how long
> PG will take to come up and replay the WAL generated during the backup.
>
> > I think converting a diff/incr backup to a full backup has been discussed
> > here , but not yet
> > implemented. If there is a workaround, please let me know. Or if someone
> is
> > simply using pgBackRest for a bigger DB (comparable to 50TB), please
> share
> > your experience with the exact numbers and config/schedule of backups. I
> > know the easiest way would be to use it myself and find out, but since it
> > is a PROD DB, I wanted to get some ideas before starting.
>
> No, we haven't implemented that yet.  It's starting to come up higher in
> our list of things we want to work on though.  There are risks to doing
> such conversions though that have to be considered- it creates long
> dependencies on things all working because if there's a PG or pgBackRest
> bug or some way that corruption slipped in then that ends up getting
> propagated down.  If you feel really confident that your restore testing
> is good (full restore w/ PG replaying WAL, running amcheck across the
> entire restored system, then pg_dump'ing everything and restoring it
> into a new PG cluster to re-validate all constraints, doing additional
> app-level review and testing...) then that can certainly help with
> mitigation of the risks mentioned above.
>
> Overall though, yes, people certainly use pgBackRest for 50TB+ PG
> clusters.
>
> Thanks,
>
> Stephen
>

-- 
_This correspondence (including any attachments) is for the intended 
recipient(s) only. It may contain confidential or privileged information or 
both. No confidentiality or privilege is waived or lost by any 
mis-transmission. If you receive this correspondence by mistake, please 
contact the sender immediately, delete this correspondence (and all 
attachments) and destroy any hard copies. You must not use, disclose, copy, 
distribute or rely on any part of this correspondence (including any 
attachments) if you are not the intended 
recipient(s).本メッセージに記載および添付されている情報(以下、総称して「本情報」といいます。)は、本来の受信者による使用のみを意図しています。誤送信等により本情報を取得された場合でも、本情報に係る秘密、または法律上の秘匿特権が失われるものではありません。本電子メールを受取られた方が、本来の受信者ではない場合には、本情報及びそのコピーすべてを削除・破棄し、本電子メールが誤って届いた旨を発信者宛てにご通知下さいますようお願いします。本情報の閲覧、発信または本情報に基づくいかなる行為も明確に禁止されていることをご了承ください。_


Re: pgBackRest for a 50 TB database

2023-08-27 Thread Stephen Frost
Greetings,

* Abhishek Bhola (abhishek.bh...@japannext.co.jp) wrote:
> I am trying to use pgBackRest for all my Postgres servers. I have tested it
> on a sample database and it works fine. But my concern is for some of the
> bigger DB clusters, the largest one being 50TB and growing by about
> 200-300GB a day.

Glad pgBackRest has been working well for you.

> I plan to mount NAS storage on my DB server to store my backup. The server
> with 50 TB data is using DELL Storage underneath to store this data and has
> 36 18-core CPUs.

How much free CPU capacity does the system have?

> As I understand, pgBackRest recommends having 2 full backups and then
> having incremental or differential backups as per requirement. Does anyone
> have any reference numbers on how much time a backup for such a DB would
> usually take, just for reference. If I take a full backup every Sunday and
> then incremental backups for the rest of the week, I believe the
> incremental backups should not be a problem, but the full backup every
> Sunday might not finish in time.

pgBackRest scales extremely well- what's going to matter here is how
much you can give it in terms of resources.  The primary bottle necks
will be CPU time for compression, network bandwidth for the NAS, and
storage bandwidth of the NAS and the DB filesystems.  Typically, CPU
time dominates due to the compression, though if you're able to give
pgBackRest a lot of those CPUs then you might get to the point of
running out of network bandwidth or storage bandwidth on your NAS.
We've certainly seen folks pushing upwards of 3TB/hr, so a 50TB backup
should be able to complete in less than a day.  Strongly recommend
taking an incremental backup more-or-less immediately after the full
backup to minimize the amount of WAL you'd have to replay on a restore.
Also strongly recommend actually doing serious restore tests of this
system to make sure you understand the process, have an idea how long
it'll take to restore the actual files with pgBackRest and then how long
PG will take to come up and replay the WAL generated during the backup.

> I think converting a diff/incr backup to a full backup has been discussed
> here , but not yet
> implemented. If there is a workaround, please let me know. Or if someone is
> simply using pgBackRest for a bigger DB (comparable to 50TB), please share
> your experience with the exact numbers and config/schedule of backups. I
> know the easiest way would be to use it myself and find out, but since it
> is a PROD DB, I wanted to get some ideas before starting.

No, we haven't implemented that yet.  It's starting to come up higher in
our list of things we want to work on though.  There are risks to doing
such conversions though that have to be considered- it creates long
dependencies on things all working because if there's a PG or pgBackRest
bug or some way that corruption slipped in then that ends up getting
propagated down.  If you feel really confident that your restore testing
is good (full restore w/ PG replaying WAL, running amcheck across the
entire restored system, then pg_dump'ing everything and restoring it
into a new PG cluster to re-validate all constraints, doing additional
app-level review and testing...) then that can certainly help with
mitigation of the risks mentioned above.

Overall though, yes, people certainly use pgBackRest for 50TB+ PG
clusters.

Thanks,

Stephen


signature.asc
Description: PGP signature


pgBackRest for a 50 TB database

2023-08-27 Thread Abhishek Bhola
Hi

I am trying to use pgBackRest for all my Postgres servers. I have tested it
on a sample database and it works fine. But my concern is for some of the
bigger DB clusters, the largest one being 50TB and growing by about
200-300GB a day.

I plan to mount NAS storage on my DB server to store my backup. The server
with 50 TB data is using DELL Storage underneath to store this data and has
36 18-core CPUs.

As I understand, pgBackRest recommends having 2 full backups and then
having incremental or differential backups as per requirement. Does anyone
have any reference numbers on how much time a backup for such a DB would
usually take, just for reference. If I take a full backup every Sunday and
then incremental backups for the rest of the week, I believe the
incremental backups should not be a problem, but the full backup every
Sunday might not finish in time.

I think converting a diff/incr backup to a full backup has been discussed
here , but not yet
implemented. If there is a workaround, please let me know. Or if someone is
simply using pgBackRest for a bigger DB (comparable to 50TB), please share
your experience with the exact numbers and config/schedule of backups. I
know the easiest way would be to use it myself and find out, but since it
is a PROD DB, I wanted to get some ideas before starting.

Thanks
Abhishek

-- 
_This correspondence (including any attachments) is for the intended 
recipient(s) only. It may contain confidential or privileged information or 
both. No confidentiality or privilege is waived or lost by any 
mis-transmission. If you receive this correspondence by mistake, please 
contact the sender immediately, delete this correspondence (and all 
attachments) and destroy any hard copies. You must not use, disclose, copy, 
distribute or rely on any part of this correspondence (including any 
attachments) if you are not the intended 
recipient(s).本メッセージに記載および添付されている情報(以下、総称して「本情報」といいます。)は、本来の受信者による使用のみを意図しています。誤送信等により本情報を取得された場合でも、本情報に係る秘密、または法律上の秘匿特権が失われるものではありません。本電子メールを受取られた方が、本来の受信者ではない場合には、本情報及びそのコピーすべてを削除・破棄し、本電子メールが誤って届いた旨を発信者宛てにご通知下さいますようお願いします。本情報の閲覧、発信または本情報に基づくいかなる行為も明確に禁止されていることをご了承ください。_