Re: pgBackRest for a 50 TB database

2023-12-19 Thread Abhishek Bhola
Hello Stephen

Just an update on this. After we deployed it on our PROD system, the
results were far better than testing.
Time taken is around 4-5 hours only. And has been the case for the last 3
months or so.
full backup: 20231209-150002F
timestamp start/stop: 2023-12-09 15:00:02+09 / 2023-12-09
19:33:56+09
wal start/stop: 00010001DCC3008E /
00010001DCC300A6
database size: 32834.8GB, database backup size: 32834.8GB
repo1: backup size: 5096.4GB

Now a question. I restored this big DB and it all works fine. However, I
was wondering if there was a way to disable the subscription on Postgres
while restoring the data using pgbackrest?
So for example, I have been taking a backup of this DB which has an active
subscription.
When I am restoring the DB for test purposes, I don't want the subscription
to be there. Is there any option to ignore the subscription?

Thanks

On Thu, Oct 5, 2023 at 10:19 PM Stephen Frost  wrote:

> Greetings,
>
> On Thu, Oct 5, 2023 at 03:10 Abhishek Bhola <
> abhishek.bh...@japannext.co.jp> wrote:
>
>> Here is the update with compress-type=zst in the config file
>> Process-max is still 30. *But it longer than before, around 27 hours 50
>> mins*
>>
>> full backup: 20231004-130621F
>> timestamp start/stop: 2023-10-04 13:06:21+09 / 2023-10-05
>> 15:56:03+09
>> wal start/stop: 00010001AC0E0054 /
>> 00010001AC0E0054
>> database size: 38249.0GB, database backup size: 38249.0GB
>> repo1: backup size: 5799.8GB
>>
>> Do you think I could be missing something?
>>
>
> Sounds like there’s something else which is the bottleneck once you have
> process-max at 30. I suspect you could reduce that process-max value and
> have around the same time still with zstd.  Ultimately if you want it to be
> faster then you’ll need to figure out what the bottleneck is (seemingly not
> CPU, unlikely to be memory, so that leaves network or storage) and address
> that.
>
> We’ve seen numbers approaching 10TB/hr with lots of processes and zstd and
> fast storage on high end physical hardware.
>
> Thanks,
>
> Stephen
>

-- 
_This correspondence (including any attachments) is for the intended 
recipient(s) only. It may contain confidential or privileged information or 
both. No confidentiality or privilege is waived or lost by any 
mis-transmission. If you receive this correspondence by mistake, please 
contact the sender immediately, delete this correspondence (and all 
attachments) and destroy any hard copies. You must not use, disclose, copy, 
distribute or rely on any part of this correspondence (including any 
attachments) if you are not the intended 
recipient(s).本メッセージに記載および添付されている情報(以下、総称して「本情報」といいます。)は、本来の受信者による使用のみを意図しています。誤送信等により本情報を取得された場合でも、本情報に係る秘密、または法律上の秘匿特権が失われるものではありません。本電子メールを受取られた方が、本来の受信者ではない場合には、本情報及びそのコピーすべてを削除・破棄し、本電子メールが誤って届いた旨を発信者宛てにご通知下さいますようお願いします。本情報の閲覧、発信または本情報に基づくいかなる行為も明確に禁止されていることをご了承ください。_


Re: Read write performance check

2023-12-19 Thread veem v
Thank you.

That would really be helpful if such test scripts or similar setups are
already available. Can someone please guide me to some docs or blogs or
sample scripts, on same please.

On Wed, 20 Dec, 2023, 10:34 am Lok P,  wrote:

> As Rob mentioned, the syntax you posted is not correct. You need to
> process or read a certain batch of rows like 1000 or 10k etc. Not all 100M
> at one shot.
>
> But again your uses case seems common one considering you want to compare
> the read and write performance on multiple databases with similar table
> structure as per your usecase. So in that case, you may want to use some
> test scripts which others must have already done rather reinventing the
> wheel.
>
>
> On Wed, 20 Dec, 2023, 10:19 am veem v,  wrote:
>
>> Thank you.
>>
>> Yes, actually we are trying to compare and see what maximum TPS are we
>> able to reach with both of these row by row and batch read/write test. And
>> then afterwards, this figure may be compared with other databases etc with
>> similar setups.
>>
>>  So wanted to understand from experts here, if this approach is fine? Or
>> some other approach is advisable?
>>
>> I agree to the point that , network will play a role in real world app,
>> but here, we are mainly wanted to see the database capability, as network
>> will always play a similar kind of role across all databases. Do you
>> suggest some other approach to achieve this objective?
>>
>>
>> On Wed, 20 Dec, 2023, 2:42 am Peter J. Holzer,  wrote:
>>
>>> On 2023-12-20 00:44:48 +0530, veem v wrote:
>>> >  So at first, we need to populate the base tables with the necessary
>>> data (say
>>> > 100million rows) with required skewness using random functions to
>>> generate the
>>> > variation in the values of different data types. Then in case of row
>>> by row
>>> > write/read test , we can traverse in a cursor loop. and in case of
>>> batch write/
>>> > insert , we need to traverse in a bulk collect loop. Something like
>>> below and
>>> > then this code can be wrapped into a procedure and passed to the
>>> pgbench and
>>> > executed from there. Please correct me if I'm wrong.
>>>
>>> One important point to consider for benchmarks is that your benchmark
>>> has to be similar to the real application to be useful. If your real
>>> application runs on a different node and connects to the database over
>>> the network, a benchmark running within a stored procedure may not be
>>> very indicative of real performance.
>>>
>>> hp
>>>
>>> --
>>>_  | Peter J. Holzer| Story must make more sense than reality.
>>> |_|_) ||
>>> | |   | h...@hjp.at |-- Charles Stross, "Creative writing
>>> __/   | http://www.hjp.at/ |   challenge!"
>>>
>>


Re: Read write performance check

2023-12-19 Thread Lok P
As Rob mentioned, the syntax you posted is not correct. You need to process
or read a certain batch of rows like 1000 or 10k etc. Not all 100M at one
shot.

But again your uses case seems common one considering you want to compare
the read and write performance on multiple databases with similar table
structure as per your usecase. So in that case, you may want to use some
test scripts which others must have already done rather reinventing the
wheel.


On Wed, 20 Dec, 2023, 10:19 am veem v,  wrote:

> Thank you.
>
> Yes, actually we are trying to compare and see what maximum TPS are we
> able to reach with both of these row by row and batch read/write test. And
> then afterwards, this figure may be compared with other databases etc with
> similar setups.
>
>  So wanted to understand from experts here, if this approach is fine? Or
> some other approach is advisable?
>
> I agree to the point that , network will play a role in real world app,
> but here, we are mainly wanted to see the database capability, as network
> will always play a similar kind of role across all databases. Do you
> suggest some other approach to achieve this objective?
>
>
> On Wed, 20 Dec, 2023, 2:42 am Peter J. Holzer,  wrote:
>
>> On 2023-12-20 00:44:48 +0530, veem v wrote:
>> >  So at first, we need to populate the base tables with the necessary
>> data (say
>> > 100million rows) with required skewness using random functions to
>> generate the
>> > variation in the values of different data types. Then in case of row by
>> row
>> > write/read test , we can traverse in a cursor loop. and in case of
>> batch write/
>> > insert , we need to traverse in a bulk collect loop. Something like
>> below and
>> > then this code can be wrapped into a procedure and passed to the
>> pgbench and
>> > executed from there. Please correct me if I'm wrong.
>>
>> One important point to consider for benchmarks is that your benchmark
>> has to be similar to the real application to be useful. If your real
>> application runs on a different node and connects to the database over
>> the network, a benchmark running within a stored procedure may not be
>> very indicative of real performance.
>>
>> hp
>>
>> --
>>_  | Peter J. Holzer| Story must make more sense than reality.
>> |_|_) ||
>> | |   | h...@hjp.at |-- Charles Stross, "Creative writing
>> __/   | http://www.hjp.at/ |   challenge!"
>>
>


Re: Read write performance check

2023-12-19 Thread veem v
Thank you.

Yes, actually we are trying to compare and see what maximum TPS are we able
to reach with both of these row by row and batch read/write test. And then
afterwards, this figure may be compared with other databases etc with
similar setups.

 So wanted to understand from experts here, if this approach is fine? Or
some other approach is advisable?

I agree to the point that , network will play a role in real world app, but
here, we are mainly wanted to see the database capability, as network will
always play a similar kind of role across all databases. Do you suggest
some other approach to achieve this objective?


On Wed, 20 Dec, 2023, 2:42 am Peter J. Holzer,  wrote:

> On 2023-12-20 00:44:48 +0530, veem v wrote:
> >  So at first, we need to populate the base tables with the necessary
> data (say
> > 100million rows) with required skewness using random functions to
> generate the
> > variation in the values of different data types. Then in case of row by
> row
> > write/read test , we can traverse in a cursor loop. and in case of batch
> write/
> > insert , we need to traverse in a bulk collect loop. Something like
> below and
> > then this code can be wrapped into a procedure and passed to the pgbench
> and
> > executed from there. Please correct me if I'm wrong.
>
> One important point to consider for benchmarks is that your benchmark
> has to be similar to the real application to be useful. If your real
> application runs on a different node and connects to the database over
> the network, a benchmark running within a stored procedure may not be
> very indicative of real performance.
>
> hp
>
> --
>_  | Peter J. Holzer| Story must make more sense than reality.
> |_|_) ||
> | |   | h...@hjp.at |-- Charles Stross, "Creative writing
> __/   | http://www.hjp.at/ |   challenge!"
>


Re: Read write performance check

2023-12-19 Thread Peter J. Holzer
On 2023-12-20 00:44:48 +0530, veem v wrote:
>  So at first, we need to populate the base tables with the necessary data (say
> 100million rows) with required skewness using random functions to generate the
> variation in the values of different data types. Then in case of row by row
> write/read test , we can traverse in a cursor loop. and in case of batch 
> write/
> insert , we need to traverse in a bulk collect loop. Something like below and
> then this code can be wrapped into a procedure and passed to the pgbench and
> executed from there. Please correct me if I'm wrong.

One important point to consider for benchmarks is that your benchmark
has to be similar to the real application to be useful. If your real
application runs on a different node and connects to the database over
the network, a benchmark running within a stored procedure may not be
very indicative of real performance.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature


Re: Read write performance check

2023-12-19 Thread Rob Sargent

On 12/19/23 12:14, veem v wrote:

Thank you for the confirmation.

 So at first, we need to populate the base tables with the necessary 
data (say 100million rows) with required skewness using random 
functions to generate the variation in the values of different data 
types. Then in case of row by row write/read test , we can traverse in 
a cursor loop. and in case of batch write/insert , we need to traverse 
in a bulk collect loop. Something like below and then this code can be 
wrapped into a procedure and passed to the pgbench and executed from 
there. Please correct me if I'm wrong.


Also can you please guide how the batch(say batch size of ~1000) 
Insert can be written ?


-- Row by row write
 FOR i IN 1..total_rows LOOP
        data_row := (SELECT
            md5(random()::TEXT),
            floor(random() * 100)::INT,
            random() * 1000::NUMERIC,
            NOW()::TIMESTAMP
        );
        INSERT INTO BASE_TABLE(column1, column2, column3, column4)
        VALUES (data_row.column1, data_row.column2, data_row.column3, 
data_row.column4);

    END LOOP;

--Row by row read
BEGIN
    FOR i IN 1..total_rows LOOP
        -- Row by row read
        SELECT * INTO data_row FROM BASE_TABLE WHERE limit 1;
    END LOOP;
END;

This row by row is guaranteed to be slow if there's no index on the 100M 
rows

-- Batch read
BEGIN
    -- Batch read
    OPEN data_set FOR SELECT * FROM BASE_TABLE LIMIT total_rows;
    CLOSE data_set;
END;

Does this batch read in the entire 100M row table? And some suspicious 
syntax


PS: Notice that top posting is frowned upon on this list.





Re: Read write performance check

2023-12-19 Thread veem v
Thank you for the confirmation.

 So at first, we need to populate the base tables with the necessary data
(say 100million rows) with required skewness using random functions to
generate the variation in the values of different data types. Then in case
of row by row write/read test , we can traverse in a cursor loop. and in
case of batch write/insert , we need to traverse in a bulk collect loop.
Something like below and then this code can be wrapped into a procedure and
passed to the pgbench and executed from there. Please correct me if I'm
wrong.

Also can you please guide how the batch(say batch size of ~1000) Insert can
be written ?

-- Row by row write
 FOR i IN 1..total_rows LOOP
data_row := (SELECT
md5(random()::TEXT),
floor(random() * 100)::INT,
random() * 1000::NUMERIC,
NOW()::TIMESTAMP
);
INSERT INTO BASE_TABLE(column1, column2, column3, column4)
VALUES (data_row.column1, data_row.column2, data_row.column3,
data_row.column4);
END LOOP;

--Row by row read
BEGIN
FOR i IN 1..total_rows LOOP
-- Row by row read
SELECT * INTO data_row FROM BASE_TABLE WHERE limit 1;
END LOOP;
END;

-- Batch read
BEGIN
-- Batch read
OPEN data_set FOR SELECT * FROM BASE_TABLE LIMIT total_rows;
CLOSE data_set;
END;






On Tue, 19 Dec 2023 at 12:58, Ilya Kosmodemiansky  wrote:

> Hi Veem,
>
> On Tue, Dec 19, 2023 at 7:36 AM veem v  wrote:
> > 1)For write performance , the rows needs to be inserted from multiple
> sessions at same time, with required random values as per the data types
> i.e. Character, Number, date columns. And this needs to be tested for row
> by row insert and batched insert.
> >
> > 2)For the read performance test , the table first has to be populated
> with those ~100million rows. Then querying will happen on that table row by
> row and batched way from the concurrent session.
> >
> > I am new to postgresql but mostly worked with Oracle, so I wanted to
> understand if the above can be achieved by creating a simple procedure or
> will a major effort be needed? And I saw a few blogs , pgbench to be used
> for concurrency tests.
>
> Yes, you are right, pgbench with customized script is what you are looking
> for
>
> >I want to know if this will still work on Aurora postgresql from intellij
> client worksheet.
>
> pgbench would work with aurora as with normal postgres, it is
> basically a shell script which connects to the database. Not sure if
> idea worksheet would help you in such case however, you can run it
> just from any machine with pgbench installed
>
>
> Best regards,
> Ilya
>
>
> --
> Ilya Kosmodemiansky
> CEO, Founder
>
> Data Egret GmbH
> Your remote PostgreSQL DBA team
> T.: +49 6821 919 3297
> i...@dataegret.com
>


Re: psql crash with custom build on RedHat 7

2023-12-19 Thread Thomas Munro
On Wed, Dec 20, 2023 at 4:41 AM Dominique Devienne  wrote:
> On Tue, Dec 19, 2023 at 2:02 PM Thomas Munro  wrote:
>> On Wed, Dec 20, 2023 at 1:39 AM Dominique Devienne  
>> wrote:
>> > Program received signal SIGSEGV, Segmentation fault.
>> > 0x004232b8 in slash_yylex ()
>>
>> I think this might have to do with flex changing.  Does it help if you
>> "make maintainer-clean"?
>
> My colleague who did the custom build double-checked the flex/bison 
> requirements,
> and the version of the packages on the RH7 machine he built on, and they 
> check out (see below).
>
> He also tells me he builds debug and release versions off different 
> workspaces/checkouts,
> thus there are no remnants of previous builds, assuming that's what `make 
> maintainer-clean` is for.

OK but be warned that if you're using tarballs, we shipped lexer
remnants in the tree (until
https://github.com/postgres/postgres/commit/721856ff, an interesting
commit to read).  The slash lexer is a kind of extension that (IIRC)
shares the same PsqlScanState (opaque pointer to private lexer state),
but if these two things are compiled to C by different flex versions,
they may contain non-identical 'struct yyguts_t' (and even if the
structs were identical, what the code does with them might still be
incompatible, but I guess the struct itself would be a good first
thing to look at along with the versions mentioned near the top of the
.c):

src/fe_utils/psqlscan.l -> psqlscan.c
src/bin/psql/psqlscanslash.l -> psqlscanslash.c

The usual "clean" doesn't remove those .c files in PG < 17, which
means that if your pipeline involves tarballs but you finished up
regenerating one of the files, or some other sequence involving
different flex versions, you could get that.  I've seen it myself on a
few systems, a decade ago when I guess flex rolled out an incompatible
change (maybe contemporaneous with RHEL7) and flex was upgraded
underneath my feet.  I remember that "maintainer-clean" (or maybe I'm
misremembering and it was "distclean") fixed it.




Re: psql crash with custom build on RedHat 7

2023-12-19 Thread Dominique Devienne
On Tue, Dec 19, 2023 at 2:02 PM Thomas Munro  wrote:

> On Wed, Dec 20, 2023 at 1:39 AM Dominique Devienne 
> wrote:
> > Program received signal SIGSEGV, Segmentation fault.
> > 0x004232b8 in slash_yylex ()
>
> I think this might have to do with flex changing.  Does it help if you
> "make maintainer-clean"?
>

My colleague who did the custom build double-checked the flex/bison
requirements,
and the version of the packages on the RH7 machine he built on, and they
check out (see below).

He also tells me he builds debug and release versions off different
workspaces/checkouts,
thus there are no remnants of previous builds, assuming that's what `make
maintainer-clean` is for.

Thanks, --DD

 From the Build Mgr -
Here https://www.postgresql.org/docs/current/install-requirements.html i
read:
> Flex and Bison are needed to build from a Git checkout, or if you changed
the actual scanner and parser definition files.
> If you need them, be sure to get Flex 2.5.35 or later and Bison 2.3 or
later.

On the cf-re7-toolkits (RH7) machine I built postgresql 16.1, the system
packages:

$ rpm -qa | grep flex
flex-2.5.37-3.el7.x86_64
$ rpm -qa | grep bison
bison-3.0.4-1.el7.x86_64

are installed in the system. So they look good


Re: psql crash with custom build on RedHat 7

2023-12-19 Thread Tom Lane
Thomas Munro  writes:
> On Wed, Dec 20, 2023 at 1:39 AM Dominique Devienne  
> wrote:
>> Program received signal SIGSEGV, Segmentation fault.
>> 0x004232b8 in slash_yylex ()

> I think this might have to do with flex changing.  Does it help if you
> "make maintainer-clean"?

If that doesn't fix it, please build with --enable-debug so that you
can get a more detailed stack trace.

regards, tom lane




Re: psql crash with custom build on RedHat 7

2023-12-19 Thread Thomas Munro
On Wed, Dec 20, 2023 at 1:39 AM Dominique Devienne  wrote:
> Program received signal SIGSEGV, Segmentation fault.
> 0x004232b8 in slash_yylex ()

I think this might have to do with flex changing.  Does it help if you
"make maintainer-clean"?




psql crash with custom build on RedHat 7

2023-12-19 Thread Dominique Devienne
Hi. We've recently upgraded from libpq 15.2 to 16.1.
We custom build postgresql using the instructions and GCC 9.1 (from RH7's
dts9).
We used the same process for building 15.2 and 16.1.
But somehow psql crashes on any backslash command, while 15.2 works fine.
I've included the small backtrace below.
I've used \conninfo, but \dn crashes just the same.
Regular SQL OTOH is fine (I tried that, before sending this email), so this
is specific to backslash commands too.
At this point, we're not sure what's going on.
I've tried against a 14.8 server (as shown below), but also a 12.5 one,
same results.
So it seems related to the client side and how it was compiled, not the
server side.

16.1 (custom) built on Windows, or on RH8 with GCC 12 work fine OTOH.

Would anyone have a clue why 16.1 on RH7 would fail as shown below?
Were there any specific changes between 15.2 and 16.1 that could explain
this behavior?

Another data-point, our own apps built using our custom-built libpq on RH7
(the same one used by psql, see ldd below) pass all their unit tests, and
obviously no crashes.

Thanks for any help, clues, anything that might help. Thanks, --DD

[ddevienne@marsu SharedComponents]$ ldd
.../postgresql/16.1/Linux_x64_2.17_gcc91/bin/psql
linux-vdso.so.1 =>  (0x7ffcf9cb1000)
libpq.so.5 =>
.../postgresql/16.1/Linux_x64_2.17_gcc91//lib/libpq.so.5
(0x7f278e8e6000)
libreadline.so.6 => /lib64/libreadline.so.6 (0x7f278e4d4000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x7f278e2b8000)
librt.so.1 => /lib64/librt.so.1 (0x7f278e0b)
libm.so.6 => /lib64/libm.so.6 (0x7f278ddae000)
libc.so.6 => /lib64/libc.so.6 (0x7f278d9e1000)
libssl.so.10 => /lib64/libssl.so.10 (0x7f278d77)
libcrypto.so.10 => /lib64/libcrypto.so.10 (0x7f278d30f000)
libtinfo.so.5 => /lib64/libtinfo.so.5 (0x7f278d0e5000)
/lib64/ld-linux-x86-64.so.2 (0x7f278e71a000)
libgssapi_krb5.so.2 => /lib64/libgssapi_krb5.so.2
(0x7f278ce98000)
libkrb5.so.3 => /lib64/libkrb5.so.3 (0x7f278cbb)
libcom_err.so.2 => /lib64/libcom_err.so.2 (0x7f278c9ac000)
libk5crypto.so.3 => /lib64/libk5crypto.so.3 (0x7f278c779000)
libdl.so.2 => /lib64/libdl.so.2 (0x7f278c575000)
libz.so.1 => /lib64/libz.so.1 (0x7f278c35f000)
libkrb5support.so.0 => /lib64/libkrb5support.so.0
(0x7f278c151000)
libkeyutils.so.1 => /lib64/libkeyutils.so.1 (0x7f278bf4d000)
libresolv.so.2 => /lib64/libresolv.so.2 (0x7f278bd34000)
libselinux.so.1 => /lib64/libselinux.so.1 (0x7f278bb0d000)
libpcre.so.1 => /lib64/libpcre.so.1 (0x7f278b8ab000)
[ddevienne@marsu SharedComponents]$ cat /etc/redhat-release
Red Hat Enterprise Linux Workstation release 7.5 (Maipo)

[ddevienne@marsu SharedComponents]$ gdb
.../postgresql/16.1/Linux_x64_2.17_gcc91/bin/psql
GNU gdb (GDB) Red Hat Enterprise Linux 8.3-3.el7
Copyright (C) 2019 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
.
Find the GDB manual and other documentation resources online at:
.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from .../postgresql/16.1/Linux_x64_2.17_gcc91/bin/psql...
(No debugging symbols found in
.../postgresql/16.1/Linux_x64_2.17_gcc91/bin/psql)
(gdb) run postgresql://ddevienne@db/migrated
Starting program: .../postgresql/16.1/Linux_x64_2.17_gcc91/bin/psql
postgresql://ddevienne@db/migrated
Missing separate debuginfos, use: debuginfo-install
glibc-2.17-222.el7.x86_64
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Password for user ddevienne:
psql (16.1, server 14.8)
SSL connection (protocol: TLSv1.2, cipher: ECDHE-RSA-AES256-GCM-SHA384,
compression: off)
Type "help" for help.

migrated=> \conninfo

Program received signal SIGSEGV, Segmentation fault.
0x004232b8 in slash_yylex ()
Missing separate debuginfos, use: debuginfo-install
keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-18.el7.x86_64
libcom_err-1.42.9-11.el7.x86_64 libselinux-2.5-12.el7.x86_64
ncurses-libs-5.9-14.20130511.el7_4.x86_64 openssl-libs-1.0.2k-12.el7.x86_64
pcre-8.32-17.el7.x86_64 readline-6.2-10.el7.x86_64 zlib-1.2.7-17.el7.x86_64
(gdb) bt
#0  0x004232b8 in slash_yylex ()
#1  0x0042456b in psql_scan_slash_command ()
#2  0x0040d56f in HandleSlashCmds ()
#3  0x00421d63 in MainLoop ()
#4