Re: "PANIC: could not open critical system index 2662" - twice

2023-04-12 Thread Evgeny Morozov
On 12/04/2023 2:35 am, Michael Paquier wrote:
> initdb does not enable checksums by default, requiring a
> -k/--data-checksums, so likely this addition comes from from your
> environment.

Indeed, turns out we had it in init_db_options.


> However, the docs say "Only
>> data pages are protected by checksums; internal data structures and
>> temporary files are not.", so I guess pg_class_oid_index might be an
>> "internal data structure"?
> pg_class_oid_index is a btree index that relies on 8k on-disk pages
> (default size), so it is subject to the same rules as normal relations
> regarding checksums for the pages flushed to disk, even if it is on a
> catalog.

OK, so then what does that mean for the error in the subject? At what
point should that problem have been detected by the data checksums?





Re: Guidance on INSERT RETURNING order

2023-04-12 Thread Mike Bayer
here's what would be cool, a new token called "tuple_order" or something


INSERT INTO table (a, b, c) VALUES ((1, 2, 3), (4, 5, 6), ...) RETURNING 
table.id, inserted.tuple_order

tuple_order would be incrementing values 1, 2, 3, 4, 5, ... which correlate the 
each row delivered by RETURNING to each entry in the VALUES clause, in the 
order they were stated in that VALUES clause, that is entry (1, 2, 3) would be 
tuple_order 1, entry (4, 5, 6) would be tuple order 2, etc.

That token would solve the entire problem in all cases. The 1800 line 
changeset I've been working on all week would go away (if every database had 
this) and the change would be reduced to a few dozen lines.


On Wed, Apr 12, 2023, at 11:05 PM, John Howroyd wrote:
> To clarify, the work around (for SQLAlchemy) does not require the RETURNING 
> values to be ordered; they will be reordered code side.  Longer term matching 
> the order up would be an added bonus!
> 
> On Thu, 13 Apr 2023 at 04:02, John Howroyd  wrote:
>> The ideal solution would be that the RETURNING values are in the same order 
>> as the declared rows in the table literal "FROM (VALUES (10, 11, 1), (12, 
>> 13, 2)) AS sel(p1, p2, num)".  But before any of you jump down my throat 
>> about about sets and a bar to parallel inserts, I think it's a small matter 
>> of some internal smoke and mirrors: simply marshal the RETURNING results 
>> through and in memory/temporary table maintaining the order from the SELECT 
>> and return these to the client upon commit.  At this moment, I believe the 
>> folk at SQLAlchemy are looking for a work around that can work for most use 
>> cases with either an identity column or a fake "sentinel" column for 
>> matching back to the ORM objects.  There may be an upper limit of 1000 to 
>> the number rows in a single insert (this arises in  MS SQL Server, who 
>> absolutely do not guarantee the order of their OUTPUT clause -- it's tied to 
>> the physical row insert rather than marshalled).
>> 
>> My feeling is that PostgreSQL may already do this.  See the commit:
>> https://github.com/postgres/postgres/commit/9118d03a8cca3d97327c56bf89a72e328e454e63
>> and the original message thread
>> https://www.postgresql.org/message-id/flat/CADBwkHv1nUx5bqxqRLp26k8iUWSnY3uW-1_0EMYxKns0t2JnUA%40mail.gmail.com
>> and references to undesired behaviour prior to PostgreSQL 9.6 such as in 
>> https://www.postgresql.org/docs/current/sql-select.html.
>> 
>> The test cases in the above commit use:
>> SELECT nextval('sequence'), datavals FROM ... ORDER BY something;
>> type syntax.  And I presume that's exactly what's happening internally when 
>> there's a serial identity column (via query rewrite rules or another 
>> mechanism).
>> 
>> So really, that just needs confirming.  What happens with triggers?  
>> Presumably the same as long as query rewrite rules are used, but this is 
>> presumably getting deeper into the code for actual inserts after the initial 
>> select.  The jump to the output being ordered, is just a guess but there's 
>> some bits and pieces that seem to suggest that there may indeed be a 
>> marshalling process going on in the background (whether or not that is 
>> linked to the original order is another matter).
>> 
>> I have set up a PostgreSQL server to test if I can break this hypothesis and 
>> see what query explains can allude to.  Does anyone have a test case where 
>> the order is not preserved?
>> 
>> Might I also point out that if the implementation of parallel inserts does 
>> create a bar then doing so may end up with the programmatic interfaces (such 
>> as SQLAlchemy) not being able to use that feature (possibly reverting back 
>> to single inserts).  Ur, so what would be the problem being solved with 
>> parallel inserts?
>> 
>> On Thu, 13 Apr 2023 at 02:27, Mike Bayer  wrote:
>>> We do something different for uuids.  These are usually created client side 
>>> anyway or are very easy to create client side if not and then we also have 
>>> a separate sentinel column option. 
>>> 
>>> Overall the whole thing is an optimization that can be turned off for 
>>> uncommon cases.   We just can't go back to having insert of 1000 rows be 
>>> 3-5x slower for all tables w server generated primary keys which is what 
>>> statement at a time gets us.  Surrogate integer Pks w sequence or identity 
>>> is the big case.
>>> 
>>> On Wed, Apr 12, 2023, at 8:58 PM, Thorsten Glaser wrote:
>>> > On Wed, 12 Apr 2023, Mike Bayer wrote:
>>> >
>>> >>ascending values for "mytable_id_seq" will correspond to the ascending
>>> >>values for "num".
>>> >
>>> > But, again, what if it’s uuid or something instead of a sequence?
>>> >
>>> > bye,
>>> > //mirabilos
>>> > -- 
>>> > 15:41⎜ Somebody write a testsuite for helloworld 
>>> > :-)
>>> 


Re: pg_basebackup / recovery

2023-04-12 Thread Achilleas Mantzios

Στις 13/4/23 01:31, ο/η Michael Paquier έγραψε:

On Wed, Apr 12, 2023 at 01:45:56PM +0300, Achilleas Mantzios - cloud wrote:

On 4/12/23 12:32, Fabrice Chapuis wrote:

During recovery process of a self contained backup, how postgres know to
stop reading wal when consistency is reached?

Because it knows the full packup info. It will observe the

STOP WAL LOCATION: 3BC7/4B000130 (file 00023BC7004B)

inside the backup file

There is a bit more to that in the recovery logic, depending mostly
on the presence of backup_ label file in the data folder when recovery
begins.  Once the backup_label is found at the beginning of recovery,
its information is stored in the control file and the file is renamed
to backup_label.old hence stopping the server when recovery has not
reached its expected point would rely on the control file contents
later on.  Then, the startup process and its WAL redo makes sure that
WAL replays until it finds the WAL record marking the end of the
backup.  Grepping for XLOG_BACKUP_END (WAL record type in this case)
shows all the areas that rely on that, and xlogrecovery.c covers the
most relevant bits.


Thank you for the info!

Sorry about my stupid typo "packup" :(


--
Michael


--
Achilleas Mantzios
 IT DEV - HEAD
 IT DEPT
 Dynacom Tankers Mgmt





Re: Guidance on INSERT RETURNING order

2023-04-12 Thread John Howroyd
To clarify, the work around (for SQLAlchemy) does not require the RETURNING
values to be ordered; they will be reordered code side.  Longer term
matching the order up would be an added bonus!

On Thu, 13 Apr 2023 at 04:02, John Howroyd  wrote:

> The ideal solution would be that the RETURNING values are in the same
> order as the declared rows in the table literal "FROM (VALUES (10, 11, 1),
> (12, 13, 2)) AS sel(p1, p2, num)".  But before any of you jump down my
> throat about about sets and a bar to parallel inserts, I think it's a small
> matter of some internal smoke and mirrors: simply marshal the RETURNING
> results through and in memory/temporary table maintaining the order from
> the SELECT and return these to the client upon commit.  At this moment, I
> believe the folk at SQLAlchemy are looking for a work around that can work
> for most use cases with either an identity column or a fake "sentinel"
> column for matching back to the ORM objects.  There may be an upper limit
> of 1000 to the number rows in a single insert (this arises in  MS SQL
> Server, who absolutely do not guarantee the order of their OUTPUT clause --
> it's tied to the physical row insert rather than marshalled).
>
> My feeling is that PostgreSQL may already do this.  See the commit:
>
> https://github.com/postgres/postgres/commit/9118d03a8cca3d97327c56bf89a72e328e454e63
> and the original message thread
>
> https://www.postgresql.org/message-id/flat/CADBwkHv1nUx5bqxqRLp26k8iUWSnY3uW-1_0EMYxKns0t2JnUA%40mail.gmail.com
> and references to undesired behaviour prior to PostgreSQL 9.6 such as in
> https://www.postgresql.org/docs/current/sql-select.html.
>
> The test cases in the above commit use:
> SELECT nextval('sequence'), datavals FROM ... ORDER BY something;
> type syntax.  And I presume that's exactly what's happening internally
> when there's a serial identity column (via query rewrite rules or another
> mechanism).
>
> So really, that just needs confirming.  What happens with triggers?
> Presumably the same as long as query rewrite rules are used, but this is
> presumably getting deeper into the code for actual inserts after the
> initial select.  The jump to the output being ordered, is just a guess but
> there's some bits and pieces that seem to suggest that there may indeed be
> a marshalling process going on in the background (whether or not that is
> linked to the original order is another matter).
>
> I have set up a PostgreSQL server to test if I can break this hypothesis
> and see what query explains can allude to.  Does anyone have a test case
> where the order is not preserved?
>
> Might I also point out that if the implementation of parallel inserts does
> create a bar then doing so may end up with the programmatic interfaces
> (such as SQLAlchemy) not being able to use that feature (possibly reverting
> back to single inserts).  Ur, so what would be the problem being solved
> with parallel inserts?
>
> On Thu, 13 Apr 2023 at 02:27, Mike Bayer  wrote:
>
>> We do something different for uuids.  These are usually created client
>> side anyway or are very easy to create client side if not and then we also
>> have a separate sentinel column option.
>>
>> Overall the whole thing is an optimization that can be turned off for
>> uncommon cases.   We just can't go back to having insert of 1000 rows be
>> 3-5x slower for all tables w server generated primary keys which is what
>> statement at a time gets us.  Surrogate integer Pks w sequence or identity
>> is the big case.
>>
>> On Wed, Apr 12, 2023, at 8:58 PM, Thorsten Glaser wrote:
>> > On Wed, 12 Apr 2023, Mike Bayer wrote:
>> >
>> >>ascending values for "mytable_id_seq" will correspond to the ascending
>> >>values for "num".
>> >
>> > But, again, what if it’s uuid or something instead of a sequence?
>> >
>> > bye,
>> > //mirabilos
>> > --
>> > 15:41⎜ Somebody write a testsuite for
>> helloworld :-)
>>
>>
>>


Re: Guidance on INSERT RETURNING order

2023-04-12 Thread John Howroyd
The ideal solution would be that the RETURNING values are in the same order
as the declared rows in the table literal "FROM (VALUES (10, 11, 1), (12,
13, 2)) AS sel(p1, p2, num)".  But before any of you jump down my throat
about about sets and a bar to parallel inserts, I think it's a small matter
of some internal smoke and mirrors: simply marshal the RETURNING results
through and in memory/temporary table maintaining the order from the SELECT
and return these to the client upon commit.  At this moment, I believe the
folk at SQLAlchemy are looking for a work around that can work for most use
cases with either an identity column or a fake "sentinel" column for
matching back to the ORM objects.  There may be an upper limit of 1000 to
the number rows in a single insert (this arises in  MS SQL Server, who
absolutely do not guarantee the order of their OUTPUT clause -- it's tied
to the physical row insert rather than marshalled).

My feeling is that PostgreSQL may already do this.  See the commit:
https://github.com/postgres/postgres/commit/9118d03a8cca3d97327c56bf89a72e328e454e63
and the original message thread
https://www.postgresql.org/message-id/flat/CADBwkHv1nUx5bqxqRLp26k8iUWSnY3uW-1_0EMYxKns0t2JnUA%40mail.gmail.com
and references to undesired behaviour prior to PostgreSQL 9.6 such as in
https://www.postgresql.org/docs/current/sql-select.html.

The test cases in the above commit use:
SELECT nextval('sequence'), datavals FROM ... ORDER BY something;
type syntax.  And I presume that's exactly what's happening internally when
there's a serial identity column (via query rewrite rules or another
mechanism).

So really, that just needs confirming.  What happens with triggers?
Presumably the same as long as query rewrite rules are used, but this is
presumably getting deeper into the code for actual inserts after the
initial select.  The jump to the output being ordered, is just a guess but
there's some bits and pieces that seem to suggest that there may indeed be
a marshalling process going on in the background (whether or not that is
linked to the original order is another matter).

I have set up a PostgreSQL server to test if I can break this hypothesis
and see what query explains can allude to.  Does anyone have a test case
where the order is not preserved?

Might I also point out that if the implementation of parallel inserts does
create a bar then doing so may end up with the programmatic interfaces
(such as SQLAlchemy) not being able to use that feature (possibly reverting
back to single inserts).  Ur, so what would be the problem being solved
with parallel inserts?

On Thu, 13 Apr 2023 at 02:27, Mike Bayer  wrote:

> We do something different for uuids.  These are usually created client
> side anyway or are very easy to create client side if not and then we also
> have a separate sentinel column option.
>
> Overall the whole thing is an optimization that can be turned off for
> uncommon cases.   We just can't go back to having insert of 1000 rows be
> 3-5x slower for all tables w server generated primary keys which is what
> statement at a time gets us.  Surrogate integer Pks w sequence or identity
> is the big case.
>
> On Wed, Apr 12, 2023, at 8:58 PM, Thorsten Glaser wrote:
> > On Wed, 12 Apr 2023, Mike Bayer wrote:
> >
> >>ascending values for "mytable_id_seq" will correspond to the ascending
> >>values for "num".
> >
> > But, again, what if it’s uuid or something instead of a sequence?
> >
> > bye,
> > //mirabilos
> > --
> > 15:41⎜ Somebody write a testsuite for helloworld
> :-)
>
>
>


Re: [E] Re: parallel aggregation

2023-04-12 Thread David Rowley
On Thu, 13 Apr 2023 at 14:31, Alexander Saydakov  wrote:
> 1. I wonder if there is a clean separation between the phases: once partial 
> aggregation is done only combining is done after that (state transition 
> function never receives results of combining).

Currently the transfn won't be called again on a state that has had
the combinefn called on it.  I can't think of a future reason that we
might change that. My imagination might be lacking, however.

> 2. Can a combiner output also go through serial-deserial before hitting 
> another combiner or finalizer?

Not currently. However, I *can* think of reasons why that might change
in the future.  If we wanted to accept partially aggregated results
from foreign servers and then combine them locally then, if those
foreign servers did parallel aggregation, the foreign server's
combined states would need to be serialised before sending over the
network.  It feels like just a matter of time before we grow the
ability to do that. Lots of work has been done on foreign data
wrappers in the past few years. It feels like it has tailed off a bit,
but I wouldn't be surprised if we had the ability to do that in the
next few years.

David




Re: [E] Re: parallel aggregation

2023-04-12 Thread David Rowley
(Let's keep this on this list)

On Thu, 13 Apr 2023 at 12:08, Alexander Saydakov  wrote:
>
> Yes, I am talking about a custom aggregate in C++ with internal state type.
> You did not really answer my first question. Does the state always go through 
> serialfinc-deserialfunc before reaching a combiner?

Well, I think maybe you asked the question wrongly.  The answer I gave
was "No" because the serial and deserial functions are only used for
internal typed aggregates.  But if the question is "are serial and
deserial functions always used for internal typed aggregates between
the Partial and Finalize phase", the answer is "Yes", they are
*currently*. I wouldn't want to rely on that staying true forever,
however.  I could think of a couple of reasons that this could change
in the future:

1) Partition-wise aggregates don't really require it.  Someone might
submit a patch that allows the Partial Aggregate phase just to return
a pointer to memory and have the Finalize Aggregate state just work on
that pointer directly rather than having its own copy.
2) If we ever changed away from the process model into a threading
model then we *may* consider not performing serialisation as an
optimisation.

Even if we ever did those 2, we might still need serial/deserial
functions for sending the states to other servers.  That's something
we don't currently do as there's no SQL-level way to express "just
give me the raw states and don't call the final functions".

> The type "internal" in C++ is just a pointer. So it can be an arbitrary data 
> structure. For some aggregations it would be better to have different state 
> types in the first phase of the aggregation (processing raw data using state 
> transition function) and the second phase (combining partial results). So I 
> wonder if there is a clean separation between the phases: once partial 
> aggregation is done the results go through serial-deserial barrier and only 
> combining is done after that (sfunc never receives results of combining). If 
> so, the question remains how to make finalfunc understand both states: 
> directly from sfunc if there is no partial aggregation, and from the combiner.
> Can a combiner receive results of another combiner? Can a combiner output 
> also go through serial-deserial?

You have to remember that the final func can be called without the
state ever having gone through the combine func. This is what happens
in non-parallel aggregation.  Also, think of the case with > 2
parallel workers.  There might be more than 2 states to combine for
any given group.  So the combine function must be able to operate on
aggregate states that have already been combined from other states.

You could just do something similar to how we handle NodeTag's in
PostgreSQL. Effectively all Node typed structs have a NodeTag field at
the start of the struct.  This is just an enum that code can look at
to determine the node type of whichever pointer it is looking at.
Perhaps you can get away with coding your aggregate function's
component functions in a way that can handle both types, you'd just
need to look at the first 4 bytes of the pointer so you know what to
do. In Postgres, we have an IsA macro to help us with that.  Have a
look at nodes.h.

David




Re: Guidance on INSERT RETURNING order

2023-04-12 Thread Mike Bayer
We do something different for uuids.  These are usually created client side 
anyway or are very easy to create client side if not and then we also have a 
separate sentinel column option. 

Overall the whole thing is an optimization that can be turned off for uncommon 
cases.   We just can't go back to having insert of 1000 rows be 3-5x slower for 
all tables w server generated primary keys which is what statement at a time 
gets us.  Surrogate integer Pks w sequence or identity is the big case.

On Wed, Apr 12, 2023, at 8:58 PM, Thorsten Glaser wrote:
> On Wed, 12 Apr 2023, Mike Bayer wrote:
>
>>ascending values for "mytable_id_seq" will correspond to the ascending
>>values for "num".
>
> But, again, what if it’s uuid or something instead of a sequence?
>
> bye,
> //mirabilos
> -- 
> 15:41⎜ Somebody write a testsuite for helloworld :-)




Re: Guidance on INSERT RETURNING order

2023-04-12 Thread Adrian Klaver

On 4/12/23 17:58, Thorsten Glaser wrote:

On Wed, 12 Apr 2023, Mike Bayer wrote:


ascending values for "mytable_id_seq" will correspond to the ascending
values for "num".


But, again, what if it’s uuid or something instead of a sequence?


This for an ORM that will be using a sequence.


bye,
//mirabilos


--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Guidance on INSERT RETURNING order

2023-04-12 Thread Thorsten Glaser
On Wed, 12 Apr 2023, Mike Bayer wrote:

>ascending values for "mytable_id_seq" will correspond to the ascending
>values for "num".

But, again, what if it’s uuid or something instead of a sequence?

bye,
//mirabilos
-- 
15:41⎜ Somebody write a testsuite for helloworld :-)




Re: FW: Error!

2023-04-12 Thread Adrian Klaver

On 4/12/23 14:47, Arquimedes Aguirre wrote:

Hello Dear!

I attach the errors that the system threw me when installing the 
Postgresql program.


The  first screenshot is about pgAdmin which is a Postgres client not 
Postgres. Also it is from 2023-02-19, so has anything changed since then?


The second screenshot shows you actually connecting to Postgres so I 
don't see an error.




I was installing it on the same website as the link you sent me.

I am using the same password that I used to create the password and 
despite that it throws me the same error.


Thanks! I hope you can help me solve this error that does not allow me 
to move forward, sorry for the inconveniences.


Thanks again, best regards!

Sent from Mail  for Windows

*From: *Adrian Klaver 
*Sent: *Wednesday, April 12, 2023 10:27 AM
*To: *Arquimedes Aguirre ; pgsql-general 
list 

*Subject: *Re: FW: Error!

On 4/10/23 17:55, Arquimedes Aguirre wrote:
 > Sent from Mail > for Windows

 >
 > *From: *Arquimedes Aguirre >

 > *Sent: *Sunday, April 9, 2023 5:59 PM
 > *To: *pgsql-advoc...@postgresql.org 
>

 > *Subject: *Error
 >
 > Hello there Dear!
 >
 > I have an error initializing PSQL shell, who can help me to solve it.
 >
 > Please I require your attention, I need to start developing a project,
 > apparently there is a hacker that is intervening in this process, I
 > appreciate your investigation some research and answer.
 >
 > Just to install the program, I got several errors without any common
 > sense, I asked the Microsoft support team for help and in 5 minutes they
 > solved the problem, it's very weird and odd. Apparently they have
 > control of my personal computer.

If MS support solved it why are you asking how to solve it?

FYI, do not use images for textual information, copy and paste the text
from the terminal .

I don't see any evidence of a hacker. I do see evidence of incorrect
passwords being used.

Did you install Postgres from here?:

https://www.enterprisedb.com/downloads/postgres-postgresql-downloads 



 > If so at some point in the process you where asked to create a password.
Use that password when invoking the psql shell command with the user
being the default postgres.


 > I hope that you can pay due attention to these problems that do not
 > allow me to advance in what I want to develop for my project. I hope you
 > can find the person responsible for these abnormalities.
 >
 > Thanks!
 >
 > Sent from Mail > for Windows

 >

--
Adrian Klaver
adrian.kla...@aklaver.com



--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Guidance on INSERT RETURNING order

2023-04-12 Thread Mike Bayer


On Wed, Apr 12, 2023, at 6:24 PM, Thorsten Glaser wrote:
> On Wed, 12 Apr 2023, Kirk Wolak wrote:
> 
> >The solution exists.  Pre-fetch the IDs, assign them and insert them with
> >the IDs.  Then you have 100% control.
> >SELECT NEXTVAL('tbl_seq') from GENERATE_SERIES(1, );
> 
> This would be a nice solution… but the PK could be always generated,
> or not even sequential (UUIDs for example). If you’re developing
> schema-first the application would not even (need to) know about how
> the IDs are generated as it’s the DB that generates them.
> 
> AIUI the OP’s an SQLAlchemy developer, which is an ORM. Therefore he
> needs a generic solution.

im my view I already have the solution which is the form I referred to in my 
earlier email:

INSERT INTO mytable (a, b) 
SELECT p1, p2 FROM (VALUES (10, 11, 1), (12, 13, 2)) AS sel(p1, p2, num) ORDER 
BY num 
RETURNING mytable.id

the columns we want to INSERT, a and b, come from p1 and p2 in the SELECT. the 
third value in each table-row value inside the VALUES clause we call the 
"sentinel" and we apply it as an incrementing number which we also ORDER BY.


per Tom Lane's email at 
https://www.postgresql.org/message-id/29386.1528813619%40sss.pgh.pa.us in 2018, 
assuming mytable.id is serial (or I would assume IDENTITY), this resolves to 
something equivalent to:

INSERT INTO mytable (id, a, b)
SELECT nextval("mytable_id_seq"), p1, p2 FROM (
SELECT p1, p2 FROM (VALUES (10, 11, 1), (12, 13, 2)) AS sel(p1, p2, num) 
ORDER BY num
) AS ss RETURNING mytable.id


So the above is one part we want to confirm is accurate, hasn't changed, is not 
going to change much as far as its ultimate behavior.  It means that as rows 
are generated from the SELECT, the ascending values for "mytable_id_seq" will 
correspond to the ascending values for "num".  That's the important part; we 
are sending in an number that controls the sorting of records as they are 
delivered outwards, and the generated identifiers, assuming they are 
incrementing, will sort in that same way as each value is associated with the 
rows emanating from the inner SELECT.  it does not matter if the sequence has 
gaps in it from other transactions or whatever, only that it increments in the 
same direction as "num" which we pass in.

In this sequence of events, the INSERT conceptually hasn't even happened for a 
particular row before the row has been associated with a sequence value.  
INSERT can do whatever it wants, it can break out into 20 threads and jumble up 
all the rows or whatever.  When it sends them out in RETURNING, whatever order 
it comes back is no problem; we just want those "mytable_id_seq" values back.  
We sort by that when we get the rows back.  This ordering matches the one we 
put in.  So the way these threads often seem to discuss how INSERTS and 
RETURNING are both not ordered are missing this particular point, that we 
really just need a way to relate server generated values to the rows we put in. 
 

This form comes from two places, one is from Microsoft SQL Server which 
documents as well as guaranteed to maintain order at 
https://learn.microsoft.com/en-us/sql/t-sql/statements/insert-transact-sql?view=sql-server-ver16;
 "INSERT queries that use SELECT with ORDER BY to populate rows guarantees how 
identity values are computed but not the order in which the rows are inserted." 
; for PostgreSQL, the 2018 thread above seems to confirm a similar behavior for 
PostgreSQL.

So we were looking for feedback from PG developers on the above assumptions.  
Beyond that, throughout all the theoretical talk about "well PG might change 
this someday", while PG most certainly can and maybe will change some of these 
assumptions, it would produce a lot of surprises for many users who knowingly 
or unknowingly have relied on various orderings of this way and it would likely 
lead to some very good documentation about this area of functionality, maybe 
even some new syntaxes or keywords to help with this extremely common use case. 
 Documentation for these things is what the various threads on this subject 
often ask for.  The first such thread I can find asking about this, which also 
ended kind of ambiguously, was in 2012 here: 
https://www.mail-archive.com/pgsql-hackers@postgresql.org/msg204156.html  .   
Then there's the 2018 thread, and also a bunch on stackoverflow.  Everyone is 
left guessing or not totally sure since there's no official documentation for 
this.

We're going to document in SQLAlchemy the approaches we are using for different 
backends and I wouldnt be surprised if when people google for "PostgreSQL 
INSERT RETURNING ORDERING" our own docs are going to come up near the top.  
That seems to happen a lot.





Re: FW: Error!

2023-04-12 Thread Greg Hennessy


Just to install the program, I got several errors without any common 
sense, I asked the Microsoft support team for help and in 5 minutes 
they solved the problem, it's very weird and odd. Apparently they have 
control of my personal computer.




Are you sure you were talking to Microsoft, rather than a hacker who 
claimed they were

Microsoft?


Re: pg_basebackup / recovery

2023-04-12 Thread Michael Paquier
On Wed, Apr 12, 2023 at 01:45:56PM +0300, Achilleas Mantzios - cloud wrote:
> On 4/12/23 12:32, Fabrice Chapuis wrote:
>> During recovery process of a self contained backup, how postgres know to
>> stop reading wal when consistency is reached?
> 
> Because it knows the full packup info. It will observe the
> 
> STOP WAL LOCATION: 3BC7/4B000130 (file 00023BC7004B)
> 
> inside the backup file

There is a bit more to that in the recovery logic, depending mostly
on the presence of backup_ label file in the data folder when recovery
begins.  Once the backup_label is found at the beginning of recovery,
its information is stored in the control file and the file is renamed
to backup_label.old hence stopping the server when recovery has not
reached its expected point would rely on the control file contents
later on.  Then, the startup process and its WAL redo makes sure that
WAL replays until it finds the WAL record marking the end of the
backup.  Grepping for XLOG_BACKUP_END (WAL record type in this case)
shows all the areas that rely on that, and xlogrecovery.c covers the
most relevant bits.
--
Michael


signature.asc
Description: PGP signature


Re: Guidance on INSERT RETURNING order

2023-04-12 Thread Rob Sargent




This would be a nice solution… but the PK could be always generated,
or not even sequential (UUIDs for example). If you’re developing
schema-first the application would not even (need to) know about how
the IDs are generated as it’s the DB that generates them.

AIUI the OP’s an SQLAlchemy developer, which is an ORM. Therefore he
needs a generic solution.


Or a different dev env. :)





Re: Guidance on INSERT RETURNING order

2023-04-12 Thread Thorsten Glaser
On Wed, 12 Apr 2023, Kirk Wolak wrote:

>The solution exists.  Pre-fetch the IDs, assign them and insert them with
>the IDs.  Then you have 100% control.
>SELECT NEXTVAL('tbl_seq') from GENERATE_SERIES(1, );

This would be a nice solution… but the PK could be always generated,
or not even sequential (UUIDs for example). If you’re developing
schema-first the application would not even (need to) know about how
the IDs are generated as it’s the DB that generates them.

AIUI the OP’s an SQLAlchemy developer, which is an ORM. Therefore he
needs a generic solution.

bye,
//mirabilos
-- 
15:41⎜ Somebody write a testsuite for helloworld :-)




Re: Guidance on INSERT RETURNING order

2023-04-12 Thread Kirk Wolak
On Wed, Apr 12, 2023 at 5:49 PM Adrian Klaver 
wrote:

> On 4/12/23 2:35 PM, Kirk Wolak wrote:
> > On Tue, Apr 11, 2023 at 4:38 PM Federico 
> >
> > A couple of comments.  For the more generic, I prefer RETURNING *
> > you get back all the columns for matching.  To me, this solves the
> > problem in a very generic way.
>
>  From what I gather from the conversation RETURNING is the red herring.
>
> The request is that for:
>
> INSERT INTO some_table(char_fld) VALUES('a'), ('b'), ('c')
>
> where some_table has an auto increment field that the values created for
> said field will always be done in the order that VALUES data was
> presented so:
>
> SELECT id, char_fld from some_table will always return:
>
> (1, 'a')
> (2, 'b')
> (3, 'c')
>

The solution exists.  Pre-fetch the IDs, assign them and insert them with
the IDs.  Then you have 100% control.
SELECT NEXTVAL('tbl_seq') from GENERATE_SERIES(1, );
// Update your structure, then insert, using these values.  SINCE the
intention is to update your structure anyways.
// This simply changes the order of operation and requires nothing to work
in many environments

Or, with RETURNING *, assign them into your structure based on how the
system assigned the IDs
Clearly this is harder than the first suggestion.  But it works, without
changing anything.

But I find the recommendation to make a DB adhere to ordering "non-ordered"
sets, especially when, as stated,
it would not allow for parallelism.  I would much rather have parallelism
in my INSERTs than some arbitrary commitment
that the slew of data I throw at the DB be processed in an order for some
"edge case" that really doesn't simplify the coding.


>
> > But SQL (and SET THEORY) basically imply you cannot trust the sequencing
> > of a set *of transactions*.  Parallel execution is just a great simple
> > example.
> >
> > Secondarily, many frameworks I've worked with (and custom ones
> > developed) would actually call the SEQUENCE.NEXTVAL, and assign the IDs,
> > in memory, accepting that we would have gaping holes if some
> > transactions were never actually sent to the server.  We did this a lot
> > in master-detail GUI type stuff.  It's just easier.  The children knew
> > their parent ID, and all the children ID's were effectively known before
> > committing.  It made for simple code that never failed.
> > (for large datasets we would want one query that returned a set of IDs,
> > we could order that.  And apply it to the records we were about to
> > insert). [Be Careful with GENERATED ALWAYS pks to OVERRIDE]
> >
> > HTH
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>


Re: Guidance on INSERT RETURNING order

2023-04-12 Thread Federico
On Wed, 12 Apr 2023 at 23:49, Adrian Klaver  wrote:
>
> On 4/12/23 2:35 PM, Kirk Wolak wrote:
> > On Tue, Apr 11, 2023 at 4:38 PM Federico 
> >
> > A couple of comments.  For the more generic, I prefer RETURNING *
> > you get back all the columns for matching.  To me, this solves the
> > problem in a very generic way.
>
>  From what I gather from the conversation RETURNING is the red herring.
>
> The request is that for:
>
> INSERT INTO some_table(char_fld) VALUES('a'), ('b'), ('c')
>
> where some_table has an auto increment field that the values created for
> said field will always be done in the order that VALUES data was
> presented so:
>
> SELECT id, char_fld from some_table will always return:
>
> (1, 'a')
> (2, 'b')
> (3, 'c')

It's actually for an insert query like this

INSERT INTO t(data)
SELECT data FROM (VALUES ('a', 1), ('b', 2), ('c', 3)) as vv(data,
num) ORDER BY num

Or even (but the above would be nicer)

INSERT INTO t(id, data)
SELECT nextval(pg_get_serial_sequence('t', 'id')) data
FROM (VALUES ('a', 1), ('b', 2), ('c', 3)) as vv(data, num)
ORDER BY num

  Federico
>
> > But SQL (and SET THEORY) basically imply you cannot trust the sequencing
> > of a set of transactions.  Parallel execution is just a great simple
> > example.
> >
> > Secondarily, many frameworks I've worked with (and custom ones
> > developed) would actually call the SEQUENCE.NEXTVAL, and assign the IDs,
> > in memory, accepting that we would have gaping holes if some
> > transactions were never actually sent to the server.  We did this a lot
> > in master-detail GUI type stuff.  It's just easier.  The children knew
> > their parent ID, and all the children ID's were effectively known before
> > committing.  It made for simple code that never failed.
> > (for large datasets we would want one query that returned a set of IDs,
> > we could order that.  And apply it to the records we were about to
> > insert). [Be Careful with GENERATED ALWAYS pks to OVERRIDE]
> >
> > HTH
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com




Re: Guidance on INSERT RETURNING order

2023-04-12 Thread Adrian Klaver

On 4/12/23 2:35 PM, Kirk Wolak wrote:
On Tue, Apr 11, 2023 at 4:38 PM Federico 



A couple of comments.  For the more generic, I prefer RETURNING *
you get back all the columns for matching.  To me, this solves the 
problem in a very generic way.


From what I gather from the conversation RETURNING is the red herring.

The request is that for:

INSERT INTO some_table(char_fld) VALUES('a'), ('b'), ('c')

where some_table has an auto increment field that the values created for 
said field will always be done in the order that VALUES data was 
presented so:


SELECT id, char_fld from some_table will always return:

(1, 'a')
(2, 'b')
(3, 'c')

But SQL (and SET THEORY) basically imply you cannot trust the sequencing 
of a set of transactions.  Parallel execution is just a great simple 
example.


Secondarily, many frameworks I've worked with (and custom ones 
developed) would actually call the SEQUENCE.NEXTVAL, and assign the IDs, 
in memory, accepting that we would have gaping holes if some 
transactions were never actually sent to the server.  We did this a lot 
in master-detail GUI type stuff.  It's just easier.  The children knew 
their parent ID, and all the children ID's were effectively known before 
committing.  It made for simple code that never failed.
(for large datasets we would want one query that returned a set of IDs, 
we could order that.  And apply it to the records we were about to 
insert). [Be Careful with GENERATED ALWAYS pks to OVERRIDE]


HTH



--
Adrian Klaver
adrian.kla...@aklaver.com




Re: FW: Error!

2023-04-12 Thread Kirk Wolak
On Wed, Apr 12, 2023 at 10:14 AM Ron  wrote:

> Are you sure that you're entering the correct password?
>
> On 4/10/23 19:55, Arquimedes Aguirre wrote:
>
>
>
>
>
> Sent from Mail  for
> Windows
>
>
>
> *From: *Arquimedes Aguirre 
> *Sent: *Sunday, April 9, 2023 5:59 PM
> *To: *pgsql-advoc...@postgresql.org
> *Subject: *Error
>
>
>
> Hello there Dear!
>
>
>
> I have an error initializing PSQL shell, who can help me to solve it.
>
>
>
> Please I require your attention, I need to start developing a project,
> apparently there is a hacker that is intervening in this process, I
> appreciate your investigation some research and answer.
>
>
>
> Just to install the program, I got several errors without any common
> sense, I asked the Microsoft support team for help and in 5 minutes they
> solved the problem, it's very weird and odd. Apparently they have control
> of my personal computer.
>
>
>
> I hope that you can pay due attention to these problems that do not allow
> me to advance in what I want to develop for my project. I hope you can find
> the person responsible for these abnormalities.
>
>
>
> Thanks!
>
>
>
> Sent from Mail  for
> Windows
>
>
>
>
>
>
> --
> Born in Arizona, moved to Babylonia.
>

My recommendation is to connect with full details:
psql -h localhost  -U postgres postgres

Note the username is postgres and the DBname is postgres (the last
parameter).
I would even go so far as:
psql -h 127.0.0.1 -U postgres postgres

Because IPV6 might not be bound as expected.

This simply gets rid of the GUESSING of what PSQL might be trying to do.
Like connecting to a different database, or logging in as a user that does
not exist.
HTH


Re: Guidance on INSERT RETURNING order

2023-04-12 Thread Kirk Wolak
On Tue, Apr 11, 2023 at 4:38 PM Federico  wrote:

> Thanks for the ansers
>
> > 2) What would you order by, id or data or both?
>
> by values order, (that incidentally seems to be what PG does)
>
> > with i as (INSERT INTO t(data) VALUES ('a', 1), ('b', 2), ('c', 3)
> > returning id)
> > select i.id from i order by id;
>
> The problem here is not having the auto increment id in a particular
> order, is that there
> is apparently no correlation with the position of an element in the
> values clause with the
> id generated. That's the reason for using the sentinel column in the
> general solution in the previous message.
>
> The extend on the use case, SQLAlchemy has 3 objects T that have
> T(data='a'), T(data='b'), T(data='c') but no
> value for the id column. The objective is to insert the 3 data values,
> get back the ids and correctly match them with
> the correct 3 objects.
>
> > No.  Sadly, adding that ORDER BY is just voodoo programming, because
> > it applies to the result of the SELECT while promising nothing about
> > the order in which INSERT/RETURNING will act on those rows.
>
> I wasn't probably clear, it's fine if INSERT/RETURNING order is
> arbitrary, what matters is that the
> autoincementing values is executed in the same order as select, like
> mentioned in this
> previous message
> https://www.postgresql.org/message-id/29386.1528813619%40sss.pgh.pa.us
>
> Is that not the case?
>
> > Re-reading that 2012 thread, the main new observation I'd make today
> > is that parallel operation is a thing now, and it's not hard to foresee
> > that sometime soon we'll want to parallelize INSERTs.  Which'd make it
> > *really* hard to promise anything about the order of RETURNING output.
>
> I think it's fine not promising anything about the order of RETURNING, but
> it would be very helpful having a way of tracking what input row
> generated a particular
> output row. Basically the sentinel case in the original post,
> without actually having to insert the sentinel into the table.
>
> > I think if you want to use RETURNING with multi-row inserts, the
> > thing to do is more like
> >
> >   INSERT INTO t(data) VALUES ('a'), ('b'), ('c') RETURNING data, id
> >
> > and then explicitly match up the returned "data" values rather than
> > presuming they appear in the same order you wrote them in in VALUES.
> > Admittedly this might be problematic if some of the VALUES rows
> > are identical, but how much should you care?
>
> Well, the example is very easy, but it's hard to generalize when
> inserting multiple columns
> with possible complex values in them, since it would mean matching on
> possibly large json values,
> arrays, etc. So definitely not ideal
>
> Thanks,
> Federico
>
> On Tue, 11 Apr 2023 at 22:06, Adrian Klaver 
> wrote:
> >
> > On 4/11/23 12:47, Federico wrote:
> > > Hello list,
> > >
> > https://www.sqlite.org/lang_returning.html#limitations_and_caveats
> > >
> > > Searching the archive seems that a using the INSERT SELECT ORDER BY
> > > form should be a better solution,
> > > so the above insert should be rewritten as
> > >
> > >  INSERT INTO t(data)
> > >  SELECT data FROM (VALUES ('a', 1), ('b', 2), ('c', 3)) as vv(data,
> > > num) ORDER BY num
> > >  RETURNING id
> >
> > Or
> >
> > with i as (INSERT INTO t(data) VALUES ('a', 1), ('b', 2), ('c', 3)
> > returning id)
> > select i.id from i order by id;
> >
> > > Sorry for the long email,
> > > Thanks
> > >
> > >  Federico
> > >
> > >
> >
> > --
> > Adrian Klaver
> > adrian.kla...@aklaver.com
> >
>
> A couple of comments.  For the more generic, I prefer RETURNING *
you get back all the columns for matching.  To me, this solves the problem
in a very generic way.
But SQL (and SET THEORY) basically imply you cannot trust the sequencing of
a set of transactions.  Parallel execution is just a great simple example.

Secondarily, many frameworks I've worked with (and custom ones developed)
would actually call the SEQUENCE.NEXTVAL, and assign the IDs, in memory,
accepting that we would have gaping holes if some transactions were never
actually sent to the server.  We did this a lot in master-detail GUI type
stuff.  It's just easier.  The children knew their parent ID, and all the
children ID's were effectively known before committing.  It made for simple
code that never failed.
(for large datasets we would want one query that returned a set of IDs, we
could order that.  And apply it to the records we were about to insert).
[Be Careful with GENERATED ALWAYS pks to OVERRIDE]

HTH


Re: TEXT column > 1Gb

2023-04-12 Thread Rob Sargent

On 4/12/23 15:03, Joe Carlson wrote:




On Apr 12, 2023, at 12:21 PM, Rob Sargent  wrote:

On 4/12/23 13:02, Ron wrote:
/Must/ the genome all be in one big file, or can you store them one 
line per table row?


The assumption in the schema I’m using is 1 chromosome per record. 
Chromosomes are typically strings of continuous sequence (A, C, G, or 
T) separated by gaps (N) of approximately known, or completely unknown 
size. In the past this has not been a problem since sequenced 
chromosomes were maybe 100 megabases. But sequencing is better now 
with the technology improvements and tackling more complex genomes. So 
gigabase chromosomes are common.


A typical use case might be from someone interested in seeing if they 
can identify the regulatory elements (the on or off switches) of a 
gene. The protein coding part of a gene can be predicted pretty 
reliably, but the upstream untranslated region and regulatory elements 
are tougher. So they might come to our web site and want to extract 
the 5 kb bit of sequence before the start of the gene and look for 
some of the common motifs that signify a protein binding site. Being 
able to quickly pull out a substring of the genome to drive a web app 
is something we want to do quickly.




Well if you're actually using the sequence, both text and bytea are 
inherently substring friendly.  Your problem goes back to transferring 
large strings and that's where http/tomcat is you friend.  Sounds like 
you're web friendly already.  You have to stream from the 
client/supplier, of course.





Re: TEXT column > 1Gb

2023-04-12 Thread Joe Carlson


> On Apr 12, 2023, at 12:21 PM, Rob Sargent  wrote:
> 
> On 4/12/23 13:02, Ron wrote:
>> Must the genome all be in one big file, or can you store them one line per 
>> table row?

The assumption in the schema I’m using is 1 chromosome per record. Chromosomes 
are typically strings of continuous sequence (A, C, G, or T) separated by gaps 
(N) of approximately known, or completely unknown size. In the past this has 
not been a problem since sequenced chromosomes were maybe 100 megabases. But 
sequencing is better now with the technology improvements and tackling more 
complex genomes. So gigabase chromosomes are common. 

A typical use case might be from someone interested in seeing if they can 
identify the regulatory elements (the on or off switches) of a gene. The 
protein coding part of a gene can be predicted pretty reliably, but the 
upstream untranslated region and regulatory elements are tougher. So they might 
come to our web site and want to extract the 5 kb bit of sequence before the 
start of the gene and look for some of the common motifs that signify a protein 
binding site. Being able to quickly pull out a substring of the genome to drive 
a web app is something we want to do quickly. 
> 
> Not sure what OP is doing with plant genomes (other than some genomics) but 
> the tools all use files and pipeline of sub-tools.  In and out of tuples 
> would be expensive.  Very,very little "editing" done in the usual "update 
> table set val where id" sense.

yeah. it’s basically a warehouse. Stick data in, but then make all the 
connections between the functional elements, their products and the predictions 
on the products. It’s definitely more than a document store and we require a 
relational database.
> 
> Lines in a vcf file can have thousands of colums fo nasty, cryptic garbage 
> data that only really makes sense to tools, reader.  Highly denormalized of 
> course.  (Btw, I hate sequencing :) )

Imagine a disciplne where some beleaguered grad student has to get something 
out the door by the end of the term. It gets published and the rest of the 
community say GREAT! we have a standard! Then the abuse of the standard 
happens. People who specialize in bioinformatics know just enough computer 
science, statistics and molecular biology to annoy experts in three different 
fields.



Re: TEXT column > 1Gb

2023-04-12 Thread Ron

On 4/12/23 14:21, Rob Sargent wrote:

On 4/12/23 13:02, Ron wrote:
/Must/ the genome all be in one big file, or can you store them one line 
per table row?


Not sure what OP is doing with plant genomes (other than some genomics) 
but the tools all use files and pipeline of sub-tools. In and out of 
tuples would be expensive.  Very,very little "editing" done in the usual 
"update table set val where id" sense.


Lines in a vcf file can have thousands of colums fo nasty, cryptic garbage 
data that only really makes sense to tools, reader. Highly denormalized of 
course.  (Btw, I hate sequencing :) )


My thinking is that you'd never hit a 1GiB limit if every line in the vcf 
file was in it's it's own row in a table.


Performance /would/ drop, though.

--
Born in Arizona, moved to Babylonia.

Re: TEXT column > 1Gb

2023-04-12 Thread Benedict Holland
Yea. For ease of use, out of the box solutions that will just work, large
objects. You might know them as BLOBS in other SQL varieties. If you are
dealing with that much data, I'm going to assume that storage isn't really
your concern. I wouldn't even waste time compressing. I use them frequently
to store all sorts of wierd objects like pictures or serialized pickle
files. They are really fast and extremely easy to use. They do not play
nicely with a lot of 3rd party software, particularly UIs sitting on top of
a database but again, if that isnt a concern or you can use stored
procedures for the selects, it should be just fine.

On Wed, Apr 12, 2023, 3:21 PM Rob Sargent  wrote:

> On 4/12/23 13:02, Ron wrote:
>
> *Must* the genome all be in one big file, or can you store them one line
> per table row?
>
>
> Not sure what OP is doing with plant genomes (other than some genomics)
> but the tools all use files and pipeline of sub-tools.  In and out of
> tuples would be expensive.  Very,very little "editing" done in the usual
> "update table set val where id" sense.
>
> Lines in a vcf file can have thousands of colums fo nasty, cryptic garbage
> data that only really makes sense to tools, reader.  Highly denormalized of
> course.  (Btw, I hate sequencing :) )
>
>
>
>


Re: TEXT column > 1Gb

2023-04-12 Thread Rob Sargent

On 4/12/23 13:02, Ron wrote:
/Must/ the genome all be in one big file, or can you store them one 
line per table row?


Not sure what OP is doing with plant genomes (other than some genomics) 
but the tools all use files and pipeline of sub-tools.  In and out of 
tuples would be expensive.  Very,very little "editing" done in the usual 
"update table set val where id" sense.


Lines in a vcf file can have thousands of colums fo nasty, cryptic 
garbage data that only really makes sense to tools, reader.  Highly 
denormalized of course.  (Btw, I hate sequencing :) )





Re: TEXT column > 1Gb

2023-04-12 Thread Ron
/Must/ the genome all be in one big file, or can you store them one line per 
table row?


On 4/12/23 12:19, Joe Carlson wrote:

I’ve certainly thought about using a different representation. A factor of 2x 
would be good, for a while anyway. For nucleotide sequence, we’d need to 
consider a 10 character alphabet (A, C, G, T, N and the lower case forms when 
representing ’soft masked’ sequence*). So it would be 2 bases/byte. (Proteins 
are not nearly so long so a straight storage is simpler.) But these would be 
bigger changes on the client side than storing in chunks so I think this is the 
way to go.

We’re working with plant genomes, which compared to human chromosomes, are 
HUGE. One chromosome of fava bean is over a gig. And pine tree is another 
monster. This, together with the fact that sequence data collection and 
assembly have improved so much in the past couple years has forced us to 
rethink a lot of our data storage assumptions.

* for those curious, especially in plants, much of sequence consists of 
repetitive element that are remnants of ancient viruses, simple repeats and the 
like. For people who want to identify particular functional components in a 
genome, they typically do not want to search against this sequence but restrict 
searching to coding regions. But the repetitive sequence is still important and 
we need to keep it.


On Apr 12, 2023, at 10:04 AM, Mark Dilger  wrote:




On Apr 12, 2023, at 7:59 AM, Joe Carlson  wrote:

The use case is genomics. Extracting substrings is common. So going to chunked 
storage makes sense.

Are you storing nucleotide sequences as text strings?  If using the simple 
4-character (A,C,G,T) alphabet, you can store four bases per byte.  If using a 
nucleotide code 16-character alphabet you can still get two bases per byte.  An 
amino acid 20-character alphabet can be stored 8 bases per 5 bytes, and so 
forth.  Such a representation might allow you to store sequences two or four 
times longer than the limit you currently hit, but then you are still at an 
impasse.  Would a factor or 2x or 4x be enough for your needs?

—
Mark Dilger
EnterpriseDB:http://www.enterprisedb.com
The Enterprise PostgreSQL Company








--
Born in Arizona, moved to Babylonia.

Re: TEXT column > 1Gb

2023-04-12 Thread Rob Sargent

On 4/12/23 11:24, Benedict Holland wrote:
For documents that long I would seriously consider using large objects 
and refencing them with their OIDs. Text fields get put in a special 
location within the database. It's similar (possibly exactly) to using 
large objects. Also, you can potentially compress them to save space 
on write and read. 1gb of text is a lot of text.




I've gone down the premature optimization coded/compressed route and am 
still regretting it. I would either not bother or move to LO as per 
Benedict's suggestion.  The test may come to whether you plan to use 
database features (functions, procedures) to work on the data versus 
feeding it out to the endless list of sequence/genetic analysis tools.   
General compression is robust and fast.  My custom code/comp is likely 
neither :( .










Re: TEXT column > 1Gb

2023-04-12 Thread Benedict Holland
For documents that long I would seriously consider using large objects and
refencing them with their OIDs. Text fields get put in a special location
within the database. It's similar (possibly exactly) to using large
objects. Also, you can potentially compress them to save space on write and
read. 1gb of text is a lot of text.

See https://www.postgresql.org/docs/current/largeobjects.html

Thanks,
Ben

On Wed, Apr 12, 2023, 1:20 PM Joe Carlson  wrote:

> I’ve certainly thought about using a different representation. A factor of
> 2x would be good, for a while anyway. For nucleotide sequence, we’d need to
> consider a 10 character alphabet (A, C, G, T, N and the lower case forms
> when representing ’soft masked’ sequence*). So it would be 2 bases/byte.
> (Proteins are not nearly so long so a straight storage is simpler.) But
> these would be bigger changes on the client side than storing in chunks so
> I think this is the way to go.
>
> We’re working with plant genomes, which compared to human chromosomes, are
> HUGE. One chromosome of fava bean is over a gig. And pine tree is another
> monster. This, together with the fact that sequence data collection and
> assembly have improved so much in the past couple years has forced us to
> rethink a lot of our data storage assumptions.
>
> * for those curious, especially in plants, much of sequence consists of
> repetitive element that are remnants of ancient viruses, simple repeats and
> the like. For people who want to identify particular functional components
> in a genome, they typically do not want to search against this sequence but
> restrict searching to coding regions. But the repetitive sequence is still
> important and we need to keep it.
>
> > On Apr 12, 2023, at 10:04 AM, Mark Dilger 
> wrote:
> >
> >
> >
> >> On Apr 12, 2023, at 7:59 AM, Joe Carlson  wrote:
> >>
> >> The use case is genomics. Extracting substrings is common. So going to
> chunked storage makes sense.
> >
> > Are you storing nucleotide sequences as text strings?  If using the
> simple 4-character (A,C,G,T) alphabet, you can store four bases per byte.
> If using a nucleotide code 16-character alphabet you can still get two
> bases per byte.  An amino acid 20-character alphabet can be stored 8 bases
> per 5 bytes, and so forth.  Such a representation might allow you to store
> sequences two or four times longer than the limit you currently hit, but
> then you are still at an impasse.  Would a factor or 2x or 4x be enough for
> your needs?
> >
> > —
> > Mark Dilger
> > EnterpriseDB: http://www.enterprisedb.com
> > The Enterprise PostgreSQL Company
> >
> >
> >
>
>
>
>


Re: TEXT column > 1Gb

2023-04-12 Thread Joe Carlson
I’ve certainly thought about using a different representation. A factor of 2x 
would be good, for a while anyway. For nucleotide sequence, we’d need to 
consider a 10 character alphabet (A, C, G, T, N and the lower case forms when 
representing ’soft masked’ sequence*). So it would be 2 bases/byte. (Proteins 
are not nearly so long so a straight storage is simpler.) But these would be 
bigger changes on the client side than storing in chunks so I think this is the 
way to go.

We’re working with plant genomes, which compared to human chromosomes, are 
HUGE. One chromosome of fava bean is over a gig. And pine tree is another 
monster. This, together with the fact that sequence data collection and 
assembly have improved so much in the past couple years has forced us to 
rethink a lot of our data storage assumptions. 

* for those curious, especially in plants, much of sequence consists of 
repetitive element that are remnants of ancient viruses, simple repeats and the 
like. For people who want to identify particular functional components in a 
genome, they typically do not want to search against this sequence but restrict 
searching to coding regions. But the repetitive sequence is still important and 
we need to keep it.

> On Apr 12, 2023, at 10:04 AM, Mark Dilger  
> wrote:
> 
> 
> 
>> On Apr 12, 2023, at 7:59 AM, Joe Carlson  wrote:
>> 
>> The use case is genomics. Extracting substrings is common. So going to 
>> chunked storage makes sense.
> 
> Are you storing nucleotide sequences as text strings?  If using the simple 
> 4-character (A,C,G,T) alphabet, you can store four bases per byte.  If using 
> a nucleotide code 16-character alphabet you can still get two bases per byte. 
>  An amino acid 20-character alphabet can be stored 8 bases per 5 bytes, and 
> so forth.  Such a representation might allow you to store sequences two or 
> four times longer than the limit you currently hit, but then you are still at 
> an impasse.  Would a factor or 2x or 4x be enough for your needs? 
> 
> —
> Mark Dilger
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
> 
> 
> 





Re: TEXT column > 1Gb

2023-04-12 Thread Mark Dilger



> On Apr 12, 2023, at 7:59 AM, Joe Carlson  wrote:
> 
> The use case is genomics. Extracting substrings is common. So going to 
> chunked storage makes sense.

Are you storing nucleotide sequences as text strings?  If using the simple 
4-character (A,C,G,T) alphabet, you can store four bases per byte.  If using a 
nucleotide code 16-character alphabet you can still get two bases per byte.  An 
amino acid 20-character alphabet can be stored 8 bases per 5 bytes, and so 
forth.  Such a representation might allow you to store sequences two or four 
times longer than the limit you currently hit, but then you are still at an 
impasse.  Would a factor or 2x or 4x be enough for your needs? 

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: TEXT column > 1Gb

2023-04-12 Thread Rob Sargent

On 4/12/23 08:59, Joe Carlson wrote:
I’m curious what you learned. I’ve been tripping over the buffer 
allocation issue when either splitting input text into chunks or 
aggregating chunks in selects. I’ve decided that I need to move this 
to client side.


The use case is genomics. Extracting substrings is common. So going to 
chunked storage makes sense.


And we have a menagerie of code to deal with. Legacy Perl for loading. 
Clients in Perl, Java, node, python. A whole zoo.


Thanks


My first stab was a java implementation based Selector class which 
worked ok using a fixed buffer size but finding the end of last 
transmission was hokey.  I move to an embedded Tomcat implementation and 
tomcat does the heavy lifting PERFECTLY.  Easily slipped in to both 
client and backend.


rjs





Re: FW: Error!

2023-04-12 Thread Adrian Klaver

On 4/10/23 17:55, Arquimedes Aguirre wrote:

Sent from Mail  for Windows

*From: *Arquimedes Aguirre 
*Sent: *Sunday, April 9, 2023 5:59 PM
*To: *pgsql-advoc...@postgresql.org 
*Subject: *Error

Hello there Dear!

I have an error initializing PSQL shell, who can help me to solve it.

Please I require your attention, I need to start developing a project, 
apparently there is a hacker that is intervening in this process, I 
appreciate your investigation some research and answer.


Just to install the program, I got several errors without any common 
sense, I asked the Microsoft support team for help and in 5 minutes they 
solved the problem, it's very weird and odd. Apparently they have 
control of my personal computer.


If MS support solved it why are you asking how to solve it?

FYI, do not use images for textual information, copy and paste the text 
from the terminal .


I don't see any evidence of a hacker. I do see evidence of incorrect 
passwords being used.


Did you install Postgres from here?:

https://www.enterprisedb.com/downloads/postgres-postgresql-downloads

If so at some point in the process you where asked to create a password. 
Use that password when invoking the psql shell command with the user 
being the default postgres.




I hope that you can pay due attention to these problems that do not 
allow me to advance in what I want to develop for my project. I hope you 
can find the person responsible for these abnormalities.


Thanks!

Sent from Mail  for Windows



--
Adrian Klaver
adrian.kla...@aklaver.com





Re: TEXT column > 1Gb

2023-04-12 Thread Joe Carlson
I’m curious what you learned. I’ve been tripping over the buffer allocation 
issue when either splitting input text into chunks or aggregating chunks in 
selects. I’ve decided that I need to move this to client side.

The use case is genomics. Extracting substrings is common. So going to chunked 
storage makes sense.

And we have a menagerie of code to deal with. Legacy Perl for loading. Clients 
in Perl, Java, node, python. A whole zoo. 

Thanks

> On Apr 11, 2023, at 10:51 AM, Rob Sargent  wrote:
> 
> 
>> On 4/11/23 11:41, Joe Carlson wrote:
>> Hello,
>> 
>> I’ve recently encountered the issue of trying to insert more than 1 Gb into 
>> a TEXT column. While the docs say TEXT is unlimited length, I had been 
>> unaware of the 1Gb buffer size limitations.
>> 
>> We can debate whether or not saving something this big in a single column is 
>> a good idea (spoiler: it isn’t. But not my design and, in fairness, was not 
>> anticipated when the schema was designed.), I’d like to implement something 
>> that is not a major disruption and try to keep the mods on the server side. 
>> My first idea is to have a chunked associated table (in pseudo code)
>> 
>> CREATE TABLE associated(key_id integer references main_table(key_id), chunk 
>> integer, text_start integer, text_end integer, text_chunk TEXT);
>> 
>> And define functions for inserting and selecting by dividing into 1Mb chunks
>> 
>> CREATE FUNCTION insertText(INTEGER,TEXT) RETURNS INTEGER AS $$
>> DECLARE
>>   chunk INTEGER := 0;
>>   key_id ALIAS for $1;
>>   the_text ALIAS for $2;
>>   text_chunk TEXT;
>> BEGIN
>>   LOOP
>> text_chunk := substr(the_text,chunk*100,100);
>> IF length(text_chunk) = 0 THEN
>>   EXIT;
>> END IF;
>> INSERT INTO associated(key_id,chunk,text_start,text_end,text_chunk) 
>> VALUES 
>> (key_id,chunk,chunk*100,(chunk*100+length(text_chunk)),text_chunk);
>> chunk := chunk + 1;
>>   END LOOP;
>>   RETURN chunk;
>> END;
>> $$ LANGUAGE plpgsql;
>> 
>> This apparently runs into the same issues of buffers size: I get an ‘invalid 
>> message length’ in the log file and the insert fails. I can see from adding 
>> notices in the code that I never enter the LOOP; I assume having function 
>> arguments > 1Gb is also a bad thing.
>> 
>> I’d like to continue to keep the modifications on the server size. And I’d 
>> like to believe someone else has had this problem before. Any suggestions 
>> other than have the client do the chunking? Can I use a different language 
>> binding and get around the argument length limitations?
>> 
>> Thanks
>> 
>> 
>> 
> I've hit this same limitation in Java (with write to db).  What is your stack 
> in this case?  Not sure my solution applies.
> 
> 


Re: Performance issue after migration from 9.4 to 15

2023-04-12 Thread Adrian Klaver

On 4/11/23 07:41, Christian Schröder wrote:

Hi all,
Thanks for all your tips!
I have meanwhile run "ANALYZE" on all tables. Either that was as important as everybody tells me 😉 or it is just a coincidence (e.g., fewer other queries on the database). 


It is:

https://www.postgresql.org/docs/current/pgupgrade.html


Statistics

Because optimizer statistics are not transferred by pg_upgrade, you will 
be instructed to run a command to regenerate that information at the end 
of the upgrade. You might need to set connection parameters to match 
your new cluster.





--
Adrian Klaver
adrian.kla...@aklaver.com





Re: subscribe

2023-04-12 Thread Adrian Klaver

On 4/12/23 07:00, Mike Bayer wrote:

What is it you are trying to do?

--
Adrian Klaver
adrian.kla...@aklaver.com





Re: FW: Error!

2023-04-12 Thread Ron

Are you sure that you're entering the correct password?

On 4/10/23 19:55, Arquimedes Aguirre wrote:


Sent from Mail  for Windows

*From: *Arquimedes Aguirre 
*Sent: *Sunday, April 9, 2023 5:59 PM
*To: *pgsql-advoc...@postgresql.org
*Subject: *Error

Hello there Dear!

I have an error initializing PSQL shell, who can help me to solve it.

Please I require your attention, I need to start developing a project, 
apparently there is a hacker that is intervening in this process, I 
appreciate your investigation some research and answer.


Just to install the program, I got several errors without any common 
sense, I asked the Microsoft support team for help and in 5 minutes they 
solved the problem, it's very weird and odd. Apparently they have control 
of my personal computer.


I hope that you can pay due attention to these problems that do not allow 
me to advance in what I want to develop for my project. I hope you can 
find the person responsible for these abnormalities.


Thanks!

Sent from Mail  for Windows



--
Born in Arizona, moved to Babylonia.

subscribe

2023-04-12 Thread Mike Bayer


Re: lippq client library and openssl initialization: PQinitOpenSSL()

2023-04-12 Thread Tom Lane
Daniel Gustafsson  writes:
> Reviving an old thread that got buried, I propose to apply the attached and
> backpatch it as OpenSSL of such vintage is equally likely to be used in old
> versions of postgres as current.

+1

regards, tom lane




Re: lippq client library and openssl initialization: PQinitOpenSSL()

2023-04-12 Thread Daniel Gustafsson
> On 12 Sep 2022, at 15:47, Tom Lane  wrote:

> Personally I'd put this up front, more like
> 
>   have been initialized by your application, so that
>   libpq will not also initialize those libraries.
> +  However, this is unnecessary when using OpenSSL 1.1.0 or later,
> +  as duplicate initializations are no longer problematic.
>  
> 
> If you do use wording that specifically mentions PQinitOpenSSL,
> it should also mention PQinitSSL, just for clarity.

Reviving an old thread that got buried, I propose to apply the attached and
backpatch it as OpenSSL of such vintage is equally likely to be used in old
versions of postgres as current.

--
Daniel Gustafsson



pqinitopenssl_v2.diff
Description: Binary data


Re: Guidance on INSERT RETURNING order

2023-04-12 Thread John Howroyd
Hi All,

The fundamental question here is:
What does "insert into object(col1, ... , coln) select sel1, ... , seln
from ... order by ord1, ... ordm returning val1, ..., valk" mean?

In particular, what does the "order by" do in "insert ... select ... order
by"?  Will this be honoured with parallel inserts?  Is that really too much
to ask?

Is a DBMS just a black box archive of data (throw it in and forget about
it)?  In my opinion, it's a system for the consistent and efficient storage
and *retrieval* of data, to be used in the context of other software
systems.  Programmatically, that means matching up what's in the database
with what's stored in memory.

And yes, a DBMS needs to give guarantees!  ACID compliance is one such
guarantee, but it is not the only one required.

Finally, please don't read this badly.  It is meant purely to question
where we want to go, I strongly support PostgreSQL and believe it has made
many major contributions to the development of DBMS over the years.

Many thanks,
John.



On Wed, 12 Apr 2023 at 11:10, Federico  wrote:

> On Wed, 12 Apr 2023 at 11:46, Thorsten Glaser  wrote:
> >
> > On Tue, 11 Apr 2023, Federico wrote:
> >
> > >My understanding was that they are generated in select order
> >
> > But are they? (I don’t know, but I’d not assume that.)
>
> That's kind of the point for this question, to see if that's correct or
> not.
>
> > >If my understanding is incorrect, would this alternative guarantee the
> above
> >
> > >INSERT INTO t(id, data)
> > >SELECT nextval(pg_get_serial_sequence('t', 'id')) data
> > >FROM (VALUES ('a', 1), ('b', 2), ('c', 3)) as vv(data, num)
> > >ORDER BY num
> > >RETURNING id
> >
> > Wouldn’t, at that point, it be better to just send multiple
> > individual INSERT statements? The overhead (on both sides)
> > for all mentioned… workarounds… surely is larger than that?
>
> No, not by a long shot. Sending thousands of single inserts
> sequentially over the network requires a lot more time even when doing
> that on localhost.
> Using a single statement is many times faster.
>
> Federico
>
> > bye,
> > //mirabilos
> > --
> > 15:41⎜ Somebody write a testsuite for helloworld
> :-)
> >
> >
> >
> >
>
>
>
>
>


Re: pg_basebackup / recovery

2023-04-12 Thread Achilleas Mantzios - cloud

On 4/12/23 12:32, Fabrice Chapuis wrote:

During recovery process of a self contained backup, how postgres know 
to stop reading wal when consistency is reached?



Because it knows the full packup info. It will observe the

STOP WAL LOCATION: 3BC7/4B000130 (file 00023BC7004B)

inside the backup file

 It's a full contained backup because you haven't called with the |-X 
/|method none|/| , and it doesn't get into standby because you haven't 
called with |--write-recovery-conf !|


|I believe by default it will do what you meant that you want.
|


Re: parallel aggregation

2023-04-12 Thread David Rowley
On Wed, 12 Apr 2023 at 22:14, Alexander Saydakov  wrote:
>
> I have a few questions regarding aggregate functions that would be parallel 
> safe.
> 1. do the inputs of combinefunc always go through serialfunc-deserialfunc or 
> they can come directly from sfunc (worker on the same machine, perhaps)?

Only aggregates with an INTERNAL transition state must be serialised
and deserialised.  Non-internal state aggregates i.e ones that have a
corresponding database type, can be pushed through the normal means
that we transfer tuples from parallel workers to the main process
without any serialisation or deserialisation at the other end.

All serial functions must return bytea and accept a single INTERNAL
parameter, so you can't even make a serial func for an aggregate
that's not got an INTERNAL aggregate state type.

> 2. can the result of combinefunc ever be fed to sfunc as opposed to other 
> combiners or finalfunc?

combinefuncs take 2 STYPEs, so it's not valid to pass those to an
SFUNC (those are only given a BASETYPE to transition into the
aggregate state).  The finalfunc will be called (if it exists) during
the Finalize Aggregate plan node. The Finalize Aggregate node also
gathers intermediate aggregate states from parallel workers and calls
the combinefunc on ones from the same group, so yes, the finalfunc
will be called on aggregate states that have been combined with the
combinefunc.

> I have in mind a scenario, when a different data structure is used in the 
> combine stage. For that it would be good if the conversion can happen in 
> serialfunc-deserialfunc, and the combiner does not even know about the other 
> structure used for state transition during aggregation. If that is the case, 
> the only problem remains with the finalfunc. It has to be ready to receive 
> both types.

What's the use case for that?

David




RE: Performance issue after migration from 9.4 to 15

2023-04-12 Thread Christian Schröder
Hi,
I will definitely give it a try; however, I do not understand how this could 
explain the observed performance changes. As far as I understand, an 
outdated/non-optimized index would not be relevant in a sequential scan, and 
outdated statistics could cause a non-optimal plan to be chosen, but they 
should not affect the performance of a single sequential scan.

Best,
Christian


From: Inzamam Shafiq 
Sent: Tuesday, April 11, 2023 1:25 PM
To: Christian Schröder 
Subject: Re: Performance issue after migration from 9.4 to 15

Hi,

How did you upgrade the databse? if it is done by pg_upgrade then I think you 
have to do the following,

Try re-indexing the database
Update/analyze statistics

Regards,

Inzamam Shafiq
Sr. DBA

From: Christian Schröder 
Sent: Tuesday, April 11, 2023 3:58 PM
To: mailto:pgsql-general@lists.postgresql.org 

Subject: Performance issue after migration from 9.4 to 15

Hi list,
We have recently migrated from our old PostgreSQL 9.4 server to a PostgreSQL 15 
server. Even though the new machine has more resources, we see a considerable 
decrease in the performance of some of our heavier queries, and I have no idea 
where I should start tuning. ?

Old database: PostgreSQL 9.4.26 on x86_64-unknown-linux-gnu, compiled by gcc 
(GCC) 4.4.7 20120313 (Red Hat 4.4.7-17), 64-bit
New database: PostgreSQL 15.0 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 
4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit

I tested the same query against the old and the new database.

Old: https://explain.dalibo.com/plan/b7d7gab73f6c7274#plan/node/21
New: https://explain.dalibo.com/plan/84h16cg1f19266be#plan/node/34

Particularly interesting are the sequential scans. In the old plan, we have 
node #21, which took 32 seconds. Almost all of the time goes into actual I/O. 
In the new plan, the corresponding node is #34. It took 55 seconds, but less 
than 1 second was actual I/O (because most of the data was already in memory). 
Why did this step still take about twice the time?
There is another Seq Scan of the same table. Node #10 in the old plan took 3 
seconds, whereas the corresponding node #21 in the new plan took more than 2 
minutes (of which less than 2 seconds was actual I/O).

Am I misreading the plans? If not, I have no idea why the sequential scans take 
so much longer in the new database, even though the I/O is even faster than 
before.

The configuration was left almost unchanged, with only some adjustments due to 
changes between the versions. As far as I can tell, none of these changes is 
performance related.

Can anybody give me a hint into which direction I should investigate further?

Thanks,
Christian





RE: Performance issue after migration from 9.4 to 15

2023-04-12 Thread Christian Schröder
Hi all,
Thanks for all your tips!
I have meanwhile run "ANALYZE" on all tables. Either that was as important as 
everybody tells me 😉 or it is just a coincidence (e.g., fewer other queries on 
the database). In any case, the performance has indeed improved considerably. I 
will observe the situation tomorrow in the morning, when there is typically 
more load on the system.
I also noticed that, after analyzing, the sequential scan node shows fewer 
columns in the "output" section 
(https://explain.dalibo.com/plan/b8bfa5a3d2dc33bc#plan/node/18). I am not sure 
if this has an impact on the performance of this particular operation.
I will also follow the suggestion and increase the number of parallel workers. 
The new machine has enough cores, so this should be possible.

Best,
Christian

-Original Message-
From: Tom Lane  
Sent: Tuesday, April 11, 2023 4:09 PM
To: David Rowley 
Cc: Christian Schröder ; 
pgsql-general@lists.postgresql.org
Subject: Re: Performance issue after migration from 9.4 to 15

David Rowley  writes:
> On Tue, 11 Apr 2023 at 23:03, Christian Schröder 
>  wrote:
>> Particularly interesting are the sequential scans. In the old plan, we have 
>> node #21, which took 32 seconds. Almost all of the time goes into actual 
>> I/O. In the new plan, the corresponding node is #34. It took 55 seconds, but 
>> less than 1 second was actual I/O (because most of the data was already in 
>> memory). Why did this step still take about twice the time?

> Perhaps your 15 server is under more load than 9.4 due to all 
> concurrent plans being slower from bad statistics? Load averages might 
> be a good indicator. (I assume the server is busy due to the "Workers
> Launched": 0)

I think the extra time is due to useless overhead from trying and failing to 
parallelize: the leader has to do all the work, but there's probably overhead 
added anyway.  9.4 of course knew nothing of parallelism.

My guess is that the OP is trying to run with a large number of backends and 
has not raised the max number of parallel workers to match.  It does look like 
the stats might need updating (since 9.4's rowcount estimate is OK and 15's 
less so) but that is not why we see "Workers
planned: 2, Workers launched: 0".  Either provision enough parallel workers to 
fix that, or disable parallelism.

regards, tom lane


parallel aggregation

2023-04-12 Thread Alexander Saydakov
I have a few questions regarding aggregate functions that would be parallel
safe.
1. do the inputs of combinefunc always go through
serialfunc-deserialfunc or they can come directly from sfunc (worker on the
same machine, perhaps)?
2. can the result of combinefunc ever be fed to sfunc as opposed to other
combiners or finalfunc?

I have in mind a scenario, when a different data structure is used in the
combine stage. For that it would be good if the conversion can happen in
serialfunc-deserialfunc, and the combiner does not even know about the
other structure used for state transition during aggregation. If that is
the case, the only problem remains with the finalfunc. It has to be ready
to receive both types.

Thanks.


Re: Guidance on INSERT RETURNING order

2023-04-12 Thread Federico
On Wed, 12 Apr 2023 at 11:46, Thorsten Glaser  wrote:
>
> On Tue, 11 Apr 2023, Federico wrote:
>
> >My understanding was that they are generated in select order
>
> But are they? (I don’t know, but I’d not assume that.)

That's kind of the point for this question, to see if that's correct or not.

> >If my understanding is incorrect, would this alternative guarantee the above
>
> >INSERT INTO t(id, data)
> >SELECT nextval(pg_get_serial_sequence('t', 'id')) data
> >FROM (VALUES ('a', 1), ('b', 2), ('c', 3)) as vv(data, num)
> >ORDER BY num
> >RETURNING id
>
> Wouldn’t, at that point, it be better to just send multiple
> individual INSERT statements? The overhead (on both sides)
> for all mentioned… workarounds… surely is larger than that?

No, not by a long shot. Sending thousands of single inserts
sequentially over the network requires a lot more time even when doing
that on localhost.
Using a single statement is many times faster.

Federico

> bye,
> //mirabilos
> --
> 15:41⎜ Somebody write a testsuite for helloworld :-)
>
>
>
>




pg_basebackup / recovery

2023-04-12 Thread Fabrice Chapuis
During recovery process of a self contained backup, how postgres know to
stop reading wal when consistency is reached?