Re: Purpose of pg_dump tar archive format?

2024-06-04 Thread Ron Johnson
On Tue, Jun 4, 2024 at 3:47 PM Gavin Roy  wrote:

>
> On Tue, Jun 4, 2024 at 3:15 PM Ron Johnson 
> wrote:
>
>>
>> But why tar instead of custom? That was part of my original question.
>>
>
> I've found it pretty useful for programmatically accessing data in a dump
> for large databases outside of the normal pg_dump/pg_restore workflow. You
> don't have to seek through one large binary file to get to the data section
> to get at the data.
>

Interesting.  Please explain, though, since a big tarball _is_ "one large
binary file" that you have to sequentially scan.  (I don't know the
internal structure of custom format files, and whether they have file
pointers to each table.)

Is it because you need individual .dat "COPY" files for something other
than loading into PG tables (since pg_restore --table= does that, too),
and directory format archives can be inconvenient?


Re: Questions on logical replication

2024-06-04 Thread Adrian Klaver

On 6/4/24 15:55, Koen De Groote wrote:
I recently read the entire documentation on logical replication, but am 
left with a question on the buildup of WAL


On this page: 
https://www.postgresql.org/docs/current/logical-replication-subscription.html#LOGICAL-REPLICATION-SUBSCRIPTION-SLOT 


It is written: " When dropping a subscription, the remote host is not 
reachable. In that case, disassociate the slot from the subscription 
using |ALTER SUBSCRIPTION| before attempting to drop the subscription. 
If the remote database instance no longer exists, no further action is 
then necessary. If, however, the remote database instance is just 
unreachable, the replication slot (and any still remaining table 
synchronization slots) should then be dropped manually; otherwise 
it/they would continue to reserve WAL and might eventually cause the 
disk to fill up. Such cases should be carefully investigated."



Assuming a situation where I add tables 1 at a time to the publisher, 
and refresh the subscription every time.


What happens if I shut down the subscriber database for a while? The 
subscription isn't dropped, so am I reading it right that the disk on 
the publisher will slowly be filling up with WAL? Isn't that always the 
case if wall is enabled?


https://www.postgresql.org/docs/current/wal-configuration.html

"Checkpoints are points in the sequence of transactions at which it is 
guaranteed that the heap and index data files have been updated with all 
information written before that checkpoint. At checkpoint time, all 
dirty data pages are flushed to disk and a special checkpoint record is 
written to the WAL file. (The change records were previously flushed to 
the WAL files.) In the event of a crash, the crash recovery procedure 
looks at the latest checkpoint record to determine the point in the WAL 
(known as the redo record) from which it should start the REDO 
operation. Any changes made to data files before that point are 
guaranteed to be already on disk. Hence, after a checkpoint, WAL 
segments preceding the one containing the redo record are no longer 
needed and can be recycled or removed. (When WAL archiving is being 
done, the WAL segments must be archived before being recycled or removed.)"




This "cause disk to fill up" warning is quite concerning, and I'd like 
to understand what could cause it and how likely it is? I thought 
logical replication uses WAL by default, so doesn't that mean there has 
to be a log of changes kept anyhow? Even if the WAL isn't written to 
disk by an "archive_command"?


https://www.postgresql.org/docs/current/warm-standby.html#STREAMING-REPLICATION-SLOTS

"Replication slots provide an automated way to ensure that the primary 
does not remove WAL segments until they have been received by all 
standbys, and that the primary does not remove rows which could cause a 
recovery conflict even when the standby is disconnected."


When you set up logical replication you are 'asking' via the replication 
slot that WAL records be kept on the publisher until the subscriber 
retrieves them.




Regards,
Koen De Groote


--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Questions on logical replication

2024-06-04 Thread Koen De Groote
Reading this:
https://www.postgresql.org/docs/current/warm-standby.html#STREAMING-REPLICATION-SLOTS

" Replication slots provide an automated way to ensure that the primary
does not remove WAL segments until they have been received by all standbys,
and that the primary does not remove rows which could cause a recovery
conflict

even when the standby is disconnected. "

Am I to understand that a subscription is considered that same as a
standby, in this context?

On Wed, Jun 5, 2024 at 12:55 AM Koen De Groote  wrote:

> I recently read the entire documentation on logical replication, but am
> left with a question on the buildup of WAL
>
> On this page:
> https://www.postgresql.org/docs/current/logical-replication-subscription.html#LOGICAL-REPLICATION-SUBSCRIPTION-SLOT
>
> It is written: " When dropping a subscription, the remote host is not
> reachable. In that case, disassociate the slot from the subscription using 
> ALTER
> SUBSCRIPTION before attempting to drop the subscription. If the remote
> database instance no longer exists, no further action is then necessary.
> If, however, the remote database instance is just unreachable, the
> replication slot (and any still remaining table synchronization slots)
> should then be dropped manually; otherwise it/they would continue to
> reserve WAL and might eventually cause the disk to fill up. Such cases
> should be carefully investigated."
>
>
> Assuming a situation where I add tables 1 at a time to the publisher, and
> refresh the subscription every time.
>
> What happens if I shut down the subscriber database for a while? The
> subscription isn't dropped, so am I reading it right that the disk on the
> publisher will slowly be filling up with WAL? Isn't that always the case if
> wall is enabled?
>
> This "cause disk to fill up" warning is quite concerning, and I'd like to
> understand what could cause it and how likely it is? I thought logical
> replication uses WAL by default, so doesn't that mean there has to be a log
> of changes kept anyhow? Even if the WAL isn't written to disk by an
> "archive_command"?
>
> Regards,
> Koen De Groote
>


Questions on logical replication

2024-06-04 Thread Koen De Groote
I recently read the entire documentation on logical replication, but am
left with a question on the buildup of WAL

On this page:
https://www.postgresql.org/docs/current/logical-replication-subscription.html#LOGICAL-REPLICATION-SUBSCRIPTION-SLOT

It is written: " When dropping a subscription, the remote host is not
reachable. In that case, disassociate the slot from the subscription
using ALTER
SUBSCRIPTION before attempting to drop the subscription. If the remote
database instance no longer exists, no further action is then necessary.
If, however, the remote database instance is just unreachable, the
replication slot (and any still remaining table synchronization slots)
should then be dropped manually; otherwise it/they would continue to
reserve WAL and might eventually cause the disk to fill up. Such cases
should be carefully investigated."


Assuming a situation where I add tables 1 at a time to the publisher, and
refresh the subscription every time.

What happens if I shut down the subscriber database for a while? The
subscription isn't dropped, so am I reading it right that the disk on the
publisher will slowly be filling up with WAL? Isn't that always the case if
wall is enabled?

This "cause disk to fill up" warning is quite concerning, and I'd like to
understand what could cause it and how likely it is? I thought logical
replication uses WAL by default, so doesn't that mean there has to be a log
of changes kept anyhow? Even if the WAL isn't written to disk by an
"archive_command"?

Regards,
Koen De Groote


Re: Purpose of pg_dump tar archive format?

2024-06-04 Thread Shaheed Haque
On Tue, 4 Jun 2024 at 20:47, Gavin Roy  wrote:

>
> On Tue, Jun 4, 2024 at 3:15 PM Ron Johnson 
> wrote:
>
>>
>> But why tar instead of custom? That was part of my original question.
>>
>
> I've found it pretty useful for programmatically accessing data in a dump
> for large databases outside of the normal pg_dump/pg_restore workflow. You
> don't have to seek through one large binary file to get to the data section
> to get at the data.
>

This is true for us too; specifically, tar, including with compression, is
very convenient for both CLI and Python ecosystems.


Re: Purpose of pg_dump tar archive format?

2024-06-04 Thread Gavin Roy
On Tue, Jun 4, 2024 at 3:15 PM Ron Johnson  wrote:

>
> But why tar instead of custom? That was part of my original question.
>

I've found it pretty useful for programmatically accessing data in a dump
for large databases outside of the normal pg_dump/pg_restore workflow. You
don't have to seek through one large binary file to get to the data section
to get at the data.


Re: Purpose of pg_dump tar archive format?

2024-06-04 Thread Rob Sargent



On 6/4/24 13:15, Ron Johnson wrote:

On Tue, Jun 4, 2024 at 2:55 PM Rob Sargent  wrote:



On 6/4/24 11:40, Shaheed Haque wrote:
>
> We use it. I bet lots of others do too.
>
>

Of course.  There are lots of small, real, useful databases in the
wild.

But why tar instead of custom? That was part of my original question.


That I can't say.  Familiarity?  Fewer keystrokes?



Re: Purpose of pg_dump tar archive format?

2024-06-04 Thread Ron Johnson
On Tue, Jun 4, 2024 at 2:55 PM Rob Sargent  wrote:

>
>
> On 6/4/24 11:40, Shaheed Haque wrote:
> >
> > We use it. I bet lots of others do too.
> >
> >
>
> Of course.  There are lots of small, real, useful databases in the wild.
>

But why tar instead of custom? That was part of my original question.


Re: Purpose of pg_dump tar archive format?

2024-06-04 Thread Rob Sargent




On 6/4/24 11:40, Shaheed Haque wrote:


We use it. I bet lots of others do too.




Of course.  There are lots of small, real, useful databases in the wild.





Re: Bugs details.

2024-06-04 Thread Erik Wienhold
On 2024-06-04 19:55 +0200, Muhammad Salahuddin Manzoor wrote:
> I need to get detailed information about PG bugs and fixes.
> 
> I can get the information about new features and Fixes in the release
> document but I find only descriptions for each fix. I need detailed
> information about these fixes. How can I get details about the fix.
> 
> Suppose the release document says.
> 
> *Fix INSERT from multiple VALUES rows into a target column that is a domain
> over an array or composite type (Tom Lane)*
> 
> I want to get more details about this fix. How can I search this. I want to
> reproduce it and demonstrate the fix. For that I need detailed information.

You can check the DocBook source of the release notes.  Each item should
have a comment about the respective Git commits, e.g., [1] for the item
that you quoted.  The commit messages usually link to the mailing list
thread where you may find more details.

[1] 
https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=doc/src/sgml/release-16.sgml;h=54860aa0e1dfb677ec16cf3672e07bfbd41d431e;hb=refs/heads/REL_16_STABLE#l130

-- 
Erik




Bugs details.

2024-06-04 Thread Muhammad Salahuddin Manzoor
Greetings,

I need to get detailed information about PG bugs and fixes.

I can get the information about new features and Fixes in the release
document but I find only descriptions for each fix. I need detailed
information about these fixes. How can I get details about the fix.

Suppose the release document says.

*Fix INSERT from multiple VALUES rows into a target column that is a domain
over an array or composite type (Tom Lane)*

I want to get more details about this fix. How can I search this. I want to
reproduce it and demonstrate the fix. For that I need detailed information.

*Salahuddin (살라후딘**)*


Re: Purpose of pg_dump tar archive format?

2024-06-04 Thread Shaheed Haque
We use it. I bet lots of others do too.

On Tue, 4 Jun 2024, 18:06 Adrian Klaver,  wrote:

> Reply to list also.
> Ccing list
>
> On 6/4/24 10:03 AM, Ron Johnson wrote:
>
> >
> > If you don't need the tar format then don't use it.
> >
> >
> > That's neither the purpose nor the point of my question.
> >
> > I think that a note in the docs mentioning that it's obsolete would be
> > helpful for new users who recognize "tar" so choose it.
>
> You are assuming facts not in evidence, namely that the format is obsolete.
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>
>
>


Re: Purpose of pg_dump tar archive format?

2024-06-04 Thread Adrian Klaver

Reply to list also.
Ccing list

On 6/4/24 10:03 AM, Ron Johnson wrote:



If you don't need the tar format then don't use it.


That's neither the purpose nor the point of my question.

I think that a note in the docs mentioning that it's obsolete would be 
helpful for new users who recognize "tar" so choose it.


You are assuming facts not in evidence, namely that the format is obsolete.

--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Purpose of pg_dump tar archive format?

2024-06-04 Thread Tom Lane
Adrian Klaver  writes:
> If you don't need the tar format then don't use it.

Indeed.  FTR, I think the original motivation for tar format was to
have a "standard" structured output format that could be manipulated
with tools other than pg_restore.  Years later, we added the directory
format which could be argued to serve the same purpose.  But nobody
particularly wanted to remove the tar functionality, especially since
the two were made to be compatible:

commit 7f508f1c6b515df66d27f860b2faa7b5761fa55d
Author: Heikki Linnakangas 
Date:   Sun Jan 23 23:10:15 2011 +0200

Add 'directory' format to pg_dump. The new directory format is compatible
with the 'tar' format, in that untarring a tar format archive produces a
valid directory format archive.

regards, tom lane




Re: Purpose of pg_dump tar archive format?

2024-06-04 Thread Adrian Klaver

On 6/4/24 08:25, Ron Johnson wrote:
On Tue, Jun 4, 2024 at 10:43 AM Adrian Klaver > wrote:


On 6/4/24 05:13, Ron Johnson wrote:
 > It doesn't support compression nor restore reordering like the
custom
 > format, so I'm having trouble seeing why it still exists (at least
 > without a doc warning that it's obsolete).

pg_dump -d test -U postgres -Ft  | gzip --stdout > test.tgz


Who's got meaningful databases that small anymore?

And if you've got meaningfully sized databases, open port 5432 and move 
them using pg_dump.


If you don't need the tar format then don't use it.

--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Purpose of pg_dump tar archive format?

2024-06-04 Thread Ron Johnson
On Tue, Jun 4, 2024 at 10:43 AM Adrian Klaver 
wrote:

> On 6/4/24 05:13, Ron Johnson wrote:
> > It doesn't support compression nor restore reordering like the custom
> > format, so I'm having trouble seeing why it still exists (at least
> > without a doc warning that it's obsolete).
>
> pg_dump -d test -U postgres -Ft  | gzip --stdout > test.tgz
>

Who's got meaningful databases that small anymore?

And if you've got meaningfully sized databases, open port 5432 and move
them using pg_dump.


Re: Purpose of pg_dump tar archive format?

2024-06-04 Thread Adrian Klaver

On 6/4/24 05:13, Ron Johnson wrote:
It doesn't support compression nor restore reordering like the custom 
format, so I'm having trouble seeing why it still exists (at least 
without a doc warning that it's obsolete).


pg_dump -d test -U postgres -Ft  | gzip --stdout > test.tgz

--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Purpose of pg_dump tar archive format?

2024-06-04 Thread Erik Wienhold
On 2024-06-04 14:13 +0200, Ron Johnson wrote:
> It doesn't support compression nor restore reordering like the custom
> format, so I'm having trouble seeing why it still exists (at least without
> a doc warning that it's obsolete).

Maybe because of this:

> * The tar format also includes a 'restore.sql' script which is there for
> * the benefit of humans. This script is never used by pg_restore.

I've never used that restore.sql, so I can't say how useful it is,
especially since it requires editing $$PATH$$.  Should be mentioned in
the docs if that file is still relevant.

-- 
Erik




Purpose of pg_dump tar archive format?

2024-06-04 Thread Ron Johnson
It doesn't support compression nor restore reordering like the custom
format, so I'm having trouble seeing why it still exists (at least without
a doc warning that it's obsolete).