Re: 28.4.4. Progress Reporting phase status

2024-06-03 Thread Euler Taveira
On Thu, May 30, 2024, at 12:50 PM, PG Doc comments form wrote:
> I noticed that in "28.4.4. Progress Reporting" chapter for
> `pg_stat_progress_create_index`, the  "Table 28.43. CREATE INDEX Phases" may
> be misleading as phase displayed for "building index" is often more
> detailed. 
> I found in this presentation[1] (slide 20) what seems to be the exact status
> displayed by the table while creating the index. 
> Maybe documentation table displaying all status could be extended to display
> the following status:

The description is accurate. Since this step (building index) is AM-specific, it
shouldn't contain the additional information (after the semicolon).

> - building index: initializing [2]
> - building index: scanning table
> - building index: sorting live tuples
> - building index: sorting dead tuples
> - building index: loading tuples in tree

This is the B-tree build phases. Although, the other access methods (such as
Hash, Gin, GiST, BRIN) do not provide a function to report the current building
phase, it might be added in the future. I'm not sure if it is worth adding such
information here. You can certainly obtain the build phases from all access
methods with a query like:

WITH amidx AS (
SELECT oid, amname FROM pg_am WHERE amtype = 'i')
SELECT a.amname, pg_indexam_progress_phasename(a.oid, i)
FROM amidx a, generate_series(0, 100) i
WHERE pg_indexam_progress_phasename(a.oid, i) IS NOT NULL
ORDER BY a.amname, i;


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: Ambiguous description on new columns

2024-06-03 Thread vignesh C
On Fri, 31 May 2024 at 08:58, vignesh C  wrote:
>
> On Thu, 30 May 2024 at 06:21, Peter Smith  wrote:
> >
> > On Wed, May 29, 2024 at 8:04 PM vignesh C  wrote:
> > >
> > > On Wed, 22 May 2024 at 14:26, Peter Smith  wrote:
> > > >
> > > > On Tue, May 21, 2024 at 8:40 PM PG Doc comments form
> > > >  wrote:
> > > > >
> > > > > The following documentation comment has been logged on the website:
> > > > >
> > > > > Page: 
> > > > > https://www.postgresql.org/docs/16/logical-replication-col-lists.html
> > > > > Description:
> > > > >
> > > > > The documentation on this page mentions:
> > > > >
> > > > > "If no column list is specified, any columns added later are 
> > > > > automatically
> > > > > replicated."
> > > > >
> > > > > It feels ambiguous what this could mean. Does it mean:
> > > > >
> > > > > 1/ That if you alter the table on the publisher and add a new column, 
> > > > > it
> > > > > will be replicated
> > > > >
> > > > > 2/ If you add a column list later and add a column to it, it will be
> > > > > replicated
> > > > >
> > > > > In both cases, does the subscriber automatically create this column 
> > > > > if it
> > > > > wasn't there before?
> > > >
> > > > No, the subscriber will not automatically create the column. That is
> > > > already clearly said at the top of the same page you linked "The table
> > > > on the subscriber side must have at least all the columns that are
> > > > published."
> > > >
> > > > All that "If no column list..." paragraph was trying to say is:
> > > >
> > > > CREATE PUBLICATION pub FOR TABLE T;
> > > >
> > > > is not quite the same as:
> > > >
> > > > CREATE PUBLICATION pub FOR TABLE T(a,b,c);
> > > >
> > > > The difference is, in the 1st case if you then ALTER the TABLE T to
> > > > have a new column 'd' then that will automatically start replicating
> > > > the 'd' data without having to do anything to either the PUBLICATION
> > > > or the SUBSCRIPTION. Of course, if TABLE T at the subscriber side does
> > > > not have a column 'd' then you'll get an error because your subscriber
> > > > table needs to have *at least* all the replicated columns. (I
> > > > demonstrate this error below)
> > > >
> > > > Whereas in the 2nd case, even though you ALTER'ed the TABLE T to have
> > > > a new column 'd' then that won't be replicated because 'd' was not
> > > > named in the PUBLICATION's column list.
> > > >
> > > > 
> > > >
> > > > Here's an example where you can see this in action
> > > >
> > > > Here is an example of the 1st case -- it shows 'd' is automatically
> > > > replicated and also shows the subscriber-side error caused by the
> > > > missing column:
> > > >
> > > > test_pub=# CREATE TABLE T(a int,b int, c int);
> > > > test_pub=# CREATE PUBLICATION pub FOR TABLE T;
> > > >
> > > > test_sub=# CREATE TABLE T(a int,b int, c int);
> > > > test_sub=# CREATE SUBSCRIPTION sub CONNECTION 'dbname=test_pub' 
> > > > PUBLICATION pub;
> > > >
> > > > See the replication happening
> > > > test_pub=# INSERT INTO T VALUES (1,2,3);
> > > > test_sub=# SELECT * FROM t;
> > > >  a | b | c
> > > > ---+---+---
> > > >  1 | 2 | 3
> > > > (1 row)
> > > >
> > > > Now alter the publisher table T and insert some new data
> > > > test_pub=# ALTER TABLE T ADD COLUMN d int;
> > > > test_pub=# INSERT INTO T VALUES (5,6,7,8);
> > > >
> > > > This will cause subscription errors like:
> > > > 2024-05-22 11:53:19.098 AEST [16226] ERROR:  logical replication
> > > > target relation "public.t" is missing replicated column: "d"
> > > >
> > > > 
> > > >
> > > > I think the following small change will remove any ambiguity:
> > > >
> > > > BEFORE
> > > > If no column list is specified, any columns added later are
> > > > automatically replicated.
> > > >
> > > > SUGGESTION
> > > > If no column list is specified, any columns added to the table later
> > > > are automatically replicated.
> > > >
> > > > ~~
> > > >
> > > > I attached a small patch to make the above change.
> > >
> > > A small recommendation:
> > > It would enhance clarity to include a line break following "If no
> > > column list is specified, any columns added to the table later are":
> > > -   If no column list is specified, any columns added later are 
> > > automatically
> > > +   If no column list is specified, any columns added to the table
> > > later are automatically
> > > replicated. This means that having a column list which names all 
> > > columns
> >
> > Hi Vignesh,
> >
> > IIUC you're saying my v1 patch *content* and rendering is OK, but you
> > only wanted the SGML text to have better wrapping for < 80 chars
> > lines. So I have attached a patch v2 with improved wrapping. If you
> > meant something different then please explain.
>
> Yes, that is what I meant and the updated patch looks good.

Adding Amit to get his opinion on the same.

Regards,
Vignesh




Re: Ambiguous description on new columns

2024-06-03 Thread Amit Kapila
On Fri, May 31, 2024 at 10:54 PM Peter Smith  wrote:
>
> On Wed, May 29, 2024 at 8:04 PM vignesh C  wrote:
> >
> > > >
> > > > The following documentation comment has been logged on the website:
> > > >
> > > > Page: 
> > > > https://www.postgresql.org/docs/16/logical-replication-col-lists.html
> > > > Description:
> > > >
> > > > The documentation on this page mentions:
> > > >
> > > > "If no column list is specified, any columns added later are 
> > > > automatically
> > > > replicated."
> > > >
> > > > It feels ambiguous what this could mean. Does it mean:
> > > >
> > > > 1/ That if you alter the table on the publisher and add a new column, it
> > > > will be replicated
> > > >
> > > > 2/ If you add a column list later and add a column to it, it will be
> > > > replicated
> > > >
> > > > In both cases, does the subscriber automatically create this column if 
> > > > it
> > > > wasn't there before?
> > >
> > > 
> > >
> > > I think the following small change will remove any ambiguity:
> > >
> > > BEFORE
> > > If no column list is specified, any columns added later are
> > > automatically replicated.
> > >
> > > SUGGESTION
> > > If no column list is specified, any columns added to the table later
> > > are automatically replicated.
> > >
> > > ~~
> > >
> > > I attached a small patch to make the above change.
> >
> > A small recommendation:
> > It would enhance clarity to include a line break following "If no
> > column list is specified, any columns added to the table later are":
> > -   If no column list is specified, any columns added later are 
> > automatically
> > +   If no column list is specified, any columns added to the table
> > later are automatically
> > replicated. This means that having a column list which names all columns
>
> Hi Vignesh,
>
> IIUC you're saying my v1 patch *content* and rendering is OK, but you
> only wanted the SGML text to have better wrapping for < 80 chars
> lines. So I have attached a patch v2 with improved wrapping. If you
> meant something different then please explain.
>

Your patch is an improvement. Koen, does the proposed change make
things clear to you?

-- 
With Regards,
Amit Kapila.