Dear Euler,

Here are comments for your v5 patch.

01.
In the document, two words target/standby are used as almost the same meaning.
Can you unify them?

02.
```
 <refsynopsisdiv>
  <cmdsynopsis>
   <command>pg_subscriber</command>
   <arg rep="repeat"><replaceable>option</replaceable></arg>
  </cmdsynopsis>
 </refsynopsisdiv>
```

There are some mandatory options like -D/-S/-P. It must be listed in Synopsis 
chapter.

03.
```
  <para>
   <application>pg_subscriber</application> takes the publisher and subscriber
   connection strings, a cluster directory from a standby server and a list of
   database names and it sets up a new logical replica using the physical
   recovery process.
  </para>
```

I briefly checked other pages and they do not describe accepted options here.
A summary of the application should be mentioned. Based on that, how about:

```
pg_subscriber creates a new <link linkend="logical-replication-subscription">
subscriber</link> from a physical standby server. This allows users to quickly
set up logical replication system.
```

04.
```
  <para>
   The <application>pg_subscriber</application> should be run at the target
   server. The source server (known as publisher server) should accept logical
   replication connections from the target server (known as subscriber server).
   The target server should accept local logical replication connection.
  </para>
```

I'm not native speaker, but they are not just recommmendations - they are surely
required. So, should we replace s/should/has to/?

05.
```
     <varlistentry>
      <term><option>-S <replaceable 
class="parameter">conninfo</replaceable></option></term>
      <term><option>--subscriber-conninfo=<replaceable 
class="parameter">conninfo</replaceable></option></term>
      <listitem>
       <para>
        The connection string to the subscriber. For details see <xref 
linkend="libpq-connstring"/>.
       </para>
      </listitem>
     </varlistentry>
```

I became not sure whether it is "The connection string to the subscriber.".
The server is still physical standby at that time.

06.
```
 * IDENTIFICATION
 *              src/bin/pg_subscriber/pg_subscriber.c
```

The identification is not correct.

07.
I felt that there were too many global variables and LogicalRepInfo should be
refactored. Because...

* Some info related with clusters(e.g., subscriber_dir, conninfo, ...) should be
  gathered in one struct.
* pubconninfo/subsconninfo are stored per db, but it is not needed if we have
  one base_conninfo.
* pubname/subname are not needed because we have fixed naming rule.
* pg_ctl_path and pg_resetwal_path can be conbimed into one bindir.
* num_dbs should not be alone.
...

Based on above, how about using structures like below?

```
typedef struct LogicalRepPerdbInfo
{
        Oid             oid;
        char   *dbname;
        bool    made_replslot;  /* replication slot was created */
        bool    made_publication;       /* publication was created */
        bool    made_subscription;      /* subscription was created */
} LogicalRepPerdbInfo;

typedef struct PrimaryInfo
{
        char   *base_conninfo;
        bool    made_transient_replslot;
} PrimaryInfo;

typedef struct StandbyInfo
{
        char   *base_conninfo;
        char   *bindir;
        char   *pgdata;
        char   *primary_slot_name;
} StandbyInfo;

typedef struct LogicalRepInfo
{
        int                                             num_dbs;
        LogicalRepPerdbInfo        *perdb;
        PrimaryInfo                        *primary;
        StandbyInfo                        *standby;
} LogicalRepInfo;
```

08.
```
        char       *subconninfo;        /* subscription connection string for 
logical
                                                                 * replication 
*/
```

Not sure how we should notate because the target has not been subscriber yet.

09.
```
enum WaitPMResult
{
        POSTMASTER_READY,
        POSTMASTER_STANDBY,
        POSTMASTER_STILL_STARTING,
        POSTMASTER_FAILED
};
```

This enum has been already defined in pg_ctl.c. Not sure we can use the same 
name.
Can we rename to PGSWaitPMResult. or export pre-existing one?

10.
```
/* Options */
static const char *progname;
```

I think it is not an option.

11.
```
/*
 * Validate a connection string. Returns a base connection string that is a
 * connection string without a database name plus a fallback application name.
 * Since we might process multiple databases, each database name will be
 * appended to this base connection string to provide a final connection string.
 * If the second argument (dbname) is not null, returns dbname if the provided
 * connection string contains it. If option --database is not provided, uses
 * dbname as the only database to setup the logical replica.
 * It is the caller's responsibility to free the returned connection string and
 * dbname.
 */
static char *
get_base_conninfo(char *conninfo, char *dbname, const char *noderole)
```
Just FYI - adding fallback_application_name may be too optimisitic. Currently
the output was used by both pg_subscriber and subscription connection. 

12.
Can we add an option not to remove log files even operations were succeeded.

13.
```
                /*
                 * Since the standby server is running, check if it is using an
                 * existing replication slot for WAL retention purposes. This
                 * replication slot has no use after the transformation, hence, 
it
                 * will be removed at the end of this process.
                 */
                primary_slot_name = use_primary_slot_name();
```

Now primary_slot_name is checked only when the server have been started, but
it should be checked in any cases.

14.
```
        consistent_lsn = create_logical_replication_slot(conn, &dbinfo[0],
                                                                                
                         temp_replslot);
```

Can we create a temporary slot here?

15.
I found that subscriptions cannot be started if tuples are inserted on publisher
after creating temp_replslot. After starting a subscriber, I got below output 
on the log.

```
ERROR:  could not receive data from WAL stream: ERROR:  publication 
"pg_subscriber_5" does not exist
CONTEXT:  slot "pg_subscriber_5_3632", output plugin "pgoutput", in the change 
callback, associated LSN 0/30008A8
LOG:  background worker "logical replication apply worker" (PID 3669) exited 
with exit code 1
```

But this is strange. I confirmed that the specified publication surely exists.
Do you know the reason?

```
publisher=# SELECT pubname FROM pg_publication;
     pubname     
-----------------
 pg_subscriber_5
(1 row)
```

Best Regards,
Hayato Kuroda
FUJITSU LIMITED



Reply via email to