On Thu, Jun 29, 2023 at 3:52 PM Hayato Kuroda (Fujitsu) <kuroda.hay...@fujitsu.com> wrote: > > 2. I want to confirm the reason why new replication command is added. >
Are you referring LIST_SLOTS command? If so, I don't think that is required and instead, we can use a query to fetch the required information. > IIUC the > launcher connects to primary by using primary_conninfo connection string, > but > it establishes the physical replication connection so that any SQL cannot > be executed. > Is it right? Another approach not to use is to specify the target database > via > GUC, whereas not smart. How do you think? > 3. You chose the per-db worker approach, however, it is difficult to extend > the > feature to support physical slots. This may be problematic. Was there any > reasons for that? I doubted ReplicationSlotCreate() or advance functions > might > not be used from other databases and these may be reasons, but not sure. > If these operations can do without connecting to specific database, I think > the architecture can be changed. > I think this point needs some investigation but do we want just one worker that syncs all the slots? That may lead to lag in keeping the slots up-to-date. We probably need some tests. > 4. Currently the launcher establishes the connection every time. Isn't it > better > to reuse the same one instead? > I feel it is not the launcher but a separate sync slot worker that establishes the connection. It is not clear to me what exactly you have in mind. Can you please explain a bit more? > Following comments are assumed the configuration, maybe the straightfoward: > > primary->standby > |->subscriber > > 5. After constructing the system, I dropped the subscription on the > subscriber. > In this case the logical slot on primary was removed, but that was not > replicated > to standby server. Did you support the workload or not? > This should work. > > 6. Current approach may delay the startpoint of sync. > > Assuming that physical replication system is created first, and then the > subscriber connects to the publisher node. In this case the launcher connects > to > primary earlier than the apply worker, and reads the slot. At that time there > are > no slots on primary, so launcher disconnects from primary and waits a time > period (up to 3min). > Even if the apply worker creates the slot on publisher, but the launcher on > standby > cannot notice that. The synchronization may start 3 min later. > I feel this should be based on some GUC like 'wal_retrieve_retry_interval' which we are already using in the launcher or probably a new one if that doesn't seem to match. -- With Regards, Amit Kapila.