Dear hackers,

Based on the discussion Sawada-san pointed out[1] that the current approach of
logical time-delayed avoids recycling WALs, I'm planning to close the CF entry 
once.
This or the forked thread will be registered again after deciding on the 
alternative
approach. Thank you very much for the time to join our discussions earlier.

I think to solve the issue, logical changes must be flushed on subscribers once
and workers apply changes after spending a specified time. The straightforward
approach for it is following physical replication - introduce the walreceiver 
process
on the subscriber. We must research more, but at least there are some benefits:

* Publisher can be shutted down even if the apply worker stuck. The stuck is 
more
  likely happen than physical replication, so this may improve the robustness.
  More detail, please see another thread[2].
* In case of synchronous_commit = 'remote_write', publisher can COMMIT faster.
  This is because walreceiver will flush changes immediately and reply soon.
  Even if time-delayed is enabled, the wait-time will not be increased.
* May be used as an infrastructure of parallel apply for non-streaming 
transaction.
  The basic design of them are the similar - one process receive changes and 
others apply.

I searched old discussions [3] and wiki pages, and I found that the initial 
prototype
had a logical walreceiver but in a later version [4] apply worker directly 
received
changes. I could not find the reason for the decision, but I suspect there were 
the
following reasons. Could you please tell me the correct background about that?

* Performance bottlenecks. If the walreceiver flush changes and the worker 
applies
  them, fsync() is called for every reception.
* Complexity. In this design walreceiver and apply worker must share the 
progress
  of flush/apply. For crash recovery, more consideration is needed. The related 
discussion
  can be found in [5].
* Extendibility. In-core logical replication should be a sample of an external
  project. Apply worker is just a background worker that can be launched from 
an extension,
  so it can be easily understood. If it deeply depends on the walreceiver, 
other projects cannot follow.

[1]: 
https://www.postgresql.org/message-id/CAD21AoAeG2%2BRsUYD9%2BmEwr8-rrt8R1bqpe56T2D%3DeuO-Qs-GAg%40mail.gmail.com
[2]: 
https://www.postgresql.org/message-id/flat/TYAPR01MB586668E50FC2447AD7F92491F5E89%40TYAPR01MB5866.jpnprd01.prod.outlook.com
[3]: 
https://www.postgresql.org/message-id/201206131327.24092.andres%402ndquadrant.com
[4]: 
https://www.postgresql.org/message-id/37e19ad5-f667-2fe2-b95b-bba69c5b6...@2ndquadrant.com
[5]: 
https://www.postgresql.org/message-id/1339586927-13156-12-git-send-email-andres%402ndquadrant.com

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

Reply via email to