On 2016-08-31 22:51, Petr Jelinek wrote:
Hi,and one more version with bug fixes, improved code docs and couple more tests, some general cleanup and also rebased on current master for the start of CF.
Clear, well-written docs, thanks. Here are some small changes to logical-replication.sgml Erik Rijkers
--- doc/src/sgml/logical-replication.sgml.orig 2016-09-01 00:04:11.484729675 +0200 +++ doc/src/sgml/logical-replication.sgml 2016-09-01 00:54:07.817799915 +0200 @@ -22,11 +22,11 @@ cascading replication or more complex configurations. </para> <para> - Logical replication typically starts with snapshot of the data on - the publisher database. Once that is done, the changes on publisher - are sent to subscriber as they occur in real-time. The subscriber - applies the data in same order as the publisher so that the - transactional consistency is guaranteed for the publications within + Logical replication typically starts with a snapshot of the data on + the publisher database. Once that is done, the changes on the publisher + are sent to the subscriber as they occur in real-time. The subscriber + applies the data in the same order as the publisher so that + transactional consistency is guaranteed for publications within a single subscription. This method of data replication is sometimes referred to as transactional replication. </para> @@ -54,23 +54,23 @@ </listitem> <listitem> <para> - Replicating between different major versions of the PostgreSQL + Replicating between different major versions of PostgreSQL </para> </listitem> <listitem> <para> - Giving access to the replicated data to different groups of + Giving access to replicated data to different groups of users. </para> </listitem> <listitem> <para> - Sharing subset of the database between multiple databases. + Sharing a subset of the database between multiple databases. </para> </listitem> </itemizedlist> <para> - The subscriber database behaves in a same way as any other + The subscriber database behaves in the same way as any other PostgreSQL instance and can be used as a publisher for other databases by defining its own publications. When the subscriber is treated as read-only by application, there will be no conflicts from @@ -83,9 +83,9 @@ <title>Publication</title> <para> A <firstterm>publication</> object can be defined on any physical - replication master, the node where publication is deined is referred to - as <firstterm>publisher</>. Only superusers or members of - <literal>REPLICATION</> role can define publication. A publication is + replication master. The node where a publication is defined is referred + to as <firstterm>publisher</>. Only superusers or members of the + <literal>REPLICATION</> role can define a publication. A publication is a set of changes generated from a group of tables, and might also be described as a <firstterm>change set</> or <firstterm>replication set</>. Each publication exists in only one database. @@ -93,7 +93,7 @@ <para> Publications are different from table schema and do not affect how the table is accessed. Each table can be added to multiple - publications if needed. Publications may currenly only contain + publications if needed. Publications may currently only contain tables. Objects must be added explicitly, except when a publication is created for <literal>ALL TABLES</>. There is no default name for a publication which specifies all tables. @@ -103,8 +103,8 @@ any combination of <command>INSERT</>, <command>UPDATE</>, <command>DELETE</> and <command>TRUNCATE</> in a similar way to the way triggers are fired by particular event types. Only - tables with <literal>REPLICA IDENTITY</> index can be added to - publication which replicate <command>UPDATE</> and <command>DELETE</> + tables with a <literal>REPLICA IDENTITY</> index can be added to a + publication which replicates <command>UPDATE</> and <command>DELETE</> operation. </para> <para> @@ -129,20 +129,20 @@ <sect1 id="logical-replication-subscription"> <title>Subscription</title> <para> - A <firstterm>subscription</> is the downstream side of the logical + A <firstterm>subscription</> is the downstream side of logical replication. The node where subscription is defined is referred to as - a <firstterm>subscriber</>. Subscription defines the connection to + the <firstterm>subscriber</>. Subscription defines the connection to another database and set of publications (one or more) to which it wants to be subscribed. </para> <para> - The subscriber database behaves in a same way as any other + The subscriber database behaves in the same way as any other PostgreSQL instance and can be used as a publisher for other databases by defining its own publications. </para> <para> A subscriber may have multiple subscriptions if desired. It is - possible to define multiple subscriptions between single + possible to define multiple subscriptions between a single publisher-subscriber pair, in which case extra care must be taken to ensure that the subscribed publication objects don't overlap. </para> @@ -153,17 +153,17 @@ of pre-existing table data. </para> <para> - Subscriptions are not dumped by pg_dump by default, but can be - requested using --subscriptions parameter. + Subscriptions are not dumped by pg_dump by default but can be + requested using the --subscriptions parameter. </para> <para> The subscription is added using <xref linkend="sql-createsubscription"> - and can be stopped/resumed at any time using + and can be stopped/resumed at any time using the <xref linkend="sql-altersubscription"> command or removed using <xref linkend="sql-dropsubscription">. </para> <para> - When subscription is dropped and recreated the synchronization + When subscription is dropped and recreated, the synchronization information is lost. This means that the data has to be resynchronized afterwards. </para> @@ -173,25 +173,25 @@ <para> The logical replication behaves similarly to normal <literal>DML</> operations in that the data will be updated even if it was changed - locally on the subscriber node. In case when the incoming data - violates any constraints the replication will stop. This is refered + locally on the subscriber node. If the incoming data + violates any constraints the replication will stop. This is referred to as a <firstterm>conflict</>. When replicating <command>UPDATE</> - or <command>DELETE</> operations any missing data will not produce - conflict and such operation will simply be skipped. + or <command>DELETE</> operations, missing data will not produce a + conflict and such operations will simply be skipped. </para> <para> - The conflicts will produce error and stop the replication and must - be resolved manually by user. + A conflict will produce an error and will stop the replication; it + must be resolved manually by the user. </para> <para> - The resolution can be done either by changing dota on the subscriber + The resolution can be done either by changing data on the subscriber so that it does not conflict with incoming change or by skipping the - transaction which conflicts with the existing data. The transaction + transaction that conflicts with the existing data. The transaction can be skipped by calling the <link linkend="pg-replication-origin-advance"> <function>pg_replication_origin_advance()</function></link> function - with <literal>node_name</> corresponding to the subscription name. The - current position of origins can be seen in + with a <literal>node_name</> corresponding to the subscription name. The + current position of origins can be seen in the <link linkend="view-pg-replication-origin-status"> <structname>pg_replication_origin_status</structname></link> system view. </para> @@ -200,28 +200,28 @@ <title>Architecture</title> <para> Logical replication starts by copying a snapshot of the data on - the publisher database. Once that is done, the changes on publisher - are sent to subscriber as they occur in real-time. The subscriber - applies the data in the order in which commits were made on the + the publisher database. Once that is done, changes on the publisher + are sent to the subscriber as they occur in real-time. The subscriber + applies data in the order in which commits were made on the publisher so that transactional consistency is guaranteed for the publications within any single Subscription. </para> <para> - The logical replication is built on the similar architecture as the + Logical replication is built with an architecture similar to physical streaming replication (see <xref linkend="streaming-replication">). It is implemented by - WalSender and the Apply processes. The WalSender starts the logical + WalSender and the Apply processes. The WalSender starts logical decoding (described in <xref linkend="logicaldecoding">) of the WAL and loads the standard logical decoding plugin (pgoutput). The plugin transforms the changes read from WAL to the logical replication protocol (see <xref linkend="protocol-logical-replication">) and filters the data - according to publication specifications. The data are then continuously + according to publication specification. The data are then continuously transferred using the streaming replication protocol to the Apply worker which maps them to the local tables and applies the individual changes as they are received in exact transactional order. </para> <para> - The Apply process on subscriber database always runs with + The Apply process on the subscriber database always runs with session_replication_role set to replica, which produces the usual effects on triggers and constraints. </para> @@ -229,15 +229,15 @@ <title>Initial snapshot</title> <para> The initial data in existing subscribed tables are snapshotted and - copied in a parallel instance of special kind of Apply process. - This process will create it's own temporary replication slot and - copies the existing data. Once existing data is copied, the worker + copied in a parallel instance of a special kind of Apply process. + This process will create its own temporary replication slot and + copy the existing data. Once existing data is copied, the worker enters synchronization mode which ensures that the table is brought - up to synchronized state with the main Apply proccess by streaming + up to synchronized state with the main Apply process by streaming any changes which happened during the initial data copy using standard - logical replication. Once the sycnhronization is done, the control - of of the replication of the table is given back to the main Apply - proccess where the replication continues as normal. + logical replication. Once the synchronization is done, the control + of the replication of the table is given back to the main Apply + process where the replication continues as normal. </para> </sect2> </sect1> @@ -246,18 +246,18 @@ <para> Because logical replication is based on similar architecture as <link linkend="streaming-replication">physical streaming - replication</link> the monitoring on publicasher is very similar to - monitoring of physical replication master(see + replication</link> the monitoring on publication is very similar to + monitoring of physical replication master (see <xref linkend="streaming-replication-monitoring">). </para> <para> - The monitoring information about subscription can is visible in + The monitoring information about subscription is visible in <link linkend="pg-stat-subscription"><literal>pg_stat_subscription</></link>. - This view contains one row per every subscription worker. Subscription - can have zero or more active subscription workers depending on it's state. + This view contains one row for every subscription worker. Subscription + can have zero or more active subscription workers depending on its state. </para> <para> - Normally there is single Apply process running for the enabled + Normally there is a single Apply process running for the enabled subscription. The disabled subscription of crashed subscription will have zero rows in this view. If the initial data synchronization of any table is in progress there will be additional worker(s) for the @@ -270,7 +270,7 @@ Replication connection can occur in the same way as physical streaming replication. It requires access to be specifically given using pg_hba.conf. The role used for the replication must have - <literal>REPLICATION</literal> privilege <command>GRANTED</command>. + <literal>REPLICATION</literal> privilege <command>GRANT</command>ED. This gives a role access to both logical and physical replication. </para> <para> @@ -281,7 +281,7 @@ To create a subscription the user must be a superuser. </para> <para> - The subscription apply process will run in local database + The subscription apply process will run in the local database with the privileges of a superuser. </para> <para> @@ -298,10 +298,10 @@ </para> <para> On the publisher side the <varname>wal_level</> must be set to - <literal>logical</>, <varname>max_replication_slots</> has to be set to - at least number of Subscriptions expected to connect with some reserve + <literal>logical</>, and <varname>max_replication_slots</> has to be set to + at least the number of Subscriptions expected to connect with some reserve for table synchronization as well. And <varname>max_wal_senders</> - should be set to at least same as <varname>max_replication_slots</> plus + should be set to at least the same as <varname>max_replication_slots</> plus the number of physical replicas that are connected at the same time. </para> <para> @@ -311,7 +311,7 @@ <varname>max_logical_replication_workers</> has to be set to at least the number of Subscriptions again with some reserve for the table synchronization. Additionally the <varname>max_worker_processes</> may - need to be adjusted to accommodate for replication workers at least + need to be adjusted to accommodate for replication workers, at least (<varname>max_logical_replication_workers</> + <literal>1</>). Please note that some extensions and parallel queries also take worker slots from <varname>max_worker_processes</>. @@ -325,7 +325,7 @@ wal_level = logical max_worker_processes = 10 # one per subscription + one per instance needed on subscriber max_logical_replication_workers = 10 # one per subscription + one per instance needed on subscriber -max_replication_slots = 10 # one per subscription needed both publisher and subscriber +max_replication_slots = 10 # one per subscription needed on both publisher and subscriber max_wal_senders = 10 # one per subscription needed on publisher </programlisting> </para> @@ -338,13 +338,13 @@ </programlisting> </para> <para> - Then on publisher database: + Then on the publisher database: <programlisting> CREATE PUBLICATION mypub FOR TABLE users, departments; </programlisting> </para> <para> - And on Subscriber database: + And on the Subscriber database: <programlisting> CREATE SUBSCRIPTION mysub WITH CONNECTION <quote>dbname=foo host=bar user=repuser</quote> PUBLICATION mypub; </programlisting>
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers