On Tue, Oct 11, 2016 at 4:18 PM, Masahiko Sawada <sawada.m...@gmail.com> wrote: > On Wed, Sep 28, 2016 at 5:14 PM, Michael Paquier > <michael.paqu...@gmail.com> wrote: >> On Sat, Sep 24, 2016 at 5:37 PM, Masahiko Sawada <sawada.m...@gmail.com> >> wrote: >>> I still vote for changing behaviour of existing syntax 'k (n1, n2)' to >>> quorum commit. >>> That is, >>> 1. 'First k (n1, n2, n3)' means that the master server waits for ACKs >>> from k standby servers whose name appear earlier in the list. >>> 2. 'Any k (n1, n2, n3)' means that the master server waits for ACKs >>> from any k listed standby servers. >>> 3. 'n1, n2, n3' is the same as #1 with k=1. >>> 4. '(n1, n2, n3)' is the same as #2 with k=1. >> >> OK, so I have done a review of this patch keeping that in mind as >> that's the consensus. I am still getting familiar with the code... > > Thank you for reviewing! > >> - transactions will wait until all the standby servers which are >> considered >> + transactions will wait until the multiple standby servers which >> are considered >> There is no real need to update this sentence. >> >> + <literal>FIRST</> means to control the standby servers with >> + different priorities. The synchronous standbys will be those >> + whose name appear earlier in this list, and that are both >> + currently connected and streaming data in real-time(as shown >> + by a state of <literal>streaming</> in the >> + <link linkend="monitoring-stats-views-table"> >> + <literal>pg_stat_replication</></link> view). Other standby >> + servers appearing later in this list represent potential >> + synchronous standbys. If any of the current synchronous >> + standbys disconnects for whatever reason, it will be replaced >> + immediately with the next-highest-priority standby. >> + For example, a setting of <literal>FIRST 3 (s1, s2, s3, s4)</> >> + makes transaction commits wait until their WAL records are received >> + by three higher-priority standbys chosen from standby servers >> + <literal>s1</>, <literal>s2</>, <literal>s3</> and <literal>s4</>. >> It does not seem necessary to me to enter in this level of details: >> The keyword FIRST, coupled with an integer number N, chooses the first >> N higher-priority standbys and makes transaction commit when their WAL >> records are received. For example <literal>FIRST 3 (s1, s2, s3, s4)</> >> makes transaction commits wait until their WAL records are received by >> the three high-priority standbys chosen from standby servers s1, s2, >> s3 and s4. > > Will fix. > >> + <literal>ANY</> means to control all of standby servers with >> + same priority. The master sever will wait for receipt from >> + at least <replaceable class="parameter">num_sync</replaceable> >> + standbys, which is quorum commit in the literature. The all of >> + listed standbys are considered as candidate of quorum commit. >> + For example, a setting of <literal> ANY 3 (s1, s2, s3, s4)</> makes >> + transaction commits wait until receiving receipts from at least >> + any three standbys of four listed servers <literal>s1</>, >> + <literal>s2</>, <literal>s3</>, <literal>s4</>. >> >> Similarly, something like that... >> The keyword ANY, coupled with an integer number N, chooses N standbys >> in a set of standbys with the same, lowest, priority and makes >> transaction commit when WAL records are received those N standbys. For >> example ANY 3(s1, s2, s3, s4) makes transaction commits wait until WAL >> records have been received from 3 servers in the set s1, s2, s3 and >> s4. > > Will fix. > >> It could be good also to mention that no keyword specified means ANY, >> which is incompatible with 9.6. The docs also miss the fact that if a >> simple list of servers is given, without parenthesis and keywords, >> this is equivalent to FIRST 1. > > Right. I will add those documentations. > >> -synchronous_standby_names = '2 (s1, s2, s3)' >> +synchronous_standby_names = 'First 2 (s1, s2, s3)' >> Nit here. It may be a good idea to just use upper-case characters in >> the docs, or just lower-case for consistency, but not mix both. >> Usually GUCs use lower-case characters. > > Agree. Will fix. > >> + when standby is considered as a condidate of quorum commit.</entry> >> s/condidate/candidate/ > > Will fix. > >> -syncrep_scanner.c: FLEXFLAGS = -CF -p >> +syncrep_scanner.c: FLEXFLAGS = -CF -p -i >> Hm... Is that actually a good idea? Now "NODE" and "node" are two >> different things for application_name, but with this patch both would >> have the same meaning. I am getting to think that we could just use >> the lower-case characters for the keywords any/first. Is this -i >> switch a problem for elements in standby_list? > > The string of standby name is not changed actually, only the parser > doesn't distinguish between "NODE" and "node". > The values used for checking application_name will still works fine. > If we want to name "first" or "any" as the standby name then it should > be double quoted. > >> + * Calculate the 'pos' newest Write, Flush and Apply positions among >> sync standbys. >> I don't understand this comment. >> >> + if (SyncRepConfig->sync_method == SYNC_REP_PRIORITY) >> + got_recptr = SyncRepGetOldestSyncRecPtr(&writePtr, &flushPtr, >> + &applyPtr, &am_sync); >> + else /* SYNC_REP_QUORUM */ >> + got_recptr = SyncRepGetNNewestSyncRecPtr(&writePtr, &flushPtr, >> + &applyPtr, >> SyncRepConfig->num_sync, >> + &am_sync); >> Those could be grouped together, there is no need to have pos as an argument. > > Will fix. > >> + /* In quroum method, all sync standby priorities are always 1 */ >> + if (found && SyncRepConfig->sync_method == SYNC_REP_QUORUM) >> + priority = 1; >> This is dead code, SyncRepGetSyncStandbysPriority is not called for >> QUORUM. > > Well, this code is in SyncRepGetStandbyPriority which is called by > SyncRepInitConifig. > SyncRepGetStandbyPriority can be called regardless of the the > synchronization method. > > >> You may want to add an assert in >> SyncRepGetSyncStandbysPriority and SyncRepGetSyncStandbysQuorum to be >> sure that they are getting called for the correct method. >> + /* Sort each array in descending order to get 'pos' newest element */ >> + qsort(write_array, len, sizeof(XLogRecPtr), cmp_lsn); >> + qsort(flush_array, len, sizeof(XLogRecPtr), cmp_lsn); >> + qsort(apply_array, len, sizeof(XLogRecPtr), cmp_lsn); >> There is no need to reorder things again and to use arrays, you can >> choose the newest LSNs when scanning the WalSnd entries. > > I considered it that but it depends on performance. > Current patch avoids O(N*M). >
Attached latest patch. Please review it. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
000_quorum_commit_v4.patch
Description: binary/octet-stream
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers