Re: src/bin/pg_upgrade/t/004_subscription.pl test comment fix
On Mon, Feb 5, 2024 at 2:42 AM Peter Smith wrote: > > On Sat, Feb 3, 2024 at 5:28 PM Amit Kapila wrote: > > > > On Thu, Feb 1, 2024 at 5:58 AM Peter Smith wrote: > > > > > > OK. Now using your suggested 2nd sentence: > > > > > > +# The subscription's running status and failover option should be > > > preserved > > > +# in the upgraded instance. So regress_sub1 should still have > > > subenabled,subfailover > > > +# set to true, while regress_sub2 should have both set to false. > > > > > > ~ > > > > > > I also tweaked some other nearby comments/messages which did not > > > mention the 'failover' preservation. > > > > > > > Looks mostly good to me. One minor nitpick: > > * > > along with retaining the replication origin's remote lsn > > -# and subscription's running status. > > +# and subscription's running status and failover option. > > > > I don't think we need to use 'and' twice in the above sentence. We > > should use ',' between different properties. I can change this on > > Monday and push it unless you think otherwise. > > > > +1 > Pushed! -- With Regards, Amit Kapila.
Re: src/bin/pg_upgrade/t/004_subscription.pl test comment fix
On Sat, Feb 3, 2024 at 5:28 PM Amit Kapila wrote: > > On Thu, Feb 1, 2024 at 5:58 AM Peter Smith wrote: > > > > OK. Now using your suggested 2nd sentence: > > > > +# The subscription's running status and failover option should be preserved > > +# in the upgraded instance. So regress_sub1 should still have > > subenabled,subfailover > > +# set to true, while regress_sub2 should have both set to false. > > > > ~ > > > > I also tweaked some other nearby comments/messages which did not > > mention the 'failover' preservation. > > > > Looks mostly good to me. One minor nitpick: > * > along with retaining the replication origin's remote lsn > -# and subscription's running status. > +# and subscription's running status and failover option. > > I don't think we need to use 'and' twice in the above sentence. We > should use ',' between different properties. I can change this on > Monday and push it unless you think otherwise. > +1 == Kind Regards, Peter Smith. Fujitsu Australia
Re: src/bin/pg_upgrade/t/004_subscription.pl test comment fix
On Thu, Feb 1, 2024 at 5:58 AM Peter Smith wrote: > > OK. Now using your suggested 2nd sentence: > > +# The subscription's running status and failover option should be preserved > +# in the upgraded instance. So regress_sub1 should still have > subenabled,subfailover > +# set to true, while regress_sub2 should have both set to false. > > ~ > > I also tweaked some other nearby comments/messages which did not > mention the 'failover' preservation. > Looks mostly good to me. One minor nitpick: * along with retaining the replication origin's remote lsn -# and subscription's running status. +# and subscription's running status and failover option. I don't think we need to use 'and' twice in the above sentence. We should use ',' between different properties. I can change this on Monday and push it unless you think otherwise. -- With Regards, Amit Kapila.
Re: src/bin/pg_upgrade/t/004_subscription.pl test comment fix
On Wed, Jan 31, 2024 at 7:48 PM Alvaro Herrera wrote: > > How about rewording it more extensively? It doesn't read great IMO. > I would use something like > > # In the upgraded instance, the running status and failover option of the > # subscription with the failover option should have been preserved; the other > # should not. > # So regress_sub1 should still have subenabled,subfailover set to true, > # while regress_sub2 should have both set to false. > IIUC this suggested comment is implying that the running status is *only* preserved when the failover option is true. But AFAIK that is not correct. e.g. I hacked the test to keep subscription regress_sub2 as ENABLED but the result after the upgrade was subenabled/subfailover = t/f, not f/f. > I think the symmetry between the two lines confuses more than helps. > It's not a huge thing but since we're editing anyway, why not? > OK. Now using your suggested 2nd sentence: +# The subscription's running status and failover option should be preserved +# in the upgraded instance. So regress_sub1 should still have subenabled,subfailover +# set to true, while regress_sub2 should have both set to false. ~ I also tweaked some other nearby comments/messages which did not mention the 'failover' preservation. PSA v2. == Kind Regards, Peter Smith. Fujitsu Australia v2-0001-Fix-pg_upgrade-test-comment.patch Description: Binary data
Re: src/bin/pg_upgrade/t/004_subscription.pl test comment fix
How about rewording it more extensively? It doesn't read great IMO. I would use something like # In the upgraded instance, the running status and failover option of the # subscription with the failover option should have been preserved; the other # should not. # So regress_sub1 should still have subenabled,subfailover set to true, # while regress_sub2 should have both set to false. I think the symmetry between the two lines confuses more than helps. It's not a huge thing but since we're editing anyway, why not? -- Álvaro HerreraBreisgau, Deutschland — https://www.EnterpriseDB.com/ "The saddest aspect of life right now is that science gathers knowledge faster than society gathers wisdom." (Isaac Asimov)
Re: src/bin/pg_upgrade/t/004_subscription.pl test comment fix
On Wed, Jan 31, 2024 at 4:51 PM vignesh C wrote: > > On Wed, 31 Jan 2024 at 10:27, Peter Smith wrote: > > > > Hi, > > > > PSA a small fix for a misleading comment found in the pg_upgrade test code. > > Is the double # intentional here, as I did not see this style of > commenting used elsewhere: > +# # Upgraded regress_sub1 should still have enabled and failover as true. > +# # Upgraded regress_sub2 should still have enabled and failover as false. > Unintentional caused by copy/paste. Thanks for reporting. I will post a fixed patch tomorrow. == Kind Regards, Peter Smith. Fujitsu Australia
Re: src/bin/pg_upgrade/t/004_subscription.pl test comment fix
On Wed, 31 Jan 2024 at 10:27, Peter Smith wrote: > > Hi, > > PSA a small fix for a misleading comment found in the pg_upgrade test code. Is the double # intentional here, as I did not see this style of commenting used elsewhere: +# # Upgraded regress_sub1 should still have enabled and failover as true. +# # Upgraded regress_sub2 should still have enabled and failover as false. Regards, Vignesh
src/bin/pg_upgrade/t/004_subscription.pl test comment fix
Hi, PSA a small fix for a misleading comment found in the pg_upgrade test code. == Kind Regards, Peter Smith. Fujitsu Australia v1-0001-Fix-test-comment.patch Description: Binary data
Re: trivial comment fix
On Thu, Apr 28, 2022 at 7:27 AM Euler Taveira wrote: > > Hi, > > While reading worker.c, I noticed that the referred SQL command was wrong. > ALTER SUBSCRIPTION ... REFRESH PUBLICATION instead of ALTER TABLE ... REFRESH > PUBLICATION. Trivial fix attached. Pushed, thanks! -- John Naylor EDB: http://www.enterprisedb.com
trivial comment fix
Hi, While reading worker.c, I noticed that the referred SQL command was wrong. ALTER SUBSCRIPTION ... REFRESH PUBLICATION instead of ALTER TABLE ... REFRESH PUBLICATION. Trivial fix attached. -- Euler Taveira EDB https://www.enterprisedb.com/ diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c index 4171371296..7da7823c35 100644 --- a/src/backend/replication/logical/worker.c +++ b/src/backend/replication/logical/worker.c @@ -93,7 +93,7 @@ * ReorderBufferFinishPrepared. * * If the subscription has no tables then a two_phase tri-state PENDING is - * left unchanged. This lets the user still do an ALTER TABLE REFRESH + * left unchanged. This lets the user still do an ALTER SUBSCRIPTION REFRESH * PUBLICATION which might otherwise be disallowed (see below). * * If ever a user needs to be aware of the tri-state value, they can fetch it
Re: comment fix in postmaster.c
On 2021/03/16 18:27, Fujii Masao wrote: Thanks for the path! LGTM. Barring any objection, I will push the patch. Pushed. Thanks! Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Re: Code comment fix
On 3/17/21 9:11 AM, Michael Paquier wrote: > On Wed, Mar 17, 2021 at 08:31:16AM +0100, Vik Fearing wrote: >> When table oids were removed by commit 578b229718e, the function >> CatalogTupleInsert() was modified to return void but its comment was >> left intact. Here is a trivial patch to fix that. > > Thanks, Vik. Good catch. I'll fix that in a bit. Cheers. -- Vik Fearing
Re: Code comment fix
On Wed, Mar 17, 2021 at 08:31:16AM +0100, Vik Fearing wrote: > When table oids were removed by commit 578b229718e, the function > CatalogTupleInsert() was modified to return void but its comment was > left intact. Here is a trivial patch to fix that. Thanks, Vik. Good catch. I'll fix that in a bit. -- Michael signature.asc Description: PGP signature
Code comment fix
When table oids were removed by commit 578b229718e, the function CatalogTupleInsert() was modified to return void but its comment was left intact. Here is a trivial patch to fix that. -- Vik Fearing diff --git a/src/backend/catalog/indexing.c b/src/backend/catalog/indexing.c index 284ceaa6b9..4d1440cd3a 100644 --- a/src/backend/catalog/indexing.c +++ b/src/backend/catalog/indexing.c @@ -210,7 +210,6 @@ CatalogTupleCheckConstraints(Relation heapRel, HeapTuple tup) * CatalogTupleInsert - do heap and indexing work for a new catalog tuple * * Insert the tuple data in "tup" into the specified catalog relation. - * The Oid of the inserted tuple is returned. * * This is a convenience routine for the common case of inserting a single * tuple in a system catalog; it inserts a new heap tuple, keeping indexes
Re: comment fix in postmaster.c
On 2021/03/16 16:51, Kyotaro Horiguchi wrote: While I worked on a patch, I noticed a comment that is inconsistent with the fact. * SIGQUIT is the special signal that says exit without proc_exit * and let the user know what's going on. But if SendStop is set * (-s on command line), then we send SIGSTOP instead, so that we * can get core dumps from all backends by hand. SendStop is set by "-T" option. It was changed by 86c23a6eb2 from "-s" in 2006. The attaches fixes the comment for the master branch. Thanks for the path! LGTM. Barring any objection, I will push the patch. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Re: A comment fix
At Tue, 12 May 2020 14:45:22 +0900, Michael Paquier wrote in > On Mon, May 11, 2020 at 02:22:36PM +0900, Michael Paquier wrote: > > Looks right to me, so will fix if there are no objections. > > read_local_xlog_page() uses the replay location when in recovery. > > Done this part now. Thanks! regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: A comment fix
On Mon, May 11, 2020 at 02:22:36PM +0900, Michael Paquier wrote: > Looks right to me, so will fix if there are no objections. > read_local_xlog_page() uses the replay location when in recovery. Done this part now. -- Michael signature.asc Description: PGP signature
Re: A comment fix
On Mon, May 11, 2020 at 10:16:19AM +0900, Kyotaro Horiguchi wrote: > The comment is mentioning "replay position" and the callers are > actually using GetXLogReplayRecPtr to check TLI and target LSN. The > comment was written in that way when the function is introduced by > 1148e22a82. The attached fixes that. Looks right to me, so will fix if there are no objections. read_local_xlog_page() uses the replay location when in recovery. > The function GetWalRcvWriteRecPtr is not called from anywhere in core > but I don't think we need to bother removing it since it is a public > function. Yes, I don't think that's removable (just look at the log message of d140f2f3), and the function is dead simple so that's not really going to break even if this is dead in-core now. Worth noting some future WAL prefetch stuff may actually use it. -- Michael signature.asc Description: PGP signature
A comment fix
Hello. I happened to notice a wrong function name in the comment of XLogReadDetermineTimeline. * The caller must also make sure it doesn't read past the current replay * position (using GetWalRcvWriteRecPtr) if executing in recovery, so it The comment is mentioning "replay position" and the callers are actually using GetXLogReplayRecPtr to check TLI and target LSN. The comment was written in that way when the function is introduced by 1148e22a82. The attached fixes that. The function GetWalRcvWriteRecPtr is not called from anywhere in core but I don't think we need to bother removing it since it is a public function. regards. -- Kyotaro Horiguchi NTT Open Source Software Center diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c index bbd801513a..0bb69447c2 100644 --- a/src/backend/access/transam/xlogutils.c +++ b/src/backend/access/transam/xlogutils.c @@ -681,10 +681,10 @@ XLogTruncateRelation(RelFileNode rnode, ForkNumber forkNum, * copied to a new timeline. * * The caller must also make sure it doesn't read past the current replay - * position (using GetWalRcvWriteRecPtr) if executing in recovery, so it + * position (using GetXLogReplayRecPtr) if executing in recovery, so it * doesn't fail to notice that the current timeline became historical. The * caller must also update ThisTimeLineID with the result of - * GetWalRcvWriteRecPtr and must check RecoveryInProgress(). + * GetXLogReplayRecPtr and must check RecoveryInProgress(). */ void XLogReadDetermineTimeline(XLogReaderState *state, XLogRecPtr wantPage, uint32 wantLength)
Re: Comment fix in session.h
On Mon, Jan 13, 2020 at 2:08 PM Amit Kapila wrote: > > On Mon, Jan 13, 2020 at 12:51 PM Antonin Houska wrote: > > > > This diff fixes what I consider a typo. > > > > LGTM. I'll push this in some time. > Pushed. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: Comment fix in session.h
On Mon, Jan 13, 2020 at 12:51 PM Antonin Houska wrote: > > This diff fixes what I consider a typo. > LGTM. I'll push this in some time. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: A comment fix in xlogreader.c
At Thu, 26 Sep 2019 11:57:59 +0900, Michael Paquier wrote in <20190926025759.gb2...@paquier.xyz> > On Thu, Sep 26, 2019 at 11:08:09AM +0900, Kyotaro Horiguchi wrote: > > While rechecking another patch, I found that 709d003fbd forgot to > > edit a comment mentioning three members removed from > > XLogReaderState. > > > @@ -103,8 +103,7 @@ XLogReaderAllocate(int wal_segment_size, const char > *waldir, > state->read_page = pagereadfunc; > /* system_identifier initialized to zeroes above */ > state->private_data = private_data; > -/* ReadRecPtr and EndRecPtr initialized to zeroes above */ > -/* readSegNo, readOff, readLen, readPageTLI initialized to zeroes above > */ > +/* ReadRecPtr, EndRecPtr and readLen initialized to zeroes above */ > state->errormsg_buf = palloc_extended(MAX_ERRORMSG_LEN + 1, >MCXT_ALLOC_NO_OOM); > if (!state->errormsg_buf) > > I see. readSegNo and readOff have been moved to WALOpenSegment and > replaced by new, equivalent fields, still all those three fields are > still initialized for the palloc_extended() call to allocate > XLogReaderState, while the two others are now part of > WALOpenSegmentInit(). Your change is correct, so applied. Exactly. Thanks! regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: A comment fix in xlogreader.c
On Thu, Sep 26, 2019 at 11:08:09AM +0900, Kyotaro Horiguchi wrote: > While rechecking another patch, I found that 709d003fbd forgot to > edit a comment mentioning three members removed from > XLogReaderState. > @@ -103,8 +103,7 @@ XLogReaderAllocate(int wal_segment_size, const char *waldir, state->read_page = pagereadfunc; /* system_identifier initialized to zeroes above */ state->private_data = private_data; -/* ReadRecPtr and EndRecPtr initialized to zeroes above */ -/* readSegNo, readOff, readLen, readPageTLI initialized to zeroes above */ +/* ReadRecPtr, EndRecPtr and readLen initialized to zeroes above */ state->errormsg_buf = palloc_extended(MAX_ERRORMSG_LEN + 1, MCXT_ALLOC_NO_OOM); if (!state->errormsg_buf) I see. readSegNo and readOff have been moved to WALOpenSegment and replaced by new, equivalent fields, still all those three fields are still initialized for the palloc_extended() call to allocate XLogReaderState, while the two others are now part of WALOpenSegmentInit(). Your change is correct, so applied. -- Michael signature.asc Description: PGP signature
A comment fix in xlogreader.c
While rechecking another patch, I found that 709d003fbd forgot to edit a comment mentioning three members removed from XLogReaderState. See the attached. regards. -- Kyotaro Horiguchi NTT Open Source Software Center diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c index 27c27303d6..c8b0d2303d 100644 --- a/src/backend/access/transam/xlogreader.c +++ b/src/backend/access/transam/xlogreader.c @@ -103,8 +103,7 @@ XLogReaderAllocate(int wal_segment_size, const char *waldir, state->read_page = pagereadfunc; /* system_identifier initialized to zeroes above */ state->private_data = private_data; - /* ReadRecPtr and EndRecPtr initialized to zeroes above */ - /* readSegNo, readOff, readLen, readPageTLI initialized to zeroes above */ + /* ReadRecPtr, EndRecPtr and readLen initialized to zeroes above */ state->errormsg_buf = palloc_extended(MAX_ERRORMSG_LEN + 1, MCXT_ALLOC_NO_OOM); if (!state->errormsg_buf)
Re: Comment fix of config_default.pl
At Sat, 13 Jul 2019 16:53:45 +0900, Michael Paquier wrote in <20190713075345.gc2...@paquier.xyz> > On Fri, Jul 12, 2019 at 05:01:41PM +0900, Michael Paquier wrote: > > I would also patch GetFakeConfigure in Solution.pm (no need to send a > > new patch), and I thought that you'd actually do the change. What do > > you think? > > OK, applied as I have been able to look at it again, and after fixing > the portion for GetFakeConfigure. Thanks! Thanks for committing and your additional part. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Comment fix of config_default.pl
On Fri, Jul 12, 2019 at 05:01:41PM +0900, Michael Paquier wrote: > I would also patch GetFakeConfigure in Solution.pm (no need to send a > new patch), and I thought that you'd actually do the change. What do > you think? OK, applied as I have been able to look at it again, and after fixing the portion for GetFakeConfigure. Thanks! -- Michael signature.asc Description: PGP signature
Re: Comment fix of config_default.pl
On Fri, Jul 12, 2019 at 03:34:11PM +0900, Kyotaro Horiguchi wrote: > At Fri, 12 Jul 2019 13:01:13 +0900, Michael Paquier > wrote in <20190712040113.gd2...@paquier.xyz> >> --with-ossp-uuid is an obsolete spelling. Wouldn't it be better to >> replace it with --with-uuid=? That would be a bit inconsistent > > Oops! Right. My eyes slipped over the difference.. I would also patch GetFakeConfigure in Solution.pm (no need to send a new patch), and I thought that you'd actually do the change. What do you think? -- Michael signature.asc Description: PGP signature
Re: Comment fix of config_default.pl
Thanks. At Fri, 12 Jul 2019 13:01:13 +0900, Michael Paquier wrote in <20190712040113.gd2...@paquier.xyz> > On Fri, Jul 12, 2019 at 12:15:29PM +0900, Kyotaro Horiguchi wrote: > > In src/tools/msvc/config_default.pl, parameter "perl" requires a > > path string, not a bool differently from that of configure > > script. --with-python has the same characteristics and the > > comment is suggesting that. > > > > We need to fix --with-perl and --with-uuid the same way. > > > > + uuid => undef,# --with-ossp-uuid= > > --with-ossp-uuid is an obsolete spelling. Wouldn't it be better to > replace it with --with-uuid=? That would be a bit inconsistent Oops! Right. My eyes slipped over the difference.. > with configure which can only take a set of hardcoded names, still > there is little point in keeping an option which would get removed > sooner than later? Agreed. Attached the fixed version. regards. -- Kyotaro Horiguchi NTT Open Source Software Center >From 621cd2a6c63b4ae99a51dd3a082d889cf7de286f Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi Date: Fri, 12 Jul 2019 15:20:27 +0900 Subject: [PATCH] Fix comments in config_default.pl Each configuration option in config_default.pl corresponds to one of the configure script options but some of the correspondents of boolean or item-selection options take path string. Such options are suggested to do so in their comments, but "perl" and "uuid" options were forgotten. Modify them so. --with-ossp-uuid is changed to --with-uuid since it is obsolete. --- src/tools/msvc/config_default.pl | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/tools/msvc/config_default.pl b/src/tools/msvc/config_default.pl index 2553636dc1..043df4c5e8 100644 --- a/src/tools/msvc/config_default.pl +++ b/src/tools/msvc/config_default.pl @@ -18,10 +18,10 @@ our $config = { nls => undef,# --enable-nls= tap_tests => undef,# --enable-tap-tests tcl => undef,# --with-tcl= - perl => undef,# --with-perl + perl => undef,# --with-perl= python=> undef,# --with-python= openssl => undef,# --with-openssl= - uuid => undef,# --with-ossp-uuid + uuid => undef,# --with-uuid= xml => undef,# --with-libxml= xslt => undef,# --with-libxslt= iconv => undef,# (not in configure, path to iconv) -- 2.16.3
Re: Comment fix of config_default.pl
On Fri, Jul 12, 2019 at 12:15:29PM +0900, Kyotaro Horiguchi wrote: > In src/tools/msvc/config_default.pl, parameter "perl" requires a > path string, not a bool differently from that of configure > script. --with-python has the same characteristics and the > comment is suggesting that. > > We need to fix --with-perl and --with-uuid the same way. > > + uuid => undef,# --with-ossp-uuid= --with-ossp-uuid is an obsolete spelling. Wouldn't it be better to replace it with --with-uuid=? That would be a bit inconsistent with configure which can only take a set of hardcoded names, still there is little point in keeping an option which would get removed sooner than later? -- Michael signature.asc Description: PGP signature
Re: Comment fix for renamed functions
On Thu, Apr 25, 2019 at 6:51 PM Kyotaro HORIGUCHI wrote: > > Hello. > > I happened to find that several symbols renamed in 3eb77eba5a are > left in comments. Please find the attached. Thanks for the patch! Committed. Regards, -- Fujii Masao
Re: some minor comment fix in md.c
On Wed, Jan 09, 2019 at 08:30:53AM +, Jamison, Kirk wrote: > Here are few minor fix in md.c comments > src/backend/storage/smgr/md.c > > 1. @L174 - removed the unnecessary word "is". > - […] Note that this is breaks mdnblocks() and related functionality [...] > + […] Note that this breaks mdnblocks() and related functionality [...] > > 2. @L885 - grammar fix > - We used to pass O_CREAT here, but that's has the disadvantage that it might > [...] > + We used to pass O_CREAT here, but that has the disadvantage that it might > [...] Thanks, that looks good to me so pushed. -- Michael signature.asc Description: PGP signature
some minor comment fix in md.c
Hi, Here are few minor fix in md.c comments src/backend/storage/smgr/md.c 1. @L174 - removed the unnecessary word "is". - […] Note that this is breaks mdnblocks() and related functionality [...] + […] Note that this breaks mdnblocks() and related functionality [...] 2. @L885 - grammar fix - We used to pass O_CREAT here, but that's has the disadvantage that it might [...] + We used to pass O_CREAT here, but that has the disadvantage that it might [...] Regards, Kirk J. 0001-minor-comment-fix-in-md.c.patch Description: 0001-minor-comment-fix-in-md.c.patch
Re: Minor comment fix for pg_config_manual.h
On 12/29/18 8:27 AM, Michael Paquier wrote: On Fri, Dec 28, 2018 at 12:37:41AM -0300, Alvaro Herrera wrote: Looks good to me. Thanks for the lookup. I have committed and back-patched to v11 for consistency. Thanks! Regards Ian Barwick -- Ian Barwick http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: Minor comment fix for pg_config_manual.h
On 2018-Dec-28, Michael Paquier wrote: > On Wed, Dec 26, 2018 at 09:36:57AM +0900, Michael Paquier wrote: > > like the attached perhaps? At the same time I am thinking about > > reformulating the second sentence as well.. > > > > /* > > - * This is default value for wal_segment_size to be used at initdb when run > > - * without --walsegsize option. Must be a valid segment size. > > + * This is the default value for wal_segment_size to be used when initdb > > is run > > + * without the --wal-segsize option. It must be a valid segment size. > > */ > > #define DEFAULT_XLOG_SEG_SIZE (16*1024*1024) > > So, any objections with this change? If somebody has a better > wording, please feel free to chime in. Looks good to me. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Minor comment fix for pg_config_manual.h
On Wed, Dec 26, 2018 at 09:36:57AM +0900, Michael Paquier wrote: > like the attached perhaps? At the same time I am thinking about > reformulating the second sentence as well.. > > /* > - * This is default value for wal_segment_size to be used at initdb when run > - * without --walsegsize option. Must be a valid segment size. > + * This is the default value for wal_segment_size to be used when initdb is > run > + * without the --wal-segsize option. It must be a valid segment size. > */ > #define DEFAULT_XLOG_SEG_SIZE(16*1024*1024) So, any objections with this change? If somebody has a better wording, please feel free to chime in. -- Michael signature.asc Description: PGP signature
Re: Minor comment fix for pg_config_manual.h
On Tue, Dec 25, 2018 at 10:22:30AM -0500, Tom Lane wrote: > The text still seems a bit awkward. Maybe "... to be used when initdb > is run without the ..." like the attached perhaps? At the same time I am thinking about reformulating the second sentence as well.. -- Michael diff --git a/src/include/pg_config_manual.h b/src/include/pg_config_manual.h index cc5eedfc41..02395af797 100644 --- a/src/include/pg_config_manual.h +++ b/src/include/pg_config_manual.h @@ -14,8 +14,8 @@ */ /* - * This is default value for wal_segment_size to be used at initdb when run - * without --walsegsize option. Must be a valid segment size. + * This is the default value for wal_segment_size to be used when initdb is run + * without the --wal-segsize option. It must be a valid segment size. */ #define DEFAULT_XLOG_SEG_SIZE (16*1024*1024) signature.asc Description: PGP signature
Re: Minor comment fix for pg_config_manual.h
Michael Paquier writes: > On Mon, Dec 24, 2018 at 01:05:25PM +0900, Ian Barwick wrote: >> Attached is mainly to fix a comment in $subject which has a typo in >> the referenced initdb option ("--walsegsize", should be >> "--wal-segsize"), and while I'm there also adds a couple of "the" >> for readability. > All that (the error as well as the extra "the" for clarity in this > sentence) seems right to me. Any opinions from others? The text still seems a bit awkward. Maybe "... to be used when initdb is run without the ..." regards, tom lane
Re: Minor comment fix for pg_config_manual.h
On Mon, Dec 24, 2018 at 01:05:25PM +0900, Ian Barwick wrote: > Attached is mainly to fix a comment in $subject which has a typo in > the referenced initdb option ("--walsegsize", should be > "--wal-segsize"), and while I'm there also adds a couple of "the" > for readability. All that (the error as well as the extra "the" for clarity in this sentence) seems right to me. Any opinions from others? -- Michael signature.asc Description: PGP signature
Minor comment fix for pg_config_manual.h
Hi Attached is mainly to fix a comment in $subject which has a typo in the referenced initdb option ("--walsegsize", should be "--wal-segsize"), and while I'm there also adds a couple of "the" for readability. Regards Ian Barwick -- Ian Barwick http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services diff --git a/src/include/pg_config_manual.h b/src/include/pg_config_manual.h new file mode 100644 index cc5eedf..cf1cfb4 *** a/src/include/pg_config_manual.h --- b/src/include/pg_config_manual.h *** *** 14,21 */ /* ! * This is default value for wal_segment_size to be used at initdb when run ! * without --walsegsize option. Must be a valid segment size. */ #define DEFAULT_XLOG_SEG_SIZE (16*1024*1024) --- 14,21 */ /* ! * This is the default value for wal_segment_size to be used at initdb when run ! * without the --wal-segsize option. Must be a valid segment size. */ #define DEFAULT_XLOG_SEG_SIZE (16*1024*1024)
Re: Comment fix and question about dshash.c
On Sun, Oct 28, 2018 at 1:22 AM Tomas Vondra wrote: > On 10/27/2018 01:51 PM, Antonin Houska wrote: > > Antonin Houska wrote: > >> Thomas Munro wrote: > >>> On Sat, Oct 27, 2018 at 12:30 AM Antonin Houska wrote: > >>> Are you saying there is a bug in this logic (which is nbuckets * 0.75 > >>> expressed as integer maths), or saying that 0.75 is not a good maximum > >>> load factor? I looked around at a couple of general purpose hash > >>> tables and saw that some used 0.75 and some used 1.0, as a wasted > >>> space-vs-collision trade-off. If I have my maths right[1], with 0.75 > >>> you expect to have 75 entries in ~53 buckets, but with 1.0 you expect > >>> to have 100 entries in ~64 buckets. > >> > >> I don't know how exactly you apply the [1] formula (what is "n" and what is > >> "N" in your case?), but my consideration was much simpler: For example, if > >> BUCKETS_PER_PARTITION returns 8 (power of 2 is expected here and also more > >> convenient), then MAX_COUNT_PER_PARTITION returns 8 / 2 + 8 / 4 = 6. Thus > >> the > >> hashtable gets resized if we're going to add the 7th entry to the > >> partition, > >> i.e. we the number of entries in the partition is lower than the number of > >> buckets. Is that o.k.? n balls = the keys being hashed and insert N bins = the hash table buckets Unless we know of some special properties of the hash function and input data, I believe we have to treat the hashes like uniformly distributed random numbers when making predictions, hence the balls-into-bins probability stuff that can tell you about the expected distribution. The expected number of occupied bins for 1 million balls into 1 million bins: select 100.0 * (1.0 - pow(1.0 - (1.0 / 100), 100.0)); -> 632120.742768354905714205000 The same thing by counting distinct positive random numbers modulo 1 million: select count(distinct (random() * 2)::int % 100) from generate_series(1, 100) ss; -> 632373, 632246, 631954, ... Distinct hashes of the first 1 million integers (arbitrary keys) modulo 1 million (buckets): select count(distinct abs(hashoid(n)) % 100) from generate_series(1, 100) ss(n); -> 632115 (For a moment I was confused about getting a higher number until I realised I need abs() to avoid doubling the effective number of buckets...) > > Well, it may be o.k. I've just checked what the fill factor means in hash > > index and it's also the number of entries divided by the number of buckets. > > > > Using load factor ~0.75 is definitely the right thing to do. One way to > interpret it is "average chain length" (or whatever is the approach in > that particular hash table implementations) and one of the main points > of hash tables is to eliminate linear scans. We could pick a value > closer to 1.0, but experience says it's not worth it - it's easy to get > a annoyingly long chains due to hash collisions, in exchange for fairly > minimal space savings. Using the hashes of the first 1 million integers, let's try putting them into different numbers of buckets, and see how many buckets we get with each chain length: WITH entries AS (SELECT abs(hashoid(generate_series(1, 100))) % NBUCKETS AS bucket), lengths AS (SELECT bucket, COUNT(*) AS chain_length FROM entries GROUP BY bucket) SELECT chain_length, COUNT(*) AS count FROM lengths GROUP BY chain_length ORDER BY 2 DESC; NBUCKETS = 100 (load factor 1.0) chain_length | count --+ 1 | 367537 2 | 184580 3 | 61109 4 | 15197 5 | 3079 6 |509 7 | 95 8 | 7 9 | 2 NBUCKETS = 133 (load factor 0.75) chain_length | count --+ 1 | 472012 2 | 177174 3 | 44348 4 | 8361 5 | 1243 6 |143 7 | 9 8 | 2 NBUCKETS = 200 (load factor 0.5) chain_length | count --+ 1 | 606451 2 | 151741 3 | 25226 4 | 3148 5 |323 6 | 28 7 | 2 > That being said, I wonder if we should tweak NTUP_PER_BUCKET=1 in hash > joins to a lower value. IIRC we ended up with 1.0 because originally it > was set to 10.0, and we reduced it to 1.0 in 9.5 which gave us nice > speedup. But I don't recall if we tried using even lower value (probably > not). Maybe we don't need to do that because we only build the hash > table at the very end, when we exactly know how many entries will it > contain, so we don't need to do lookups and inserts at the same time, > and we don't need to grow the hash table (at least in the non-parallel > case). And we end up with 0.75 load factor on average, due to the > doubling (the sizes are essentially uniformly distributed between > 0.5+epsilon and 1.0-epsilon). Yeah, it would indeed be interesting to experiment with
Re: Comment fix and question about dshash.c
On 10/27/2018 01:51 PM, Antonin Houska wrote: Antonin Houska wrote: Thomas Munro wrote: On Sat, Oct 27, 2018 at 12:30 AM Antonin Houska wrote: Are you saying there is a bug in this logic (which is nbuckets * 0.75 expressed as integer maths), or saying that 0.75 is not a good maximum load factor? I looked around at a couple of general purpose hash tables and saw that some used 0.75 and some used 1.0, as a wasted space-vs-collision trade-off. If I have my maths right[1], with 0.75 you expect to have 75 entries in ~53 buckets, but with 1.0 you expect to have 100 entries in ~64 buckets. I don't know how exactly you apply the [1] formula (what is "n" and what is "N" in your case?), but my consideration was much simpler: For example, if BUCKETS_PER_PARTITION returns 8 (power of 2 is expected here and also more convenient), then MAX_COUNT_PER_PARTITION returns 8 / 2 + 8 / 4 = 6. Thus the hashtable gets resized if we're going to add the 7th entry to the partition, i.e. we the number of entries in the partition is lower than the number of buckets. Is that o.k.? Well, it may be o.k. I've just checked what the fill factor means in hash index and it's also the number of entries divided by the number of buckets. Using load factor ~0.75 is definitely the right thing to do. One way to interpret it is "average chain length" (or whatever is the approach in that particular hash table implementations) and one of the main points of hash tables is to eliminate linear scans. We could pick a value closer to 1.0, but experience says it's not worth it - it's easy to get a annoyingly long chains due to hash collisions, in exchange for fairly minimal space savings. That being said, I wonder if we should tweak NTUP_PER_BUCKET=1 in hash joins to a lower value. IIRC we ended up with 1.0 because originally it was set to 10.0, and we reduced it to 1.0 in 9.5 which gave us nice speedup. But I don't recall if we tried using even lower value (probably not). Maybe we don't need to do that because we only build the hash table at the very end, when we exactly know how many entries will it contain, so we don't need to do lookups and inserts at the same time, and we don't need to grow the hash table (at least in the non-parallel case). And we end up with 0.75 load factor on average, due to the doubling (the sizes are essentially uniformly distributed between 0.5+epsilon and 1.0-epsilon). regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Comment fix and question about dshash.c
Antonin Houska wrote: > Thomas Munro wrote: > > > On Sat, Oct 27, 2018 at 12:30 AM Antonin Houska wrote: > > > > Are you saying there is a bug in this logic (which is nbuckets * 0.75 > > expressed as integer maths), or saying that 0.75 is not a good maximum > > load factor? I looked around at a couple of general purpose hash > > tables and saw that some used 0.75 and some used 1.0, as a wasted > > space-vs-collision trade-off. If I have my maths right[1], with 0.75 > > you expect to have 75 entries in ~53 buckets, but with 1.0 you expect > > to have 100 entries in ~64 buckets. > > I don't know how exactly you apply the [1] formula (what is "n" and what is > "N" in your case?), but my consideration was much simpler: For example, if > BUCKETS_PER_PARTITION returns 8 (power of 2 is expected here and also more > convenient), then MAX_COUNT_PER_PARTITION returns 8 / 2 + 8 / 4 = 6. Thus the > hashtable gets resized if we're going to add the 7th entry to the partition, > i.e. we the number of entries in the partition is lower than the number of > buckets. Is that o.k.? Well, it may be o.k. I've just checked what the fill factor means in hash index and it's also the number of entries divided by the number of buckets. -- Antonin Houska Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26, A-2700 Wiener Neustadt Web: https://www.cybertec-postgresql.com
Re: Comment fix and question about dshash.c
Thomas Munro wrote: > On Sat, Oct 27, 2018 at 12:30 AM Antonin Houska wrote: > > Thanks, will fix. > > > 2. Can anyone please explain this macro? > > > > /* Max entries before we need to grow. Half + quarter = 75% load factor. */ > > #define MAX_COUNT_PER_PARTITION(hash_table) \ > > (BUCKETS_PER_PARTITION(hash_table->size_log2) / 2 + \ > > BUCKETS_PER_PARTITION(hash_table->size_log2) / 4) > > > > I'm failing to understand why the maximum number of hash table entries in a > > partition should be smaller than the number of buckets in that partition. > > > > The fact that MAX_COUNT_PER_PARTITION refers to entries and not buckets is > > obvious from this condition in dshash_find_or_insert() > > > > /* Check if we are getting too full. */ > > if (partition->count > MAX_COUNT_PER_PARTITION(hash_table)) > > Are you saying there is a bug in this logic (which is nbuckets * 0.75 > expressed as integer maths), or saying that 0.75 is not a good maximum > load factor? I looked around at a couple of general purpose hash > tables and saw that some used 0.75 and some used 1.0, as a wasted > space-vs-collision trade-off. If I have my maths right[1], with 0.75 > you expect to have 75 entries in ~53 buckets, but with 1.0 you expect > to have 100 entries in ~64 buckets. I don't know how exactly you apply the [1] formula (what is "n" and what is "N" in your case?), but my consideration was much simpler: For example, if BUCKETS_PER_PARTITION returns 8 (power of 2 is expected here and also more convenient), then MAX_COUNT_PER_PARTITION returns 8 / 2 + 8 / 4 = 6. Thus the hashtable gets resized if we're going to add the 7th entry to the partition, i.e. we the number of entries in the partition is lower than the number of buckets. Is that o.k.? > [1] > https://math.stackexchange.com/questions/470662/average-number-of-bins-occupied-when-throwing-n-balls-into-n-bins -- Antonin Houska Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26, A-2700 Wiener Neustadt Web: https://www.cybertec-postgresql.com
Re: Comment fix and question about dshash.c
On Sat, Oct 27, 2018 at 10:03 AM Thomas Munro wrote: > space-vs-collision trade-off. If I have my maths right[1], with 0.75 > you expect to have 75 entries in ~53 buckets, but with 1.0 you expect > to have 100 entries in ~64 buckets. It'd be a fair criticism that Forgot to add: assuming 100 buckets, for illustration (the real number is a power of 2). -- Thomas Munro http://www.enterprisedb.com
Re: Comment fix and question about dshash.c
On Sat, Oct 27, 2018 at 12:30 AM Antonin Houska wrote: > 1. The return type of resize() function is void, so I propose part of the > header comment to be removed: > > diff --git a/src/backend/lib/dshash.c b/src/backend/lib/dshash.c > index b46f7c4cfd..b2b8fe60e1 100644 > --- a/src/backend/lib/dshash.c > +++ b/src/backend/lib/dshash.c > @@ -672,9 +672,7 @@ delete_item(dshash_table *hash_table, dshash_table_item > *item) > > /* > * Grow the hash table if necessary to the requested number of buckets. The > - * requested size must be double some previously observed size. Returns true > - * if the table was successfully expanded or found to be big enough already > - * (because another backend expanded it). > + * requested size must be double some previously observed size. > * > * Must be called without any partition lock held. > */ Thanks, will fix. > 2. Can anyone please explain this macro? > > /* Max entries before we need to grow. Half + quarter = 75% load factor. */ > #define MAX_COUNT_PER_PARTITION(hash_table) \ > (BUCKETS_PER_PARTITION(hash_table->size_log2) / 2 + \ > BUCKETS_PER_PARTITION(hash_table->size_log2) / 4) > > I'm failing to understand why the maximum number of hash table entries in a > partition should be smaller than the number of buckets in that partition. > > The fact that MAX_COUNT_PER_PARTITION refers to entries and not buckets is > obvious from this condition in dshash_find_or_insert() > > /* Check if we are getting too full. */ > if (partition->count > MAX_COUNT_PER_PARTITION(hash_table)) Are you saying there is a bug in this logic (which is nbuckets * 0.75 expressed as integer maths), or saying that 0.75 is not a good maximum load factor? I looked around at a couple of general purpose hash tables and saw that some used 0.75 and some used 1.0, as a wasted space-vs-collision trade-off. If I have my maths right[1], with 0.75 you expect to have 75 entries in ~53 buckets, but with 1.0 you expect to have 100 entries in ~64 buckets. It'd be a fair criticism that that's arbitrarily different to the choice made for hash joins, and dynahash's default is 1 (though it's a run-time parameter). [1] https://math.stackexchange.com/questions/470662/average-number-of-bins-occupied-when-throwing-n-balls-into-n-bins -- Thomas Munro http://www.enterprisedb.com