Re: [HACKERS] increasing the default WAL segment size
Hi, On 2017-09-14 11:31:33 +0530, Beena Emerson wrote: > The change looks good and is working as expected. > PFA the updated patch after running pgindent. I've pushed this version. Yay! Thanks for the work Beena, everyone! The only change I made is to run the pg_upgrade tests with a 1 MB segment size, as discussed in [1]. We'll probably want to refine that, but let's discuss that in the other thread. Regards, Andres [1] http://archives.postgresql.org/message-id/20170919175457.liz3oreqiambuhca%40alap3.anarazel.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
Hi, On 2017-09-06 20:24:16 +0530, Beena Emerson wrote: > > - pg_standby's RetrieveWALSegSize() does too much for it's name. It > > seems quite weird that a function named that way has the section below > > "/* check if clean up is necessary */" > > we set 2 cleanup related variables once WalSegSize is set, namely > need_cleanup and exclusiveCleanupFileName. Does > SetWALSegSizeAndCleanupValues look good? It's better, but see below. > > - the way you redid the ReadControlFile() invocation doesn't quite seem > > right. Consider what happens if XLOGbuffers isn't -1 - then we > > wouldn't read the control file, but you unconditionally copy it in > > XLOGShmemInit(). I think we instead should introduce something like > > XLOGPreShmemInit() that reads the control file unless in bootstrap > > mode. Then get rid of the second ReadControlFile() already present. > > I did not think it was necessary to create a new function, I have > simply added the check and > function call within the XLOGShmemInit(). Which is wrong. XLogShmemSize() already needs to know the actual size, otherwise we allocate the wrong shmem size. You may sometimes succeed nevertheless because we leave some slop unused shared memory space, but it's not ok to rely on. See the refactoring I did in 0001. Changes: - refactored the way the control file was handled, moved it to separate phase. I wrote this last and it's late, so I'm not yet fully confident in it, but it survives plain and EXEC_BACKEND builds. This also gets rid of ferrying wal_segment_size through the EXEC_BACKEND variable stuff, which didn't really do much, given how many other parts weren't carried over. - renamed all the non-postgres binary version of wal_segment_size to WalSegSz, diverging seems pointless, and the WalSegsz seems inconsistent. - changed malloc in pg_waldump's search_directory() to a stack allocation. Less because of efficiency, more because there wasn't any error handling. - removed redundant char * -> XLogPageHeader -> XLogLongPageHeader casting. - replace new malloc with pg_malloc in initdb (failure handling!) - replaced the floating point logic in pretty_wal_size with a, imo much simpler, (sz % 1024) == 0 - it's inconsistent that the new code for pg_standby was added to the top of the file, where all the customizable stuff resides. - other small changes Issues: - I think the pg_standby stuff isn't correct. And it's hard to understand. Consider the case where the first file restored is *not* a timeline history file, but *also* not a complete file. We'll start to spew "not enough data in file" errors and such, which we previously didn't. My preferred solution would be to remove pg_standby ([1]), but that's probably not quick enough. Unless we can quickly agree on that, I think we need to refactor this a bit, I've done so in the attached, but it's untested. Could you please verify it works and if not fix it up? What do you think? Regards, Andres [1] http://archives.postgresql.org/message-id/20170913064824.rqflkadxwpboabgw%40alap3.anarazel.de >From 2ac42f99dacfd8b301d7d5c3d7adfd1bf7357508 Mon Sep 17 00:00:00 2001 From: Andres Freund Date: Wed, 13 Sep 2017 02:12:17 -0700 Subject: [PATCH 1/2] Perform only one ReadControlFile() during startup. Previously we read the control file in multiple places. But soon the segment size will be configurable and stored in the control file, and that needs to be available earlier than it currently is needed. Instead of adding yet another place where it's read, refactor things so there's a single processing of the control file during startup (in EXEC_BACKEND that's every individual backend's startup). --- src/backend/access/transam/xlog.c | 48 ++--- src/backend/postmaster/postmaster.c | 6 + src/backend/tcop/postgres.c | 3 +++ src/include/access/xlog.h | 1 + 4 files changed, 44 insertions(+), 14 deletions(-) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index a3e8ce092f..5c66a8de7d 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -4799,6 +4799,22 @@ check_wal_buffers(int *newval, void **extra, GucSource source) return true; } +/* + * Read the control file, set respective GUCs. + * + * This is to be called during startup, unless in bootstrap mode, where no + * control file yet exists. As there's no shared memory yet (it's sizing can + * depend on the contents of the control file!), first store data in local + * memory. XLOGShemInit() will then copy it to shared memory later. + */ +void +LocalProcessControlFile(void) +{ + Assert(ControlFile == NULL); + ControlFile = palloc(sizeof(ControlFileData)); + ReadControlFile(); +} + /* * Initialization of shared memory for XLOG */ @@ -4850,6 +4866,7 @@ XLOGShmemInit(void) foundXLog; char *allocptr;
Re: [HACKERS] increasing the default WAL segment size
Hi, I was looking to commit this, but the changes I made ended up being pretty large. Here's what I changed in the attached: - split GUC_UNIT_BYTE into a separate commit, squashed rest - renamed GUC_UNIT_BYT to GUC_UNIT_BYTE, don't see why we'd have such a weird abbreviation? - bumped control file version, otherwise things wouldn't work correctly - wal_segment_size text still said "Shows the number of pages per write ahead log segment." - I still feel strongly that exporting XLogSegSize, which previously was a macro and now a integer variable, is a bad idea. Hence I've renamed it to wal_segment_size. - There still were comments referencing XLOG_SEG_SIZE - IsPowerOf2 regarded 0 as a valid power of two - ConvertToXSegs() depended on a variable not passed as arg, bad idea. - As previously mentioned, I don't think it's ok to rely on vars like XLogSegSize to be defined both in backend and frontend code. - I don't think XLogReader can rely on XLogSegSize, needs to be parametrized. - pg_rewind exported another copy of extern int XLogSegSize - streamutil.h had a extern uint32 WalSegsz; but used RetrieveXlogSegSize, that seems needlessly different - moved wal_segment_size (aka XLogSegSize) to xlog.h - pg_standby included xlogreader, not sure why? - MaxSegmentsPerLogFile still had a conflicting naming scheme - you'd included "sys/stat.h", that's not really appropriate for system headers, should be (and then grouped w/ rest) - pg_controldata's warning about an invalid segsize missed newlines Unresolved: - this needs some new performance tests, the number of added instructions isn't trivial. Don't think there's anything, but ... - read through it again, check long lines - pg_standby's RetrieveWALSegSize() does too much for it's name. It seems quite weird that a function named that way has the section below "/* check if clean up is necessary */" - the way you redid the ReadControlFile() invocation doesn't quite seem right. Consider what happens if XLOGbuffers isn't -1 - then we wouldn't read the control file, but you unconditionally copy it in XLOGShmemInit(). I think we instead should introduce something like XLOGPreShmemInit() that reads the control file unless in bootstrap mode. Then get rid of the second ReadControlFile() already present. - In pg_resetwal.c:ReadControlFile() we ignore the file contents if there's an invalid segment size, but accept the contents as guessed if there's a crc failure - that seems a bit weird? - verify EXEC_BACKEND does the right thing - not this commit/patch, but XLogReadDetermineTimeline() could really use some simplifying of repetitive expresssions - XLOGShmemInit shouldn't memcpy to temp_cfile and such, why not just save previous pointer in a local variable? - could you fill in the Reviewed-By: line in the commit message? Running out of concentration / time now. - Andres >From 15d16b8e2146b0491e8b64e780227424162dd784 Mon Sep 17 00:00:00 2001 From: Andres Freund Date: Tue, 5 Sep 2017 13:26:55 -0700 Subject: [PATCH 1/2] Introduce BYTES unit for GUCs. This is already useful for track_activity_query_size, and will further be used in followup commits. Author: Beena Emerson Reviewed-By: Andres Freund Discussion: https://postgr.es/m/caog9apeu8bxvwbxkoo9j7zpm76task_vfmeeicejwhmmsli...@mail.gmail.com --- src/backend/utils/misc/guc.c | 14 +- src/include/utils/guc.h | 1 + 2 files changed, 10 insertions(+), 5 deletions(-) diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c index 246fea8693..25da06fffc 100644 --- a/src/backend/utils/misc/guc.c +++ b/src/backend/utils/misc/guc.c @@ -722,6 +722,11 @@ static const char *memory_units_hint = gettext_noop("Valid units for this parame static const unit_conversion memory_unit_conversion_table[] = { + {"GB", GUC_UNIT_BYTE, 1024 * 1024 * 1024}, + {"MB", GUC_UNIT_BYTE, 1024 * 1024}, + {"kB", GUC_UNIT_BYTE, 1024}, + {"B", GUC_UNIT_BYTE, 1}, + {"TB", GUC_UNIT_KB, 1024 * 1024 * 1024}, {"GB", GUC_UNIT_KB, 1024 * 1024}, {"MB", GUC_UNIT_KB, 1024}, @@ -2863,11 +2868,7 @@ static struct config_int ConfigureNamesInt[] = {"track_activity_query_size", PGC_POSTMASTER, RESOURCES_MEM, gettext_noop("Sets the size reserved for pg_stat_activity.query, in bytes."), NULL, - - /* - * There is no _bytes_ unit, so the user can't supply units for - * this. - */ + GUC_UNIT_BYTE }, &pgstat_track_activity_query_size, 1024, 100, 102400, @@ -8113,6 +8114,9 @@ GetConfigOptionByNum(int varnum, const char **values, bool *noshow) { switch (conf->flags & (GUC_UNIT_MEMORY | GUC_UNIT_TIME)) { + case GUC_UNIT_BYTE: +values[2] = "B"; +break; case GUC_UNIT_KB: values[2] = "kB"; break; diff --git a/src/include/utils/guc.h b/src/include/utils/guc.h index c1870d2130..467125a09d 100644 --- a/src/include/utils/guc.h +++ b/src/include/utils/guc.h @@ -219,6 +219,7 @@ typedef enum #define GUC_UNIT_BLOCKS 0x2000 /* value is in blocks */
Re: [HACKERS] increasing the default WAL segment size
Hi, On 2017-08-23 12:13:15 +0530, Beena Emerson wrote: > >> + /* > >> + * The calculation of XLOGbuffers requires the run-time > >> parameter > >> + * XLogSegSize which is set from the control file. This > >> value is > >> + * required to create the shared memory segment. Hence, > >> temporarily > >> + * allocate space for reading the control file. > >> + */ > > > > This makes me uncomfortable. Having to choose the control file multiple > > times seems wrong. We're effectively treating the control file as part > > of the configuration now, and that means we should move it's parsing to > > an earlier part of startup. > > Yes, this may seem ugly. ControlFile was originally read into the > shared memory segment but then we now need the XLogSegSize from the > ControlFile to initialise the shared memory segment. I could not > figure out any other way to achieve this. I think reading it one into local memory inside the startup process and then copying it into shared memory from there should work? > >> @@ -8146,6 +8181,9 @@ InitXLOGAccess(void) > >> ThisTimeLineID = XLogCtl->ThisTimeLineID; > >> Assert(ThisTimeLineID != 0 || IsBootstrapProcessingMode()); > >> > >> + /* set XLogSegSize */ > >> + XLogSegSize = ControlFile->xlog_seg_size; > >> + > > > > Hm, why do we have two variables keeping track of the segment size? > > wal_segment_size and XLogSegSize? That's bound to lead to confusion. > > > > wal_segment_size is the guc which stores the number of segments > (XLogSegSize / XLOG_BLCKSZ). wal_segment_size and XLogSegSize are the same name, spelt different, so if that's where we want to go, we should name them differently. But perhaps more fundamentally, I don't see why we need both: What stops us from just defining the GUC in bytes? Regards, Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
Hi, Personally I find the split between 03 and 04 and their naming a bit confusing. I'd rather just merge them. These patches need a rebase, they don't apply anymore. On 2017-07-06 12:04:12 +0530, Beena Emerson wrote: > @@ -4813,6 +4836,18 @@ XLOGShmemSize(void) > { > charbuf[32]; > > + /* > + * The calculation of XLOGbuffers requires the run-time > parameter > + * XLogSegSize which is set from the control file. This value is > + * required to create the shared memory segment. Hence, > temporarily > + * allocate space for reading the control file. > + */ This makes me uncomfortable. Having to choose the control file multiple times seems wrong. We're effectively treating the control file as part of the configuration now, and that means we should move it's parsing to an earlier part of startup. > + if (!IsBootstrapProcessingMode()) > + { > + ControlFile = palloc(sizeof(ControlFileData)); > + ReadControlFile(); > + pfree(ControlFile); General plea: Please reset to NULL in cases like this, otherwise the pointer will [temporarily] point into a freed memory location, which makes debugging harder. > @@ -8146,6 +8181,9 @@ InitXLOGAccess(void) > ThisTimeLineID = XLogCtl->ThisTimeLineID; > Assert(ThisTimeLineID != 0 || IsBootstrapProcessingMode()); > > + /* set XLogSegSize */ > + XLogSegSize = ControlFile->xlog_seg_size; > + Hm, why do we have two variables keeping track of the segment size? wal_segment_size and XLogSegSize? That's bound to lead to confusion. > /* Use GetRedoRecPtr to copy the RedoRecPtr safely */ > (void) GetRedoRecPtr(); > /* Also update our copy of doPageWrites. */ > diff --git a/src/backend/bootstrap/bootstrap.c > b/src/backend/bootstrap/bootstrap.c > index b3f0b3c..d2c524b 100644 > --- a/src/backend/bootstrap/bootstrap.c > +++ b/src/backend/bootstrap/bootstrap.c > @@ -19,6 +19,7 @@ > > #include "access/htup_details.h" > #include "access/xact.h" > +#include "access/xlog_internal.h" > #include "bootstrap/bootstrap.h" > #include "catalog/index.h" > #include "catalog/pg_collation.h" > @@ -47,6 +48,7 @@ > #include "utils/tqual.h" > > uint32 bootstrap_data_checksum_version = 0;/* No checksum > */ > +uint32 XLogSegSize; Se we define the same variable declared in a header in multiple files (once for each applicationq)? I'm pretty strongly against that. Why's that needed/a good idea? Neither is it clear to me why the definition was moved from xlog.c to bootstrap.c? That doesn't sound like a natural place. > /* > + * Calculate the default wal_size in proper unit. > + */ > +static char * > +pretty_wal_size(int segment_count) > +{ > + double val = wal_segment_size / (1024 * 1024) * segment_count; > + double temp_val; > + char *result = malloc(10); > + > + /* > + * Wal segment size ranges from 1MB to 1GB and the default > + * segment_count is 5 for min_wal_size and 64 for max_wal_size, so the > + * final values can range from 5MB to 64GB. > + */ Referencing the defaults here seems unnecessary. And nearly a guarantee that the values in the comment will go out of date soon-ish. > + /* set default max_wal_size and min_wal_size */ > + snprintf(repltok, sizeof(repltok), "min_wal_size = %s", > + pretty_wal_size(DEFAULT_MIN_WAL_SEGS)); > + conflines = replace_token(conflines, "#min_wal_size = 80MB", repltok); > + > + snprintf(repltok, sizeof(repltok), "max_wal_size = %s", > + pretty_wal_size(DEFAULT_MAX_WAL_SEGS)); > + conflines = replace_token(conflines, "#max_wal_size = 1GB", repltok); > + Hm. So postgresql.conf.sample values are now going to contain misleading information for clusters with non-default segment sizes. Have we discussed instead defaulting min_wal_size/max_wal_size to a constant amount of megabytes and rounding up when it doesn't work for a particular segment size? > diff --git a/src/include/access/xlog_internal.h > b/src/include/access/xlog_internal.h > index 9c0039c..c805f12 100644 > --- a/src/include/access/xlog_internal.h > +++ b/src/include/access/xlog_internal.h > @@ -91,6 +91,11 @@ typedef XLogLongPageHeaderData *XLogLongPageHeader; > */ > > extern uint32 XLogSegSize; > +#define XLOG_SEG_SIZE XLogSegSize I don't think this is a good idea, we should rather rip the bandaid of and remove this macro. If people are assuming it's a macro they'll just run into more confusing errors/problems. > diff --git a/src/include/pg_config_manual.h b/src/include/pg_config_manual.h > index f3b3529..f31c30e 100644 > --- a/src/include/pg_config_manual.h > +++ b/src/include/pg_config_manual.h > @@ -14,6 +14,12 @@ > */ > > /* > + * This is default value for WAL_segment_size to
Re: [HACKERS] increasing the default WAL segment size
On Thu, Jul 6, 2017 at 3:21 PM, tushar wrote: > On 07/06/2017 12:04 PM, Beena Emerson wrote: >> >> The 04-initdb-walsegsize_v2.patch has the following improvements: >> - Rebased over new 03 patch >> - Pass the wal-segsize intidb option as command-line option rathern >> than in an environment variable. >> - Since new function check_wal_size had only had two checks and was >> sed once, moved the code to ReadControlFile where it is used and >> removed this function. >> - improve comments and add validations where required. >> - Use DEFAULT_XLOG_SEG_SIZE to set the min_wal_size and >> max_wal_size,instead of the value 16. >> - Use XLogSegMaxSize and XLogSegMinSize to calculate the range of guc >> wal_segment_size instead 16 - INT_MAX. > > Thanks Beena. I tested with below following scenarios and all are working > as expected > .)Different WAL segment size i.e 1,2,8,16,32,64,512,1024 at the time of > initdb > .)SR setup in place. > .)Combinations of max/min_wal_size in postgresql.conf with different > wal_segment_size. > .)shutdown the server forcefully (kill -9) / promote slave / to make sure > -recovery happened successfully. > .)with different utilities like pg_resetwal/pg_upgrade/pg_waldump > .)running pg_bench for substantial workloads (~ 1 hour) > .)WAL segment size is not default (changed at the time of ./configure) + > different wal_segment_size (at the time of initdb) . > > Things looks fine. > Thank you Tushar. -- Beena Emerson EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
On 07/06/2017 12:04 PM, Beena Emerson wrote: The 04-initdb-walsegsize_v2.patch has the following improvements: - Rebased over new 03 patch - Pass the wal-segsize intidb option as command-line option rathern than in an environment variable. - Since new function check_wal_size had only had two checks and was sed once, moved the code to ReadControlFile where it is used and removed this function. - improve comments and add validations where required. - Use DEFAULT_XLOG_SEG_SIZE to set the min_wal_size and max_wal_size,instead of the value 16. - Use XLogSegMaxSize and XLogSegMinSize to calculate the range of guc wal_segment_size instead 16 - INT_MAX. Thanks Beena. I tested with below following scenarios and all are working as expected .)Different WAL segment size i.e 1,2,8,16,32,64,512,1024 at the time of initdb .)SR setup in place. .)Combinations of max/min_wal_size in postgresql.conf with different wal_segment_size. .)shutdown the server forcefully (kill -9) / promote slave / to make sure -recovery happened successfully. .)with different utilities like pg_resetwal/pg_upgrade/pg_waldump .)running pg_bench for substantial workloads (~ 1 hour) .)WAL segment size is not default (changed at the time of ./configure) + different wal_segment_size (at the time of initdb) . Things looks fine. -- regards,tushar EnterpriseDB https://www.enterprisedb.com/ The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
Hello, On Tue, Mar 28, 2017 at 1:06 AM, Beena Emerson wrote: > Hello, > > On Sat, Mar 25, 2017 at 10:32 PM, Peter Eisentraut > wrote: >> >> At this point, I suggest splitting this patch up into several >> potentially less controversial pieces. >> >> One big piece is that we currently don't support segment sizes larger >> than 64 GB, for various internal arithmetic reasons. Your patch appears >> to address that. So I suggest isolating that. Assuming it works >> correctly, I think there would be no great concern about it. >> >> The next piece would be making the various tools aware of varying >> segment sizes without having to rely on a built-in value. >> >> The third piece would then be the rest that allows you to set the size >> at initdb >> >> If we take these in order, we would make it easier to test various sizes >> and see if there are any more unforeseen issues when changing sizes. It >> would also make it easier to do performance testing so we can address >> the original question of what the default size should be. > > > PFA the patches divided into 3 parts: > > 02-increase-max-wal-segsize.patch - Increases the wal-segsize and changes > the internal representation of max_wal_size and min_wal_size to mb. Already committed. > > 03-modify-tools.patch - Makes XLogSegSize into a variable, currently set as > XLOG_SEG_SIZE and modifies the tools to fetch the size instead of using > inbuilt value. > The updated 03-modify-tools_v2.patch has the following changes: - Rebased over current HEAD - Impoverished comments - Adding error messages where applicable. - Replace XLOG_SEG_SIZE in the tools and xlog_internal.h to XLogSegSize. XLOG_SEG_SIZE is the wal_segment_size the server was compiled with and XLogSegSize is the wal_segment_size of the target server on which the tool is run. When the binaries used and the target server are compiled with different wal_segment_size, the calculations would be be affected and the tool would crash. To avoid it, all the calculations used by tool should use XLogSegSize. - pg_waldump : The fuzzy_open_file is split into two functions - open_file_in_directory and identify_target_directory so that code can be reused when determining the XLogSegSize from the WAL file header. - IsValidXLogSegSize macro is moved from 04 to here so that we can use it for validating the size in all the tools. > 04-initdb-walsegsize.patch - Adds the initdb option to set wal-segsize and > make related changes. Update pg_test_fsync to use DEFAULT_XLOG_SEG_SIZE > instead of XLOG_SEG_SIZE The 04-initdb-walsegsize_v2.patch has the following improvements: - Rebased over new 03 patch - Pass the wal-segsize intidb option as command-line option rathern than in an environment variable. - Since new function check_wal_size had only had two checks and was sed once, moved the code to ReadControlFile where it is used and removed this function. - improve comments and add validations where required. - Use DEFAULT_XLOG_SEG_SIZE to set the min_wal_size and max_wal_size,instead of the value 16. - Use XLogSegMaxSize and XLogSegMinSize to calculate the range of guc wal_segment_size instead 16 - INT_MAX. > >> >> One concern I have is that your patch does not contain any tests. There >> should probably be lots of tests. > > > 05-initdb_tests.patch adds tap tests to initialize cluster with different > wal_segment_size and then check the config values. What other tests do you > have in mind? Checking the various tools? > > -- Beena Emerson EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company 04-initdb-walsegsize_v2.patch Description: Binary data 03-modify-tools_v2.patch Description: Binary data 01-add-XLogSegmentOffset-macro_rebase.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
On 4/7/17 2:59 AM, Beena Emerson wrote: > I ran tests and following are the details: > > Machine details: > Architecture: ppc64le > Byte Order:Little Endian > CPU(s):192 > On-line CPU(s) list: 0-191 > Thread(s) per core:8 > Core(s) per socket:1 > Socket(s): 24 > NUMA node(s): 4 > Model: IBM,8286-42A > > clients> 16 32 64 > 128 > size > 16MB 18895.63486 28799.48759 37855.39521 27968.88309 > 32MB 18313.1461 29201.44954 40733.80051 32458.74147 > 64 MB18055.73141 30875.28687 42713.54447 38009.60542 > 128MB 18234.31424 33208.65419 48604.5593 45498.27689 > 256MB19524.36498 35740.19032 54686.16898 54060.11168 > 512MB 20351.90719 37426.72174 55045.60719 56194.99349 > 1024MB 19667.67062 35696.19194 53666.60373 54353.0614 > > I did not get any degradation, in fact, higher values showed performance > improvement for higher client count. This submission has been moved to CF 2017-07. -- -David da...@pgmasters.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
I ran tests and following are the details: Machine details: Architecture: ppc64le Byte Order:Little Endian CPU(s):192 On-line CPU(s) list: 0-191 Thread(s) per core:8 Core(s) per socket:1 Socket(s): 24 NUMA node(s): 4 Model: IBM,8286-42A clients> 16 32 64 128 size 16MB 18895.63486 28799.48759 37855.39521 27968.88309 32MB 18313.1461 29201.44954 40733.80051 32458.74147 64 MB18055.73141 30875.28687 42713.54447 38009.60542 128MB 18234.31424 33208.65419 48604.5593 45498.27689 256MB19524.36498 35740.19032 54686.16898 54060.11168 512MB 20351.90719 37426.72174 55045.60719 56194.99349 1024MB 19667.67062 35696.19194 53666.60373 54353.0614 I did not get any degradation, in fact, higher values showed performance improvement for higher client count. -- Beena Emerson EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [HACKERS] increasing the default WAL segment size
On Fri, Apr 7, 2017 at 2:35 AM, Tomas Vondra wrote: > On 04/06/2017 08:33 PM, David Steele wrote: >> >> >> I'm in favor of 16,64,256,1024. >> >> > I don't see a particular reason for this, TBH. The sweet spots will be > likely dependent hardware / OS configuration etc. Assuming there actually > are sweet spots - no one demonstrated that yet. > > Also, I don't see how supporting additional WAL sizes increases chance of > incompatibility. We already allow that, so either the tools (e.g. backup > solutions) assume WAL segments are always 16MB (in which case are > essentially broken) or support valid file sizes (in which case they should > have no issues with the new ones). > > If we're going to do this, I'm in favor of deciding some reasonable upper > limit (say, 1GB or 2GB sounds good), and allowing all 2^n values up to that > limit. I think the majority consensus is to use all valid values. Since 1GB is what we have finalized as the upper limit, lets continue with that for now. -- Beena Emerson EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [HACKERS] increasing the default WAL segment size
Hello, On Wed, Apr 5, 2017 at 6:06 PM, Peter Eisentraut < peter.eisentr...@2ndquadrant.com> wrote: > > (Roughly speaking, to get started, this would mean compiling with > --with-wal-segsize 16, 32, 64, 128, 256, run make check-world both > sequentially and in parallel, and take note of a) passing, b) run time, > c) disk space.) > > The attached patch updates a pg_upgrade test which fails for higher segment values: The output of SELECT restart_lsn FROM pg_replication_slots WHERE slot_name = 'slot1'}. The following are the results of the installcheck-world execution. wal_size time cluster_size pg_wal files 16 5m32.761s 539M 417M 26 32 5m32.618s 545M 417M 13 64 5m39.685s 571M 449M 7 128 5m52.961s641M 513M 4 256 6m13.402s 635M 512M 2 512 7m3.252s 1.2G 1G 2 1024 9m0.205s 1.2G 1G 1 wal_size time cluster_size pg_wal files 16 5m31.137s 542M 417M 26 32 5m29.532s 539M 417M 13 64 5m36.189s 571M 449M 7 128 5m50.027s635M 513M 4 256 6m13.603s 635M 512M 2 512 7m4.154s 1.2G 1G 2 1024 8m55.357s1.2G 1G 1 For every test, except for connect/test5 in src/interfaces/ecpg, all else passed. We can see that smaller chunks take lesser time for the same amount of WAL (128 and 256, 512 and 1024). But these tests are not large enough to conclude. Beena Emerson EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company initdb_update_regress.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
On 4/6/17 6:52 PM, Tomas Vondra wrote: > On 04/06/2017 11:45 PM, David Steele wrote: >> >> How many people in the field are running custom builds of >> Postgres? And of those, how many have changed the WAL segment size? >> I've never encountered a non-standard segment size or talked to anyone >> who has. I'm not saying it has *never* happened but I would venture to >> say it's rare. > > I agree it's rare, but I don't think that means we can just consider the > option as 'unsupported'. We're even mentioning it in the docs as a valid > way to customize granularity of the WAL archival. > > I certainly know people who run custom builds, and some of them run with > custom WAL segment size. Some of them are our customers, some are not. > And yes, some of them actually patched the code to allow 256MB WAL > segments. I feel a little better knowing that *somebody* is doing it in the field. >> Just because we don't change the default doesn't mean that others won't. >> I still think testing for sizes other than 16MB is severely lacking and >> I don't believe caveat emptor is the way to go. > > Aren't you mixing regression and performance testing? I agree we need to > be sure all segment sizes are handled correctly, no argument here. Not intentionally. Our standard test suite is only regression as far as I can see. All the performance testing I've seen has been done ad hoc. >> I don't have plans to add animals. I think we'd need a way to tell >> 'make check' to use a different segment size for tests and then >> hopefully reconfigure some of the existing animals. > > OK. My point was that we don't have that capability now, and the latest > patch is not adding it either. Agreed. -- -David da...@pgmasters.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
On 04/06/2017 11:45 PM, David Steele wrote: On 4/6/17 5:05 PM, Tomas Vondra wrote: On 04/06/2017 08:33 PM, David Steele wrote: On 4/5/17 7:29 AM, Simon Riggs wrote: 2. It's not clear to me the advantage of being able to pick varying filesizes. I see great disadvantage in having too many options, which greatly increases the chance of incompatibility, annoyance and breakage. I favour a small number of values that have been shown by testing to be sweet spots in performance and usability. (1GB has been suggested) I'm in favor of 16,64,256,1024. I don't see a particular reason for this, TBH. The sweet spots will be likely dependent hardware / OS configuration etc. Assuming there actually are sweet spots - no one demonstrated that yet. Fair enough, but my feeling is that this patch has never been about server performance, per se. Rather, is is about archive management and trying to stem the tide of WAL as servers get bigger and busier. Generally, archive commands have to make a remote connection to offload WAL and that has a cost per segment. Perhaps, although Robert also mentioned that the fsync at the end of each WAL segment is noticeable. But the thread is a bit difficult to follow, different people have different ideas about the motivation of the patch, etc. Also, I don't see how supporting additional WAL sizes increases chance of incompatibility. We already allow that, so either the tools (e.g. backup solutions) assume WAL segments are always 16MB (in which case are essentially broken) or support valid file sizes (in which case they should have no issues with the new ones). I don't see how a compile-time option counts as "supporting that" in practice. How many people in the field are running custom builds of Postgres? And of those, how many have changed the WAL segment size? I've never encountered a non-standard segment size or talked to anyone who has. I'm not saying it has *never* happened but I would venture to say it's rare. I agree it's rare, but I don't think that means we can just consider the option as 'unsupported'. We're even mentioning it in the docs as a valid way to customize granularity of the WAL archival. I certainly know people who run custom builds, and some of them run with custom WAL segment size. Some of them are our customers, some are not. And yes, some of them actually patched the code to allow 256MB WAL segments. If we're going to do this, I'm in favor of deciding some reasonable upper limit (say, 1GB or 2GB sounds good), and allowing all 2^n values up to that limit. I'm OK with that. I'm also OK with providing a few reasonable choices. I guess that means I'll just go with the majority opinion. 3. New file allocation has been a problem raised with this patch for some months now. I've been playing around with this and I don't think short tests show larger sizes off to advantage. Larger segments will definitely perform more poorly until Postgres starts recycling WAL. Once that happens I think performance differences should be negligible, though of course this needs to be verified with longer-running tests. I'm willing to do some extensive performance testing on the patch. I don't see how that could happen in the next few day (before the feature freeze), particularly considering we're interested in long tests. Cool. I've been thinking about how to do some meaningful tests for this (mostly pgbench related). I'd like to hear what you are thinking. My plan was to do some pgbench tests with different workloads, scales (in shared buffers, in RAM, exceeds RAM), and different storage configurations (SSD vs. HDD, WAL/datadir on the same/different device/fs, possibly also ext4/xfs). The question however is whether we need to do this testing when we don't actually change the default (at least the patch submitted on 3/27 does seem to keep the 16MB). I assume people specifying a custom value when calling initdb are expected to know what they are doing (and I don't see how we can prevent distros from choosing a bad value in their packages - they could already do that with configure-time option). Just because we don't change the default doesn't mean that others won't. I still think testing for sizes other than 16MB is severely lacking and I don't believe caveat emptor is the way to go. Aren't you mixing regression and performance testing? I agree we need to be sure all segment sizes are handled correctly, no argument here. Do we actually have any infrastructure for that? Or do you plan to add some new animals with different WAL segment sizes? I don't have plans to add animals. I think we'd need a way to tell 'make check' to use a different segment size for tests and then hopefully reconfigure some of the existing animals. OK. My point was that we don't have that capability now, and the latest patch is not adding it either. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x
Re: [HACKERS] increasing the default WAL segment size
On 4/6/17 5:05 PM, Tomas Vondra wrote: > On 04/06/2017 08:33 PM, David Steele wrote: >> On 4/5/17 7:29 AM, Simon Riggs wrote: >> >>> 2. It's not clear to me the advantage of being able to pick varying >>> filesizes. I see great disadvantage in having too many options, which >>> greatly increases the chance of incompatibility, annoyance and >>> breakage. I favour a small number of values that have been shown by >>> testing to be sweet spots in performance and usability. (1GB has been >>> suggested) >> >> I'm in favor of 16,64,256,1024. >> > > I don't see a particular reason for this, TBH. The sweet spots will be > likely dependent hardware / OS configuration etc. Assuming there > actually are sweet spots - no one demonstrated that yet. Fair enough, but my feeling is that this patch has never been about server performance, per se. Rather, is is about archive management and trying to stem the tide of WAL as servers get bigger and busier. Generally, archive commands have to make a remote connection to offload WAL and that has a cost per segment. > Also, I don't see how supporting additional WAL sizes increases chance > of incompatibility. We already allow that, so either the tools (e.g. > backup solutions) assume WAL segments are always 16MB (in which case are > essentially broken) or support valid file sizes (in which case they > should have no issues with the new ones). I don't see how a compile-time option counts as "supporting that" in practice. How many people in the field are running custom builds of Postgres? And of those, how many have changed the WAL segment size? I've never encountered a non-standard segment size or talked to anyone who has. I'm not saying it has *never* happened but I would venture to say it's rare. > If we're going to do this, I'm in favor of deciding some reasonable > upper limit (say, 1GB or 2GB sounds good), and allowing all 2^n values > up to that limit. I'm OK with that. I'm also OK with providing a few reasonable choices. I guess that means I'll just go with the majority opinion. >>> 3. New file allocation has been a problem raised with this patch for >>> some months now. >> >> I've been playing around with this and I don't think short tests show >> larger sizes off to advantage. Larger segments will definitely perform >> more poorly until Postgres starts recycling WAL. Once that happens I >> think performance differences should be negligible, though of course >> this needs to be verified with longer-running tests. >> > I'm willing to do some extensive performance testing on the patch. I > don't see how that could happen in the next few day (before the feature > freeze), particularly considering we're interested in long tests. Cool. I've been thinking about how to do some meaningful tests for this (mostly pgbench related). I'd like to hear what you are thinking. > The question however is whether we need to do this testing when we don't > actually change the default (at least the patch submitted on 3/27 does > seem to keep the 16MB). I assume people specifying a custom value when > calling initdb are expected to know what they are doing (and I don't see > how we can prevent distros from choosing a bad value in their packages - > they could already do that with configure-time option). Just because we don't change the default doesn't mean that others won't. I still think testing for sizes other than 16MB is severely lacking and I don't believe caveat emptor is the way to go. > Do we actually have any infrastructure for that? Or do you plan to add > some new animals with different WAL segment sizes? I don't have plans to add animals. I think we'd need a way to tell 'make check' to use a different segment size for tests and then hopefully reconfigure some of the existing animals. -- -David da...@pgmasters.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
On 04/06/2017 08:33 PM, David Steele wrote: On 4/5/17 7:29 AM, Simon Riggs wrote: On 5 April 2017 at 06:04, Beena Emerson wrote: The WALfilename - LSN mapping disruption for higher values you mean? Is there anything else I have missed? I see various issues raised but not properly addressed 1. we would need to drop support for segment sizes < 16MB unless we adopt a new incompatible filename format. I think at 16MB the naming should be the same as now and that WALfilename -> LSN is very important. For this release, I think we should just allow >= 16MB and avoid the issue thru lack of time. +1. 2. It's not clear to me the advantage of being able to pick varying filesizes. I see great disadvantage in having too many options, which greatly increases the chance of incompatibility, annoyance and breakage. I favour a small number of values that have been shown by testing to be sweet spots in performance and usability. (1GB has been suggested) I'm in favor of 16,64,256,1024. I don't see a particular reason for this, TBH. The sweet spots will be likely dependent hardware / OS configuration etc. Assuming there actually are sweet spots - no one demonstrated that yet. Also, I don't see how supporting additional WAL sizes increases chance of incompatibility. We already allow that, so either the tools (e.g. backup solutions) assume WAL segments are always 16MB (in which case are essentially broken) or support valid file sizes (in which case they should have no issues with the new ones). If we're going to do this, I'm in favor of deciding some reasonable upper limit (say, 1GB or 2GB sounds good), and allowing all 2^n values up to that limit. 3. New file allocation has been a problem raised with this patch for some months now. I've been playing around with this and I don't think short tests show larger sizes off to advantage. Larger segments will definitely perform more poorly until Postgres starts recycling WAL. Once that happens I think performance differences should be negligible, though of course this needs to be verified with longer-running tests. I'm willing to do some extensive performance testing on the patch. I don't see how that could happen in the next few day (before the feature freeze), particularly considering we're interested in long tests. The question however is whether we need to do this testing when we don't actually change the default (at least the patch submitted on 3/27 does seem to keep the 16MB). I assume people specifying a custom value when calling initdb are expected to know what they are doing (and I don't see how we can prevent distros from choosing a bad value in their packages - they could already do that with configure-time option). If archive_timeout is set then there will also be additional time taken to zero out potentially larger unused parts of the segment. I don't see this as an issue, however, because if archive_timeout is being triggered then the system is very likely under lower than usual load. Lack of clarity and/or movement on these issues is very, very close to getting the patch rejected now. Let's not approach this with the viewpoint that I or others want it to be rejected, lets work forwards and get some solid changes that will improve the world without causing problems. I would definitely like to see this go in, though I agree with Peter that we need a lot more testing. I'd like to see some build farm animals testing with different sizes at the very least, even if there's no time to add new tests. Do we actually have any infrastructure for that? Or do you plan to add some new animals with different WAL segment sizes? regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
On 4/5/17 7:29 AM, Simon Riggs wrote: > On 5 April 2017 at 06:04, Beena Emerson wrote: >> >> The WALfilename - LSN mapping disruption for higher values you mean? Is >> there anything else I have missed? > > I see various issues raised but not properly addressed > > 1. we would need to drop support for segment sizes < 16MB unless we > adopt a new incompatible filename format. > I think at 16MB the naming should be the same as now and that > WALfilename -> LSN is very important. > For this release, I think we should just allow >= 16MB and avoid the > issue thru lack of time. +1. > 2. It's not clear to me the advantage of being able to pick varying > filesizes. I see great disadvantage in having too many options, which > greatly increases the chance of incompatibility, annoyance and > breakage. I favour a small number of values that have been shown by > testing to be sweet spots in performance and usability. (1GB has been > suggested) I'm in favor of 16,64,256,1024. > 3. New file allocation has been a problem raised with this patch for > some months now. I've been playing around with this and I don't think short tests show larger sizes off to advantage. Larger segments will definitely perform more poorly until Postgres starts recycling WAL. Once that happens I think performance differences should be negligible, though of course this needs to be verified with longer-running tests. If archive_timeout is set then there will also be additional time taken to zero out potentially larger unused parts of the segment. I don't see this as an issue, however, because if archive_timeout is being triggered then the system is very likely under lower than usual load. > Lack of clarity and/or movement on these issues is very, very close to > getting the patch rejected now. Let's not approach this with the > viewpoint that I or others want it to be rejected, lets work forwards > and get some solid changes that will improve the world without causing > problems. I would definitely like to see this go in, though I agree with Peter that we need a lot more testing. I'd like to see some build farm animals testing with different sizes at the very least, even if there's no time to add new tests. -- -David da...@pgmasters.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
On 4/6/17 07:13, Beena Emerson wrote: > Does the options 16, 64 and 1024 seem good. > We can remove sizes below 16 as most have agreed and as per the > discussion, 64MB and 1GB seems favoured. We could probably allow 32MB > since it was an already allowed size? I don't see the need to remove any options right now. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
Hello, On Wed, Apr 5, 2017 at 6:06 PM, Peter Eisentraut < peter.eisentr...@2ndquadrant.com> wrote:e format and expand the range? > > > I don't think me saying it felt a bit slow around 256 MB is a proper > technical analysis that should lead to the conclusion that that upper > limit should be 128 MB. ;-) > I ran a couple of tests for 16MB and 1GB and found less than 4% performance dip. I am currently running test to check consistency of the results and for various sizes. I will update soon. -- Beena Emerson EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [HACKERS] increasing the default WAL segment size
Hello, On Wed, Apr 5, 2017 at 4:59 PM, Simon Riggs wrote: > On 5 April 2017 at 06:04, Beena Emerson wrote: > > I see various issues raised but not properly addressed > > 1. we would need to drop support for segment sizes < 16MB unless we > adopt a new incompatible filename format. > I think at 16MB the naming should be the same as now and that > WALfilename -> LSN is very important. > For this release, I think we should just allow >= 16MB and avoid the > issue thru lack of time. > > 2. It's not clear to me the advantage of being able to pick varying > filesizes. I see great disadvantage in having too many options, which > greatly increases the chance of incompatibility, annoyance and > breakage. I favour a small number of values that have been shown by > testing to be sweet spots in performance and usability. (1GB has been > suggested) > Does the options 16, 64 and 1024 seem good. We can remove sizes below 16 as most have agreed and as per the discussion, 64MB and 1GB seems favoured. We could probably allow 32MB since it was an already allowed size? > 3. New file allocation has been a problem raised with this patch for > some months now. > This did not seem to be an open issue, at least there was not many disagreements. Higher the server load, more WAL generated. For the same load, the frequency of file allocation reduces for higher wal-segsize values. Overall it is either filling up many files or few larger files. -- Beena Emerson EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [HACKERS] increasing the default WAL segment size
On 5 April 2017 at 08:36, Peter Eisentraut wrote: > On 4/5/17 06:04, Beena Emerson wrote: >> I suggest the next step is to dial up the allowed segment size in >> configure and run some tests about what a reasonable maximum value could >> be. I did a little bit of that, but somewhere around 256 MB, things got >> really slow. >> >> >> Would it be better if just increase the limit to 128MB for now? >> In next we can change the WAL file name format and expand the range? > > I don't think me saying it felt a bit slow around 256 MB is a proper > technical analysis that should lead to the conclusion that that upper > limit should be 128 MB. ;-) > > This tells me that there is a lot of explore and test here before we > should let it loose on users. Agreed > I think the best we should do now is spend a bit of time exploring > whether/how larger values of segment size behave, and bump the hardcoded > configure limit if we get positive results. Everything else should > probably be postponed. > > (Roughly speaking, to get started, this would mean compiling with > --with-wal-segsize 16, 32, 64, 128, 256, run make check-world both > sequentially and in parallel, and take note of a) passing, b) run time, > c) disk space.) I've committed the rest of Beena's patch to allow this testing to occur up to 1024MB. -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
On 4/5/17 06:04, Beena Emerson wrote: > I suggest the next step is to dial up the allowed segment size in > configure and run some tests about what a reasonable maximum value could > be. I did a little bit of that, but somewhere around 256 MB, things got > really slow. > > > Would it be better if just increase the limit to 128MB for now? > In next we can change the WAL file name format and expand the range? I don't think me saying it felt a bit slow around 256 MB is a proper technical analysis that should lead to the conclusion that that upper limit should be 128 MB. ;-) This tells me that there is a lot of explore and test here before we should let it loose on users. I think the best we should do now is spend a bit of time exploring whether/how larger values of segment size behave, and bump the hardcoded configure limit if we get positive results. Everything else should probably be postponed. (Roughly speaking, to get started, this would mean compiling with --with-wal-segsize 16, 32, 64, 128, 256, run make check-world both sequentially and in parallel, and take note of a) passing, b) run time, c) disk space.) -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
On 5 April 2017 at 06:04, Beena Emerson wrote: >> >> No commitment yet to increasing wal-segsize in the way this patch has >> >> it. >> >> >> > >> > What part of patch you don't like and do you have any suggestions to >> > improve the same? >> >> I think there are still some questions and disagreements about how it >> should behave. > > > The WALfilename - LSN mapping disruption for higher values you mean? Is > there anything else I have missed? I see various issues raised but not properly addressed 1. we would need to drop support for segment sizes < 16MB unless we adopt a new incompatible filename format. I think at 16MB the naming should be the same as now and that WALfilename -> LSN is very important. For this release, I think we should just allow >= 16MB and avoid the issue thru lack of time. 2. It's not clear to me the advantage of being able to pick varying filesizes. I see great disadvantage in having too many options, which greatly increases the chance of incompatibility, annoyance and breakage. I favour a small number of values that have been shown by testing to be sweet spots in performance and usability. (1GB has been suggested) 3. New file allocation has been a problem raised with this patch for some months now. Lack of clarity and/or movement on these issues is very, very close to getting the patch rejected now. Let's not approach this with the viewpoint that I or others want it to be rejected, lets work forwards and get some solid changes that will improve the world without causing problems. -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
On 4 April 2017 at 22:47, Amit Kapila wrote: > On Wed, Apr 5, 2017 at 3:36 AM, Simon Riggs wrote: >> On 27 March 2017 at 15:36, Beena Emerson wrote: >> >>> 02-increase-max-wal-segsize.patch - Increases the wal-segsize and changes >>> the internal representation of max_wal_size and min_wal_size to mb. >> >> Committed first part to allow internal representation change (only). >> >> No commitment yet to increasing wal-segsize in the way this patch has it. >> > > What part of patch you don't like and do you have any suggestions to > improve the same? The only part of the patch uncommitted was related to choice of WAL file size in the config file. I've already made suggestions on that upthread. I'm now looking at patch 03-modify-tools.patch * Peter's "lack of tests" comment still applies * I think we should remove pg_standby in this release, so we don't have to care about it * If we change pg_resetwal then it should allow changing XLogSegSize also * "coulnot access the archive location" 03 looks mostly OK 04 is much more of a mess * Lots of comments and notes pre-judge what the limits and configurability are, so its hard to commit the patches without committing to the basic assumptions. Please look at removing all assumptions about what the values/options are, so we can change them later 05 adds various tests but I don't think adds enough value to commit -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
Hello, On Wed, Apr 5, 2017 at 9:29 AM, Peter Eisentraut < peter.eisentr...@2ndquadrant.com> wrote: > On 4/4/17 22:47, Amit Kapila wrote: > >> Committed first part to allow internal representation change (only). > >> > >> No commitment yet to increasing wal-segsize in the way this patch has > it. > >> > > > > What part of patch you don't like and do you have any suggestions to > > improve the same? > > I think there are still some questions and disagreements about how it > should behave. > The WALfilename - LSN mapping disruption for higher values you mean? Is there anything else I have missed? > > I suggest the next step is to dial up the allowed segment size in > configure and run some tests about what a reasonable maximum value could > be. I did a little bit of that, but somewhere around 256 MB, things got > really slow. > Would it be better if just increase the limit to 128MB for now? In next we can change the WAL file name format and expand the range? -- Beena Emerson EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [HACKERS] increasing the default WAL segment size
On 4/4/17 22:47, Amit Kapila wrote: >> Committed first part to allow internal representation change (only). >> >> No commitment yet to increasing wal-segsize in the way this patch has it. >> > > What part of patch you don't like and do you have any suggestions to > improve the same? I think there are still some questions and disagreements about how it should behave. I suggest the next step is to dial up the allowed segment size in configure and run some tests about what a reasonable maximum value could be. I did a little bit of that, but somewhere around 256 MB, things got really slow. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
On Wed, Apr 5, 2017 at 3:36 AM, Simon Riggs wrote: > On 27 March 2017 at 15:36, Beena Emerson wrote: > >> 02-increase-max-wal-segsize.patch - Increases the wal-segsize and changes >> the internal representation of max_wal_size and min_wal_size to mb. > > Committed first part to allow internal representation change (only). > > No commitment yet to increasing wal-segsize in the way this patch has it. > What part of patch you don't like and do you have any suggestions to improve the same? -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
On 27 March 2017 at 15:36, Beena Emerson wrote: > 02-increase-max-wal-segsize.patch - Increases the wal-segsize and changes > the internal representation of max_wal_size and min_wal_size to mb. Committed first part to allow internal representation change (only). No commitment yet to increasing wal-segsize in the way this patch has it. -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
Hello, On Fri, Mar 31, 2017 at 11:20 AM, Kuntal Ghosh wrote: > On Fri, Mar 31, 2017 at 10:40 AM, Beena Emerson > wrote: > > On 30 Mar 2017 15:10, "Kuntal Ghosh" wrote: > > > I do not think a generalised function is a good idea. Besides, I feel the > > respective approaches are best kept in the modules used also because the > > internal code is not easily accessible by utils. > > > Ahh, I wonder what the reason can be. Anyway, I'll leave that decision > for the committer. I'm moving the status to Ready for committer. > > Thank you. -- Beena Emerson EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [HACKERS] increasing the default WAL segment size
On Fri, Mar 31, 2017 at 10:40 AM, Beena Emerson wrote: > On 30 Mar 2017 15:10, "Kuntal Ghosh" wrote: >> 03-modify-tools.patch - Makes XLogSegSize into a variable, currently set >> as >> XLOG_SEG_SIZE and modifies the tools to fetch the size instead of using >> inbuilt value. > Several methods are declared and defined in different tools to fetch > the size of wal-seg-size. > In pg_standby.c, > RetrieveXLogSegSize() - /* Set XLogSegSize from the WAL file header */ > > In pg_basebackup/streamutil.c, > on behaRetrieveXLogSegSize(PGconn *conn) - /* set XLogSegSize using > SHOW wal_segment_size */ > > In pg_waldump.c, > ReadXLogFromDir(char *archive_loc) > RetrieveXLogSegSize(char *archive_path) /* Scan through the archive > location to set XLogSegsize from the first WAL file */ > > IMHO, it's better to define a single method in xlog.c and based on the > different strategy, it can retrieve the XLogSegsize on behalf of > different modules. I've suggested the same in my first set review and > I'll still vote for it. For example, in xlog.c, you can define > something as following: > bool RetrieveXLogSegSize(RetrieveStrategy rs, void* ptr) > > Now based on the RetrieveStrategy(say Conn, File, Dir), you can cast > the void pointer to the appropriate type. So, when a new tool needs to > retrieve XLogSegSize, it can just call this function instead of > defining a new RetrieveXLogSegSize method. > > It's just a suggestion from my side. Is there anything I'm missing > which can cause the aforesaid approach not to be working? > Apart from that, I've nothing to add here. > > > > I do not think a generalised function is a good idea. Besides, I feel the > respective approaches are best kept in the modules used also because the > internal code is not easily accessible by utils. > Ahh, I wonder what the reason can be. Anyway, I'll leave that decision for the committer. I'm moving the status to Ready for committer. I've only tested the patch in my 64-bit linux system. It needs some testing on other environment settings. -- Thanks & Regards, Kuntal Ghosh EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
Hello, Thanks for testing my patch. On 30 Mar 2017 15:10, "Kuntal Ghosh" wrote: On Tue, Mar 28, 2017 at 1:06 AM, Beena Emerson wrote: > On Sat, Mar 25, 2017 at 10:32 PM, Peter Eisentraut > wrote: >> >> At this point, I suggest splitting this patch up into several >> potentially less controversial pieces. >> >> One big piece is that we currently don't support segment sizes larger >> than 64 GB, for various internal arithmetic reasons. Your patch appears >> to address that. So I suggest isolating that. Assuming it works >> correctly, I think there would be no great concern about it. >> >> The next piece would be making the various tools aware of varying >> segment sizes without having to rely on a built-in value. >> >> The third piece would then be the rest that allows you to set the size >> at initdb >> >> If we take these in order, we would make it easier to test various sizes >> and see if there are any more unforeseen issues when changing sizes. It >> would also make it easier to do performance testing so we can address >> the original question of what the default size should be. > > > PFA the patches divided into 3 parts: Thanks for splitting the patches. 01-add-XLogSegmentOffset-macro.patch is same as before and it looks good. > 02-increase-max-wal-segsize.patch - Increases the wal-segsize and changes > the internal representation of max_wal_size and min_wal_size to mb. looks good. > 03-modify-tools.patch - Makes XLogSegSize into a variable, currently set as > XLOG_SEG_SIZE and modifies the tools to fetch the size instead of using > inbuilt value. Several methods are declared and defined in different tools to fetch the size of wal-seg-size. In pg_standby.c, RetrieveXLogSegSize() - /* Set XLogSegSize from the WAL file header */ In pg_basebackup/streamutil.c, on behaRetrieveXLogSegSize(PGconn *conn) - /* set XLogSegSize using SHOW wal_segment_size */ In pg_waldump.c, ReadXLogFromDir(char *archive_loc) RetrieveXLogSegSize(char *archive_path) /* Scan through the archive location to set XLogSegsize from the first WAL file */ IMHO, it's better to define a single method in xlog.c and based on the different strategy, it can retrieve the XLogSegsize on behalf of different modules. I've suggested the same in my first set review and I'll still vote for it. For example, in xlog.c, you can define something as following: bool RetrieveXLogSegSize(RetrieveStrategy rs, void* ptr) Now based on the RetrieveStrategy(say Conn, File, Dir), you can cast the void pointer to the appropriate type. So, when a new tool needs to retrieve XLogSegSize, it can just call this function instead of defining a new RetrieveXLogSegSize method. It's just a suggestion from my side. Is there anything I'm missing which can cause the aforesaid approach not to be working? Apart from that, I've nothing to add here. I do not think a generalised function is a good idea. Besides, I feel the respective approaches are best kept in the modules used also because the internal code is not easily accessible by utils. > 04-initdb-walsegsize.patch - Adds the initdb option to set wal-segsize and > make related changes. Update pg_test_fsync to use DEFAULT_XLOG_SEG_SIZE > instead of XLOG_SEG_SIZE looks good. >> >> One concern I have is that your patch does not contain any tests. There >> should probably be lots of tests. > > > 05-initdb_tests.patch adds tap tests to initialize cluster with different > wal_segment_size and then check the config values. What other tests do you > have in mind? Checking the various tools? Nothing from me to add here. I've nothing to add here for the attached set of patches. On top of these, David's patch can be used for preserving LSNs in the file naming for all segment sizes.
Re: [HACKERS] increasing the default WAL segment size
On Tue, Mar 28, 2017 at 1:06 AM, Beena Emerson wrote: > On Sat, Mar 25, 2017 at 10:32 PM, Peter Eisentraut > wrote: >> >> At this point, I suggest splitting this patch up into several >> potentially less controversial pieces. >> >> One big piece is that we currently don't support segment sizes larger >> than 64 GB, for various internal arithmetic reasons. Your patch appears >> to address that. So I suggest isolating that. Assuming it works >> correctly, I think there would be no great concern about it. >> >> The next piece would be making the various tools aware of varying >> segment sizes without having to rely on a built-in value. >> >> The third piece would then be the rest that allows you to set the size >> at initdb >> >> If we take these in order, we would make it easier to test various sizes >> and see if there are any more unforeseen issues when changing sizes. It >> would also make it easier to do performance testing so we can address >> the original question of what the default size should be. > > > PFA the patches divided into 3 parts: Thanks for splitting the patches. 01-add-XLogSegmentOffset-macro.patch is same as before and it looks good. > 02-increase-max-wal-segsize.patch - Increases the wal-segsize and changes > the internal representation of max_wal_size and min_wal_size to mb. looks good. > 03-modify-tools.patch - Makes XLogSegSize into a variable, currently set as > XLOG_SEG_SIZE and modifies the tools to fetch the size instead of using > inbuilt value. Several methods are declared and defined in different tools to fetch the size of wal-seg-size. In pg_standby.c, RetrieveXLogSegSize() - /* Set XLogSegSize from the WAL file header */ In pg_basebackup/streamutil.c, on behaRetrieveXLogSegSize(PGconn *conn) - /* set XLogSegSize using SHOW wal_segment_size */ In pg_waldump.c, ReadXLogFromDir(char *archive_loc) RetrieveXLogSegSize(char *archive_path) /* Scan through the archive location to set XLogSegsize from the first WAL file */ IMHO, it's better to define a single method in xlog.c and based on the different strategy, it can retrieve the XLogSegsize on behalf of different modules. I've suggested the same in my first set review and I'll still vote for it. For example, in xlog.c, you can define something as following: bool RetrieveXLogSegSize(RetrieveStrategy rs, void* ptr) Now based on the RetrieveStrategy(say Conn, File, Dir), you can cast the void pointer to the appropriate type. So, when a new tool needs to retrieve XLogSegSize, it can just call this function instead of defining a new RetrieveXLogSegSize method. It's just a suggestion from my side. Is there anything I'm missing which can cause the aforesaid approach not to be working? Apart from that, I've nothing to add here. > 04-initdb-walsegsize.patch - Adds the initdb option to set wal-segsize and > make related changes. Update pg_test_fsync to use DEFAULT_XLOG_SEG_SIZE > instead of XLOG_SEG_SIZE looks good. >> >> One concern I have is that your patch does not contain any tests. There >> should probably be lots of tests. > > > 05-initdb_tests.patch adds tap tests to initialize cluster with different > wal_segment_size and then check the config values. What other tests do you > have in mind? Checking the various tools? Nothing from me to add here. I've nothing to add here for the attached set of patches. On top of these, David's patch can be used for preserving LSNs in the file naming for all segment sizes. -- Thanks & Regards, Kuntal Ghosh EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
Hello, On Sat, Mar 25, 2017 at 10:32 PM, Peter Eisentraut < peter.eisentr...@2ndquadrant.com> wrote: > At this point, I suggest splitting this patch up into several > potentially less controversial pieces. > > One big piece is that we currently don't support segment sizes larger > than 64 GB, for various internal arithmetic reasons. Your patch appears > to address that. So I suggest isolating that. Assuming it works > correctly, I think there would be no great concern about it. > > The next piece would be making the various tools aware of varying > segment sizes without having to rely on a built-in value. > > The third piece would then be the rest that allows you to set the size > at initdb > > If we take these in order, we would make it easier to test various sizes > and see if there are any more unforeseen issues when changing sizes. It > would also make it easier to do performance testing so we can address > the original question of what the default size should be. > PFA the patches divided into 3 parts: 02-increase-max-wal-segsize.patch - Increases the wal-segsize and changes the internal representation of max_wal_size and min_wal_size to mb. 03-modify-tools.patch - Makes XLogSegSize into a variable, currently set as XLOG_SEG_SIZE and modifies the tools to fetch the size instead of using inbuilt value. 04-initdb-walsegsize.patch - Adds the initdb option to set wal-segsize and make related changes. Update pg_test_fsync to use DEFAULT_XLOG_SEG_SIZE instead of XLOG_SEG_SIZE > One concern I have is that your patch does not contain any tests. There > should probably be lots of tests. 05-initdb_tests.patch adds tap tests to initialize cluster with different wal_segment_size and then check the config values. What other tests do you have in mind? Checking the various tools? -- Beena Emerson EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company 01-add-XLogSegmentOffset-macro.patch Description: Binary data 02-increase-max-wal-segsize.patch Description: Binary data 03-modify-tools.patch Description: Binary data 05-initdb_tests.patch Description: Binary data 04-initdb-walsegsize.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
On 25 March 2017 at 17:02, Peter Eisentraut wrote: > At this point, I suggest splitting this patch up into several > potentially less controversial pieces. > > One big piece is that we currently don't support segment sizes larger > than 64 GB, for various internal arithmetic reasons. Your patch appears > to address that. So I suggest isolating that. Assuming it works > correctly, I think there would be no great concern about it. +1 > The next piece would be making the various tools aware of varying > segment sizes without having to rely on a built-in value. Hmm > The third piece would then be the rest that allows you to set the size > at initdb > > If we take these in order, we would make it easier to test various sizes > and see if there are any more unforeseen issues when changing sizes. It > would also make it easier to do performance testing so we can address > the original question of what the default size should be. > > One concern I have is that your patch does not contain any tests. There > should probably be lots of tests. This is looking like a reject in its current form. Changing WAL filename to a new form seems best plan, but we don't have time to do that and get it right, especially with no tests. My summary of useful requirements would be * Files smaller than 16MB and larger than 16MB are desirable * LSN <-> filename mapping must be clear * New filename format best for debugging and clarity My proposal from here is that we allow only one new size in this release, to minimize the splash zone. Keep the filename format as it is now, using David's suggestion. Suggest adding 1GB as the only additional option, which continues the idea of having 1GB as the max filesize. New filename format can come in PG11 allowing much wider range of WAL filesizes, bigger and smaller. -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
At this point, I suggest splitting this patch up into several potentially less controversial pieces. One big piece is that we currently don't support segment sizes larger than 64 GB, for various internal arithmetic reasons. Your patch appears to address that. So I suggest isolating that. Assuming it works correctly, I think there would be no great concern about it. The next piece would be making the various tools aware of varying segment sizes without having to rely on a built-in value. The third piece would then be the rest that allows you to set the size at initdb If we take these in order, we would make it easier to test various sizes and see if there are any more unforeseen issues when changing sizes. It would also make it easier to do performance testing so we can address the original question of what the default size should be. One concern I have is that your patch does not contain any tests. There should probably be lots of tests. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
Peter, * Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote: > On 3/24/17 08:18, Stephen Frost wrote: > > Beyond that, this also bakes in an assumption that we would then require > > access to a database > > That is a good point, but then any change to the naming whatsoever will > create trouble. Then we might as well choose which specific trouble. Right, and I'd rather we work that out before we start encouraging users to change their WAL segment size, which is what this patch will do. Personally, I'm alright with the patch David has produced, which is pretty small, maintains the same names when 16MB segments are used, and is pretty straight-forward to reason about. I do think it'll need added documentation to clarify how WAL segment names are calculated and perhaps another function which returns the size of WAL segments on a given cluster (I don't think we have that..?), and, ideally, added regression tests or buildfarm animals which try different sizes. On the other hand, I don't have any particular issue with the naming scheme you proposed up-thread, which uses proper separators between the components of a WAL filename, but that would change what happens with 16MB WAL segments today. I'm still of the opinion that we should be changing the default to 64MB for WAL segments. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] increasing the default WAL segment size
Peter, * Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote: > On 3/24/17 19:13, David Steele wrote: > > Behavior for the current default of 16MB is unchanged, and all other > > sizes go through a logical progression. > > Just at a glance, without analyzing the math behind it, this scheme > seems super confusing. Compared to: 1GB: 00010001 00010002 00010003 00010001 ...? Having the naming no longer match the LSN and also, seemingly, jump randomly, strikes me as very confusing. At least with the LSN-based approach, we aren't jumping randomly but exactly in-line with what the starting LSN of the file is, and always by the same amount (in hex). Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] increasing the default WAL segment size
On 3/24/17 08:18, Stephen Frost wrote: > Peter, > > * Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote: >> There is a function for that. > [...] >> There is not a function for that, but there could be one. > > I'm not sure you've really considered what you're suggesting here. Create a set-returning function that returns all the to-be-expected file names between two LSNs. > Beyond that, this also bakes in an assumption that we would then require > access to a database That is a good point, but then any change to the naming whatsoever will create trouble. Then we might as well choose which specific trouble. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
On 3/24/17 19:13, David Steele wrote: > Behavior for the current default of 16MB is unchanged, and all other > sizes go through a logical progression. Just at a glance, without analyzing the math behind it, this scheme seems super confusing. > > 1GB: > 00010040 > 00010080 > 000100C0 > 00010001 > > 256GB: > > 00010010 > 00010020 > 00010030 > ... > 000100E0 > 000100F0 > 00010001 > > 64GB: > > 000100010004 > 000100010008 > 00010001000C > ... > 0001000100F8 > 0001000100FC > 00010001 -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
Hi Robert, On 3/24/17 3:00 PM, Robert Haas wrote: On Wed, Mar 22, 2017 at 6:05 PM, David Steele wrote: Wait, really? I thought you abandoned this approach because there's then no principled way to handle WAL segments of less than the default size. I did say that, but I thought I had hit on a compromise. But, as I originally pointed out the hex characters in the filename are not aligned correctly for > 8 bits (< 16MB segments) and using different alignments just made it less consistent. I don't think I understand what the compromise is. Are you saying we should have one rule for segments < 16MB and another rule for segments 16MB? I think using two different rules for file naming depending on the segment size will be a negative for both tool authors and ordinary users. Sorry for the confusion, I meant to say that if we want to keep LSNs in the filenames and not change alignment for the current default, then we would need to drop support for segment sizes < 16MB (more or less what I said below). Bad editing on my part. It would be OK if we were willing to drop the 1,2,4,8 segment sizes because then the alignment would make sense and not change the current 16MB sequence. Well, that is true. But the thing I'm trying to do here is to keep this patch down to what actually needs to be changed in order to accomplish the original purchase, not squeeze more and more changes into it. Attached is a patch to be applied on top of Beena's v8 patch that preserves LSNs in the file naming for all segment sizes. It's not quite complete because it doesn't modify the lower size limit everywhere, but I think it's enough so you can see what I'm getting at. This passes check-world and I've poked at it in other segment sizes as well manually. Behavior for the current default of 16MB is unchanged, and all other sizes go through a logical progression. 1GB: 00010040 00010080 000100C0 00010001 256GB: 00010010 00010020 00010030 ... 000100E0 000100F0 00010001 64GB: 000100010004 000100010008 00010001000C ... 0001000100F8 0001000100FC 00010001 I believe that maintaining an easy correspondence between LSN and filename is important. The cluster will not always be up to help with these calculations and tools that do the job may not be present or may have issues. I'm happy to merge this with Beena's patch (and tidy my patch up) if this looks like an improvement to everyone. -- -David da...@pgmasters.net diff --git a/src/include/access/xlog_internal.h b/src/include/access/xlog_internal.h index e3f616a..08d6494 100644 --- a/src/include/access/xlog_internal.h +++ b/src/include/access/xlog_internal.h @@ -93,10 +93,12 @@ typedef XLogLongPageHeaderData *XLogLongPageHeader; extern uint32 XLogSegSize; #define XLOG_SEG_SIZE XLogSegSize -/* XLogSegSize can range from 1MB to 1GB */ -#define XLogSegMinSize 1024 * 1024 +/* XLogSegSize can range from 16MB to 1GB */ +#define XLogSegMinSize 1024 * 1024 * 16 #define XLogSegMaxSize 1024 * 1024 * 1024 +#define XLogSegMinSizeBits 24 + /* Default number of min and max wal segments */ #define DEFAULT_MIN_WAL_SEGS 5 #define DEFAULT_MAX_WAL_SEGS 64 @@ -158,10 +160,14 @@ extern uint32 XLogSegSize; /* Length of XLog file name */ #define XLOG_FNAME_LEN24 +#define XLogSegNoLower(logSegNo) \ + (((logSegNo) % XLogSegmentsPerXLogId) * \ + (XLogSegSize >> XLogSegMinSizeBits)) + #define XLogFileName(fname, tli, logSegNo) \ snprintf(fname, MAXFNAMELEN, "%08X%08X%08X", tli, \ (uint32) ((logSegNo) / XLogSegmentsPerXLogId), \ -(uint32) ((logSegNo) % XLogSegmentsPerXLogId)) +(uint32) XLogSegNoLower(logSegNo)) #define XLogFileNameById(fname, tli, log, seg) \ snprintf(fname, MAXFNAMELEN, "%08X%08X%08X", tli, log, seg) @@ -181,17 +187,18 @@ extern uint32 XLogSegSize; strcmp((fname) + XLOG_FNAME_LEN, ".partial") == 0) #define XLogFromFileName(fname, tli, logSegNo) \ - do { \ - uint32 log; \ - uint32 seg; \ - sscanf(fname, "%08X%08X%08X", tli, &log, &seg); \ - *logSegNo = (uint64) log * XLogSegmentsPerXLogId + seg; \ + do { \ + uint32 log; \ + uint32 seg;
Re: [HACKERS] increasing the default WAL segment size
On Wed, Mar 22, 2017 at 6:05 PM, David Steele wrote: >> Wait, really? I thought you abandoned this approach because there's >> then no principled way to handle WAL segments of less than the default >> size. > > I did say that, but I thought I had hit on a compromise. > > But, as I originally pointed out the hex characters in the filename are not > aligned correctly for > 8 bits (< 16MB segments) and using different > alignments just made it less consistent. I don't think I understand what the compromise is. Are you saying we should have one rule for segments < 16MB and another rule for segments > 16MB? I think using two different rules for file naming depending on the segment size will be a negative for both tool authors and ordinary users. > It would be OK if we were willing to drop the 1,2,4,8 segment sizes because > then the alignment would make sense and not change the current 16MB > sequence. Well, that is true. But the thing I'm trying to do here is to keep this patch down to what actually needs to be changed in order to accomplish the original purchase, not squeeze more and more changes into it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
Hello, On Wed, Mar 22, 2017 at 9:41 PM, Kuntal Ghosh wrote: > On Wed, Mar 22, 2017 at 3:14 PM, Beena Emerson > wrote: > > PFA an updated patch which fixes a minor bug I found. It only increases > the > > string size in pretty_wal_size function. > > The 01-add-XLogSegmentOffset-macro.patch has also been rebased. > Thanks for the updated versions. Here is a partial review of the patch: > > In pg_standby.c and pg_waldump.c, > + XLogPageHeader hdr = (XLogPageHeader) buf; > + XLogLongPageHeader NewLongPage = (XLogLongPageHeader) hdr; > + > + XLogSegSize = NewLongPage->xlp_seg_size; > It waits until the file is at least XLOG_BLCKSZ, then gets the > expected final size from XLogPageHeader. This looks really clean > compared to the previous approach. > thank you for testing. PFA the rebased patch incorporating your comments. > > + * Verify that the min and max wal_size meet the minimum requirements. > Better to write min_wal_size and max_wal_size. > Updated wherever applicable. > > + errmsg("Insufficient value for \"min_wal_size\""))); > "min_wal_size %d is too low" may be? Use lower case for error > messages. Same for max_wal_size. > updated. > > + /* Set the XLogSegSize */ > + XLogSegSize = ControlFile->xlog_seg_size; > + > A call to IsValidXLogSegSize() will be good after this, no? > Is it necessary? We already have the CRC check for ControlFiles. Is that not enough? > > + /* Update variables using XLogSegSize */ > + check_wal_size(); > The method name looks somewhat misleading compared to the comment for > it, doesn't it? > The method name is correct since it only checks if the value provided is sufficient (at least 2 segment size). I have updated the comment to say Check and update variables dependent on XLogSegSize > This patch introduces a new guc_unit having values in MB for > max_wal_size and min_wal_size. I'm not sure about the upper limit > which is set to INT_MAX for 32-bit systems as well. Is it needed to > define something like MAX_MEGABYTES similar to MAX_KILOBYTES? > It is worth mentioning that GUC_UNIT_KB can't be used in this case > since MAX_KILOBYTES is INT_MAX / 1024(<2GB) on 32-bit systems. That's > not a sufficient value for min_wal_size/max_wal_size. > You are right. I can use the same value as upper limit for GUC_UNIT_MB, we could probably change the name of MAX_KILOBYTES to something more general for both GUC_UNIT_MB and GUC_UNIT_KB. The max size in 32-bit systems would be 2TB. > While testing with pg_waldump, I got the following error. > bin/pg_waldump -p master/pg_wal/ -s 0/0100 > Floating point exception (core dumped) > Stack: > #0 0x004039d6 in ReadPageInternal () > #1 0x00404c84 in XLogFindNextRecord () > #2 0x00401e08 in main () > I think that the problem is in following code: > /* parse files as start/end boundaries, extract path if not specified */ > if (optind < argc) > { > > + if (!RetrieveXLogSegSize(full_path)) > ... > } > In this case, RetrieveXLogSegSize is conditionally called. So, if the > condition is false, XLogSegSize will not be initialized. > > pg_waldump code has been updated. -- Beena Emerson EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company 02-initdb-walsegsize-v8.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
On 3/23/17 4:45 PM, Peter Eisentraut wrote: On 3/22/17 17:33, David Steele wrote: I think if we don't change the default size it's very unlikely I would use alternate WAL segment sizes or recommend that anyone else does, at least in v10. I simply don't think it would get the level of testing required to be production worthy I think we could tweak the test harnesses to run all the tests with different segment sizes. That would get pretty good coverage. I would want to see 1,16,64 at a minimum. More might be nice but that gets a bit ridiculous at some point. I would be fine with different critters having different defaults. I don't think that each critter needs to test each value. More generally, the methodology that we should not add an option unless we also change the default because the option would otherwise not get enough testing is a bit dubious. Generally, I would agree, but I think this is a special case. This option has been around for a long time and we are just now exposing it in a way that's useful to end users. It's easy to see how various assumptions may have arisen around the default and led to code that is not quite right when using different values. Even if that's not true in the backend code, it might affect bin, and certainly affects third party tools. -- -David da...@pgmasters.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
Jeff, * Jeff Janes (jeff.ja...@gmail.com) wrote: > On Thu, Mar 23, 2017 at 1:45 PM, Peter Eisentraut < > peter.eisentr...@2ndquadrant.com> wrote: > > On 3/22/17 17:33, David Steele wrote: > > > and I doubt that most tool writers would be quick to > > > add support for a feature that very few people (if any) use. > > > > I'm not one of those tool writers, although I have written my share of > > DBA scripts over the years, but I wonder why those tools would really > > care. They are handed files with predetermined names to archive, and > > for restore files with predetermined names are requested back from them. > > What else do they need? If something is missing that requires them to > > parse file names, then maybe that should be added. > > I have a pg_restore which predicts the file 5 files ahead of the one it was > asked for, and initiates a pre-fetch and decompression of it. Then it > delivers the file it was asked for, either by pulling it out of the > pre-staging area set up by the N-5th invocation, or by going directly to > the archive to get it. This speeds up play-back dramatically when the > files are stored compressed and non-local. Ah, yes, that is on our road-map for pgBackrest to do also, along with parallel WAL fetch which also needs to figure out the WAL names before being asked for them. We do already have parallel push, which also needs to figure out what the upcoming file names are going to be so we can find them and push them when they're indicated as ready in archive_status. Perhaps we could just push whatever is ready and remember everything that was pushed for when PG asks, but that is really not ideal. > That is why I need to know how the files are numbered. I don't think that > that makes much of a difference, though. Any change is going to break > that, no matter which change. Then I'll fix it. Right, but the discussion here is actually about the idea that we're going to encourage people to use the initdb-time option to change the WAL size, meaning you'll need to deal with different WAL sizes and different naming due to that, and then we're going to turn around in the very next release and break the naming, meaning you'll have to adjust your tools first for the different possible WAL sizes in PG10 and then adjust again for the different naming in PG11. I'm trying to suggest that if we're going to do this that, perhaps, we should try to make both the changes in one release instead of across two. > If we are going to break it, I'd prefer to just do away with the 'segment' > thing altogether. You have timelines, and you have files. That's it. I'm not sure I follow this proposal. We have to know which WAL file has which LSN in it, how do you do that with just 'timelines and files'? Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] increasing the default WAL segment size
On 3/24/17 12:27 AM, Peter Eisentraut wrote: On 3/23/17 16:58, Stephen Frost wrote: The backup tools need to also get the LSN from the pg_stop_backup and verify that they have the WAL file associated with that LSN. There is a function for that. They also need to make sure that they have all of the WAL files between the starting LSN and the ending LSN. Doing that requires understanding how the files are named to make sure there aren't any missing. There is not a function for that, but there could be one. A function would be nice, but tools often cannot depend on the database being operational so it's still necessary to re-implement them. Having a sane sequence in the WAL makes that easier. -- -David da...@pgmasters.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
Peter, * Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote: > There is a function for that. [...] > There is not a function for that, but there could be one. I'm not sure you've really considered what you're suggesting here. We need to to make sure we have every file between two LSNs. Yes, we could step the LSN forward one byte at a time, calling the appropriate function for every single byte, to make sure that we have that file, but that really isn't a reasonable approach. Nor would it be reasonable if we go on the assumption that WAL files can't be less than 1MB. Beyond that, this also bakes in an assumption that we would then require access to a database (of a current enough version to have the functions needed too!) to connect to and run these functions, which is a poor design. If the user is using a remote system to gather the WAL on, that system may not have easy access to PG. Further, backup tools will want to do things like off-line verification that the backup is complete, perhaps in another environment entirely which doesn't have PG, or maybe where what they're trying to do is make sure that a given backup is good before starting a restore to bring PG back up. Also, given that one of the things we're talking about here is specifically that we want to be able to change the WAL size for different databases, you would have to make sure that the database you're running these functions on uses the same WAL file size as the one which is being backed up. No, I don't agree that we can claim the LSN -> WAL filename mapping is an internal PG detail that we can whack around because there are functions to calculate the answer. External utilities need to be able to perform that translation and we need to document for them how to do so correctly. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] increasing the default WAL segment size
On 3/23/17 21:47, Jeff Janes wrote: > I have a pg_restore which predicts the file 5 files ahead of the one it > was asked for, and initiates a pre-fetch and decompression of it. Then > it delivers the file it was asked for, either by pulling it out of the > pre-staging area set up by the N-5th invocation, or by going directly to > the archive to get it. This speeds up play-back dramatically when the > files are stored compressed and non-local. Yeah, some better support for prefetching would be necessary to avoid having to have any knowledge of the file naming. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
On 3/23/17 16:58, Stephen Frost wrote: > The backup tools need to also get the LSN from the pg_stop_backup and > verify that they have the WAL file associated with that LSN. There is a function for that. > They also > need to make sure that they have all of the WAL files between the > starting LSN and the ending LSN. Doing that requires understanding how > the files are named to make sure there aren't any missing. There is not a function for that, but there could be one. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
On Thu, Mar 23, 2017 at 1:45 PM, Peter Eisentraut < peter.eisentr...@2ndquadrant.com> wrote: > On 3/22/17 17:33, David Steele wrote: > > > and I doubt that most tool writers would be quick to > > add support for a feature that very few people (if any) use. > > I'm not one of those tool writers, although I have written my share of > DBA scripts over the years, but I wonder why those tools would really > care. They are handed files with predetermined names to archive, and > for restore files with predetermined names are requested back from them. > What else do they need? If something is missing that requires them to > parse file names, then maybe that should be added. > I have a pg_restore which predicts the file 5 files ahead of the one it was asked for, and initiates a pre-fetch and decompression of it. Then it delivers the file it was asked for, either by pulling it out of the pre-staging area set up by the N-5th invocation, or by going directly to the archive to get it. This speeds up play-back dramatically when the files are stored compressed and non-local. That is why I need to know how the files are numbered. I don't think that that makes much of a difference, though. Any change is going to break that, no matter which change. Then I'll fix it. If we are going to break it, I'd prefer to just do away with the 'segment' thing altogether. You have timelines, and you have files. That's it. Cheers, Jeff
Re: [HACKERS] increasing the default WAL segment size
Peter, * Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote: > On 3/22/17 17:33, David Steele wrote: > > and I doubt that most tool writers would be quick to > > add support for a feature that very few people (if any) use. > > I'm not one of those tool writers, although I have written my share of > DBA scripts over the years, but I wonder why those tools would really > care. They are handed files with predetermined names to archive, and > for restore files with predetermined names are requested back from them. > What else do they need? If something is missing that requires them to > parse file names, then maybe that should be added. PG backup technology has come a fair ways from that simple characterization of it. :) The backup tools need to also get the LSN from the pg_stop_backup and verify that they have the WAL file associated with that LSN. They also need to make sure that they have all of the WAL files between the starting LSN and the ending LSN. Doing that requires understanding how the files are named to make sure there aren't any missing. David will probably point out other reasons that the backup tools need to understand the file naming, but those are ones I know of off-hand. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] increasing the default WAL segment size
On 3/22/17 17:33, David Steele wrote: > I think if we don't change the default size it's very unlikely I would > use alternate WAL segment sizes or recommend that anyone else does, at > least in v10. > > I simply don't think it would get the level of testing required to be > production worthy I think we could tweak the test harnesses to run all the tests with different segment sizes. That would get pretty good coverage. More generally, the methodology that we should not add an option unless we also change the default because the option would otherwise not get enough testing is a bit dubious. > and I doubt that most tool writers would be quick to > add support for a feature that very few people (if any) use. I'm not one of those tool writers, although I have written my share of DBA scripts over the years, but I wonder why those tools would really care. They are handed files with predetermined names to archive, and for restore files with predetermined names are requested back from them. What else do they need? If something is missing that requires them to parse file names, then maybe that should be added. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
David, * David Steele (da...@pgmasters.net) wrote: > On 3/22/17 3:45 PM, Robert Haas wrote: > >On Wed, Mar 22, 2017 at 3:24 PM, David Steele wrote: > >>>One of the reasons to go with the LSN is that we would actually be > >>>maintaining what happens when the WAL files are 16MB in size. > >>> > >>>David's initial expectation was this for 64MB WAL files: > >>> > >>>00010040 > >>>00010080 > >>>000100CO > >>>00010001 > >> > >> > >>This is the 1GB sequence, actually, but idea would be the same for 64MB > >>files. > > > >Wait, really? I thought you abandoned this approach because there's > >then no principled way to handle WAL segments of less than the default > >size. > > I did say that, but I thought I had hit on a compromise. Strikes me as one, at least. > But, as I originally pointed out the hex characters in the filename > are not aligned correctly for > 8 bits (< 16MB segments) and using > different alignments just made it less consistent. > > It would be OK if we were willing to drop the 1,2,4,8 segment sizes > because then the alignment would make sense and not change the > current 16MB sequence. For my 2c, at least, it seems extremely unlikely that people are using smaller-than-16MB segments. Also, we don't have to actually drop support for those sizes, just handle the numbering differently, if we feel like they're useful enough to keep- in particular I was thinking we could make the filename one digit longer, or shift the numbers up one position, but my general feeling is that it wouldn't ever be an exercised use-case and therefore we should just drop support for them. Perhaps I'm being overly paranoid, but I share David's concern about non-standard/non-default WAL sizes being a serious risk due to lack of exposure for those code paths, which is another reason that we should change the default if we feel it's valuable to have a large WAL segment, not just create this option which users can set at initdb time but which we very rarely actually test to ensure it's working. With any of these we need to have some buildfarm systems which are at *least* running our regression tests against the different options, if we would consider telling users to use them. > Even then, there are some interesting side effects. For 1GB > segments the "0001000100C0" segment would include LSNs > 1/C000 through 1/. This is correct but is not an > obvious filename to LSN mapping, at least for LSNs that appear later > in the segment. That doesn't seem unreasonable to me. If we're going to use the starting LSN of the segment then it's going to skip when you start varying the size of the segment. Even keeping the current scheme we end up with skipping happening, it just different skipping and goes "1, 2, 3, skip to the next!" where how high the count goes depends on the size. With this approach, we're consistently skipping the same amount which is exactly the size divided by 16MB, always. I do also like Peter's suggestion also of using separator between the components of the WAL filename, but that would change the naming for everyone, which is a concern I can understand us wishing to avoid. From a user-experience point of view, keeping the mapping from the WAL filename to the starting LSN is quite nice, even if this change might complicate the backend code a bit. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] increasing the default WAL segment size
Hi Robert, On 3/22/17 3:45 PM, Robert Haas wrote: On Wed, Mar 22, 2017 at 3:24 PM, David Steele wrote: One of the reasons to go with the LSN is that we would actually be maintaining what happens when the WAL files are 16MB in size. David's initial expectation was this for 64MB WAL files: 00010040 00010080 000100CO 00010001 This is the 1GB sequence, actually, but idea would be the same for 64MB files. Wait, really? I thought you abandoned this approach because there's then no principled way to handle WAL segments of less than the default size. I did say that, but I thought I had hit on a compromise. But, as I originally pointed out the hex characters in the filename are not aligned correctly for > 8 bits (< 16MB segments) and using different alignments just made it less consistent. It would be OK if we were willing to drop the 1,2,4,8 segment sizes because then the alignment would make sense and not change the current 16MB sequence. Even then, there are some interesting side effects. For 1GB segments the "0001000100C0" segment would include LSNs 1/C000 through 1/. This is correct but is not an obvious filename to LSN mapping, at least for LSNs that appear later in the segment. -- -David da...@pgmasters.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
On 3/22/17 3:39 PM, Peter Eisentraut wrote: On 3/22/17 15:37, Peter Eisentraut wrote: If changing WAL sizes catches on, I do think we should keep thinking about a new format for a future release, I think that means that I'm skeptical about changing the default size right now. I think if we don't change the default size it's very unlikely I would use alternate WAL segment sizes or recommend that anyone else does, at least in v10. I simply don't think it would get the level of testing required to be production worthy and I doubt that most tool writers would be quick to add support for a feature that very few people (if any) use. -- -David da...@pgmasters.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
* Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote: > The question is, which property is more useful to preserve: matching > LSN, or having a mostly consecutive numbering. > > Actually, I would really really like to have both, but if I had to pick > one, I'd lean 55% toward consecutive numbering. > For the issue at hand, I think it's fine to proceed with the naming > schema that the existing compile-time option gives you. What I don't particularly like about that is that it's *not* actually consecutive, you end up with this: 00010001 00010002 00010003 00010001 Which is part of what I don't particularly like about this approach. > In fact, that would flush out some of the tools that look directly at > the file names and interpret them, thus preserving the option to move to > a more radically different format. This doesn't make a lot of sense to me. If we get people to change to using larger WAL segments and the tools are modified to understand the pseudo-consecutive format, and then you want to change it on them again in another release or two? I'm generally a fan of not feeling too bad breaking backwards compatibility, but it seems pretty rough even to me to do so immediately. This is exactly why I think it'd be better to work out a good naming scheme now that actually makes sense and that we'll be able to stick with for a while instead of rushing to get this ability in now, when we'll have people actually starting to use it and then try to change it. > If changing WAL sizes catches on, I do think we should keep thinking > about a new format for a future release, because debugging will > otherwise become a bit wild. I'm thinking something like > > {integer timeline}_{integer seq number}_{hex lsn} > > might address various interests. Right, I'd rather not have debugging WAL files become a bit wild. If we can't work out a sensible approach to naming that we expect to last us for at least a couple of releases for different sizes of WAL files, then I don't think we should rush to encourage users to use different sizes of WAL files. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] increasing the default WAL segment size
On Wed, Mar 22, 2017 at 3:24 PM, David Steele wrote: >> One of the reasons to go with the LSN is that we would actually be >> maintaining what happens when the WAL files are 16MB in size. >> >> David's initial expectation was this for 64MB WAL files: >> >> 00010040 >> 00010080 >> 000100CO >> 00010001 > > > This is the 1GB sequence, actually, but idea would be the same for 64MB > files. Wait, really? I thought you abandoned this approach because there's then no principled way to handle WAL segments of less than the default size. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
On 3/22/17 15:37, Peter Eisentraut wrote: > If changing WAL sizes catches on, I do think we should keep thinking > about a new format for a future release, I think that means that I'm skeptical about changing the default size right now. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
On 3/22/17 15:09, Stephen Frost wrote: > David's initial expectation was this for 64MB WAL files: > > 00010040 > 00010080 > 000100CO > 00010001 > > Which both matches the LSN *and* keeps the file names the same when > they're 16MB. This is what David's looking at writing a patch for and > is what I think we should be considering. This avoids breaking > compatibility for people who choose to continue using 16MB (assuming > we switch the default to 64MB, which I am still hopeful we will do). The question is, which property is more useful to preserve: matching LSN, or having a mostly consecutive numbering. Actually, I would really really like to have both, but if I had to pick one, I'd lean 55% toward consecutive numbering. For the issue at hand, I think it's fine to proceed with the naming schema that the existing compile-time option gives you. In fact, that would flush out some of the tools that look directly at the file names and interpret them, thus preserving the option to move to a more radically different format. If changing WAL sizes catches on, I do think we should keep thinking about a new format for a future release, because debugging will otherwise become a bit wild. I'm thinking something like {integer timeline}_{integer seq number}_{hex lsn} might address various interests. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
* David Steele (da...@pgmasters.net) wrote: > On 3/22/17 3:09 PM, Stephen Frost wrote: > >* Robert Haas (robertmh...@gmail.com) wrote: > >>On Wed, Mar 22, 2017 at 1:49 PM, Stephen Frost wrote: > >>>Then perhaps we do need to be thinking of moving this to PG11 instead of > >>>exposing an option that users will start to use which will result in WAL > >>>naming that'll be confusing and inconsistent. I certainly don't think > >>>it's a good idea to move forward exposing an option with a naming scheme > >>>that's agreed to be bad. > >> > > > >One of the reasons to go with the LSN is that we would actually be > >maintaining what happens when the WAL files are 16MB in size. > > > >David's initial expectation was this for 64MB WAL files: > > > >00010040 > >00010080 > >000100CO > >00010001 > > This is the 1GB sequence, actually, but idea would be the same for > 64MB files. Ah, right, sorry. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] increasing the default WAL segment size
On 3/22/17 3:09 PM, Stephen Frost wrote: * Robert Haas (robertmh...@gmail.com) wrote: On Wed, Mar 22, 2017 at 1:49 PM, Stephen Frost wrote: Then perhaps we do need to be thinking of moving this to PG11 instead of exposing an option that users will start to use which will result in WAL naming that'll be confusing and inconsistent. I certainly don't think it's a good idea to move forward exposing an option with a naming scheme that's agreed to be bad. One of the reasons to go with the LSN is that we would actually be maintaining what happens when the WAL files are 16MB in size. David's initial expectation was this for 64MB WAL files: 00010040 00010080 000100CO 00010001 This is the 1GB sequence, actually, but idea would be the same for 64MB files. -- -David da...@pgmasters.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
Robert, * Robert Haas (robertmh...@gmail.com) wrote: > On Wed, Mar 22, 2017 at 1:49 PM, Stephen Frost wrote: > > Then perhaps we do need to be thinking of moving this to PG11 instead of > > exposing an option that users will start to use which will result in WAL > > naming that'll be confusing and inconsistent. I certainly don't think > > it's a good idea to move forward exposing an option with a naming scheme > > that's agreed to be bad. > > I'm not sure there is any such agreement. I agree that the naming > scheme for WAL files probably isn't the greatest and that David's > proposal is probably better, but we've had that naming scheme for many > years, and I don't accept that making a previously-configure-time > option initdb-time means that it's suddenly necessary to break > everything for people who continue to use a 16MB WAL size. Apologies, I completely forgot to bring up how the discussion has evolved regarding the 16MB case even though we had moved past it in my head. Let me try to set that right here. One of the reasons to go with the LSN is that we would actually be maintaining what happens when the WAL files are 16MB in size. David's initial expectation was this for 64MB WAL files: 00010040 00010080 000100CO 00010001 Which both matches the LSN *and* keeps the file names the same when they're 16MB. This is what David's looking at writing a patch for and is what I think we should be considering. This avoids breaking compatibility for people who choose to continue using 16MB (assuming we switch the default to 64MB, which I am still hopeful we will do). David had offered up another idea which would change the WAL naming for all sizes, but he and I chatted about it and it seemed like it'd make more sense to maintain the 16MB filenames and then to use the LSN for other sizes also in the same manner. Regardless of which approach we end up using, I do think we need a formal function for converting WAL file names into LSNs and documentation included which spells out exactly how that's done. This is obviously important to backup tools which need to make sure that there aren't any gaps in the WAL stream and also need to figure out where the LSN returned by pg_start_backup() is. We have a function for the latter already, but no documentation explaining how it works, which I believe we should as tool authors need to implement this in their own code since they can't always assume access to a PG server is available. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] increasing the default WAL segment size
On Wed, Mar 22, 2017 at 1:49 PM, Stephen Frost wrote: > * Robert Haas (robertmh...@gmail.com) wrote: >> On Wed, Mar 22, 2017 at 1:22 PM, Stephen Frost wrote: >> > To put this in another light, had this issue been brought up post >> > feature-freeze, your definition would mean that we would only have the >> > option to either revert the patch entirely or to live with the poor >> > naming scheme. >> >> Yeah, and I absolutely agree with that. In fact, I think it's >> *already* past the time when we should be considering the changes you >> want. > > Then perhaps we do need to be thinking of moving this to PG11 instead of > exposing an option that users will start to use which will result in WAL > naming that'll be confusing and inconsistent. I certainly don't think > it's a good idea to move forward exposing an option with a naming scheme > that's agreed to be bad. I'm not sure there is any such agreement. I agree that the naming scheme for WAL files probably isn't the greatest and that David's proposal is probably better, but we've had that naming scheme for many years, and I don't accept that making a previously-configure-time option initdb-time means that it's suddenly necessary to break everything for people who continue to use a 16MB WAL size. I really think that is very unlikely to be a majority position, no matter how firmly you and David hold to it. It is possible that a majority of people will agree that such a change should be made, but it seems very remote that a majority of people will agree that it has to (or even should be) the same commit that improves the configurability. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
On Wed, Mar 22, 2017 at 9:51 AM, Stephen Frost wrote: > Robert, > > * Robert Haas (robertmh...@gmail.com) wrote: > > On Wed, Mar 22, 2017 at 12:22 PM, Stephen Frost > wrote: > > > While I understand that you'd like to separate the concerns between > > > changing the renaming scheme and changing the default and enabling this > > > option, I don't agree that they can or should be independently > > > considered. > > > > Well, I don't understand what you think is going to happen here. Neither > > Beena nor any other contributor you don't employ is obliged to write a > > patch for those changes because you'd like them to get made, and Peter > and > > I have already voted against including them. If you or David want to > write > > a patch for those changes, post it for discussion, and try to get > consensus > > to commit it, that's of course your right. But the patch will be more > than > > three weeks after the feature freeze deadline and will have two committer > > votes against it from the outset. > > This would clearly be an adjustment to the submitted patch, which > happens regularly during the review and commit process and is part of > the commitfest process, so I don't agree that holding it to new-feature > level is correct when it's a change which is entirely relevant to this > new feature, and one which a committer is asking to be included as part > of the change. Nor do I feel particularly bad about asking for feature > authors to be prepared to rework parts of their feature based on > feedback during the commitfest process. > ​Maybe it can be fit in as part of the overall patch set but wouldn't placing it either: First. changing the name behavior and use the existing configure-time ​knob to test it out or Second. commit the existing patch relying on the existing behavior and then implement the rename changes using the new initdb-time knob to test it out. ​in a series make reasoning and discussing the change considerably easier?​ > I would have liked to have realized this oddity with the WAL naming > scheme for not-16MB-WALs earlier too, but it's unfortunately not within > my abilities to change that. That does not mean that we shouldn't be > cognizant of the impact that this new feature will have in exposing this > naming scheme, one which there seems to be agreement is bad, to users. > > That said, David is taking a look at it to try and be helpful. > > Vote-counting seems a bit premature given that there hasn't been any > particularly clear asking for votes. Additionally, I believe Peter also > seemed concerned that the existing naming scheme which, if used with, > say, 64MB segments, wouldn't match LSNs either, in this post: > 9795723f-b4dd-f9e9-62e4-ddaf6cd88...@2ndquadrant.com ​While my DBA skills aren't that great I would think that having a system that relies upon the DBA learning how to mentally map between LSN IDs and WAL​ files is a failure in UX in the first place. The hacker-DBA might get a kick out of being able to operate efficiently with that knowledge and level of skill but the typical DBA would rather have something like "pg_wal --lsn " that they can rely upon. I would think tool builders would likewise rather rely on us to tell them where to go look instead of matching up two strings. This kinda reminds me of the discussion regarding our version number change. We are going to expose broken tools that rely on the decimal version number instead of the official integer one. Here we expose tools that rely on the equivalence between LSN and WAL filenames when using 16MB WAL files. What I haven't seen defined here is how those tools should be behaving - i.e., what is our supported API. If we lack an official way of working with these values then maybe we shouldn't give users an initdb-time way to change the WAL file size. For the uninformed like myself an actual concrete example of an actual program that would be affected would be helpful. David J.
Re: [HACKERS] increasing the default WAL segment size
Robert, * Robert Haas (robertmh...@gmail.com) wrote: > On Wed, Mar 22, 2017 at 1:22 PM, Stephen Frost wrote: > > To put this in another light, had this issue been brought up post > > feature-freeze, your definition would mean that we would only have the > > option to either revert the patch entirely or to live with the poor > > naming scheme. > > Yeah, and I absolutely agree with that. In fact, I think it's > *already* past the time when we should be considering the changes you > want. Then perhaps we do need to be thinking of moving this to PG11 instead of exposing an option that users will start to use which will result in WAL naming that'll be confusing and inconsistent. I certainly don't think it's a good idea to move forward exposing an option with a naming scheme that's agreed to be bad. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] increasing the default WAL segment size
On Wed, Mar 22, 2017 at 1:22 PM, Stephen Frost wrote: > To put this in another light, had this issue been brought up post > feature-freeze, your definition would mean that we would only have the > option to either revert the patch entirely or to live with the poor > naming scheme. Yeah, and I absolutely agree with that. In fact, I think it's *already* past the time when we should be considering the changes you want. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
Robert, * Robert Haas (robertmh...@gmail.com) wrote: > On Wed, Mar 22, 2017 at 12:51 PM, Stephen Frost wrote: > > This would clearly be an adjustment to the submitted patch, which > > happens regularly during the review and commit process and is part of > > the commitfest process, so I don't agree that holding it to new-feature > > level is correct when it's a change which is entirely relevant to this > > new feature, and one which a committer is asking to be included as part > > of the change. Nor do I feel particularly bad about asking for feature > > authors to be prepared to rework parts of their feature based on > > feedback during the commitfest process. > > Obviously, reworking the patch is an expected part of the CommitFest > process. However, I disagree that what you're asking for can in any > way be fairly characterized that way. You're not trying to make it do > the thing that it does better or differently. You're trying to make > it do a second thing. I don't agree with the particularly narrow definition you're using in this case to say that adding an option to initdb to change how big WAL files are, which will also change how they're named (even though this patch doesn't *specifically* do anything with the naming because there was a configure-time switch which existed before) means that asking for the WAL files names, which are already being changed, to be changed in a different way, is really outside the scope and a new feature. To put this in another light, had this issue been brought up post feature-freeze, your definition would mean that we would only have the option to either revert the patch entirely or to live with the poor naming scheme. For my 2c, in such a case, I would have voted to make the change even post feature-freeze unless we were very close to release as it's not really a new 'feature'. Thankfully, that isn't the case here and we do have time to consider changing it without having to worry about having a post feature-freeze discussion about it. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] increasing the default WAL segment size
On Wed, Mar 22, 2017 at 12:51 PM, Stephen Frost wrote: > This would clearly be an adjustment to the submitted patch, which > happens regularly during the review and commit process and is part of > the commitfest process, so I don't agree that holding it to new-feature > level is correct when it's a change which is entirely relevant to this > new feature, and one which a committer is asking to be included as part > of the change. Nor do I feel particularly bad about asking for feature > authors to be prepared to rework parts of their feature based on > feedback during the commitfest process. Obviously, reworking the patch is an expected part of the CommitFest process. However, I disagree that what you're asking for can in any way be fairly characterized that way. You're not trying to make it do the thing that it does better or differently. You're trying to make it do a second thing. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
Robert, * Robert Haas (robertmh...@gmail.com) wrote: > On Wed, Mar 22, 2017 at 12:22 PM, Stephen Frost wrote: > > While I understand that you'd like to separate the concerns between > > changing the renaming scheme and changing the default and enabling this > > option, I don't agree that they can or should be independently > > considered. > > Well, I don't understand what you think is going to happen here. Neither > Beena nor any other contributor you don't employ is obliged to write a > patch for those changes because you'd like them to get made, and Peter and > I have already voted against including them. If you or David want to write > a patch for those changes, post it for discussion, and try to get consensus > to commit it, that's of course your right. But the patch will be more than > three weeks after the feature freeze deadline and will have two committer > votes against it from the outset. This would clearly be an adjustment to the submitted patch, which happens regularly during the review and commit process and is part of the commitfest process, so I don't agree that holding it to new-feature level is correct when it's a change which is entirely relevant to this new feature, and one which a committer is asking to be included as part of the change. Nor do I feel particularly bad about asking for feature authors to be prepared to rework parts of their feature based on feedback during the commitfest process. I would have liked to have realized this oddity with the WAL naming scheme for not-16MB-WALs earlier too, but it's unfortunately not within my abilities to change that. That does not mean that we shouldn't be cognizant of the impact that this new feature will have in exposing this naming scheme, one which there seems to be agreement is bad, to users. That said, David is taking a look at it to try and be helpful. Vote-counting seems a bit premature given that there hasn't been any particularly clear asking for votes. Additionally, I believe Peter also seemed concerned that the existing naming scheme which, if used with, say, 64MB segments, wouldn't match LSNs either, in this post: 9795723f-b4dd-f9e9-62e4-ddaf6cd88...@2ndquadrant.com Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] increasing the default WAL segment size
On Wed, Mar 22, 2017 at 12:22 PM, Stephen Frost wrote: > While I understand that you'd like to separate the concerns between > changing the renaming scheme and changing the default and enabling this > option, I don't agree that they can or should be independently > considered. Well, I don't understand what you think is going to happen here. Neither Beena nor any other contributor you don't employ is obliged to write a patch for those changes because you'd like them to get made, and Peter and I have already voted against including them. If you or David want to write a patch for those changes, post it for discussion, and try to get consensus to commit it, that's of course your right. But the patch will be more than three weeks after the feature freeze deadline and will have two committer votes against it from the outset. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [HACKERS] increasing the default WAL segment size
Robert, * Robert Haas (robertmh...@gmail.com) wrote: > On the topic of whether to also change the default, I'm not sure what > is best and will defer to others. On the topic of whether to whack > around the file naming scheme, -1 from me. This patch was posted > three months ago and nobody suggested that course of action until this > week. Even though it is on a related topic, it is a conceptually > separate change that is previously undiscussed and on which we do not > have agreement. Making those changes just before feature freeze isn't > fair to the patch authors or people who may not have time to pay > attention to this thread right this minute. While I understand that you'd like to separate the concerns between changing the renaming scheme and changing the default and enabling this option, I don't agree that they can or should be independently considered. This is, in my view, basically the only opportunity we will have to change the naming scheme because once we make it an initdb option, while I don't think very many people will use it, there will be people who will and the various tool authors will also have to adjust to handle those cases. Chances are good that we will even see people start to recommend using that initdb option, leading to more people using a different default, at which point we simply are not going to be able to consider changing the nameing scheme. Therefore, I would much rather we take this opportunity to change the naming scheme and the default at the same time to be more sane, because if we have this patch as-is in PG10, we won't be able to do so in the future without a great deal more pain. I'm willing to forgo the ability to change the WAL size with just a server restart for PG10 because that's something which can clearly be added later without any concerns about backwards-compatibility, but the same is not true regarding the naming scheme. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] increasing the default WAL segment size
On Wed, Mar 22, 2017 at 8:46 AM, Stephen Frost wrote: >> I was definitely initially in favor of >> raising the value, but I got cold feet, a bit, when Alvaro pointed out >> that going to 64MB would require a substantial increase in >> min_wal_size. > > The performance concern of having 3 segments is a red herring here if > we're talking about a default install because the default for > max_wal_size is 1G, not 192MB. I do think increasing the default WAL > size would be valuable to do even if it does mean a default install will > take up a bit more space. min_wal_size isn't the same thing as max_wal_size. > I didn't see much discussion of it, but if this is really a concern then > couldn't we set the default to be 2 segments worth instead of 3 also? > That would mean an increase from 80MB to 128MB in the default install if > you never touch more than 128MB during a checkpoint. Not sure. Need testing. >> I'm a little worried that this whole question of changing the file >> naming scheme is a diversion which will result in torpedoing any >> chance of getting some kind of improvement here for v11. I don't >> think the patch is all that far from being committable but it's not >> going to get there if we start redesigning the world around it. > > It's not my intent to 'torpedo' this patch but I'm pretty disappointed > that we're introducing yet another initdb-time option with, as far as I > can tell, no option to change it after the cluster has started (without > some serious hackery), and continuing to have a poor default, which is > what most users will end up with. > > I really don't like these kinds of options. I'd much rather have a > reasonable default that covers most cases and is less likely to be a > problem for most systems than have a minimal setting that's impossible > to change after you've got your data in the system. As much as I'd like > everyone to talk to me before doing an initdb, that's pretty rare and > instead we end up having to break the bad news that they should have > known better and done the right thing at initdb time and, no, sorry, > there's no answer today but to dump out all of the data and load it into > a new cluster which was set up with the right initdb settings. Well, right now, the alternative is to recompile the server, so I think being able to make the change at initdb time is pretty [ insert a word of your choice here ] great by comparison. Now, I completely agree that initdb-time configurability is inferior to server-restart configurability which is obviously inferior to on-the-fly reconfigurability, but we are not going to get either of those latter two things in v10, so I think we should take the one we can get, which is clearly better than what we've got now. In the future, if somebody is willing to put in the time and energy to allow this to be changed via a pg_resetexlog-like procedure, or even on the fly by some ALTER SYSTEM command, we can consider those changes then, but everything this patch does will still be necessary. On the topic of whether to also change the default, I'm not sure what is best and will defer to others. On the topic of whether to whack around the file naming scheme, -1 from me. This patch was posted three months ago and nobody suggested that course of action until this week. Even though it is on a related topic, it is a conceptually separate change that is previously undiscussed and on which we do not have agreement. Making those changes just before feature freeze isn't fair to the patch authors or people who may not have time to pay attention to this thread right this minute. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
On Wed, Mar 22, 2017 at 3:14 PM, Beena Emerson wrote: > PFA an updated patch which fixes a minor bug I found. It only increases the > string size in pretty_wal_size function. > The 01-add-XLogSegmentOffset-macro.patch has also been rebased. Thanks for the updated versions. Here is a partial review of the patch: In pg_standby.c and pg_waldump.c, + XLogPageHeader hdr = (XLogPageHeader) buf; + XLogLongPageHeader NewLongPage = (XLogLongPageHeader) hdr; + + XLogSegSize = NewLongPage->xlp_seg_size; It waits until the file is at least XLOG_BLCKSZ, then gets the expected final size from XLogPageHeader. This looks really clean compared to the previous approach. + * Verify that the min and max wal_size meet the minimum requirements. Better to write min_wal_size and max_wal_size. + errmsg("Insufficient value for \"min_wal_size\""))); "min_wal_size %d is too low" may be? Use lower case for error messages. Same for max_wal_size. + /* Set the XLogSegSize */ + XLogSegSize = ControlFile->xlog_seg_size; + A call to IsValidXLogSegSize() will be good after this, no? + /* Update variables using XLogSegSize */ + check_wal_size(); The method name looks somewhat misleading compared to the comment for it, doesn't it? + * allocating space and reading ControlFile. s/and/for + {"TB", GUC_UNIT_MB, 1024 * 1024}, + {"GB", GUC_UNIT_MB, 1024}, + {"MB", GUC_UNIT_MB, 1}, + {"kB", GUC_UNIT_MB, -1024}, @@ -2235,10 +2231,10 @@ static struct config_int ConfigureNamesInt[] = {"min_wal_size", PGC_SIGHUP, WAL_CHECKPOINTS, gettext_noop("Sets the minimum size to shrink the WAL to."), NULL, - GUC_UNIT_XSEGS + GUC_UNIT_MB }, - &min_wal_size, - 5, 2, INT_MAX, + &min_wal_size_mb, + DEFAULT_MIN_WAL_SEGS * 16, 2, INT_MAX, NULL, NULL, NULL }, @@ -2246,10 +2242,10 @@ static struct config_int ConfigureNamesInt[] = {"max_wal_size", PGC_SIGHUP, WAL_CHECKPOINTS, gettext_noop("Sets the WAL size that triggers a checkpoint."), NULL, - GUC_UNIT_XSEGS + GUC_UNIT_MB }, - &max_wal_size, - 64, 2, INT_MAX, + &max_wal_size_mb, + DEFAULT_MAX_WAL_SEGS * 16, 2, INT_MAX, NULL, assign_max_wal_size, NULL }, This patch introduces a new guc_unit having values in MB for max_wal_size and min_wal_size. I'm not sure about the upper limit which is set to INT_MAX for 32-bit systems as well. Is it needed to define something like MAX_MEGABYTES similar to MAX_KILOBYTES? It is worth mentioning that GUC_UNIT_KB can't be used in this case since MAX_KILOBYTES is INT_MAX / 1024(<2GB) on 32-bit systems. That's not a sufficient value for min_wal_size/max_wal_size. While testing with pg_waldump, I got the following error. bin/pg_waldump -p master/pg_wal/ -s 0/0100 Floating point exception (core dumped) Stack: #0 0x004039d6 in ReadPageInternal () #1 0x00404c84 in XLogFindNextRecord () #2 0x00401e08 in main () I think that the problem is in following code: /* parse files as start/end boundaries, extract path if not specified */ if (optind < argc) { + if (!RetrieveXLogSegSize(full_path)) ... } In this case, RetrieveXLogSegSize is conditionally called. So, if the condition is false, XLogSegSize will not be initialized. I'm yet to review pg_basebackup module and I'll try to finish that ASAP. -- Thanks & Regards, Kuntal Ghosh EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
Peter, * Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote: > On 3/22/17 08:46, Stephen Frost wrote: > > It's not my intent to 'torpedo' this patch but I'm pretty disappointed > > that we're introducing yet another initdb-time option with, as far as I > > can tell, no option to change it after the cluster has started (without > > some serious hackery), and continuing to have a poor default, which is > > what most users will end up with. > > I understand this concern, but what alternative do you have in mind? Changing the default to a more reasonable value would at least reduce the issue. I think it'd also be nice to have a way to change it post-initdb, but that's less of an issue if we are at least setting it to a good default to begin with instead of a minimal one. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] increasing the default WAL segment size
On 3/22/17 08:46, Stephen Frost wrote: > It's not my intent to 'torpedo' this patch but I'm pretty disappointed > that we're introducing yet another initdb-time option with, as far as I > can tell, no option to change it after the cluster has started (without > some serious hackery), and continuing to have a poor default, which is > what most users will end up with. I understand this concern, but what alternative do you have in mind? -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
On Tue, Mar 21, 2017 at 11:49:30PM -0400, Robert Haas wrote: > To be honest, I'd sort of forgotten about the change which is the > nominal subject of this thread - I was more focused on the patch, > which makes it configurable. I was definitely initially in favor of > raising the value, but I got cold feet, a bit, when Alvaro pointed out > that going to 64MB would require a substantial increase in > min_wal_size. I'm not sure people with small installations will > appreciate seeing that value cranked up from 5 segments * 16MB = 80MB > to, say, 3 segments * 64MB = 192MB. That's an extra 100+ MB of space > that doesn't really do anything for you. And nobody's done any > benchmarking to see whether having only 3 segments is even a workable, > performant configuration, so maybe we'll end up with 5 * 64MB = 320MB > by default. Maybe its time to have a documentation section listing suggested changes for small installs so we can have more reasonable defaults. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
On 3/22/17 05:44, Beena Emerson wrote: > As stated above, the default 16MB has not changed and so we can take > this separately and not as part of this patch. It's good to have that discussion separately, but if we're planning to do it for PG10 (not saying we should), then we should have that discussion very soon. Especially if we would be shipping a default configuration where the mapping of files to LSNs fails, which will require giving users some time to adjust. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
Robert, * Robert Haas (robertmh...@gmail.com) wrote: > On Tue, Mar 21, 2017 at 8:10 PM, Stephen Frost wrote: > >> We've already > >> created quite a few incompatibilities in this release, and I'm not > >> entirely eager to just keep cranking them out at top speed. > > > > That position would seem to imply that you're in favor of keeping the > > current default of 16MB, but that doesn't make sense given that you > > started this discussion advocating to make it larger. Changing your > > position is certainly fine, but it'd be good to be more clear if that's > > what you meant here or if you were just referring to the file naming > > scheme but you do still want to increase the default size. > > To be honest, I'd sort of forgotten about the change which is the > nominal subject of this thread - I was more focused on the patch, > which makes it configurable. I was definitely initially in favor of > raising the value, but I got cold feet, a bit, when Alvaro pointed out > that going to 64MB would require a substantial increase in > min_wal_size. I'm not sure people with small installations will > appreciate seeing that value cranked up from 5 segments * 16MB = 80MB > to, say, 3 segments * 64MB = 192MB. That's an extra 100+ MB of space > that doesn't really do anything for you. And nobody's done any > benchmarking to see whether having only 3 segments is even a workable, > performant configuration, so maybe we'll end up with 5 * 64MB = 320MB > by default. The performance concern of having 3 segments is a red herring here if we're talking about a default install because the default for max_wal_size is 1G, not 192MB. I do think increasing the default WAL size would be valuable to do even if it does mean a default install will take up a bit more space. I didn't see much discussion of it, but if this is really a concern then couldn't we set the default to be 2 segments worth instead of 3 also? That would mean an increase from 80MB to 128MB in the default install if you never touch more than 128MB during a checkpoint. > I'm a little worried that this whole question of changing the file > naming scheme is a diversion which will result in torpedoing any > chance of getting some kind of improvement here for v11. I don't > think the patch is all that far from being committable but it's not > going to get there if we start redesigning the world around it. It's not my intent to 'torpedo' this patch but I'm pretty disappointed that we're introducing yet another initdb-time option with, as far as I can tell, no option to change it after the cluster has started (without some serious hackery), and continuing to have a poor default, which is what most users will end up with. I really don't like these kinds of options. I'd much rather have a reasonable default that covers most cases and is less likely to be a problem for most systems than have a minimal setting that's impossible to change after you've got your data in the system. As much as I'd like everyone to talk to me before doing an initdb, that's pretty rare and instead we end up having to break the bad news that they should have known better and done the right thing at initdb time and, no, sorry, there's no answer today but to dump out all of the data and load it into a new cluster which was set up with the right initdb settings. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] increasing the default WAL segment size
Hello, On Wed, Mar 22, 2017 at 9:19 AM, Robert Haas wrote: > > I'm a little worried that this whole question of changing the file > naming scheme is a diversion which will result in torpedoing any > chance of getting some kind of improvement here for v11. I don't > think the patch is all that far from being committable but it's not > going to get there if we start redesigning the world around it. > > As stated above, the default 16MB has not changed and so we can take this separately and not as part of this patch. PFA an updated patch which fixes a minor bug I found. It only increases the string size in pretty_wal_size function. The 01-add-XLogSegmentOffset-macro.patch has also been rebased. -- Thank you, Beena Emerson EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company 02-initdb-walsegsize-v7.patch Description: Binary data 01-add-XLogSegmentOffset-macro.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
On Tue, Mar 21, 2017 at 11:49 PM, Robert Haas wrote: > I'm a little worried that this whole question of changing the file > naming scheme is a diversion which will result in torpedoing any > chance of getting some kind of improvement here for v11. I don't > think the patch is all that far from being committable but it's not > going to get there if we start redesigning the world around it. Ha. A little Freudian slip there, since I obviously meant v10. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
On Tue, Mar 21, 2017 at 8:10 PM, Stephen Frost wrote: >> We've already >> created quite a few incompatibilities in this release, and I'm not >> entirely eager to just keep cranking them out at top speed. > > That position would seem to imply that you're in favor of keeping the > current default of 16MB, but that doesn't make sense given that you > started this discussion advocating to make it larger. Changing your > position is certainly fine, but it'd be good to be more clear if that's > what you meant here or if you were just referring to the file naming > scheme but you do still want to increase the default size. To be honest, I'd sort of forgotten about the change which is the nominal subject of this thread - I was more focused on the patch, which makes it configurable. I was definitely initially in favor of raising the value, but I got cold feet, a bit, when Alvaro pointed out that going to 64MB would require a substantial increase in min_wal_size. I'm not sure people with small installations will appreciate seeing that value cranked up from 5 segments * 16MB = 80MB to, say, 3 segments * 64MB = 192MB. That's an extra 100+ MB of space that doesn't really do anything for you. And nobody's done any benchmarking to see whether having only 3 segments is even a workable, performant configuration, so maybe we'll end up with 5 * 64MB = 320MB by default. I'm a little worried that this whole question of changing the file naming scheme is a diversion which will result in torpedoing any chance of getting some kind of improvement here for v11. I don't think the patch is all that far from being committable but it's not going to get there if we start redesigning the world around it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
Robert, * Robert Haas (robertmh...@gmail.com) wrote: > On Tue, Mar 21, 2017 at 6:02 PM, David Steele wrote: > > The biggest downside I can see is that this would change the naming scheme > > for the default of 16MB compared to previous versions of Postgres. However, > > for all other wal-seg-size values changes would need to be made anyway. > > I think changing the naming convention for 16MB WAL segments, which is > still going to be what 99% of people use, is an awfully large > compatibility break for an awfully marginal benefit. It seems extremely unlikely to me that we're going to actually see users deviate from whatever we set the default to and so I'm not sure that this is a real concern. We aren't changing what 9.6 and below's naming scheme is, just what PG10+ do, and PG10+ are going to have a different default WAL size. I realize the current patch still has the 16MB default even though a rather large portion of the early discussion appeared in favor of changing it to 64MB. Once we've done that, I don't think it makes one whit of difference what the naming scheme looks like when you're using 16MB sizes because essentially zero people are going to actually use such a setting. > We've already > created quite a few incompatibilities in this release, and I'm not > entirely eager to just keep cranking them out at top speed. That position would seem to imply that you're in favor of keeping the current default of 16MB, but that doesn't make sense given that you started this discussion advocating to make it larger. Changing your position is certainly fine, but it'd be good to be more clear if that's what you meant here or if you were just referring to the file naming scheme but you do still want to increase the default size. I'll admit that we might have a few more people using non-default sizes once we make it an initdb-option (though I'm tempted to suggest that one might be able to count them using their digits ;), but it seems very unlikely that they would do so to reduce it back down to 16MB, so I'm really not seeing the naming scheme change as a serious backwards-incompatibility change. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] increasing the default WAL segment size
On Tue, Mar 21, 2017 at 6:02 PM, David Steele wrote: > The biggest downside I can see is that this would change the naming scheme > for the default of 16MB compared to previous versions of Postgres. However, > for all other wal-seg-size values changes would need to be made anyway. I think changing the naming convention for 16MB WAL segments, which is still going to be what 99% of people use, is an awfully large compatibility break for an awfully marginal benefit. We've already created quite a few incompatibilities in this release, and I'm not entirely eager to just keep cranking them out at top speed. Where it's necessary to achieve forward progress in some area, sure, but this feels gratuitous to me. I agree that we might have picked your scheme if we were starting from scratch, but I have a hard time believing it's a good idea to do it now just because of this patch. Changing the WAL segment size has been supported for a long time, and I don't see the fact that it will now potentially be initdb-configurable rather than configure-configurable as a sufficient justification for whacking around the naming scheme -- even though I don't love the naming scheme we've got. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
On 3/21/17 15:22, Robert Haas wrote: > If you take the approach that Beena did, then you lose the > correspondence with LSNs, which is admittedly not great but there are > already helper functions available to deal with LSN -> filename > mappings and I assume those will continue to work. If you take the > opposite approach, then WAL filenames stop being consecutive, which > seems to me to be far worse in terms of user and tool confusion. Anecdotally, I think having the file numbers consecutive is very important, for debugging and feel-good factor. If you want to raise the segment size and preserve the LSN mapping, then pick 256 MB as your next size. I do think, however, that this has the potential of creating another ongoing source of confusion similar to oid vs relfilenode, where the numbers are often the same, except when they are not. With hindsight, I would have made the relfilenodes completely different from the OIDs. We chose to keep them (mostly) the same as the OIDs, for compatibility. We are seemingly making a similar kind of decision here. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
On 3/21/17 3:22 PM, Robert Haas wrote: On Tue, Mar 21, 2017 at 9:04 AM, Stephen Frost wrote: In short, I'm also concerned about this change to make WAL file names no longer match up with LSNs and also about the odd stepping that you get as a result of this change when it comes to WAL file names. OK, that's a bit surprising to me, but what do you want to do about it? If you take the approach that Beena did, then you lose the correspondence with LSNs, which is admittedly not great but there are already helper functions available to deal with LSN -> filename mappings and I assume those will continue to work. If you take the opposite approach, then WAL filenames stop being consecutive, which seems to me to be far worse in terms of user and tool confusion. They are already non-consecutive. Does 00010002 really logically follow 0001000100FF? Yeah, sort of, if you know the rules. Also note that, both currently and with the patch, you can also reduce the WAL segment size. David's proposed naming scheme doesn't handle that case, I think, and I think it would be all kinds of a bad idea to use one file-naming approach for segments < 16MB and a separate approach for segments > 16MB. That's not making anything easier for users or tool authors. I believe it does handle that case, actually. The minimum WAL segment size is 1MB so they would increase like: 00010001 000100010010 000100010020 ... 00010001FFF0 You could always calculate the next WAL file by adding (wal_seg_size_in_mb << 20) to the previous WAL file's LSN. This would even work for WAL segments > 4GB. Overall, I think this would make calculating WAL ranges simpler than it is now. The biggest downside I can see is that this would change the naming scheme for the default of 16MB compared to previous versions of Postgres. However, for all other wal-seg-size values changes would need to be made anyway. -- -David da...@pgmasters.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
On Tue, Mar 21, 2017 at 9:04 AM, Stephen Frost wrote: > In short, I'm also concerned about this change to make WAL file names no > longer match up with LSNs and also about the odd stepping that you get > as a result of this change when it comes to WAL file names. OK, that's a bit surprising to me, but what do you want to do about it? If you take the approach that Beena did, then you lose the correspondence with LSNs, which is admittedly not great but there are already helper functions available to deal with LSN -> filename mappings and I assume those will continue to work. If you take the opposite approach, then WAL filenames stop being consecutive, which seems to me to be far worse in terms of user and tool confusion. Also note that, both currently and with the patch, you can also reduce the WAL segment size. David's proposed naming scheme doesn't handle that case, I think, and I think it would be all kinds of a bad idea to use one file-naming approach for segments < 16MB and a separate approach for segments > 16MB. That's not making anything easier for users or tool authors. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
On 3/21/17 9:04 AM, Stephen Frost wrote: Robert, * Robert Haas (robertmh...@gmail.com) wrote: On Mon, Mar 20, 2017 at 7:23 PM, David Steele wrote: With 16MB WAL segments the filename neatly aligns with the LSN. For example: WAL FILE 0001000100FE = LSN 1/FE00 This no longer holds true with this patch. It is already possible to change the WAL segment size using the configure option --with-wal-segsize, and I think the patch should be consistent with whatever that existing option does. Considering how little usage that option has likely seen (I can't say I've ever run into usage of it so far...), I'm not really sure that it makes sense to treat it as final when we're talking about changing the default here. +1. A seldom-used compile-time option does not necessarily provide a good model for a user-facing feature. In short, I'm also concerned about this change to make WAL file names no longer match up with LSNs and also about the odd stepping that you get as a result of this change when it comes to WAL file names. I can't decide which way I like best. I like the filenames corresponding to LSNs as they do now, but it seems like a straight sequence might be easier to understand. Either way you need to know that different segment sizes mean different numbers of segments per lsn.xlogid. Even now the correspondence is a bit tenuous. I've always thought: 00010001000F Should be: 000100010F00 I'm really excited to (hopefully) have this feature in v10. I just want to be sure we discuss this as it will be a big change for tool authors and just about anybody who looks at WAL. Thanks, -- -David da...@pgmasters.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
Robert, * Robert Haas (robertmh...@gmail.com) wrote: > On Mon, Mar 20, 2017 at 7:23 PM, David Steele wrote: > > With 16MB WAL segments the filename neatly aligns with the LSN. For > > example: > > > > WAL FILE 0001000100FE = LSN 1/FE00 > > > > This no longer holds true with this patch. > > It is already possible to change the WAL segment size using the > configure option --with-wal-segsize, and I think the patch should be > consistent with whatever that existing option does. Considering how little usage that option has likely seen (I can't say I've ever run into usage of it so far...), I'm not really sure that it makes sense to treat it as final when we're talking about changing the default here. In short, I'm also concerned about this change to make WAL file names no longer match up with LSNs and also about the odd stepping that you get as a result of this change when it comes to WAL file names. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] increasing the default WAL segment size
PFA an updated patch. This fixes an issue reported by Tushar internally. Since the patch changes the way min and max wal_size is stored internally from segment count to size in kb, it limited the maximum size of min and max_wal_size to 2GB in 32 bit systems. The minimum required segment is 2 and the minimum wal size is 1MB. So the lowest possible value of the min/max wal_size is 2MB. Hence, I have changed the internal representation to MB instead of KB so that we can increase the range. Also, for any wal-seg-size, it retains the default seg count as 5 and 64 for min and max wal_size. Consequently, the size of the variables increase automatically according to the wal_segment_size. This behaviour is similar to that of existing code. I have also run pg_indent on the files. On Mon, Mar 20, 2017 at 11:37 PM, Beena Emerson wrote: > Hello, > > PFA the updated patch. > > On Fri, Mar 17, 2017 at 6:40 AM, Robert Haas > wrote: > >> On Tue, Mar 14, 2017 at 1:44 AM, Beena Emerson >> wrote: >> > Attached is the updated patch. It fixes the issues and also updates few >> code >> > comments. >> >> I did an initial readthrough of this patch tonight just to get a >> feeling for what's going on. Based on that, here are a few review >> comments: >> >> The changes to pg_standby seem to completely break the logic to wait >> until the file has attained the correct size. I don't know how to >> salvage that logic off-hand, but just breaking it isn't acceptable. >> > > Using, the XLogLongPageHeader->xlp_seg_size, all the original checks have > been retained. This methid is even used in pg_waldump. > > > >> >> + Note that changing this value requires an initdb. >> >> Instead, maybe say something like "Note that this value is fixed for >> the lifetime of the database cluster." >> > > Corrected. > > >> >> -intmax_wal_size = 64;/* 1 GB */ >> -intmin_wal_size = 5;/* 80 MB */ >> +intwal_segment_size = 2048;/* 16 MB */ >> +intmax_wal_size = 1024 * 1024;/* 1 GB */ >> +intmin_wal_size = 80 * 1024;/* 80 MB */ >> >> If wal_segment_size is now measured in multiple of XLOG_BLCKSZ, then >> it's not the case that 2048 is always 16MB. If the other values are >> now measured in kB, perhaps rename the variables to add _kb, to avoid >> confusion with the way it used to work (and in general). The problem >> with leaving this as-is is that any existing references to >> max_wal_size in core or extension code will silently break; you want > > it to break in a noticeable way so that it gets fixed. >> >> > The wal_segment_size now is DEFAULT_XLOG_SEG_SIZE / XLOG_BLCKSZ; > min and max wal_size have _kb postfix > > >> + * UsableBytesInSegment: It is set in assign_wal_segment_size and stores >> the >> + * number of bytes in a WAL segment usable for WAL data. >> >> The comment doesn't need to say where it gets set, and it doesn't need >> to repeat the variable name. Just say "The number of bytes in a..." >> > > Done. > > >> >> +assign_wal_segment_size(int newval, void *extra) >> >> Why does a PGC_INTERNAL GUC need an assign hook? I think the GUC >> should only be there to expose the value; it shouldn't have >> calculation logic associated with it. >> > > Removed the function and called the functions in ReadControlFile. > > >> >> /* >> + * initdb passes the WAL segment size in an environment variable. We >> don't >> + * bother doing any sanity checking, we already check in initdb that >> the >> + * user gives a sane value. >> + */ >> +XLogSegSize = pg_atoi(getenv("XLOG_SEG_SIZE"), sizeof(uint32), 0); >> >> I think we should bother. I don't like the idea of the postmaster >> crashing in flames without so much as a reasonable error message if >> this parameter-passing mechanism goes wrong. >> > > I have rechecked the XLogSegSize. > > >> >> +{"wal-segsize", required_argument, NULL, 'Z'}, >> >> When adding an option with no documented short form, generally one >> picks a number that isn't a character for the value at the end. See >> pg_regress.c or initdb.c for examples. >> > > Done. > > >> >> + wal_segment_size = atoi(str_wal_segment_size); >> >> So, you're comfortable interpreting --wal-segsize=1TB or >> --wal-segsize=1GB as 1? Implicitly, 1MB? >> > > Imitating the current behaviour of config option --with-wal-segment, I > have used strtol to throw an error if the value is not only integers. > > >> >> + * ControlFile is not accessible here so use SHOW wal_segment_size >> command >> + * to set the XLogSegSize >> >> Breaks compatibility with pre-9.6 servers. >> > > Added check for the version, the SHOW command will be run only in v10 and > above. Previous versions do not need this. > > >> >> -- > Thank you, > > Beena Emerson > > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company > -- Thank you, Beena Emerson EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreS
Re: [HACKERS] increasing the default WAL segment size
On Mon, Mar 20, 2017 at 7:23 PM, David Steele wrote: > With 16MB WAL segments the filename neatly aligns with the LSN. For > example: > > WAL FILE 0001000100FE = LSN 1/FE00 > > This no longer holds true with this patch. It is already possible to change the WAL segment size using the configure option --with-wal-segsize, and I think the patch should be consistent with whatever that existing option does. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
Hi Beena, On 3/20/17 2:07 PM, Beena Emerson wrote: Added check for the version, the SHOW command will be run only in v10 and above. Previous versions do not need this. I've just had the chance to have a look at this patch. This is not a complete review, just a test of something I've been curious about. With 16MB WAL segments the filename neatly aligns with the LSN. For example: WAL FILE 0001000100FE = LSN 1/FE00 This no longer holds true with this patch. I created a cluster with 1GB segments and the sequence looked like: 00010001 00010002 00010003 00010001 Whereas I had expected something like: 00010040 00010080 000100CO 00010001 I scanned the thread but couldn't find any mention of this so I'm curious to know if it was considered? Was the prior correspondence merely serendipitous? I'm honestly not sure which way I think is better, but I know either way it represents a pretty big behavioral change for any tools looking at pg_wal or using the various helper functions. It's a probably a good thing to do at the same time as the rename, just want to make sure we are all aware of the changes. -- -David da...@pgmasters.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
Hello, PFA the updated patch. On Fri, Mar 17, 2017 at 6:40 AM, Robert Haas wrote: > On Tue, Mar 14, 2017 at 1:44 AM, Beena Emerson > wrote: > > Attached is the updated patch. It fixes the issues and also updates few > code > > comments. > > I did an initial readthrough of this patch tonight just to get a > feeling for what's going on. Based on that, here are a few review > comments: > > The changes to pg_standby seem to completely break the logic to wait > until the file has attained the correct size. I don't know how to > salvage that logic off-hand, but just breaking it isn't acceptable. > Using, the XLogLongPageHeader->xlp_seg_size, all the original checks have been retained. This methid is even used in pg_waldump. > > + Note that changing this value requires an initdb. > > Instead, maybe say something like "Note that this value is fixed for > the lifetime of the database cluster." > Corrected. > > -intmax_wal_size = 64;/* 1 GB */ > -intmin_wal_size = 5;/* 80 MB */ > +intwal_segment_size = 2048;/* 16 MB */ > +intmax_wal_size = 1024 * 1024;/* 1 GB */ > +intmin_wal_size = 80 * 1024;/* 80 MB */ > > If wal_segment_size is now measured in multiple of XLOG_BLCKSZ, then > it's not the case that 2048 is always 16MB. If the other values are > now measured in kB, perhaps rename the variables to add _kb, to avoid > confusion with the way it used to work (and in general). The problem > with leaving this as-is is that any existing references to > max_wal_size in core or extension code will silently break; you want it to break in a noticeable way so that it gets fixed. > > The wal_segment_size now is DEFAULT_XLOG_SEG_SIZE / XLOG_BLCKSZ; min and max wal_size have _kb postfix > + * UsableBytesInSegment: It is set in assign_wal_segment_size and stores > the > + * number of bytes in a WAL segment usable for WAL data. > > The comment doesn't need to say where it gets set, and it doesn't need > to repeat the variable name. Just say "The number of bytes in a..." > Done. > > +assign_wal_segment_size(int newval, void *extra) > > Why does a PGC_INTERNAL GUC need an assign hook? I think the GUC > should only be there to expose the value; it shouldn't have > calculation logic associated with it. > Removed the function and called the functions in ReadControlFile. > > /* > + * initdb passes the WAL segment size in an environment variable. We > don't > + * bother doing any sanity checking, we already check in initdb that > the > + * user gives a sane value. > + */ > +XLogSegSize = pg_atoi(getenv("XLOG_SEG_SIZE"), sizeof(uint32), 0); > > I think we should bother. I don't like the idea of the postmaster > crashing in flames without so much as a reasonable error message if > this parameter-passing mechanism goes wrong. > I have rechecked the XLogSegSize. > > +{"wal-segsize", required_argument, NULL, 'Z'}, > > When adding an option with no documented short form, generally one > picks a number that isn't a character for the value at the end. See > pg_regress.c or initdb.c for examples. > Done. > > + wal_segment_size = atoi(str_wal_segment_size); > > So, you're comfortable interpreting --wal-segsize=1TB or > --wal-segsize=1GB as 1? Implicitly, 1MB? > Imitating the current behaviour of config option --with-wal-segment, I have used strtol to throw an error if the value is not only integers. > > + * ControlFile is not accessible here so use SHOW wal_segment_size command > + * to set the XLogSegSize > > Breaks compatibility with pre-9.6 servers. > Added check for the version, the SHOW command will be run only in v10 and above. Previous versions do not need this. > > -- Thank you, Beena Emerson EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company 02-initdb-walsegsize-v5.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
On 3/17/17 4:56 PM, Tom Lane wrote: Peter Eisentraut writes: On 3/17/17 16:20, Peter Eisentraut wrote: I think we would have to extend restore_command with an additional placeholder that communicates the segment size, and add a new pg_standby option to accept that size somehow. And specifying the size would have to be mandatory, for complete robustness. Urgh. Another way would be to name the WAL files in a more self-describing way. For example, instead of Actually, if you're content with having tools obtain this info by examining the WAL files, we shouldn't need to muck with the WAL naming convention (which seems like it would be a horrid mess, anyway --- too much outside code knows that). Tools could get the segment size out of XLogLongPageHeaderData.xlp_seg_size in the first page of the segment. regards, tom lane +1 -- -David da...@pgmasters.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
On Fri, Mar 17, 2017 at 6:11 PM, Peter Eisentraut wrote: > On 3/17/17 16:56, Tom Lane wrote: >> Tools could get the segment size out of >> XLogLongPageHeaderData.xlp_seg_size in the first page of the segment. > > OK, then pg_standby would have to wait until the file is at least > XLOG_BLCKSZ, then look inside and get the expected final size. A bit > more complicated than now, but seems doable. Yeah, that doesn't sound too bad. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
On 3/17/17 16:56, Tom Lane wrote: > Tools could get the segment size out of > XLogLongPageHeaderData.xlp_seg_size in the first page of the segment. OK, then pg_standby would have to wait until the file is at least XLOG_BLCKSZ, then look inside and get the expected final size. A bit more complicated than now, but seems doable. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
Peter Eisentraut writes: > On 3/17/17 16:20, Peter Eisentraut wrote: >> I think we would have to extend restore_command with an additional >> placeholder that communicates the segment size, and add a new pg_standby >> option to accept that size somehow. And specifying the size would have >> to be mandatory, for complete robustness. Urgh. > Another way would be to name the WAL files in a more self-describing > way. For example, instead of Actually, if you're content with having tools obtain this info by examining the WAL files, we shouldn't need to muck with the WAL naming convention (which seems like it would be a horrid mess, anyway --- too much outside code knows that). Tools could get the segment size out of XLogLongPageHeaderData.xlp_seg_size in the first page of the segment. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
On 3/17/17 16:20, Peter Eisentraut wrote: > On 3/16/17 21:10, Robert Haas wrote: >> The changes to pg_standby seem to completely break the logic to wait >> until the file has attained the correct size. I don't know how to >> salvage that logic off-hand, but just breaking it isn't acceptable. > > I think we would have to extend restore_command with an additional > placeholder that communicates the segment size, and add a new pg_standby > option to accept that size somehow. And specifying the size would have > to be mandatory, for complete robustness. Urgh. Another way would be to name the WAL files in a more self-describing way. For example, instead of 00010001 00010002 00010003 name them (for 16 MB) 000101 000102 000103 Then, pg_standby and similar tools can compute the expected file size from the file name length: 16 ^ (24 - fnamelen) However, that way you can't actually support 64 MB segments. The next jump up would have to be 256 MB (unless you want to go to a base other than 16). -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
On 3/16/17 21:10, Robert Haas wrote: > The changes to pg_standby seem to completely break the logic to wait > until the file has attained the correct size. I don't know how to > salvage that logic off-hand, but just breaking it isn't acceptable. I think we would have to extend restore_command with an additional placeholder that communicates the segment size, and add a new pg_standby option to accept that size somehow. And specifying the size would have to be mandatory, for complete robustness. Urgh. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
On Fri, Mar 17, 2017 at 2:08 AM, Beena Emerson wrote: > The option was intended to only accept values in MB as the original config > --with-wal-segsize option, unfortunately, the patch does not throw error as > in the config option when the units are specified. Yeah, you want to use strtol(), so that you can throw an error if *endptr isn't '\0'. > Error with config option --with-wal-segsize=1MB > configure: error: Invalid WAL segment size. Allowed values are > 1,2,4,8,16,32,64. > > Should we imitate this behaviour and just add a check to see if it only > contains numbers? or would it be better to allow the use of the units and > make appropriate code changes? I think just restricting it to numeric values would be fine. If somebody wants to do the work to make it accept a unit suffix, I don't have a problem with that, but it doesn't seem like a must-have. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
Hello, Thank you for your comments, I will post an updated patch soon. On Fri, Mar 17, 2017 at 6:40 AM, Robert Haas wrote: > > > +assign_wal_segment_size(int newval, void *extra) > > Why does a PGC_INTERNAL GUC need an assign hook? I think the GUC > should only be there to expose the value; it shouldn't have > calculation logic associated with it. > The Checkpoint Segments and the UsableBytesInSegment had to be changed depending on the value of wal_segment_size set during initdb. I will figure out another way to assign these values without using this assign_hook. > + wal_segment_size = atoi(str_wal_segment_size); > > So, you're comfortable interpreting --wal-segsize=1TB or > --wal-segsize=1GB as 1? Implicitly, 1MB? > The option was intended to only accept values in MB as the original config --with-wal-segsize option, unfortunately, the patch does not throw error as in the config option when the units are specified. Error with config option --with-wal-segsize=1MB configure: error: Invalid WAL segment size. Allowed values are 1,2,4,8,16,32,64. Should we imitate this behaviour and just add a check to see if it only contains numbers? or would it be better to allow the use of the units and make appropriate code changes? -- Thank you, Beena Emerson EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [HACKERS] increasing the default WAL segment size
On Tue, Mar 14, 2017 at 1:44 AM, Beena Emerson wrote: > Attached is the updated patch. It fixes the issues and also updates few code > comments. I did an initial readthrough of this patch tonight just to get a feeling for what's going on. Based on that, here are a few review comments: The changes to pg_standby seem to completely break the logic to wait until the file has attained the correct size. I don't know how to salvage that logic off-hand, but just breaking it isn't acceptable. + Note that changing this value requires an initdb. Instead, maybe say something like "Note that this value is fixed for the lifetime of the database cluster." -intmax_wal_size = 64;/* 1 GB */ -intmin_wal_size = 5;/* 80 MB */ +intwal_segment_size = 2048;/* 16 MB */ +intmax_wal_size = 1024 * 1024;/* 1 GB */ +intmin_wal_size = 80 * 1024;/* 80 MB */ If wal_segment_size is now measured in multiple of XLOG_BLCKSZ, then it's not the case that 2048 is always 16MB. If the other values are now measured in kB, perhaps rename the variables to add _kb, to avoid confusion with the way it used to work (and in general). The problem with leaving this as-is is that any existing references to max_wal_size in core or extension code will silently break; you want it to break in a noticeable way so that it gets fixed. + * UsableBytesInSegment: It is set in assign_wal_segment_size and stores the + * number of bytes in a WAL segment usable for WAL data. The comment doesn't need to say where it gets set, and it doesn't need to repeat the variable name. Just say "The number of bytes in a..." +assign_wal_segment_size(int newval, void *extra) Why does a PGC_INTERNAL GUC need an assign hook? I think the GUC should only be there to expose the value; it shouldn't have calculation logic associated with it. /* + * initdb passes the WAL segment size in an environment variable. We don't + * bother doing any sanity checking, we already check in initdb that the + * user gives a sane value. + */ +XLogSegSize = pg_atoi(getenv("XLOG_SEG_SIZE"), sizeof(uint32), 0); I think we should bother. I don't like the idea of the postmaster crashing in flames without so much as a reasonable error message if this parameter-passing mechanism goes wrong. +{"wal-segsize", required_argument, NULL, 'Z'}, When adding an option with no documented short form, generally one picks a number that isn't a character for the value at the end. See pg_regress.c or initdb.c for examples. + wal_segment_size = atoi(str_wal_segment_size); So, you're comfortable interpreting --wal-segsize=1TB or --wal-segsize=1GB as 1? Implicitly, 1MB? + * ControlFile is not accessible here so use SHOW wal_segment_size command + * to set the XLogSegSize Breaks compatibility with pre-9.6 servers. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers