[PATCH] Performance Improvement For Copy From Binary Files
Hi Hackers, For Copy From Binary files, there exists below information for each tuple/row. 1. field count(number of columns) 2. for every field, field size(column data length) 3. field data of field size(actual column data) Currently, all the above data required at each step is read directly from file using fread() and this happens for all the tuples/rows. One observation is that in the total execution time of a copy from binary file, the fread() call is taking upto 20% of time and the fread() function call count is also too high. For instance, with a dataset of size 5.3GB, 10million tuples with 10 columns, total exec time in sec total time taken for fread() fread() function call count 101.193 *21.33* 21005 101.345 *21.436* 21005 The total time taken for fread() and the corresponding function call count may increase if we have more number of columns for instance 1000. One solution to this problem is to read data from binary file in RAW_BUF_SIZE(64KB) chunks to avoid repeatedly calling fread()(thus possibly avoiding few disk IOs). This is similar to the approach followed for csv/text files. Attaching a patch, implementing the above solution for binary format files. Below is the improvement gained. total exec time in sec total time taken for fread() fread() function call count 75.757 *2.73* 160884 75.351 *2.742* 160884 *Execution is 1.36X times faster, fread() time is reduced by 87%, fread() call count is reduced by 99%.* Request the community to take this patch for review if this approach and improvement seem beneficial. Any suggestions to improve further are most welcome. Attached also is the config file used for testing the above use case. With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com Postgres configuration used for above testing: shared_buffers = 40GB max_worker_processes = 32 max_parallel_maintenance_workers = 24 max_parallel_workers = 32 synchronous_commit = off checkpoint_timeout = 1d max_wal_size = 24GB min_wal_size = 15GB autovacuum = off System Configuration: RAM: 503GB Disk Type: SSD Disk Size: 1.6TB Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 128 On-line CPU(s) list: 0-127 Thread(s) per core: 2 Core(s) per socket: 8 Socket(s): 8 NUMA node(s): 8 Vendor ID: GenuineIntel CPU family: 6 Model: 47 Model name: Intel(R) Xeon(R) CPU E7- 8830 @ 2.13GHz Stepping: 2 CPU MHz: 1064.000 CPU max MHz: 2129. CPU min MHz: 1064. BogoMIPS: 4266.62 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 24576K v1-0001-Performance-Improvement-For-Copy-From-Binary-File.patch Description: Binary data
Re: POC: postgres_fdw insert batching
Hi Tomas, On Mon, Jun 29, 2020 at 12:10 AM Tomas Vondra wrote: > > Hi, > > One of the issues I'm fairly regularly reminded by users/customers is > that inserting into tables sharded using FDWs are rather slow. We do > even get it reported on pgsql-bugs from time to time [1]. > > Some of the slowness / overhead is expected, doe to the latency between > machines in the sharded setup. Even just 1ms latency will make it way > more expensive than a single instance. > > But let's do a simple experiment, comparing a hash-partitioned regular > partitions, and one with FDW partitions in the same instance. Scripts to > run this are attached. The duration of inserting 1M rows to this table > (average of 10 runs on my laptop) looks like this: > >regular: 2872 ms >FDW: 64454 ms > > Yep, it's ~20x slower. On setup with ping latency well below 0.05ms. > Imagine how would it look on sharded setups with 0.1ms or 1ms latency, > which is probably where most single-DC clusters are :-( > > Now, the primary reason why the performance degrades like this is that > while FDW has batching for SELECT queries (i.e. we read larger chunks of > data from the cursors), we don't have that for INSERTs (or other DML). > Every time you insert a row, it has to go all the way down into the > partition synchronously. > > For some use cases this may be reduced by having many independent > connnections from different users, so the per-user latency is higher but > acceptable. But if you need to import larger amounts of data (say, a CSV > file for analytics, ...) this may not work. > > Some time ago I wrote an ugly PoC adding batching, just to see how far > would it get us, and it seems quite promising - results for he same > INSERT benchmarks look like this: > > FDW batching: 4584 ms > > So, rather nice improvement, I'd say ... Very nice indeed. > Before I spend more time hacking on this, I have a couple open questions > about the design, restrictions etc. I think you may want to take a look this recent proposal by Andrey Lepikhov: * [POC] Fast COPY FROM command for the table with foreign partitions * https://www.postgresql.org/message-id/flat/3d0909dc-3691-a576-208a-90986e55489f%40postgrespro.ru -- Amit Langote EnterpriseDB: http://www.enterprisedb.com
Re: Resetting spilled txn statistics in pg_stat_replication
On Fri, 26 Jun 2020 at 17:53, Amit Kapila wrote: > > On Fri, Jun 26, 2020 at 11:31 AM Masahiko Sawada > wrote: > > > > On Thu, 25 Jun 2020 at 19:35, Amit Kapila wrote: > > > > > > On Tue, Jun 23, 2020 at 6:39 PM Amit Kapila > > > wrote: > > > > > > > > On Tue, Jun 23, 2020 at 3:48 PM Tomas Vondra > > > > wrote: > > > > > > > > > > On Tue, Jun 23, 2020 at 10:58:18AM +0530, Amit Kapila wrote: > > > > > >On Tue, Jun 23, 2020 at 9:32 AM Masahiko Sawada > > > > > > wrote: > > > > > >> > > > > > >> On Sun, 21 Jun 2020 at 06:57, Tomas Vondra > > > > > >> wrote: > > > > > >> > > > > > > >> > > > > > > > >> > >What if the decoding has been performed by multiple backends > > > > > >> > >using the > > > > > >> > >same slot? In that case, it will be difficult to make the > > > > > >> > >judgment > > > > > >> > >for the value of logical_decoding_work_mem based on stats. It > > > > > >> > >would > > > > > >> > >make sense if we provide a way to set logical_decoding_work_mem > > > > > >> > >for a > > > > > >> > >slot but not sure if that is better than what we have now. > > > > > >> > > > > > > > >> > > > > > >> I thought that the stats are relevant to what > > > > > >> logical_decoding_work_mem value was but not with who performed > > > > > >> logical > > > > > >> decoding. So even if multiple backends perform logical decoding > > > > > >> using > > > > > >> the same slot, the user can directly use stats as long as > > > > > >> logical_decoding_work_mem value doesn’t change. > > > > > >> > > > > > > Today, I thought about it again, and if we consider the point that > > > logical_decoding_work_mem value doesn’t change much then having the > > > stats at slot-level would also allow computing > > > logical_decoding_work_mem based on stats. Do you think it is a > > > reasonable assumption that users won't change > > > logical_decoding_work_mem for different processes (WALSender, etc.)? > > > > FWIW, if we use logical_decoding_work_mem as a threshold of starting > > of sending changes to a subscriber, I think there might be use cases > > where the user wants to set different logical_decoding_work_mem values > > to different wal senders. For example, setting a lower value to > > minimize the latency of synchronous logical replication to a near-site > > whereas setting a large value to minimize the amount of data sent to a > > far site. > > > > How does setting a large value can minimize the amount of data sent? > One possibility is if there are a lot of transaction aborts and > transactions are not large enough that they cross > logical_decoding_work_mem threshold but such cases shouldn't be many. Yeah, this is what I meant. I agree that it would not be a common case that the user sets different values for different processes. Based on that assumption, I also think having the stats at slot-level is a good idea. But I might want to have the reset function. Regards, -- Masahiko Sawadahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
[Bug fix]There is the case archive_timeout parameter is ignored after recovery works.
Hi, I found the bug about archive_timeout parameter. There is the case archive_timeout parameter is ignored after recovery works. [Problem] When the value of archive_timeout is smaller than that of checkpoint_timeout and recovery works, archive_timeout is ignored in the first WAL archiving. Once WAL is archived, the archive_timeout seems to be valid after that. I attached the simple script for reproducing this problem on version 12. I also confirmed that PostgreSQL10, 11 and 12. I think other supported versions have this problem. [Investigation] In the CheckpointerMain(), calculate the time (cur_timeout) to wait on WaitLatch. - now = (pg_time_t) time(NULL); elapsed_secs = now - last_checkpoint_time; if (elapsed_secs >= CheckPointTimeout) continue; /* no sleep for us ... */ cur_timeout = CheckPointTimeout - elapsed_secs; if (XLogArchiveTimeout > 0 && !RecoveryInProgress()) { elapsed_secs = now - last_xlog_switch_time; if (elapsed_secs >= XLogArchiveTimeout) continue; /* no sleep for us ... */ cur_timeout = Min(cur_timeout, XLogArchiveTimeout - elapsed_secs); } (void) WaitLatch(MyLatch, WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH, cur_timeout * 1000L /* convert to ms */ , WAIT_EVENT_CHECKPOINTER_MAIN); - Currently, cur_timeout is set according to only checkpoint_timeout when it is during recovery. Even during recovery, the cur_timeout should be calculated including archive_timeout as well as checkpoint_timeout, I think. I attached the patch to solve this problem. Regards, Daisuke, Higuchi archive_timeout_test.sh Description: archive_timeout_test.sh archive_timeout.patch Description: archive_timeout.patch
Re: Fwd: PostgreSQL: WolfSSL support
Hi Jonah, On Sat, Jun 27, 2020 at 12:35 PM Jonah H. Harris wrote: > > Somewhere, I recall seeing an open-source OpenSSL compatibility wrapper for > WolfSSL. Assuming that still exists, this patch seems entirely unnecessary. The patch uses the OpenSSL compatibility layer. Kind regards Felix Lechner
Re: Fwd: PostgreSQL: WolfSSL support
Hi Tom, On Sat, Jun 27, 2020 at 11:52 AM Tom Lane wrote: > > The configure > script could add -I/usr/include/wolfssl (or wherever those files > are) to CPPFLAGS instead of touching all those #includes. That does not work well when OpenSSL's development files are installed. I did not think a segmentation fault was a good way to make friends. > However, as long as > it's available on GPL terms, I don't see a problem with it > [wolfSSL] being one alternative. A minimal patch against -13 is on its way. Kind regards Felix Lechner
Re: Fwd: PostgreSQL: WolfSSL support
Hi Tom, On Sat, Jun 27, 2020 at 7:56 AM Tom Lane wrote: > > However, judging from the caveats mentioned in the initial message, > my inclination would be to wait awhile for wolfSSL to mature. Please have a closer look. The library has been around since 2004 and is popular in embedded systems. (It was previously known as cyaSSL.) If you bought a car or an appliance in the past ten years, you may be using it already. wolfSSL's original claim to fame was that MySQL relied on it (I think Oracle changed that). MariaDB still bundles an older, captive version. The software is mature, and widely deployed. Kind regards Felix Lechner
Re: PostgreSQL: WolfSSL support
Hi Greg, On Sun, Jun 28, 2020 at 6:40 AM Greg Stark wrote: > > I'm more interested in a library like Amazon's Does S2N support TLS 1.3? https://github.com/awslabs/s2n/issues/388 Kind regards Felix Lechner
Re: pgsql: Enable Unix-domain sockets support on Windows
On Sun, Jun 28, 2020 at 2:03 PM Peter Eisentraut wrote: > > On 2020-06-27 13:57, Amit Kapila wrote: > > BTW, in which > > format the path needs to be specified for unix_socket_directories? I > > tried with '/c/tmp', 'c:/tmp', 'tmp' but nothing seems to be working, > > it gives me errors like: "could not create lock file > > "/c/tmp/.s.PGSQL.5432.lock": No such file or directory" on server > > start. I am trying this on Win7 just to check what is the behavior of > > this feature on it. > > Hmm, the only thing I remember about this now is that you need to use > native Windows paths, meaning you can't just use /tmp under MSYS, but it > needs to be something like C:\something. > I have tried it by giving something like that. After giving path as unix_socket_directories = 'C:\\akapila', I get below errors on server start: 2020-06-29 08:19:13.174 IST [4460] LOG: could not create Unix socket for address "C:/akapila/.s.PGSQL.5432": An address incompatible with the request ed protocol was used. 2020-06-29 08:19:13.205 IST [4460] WARNING: could not create Unix-domain socket in directory "C:/akapila" 2020-06-29 08:19:13.205 IST [4460] FATAL: could not create any Unix-domain sockets 2020-06-29 08:19:13.221 IST [4460] LOG: database system is shut down After giving path as unix_socket_directories = 'C:\akapila', I get below errors on server start: 2020-06-29 08:24:11.861 IST [4808] FATAL: could not create lock file "C:akapila/.s.PGSQL.5432.lock": No such file or directory 2020-06-29 08:24:11.877 IST [4808] LOG: database system is shut down > But the error you have there > is not even about the socket file but about the lock file, which is a > normal file, so if that goes wrong, it might be an unrelated problem. > Yeah, but as I am trying this on Win7 machine, I was expecting an error similar to what you were saying: "unsupported address family ...". It seems this error occurred after passing that phase. I am not sure what is going on here and maybe it is not important as well because we don't support this feature on Win7 but probably an appropriate error message would have been good. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
estimation problems for DISTINCT ON with FDW
If I use the attached sql file to set up the database with loop-back postgres_fdw, and then turn on use_remote_estimate for this query: distinct on (id) id, z from fgn.priority order by id, priority desc,z It issues two queries for the foreign estimate, one with a sort and one without: EXPLAIN SELECT id, priority, z FROM public.priority EXPLAIN SELECT id, priority, z FROM public.priority ORDER BY id ASC NULLS LAST, priority DESC NULLS FIRST, z ASC NULLS LAST It doesn't cost out the plan of pushing the DISTINCT ON down to the foreign side, which is probably the best way to run the query. I guess it makes sense that FDW machinery in general doesn't want to try to push a PostgreSQL specific construct. But much worse than that, it horribly misestmates the number of unique rows it will get back, having never asked the remote side for an estimate of that. Result (cost=100.51..88635.90 rows=1 width=16) -> Unique (cost=100.51..88635.90 rows=1 width=16) -> Foreign Scan on priority (cost=100.51..86135.90 rows=100 width=16) Where does it come up with the idea that these 1,000,000 rows will DISTINCT/Unique down to just 1 row? I can't find the place in the code where that happens. I suspect it is happening somewhere in the core code based on data fed into it by postgres_fdw, not in postgres_fdw itself. This leads to horrible plans if the DISTINCT ON is actually in a subquery which is joined to other tables, for example. If you don't use the remote estimates, it at least comes up with a roughly sane estimate of 200 distinct rows, which is enough to inhibit selection of the worst plans. Why does an uninformative remote estimate do so much worse than no remote estimate at all? Of course I could just disable remote estimates for this table, but then other queries that use the table without DISTINCT ON suffer. Another solution is to ANALYZE the foreign table, but that opens up a can of worms of its own. I see this behavior in all supported or in-development versions. Cheers, Jeff create table priority as select floor(random()*100)::int as id, floor(random()*10)::int as priority, random() as z from generate_series(1,100)f(id) order by random(); vacuum ANALYZE priority; create index on priority (id, priority,z); create extension postgres_fdw ; create server fdw foreign data wrapper postgres_fdw; create user MAPPING FOR CURRENT_USER SERVER fdw ; create schema fgn; import foreign schema public from server fdw into fgn; explain select distinct on (id) id, z from fgn.priority order by id, priority desc,z; alter foreign table fgn.priority options (use_remote_estimate 'true'); explain select distinct on (id) id, z from fgn.priority order by id, priority desc,z;
Re: Creating a function for exposing memory usage of backend process
On 2020-06-20 03:11, Robert Haas wrote: On Wed, Jun 17, 2020 at 11:56 PM Fujii Masao wrote: > As a first step, to deal with (3) I'd like to add > pg_stat_get_backend_memory_context() which target is limited to the > local backend process. +1 +1 from me, too. Attached a patch that adds a function exposing memory usage of local backend. It's almost same as pg_cheat_funcs's pg_stat_get_memory_context(). I've also added MemoryContexts identifier because it seems useful to distinguish the same kind of memory contexts. For example, when there are many prepared statements we can distinguish them using identifiers, which shows the cached query. =# SELECT name, ident FROM pg_stat_get_memory_contexts() WHERE name = 'CachedPlanSource'; name |ident --+ CachedPlanSource | PREPARE q1(text) AS SELECT ..; CachedPlanSource | PREPARE q2(text) AS SELECT ..; (2 rows) Something that exposed this via shared memory would be quite useful, and there are other things we'd like to expose similarly, such as the plan(s) from the running queries, or even just the untruncated query string(s). I'd like to have a good infrastructure for that sort of thing, but I think it's quite tricky to do properly. You need a variable-size chunk of shared memory, so probably it has to use DSM somehow, and you need to update it as things change, but if you update it too frequently performance will stink. If you ping processes to do the updates, how do you know when they've completed them, and what happens if they don't respond in a timely fashion? These are probably all solvable problems, but I don't think they are very easy ones. Thanks for your comments! It seems hard as you pointed out. I'm going to consider it along with the advice of Fujii and Kasahara. Regards, -- Atsushi Torikoshi NTT DATA CORPORATIONFrom 055af903a3dbf146d97dd3fb01a6a7d3d3bd2ae0 Mon Sep 17 00:00:00 2001 From: Atsushi Torikoshi Date: Mon, 29 Jun 2020 07:48:49 +0900 Subject: [PATCH] Add a function exposing memory usage of local backend. Backend processes sometimes use a lot of memory because of various reasons like caches, prepared statements and cursors. Previously, the only way to examine the usage of backend processes was attaching a debugger and call MemoryContextStats() and it was not so convenient in some cases. This patch implements a new SQL-callable function pg_stat_get_memory_contexts() which exposes memory usage of the local backend. --- doc/src/sgml/monitoring.sgml| 14 + src/backend/postmaster/pgstat.c | 80 + src/backend/utils/adt/pgstatfuncs.c | 45 src/include/catalog/pg_proc.dat | 9 src/include/pgstat.h| 6 ++- 5 files changed, 153 insertions(+), 1 deletion(-) diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index dfa9d0d641..74c8663981 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -4617,6 +4617,20 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i + + + + pg_stat_get_memory_contexts + +pg_stat_get_memory_contexts () +setof record + + +Returns records of information about all memory contexts of the +local backend. + + + diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c index c022597bc0..140f111654 100644 --- a/src/backend/postmaster/pgstat.c +++ b/src/backend/postmaster/pgstat.c @@ -62,6 +62,7 @@ #include "storage/procsignal.h" #include "storage/sinvaladt.h" #include "utils/ascii.h" +#include "utils/builtins.h" #include "utils/guc.h" #include "utils/memutils.h" #include "utils/ps_status.h" @@ -167,6 +168,13 @@ static const char *const slru_names[] = { */ static PgStat_MsgSLRU SLRUStats[SLRU_NUM_ELEMENTS]; +/* -- + * The max bytes for showing identifiers of MemoryContext. + * -- + */ +#define MEMORY_CONTEXT_IDENT_SIZE 1024 + + /* -- * Local data * -- @@ -2691,6 +2699,78 @@ pgstat_fetch_slru(void) return slruStats; } +void +pgstat_put_memory_contexts_tuplestore(Tuplestorestate *tupstore, + TupleDesc tupdesc, MemoryContext context, + MemoryContext parent, int level) +{ +#define PG_STAT_GET_MEMORY_CONTEXT_COLS 9 + Datum values[PG_STAT_GET_MEMORY_CONTEXT_COLS]; + bool nulls[PG_STAT_GET_MEMORY_CONTEXT_COLS]; + MemoryContextCounters stat; + MemoryContext child; + const char *name = context->name; + const char *ident = context->ident; + + if (context == NULL) + return; + + /* + * It seems preferable to label dynahash contexts with just the hash table + * name. Those are already unique enough, so the "dynahash" part isn't + * very helpful, and this way is more consistent with pre-v11 practice. + */ + if (ident && strcmp(name, "dynahash")
Re: Default setting for enable_hashagg_disk
On Sat, Jun 27, 2020 at 3:41 AM Tomas Vondra wrote: > Well, there are multiple ideas discussed in this (sub)thread, one of > them being a per-query memory limit. That requires decisions how much > memory should different nodes get, which I think would need to be > cost-based. A design like that probably makes sense. But it's way out of scope for Postgres 13, and not something that should be discussed further on this thread IMV. > Of course, a simpler scheme like this would not require that. And maybe > introducing hash_mem is a good idea - I'm not particularly opposed to > that, actually. But I think we should not introduce separate memory > limits for each node type, which was also mentioned earlier. I had imagined that hash_mem would apply to hash join and hash aggregate only. A GUC that either represents a multiple of work_mem, or an absolute work_mem-style KB value. > The problem of course is that hash_mem does not really solve the issue > discussed at the beginning of this thread, i.e. regressions due to > underestimates and unexpected spilling at execution time. Like Andres, I suspect that that's a smaller problem in practice. A hash aggregate that spills often has performance characteristics somewhat like a group aggregate + sort, anyway. I'm worried about cases where an *in-memory* hash aggregate is naturally far far faster than other strategies, and yet we can't use it -- despite the fact that Postgres 12 could "safely" do so. (It probably doesn't matter whether the slow plan that you get in Postgres 13 is a hash aggregate that spills, or something else -- this is not really a costing problem.) Besides, hash_mem *can* solve that problem to some extent. Other cases (cases where the estimates are so bad that hash_mem won't help) seem like less of a concern to me. To some extent, that's the price you pay to avoid the risk of an OOM. > The thread is getting a rather confusing mix of proposals how to fix > that for v13 and proposals how to improve our configuration of memory > limits :-( As I said to Amit in my last message, I think that all of the ideas that are worth pursuing involve giving hash aggregate nodes license to use more memory than other nodes. One variant involves doing so only at execution time, while the hash_mem idea involves formalizing and documenting that hash-based nodes are special -- and taking that into account during both planning and execution. > Interesting. What is not entirely clear to me how do these databases > decide how much should each node get during planning. With the separate > work_mem-like settings it's fairly obvious, but how do they do that with > the global limit (either per-instance or per-query)? I don't know, but that seems like a much more advanced way of approaching the problem. It isn't in scope here. Perhaps I'm not considering some unintended consequence of the planner giving hash-based nodes extra memory "for free" in the common case where hash_mem exceeds work_mem (by 2x, say). But my guess is that that won't be a significant problem that biases the planner in some obviously undesirable way. -- Peter Geoghegan
Re: ModifyTable overheads in generic plans
On Sat, 27 Jun 2020 at 00:36, Amit Langote wrote: > 2. ExecCheckRTPerms(): checks permissions of *all* partitions before > executing the plan tree, but maybe it's okay to check only the ones > that will be accessed I don't think it needs to be quite as complex as that. expand_single_inheritance_child will set the RangeTblEntry.requiredPerms to 0, so we never need to check permissions on a partition. The overhead of permission checking when there are many partitions is just down to the fact that ExecCheckRTPerms() loops over the entire rangetable and calls ExecCheckRTEPerms for each one. ExecCheckRTEPerms() does have very little work to do when requiredPerms is 0, but the loop itself and the function call overhead show up when you remove the other bottlenecks. I have a patch somewhere that just had the planner add the RTindexes with a non-zero requiredPerms and set that in the plan so that ExecCheckRTPerms could just look at the ones that actually needed something checked. There's a slight disadvantage there that for queries to non-partitioned tables that we need to build a Bitmapset that has all items from the rangetable. That's likely a small overhead, but not free, so perhaps there is a better way. David
Re: Fwd: PostgreSQL: WolfSSL support
On Sun, Jun 28, 2020 at 10:18:12AM +0200, Peter Eisentraut wrote: > We have added support for allegedly-OpenSSL compatible libraries such as > LibreSSL before, so some tweaks for wolfSSL would seem acceptable. However, > I doubt we are going to backpatch them, so unless you want to take > responsibility for that as a packager, it's not really going to help anyone > soon. That's a new feature to me. > And OpenSSL 3.0.0 will have a new license, so for the next PostgreSQL > release, this problem might be gone. And there is this part too to consider, but I am no lawyer. @@ -131,11 +131,11 @@ typedef union { #ifdef WOLFSSL_SHA3 wc_Sha3 sha3; #endif -} Hash; +} WolfSSLHash; [...] #endif #if !defined(XVALIDATE_DATE) && !defined(HAVE_VALIDATE_DATE) #define USE_WOLF_VALIDDATE -#define XVALIDATE_DATE(d, f, t) ValidateDate((d), (f), (t)) +#define XVALIDATE_DATE(d, f, t) WolfSSLValidateDate((d), (f), (t)) #endif Looking at the patches, it seems to me that the part applying only to WolfSSL should be done anyway, at least for the Hash part which is a rather generic name, and that it may be better to do something as well on the Postgres part for the same plan node to avoid conflicts, but that's something old enough that it could vote (1054097). ValidateTime() is present in the Postgres tree since f901bb5, but it is always annoying to break stuff that could be used by external plugins... Regarding the Postgres part of the WIP, the hard part is that we need more thinking about the refactoring bits, so as people compiling Postgres can choose between OpenSSL or something else. And as Tom mentioned upthread there is no need for that: -#include -#include -#include +#include +#include +#include +#include ./configure should just append the correct path with -I. - my_bio_methods->bread = my_sock_read; - my_bio_methods->bwrite = my_sock_write; + my_bio_methods->readCb = my_sock_read; + my_bio_methods->writeCb = my_sock_write; These parts could also be consolidated between OpenSSL and WolfSSL? - dh = PEM_read_DHparams(fp, NULL, NULL, NULL); FreeFile(fp); + return NULL; This part is not acceptable as-is. As a proof of concept, that's fine of course. -- Michael signature.asc Description: PGP signature
Re: ModifyTable overheads in generic plans
On Fri, Jun 26, 2020 at 9:36 PM Amit Langote wrote: > I would like to discuss a refactoring patch that builds on top of the > patches at [1] to address $subject. I've added this to the next CF: https://commitfest.postgresql.org/28/2621/ -- Amit Langote EnterpriseDB: http://www.enterprisedb.com
Re: Default setting for enable_hashagg_disk
On Sat, Jun 27, 2020 at 3:00 AM Amit Kapila wrote: > I think the advantage of delaying it is that we > might see some real problems (like where hash aggregate is not a good > choice) which can be fixed via the costing model. I think any problem that might come up with the costing is best thought of as a distinct problem. This thread is mostly about the problem of users getting fewer in-memory hash aggregates compared to a previous release running the same application (though there has been some discussion of the other problem, too [1], but it's thought to be less serious). The problem is that affected users were theoretically never entitled to the performance they came to rely on, and yet there is good reason to think that hash aggregate really should be entitled to more memory. They won't care that they were theoretically never entitled to that performance, though -- they *liked* the fact that hash agg could cheat. And they'll dislike the fact that this cannot be corrected by tuning work_mem, since that affects all node types that consume work_mem, not just hash aggregate -- that could cause OOMs for them. There are two or three similar ideas under discussion that might fix the problem. They all seem to involve admitting that hash aggregate's "cheating" might actually have been a good thing all along (even though giving hash aggregate much much more memory than other nodes is terrible), and giving hash aggregate license to "cheat openly". Note that the problem isn't exactly a problem with the hash aggregate spilling patch. You could think of the problem as a pre-existing issue -- a failure to give more memory to hash aggregate, which really should be entitled to more memory. Jeff's patch just made the issue more obvious. [1] https://postgr.es/m/20200624191433.5gnqgrxfmucex...@alap3.anarazel.de -- Peter Geoghegan
Re: Commitfest 2020-07
On Mon, Jun 29, 2020 at 08:04:35AM +0800, Andy Fan wrote: > Thanks for the volunteering! +1. -- Michael signature.asc Description: PGP signature
Re: Commitfest 2020-07
On Mon, Jun 29, 2020 at 2:50 AM Magnus Hagander wrote: > On Sun, Jun 28, 2020 at 4:49 PM Tom Lane wrote: > >> Daniel Gustafsson writes: >> > In a few days, the first commitfest of the 14 cycle - 2020-07 - will >> start. >> > Unless anyone has already spoken up that I've missed, I'm happy to >> volunteer to >> > run CFM for this one. >> >> No one has volunteered that I recall, so the baton is yours. >> >> > Enjoy! > > //Magnus > > Thanks for the volunteering! -- Best Regards Andy Fan
Re: pg_read_file() with virtual files returns empty string
Joe Conway writes: > All good stuff -- I believe the attached checks all the boxes. Looks okay to me, except I think you want ! if (bytes_to_read > 0) to be ! if (bytes_to_read >= 0) As it stands, a zero request will be treated like -1 (read all the rest of the file) while ISTM it ought to be an expensive way to read zero bytes --- perhaps useful to check the filename and seek offset validity? > The intention here seems to be that if you pass bytes_to_read = -1 with a > negative offset, it will give you the last offset bytes of the file. I think it's just trying to convert bytes_to_read = -1 into an explicit positive length-to-read in all cases. We don't need that anymore with this code, so dropping it is fine. regards, tom lane
Revert workarounds for unportability of printf %.*s format?
In connection with the discussion at [1], I realized that we could unwind the hacks we've introduced --- mostly in commit 54cd4f045 --- to avoid depending on the behavior of %.*s format in printf. Now that we always use our own snprintf.c code, we know that it measures field widths in bytes not characters, and we also know that use of this format won't cause random encoding-related failures. Some of the changes are not worth undoing; for example using strlcpy instead of snprintf to truncate a string is a net win by any measure. But places where we introduced a temporary buffer, such as the change in truncate_identifier() in 54cd4f045, would be better off with the old coding. In any case we could remove all the comments warning against using this feature. Thoughts? regards, tom lane [1] https://www.postgresql.org/message-id/flat/a120087c-4c88-d9d4-1ec5-808d7a7f133d%40gmail.com
Re: Commitfest 2020-07
On Sun, Jun 28, 2020 at 4:49 PM Tom Lane wrote: > Daniel Gustafsson writes: > > In a few days, the first commitfest of the 14 cycle - 2020-07 - will > start. > > Unless anyone has already spoken up that I've missed, I'm happy to > volunteer to > > run CFM for this one. > > No one has volunteered that I recall, so the baton is yours. > > Enjoy! //Magnus
Re: new heapcheck contrib module
> On Jun 28, 2020, at 9:05 AM, Dilip Kumar wrote: > > Some more comments on v9_0001. > 1. > + LWLockAcquire(XidGenLock, LW_SHARED); > + nextFullXid = ShmemVariableCache->nextFullXid; > + ctx.oldestValidXid = ShmemVariableCache->oldestXid; > + LWLockRelease(XidGenLock); > + ctx.nextKnownValidXid = XidFromFullTransactionId(nextFullXid); > ... > ... > + > + for (ctx.blkno = startblock; ctx.blkno < endblock; ctx.blkno++) > + { > + int32 mapbits; > + OffsetNumber maxoff; > + PageHeader ph; > + > + /* Optionally skip over all-frozen or all-visible blocks */ > + if (skip_all_frozen || skip_all_visible) > + { > + mapbits = (int32) visibilitymap_get_status(ctx.rel, ctx.blkno, > +); > + if (skip_all_visible && (mapbits & VISIBILITYMAP_ALL_VISIBLE) != 0) > + continue; > + if (skip_all_frozen && (mapbits & VISIBILITYMAP_ALL_FROZEN) != 0) > + continue; > + } > + > + /* Read and lock the next page. */ > + ctx.buffer = ReadBufferExtended(ctx.rel, MAIN_FORKNUM, ctx.blkno, > + RBM_NORMAL, ctx.bstrategy); > + LockBuffer(ctx.buffer, BUFFER_LOCK_SHARE); > > I might be missing something, but it appears that first we are getting > the nextFullXid and after that, we are scanning the block by block. > So while we are scanning the block if the nextXid is advanced and it > has updated some tuple in the heap pages, then it seems the current > logic will complain about out of range xid. I did not test this > behavior so please point me to the logic which is protecting this. We know the oldest valid Xid cannot advance, because we hold a lock that would prevent it from doing so. We cannot know that the newest Xid will not advance, but when we see an Xid beyond the end of the known valid range, we check its validity, and either report it as a corruption or advance our idea of the newest valid Xid, depending on that check. That logic is in TransactionIdValidInRel. > 2. > /* > * Helper function to construct the TupleDesc needed by verify_heapam. > */ > static TupleDesc > verify_heapam_tupdesc(void) > > From function name, it appeared that it is verifying tuple descriptor > but this is just creating the tuple descriptor. In amcheck--1.2--1.3.sql we define a function named verify_heapam which returns a set of records. This is the tuple descriptor for that function. I understand that the name can be parsed as verify_(heapam_tupdesc), but it is meant as (verify_heapam)_tupdesc. Do you have a name you would prefer? > 3. > + /* Optionally skip over all-frozen or all-visible blocks */ > + if (skip_all_frozen || skip_all_visible) > + { > + mapbits = (int32) visibilitymap_get_status(ctx.rel, ctx.blkno, > +); > + if (skip_all_visible && (mapbits & VISIBILITYMAP_ALL_VISIBLE) != 0) > + continue; > + if (skip_all_frozen && (mapbits & VISIBILITYMAP_ALL_FROZEN) != 0) > + continue; > + } > > Here, do we want to test that in VM the all visible bit is set whereas > on the page it is not set? That can lead to a wrong result in an > index-only scan. If the caller has specified that the corruption check should skip over all-frozen or all-visible data, then we cannot load the page that the VM claims is all-frozen or all-visible without defeating the purpose of the caller having specified these options. Without loading the page, we cannot check the page's header bits. When not skipping all-visible or all-frozen blocks, we might like to pin both the heap page and the visibility map page in order to compare the two, being careful not to hold a pin on the one while performing I/O on the other. See for example the logic in heap_delete(). But I'm not sure what guarantees the system makes about agreement between these two bits. Certainly, the VM should not claim a page is all visible when it isn't, but are we guaranteed that a page that is all-visible will always have its all-visible bit set? I don't know if (possibly transient) disagreement between these two bits constitutes corruption. Perhaps others following this thread can advise? > 4. One cosmetic comment > > + /* Skip non-varlena values, but update offset first */ > .. > + > + /* Ok, we're looking at a varlena attribute. */ > > Throughout the patch, I have noticed that some of your single-line > comments have "full stop" whereas other don't. Can we keep them > consistent? I try to use a "full stop" at the end of sentences, but not at the end of sentence fragments. To me, a "full stop" means that a sentence has reached its conclusion. I don't intentionally use one at the end of a fragment, unless the fragment precedes a full sentence, in which case the "full stop" is needed to separate the two. Of course, I may have violated my own rule in a few places, but before I submit a v10 patch with comment punctuation changes, perhaps we can agree on what the rule is? (This has probably been discussed before and agreed before. A link to the appropriate email thread would be sufficient.) For example: /* red, green, or blue */ /*
Re: Fwd: PostgreSQL: WolfSSL support
Felix Lechner writes: > On Sat, Jun 27, 2020 at 7:56 AM Tom Lane wrote: >> However, judging from the caveats mentioned in the initial message, >> my inclination would be to wait awhile for wolfSSL to mature. > Please have a closer look. The library has been around since 2004 and > is popular in embedded systems. (It was previously known as cyaSSL.) I don't really care where else it's used. If we have to hack the source code before we can use it, it's not mature for our purposes. Even when (if) that requirement reduces to "you have to use the latest bleeding edge release", it'll be problematic for people whose platforms supply a less bleeding edge version. I was signifying a desire to wait until compatible versions are reasonably common in the wild before we spend much time on this. regards, tom lane
Re: bugfix: invalid bit/varbit input causes the log file to be unreadable
I wrote: > Even granting the premise, the proposed patch seems like a significant > decrease in user-friendliness for typical cases. I'd rather see us > make an effort to print one valid-per-the-DB-encoding character. > Now that we can rely on snprintf to count %s restrictions in bytes, > I think something like this should work: > errmsg("\"%.*s\" is not a valid binary digit", > pg_mblen(sp), sp))); > But the real problem is that this is only the tip of the iceberg. > You didn't even hit all the %c usages in varbit.c. I went through all the %c format sequences in the backend to see which ones could use this type of fix. There were not as many as I'd expected, but still a fair number. (I skipped cases where the input was coming from the catalogs, as well as some non-user-facing debug printouts.) That leads to the attached patch, which seems to do the job without breaking anything that works today. regards, tom lane PS: I failed to resist the temptation to improve some shoddy error messages nearby in pageinspect/heapfuncs.c. diff --git a/contrib/hstore/hstore_io.c b/contrib/hstore/hstore_io.c index 60bdbea46b..b3304ff844 100644 --- a/contrib/hstore/hstore_io.c +++ b/contrib/hstore/hstore_io.c @@ -80,7 +80,9 @@ get_val(HSParser *state, bool ignoreeq, bool *escaped) } else if (*(state->ptr) == '=' && !ignoreeq) { -elog(ERROR, "Syntax error near '%c' at position %d", *(state->ptr), (int32) (state->ptr - state->begin)); +elog(ERROR, "Syntax error near \"%.*s\" at position %d", + pg_mblen(state->ptr), state->ptr, + (int32) (state->ptr - state->begin)); } else if (*(state->ptr) == '\\') { @@ -219,7 +221,9 @@ parse_hstore(HSParser *state) } else if (!isspace((unsigned char) *(state->ptr))) { -elog(ERROR, "Syntax error near '%c' at position %d", *(state->ptr), (int32) (state->ptr - state->begin)); +elog(ERROR, "Syntax error near \"%.*s\" at position %d", + pg_mblen(state->ptr), state->ptr, + (int32) (state->ptr - state->begin)); } } else if (st == WGT) @@ -234,7 +238,9 @@ parse_hstore(HSParser *state) } else { -elog(ERROR, "Syntax error near '%c' at position %d", *(state->ptr), (int32) (state->ptr - state->begin)); +elog(ERROR, "Syntax error near \"%.*s\" at position %d", + pg_mblen(state->ptr), state->ptr, + (int32) (state->ptr - state->begin)); } } else if (st == WVAL) @@ -267,7 +273,9 @@ parse_hstore(HSParser *state) } else if (!isspace((unsigned char) *(state->ptr))) { -elog(ERROR, "Syntax error near '%c' at position %d", *(state->ptr), (int32) (state->ptr - state->begin)); +elog(ERROR, "Syntax error near \"%.*s\" at position %d", + pg_mblen(state->ptr), state->ptr, + (int32) (state->ptr - state->begin)); } } else diff --git a/contrib/pageinspect/heapfuncs.c b/contrib/pageinspect/heapfuncs.c index 11a910184b..f04455da12 100644 --- a/contrib/pageinspect/heapfuncs.c +++ b/contrib/pageinspect/heapfuncs.c @@ -30,6 +30,7 @@ #include "catalog/pg_am_d.h" #include "catalog/pg_type.h" #include "funcapi.h" +#include "mb/pg_wchar.h" #include "miscadmin.h" #include "pageinspect.h" #include "port/pg_bitutils.h" @@ -99,7 +100,8 @@ text_to_bits(char *str, int len) else ereport(ERROR, (errcode(ERRCODE_DATA_CORRUPTED), - errmsg("illegal character '%c' in t_bits string", str[off]))); + errmsg("invalid character \"%.*s\" in t_bits string", + pg_mblen(str + off), str + off))); if (off % 8 == 7) bits[off / 8] = byte; @@ -460,14 +462,13 @@ tuple_data_split(PG_FUNCTION_ARGS) if (!t_bits_str) ereport(ERROR, (errcode(ERRCODE_DATA_CORRUPTED), - errmsg("argument of t_bits is null, but it is expected to be null and %d character long", - bits_len))); + errmsg("t_bits string must not be NULL"))); bits_str_len = strlen(t_bits_str); if (bits_len != bits_str_len) ereport(ERROR, (errcode(ERRCODE_DATA_CORRUPTED), - errmsg("unexpected length of t_bits %u, expected %d", + errmsg("unexpected length of t_bits string: %u, expected %u", bits_str_len, bits_len))); /* do the conversion */ @@ -478,7 +479,7 @@ tuple_data_split(PG_FUNCTION_ARGS) if (t_bits_str) ereport(ERROR, (errcode(ERRCODE_DATA_CORRUPTED), - errmsg("t_bits string is expected to be NULL, but instead it is %zu bytes length", + errmsg("t_bits string is expected to be NULL, but instead it is %zu bytes long", strlen(t_bits_str; } diff --git a/src/backend/utils/adt/encode.c b/src/backend/utils/adt/encode.c index 61d318d93c..a609d49c12 100644 --- a/src/backend/utils/adt/encode.c +++ b/src/backend/utils/adt/encode.c @@ -15,6 +15,7 @@ #include +#include "mb/pg_wchar.h" #include "utils/builtins.h" #include "utils/memutils.h" @@ -171,17 +172,19 @@ hex_encode(const char *src, size_t len, char
Re: pg_read_file() with virtual files returns empty string
On 6/27/20 3:43 PM, Tom Lane wrote: > Joe Conway writes: >> The attached patch fixes this for me. I think it ought to be backpatched >> through >> pg11. > >> Comments? > > 1. Doesn't seem to be accounting for the possibility of an error in fread(). > > 2. Don't we want to remove the stat() call altogether, if we're not > going to believe its length? > > 3. This bit might need to cast the RHS to int64: > if (bytes_to_read > (MaxAllocSize - VARHDRSZ)) > otherwise it might be treated as an unsigned comparison. > Or you could check for bytes_to_read < 0 separately. > > 4. appendStringInfoString seems like quite the wrong thing to use > when the input is binary data. > > 5. Don't like the comment. Whether the file is virtual or not isn't > very relevant here. > > 6. If the file size exceeds 1GB, I fear we'll get some rather opaque > failure from the stringinfo infrastructure. It'd be better to > check for that here and give a file-too-large error. All good stuff -- I believe the attached checks all the boxes. I noted while at this, that the current code can never hit this case: ! if (bytes_to_read < 0) ! { ! if (seek_offset < 0) ! bytes_to_read = -seek_offset; The intention here seems to be that if you pass bytes_to_read = -1 with a negative offset, it will give you the last offset bytes of the file. But all of the SQL exposed paths disallow an explicit negative value for bytes_to_read. This was also not documented as far as I can tell so I eliminated that case in the attached. Is that actually a case I should fix/support instead? Separately, it seems to me that a two argument version of pg_read_file() would be useful: pg_read_file('filename', offset) In that case bytes_to_read = -1 could be passed down in order to read the entire file after the offset. In fact I think that would nicely handle the negative offset case as well. Joe -- Crunchy Data - http://crunchydata.com PostgreSQL Support for Secure Enterprises Consulting, Training, & Open Source Development diff --git a/contrib/adminpack/expected/adminpack.out b/contrib/adminpack/expected/adminpack.out index 5738b0f..edf3ebf 100644 *** a/contrib/adminpack/expected/adminpack.out --- b/contrib/adminpack/expected/adminpack.out *** SELECT pg_file_rename('test_file1', 'tes *** 79,85 (1 row) SELECT pg_read_file('test_file1'); -- not there ! ERROR: could not stat file "test_file1": No such file or directory SELECT pg_read_file('test_file2'); pg_read_file -- --- 79,85 (1 row) SELECT pg_read_file('test_file1'); -- not there ! ERROR: could not open file "test_file1" for reading: No such file or directory SELECT pg_read_file('test_file2'); pg_read_file -- *** SELECT pg_file_rename('test_file2', 'tes *** 108,114 (1 row) SELECT pg_read_file('test_file2'); -- not there ! ERROR: could not stat file "test_file2": No such file or directory SELECT pg_read_file('test_file3'); pg_read_file -- --- 108,114 (1 row) SELECT pg_read_file('test_file2'); -- not there ! ERROR: could not open file "test_file2" for reading: No such file or directory SELECT pg_read_file('test_file3'); pg_read_file -- diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c index ceaa618..6363dbe 100644 *** a/src/backend/utils/adt/genfile.c --- b/src/backend/utils/adt/genfile.c *** *** 36,41 --- 36,42 #include "utils/syscache.h" #include "utils/timestamp.h" + #define READBUF_SIZE 4096 /* * Convert a "text" filename argument to C string, and check it's allowable. *** read_binary_file(const char *filename, i *** 106,138 bool missing_ok) { bytea *buf; ! size_t nbytes; FILE *file; ! if (bytes_to_read < 0) ! { ! if (seek_offset < 0) ! bytes_to_read = -seek_offset; ! else ! { ! struct stat fst; ! ! if (stat(filename, ) < 0) ! { ! if (missing_ok && errno == ENOENT) ! return NULL; ! else ! ereport(ERROR, ! (errcode_for_file_access(), ! errmsg("could not stat file \"%s\": %m", filename))); ! } ! ! bytes_to_read = fst.st_size - seek_offset; ! } ! } ! ! /* not sure why anyone thought that int64 length was a good idea */ ! if (bytes_to_read > (MaxAllocSize - VARHDRSZ)) ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg("requested length too large"))); --- 107,117 bool missing_ok) { bytea *buf; ! size_t nbytes = 0; FILE *file; ! /* clamp request size to what we can actually deliver */ ! if (bytes_to_read > (int64) (MaxAllocSize - VARHDRSZ)) ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg("requested length too large"))); *** read_binary_file(const char *filename, i *** 154,162
Re: new heapcheck contrib module
On Sun, Jun 28, 2020 at 8:59 PM Dilip Kumar wrote: > > On Mon, Jun 22, 2020 at 5:44 AM Mark Dilger > wrote: > > > > > > > > > On Jun 21, 2020, at 2:54 AM, Dilip Kumar wrote: > > > > > > I have looked into 0001 patch and I have a few comments. > > > > > > 1. > > > + > > > + /* Skip over unused/dead/redirected line pointers */ > > > + if (!ItemIdIsUsed(ctx.itemid) || > > > + ItemIdIsDead(ctx.itemid) || > > > + ItemIdIsRedirected(ctx.itemid)) > > > + continue; > > > > > > Isn't it a good idea to verify the Redirected Itemtid? Because we > > > will still access the redirected item id to find the > > > actual tuple from the index scan. Maybe not exactly at this level, > > > but we can verify that the link itemid store in that > > > is within the itemid range of the page or not. > > > > Good idea. I've added checks that the redirection is valid, both in terms > > of being within bounds and in terms of alignment. > > > > > 2. > > > > > > + /* Check for tuple header corruption */ > > > + if (ctx->tuphdr->t_hoff < SizeofHeapTupleHeader) > > > + { > > > + confess(ctx, > > > + psprintf("t_hoff < SizeofHeapTupleHeader (%u < %u)", > > > + ctx->tuphdr->t_hoff, > > > + (unsigned) SizeofHeapTupleHeader)); > > > + fatal = true; > > > + } > > > > > > I think we can also check that if there is no NULL attributes (if > > > (!(t_infomask & HEAP_HASNULL)) then > > > ctx->tuphdr->t_hoff should be equal to SizeofHeapTupleHeader. > > > > You have to take alignment padding into account, but otherwise yes, and > > I've added a check for that. > > > > > 3. > > > + ctx->offset = 0; > > > + for (ctx->attnum = 0; ctx->attnum < ctx->natts; ctx->attnum++) > > > + { > > > + if (!check_tuple_attribute(ctx)) > > > + break; > > > + } > > > + ctx->offset = -1; > > > + ctx->attnum = -1; > > > > > > So we are first setting ctx->offset to 0, then inside > > > check_tuple_attribute, we will keep updating the offset as we process > > > the attributes and after the loop is over we set ctx->offset to -1, I > > > did not understand that why we need to reset it to -1, do we ever > > > check for that. We don't even initialize the ctx->offset to -1 while > > > initializing the context for the tuple so I do not understand what is > > > the meaning of the random value -1. > > > > Ahh, right, those are left over from a previous design of the code. Thanks > > for pointing them out. They are now removed. > > > > > 4. > > > + if (!VARATT_IS_EXTENDED(chunk)) > > > + { > > > + chunksize = VARSIZE(chunk) - VARHDRSZ; > > > + chunkdata = VARDATA(chunk); > > > + } > > > + else if (VARATT_IS_SHORT(chunk)) > > > + { > > > + /* > > > + * could happen due to heap_form_tuple doing its thing > > > + */ > > > + chunksize = VARSIZE_SHORT(chunk) - VARHDRSZ_SHORT; > > > + chunkdata = VARDATA_SHORT(chunk); > > > + } > > > + else > > > + { > > > + /* should never happen */ > > > + confess(ctx, > > > + pstrdup("toast chunk is neither short nor extended")); > > > + return; > > > + } > > > > > > I think the error message "toast chunk is neither short nor extended". > > > Because ideally, the toast chunk should not be further toasted. > > > So I think the check is correct, but the error message is not correct. > > > > I agree the error message was wrongly stated, and I've changed it, but you > > might suggest a better wording than what I came up with, "corrupt toast > > chunk va_header". > > > > > 5. > > > > > > + ctx.rel = relation_open(relid, ShareUpdateExclusiveLock); > > > + check_relation_relkind_and_relam(ctx.rel); > > > + > > > + /* > > > + * Open the toast relation, if any, also protected from concurrent > > > + * vacuums. > > > + */ > > > + if (ctx.rel->rd_rel->reltoastrelid) > > > + { > > > + int offset; > > > + > > > + /* Main relation has associated toast relation */ > > > + ctx.toastrel = table_open(ctx.rel->rd_rel->reltoastrelid, > > > + ShareUpdateExclusiveLock); > > > + offset = toast_open_indexes(ctx.toastrel, > > > > > > + if (TransactionIdIsNormal(ctx.relfrozenxid) && > > > + TransactionIdPrecedes(ctx.relfrozenxid, ctx.oldestValidXid)) > > > + { > > > + confess(, psprintf("relfrozenxid %u precedes global " > > > +"oldest valid xid %u ", > > > +ctx.relfrozenxid, ctx.oldestValidXid)); > > > + PG_RETURN_NULL(); > > > + } > > > > > > Don't we need to close the relation/toastrel/toastindexrel in such > > > return which is without an abort? IIRC, we > > > will get relcache leak WARNING on commit if we left them open in commit > > > path. > > > > Ok, I've added logic to close them. > > > > All changes inspired by your review are included in the v9-0001 patch. The > > differences since v8 are pulled out into v9_diffs for easier review. > > I have reviewed the changes in v9_diffs and looks fine to me. Some more comments on v9_0001. 1. + LWLockAcquire(XidGenLock, LW_SHARED); + nextFullXid = ShmemVariableCache->nextFullXid; + ctx.oldestValidXid = ShmemVariableCache->oldestXid; + LWLockRelease(XidGenLock); + ctx.nextKnownValidXid =
Re: new heapcheck contrib module
On Mon, Jun 22, 2020 at 5:44 AM Mark Dilger wrote: > > > > > On Jun 21, 2020, at 2:54 AM, Dilip Kumar wrote: > > > > I have looked into 0001 patch and I have a few comments. > > > > 1. > > + > > + /* Skip over unused/dead/redirected line pointers */ > > + if (!ItemIdIsUsed(ctx.itemid) || > > + ItemIdIsDead(ctx.itemid) || > > + ItemIdIsRedirected(ctx.itemid)) > > + continue; > > > > Isn't it a good idea to verify the Redirected Itemtid? Because we > > will still access the redirected item id to find the > > actual tuple from the index scan. Maybe not exactly at this level, > > but we can verify that the link itemid store in that > > is within the itemid range of the page or not. > > Good idea. I've added checks that the redirection is valid, both in terms of > being within bounds and in terms of alignment. > > > 2. > > > > + /* Check for tuple header corruption */ > > + if (ctx->tuphdr->t_hoff < SizeofHeapTupleHeader) > > + { > > + confess(ctx, > > + psprintf("t_hoff < SizeofHeapTupleHeader (%u < %u)", > > + ctx->tuphdr->t_hoff, > > + (unsigned) SizeofHeapTupleHeader)); > > + fatal = true; > > + } > > > > I think we can also check that if there is no NULL attributes (if > > (!(t_infomask & HEAP_HASNULL)) then > > ctx->tuphdr->t_hoff should be equal to SizeofHeapTupleHeader. > > You have to take alignment padding into account, but otherwise yes, and I've > added a check for that. > > > 3. > > + ctx->offset = 0; > > + for (ctx->attnum = 0; ctx->attnum < ctx->natts; ctx->attnum++) > > + { > > + if (!check_tuple_attribute(ctx)) > > + break; > > + } > > + ctx->offset = -1; > > + ctx->attnum = -1; > > > > So we are first setting ctx->offset to 0, then inside > > check_tuple_attribute, we will keep updating the offset as we process > > the attributes and after the loop is over we set ctx->offset to -1, I > > did not understand that why we need to reset it to -1, do we ever > > check for that. We don't even initialize the ctx->offset to -1 while > > initializing the context for the tuple so I do not understand what is > > the meaning of the random value -1. > > Ahh, right, those are left over from a previous design of the code. Thanks > for pointing them out. They are now removed. > > > 4. > > + if (!VARATT_IS_EXTENDED(chunk)) > > + { > > + chunksize = VARSIZE(chunk) - VARHDRSZ; > > + chunkdata = VARDATA(chunk); > > + } > > + else if (VARATT_IS_SHORT(chunk)) > > + { > > + /* > > + * could happen due to heap_form_tuple doing its thing > > + */ > > + chunksize = VARSIZE_SHORT(chunk) - VARHDRSZ_SHORT; > > + chunkdata = VARDATA_SHORT(chunk); > > + } > > + else > > + { > > + /* should never happen */ > > + confess(ctx, > > + pstrdup("toast chunk is neither short nor extended")); > > + return; > > + } > > > > I think the error message "toast chunk is neither short nor extended". > > Because ideally, the toast chunk should not be further toasted. > > So I think the check is correct, but the error message is not correct. > > I agree the error message was wrongly stated, and I've changed it, but you > might suggest a better wording than what I came up with, "corrupt toast chunk > va_header". > > > 5. > > > > + ctx.rel = relation_open(relid, ShareUpdateExclusiveLock); > > + check_relation_relkind_and_relam(ctx.rel); > > + > > + /* > > + * Open the toast relation, if any, also protected from concurrent > > + * vacuums. > > + */ > > + if (ctx.rel->rd_rel->reltoastrelid) > > + { > > + int offset; > > + > > + /* Main relation has associated toast relation */ > > + ctx.toastrel = table_open(ctx.rel->rd_rel->reltoastrelid, > > + ShareUpdateExclusiveLock); > > + offset = toast_open_indexes(ctx.toastrel, > > > > + if (TransactionIdIsNormal(ctx.relfrozenxid) && > > + TransactionIdPrecedes(ctx.relfrozenxid, ctx.oldestValidXid)) > > + { > > + confess(, psprintf("relfrozenxid %u precedes global " > > +"oldest valid xid %u ", > > +ctx.relfrozenxid, ctx.oldestValidXid)); > > + PG_RETURN_NULL(); > > + } > > > > Don't we need to close the relation/toastrel/toastindexrel in such > > return which is without an abort? IIRC, we > > will get relcache leak WARNING on commit if we left them open in commit > > path. > > Ok, I've added logic to close them. > > All changes inspired by your review are included in the v9-0001 patch. The > differences since v8 are pulled out into v9_diffs for easier review. I have reviewed the changes in v9_diffs and looks fine to me. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
On Fri, Jun 26, 2020 at 11:47 AM Amit Kapila wrote: > > On Thu, Jun 25, 2020 at 7:11 PM Dilip Kumar wrote: > > > > On Wed, Jun 24, 2020 at 4:04 PM Amit Kapila wrote: > > > > > > > > > Review comments on various patches. > > > > > > poc_shared_fileset_cleanup_on_procexit > > > = > > > 1. > > > - ent->subxact_fileset = > > > - MemoryContextAlloc(ApplyContext, sizeof(SharedFileSet)); > > > + MemoryContext oldctx; > > > > > > + /* Shared fileset handle must be allocated in the persistent context */ > > > + oldctx = MemoryContextSwitchTo(ApplyContext); > > > + ent->subxact_fileset = palloc(sizeof(SharedFileSet)); > > > SharedFileSetInit(ent->subxact_fileset, NULL); > > > + MemoryContextSwitchTo(oldctx); > > > fd = BufFileCreateShared(ent->subxact_fileset, path); > > > > > > Why is this change required for this patch and why we only cover > > > SharedFileSetInit in the Apply context and not BufFileCreateShared? > > > The comment is also not very clear on this point. > > > > Added the comments for the same. > > > > 1. > + /* > + * Shared fileset handle must be allocated in the persistent context. > + * Also, SharedFileSetInit allocate the memory for sharefileset list > + * so we need to allocate that in the long term meemory context. > + */ > > How about "We need to maintain shared fileset across multiple stream > open/close calls. So, we allocate it in a persistent context." Done > 2. > + /* > + * If the caller is following the dsm based cleanup then we don't > + * maintain the filesetlist so return. > + */ > + if (filesetlist == NULL) > + return; > > The check here should use 'NIL' instead of 'NULL' Done > Other than that the changes in this particular patch looks good to me. Added as a last patch in the series, in the next version I will merge this to 0012 and 0013. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: tar-related code in PostgreSQL
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: not tested Documentation:not tested The patch works perfectly. The new status of this patch is: Ready for Committer
POC: postgres_fdw insert batching
Hi, One of the issues I'm fairly regularly reminded by users/customers is that inserting into tables sharded using FDWs are rather slow. We do even get it reported on pgsql-bugs from time to time [1]. Some of the slowness / overhead is expected, doe to the latency between machines in the sharded setup. Even just 1ms latency will make it way more expensive than a single instance. But let's do a simple experiment, comparing a hash-partitioned regular partitions, and one with FDW partitions in the same instance. Scripts to run this are attached. The duration of inserting 1M rows to this table (average of 10 runs on my laptop) looks like this: regular: 2872 ms FDW: 64454 ms Yep, it's ~20x slower. On setup with ping latency well below 0.05ms. Imagine how would it look on sharded setups with 0.1ms or 1ms latency, which is probably where most single-DC clusters are :-( Now, the primary reason why the performance degrades like this is that while FDW has batching for SELECT queries (i.e. we read larger chunks of data from the cursors), we don't have that for INSERTs (or other DML). Every time you insert a row, it has to go all the way down into the partition synchronously. For some use cases this may be reduced by having many independent connnections from different users, so the per-user latency is higher but acceptable. But if you need to import larger amounts of data (say, a CSV file for analytics, ...) this may not work. Some time ago I wrote an ugly PoC adding batching, just to see how far would it get us, and it seems quite promising - results for he same INSERT benchmarks look like this: FDW batching: 4584 ms So, rather nice improvement, I'd say ... Before I spend more time hacking on this, I have a couple open questions about the design, restrictions etc. 1) Extend the FDW API? In the patch, the batching is simply "injected" into the existing insert API method, i.e. ExecForeignInsert et al. I wonder if it'd be better to extend the API with a "batched" version of the method, so that we can easily determine whether the FDW supports batching or not - it would require changes in the callers, though. OTOH it might be useful for COPY, where we could do something similar to multi_insert (COPY already benefits from this patch, but it does not use the batching built-into COPY). 2) What about the insert results? I'm not sure what to do about "result" status for the inserted rows. We only really "stash" the rows into a buffer, so we don't know if it will succeed or not. The patch simply assumes it will succeed, but that's clearly wrong, and it may result in reporting a wrong number or rows. The patch also disables the batching when the insert has a RETURNING clause, because there's just a single slot (for the currently inserted row). I suppose a "batching" method would take an array of slots. 3) What about the other DML operations (DELETE/UPDATE)? The other DML operations could probably benefit from the batching too. INSERT was good enough for a PoC, but having batching only for INSERT seems somewhat asmymetric. DELETE/UPDATE seem more complicated because of quals, but likely doable. 3) Should we do batching for COPY insteads? While looking at multi_insert, I've realized it's mostly exactly what the new "batching insert" API function would need to be. But it's only really used in COPY, so I wonder if we should just abandon the idea of batching INSERTs and do batching COPY for FDW tables. For cases that can replace INSERT with COPY this would be enough, but unfortunately it does nothing for DELETE/UPDATE so I'm hesitant to do this :-( 4) Expected consistency? I'm not entirely sure what are the consistency expectations for FDWs. Currently the FDW nodes pointing to the same server share a connection, so the inserted rows might be visible to other nodes. But if we only stash the rows in a local buffer for a while, that's no longer true. So maybe this breaks the consistency expectations? But maybe that's OK - I'm not sure how the prepared statements/cursors affect this. I can imagine restricting the batching only to plans where this is not an issue (single FDW node or something), but it seems rather fragile and undesirable. I was thinking about adding a GUC to enable/disable the batching at some level (global, server, table, ...) but it seems like a bad match because the consistency expectations likely depend on a query. There should be a GUC to set the batch size, though (it's hardcoded to 100 for now). regards [1] https://www.postgresql.org/message-id/CACnz%2BQ1q0%2B2KoJam9LyNMk8JmdC6qYHXWB895Wu2xcpoip18xQ%40mail.gmail.com -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services fdw.sql Description: application/sql local.sql Description: application/sql >From df2cf502909886fbfc86f93f36b2daba03f785e4 Mon Sep 17 00:00:00 2001 From: Tomas Vondra Date: Sun, 28 Jun 2020 14:31:18 +0200 Subject: [PATCH]
Re: More tzdb fun: POSIXRULES is being deprecated upstream
Seems like I'm not getting any traction in convincing people that back-patching this change is wise. To get this closed out before the CF starts, I'm just going to put it into HEAD/v13 and call it a day. I remain of the opinion that we'll probably regret not doing anything in the back branches, sometime in the next 4+ years. regards, tom lane
Re: Commitfest 2020-07
Daniel Gustafsson writes: > In a few days, the first commitfest of the 14 cycle - 2020-07 - will start. > Unless anyone has already spoken up that I've missed, I'm happy to volunteer > to > run CFM for this one. No one has volunteered that I recall, so the baton is yours. regards, tom lane
Re: Fwd: PostgreSQL: WolfSSL support
On Sun, Jun 28, 2020 at 10:18:12AM +0200, Peter Eisentraut wrote: > On 2020-06-27 14:50, Christoph Berg wrote: > > Re: Peter Eisentraut > > > What would be the advantage of using wolfSSL over OpenSSL? > > > > Avoiding the OpenSSL-vs-GPL linkage problem with readline. > > We have added support for allegedly-OpenSSL compatible libraries such as > LibreSSL before, so some tweaks for wolfSSL would seem acceptable. However, > I doubt we are going to backpatch them, so unless you want to take > responsibility for that as a packager, it's not really going to help anyone > soon. And OpenSSL 3.0.0 will have a new license, so for the next PostgreSQL > release, this problem might be gone. Oh, that is a long time coming --- it would be nice. -- Bruce Momjian https://momjian.us EnterpriseDB https://enterprisedb.com The usefulness of a cup is in its emptiness, Bruce Lee
Re: PostgreSQL: WolfSSL support
Personally I'm more interested in a library like Amazon's which is trying to do less rather than more. I would rather a simpler, better tested, easier to audit code-base than one with more features and more complications. https://github.com/awslabs/s2n
Re: [HACKERS] Custom compression methods
On Wed, Jun 24, 2020 at 5:30 PM Robert Haas wrote: > > On Tue, Jun 23, 2020 at 4:00 PM Andres Freund wrote: > > https://postgr.es/m/20130621000900.GA12425%40alap2.anarazel.de is a > > thread with more information / patches further along. > > > > I confused this patch with the approach in > > https://www.postgresql.org/message-id/d8576096-76ba-487d-515b-44fdedba8bb5%402ndquadrant.com > > sorry for that. It obviously still differs by not having lower space > > overhead (by virtue of not having a 4 byte 'va_cmid', but no additional > > space for two methods, and then 1 byte overhead for 256 more), but > > that's not that fundamental a difference. > > Wait a minute. Are we saying there are three (3) dueling patches for > adding an alternate TOAST algorithm? It seems like there is: > > This "custom compression methods" thread - vintage 2017 - Original > code by Nikita Glukhov, later work by Ildus Kurbangaliev > The "pluggable compression support" thread - vintage 2013 - Andres Freund > The "plgz performance" thread - vintage 2019 - Petr Jelinek > > Anyone want to point to a FOURTH implementation of this feature? > > I guess the next thing to do is figure out which one is the best basis > for further work. I have gone through these 3 threads and here is a summary of what I understand from them. Feel free to correct me if I have missed something. #1. Custom compression methods: Provide a mechanism to create/drop compression methods by using external libraries, and it also provides a way to set the compression method for the columns/types. There are a few complexities with this approach those are listed below: a. We need to maintain the dependencies between the column and the compression method. And the bigger issue is, even if the compression method is changed, we need to maintain the dependencies with the older compression methods as we might have some older tuples that were compressed with older methods. b. Inside the compressed attribute, we need to maintain the compression method so that we know how to decompress it. For this, we use 2 bits from the raw_size of the compressed varlena header. #2. pglz performance: Along with pglz this patch provides an additional compression method using lz4. The new compression method can be enabled/disabled during configure time or using SIGHUP. We use 1 bit from the raw_size of the compressed varlena header to identify the compression method (pglz or lz4). #3. pluggable compression: This proposal is to replace the existing pglz algorithm, with the snappy or lz4 whichever is better. As per the performance data[1], it appeared that the lz4 is the winner in most of the cases. - This also provides an additional patch to plugin any compression method. - This will also use 2 bits from the raw_size of the compressed attribute for identifying the compression method. - Provide an option to select the compression method using GUC, but the comments in the patch suggest to remove the GUC. So it seems that GUC was used only for the POC. - Honestly, I did not clearly understand from this patch set that whether it proposes to replace the existing compression method with the best method (and the plugin is just provided for performance testing) or it actually proposes an option to have pluggable compression methods. IMHO, We can provide a solution based on #1 and #2, i.e. we can provide a few best compression methods in the core, and on top of that, we can also provide a mechanism to create/drop the external compression methods. [1] https://www.postgresql.org/message-id/20130621000900.GA12425%40alap2.anarazel.de -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
TLS checking in pgstat
As I mentioned in [1], checking (struct Port)->ssl for NULL to determine whether TLS is used for connection is a bit of a leaky abstraction, as that's an OpenSSL specific struct member. This sets the requirement that all TLS implementations use a pointer named SSL, and that the pointer is set to NULL in case of a failed connection, which may or may not fit. Is there a reason to not use (struct Port)->ssl_in_use flag which tracks just what we're looking for here? This also maps against other parts of the abstraction in be-secure.c which do just that. The attached implements this. cheers ./daniel [1] fab21fc8-0f62-434f-aa78-6bd9336d6...@yesql.se ssl_reporting.patch Description: Binary data
Commitfest 2020-07
In a few days, the first commitfest of the 14 cycle - 2020-07 - will start. Unless anyone has already spoken up that I've missed, I'm happy to volunteer to run CFM for this one. cheers ./daniel
Re: PostgreSQL: WolfSSL support
> On 27 Jun 2020, at 21:40, Bruce Momjian wrote: > On Sat, Jun 27, 2020 at 04:22:51PM -0300, Ranier Vilela wrote: >> WolfSSL, will provide a commercial license for PostgreSQL? >> Isn't LIbreSSL a better alternative? > > Seems it might be. That's not really an apples/apples comparison as the projects have vastly different goals (wolfSSL offers FIPS certification for example). cheers ./daniel
Re: pgsql: Enable Unix-domain sockets support on Windows
On 2020-06-27 13:57, Amit Kapila wrote: Fair enough, but what should be the behavior in the Windows versions (<10) where Unix-domain sockets are not supported? You get an error about an unsupported address family, similar to trying to use IPv6 on a system that doesn't support it. BTW, in which format the path needs to be specified for unix_socket_directories? I tried with '/c/tmp', 'c:/tmp', 'tmp' but nothing seems to be working, it gives me errors like: "could not create lock file "/c/tmp/.s.PGSQL.5432.lock": No such file or directory" on server start. I am trying this on Win7 just to check what is the behavior of this feature on it. Hmm, the only thing I remember about this now is that you need to use native Windows paths, meaning you can't just use /tmp under MSYS, but it needs to be something like C:\something. But the error you have there is not even about the socket file but about the lock file, which is a normal file, so if that goes wrong, it might be an unrelated problem. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Fwd: PostgreSQL: WolfSSL support
On 2020-06-27 14:50, Christoph Berg wrote: Re: Peter Eisentraut What would be the advantage of using wolfSSL over OpenSSL? Avoiding the OpenSSL-vs-GPL linkage problem with readline. We have added support for allegedly-OpenSSL compatible libraries such as LibreSSL before, so some tweaks for wolfSSL would seem acceptable. However, I doubt we are going to backpatch them, so unless you want to take responsibility for that as a packager, it's not really going to help anyone soon. And OpenSSL 3.0.0 will have a new license, so for the next PostgreSQL release, this problem might be gone. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Raising stop and warn limits
On Sun, Jun 21, 2020 at 01:35:13AM -0700, Noah Misch wrote: > In brief, I'm proposing to raise xidWrapLimit-xidStopLimit to 3M and > xidWrapLimit-xidWarnLimit to 40M. Likewise for mxact counterparts. Here's the patch for it. > 1. VACUUM, to advance a limit, may assign IDs subject to one of the limits. >VACUUM formerly consumed XIDs, not mxacts. It now consumes mxacts, not >XIDs. Correction: a lazy_truncate_heap() at wal_level!=minimal does assign an XID, so XID consumption is impossible with "VACUUM (TRUNCATE false)" but possible otherwise. "VACUUM (ANALYZE)", which a DBA might do by reflex, also assigns XIDs. (These corrections do not affect $SUBJECT.) Author: Noah Misch Commit: Noah Misch Change XID and mxact limits to warn at 40M and stop at 3M. We have edge-case bugs when assigning values in the last few dozen pages before the wrap limit. We may introduce similar bugs in the future. At default BLCKSZ, this makes such bugs unreachable outside of single-user mode. Also, when VACUUM began to consume mxacts, multiStopLimit did not change to compensate. pg_upgrade may fail on a cluster that was already printing "must be vacuumed" warnings. Follow the warning's instructions to clear the warning, then run pg_upgrade again. One can still, peacefully consume 98% of XIDs or mxacts, so DBAs need not change routine VACUUM settings. Reviewed by FIXME. Discussion: https://postgr.es/m/20200621083513.ga3074...@rfd.leadboat.com diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml index 612e4cb..a28ea56 100644 --- a/doc/src/sgml/maintenance.sgml +++ b/doc/src/sgml/maintenance.sgml @@ -608,10 +608,10 @@ SELECT datname, age(datfrozenxid) FROM pg_database; If for some reason autovacuum fails to clear old XIDs from a table, the system will begin to emit warning messages like this when the database's -oldest XIDs reach eleven million transactions from the wraparound point: +oldest XIDs reach forty million transactions from the wraparound point: -WARNING: database "mydb" must be vacuumed within 10985967 transactions +WARNING: database "mydb" must be vacuumed within 39985967 transactions HINT: To avoid a database shutdown, execute a database-wide VACUUM in that database. @@ -621,7 +621,7 @@ HINT: To avoid a database shutdown, execute a database-wide VACUUM in that data be able to advance the database's datfrozenxid.) If these warnings are ignored, the system will shut down and refuse to start any new -transactions once there are fewer than 1 million transactions left +transactions once there are fewer than three million transactions left until wraparound: @@ -629,7 +629,7 @@ ERROR: database is not accepting commands to avoid wraparound data loss in data HINT: Stop the postmaster and vacuum that database in single-user mode. -The 1-million-transaction safety margin exists to let the +The three-million-transaction safety margin exists to let the administrator recover without data loss, by manually executing the required VACUUM commands. However, since the system will not execute commands once it has gone into the safety shutdown mode, diff --git a/src/backend/access/transam/multixact.c b/src/backend/access/transam/multixact.c index ce84dac..475f5ed 100644 --- a/src/backend/access/transam/multixact.c +++ b/src/backend/access/transam/multixact.c @@ -2217,28 +2217,24 @@ SetMultiXactIdLimit(MultiXactId oldest_datminmxid, Oid oldest_datoid, multiWrapLimit += FirstMultiXactId; /* -* We'll refuse to continue assigning MultiXactIds once we get within 100 -* multi of data loss. -* -* Note: This differs from the magic number used in -* SetTransactionIdLimit() since vacuum itself will never generate new -* multis. XXX actually it does, if it needs to freeze old multis. +* We'll refuse to continue assigning MultiXactIds once we get within 3M +* multi of data loss. See SetTransactionIdLimit. */ - multiStopLimit = multiWrapLimit - 100; + multiStopLimit = multiWrapLimit - 300; if (multiStopLimit < FirstMultiXactId) multiStopLimit -= FirstMultiXactId; /* -* We'll start complaining loudly when we get within 10M multis of the -* stop point. This is kind of arbitrary, but if you let your gas gauge -* get down to 1% of full, would you be looking for the next gas station? -* We need to be fairly liberal about this number because there are lots -* of scenarios where most transactions are done by automatic clients that -* won't pay attention to warnings. (No, we're not gonna make this +* We'll start complaining loudly when we get within 40M multis of data +* loss. This is kind of arbitrary, but if you let your
Re: update substring pattern matching syntax
Hallo Peter, v2 patches apply cleanly, compile, global check ok, citext check ok, doc gen ok. No further comments. As I did not find an entry in the CF, so I did nothing about tagging it "ready". -- Fabien.