Re: Export log_line_prefix(); useful for emit_log_hook.
On Wed, Jun 29, 2022 at 03:09:42PM +0200, Alvaro Herrera wrote: > Hmm, maybe your hypothetical book would prefer to use a different > setting for log line prefix than Log_line_prefix, so it would make sense > to pass the format string as a parameter to the function instead of > relying on the GUC global. +1. -- Michael signature.asc Description: PGP signature
Re: Prevent writes on large objects in read-only transactions
On Wed, Jun 29, 2022 at 05:29:50PM +0900, Yugo NAGATA wrote: > Thank you for reviewing the patch. I attached the updated patch. Thanks for the new version. I have looked at that again, and the set of changes seem fine (including the change for the alternate output). So, applied. -- Michael signature.asc Description: PGP signature
Re: Perform streaming logical transactions by background workers and parallel apply
Below are some review comments for patch v14-0004: v14-0004 4.0 General. This comment is an after-thought but as I write this mail I am wondering if most of this 0004 patch is even necessary at all? Instead of introducing a new column and all the baggage that goes with it, can't the same functionality be achieved just by toggling the streaming mode 'substream' value from 'p' (parallel) to 't' (on) whenever an error occurs causing a retry? Anyway, if you do change it this way then most of the following comments can be disregarded. == 4.1 Commit message Patch needs an explanatory commit message. Currently, there is nothing. == 4.2 doc/src/sgml/catalogs.sgml + + + subretry bool + + + If true, the subscription will not try to apply streaming transaction + in parallel mode. See +for more information. + + I think it is overkill to mention anything about the streaming=parallel here because IIUC it is nothing to do with this field at all. I thought you really only need to say something brief like: SUGGESTION: True if the previous apply change failed and a retry was required. == 4.3 doc/src/sgml/ref/create_subscription.sgml @@ -244,6 +244,10 @@ CREATE SUBSCRIPTION subscription_nameon + mode. I did not think it is good to say "we" in the docs. SUGGESTION When applying a streaming transaction, if either requirement is not met, the background worker will exit with an error. Parallel mode is disregarded when retrying; instead the transaction will be applied using streaming = on. == 4.4 .../replication/logical/applybgworker.c + /* + * We don't start new background worker if retry was set as it's possible + * that the last time we tried to apply a transaction in background worker + * and the check failed (see function apply_bgworker_relation_check). So + * we will try to apply this transaction in apply worker. + */ SUGGESTION (simplified, and remove "we") Don't use apply background workers for retries, because it is possible that the last time we tried to apply a transaction using an apply background worker the checks failed (see function apply_bgworker_relation_check). ~~~ 4.5 + elog(DEBUG1, "retry to apply an streaming transaction in apply " + "background worker"); IMO the log message is too confusing SUGGESTION "apply background workers are not used for retries" == 4.6 src/backend/replication/logical/worker.c 4.6.a - apply_handle_commit + /* Set the flag that we will not retry later. */ + set_subscription_retry(false); But the comment is wrong, isn't it? Shouldn’t it just say that we are not *currently* retrying? And can’t this just anyway be redundant if only the catalog column has a DEFAULT value of false? 4.6.b - apply_handle_prepare Ditto 4.6.c - apply_handle_commit_prepared Ditto 4.6.d - apply_handle_rollback_prepared Ditto 4.6.e - apply_handle_stream_prepare Ditto 4.6.f - apply_handle_stream_abort Ditto 4.6.g - apply_handle_stream_commit Ditto ~~~ 4.7 src/backend/replication/logical/worker.c 4.7.a - start_table_sync @@ -3894,6 +3917,9 @@ start_table_sync(XLogRecPtr *origin_startpos, char **myslotname) } PG_CATCH(); { + /* Set the flag that we will retry later. */ + set_subscription_retry(true); Maybe this should say more like "Flag that the next apply will be the result of a retry" 4.7.b - start_apply Ditto ~~~ 4.8 src/backend/replication/logical/worker.c - set_subscription_retry + +/* + * Set subretry of pg_subscription catalog. + * + * If retry is true, subscriber is about to exit with an error. Otherwise, it + * means that the changes was applied successfully. + */ +static void +set_subscription_retry(bool retry) "changes" -> "change" ? ~~~ 4.8 src/backend/replication/logical/worker.c - set_subscription_retry Isn't this flag only every used when streaming=parallel? But it does not seem ot be checking that anywhere before potentiall executing all this code when maybe will never be used. == 4.9 src/include/catalog/pg_subscription.h @@ -76,6 +76,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW bool subdisableonerr; /* True if a worker error should cause the * subscription to be disabled */ + bool subretry; /* True if the previous apply change failed. */ I was wondering if you can give this column a DEFAULT value of false, because then perhaps most of the patch code from worker.c may be able to be eliminated. == 4.10 .../subscription/t/032_streaming_apply.pl I felt that the test cases all seem to blend together. IMO it will be more readable if the main text parts are visually separated e.g using a comment like: # = -- Kind Regards, Peter Smith. Fujitsu Australia
Re: Lazy JIT IR code generation to increase JIT speed with partitions
Hi Alvaro, hi David, Thanks for reviewing this and the interesting examples! Wanted to give a bit of extra insight as to why I'd love to have a system that can lazily emit JIT code and hence creates roughly a module per function: In the end I'm hoping that we can migrate to a system where we only JIT after a configurable cost has been exceeded for this node, as well as a configurable amount of rows has actually been processed. Reason is that this would safeguard against some problematic planning issues wrt JIT (node not being executed, row count being massively off). It would also allow for more finegrained control, with a cost system similar to most other planning costs, as they are also per node and not globally, and would potentially allow us to only JIT things where we expect to truly gain any runtime compared to the costs of doing it. If this means we have to invest more in making it cheap(er) to emit modules, I'm all for that. Kudos to David for fixing the caching in that sense :) @Andres if there's any other things we ought to fix to make this cheap (enough) compared to the previous code I'd love to know your thoughts. Best, Luc Vlaming (ServiceNow) From: David Geier Sent: Wednesday, June 29, 2022 11:03 AM To: Alvaro Herrera Cc: Luc Vlaming ; Andres Freund ; PostgreSQL-development Subject: Re: Lazy JIT IR code generation to increase JIT speed with partitions [External Email] Hi Alvaro, That's a very interesting case and might indeed be fixed or at least improved by this patch. I tried to reproduce this, but at least when running a simple, serial query with increasing numbers of functions, the time spent per function is linear or even slightly sub-linear (same as Tom observed in [1]). I also couldn't reproduce the JIT runtimes you shared, when running the attached catalog query. The catalog query ran serially for me with the following JIT stats: JIT: Functions: 169 Options: Inlining true, Optimization true, Expressions true, Deforming true Timing: Generation 12.223 ms, Inlining 17.323 ms, Optimization 388.491 ms, Emission 283.464 ms, Total 701.501 ms Is it possible that the query ran in parallel for you? For parallel queries, every worker JITs all of the functions it uses. Even though the workers might JIT the functions in parallel, the time reported in the EXPLAIN ANALYZE output is the sum of the time spent by all workers. With this patch applied, the JIT time drops significantly, as many of the generated functions remain unused. JIT: Modules: 15 Functions: 26 Options: Inlining true, Optimization true, Expressions true, Deforming true Timing: Generation 1.931 ms, Inlining 0.722 ms, Optimization 67.195 ms, Emission 70.347 ms, Total 140.195 ms Of course, this does not prove that the nonlinearity that you observed went away. Could you share with me how you ran the query so that I can reproduce your numbers on master to then compare them with the patched version? Also, which LLVM version did you run with? I'm currently running with LLVM 13. Thanks! -- David Geier (ServiceNow) On Mon, Jun 27, 2022 at 5:37 PM Alvaro Herrera wrote: On 2021-Jan-18, Luc Vlaming wrote: > I would like this topic to somehow progress and was wondering what other > benchmarks / tests would be needed to have some progress? I've so far > provided benchmarks for small(ish) queries and some tpch numbers, assuming > those would be enough. Hi, some time ago I reported a case[1] where our JIT implementation does a very poor job and perhaps the changes that you're making could explain what is going on, and maybe even fix it: [1] https://postgr.es/m/20241706.wqq7xoyigwa2@alvherre.pgsql The query for which I investigated the problem involved some pg_logical metadata tables, so I didn't post it anywhere public; but the blog post I found later contains a link to a query that shows the same symptoms, and which is luckily still available online: https://gist.github.com/saicitus/251ba20b211e9e73285af35e61b19580 I attach it here in case it goes missing sometime. -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
Re: Probable memory leak with ECPG and AIX
On Sat, Jul 02, 2022 at 11:59:58PM -0400, Tom Lane wrote: > Confirmed that either initializing ecpg_clocale or adding -fno-common > allows the ecpglib build to succeed. (Testing it beyond that would > require another hour or so to build the rest of the system, so I won't.) > > As I said, I was already leaning to the idea that initializing the > variable explicitly is better style, so I recommend we do that. Works for me. Pushed that way, and things have been clean.
Re: [PoC] Improve dead tuple storage for lazy vacuum
On Tue, Jun 28, 2022 at 10:10 PM John Naylor wrote: > > On Tue, Jun 28, 2022 at 1:24 PM Masahiko Sawada wrote: > > > > > I > > > suspect other optimizations would be worth a lot more than using AVX2: > > > - collapsing inner nodes > > > - taking care when constructing the key (more on this when we > > > integrate with VACUUM) > > > ...and a couple Andres mentioned: > > > - memory management: in > > > https://www.postgresql.org/message-id/flat/20210717194333.mr5io3zup3kxahfm%40alap3.anarazel.de > > > - node dispatch: > > > https://www.postgresql.org/message-id/20210728184139.qhvx6nbwdcvo63m6%40alap3.anarazel.de > > > > > > Therefore, I would suggest that we use SSE2 only, because: > > > - portability is very easy > > > - to avoid a performance hit from indirecting through a function pointer > > > > Okay, I'll try these optimizations and see if the performance becomes > > better. > > FWIW, I think it's fine if we delay these until after committing a > good-enough version. The exception is key construction and I think > that deserves some attention now (more on this below). Agreed. > > > I've done benchmark tests while changing the node types. The code base > > is v3 patch that doesn't have the optimization you mentioned below > > (memory management and node dispatch) but I added the code to use SSE2 > > for node-16 and node-32. > > Great, this is helpful to visualize what's going on! > > > * sse2_4_16_48_256 > > * nkeys = 9091, height = 3, n4 = 0, n16 = 0, n48 = 512, n256 = > > 916433 > > * nkeys = 2, height = 3, n4 = 2, n16 = 0, n48 = 207, n256 = 50 > > > > * sse2_4_32_128_256 > > * nkeys = 9091, height = 3, n4 = 0, n32 = 285, n128 = 916629, n256 > > = 31 > > * nkeys = 2, height = 3, n4 = 2, n32 = 48, n128 = 208, n256 = 1 > > > Observations are: > > > > In both test cases, There is not much difference between using AVX2 > > and SSE2. The more mode types, the more time it takes for loading the > > data (see sse2_4_16_32_128_256). > > Good to know. And as Andres mentioned in his PoC, more node types > would be a barrier for pointer tagging, since 32-bit platforms only > have two spare bits in the pointer. > > > In dense case, since most nodes have around 100 children, the radix > > tree that has node-128 had a good figure in terms of memory usage. On > > Looking at the node stats, and then your benchmark code, I think key > construction is a major influence, maybe more than node type. The > key/value scheme tested now makes sense: > > blockhi || blocklo || 9 bits of item offset > > (with the leaf nodes containing a bit map of the lowest few bits of > this whole thing) > > We want the lower fanout nodes at the top of the tree and higher > fanout ones at the bottom. So more inner nodes can fit in CPU cache, right? > > Note some consequences: If the table has enough columns such that much > fewer than 100 tuples fit on a page (maybe 30 or 40), then in the > dense case the nodes above the leaves will have lower fanout (maybe > they will fit in a node32). Also, the bitmap values in the leaves will > be more empty. In other words, many tables in the wild *resemble* the > sparse case a bit, even if truly all tuples on the page are dead. > > Note also that the dense case in the benchmark above has ~4500 times > more keys than the sparse case, and uses about ~1000 times more > memory. But the runtime is only 2-3 times longer. That's interesting > to me. > > To optimize for the sparse case, it seems to me that the key/value would be > > blockhi || 9 bits of item offset || blocklo > > I believe that would make the leaf nodes more dense, with fewer inner > nodes, and could drastically speed up the sparse case, and maybe many > realistic dense cases. Does it have an effect on the number of inner nodes? > I'm curious to hear your thoughts. Thank you for your analysis. It's worth trying. We use 9 bits for item offset but most pages don't use all bits in practice. So probably it might be better to move the most significant bit of item offset to the left of blockhi. Or more simply: 9 bits of item offset || blockhi || blocklo > > > the other hand, the radix tree that doesn't have node-128 has a better > > number in terms of insertion performance. This is probably because we > > need to iterate over 'isset' flags from the beginning of the array in > > order to find an empty slot when inserting new data. We do the same > > thing also for node-48 but it was better than node-128 as it's up to > > 48. > > I mentioned in my diff, but for those following along, I think we can > improve that by iterating over the bytes and if it's 0xFF all 8 bits > are set already so keep looking... Right. Using 0xFF also makes the code readable so I'll change that. > > > In terms of lookup performance, the results vary but I could not find > > any common pattern that makes the performance better or worse. Getting > > more statistics such as the number of each node type per tree level > > might
Re: Perform streaming logical transactions by background workers and parallel apply
Below are some review comments for patch v14-0003: v14-0003 3.1 Commit message If any of the following checks are violated, an error will be reported. 1. The unique columns between publisher and subscriber are difference. 2. There is any non-immutable function present in expression in subscriber's relation. Check from the following 4 items: a. The function in triggers; b. Column default value expressions and domain constraints; c. Constraint expressions. d. The foreign keys. SUGGESTION (rewording to match the docs and the code). Add some checks before using apply background worker to apply changes. streaming=parallel mode has two requirements: 1) The unique columns must be the same between publisher and subscriber 2) There cannot be any non-immutable functions in the subscriber-side replicated table. Look for functions in the following places: * a. Trigger functions * b. Column default value expressions and domain constraints * c. Constraint expressions * d. Foreign keys == 3.2 doc/src/sgml/ref/create_subscription.sgml + To run in this mode, there are following two requirements. The first + is that the unique column should be the same between publisher and + subscriber; the second is that there should not be any non-immutable + function in subscriber-side replicated table. SUGGESTION Parallel mode has two requirements: 1) the unique columns must be the same between publisher and subscriber; 2) there cannot be any non-immutable functions in the subscriber-side replicated table. == 3.3 .../replication/logical/applybgworker.c - apply_bgworker_relation_check + * Check if changes on this logical replication relation can be applied by + * apply background worker. SUGGESTION Check if changes on this relation can be applied by an apply background worker. ~~~ 3.4 + * Although we maintains the commit order by allowing only one process to + * commit at a time, our access order to the relation has changed. SUGGESTION Although the commit order is maintained only allowing one process to commit at a time, the access order to the relation has changed. ~~~ 3.5 + /* Check only we are in apply bgworker. */ + if (!am_apply_bgworker()) + return; SUGGESTION /* Skip check if not an apply background worker. */ ~~~ 3.6 + /* + * If it is a partitioned table, we do not check it, we will check its + * partition later. + */ This comment is lacking useful details: /* Partition table checks are done later in (?) */ ~~~ 3.7 + if (!rel->sameunique) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("cannot replicate relation with different unique index"), + errhint("Please change the streaming option to 'on' instead of 'parallel'."))); Maybe the first message should change slightly so it is worded consistently with the other one. SUGGESTION errmsg("cannot replicate relation. Unique indexes must be the same"), == 3.8 src/backend/replication/logical/proto.c -#define LOGICALREP_IS_REPLICA_IDENTITY 1 +#define LOGICALREP_IS_REPLICA_IDENTITY 0x0001 +#define LOGICALREP_IS_UNIQUE 0x0002 I think these constants should named differently to reflect that they are just attribute flags. They should should use similar bitset styles to the other nearby constants. SUGGESTION #define ATTR_IS_REPLICA_IDENTITY (1 << 0) #define ATTR_IS_UNIQUE (1 << 1) ~~~ 3.9 src/backend/replication/logical/proto.c - logicalrep_write_attrs This big slab of new code to get the BMS looks very similar to RelationGetIdentityKeyBitmap. So perhaps this code should be encapsulated in another function like that one (in relcache.c?) and then just called from logicalrep_write_attrs == 3.10 src/backend/replication/logical/relation.c - logicalrep_relmap_reset_volatility_cb +/* + * Reset the flag volatility of all existing entry in the relation map cache. + */ +static void +logicalrep_relmap_reset_volatility_cb(Datum arg, int cacheid, uint32 hashvalue) SUGGESTION Reset the volatility flag of all entries in the relation map cache. ~~~ 3.11 src/backend/replication/logical/relation.c - logicalrep_rel_mark_safe_in_apply_bgworker +/* + * Check if unique index/constraint matches and mark sameunique and volatility + * flag. + * + * Don't throw any error here just mark the relation entry as not sameunique or + * FUNCTION_NONIMMUTABLE as we only check these in apply background worker. + */ +static void +logicalrep_rel_mark_safe_in_apply_bgworker(LogicalRepRelMapEntry *entry) SUGGESTION Check if unique index/constraint matches and assign 'sameunique' flag. Check if there are any non-immutable functions and assign the 'volatility' flag. Note: Don't throw any error here - these flags will be checked in the apply background worker. ~~~ 3.12 src/backend/replication/logical/relation.c - logicalrep_rel_mark_safe_in_apply_bgworker I did not really understand why you used an enum for one flag (volatility) but not the other one (sameunique); shouldn’t bot
Re: Handle infinite recursion in logical replication setup
On Mon, Jul 4, 2022 at 3:59 AM Peter Smith wrote: > > On Mon, Jul 4, 2022 at 12:59 AM vignesh C wrote: > ... > > > 2. > > > /* ALTER SUBSCRIPTION SET ( */ > > > else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && > > > TailMatches("SET", "(")) > > > - COMPLETE_WITH("binary", "slot_name", "streaming", > > > "synchronous_commit", "disable_on_error"); > > > + COMPLETE_WITH("binary", "origin", "slot_name", "streaming", > > > "synchronous_commit", "disable_on_error"); > > > /* ALTER SUBSCRIPTION SKIP ( */ > > > else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && > > > TailMatches("SKIP", "(")) > > > COMPLETE_WITH("lsn"); > > > @@ -3152,7 +3152,7 @@ psql_completion(const char *text, int start, int > > > end) > > > /* Complete "CREATE SUBSCRIPTION ... WITH ( " */ > > > else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", > > > "(")) > > > COMPLETE_WITH("binary", "connect", "copy_data", "create_slot", > > > - "enabled", "slot_name", "streaming", > > > + "enabled", "origin", "slot_name", "streaming", > > > "synchronous_commit", "two_phase", "disable_on_error"); > > > > > > Why do you choose to add a new option in-between other parameters > > > instead of at the end which we normally do? The one possible reason I > > > can think of is that all the parameters at the end are boolean so you > > > want to add this before those but then why before slot_name, and again > > > I don't see such a rule being followed for other parameters. > > > > I was not sure if it should be maintained in alphabetical order, > > anyway since the last option "disable_on_error" is at the end, I have > > changed it to the end. > > > > Although it seems it is not a hard rule, mostly the COMPLETE_WITH are > coded using alphabetical order. Anyway, I think that was a clear > intention here too since 13 of 14 parameters were already in > alphabetical order; it is actually only that "disable_on_error" > parameter that was misplaced; not the new "origin" parameter. > Agreed, but let's not change disable_on_error as part of this patch. -- With Regards, Amit Kapila.
Re: "ERROR: latch already owned" on gharial
On Wed, Jun 1, 2022 at 12:55 AM Robert Haas wrote: > OK, I have access to the box now. I guess I might as well leave the > crontab jobs enabled until the next time this happens, since Thomas > just took steps to improve the logging, but I do think these BF > members are overdue to be killed off, and would like to do that as > soon as it seems like a reasonable step to take. A couple of months later, there has been no repeat of that error. I'd happily forget about that and move on, if you want to decommission these.
Re: Too-long socket paths are breaking several buildfarm members
On Sun, Jul 03, 2022 at 09:40:23PM -0400, Tom Lane wrote: > Yeah, I just came to the same conclusion and pushed an equivalent > patch. Sorry for the duplicated effort. No problem. Thanks for the quick fix. -- Michael signature.asc Description: PGP signature
Re: pgsql: dshash: Add sequential scan support.
[Re-directing to -hackers] On Fri, Mar 11, 2022 at 2:27 PM Andres Freund wrote: > On 2022-03-10 20:09:56 -0500, Tom Lane wrote: > > Andres Freund writes: > > > dshash: Add sequential scan support. > > > Add ability to scan all entries sequentially to dshash. The interface is > > > similar but a bit different both from that of dynahash and simple dshash > > > search functions. The most significant differences is that dshash's > > > interfac > > > always needs a call to dshash_seq_term when scan ends. > > > > Umm ... what about error recovery? Or have you just cemented the > > proposition that long-lived dshashes are unsafe? > > I don't think this commit made it worse. dshash_seq_term() releases an lwlock > (which will be released in case of an error) and unsets > hash_table->find_[exclusively_]locked. The latter weren't introduced by this > patch, and are also set by dshash_find(). > > I agree that ->find_[exclusively_]locked are problematic from an error > recovery perspective. Right, as seen in the build farm at [1]. Also reproducible with something like: @@ -269,6 +269,14 @@ dsm_impl_posix(dsm_op op, dsm_handle handle, Size request_size, return false; } + /* XXX random fault injection */ + if (op == DSM_OP_ATTACH && random() < RAND_MAX / 8) + { + close(fd); + elog(ERROR, "chaos"); + return false; + } + I must have thought that it was easy and practical to write no-throw straight-line code and be sure to reach dshash_release_lock(), but I concede that it was a bad idea: even dsa_get_address() can throw*, and you're often likely to need to call that while accessing dshash elements. For example, in lookup_rowtype_tupdesc_internal(), there is a sequence dshash_find(), ..., dsa_get_address(), ..., dshash_release_lock(), and I must have considered the range of code between find and release to be no-throw, but now I know that it is not. > It's per-backend state at least and just used for assertions. We could remove > it. Or stop checking it in places where it could be set wrongly: dshash_find() > and dshash_detach() couldn't check anymore, but the rest of the assertions > would still be valid afaics? Yeah, it's all for assertions... let's just remove it. Those assertions were useful to me at some stage in development but won't hold as well as I thought, at least without widespread PG_FINALLY(), which wouldn't be nice. *dsa_get_address() might need to adjust the memory map with system calls, which might fail. If you think of DSA as not only an allocator but also a poor man's user level virtual memory scheme to tide us over until we get threads, then this is a pretty low level kind of should-not-happen failure that is analogous on some level to SIGBUS or SIGSEGV or something like that, and we should PANIC. Then we could claim that dsa_get_address() is no-throw. At least, that was one argument I had with myself while investigating that strange Solaris shm_open() failure, but ... I lost the argument. It's quite an extreme position to take just to support these assertions, which are of pretty limited value. [1] https://www.postgresql.org/message-id/20220701232009.jcwxpl45bptaxv5n%40alap3.anarazel.de From e3e87a9ec39d3456a7bf29796102502d3724993e Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Mon, 4 Jul 2022 11:56:02 +1200 Subject: [PATCH] Remove spurious assertions from dshash.c. dshash.c maintained flags to track locking, in order to make assertions that the dshash API was being used correctly. Unfortunately the interaction with ereport's non-local exits was not thought through carefully enough. While lwlocks are released automatically, the flags can get out of sync. Since they're only used for assertions anyway, we can just remove them. This problem was noted by Tom and Andres while reviewing changes to support the new shared memory stats system in 15, the second user of dshash.c. On closer inspection, even the earlier typcache.c code is not guaranteed to meet the asserted conditions, now that it's been highlighted that even dsa_get_address() can throw (albeit in unlikely circumstances). Back-patch to 11, where dshash.c arrived. Reported-by: Tom Lane Reported-by: Andres Freund Discussion: https://postgr.es/m/20220311012712.botrpsikaufzt...@alap3.anarazel.de diff --git a/src/backend/lib/dshash.c b/src/backend/lib/dshash.c index ec454b4d65..9706e7926b 100644 --- a/src/backend/lib/dshash.c +++ b/src/backend/lib/dshash.c @@ -110,8 +110,6 @@ struct dshash_table dshash_table_control *control; /* Control object in DSM. */ dsa_pointer *buckets; /* Current bucket pointers in DSM. */ size_t size_log2; /* log2(number of buckets) */ - bool find_locked; /* Is any partition lock held by 'find'? */ - bool find_exclusively_locked; /* ... exclusively? */ }; /* Given a pointer to an item, find the entry (user data) it holds. */ @@ -234,9 +232,6 @@ dshash_create(dsa_area *area, const dshash_para
Re: Issue with pg_stat_subscription_stats
On Sat, Jul 2, 2022 at 9:52 AM Masahiko Sawada wrote: > > > > On Sat, Jul 2, 2022 at 2:53 Andres Freund wrote: >> >> Hi, >> >> On 2022-07-01 16:08:48 +0900, Masahiko Sawada wrote: >> > Yes, my point is that it may be misleading that the subscription stats >> > are created when a subscription is created. >> >> I think it's important to create stats at that time, because otherwise it's >> basically impossible to ensure that stats are dropped when a transaction >> rolls >> back. If some / all columns should return something else before stats are >> reported that can be addressed easily by tracking that in a separate field. >> >> >> > On the other hand, I'm not sure we agreed that the behavior that >> > Melanie reported is not a problem. The user might get confused since >> > the subscription stats works differently than other stats when a >> > reset. Previously, the primary reason why I hesitated to create the >> > subscription stats when creating a subscription is that CREATE >> > SUBSCRIPTION (with create_slot = false) can be rolled back. But with >> > the shmem stats, we can easily resolve it by using >> > pgstat_create_transactional(). >> >> Yep. >> >> >> > > > The second problem is that the following code in DropSubscription() >> > > > should be updated: >> > > > >> > > > /* >> > > > * Tell the cumulative stats system that the subscription is >> > > > getting >> > > > * dropped. We can safely report dropping the subscription >> > > > statistics here >> > > > * if the subscription is associated with a replication slot since >> > > > we >> > > > * cannot run DROP SUBSCRIPTION inside a transaction block. >> > > > Subscription >> > > > * statistics will be removed later by (auto)vacuum either if it's >> > > > not >> > > > * associated with a replication slot or if the message for >> > > > dropping the >> > > > * subscription gets lost. >> > > > */ >> > > > if (slotname) >> > > > pgstat_drop_subscription(subid); >> > > > >> > > > I think we can call pgstat_drop_subscription() even if slotname is >> > > > NULL and need to update the comment. >> > > > >> > > >> > > +1. >> > > >> > > > IIUC autovacuum is no longer >> > > > responsible for garbage collection. >> > > > >> > > >> > > Right, this is my understanding as well. >> > >> > Thank you for the confirmation. >> >> Want to propose a patch? > > > Yes, I’ll propose a patch. > I've attached the patch, fix_drop_subscriptions_stats.patch, to fix it. I've also attached another PoC patch, poc_create_subscription_stats.patch, to create the stats entry when creating the subscription, which address the issue reported in this thread; the pg_stat_reset_subscription_stats() doesn't update the stats_reset if no error is reported yet. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/ fix_drop_subscription_stats.patch Description: Binary data poc_create_subscription_stats.patch Description: Binary data
Re: Too-long socket paths are breaking several buildfarm members
Michael Paquier writes: > There is PostgreSQL::Test::Utils::tempdir_short for that, which is > what all the nodes created in Cluster.pm use for > unix_socket_directories. One way to address the issue would be to > pass that to pg_upgrade with --socketdir, as of the attached. Yeah, I just came to the same conclusion and pushed an equivalent patch. Sorry for the duplicated effort. regards, tom lane
Re: Too-long socket paths are breaking several buildfarm members
On Sun, Jul 03, 2022 at 07:22:11PM -0400, Tom Lane wrote: > That path name is 3 bytes over the platform limit. Evidently, > "REL_15_STABLE" is just enough longer than "HEAD" to make this fail, > whereas we didn't see the problem as long as the test case only > ran in HEAD. That tells enough about UNIXSOCK_PATH_BUFLEN. It looks like test.sh has been using for ages /tmp/pg_upgrade_check* as socket directory to counter this issue. > Members butterflyfish, massasauga, and myna likewise have yet to pass > this test in REL_15_STABLE, though they're perfectly happy in HEAD. > They are returning cut-down logs that don't allow diagnosing for > certain, but a reasonable bet is that it's the same kind of problem. Hmm. That's possible. > I think that the conversion of pg_upgrade's test script to TAP > form missed a bet. IIRC, we have mechanism somewhere to ensure > that test socket path names are created under /tmp, or someplace else > that's not subject to possibly-long paths of installation directories. > That's evidently not being used here. There is PostgreSQL::Test::Utils::tempdir_short for that, which is what all the nodes created in Cluster.pm use for unix_socket_directories. One way to address the issue would be to pass that to pg_upgrade with --socketdir, as of the attached. -- Michael diff --git a/src/bin/pg_upgrade/t/002_pg_upgrade.pl b/src/bin/pg_upgrade/t/002_pg_upgrade.pl index 67e0be6856..8dbda0950e 100644 --- a/src/bin/pg_upgrade/t/002_pg_upgrade.pl +++ b/src/bin/pg_upgrade/t/002_pg_upgrade.pl @@ -216,6 +216,9 @@ chdir ${PostgreSQL::Test::Utils::tmp_check}; # Upgrade the instance. $oldnode->stop; +# Use a short path for socket directories. +my $socketdir = PostgreSQL::Test::Utils::tempdir_short; + # Cause a failure at the start of pg_upgrade, this should create the logging # directory pg_upgrade_output.d but leave it around. Keep --check for an # early exit. @@ -228,6 +231,7 @@ command_fails( '-B', $newbindir, '-p', $oldnode->port, '-P', $newnode->port, + '-s', $socketdir, '--check' ], 'run of pg_upgrade --check for new instance with incorrect binary path'); @@ -241,7 +245,8 @@ command_ok( 'pg_upgrade', '--no-sync','-d', $oldnode->data_dir, '-D', $newnode->data_dir, '-b', $oldbindir, '-B', $newbindir, '-p', $oldnode->port, - '-P', $newnode->port, '--check' + '-P', $newnode->port, '-s', $socketdir, + '--check' ], 'run of pg_upgrade --check for new instance'); ok(!-d $newnode->data_dir . "/pg_upgrade_output.d", @@ -253,7 +258,7 @@ command_ok( 'pg_upgrade', '--no-sync','-d', $oldnode->data_dir, '-D', $newnode->data_dir, '-b', $oldbindir, '-B', $newbindir, '-p', $oldnode->port, - '-P', $newnode->port + '-P', $newnode->port, '-s', $socketdir, ], 'run of pg_upgrade for new instance'); ok( !-d $newnode->data_dir . "/pg_upgrade_output.d", signature.asc Description: PGP signature
Re: AIX support - alignment issues
Andres Freund writes: > On 2022-07-03 20:08:19 -0400, Tom Lane wrote: >> I do still feel that HPPA is of interest, to keep us honest >> about spinlock support > I.e. forgetting to initialize them? Or the weird alignment stuff it has? The nonzero initialization mainly, and to a lesser extent the weird size of a lock. I think the fact that the active word is only part of the lock struct is pretty well encapsulated. > I'd started to work a patch to detect missing initialization for both > spinlocks and lwlocks, I think that'd be good to have for more common cases. No objection to having more than one check for this ;-) regards, tom lane
Re: AIX support - alignment issues
Hi, On 2022-07-03 20:08:19 -0400, Tom Lane wrote: > I would have preferred to keep pademelon, with its pre-C99 compiler, going > until v11 is EOL, but that ain't happening. I'm not too worried about that - clang with -std=c89 -Wc99-extensions -Werror=c99-extensions as it's running on mylodon for the older branches seems to do a decent job. And is obviously much faster :) > I would not stand in the way of dropping HP-UX and IA64 support as of > v16. Cool. > I do still feel that HPPA is of interest, to keep us honest > about spinlock support I.e. forgetting to initialize them? Or the weird alignment stuff it has? I'd started to work a patch to detect missing initialization for both spinlocks and lwlocks, I think that'd be good to have for more common cases. Greetings, Andres Freund
Re: AIX support - alignment issues
Hi, On 2022-07-04 10:33:37 +1200, Thomas Munro wrote: > I don't have a dog in this race, but AIX is clearly not in the same > category as HP-UX (and maybe Solaris is somewhere in between). The reason to consider whether it's worth supporting AIX is that it's library model is very different from other unix like platforms (much closer to windows though). We also have dedicated compiler support for it, which I guess could separately be dropped. Greetings, Andres Freund
Re: AIX support - alignment issues
Thomas Munro writes: > On Sun, Jul 3, 2022 at 8:34 AM Tom Lane wrote: >> I am a little concerned though that we don't have access to the latest >> version of AIX --- that seems like a non-maintainable situation. > The release history doesn't look t bad on that front: the live > versions are 7.1 (2010-2023), 7.2 (2015-TBA) and 7.3 (2021-TBA). 7.3 > only came out half a year ago, slightly after Windows 11, which we > aren't testing yet either. Those GCC AIX systems seem to be provided > by IBM and the Open Source Lab at Oregon State University which has a > POWER lab providing ongoing CI services etc to various OSS projects, > so I would assume that upgrades (and retirement of the > about-to-be-desupported 7.1 system) will come along eventually. OK, we can wait awhile to see what happens on that. > I don't have a dog in this race, but AIX is clearly not in the same > category as HP-UX (and maybe Solaris is somewhere in between). AIX > runs on hardware you can buy today that got a major refresh last year > (Power 10), while HP-UX runs only on discontinued CPUs, so while it's > a no-brainer to drop HP-UX support, it's a trickier question for AIX. Yeah. FTR, I'm out of the HP-UX game: due to a hardware failure, I can no longer boot that installation. I would have preferred to keep pademelon, with its pre-C99 compiler, going until v11 is EOL, but that ain't happening. I see that EDB are still running a couple of HP-UX/IA64 animals, but I wonder if they're prepared to do anything to support those animals --- like, say, fix platform-specific bugs. Robert has definitely indicated displeasure with doing so, but I don't know if he makes the decisions on that. I would not stand in the way of dropping HP-UX and IA64 support as of v16. (I do still feel that HPPA is of interest, to keep us honest about spinlock support --- but that dual-stack arrangement that IA64 uses is surely not part of anyone's future.) I have no opinion either way about Solaris. regards, tom lane
Too-long socket paths are breaking several buildfarm members
Buildfarm member thorntail has yet to pass the pg_upgrade test in the REL_15_STABLE branch. It looks like the problem reduces to an overlength pathname: 2022-07-04 00:27:03.404 MSK [2212393:2] LOG: Unix-domain socket path "/home/nm/farm/sparc64_deb10_gcc_64_ubsan/REL_15_STABLE/pgsql.build/src/bin/pg_upgrade/tmp_check/.s.PGSQL.49714" is too long (maximum 107 bytes) That path name is 3 bytes over the platform limit. Evidently, "REL_15_STABLE" is just enough longer than "HEAD" to make this fail, whereas we didn't see the problem as long as the test case only ran in HEAD. Members butterflyfish, massasauga, and myna likewise have yet to pass this test in REL_15_STABLE, though they're perfectly happy in HEAD. They are returning cut-down logs that don't allow diagnosing for certain, but a reasonable bet is that it's the same kind of problem. I think that the conversion of pg_upgrade's test script to TAP form missed a bet. IIRC, we have mechanism somewhere to ensure that test socket path names are created under /tmp, or someplace else that's not subject to possibly-long paths of installation directories. That's evidently not being used here. regards, tom lane
Re: AIX support - alignment issues
On Sun, Jul 3, 2022 at 8:34 AM Tom Lane wrote: > I am a little concerned though that we don't have access to the latest > version of AIX --- that seems like a non-maintainable situation. The release history doesn't look t bad on that front: the live versions are 7.1 (2010-2023), 7.2 (2015-TBA) and 7.3 (2021-TBA). 7.3 only came out half a year ago, slightly after Windows 11, which we aren't testing yet either. Those GCC AIX systems seem to be provided by IBM and the Open Source Lab at Oregon State University which has a POWER lab providing ongoing CI services etc to various OSS projects, so I would assume that upgrades (and retirement of the about-to-be-desupported 7.1 system) will come along eventually. I don't have a dog in this race, but AIX is clearly not in the same category as HP-UX (and maybe Solaris is somewhere in between). AIX runs on hardware you can buy today that got a major refresh last year (Power 10), while HP-UX runs only on discontinued CPUs, so while it's a no-brainer to drop HP-UX support, it's a trickier question for AIX. I guess the way open source is supposed to work is that someone with a real interest in PostgreSQL on AIX helps maintain it, not only keeping it building and passing tests, but making it work really well (cf huge pages, scalable event handling, probably more things that would be obvious to an AIX expert...), and representing ongoing demand and interests from the AIX user community...
Re: generate_series for timestamptz and time zone problem
=?UTF-8?Q?Przemys=c5=82aw_Sztoch?= writes: > I have problem with this: > -- Considering only built-in procs (prolang = 12), look for multiple uses > -- of the same internal function (ie, matching prosrc fields). It's OK to > -- have several entries with different pronames for the same internal > function, > -- but conflicts in the number of arguments and other critical items should > -- be complained of. (We don't check data types here; see next query.) It's telling you you're violating project style. Don't make multiple pg_proc entries point at the same C function and then use PG_NARGS to disambiguate; instead point at two separate functions. The functions can share code at the next level down, if they want. (Just looking at the patch, though, I wonder if sharing code is really beneficial in this case. It seems quite messy, and I wouldn't be surprised if it hurts performance in the existing case.) You also need to expend some more effort on refactoring code, to eliminate silliness like looking up the timezone name each time through the SRF. That's got to be pretty awful performance-wise. regards, tom lane
Re: Handle infinite recursion in logical replication setup
On Mon, Jul 4, 2022 at 12:59 AM vignesh C wrote: ... > > 2. > > /* ALTER SUBSCRIPTION SET ( */ > > else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && > > TailMatches("SET", "(")) > > - COMPLETE_WITH("binary", "slot_name", "streaming", > > "synchronous_commit", "disable_on_error"); > > + COMPLETE_WITH("binary", "origin", "slot_name", "streaming", > > "synchronous_commit", "disable_on_error"); > > /* ALTER SUBSCRIPTION SKIP ( */ > > else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && > > TailMatches("SKIP", "(")) > > COMPLETE_WITH("lsn"); > > @@ -3152,7 +3152,7 @@ psql_completion(const char *text, int start, int end) > > /* Complete "CREATE SUBSCRIPTION ... WITH ( " */ > > else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", > > "(")) > > COMPLETE_WITH("binary", "connect", "copy_data", "create_slot", > > - "enabled", "slot_name", "streaming", > > + "enabled", "origin", "slot_name", "streaming", > > "synchronous_commit", "two_phase", "disable_on_error"); > > > > Why do you choose to add a new option in-between other parameters > > instead of at the end which we normally do? The one possible reason I > > can think of is that all the parameters at the end are boolean so you > > want to add this before those but then why before slot_name, and again > > I don't see such a rule being followed for other parameters. > > I was not sure if it should be maintained in alphabetical order, > anyway since the last option "disable_on_error" is at the end, I have > changed it to the end. > Although it seems it is not a hard rule, mostly the COMPLETE_WITH are coded using alphabetical order. Anyway, I think that was a clear intention here too since 13 of 14 parameters were already in alphabetical order; it is actually only that "disable_on_error" parameter that was misplaced; not the new "origin" parameter. Also, in practice, on those completions will get output in alphabetical order, so IMO it makes more sense for the code to be consistent with the output: e.g. test_sub=# create subscription sub xxx connection '' publication pub WITH ( BINARY DISABLE_ON_ERRORSTREAMING CONNECT ENABLED SYNCHRONOUS_COMMIT COPY_DATA ORIGIN TWO_PHASE CREATE_SLOT SLOT_NAME test_sub=# -- Kind Regards, Peter Smith. Fujitsu Australia
Re: Allowing REINDEX to have an optional name
Simon Riggs writes: > Thanks for the review, new version attached. This is marked as Ready for Committer, but that seems unduly optimistic. The cfbot shows that it's failing on all platforms --- and not in the same way on each, suggesting there are multiple problems. regards, tom lane
Re: JSON/SQL: jsonpath: incomprehensible error message
On 2022-06-30 Th 04:19, Amit Kapila wrote: > On Wed, Jun 29, 2022 at 6:58 PM Erik Rijkers wrote: >> Op 29-06-2022 om 15:00 schreef Amit Kapila: >>> On Mon, Jun 27, 2022 at 8:46 PM Andrew Dunstan wrote: On 2022-06-26 Su 11:44, Erik Rijkers wrote: > JSON/SQL jsonpath > > For example, a jsonpath string with deliberate typo 'like_regexp' > (instead of 'like_regex'): > > select js > from (values (jsonb '{}')) as f(js) > where js @? '$ ? (@ like_regexp "^xxx")'; > > ERROR: syntax error, unexpected IDENT_P at or near " " of jsonpath input > LINE 1: ...s from (values (jsonb '{}')) as f(js) where js @? '$ ? (@ > li... > > Both 'IDENT_P' and 'at or near " "' seem pretty useless. > >>> removing this. One thing that is not clear to me is why OP sees an >>> acceptable message (ERROR: syntax error, unexpected invalid token at >>> or near "=" of jsonpath input) for a similar query in 14? >> To mention that was perhaps unwise of me because The IDENT_P (or more >> generally, *_P) messages can be provoked on 14 too. >> > Okay, then I think it is better to backpatch this fix. Done. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: [PoC] Reducing planning time when tables have many partitions
Yuya Watari writes: > On Thu, Mar 24, 2022 at 11:03 AM David Rowley wrote: >> I think a better way to solve this would be just to have a single hash >> table over all EquivalenceClasses that allows fast lookups of >> EquivalenceMember->em_expr. > If the predicate were "em->em_expr == something", the hash table whose > key is em_expr would be effective. However, the actual predicates are > not of this type but the following. > // Find EquivalenceMembers whose relids is equal to the given relids > (1) bms_equal(em->em_relids, relids) > // Find EquivalenceMembers whose relids is a subset of the given relids > (2) bms_is_subset(em->em_relids, relids) Yeah, that's a really interesting observation, and I agree that David's suggestion doesn't address it. Maybe after we fix this problem, matching of em_expr would be the next thing to look at, but your results say it isn't the first thing. I'm not real thrilled with trying to throw hashtables at the problem, though. As David noted, they'd be counterproductive for simple queries. Sure, we could address that with duplicate code paths, but that's a messy and hard-to-tune approach. Also, I find the idea of hashing on all subsets of relids to be outright scary. "m is not so large in most cases" does not help when m *is* large. For the bms_equal class of lookups, I wonder if we could get anywhere by adding an additional List field to every RelOptInfo that chains all EquivalenceMembers that match that RelOptInfo's relids. The trick here would be to figure out when to build those lists. The simple answer would be to do it lazily on-demand, but that would mean a separate scan of all the EquivalenceMembers for each RelOptInfo; I wonder if there's a way to do better? Perhaps the bms_is_subset class could be handled in a similar way, ie do a one-time pass to make a List of all EquivalenceMembers that use a RelOptInfo. regards, tom lane
Re: [PATCH] Completed unaccent dictionary with many missing characters
Michael Paquier wrote on 6/28/2022 7:14 AM: On Thu, Jun 23, 2022 at 02:10:42PM +0200, Przemysław Sztoch wrote: The only division that is probably possible is the one attached. Well, the addition of cyrillic does not make necessary the removal of SOUND RECORDING COPYRIGHT or the DEGREEs, that implies the use of a dictionnary when manipulating the set of codepoints, but that's me being too picky. Just to say that I am fine with what you are proposing here. By the way, could you add a couple of regressions tests for each patch with a sample of the characters added? U+210C is a particularly sensitive case, as we should really make sure that it maps to what we want even if Latin-ASCII.xml tells a different story. This requires the addition of a couple of queries in unaccent.sql with the expected output updated in unaccent.out. -- Michael Regression tests has been added. -- Przemysław Sztoch | Mobile +48 509 99 00 66 From bc0cbd36f5dfece10306466437d5de2dc676f485 Mon Sep 17 00:00:00 2001 From: Przemyslaw Sztoch Date: Thu, 23 Jun 2022 13:56:09 +0200 Subject: [PATCH 1/2] Update unnaccent rules generator add cirillic alpha add digits set->dict --- contrib/unaccent/expected/unaccent.out | 24 + contrib/unaccent/generate_unaccent_rules.py | 66 +- contrib/unaccent/sql/unaccent.sql |5 + contrib/unaccent/unaccent.rules | 1056 +++ 4 files changed, 1115 insertions(+), 36 deletions(-) diff --git a/contrib/unaccent/expected/unaccent.out b/contrib/unaccent/expected/unaccent.out index c1bd7cd897..090a0b3889 100644 --- a/contrib/unaccent/expected/unaccent.out +++ b/contrib/unaccent/expected/unaccent.out @@ -37,6 +37,18 @@ SELECT unaccent('À'); -- Remove combining diacritical 0x0300 A (1 row) +SELECT unaccent('ℌ'); -- Controversial conversion from Latin-ASCII.xml + unaccent +-- + x +(1 row) + +SELECT unaccent('Chrząszcza szczudłem przechrzcił wąż, Strząsa skrzydła z dżdżu,'); -- Polish +unaccent +- + Chrzaszcza szczudlem przechrzcil waz, Strzasa skrzydla z dzdzu, +(1 row) + SELECT unaccent('unaccent', 'foobar'); unaccent -- @@ -67,6 +79,12 @@ SELECT unaccent('unaccent', 'À'); A (1 row) +SELECT unaccent('unaccent', 'Łódź ąćęłńóśżź ĄĆĘŁŃÓŚŻŹ'); -- Polish letters + unaccent +-- + Lodz acelnoszz ACELNOSZZ +(1 row) + SELECT ts_lexize('unaccent', 'foobar'); ts_lexize --- @@ -97,3 +115,9 @@ SELECT ts_lexize('unaccent', 'À'); {A} (1 row) +SELECT unaccent('unaccent', '⅓ ⅜ ℃ ℉ ℗'); + unaccent +--- + 1/3 3/8 °C °F (P) +(1 row) + diff --git a/contrib/unaccent/generate_unaccent_rules.py b/contrib/unaccent/generate_unaccent_rules.py index c405e231b3..71932c8224 100644 --- a/contrib/unaccent/generate_unaccent_rules.py +++ b/contrib/unaccent/generate_unaccent_rules.py @@ -35,13 +35,16 @@ import xml.etree.ElementTree as ET sys.stdout = codecs.getwriter('utf8')(sys.stdout.buffer) # The ranges of Unicode characters that we consider to be "plain letters". -# For now we are being conservative by including only Latin and Greek. This -# could be extended in future based on feedback from people with relevant -# language knowledge. +# For now we are being conservative by including only Latin, Greek and Cyrillic. +# This could be extended in future based on feedback from people with +# relevant language knowledge. PLAIN_LETTER_RANGES = ((ord('a'), ord('z')), # Latin lower case (ord('A'), ord('Z')), # Latin upper case - (0x03b1, 0x03c9), # GREEK SMALL LETTER ALPHA, GREEK SMALL LETTER OMEGA - (0x0391, 0x03a9)) # GREEK CAPITAL LETTER ALPHA, GREEK CAPITAL LETTER OMEGA + (ord('0'), ord('9')), # Digits + (0x0391, 0x03a9), # Greek capital letters (ALPHA-OMEGA) + (0x03b1, 0x03c9), # Greek small letters (ALPHA-OMEGA) + (0x0410, 0x044f), # Cyrillic capital and small letters + (0x00b0, 0x00b0)) # Degree sign # Combining marks follow a "base" character, and result in a composite # character. Example: "U&'A\0300'"produces "À".There are three types of @@ -139,24 +142,24 @@ def get_plain_letter(codepoint, table): return codepoint # Should not come here -assert(False) +assert False, 'Codepoint U+%0.2X' % codepoint.id def is_ligature(codepoint, table): """Return true for letters combined with letters.""" -return all(is_letter(table[i], table) for i in codepoint.combining_ids) +return all(i in table and is_letter(table[i], table) for i in codepoint.combining_ids) def get_plain_letters(codepoint, table): """Return a list of plain letters from a ligature.""" -assert(is_ligature(codepoint
Re: Allow makeaclitem() to accept multiple privileges
"Tharakan, Robins" writes: > Presently, makeaclitem() allows only a single privilege in a single call. > This > patch allows it to additionally accept multiple comma-separated privileges. Seems reasonable. Pushed with minor cosmetic adjustments (mostly docs changes). regards, tom lane
Re: Fix proposal for comparaison bugs in PostgreSQL::Version
On Sun, 3 Jul 2022 10:40:21 -0400 Andrew Dunstan wrote: > On 2022-06-29 We 05:09, Jehan-Guillaume de Rorthais wrote: > > On Tue, 28 Jun 2022 18:17:40 -0400 > > Andrew Dunstan wrote: > > > >> On 2022-06-28 Tu 16:53, Jehan-Guillaume de Rorthais wrote: > >>> ... > >>> A better fix would be to store the version internally as version_num that > >>> are trivial to compute and compare. Please, find in attachment an > >>> implementation of this. > >>> > >>> The patch is a bit bigger because it improved the devel version to support > >>> rc/beta/alpha comparison like 14rc2 > 14rc1. > >>> > >>> Moreover, it adds a bunch of TAP tests to check various use cases. > >> > >> Nice catch, but this looks like massive overkill. I think we can very > >> simply fix the test in just a few lines of code, instead of a 190 line > >> fix and a 130 line TAP test. > > I explained why the patch was a little bit larger than required: it fixes > > the bugs and do a little bit more. The _version_cmp sub is shorter and > > easier to understand, I use multi-line code where I could probably fold > > them in a one-liner, added some comments... Anyway, I don't feel the number > > of line changed is "massive". But I can probably remove some code and > > shrink some other if it is really important... > > > > Moreover, to be honest, I don't mind the number of additional lines of TAP > > tests. Especially since it runs really, really fast and doesn't hurt > > day-to-day devs as it is independent from other TAP tests anyway. It could > > be 1k, if it runs fast, is meaningful and helps avoiding futur regressions, > > I would welcome the addition. > > > I don't see the point of having a TAP test at all. We have TAP tests for > testing the substantive products we test, not for the test suite > infrastructure. Otherwise, where will we stop? Shall we have tests for > the things that test the test suite? Tons of perl module have regression tests. When questioning where testing should stop, it seems the Test::More module itself is not the last frontier: https://github.com/Test-More/test-more/tree/master/t Moreover, the PostgreSQL::Version is not a TAP test module, but a module to deal with PostgreSQL versions and compare them. Testing makes development faster as well when it comes to test the code. Instead of testing vaguely manually, you can test a whole bunch of situations and add accumulate some more when you think about a new one or when a bug is reported. Having TAP test helps to make sure the code work as expected. It helped me when creating my patch. With all due respect, I just don't understand your arguments against them. The number of lines or questioning when testing should stop doesn't hold much. > > If we really want to save some bytes, I have a two lines worth of code fix > > that looks more readable to me than fixing _version_cmp: > > > > +++ b/src/test/perl/PostgreSQL/Version.pm > > @@ -92,9 +92,13 @@ sub new > > # Split into an array > > my @numbers = split(/\./, $arg); > > > > + # make sure all digit of the array-represented version are set so > > we can > > + # keep _version_cmp code as a "simple" digit-to-digit comparison > > loop > > + $numbers[$_] += 0 for 0..3; > > + > > # Treat development versions as having a minor/micro version one > > less than # the first released version of that branch. > > - push @numbers, -1 if ($devel); > > + $numbers[3] = -1 if $devel; > > > > $devel ||= ""; > > I don't see why this is any more readable. The _version_cmp is much more readable. But anyway, this is not the point. Using an array to compare versions where we can use version_num seems like useless and buggy convolutions to me. Regards,
Re: 15beta1 tab completion of extension versions
Noah Misch writes: > On Sun, Jul 03, 2022 at 02:00:59PM -0400, Tom Lane wrote: >> This will require one extra when what you want is to update to >> a specific version, but I doubt that that's going to bother anyone >> very much. I don't want to try to resurrect the v14 behavior exactly >> because it's too much of a mess from a quoting standpoint. > Works for me, and I agree the patch implements that successfully. "ALTER > EXTENSION x UPDATE;" is an infrequent command, and "ALTER EXTENSION x UPDATE > TO ..." is even less frequent. It's not worth much special effort to shave > steps. Done that way then. Thanks for the report. regards, tom lane
Re: automatically generating node support functions
Peter Eisentraut writes: > Here is a patch that reformats the relevant (and a few more) comments > that way. This has been run through pgindent, so the formatting should > be stable. Now that that's been pushed, the main patch is of course quite broken. Are you working on a rebase? I looked through the last published version of the main patch (Alvaro's 0002 from 2022-04-19), without trying to actually test it, and found a couple of things that look wrong in the Makefiles: * AFAICT, the infrastructure for removing the generated files at "make *clean" is incomplete. In particular I don't see any code for removing the symlinks or the associated stamp file during "make clean". It looks like the existing header symlinks are all cleaned up in src/include/Makefile's "clean" rule, so you could do likewise for these. Also, the "make maintainer-clean" infrastructure seems incomplete --- shouldn't src/backend/Makefile's maintainer-clean rule now also do $(MAKE) -C nodes $@ ? * There are some useful comments in backend/utils/Makefile that I think should be carried over along with the make rules that (it looks like) you cribbed from there, notably # fmgr-stamp records the last time we ran Gen_fmgrtab.pl. We don't rely on # the timestamps of the individual output files, because the Perl script # won't update them if they didn't change (to avoid unnecessary recompiles). # These generated headers must be symlinked into builddir/src/include/, # using absolute links for the reasons explained in src/backend/Makefile. # We use header-stamp to record that we've done this because the symlinks # themselves may appear older than fmgr-stamp. and something similar to this for the "clean" rule: # fmgroids.h, fmgrprotos.h, fmgrtab.c, fmgr-stamp, and errcodes.h are in the # distribution tarball, so they are not cleaned here. Also, I share David's upthread allergy to the option names "path_hackN" and to documenting those only inside the conversion script. I think the existing text that you moved into the script, such as this bit: # We do not print the parent, else we'd be in infinite # recursion. We can print the parent's relids for # identification purposes, though. We print the pathtarget # only if it's not the default one for the rel. We also do # not print the whole of param_info, since it's printed via # RelOptInfo; it's sufficient and less cluttering to print # just the required outer relids. is perfectly adequate as documentation, it just needs to be somewhere else (pathnodes.h seems fine, if not nodes.h) and labeled as to exactly which pg_node_attr option invokes which behavior. BTW, I think this: "Unknown attributes are ignored" is a seriously bad idea; it will allow typos to escape detection. regards, tom lane
Re: 15beta1 tab completion of extension versions
On Sun, Jul 03, 2022 at 02:00:59PM -0400, Tom Lane wrote: > I wrote: > > Noah Misch writes: > >> "ALTER EXTENSION hstore UPDATE;" is a valid command (updates to the > >> control file default version). Hence, I think the v14 behavior was better. > > > Hmm ... good point, let me think about that. > > After consideration, my preferred solution is just this: > > diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c > index 463cac9fb0..c5cafe6f4b 100644 > --- a/src/bin/psql/tab-complete.c > +++ b/src/bin/psql/tab-complete.c > @@ -1927,7 +1927,7 @@ psql_completion(const char *text, int start, int end) > > /* ALTER EXTENSION */ > else if (Matches("ALTER", "EXTENSION", MatchAny)) > - COMPLETE_WITH("ADD", "DROP", "UPDATE TO", "SET SCHEMA"); > + COMPLETE_WITH("ADD", "DROP", "UPDATE", "SET SCHEMA"); > > /* ALTER EXTENSION UPDATE */ > else if (Matches("ALTER", "EXTENSION", MatchAny, "UPDATE")) > > This will require one extra when what you want is to update to > a specific version, but I doubt that that's going to bother anyone > very much. I don't want to try to resurrect the v14 behavior exactly > because it's too much of a mess from a quoting standpoint. Works for me, and I agree the patch implements that successfully. "ALTER EXTENSION x UPDATE;" is an infrequent command, and "ALTER EXTENSION x UPDATE TO ..." is even less frequent. It's not worth much special effort to shave steps.
Re: 15beta1 tab completion of extension versions
I wrote: > Noah Misch writes: >> "ALTER EXTENSION hstore UPDATE;" is a valid command (updates to the >> control file default version). Hence, I think the v14 behavior was better. > Hmm ... good point, let me think about that. After consideration, my preferred solution is just this: diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c index 463cac9fb0..c5cafe6f4b 100644 --- a/src/bin/psql/tab-complete.c +++ b/src/bin/psql/tab-complete.c @@ -1927,7 +1927,7 @@ psql_completion(const char *text, int start, int end) /* ALTER EXTENSION */ else if (Matches("ALTER", "EXTENSION", MatchAny)) - COMPLETE_WITH("ADD", "DROP", "UPDATE TO", "SET SCHEMA"); + COMPLETE_WITH("ADD", "DROP", "UPDATE", "SET SCHEMA"); /* ALTER EXTENSION UPDATE */ else if (Matches("ALTER", "EXTENSION", MatchAny, "UPDATE")) This will require one extra when what you want is to update to a specific version, but I doubt that that's going to bother anyone very much. I don't want to try to resurrect the v14 behavior exactly because it's too much of a mess from a quoting standpoint. regards, tom lane
Re: O(n) tasks cause lengthy startups and checkpoints
Hi, On 2022-07-03 10:07:54 -0700, Nathan Bossart wrote: > Thanks for the prompt review. > > On Sat, Jul 02, 2022 at 03:54:56PM -0700, Andres Freund wrote: > > On 2022-07-02 15:05:54 -0700, Nathan Bossart wrote: > >> + /* Obtain requested tasks */ > >> + SpinLockAcquire(&CustodianShmem->cust_lck); > >> + flags = CustodianShmem->cust_flags; > >> + CustodianShmem->cust_flags = 0; > >> + SpinLockRelease(&CustodianShmem->cust_lck); > > > > Just resetting the flags to 0 is problematic. Consider what happens if > > there's > > two tasks and and the one processed first errors out. You'll loose > > information > > about needing to run the second task. > > I think we also want to retry any failed tasks. I don't think so, at least not if it's just going to retry that task straight away - then we'll get stuck on that one task forever. If we had the ability to "queue" it the end, to be processed after other already dequeued tasks, it'd be a different story. > The way v6 handles this is by requesting all tasks after an exception. Ick. That strikes me as a bad idea. > >> +/* > >> + * RequestCustodian > >> + *Called to request a custodian task. > >> + */ > >> +void > >> +RequestCustodian(int flags) > >> +{ > >> + SpinLockAcquire(&CustodianShmem->cust_lck); > >> + CustodianShmem->cust_flags |= flags; > >> + SpinLockRelease(&CustodianShmem->cust_lck); > >> + > >> + if (ProcGlobal->custodianLatch) > >> + SetLatch(ProcGlobal->custodianLatch); > >> +} > > > > With this representation we can't really implement waiting for a task or > > such. And it doesn't seem like a great API for the caller to just specify a > > mix of flags. > > At the moment, the idea is that nothing should need to wait for a task > because the custodian only handles things that are relatively non-critical. Which is just plainly not true as the patchset stands... I think we're going to have to block if some cleanup as part of a checkpoint hasn't been completed by the next checkpoint - otherwise it'll just end up being way too confusing and there's absolutely no backpressure anymore. > If that changes, this could probably be expanded to look more like > RequestCheckpoint(). > > What would you suggest using instead of a mix of flags? I suspect an array of tasks with requested and completed counters or such? With a condition variable to wait on? > >> + /* let the custodian know what it can remove */ > >> + CustodianSetLogicalRewriteCutoff(cutoff); > > > > Setting this variable in a custodian datastructure and then fetching it from > > there seems architecturally wrong to me. > > Where do you think it should go? I previously had it in the checkpointer's > shared memory, but you didn't like that the functions were declared in > bgwriter.h (along with the other checkpoint stuff). If the checkpointer > shared memory is the right place, should we create checkpointer.h and use > that instead? Well, so far I have not understood what the whole point of the shared state is, so i have a bit of a hard time answering this ;) > >> +/* > >> + * Remove all mappings not needed anymore based on the logical restart > >> LSN saved > >> + * by the checkpointer. We use this saved value instead of calling > >> + * ReplicationSlotsComputeLogicalRestartLSN() so that we don't interfere > >> with an > >> + * ongoing call to CheckPointLogicalRewriteHeap() that is flushing > >> mappings to > >> + * disk. > >> + */ > > > > What interference could there be? > > My concern is that the custodian could obtain a later cutoff than what the > checkpointer does, which might cause files to be concurrently unlinked and > fsync'd. If we always use the checkpointer's cutoff, that shouldn't be a > problem. This could probably be better explained in this comment. How about having a Datum argument to RequestCustodian() that is forwarded to the task? > >> +void > >> +RemoveOldLogicalRewriteMappings(void) > >> +{ > >> + XLogRecPtr cutoff; > >> + DIR*mappings_dir; > >> + struct dirent *mapping_de; > >> + charpath[MAXPGPATH + 20]; > >> + boolvalue_set = false; > >> + > >> + cutoff = CustodianGetLogicalRewriteCutoff(&value_set); > >> + if (!value_set) > >> + return; > > > > Afaics nothing clears values_set - is that a good idea? > > I'm using value_set to differentiate the case where InvalidXLogRecPtr means > the checkpointer hasn't determined a value yet versus the case where it > has. In the former, we don't want to take any action. In the latter, we > want to unlink all the files. Since we're moving to a request model for > the custodian, I might be able to remove this value_set stuff completely. > If that's not possible, it probably deserves a better comment. It would. Greetings, Andres Freund
Re: replacing role-level NOINHERIT with a grant-level option
On Sat, Jul 02, 2022 at 11:04:28PM -0400, Robert Haas wrote: > On Sat, Jul 2, 2022 at 6:16 PM Nathan Bossart > wrote: >> I was thinking that when DEFAULT was removed, pg_dump would just need to >> generate WITH INHERIT TRUE/FALSE based on the value of rolinherit for older >> versions. Using the role-level property as the default for future grants >> seems a viable strategy, although it would break backward compatibility. >> For example, if I create a NOINHERIT role, grant a bunch of roles to it, >> and then change it to INHERIT, the role won't begin inheriting the >> privileges of the roles it is a member of. Right now, it does. > > I think the idea you propose here is interesting, because I think it > proves that committing v2 or something like it doesn't really lock us > into the role-level property any more than we already are, which at > least makes me feel slightly less bad about that option. However, if > there's implacable opposition to any compatibility break at any point, > then maybe this plan would never actually be implemented in practice. > And if there's not, maybe we can be bolder now. If by "bolder" you mean "mark [NO]INHERIT as deprecated-and-to-be-removed and begin emitting WARNINGs when it and WITH INHERIT DEFAULT are used," I think it's worth consideration. I suspect it will be hard to sell removing [NO]INHERIT in v16 because it would introduce a compatibility break without giving users much time to migrate. I could be wrong, though. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Re: PSA: Autoconf has risen from the dead
Hi, On 2022-07-03 10:50:49 -0400, Tom Lane wrote: > Robert Haas writes: > > Hmm, I also don't know how annoying it's going to be to get the new > > ninja/meson stuff working on macOS ... I really hope someone puts a > > good set of directions on the wiki or in the documentation or > > someplace. Yea, I guess I should start a documentation section... I've only used homebrew on mac, but with that it should be something along the lines of brew install meson meson setup --buildtype debug -Dcassert=true build-directory cd build-directory ninja of course if you want to build against some dependencies and / or run tap tests, you need to do something similar to what you have to do for configure. I.e. - install perl modules [1] - tell the build about location of homebrew [2] > If you use MacPorts it's just "install those packages", and I imagine > the same for Homebrew. I've not tried build-from-source on modern > platforms. I've done some semi automated testing (to be turned fully automatic) across meson versions that didn't so far show any need for that. We do require a certain minimum version of meson (indicated in the top-level meson.build, raises an error if not met), which in turn requires a minimum version of ninja (also errors). The windows build with msbuild is slower on older versions of meson that are unproblematic on other platforms. But given you're not going to install an outdated meson from $package-manager there, I don't think it's worth worrying about. Greetings, Andres Freund [1] https://github.com/anarazel/postgres/blob/meson/.cirrus.yml#L638 [2] https://github.com/anarazel/postgres/blob/meson/.cirrus.yml#L742
Re: SQL/JSON functions vs. ECPG vs. STRING as a reserved word
Noah Misch writes: > On Mon, May 30, 2022 at 05:20:15PM -0400, Tom Lane wrote: >> [allow EXEC SQL TYPE unreserved_keyword IS ...] > I didn't locate any problems beyond the test and doc gaps that you mentioned, > so I've marked this Ready for Committer. Thanks! Here's a fleshed-out version with doc changes, plus adjustment of preproc/type.pgc so that it exposes the existing problem. (No code changes from v1.) I'll push this in a few days if there are not objections. regards, tom lane diff --git a/doc/src/sgml/ecpg.sgml b/doc/src/sgml/ecpg.sgml index 7f8b4dd5c0..0ba1bf93a6 100644 --- a/doc/src/sgml/ecpg.sgml +++ b/doc/src/sgml/ecpg.sgml @@ -1483,6 +1483,10 @@ EXEC SQL END DECLARE SECTION; Typedefs + + typedef + in ECPG + Use the typedef keyword to map new types to already @@ -1497,8 +1501,38 @@ EXEC SQL END DECLARE SECTION; EXEC SQL TYPE serial_t IS long; - This declaration does not need to be part of a declare section. + This declaration does not need to be part of a declare section; + that is, you can also write typedefs as normal C statements. + + + Any word you declare as a typedef cannot be used as a SQL keyword + in EXEC SQL commands later in the same program. + For example, this won't work: + +EXEC SQL BEGIN DECLARE SECTION; +typedef int start; +EXEC SQL END DECLARE SECTION; +... +EXEC SQL START TRANSACTION; + + ECPG will report a syntax error for START + TRANSACTION, because it no longer + recognizes START as a SQL keyword, + only as a typedef. + + + + + In PostgreSQL releases before v16, use + of SQL keywords as typedef names was likely to result in syntax + errors associated with use of the typedef itself, rather than use + of the name as a SQL keyword. The new behavior is less likely to + cause problems when an existing ECPG application is recompiled in + a new PostgreSQL release with new + keywords. + + diff --git a/src/interfaces/ecpg/preproc/ecpg.trailer b/src/interfaces/ecpg/preproc/ecpg.trailer index b95fc44314..fba35f6be6 100644 --- a/src/interfaces/ecpg/preproc/ecpg.trailer +++ b/src/interfaces/ecpg/preproc/ecpg.trailer @@ -564,8 +564,29 @@ var_type: simple_type $$.type_index = mm_strdup("-1"); $$.type_sizeof = NULL; } - | ECPGColLabelCommon '(' precision opt_scale ')' + | NUMERIC '(' precision opt_scale ')' { + $$.type_enum = ECPGt_numeric; + $$.type_str = mm_strdup("numeric"); + $$.type_dimension = mm_strdup("-1"); + $$.type_index = mm_strdup("-1"); + $$.type_sizeof = NULL; + } + | DECIMAL_P '(' precision opt_scale ')' + { + $$.type_enum = ECPGt_decimal; + $$.type_str = mm_strdup("decimal"); + $$.type_dimension = mm_strdup("-1"); + $$.type_index = mm_strdup("-1"); + $$.type_sizeof = NULL; + } + | IDENT '(' precision opt_scale ')' + { + /* + * In C parsing mode, NUMERIC and DECIMAL are not keywords, so + * they will show up here as a plain identifier, and we need + * this duplicate code to recognize them. + */ if (strcmp($1, "numeric") == 0) { $$.type_enum = ECPGt_numeric; @@ -587,15 +608,98 @@ var_type: simple_type $$.type_index = mm_strdup("-1"); $$.type_sizeof = NULL; } - | ECPGColLabelCommon ecpg_interval + | VARCHAR { - if (strlen($2) != 0 && strcmp ($1, "datetime") != 0 && strcmp ($1, "interval") != 0) -mmerror (PARSE_ERROR, ET_ERROR, "interval specification not allowed here"); + $$.type_enum = ECPGt_varchar; + $$.type_str = EMPTY; /*mm_strdup("varchar");*/ + $$.type_dimension = mm_strdup("-1"); + $$.type_index = mm_strdup("-1"); + $$.type_sizeof = NULL; + } + | FLOAT_P + { + /* Note: DOUBLE is handled in simple_type */ + $$.type_enum = ECPGt_float; + $$.type_str = mm_strdup("float"); + $$.type_dimension = mm_strdup("-1"); + $$.type_index = mm_strdup("-1"); + $$.type_sizeof = NULL; + } + | NUMERIC + { + $$.type_enum = ECPGt_numeric; + $$.type_str = mm_strdup("numeric"); + $$.type_dimension = mm_strdup("-1"); + $$.type_index = mm_strdup("-1"); + $$.type_sizeof = NULL; + } + | DECIMAL_P + { + $$.type_enum = ECPGt_decimal; + $$.type_str = mm_strdup("decimal"); + $$.type_dimension = mm_strdup("-1"); + $$.type_index = mm_strdup("-1"); + $$.type_sizeof = NULL; + } + | TIMESTAMP + { + $$.type_enum = ECPGt_timestamp; + $$.type_str = mm_strdup("timestamp"); + $$.type_dimension = mm_strdup("-1"); + $$.type_index = mm_strdup("-1"); + $$.type_sizeof = NULL; + } + | INTERVAL ecpg_interval + { + $$.type_enum = ECPGt_interval; + $$.type_str = mm_strdup("interval"); + $$.type_dimension = mm_strdup("-1"); + $$.type_index = mm_strdup("-1"); + $$.type_sizeof = NULL; + } + | STRING + { + if (INFORMIX_MODE) + { +/* In Informix mode, "string" is automatically a typedef */ +$$.type_enum = ECPGt
Re: O(n) tasks cause lengthy startups and checkpoints
Hi Andres, Thanks for the prompt review. On Sat, Jul 02, 2022 at 03:54:56PM -0700, Andres Freund wrote: > On 2022-07-02 15:05:54 -0700, Nathan Bossart wrote: >> +/* Obtain requested tasks */ >> +SpinLockAcquire(&CustodianShmem->cust_lck); >> +flags = CustodianShmem->cust_flags; >> +CustodianShmem->cust_flags = 0; >> +SpinLockRelease(&CustodianShmem->cust_lck); > > Just resetting the flags to 0 is problematic. Consider what happens if there's > two tasks and and the one processed first errors out. You'll loose information > about needing to run the second task. I think we also want to retry any failed tasks. The way v6 handles this is by requesting all tasks after an exception. Another way to handle this could be to reset each individual flag before the task is executed, and then we could surround each one with a PG_CATCH block that resets the flag. I'll do it this way in the next revision. >> +/* >> + * RequestCustodian >> + * Called to request a custodian task. >> + */ >> +void >> +RequestCustodian(int flags) >> +{ >> +SpinLockAcquire(&CustodianShmem->cust_lck); >> +CustodianShmem->cust_flags |= flags; >> +SpinLockRelease(&CustodianShmem->cust_lck); >> + >> +if (ProcGlobal->custodianLatch) >> +SetLatch(ProcGlobal->custodianLatch); >> +} > > With this representation we can't really implement waiting for a task or > such. And it doesn't seem like a great API for the caller to just specify a > mix of flags. At the moment, the idea is that nothing should need to wait for a task because the custodian only handles things that are relatively non-critical. If that changes, this could probably be expanded to look more like RequestCheckpoint(). What would you suggest using instead of a mix of flags? >> +/* Calculate how long to sleep */ >> +end_time = (pg_time_t) time(NULL); >> +elapsed_secs = end_time - start_time; >> +if (elapsed_secs >= CUSTODIAN_TIMEOUT_S) >> +continue; /* no sleep for us */ >> +cur_timeout = CUSTODIAN_TIMEOUT_S - elapsed_secs; >> + >> +(void) WaitLatch(MyLatch, >> + WL_LATCH_SET | WL_TIMEOUT | >> WL_EXIT_ON_PM_DEATH, >> + cur_timeout * 1000L /* convert >> to ms */ , >> + WAIT_EVENT_CUSTODIAN_MAIN); >> +} > > I don't think we should have this thing wake up on a regular basis. We're > doing way too much of that already, and I don't think we should add > more. Either we need a list of times when tasks need to be processed and wake > up at that time, or just wake up if somebody requests a task. I agree. I will remove the timeout in the next revision. >> diff --git a/src/backend/postmaster/postmaster.c >> b/src/backend/postmaster/postmaster.c >> index e67370012f..82aa0c6307 100644 >> --- a/src/backend/postmaster/postmaster.c >> +++ b/src/backend/postmaster/postmaster.c >> @@ -1402,7 +1402,8 @@ PostmasterMain(int argc, char *argv[]) >> * Remove old temporary files. At this point there can be no other >> * Postgres processes running in this directory, so this should be safe. >> */ >> -RemovePgTempFiles(); >> +RemovePgTempFiles(true, true); >> +RemovePgTempFiles(false, false); > > This is imo hard to read and easy to get wrong. Make it multiple functions or > pass named flags in. Will do. >> + * StagePgTempDirForRemoval >> + * >> + * This function renames the given directory with a special prefix that >> + * RemoveStagedPgTempDirs() will know to look for. An integer is appended >> to >> + * the end of the new directory name in case previously staged pgsql_tmp >> + * directories have not yet been removed. >> + */ > > It doesn't seem great to need to iterate through a directory that contains > other files, potentially a significant number. How about having a > staged_for_removal/ directory, and then only scanning that? Yeah, that seems like a good idea. Will do. >> +/* >> + * Find a name for the stage directory. We just increment an integer >> at the >> + * end of the name until we find one that doesn't exist. >> + */ >> +for (int n = 0; n <= INT_MAX; n++) >> +{ >> +snprintf(stage_path, sizeof(stage_path), "%s/%s%d", parent_path, >> + PG_TEMP_DIR_TO_REMOVE_PREFIX, n); >> + >> +if (stat(stage_path, &statbuf) != 0) >> +{ >> +if (errno == ENOENT) >> +break; >> + >> +ereport(LOG, >> +(errcode_for_file_access(), >> + errmsg("could not stat file \"%s\": >> %m", stage_path))); >> +return; >> +} >> + >> +stage_path[0] = '\0'; > > I still disli
Re: PSA: Autoconf has risen from the dead
Robert Haas writes: > Hmm, I also don't know how annoying it's going to be to get the new > ninja/meson stuff working on macOS ... I really hope someone puts a > good set of directions on the wiki or in the documentation or > someplace. If you use MacPorts it's just "install those packages", and I imagine the same for Homebrew. I've not tried build-from-source on modern platforms. One thing I think we lack data on is whether we're going to need a policy similar to everyone-must-use-exactly-this-autoconf-version. If we do, that will greatly raise the importance of building from source. regards, tom lane
Re: PSA: Autoconf has risen from the dead
On Sat, Jul 2, 2022 at 1:42 PM Tom Lane wrote: > On the whole, I'm questioning the value of messing with our autoconf > infrastructure at this stage. We did agree at PGCon that we'd keep > it going for a couple years more, but it's not real clear to me why > we can't limp along with 2.69 until we decide to drop it. If building it on macOS is going to be annoying, then -1 from me for upgrading to a new version until that's resolved. Hmm, I also don't know how annoying it's going to be to get the new ninja/meson stuff working on macOS ... I really hope someone puts a good set of directions on the wiki or in the documentation or someplace. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Fix proposal for comparaison bugs in PostgreSQL::Version
On 2022-06-29 We 05:09, Jehan-Guillaume de Rorthais wrote: > On Tue, 28 Jun 2022 18:17:40 -0400 > Andrew Dunstan wrote: > >> On 2022-06-28 Tu 16:53, Jehan-Guillaume de Rorthais wrote: >>> ... >>> A better fix would be to store the version internally as version_num that >>> are trivial to compute and compare. Please, find in attachment an >>> implementation of this. >>> >>> The patch is a bit bigger because it improved the devel version to support >>> rc/beta/alpha comparison like 14rc2 > 14rc1. >>> >>> Moreover, it adds a bunch of TAP tests to check various use cases. >> >> Nice catch, but this looks like massive overkill. I think we can very >> simply fix the test in just a few lines of code, instead of a 190 line >> fix and a 130 line TAP test. > I explained why the patch was a little bit larger than required: it fixes the > bugs and do a little bit more. The _version_cmp sub is shorter and easier to > understand, I use multi-line code where I could probably fold them in a > one-liner, added some comments... Anyway, I don't feel the number of line > changed is "massive". But I can probably remove some code and shrink some > other > if it is really important... > > Moreover, to be honest, I don't mind the number of additional lines of TAP > tests. Especially since it runs really, really fast and doesn't hurt > day-to-day > devs as it is independent from other TAP tests anyway. It could be 1k, if it > runs fast, is meaningful and helps avoiding futur regressions, I would welcome > the addition. I don't see the point of having a TAP test at all. We have TAP tests for testing the substantive products we test, not for the test suite infrastructure. Otherwise, where will we stop? Shall we have tests for the things that test the test suite? > > If we really want to save some bytes, I have a two lines worth of code fix > that > looks more readable to me than fixing _version_cmp: > > +++ b/src/test/perl/PostgreSQL/Version.pm > @@ -92,9 +92,13 @@ sub new > # Split into an array > my @numbers = split(/\./, $arg); > > + # make sure all digit of the array-represented version are set so we > can > + # keep _version_cmp code as a "simple" digit-to-digit comparison loop > + $numbers[$_] += 0 for 0..3; > + > # Treat development versions as having a minor/micro version one less > than > # the first released version of that branch. > - push @numbers, -1 if ($devel); > + $numbers[3] = -1 if $devel; > > $devel ||= ""; I don't see why this is any more readable. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: making relfilenodes 56 bits
On Sat, Jul 2, 2022 at 3:29 PM Andres Freund wrote: > Why did you choose a quite small value for VAR_RFN_PREFETCH? VAR_OID_PREFETCH > is 8192, but you chose 64 for VAR_RFN_PREFETCH? As Dilip mentioned, I suggested a lower value. If that's too low, we can go higher, but I think there is value in not making this excessively large. Somebody somewhere is going to have a database that's crash-restarting like mad, and I don't want that person to run through an insane number of relfilenodes for no reason. I don't think there are going to be a lot of people creating thousands upon thousands of relations in a short period of time, and I'm not sure that it's a big deal if those who do end up having to wait for a few extra xlog flushes. -- Robert Haas EDB: http://www.enterprisedb.com
Re: 15beta1 tab completion of extension versions
Noah Misch writes: > I think it makes sense to send UPDATE TO as a single completion in places > where no valid command can have the UPDATE without the TO. CREATE RULE foo AS > ON UPDATE TO is a candidate, though CREATE RULE completion doesn't do that > today. "ALTER EXTENSION hstore UPDATE;" is a valid command (updates to the > control file default version). Hence, I think the v14 behavior was better. Hmm ... good point, let me think about that. regards, tom lane
Re: making relfilenodes 56 bits
On Sun, Jul 3, 2022 at 12:59 AM Andres Freund wrote: > Hm. Now that I think about it, isn't the XlogFlush() in > XLogPutNextRelFileNumber() problematic performance wise? Yes, we'll spread the > cost across a number of GetNewRelFileNumber() calls, but still, an additional > f[data]sync for every 64 relfilenodes assigned isn't cheap - today there's > zero fsyncs when creating a sequence or table inside a transaction (there are > some for indexes, but there's patches to fix that). > > Not that I really see an obvious alternative. I think to see the impact we need a workload which frequently creates the relfilenode, maybe we can test where parallel to pgbench we are frequently creating the relation/indexes and check how much performance hit we see. And if we see the impact then increasing VAR_RFN_PREFETCH value can help in resolving that. > I guess we could try to invent a flush-log-before-write type logic for > relfilenodes somehow? So that the first block actually written to a file needs > to ensure the WAL record that created the relation is flushed. But getting > that to work reliably seems nontrivial. > > One thing that would be good is to add an assertion to a few places ensuring > that relfilenodes aren't above ->nextRelFileNumber, most importantly somewhere > in the recovery path. Yes, it makes sense. > Why did you choose a quite small value for VAR_RFN_PREFETCH? VAR_OID_PREFETCH > is 8192, but you chose 64 for VAR_RFN_PREFETCH? Earlier it was 8192, then there was a comment from Robert that we can use Oid for many other things like an identifier for a catalog tuple or a TOAST chunk, but a RelFileNumber requires a filesystem operation, so the amount of work that is needed to use up 8192 RelFileNumbers is a lot bigger than the amount of work required to use up 8192 OIDs. I think that make sense so I reduced it to 64, but now I tends to think that we also need to consider the point that after consuming VAR_RFN_PREFETCH we are going to do XlogFlush(), so it's better that we keep it high. And as Robert told upthread, even with keeping it 8192 we can still crash 2^41 (2 trillian) times before we completely run out of the number. So I think we can easily keep it up to 8192 and I don't think that we really need to worry much about the performance impact by XlogFlush()? > I'd spell out RFN in VAR_RFN_PREFETCH btw, it took me a bit to expand RFN to > relfilenode. Okay. > This is embarassing. Trying to keep alignment match between C and catalog > alignment on AIX, without actually making the system understand the alignment > rules, is a remarkably shortsighted approach. > > I started a separate thread about it, since it's not really relevant to this > thread: > https://postgr.es/m/20220702183354.a6uhja35wta7agew%40alap3.anarazel.de > > Maybe we could at least make the field order to be something like > oid, relam, relfilenode, relname Yeah that we can do. > that should be ok alignment wise, keep oid first, and seems to make sense from > an "importance" POV? Can't really interpret later fields without knowing relam > etc. Right. > > I think due to wraparound if relfilenode gets reused by another > > relation in the same checkpoint then there was an issue in crash > > recovery with wal level minimal. But the root of the issue is a > > wraparound, right? > > I'm not convinced the tombstones were required solely in the oid wraparound > case before, despite the comment saying so, with wal_level=minimal. I gotta do > some non-work stuff for a bit, so I need to stop pondering this now :) > > I think it might be a good idea to have a few weeks in which we do *not* > remove the tombstones, but have assertion checks against such files existing > when we don't expect them to. I.e. commit 1-3, add the asserts, then commit 4 > a bit later. I think this is a good idea. > > Okay, so you mean to say that we can first drop the remaining segment > > and at last we drop the segment 0 right? > > I'd use the approach Robert suggested and delete from the end, going down. Yeah, I got it, thanks. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: Slow standby snapshot
> On 1 Apr 2022, at 04:18, Michail Nikolaev wrote: > > Hello. > > Just an updated commit message. I've looked into v5. IMO the purpose of KnownAssignedXidsNext would be slightly more obvious if it was named KnownAssignedXidsNextOffset. Also please consider some editorialisation: s/high value/big number/g KnownAssignedXidsNext[] is updating while taking the snapshot. -> KnownAssignedXidsNext[] is updated during taking the snapshot. O(N) next call -> amortized O(N) on next call Is it safe on all platforms to do "KnownAssignedXidsNext[prev] = n;" while only holding shared lock? I think it is, per Alexander's comment, but maybe let's document it? Thank you! Thanks! Best regards, Andrey Borodin.
Re: 15beta1 tab completion of extension versions
On Sun, Jun 19, 2022 at 12:56:13AM -0400, Tom Lane wrote: > Actually ... after further thought it seems like maybe we should > make this more like other cases rather than less so. ISTM that much > of the issue here is somebody's decision that "TO version" should be > offered as a completion of "UPDATE", which is unlike the way we do this > anywhere else --- the usual thing is to offer "UPDATE TO" as a single > completion. So I'm thinking about the attached. Which are the older completions that offer "UPDATE TO"? I don't see any. > This behaves a little differently from the old code. In v14, > alter extension pg_trgm upd > gives you > alter extension pg_trgm update > and another produces > alter extension pg_trgm update TO "1. > > With this, > alter extension pg_trgm upd > gives you > alter extension pg_trgm update to > and another produces > alter extension pg_trgm update to "1. > > That seems more consistent with other cases, and it's the same > number of presses. I think it makes sense to send UPDATE TO as a single completion in places where no valid command can have the UPDATE without the TO. CREATE RULE foo AS ON UPDATE TO is a candidate, though CREATE RULE completion doesn't do that today. "ALTER EXTENSION hstore UPDATE;" is a valid command (updates to the control file default version). Hence, I think the v14 behavior was better.
Re: SQL/JSON functions vs. ECPG vs. STRING as a reserved word
On Mon, May 30, 2022 at 05:20:15PM -0400, Tom Lane wrote: [allow EXEC SQL TYPE unreserved_keyword IS ...] > 1. In pgc.l, if an identifier is a typedef name, ignore any possible > keyword meaning and return it as an IDENT. (I'd originally supposed > that we'd want to return some new TYPEDEF token type, but that does > not seem to be necessary right now, and adding a new token type would > increase the patch footprint quite a bit.) > > 2. In the var_type production, forget about ECPGColLabel[Common] > and just handle the keywords we know we need, plus IDENT for the > typedef case. It turns out that we have to have duplicate coding > because most of these keywords are not keywords in C lexing mode, > so that they'll come through the IDENT path anyway when we're > in a C rather than SQL context. That seemed acceptable to me. > I thought about adding them all to the C keywords list but that > seemed likely to have undesirable side-effects, and again it'd > bloat the patch footprint. > > This fix is not without downsides. Disabling recognition of > keywords that match typedefs means that, for example, if you > declare a typedef named "work" then ECPG will fail to parse > "EXEC SQL BEGIN WORK". So in a real sense this is just trading > one hazard for another. But there is an important difference: > with this, whether your ECPG program works depends only on what > typedef names and SQL commands are used in the program. If > it compiles today it'll still compile next year, whereas with > the present implementation the addition of some new unreserved > SQL keyword could break it. We'd have to document this change > for sure, and it wouldn't be something to back-patch, but it > seems like it might be acceptable from the users' standpoint. I agree this change is more likely to please a user than to harm a user. The user benefit is slim, but the patch is also slim. > We could narrow (not eliminate) this hazard if we could get the > typedef lookup in pgc.l to happen only when we're about to parse > a var_type construct. But because of Bison's lookahead behavior, > that seems to be impossible, or at least undesirably messy > and fragile. But perhaps somebody else will see a way. I don't, though I'm not much of a Bison wizard. > Anyway, this seems like too big a change to consider for v15, > so I'll stick this patch into the v16 CF queue. It's only > draft quality anyway --- lacks documentation changes and test > cases. There are also some coding points that could use review. > Notably, I made the typedef lookup override SQL keywords but > not C keywords; this is for consistency with the C-mode lookup > rules, but is it the right thing? That decision seems fine. ScanCKeywordLookup() covers just twenty-six keywords, and that list hasn't changed since 2003. Moreover, most of them are keywords of the C language itself, so allowing them would entailing mangling them in the generated C to avoid C compiler errors. Given the lack of complaints, let's not go there. I didn't locate any problems beyond the test and doc gaps that you mentioned, so I've marked this Ready for Committer.