Re: Shared detoast Datum proposal
Hi, Robert, thank you very much for your response to this thread. I agree with most of things you've mentioned, but this proposal is a PoC, and anyway has a long way to go to be presented (if it ever would) as something to be committed. Andy, glad you've not lost interest in this work, I'm looking forward to your improvements! -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: Shared detoast Datum proposal
Hi! Here's a slightly improved version of patch Tomas provided above (v2), with cache invalidations and slices caching added, still as PoC. The main issue I've encountered during tests is that when the same query retrieves both slices and full value - slices, like substring(t,...) the order of the values is not predictable, with text fields substrings are retrieved before the full value and we cannot benefit from cache full value first and use slices from cached value. Yet the cache code is still very compact and affects only sources related to detoasting. Tomas, about HASH_ENTER - according to comments it could throw an OOM error, so I've changed it to HASH_ENTER_NULL to avoid new errors. In this case we would just have the value not cached without an error. -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/ 0001-toast-cache-v3.patch Description: Binary data
Re: Shared detoast Datum proposal
Hi! Tomas, I thought about this issue - >What if you only need the first slice? In that case decoding everything >will be a clear regression. And completely agree with you, I'm currently working on it and will post a patch a bit later. Another issue we have here - if we have multiple slices requested - then we have to update cached slice from both sides, the beginning and the end. On update, yep, you're right >Surely the update creates a new TOAST value, with a completely new TOAST >pointer, new rows in the TOAST table etc. It's not updated in-place. So >it'd end up with two independent entries in the TOAST cache. >Or are you interested just in evicting the old value simply to free the >memory, because we're not going to need that (now invisible) value? That >seems interesting, but if it's too invasive or complex I'd just leave it >up to the LRU eviction (at least for v1). Again, yes, we do not need the old value after it was updated and it is better to take care of it explicitly. It's a simple and not invasive addition to your patch. Could not agree with you about large values - it makes sense to cache large values because the larger it is the more benefit we will have from not detoasting it again and again. One way is to keep them compressed, but what if we have data with very low compression rate? I also changed HASHACTION field value from HASH_ENTER to HASH_ENTER_NULL to softly deal with case when we do not have enough memory and added checks for if the value was stored or not. -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: Shared detoast Datum proposal
Hi, In addition to the previous message - for the toast_fetch_datum_slice the first (seems obvious) way is to detoast the whole value, save it to cache and get slices from it on demand. I have another one on my mind, but have to play with it first. -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: Shared detoast Datum proposal
ot;default" algorithm if we don't have a very > > good idea of the exact life span of the values, etc. Which is why other > > nodes (e.g. Memoize) use LRU too. > > > But I wonder if there's a way to count how many times an attribute is > > accessed (or is likely to be). That might be used to inform a better > > eviction strategy. > > Yes, but in current issue we can get a better esitimation with the help > of plan shape and Memoize depends on some planner information as well. > If we bypass the planner information and try to resolve it at the > cache level, the code may become to complex as well and all the cost is > run time overhead while the other way is a planning timie overhead. > > > Also, we don't need to evict the whole entry - we could evict just the > > data part (guaranteed to be fairly large), but keep the header, and keep > > the counts, expected number of hits, and other info. And use this to > > e.g. release entries that reached the expected number of hits. But I'd > > probably start with LRU and only do this as an improvement later. > > A great lession learnt here, thanks for sharing this! > > As for the current user case what I want to highlight is in the current > user case, we are "caching" "user data" "locally". > > USER DATA indicates it might be very huge, it is not common to have a > 1M tables, but it is much common we have 1M Tuples in one scan, so > keeping the header might extra memory usage as well, like 10M * 24 bytes > = 240MB. LOCALLY means it is not friend to multi active sessions. CACHE > indicates it is hard to evict correctly. My method also have the USER > DATA, LOCALLY attributes, but it would be better at eviction. eviction > then have lead to memory usage issue which is discribed at the beginning > of this writing. > > >>> Also, this leads me to the need of having some sort of memory limit. If > >>> we may be keeping entries for extended periods of time, and we don't > >>> have any way to limit the amount of memory, that does not seem great. > >>> > >>> AFAIK if we detoast everything into tts_values[] there's no way to > >>> implement and enforce such limit. What happens if there's a row with > >>> multiple large-ish TOAST values? What happens if those rows are in > >>> different (and distant) parts of the plan? > >> > >> I think this can be done by tracking the memory usage on EState level > >> or global variable level and disable it if it exceeds the limits and > >> resume it when we free the detoast datum when we don't need it. I think > >> no other changes need to be done. > >> > > > > That seems like a fair amount of additional complexity. And what if the > > toasted values are accessed in context without EState (I haven't checked > > how common / important that is)? > > > > And I'm not sure just disabling the detoast as a way to enforce a memory > > limit, as explained earlier. > > > >>> It seems far easier to limit the memory with the toast cache. > >> > >> I think the memory limit and entry eviction is the key issue now. IMO, > >> there are still some difference when both methods can support memory > >> limit. The reason is my patch can grantee the cached memory will be > >> reused, so if we set the limit to 10MB, we know all the 10MB is > >> useful, but the TOAST cache method, probably can't grantee that, then > >> when we want to make it effecitvely, we have to set a higher limit for > >> this. > >> > > > > Can it actually guarantee that? It can guarantee the slot may be used, > > but I don't see how could it guarantee the detoasted value will be used. > > We may be keeping the slot for other reasons. And even if it could > > guarantee the detoasted value will be used, does that actually prove > > it's better to keep that value? What if it's only used once, but it's > > blocking detoasting of values that will be used 10x that? > > > > If we detoast a 10MB value in the outer side of the Nest Loop, what if > > the inner path has multiple accesses to another 10MB value that now > > can't be detoasted (as a shared value)? > > Grarantee may be wrong word. The difference in my mind are: > 1. plan shape have better potential to know the user case of datum, > since we know the plan tree and knows the rows pass to a given node. > 2. Planning time effort is cheaper than run-time effort. > 3. eviction in my method is not as important as it is in TOAST cache > method since it is reset per slot, so usually it doesn't hit limit in > fact. But as a cache, it does. > 4. use up to memory limit we set in TOAST cache case. > > >>> In any case, my concern is more about having to do this when creating > >>> the plan at all, the code complexity etc. Not just because it might > have > >>> performance impact. > >> > >> I think the main trade-off is TOAST cache method is pretty non-invasive > >> but can't control the eviction well, the impacts includes: > >> 1. may evicting the datum we want and kept the datum we don't need. > > > > This applies to any eviction algorithm, not just LRU. Ultimately what > > matters is whether we have in the cache the most often used values, i.e. > > the hit ratio (perhaps in combination with how expensive detoasting that > > particular entry was). > > Correct, just that I am doubtful about design a LOCAL CACHE for USER > DATA with the reason I described above. > > At last, thanks for your attention, really appreciated about it! > > -- > Best Regards > Andy Fan > > > > -- Regards, -- Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: Shared detoast Datum proposal
Hi, Andy! Sorry for the delay, I have had long flights this week. I've reviewed the patch set, thank you for your efforts. I have several notes about patch set code, but first of I'm not sure the overall approach is the best for the task. As Tomas wrote above, the approach is very invasive and spreads code related to detoasting among many parts of code. Have you considered another one - to alter pg_detoast_datum (actually, it would be detoast_attr function) and save detoasted datums in the detoast context derived from the query context? We have just enough information at this step to identify the datum - toast relation id and value id, and could keep links to these detoasted values in a, say, linked list or hash table. Thus we would avoid altering the executor code and all detoast-related code would reside within the detoast source files? I'd check this approach in several days and would report on the result here. There are also comments on the code itself, I'd write them a bit later. -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: Shared detoast Datum proposal
Hi, Tomas, we already have a working jsonb partial detoast prototype, and currently I'm porting it to the recent master. Due to the size of the changes and very invasive nature it takes a lot of effort, but it is already done. I'm also trying to make the core patch less invasive. Actually, it is a subject for a separate topic, as soon as I make it work on the current master we'll propose it to the community. Andy, thank you! I'll check the last patch set out and reply in a day or two. -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: Shared detoast Datum proposal
Hi! I see this to be a very promising initiative, but some issues come into my mind. When we store and detoast large values, say, 1Gb - that's a very likely scenario, we have such cases from prod systems - we would end up in using a lot of shared memory to keep these values alive, only to discard them later. Also, toasted values are not always being used immediately and as a whole, i.e. jsonb values are fully detoasted (we're working on this right now) to extract the smallest value from big json, and these values are not worth keeping in memory. For text values too, we often do not need the whole value to be detoasted and kept in memory. What do you think? -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: POC: Extension for adding distributed tracing - pg_tracing
Hi! I agree with Heikki on most topics and especially the one he recommended to publish your extension on GitHub, this tool is very promising for highly loaded systems, you will get a lot of feedback very soon. I'm curious about SpinLock - it is recommended for very short operations, taking up to several instructions, and docs say that for longer ones it will be too expensive, and recommends using LWLock. Why have you chosen SpinLock? Does it have some benefits here? Thank you! PS: process_query_desc function has trace_context argument that is neither used nor passed anywhere. -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: POC: Extension for adding distributed tracing - pg_tracing
Hi! It's a good idea to split a big patch into several smaller ones. But you've already implemented these features - you could provide them as sequential small patches (i.e. v13-0002-guc-context-propagation.patch and so on) Great job! I'm both hands for committing your patch set. On Fri, Jan 26, 2024 at 1:17 PM Anthonin Bonnefoy < anthonin.bonne...@datadoghq.com> wrote: > Hi, > > The last CFbot failure was on pg_upgrade/002_pg_upgrade and doesn't > seem related to the patch. > > # Failed test 'regression tests pass' > # at C:/cirrus/src/bin/pg_upgrade/t/002_pg_upgrade.pl line 212. > # got: '256' > # expected: '0' > # Looks like you failed 1 test of 18. > > Given that the patch grew a bit in complexity and features, I've tried > to reduce it to a minimum to make reviewing easier. The goal is to > keep the scope focused for a first version while additional features > and changes could be handled afterwards. > > For this version: > - Trace context can be propagated through SQLCommenter > - Sampling decision uses either a global sampling rate or on > SQLCommenter's sampled flag > - We generate spans for Planner, ExecutorRun and ExecutorFinish > > What was put on hold for now: > - Generate spans from planstate. This means we don't need the change > in instrument.h to track the start of a node for the first version. > - Generate spans for parallel processes > - Using a GUC variable to propagate trace context > - Filtering sampling on queryId > > With this, the current patch provides the core to: > - Generate, store and export spans with basic buffer management (drop > old spans when full or drop new spans when full) > - Keep variable strings in a separate file (a la pg_stat_statements) > - Track tracing state and top spans across nested execution levels > > Thoughts on this approach? > > On Mon, Jan 22, 2024 at 7:45 AM Peter Smith wrote: > > > > 2024-01 Commitfest. > > > > Hi, This patch has a CF status of "Needs Review" [1], but it seems > > there were CFbot test failures last time it was run [2]. Please have a > > look and post an updated version if necessary. > > > > == > > [1] https://commitfest.postgresql.org/46/4456/ > > [2] https://cirrus-ci.com/task/5581154296791040 > > > > Kind Regards, > > Peter Smith. > -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: [PATCH] Compression dictionaries for JSONB
Hi, Aleksander, there was a quite straightforward answer regarding Pluggable TOAST in other thread - the Pluggable TOAST feature is not desired by the community, and advanced TOAST mechanics would be accepted as parts of problematic datatypes extended functionality, on a par with in and out functions, so what I am actually doing now - re-writing JSONb TOAST improvements to be called as separate functions without Pluggable TOAST API. This takes some time because there is a large and complex code base left by Nikita Glukhov who has lost interest in this work due to some reasons. I, personally, think that these two features could benefit from each other, but they could be adapted to each other after I would introduce JSONb Toaster in v17 master. If you don't mind please check the thread on extending the TOAST pointer - it is important for improving TOAST mechanics. On Wed, Jan 17, 2024 at 5:27 PM Aleksander Alekseev < aleksan...@timescale.com> wrote: > Hi again, > > > Yes it does for a while now. Until we reach any agreement regarding > > questions (1) and (2) personally I don't see the point in submitting > > rebased patches. We can continue the discussion but mark the CF entry > > as RwF for now it will be helpful. > > Sorry, what I actually meant were the following questions: > > """ > Several things occured to me: > > - Does anyone believe that va_tag should be part of the utf8-like > bitmask in order to save a byte or two? > > - The described approach means that compression dictionaries are not > going to be used when data is compressed in-place (i.e. within a > tuple), since no TOAST pointer is involved in this case. Also we will > be unable to add additional compression algorithms here. Does anyone > have problems with this? Should we use the reserved compression > algorithm id instead as a marker of an extended TOAST? > > - It would be nice to decompose the feature in several independent > patches, e.g. modify TOAST first, then add compression dictionaries > without automatic update of the dictionaries, then add the automatic > update. I find it difficult to imagine however how to modify TOAST > pointers and test the code properly without a dependency on a larger > feature. Could anyone think of a trivial test case for extendable > TOAST? Maybe something we could add to src/test/modules similarly to > how we test SLRU, background workers, etc. > """ > > Since there was not much activity since then (for 3 months) I don't > really see how to process further. > > -- > Best regards, > Aleksander Alekseev > -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: POC: Extension for adding distributed tracing - pg_tracing
Hi! I've meant exactly the thing you mentioned - > > > By queries you mean particular queries, not transactions? And since > > they have an assigned ID it means that the query is already executing > > and we want to enable the tracking in another session, right? > > I think that was the idea. The query identifier generated for a > specific statement is stable so we could have a GUC variable with a > list of query id and only queries matching the provided query ids > would be sampled. This would make it easier to generate traces for a > specific query within a session. > > This one. When maintaining production systems with high load rate we often encountered necessity to trace a number of queries with known IDs (that was the case for Oracle environments, but Postgres is not very different from this point of view), because enabling tracing for the whole session would immediately result in thousands of trace spans in the system with throughput like hundreds or even thousands of tps, when we need, say, to trace a single problematic query. Thank you! -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: introduce dynamic shared memory registry
Hi! This patch looks like a good solution for a pain in the ass, I'm too for this patch to be committed. Have looked through the code and agree with Andrei, the code looks good. Just a suggestion - maybe it is worth adding a function for detaching the segment, for cases when we unload and/or re-load the extension? -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: Is a clearer memory lifespan for outerTuple and innerTuple useful?
Hi! Maybe, the alternative way is using a separate kind of context, say name it 'ToastContext' for all custom data related to Toasted values? What do you think? On Sun, Dec 17, 2023 at 4:52 PM Andy Fan wrote: > > Andy Fan writes: > > > Andy Fan writes: > > > >> ..., I attached the 2 MemoryContext in > >> JoinState rather than MergeJoinState, which is for the "shared detoast > >> value"[0] more or less. > >> > > In order to delimit the scope of this discussion, I attached the 2 > MemoryContext to MergeJoinState. Since the code was writen by Tom at > 2005, so add Tom to the cc-list. > > > -- > Best Regards > Andy Fan > -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: POC: Extension for adding distributed tracing - pg_tracing
Hi! Overall solution looks good for me except SQL Commenter - query instrumentation with SQL comments is often not possible on production systems. Instead the very often requested feature is to enable tracing for a given single query ID, or several (very limited number of) queries by IDs. It could be done by adding SQL function to add and remove query ID into a list (even array would do) stored in top tracing context. Great job, thank you! -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: RFI: Extending the TOAST Pointer
Hi, Here's the PoC for a custom TOAST pointer. The main idea is that custom pointer provides data space allowing to store custom metadata (i.e. TOAST method, relation OIDs, advanced compression information, etc, and even keep part of the data inline. Any feedback would be greatly appreciated. -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/ 0001_custom_toast_pointer_v1.patch Description: Binary data
Re: Avoid detoast overhead when possible
Hi, With your setup (table created with setup.sql): postgres@postgres=# explain analyze select big->'1', big->'2', big->'3', big->'5', big->'10' from b; QUERY PLAN -- Seq Scan on b (cost=0.00..29.52 rows=1001 width=160) (actual time=0.656..359.964 rows=1001 loops=1) Planning Time: 0.042 ms Execution Time: 360.177 ms (3 rows) Time: 361.054 ms postgres@postgres=# explain analyze select big->'1' from b; QUERY PLAN Seq Scan on b (cost=0.00..19.51 rows=1001 width=32) (actual time=0.170..63.996 rows=1001 loops=1) Planning Time: 0.042 ms Execution Time: 64.063 ms (3 rows) Time: 64.626 ms Without patch, the same table and queries: postgres@postgres=# explain analyze select big->'1', big->'2', big->'3', big->'5', big->'10' from b; QUERY PLAN -- Seq Scan on b (cost=0.00..29.52 rows=1001 width=160) (actual time=0.665..326.399 rows=1001 loops=1) Planning Time: 0.035 ms Execution Time: 326.508 ms (3 rows) Time: 327.132 ms postgres@postgres=# explain analyze select big->'1' from b; QUERY PLAN Seq Scan on b (cost=0.00..19.51 rows=1001 width=32) (actual time=0.159..62.807 rows=1001 loops=1) Planning Time: 0.033 ms Execution Time: 62.879 ms (3 rows) Time: 63.504 ms -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: Avoid detoast overhead when possible
Hi, Hmmm, I've checked this patch and can't see performance difference on a large (2 key-value pairs) json, using toasted json column several times makes no difference between current implementation on master (like queries mentioned above). Maybe I'm doing something wrong? On Tue, Dec 5, 2023 at 4:16 AM wrote: > > Hi, > > Matthias van de Meent writes: > > > SELECT toastable_col FROM t1 > > WHERE f(t1.toastable_col) > > ORDER BY nonindexed; > > Thanks for this example! it's true that the current design requires more > memory to sort since toastable_col is detoasted at the scan stage and it > is output to the sort node. It should be avoided. > > > SELECT ev_class > > FROM pg_rewrite > > WHERE octet_length(ev_action) > 1 > > ORDER BY ev_class; > > This one is different I think, since the ev_action (the toastable_col) is > *NOT* output to sort node, so no extra memory is required IIUC. > > * CP_SMALL_TLIST specifies that a narrower tlist is preferred. This is > * passed down by parent nodes such as Sort and Hash, which will have to > * store the returned tuples. > > We can also verify this by > > explain (costs off, verbose) SELECT ev_class > FROM pg_rewrite > WHERE octet_length(ev_action) > 1 > ORDER BY ev_class; > QUERY PLAN > -- > Sort >Output: ev_class >Sort Key: pg_rewrite.ev_class >-> Seq Scan on pg_catalog.pg_rewrite > Output: ev_class > Filter: (octet_length((pg_rewrite.ev_action)::text) > 1) > (6 rows) > > Only ev_class is output to Sort node. > > So if we want to make sure there is performance regression for all the > existing queries in any case, we can add 1 more restriction into the > saved-detoast-value logic. It must be (NOT under CP_SMALL_TLIST) OR (the > toastable_col is not in the output list). It can be a planner decision. > > If we code like this, the result will be we need to dotoast N times > for toastable_col in qual for the below query. > > SELECT toastable_col FROM t > WHERE f1(toastable_col) > AND f2(toastable_col) > .. > AND fn(toastable_col) > ORDER BY any-target-entry; > > However > > SELECT > f1(toastable_col), > f2(toastable_col), > .. > fn(toastable_col) > FROM t > ORDER BY any-target-entry; > > the current path still works for it. > > This one is my favorite one so far. Another option is saving the > detoast-value in some other memory or existing-slot-in-place for > different sistuation, that would requires more expr expression changes > and planner changes. I just checked all the queries in my hand, the > current design can cover all of them. > > -- > Best Regards > Andy Fan > > > > -- Regards, -- Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: Avoid detoast overhead when possible
Hi! There's a view from the other angle - detoast just attributes that are needed (partial detoast), with optimized storage mechanics for JSONb. I'm preparing a patch for it, so maybe the best results could be acquired by combining these two techniques. What do you think? -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: [Proposal] global sequence implemented by snowflake ID
Hi! I have reviewed the patch in this topic and have a question mentioning the machine ID - INSERT INTO snowflake_sequence.machine_id SELECT round((random() * (0 - 511))::numeric, 0) + 511; This kind of ID generation does not seem to guarantee from not having the same ID in a pool of instances, does it? On Thu, Nov 30, 2023 at 4:18 AM Michael Paquier wrote: > On Tue, Nov 28, 2023 at 02:23:44PM +0530, Amit Kapila wrote: > > It is interesting to see you want to work towards globally distributed > > sequences. I think it would be important to discuss how and what we > > want to achieve with sequences w.r.t logical replication and or > > active-active configuration. There is a patch [1] for logical > > replication of sequences which will primarily achieve the failover > > case, i.e. if the publisher goes down and the subscriber takes over > > the role, one can re-direct connections to it. Now, if we have global > > sequences, one can imagine that even after failover the clients can > > still get unique values of sequences. It will be a bit more flexible > > to use global sequences, for example, we can use the sequence on both > > nodes at the same time which won't be possible with the replication of > > sequences as they will become inconsistent. Now, it is also possible > > that both serve different use cases and we need both functionalities > > but it would be better to have some discussion on the same. > > > > Thoughts? > > > > [1] - https://commitfest.postgresql.org/45/3823/ > > Thanks for pointing this out. I've read through the patch proposed by > Tomas and both are independent things IMO. The logical decoding patch > relies on the SEQ_LOG records to find out which last_value/is_called > to transfer, which is something directly depending on the in-core > sequence implementation. Sequence AMs are concepts that cover much > more ground, leaving it up to the implementor to do what they want > while hiding the activity with a RELKIND_SEQUENCE (generated columns > included). > > To put it short, I have the impression that one and the other don't > really conflict, but just cover different ground. However, I agree > that depending on the sequence AM implementation used in a cluster > (snowflake IDs guarantee unicity with their machine ID), replication > may not be necessary because the sequence implementation may be able > to ensure that no replication is required from the start. > -- > Michael > -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: Table AM Interface Enhancements
Hi, Pavel, as far as I understand Alexander's idea assertion and especially ereport here does not make any sense - this method is not considered to report error, it silently calls if there is underlying [free] function and simply falls through otherwise, also, take into account that it could be located in the uninterruptible part of the code. On the whole topic I have to On Wed, Nov 29, 2023 at 4:56 PM Pavel Borisov wrote: > Hi, Alexander! > >> I'm planning to review some of the other patches from the current >> patchset soon. >> > > I've looked into the patch 0003. > The patch looks in good shape and is uncontroversial to me. Making memory > structures to be dynamically allocated is simple enough and it allows to > store complex data like lists etc. I consider places like this that expect > memory structures to be flat and allocated at once are because the was no > need in more complex ones previously. If there is a need for them, I think > they could be added without much doubt, provided the simplicity of the > change. > > For the code: > +static inline void > +table_free_rd_amcache(Relation rel) > +{ > + if (rel->rd_tableam && rel->rd_tableam->free_rd_amcache) > + { > + rel->rd_tableam->free_rd_amcache(rel); > + } > + else > + { > + if (rel->rd_amcache) > + pfree(rel->rd_amcache); > + rel->rd_amcache = NULL; > + } > > here I suggest adding Assert(rel->rd_amcache == NULL) (or maybe better an > error report) after calling free_rd_amcache to be sure the custom > implementation has done what it should do. > > Also, I think some brief documentation about writing this custom method is > quite relevant maybe based on already existing comments in the code. > > Kind regards, > Pavel > -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: Patch: Global Unique Index
Hi! Please advise on the status of this patch set - are there any improvements? Is there any work going on? Thanks! -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: RFC: Pluggable TOAST
Hi! Matthias, regarding your message above, I have a question to ask. On typed TOAST implementations - we thought that TOAST method used for storing data could depend not only on data type, but on the flow or workload, like out bytea appendable toaster which is much (hundreds of times) faster on update compared to regular procedure. That was one of ideas behind the Pluggable TOAST - we can choose the most suitable TOAST implementation available. If we have a single TOAST entry point for data type - then we should have some means to control it or choose a TOAST method suitable to our needs. Or should not? -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: RFC: Pluggable TOAST
Hi, I've been thinking about Matthias' proposals for some time and have some questions: >So, in short, I don't think there is a need for a specific "Pluggable >toast API" like the one in the patchset at [0] that can be loaded >on-demand, but I think that updating our current TOAST system to a >system for which types can provide support functions would likely be >quite beneficial, for efficient extraction of data from composite >values. As I understand one of the reasons against Pluggable TOAST is that differences in plugged-in Toasters could result in incompatibility even in different versions of the same DB. The importance of the correct TOAST update is out of question, feel like I have to prepare a patch for it. There are some questions though, I'd address them later with a patch. >Example support functions: >/* TODO: bikeshedding on names, signatures, further support functions. */ >Datum typsup_roastsliceofbread(Datum ptr, int sizetarget, char cmethod) >Datum typsup_unroastsliceofbread(Datum ptr) >void typsup_releaseroastedsliceofbread(Datump ptr) /* in case of >non-unitary in-memory datums */ I correctly understand that you mean extending PG_TYPE and type cache, by adding a new function set for toasting/detoasting a value in addition to in/out, etc? I see several issues here: 1) We could benefit from knowledge of internals of data being toasted (i.e. in case of JSON value with key-value structure) only when EXTERNAL storage mode is set, otherwise value will be compressed before toasted. So we have to keep both TOAST mechanics regarding the storage mode being used. It's the same issue as in Pluggable TOAST. Is it OK? 2) TOAST pointer is very limited in means of data it keeps, we'd have to extend it anyway and keep both for backwards compatibility; 3) There is no API and such an approach would require implementing toast and detoast in every data type we want to be custom toasted, resulting in multiple files modification. Maybe we have to consider introducing such an API? 4) 1 toast relation per regular relation. With an update mechanics this will be less limiting, but still a limiting factor because 1 entry in base table could have a lot of entries in the toast table. Are we doing something with this? >We would probably want at least 2 more subtypes of varattrib_1b_e - >one for on-disk pointers, and one for in-memory pointers - where the >payload of those pointers is managed by the type's toast mechanism and >considered opaque to the rest of PostgreSQL (and thus not compatible >with the binary transfer protocol). Types are currently already >expected to be able to handle their own binary representation, so >allowing types to manage parts of the toast representation should IMHO >not be too dangerous, though we should make sure that BINARY COERCIBLE >types share this toast support routine, or be returned to their >canonical binary version before they are cast to the coerced type, as >using different detoasting mechanisms could result in corrupted data >and thus crashes. >Lastly, there is the compression part of TOAST. I think it should be >relatively straightforward to expose the compression-related >components of TOAST through functions that can then be used by >type-specific toast support functions. >Note that this would be opt-in for a type, thus all functions that use >that type's internals should be aware of the different on-disk format >for toasted values and should thus be able to handle it gracefully. Thanks a lot for answers! -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: remaining sql/json patches
Hi! Read Tom Lane's note in previous discussion (quite long, so I've missed it) on pg_proc column - >I strongly recommend against having a new pg_proc column at all. >I doubt that you really need it, and having one will create >enormous mechanical burdens to making the conversion. (For example, >needing a catversion bump every time we convert one more function, >or an extension version bump to convert extensions.) so should figure out another way to do it. Regards, -- Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: remaining sql/json patches
Hi! According to the discussion above, I've added the 'proerrsafe' attribute to the PG_PROC relation. The same was done some time ago by Nikita Glukhov but this part was reverted. This is a WIP patch, I am new to this part of Postgres, so please correct me if I'm going the wrong way. -- Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/ 0001_proerrsafe_attr_v1.patch Description: Binary data
Re: RFC: Pluggable TOAST
Hi! Matthias, thank you for your patience and explanation. I'd wish I had it much earlier, it would save a lot of time. You've asked a lot of good questions, and the answers we have for some seem to be not very satisfactory, and pointed out some topics that were not mentioned before. I have to rethink our approach to the TOAST enhancements according to it. Thanks a lot! -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: remaining sql/json patches
Hi, Agreed on the latter, that must not be the part of it for sure. Would think on how to make this part correct. -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: RFC: Pluggable TOAST
Hi, I meant discussion preceding the patch set - there was no any. And the goal of *THIS* topic is to gather a picture on how the community sees improvements in TOAST mechanics if it doesn't want it the way we proposed before, to understand which way to go with JSON advanced storage and other enhancements we already have. Previous topic was not of any help here. -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: remaining sql/json patches
Hi, The main goal was to correctly process invalid queries (as in examples above). I'm not sure this could be done in type input functions. I thought that some coercions could be checked before evaluating expressions for saving reasons. -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: RFC: Pluggable TOAST
Hi, Aleksander, previous discussion was not a discussion actually, we proposed a set of big and complex core changes without any discussion preceding it. That was not very good approach although the overall idea behind the patch set is very progressive and is ready to solve some old and painful issues in Postgres. Also, introduction of SQL/JSON will further boost usage of JSON in databases, so our improvements in JSON storage and performance would be very useful. These improvements depend on Pluggable TOAST, without API that allows easy plug-in different TOAST implementations they require heavy core modifications and are very unlikely to be accepted. Not to mention that such kind of changes require upgrades, restarts and so on. Pluggable TOAST allows using advanced storage techniques on top of the default Postgres database engine, instead of implementing the complex Pluggable Storage API, and allows plugging these advanced techniques on the fly - without even restarting the server, which is crucial for production systems. Discussion on extending the TOAST pointer showed some interest in this topic, so I hope this feature would draw some attention in the scope of widely used large JSON objects. -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: remaining sql/json patches
Hi! Amit, on previous email, patch #2 - I agree that it is not the best idea to introduce new type of logic into the parser, so this logic could be moved to the executor, or removed at all. What do you think of these options? On Wed, Oct 18, 2023 at 5:19 AM jian he wrote: > Hi. > based on v22. > > I added some tests again json_value for the sake of coverager test. > > A previous email thread mentioned needing to check *empty in > ExecEvalJsonExpr. > since JSON_VALUE_OP, JSON_QUERY_OP, JSON_EXISTS_OP all need to have > *empty cases, So I refactored a little bit. > might be helpful. Maybe we can also refactor *error cases. > > The following part is not easy to understand. > res = ExecPrepareJsonItemCoercion(jbv, > + jsestate->item_jcstates, > + &post_eval->jcstate); > + if (post_eval->jcstate && > + post_eval->jcstate->coercion && > + (post_eval->jcstate->coercion->via_io || > + post_eval->jcstate->coercion->via_populate)) > -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
RFC: Pluggable TOAST
Hi hackers! We need community feedback on previously discussed topic [1]. There are some long-live issues in Postgres related to the TOAST mechanics, like [2]. Some time ago we already proposed a set of patches with an API allowing to plug in different TOAST implementations into a live database. The patch set introduced a lot of code and was quite crude in some places, so after several implementations we decided to try to implement it in the production environment for further check-up. The main idea behind pluggable TOAST is make it possible to easily plug in and use different implementations of large values storage, preserving existing mechanics to keep backward compatibilitну provide easy Postgres-way give users alternative mechanics for storing large column values in a more effective way - we already have custom and very effective (up to tens and even hundreds of times faster) TOAST implementations for bytea and JSONb data types. As we see it - Pluggable TOAST proposes 1) changes in TOAST pointer itself, extending it to store custom data - current limitations of TOAST pointer were discussed in [1] and [4]; 2) API which allows calls of custom TOAST implementations for certain table columns and (a topic for discussion) certain datatypes. Custom TOAST could be also used in a not so trivial way - for example, limited columnar storage could be easily implemented and plugged in without heavy core modifications of implementation of Pluggable Storage (Table Access Methods), preserving existing data and database structure, be upgraded, replicated and so on. Any thoughts and proposals are welcome. [1] Pluggable TOAST https://www.postgresql.org/message-id/flat/224711f9-83b7-a307-b17f-4457ab73aa0a%40sigaev.ru [2] Infinite loop while acquiring new TOAST Oid https://www.postgresql.org/message-id/flat/CAN-LCVPRvRzxeUdYdDCZ7UwZQs1NmZpqBUCd%3D%2BRdMPFTyt-bRQ%40mail.gmail.com [3] JSONB Compression dictionaries https://www.postgresql.org/message-id/flat/CAJ7c6TOtAB0z1UrksvGTStNE-herK-43bj22%3D5xVBg7S4vr5rQ%40mail.gmail.com [4] Extending the TOAST pointer https://www.postgresql.org/message-id/flat/CAN-LCVMq2X%3Dfhx7KLxfeDyb3P%2BBXuCkHC0g%3D9GF%2BJD4izfVa0Q%40mail.gmail.com -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: Pro et contra of preserving pg_proc oids during pg_upgrade
Hi, Thank you very much, I'll check it out. It looks like the getObjectIdentity() used in pg_identify_object() could do. -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: remaining sql/json patches
Hi, Sorry, forgot to mention above - patches from our patch set should be applied onto SQL/JSON part 3 - v22-0003-SQL-JSON-query-functions.patch, thus they are numbered as v23-0003-1 and -2. -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: remaining sql/json patches
Hi, Also FYI - the following case results in segmentation fault: postgres@postgres=# CREATE TABLE test_jsonb_constraints ( js text, i int, x jsonb DEFAULT JSON_QUERY(jsonb '[1,2]', '$[*]' WITH WRAPPER) CONSTRAINT test_jsonb_constraint1 CHECK (js IS JSON) CONSTRAINT test_jsonb_constraint5 CHECK (JSON_QUERY(js::jsonb, '$.mm' RETURNING char(5) OMIT QUOTES EMPTY ARRAY ON EMPTY) > 'a' COLLATE "C") CONSTRAINT test_jsonb_constraint6 CHECK (JSON_EXISTS(js::jsonb, 'strict $.a' RETURNING int TRUE ON ERROR) < 2) ); CREATE TABLE Time: 13.518 ms postgres@postgres=# INSERT INTO test_jsonb_constraints VALUES ('[]'); server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Failed. The connection to the server was lost. Attempting reset: Failed. Time: 6.858 ms @!> We're currently looking into this case. -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: remaining sql/json patches
Hi! With the latest set of patches we encountered failure with the following query: postgres@postgres=# SELECT JSON_QUERY(jsonpath '"aaa"', '$' RETURNING text); server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Failed. The connection to the server was lost. Attempting reset: Failed. Time: 11.165 ms A colleague of mine, Anton Melnikov, proposed the following changes which slightly alter coercion functions to process this kind of error correctly. Please check attached patch set. -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/ v23-0003-1-transformJsonExprCommon-fixup.patch Description: Binary data v23-0003-2-json-query-coercion-override.patch Description: Binary data
Re: Pro et contra of preserving pg_proc oids during pg_upgrade
Hi, Textual representation requires a long text field because it could contain schema, arguments, it is difficult and not effective to be saved as part of the data, and must be parsed to retrieve function oid. By using direct oid (actually, a value of the regprocedure field) we avoid it and function could be retrieved by pk. Why pg_upgrade cannot be used? OID preservation logic is already implemented for several OIDs in catalog tables, like pg_class, type, relfilenode, enum... I've mentioned twice that this logic is already implemented and I haven't encountered any problems with pg_upgrade. Actually, I've asked here because there are several references to PG_PROC oids from other tables in the system catalog, so I was worried if this logic could break something I do not know about. -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: Pro et contra of preserving pg_proc oids during pg_upgrade
Hi, I've already implemented preserving PG_PROC oids during pg_upgrade in a way like relfilenodes, etc, actually, it is quite simple, and on the first look there are no any problems. About using surrogate key - this feature is more for data generated by the DBMS itself, i.e. data processed by some extension and saved and re-processed automatically or by user's request, but without bothering user with these internal keys. The main question - maybe, are there pitfalls of which I am not aware of? Thanks for your replies! -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: Pro et contra of preserving pg_proc oids during pg_upgrade
Hi! Say, we have data processed by some user function and we want to keep reference to this function in our data. In this case we have two ways - first - store string output of regprocedure, which is not very convenient, and the second - store its OID, which requires slight modification of pg_upgrade (pg_dump and func/procedure creation function). I've read previous threads about using regproc, and agree that this is not a very good case anyway, but I haven't found any serious obstacles that forbid modifying pg_upgrade this way. -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Pro et contra of preserving pg_proc oids during pg_upgrade
Hi hackers! Please advise on the idea of preserving pg_proc oids during pg_upgrade, in a way like relfilenodes, type id and so on. What are possible downsides of such a solution? Thanks! -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: POC: Extension for adding distributed tracing - pg_tracing
Hi! Great you continue to work on this patch! I'm checking out the newest changes. On Fri, Sep 15, 2023 at 2:32 PM Aleksander Alekseev < aleksan...@timescale.com> wrote: > Hi, > > > Renaming/Refactoring: > > - All spans are now tracked in the palloced current_trace_spans buffer > > compared to top_span and parse_span being kept in a static variable > > before. > > - I've renamed query_spans to top_span. A top_span serves as the > > parent for all spans in a specific nested level. > > - More debugging information and assertions. Spans now track their > > nested level, if they've been ended and if they are a top_span. > > > > Changes: > > - I've added the subxact_count to the span's metadata. This can help > > identify the moment a subtransaction was started. > > - I've reworked nested queries and utility statements. Previously, I > > made the assumptions that we could only have one top_span per nested > > level which is not the case. Some utility statements can execute > > multiple queries in the same nested level. Tracing utility statement > > now works better (see image of tracing a create extension). > > Many thanks for the updated patch. > > If it's not too much trouble, please use `git format-patch`. This > makes applying and testing the patch much easier. Also you can provide > a commit message which 1. will simplify the work for the committer and > 2. can be reviewed as well. > > The tests fail on CI [1]. I tried to run them locally and got the same > results. I'm using full-build.sh from this repository [2] for > Autotools and the following one-liner for Meson: > > ``` > time CPPFLAGS="-DENFORCE_REGRESSION_TEST_NAME_RESTRICTIONS" sh -c 'git > clean -dfx && meson setup --buildtype debug -Dcassert=true > -DPG_TEST_EXTRA="kerberos ldap ssl load_balance" -Dldap=disabled > -Dssl=openssl -Dtap_tests=enabled > -Dprefix=/home/eax/projects/pginstall build && ninja -C build && ninja > -C build docs && PG_TEST_EXTRA=1 meson test -C build' > ``` > > The comments for pg_tracing.c don't seem to be formatted properly. > Please make sure to run pg_indent. See src/tools/pgindent/README > > The interface pg_tracing_spans(true | false) doesn't strike me as > particularly readable. Maybe this function should be private and the > interface be like pg_tracing_spans() and pg_trancing_consume_spans(). > Alternatively you could use named arguments in the tests, but I don't > think this is a preferred solution. > > Speaking of the tests I suggest adding a bit more comments before > every (or most) of the queries. Figuring out what they test could be > not particularly straightforward for somebody who will make changes > after the patch will be accepted. > > [1]: http://cfbot.cputube.org/ > [2]: https://github.com/afiskon/pgscripts/ > > -- > Best regards, > Aleksander Alekseev > -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: POC: Extension for adding distributed tracing - pg_tracing
Hi! Storing spans is a tricky question. Monitoring systems often use pull model and use probes or SQL API for pulling data, so from this point of view it is much more convenient to keep spans in separate table. But in this case we come to another issue - how to flush this data into the table? Automatic flushing on a full buffer would randomly (to the user) significantly affect query performance. I'd rather make a GUC turned off by default to store spans into the table instead of buffer. >3) I'm testing more complex queries. Most of my previous tests were using simple query protocol but extended protocol introduces >differences that break some assumptions I did. For example, with multi statement transaction like >BEGIN; >SELECT 1; >SELECT 2; >The parse of SELECT 2 will happen before the ExecutorEnd (and the end_tracing) of SELECT 1. For now, I'm skipping the post parse >hook if we still have an ongoing tracing. I've checked this behavior before and haven't noticed the case you mentioned, but for loops like for i in 1..2 loop StartTime := clock_timestamp(); insert into t2 values (i, a_data); EndTime := clock_timestamp(); Delta := 1000 * ( extract(epoch from EndTime) - extract(epoch from StartTime) ); end loop; I've got the following call sequence: psql:/home/postgres/tests/trace.sql:52: NOTICE: pg_tracing_post_parse_analyze hook <1> psql:/home/postgres/tests/trace.sql:52: NOTICE: pg_tracing_planner_hook <1> psql:/home/postgres/tests/trace.sql:52: NOTICE: pg_tracing_post_parse_analyze hook <2> psql:/home/postgres/tests/trace.sql:52: NOTICE: pg_tracing_planner_hook <2> psql:/home/postgres/tests/trace.sql:52: NOTICE: pg_tracing_post_parse_analyze hook psql:/home/postgres/tests/trace.sql:52: NOTICE: pg_tracing_planner_hook psql:/home/postgres/tests/trace.sql:52: NOTICE: pg_tracing_post_parse_analyze hook psql:/home/postgres/tests/trace.sql:52: NOTICE: pg_tracing_planner_hook psql:/home/postgres/tests/trace.sql:52: NOTICE: pg_tracing_ExecutorStart psql:/home/postgres/tests/trace.sql:52: NOTICE: pg_tracing_ExecutorRun psql:/home/postgres/tests/trace.sql:52: NOTICE: pg_tracing_ExecutorFinish psql:/home/postgres/tests/trace.sql:52: NOTICE: pg_tracing_ExecutorEnd psql:/home/postgres/tests/trace.sql:52: NOTICE: pg_tracing_post_parse_analyze hook psql:/home/postgres/tests/trace.sql:52: NOTICE: pg_tracing_planner_hook psql:/home/postgres/tests/trace.sql:52: NOTICE: pg_tracing_post_parse_analyze hook psql:/home/postgres/tests/trace.sql:52: NOTICE: pg_tracing_planner_hook psql:/home/postgres/tests/trace.sql:52: NOTICE: pg_tracing_planner_hook psql:/home/postgres/tests/trace.sql:52: NOTICE: pg_tracing_ExecutorStart psql:/home/postgres/tests/trace.sql:52: NOTICE: pg_tracing_ExecutorRun psql:/home/postgres/tests/trace.sql:52: NOTICE: pg_tracing_ExecutorFinish psql:/home/postgres/tests/trace.sql:52: NOTICE: pg_tracing_ExecutorEnd -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: POC: Extension for adding distributed tracing - pg_tracing
Hi! Thanks for the improvements! >Here's a new patch with changes from the previous discussion: >- I'm now directly storing nanoseconds duration in the span instead of the instr_time. Using the instr_time macros was a bit awkward as the durations I generate don't necessarily have a starting and ending instr_time. >Moving to straight nanoseconds made things clearer from my point of view. Cool. It could be easily casted to ms for the user. >- I've added an additional sample rate pg_tracing.sample_rate (on top of the pg_tracing.caller_sample_rate). This one will allow queries to be sampled even without trace information propagated from the caller. >Setting this sample rate to 1 will basically trace everything. For now, this will only work when going through the post parse hook. I will add support for prepared statements and cached plans for the next patch. Cool, I've just made the same improvement and wanted to send a patch a bit later after tests. >- I've improved how parse spans are created. It's a bit challenging to get the start of a parse as there's no pre parse hook or instrumentation around parse so it's only an estimation. I've also added a query id field to span and made a table and an sql function that flushes spans to this table instead of returning set or records - it is more convenient for the maintenance to query the table. -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: POC: Extension for adding distributed tracing - pg_tracing
Hi! What do you think about using INSTR_TIME_SET_CURRENT, INSTR_TIME_SUBTRACT and INSTR_TIME_GET_MILLISEC macros for timing calculations? Also, have you thought about a way to trace existing (running) queries without directly instrumenting them? It would be much more usable for maintenance and support personnel, because in production environments you rarely could change query text directly. For the current state the most simple solution is switch tracing on and off by calling SQL function, and possibly switch tracing for prepared statement the same way, but not for any random query. I'll check the patch for the race conditions. -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: POC: Extension for adding distributed tracing - pg_tracing
Hi, I'd keep Autotools build up to date, because Meson is very limited in terms of not very up-to-date OS versions. Anyway, it's OK now. I'm currently playing with the patch and reviewing sources, if you need any cooperation - please let us know. I'm +1 for committing this patch after review. -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: POC: Extension for adding distributed tracing - pg_tracing
Hi, I've fixed the Autotools build, please check patch below (v2). On Thu, Jul 27, 2023 at 6:39 PM Aleksander Alekseev < aleksan...@timescale.com> wrote: > Hi, > > > Also FYI, there are build warnings because functions > > const char * get_span_name(const Span * span, const char *qbuffer) > > and > > const char * get_operation_name(const Span * span, const char *qbuffer) > > do not have default inside switch and no return outside of switch. > > You are right, there are a few warnings: > > ``` > [1566/1887] Compiling C object contrib/pg_tracing/pg_tracing.so.p/span.c.o > ../contrib/pg_tracing/span.c: In function ‘get_span_name’: > ../contrib/pg_tracing/span.c:210:1: warning: control reaches end of > non-void function [-Wreturn-type] > 210 | } > | ^ > ../contrib/pg_tracing/span.c: In function ‘get_operation_name’: > ../contrib/pg_tracing/span.c:249:1: warning: control reaches end of > non-void function [-Wreturn-type] > 249 | } > | ^ > ``` > > Here is the patch v2 with a quick fix. > > > but got errors calling make check and cannot install the extension > > Agree, something goes wrong when using Autotools (but not Meson) on > both Linux and MacOS. I didn't investigate the issue though. > > -- > Best regards, > Aleksander Alekseev > -- Regards, -- Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/ pg-tracing-v2.patch Description: Binary data
Re: POC: Extension for adding distributed tracing - pg_tracing
Hi, Also FYI, there are build warnings because functions const char * get_span_name(const Span * span, const char *qbuffer) and const char * get_operation_name(const Span * span, const char *qbuffer) do not have default inside switch and no return outside of switch. -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: POC: Extension for adding distributed tracing - pg_tracing
Hi! I've tried to test the extension, but got errors calling make check and cannot install the extension. I've applied the patch onto current master and configured it as: ./configure --enable-debug --enable-cassert --enable-depend --enable-tap-tests Could you please advise if I'm doing something wrong? For make check please see attached log. For installing extension after setting shared preload library I've got an error in Postgres logfile: 2023-07-27 13:41:52.324 MSK [12738] LOG: database system is shut down 2023-07-27 13:41:52.404 MSK [14126] FATAL: could not load library "/usr/local/pgsql/lib/pg_tracing.so": /usr/local/pgsql/lib/pg_tracing.so: undefined symbol: get_operation_name Thank you! -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/ mc.log Description: Binary data
Re: POC: Extension for adding distributed tracing - pg_tracing
Hi! This patch looks very interesting, I'm working on the same subject too. But I've used another approach - I'm using C wrapper for C++ API library from OpenTelemetry, and handle span storage and output to this library. There are some nuances though, but it is possible. Have you tried to use OpenTelemetry APIs instead of implementing all functionality around spans? On Tue, Jul 25, 2023 at 1:16 PM Aleksander Alekseev < aleksan...@timescale.com> wrote: > Hi Anthonin, > > > I have a working prototype of a pg_tracing extension and wanted some > feedback on the design and architecture. > > The patch looks very interesting, thanks for working on it and for > sharing. The facts that the patch doesn't change the core except for > two lines in instrument.{c.h} and that is uses pull-based model: > > > - Collect the spans through a new pg_tracing_spans() function output > > ... IMO were the right design decisions. The patch lacks the > documentation, but this is OK for a PoC. > > I added the patch to the nearest CF [1]. Let's see what the rest of > the community thinks. > > [1] https://commitfest.postgresql.org/44/4456/ > > -- > Best regards, > Aleksander Alekseev > > > -- Regards, -- Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: Protect extension' internal tables - how?
Hi, Aleksander, thank you very much. Tables are already placed into special schema, but there are some dynamically created tables and the goal is to protect all these tables from direct insert, update and delete operations from users. I've read about the SECURITY DEFINER, it will do the trick. -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Protect extension' internal tables - how?
Hi hackers! While working on an extension I encountered a quite tricky question - the extension (with functions in C) creates tables during function calls, these tables must be protected from direct users' queries, at the same time they must remain accessible for all functions of this extension for all users allowed to use this extension. Could you please advise or give some hint on what is the correct (and secure) way to implement this? Currently I use the owner of the extension as owner when creating such a table inside the function, but maybe there are some pitfalls in this kind of solution? Thanks in advance. -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: [PATCH] Infinite loop while acquiring new TOAST Oid
Hi! Aleksander, thank you for reminding me of this patch, try to do it in a few days. -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: pg_column_toast_chunk_id: a function to get a chunk ID of a TOASTed value
Hi! I like the idea of having a standard function which shows a TOAST value ID for a row. I've used my own to handle TOAST errors. Just, maybe, more correct name would be "...value_id", because you actually retrieve valueid field from the TOAST pointer, and chunk ID consists of valueid + chunk_seq. -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: Pluggable toaster
Hi, Seems that I missed the thread mentioned above. I strongly disagree with such statement, Pluggable TOAST could not be a part or Compression Dictionaries thread because the TOAST improvement is a more general subject, it involves much deeper and tricky changes in the core. And also is much more promising in terms of performance and storage improvements. We already have a lot of changes in Pluggable TOAST that were not committed to the main GIT branch of this thread, so it seems that I have to merge them and reopen it. -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: Pluggable toaster
Hi! I have a question for the community - why was this patch silently put to a "rejected" status? Should there no be some kind of explanation? During this discussion I got the impression that for some reason some members of the community do not want the TOAST functionality, which has such drawbacks that make it really a curse for in many ways very good DBMS, to be touched. We cannot get rid of it because of backwards compatibility, so the best way is to make it more adaptable and extensible - this is what this thread is about. We proposed our vision on how to extend the TOAST Postgres-way, like Pluggable Storage some time before. There are some very complex subjects left in this topic that really need a community' attention. I've mentioned them above, but there was no feedback on them. Pavel, we've already had an update implementation for TOAST. But it is a part of a Pluggable TOAST and it hardly would be here without it. I've started another thread on extending the TOAST pointer, maybe you would want to participate there [1]. We still would be grateful for feedback. [1] Extending the TOAST Pointer <https://www.postgresql.org/message-id/flat/CAJ7c6TNAYyeMYKVkiwOZChy7UpE_CkjpYOk73gcWTXMkLkEyzw%40mail.gmail.com#59aacdde27dd61277fe7c46c61c84b2c> -- Regards, Nikita Malakhov Postgres Professional https://postgrespro.ru/
Re: RFI: Extending the TOAST Pointer
Hi! I've made a WIP patch that uses 64-bit TOAST value ID instead of 32-bit, and sent it as a part of discussion, but there was no feedback on such a solution. There was a link to that discussion at the top of this thread. Also, I have to note that, based on our work on Pluggable TOAST - extending TOAST pointer with additional structures would require review of the logical replication engine, currently it is not suitable for any custom TOAST pointers. Currently we have no final solution for problems with logical replication for custom TOAST pointers. -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: RFI: Extending the TOAST Pointer
Hi, Aleksander, I'm interested in extending TOAST pointer in various ways. 64-bit TOAST value ID allows to resolve very complex issue for production systems with large tables and heavy update rate. I agree with Matthias that there should not be processing of TOAST pointer internals outside TOAST macros. Currently, TOASTed value is distinguished as VARATT_IS_EXTERNAL_ONDISK, and it should stay this way. Adding compression requires another implementation (extension) of VARATT_EXTERNAL because current supports only 2 compression methods - it has only 1 bit responsible for compression method, and there is a safe way to do so, without affecting default TOAST mechanics - we must keep it this way for compatibility issues and not to break DB upgrade. Also, I must remind that we should not forget about field alignment inside the TOAST pointer. As it was already mentioned, it seems not very reasonable trying to save a byte or two while we are storing out-of-line values of at least 2 kb in size. On Mon, May 22, 2023 at 4:47 PM Aleksander Alekseev < aleksan...@timescale.com> wrote: > Hi, > > > I see your point, but I do think we should also think about why we do > > the change. > > Personally at the moment I care only about implementing compression > dictionaries on top of this, as is discussed in the corresponding > thread [1]. I'm going to need new fields in the TOAST pointer's > including (but not necessarily limited to) dictionary id. > > As I understand, Nikita is interested in implementing 64-bit TOAST > pointers [2]. I must admit I didn't follow that thread too closely but > I can imagine the needs are similar. > > Last but not least I remember somebody on the mailing list suggested > adding ZSTD compression support for TOAST, besides LZ4. Assuming I > didn't dream it, the proposal was rejected due to the limited amount > of free bits in ToastCompressionId. It was argued that two possible > combinations that are left should be treated with care and ZSTD will > not bring enough value to the users compared to LZ4. > > These are 3 recent cases I could recall. This being said I think our > solution should be generic enough to cover possible future cases > and/or cases unknown to us yet. > > [1]: > https://postgr.es/m/CAJ7c6TM7%2BsTvwREeL74Y5U91%2B5ymNobRbOmnDRfdTonq9trZyQ%40mail.gmail.com > [2]: https://commitfest.postgresql.org/43/4296/ > > -- > Best regards, > Aleksander Alekseev > -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: RFI: Extending the TOAST Pointer
Hi! Matthias, in the Pluggable TOAST thread we proposed additional pointer definition, without modification of the original varatt_external - we have to keep it untouched for compatibility issues. The following extension for the TOAST pointer was proposed: typedef struct varatt_custom { uint16 va_toasterdatalen;/* total size of toast pointer, < BLCKSZ */ uint32 va_rawsize; /* Original data size (includes header) */ uint32 va_toasterid; /* Toaster ID, actually Oid */ char va_toasterdata[FLEXIBLE_ARRAY_MEMBER]; /* Custom toaster data */ } varatt_custom; with the new tag VARTAG_CUSTOM = 127. Rawsize we have to keep because it is used by Executor. And Toaster ID is the OID by which we identify the extension that uses this pointer invariant. On Thu, May 18, 2023 at 5:34 PM Matthias van de Meent < boekewurm+postg...@gmail.com> wrote: > On Thu, 18 May 2023 at 15:12, Aleksander Alekseev > wrote: > > > > Hi, > > > > > I agree that va_tag can have another use. But since we are going to > > > make varatt_external variable in size (otherwise I don't see how it > > > could be really **extendable**) I don't think this is the right > > > approach. > > Why would we modify va_tag=18; data=varatt_external? A different > va_tag option would allow us to keep the current layout around without > much maintenance and a consistent low overhead. > > > On second thought, perhaps we are talking more or less about the same > thing? > > > > It doesn't matter what will be used as a sign of presence of a varint > > bitmask in the pointer. My initial proposal to use ToastCompressionId > > for this is probably redundant since va_tag > 18 will already tell > > that. > > I'm not sure "extendable" would be the right word, but as I see it: > > 1. We need more space to store more metadata; > 2. Essentially all bits in varatt_external are already accounted for; and > 3. There are still many options left in va_tag > > It seems to me that adding a new variant to va_att for marking new > features in the toast infrastructure makes the most sense, as we'd > also be able to do the new things like varints etc without needing to > modify existing toast paths significantly. > > We'd need to stop using the va_tag as length indicator, but I don't > think it's currently assumed to be a length indicator anyway (see > VARSIZE_EXTERNAL(ptr)). By not using the varatt_external struct > currently in use, we could be able to get down to <18B toast pointers > as well, though I'd consider that unlikely. > > Kind regards, > > Matthias van de Meent > Neon, Inc. > -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: [PATCH] Compression dictionaries for JSONB
Hi hackers! As discussed above, I've created a new thread on the Extension of the TOAST pointer subject - https://www.postgresql.org/message-id/flat/CAN-LCVMq2X%3Dfhx7KLxfeDyb3P%2BBXuCkHC0g%3D9GF%2BJD4izfVa0Q%40mail.gmail.com Please check and comment. On Thu, Apr 27, 2023 at 1:43 PM Nikita Malakhov wrote: > Hi, > > I think I should open a new thread related to TOAST pointer refactoring > based on Pluggable TOAST, COPY and looping in retrieving new TOAST > value OID issues. > > On Wed, Apr 26, 2023 at 4:00 PM Aleksander Alekseev < > aleksan...@timescale.com> wrote: > >> Hi Nikita, >> >> > The External TOAST pointer is very limited to the amount of service data >> > it could keep, that's why we introduced the Custom TOAST pointers in the >> > Pluggable TOAST. But keep in mind that changing the TOAST pointer >> > structure requires a lot of quite heavy modifications in the core - >> along with >> > some obvious places like insert/update/delete datum there is very >> serious >> > issue with logical replication. >> > The Pluggable TOAST was rejected, but we have a lot of improvements >> > based on changing the TOAST pointer structure. >> >> Now I see what you meant [1]. I agree that we should focus on >> refactoring TOAST pointers first. So I suggest we continue discussing >> this in a corresponding thread and return to this one later. >> >> [1]: >> https://www.postgresql.org/message-id/CAJ7c6TPSvR2rKpoVX5TSXo_kMxXF%2B-SxLtrpPaMf907tX%3DnVCw%40mail.gmail.com >> >> -- >> Best regards, >> Aleksander Alekseev >> > > > -- > Regards, > > -- > Nikita Malakhov > Postgres Professional > The Russian Postgres Company > https://postgrespro.ru/ > -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
RFI: Extending the TOAST Pointer
Hi hackers! There were several discussions where the limitations of the existing TOAST pointers were mentioned [1], [2] and [3] and from time to time this topic appears in other places. We proposed a fresh approach to the TOAST mechanics in [2], but unfortunately the patch was met quite unfriendly, and after several iterations was rejected, although we still have hopes for it and have several very promising features based on it. Anyway, the old TOAST pointer is also the cause of problems like [4], and this part of the PostgreSQL screams to be revised and improved. The TOAST begins with the pointer to the externalized value - the TOAST Pointer, which is very limited in means of storing data, and all TOAST improvements require revision of this Pointer structure. So we decided to ask the community for thoughts and ideas on how to rework this pointer. The TOAST Pointer (varatt_external structure) stores 4 fields: [varlena header][<4b - original data size><4b - size in TOAST table><4b - TOAST table OID><4b - ID of chunk>] In [2] we proposed the new Custom TOAST pointer structure where main feature is extensibility: [varlena header][<2b - total size of the TOAST pointer><4b size of original data><4b - OID of algorithm used for TOASTing>] where Custom TOAST Pointer is distinguished from Regular one by va_flag field which is a part of varlena header, so new pointer format does not interfere with the old (regular) one. The first field is necessary because the Custom TOAST pointer has variable length due to the tail used for inline storage, and original value size is used by the Executor. The third field could be a subject for discussion. Thoughts? Objections? [1] [PATCH] Infinite loop while acquiring new TOAST Oid <https://www.postgresql.org/message-id/CAJ7c6TPSvR2rKpoVX5TSXo_kMxXF%2B-SxLtrpPaMf907tX%3DnVCw%40mail.gmail.com> [2] Pluggable Toaster <https://www.postgresql.org/message-id/flat/224711f9-83b7-a307-b17f-4457ab73aa0a%40sigaev.ru> [3] [PATCH] Compression dictionaries for JSONB <https://www.postgresql.org/message-id/flat/CAJ7c6TOtAB0z1UrksvGTStNE-herK-43bj22%3D5xVBg7S4vr5rQ%40mail.gmail.com> [4] BUG #16722: PG hanging on COPY when table has close to 2^32 toasts in the table. <https://www.postgresql.org/message-id/flat/16722-93043fb459a41073%40postgresql.org> -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: [PATCH] Compression dictionaries for JSONB
Hi, I think I should open a new thread related to TOAST pointer refactoring based on Pluggable TOAST, COPY and looping in retrieving new TOAST value OID issues. On Wed, Apr 26, 2023 at 4:00 PM Aleksander Alekseev < aleksan...@timescale.com> wrote: > Hi Nikita, > > > The External TOAST pointer is very limited to the amount of service data > > it could keep, that's why we introduced the Custom TOAST pointers in the > > Pluggable TOAST. But keep in mind that changing the TOAST pointer > > structure requires a lot of quite heavy modifications in the core - > along with > > some obvious places like insert/update/delete datum there is very serious > > issue with logical replication. > > The Pluggable TOAST was rejected, but we have a lot of improvements > > based on changing the TOAST pointer structure. > > Now I see what you meant [1]. I agree that we should focus on > refactoring TOAST pointers first. So I suggest we continue discussing > this in a corresponding thread and return to this one later. > > [1]: > https://www.postgresql.org/message-id/CAJ7c6TPSvR2rKpoVX5TSXo_kMxXF%2B-SxLtrpPaMf907tX%3DnVCw%40mail.gmail.com > > -- > Best regards, > Aleksander Alekseev > -- Regards, -- Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: [PATCH] Infinite loop while acquiring new TOAST Oid
Hi! Widening of a TOAST pointer is possible if we keep backward compatibility with old-fashioned TOAST tables - I mean differ 'long' and 'short' TOAST pointers and process them accordingly on insert and delete cases, and vacuum with logical replication. It is not very difficult, however it takes some effort. Recently I've found out that I have not overseen all compatibility cases, so the provided patch is functional but limited in compatibility. We already have a flag byte in the TOAST pointer which is responsible for the type of the pointer - va_flag field. It was explained in the Pluggable TOAST patch. One is enough, there is no need to add another one. On Wed, Apr 26, 2023 at 3:55 PM Aleksander Alekseev < aleksan...@timescale.com> wrote: > Hi, > > > I agree that we can't simply widen varatt_external to use 8 bytes for > > the toast ID in all cases. > > +1 > > Note that the user may have a table with multiple TOASTable > attributes. If we simply widen the TOAST pointer it may break the > existing tables in the edge case. Also this may be a reason why > certain users may prefer to continue using narrow pointers. IMO wide > TOAST pointers should be a table option. Whether the default for new > tables should be wide or narrow pointers is debatable. > > In another discussion [1] we seem to agree that we also want to have > an ability to include a 32-bit dictionary_id to the TOAST pointers and > perhaps support more compression methods (ZSTD to name one). Besides > that it would be nice to have an ability to extend TOAST pointers in > the future without breaking the existing pointers. One possible > solution would be to add a varint feature bitmask to every pointer. So > we could add flags like TOAST_IS_WIDE, TOAST_HAS_DICTIONARY, > TOAST_UNKNOWN_FEATURE_FROM_2077, etc indefinitely. > > I suggest we address all the current and future needs once and > completely refactor TOAST pointers rather than solving one problem at > a time. I believe this will be more beneficial for the community in > the long term. > > Thoughts? > > [1]: https://commitfest.postgresql.org/43/3626/ > > -- > Best regards, > Aleksander Alekseev > -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: Doc limitation update proposal: include out-of-line OID usage per TOAST-ed columns
Hi, This is a production case for large databases with high update rates, but is mistaken with reaching table size limit, although size limit is processed correctly. The note on TOAST limitation does not mention that TOAST values are not actually updated on UPDATE operation - old value is marked as dead and new one is inserted, and dead values should be vacuumed before value OID could be reused. The worst is that the INSERT/UPDATE clause does not fail if there is no OID available - it is looped in an infinite loop of sorting out OIDs. On Sat, Apr 22, 2023 at 6:42 PM Gurjeet Singh wrote: > On Fri, Apr 21, 2023 at 12:14 AM Nikita Malakhov > wrote: > > This limitation applies not only to wide tables - it also applies to > tables where TOASTed values > > are updated very often. You would soon be out of available TOAST value > ID because in case of > > high frequency updates autovacuum cleanup rate won't keep up with the > updates. It is discussed > > in [1]. > > > > On Fri, Apr 21, 2023 at 9:39 AM Jakub Wartak < > jakub.war...@enterprisedb.com> wrote: > >> I would like to ask if it wouldn't be good idea to copy the > >> https://wiki.postgresql.org/wiki/TOAST#Total_table_size_limit > >> discussion (out-of-line OID usage per TOAST-ed columns / potential > >> limitation) to the official "Appendix K. PostgreSQL Limits" with also > >> little bonus mentioning the "still searching for an unused OID in > >> relation" notice. Although it is pretty obvious information for some > >> and from commit 7fbcee1b2d5f1012c67942126881bd492e95077e and the > >> discussion in [1], I wonder if the information shouldn't be a little > >> more well known via the limitation (especially to steer people away > >> from designing very wide non-partitioned tables). > >> > >> [1] - > https://www.postgresql.org/message-id/flat/16722-93043fb459a41073%40postgresql.org > > > > [1] > https://www.postgresql.org/message-id/CAN-LCVPRvRzxeUdYdDCZ7UwZQs1NmZpqBUCd%3D%2BRdMPFTyt-bRQ%40mail.gmail.com > > These 2 discussions show that it's a painful experience to run into > this problem, and that the hackers have ideas on how to fix it, but > those fixes haven't materialized for years. So I would say that, yes, > this info belongs in the hard-limits section, because who knows how > long it'll take this to be fixed. > > Please submit a patch. > > I anticipate that edits to Appendix K Postgres Limits will prompt > improving the note in there about the maximum column limit, That note > is too wordy, and sometimes confusing, especially for the audience > that it's written for: newcomers to Postgres ecosystem. > > Best regards, > Gurjeet https://Gurje.et > http://aws.amazon.com > -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: [PATCH] Infinite loop while acquiring new TOAST Oid
Hi! No, it wasn't. It was a proposal, I thought I'd get some feedback on it before sending it to commitfest. On Sat, Apr 22, 2023 at 6:17 PM Gurjeet Singh wrote: > On Thu, Dec 22, 2022 at 10:07 AM Nikita Malakhov > wrote: > > Any suggestions on the previous message (64-bit toast value ID with > individual counters)? > > Was this patch ever added to CommitFest? I don't see it in the current > Open Commitfest. > > https://commitfest.postgresql.org/43/ > > Best regards, > Gurjeet http://Gurje.et > http://aws.amazon.com > -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: Doc limitation update proposal: include out-of-line OID usage per TOAST-ed columns
Hi! This limitation applies not only to wide tables - it also applies to tables where TOASTed values are updated very often. You would soon be out of available TOAST value ID because in case of high frequency updates autovacuum cleanup rate won't keep up with the updates. It is discussed in [1]. On Fri, Apr 21, 2023 at 9:39 AM Jakub Wartak wrote: > Hi -hackers, > > I would like to ask if it wouldn't be good idea to copy the > https://wiki.postgresql.org/wiki/TOAST#Total_table_size_limit > discussion (out-of-line OID usage per TOAST-ed columns / potential > limitation) to the official "Appendix K. PostgreSQL Limits" with also > little bonus mentioning the "still searching for an unused OID in > relation" notice. Although it is pretty obvious information for some > and from commit 7fbcee1b2d5f1012c67942126881bd492e95077e and the > discussion in [1], I wonder if the information shouldn't be a little > more well known via the limitation (especially to steer people away > from designing very wide non-partitioned tables). > > Regards, > -J. > > [1] - > https://www.postgresql.org/message-id/flat/16722-93043fb459a41073%40postgresql.org > > > [1] https://www.postgresql.org/message-id/CAN-LCVPRvRzxeUdYdDCZ7UwZQs1NmZpqBUCd%3D%2BRdMPFTyt-bRQ%40mail.gmail.com -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: [PATCH] Compression dictionaries for JSONB
there is no expensive TOAST lookup for the > attributes of the tuple; but that doesn't seem to be an option with > that design. > > > When user wants a given attribute to be compressed, he/she says: > > > > ALTER TABLE foo ALTER COLUMN bar SET STORAGE DICTIONARY; > > > > And the compression algorithms is chosen as usual: > > > > ALTER TABLE foo ALTER COLUMN bar SET COMPRESSION lz4; > > > > When there are no dictionaries yet, DICTIONARY works the same as > > EXTENDED. When a dictionary is trained the data is compressed using > > the latest version of this dictionary. For visibility we are going to > > need some sort of pg_stat_dictionaries view that shows the existing > > dictionaries, how much space they consume, etc. > > I think "AT_AC SET COMPRESSION lz4 {[WITH | WITHOUT] DICTIONARY}", > "AT_AC SET COMPRESSION lz4-dictionary", or "AT_AC SET > compression_dictionary = on" would be better from a design > perspective. > > > If we choose this approach there are a couple of questions/notes that > > come to mind: > > > > * The default compression algorithm is PGLZ and unlike LZ4 it doesn't > > support training dictionaries yet. This should be straightforward to > > implement though, or alternatively shared dictionaries could work only > > for LZ4; > > Didn't we get zstd support recently as well? > > > * Currently users have control of toast_tuple_target but not > > TOAST_TUPLE_THRESHOLD. Which means for tuples smaller than 1/4 of the > > page size shared dictionaries are not going to be triggered. Which is > > not necessarily a bad thing. Alternatively we could give the users > > toast_tuple_threshold setting. This shouldn't necessarily be part of > > this CF entry discussion however, we can always discuss it separately; > > That makes a lot of sense, but as you said handling that separately > would probably be better and easier to review. > > > * Should we allow setting DICTIONARY storage strategy for a given > > type, i.e. CREATE TYPE baz STORAGE = DICTIONARY? I suggest we forbid > > it in the first implementation, just for the sake of simplicity. > > Can we specify a default compression method for each postgresql type, > just like how we specify the default storage? If not, then the setting > could realistically be in conflict with a default_toast_compression > setting, assuming that dictionary support is not a requirement for > column compression methods. > > > * It looks like we won't be able to share a dictionary between > > multiple columns. Which again is not necessarily a bad thing: data in > > these columns can be completely different (e.g. BYTEA and XML), > > columns can be dropped independently, etc. > > Yes > > > If a user is interested in > > sharing a dictionary between several columns he/she can join these > > columns in a single JSONB column. > > It is unreasonable to expect this to be possible, due to e.g. > partitioning resulting in columns that share compressable patters to > be on different physical tables. > > > * TOAST currently doesn't support ZSTD. IMO this is not a big deal and > > adding the corresponding support can be discussed separately. > > * If memory serves, there were not so many free bits left in TOAST > > pointers. The pointers don't store a storage strategy though so > > hopefully this will not be a problem. We'll see. > > The toast pointer must store enough info about the compression used to > decompress the datum, which implies it needs to store the compression > algorithm used, and a reference to the compression dictionary (if > any). I think the idea about introducing a new toast pointer type (in > the custom toast patch) wasn't bad per se, and that change would allow > us to carry more or different info in the header. > > Kind regards, > > Matthias van de Meent > -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: SQL/JSON revisited
Hi hackers! The latest SQL standard contains dot notation for JSON. Are there any plans to include it into Pg 16? Or maybe we should start a separate thread for it? Thanks! On Tue, Apr 4, 2023 at 3:36 PM Alvaro Herrera wrote: > On 2023-Apr-04, Amit Langote wrote: > > > On Tue, Apr 4, 2023 at 2:16 AM Alvaro Herrera > wrote: > > > > - the gram.y solution to the "ON ERROR/ON EMPTY" clauses is quite ugly. > > > I think we could make that stuff use something similar to > > > ConstraintAttributeSpec with an accompanying post-processing > function. > > > That would reduce the number of ad-hoc hacks, which seem excessive. > > > > Do you mean the solution involving the JsonBehavior node? > > Right. It has spilled as the separate on_behavior struct in the core > parser %union in addition to the raw jsbehavior, which is something > we've gone 30 years without having, and I don't see why we should start > now. > > This stuff is terrible: > > json_exists_error_clause_opt: > json_exists_error_behavior ON ERROR_P { $$ = $1; } > | /* EMPTY */ { $$ = NULL; } > ; > > json_exists_error_behavior: > ERROR_P { $$ = makeJsonBehavior(JSON_BEHAVIOR_ERROR, > NULL); } > | TRUE_P{ $$ = makeJsonBehavior(JSON_BEHAVIOR_TRUE, > NULL); } > | FALSE_P { $$ = makeJsonBehavior(JSON_BEHAVIOR_FALSE, > NULL); } > | UNKNOWN { $$ = makeJsonBehavior(JSON_BEHAVIOR_UNKNOWN, > NULL); } > ; > > json_value_behavior: > NULL_P { $$ = makeJsonBehavior(JSON_BEHAVIOR_NULL, NULL); > } > | ERROR_P { $$ = makeJsonBehavior(JSON_BEHAVIOR_ERROR, > NULL); } > | DEFAULT a_expr{ $$ = > makeJsonBehavior(JSON_BEHAVIOR_DEFAULT, $2); } > ; > > json_value_on_behavior_clause_opt: > json_value_behavior ON EMPTY_P > { $$.on_empty = $1; $$.on_error = > NULL; } > | json_value_behavior ON EMPTY_P json_value_behavior ON ERROR_P > { $$.on_empty = $1; $$.on_error = $4; } > | json_value_behavior ON ERROR_P > { $$.on_empty = NULL; $$.on_error = > $1; } > | /* EMPTY */ > { $$.on_empty = NULL; $$.on_error = > NULL; } > ; > > json_query_behavior: > ERROR_P { $$ = makeJsonBehavior(JSON_BEHAVIOR_ERROR, > NULL); } > | NULL_P{ $$ = makeJsonBehavior(JSON_BEHAVIOR_NULL, > NULL); } > | EMPTY_P ARRAY { $$ = > makeJsonBehavior(JSON_BEHAVIOR_EMPTY_ARRAY, NULL); } > /* non-standard, for Oracle compatibility only */ > | EMPTY_P { $$ = > makeJsonBehavior(JSON_BEHAVIOR_EMPTY_ARRAY, NULL); } > | EMPTY_P OBJECT_P { $$ = > makeJsonBehavior(JSON_BEHAVIOR_EMPTY_OBJECT, NULL); } > | DEFAULT a_expr{ $$ = > makeJsonBehavior(JSON_BEHAVIOR_DEFAULT, $2); } > ; > > json_query_on_behavior_clause_opt: > json_query_behavior ON EMPTY_P > { $$.on_empty = $1; $$.on_error = > NULL; } > | json_query_behavior ON EMPTY_P json_query_behavior ON ERROR_P > { $$.on_empty = $1; $$.on_error = $4; } > | json_query_behavior ON ERROR_P > { $$.on_empty = NULL; $$.on_error = > $1; } > | /* EMPTY */ > { $$.on_empty = NULL; $$.on_error = > NULL; } > ; > > Surely this can be made cleaner. > > By the way -- that comment about clauses being non-standard, can you > spot exactly *which* clauses that comment applies to? > > -- > Álvaro Herrera PostgreSQL Developer — > https://www.EnterpriseDB.com/ > "El número de instalaciones de UNIX se ha elevado a 10, > y se espera que este número aumente" (UPM, 1972) > > > -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: SQL JSON path enhanced numeric literals
Hi! Sorry to bother, but there is a question on JsonPath - how many bits in the JsonPath header could be used for the version? The JsonPath header is 4 bytes, and currently the Version part is defined as #define JSONPATH_VERSION (0x01) Thanks! On Sun, Mar 5, 2023 at 6:55 PM Peter Eisentraut < peter.eisentr...@enterprisedb.com> wrote: > On 03.03.23 21:16, Dean Rasheed wrote: > > I think this new feature ought to be mentioned in the docs somewhere. > > Perhaps a sentence or two in the note below table 9.49 would suffice, > > since it looks like that's where jsonpath numbers are mentioned for > > the first time. > > Done. I actually put it into the data types chapter, where some other > differences between SQL and SQL/JSON syntax were already discussed. > > > In jsonpath_scan.l, I think the hex/oct/bininteger cases could do with > > a comment, such as > > > > /* Non-decimal integers in ECMAScript; must not have underscore after > radix */ > > hexinteger0[xX]{hexdigit}(_?{hexdigit})* > > octinteger0[oO]{octdigit}(_?{octdigit})* > > bininteger0[bB]{bindigit}(_?{bindigit})* > > > > since that's different from the main lexer's syntax. > > done > > > Perhaps it's worth mentioning that difference in the docs. > > done > > > Otherwise, this looks good to me. > > committed > > > > -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
Re: JsonPath version bits
Hi hackers! Could the 1 byte from the JsonPath header be used to store version? Or how many bits from the header could be used for the version value? On Mon, Mar 27, 2023 at 12:54 PM Nikita Malakhov wrote: > Hi hackers! > > I've got a question on the JsonPath header - currently the header size > is 4 bytes, where there are version and mode bits. Is there somewhere > a defined size of the version part? There are some extensions working > with JsonPath, and we have some too, thus it is important how many > bits is it possible to use to store a version value? > > -- Regards, Nikita Malakhov Postgres Professional The Russian Postgres Company https://postgrespro.ru/
JsonPath version bits
Hi hackers! I've got a question on the JsonPath header - currently the header size is 4 bytes, where there are version and mode bits. Is there somewhere a defined size of the version part? There are some extensions working with JsonPath, and we have some too, thus it is important how many bits is it possible to use to store a version value? Thanks! -- Regards, Nikita Malakhov Postgres Professional https://postgrespro.ru/
Re: [PATCH] Compression dictionaries for JSONB
Hi! If I understand Andres' message correctly - the proposition is to make use of compression dictionaries automatic, possibly just setting a parameter when the table is created, something like CREATE TABLE t ( ..., t JSONB USE DICTIONARY); The question is then how to create such dictionaries automatically and extend them while data is being added to the table. Because it is not something unusual when after a time circumstances change and a rather small table is started to be loaded with huge amounts of data. I prefer extending a dictionary over re-creating it because while dictionary is recreated we leave users two choices - to wait until dictionary creation is over or to use the old version (say, kept as as a snapshot while a new one is created). Keeping many versions simultaneously does not make sense and would extend DB size. Also, compressing small data with a large dictionary (the case for one-for-many tables dictionary), I think, would add some considerable overhead to the INSERT/UPDATE commands, so the most reasonable choice is a per-table dictionary. Am I right? Any ideas on how to create and extend such dictionaries automatically? On Thu, Feb 9, 2023 at 2:01 PM Andres Freund wrote: > Hi, > > On February 9, 2023 2:50:57 AM PST, Aleksander Alekseev < > aleksan...@timescale.com> wrote: > >Hi Andres, > > > >> > So to clarify, are we talking about tuple-level compression? Or > >> > perhaps page-level compression? > >> > >> Tuple level. > > > >> although my own patch proposed attribute-level compression, not > >> tuple-level one, it is arguably closer to tuple-level approach than > >> page-level one > > > >Just wanted to make sure that by tuple-level we mean the same thing. > > > >When saying tuple-level do you mean that the entire tuple should be > >compressed as one large binary (i.e. similarly to page-level > >compression but more granularly), or every single attribute should be > >compressed separately (similarly to how TOAST does this)? > > Good point - should have been clearer. I meant attribute wise compression. > Like we do today, except that we would use a dictionary to increase > compression rates. > > Andres > -- > Sent from my Android device with K-9 Mail. Please excuse my brevity. > -- Regards, Nikita Malakhov Postgres Professional https://postgrespro.ru/
Re: [PATCH] Compression dictionaries for JSONB
Hi, On updating dictionary - >You cannot "just" refresh a dictionary used once to compress an >object, because you need it to decompress the object too. and when you have many - updating an existing dictionary requires going through all objects compressed with it in the whole database. It's a very tricky question how to implement this feature correctly. Also, there are some thoughts on using JSON schema to optimize storage for JSON objects. (That's applied to the TOAST too, so at first glance we've decided to forbid dropping or changing TOAST implementations already registered in a particular database.) In my experience, in modern world, even with fast SSD storage arrays, with large database (about 40-50 Tb) we had disk access as a bottleneck more often than CPU, except for the cases with a lot of parallel execution threads for a single query (Oracle). On Mon, Feb 6, 2023 at 10:33 PM Andres Freund wrote: > Hi, > > On 2023-02-06 16:16:41 +0100, Matthias van de Meent wrote: > > On Mon, 6 Feb 2023 at 15:03, Aleksander Alekseev > > wrote: > > > > > > Hi, > > > > > > I see your point regarding the fact that creating dictionaries on a > > > training set is too beneficial to neglect it. Can't argue with this. > > > > > > What puzzles me though is: what prevents us from doing this on a page > > > level as suggested previously? > > > > The complexity of page-level compression is significant, as pages are > > currently a base primitive of our persistency and consistency scheme. > > +many > > It's also not all a panacea performance-wise, datum-level decompression can > often be deferred much longer than page level decompression. For things > like > json[b], you'd hopefully normally have some "pre-filtering" based on proper > columns, before you need to dig into the json datum. > > It's also not necessarily that good, compression ratio wise. Particularly > for > wider datums you're not going to be able to remove much duplication, > because > there's only a handful of tuples. Consider the case of json keys - the > dictionary will often do better than page level compression, because it'll > have the common keys in the dictionary, which means the "full" keys never > will > have to appear on a page, whereas page-level compression will have the > keys on > it, at least once. > > Of course you can use a dictionary for page-level compression too, but the > gains when it works well will often be limited, because in most OLTP usable > page-compression schemes I'm aware of, you can't compress a page all that > far > down, because you need a small number of possible "compressed page sizes". > > > > > More similar data you compress the more space and disk I/O you save. > > > Additionally you don't have to compress/decompress the data every time > > > you access it. Everything that's in shared buffers is uncompressed. > > > Not to mention the fact that you don't care what's in pg_attribute, > > > the fact that schema may change, etc. There is a table and a > > > dictionary for this table that you refresh from time to time. Very > > > simple. > > > > You cannot "just" refresh a dictionary used once to compress an > > object, because you need it to decompress the object too. > > Right. That's what I was trying to refer to when mentioning that we might > need > to add a bit of additional information to the varlena header for datums > compressed with a dictionary. > > Greetings, > > Andres Freund > -- Regards, -- Nikita Malakhov Postgres Professional https://postgrespro.ru/
Re: Pluggable toaster
Hi, It'll be great to see more opinions on the approach as a whole. >The problem that I see being raised here, is that there was little >discussion and no observed community consensus about the design of >this complex feature *before* this patch with high complexity was >provided. >The next action that was requested is to take a step back and decide >how we would want to implement type-aware TOASTing (and the associated >patch compression dictionaries) *before* we look into the type-aware >toasting. We decided to put this improvement as a patch because we thought that the most complex and questionable part would be the TOAST implementations (the Toasters) itself, and the Pluggable TOAST is just a tool to make plugging different TOAST implementations clean and simple. -- Nikita Malakhov Postgres Professional https://postgrespro.ru/
Re: Pluggable toaster
Hi! Existing TOAST has several very painful drawbacks - lack of UPDATE operation, bloating of TOAST tables, and limits that are implicitly implied on base tables by their TOAST tables, so it is seems not fair to say that Pluggable TOAST does not solve any problems but just introduces new ones. The main reason behind this decision is that keeping the first implementation on the side of the vanilla (I mean rebasing it) over time is very difficult due to the very invasive nature of this solution. So we decided to reduce changes in the core to the minimum necessary to make it available through the hooks, because the hooks part is very lightweight and simple to keep rebasing onto the vanilla core. We plan to keep this extension free with the PostgreSQL license, so any PostgreSQL user could benefit from the TOAST on steroids, and sometimes in the future it will be a much simpler task to integrate the Pluggable TOAST into the vanilla, along with our advanced TOAST implementations which we plan to keep under Open Source licenses too. On Mon, Feb 6, 2023 at 1:49 PM Alvaro Herrera wrote: > On 2023-Feb-06, Nikita Malakhov wrote: > > > Currently we're busy revising the whole Pluggable TOAST API to make it > > available as an extension and based on hooks to minimize changes in > > the core. It will be available soon. > > Hmm, I'm not sure why would PGDG want to accept such a thing. I read > "minimize changes" as "open source Postgres can keep their crap > implementation and companies will offer good ones for a fee". I'd > rather have something that can give users direct benefit -- not hide it > behind proprietary licenses forcing each company to implement their own > performant toaster. > > -- > Álvaro Herrera 48°01'N 7°57'E — > https://www.EnterpriseDB.com/ > -- Regards, Nikita Malakhov Postgres Professional https://postgrespro.ru/
Re: Pluggable toaster
Hi hackers! In response to opinion in thread on Compresson dictionaries for JSONb [1] >The approaches are completely different, >but they seem to be trying to fix the same problem: the fact that the >default TOAST stuff isn't good enough for JSONB. The problem, actually, is that the default TOAST is often not good for modern loads and amounts of data.Pluggable TOAST is based not only on pure enthusiasm, but on demands and tickets from production databases.The main demand is effective and transparent storage subsystem for large values for some problematic types of data, which we already have, with proven efficiency. So we're really quite surprised that it has got so little feedback. We've got some opinions on approach but there is no any general one on the approach itself except doubts about the TOAST mechanism needs revision at all. Currently we're busy revising the whole Pluggable TOAST API to make it available as an extension and based on hooks to minimize changes in the core. It will be available soon. > [1] https://postgr.es/m/20230203095540.zutul5vmsbmantbm@alvherre.pgsql > -- Nikita Malakhov Postgres Professional https://postgrespro.ru/
Re: Inconsistency in vacuum behavior
Hi! Yes, I've checked that. What would be desirable behavior in the case above? Anyway, waiting for table unlock seems to be not quite right. On Sat, Jan 21, 2023 at 4:12 AM Nathan Bossart wrote: > On Mon, Jan 16, 2023 at 11:18:08AM +0300, Alexander Pyhalov wrote: > > Is it intended? Why don't we perform vacuum_is_permitted_for_relation() > > check for inheritors in expand_vacuum_rel()? > > Since no lock is held on the partition, the calls to functions like > object_ownercheck() and pg_class_aclcheck() in > vacuum_is_permitted_for_relation() will produce cache lookup ERRORs if the > relation is concurrently dropped. > > -- > Nathan Bossart > Amazon Web Services: https://aws.amazon.com > > > -- Regards, Nikita Malakhov Postgres Professional https://postgrespro.ru/
Re: run pgindent on a regular basis / scripted manner
Hi! I've ran pdindent on the whole Postgres and it'd changed an awful lot of source files. Won't it create a lot of merge conflicts? On Mon, Jan 23, 2023 at 8:48 PM Alvaro Herrera wrote: > On 2023-Jan-23, Tom Lane wrote: > > > 1. [...] So now I think that we should > > stick to the convention that it's on the user to install > > pg_bsd_indent somewhere in their PATH; all we'll be doing with > > this change is eliminating the step of fetching pg_bsd_indent's > > source files from somewhere else. > > +1 > > > 2. Given #1, it'll be prudent to continue having pgindent > > double-check that pg_bsd_indent reports a specific version > > number. We could imagine starting to use the main Postgres > > version number for that, but I'm inclined to continue with > > its existing numbering series. > > +1 > > > 3. If we do nothing special, the first mass reindentation is > > going to reformat the pg_bsd_indent sources per PG style, > > which is ... er ... not the way they look now. Do we want > > to accept that outcome, or take steps to prevent pgindent > > from processing pg_bsd_indent? I have a feeling that manual > > cleanup would be necessary if we let such reindentation > > happen, but I haven't experimented. > > Hmm, initially it must just be easier to have an exception so that > pg_bsd_indent itself isn't indented. > > -- > Álvaro Herrera 48°01'N 7°57'E — > https://www.EnterpriseDB.com/ > #error <https://www.EnterpriseDB.com/#error> "Operator lives in the wrong > universe" > ("Use of cookies in real-time system development", M. Gleixner, M. Mc > Guire) > > > -- Regards, Nikita Malakhov Postgres Professional https://postgrespro.ru/
Re: Inconsistency in vacuum behavior
Hi! For the test Alexander described in the beginning of the discussion - the results are postgres@postgres=# set role regress_vacuum_conflict; SET Time: 0.324 ms postgres@postgres=> vacuum vacuum_tab; WARNING: permission denied to vacuum "vacuum_tab", skipping it WARNING: permission denied to vacuum "vacuum_tab_1", skipping it WARNING: permission denied to vacuum "vacuum_tab_2", skipping it VACUUM Time: 1.782 ms postgres@postgres=> vacuum full; WARNING: permission denied to vacuum "pg_statistic", skipping it WARNING: permission denied to vacuum "vacuum_tab", skipping it ... postgres@postgres=> cluster vacuum_tab; ERROR: must be owner of table vacuum_tab Time: 0.384 ms For the standard role "Postgres" the behavior is the same as before patch. On Thu, Jan 19, 2023 at 10:37 AM Alexander Pyhalov wrote: > Justin Pryzby писал 2023-01-19 04:49: > > On Mon, Jan 16, 2023 at 08:12:18PM +0300, Nikita Malakhov wrote: > >> Hi, > >> > >> Currently there is no error in this case, so additional thrown error > >> would > >> require a new test. > >> Besides, throwing an error here does not make sense - it is just a > >> check > >> for a vacuum > >> permission, I think the right way is to just skip a relation that is > >> not > >> suitable for vacuum. > >> Any thoughts or objections? > > > > Could you check if this is consistent between the behavior of VACUUM > > FULL and CLUSTER ? See also Nathan's patches. > > Hi. > > Cluster behaves in a different way - it errors out immediately if > relation is not owned by user. For partitioned rel it would anyway raise > error later. > VACUUM and VACUUM FULL behave consistently after applying Nikita's patch > (for partitioned and regular tables) - issue warning "skipping > TABLE_NAME --- only table or database owner can vacuum it" and return > success status. > > -- > Best regards, > Alexander Pyhalov, > Postgres Professional > -- Regards, Nikita Malakhov Postgres Professional https://postgrespro.ru/
Re: Inconsistency in vacuum behavior
Hi! I've found the discussion you'd mentioned before, checking now. On Thu, Jan 19, 2023 at 4:49 AM Justin Pryzby wrote: > On Mon, Jan 16, 2023 at 08:12:18PM +0300, Nikita Malakhov wrote: > > Hi, > > > > Currently there is no error in this case, so additional thrown error > would > > require a new test. > > Besides, throwing an error here does not make sense - it is just a check > > for a vacuum > > permission, I think the right way is to just skip a relation that is not > > suitable for vacuum. > > Any thoughts or objections? > > Could you check if this is consistent between the behavior of VACUUM > FULL and CLUSTER ? See also Nathan's patches. > > -- > Justin > -- Regards, Nikita Malakhov Postgres Professional https://postgrespro.ru/
Re: Inconsistency in vacuum behavior
Hi hackers! Alexander found a very good issue. Please check the solution above. Any objections? It's a production case, please review, any thoughts and objections are welcome. On Mon, Jan 16, 2023 at 8:15 PM Alexander Pyhalov wrote: > Nikita Malakhov писал 2023-01-16 20:12: > > Hi, > > > > Currently there is no error in this case, so additional thrown error > > would require a new test. > > Besides, throwing an error here does not make sense - it is just a > > check for a vacuum > > permission, I think the right way is to just skip a relation that is > > not suitable for vacuum. > > Any thoughts or objections? > > > > No objections for not throwing an error. > > -- > Best regards, > Alexander Pyhalov, > Postgres Professional > -- Regards, Nikita Malakhov Postgres Professional https://postgrespro.ru/
Re: Inconsistency in vacuum behavior
Hi, Currently there is no error in this case, so additional thrown error would require a new test. Besides, throwing an error here does not make sense - it is just a check for a vacuum permission, I think the right way is to just skip a relation that is not suitable for vacuum. Any thoughts or objections? On Mon, Jan 16, 2023 at 7:46 PM Alexander Pyhalov wrote: > Nikita Malakhov писал 2023-01-16 17:26: > > Hi! > > > > Here's the patch that fixes this case, please check it out. > > The patch adds vacuum_is_permitted_for_relation() check before adding > > partition relation to the vacuum list, and if permission is denied the > > relation > > is not added, so it is not passed to vacuum_rel() and there are no try > > to > > acquire the lock. > > > > Cheers! > > Hi. > > The patch seems to solve the issue. > Two minor questions I have: > 1) should we error out if HeapTupleIsValid(part_tuple) is false? > 2) comment "Check partition relations for vacuum permit" seems to be > broken in some way. > > -- > Best regards, > Alexander Pyhalov, > Postgres Professional > -- Regards, Nikita Malakhov Postgres Professional https://postgrespro.ru/
Re: Inconsistency in vacuum behavior
Hi! Here's the patch that fixes this case, please check it out. The patch adds vacuum_is_permitted_for_relation() check before adding partition relation to the vacuum list, and if permission is denied the relation is not added, so it is not passed to vacuum_rel() and there are no try to acquire the lock. Cheers! On Mon, Jan 16, 2023 at 4:48 PM Nikita Malakhov wrote: > Hi! > > I've checked this expand_vacuum_rel() and made a quick fix for this.Here's > the result of the test: > > postgres@postgres=# set role regress_vacuum_conflict; > SET > Time: 0.369 ms > postgres@postgres=> vacuum vacuum_tab; > WARNING: permission denied to vacuum "vacuum_tab", skipping it > WARNING: permission denied to vacuum "vacuum_tab_1", skipping it > WARNING: permission denied to vacuum "vacuum_tab_2", skipping it > VACUUM > Time: 0.936 ms > postgres@postgres=> > > Looks like it's a subject for a patch. > > On Mon, Jan 16, 2023 at 11:18 AM Alexander Pyhalov < > a.pyha...@postgrespro.ru> wrote: > >> Hi. >> >> We've run regress isolation tests on partitioned tables and found >> interesting VACUUM behavior. I'm not sure, if it's intended. >> >> In the following example, partitioned tables and regular tables behave >> differently: >> >> CREATE TABLE vacuum_tab (a int) PARTITION BY HASH (a); >> CREATE TABLE vacuum_tab_1 PARTITION OF vacuum_tab FOR VALUES WITH >> (MODULUS 2, REMAINDER 0); >> CREATE TABLE vacuum_tab_2 PARTITION OF vacuum_tab FOR VALUES WITH >> (MODULUS 2, REMAINDER 1); >> CREATE ROLE regress_vacuum_conflict; >> >> In the first session: >> >> begin; >> LOCK vacuum_tab IN SHARE UPDATE EXCLUSIVE MODE; >> >> In the second: >> SET ROLE regress_vacuum_conflict; >> VACUUM vacuum_tab; >> WARNING: permission denied to vacuum "vacuum_tab", skipping it < >> hangs here, trying to lock vacuum_tab_1 >> >> In non-partitioned case second session exits after emitting warning. In >> partitioned case, it hangs, trying to get locks. >> This is due to the fact that in expand_vacuum_rel() we skip parent table >> if vacuum_is_permitted_for_relation(), but don't perform such check for >> its child. >> The check will be performed later in vacuum_rel(), but after >> vacuum_open_relation(), which leads to hang in the second session. >> >> Is it intended? Why don't we perform vacuum_is_permitted_for_relation() >> check for inheritors in expand_vacuum_rel()? >> >> -- >> Best regards, >> Alexander Pyhalov, >> Postgres Professional >> >> >> > > -- > Regards, > Nikita Malakhov > Postgres Professional > https://postgrespro.ru/ > -- Regards, Nikita Malakhov Postgres Professional https://postgrespro.ru/ 0001_expand_vac_rel_part_chk_v1.patch Description: Binary data
Re: Inconsistency in vacuum behavior
Hi! I've checked this expand_vacuum_rel() and made a quick fix for this.Here's the result of the test: postgres@postgres=# set role regress_vacuum_conflict; SET Time: 0.369 ms postgres@postgres=> vacuum vacuum_tab; WARNING: permission denied to vacuum "vacuum_tab", skipping it WARNING: permission denied to vacuum "vacuum_tab_1", skipping it WARNING: permission denied to vacuum "vacuum_tab_2", skipping it VACUUM Time: 0.936 ms postgres@postgres=> Looks like it's a subject for a patch. On Mon, Jan 16, 2023 at 11:18 AM Alexander Pyhalov wrote: > Hi. > > We've run regress isolation tests on partitioned tables and found > interesting VACUUM behavior. I'm not sure, if it's intended. > > In the following example, partitioned tables and regular tables behave > differently: > > CREATE TABLE vacuum_tab (a int) PARTITION BY HASH (a); > CREATE TABLE vacuum_tab_1 PARTITION OF vacuum_tab FOR VALUES WITH > (MODULUS 2, REMAINDER 0); > CREATE TABLE vacuum_tab_2 PARTITION OF vacuum_tab FOR VALUES WITH > (MODULUS 2, REMAINDER 1); > CREATE ROLE regress_vacuum_conflict; > > In the first session: > > begin; > LOCK vacuum_tab IN SHARE UPDATE EXCLUSIVE MODE; > > In the second: > SET ROLE regress_vacuum_conflict; > VACUUM vacuum_tab; > WARNING: permission denied to vacuum "vacuum_tab", skipping it < > hangs here, trying to lock vacuum_tab_1 > > In non-partitioned case second session exits after emitting warning. In > partitioned case, it hangs, trying to get locks. > This is due to the fact that in expand_vacuum_rel() we skip parent table > if vacuum_is_permitted_for_relation(), but don't perform such check for > its child. > The check will be performed later in vacuum_rel(), but after > vacuum_open_relation(), which leads to hang in the second session. > > Is it intended? Why don't we perform vacuum_is_permitted_for_relation() > check for inheritors in expand_vacuum_rel()? > > -- > Best regards, > Alexander Pyhalov, > Postgres Professional > > > -- Regards, Nikita Malakhov Postgres Professional https://postgrespro.ru/
Re: Pluggable toaster
Hi! Fails due to recent changes. Working on it. On Sat, Jan 14, 2023 at 9:56 AM vignesh C wrote: > On Sun, 8 Jan 2023 at 01:40, Nikita Malakhov wrote: > > > > Hi! > > > > Thank you for your attention. > > I've rebased the patchset onto the latest master (from 07.01), but the > second part is still > > in pre-patch shape - it is working, but tests are failing due to changes > in TOAST relations > > logic - I haven't adapted 'em yet. > > The patch does not apply on top of HEAD as in [1], please post a rebased > patch: > === Applying patches on top of PostgreSQL commit ID > c44f6334ca6ff6d242d9eb6742441bc4e1294067 === > === expanding ./0002_toaster_default_v25.patch.gz > === expanding ./0001_toaster_interface_v25.patch.gz > === expanding ./0004_drop_toaster_v25.patch.gz > === expanding ./0003_pg_toastrel_control_v25.patch.gz > === applying patch ./0001_toaster_interface_v25.patch > > patching file src/include/postgres.h > Hunk #1 succeeded at 80 with fuzz 2 (offset 4 lines). > Hunk #2 FAILED at 148. > Hunk #3 FAILED at 315. > Hunk #4 FAILED at 344. > Hunk #5 FAILED at 359. > 4 out of 5 hunks FAILED -- saving rejects to file > src/include/postgres.h.rej > > [1] - http://cfbot.cputube.org/patch_41_3490.log > > Regards, > Vignesh > -- Regards, Nikita Malakhov Postgres Professional https://postgrespro.ru/
Re: How to define template types in PostgreSQL
Hi! I'd suggest creating an API that defines a general function set with variable input, and calling implementation defined on the input type? On Sat, Jan 7, 2023 at 12:32 PM Esteban Zimanyi wrote: > Dear all > > MobilityDB (https://github.com/MobilityDB/MobilityDB) defines at the C > level four template types: Set, Span, SpanSet, and Temporal. The type Set > is akin to PostgreSQL's ArrayType restricted to one dimension, but enforces > the constraint that sets do not have duplicates, the types Span and SpanSet > are akin to PostgreSQL's RangeType and MultirangeType but enforce the > constraints that span types are of fixed length and that empty spans and > infinite bounds are not allowed, and the typeTemporal is used to > manipulate time-varying values. > > These template types need to be instantiated at the SQL level with base > types (int, bigint, float, timestamptz, text, ...) and because of this, > MobilityDB needs to define numerous SQL functions that all call the same > function in C. Taking as example the Set type, we need to define, e.g., > > CREATE FUNCTION intset_eq(intset, intset) RETURNS bool AS > 'MODULE_PATHNAME', 'Set_eq' ... > CREATE FUNCTION bigintset_eq(bigintset, bigintset) RETURNS bool AS > 'MODULE_PATHNAME', 'Set_eq' ... > CREATE FUNCTION floatset_eq(floatset, floatset) RETURNS bool AS > 'MODULE_PATHNAME', 'Set_eq' ... > CREATE FUNCTION textset_eq(textset, textset) RETURNS bool AS > 'MODULE_PATHNAME', 'Set_eq' ... > ... > > CREATE FUNCTION intset_ne(intset, intset) RETURNS bool AS > 'MODULE_PATHNAME', 'Set_ne' ... > CREATE FUNCTION bigintset_ne(bigintset, bigintset) RETURNS bool AS > 'MODULE_PATHNAME', 'Set_ne' ... > CREATE FUNCTION floatset_ne(floatset, floatset) RETURNS bool AS > 'MODULE_PATHNAME', 'Set_ne' ... > CREATE FUNCTION textset_ne(textset, textset) RETURNS bool AS > 'MODULE_PATHNAME', 'Set_ne' ... > ... > > In the case of arrays, ranges, and multiranges, PostgreSQL avoids this > redundancy using pseudo-types such as anyarray, anyrange, anymultirange, ... > > Is there a possibility that we can also define pseudo types such as > anyset, anyspan, anyspanset, anytemporal, ? > > This will considerably reduce the number of SQL functions to define. > Currently, given the high number of functions in MobilityDB, creating the > extension takes a lng time > > Regards > > Esteban > > -- Regards, Nikita Malakhov Postgres Professional https://postgrespro.ru/
Re: split TOAST support out of postgres.h
Hi, I've thought about this while working on Pluggable TOAST and 64-bit TOAST value ID myself. Agree with #2 to seem the best of the above. There are not so many places where a new header should be included. On Wed, Dec 28, 2022 at 6:08 PM Tom Lane wrote: > Peter Eisentraut writes: > > ... Then we could either > > > 1) Include varatt.h in postgres.h, similar to elog.h and palloc.h. That > > way we clean up the files a bit but don't change any external interfaces. > > > 2) Just let everyone who needs it include the new file. > > > 3) Compromise: You can avoid most "damage" by having fmgr.h include > > varatt.h. That satisfies most data types and extension code. That way, > > there are only a few places that need an explicit include of varatt.h. > > > I went with the last option in my patch. > > I dunno, #3 seems kind of unprincipled. Also, since fmgr.h is included > so widely, I doubt it is buying very much in terms of reducing header > footprint. How bad is it to do #2? > > regards, tom lane > > > -- Regards, Nikita Malakhov Postgres Professional https://postgrespro.ru/
Re: Passing relation metadata to Exec routine
Hi Tom! Thank you for your feedback. I agree that for complex columns created with joins, grouping, etc considering properties of the base table does not make sense at all. But for CREATE TABLE LIKE and simple columns that are inherited from some existing relations - it does, if we consider some advanced properties and from user's perspective - want our new table [columns] to behave exactly as the base ones (in some ways like encryption, storage, compression methods, etc). LIKE options is a good idea, thank you, but when we CREATE TABLE AS - maybe, we take it into account too? I agree that passing these parameters in a backdoor fashion is not very transparent and user-friendly, too. On Tue, Dec 27, 2022 at 12:56 AM Tom Lane wrote: > Nikita Malakhov writes: > > While working on Pluggable TOAST [1] we found out that creation > > of new relation with CREATE TABLE AS... or CREATE TABLE LIKE - > > method > > static ObjectAddress create_ctas_internal(List *attrList, IntoClause > *into) > > does not receive any metadata from columns or tables used in query > > (if any). It makes sense to pass not only column type and size, but > > all other metadata - like attoptions,base relation OID (and, maybe, > > reloptions), if the column from existing relation was used. > > I am very, very skeptical of the premise here. > > CREATE TABLE AS creates a table based on the output of a query. > That query could involve arbitrarily complex processing -- joins, > grouping, what-have-you. I don't see how it makes sense to > consider the properties of the base table(s) in deciding how to > create the output table. I certainly do not think it'd be sane for > that to behave differently depending on how complicated the query is. > > As for CREATE TABLE LIKE, the point of that is to create a table > by copying a *specified* set of properties of a reference table. > I don't think letting an access method copy some other properties > behind the user's back is a great idea. If you've got things that > it'd make sense to be able to inherit, let's discuss adding more > LIKE options to support that --- in which case the implementation > would presumably pass the data through in a non-back-door fashion. > > regards, tom lane > -- Regards, Nikita Malakhov Postgres Professional https://postgrespro.ru/
Passing relation metadata to Exec routine
Hi hackers! While working on Pluggable TOAST [1] we found out that creation of new relation with CREATE TABLE AS... or CREATE TABLE LIKE - method static ObjectAddress create_ctas_internal(List *attrList, IntoClause *into) does not receive any metadata from columns or tables used in query (if any). It makes sense to pass not only column type and size, but all other metadata - like attoptions,base relation OID (and, maybe, reloptions), if the column from existing relation was used. A good example is the creation of new relation from base one where some other Toaster was assigned to a column - it seems reasonable that the same column in new table must have the same Toaster assigned as the base one. And we already have a couple of other practical uses for the metadata passed along with column definitions. Any thoughts or suggestions? [1] https://commitfest.postgresql.org/41/3490/ -- Regards, Nikita Malakhov Postgres Professional https://postgrespro.ru/
Re: ARRNELEMS Out-of-bounds possible errors
Hi, My bad, I was misleaded by unconditional return in ereturn_domain. Sorry for the noise. On Sat, Dec 24, 2022 at 7:05 PM Tom Lane wrote: > Nikita Malakhov writes: > > Even with null context it does not turn to ereport, and returns dummy > value > > Read the code. ArrayGetNItems passes NULL for escontext, therefore > if there's a problem the ereturn calls in ArrayGetNItemsSafe will > throw error, *not* return -1. > > Not sure how we could persuade Coverity of that, though, > if it fails to understand that for itself. > > regards, tom lane > -- Regards, Nikita Malakhov Postgres Professional https://postgrespro.ru/
Re: ARRNELEMS Out-of-bounds possible errors
Hi, Even with null context it does not turn to ereport, and returns dummy value - #define errsave_domain(context, domain, ...) \ do { \ struct Node *context_ = (context); \ pg_prevent_errno_in_scope(); \ if (errsave_start(context_, domain)) \ __VA_ARGS__, errsave_finish(context_, __FILE__, __LINE__, __func__); \ } while(0) #define errsave(context, ...) \ errsave_domain(context, TEXTDOMAIN, __VA_ARGS__) /* * "ereturn(context, dummy_value, ...);" is exactly the same as * "errsave(context, ...); return dummy_value;". This saves a bit * of typing in the common case where a function has no cleanup * actions to take after reporting a soft error. "dummy_value" * can be empty if the function returns void. */ #define ereturn_domain(context, dummy_value, domain, ...) \ do { \ errsave_domain(context, domain, __VA_ARGS__); \ return dummy_value; \ } while(0) #define ereturn(context, dummy_value, ...) \ ereturn_domain(context, dummy_value, TEXTDOMAIN, __VA_ARGS__) On Fri, Dec 23, 2022 at 11:40 AM Kyotaro Horiguchi wrote: > At Fri, 23 Dec 2022 17:37:55 +0900 (JST), Kyotaro Horiguchi < > horikyota@gmail.com> wrote in > > At Thu, 22 Dec 2022 12:35:58 -0300, Ranier Vilela > wrote in > > > Hi. > > > > > > Per Coverity. > > > > > > The commit ccff2d2 > > > < > https://github.com/postgres/postgres/commit/ccff2d20ed9622815df2a7deffce8a7b14830965 > >, > > > changed the behavior function ArrayGetNItems, > > > with the introduction of the function ArrayGetNItemsSafe. > > > > > > Now ArrayGetNItems may return -1, according to the comment. > > > " instead of throwing an exception. -1 is returned after an error." > > > > If I'm reading the code correctly, it's the definition of > > ArrayGetNItems*Safe*. ArrayGetNItems() calls that function with a NULL > > escontext and the NULL turns ereturn() into ereport(). > > > That doesn't seem to be changed by the commit. > > No.. It seems to me that the commit didn't change its behavior in that > regard. > > > Of course teaching Coverity not to issue the false warnings would be > > another actual issue that we should do, maybe. > > -- > Kyotaro Horiguchi > NTT Open Source Software Center > > > -- Regards, Nikita Malakhov Postgres Professional https://postgrespro.ru/
Re: ARRNELEMS Out-of-bounds possible errors
Hi, The most obvious solution I see is to check all calls and for cases like we both mentioned to pass a flag meaning safe or unsafe (for these cases) behavior is expected, like #define ARRNELEMS(x) ArrayGetNItems( ARR_NDIM(x), ARR_DIMS(x), false) ... int ArrayGetNItems(int ndim, const int *dims, bool issafe) { return ArrayGetNItemsSafe(ndim, dims, NULL, issafe); } int ArrayGetNItemsSafe(int ndim, const int *dims, struct Node *escontext, bool issafe) { ... Another solution is revision of wrapping code for all calls of ArrayGetNItems. Safe functions is a good idea overall, but a lot of code needs to be revised. On Fri, Dec 23, 2022 at 1:20 AM Ranier Vilela wrote: > Em qui., 22 de dez. de 2022 às 15:45, Nikita Malakhov > escreveu: > >> Hi, >> >> Actually, there would be much more sources affected, like >> nbytes += subbytes[outer_nelems]; >> subnitems[outer_nelems] = ArrayGetNItems(this_ndims, >> ARR_DIMS(array)); >> nitems += subnitems[outer_nelems]; >> havenulls |= ARR_HASNULL(array); >> outer_nelems++; >> } >> >> Maybe it is better for most calls like this to keep old behavior, by >> passing a flag >> that says which behavior is expected by caller? >> > I agreed that it is better to keep old behavior. > Even the value 0 is problematic, with calls like this: > > nel = ARRNELEMS(ent); > memcpy(ptr, ARRPTR(ent), nel * sizeof(int32)); > > regards, > Ranier Vilela > -- Regards, Nikita Malakhov Postgres Professional https://postgrespro.ru/
Re: ARRNELEMS Out-of-bounds possible errors
Hi, Actually, there would be much more sources affected, like nbytes += subbytes[outer_nelems]; subnitems[outer_nelems] = ArrayGetNItems(this_ndims, ARR_DIMS(array)); nitems += subnitems[outer_nelems]; havenulls |= ARR_HASNULL(array); outer_nelems++; } Maybe it is better for most calls like this to keep old behavior, by passing a flag that says which behavior is expected by caller? On Thu, Dec 22, 2022 at 6:36 PM Ranier Vilela wrote: > Hi. > > Per Coverity. > > The commit ccff2d2 > <https://github.com/postgres/postgres/commit/ccff2d20ed9622815df2a7deffce8a7b14830965>, > changed the behavior function ArrayGetNItems, > with the introduction of the function ArrayGetNItemsSafe. > > Now ArrayGetNItems may return -1, according to the comment. > " instead of throwing an exception. -1 is returned after an error." > > So the macro ARRNELEMS can fail entirely with -1 return, > resulting in codes failing to use without checking the function return. > > Like (contrib/intarray/_int_gist.c): > { > int nel; > > nel = ARRNELEMS(ent); > memcpy(ptr, ARRPTR(ent), nel * sizeof(int32)); > } > > Sources possibly affecteds: > contrib\cube\cube.c > contrib\intarray\_intbig_gist.c > contrib\intarray\_int_bool.c > contrib\intarray\_int_gin.c > contrib\intarray\_int_gist.c > contrib\intarray\_int_op.c > contrib\intarray\_int_tool.c: > > Thoughts? > > regards, > Ranier Vilela > -- Regards, Nikita Malakhov Postgres Professional https://postgrespro.ru/
Re: [PATCH] Infinite loop while acquiring new TOAST Oid
Hi hackers! Any suggestions on the previous message (64-bit toast value ID with individual counters)? On Tue, Dec 13, 2022 at 1:41 PM Nikita Malakhov wrote: > Hi hackers! > > I've prepared a patch with a 64-bit TOAST Value ID. It is a kind of > prototype > and needs some further work, but it is already working and ready to play > with. > > I've introduced 64-bit value ID field in varatt_external, but to keep it > compatible > with older bases 64-bit value is stored as a structure of two 32-bit > fields and value > ID moved to be the last in the varatt_external structure. Also, I've > introduced > different ID assignment function - instead of using global OID assignment > function > I use individual counters for each TOAST table, automatically cached and > after > server restart when new value is inserted into TOAST table maximum used > value > is searched and used to assign the next one. > > >Andres Freund: > >I think we'll need to do something about the width of varatt_external to > make > >the conversion to 64bit toast oids viable - and if we do, I don't think > >there's a decent argument for staying with 4 byte toast OIDs. I think the > >varatt_external equivalent would end up being smaller in just about all > cases. > >And as you said earlier, the increased overhead inside the toast table / > index > >is not relevant compared to the size of toasted datums. > > There is some small overhead due to indexing 64-bit values. Also, for > smaller > tables we can use 32-bit ID instead of 64-bit, but then we would have a > problem > when we reach the limit of 2^32. > > Pg_upgrade seems to be not a very difficult case if we go keeping old TOAST > tables using 32-bit values, > > Please have a look. I'd be grateful for some further directions. > > GIT branch with this feature, rebased onto current master: > https://github.com/postgrespro/postgres/tree/toast_64bit > > > -- Regards, Nikita Malakhov Postgres Professional https://postgrespro.ru/
Re: Array initialisation notation in syscache.c
Hi! Wanted to ask this since I encountered a need for a cache with 5 keys - why is the syscache index still limited to 4 keys? Thanks! On Wed, Dec 21, 2022 at 7:36 PM Peter Eisentraut < peter.eisentr...@enterprisedb.com> wrote: > On 21.12.22 04:16, Thomas Munro wrote: > > On Wed, Dec 21, 2022 at 1:33 PM Thomas Munro > wrote: > >> KEY(Anum_pg_attribute_attrelid, > >> Anum_pg_attribute_attnum), > > > > I independently rediscovered that our VA_ARGS_NARGS() macro in c.h > > always returns 1 on MSVC via trial-by-CI. Derp. Here is the same > > patch, no change from v2, but this time accompanied by Victor Spirin's > > fix, which I found via one of the tab-completion-is-busted-on-Windows > > discussions. I can't supply a useful commit message, because I > > haven't understood why it works, but it does indeed seem to work and > > this should make cfbot green. > > This looks like a good improvement to me. > > (I have also thought about having this generated from the catalog > definition files somehow, but one step at a time ...) > > > > -- Regards, Nikita Malakhov Postgres Professional https://postgrespro.ru/
Re: meson and tmp_install
Hi! That's great, thanks! Discussion list is very long so I've missed this topic. Just a suggestion - I've checked the link above, maybe there should be added a small part on where build files are located and how to add new sources for successful build? On Tue, Dec 20, 2022 at 9:22 PM Andres Freund wrote: > Hi, > > On 2022-12-20 21:11:26 +0300, Nikita Malakhov wrote: > > Didn't know where to ask, so I've chosen this thread - there is no any > > documentation on meson build platform in PostgreSQL docs. > > There is now: > https://www.postgresql.org/docs/devel/install-meson.html > > Needs further work, but it's a start. > > > > Is this okay? For me it was a surprise when the meson platform was > > added > > It's been discussed on the list for a year or so before it was > added. It's a large change, so unfortunately it's not something that I > could get done in a single day, with perfect docs from the get go. > > Greetings, > > Andres Freund > -- Regards, Nikita Malakhov Postgres Professional https://postgrespro.ru/
Re: meson and tmp_install
Hi! Didn't know where to ask, so I've chosen this thread - there is no any documentation on meson build platform in PostgreSQL docs. Is this okay? For me it was a surprise when the meson platform was added, and I have had to spend some time sweeping through meson docs when I'd added new source files, to build Postgres successfully. Thanks! On Tue, Dec 20, 2022 at 8:30 PM Andres Freund wrote: > Hi, > > On 2022-12-20 07:12:04 -0500, Andrew Dunstan wrote: > > Yesterday when testing a patch I got annoyed when my test failed. I > > tested it like this: > > > > meson test ldap_password_func/001_mutated_bindpasswd > > > > It turned out that I needed to do this: > > > > meson test tmp_install ldap_password_func/001_mutated_bindpasswd > > > > The Makefile equivalent ensures that you have a tmp_install without > > having to request to explicitly. It would be nice if we could make meson > > do the same thing, and also honor NO_TEMP_INSTALL if set. > > I would like that too, but there's no easy fix that doesn't have > downsides as far as I am aware. We could make the temp install a build > target that the tests depend on, but for historical reasons in meson > that means that the 'all' target depends on temp-install. Which isn't > great. > > My current thinking is that we should get away from needing the > temporary install and instead allow to run the tests against the build > directory itself. The temp-install adds a fair bit of overhead and > failure potential. The only reason we need it is that a) initdb and a > few other programs insist that postgres needs to be in the same > directory b) contrib modules currently need to reside in one single > directory. > > Greetings, > > Andres Freund > > > -- Regards, Nikita Malakhov Postgres Professional https://postgrespro.ru/
Re: New strategies for freezing, advancing relfrozenxid early
Hi! I'll try to apply this patch onto my branch with Pluggable TOAST to test these mechanics with new TOAST. Would reply on the result. It could be difficult though, because both have a lot of changes that affect the same code. >I'm not sure how much this would help with bloat. I suspect that it >could make a big difference with the right workload. If you always >need frequent autovacuums, just to deal with bloat, then there is >never a good time to run an aggressive antiwraparound autovacuum. An >aggressive AV will probably end up taking much longer than the typical >autovacuum that deals with bloat. While the aggressive AV will remove >as much bloat as any other AV, in theory, that might not help much. If >the aggressive AV takes as long as (say) 5 regular autovacuums would >have taken, and if you really needed those 5 separate autovacuums to >run, just to deal with the bloat, then that's a real problem. The >aggressive AV effectively causes bloat with such a workload. On Tue, Dec 20, 2022 at 12:01 PM Jeff Davis wrote: > On Sun, 2022-12-18 at 14:20 -0800, Peter Geoghegan wrote: > > Attached is v10, which fixes this issue, but using a different > > approach to the one I sketched here. > > In 0001, it's fairly straightforward rearrangement and looks like an > improvement to me. I have a few complaints, but they are about pre- > existing code that you moved around, and I like that you didn't > editorialize too much while just moving code around. +1 from me. > > > -- > Jeff Davis > PostgreSQL Contributor Team - AWS > > > > > -- Regards, -- Nikita Malakhov Postgres Professional https://postgrespro.ru/
Re: Patch: Global Unique Index
Hi! Sorry to bother - but is this patch used in IvorySQL? Here: https://www.ivorysql.org/docs/Global%20Unique%20Index/create_global_unique_index According to syntax it definitely looks like this patch. Thank you! On Sat, Dec 3, 2022 at 3:05 AM David Zhang wrote: > On 2022-11-29 6:16 p.m., Tom Lane wrote: > > Assuming that you are inserting into index X, and you've checked > > index Y to find that it has no conflicts, what prevents another > > backend from inserting a conflict into index Y just after you look? > > AIUI the idea is to prevent that by continuing to hold an exclusive > > lock on the whole index Y until you've completed the insertion. > > Perhaps there's a better way to do that, but it's not what was > > described. > Another main change in patch > `0004-support-global-unique-index-insert-and-update.patch`, > +search_global: > +stack = _bt_search(iRel, insertstate.itup_key, > + &insertstate.buf, BT_READ, > NULL); > +xwait = _bt_check_unique_gi(iRel, &insertstate, > +hRel, checkUnique, > &is_unique, > + &speculativeToken, heapRel); > +if (unlikely(TransactionIdIsValid(xwait))) > +{ > ... ... > +goto search_global; > +} > > Here, I am trying to use `BT_READ` to require a LW_SHARED lock on the > buffer block if a match found using `itup_key` search key. The > cross-partition uniqueness checking will wait if the index tuple > insertion on this buffer block has not done yet, otherwise runs the > uniqueness check to see if there is an ongoing transaction which may > insert a conflict value. Once the ongoing insertion is done, it will go > back and check again (I think it can also handle the case that a > potential conflict index tuple was later marked as dead in the same > transaction). Based on this change, my test results are: > > 1) a select-only query will not be blocked by the ongoing insertion on > index X > > 2) insertion happening on index Y may wait for the buffer block lock > when inserting a different value but it does not wait for the > transaction lock held by insertion on index X. > > 3) when an insertion inserting a conflict value on index Y, > 3.1) it waits for buffer block lock if the lock has been held by > the insertion on index X. > 3.2) then, it waits for transaction lock until the insertion on > index X is done. > > > > -- Regards, Nikita Malakhov Postgres Professional https://postgrespro.ru/
Re: On login trigger: take three
Hi, Mikhail, I've checked the patch and previous discussion, the condition mentioned earlier is still actual: >The example trigger from the documentation +DECLARE + hour integer = EXTRACT('hour' FROM current_time); + rec boolean; +BEGIN +-- 1. Forbid logging in between 2AM and 4AM. +IF hour BETWEEN 2 AND 4 THEN + RAISE EXCEPTION 'Login forbidden'; +END IF; >can be bypassed with PGOPTIONS='-c timezone=...'. Probably this is >nothing new and concerns any SECURITY DEFINER function, but still... along with +IF hour BETWEEN 8 AND 20 THEN It seems to be a minor security issue, so just in case you haven't noticed it. On Fri, Dec 16, 2022 at 9:14 PM Mikhail Gribkov wrote: > Hi hackers, > Attached v33 is a new rebase of the flagless version of the patch. As > there were no objections at first glance, I’ll try to post it to the > upcoming commitfest, thus the brief recap of all the patch details is below. > > v33-On_client_login_event_trigger > The patch introduces a trigger on login event, allowing to fire some > actions right on the user connection. This can be useful for logging or > connection check purposes as well as for some personalization of the > environment. Usage details are described in the documentation included, but > shortly usage is the same as for other triggers: create function returning > event_trigger and then create event trigger on login event. > > The patch is prepared to be applied to the master branch and tested when > applied after e52f8b301ed54aac5162b185b43f5f1e44b6b17e commit by Thomas > Munro (Date: Fri Dec 16 17:36:22 2022 +1300). > Regression tests and documentation included. > > A couple of words about what and why I changed compared to the previous > author's version. > First, the (en/dis)abling GUC was removed from the patch because > ideologically it is a separate feature and nowadays it’s discussed and > supported in a separate thread by Daniel Gustaffson: > > https://www.postgresql.org/message-id/flat/9140106E-F9BF-4D85-8FC8-F2D3C094A6D9%40yesql.se > Second, I have removed the dathasloginevt flag. The flag was initially > added to the patch for performance reasons and it did the job, although it > was quite a clumsy construct causing tons of bugs and could potentially > lead to tons more during further PostgreSQL evolution. I have removed the > flag and found that the performance drop is not that significant. > > And this is an aspect I should describe in more detail. > The possible performance drop is expected as an increased time used to > login a user NOT using a login trigger. > First of all, the method of performance check: > echo 'select 1;' > ./tst.sql > pgbench -n -C -T3 -f tst.sql -U postgres postgres > The output value "average connection time" is the one I use to compare > performance. > Now, what are the results. > * master branch: 0.641 ms > * patched version: 0.644 ms > No significant difference here and it is just what was expected. Based on > the patch design the performance drop can be expected when there are lots > of event triggers created, but not the login one. Thus I have created 1000 > drop triggers (the script for creating triggers is attached too) in the > database and repeated the test: > * master branch: 0.646 ms > * patched version: 0.754 ms > For 2000 triggers the patched version connection time is further increased > to 0.862. Thus we have a login time rise in about 16.5% per 1000 event > triggers in the database. It is a statistically noticeable value, still I > don’t think it’s a critical one we should be afraid of. > N.B. The exact values of the login times slightly differ from the ones I > posted in the previous email. Well, that’s the repeatability level we have. > This convinces me even more that the observed level of performance drop is > acceptable. > -- > best regards, > Mikhail A. Gribkov > > e-mail: youzh...@gmail.com > *http://www.flickr.com/photos/youzhick/albums > <http://www.flickr.com/photos/youzhick/albums>* > http://www.strava.com/athletes/5085772 > phone: +7(916)604-71-12 > Telegram: @youzhick > -- Regards, Nikita Malakhov Postgres Professional https://postgrespro.ru/