Re: XLog size reductions: Reduced XLog record header size for PG17
> On Jun 20, 2023, at 1:01 PM, Matthias van de Meent > wrote: > > 0001 is copied essentially verbatim from [1] and reduces overhead in > the registered block's length field where possible. It is included to > improve code commonality between varcoded integer fields. See [1] for > more details. Hi Matthias! I am interested in seeing this patch move forward. We seem to be stuck. The disagreement on the other thread seems to be about whether we can generalize and reuse variable integer encoding. Could you comment on whether perhaps we just need a few versions of that? Perhaps one version where the number of length bytes is encoded in the length itself (such as is used for varlena and by Andres' patch) and one where the number of length bytes is stored elsewhere? You are clearly using the "elsewhere" form, but perhaps you could pull out the logic of that into src/common? In struct XLogRecordBlockHeader.id <http://xlogrecordblockheader.id/>, you are reserving two bits for the size class. (The code comments aren't clear about this, by the way.) Perhaps if the generalized length encoding logic could take a couple arguments to represent where and how the size class bits are to be stored, and where the length itself is stored? I doubt you need to sacrifice any performance gains of this patch to make that happen. You'd just need to restructure the patch. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
> On May 17, 2024, at 1:00 PM, Peter Geoghegan wrote: > > Many different parts of the B-Tree code will fight against allowing > duplicates of the same value to span multiple leaf pages -- this is > especially true for unique indexes. The quick-and-dirty TAP test I wrote this morning is intended to introduce duplicates across page boundaries, not to test for ones that got there by normal database activity. In other words, the TAP test forcibly corrupts the index by changing a value on one side of a boundary to be equal to the value on the other side of the boundary. Prior to the corrupting action the values were all unique. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
> On May 17, 2024, at 12:42 PM, Pavel Borisov wrote: > > At the first glance it's not clear to me: > - why your test creates cross-page unique constraint violations? To see if they are detected. > - how do you know they are not detected? It appears that they are detected. At least, rerunning the test after adjusting the expected output, I no longer see problems. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
> On May 17, 2024, at 12:10 PM, Mark Dilger > wrote: > >> Amcheck with checkunique option does check uniqueness violation between >> pages. But it doesn't warranty detection of cross page uniqueness violations >> in extremely rare cases when the first equal index entry on the next page >> corresponds to tuple that is not visible (e.g. dead). In this, I followed >> the Peter's notion [1] that checking across a number of dead equal entries >> that could theoretically span even across many pages is an unneeded code >> complication and amcheck is not a tool that provides any warranty when >> checking an index. > > This confuses me a bit. The regression test creates a table and index but > never performs any DELETE nor any UPDATE operations, so none of the index > entries should be dead. If I am understanding you correct, I'd be forced to > conclude that the uniqueness checking code is broken. Can you take a look? On further review, the test was not anticipating the error message "high key invariant violated for index". That wasn't seen in calls to bt_index_parent_check(), but appears as one of the errors from bt_index_check(). I am rerunning the test now — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
> On May 17, 2024, at 11:51 AM, Pavel Borisov wrote: > > Amcheck with checkunique option does check uniqueness violation between > pages. But it doesn't warranty detection of cross page uniqueness violations > in extremely rare cases when the first equal index entry on the next page > corresponds to tuple that is not visible (e.g. dead). In this, I followed the > Peter's notion [1] that checking across a number of dead equal entries that > could theoretically span even across many pages is an unneeded code > complication and amcheck is not a tool that provides any warranty when > checking an index. This confuses me a bit. The regression test creates a table and index but never performs any DELETE nor any UPDATE operations, so none of the index entries should be dead. If I am understanding you correct, I'd be forced to conclude that the uniqueness checking code is broken. Can you take a look? — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
> On May 17, 2024, at 3:11 AM, Alexander Korotkov wrote: > > On Mon, May 13, 2024 at 4:42 AM Alexander Korotkov > wrote: >> On Mon, May 13, 2024 at 12:23 AM Alexander Korotkov >> wrote: >>> On Sat, May 11, 2024 at 4:13 AM Mark Dilger >>> wrote: >>>>> On May 10, 2024, at 12:05 PM, Alexander Korotkov >>>>> wrote: >>>>> The only bt_target_page_check() caller is >>>>> bt_check_level_from_leftmost(), which overrides state->target in the >>>>> next iteration anyway. I think the patch is just refactoring to >>>>> eliminate the confusion pointer by Peter Geoghegan upthread. >>>> >>>> I find your argument unconvincing. >>>> >>>> After bt_target_page_check() returns at line 919, and before >>>> bt_check_level_from_leftmost() overrides state->target in the next >>>> iteration, bt_check_level_from_leftmost() conditionally fetches an item >>>> from the page referenced by state->target. See line 963. >>>> >>>> I'm left with four possibilities: >>>> >>>> >>>> 1) bt_target_page_check() never gets to the code that uses "rightpage" >>>> rather than "state->target" in the same iteration where >>>> bt_check_level_from_leftmost() conditionally fetches an item from >>>> state->target, so the change you're making doesn't matter. >>>> >>>> 2) The code prior to v2-0003 was wrong, having changed state->target in >>>> an inappropriate way, causing the wrong thing to happen at what is now >>>> line 963. The patch fixes the bug, because state->target no longer gets >>>> overwritten where you are now using "rightpage" for the value. >>>> >>>> 3) The code used to work, having set up state->target correctly in the >>>> place where you are now using "rightpage", but v2-0003 has broken that. >>>> >>>> 4) It's been broken all along and your patch just changes from wrong to >>>> wrong. >>>> >>>> >>>> If you believe (1) is true, then I'm complaining that you are relying far >>>> to much on action at a distance, and that you are not documenting it. >>>> Even with documentation of this interrelationship, I'd be unhappy with how >>>> brittle the code is. I cannot easily discern that the two don't ever >>>> happen in the same iteration, and I'm not at all convinced one way or the >>>> other. I tried to set up some Asserts about that, but none of the test >>>> cases actually reach the new code, so adding Asserts doesn't help to >>>> investigate the question. >>>> >>>> If (2) is true, then I'm complaining that the commit message doesn't >>>> mention the fact that this is a bug fix. Bug fixes should be clearly >>>> documented as such, otherwise future work might assume the commit can be >>>> reverted with only stylistic consequences. >>>> >>>> If (3) is true, then I'm complaining that the patch is flat busted. >>>> >>>> If (4) is true, then maybe we should revert the entire feature, or have a >>>> discussion of mitigation efforts that are needed. >>>> >>>> Regardless of which of 1..4 you pick, I think it could all do with more >>>> regression test coverage. >>>> >>>> >>>> For reference, I said something similar earlier today in another email to >>>> this thread: >>>> >>>> This patch introduces a change that stores a new page into variable >>>> "rightpage" rather than overwriting "state->target", which the old >>>> implementation most certainly did. That means that after returning from >>>> bt_target_page_check() into the calling function >>>> bt_check_level_from_leftmost() the value in state->target is not what it >>>> would have been prior to this patch. Now, that'd be irrelevant if nobody >>>> goes on to consult that value, but just 44 lines further down in >>>> bt_check_level_from_leftmost() state->target is clearly used. So the >>>> behavior at that point is changing between the old and new versions of the >>>> code, and I think I'm within reason to ask if it was wrong before the >>>> patch, wrong after the patch, or something else? Is this a bug being >>>> introduced, b
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
> On May 10, 2024, at 12:05 PM, Alexander Korotkov wrote: > > The only bt_target_page_check() caller is > bt_check_level_from_leftmost(), which overrides state->target in the > next iteration anyway. I think the patch is just refactoring to > eliminate the confusion pointer by Peter Geoghegan upthread. I find your argument unconvincing. After bt_target_page_check() returns at line 919, and before bt_check_level_from_leftmost() overrides state->target in the next iteration, bt_check_level_from_leftmost() conditionally fetches an item from the page referenced by state->target. See line 963. I'm left with four possibilities: 1) bt_target_page_check() never gets to the code that uses "rightpage" rather than "state->target" in the same iteration where bt_check_level_from_leftmost() conditionally fetches an item from state->target, so the change you're making doesn't matter. 2) The code prior to v2-0003 was wrong, having changed state->target in an inappropriate way, causing the wrong thing to happen at what is now line 963. The patch fixes the bug, because state->target no longer gets overwritten where you are now using "rightpage" for the value. 3) The code used to work, having set up state->target correctly in the place where you are now using "rightpage", but v2-0003 has broken that. 4) It's been broken all along and your patch just changes from wrong to wrong. If you believe (1) is true, then I'm complaining that you are relying far to much on action at a distance, and that you are not documenting it. Even with documentation of this interrelationship, I'd be unhappy with how brittle the code is. I cannot easily discern that the two don't ever happen in the same iteration, and I'm not at all convinced one way or the other. I tried to set up some Asserts about that, but none of the test cases actually reach the new code, so adding Asserts doesn't help to investigate the question. If (2) is true, then I'm complaining that the commit message doesn't mention the fact that this is a bug fix. Bug fixes should be clearly documented as such, otherwise future work might assume the commit can be reverted with only stylistic consequences. If (3) is true, then I'm complaining that the patch is flat busted. If (4) is true, then maybe we should revert the entire feature, or have a discussion of mitigation efforts that are needed. Regardless of which of 1..4 you pick, I think it could all do with more regression test coverage. For reference, I said something similar earlier today in another email to this thread: This patch introduces a change that stores a new page into variable "rightpage" rather than overwriting "state->target", which the old implementation most certainly did. That means that after returning from bt_target_page_check() into the calling function bt_check_level_from_leftmost() the value in state->target is not what it would have been prior to this patch. Now, that'd be irrelevant if nobody goes on to consult that value, but just 44 lines further down in bt_check_level_from_leftmost() state->target is clearly used. So the behavior at that point is changing between the old and new versions of the code, and I think I'm within reason to ask if it was wrong before the patch, wrong after the patch, or something else? Is this a bug being introduced, being fixed, or ... ? — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
> On May 10, 2024, at 11:42 AM, Pavel Borisov wrote: > > IMO 0003 doesn't introduce nor fixes a bug. It loads rightpage into a local > variable, rather that to a BtreeCheckState that can have another users of > state->target afterb uniqueness check in the future, but don't have now. So > the original patch is correct, and the goal of this refactoring is to untie > rightpage fron state structure as it's used only transiently for cross-page > unuque check. It's the same style as already used > bt_right_page_check_scankey() that loads rightpage into a local variable. Well, you can put an Assert(false) dead in the middle of the code we're discussing and all the regression tests still pass, so I'd argue the change is getting zero test coverage. This patch introduces a change that stores a new page into variable "rightpage" rather than overwriting "state->target", which the old implementation most certainly did. That means that after returning from bt_target_page_check() into the calling function bt_check_level_from_leftmost() the value in state->target is not what it would have been prior to this patch. Now, that'd be irrelevant if nobody goes on to consult that value, but just 44 lines further down in bt_check_level_from_leftmost() state->target is clearly used. So the behavior at that point is changing between the old and new versions of the code, and I think I'm within reason to ask if it was wrong before the patch, wrong after the patch, or something else? Is this a bug being introduced, being fixed, or ... ? Having a regression test that actually touches this code would go a fair way towards helping the analysis. > For 0002 I doubt I understand your actual positiob. Could you explain what it > violates or doesn't violate? v2-0002 is does not violate the post feature freeze restriction on new features so far as I can tell, but I just don't care for the variable initialization because it doesn't name the fields. If anybody refactored the struct they might not notice that the need to reorder this initialization, and depending on various factors including compiler flags they might not get an error. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
> On May 10, 2024, at 5:10 AM, Pavel Borisov wrote: > > Hi, Alexander! > > On Fri, 10 May 2024 at 12:39, Alexander Korotkov wrote: > On Fri, May 10, 2024 at 3:43 AM Tom Lane wrote: > > Alexander Korotkov writes: > > > The revised patchset is attached. I applied cosmetical changes. I'm > > > going to push it if no objections. > > > > Is this really suitable material to be pushing post-feature-freeze? > > It doesn't look like it's fixing any new-in-v17 issues. > > These are code improvements to the 5ae2087202, which answer critics in > the thread. 0001 comprises an optimization, but it's rather small and > simple. 0002 and 0003 contain refactoring. 0004 contains better > error reporting. For me this looks like pretty similar to what others > commit post-FF, isn't it? > I've re-checked patches v2. Differences from v1 are in improving > naming/pgindent's/commit messages. > In 0002 order of variables in struct BtreeLastVisibleEntry changed. This > doesn't change code behavior. > > Patch v2-0003 doesn't contain credits and a discussion link. All other > patches do. > > Overall, patches contain small performance optimization (0001), code > refactoring and error reporting changes. IMO they could be pushed post-FF. v2-0001's commit message itself says, "This commit implements skipping keys". I take no position on the correctness or value of the improvement, but it seems out of scope post feature freeze. The patch seems to postpone uniqueness checking until later in the scan than what the prior version did, and that kind of change could require more analysis than we have time for at this point in the release cycle. v2-0002 does appear to just be refactoring. I don't care for a small portion of that patch, but I doubt it violates the post feature freeze rules. In particular: + BtreeLastVisibleEntry lVis = {InvalidBlockNumber, InvalidOffsetNumber, -1, NULL}; v2-0003 may be an improvement in some way, but it compounds some preexisting confusion also. There is already a member of the BtreeCheckState called "target" and a memory context in that struct called "targetcontext". That context is used to allocate pages "state->target", "rightpage", "child" and "page", but not "metapage". Perhaps "targetcontext" is a poor choice of name? "notmetacontext" is a terrible name, but closer to describing the purpose of the memory context. Care to propose something sensible? Prior to applying v2-0003, the rightpage was stored in state->target, and continued to be in state->target later when entering /* * * Downlink check * * * Additional check of child items iff this is an internal page and * caller holds a ShareLock. This happens for every downlink (item) * in target excluding the negative-infinity downlink (again, this is * because it has no useful value to compare). */ if (!P_ISLEAF(topaque) && state->readonly) bt_child_check(state, skey, offset); and thereafter. Now, the rightpage of state->target is created, checked, and free'd, and then the old state->target gets processed in the downlink check and thereafter. This is either introducing a bug, or fixing one, but the commit message is totally ambiguous about whether this is a bugfix or a code cleanup or something else? I think this kind of patch should have a super clear commit message about what it thinks it is doing. v2-0004 guards against a real threat, and is reasonable post feature freeze — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: BitmapHeapScan streaming read user and prelim refactoring
> On Feb 14, 2024, at 6:47 AM, Melanie Plageman > wrote: > > Just curious, did your table AM implement > table_scan_bitmap_next_block() and table_scan_bitmap_next_tuple(), > and, if so, did you use the TBMIterateResult? Since it is not used in > BitmapHeapNext() in my version, table AMs would have to change how > they use TBMIterateResults anyway. But I assume they could add it to a > table AM specific scan descriptor if they want access to a > TBMIterateResult of their own making in both > table_san_bitmap_next_block() and next_tuple()? My table AM does implement those two functions and does use the TBMIterateResult *tbmres argument, yes. I would deal with the issue in very much the same way that your patches modify heapam. I don't really have any additional comments about that. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: BitmapHeapScan streaming read user and prelim refactoring
> On Feb 13, 2024, at 3:11 PM, Melanie Plageman > wrote: Thanks for the patch... > Attached is a patch set which refactors BitmapHeapScan such that it > can use the streaming read API [1]. It also resolves the long-standing > FIXME in the BitmapHeapScan code suggesting that the skip fetch > optimization should be pushed into the table AMs. Additionally, it > moves table scan initialization to after the index scan and bitmap > initialization. > > patches 0001-0002 are assorted cleanup needed later in the set. > patches 0003 moves the table scan initialization to after bitmap creation > patch 0004 is, I think, a bug fix. see [2]. > patches 0005-0006 push the skip fetch optimization into the table AMs > patches 0007-0009 change the control flow of BitmapHeapNext() to match > that required by the streaming read API > patch 0010 is the streaming read code not yet in master > patch 0011 is the actual bitmapheapscan streaming read user. > > patches 0001-0009 apply on top of master but 0010 and 0011 must be > applied on top of a commit before a 21d9c3ee4ef74e2 (until a rebased > version of the streaming read API is on the mailing list). I followed your lead and applied them to 6a8ffe812d194ba6f4f26791b6388a4837d17d6c. `make check` worked fine, though I expect you know that already. > The caveat is that these patches introduce breaking changes to two > table AM functions for bitmapheapscan: table_scan_bitmap_next_block() > and table_scan_bitmap_next_tuple(). You might want an independent perspective on how much of a hassle those breaking changes are, so I took a stab at that. Having written a custom proprietary TAM for postgresql 15 here at EDB, and having ported it and released it for postgresql 16, I thought I'd try porting it to the the above commit with your patches. Even without your patches, I already see breaking changes coming from commit f691f5b80a85c66d715b4340ffabb503eb19393e, which creates a similar amount of breakage for me as does your patches. Dealing with the combined breakage might amount to a day of work, including testing, half of which I think I've already finished. In other words, it doesn't seem like a big deal. Were postgresql 17 shaping up to be compatible with TAMs written for 16, your patch would change that qualitatively, but since things are already incompatible, I think you're in the clear. > A TBMIterateResult used to be threaded through both of these functions > and used in BitmapHeapNext(). This patch set removes all references to > TBMIterateResults from BitmapHeapNext. Because the streaming read API > requires the callback to specify the next block, BitmapHeapNext() can > no longer pass a TBMIterateResult to table_scan_bitmap_next_block(). > > More subtly, table_scan_bitmap_next_block() used to return false if > there were no more visible tuples on the page or if the block that was > requested was not valid. With these changes, > table_scan_bitmap_next_block() will only return false when the bitmap > has been exhausted and the scan can end. In order to use the streaming > read API, the user must be able to request the blocks it needs without > requiring synchronous feedback per block. Thus, this table AM function > must change its meaning. > > I think the way the patches are split up could be improved. I will > think more about this. There are also probably a few mistakes with > which comments are updated in which patches in the set. I look forward to the next version of the patch set. Thanks again for working on this. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Should we remove -Wdeclaration-after-statement?
> On Jan 29, 2024, at 7:35 AM, Isaac Morland wrote: > > On Mon, 29 Jan 2024 at 10:31, Mark Dilger > wrote: > > -Infinity for refactoring the entire codebase and backpatching. > > I don't think anybody is proposing re-working the existing codebase. I > understand this to be only about allowing new code to use the newer style. > Personally, I like, as much as possible, to use initializations to const > variables and avoid assignment operators, so I much prefer to declare at > first (and usually only) assignment. I was responding to Jelte's paragraph upthread: > On Dec 27, 2023, at 9:59 AM, Jelte Fennema-Nio wrote: > > But if the only reason not to remove the > warning is that then there would be two styles of declaration in the > codebase, then I'm happy to create another refactoring script that > moves declarations down to their first usage. (Which could then be run > on all backbranches to make sure there is no backpatching pain) I find that kind of gratuitous code churn highly unpleasant. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Should we remove -Wdeclaration-after-statement?
> On Jan 29, 2024, at 7:03 AM, Jelte Fennema-Nio wrote: > > So my suggestion is for people to respond with -1, -0.5, +-0, +0.5, or > +1 to indicate support against/for the change. -1 for me. -Infinity for refactoring the entire codebase and backpatching. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Escape output of pg_amcheck test
> On Jan 8, 2024, at 5:41 AM, Mark Dilger wrote: > > The /r modifier defeats the purpose of the patch, at least for my perl > version, perl 5, version 28, subversion 1 (v5.28.1). With just the /aeg > modifier, it works fine. Nevermind. I might be wrong about that. I didn't have a test case handy that would generate index corruption which would result in characters of the problematic class, and so I quickly wrote some (wrong) instrumentation to try to test your patch. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Escape output of pg_amcheck test
On 1/7/24 23:27, Peter Eisentraut wrote: The pg_amcheck reports a skip message if the layout of the index does not match expectations. That message includes the bytes that were expected and the ones that were found. But the found ones are arbitrary bytes, which can have funny effects on the terminal when they are printed. To avoid that, escape non-word characters before printing. + # escape non-word characters to avoid confusing the terminal + $b =~ s{(\W)}{ sprintf '\x%02x', ord($1) }aegr); The /r modifier defeats the purpose of the patch, at least for my perl version, perl 5, version 28, subversion 1 (v5.28.1). With just the /aeg modifier, it works fine. -- Mark Dilger
Re: Table AM Interface Enhancements
> On Nov 25, 2023, at 9:47 AM, Alexander Korotkov wrote: > >> Should the patch at least document which parts of the EState are expected to >> be in which states, and which parts should be viewed as undefined? If the >> implementors of table AMs rely on any/all aspects of EState, doesn't that >> prevent future changes to how that structure is used? > > New tuple tuple_insert_with_arbiter() table AM API method needs EState > argument to call executor functions: ExecCheckIndexConstraints(), > ExecUpdateLockMode(), and ExecInsertIndexTuples(). I think we > probably need to invent some opaque way to call this executor function > without revealing EState to table AM. Do you think this could work? We're clearly not accessing all of the EState, just some specific fields, such as es_per_tuple_exprcontext. I think you could at least refactor to pass the minimum amount of state information through the table AM API. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Table AM Interface Enhancements
> On Nov 23, 2023, at 4:42 AM, Alexander Korotkov wrote: > 0006-Generalize-table-AM-API-for-INSERT-.-ON-CONFLICT-v1.patch > > Provides a new table AM API method to encapsulate the whole INSERT ... > ON CONFLICT ... algorithm rather than just implementation of > speculative tokens. I *think* I understand that you are taking the part of INSERT..ON CONFLICT that lives outside the table AM and pulling it inside so that table AM authors are free to come up with whatever implementation is more suited for them. The most straightforward way of doing so results in an EState parameter in the interface definition. That seems not so good, as the EState is a fairly complicated structure, and future changes in the executor might want to rearrange what EState tracks, which would change which values tuple_insert_with_arbiter() can depend on. Should the patch at least document which parts of the EState are expected to be in which states, and which parts should be viewed as undefined? If the implementors of table AMs rely on any/all aspects of EState, doesn't that prevent future changes to how that structure is used? > 0008-Let-table-AM-insertion-methods-control-index-inse-v1.patch > > Allows table AM to skip index insertions in the executor and handle > those insertions itself. The new parameter could use more documentation. > 0009-Custom-reloptions-for-table-AM-v1.patch > > Enables table AMs to define and override reloptions for tables and indexes. This could use some regression tests to exercise the custom reloptions. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [PATCH] Allow Postgres to pick an unused port to listen
> On Apr 12, 2023, at 8:01 AM, Robert Haas wrote: > > "hey, I'd like that too" I like the idea in principle. I have been developing a testing infrastructure in my spare time and would rather not duplicate Andrew's TAP logic. If we get this pushed down into the server itself, all the test infrastructure can use a single, shared solution. As for the implementation, I just briefly scanned the patch. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: running logical replication as the subscription owner
> On Mar 24, 2023, at 11:35 AM, Robert Haas wrote: > > I don't know how bad that sounds to you, and if it does sound bad, I > don't immediately see how to mitigate it. As I said to Jeff, if you > can replicate into a table that has a casually-written SECURITY > INVOKER trigger on it, you can probably hack into the table owner's > account. I assume you mean this bit: > > Imagine for example that the table > > owner has a trigger which doesn't sanitize search_path. The > > subscription owner can potentially leverage that to get the table > > owner's privileges. I don't find that terribly convincing. First, there's no reason a subscription owner should be an ordinary user able to volitionally do anything. The subscription owner should just be a role that the subscription runs under, as a means of superuser dropping privileges before applying changes. So the only real problem would be that the changes coming from the publisher might, upon application, hack the table owner. But if that's the case, the table owner's vulnerability on the subscription-database side is equal to their vulnerability on the publication-database side (assuming equal schemas on both). Flagging this vulnerability as being logical replication related seems a category error. Instead, it's a schema vulnerability. > So I think that if we allow user A to replicate into user B's > table with fewer privileges than A-can-set-role-to-B, we're building a > privilege-escalation attack into the system. But if we do require > A-can-set-role-to-B, then things change as described above. I don't understand the direction this patch is going. I'm emphatically not objecting to it, merely expressing my confusion about it. I had imagined the solution to the replication security problem was to stop running the replication as superuser, and instead as a trivial user. Imagine that superuser creates roles "deadhead_bob" and "deadhead_alice" which cannot log in, are not members of any groups nor have any other roles as members of themselves, and have no privileges beyond begin able to replicate into bob's and alice's tables, respectively. The superuser sets up the subscriptions disabled, transfers ownership to deadhead_bob and deadhead_alice, and only then enables the subscriptions. Since deadhead_bob and deadhead_alice cannot log in, and nobody can set role to them, I don't see what the vulnerability is. Sure, maybe alice can attack deadhead_alice, or bob can attack deadhead_bob, but that's more of a privilege deescalation than a privilege escalation, so where's the risk? That's not a rhetorical question. Is there a risk here? Or are we just concerned that most users will set up replication with superuser or some other high-privilege user, and we're trying to protect them from the consequences of that choice? — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: running logical replication as the subscription owner
> On Mar 24, 2023, at 7:00 AM, Robert Haas wrote: > > More generally, Stephen Frost has elsewhere argued that we should want > the subscription owner to be a very low-privilege user, so that if > their privileges get stolen, it's no big deal. I disagree with that. I > think it's always a problem if one user can get unauthorized access to > another user's account, regardless of exactly what those accounts can > do. I think our goal should be to make it safe for the subscription > owner to be a very high-privilege user, because you're going to need > to be a very high-privilege user to set up replication. And if you do > have that level of privilege, it's more convenient and simpler if you > can just own the subscription yourself, rather than having to make a > dummy account to own it. To put that another way, I think that what > people are going to want to do in a lot of cases is have the superuser > own the subscription, so I think we need to make that case safe, > whatever it takes. I also think the subscription owner should be a low-privileged user, owing to the risk of the publisher injecting malicious content into the publication. I think you are focused on all the bad actors on the subscription-side database and what they can do to each other. That's also valid, but I get the impression that you're losing sight of the risk posed by malicious publishers. Or maybe you aren't, and can explain? — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: BUG: Postgres 14 + vacuum_defer_cleanup_age + FOR UPDATE + UPDATE
> On Mar 11, 2023, at 3:22 PM, Andres Freund wrote: > > Something like the attached. I like that your patch doesn't make the test longer. I assume you've already run the tests and that it works. > I don't know enough perl to know how to interpolate something like > use constant ROWCOUNT => 17; > so I just made it a variable. Seems fair. I certainly don't mind. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Transparent column encryption
> On Mar 9, 2023, at 11:18 PM, Peter Eisentraut > wrote: > > Here is an updated patch. Thanks, Peter. The patch appears to be in pretty good shape, but I have a few comments and concerns. CEKIsVisible() and CMKIsVisible() are obviously copied from TSParserIsVisible(), and the code comments weren't fully updated. Specifically, the phrase "hidden by another parser of the same name" should be updated to not mention "parser". Why does get_cmkalg_name() return the string "unspecified" for PG_CMK_UNSPECIFIED, but the next function get_cmkalg_jwa_name() returns NULL for PG_CMK_UNSPECIFIED? It seems they would both return NULL, or both return "unspecified". If there's a reason for the divergence, could you add a code comment to clarify? BTW, get_cmkalg_jwa_name() has no test coverage. Looking further at code coverage, the new conditional in printsimple_startup() is never tested with (MyProcPort->column_encryption_enabled), so the block is never entered. This would seem to be a consequence of backends like walsender not using column encryption, which is not terribly surprising, but it got me wondering if you had a particular client use case in mind when you added this block? The new function pg_encrypted_in() appears totally untested, but I have to wonder if that's because we're not round-trip testing pg_dump with column encryption...? The code coverage in pg_dump looks fairly decent, but some column encryption code is not covered. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: BUG: Postgres 14 + vacuum_defer_cleanup_age + FOR UPDATE + UPDATE
> On Mar 8, 2023, at 4:15 PM, Andres Freund wrote: > > I worked some more on the fixes for amcheck, and fixes for amcheck. > > The second amcheck fix ends up correcting some inaccurate output in the tests > - previously xids from before xid 0 were reported to be in the future. > > Previously there was no test case exercising exceeding nextxid, without > wrapping around into the past. I added that at the end of > 004_verify_heapam.pl, because renumbering seemed too annoying. > > What do you think? The changes look reasonable to me. > Somewhat random note: > > Is it intentional that we VACUUM FREEZE test ROWCOUNT times? That's > effectively O(ROWCOUNT^2), albeit with small enough constants to not really > matter. I don't think we need to insert the rows one-by-one either. Changing > that to a single INSERT and FREEZE shaves 10-12% off the tests. I didn't > change that, but we also fire off a psql for each tuple for heap_page_items(), > with offset $N no less. That seems to be another 500ms. I don't recall the reasoning. Feel free to optimize the tests. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: HOT chain validation in verify_heapam()
> On Mar 7, 2023, at 10:16 AM, Robert Haas wrote: > > On Mon, Mar 6, 2023 at 12:36 PM Robert Haas wrote: >> So it seems that we still don't have a patch where the >> value of a variable called lp_valid corresponds to whether or not the >> L.P. is valid. > > Here's a worked-over version of this patch. Changes: > > - I got rid of the code that sets lp_valid in funny places and instead > arranged to have check_tuple_visibility() pass up the information on > the XID status. That's important, because we can't casually apply > operations like TransactionIdIsCommitted() to XIDs that, for all we > know, might not even be in the range covered by CLOG. In such cases, > we should not perform any HOT chain validation because we can't do it > sensibly; the new code accomplishes this, and also reduces the number > of CLOG lookups as compared with your version. > > - I moved most of the HOT chain checks from the loop over the > predecessor[] array to the loop over the successor[] array. It didn't > seem to have any value to put them in the third loop; it forces us to > expend extra code to distinguish between redirects and tuples, > information that we already had in the second loop. The only check > that seems to make sense to do in that last loop is the one for a HOT > chain that starts with a HOT tuple, which can't be done any earlier. > > - I realized that your patch had a guard against setting the > predecessor[] when it was set already only for tuples, not for > redirects. That means if a redirect pointed into the middle of a HOT > chain we might not report corruption appropriately. I fixed this and > reworded the associated messages a bit. > > - Assorted cosmetic and comment changes. > > I think this is easier to follow and more nearly correct, but what do > you (and others) think? Thanks, Robert. Quickly skimming over this patch, it looks like something reviewable. Your changes to t/004_verify_heapam.pl appear to be consistent with how that test was intended to function. Note that I have not tried any of this yet. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Non-superuser subscription owners
> On Feb 22, 2023, at 10:49 AM, Jeff Davis wrote: > > On Wed, 2023-02-22 at 09:27 -0800, Mark Dilger wrote: >> Another option is to execute under the intersection of their >> privileges, where both the definer and the invoker need the >> privileges in order for the action to succeed. That would be more >> permissive than the proposed SECURITY NONE, while still preventing >> either party from hijacking privileges of the other. > > Interesting idea, I haven't heard of something like that being done > before. Is there some precedent for that or a use case where it's > helpful? No current use case comes to mind, but I proposed it for event triggers one or two development cycles ago, to allow for non-superuser event trigger owners. The problems associated with allowing non-superusers to create and own event triggers were pretty similar to the problems being discussed in this thread. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Non-superuser subscription owners
> On Feb 22, 2023, at 9:18 AM, Jeff Davis wrote: > > Another option is having some kind SECURITY NONE that would run the > code as a very limited-privilege user that can basically only access > the catalog. That would be useful for running default expressions and > the like without the definer or invoker needing to be careful. Another option is to execute under the intersection of their privileges, where both the definer and the invoker need the privileges in order for the action to succeed. That would be more permissive than the proposed SECURITY NONE, while still preventing either party from hijacking privileges of the other. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Transparent column encryption
> On Feb 11, 2023, at 1:54 PM, Mark Dilger wrote: > > Here are some observations I should mention, src/sgml/html/libpq-exec.html needs clarification: > paramFormats[]Specifies whether parameters are text (put a zero in the array > entry for the corresponding parameter) or binary (put a one in the array > entry for the corresponding parameter). If the array pointer is null then all > parameters are presumed to be text strings. Perhaps you should edit this last sentence to say that all parameters are presumed to be test strings without forced encryption. > Values passed in binary format require knowledge of the internal > representation expected by the backend. For example, integers must be passed > in network byte order. Passing numeric values requires knowledge of the > server storage format, as implemented in > src/backend/utils/adt/numeric.c::numeric_send() and > src/backend/utils/adt/numeric.c::numeric_recv(). > When column encryption is enabled, the second-least-significant byte of this > parameter specifies whether encryption should be forced for a parameter. The value 0x10 has a one as its second-least-significant *nibble*, but that's a really strange way to describe the high-order nibble, and perhaps not what you mean. Could you clarify? > Set this byte to one to force encryption. I think setting the byte to one (0x01) means "binary unencrypted", not "force encryption". Don't you mean to set this byte to 16? > For example, use the C code literal 0x10 to specify text format with forced > encryption. If the array pointer is null then encryption is not forced for > any parameter. > If encryption is forced for a parameter but the parameter does not correspond > to an encrypted column on the server, then the call will fail and the > parameter will not be sent. This can be used for additional security against > a compromised server. (The drawback is that application code then needs to be > kept up to date with knowledge about which columns are encrypted rather than > letting the server specify this.) I think you should say something about how specifying 0x11 will behave -- in other words, asking for encrypted binary data. I believe that this is will draw a "format must be text for encrypted parameter" error, and that the docs should clearly say so. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Transparent column encryption
> On Jan 25, 2023, at 10:44 AM, Peter Eisentraut > wrote: > > Here is a new patch. Changes since v14: > > - Fixed some typos (review by Justin Pryzby) > - Fixed backward compat. psql and pg_dump (review by Justin Pryzby) > - Doc additions (review by Jacob Champion) > - Validate column_encryption option in libpq (review by Jacob Champion) > - Handle column encryption in inheritance > - Change CEKs and CMKs to live inside > schemas Thanks Peter. Here are some observations about the documentation in patch version 15. In acronyms.sgml, the CEK and CMK entries should link to documentation, perhaps linkend="glossary-column-encryption-key" and linkend="glossary-column-master-key". These glossary entries should in turn link to linkend="ddl-column-encryption". In ddl.sgml, the sentence "forcing encryption of certain parameters in the client library (see its documentation)" should link to linkend="libpq-connect-column-encryption". Did you intend the use of "transparent data encryption" (rather than "transparent column encryption") in datatype.sgml? If so, what's the difference? Is this feature intended to be available from ecpg? If so, can we maybe include an example in 36.3.4. Prepared Statements showing how to pass the encrypted values securely. If not, can we include verbiage about that limitation, so folks don't waste time trying to figure out how to do it? The documentation for pg_dump (and pg_dumpall) now includes a --decrypt-encrypted-columns option, which I suppose requires cmklookup to first be configured, and for PGCMKLOOKUP to be exported. There isn't anything in the pg_dump docs about this, though, so maybe a link to section 5.5.3 with a warning about not running pg_dump this way on the database server itself? How does a psql user mark a parameter as having forced encryption? A libpq user can specify this in the paramFormats array, but I don't see any syntax for doing this from psql. None of the perl tap tests you've included appear to do this (except indirectly when calling test_client); grep'ing for the libpq error message "parameter with forced encryption is not to be encrypted" in the tests has no matches. Is it just not possible? I thought you'd mentioned some syntax for this when we spoke in person, but I don't see it now. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Non-superuser subscription owners
> On Feb 1, 2023, at 6:43 AM, Robert Haas wrote: > The thing I'm > struggling to understand is: if you only want to replicate into tables > that Alice can write, why not just make Alice own the subscription? > For a run-as user to make sense, you need a scenario where we want the > replication to target only tables that Alice can touch, but we also > don't want Alice herself to be able to touch the subscription, so you > make Alice the run-as user and yourself the owner, or something like > that. But I'm not sure what that scenario is exactly. This "run-as" idea came about because we didn't want arbitrary roles to be able to change the subscription's connection string. A competing idea was to have a server object rather than a string, with roles like Alice being able to use the server object if they have been granted usage privilege, and not otherwise. So the "run-as" and "server" ideas were somewhat competing. > Mark was postulating a scenario where the publisher and subscriber > don't trust each other. I was thinking a little bit more about that. I > still maintain that the current system is poorly set up to make that > work, but suppose we wanted to do better. We could add filtering on > the subscriber side, like you list schemas or specific relations that > you are or are not willing to replicate into. Then you could, for > example, connect your subscription to a certain remote publication, > but with the restriction that you're only willing to replicate into > the "headquarters" schema. Then we'll replicate whatever tables they > send us, but if the dorks at headquarters mess up the publications on > their end (intentionally or otherwise) and add some tables from the > "locally_controlled_stuff" schema, we'll refuse to replicate that into > our eponymous schema. That example is good, though I don't see how "filters" are better than roles+privileges. Care to elaborate? — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Non-superuser subscription owners
> On Jan 30, 2023, at 1:29 PM, Robert Haas wrote: > > I feel like you're accusing me of removing functionality that has > never existed. A subscription doesn't run as the subscription creator. > It runs as the subscription owner. If you or anyone else had added the > capability for it to run as someone other than the subscription owner, > I certainly wouldn't be trying to back that capability out as part of > this patch, and because there isn't, I'm not proposing to add that as > part of this patch. I don't see how that makes me guilty of squashing > anything together. The current state of affairs, where the run-as user > is taken from pg_subscription.subowner, the same field that is updated > by ALTER SUBSCRIPTION ... OWNER TO, is the result of your work, not > anything that I have done or am proposing to do. > > I also *emphatically* disagree with the idea that this undoes what you > worked on. My patch would be *impossible* without your work. Prior to > your work, the run-as user was always, basically, the superuser, and > so the idea of allowing anyone other than a superuser to execute > CREATE SUBSCRIPTION would be flat-out nuts. Because of your work, > that's now a thing that we may be able to reasonably allow, if we can > work through the remaining issues. So I'm grateful to you, and also > sorry to hear that you're annoyed with me. But I still don't think > that the fact that the division you want doesn't exist is somehow my > fault. > > I'm kind of curious why you *didn't* make this distinction at the time > that you were did the other work in this area. Maybe my memory is > playing tricks on me again, but I seem to recall talking about the > idea with you at the time, and I seem to recall thinking that it > sounded like an OK idea. I seem to vaguely recall us discussing > hazards like: well, what if replication causes code to get executed as > the subscription owner that that causes something bad to happen? But I > think the only way that happens is if they put triggers on the tables > that are being replicated, which is their choice, and they can avoid > installing problematic code there if they want. I think there might > have been some other scenarios, too, but I just can't remember. In any > case, I don't think the idea is completely without merit. I think it > could very well be something that we want to have for one reason or > another. But I don't currently understand exactly what those reasons > are, and I don't see any reason why one patch should both split owner > from run-as user and also allow the owner to be a non-superuser. That > seems like two different efforts to me. I don't have a concrete problem with your patch, and wouldn't object if you committed it. My concerns were more how you were phrasing things, but it seems not worth any additional conversation, because it's probably a distinction without a difference. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Non-superuser subscription owners
> On Jan 30, 2023, at 11:30 AM, Robert Haas wrote: > > CREATE SUBSCRIPTION > provides no tools at all for filtering the data that the subscriber > chooses to send. > > Now that can be changed, I suppose, and a run-as user would be one way > to make progress in that direction. But I'm not sure how viable that > is, because... > >>> In what >>> circumstances would be it be reasonable to give responsibility for >>> those objects to different and especially mutually untrusting users? >> >> When public repositories of data, such as the IANA whois database, publish >> their data via postgres publications. > > ... for that to work, IANA would need to set up the database so that > untrusted parties can create logical replication slots on their > PostgreSQL server. And I think that granting REPLICATION privilege on > your database to random people on the Internet is not really viable, > nor intended to be viable. That was an aspirational example in which there's infinite daylight between the publisher and subscriber. I, too, doubt that's ever going to be possible. But I still think we should aspire to some extra daylight between the two. Perhaps IANA doesn't publish to the whole world, but instead publishes only to subscribers who have a contract in place, and have agreed to monetary penalties should they abuse the publishing server. Whatever. There's going to be some amount of daylight possible if we design for it, and none otherwise. My real argument here isn't against your goal of having non-superusers who can create subscriptions. That part seems fine to me. Given that my work last year made it possible for subscriptions to run as somebody other than the subscription creator, it annoys me that you now want the subscription creator's privileges to be what the subscription runs as. That seems to undo what I worked on. In my mental model of a (superuser-creator, non-superuser-owner) pair, it seems you're logically only touching the lefthand side, so you should then have a (nonsuperuser-creator, nonsuperuser-owner) pair. But you don't. You go the apparently needless extra step of just squashing them together. I just don't see why it needs to be like that. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Non-superuser subscription owners
> On Jan 30, 2023, at 9:26 AM, Robert Haas wrote: > > So basically this doesn't really feel like a valid scenario to me. > We're supposed to believe that Alice is hostile to Bob, but the > superuser doesn't seem to have thought very carefully about how Bob is > supposed to defend himself against Alice, and Bob doesn't even seem to > be trying. Maybe we should rename the users to Samson and Delilah? :-) No, Atahualpa and Pizarro. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Non-superuser subscription owners
> On Jan 30, 2023, at 9:26 AM, Robert Haas wrote: > > First, it doesn't seem to make a lot of sense to have one person > managing the publications and someone else managing the subscriptions, > and especially if those parties are mutually untrusting. I can't think > of any real reason to set things up that way. Sure, you could, but why > would you? You could, equally, decide that one member of your > household was going to decide what's for dinner every night, and some > other member of your household was going to decide what gets purchased > at the grocery store each week. If those two people exercise their > responsibilities without tight coordination, or with hostile intent > toward each other, things are going to go badly, but that's not an > argument for putting a combination lock on the flour canister. It's an > argument for getting along better, or not having such a dumb system in > the first place. I don't quite see how the situation you postulate in > (A) and (B) is any different. Publications and subscriptions are as > closely connected as food purchases and meals. The point of a > publication is for it to connect up to a subscription. I have a grim view of the requirement that publishers and subscribers trust each other. Even when they do trust each other, they can firewall attacks by acting as if they do not. > In what > circumstances would be it be reasonable to give responsibility for > those objects to different and especially mutually untrusting users? When public repositories of data, such as the IANA whois database, publish their data via postgres publications. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Non-superuser subscription owners
> On Jan 30, 2023, at 7:44 AM, Robert Haas wrote: > > And if we suppose that > that already works and is safe, well then what's the case where I do > need a run-as user? A) Alice publishes tables, and occasionally adds new tables to existing publications. B) Bob manages subscriptions, and periodically runs "refresh publication". Bob also creates new subscriptions for people when a row is inserted into the "please create a subscription for me" table which Bob owns, using a trigger that Bob created on that table. C) Alice creates a "please create a subscription for me" table on the publishing database, adds lots of malicious requests, and adds that table to the publication. D) Bob replicates the table, fires the trigger, creates the malicious subscriptions, and starts replicating all that stuff, too. I think that having Charlie, not Bob, as the "run-as" user helps somewhere right around (D). — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Non-superuser subscription owners
> On Jan 27, 2023, at 1:35 PM, Robert Haas wrote: > >> I started out asking what benefits it provides to own a subscription one >> cannot modify. But I think it is a good capability to have, to restrict the >> set of relations that replication could target. Although perhaps it'd be >> better to set the "replay user" as a separate property on the subscription? > > That's been proposed previously, but for reasons I don't quite > remember it seems not to have happened. I don't think it achieved > consensus. > >> Does owning a subscription one isn't allowed to modify useful outside of >> that? > > Uh, possibly that's a question for Mark or Jeff. I don't know. I can't > see what they would be, but I just work here. If the owner cannot modify the subscription, then the owner degenerates into a mere "run-as" user. Note that this isn't how things work now, and even if we disallowed owners from modifying the connection string, there would still be other attributes the owner could modify, such as the set of publications subscribed. More generally, my thinking on this thread is that there needs to be two nosuperuser roles: A higher privileged role which can create a subscription, and a lower privileged role serving the "run-as" function. Those shouldn't be the same, because the "run-as" concept doesn't logically need to have subscription creation power, and likely *shouldn't* have that power. Depending on which sorts of attributes a subscription object has, such as the connection string, the answer differs for whether the owner/"run-as" user should get to change those attributes. One advantage of Jeff's idea of using a server object rather than a string is that it becomes more plausibly safe to allow the subscription owner to make changes to that attribute of the subscription. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Non-superuser subscription owners
> On Jan 18, 2023, at 12:51 PM, Robert Haas wrote: > > Unless I'm missing something, it seems like this could be a quite small patch. I didn't like the idea of the create/alter subscription commands needing to parse the connection string and think about what it might do, because at some point in the future we might extend what things are allowed in that string, and we have to keep everything that contemplates that string in sync. I may have been overly hesitant to tackle that problem. Or maybe I just ran short of round tuits. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Non-superuser subscription owners
kly switched my focus elsewhere. > , and (b) what people think of (1)-(3) above There are different ways of solving (1), and Jeff and I discussed them in Dec 2021. My recollection was that idea (3) was the cleanest. Other ideas might be simpler than (3), or they may just appear simpler but in truth turn into a can of worms. I don't know, since I never went as far as trying to implement either approach. Idea (2) seems to contemplate non-superuser subscription owners as a theoretical thing, but it's quite real already. Again, see 027_nosuperuser.pl. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Transparent column encryption
> On Jan 10, 2023, at 9:26 AM, Mark Dilger wrote: > >-- Cryptographically connected to the encrypted record >patient_id BIGINT NOT NULL, >patient_ssn CHAR(11), > >-- The encrypted record >patient_record TEXT ENCRYPTED WITH (column_encryption_key = cek1, >column_encryption_salt = (patient_id, > patient_ssn)), As you mention upthread, tying columns together creates problems for statements that only operate on a subset of columns. Allowing schema designers a choice about tying the encrypted column to zero or more other columns allows them to choose which works best for their security needs. The example above would make a statement like "UPDATE patient_record SET patient_record = $1 \bind '{some json whatever}'" raise an exception at the libpq client level, but maybe that's what schema designers wants it to do. If not, they should omit the column_encryption_salt option in the create table statement; but if so, they should expect to have to specify the other columns as part of the update statement, possibly as part of the where clause, like UPDATE patient_record SET patient_record = $1 WHERE patient_id = 12345 AND patient_ssn = '111-11-' \bind '{some json record}' and have the libpq get the salt column values from the where clause (which may be tricky to implement), or perhaps use some new bind syntax like UPDATE patient_record SET patient_record = ($1:$2,$3) -- new, wonky syntax WHERE patient_id = $2 AND patient_ssn = $3 \bind '{some json record}' 12345 '111-11-' which would be error prone, since the sql statement could specify the ($1:$2,$3) inconsistently with the where clause, or perhaps specify the "new" salt columns even when not changed, like UPDATE patient_record SET patient_record = $1, patient_id = 12345, patient_ssn = "111-11-" WHERE patient_id = 12345 AND patient_ssn = "111-11-" \bind '{some json record}' which looks kind of nuts at first glance, but is grammatically consistent with cases where one or both of the patient_id or patient_ssn are also being changed, like UPDATE patient_record SET patient_record = $1, patient_id = 98765, patient_ssn = "999-99-" WHERE patient_id = 12345 AND patient_ssn = "111-11-" \bind '{some json record}' Or, of course, you can ignore these suggestions or punt them to some future patch that extends the current work, rather than trying to get it all done in the first column encryption commit. But it seems useful to think about what future directions would be, to avoid coding ourselves into a corner, making such future work harder. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Transparent column encryption
> On Dec 31, 2022, at 6:17 AM, Peter Eisentraut > wrote: > > Another update, with some merge conflicts resolved. Hi Peter, thanks for the patch! I wonder if logical replication could be made to work more easily with this feature. Specifically, subscribers of encrypted columns will need the encrypted column encryption key (CEK) and the name of the column master key (CMD) as exists on the publisher, but getting access to that is not automated as far as I can see. It doesn't come through automatically as part of a subscription, and publisher's can't publish the pg_catalog tables where the keys are kept (because publishing system tables is not supported.) Is it reasonable to make available the CEK and CMK to subscribers in an automated fashion, to facilitate setting up logical replication with less manual distribution of key information? Is this already done, and I'm just not recognizing that you've done it? Can we do anything about the attack vector wherein a malicious DBA simply copies the encrypted datum from one row to another? Imagine the DBA Alice wants to murder a hospital patient Bob by removing the fact that Bob is deathly allergic to latex. She cannot modify the Bob's encrypted and authenticated record, but she can easily update Bob's record with the encrypted record of a different patient Charlie. Likewise, if she want's Bob to pay Charlie's bill, she can replace Charlie's encrypted credit card number with Bob's, and once Bob is dead, he won't dispute the charges. An encrypted-and-authenticated column value should be connected with its row in some way that Alice cannot circumvent. In the patch as you have it written, the client application can include row information in the patient record (specifically, the patient's name, ssn, etc) and verify when any patient record is retrieved that this information matches. But that's hardly "transparent" to the client. It's something all clients will have to do, and easy to forget to do in some code path. Also, for encrypted fixed-width columns, it is not an option. So it seems the client needs to "salt" (maybe not the right term for what I have in mind) the encryption with some relevant other columns, and that's something the libpq client would need to understand, and something the patch's syntax needs to support. Something like: CREATE TABLE patient_records ( -- Cryptographically connected to the encrypted record patient_id BIGINT NOT NULL, patient_ssn CHAR(11), -- The encrypted record patient_record TEXT ENCRYPTED WITH (column_encryption_key = cek1, column_encryption_salt = (patient_id, patient_ssn)), -- Extra stuff, not cryptographically connected to anything next_of_kin TEXT, phone_number BIGINT, ... ); I have not selected any algorithms that include such "salt"ing (again, maybe the wrong word) because I'm just trying to discuss the general feature, not get into the weeds about which cryptographic algorithm to select. Thoughts? — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: BUG: Postgres 14 + vacuum_defer_cleanup_age + FOR UPDATE + UPDATE
> On Jan 9, 2023, at 2:07 PM, Andres Freund wrote: > > The tests encounter the issue today. If you add > Assert(TransactionIdIsValid(ctx->next_xid)); > Assert(FullTransactionIdIsValid(ctx->next_fxid)); > early in FullTransactionIdFromXidAndCtx() it'll be hit in the > amcheck/pg_amcheck tests. Ok, I can confirm that. I find the assertion Assert(epoch != (uint32)-1); a bit simpler to reason about, but either way, I agree it is a bug. Thanks for finding this. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: BUG: Postgres 14 + vacuum_defer_cleanup_age + FOR UPDATE + UPDATE
> On Jan 9, 2023, at 11:34 AM, Andres Freund wrote: > > 1) Because ctx->next_xid is set after the XidFromFullTransactionId() call in > update_cached_xid_range(), we end up using the xid 0 (or an outdated value in > subsequent calls) to determine whether epoch needs to be reduced. Can you say a bit more about your analysis here, preferably with pointers to the lines of code you are analyzing? Does the problem exist in amcheck as currently committed, or are you thinking about a problem that arises only after applying your patch? I'm a bit fuzzy on where xid 0 gets used. Thanks — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [PATCH] random_normal function
> On Dec 8, 2022, at 1:53 PM, Paul Ramsey wrote: > > Just a utility function to generate random numbers from a normal > distribution. I find myself doing this several times a year, and I am > sure I must not be the only one. Thanks for the patch. What do you think about these results? +-- The semantics of a negative stddev are not well defined +SELECT random_normal(mean := 0, stddev := -1); +random_normal +- + -1.0285744583010896 +(1 row) + +SELECT random_normal(mean := 0, stddev := '-Inf'); + random_normal +--- + Infinity +(1 row) + +-- This result may be defensible... +SELECT random_normal(mean := '-Inf', stddev := 'Inf'); + random_normal +--- + -Infinity +(1 row) + +-- but if so, why is this NaN? +SELECT random_normal(mean := 'Inf', stddev := 'Inf'); + random_normal +--- + NaN +(1 row) + — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: fixing CREATEROLE
> On Nov 28, 2022, at 12:33 PM, Mark Dilger > wrote: > >> Isn't this just GRANT .. WITH SET FALSE, INHERIT FALSE, ADMIN TRUE? That >> should allow role administration, without actually granting membership in >> that role, yet, right? > > Can you clarify what you mean here? Are you inventing a new syntax? Nevermind. After reading Robert's email, it's clear enough what you mean here. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: fixing CREATEROLE
> On Nov 28, 2022, at 12:08 PM, walt...@technowledgy.de wrote: > > Isn't this just GRANT .. WITH SET FALSE, INHERIT FALSE, ADMIN TRUE? That > should allow role administration, without actually granting membership in > that role, yet, right? Can you clarify what you mean here? Are you inventing a new syntax? +GRANT bob TO alice WITH SET FALSE, INHERIT FALSE, ADMIN TRUE; +ERROR: syntax error at or near "SET" +LINE 1: GRANT bob TO alice WITH SET FALSE, INHERIT FALSE, ADMIN TRUE... — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: fixing CREATEROLE
> On Nov 28, 2022, at 11:34 AM, David G. Johnston > wrote: > > No Defaults needed: David J., Mark?, Tom? As Robert has the patch organized, I think defaults are needed, but I see that as a strike against the patch. > Defaults needed - attached to role directly: Robert > Defaults needed - defined within Default Privileges: Walther? > The capability itself seems orthogonal to the rest of the patch to track > these details better. I think we can "Fix CREATEROLE" without any feature > regarding optional default behaviors and would suggest this patch be so > limited and that another thread be started for discussion of (assuming a > default specifying mechanism is wanted overall) how it should look. Let's > not let a usability debate distract us from fixing a real problem. In Robert's initial email, he wrote, "It seems to me that the root of any fix in this area must be to change the rule that CREATEROLE can administer any role whatsoever." The obvious way to fix that is to revoke that rule and instead automatically grant ADMIN OPTION to a creator over any role they create. That's problematic, though, because as things stand, ADMIN OPTION is granted to somebody by granting them membership in the administered role WITH ADMIN OPTION, so membership in the role and administration of the role are conflated. Robert's patch tries to deal with the (possibly unwanted) role membership by setting up defaults to mitigate the effects, but that is more confusing to me than just de-conflating role membership from role administration, and giving role creators administration over roles they create, without in so doing giving them role membership. I don't recall enough details about how hard it is to de-conflate role membership from role administration, and maybe that's a non-starter for reasons I don't recall at the moment. I expect Robert has already contemplated that idea and instead proposed this patch for good reasons. Robert? — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: fixing CREATEROLE
> On Nov 23, 2022, at 12:04 PM, Robert Haas wrote: > >> But if that's the case, did I misunderstand upthread that these are >> properties the superuser specifies about Alice? Can Alice just set these >> properties about herself, so she gets the behavior she wants? I'm confused >> now about who controls these settings. > > Because they are role-level properties, they can be set by whoever has > ADMIN OPTION on the role. That always includes every superuser, and it > never includes Alice herself (except if she's a superuser). It could > include other users depending on the system configuration. For > example, under this proposal, the superuser might create alice and set > her account to CREATEROLE, configuring the INHERITCREATEDROLES and > SETCREATEDROLES properties on Alice's account according to preference. > Then, alice might create another user, say bob, and make him > CREATEROLE as well. In such a case, either the superuser or alice > could set these properties for role bob, because alice enjoys ADMIN > OPTION on role bob. Ok, so the critical part of this proposal is that auditing tools can tell when Alice circumvents these settings. Without that bit, the whole thing is inane. Why make Alice jump through hoops that you are explicitly allowing her to jump through? Apparently the answer is that you can point a high speed camera at the hoops. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: fixing CREATEROLE
> On Nov 23, 2022, at 11:02 AM, Robert Haas wrote: > > For me, this > clearly falls into the "good" category: it's configuration that you > put into the database that makes things happen the way you want, not a > behavior-changing setting that comes along and ruins somebody's day. I had incorrectly imagined that if the bootstrap superuser granted CREATEROLE to Alice with particular settings, those settings would limit the things that Alice could do when creating role Bob, specifically limiting how much she could administer/inherit/set role Bob thereafter. Apparently, your proposal only configures what happens by default, and Alice can work around that if she wants to. But if that's the case, did I misunderstand upthread that these are properties the superuser specifies about Alice? Can Alice just set these properties about herself, so she gets the behavior she wants? I'm confused now about who controls these settings. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: fixing CREATEROLE
cant improvement. The implicit grant that is > created when CREATE ROLE is issued has the bootstrap superuser as > grantor, because there is no other legal option, but then any further > grants performed by the CREATE ROLE user rely on that user having that > grant, and thus record the OID of the CREATEROLE user as the grantor, > not the bootstrap superuser. > > That, in turn, has a number of rather nice consequences. It means in > particular that the CREATEROLE user can't remove the implicit grant, > nor can they alter it. They are, after all, not the grantor, who is > the bootstrap superuser, nor do they any longer have the authority to > act as the bootstrap superuser. Thus, if you have two or more > CREATEROLE users running around doing stuff, you can tell who did > what. Every role that those users created is linked back to the > creating role in a way that the creator can't alter. A CREATEROLE user > is unable to contrive a situation where they no longer control a role > that they created. That also means that the superuser, if desired, can > revoke all membership grants performed by any particular CREATEROLE > user by revoking the implicit grants with CASCADE. > > But since this implicit grant has, and must have, the bootstrap > superuser as grantor, it is also only reasonable that superusers get > to determine what options are used when creating that grant, rather > than leaving that up to the CREATEROLE user. Yes, this all makes sense, but does it entail that the CREATE ROLE command must behave differently on the basis of a setting? — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: fixing CREATEROLE
> On Nov 22, 2022, at 2:02 PM, Robert Haas wrote: > >> Patch 0004 feels like something that won't get committed. The >> INHERITCREATEDROLES and SETCREATEDROLES in 0004 seems clunky. > > I think role properties are kind of clunky in general, the way we've > implemented them in PostgreSQL, but I don't really see why these are > worse than anything else. I think we need some way to control the > behavior, and I don't really see a reasonable place to put it other > than a per-role property. And if we're going to do that then they > might as well look like the other properties that we've already got. > > Do you have a better idea? Whatever behavior is to happen in the CREATE ROLE statement should be spelled out in that statement. "CREATE ROLE bob WITH INHERIT false WITH SET false" doesn't seem too unwieldy, and has the merit that it can be read and understood without reference to hidden parameters. Forcing this to be explicit should be safer if these statements ultimately make their way into dump/restore scripts, or into logical replication. That's not to say that I wouldn't rather that it always work one way or always the other. It's just to say that I don't want it to work differently based on some poorly advertised property of the role executing the command. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: fixing CREATEROLE
> On Nov 21, 2022, at 12:39 PM, Robert Haas wrote: > > I have drafted a few patches to try to improve the situation. The 0001 and 0002 patches appear to be uncontroversial refactorings. Patch 0003 looks on-point and a move in the right direction. The commit message in that patch is well written. Patch 0004 feels like something that won't get committed. The INHERITCREATEDROLES and SETCREATEDROLES in 0004 seems clunky. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Unstable regression test for contrib/pageinspect
> On Nov 20, 2022, at 12:37 PM, Tom Lane wrote: > > contrib/amcheck and contrib/pg_visibility are also using > DISABLE_PAGE_SKIPPING, so I wonder if they have similar hazards. > I haven't seen them fall over, though. In the amcheck regression test case, it's because the test isn't sensitive to whether the freeze actually happens. You can comment out that line, and the only test difference is the comment: @@ -108,8 +108,8 @@ ERROR: ending block number must be between 0 and 0 SELECT * FROM verify_heapam(relation := 'heaptest', startblock := 1, endblock := 11000); ERROR: starting block number must be between 0 and 0 --- Vacuum freeze to change the xids encountered in subsequent tests -VACUUM (FREEZE, DISABLE_PAGE_SKIPPING) heaptest; +-- -- Vacuum freeze to change the xids encountered in subsequent tests +-- VACUUM (FREEZE, DISABLE_PAGE_SKIPPING) heaptest; -- Check that valid options are not rejected nor corruption reported -- for a non-empty frozen table SELECT * FROM verify_heapam(relation := 'heaptest', skip := 'none'); The amcheck TAP test is sensitive to commenting out the freeze, though: t/001_verify_heapam.pl .. 42/? # Failed test 'all-frozen corrupted table skipping all-frozen' # at t/001_verify_heapam.pl line 58. # got: '0|3||line pointer redirection to item at offset 21840 exceeds maximum offset 38 # 0|4||line pointer to page offset 21840 with length 21840 ends beyond maximum page offset 8192 # 0|5||line pointer redirection to item at offset 0 precedes minimum offset 1 # 0|6||line pointer length 0 is less than the minimum tuple header size 24 # 0|7||line pointer to page offset 15 is not maximally aligned # 0|8||line pointer length 15 is less than the minimum tuple header size 24' # expected: '' t/001_verify_heapam.pl .. 211/? # Looks like you failed 1 test of 272. t/001_verify_heapam.pl .. Dubious, test returned 1 (wstat 256, 0x100) Failed 1/272 subtests t/002_cic.pl ok t/003_cic_2pc.pl ok Test Summary Report --- t/001_verify_heapam.pl (Wstat: 256 (exited 1) Tests: 272 Failed: 1) Failed test: 80 Non-zero exit status: 1 Files=3, Tests=280, 10 wallclock secs ( 0.05 usr 0.02 sys + 3.84 cusr 3.10 csys = 7.01 CPU) Result: FAIL make: *** [check] Error 1 But the TAP test also disables autovacuum, so a background auto-analyze shouldn't be running. Maybe that's why you haven't seen amcheck fall over? — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Tables not getting vacuumed in postgres
> On Nov 8, 2022, at 5:21 AM, Karthik Jagadish (kjagadis) > wrote: > > I didn’t get your point dead tuples are visible to transaction means? > Vacuuming job is to remove dead tuples right? Please see https://www.2ndquadrant.com/en/blog/when-autovacuum-does-not-vacuum/ for more information about your question. Specifically, you might look at the third section down, "Long transactions", which starts with "So, if the table is vacuumed regularly, surely it can’t accumulate a lot of dead rows, right?" You might benefit from reading the entire article rather than skipping down to that section. I hope it helps — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Privileges on PUBLICATION
> On Nov 4, 2022, at 12:28 AM, Antonin Houska wrote: > > I thought about the whole concept a bit more and I doubt if the PUBLICATION > privilege is the best approach. In particular, the user specified in CREATE > SUBSCRIPTION ... CONNECTION ... (say "subscription user") needs to have SELECT > privilege on the tables replicated. So if the DBA excludes some columns from > the publication's column list and sets the (publication) privileges in such a > way that the user cannot get the column values via other publications, the > user still can connect to the database directly and get values of the excluded > columns. > > As an alternative to the publication privileges, I think that the CREATE > SUBSCRIPTION command could grant ACL_SELECT automatically to the subscription > user on the individual columns contained in the publication column list, and > DROP SUBSCRIPTION would revoke that privilege. > > Of course a question is what to do if the replication user already has that > privilege on some columns: either the CREATE SUBSCRIPTION command should raise > ERROR, or we should introduce a new privilege (e.g. ACL_SELECT_PUB) for this > purpose, which would effectivelly be ACL_SELECT, but hidden from the users of > GRANT / REVOKE. > > If this approach was taken, the USAGE privilege on schema would need to be > granted / revoked in a similar way. When you talk about a user needing to have privileges, it sounds like you mean privileges on the publishing database. But then you talk about CREATE SUBSCRIPTION granting privileges, which would necessarily be on the subscriber database. Can you clarify what you have in mind? — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Incorrect comment regarding command completion tags
> On Oct 13, 2022, at 2:56 PM, David Rowley wrote: > > I was looking at the code in EndCommand() and noticed a comment > talking about some Asserts which didn't seem to exist in the code. > The comment also talks about LastOid which looks like the name of a > variable that's nowhere to be seen. > > It looks like the Asserts did exists when the completion tag patch was > being developed [1] but they disappeared somewhere along the line and > the comment didn't get an update before 2f9661311 went in. > > In the attached, I rewrote the comment to remove mention of the > Asserts. I also tried to form the comment in a way that's more > understandable about why we always write a "0" in "INSERT 0 ". Your wording is better. +1 — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Should setting TupleTableSlotOps get_heap_tuple=NULL break INSERT..ON CONFLICT DO UPDATE..RETURNING?
> On Sep 29, 2022, at 9:22 AM, Andres Freund wrote: > > I would assume that this can be avoided by the tuple slot implementation, but > without seeing what precisely you did in your pile slot... "pile" is just a copy of "heap" placed into an extension with a slightly smarter version of s/heap/pile/g performed across the sources. It is intended to behave exactly as heap does. Without disabling the get_heap_tuple function, it passes a wide variety of the regression/isolation/tap tests. To test the claim made in the TupleTableSlotOps code comments, I disabled that one function: /* * Return a heap tuple "owned" by the slot. It is slot's responsibility to * free the memory consumed by the heap tuple. If the slot can not "own" a * heap tuple, it should not implement this callback and should set it as * NULL. */ HeapTuple (*get_heap_tuple) (TupleTableSlot *slot); That comment suggests that I do not need to keep a copy of the heap tuple, and per the next comment: /* * Return a copy of heap tuple representing the contents of the slot. The * copy needs to be palloc'd in the current memory context. The slot * itself is expected to remain unaffected. It is *not* expected to have * meaningful "system columns" in the copy. The copy is not be "owned" by * the slot i.e. the caller has to take responsibility to free memory * consumed by the slot. */ HeapTuple (*copy_heap_tuple) (TupleTableSlot *slot); I do not need to keep a copy of the "system columns". But clearly this doesn't work. When get_heap_tuple=NULL, the AM's tuple_update is at liberty to free the update tuple (per the above documentation) and later return a copy of the slot's tuple sans any "system columns" (also per the above documentation) and that's when the core code breaks. It's not the TAM that is broken here, not according to the interface's documentation as I read it. Am I reading it wrong? — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Should setting TupleTableSlotOps get_heap_tuple=NULL break INSERT..ON CONFLICT DO UPDATE..RETURNING?
crosscheck=0x, wait=true, tmfd=0x7ffee3aa3cf8, lockmode=0x7ffee3aa3ce8, update_indexes=0x7ffee3aa3ce6) at tableam.h:1509:9 frame #8: 0x00010c518ec7 postgres`ExecUpdate(mtstate=0x7f9edd0acd40, resultRelInfo=0x7f9edd0acf58, tupleid=0x7ffee3aa3f30, oldtuple=0x, slot=0x7f9edb02c550, planSlot=0x7f9edd0ad540, epqstate=0x7f9edd0ace28, estate=0x7f9edd0abd20, canSetTag=true) at nodeModifyTable.c:1809:12 frame #9: 0x00010c51b187 postgres`ExecOnConflictUpdate(mtstate=0x7f9edd0acd40, resultRelInfo=0x7f9edd0acf58, conflictTid=0x7ffee3aa3f30, planSlot=0x7f9edd0ad540, excludedSlot=0x7f9edb02ec40, estate=0x7f9edd0abd20, canSetTag=true, returning=0x7ffee3aa3f18) at nodeModifyTable.c:2199:15 frame #10: 0x00010c518453 postgres`ExecInsert(mtstate=0x7f9edd0acd40, resultRelInfo=0x7f9edd0acf58, slot=0x7f9edb02ec40, planSlot=0x7f9edd0ad540, estate=0x7f9edd0abd20, canSetTag=true) at nodeModifyTable.c:870:10 frame #11: 0x00010c516fd4 postgres`ExecModifyTable(pstate=0x7f9edd0acd40) at nodeModifyTable.c:2583:12 frame #12: 0x00010c4d4862 postgres`ExecProcNodeFirst(node=0x7f9edd0acd40) at execProcnode.c:464:9 frame #13: 0x00010c4cc6d2 postgres`ExecProcNode(node=0x7f9edd0acd40) at executor.h:257:9 frame #14: 0x00010c4c7d21 postgres`ExecutePlan(estate=0x7f9edd0abd20, planstate=0x7f9edd0acd40, use_parallel_mode=false, operation=CMD_INSERT, sendTuples=true, numberTuples=0, direction=ForwardScanDirection, dest=0x7f9edc00b458, execute_once=true) at execMain.c:1551:10 frame #15: 0x00010c4c7bf1 postgres`standard_ExecutorRun(queryDesc=0x7f9edc00b4f0, direction=ForwardScanDirection, count=0, execute_once=true) at execMain.c:361:3 frame #16: 0x00010c4c7982 postgres`ExecutorRun(queryDesc=0x7f9edc00b4f0, direction=ForwardScanDirection, count=0, execute_once=true) at execMain.c:305:3 frame #17: 0x00010c7930dc postgres`ProcessQuery(plan=0x7f9edb021fc0, sourceText="WITH aaa AS (SELECT 1 AS a, 'Foo' AS b) INSERT INTO upsert_test\n VALUES (1, 'Bar') ON CONFLICT(a)\n DO UPDATE SET (b, a) = (SELECT b, a FROM aaa) RETURNING *;", params=0x, queryEnv=0x, dest=0x7f9edc00b458, qc=0x7ffee3aa4408) at pquery.c:160:2 frame #18: 0x00010c791f07 postgres`PortalRunMulti(portal=0x7f9edd028920, isTopLevel=true, setHoldSnapshot=true, dest=0x7f9edc00b458, altdest=0x00010cbc8890, qc=0x7ffee3aa4408) at pquery.c:1274:5 frame #19: 0x00010c791835 postgres`FillPortalStore(portal=0x7f9edd028920, isTopLevel=true) at pquery.c:1023:4 frame #20: 0x00010c7913ee postgres`PortalRun(portal=0x7f9edd028920, count=9223372036854775807, isTopLevel=true, run_once=true, dest=0x7f9edb0220b0, altdest=0x7f9edb0220b0, qc=0x7ffee3aa46a0) at pquery.c:760:6 frame #21: 0x00010c78c394 postgres`exec_simple_query(query_string="WITH aaa AS (SELECT 1 AS a, 'Foo' AS b) INSERT INTO upsert_test\n VALUES (1, 'Bar') ON CONFLICT(a)\n DO UPDATE SET (b, a) = (SELECT b, a FROM aaa) RETURNING *;") at postgres.c:1213:10 frame #22: 0x00010c78b3f7 postgres`PostgresMain(argc=1, argv=0x7ffee3aa49d0, dbname="contrib_regression", username="mark.dilger") at postgres.c:4496:7 frame #23: 0x00010c692a59 postgres`BackendRun(port=0x7f9edc804080) at postmaster.c:4530:2 frame #24: 0x00010c691fa5 postgres`BackendStartup(port=0x7f9edc804080) at postmaster.c:4252:3 frame #25: 0x00010c690d0e postgres`ServerLoop at postmaster.c:1745:7 frame #26: 0x00010c68e23a postgres`PostmasterMain(argc=8, argv=0x7f9edac06440) at postmaster.c:1417:11 frame #27: 0x00010c565249 postgres`main(argc=8, argv=0x7f9edac06440) at main.c:209:3 frame #28: 0x7fff70d5ecc9 libdyld.dylib`start + 1 frame #29: 0x7fff70d5ecc9 libdyld.dylib`start + 1 — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: why can't a table be part of the same publication as its schema
> On Sep 20, 2022, at 12:36 PM, Jonathan S. Katz wrote: > > This behavior exists "FOR ALL TABLES" without the "IN SCHEMA" qualifier. This > was discussed multiple times on the original thread[1]. Yes, nobody is debating that as far as I can see. And I do take your point that this stuff was discussed in other threads quite a while back. > I tried to diligently read the sections where we talk about granting + > privileges[2][3] to see what it says about "ALL * IN SCHEMA". Unless I missed > it, and I read through it twice, it does not explicitly state whether or not > "GRANT" applies to all objects at only that given moment, or to future > objects of that type which are created in that schema. Maybe the behavior is > implied or is part of the standard, but it's not currently documented. Interesting. Thanks for that bit of research. > We do link to "ALTER DEFAULT PRIVILEGES" at the bottom of the GRANT[2] docs, > but we don't give any indication as to why. > > (This is also to say we should document in GRANT that ALL * IN SCHEMA does > not apply to future objects; Yes, I agree this should be documented. > if you need that behavior use ALTER DEFAULT PRIVILEGES. Separate thread :) > > I understand there is a risk of confusion of the similar grammar across > commands, but the current command in logical replication has this is building > on the existing behavior. I don't complain that it is buidling on the existing behavior. I'm *only* concerned about the keywords we're using for this. Consider the following: -- AS ADMIN CREATE USER bob NOSUPERUSER; GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA foo TO bob; SET ROLE bob; CREATE PUBLICATION bobs_pub FOR ALL TABLES IN SCHEMA foo; We're going to have that fail in pg15 because the FOR ALL TABLES IN SCHEMA option is reserved to superusers. But we agreed that was a stop-gap solution that we'd potentially loosen in the future. Certainly we'll need wiggle room in the syntax to perform that loosening: --- Must be superuser for this in pg15, and in subsequent releases. CREATE PUBLICATION bobs_pub FOR ALL FUTURE TABLES IN SCHEMA foo; --- Not supported in pg15, but reserved for some future pg versions to allow --- non-superusers to create publications on tables currently in schema foo, --- assuming they have sufficient privileges on those tables CREATE PUBLICATION bobs_pub FOR ALL TABLES IN SCHEMA foo; Doing it this way makes the syntax consistent between the GRANT...TO bob and the CREATE PUBLICATION bobs_pub. Surely this makes more sense? I'm not a huge fan of the keyword "FUTURE" here, but I found a reference to another database that uses that keyword for what I think is a similar purpose. We should choose *something* for this, though, if we want things to be rational going forward. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: why can't a table be part of the same publication as its schema
> On Sep 19, 2022, at 8:03 PM, Jonathan S. Katz wrote: > > "When a partitioned table is added to a publication, all of its existing and > future partitions are implicitly considered to be part of the > publication."[10] > > Additionally, this is the behavior that is already present in "FOR ALL > TABLES": > > "Marks the publication as one that replicates changes for all tables in the > database, including tables created in the future."[10] > > I don't think we should change this behavior that's already in logical > replication. The existing behavior in logical replication doesn't have any "IN SCHEMA" qualifiers. > While I understand the reasons why "GRANT ... ALL TABLES IN SCHEMA" has a > different behavior (i.e. it's not applied to future objects) and do not > advocate to change it, I have personally been affected where I thought a > permission would be applied to all future objects, only to discover > otherwise. I believe it's more intuitive to think that "ALL" applies to > "everything, always." The conversation is focusing on what "ALL TABLES" means, but the ambiguous part is what "IN SCHEMA" means. In GRANT it means "currently in schema, computed now." We are about to create confusion by adding the "IN SCHEMA" phrase to publication commands meaning "later in schema, computed then." A user who diligently consults the documentation for one command to discover what "IN SCHEMA" means may fairly, but wrongly, assume it means the same thing in another command. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: why can't a table be part of the same publication as its schema
> On Sep 10, 2022, at 4:17 PM, Robert Haas wrote: > >>> I don't understand why we >>> used this ALL TABLES IN SCHEMA language. >> >> The conversation, as I recall, was that "ADD SCHEMA foo" would only mean all >> tables in foo, until publication of other object types became supported, at >> which point "ADD SCHEMA foo" would suddenly mean more than it did before. >> People might find that surprising, so the "ALL TABLES IN" was intended to >> future-proof against surprising behavioral changes. > > If I encountered this syntax in a vacuum, that's not what I would > think. I would think that ADD ALL TABLES IN SCHEMA meant add all the > tables in the schema to the publication one by one as individual > objects Yes, it appears the syntax was chosen to avoid one kind of confusion, but created another kind. Per the docs on this feature: FOR ALL TABLES IN SCHEMA Marks the publication as one that replicates changes for all tables in the specified list of schemas, including tables created in the future. Like you, I wouldn't expect that definition, given the behavior of GRANT with respect to the same grammatical construction. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: why can't a table be part of the same publication as its schema
> On Sep 9, 2022, at 8:18 AM, Robert Haas wrote: > > Things might be clearer if we'd made the syntax "ALTER PUBLICATION p1 > { ADD | DROP } { TABLE | SCHEMA } name". I don't understand why we > used this ALL TABLES IN SCHEMA language. The conversation, as I recall, was that "ADD SCHEMA foo" would only mean all tables in foo, until publication of other object types became supported, at which point "ADD SCHEMA foo" would suddenly mean more than it did before. People might find that surprising, so the "ALL TABLES IN" was intended to future-proof against surprising behavioral changes. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: predefined role(s) for VACUUM and ANALYZE
> On Sep 7, 2022, at 4:09 PM, Stephen Frost wrote: > > Calling this a redesign is over-stating things, imv … and I’d much rather > have the per-relation granularity than predefined roles for this, so there is > that to consider too, perhaps. Ok, now I'm a bit lost. If I want to use Nathan's feature to create a role to vacuum and analyze my database on a regular basis, how does per-relation granularity help me? If somebody creates a new table and doesn't grant those privileges to the role, doesn't that break the usage case? To me, per-relation granularity sounds useful, but orthogonal, to this feature. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: predefined role(s) for VACUUM and ANALYZE
> On Sep 7, 2022, at 3:21 PM, Nathan Bossart wrote: > > There was some previous discussion around adding a pg_maintenance role that > could perform all of these commands [0]. I didn't intend to draw a line > around VACUUM and ANALYZE. Those are just the commands I started with. > If/when there are many of these roles, it might make sense to create a > pg_maintenance role that is a member of pg_vacuum_all_tables, > pg_analyze_all_tables, etc. Thank you, that sounds fair enough. It seems you've been pushed to make the patch-set more complicated, and now we're debating privilege bits, which seems pretty far off topic. I may be preaching to the choir here, but wouldn't it work to commit new roles pg_vacuum_all_tables and pg_analyze_all_tables with checks like you had in the original patch of this thread? That wouldn't block the later addition of finer grained controls allowing users to grant VACUUM or ANALYZE per-relation, would it? Something like what Stephen is requesting, and what you did with new privilege bits for VACUUM and ANALYZE could still be added, unless I'm missing something. I'd hate to see your patch set get further delayed by things that aren't logically prerequisites. The conversation upthread was useful to determine that they aren't prerequisites, but if anybody wants to explain to me why they are — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: predefined role(s) for VACUUM and ANALYZE
> On Jul 22, 2022, at 1:37 PM, Nathan Bossart wrote: > > The primary motivation for this is to continue chipping away at things that > require special privileges or even superuser. VACUUM and ANALYZE typically > require table ownership, database ownership, or superuser. And only > superusers can VACUUM/ANALYZE shared catalogs. A predefined role for these > operations would allow delegating such tasks (e.g., a nightly VACUUM > scheduled with pg_cron) to a role with fewer privileges. > > The attached patch adds a pg_vacuum_analyze role that allows VACUUM and > ANALYZE commands on all relations. Granting membership in a role that can VACUUM and ANALYZE any relation seems to grant a subset of a more general category, the ability to perform modifying administrative operations on a relation without necessarily being able to read or modify the logical contents of that relation. That more general category would seem to also include CLUSTER, REINDEX, REFRESH MATERIALIZED VIEW and more broadly ALTER SUBSCRIPTION ... REFRESH PUBLICATION and ALTER DATABASE ... REFRESH COLLATION VERSION. These latter operations may be less critical to database maintenance than is VACUUM, but arguably ANALYZE isn't as critical as is VACUUM, either. Assuming for the sake of argument that we should create a role something like you propose, can you explain why we should draw the line around just VACUUM and ANALYZE? I am not arguing for including these other commands, but don't want to regret having drawn the line in the wrong place when later we decide to add more roles like the one you are proposing. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: hash_xlog_split_allocate_page: failed to acquire cleanup lock
> On Aug 9, 2022, at 7:26 PM, Andres Freund wrote: > > The relevant code triggering it: > > newbuf = XLogInitBufferForRedo(record, 1); > _hash_initbuf(newbuf, xlrec->new_bucket, xlrec->new_bucket, > xlrec->new_bucket_flag, true); > if (!IsBufferCleanupOK(newbuf)) > elog(PANIC, "hash_xlog_split_allocate_page: failed to acquire > cleanup lock"); > > Why do we just crash if we don't already have a cleanup lock? That can't be > right. Or is there supposed to be a guarantee this can't happen? Perhaps the code assumes that when xl_hash_split_allocate_page record was written, the new_bucket field referred to an unused page, and so during replay it should also refer to an unused page, and being unused, that nobody will have it pinned. But at least in heap we sometimes pin unused pages just long enough to examine them and to see that they are unused. Maybe something like that is happening here? I'd be curious to see the count returned by BUF_STATE_GET_REFCOUNT(LockBufHdr(newbuf)) right before this panic. If it's just 1, then it's not another backend, but our own, and we'd want to debug why we're pinning the same page twice (or more) while replaying wal. Otherwise, maybe it's a race condition with some other process that transiently pins a buffer and occasionally causes this code to panic? — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Memory leak fix in psql
> On Jul 19, 2022, at 2:02 AM, tanghy.f...@fujitsu.com wrote: > > I think there is a newly introduced memory leak in your patch d2d3547. I agree. Thanks for noticing, and for the patch! > Try to fix it in the attached patch. > Kindly to have a check. This looks ok, but comments down-thread seem reasonable, so I suspect a new patch will be needed. Would you like to author it, or would you prefer that I, as the guilty party, do so? — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Modest proposal to extend TableAM API for controlling cluster commands
> On Jun 16, 2022, at 12:27 AM, Andres Freund wrote: > >> I don't think I should have to do so. It's like saying, "I think I should >> have freedom of speech", and you say, "well, I'm not sure about that; tell >> me what you want to say, and I'll decide if I'm going to let you say it".' >> That's not freedom. I think TAM authors should have broad discretion over >> anything that the core system doesn't have a compelling interest in >> controlling. > > That's insultingly ridiculous. You can say, do whatever you want, but that > doesn't mean I have to be convinced by it (i.e. +1 adding an API) - that'd be > compelled speech, to go with your image... Indeed it would be compelled speech, and I'm not trying to compel you, only to convince you. And my apologies if it came across as insulting. I have a lot of respect for you, as do others at EDB, per invariably complementary comments I've heard others express. > It's utterly normal to be asked what the use case for a new API is when > proposing one. It seems like we're talking on two different levels. I've said what the use case is, which is to implement a TAM that doesn't benefit from cluster or vacuum full, without the overhead of needlessly copying itself, and without causing argumentless VACUUM FULL commands to fail. I'm *emphatically* not asking the community to accept the TAM back as a patch. The freedom I'm talking about is the freedom to design and implement such a third-party TAM without seeking community approval of the TAM's merits. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Modest proposal to extend TableAM API for controlling cluster commands
> On Jun 16, 2022, at 12:28 AM, David G. Johnston > wrote: > > But you are basically proposing a reworking of the existing system into one > that makes pretty much any SQL Command something that a TAM can treat as > being an optional request by the user; Yes, and I think I'm perfectly correct in asking for that. If the standard you are proposing (albeit as Devil's Advocate) were applied to filesystems, nobody could ever implement /dev/null, on the argument that users have a reasonable expectation that data they write to a file will be there for them to read later. Yet Michael Paquier wrote a blackhole TAM, and although I don't find it terribly useful, I do think it's a reasonable thing for somebody to be able to write. > whereas today the system presumes that the implementations will respond to > these commands. That depends on what you mean by "respond to". A TAM which implements a tamper resistant audit log responds to update and delete commands with a "permission denied" error. A TAM which implements running aggregates implements insert commands by computing and inserting a new running aggregate value and reclaiming space from old running aggregate values when no transaction could any longer see them. You can do this stuff at a higher level with hooks, functions, triggers, and rules, inserting into a heap, and having to periodically vacuum, by why would you want to? That's almost guaranteed to be slower, maybe even orders of magnitude slower. > And to make this change without any core code having such a need. The core code won't have any such need, because the core code is content with heap, and the API already accommodates heap. It seems Andres moved the project in the direction of allowing custom TAMs when he created the Table AM interface, and I'm quite pleased that he did so, but it doesn't allow nearly enough flexibility to do all the interesting things a TAM could otherwise do. Consider for example that the multi_insert hook uses a BulkInsertStateData parameter, defined as: typedef struct BulkInsertStateData { BufferAccessStrategy strategy; /* our BULKWRITE strategy object */ Buffer current_buf;/* current insertion target page */ } BulkInsertStateData; which is just the structure heap would want, but what about a TAM that wants to route different tuples to different pages? The "current_buf" isn't enough information, and there's no void *extra field, so you're just sunk. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Modest proposal to extend TableAM API for controlling cluster commands
> On Jun 15, 2022, at 8:50 PM, David G. Johnston > wrote: > > On Wed, Jun 15, 2022 at 8:18 PM Andres Freund wrote: > > If a simple callback like > > relation_supports_cluster(Relation rel) is too simplistic > > Seems like it should be called: > relation_supports_compaction[_by_removal_of_interspersed_dead_tuples] Ok. > Basically, if the user tells the table to make itself smaller on disk by > removing dead tuples, should we support the case where the Table AM says: > "Sorry, I cannot do that"? I submit that's the only sane thing to do if the table AM already guarantees that the table will always be fully compacted. There is no justification for forcing the table contents to be copied without benefit. > If yes, then naming the table explicitly should elicit an error. Having the > table chosen implicitly should provoke a warning. For ALTER TABLE CLUSTER > there should be an error: which makes the implicit CLUSTER command a > non-factor. I'm basically fine with how you would design the TAM, but I'm going to argue again that the core project should not dictate these decisions. The TAM's relation_supports_compaction() function can return true/false, or raise an error. If raising an error is the right action, the TAM can do that. If the core code makes that decision, the TAM can't override, and that paints TAM authors into a corner. > However, given that should the table structure change it is imperative that > the Table AM be capable of producing a new physical relation with the correct > data, which will have been compacted as a side-effect, it seems like, > explicit or implicit, expecting any Table AM to do that when faced with > Vacuum Full is reasonable. Which leaves deciding how to allow a table with a > given TAM to prevent itself from being added to the CLUSTER roster. And > decide whether an opt-out feature for implicit VACUUM FULL is something we > should offer as well. > > I'm doubtful that a TAM that is pluggable into the MVCC and WAL architecture > of PostgreSQL could avoid this basic contract between the system and its > users. How about a TAM that implements a write-once, read-many logic. You get one multi-insert, and forever after you can't modify it (other than to drop the table, or perhaps to truncate it). That's a completely made-up-on-the-spot example, but it's not entirely without merit. You could avoid a lot of locking overhead when using such a table, since you'd know a priori that nobody else is modifying it. It could also be implemented with a smaller tuple header, since a lot of the header bytes in heap tuples are dedicated to tracking updates. You wouldn't need a per-row inserting transaction-Id either, since you could just store one per table, knowing that all the rows were inserted in the same transaction. In what sense does this made-up TAM conflict with mvcc and wal? It doesn't have all the features of heap, but that's not the same thing as violating mvcc or breaking wal. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Extending USING [heap | mytam | yourtam] grammar and behavior
> On Jun 15, 2022, at 8:51 PM, Michael Paquier wrote: > > I am not sure to see why this would be something users would actually > use in prod. That means to pick up something else than what the > server thinks is the best default AM but where somebody does not want > to trust the default, while generating an error if specifying the > default AM in the USING NOT clause. Sorry for the lack of clarity. I do not suggest raising an error. If you say "USING NOT foo", and foo is the default table access method, then you get the same behavior as a "USING heap" would have gotten you, otherwise, you get the same behavior as not providing any USING clause at all. In future, we might want to create a list of fallback tams rather than just hardcoding "heap" as the one and only fallback, but I haven't run into an actual need for that. If you're wondering what "USING NOT heap" falls back to, I think that could error, or it could just use heap anyway. Whatever. That's why I'm still soliciting for comments at this phase rather than posting a patch. > On top of that > default_table_access_method is user-settable. Yeah, but specifying a "USING foo" clause is also open to any user, so I don't see why this matters. "USING NOT foo" is just shorthand for checking the current default_table_access_method, and then either appending a "USING heap" clause or appending no clause. Since the user can do this anyway, what's the security implication in some syntactic sugar? — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Modest proposal to extend TableAM API for controlling cluster commands
> On Jun 15, 2022, at 8:18 PM, Andres Freund wrote: > > Hi, > > On 2022-06-15 20:10:30 -0700, Mark Dilger wrote: >>> On Jun 15, 2022, at 7:30 PM, Andres Freund wrote: >>>> But it's up to the TAM what it wants to do with that boolean, if in fact it >>>> does anything at all based on that. A TAM could decide to do the exact >>>> opposite of what Heap-AM does and instead sort on VACUUM FULL but not sort >>>> on CLUSTER, or perhaps perform a randomized shuffle, or >>> behavior here>. >>> >>> That's bogus. Yes, an AM can do stupid stuff in a callback. But so what, >>> that's possible with all extension APIs. >> >> I don't think it's "stupid stuff". A major motivation, perhaps the only >> useful motivation, for implementing a TAM is to get a non-trivial >> performance boost (relative to heap) for some target workload, almost >> certainly at the expense of worse performance for another workload. To >> optimize any particular workload sufficiently to make it worth the bother, >> you've pretty much got to do something meaningfully different than what heap >> does. > > Sure. I just don't see what that has to do with doing something widely > differing in VACUUM FULL. Which AM can't support that? I guess there's some > where implementing the full visibility semantics isn't feasible, but that's > afaics OK. The problem isn't that the TAM can't do it. Any TAM can probably copy its contents verbatim from the OldHeap to the NewHeap. But it's really stupid to have to do that if you're not going to change anything along the way, and I think if you divorce your thinking from how heap-AM works sufficiently, you might start to see how other TAMs might have nothing useful to do at this step. So there's a strong motivation to not be forced into a "move all the data uselessly" step. I don't really want to discuss the proprietary details of any TAMs I'm developing, so I'll use some made up examples. Imagine you have decided to trade off the need to vacuum against the cost of storing 64bit xids. That doesn't mean that compaction isn't maybe still a good thing, but you don't need to vacuum for anti-wraparound purposes anymore. Now imagine you've also decided to trade off insert speed for select speed, and you sort the entire table on every insert, and to keep indexes happy, you keep a "externally visible TID" to "internal actual location" mapping in a separate fork. Let's say you also don't support UPDATE or DELETE at all. Those immediately draw an error, "not implemented by this tam". At this point, you have a fully sorted and completely compacted table at all times. It's simply an invariant of the TAM. So, now what exactly is VACUUM FULL or CLUSTER supposed to do? Seems like the answer is "diddly squat", and yet you seem to propose requiring the TAM to do it. I don't like that. >>>> From the point-of-view of a TAM implementor, VACUUM FULL and CLUSTER are >>>> synonyms. Or am I missing something? >>> >>> The callback gets passed use_sort. So just implement it use_sort = false and >>> error out if use_sort = true? >> >> I'm not going to say that your idea is unreasonable for a TAM that you might >> choose to implement, but I don't see why that should be required of all TAMs >> anybody might ever implement. > >> The callback that gets use_sort is called from copy_table_data(). By that >> time, it's too late to avoid the >> >>/* >> * Open the relations we need. >> */ >>NewHeap = table_open(OIDNewHeap, AccessExclusiveLock); >>OldHeap = table_open(OIDOldHeap, AccessExclusiveLock); >> >> code that has already happened in cluster.c's copy_table_data() function, >> and unless I raise an error, after returning from my TAM's callback, the >> cluster code will replace the old table with the new one. I'm left no >> choices but to copy my data over, loose my data, or abort the command. None >> of those are OK options for me. > > I think you need to do a bit more explaining of what you're actually trying to > achieve here. You're just saying "I don't want to", which doesn't really help > me to understand the set of useful options. I'm trying to opt out of cluster/vacfull. >> I'm open to different solutions. If a simple callback like >> relation_supports_cluster(Relation rel) is too simplistic, and more >> parameters with more context information is wanted, then fine, let's do >> that. > > FWIW, I want to go *simpler* if anything, not more complicated. I.e. make the > relation_copy_for_cluster optional. > > I stil
Re: Extending USING [heap | mytam | yourtam] grammar and behavior
> On Jun 15, 2022, at 8:51 PM, Michael Paquier wrote: > > However, > your problem is basically that you develop multiple AMs, but you want > to have regression tests that do checks across more than one table AM > at the same time. It is true that I test multiple table AMs at the same time, but that's a somewhat different concern. > Am I getting that right? Not exactly. > Why is a grammar > extension necessary for what looks like a test structure problem when > there are interdependencies across multiple AMs developped? Ok, I didn't want to get into my exact process, because it involves other changes that I don't expect -hackers to want. But basically what I do is: ./configure --with-default-tam=chicago && make && make check-world That fails for a few tests, and I manually change the create table statements in tests that are not chicago-compatible to "using not chicago". Then ./configure --with-default-tam=detroit && make && make check-world That fails for some other set of tests, but note that the tests with "using not chicago" are still using detroit in this second run. That wouldn't be true if I'd fixed up the tests in the first run "using heap". Then I can also add my own tests which might make some chicago backed tables plus some detroit backed tables and see how they interact. But that's superfluous to the issue of just trying to leverage the existing tests as much as I can without having to reinvent tests to cover "chicago", and then reinvent again to cover "detroit", and so forth. If you develop enough TAMs in parallel, and go with the "using heap" solution, you eventually have zero coverage for any of the TAMs, because you'll eventually be "using heap" in all the tables of all the tests. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Modest proposal to extend TableAM API for controlling cluster commands
> On Jun 15, 2022, at 7:30 PM, Andres Freund wrote: > >> It's effectively a synonym which determines whether the "bool use_sort" >> parameter of the table AM's relation_copy_for_cluster will be set. Heap-AM >> plays along and sorts or not based on that. > > Hardly a synonym if it behaves differently? I don't think this point is really worth arguing. We don't have to call it a synonym, and the rest of the discussion remains the same. >> But it's up to the TAM what it wants to do with that boolean, if in fact it >> does anything at all based on that. A TAM could decide to do the exact >> opposite of what Heap-AM does and instead sort on VACUUM FULL but not sort >> on CLUSTER, or perhaps perform a randomized shuffle, or > behavior here>. > > That's bogus. Yes, an AM can do stupid stuff in a callback. But so what, > that's possible with all extension APIs. I don't think it's "stupid stuff". A major motivation, perhaps the only useful motivation, for implementing a TAM is to get a non-trivial performance boost (relative to heap) for some target workload, almost certainly at the expense of worse performance for another workload. To optimize any particular workload sufficiently to make it worth the bother, you've pretty much got to do something meaningfully different than what heap does. >> From the point-of-view of a TAM implementor, VACUUM FULL and CLUSTER are >> synonyms. Or am I missing something? > > The callback gets passed use_sort. So just implement it use_sort = false and > error out if use_sort = true? I'm not going to say that your idea is unreasonable for a TAM that you might choose to implement, but I don't see why that should be required of all TAMs anybody might ever implement. The callback that gets use_sort is called from copy_table_data(). By that time, it's too late to avoid the /* * Open the relations we need. */ NewHeap = table_open(OIDNewHeap, AccessExclusiveLock); OldHeap = table_open(OIDOldHeap, AccessExclusiveLock); code that has already happened in cluster.c's copy_table_data() function, and unless I raise an error, after returning from my TAM's callback, the cluster code will replace the old table with the new one. I'm left no choices but to copy my data over, loose my data, or abort the command. None of those are OK options for me. I'm open to different solutions. If a simple callback like relation_supports_cluster(Relation rel) is too simplistic, and more parameters with more context information is wanted, then fine, let's do that. But I can't really complete my work with the interface as it stands now. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Modest proposal to extend TableAM API for controlling cluster commands
> On Jun 15, 2022, at 7:21 PM, Mark Dilger wrote: > >> If a user does that for a table that doesn't support clustering, well, I >> don't >> see what's gained by not erroring out. > > Perhaps they want to give the TAM information about which index to use for > sorting, on those occasions when the TAM's logic dictates that sorting is > appropriate, but not in response to a cluster command. I should admit that this is a bit hack-ish, but TAM authors haven't been left a lot of options here. Index AMs allow for custom storage parameters, but Table AMs don't, so getting information to the TAM about how to behave takes more than a little slight of hand. Simon's proposal from a while back (don't have the link just now) to allow TAMs to define custom storage parameters would go some distance here. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Modest proposal to extend TableAM API for controlling cluster commands
> On Jun 15, 2022, at 7:14 PM, Andres Freund wrote: > > Hi, > > On 2022-06-15 19:07:50 -0700, Mark Dilger wrote: >>> On Jun 15, 2022, at 6:55 PM, Andres Freund wrote: >>> >>> I think nothing would happen in this case - only pre-clustered tables get >>> clustered in an argumentless CLUSTER. What am I missing? >> >> The "VACUUM FULL" synonym of "CLUSTER" doesn't depend on whether the target >> is pre-clustered > > VACUUM FULL isn't a synonym of CLUSTER. While a good bit of the implementation > is shared, VACUUM FULL doesn't order the table contents. I see now reason why > an AM shouldn't support VACUUM FULL? It's effectively a synonym which determines whether the "bool use_sort" parameter of the table AM's relation_copy_for_cluster will be set. Heap-AM plays along and sorts or not based on that. But it's up to the TAM what it wants to do with that boolean, if in fact it does anything at all based on that. A TAM could decide to do the exact opposite of what Heap-AM does and instead sort on VACUUM FULL but not sort on CLUSTER, or perhaps perform a randomized shuffle, or . From the point-of-view of a TAM implementor, VACUUM FULL and CLUSTER are synonyms. Or am I missing something? >> , and both will run against the table if the user has run an ALTER >> TABLE..CLUSTER ON. > > If a user does that for a table that doesn't support clustering, well, I don't > see what's gained by not erroring out. Perhaps they want to give the TAM information about which index to use for sorting, on those occasions when the TAM's logic dictates that sorting is appropriate, but not in response to a cluster command. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Modest proposal to extend TableAM API for controlling cluster commands
> On Jun 15, 2022, at 6:55 PM, Andres Freund wrote: > > I think nothing would happen in this case - only pre-clustered tables get > clustered in an argumentless CLUSTER. What am I missing? The "VACUUM FULL" synonym of "CLUSTER" doesn't depend on whether the target is pre-clustered, and both will run against the table if the user has run an ALTER TABLE..CLUSTER ON. Now, we could try to catch that latter command with a utility hook, but since the VACUUM FULL is still problematic, it seems cleaner to just add the callback I am proposing. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Modest proposal to extend TableAM API for controlling cluster commands
> On Jun 15, 2022, at 6:01 PM, Andres Freund wrote: > > Hi, > > On 2022-06-15 17:21:56 -0700, Mark Dilger wrote: >> While developing various Table Access Methods, I have wanted a callback for >> determining if CLUSTER (and VACUUM FULL) should be run against a table >> backed by a given TAM. The current API contains a callback for doing the >> guts of the cluster, but by that time, it's a bit too late to cleanly back >> out. For single relation cluster commands, raising an error from that >> callback is probably not too bad. For multi-relation cluster commands, that >> aborts the clustering of other yet to be processed relations, which doesn't >> seem acceptable. > > Why not? What else do you want to do in that case? Silently ignoring > non-clusterable tables doesn't seem right either. What's the use-case for > swallowing the error? Imagine you develop a TAM for which the concept of "clustering" doesn't have any defined meaning. Perhaps you've arranged the data in a way that has no similarity to heap, and now somebody runs a CLUSTER command (with no arguments.) It's reasonable that they want all their heap tables to get the usual cluster_rel treatment, and that the existence of a table using your exotic TAM shouldn't interfere with that. Then what? You are forced to copy all the data from your OldHeap (badly named) to the NewHeap (also badly named), or to raise an error. That doesn't seem ok. >> I tried fixing this with a ProcessUtility_hook, but that >> fires before the multi-relation cluster command has compiled the list of >> relations to cluster, meaning the ProcessUtility_hook doesn't have access to >> the necessary information. (It can be hacked to compile the list of >> relations itself, but that duplicates both code and effort, with the usual >> risks that the code will get out of sync.) >> >> For my personal development, I have declared a new hook, bool >> (*relation_supports_cluster) (Relation rel). It gets called from >> commands/cluster.c in both the single-relation and multi-relation code >> paths, with warning or debug log messages output for relations that decline >> to be clustered, respectively. > > Do you actually need to dynamically decide whether CLUSTER is supported? > Otherwise we could just make the existing cluster callback optional and error > out if a table is clustered that doesn't have the callback. Same as above, I don't know why erroring would be the right thing to do. As a comparison, consider that we don't attempt to cluster a partitioned table, but rather just silently skip it. Imagine if, when we introduced the concept of partitioned tables, we made unqualified CLUSTER commands always fail when they encountered a partitioned table. >> Before posting a patch, do people think this sounds useful? Would you like >> the hook function signature to differ in some way? Is a simple "yes this >> relation may be clustered" vs. "no this relation may not be clustered" >> interface overly simplistic? > > It seems overly complicated, if anything ;) For the TAMs I've developed thus far, I don't need the (Relation rel) parameter, and could just have easily used (void). But that seems to fence in what other TAM authors could do in future. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Extending USING [heap | mytam | yourtam] grammar and behavior
Hackers, I have extended the grammar to allow "USING NOT method [, ...]" to exclude one or more TAMs in a CREATE TABLE statement. This may sound like a weird thing to do, but it is surprisingly useful when developing new Table Access Methods, particularly when you are developing two or more, not just one. To explain: Developing a new TAM takes an awful lot of testing, and much of it is duplicative of the existing core regression test suite. Leveraging the existing tests saves an awful lot of test development. When developing just one TAM, leveraging the existing tests isn't too hard. Without much work*, you can set default_table_access_method=mytam for the duration of the check-world. You'll get a few test failures this way. Some will be in tests that probe the catalogs to verify that /heap/ is stored there, and instead /mytam/ is found. Others will be tests that are sensitive to the number of rows that fit per page, etc. But a surprising number of tests just pass, at least after you get the TAM itself debugged. When developing two or more TAMs, this falls apart. Some tests may be worth fixing up (perhaps with alternate output files) for "mytam", but not for "columnar_tam". That might be because the test is checking fundamentally row-store-ish properties of the table, which has no applicability to your column-store-ish TAM. In that case, "USING NOT columnar_tam" fixes the test failure when columnar is the default, without preventing the test from testing "mytam" when it happens to be the default. Once you have enough TAMs developed and deployed, this USING NOT business becomes useful in production. You might have different defaults on different servers, or for different customers, etc., and for a given piece of DDL that you want to release you only want to say which TAMs not to use, not to nail down which TAM must be used. Thoughts? I'll hold off posting a patch until the general idea is debated. [*] It takes some extra work to get the TAP tests to play along. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Modest proposal to extend TableAM API for controlling cluster commands
Hackers, While developing various Table Access Methods, I have wanted a callback for determining if CLUSTER (and VACUUM FULL) should be run against a table backed by a given TAM. The current API contains a callback for doing the guts of the cluster, but by that time, it's a bit too late to cleanly back out. For single relation cluster commands, raising an error from that callback is probably not too bad. For multi-relation cluster commands, that aborts the clustering of other yet to be processed relations, which doesn't seem acceptable. I tried fixing this with a ProcessUtility_hook, but that fires before the multi-relation cluster command has compiled the list of relations to cluster, meaning the ProcessUtility_hook doesn't have access to the necessary information. (It can be hacked to compile the list of relations itself, but that duplicates both code and effort, with the usual risks that the code will get out of sync.) For my personal development, I have declared a new hook, bool (*relation_supports_cluster) (Relation rel). It gets called from commands/cluster.c in both the single-relation and multi-relation code paths, with warning or debug log messages output for relations that decline to be clustered, respectively. Before posting a patch, do people think this sounds useful? Would you like the hook function signature to differ in some way? Is a simple "yes this relation may be clustered" vs. "no this relation may not be clustered" interface overly simplistic? — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Checking for missing heap/index files
> On Jun 8, 2022, at 5:45 AM, Bruce Momjian wrote: > > Is this something anyone has even needed or had requested? I might have put this in amcheck's verify_heapam() had there been an interface for it. I vaguely recall wanting something like this, yes. As it stands, verify_heapam() may trigger mdread()'s "could not open file" or "could not read block" error, in the course of verifying the table. There isn't an option in amcheck to just verify that the underlying files exist. If struct f_smgr had a function for validating that all segment files exist, I may have added an option to amcheck (and the pg_amcheck frontend tool) to quickly look for missing files. Looking at smgr/md.c, it seems mdnblocks() is close to what we want, but it skips already opened segments "to avoid redundant seeks". Perhaps we'd want to add a function to f_smgr, say "smgr_allexist", to check for all segment files? I'm not sure how heavy-handed the corresponding mdallexist() function should be. Should it close all open segments, then reopen and check the size of all of them by calling mdnblocks()? That seems safer than merely asking the filesystem if the file exists without verifying that it can be opened. If we made these changes, and added corresponding quick check options to amcheck and pg_amcheck, would that meet your current needs? The downside to using amcheck for this sort of thing is that we did not (and likely will not) back port it. I have had several occasions to want this functionality recently, but the customers were on pre-v14 servers, so these tools were not available anyway. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Tightening behaviour for non-immutable behaviour in immutable functions
> On Jun 8, 2022, at 2:42 PM, Greg Stark wrote: > > Thinking of plpgsql here, we already run the raw parser on all sql > when the function is defined. We could somehow check whether the > raw_parser found any non-schema-qualified references. This looks like > it would be awkward but doable. That would allow users to write > non-search_path-dependent code and if postgres doesn't warn they would > know they've done it properly. It would still put quite a burden on > users, especially when it comes to operators... > > Or alternatively we could offer lexical scoping so that all objects > are looked up at parse time and the fully qualified reference is > stored instead of the non-qualified reference. That would be more > similar to how views and other object references are handled. I like the general idea, but I'm confused why you are limiting the analysis to search path resolution. The following is clearly wrong, but not for that reason: create function public.identity () returns double precision as $$ select random()::integer; $$ language sql immutable parallel safe -- set search_path to 'pg_catalog' ; Uncommenting that last bit wouldn't make it much better. Isn't the more general approach to look for non-immutable (or non-stable) operations, with object resolution just one type of non-immutable operation? Perhaps raise an error when you can prove the given function's provolatile marking is wrong, and a warning when you cannot prove the marking is correct? That would tend to give warnings for polymorphic functions that use functions or operators over the polymorphic types, or which use dynamic sql, but maybe that's ok. Those functions probably deserve closer scrutiny anyway. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Collation version tracking for macOS
> On Jun 7, 2022, at 1:10 PM, Tom Lane wrote: > > This is not the concern that I have. I agree that if we tell a user > that collation X changed behavior and he'd better reindex his indexes > that use collation X, but none of them actually contain any cases that > changed behavior, that's not a "false positive" --- that's "it's cheaper > to reindex than to try to identify whether there's a problem". I don't see this problem as limited to indexes, though I do understand why that might be the most common place for the problem to manifest itself. As a simple example, text[] constructed using array_agg over sorted data can be corrupted by a collation change, and reindex won't fix it. If we extend the table-AM interface to allow query quals to be pushed down to the table-AM, we might develop table-AMs that care about sort order, too. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: How about a psql backslash command to show GUCs?
> On Jun 6, 2022, at 9:02 AM, Tom Lane wrote: > > Thoughts? I think it depends on your mental model of what \dconfig is showing you. Is it showing you the list of configs that you can SET, or just the list of all configs?
Re: First draft of the PG 15 release notes
> On May 10, 2022, at 6:46 PM, Bruce Momjian wrote: > > Allow logical replication to run as the owner of the publication Make that "owner of the subscription". This change operates on the subscriber-side. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: First draft of the PG 15 release notes
> On May 10, 2022, at 8:44 AM, Bruce Momjian wrote: > > I have completed the first draft of the PG 15 release notes and you can > see the results here Thanks, Bruce! This release note: • Prevent logical replication into tables where the subscription owner is subject to the table's row-level security policies (Mark Dilger) ... should mention, independent of any RLS considerations, subscriptions are now applied under the privilege of the subscription owner. I don't think we can fit it in the release note, but the basic idea is that: CREATE SUBSCRIPTION ... CONNECTION '...' PUBLICATION ... WITH (enabled = false); ALTER SUBSCRIPTION ... OWNER TO nonsuperuser_whoever; ALTER SUBSCRIPTION ... ENABLE; can be used to replicate a subscription without sync or apply workers operating as superuser. That's the main advantage. Previously, subscriptions always ran with superuser privilege, which creates security concerns if the publisher is malicious (or foolish). Avoiding any unintentional bypassing of RLS was just a necessary detail to close the security loophole, not the main point of the security enhancement. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Unstable tests for recovery conflict handling
> On Apr 27, 2022, at 6:26 PM, Andres Freund wrote: > > I'm a bit confused - what's the relation of that failure to this thread / the > tests / this commit? None, upon further reflection. It turned out to be unrelated. Sorry for the noise. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Unstable tests for recovery conflict handling
> On Apr 27, 2022, at 10:11 AM, Mark Dilger > wrote: > > I'll try again on master Still with coverage and dtrace enabled, I get the same thing, except that master formats the logs a bit differently: # Postmaster PID for node "primary" is 19797 psql::1: ERROR: prepared transaction with identifier "xact_012_1" does not exist [10:26:16.314](1.215s) not ok 11 - Rollback of PGPROC_MAX_CACHED_SUBXIDS+ prepared transaction on promoted standby [10:26:16.314](0.000s) [10:26:16.314](0.000s) # Failed test 'Rollback of PGPROC_MAX_CACHED_SUBXIDS+ prepared transaction on promoted standby' [10:26:16.314](0.000s) # at t/012_subtransactions.pl line 208. [10:26:16.314](0.000s) # got: '3' # expected: '0' With coverage but not dtrace enabled, I still get the error, though the log leading up to the error now has a bunch of coverage noise lines like: profiling: /Users/mark.dilger/recovery_test/postgresql/src/backend/utils/sort/tuplesort.gcda: cannot merge previous GCDA file: corrupt arc tag The error itself looks the same except the timing numbers differ a little. With neither enabled, all tests pass. I'm inclined to think that either the recovery code or the test have a race condition, and that enabling coverage causes the race to come out differently. I'll keep poking — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Unstable tests for recovery conflict handling
> On Apr 27, 2022, at 9:45 AM, Tom Lane wrote: > > [ starting a new thread cuz the shared-stats one is way too long ] > > Andres Freund writes: >> Add minimal tests for recovery conflict handling. > > It's been kind of hidden by other buildfarm noise, but > 031_recovery_conflict.pl is not as stable as it should be [1][2][3][4]. > > Three of those failures look like Interesting, I have been getting failures on REL_14_STABLE: t/012_subtransactions.pl . 11/12 # Failed test 'Rollback of PGPROC_MAX_CACHED_SUBXIDS+ prepared transaction on promoted standby' # at t/012_subtransactions.pl line 211. # got: '3' # expected: '0' t/012_subtransactions.pl . 12/12 # Looks like you failed 1 test of 12. t/012_subtransactions.pl . Dubious, test returned 1 (wstat 256, 0x100) Failed 1/12 subtests And the logs, tmp_check/log/regress_log_012_subtransactions, showing: ### Enabling streaming replication for node "primary" ### Starting node "primary" # Running: pg_ctl -D /Users/mark.dilger/recovery_test/postgresql/src/test/recovery/tmp_check/t_012_subtransactions_primary_data/pgdata -l /Users/mark.dilger/recovery_test/postgresql/src/test/recovery/tmp_check/log/012_subtransactions_primary.log -o --cluster-name=primary start waiting for server to start done server started # Postmaster PID for node "primary" is 46270 psql::1: ERROR: prepared transaction with identifier "xact_012_1" does not exist not ok 11 - Rollback of PGPROC_MAX_CACHED_SUBXIDS+ prepared transaction on promoted standby # Failed test 'Rollback of PGPROC_MAX_CACHED_SUBXIDS+ prepared transaction on promoted standby' # at t/012_subtransactions.pl line 211. # got: '3' # expected: '0' This is quite consistent for me, but only when I configure with --enable-coverage and --enable-dtrace. (I haven't yet tried one of those without the other.) I wasn't going to report this yet, having not yet completely narrowed this down, but I wonder if anybody else is seeing this? I'll try again on master — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Postgres perl module namespace
> On Apr 18, 2022, at 1:19 PM, Andrew Dunstan wrote: > > that seems quite separate from the present issue. Thanks for the clarification. I agree, given your comments, that it is unrelated to this thread. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Postgres perl module namespace
> On Apr 18, 2022, at 10:59 AM, Andrew Dunstan wrote: > > No, I think we could probably just port the whole of src/test/PostreSQL > back if required, and have it live alongside the old modules. Each TAP > test is a separate miracle - see comments elsewhere about port > assignment in parallel TAP tests. I think $last_port_assigned would need to be shared between the two modules. This global safeguard is already a bit buggy, but not sharing it between modules would be far worse. Currently, if a node which has a port assigned is stopped, and a parallel test creates a new node, this global variable prevents the new node from getting the port already assigned to the old stopped node, except when port assignment wraps around. Without sharing the global, wrap-around need not happen for port collisions. Or am I reading the code wrong? — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: pg14 psql broke \d datname.nspname.relname
> On Apr 8, 2022, at 4:11 AM, Robert Haas wrote: > > I don't personally see how we're going to come out ahead with that > approach, but if you or Tom or someone else want to put something > together, that's fine with me. I'm not stuck on this approach, I just > don't see how we come out ahead with the type of thing you're talking > about. I mean we could return the error text, but it's only to a > handful of places, so it just doesn't really seem like a win over what > the patch is already doing. Since there hasn't been any agreement on that point, I've just rebased the patch to apply cleanly against the current master: v9-0001-Reject-patterns-with-too-many-parts-or-wrong-db.patch Description: Binary data — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: pg14 psql broke \d datname.nspname.relname
> On Apr 7, 2022, at 3:37 PM, Tom Lane wrote: > >> >> I don't know whether that's a bug fix for the existing code or some >> new bit of functionality that \dconfig requires and nothing else >> needs. > > Well, \dconfig needs it because it would like foo.bar to get processed > as just a name. But I think it's a bug fix because as things stood, > if the caller doesn't provide a schemavar and the pattern contains a > dot, the code just silently throws away the dot and all to the left. > That doesn't seem very sane, even if it is a longstanding behavior. The patch submitted changes processSQLNamePattern() to return a dot count by reference. It's up to the caller to decide whether to raise an error. If you pass in no schemavar, and you get back dotcnt=2, you know it parsed it as a two part pattern, and you can pg_fatal(...) or ereport(ERROR, ...) or whatever. It looks like I'll need to post a new version of the patch with an argument telling the function to ignore dots, but I'm not prepared to say that for sure. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: How about a psql backslash command to show GUCs?
> On Apr 7, 2022, at 9:37 AM, Tom Lane wrote: > > Maybe I'm atypical, but I'm probably going to use tab completion > either way, so it's not really more keystrokes. Same here, because after tab-completing \dcon\t\t into \dconfig, I'm likely to also tab-complete to get the list of parameters. I frequently can't recall the exact spelling of them. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: How about a psql backslash command to show GUCs?
> On Apr 7, 2022, at 9:29 AM, Tom Lane wrote: > > I wouldn't > fight too hard if people want to lengthen it to \dconfig for consistency > with set_config(). I'd prefer \dconfig, but if the majority on this list view that as pedantically forcing them to type more, I'm not going to kick up a fuss about \dconf. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: How about a psql backslash command to show GUCs?
> On Apr 7, 2022, at 6:21 AM, Andrew Dunstan wrote: > > \dconf seems fine to me We have too many synonyms for configuration parameters. "config", "guc", "parameter", and "setting" are already in use. I thought we agreed on the other thread that "setting" means the value, and "parameter" is the thing being set. It's true that "config" refers to parameters in the name of pg_catalog.set_config, which is a pretty strong precedent, but sadly "config" also refers to configuration files, the build configuration (as in the "pg_config" tool), text search configuration, etc. While grep'ing through doc/src/sgml, I see no instances of "conf" ever referring to configuration parameters. It only ever refers to configuration files. I'd prefer not adding it to the list of synonyms. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: How about a psql backslash command to show GUCs?
> On Apr 6, 2022, at 2:34 PM, Jonathan S. Katz wrote: > > "\sc" would make sense I originally wrote the command as \dcp (describe configuration parameter) because \dp (describe parameter) wasn't available. The thing we're showing is a "parameter", not a "config". If we're going to use a single letter, I'd prefer /p/. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: How about a psql backslash command to show GUCs?
> On Apr 6, 2022, at 10:48 AM, Tom Lane wrote: > > So does anyone else like this idea? Privileges on targets other than parameters have a \d command to show the privileges, as listed in doc/src/sgml/ddl.sgml. There isn't an obvious reason for omitting parameters from the list so covered. One of the ideas that got punted in the recent commit was to make it possible to revoke SET on a USERSET guc. For example, it might be nice to REVOKE SET ON PARAMETER work_mem FROM PUBLIC. That can't be done now, but for some select parameters, we might implement that in the future by promoting them to SUSET with a default GRANT SET...TO PUBLIC. When connecting to databases of different postgres versions (albeit only those version 15 and above), it'd be nice to quickly see what context (USERSET vs. SUSET) is assigned to the parameter, and whether the GRANT has been revoked. So yes, +1 from me. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
buildfarm failures, src/test/recovery
Hi Andrew, It appears to me that 2ef6f11b0c77ec323c688ddfd98ffabddb72c11d broke src/test/recovery. It looks like the following fixes it. Care to review and push? Or perhaps just revert that commit? diff --git a/src/test/regress/expected/jsonb_sqljson.out b/src/test/regress/expected/jsonb_sqljson.out index 230cfd3bfd..ee16f2770a 100644 --- a/src/test/regress/expected/jsonb_sqljson.out +++ b/src/test/regress/expected/jsonb_sqljson.out @@ -2088,7 +2088,7 @@ ERROR: only string constants supported in JSON_TABLE path specification LINE 1: SELECT * FROM JSON_TABLE(jsonb '{"a": 123}', '$' || '.' || '... ^ -- Test parallel JSON_VALUE() -CREATE UNLOGGED TABLE test_parallel_jsonb_value AS +CREATE TABLE test_parallel_jsonb_value AS SELECT i::text::jsonb AS js FROM generate_series(1, 50) i; -- Should be non-parallel due to subtransactions diff --git a/src/test/regress/sql/jsonb_sqljson.sql b/src/test/regress/sql/jsonb_sqljson.sql index 866c708a4d..2fc41a6df5 100644 --- a/src/test/regress/sql/jsonb_sqljson.sql +++ b/src/test/regress/sql/jsonb_sqljson.sql @@ -948,7 +948,7 @@ SELECT JSON_QUERY(jsonb '{"a": 123}', 'error' || ' ' || 'error'); SELECT * FROM JSON_TABLE(jsonb '{"a": 123}', '$' || '.' || 'a' COLUMNS (foo int)); -- Test parallel JSON_VALUE() -CREATE UNLOGGED TABLE test_parallel_jsonb_value AS +CREATE TABLE test_parallel_jsonb_value AS SELECT i::text::jsonb AS js FROM generate_series(1, 50) i; — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: pg14 psql broke \d datname.nspname.relname
> On Mar 29, 2022, at 8:20 AM, Robert Haas wrote: > > In describe.c, why are the various describeWhatever functions > returning true when validateSQLNamePattern returns false? It seems to > me that they should return false. That would cause exec_command_d() to > set status = PSQL_CMD_ERROR, which seems appropriate. I wondered > whether we should return PSQL_CMD_ERROR only for database errors, but > that doesn't seem to be the case. For example, exec_command_a() sets > PSQL_CMD_ERROR for a failure in do_pset(). Yes, I believe you are right. For scripting, the following should echo, but doesn't under the version 7 patch. Fixed in version 8. % psql -c "\d a.b.c.d" || echo 'error' improper qualified name (too many dotted names): a.b.c.d > pg_dump's prohibit_crossdb_refs() has a special case for you are not > connected to a database, but psql's validateSQLNamePattern() treats it > as an invalid cross-database reference. Maybe that should be > consistent, or just the other way around. After all, I would expect > pg_dump to just bail out if we lose the database connection, but psql > may continue, because we can reconnect. Putting more code into the > tool where reconnecting doesn't really make sense seems odd. Fixed psql in version 8 to issue the appropriate error message, either "You are currently not connected to a database." or "cross-database references are not implemented: %s". That matches the output for pg_dump. > processSQLNamePattern() documents that dotcnt can be NULL, and then > asserts that it isn't. That's ugly. Fixed the documentation in version 8. > processSQLNamePattern() introduces new local variables schema and > name, which account for most of the notational churn in that function. > I can't see a reason why those changes are needed. You do test whether > the new variables are NULL in a couple of places, but you could > equally well test schemavar/namevar/altnamevar directly. Actually, I > don't really understand why this function needs any changes other than > passing dbnamebuf and dotcnt through to patternToSQLRegex(). Is there > a reason? It looks like overeager optimization to me, to avoid passing buffers to patternToSQLRegex that aren't really wanted, consequently asking that function to parse things that the caller doesn't care about. But I don't think the optimization is worth the git history churn. Removed in version 8. > patternToSQLRegex() restructures the system of buffers as well, and I > don't understand the purpose of that either. It sort of looks like the > idea might be to relax the rule against dbname.relname patterns, but > why would we want to do that? If we don't want to do that, why remove > the assertion? This took a while to answer. I don't remember exactly what I was trying to do here, but it looks like I wanted callers who only want a (possibly database-qualified) schema name to pass that in the (dbnamebuf and) schemabuf, rather than using the (schemabuf and ) namebuf. I obviously didn't finish that conversion, because the clients never got the message. What remained was some rearrangement in patternToSQLRegex which worked but served no purpose. I've reverted the useless refactoring. > It is not very nice that patternToSQLRegex() ends up repeating the > locution "if (left && want_literal_dbname) > appendPQExpBufferChar(_literal, '"')" a whole bunch of times. > Suppose we remove all that. Then, in the if (!inquotes && ch == '.') > case, if left = true, we copy "cp - pattern" bytes starting at > "pattern" into the buffer. Wouldn't that accomplish the same thing > with less code? We don't *quite* want the literal left string. If it is quoted, we still want the quotes removed. For example: \d "robert.haas".accounts.acme needs to return robert.haas (without the quotes) as the database name. Likewise, for embedded quotes: \d "robert""haas".accounts.acme needs to return robert"haas, and so forth. I was able to clean up the "if (left && want_literal_dbname)" stuff, though. v8-0001-Reject-patterns-with-too-many-parts-or-wrong-db.patch Description: Binary data — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Granting SET and ALTER SYSTE privileges for GUCs
> On Apr 4, 2022, at 5:12 PM, Tom Lane wrote: > > Wrote it already, no need for you to do it. Thanks! — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Granting SET and ALTER SYSTE privileges for GUCs
> On Apr 4, 2022, at 2:26 PM, Tom Lane wrote: > > Thanks. As I'm working through this, I'm kind of inclined to drop > the has_parameter_privilege() variants that take an OID as object > identifier. This gets back to the fact that I don't think > pg_parameter_acl OIDs have any outside use; we wouldn't even have > them except that we need a way to track their role dependencies > in pg_shdepend. AFAICS users will only ever be interested in > looking up a GUC by name. Any objections? None. > Another thought here is that I see you're expending some code > to store the canonical name of a GUC in pg_parameter_acl, but > I think that's probably going too far. In particular, for the > legacy mixed-case names like "DateStyle", what ends up in the > table is the mixed-case name, which seems problematic. It would > definitely be problematic if an extension used such a name, > because we might or might not be aware of the idiosyncratic > casing at the time a GRANT is issued. I'm thinking that we > really want to avoid looking up custom GUCs at all during GRANT, > because that can't do anything except create hazards. Yikes. It took a few tries to see what you mean. Yes, if the GRANT happens before the LOAD, that can have bad consequences. So I agree something should be changed. > So I think that instead of what you've got here, we should > (1) apply the map_old_guc_names[] mapping, which is constant >(for any one PG release anyway) > (2) smash to lower case > (3) verify validity per valid_variable_name. > > This also simplifies life on the lookup side, where it's sufficient > to do steps (1) and (2) before performing a catalog search. > > Thoughts? That sounds right. Do you already have something like that coded, or would you like me to post a patch? — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company