Re: Things I don't like about \du's "Attributes" column
On Sat, Jun 8, 2024 at 10:02 AM Pavel Luzanov wrote: > Therefore, I think the current patch offers a better version of the \du > command. > However, I admit that these improvements are not enough to accept the patch. > I would like to hear other opinions. Hmm, I don't think I quite agree with this. If people like this version better than what we have now, that's all we need to accept the patch. I just don't really want to be the one to decide all by myself whether this is, in fact, better. So, like you, I would like to hear other opinions. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs
On Thu, Jun 6, 2024 at 3:27 PM Jelte Fennema-Nio wrote: > Of course there's always the possibility to review more. But I don't > really agree with this summary of my review activity. Nonetheless, I need to take a break from this to work on some of my own stuff. I'll circle back around to it. -- Robert Haas EDB: http://www.enterprisedb.com
Re: relfilenode statistics
On Thu, Jun 6, 2024 at 11:17 PM Andres Freund wrote: > If we just want to keep prior stats upon arelation rewrite, we can just copy > the stats from the old relfilenode. Or we can decide that those stats don't > really make sense anymore, and start from scratch. I think we need to think carefully about what we want the user experience to be here. "Per-relfilenode stats" could mean "sometimes I don't know the relation OID so I want to use the relfilenumber instead, without changing the user experience" or it could mean "some of these stats actually properly pertain to the relfilenode rather than the relation so I want to associate them with the right object and that will affect how the user sees things." We need to decide which it is. If it's the former, then we need to examine whether the goal of hiding the distinction between relfilenode stats and relation stats from the user is in fact feasible. If it's the latter, then we need to make sure the whole patch reflects that design, which would include e.g. NOT copying stats from the old to the new relfilenode, and which would also include documenting the behavior in a way that will be understandable to users. In my experience, the worst thing you can do in cases like this is be somewhere in the middle. Then you tend to end up with stuff like: the difference isn't supposed to be something that the user knows or cares about, except that they do have to know and care because you haven't thoroughly covered up the deception, and often they have to reverse engineer the behavior because you didn't document what was really happening because you imagined that they wouldn't notice. -- Robert Haas EDB: http://www.enterprisedb.com
Re: race condition in pg_class
On Thu, Jun 6, 2024 at 7:20 PM Michael Paquier wrote: > On Thu, Jun 06, 2024 at 09:48:51AM -0400, Robert Haas wrote: > > It's not this patch set's fault, but I'm not very pleased to see that > > the injection point wait events have been shoehorned into the > > "Extension" category - which they are not - instead of being a new > > wait_event_type. That would have avoided the ugly wait-event naming > > pattern, inconsistent with everything else, introduced by > > inplace050-tests-inj-v1.patch. > > Not sure to agree with that. The set of core backend APIs supporting > injection points have nothing to do with wait events. The library > attached to one or more injection points *may* decide to use a wait > event like what the wait/wakeup calls in modules/injection_points do, > but that's entirely optional. These rely on custom wait events, > plugged into the Extension category as the code run is itself in an > extension. I am not arguing against the point that it may be > interesting to plug in custom wait event categories, but the current > design of wait events makes that much harder than what core is > currently able to handle, and I am not sure that this brings much at > the end as long as the wait event strings can be customized. > > I've voiced upthread concerns over the naming enforced by the patch > and the way it plugs the namings into the isolation functions, by the > way. I think the core code should provide an "Injection Point" wait event type and let extensions add specific wait events there, just like you did for "Extension". Then this ugly naming would go away. As I see it, "Extension" is only supposed to be used as a catch-all when we have no other information, but here we do. If we refuse to use the wait_event_type field to categorize waits, then people are going to have to find some other way to get that data into the system, as Noah has done. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Assert in heapgettup_pagemode() fails due to underlying buffer change
On Fri, Jun 7, 2024 at 4:05 AM Alvaro Herrera wrote: > > static void > > -ZeroBuffer(Buffer buffer, ReadBufferMode mode) > > +ZeroBuffer(Buffer buffer, ReadBufferMode mode, bool zero) > > This change makes the API very strange. Should the function be called > ZeroAndLockBuffer() instead? Then the addition of a "bool zero" > argument makes a lot more sense. I agree that's better, but it still looks a bit weird. You have to realize that 'bool zero' means 'is already zeroed' here -- or at least, I guess that's the intention. But then I wonder why you'd call a function called ZeroAndLockBuffer if all you need to do is LockBuffer. -- Robert Haas EDB: http://www.enterprisedb.com
Re: question regarding policy for patches to out-of-support branches
On Thu, Jun 6, 2024 at 10:04 PM Tom Lane wrote: > > I added them here with minimal copy editing an no attempt to organize or > > sort into groups: > > https://wiki.postgresql.org/wiki/Committing_checklist#Policies > > If someone has thoughts on how to improve I am happy to make more changes. > > Thanks! I summoned the energy to make a few more improvements, > particularly updating stuff that seemed out-of-date. I'm sure > there's more that could be added here. This is nice! I wonder if we could interest anyone in creating tooling that could be used to check some of this stuff -- ideally run as part of the regular build process, so that you fail to notice that you did it wrong. Not all of these rules are subject to automatic verification e.g. it's hard to enforce that a change to an out-of-support branch makes no functional change. But an awful lot of them could be, and I would personally be significantly happier and less stressed if I knew that 'ninja && meson test' was going to tell me that I did it wrong before I pushed, instead of finding out afterward and then having to drop everything to go clean it up. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Things I don't like about \du's "Attributes" column
On Thu, Jun 6, 2024 at 5:10 PM Pavel Luzanov wrote: > Agree. > There is an additional technical argument for removing this replacement. > I don't like explicit cast to text of the "Connection limit" column. > Without 'Not allowed' it is no longer required. > Value -1 can be replaced by NULL with an implicit cast to integer. Yeah, +1 for that idea. > Example output: > > \du+ regress_du* > List of roles > Role name | Login | Attributes | Valid until | > Connection limit | Description > --+---+-+--+--+-- > regress_du_admin | yes | Superuser +| | > | some description > | | Create DB +| | > | > | | Create role+| | > | > | | Inherit+| | > | > | | Replication+| | > | > | | Bypass RLS | | > | > regress_du_role0 | yes | Inherit | Tue Jun 04 00:00:00 2024 PDT | > 0 | > regress_du_role1 | no| Create role+| infinity | > | > | | Inherit | | > | > regress_du_role2 | yes | Inherit+| | > 42 | > | | Replication+| | > | > | | Bypass RLS | | > | > (4 rows) This seems unobjectionable to me. I am not sure whether it is better than the current verison, or whether it is what we want. But it seems reasonable. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Avoid orphaned objects dependencies, take 3
On Thu, Jun 6, 2024 at 1:56 AM Bertrand Drouvot wrote: > v9 is more invasive (as it changes code in much more places) than v8 but it is > easier to follow (as it is now clear where the new lock is acquired). Hmm, this definitely isn't what I had in mind. Possibly that's a sign that what I had in mind was dumb, but for sure it's not what I imagined. What I thought you were going to do was add calls like LockDatabaseObject(NamespaceRelationId, schemaid, 0, AccessShareLock) in various places, or perhaps LockRelationOid(reloid, AccessShareLock), or whatever the case may be. Here you've got stuff like this: - record_object_address_dependencies(, addrs_auto, -DEPENDENCY_AUTO); + lock_record_object_address_dependencies(, addrs_auto, + DEPENDENCY_AUTO); ...which to me looks like the locking is still pushed down inside the dependency code. And you also have stuff like this: ObjectAddressSet(referenced, RelationRelationId, childTableId); + depLockAndCheckObject(); recordDependencyOn(, , DEPENDENCY_PARTITION_SEC); But in depLockAndCheckObject you have: + if (object->classId == RelationRelationId || object->classId == AuthMemRelationId) + return; That doesn't seem right, because then it seems like the call isn't doing anything, but there isn't really any reason for it to not be doing anything. If we're dropping a dependency on a table, then it seems like we need to have a lock on that table. Presumably the reason why we don't end up with dangling dependencies in such cases now is because we're careful about doing LockRelation() in the right places, but we're not similarly careful about other operations e.g. ConstraintSetParentConstraint is called by DefineIndex which calls table_open(childRelId, ...) first, but there's no logic in DefineIndex to lock the constraint. Thoughts? -- Robert Haas EDB: http://www.enterprisedb.com
Re: problems with "Shared Memory and Semaphores" section of docs
On Thu, Jun 6, 2024 at 3:21 PM Nathan Bossart wrote: > Here is a rebased version of the patch for v18 that adds a runtime-computed > GUC. As I noted earlier, there still isn't a consensus on this approach. I don't really like making this a GUC, but what's the other option? It's reasonable for people to want to ask the server how many resources it will need to start, and -C is the only tool we have for that right now. So I feel like this is a fair thing to do. I do think the name could use some more thought, though. semaphores_required would end up being the same kind of thing as shared_memory_size_in_huge_pages, but the names seem randomly different. If semaphores_required is right here, why isn't shared_memory_required used there? Seems more like we ought to call this semaphores or os_semaphores or num_semaphores or num_os_semaphores or something. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Assert in heapgettup_pagemode() fails due to underlying buffer change
On Thu, Jun 6, 2024 at 6:00 AM Alexander Lakhin wrote: > Am I missing something or the the page buffer indeed lacks locking there? I don't know, but if the locks are really missing now, I feel like the first question is "which commit got rid of them?". It's a little hard to believe that they've never been there and somehow nobody has noticed. Then again, maybe we have; see Noah's thread about in-place updates breaking stuff and some of the surprising discoveries there. But it seems worth investigating. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Reuse child_relids in try_partitionwise_join was Re: Assert failure on bms_equal(child_joinrel->relids, child_joinrelids)
On Wed, Jun 5, 2024 at 3:48 AM Ashutosh Bapat wrote: > Here's planning time measurements. > num_joins | master (ms) | patched (ms) | change in planning time (ms) | > change in planning time > ---+-+--+--+--- > 2 | 187.86 | 177.27 |10.59 | > 5.64% > 3 | 721.81 | 758.80 | -36.99 | >-5.12% > 4 | 2239.87 | 2236.19 | 3.68 | > 0.16% > 5 | 6830.86 | 7027.76 | -196.90 | >-2.88% I find these results concerning. Why should the planning time sometimes go up? And why should it go up for odd numbers of joinrels and down for even numbers of joinrels? I wonder if these results are really correct, and if they are, I wonder what could account for such odd behavior -- Robert Haas EDB: http://www.enterprisedb.com
Re: relfilenode statistics
On Wed, Jun 5, 2024 at 1:52 AM Bertrand Drouvot wrote: > I think we should keep the stats in the relation during relfilenode changes. > As a POC, v1 implemented a way to do so during TRUNCATE (see the changes in > table_relation_set_new_filelocator() and in pg_statio_all_tables): as you can > see in the example provided up-thread the new heap_blks_written statistic has > been preserved during the TRUNCATE. Yeah, I think there's something weird about this design. Somehow we're ending up with both per-relation and per-relfilenode counters: + pg_stat_get_blocks_written(C.oid) + pg_stat_get_relfilenode_blocks_written(d.oid, CASE WHEN C.reltablespace <> 0 THEN C.reltablespace ELSE d.dattablespace END, C.relfilenode) AS heap_blks_written, I'll defer to Andres if he thinks that's awesome, but to me it does not seem right to track some blocks written in a per-relation counter and others in a per-relfilenode counter. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs
On Thu, Jun 6, 2024 at 5:12 AM Jelte Fennema-Nio wrote: > Looking at ssl_max_protocol_version closer though, to stay really > consistent I'd have to change "latest" to be renamed to empty string > (i.e. there is no max_protocol_version). I think I might prefer > staying consistent over introducing an imho slightly clearer name. > Another way to stay consistent would of course be also adding "latest" > as an option to ssl_max_protocol_version? What do you think? As I see it, the issue here is whether the default value would ever be different from the latest value. If not, then using blank to mean the latest seems fine, but if so, then I'd expect blank to mean the default version and latest to mean the latest version. > I'll look into adding min_protocol_version to the patchset soonish. > Some review of the existing code in the first few patches would > definitely be appreciated. Yeah, I understand, and I do want to do that, but keep in mind I've already spent considerable time on this patch set, way more than most others, and if I want to get it committed I'm nowhere close to being done. It's probably multiple weeks of additional work for me, and I think I've probably already spent close to a week on this, and I only work ~48 weeks a year, and there are ~300 patches in the CommitFest. Plus, right now there is no possibility of actually committing anything until after we branch. And, respectfully, I feel like there has to be some give and take here. I've been trying to give this patch set higher priority because it's in an area that I know something about and have opinions about and also because I can tell that you're kind of frustrated and I don't want you to leave the development community. But, at the same time, I don't think you've done much to help me get my patches committed, and while you have done some review of other people's patches, it doesn't seem to often be the kind of detailed, line-by-line review that is needed to get most patches committed. So I'm curious how you expect this system to scale. -- Robert Haas EDB: http://www.enterprisedb.com
Re: question regarding policy for patches to out-of-support branches
On Thu, Jun 6, 2024 at 4:25 AM Hannu Krosing wrote: > Not absolutely sure, but would at least adding a page to PostgreSQL > Wiki about this make sense ? I feel like we need to do something. Tom says this is a policy, and he's made that comment before about other things, but the fact that they're not memorialized anywhere is a huge problem, IMHO. People don't read or remember every mailing list discussion forever, and even if they did, how would we enumerate all the policies for the benefit of a newcomer? Maybe this belongs in the documentation, maybe in the wiki, maybe someplace else, but the real issue for me is that policies have to be discoverable by the people who need to adhere to them, and old mailing list discussions aren't. -- Robert Haas EDB: http://www.enterprisedb.com
Re: ResourceOwner refactoring
On Thu, Jun 6, 2024 at 7:32 AM Heikki Linnakangas wrote: > > Barring objections, I'll commit this later today or tomorrow. Thanks for > > the report! > > Committed. I think you may have forgotten to push. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Things I don't like about \du's "Attributes" column
On Thu, Jun 6, 2024 at 5:08 AM Pavel Luzanov wrote: > But now there are no changes in pg_roles. Just a special interpretation > of the two values of the "Connection limit" column: > 0 - Now allowed (changed from 'No connections') > -1 - empty string I think the first of these special interpretations is unnecessary and should be removed. It seems pretty clear what 0 means. -- Robert Haas EDB: http://www.enterprisedb.com
Re: [multithreading] extension compatibility
On Thu, Jun 6, 2024 at 5:00 AM Heikki Linnakangas wrote: > If there is some material harm from compiling with multithreading > support even if you're not using it, we should try to fix that. I'm not > dead set against having a compile-time option, but I don't see the need > for it at the moment. Well, OK, so it sounds like I'm outvoted, at least at the moment. Maybe that will change as more people vote, but for now, that's where we are. Given that, I suppose we want something more like Tristan's patch, but with a more extensible syntax. Does that sound right? -- Robert Haas EDB: http://www.enterprisedb.com
Re: race condition in pg_class
te altogether. That sounds difficult and not back-patchable, but I can't classify what this patch set does as anything better than grotty hacks to work around serious design deficiencies. That is not a vote against these patches: I see no better way forward. Nonetheless, I dislike the lack of better options. I have done only cursory review of the last two patches and don't feel I'm in a place to certify them, at least not now. -- Robert Haas EDB: http://www.enterprisedb.com
Re: [multithreading] extension compatibility
On Wed, Jun 5, 2024 at 10:09 PM Andres Freund wrote: > Maybe. I think shipping a mode where users can fairly simply toggle between > threaded and process mode will allow us to get this stable a *lot* quicker > than if we distribute two builds. Most users don't build from source, distros > will have to pick the mode. If they don't choose threaded mode, we'll not find > problems. If they do choose threaded mode, we can't ask users to switch to a > process based mode to check if the problem is related. I don't believe that being coercive here is the right approach. I think distros see the value in compiling with as many things turned on as possible; when they ship with something turned off, it's because it's unstable or introduces weird dependencies or has some other disadvantage. -- Robert Haas EDB: http://www.enterprisedb.com
Re: [multithreading] extension compatibility
On Wed, Jun 5, 2024 at 9:50 PM Andres Freund wrote: > Depending on the architecture / ABI / compiler options it's often not > meaningfully more expensive to access a thread local variable than a "normal" > variable. > > I think these days it's e.g. more expensive on x86-64 windows, but not > linux. On arm the overhead of TLS is more noticeable, across platforms, > afaict. I mean, to me, this still sounds like we want multithreading to be a build-time option. -- Robert Haas EDB: http://www.enterprisedb.com
Re: [multithreading] extension compatibility
On Wed, Jun 5, 2024 at 8:01 PM Heikki Linnakangas wrote: > I'm very much in favor of a runtime toggle. To be precise, a > PGC_POSTMASTER setting. We'll get a lot more testing if you can easily > turn it on/off, and so far I haven't seen anything that would require it > to be a compile time option. I was thinking about global variable annotations. If someone wants to build without multithreading, I think that they won't want to still end up with a ton of variables being changed to thread-local. So I think there has to be a build-time option controlling whether this build supports threading. I suspect there will be other people who want to just shut all of this experimental code off, which is probably going to be a second driver for a build-time toggle. But even with that, we can still have a GUC controlling whether threading is actually used. Does that make sense to you? Supposing it does, then how does the extension-marking system need to work? I suppose in this world we don't want any build failures: you're allowed to build a non-thread-aware extension against a threading-capable PostgreSQL; you're just not allowed to load the resulting extension when the server is in threading mode. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs
On Wed, Jun 5, 2024 at 5:16 PM Jelte Fennema-Nio wrote: > I agree longcancelkey=true is not what we want. In my patch 0004, you > can specify max_protocol_version=latest to use the latest protocol > version as opt-in. This is a future proof version of > protocolversion=3.1 that you're proposing, because it will > automatically start using 3.2 when it comes out. So I think that > solves your concern here. (although maybe it should be called > latest-3.x or something, in case we ever want to add a 4.0 protocol, > naming is hard) > > I personally quite like the max_protocol_version connection parameter. > I think even for testing it is pretty useful to tell libpq what > protocol version to try to connect as. It could even be accompanied > with a min_protocol_version, e.g. in case you only want the connection > attempt to fail when the server does not support this more secure > cancel key. This makes some sense to me. I don't think that I believe max_protocol_version=latest is future-proof: just because I want to opt into this round of new features doesn't mean I feel the same way about the next round. But it may still be a good idea. I suppose the semantics are that we try to connect with the version specified by max_protocol_version and, if we get downgraded by the server, we abort if we end up below min_protocol_version. I like those semantics, and I think I also like having both parameters, but I'm not 100% sure of the naming. It's a funny use of "max" and "min", because the max is really what we're trying to do and the min is what we end up with, and those terms don't necessarily bring those ideas to mind. I don't have a better idea off-hand, though. -- Robert Haas EDB: http://www.enterprisedb.com
Re: [multithreading] extension compatibility
On Wed, Jun 5, 2024 at 4:32 PM Tristan Partin wrote: > Not entirely sure how I feel about the approach you've taken, but here > is a patch that Heikki and I put together for extension compatibility. > It's not a build time solution, but a runtime solution. Instead of > PG_MODULE_MAGIC, extensions would use PG_MAGIC_MODULE_REENTRANT. There > is a GUC called `multithreaded` which controls the variable > IsMultithreaded. We operated under the assumption that being able to > toggle multithreading and multi-processing without recompiling has > value. That's interesting, because I thought Heikki was against having a runtime toggle. I don't think PG_MODULE_MAGIC_REENTRANT is a good syntax. It all looks great as long as we only ever need the PG_MODULE_MAGIC line to signal one bit of information, but as soon as you get to two bits it's pretty questionable, and anything more than two bits is insane. If we want to do something with the PG_MODULE_MAGIC line, I think it should involve options-passing of some form rather than just having an alternate macro name. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs
On Wed, Jun 5, 2024 at 1:50 PM Jelte Fennema-Nio wrote: > Totally agreed, and that's easily achievable because of the > NegotiateProtocolVersion message. We're all good on v18 to v17 > connections and protocol changes, as long as we make any new behaviour > depend on either a protocol parameter or a protocol version bump. Cool. > I think at minimum we'll need a mechanism for people to connect using > a StartupMessage that doesn't break existing poolers. If users have no > way to connect at all to their semi-recently installed connection > pooler with the newest libpq, then I expect we'll have many unhappy > users. I think it's debatable whether the compatible StartupMessage > should be opt-in or opt-out. I'd personally at minimum want to wait > one PG release before using the incompatible StartupMessage by > default, just to give pooler installs some time to catch up. I agree that we need such a mechanism, but if the driving feature is cancel-key length, I expect that opt-in isn't going to work very well. Opt-in seems like it would work well with compression or transparent column encryption, but few users will specify a connection string option just to get a longer cancel key length, so the feature won't get much use if we do it that way. I won't be crushed if we decide to somehow make it opt-in, but I kind of doubt that will happen. Would we make everyone add longcancelkey=true to all their connection strings for one release and then carry that connection parameter until the heat death of the universe even though after the 1-release transition period there would be no reason to ever use it? Would we rip the parameter back out after the transition period and break existing connection strings? Would we have people write protocolversion=3.1 to opt in and then they could just keep that in the connection string without harm, or at least without harm until 3.2 comes out? I don't really like any of these options that well. -- Robert Haas EDB: http://www.enterprisedb.com
[multithreading] extension compatibility
nd (3) one could also declare that extension authors should just document what they do or don't support rather than doing anything in code. However, I think it makes sense to try to make extensions fail to compile against a threaded PostgreSQL unless the extension declares that it supports such builds of PostgreSQL. I think that by doing this, we'll make it a LOT easier for packagers to find out what extensions still need updating. A packager could possibly do light testing of an extension and fail to miss the fact that the extension doesn't actually work properly against a threaded PostgreSQL, but you can't fail to notice a compile failure. There's still going to be some chaos because of (1), but I think we can mitigate that with good messaging: documentation, wiki pages, and blog posts explaining that this is coming and how to adapt to it can help a lot, IMHO. == Extension Compatibility Solutions == The attached patch is a sketch of one possible approach: PostgreSQL signals whether it is multithreaded by defining or not defining PG_MULTITHREADING in pg_config_manual.h, and an extension signals thread-readiness by defining PG_THREADSAFE_EXTENSION before including any PostgreSQL headers other than postgres.h. If PostgreSQL is built multithreaded and the extension does not signal thread-safety, you get something like this: ../pgsql/src/test/modules/dummy_seclabel/dummy_seclabel.c:20:1: error: static assertion failed due to requirement '1 == 0': must define PG_THREADSAFE_EXTENSION or use unthreaded PostgreSQL PG_MODULE_MAGIC; I'm not entirely happy with this solution because the results are confusing if PG_THREADSAFE_EXTENSION is declared after including fmgr.h. Perhaps this can be adequately handled by documenting and demonstrating the right pattern, or maybe somebody has a better idea. Another idea I considered was to replace the PG_MODULE_MAGIC; declaration with something that allows for arguments, like PG_MODULE_MAGIC(.process_model = false, .thread_model = true). But on further reflection, that seems like the wrong thing. AFAICS, that's going to tell you at runtime about something that you really want to know at compile time. But this kind of idea might need more thought if we decide that the *same* build of PostgreSQL can either launch processes or threads per session, because then we'd to know which extensions were available in whichever mode applied to the current session. That's all I've got for today. -- Robert Haas EDB: http://www.enterprisedb.com v1-0001-POC-Extension-API-for-multithreading.patch Description: Binary data
Re: Make query cancellation keys longer
On Fri, Mar 8, 2024 at 5:20 PM Heikki Linnakangas wrote: > The nice thing about relying on the message length was that we could > just redefine the CancelRequest message to have a variable length > secret, and old CancelRequest with 4-byte secret was compatible with the > new definition too. But it doesn't matter much, so added an explicit > length field. I think I liked your original idea better than this one. > One consequence of this patch that I didn't mention earlier is that it > makes libpq incompatible with server version 9.2 and below, because the > minor version negotiation was introduced in version 9.3. We could teach > libpq to disconnect and reconnect with minor protocol version 3.0, if > connecting with 3.1 fails, but IMHO it's better to not complicate this > and accept the break in backwards-compatibility. 9.3 was released in > 2013. We dropped pg_dump support for versions older than 9.2 a few years > ago, this raises the bar for pg_dump to 9.3 as well. I think we shouldn't underestimate the impact of bumping the minor protocol version. Minor version negotiation is probably not supported by all drivers and Jelte says that it's not supported by any poolers, so for anybody but direct libpq users, there will be some breakage. Now, on the one hand, as Jelte has said, there's little value in having a protocol version if we're too afraid to make use of it. But on the other hand, is this problem serious enough to justify the breakage we'll cause? I'm not sure. It seems pretty silly to be using a 32-bit value for this in 2024, but I'm sure some people aren't going to like it, and they may not all have noticed this thread. On the third hand, if we do this, it may help to unblock a bunch of other pending patches that also want to do protocol-related things. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs
On Wed, Jun 5, 2024 at 10:06 AM Jelte Fennema-Nio wrote: > FYI Heikki his patchset is here: > https://www.postgresql.org/message-id/flat/508d0505-8b7a-4864-a681-e7e5edfe32aa%40iki.fi > > Afaict there's no way to implement this with old clients supporting > the new message. So it would need to be opt-in from the client > perspective, either using a version bump or a protocol parameter (e.g. > large_cancel=true). IMHO a version bump would make more sense for this > one. Well, hang on. The discussion on that thread suggests that this is only going to break connections to servers that don't have NegotiateProtocolVersion. Personally, I think that's fairly OK. It's true that connections to, I guess, pre-9.3 servers will break, but there shouldn't be tons of those left around. It's not great to break connectivity to such servers, of course, but it seems kind of OK. What I really don't want is for v18 to break connections to v17 servers. That would be exponentially more disruptive. I do take your point that poolers haven't added support for NegotiateProtocolVersion yet, but I bet that will change pretty quickly once there's a benefit to doing so. I think it's a lot easier for people to install a new pooler version than a new server version, because the server has the data in it. Also, it's not like they haven't had time. But the real question here is whether we think the longer cancel key is really important enough to justify the partial compatibility break. I'm not deathly opposed to it, but I think it's debatable and I'm sure some people are going to be unhappy. -- Robert Haas EDB: http://www.enterprisedb.com
Re: meson "experimental"?
On Tue, Jun 4, 2024 at 12:56 PM Peter Eisentraut wrote: > Experimental is probably too harsh a word now. > > But it doesn't have feature parity with configure/make yet (for example, > no .bc files), so I wouldn't recommend it for production or packaging > builds. That's unfortunate. :-( > Then, there are issues like [0]. If it's experimental, then this is > like, meh, we'll fix it later. If not, then it's a bug. This feels like a case where I'd have to read the entire thread to understand what the issue is. > More generally, I don't think we've really done a comprehensive check of > how popular extensions build against pgxs-from-meson. Packagers that > make their production builds using meson now might be signing up for a > long tail of the unknown. That's a fair point, and I don't know what to do about it, but it seems like something that needs to be addressed. -- Robert Haas EDB: http://www.enterprisedb.com
Re: meson "experimental"?
On Wed, May 29, 2024 at 2:41 AM Peter Eisentraut wrote: > "Alternatively, PostgreSQL can be built using Meson. This is the only > option for building PostgreSQL in Windows using Visual Something[*]. > For other platforms, using Meson is currently experimental." Is it, though? I feel like we're beyond the experimental stage now. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs
On Sat, May 25, 2024 at 6:40 AM Jelte Fennema-Nio wrote: > Of the proposed changes so far on the mailing list the only 2 that > would fall under 1 imho are: > 1. The ParameterSet message > 2. Longer than 32bit secret in BackendKeyData Yeah. I wonder how Heikki thinks he can do (2) without breaking everything. Maybe just adding an extra, optional, longer field onto the existing message and hoping that client implementations ignore the extra field? If that's not good enough, then I don't understand how that can be done without breaking compatibility in a fundamental and relatively serious way -- at which point maybe bumping the protocol version is the right thing to do. Really, what I'm strongly opposed to is not bumping the version, but rather doing things that break compatibility such that we need to bump the version. *If* we have a problem that's sufficiently serious to justify breaking compatibility anyway, then we don't really lose anything by bumping the version, and indeed we gain something. I just want us to be searching for ways to avoid breaking interoperability, rather than seeking them out. If it becomes impossible for a PG 18 (or whatever version) server to communicate with earlier servers without specifying special options, or worse yet at all, then a lot of people are going to be very sad about that. We will get a bunch of complaints and a bunch of frustrated users, and they will not be impressed by vague claims of necessity or desirability. They'll just be mad. The question for me here is not "what is the theoretically right thing to do?" but rather "what am I going to tell angry users when they demand to know why I committed the patch that broke this?". "The old way was insecure so we had to change it" might be a good enough reason for people to calm down and stop yelling at me, but "it's no use having a protocol version if we never bump it" definitely won't be. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs
On Fri, May 24, 2024 at 6:09 PM Jelte Fennema-Nio wrote: > Separating it from the GUC infrastructure will mean we need to > duplicate a lot of the infrastructure. Assuming we don't care about > SHOW or pg_settings (which I agree are not super important), the > things that we would want for protocol parameters to have that guc.c > gives us for free are: > 1. Reporting the value of the parameter to the client (done using > ParameterStatus) > 2. Parsing and validating of the input, bool, int, enum, etc, but also > check_hook and assign_hook. > 3. Logic in all connection poolers to change GUC values to the > client's expected values whenever a server connection is handed off to > a client > 4. Permission checking, if we want some protocol extensions to only be > configurable by a highly privileged user > > All of those things would have to be duplicated/re-implemented if we > make protocol parameters their own dedicated thing. Doing that work > seems like a waste of time to me, and would imho add much more > complexity than the proposed 65 lines of code in 0011. I think that separating (1) and (3) is going to make us happy rather than sad. For example, if a client speaks to an intermediate pooler which speaks to a server, the client-pooler connection could have different values from the pooler-server connection, and then if everything is done via ParameterStatus messages it's actually more complicated for the pooler, which will have to edit ParameterStatus messages as they pass through, and possibly also create new ones out of nothing. If we separate things, the pooler can always pass ParameterStatus messages unchanged, and only has to worry about the separate infrastructure for handling these new things. Said differently, I'd agree with you if (a) ParameterStatus weren't so dubiously designed and (b) all poolers were going to want every protocol parameter to match on the input side and the output side. And maybe in practice (b) will happen, but I want to leave the door open to cases where it doesn't, and as soon as that happens, I think this becomes a hassle, whereas separate mechanisms don't really hurt much. As you and I discussed a bit in person last week, two for loops rather than one in the pooler isn't really that big of a deal. IMHO, that's a small price to pay for an increased chance of not boxing ourselves into a corner depending on how these parameters end up getting used in practice. As for (2) and (4), I agree there's some duplication, but I think it's quite minor. We have existing facilities for parsing integers and booleans that are reused by a lot of code already, and this is just one more place that can use them. That infrastructure is not GUC-specific. The permissions-checking stuff isn't either. The vast bulk of the GUC infrastructure is concerned with (1) allowing for GUCs to be set from many different sources and (2) handling their transactional nature. We don't need or want either of those characteristics here, so a lot of that complexity just isn't needed. -- Robert Haas EDB: http://www.enterprisedb.com
Re: DROP OWNED BY fails to clean out pg_init_privs grants
On Fri, May 24, 2024 at 4:00 PM Tom Lane wrote: > > Oh! That does seem like it would make what I said wrong, but how would > > it even know who the original owner was? Shouldn't we be recreating > > the object with the owner it had at dump time? > > Keep in mind that the whole point here is for the pg_dump script to > just say "CREATE EXTENSION foo", not to mess with the individual > objects therein. So the objects are (probably) going to be owned by > the user that issued CREATE EXTENSION. > > In the original conception, that was the end of it: what you got for > the member objects was whatever state CREATE EXTENSION left behind. > The idea of pg_init_privs is to support dump/reload of subsequent > manual alterations of privileges for extension-created objects. > I'm not, at this point, 100% certain that that's a fully realizable > goal. But I definitely think it's insane to expect that to work > without also tracking changes in the ownership of said objects. > > Maybe forbidding ALTER OWNER on extension-owned objects isn't > such a bad idea? I think the root of my confusion, or at least one of the roots of my confusion, was whether we were talking about altering the extension or the objects within the extension. If somebody creates an extension as user nitin and afterwards changes the ownership to user lucia, then I would hope for the same pg_init_privs contents as if user lucia had created the extension in the first place. I think that might be hard to achieve in complex cases, if the extension changes the ownership of some objects during the creation process, or if some ownership has been changed afterward. But in the simple case where nothing like that happens, it sounds possible. If, on the other hand, somebody creates the extension and modifies the ownership of an object in the extension, then I agree that pg_init_privs shouldn't be updated. However, it also seems pretty clear that pg_init_privs is not recording enough state for pg_dump to mimic the ownership change, because it only records the original privileges, not the original ownership. So, can we recreate the original privileges without recreating the original ownership? It doesn't really seem like this is going to work in general. If nitin creates the extension and grants privileges to lucia, and then the ownership of the extension is changed to swara, then nitin is no longer a valid grantor. Even if we could fix that by modifying the grants to substitute swara for nitin, that could create impermissible circularities in the permissions graph. Maybe there are some scenarios where we can fix things up, but it doesn't seem possible in general. In a perfect world, the fix here is probably to have pg_init_privs or something similar record the ownership as well as the permissions, but that is not back-patchable and nobody's on the hook to fix up somebody else's feature. So what do we do? Imposing a constraint that you can't change the ownership of an extension-owned object, as you propose, seems fairly likely to break a few existing extension scripts, and also won't fix existing instances that are already broken, but maybe it's still worth considering if we don't have a better idea. Another line of thinking might be to somehow nerf pg_init_privs, or the use of it, so that we don't even try to cover this case e.g. if the ownership is changed, we nuke the pg_init_privs entry or resnap it to the current state, and the dump fails to recreate the original state but we get out from under that by declaring the case unsupported (with appropriate documentation changes, hopefully). pg_init_privs seems adequate for normal system catalog entries where the ownership shouldn't ever change from the bootstrap superuser, but extensions seem like they require more infrastructure. I think the only thing we absolutely have to fix here is the dangling ACL references. Those are a hazard, because the OID could be reused by an unrelated role, which seems like it could even give rise to security concerns. Making the whole pg_init_privs system actually work for this case case would be even better, but that seems to require filling in at least one major hole the original design, which is a lot to ask for (1) a back-patched fix or (2) a patch that someone who is not the original author has to try to write and be responsible for despite not being the one who created the problem. -- Robert Haas EDB: http://www.enterprisedb.com
Re: DROP OWNED BY fails to clean out pg_init_privs grants
On Tue, May 28, 2024 at 9:06 PM Hannu Krosing wrote: > We should definitely also fix pg_dump, likely just checking that the > role exists when generating REVOKE commands (may be a good practice > for other cases too so instead of casting to ::regrole do the actual > join) > > ## here is the fix for pg_dump > > While flying to Vancouver I looked around in pg_dump code, and it > looks like the easiest way to mitigate the dangling pg_init_priv > entries is to replace the query in pg_dump with one that filters out > invalid entries +1 for this approach. I agree with Tom that fixing this in REVOKE is a bad plan; REVOKE is used by way too many things other than pg_dump, and the behavior change is not in general desirable. -- Robert Haas EDB: http://www.enterprisedb.com
Re: relfilenode statistics
On Mon, Jun 3, 2024 at 7:11 AM Bertrand Drouvot wrote: > The main argument is that we currently don’t have writes counters for > relations. > The reason is that we don’t have the relation OID when writing buffers out. OK. > Second argument is that this is also beneficial for the "Split index and > table statistics into different types of stats" thread (mentioned in the > previous > message). It would allow us to avoid additional branches in some situations > (like > the one mentioned by Andres in the link I provided up-thread). OK. > We’d move the current buffer read and buffer hit counters from the relation > stats > to the relfilenode stats (while still being able to retrieve them from the > pg_statio_all_tables/indexes views: see the example for the new > heap_blks_written > stat added in the patch). Generally speaking, I think that tracking counters > at > a common level (i.e relfilenode level instead of table or index level) is > beneficial (avoid storing/allocating space for the same counters in multiple > structs) and sounds more intuitive to me. Hmm. So if I CLUSTER or VACUUM FULL the relation, the relfilenode changes. Does that mean I lose all of those stats? Is that a problem? Or is it good? Or what? I also thought about the other direction. Suppose I drop the a relation and create a new one that gets a different relation OID but the same relfilenode. But I don't think that's a problem: dropping the relation should forcibly remove the old stats, so there won't be any conflict in this case. -- Robert Haas EDB: http://www.enterprisedb.com
Re: relfilenode statistics
Hi Bertrand, It would be helpful to me if the reasons why we're splitting out relfilenodestats could be more clearly spelled out. I see Andres's comment in the thread to which you linked, but it's pretty vague about why we should do this ("it's not nice") and whether we should do this ("I wonder if this is an argument for") and maybe that's all fine if Andres is going to be the one to review and commit this, but even if then it would be nice if the rest of us could follow along from home, and right now I can't. The commit message is often a good place to spell this kind of thing out, because then it's included with every version of the patch you post, and may be of some use to the eventual committer in writing their commit message. The body of the email where you post the patch set can be fine, too. ...Robert
Re: DROP OWNED BY fails to clean out pg_init_privs grants
On Fri, May 24, 2024 at 2:57 PM Tom Lane wrote: > Doesn't seem right to me. That will give pg_dump the wrong idea > of what the initial privileges actually were, and I don't see how > it can construct correct delta GRANT/REVOKE on the basis of false > information. During the dump reload, the extension will be > recreated with the original owner (I think), causing its objects' > privileges to go back to the original pg_init_privs values. Oh! That does seem like it would make what I said wrong, but how would it even know who the original owner was? Shouldn't we be recreating the object with the owner it had at dump time? > Although ... this is tickling a recollection that pg_dump doesn't > try very hard to run CREATE EXTENSION with the same owner that > the extension had originally. That's a leftover from the times > when basically all extensions required superuser to install, > and of course one superuser is as good as the next. There might > be some work we have to do on that side too if we want to up > our game in this area. Hmm, yeah. > Another case that's likely not handled well is what if the extension > really shouldn't have its original owner (e.g. you're using > --no-owner). If it's restored under a new owner then the > pg_init_privs data certainly doesn't apply, and it feels like it'd > be mostly luck if the precomputed delta GRANT/REVOKEs lead to a > state you like. I'm not sure exactly how this computation works, but if tgl granted nmisch privileges on an object and the extension is now owned by rhaas, it would seem like the right thing to do would be for rhaas to grant nmisch those same privileges. Conversely if tgl started with privileges to do X and Y and later was granted privileges to do Z and we dump and restore such that the extension is owned by rhaas, I'd presume rhaas would end up with those same privileges. I'm probably too far from the code to give terribly useful advice here, but I think the expected behavior is that the new owner replaces the old one for all purposes relating to the owned object(s). At least, I can't currently see what else makes any sense. -- Robert Haas EDB: http://www.enterprisedb.com
Re: apply_scanjoin_target_to_paths and partitionwise join
On Fri, May 24, 2024 at 2:02 PM wrote: > I am not sure, whether it's really a bug. I personally wouldn't be brave > enough to back patch this. I don't want to deal with complaining end > users. Suddenly their optimizer, which always had horrible estimates, > was actually able to do harmful stuff with them. Only due to a minor > version upgrade. I think that's a bad idea to backpatch something with > complex performance implications. Especially since they might even be > based on potentially inaccurate data... +1. -- Robert Haas EDB: http://www.enterprisedb.com
Re: DROP OWNED BY fails to clean out pg_init_privs grants
On Fri, May 24, 2024 at 11:59 AM Tom Lane wrote: > Thinking about this some more: the point of pg_init_privs is to record > an object's privileges as they stood at the end of CREATE EXTENSION > (or extension update), with the goal that pg_dump should be able to > compute the delta between that and the object's current privileges > and emit GRANT/REVOKE commands to restore those current privileges > after a fresh extension install. (We slide gently past the question > of whether the fresh extension install is certain to create privileges > matching the previous pg_init_privs entry.) +1 to all of this. > So this goal seems to > mean that neither ALTER OWNER nor REASSIGN OWNED should touch > pg_init_privs at all, as that would break its function of recording > a historical state. Only DROP OWNED should get rid of pg_init_privs > grants, and that only because there's no choice -- if the role is > about to go away, we can't hang on to a reference to its OID. But I would have thought that the right thing to do to pg_init_privs here would be essentially s/$OLDOWNER/$NEWOWNER/g. I know I'm late to the party here, but why is that idea wrong? -- Robert Haas EDB: http://www.enterprisedb.com
Re: warn if GUC set to an invalid shared library
On Fri, May 24, 2024 at 11:48 AM Justin Pryzby wrote: > You give me too much credit.. Gee, usually I'm very good at avoiding that mistake. :-) > We don't want to change the behavior to allow this to succeed -- it > would allow leaving the server in a state that it fails to start (rather > than helping to avoid doing so, as intended by this thread). +1. > Maybe there should be a comment explaning why PGC_FILE is used, and > maybe there should be a TAP test for the behavior, but they're pretty > unrelated to this thread. So, I've dropped the 001 patch. +1 for that, too. + /* Note that filename was already canonicalized */ I see that this comment is copied from load_libraries(), but I don't immediately see where the canonicalization actually happens. Do you know, or can you find out? Because that's crucial here, else stat() might not target the real filename. I wonder if it will anyway. Like, couldn't the library be versioned, and might not dlopen() try a few possibilities? + errdetail("The server will currently fail to start with this setting."), + errdetail("New sessions will currently fail to connect with the new setting.")); I understand why these messages have the word "currently" in them, but I bet the user won't. I'm not sure exactly what to recommend at the moment (and I'm quite busy today due to the conference upcoming) but I think we should try to find some way to rephrase these. -- Robert Haas EDB: http://www.enterprisedb.com
Re: First draft of PG 17 release notes
On Thu, May 23, 2024 at 11:04 PM Bruce Momjian wrote: > For a case where O(N^2) become O(N), we might not even know the > performance change since it is a micro-optimization. That is why I > suggested we call it "Internal Performance". I don't get this at all. If we can't tell the difference between O(N^2) and O(N), then N was small enough that there wasn't any real need to optimize in the first place. I think we should be assuming that if somebody took the trouble to write a patch, the difference did matter. Hence the change would be user-visible, and should be documented. "Internal Performance" doesn't make a lot of sense to me as a section heading. What does "Internal" mean here as opposed to "General"? I suspect you mean to imply that the user won't be able to tell the difference, but I doubt that very much. We make performance improvements because they are user-visible. If a performance improvement is so miniscule that nobody would ever notice the difference, well then I don't think it needs to be release-noted at all, and we might have a few changes like that where people were mostly aiming for code cleanliness. But in general, what people do is look for something that's slow (for them) and try to make it faster. So the presumption should be that a performance feature has a meaningful impact on users, and then in rare cases we may decide otherwise. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs
On Thu, May 23, 2024 at 4:00 PM Tom Lane wrote: > I don't recall exactly what I thought earlier, but now I think we'd > be better off with separate infrastructure. guc.c is unduly complex > already. Perhaps there are bits of it that could be factored out > and shared, but I bet not a lot. OK. That seems fine to me, but I bet Jelte is going to disagree. -- Robert Haas EDB: http://www.enterprisedb.com
Re: warn if GUC set to an invalid shared library
On Thu, Jul 6, 2023 at 4:15 PM Justin Pryzby wrote: > I'm still hoping. Hi, I got asked to take a look at this thread. First, I want to explain why I think this thread hasn't gotten as much feedback as Justin was hoping. It is always possible for any thread to have that problem just because people are busy or not interested. However, I think in this case an aggravating factor is that the discussion is very "high context"; it's hard to understand what the open problems are without reading a lot of emails and understanding how they all relate to each other. One of the key questions is whether we should replace PGC_S_FILE with PGC_S_TEST in AlterSystemSetConfigFile. I originally thought, based on reading one of the emails, that the question was whether we should do that out of some sense of intellectual purity, and my answer was "probably not, because that would change the behavior in a way that doesn't seem good." But then I realized, reading another email, that Justin already knew that the behavior would change, or at least I'm 90% certain that he knows that. So now I think the question is whether we want that behavior change, but he only provided one example of how the behavior changes, and it's not clear how many other scenarios are affected or in what way, so it's still a bit hard to answer. Plus, it took me 10 minutes to figure out what the question was. I think that if the question had been phrased in a way that was easily understandable to any experienced PostgreSQL user, it's a lot more likely that one or more people would have had an opinion on whether it was good or bad. As it is, I think most people probably didn't understand the question, and the people who did understand the question may not have wanted to spend the time to do the research that they would have needed to do to come up with an intelligent answer. I'm not saying any of this to criticize Justin or to say that he did anything wrong, but I think we have lots of examples of stuff like this on the mailing list, where people are sad because they didn't get an answer, but don't always realize that there might be things they could do to improve their chances. On the behavior change itself, it seems to me that there's a big difference between shared_preload_libraries=bogus and work_mem=bogus. The former is valid or invalid according to whether bogus.so exists in an appropriate directory on the local machine, but the latter is categorically invalid. I'm not sure to what degree we have the infrastructure to distinguish those cases, but to the extent that we do, handling them differently is completely defensible. It's reasonable to allow the first one on the theory that the presently-invalid configuration may at a later time become valid, but that's not reasonable in the second case. So if changing PGC_S_FILE to PGC_S_TEST in AlterSystemSetConfigFile is going to have the effect of allowing garbage values into postgresql.auto.conf that would currently get blocked, I think that's a bad plan and we shouldn't do it. But it's quite possible I'm not fully understanding the situation. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs
On Thu, May 23, 2024 at 2:12 PM Tom Lane wrote: > I got around to looking through this thread in preparation for next > week's patch review session. I have a couple of opinions to offer: I agree with these opinions. Independently of that, I'm glad you shared them. I think part of the reason we ended up with the protocol parameters = GUCs thing is because you seemed to be concurring with that approach upthread. I think it was Jelte's idea originally, but I interpreted some of your earlier remarks to be supporting it. I'm not sure whether you've revised your opinion, or just refined it, or whether we misinterpreted your earlier remarks. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Avoid orphaned objects dependencies, take 3
On Thu, May 23, 2024 at 12:19 AM Bertrand Drouvot wrote: > The reason why we are using a dirty snapshot here is for the cases where we > are > recording a dependency on a referenced object that we are creating at the same > time behind the scene (for example, creating a composite type while creating > a relation). Without the dirty snapshot, then the object we are creating > behind > the scene (the composite type) would not be visible and we would wrongly > assume > that it has been dropped. The usual reason for using a dirty snapshot is that you want to see uncommitted work by other transactions. It sounds like you're saying you just need to see uncommitted work by the same transaction. If that's true, I think using HeapTupleSatisfiesSelf would be clearer. Or maybe we just need to put CommandCounterIncrement() calls in the right places to avoid having the problem in the first place. Or maybe this is another sign that we're doing the work at the wrong level. -- Robert Haas EDB: http://www.enterprisedb.com
Re: State of pg_createsubscriber
On Thu, May 23, 2024 at 4:40 AM Amit Kapila wrote: > Sorry, for the wrong link. See [1] for the correct link for --dry-run > related suggestion: > > [1] > https://www.postgresql.org/message-id/CAA4eK1J2fAvsJ2HihbWJ_GxETd6sdqSMrZdCVJEutRZRpm1MEQ%40mail.gmail.com Yeah, those should definitely be fixed. Seems simple enough. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Shared detoast Datum proposal
On Wed, May 22, 2024 at 9:46 PM Andy Fan wrote: > Please give me one more chance to explain this. What I mean is: > > Take SELECT f(a) FROM t1 join t2...; for example, > > When we read the Datum of a Var, we read it from tts->tts_values[*], no > matter what kind of TupleTableSlot is. So if we can put the "detoast > datum" back to the "original " tts_values, then nothing need to be > changed. Yeah, I don't think you can do that. > - Saving the "detoast datum" version into tts_values[*] doesn't break > the above semantic and the ExprEval engine just get a detoast version > so it doesn't need to detoast it again. I don't think this is true. If it is true, it needs to be extremely well-justified. Even if this seems to work in simple cases, I suspect there will be cases where it breaks badly, resulting in memory leaks or server crashes or some other kind of horrible problem that I can't quite imagine. Unfortunately, my knowledge of this code isn't fantastic, so I can't say exactly what bad thing will happen, and I can't even say with 100% certainty that anything bad will happen, but I bet it will. It seems like it goes against everything I understand about how TupleTableSlots are supposed to be used. (Andres would be the best person to give a definitive answer here.) > - The keypoint is the memory management and effeiciency. for now I think > all the places where "slot->tts_nvalid" is set to 0 means the > tts_values[*] is no longer validate, then this is the place we should > release the memory for the "detoast datum". All the other places like > ExecMaterializeSlot or ExecCopySlot also need to think about the > "detoast datum" as well. This might be a way of addressing some of the issues with this idea, but I doubt it will be acceptable. I don't think we want to complicate the slot API for this feature. There's too much downside to doing that, in terms of performance and understandability. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Shared detoast Datum proposal
On Tue, May 21, 2024 at 10:02 PM Andy Fan wrote: > One more things I want to highlight it "syscache" is used for metadata > and *detoast cache* is used for user data. user data is more > likely bigger than metadata, so cache size control (hence eviction logic > run more often) is more necessary in this case. The first graph in [1] > (including the quota text) highlight this idea. Yes. > I think that is same idea as design a. Sounds similar. > I think the key points includes: > > 1. Where to put the *detoast value* so that ExprEval system can find out > it cheaply. The soluation in A slot->tts_values[*]. > > 2. How to manage the memory for holding the detoast value. > > The soluation in A is: > > - using a lazy created MemoryContext in slot. > - let the detoast system detoast the datum into the designed > MemoryContext. > > 3. How to let Expr evalution run the extra cycles when needed. I don't understand (3) but I agree that (1) and (2) are key points. In many - nearly all - circumstances just overwriting slot->tts_values is unsafe and will cause problems. To make this work, we've got to find some way of doing this that is compatible with general design of TupleTableSlot, and I think in that API, just about the only time you can touch slot->tts_values is when you're trying to populate a virtual slot. But often we'll have some other kind of slot. And even if we do have a virtual slot, I think you're only allowed to manipulate slot->tts_values up until you call ExecStoreVirtualTuple. That's why I mentioned using something temporary. You could legally (a) materialize the existing slot, (b) create a new virtual slot, (c) copy over the tts_values and tts_isnull arrays, (c) detoast any datums you like from tts_values and store the new datums back into the array, (d) call ExecStoreVirtualTuple, and then (e) use the new slot in place of the original one. That would avoid repeated detoasting, but it would also have a bunch of overhead. Having to fully materialize the existing slot is a cost that you wouldn't always need to pay if you didn't do this, but you have to do it to make this work. Creating the new virtual slot costs something. Copying the arrays costs something. Detoasting costs quite a lot potentially, if you don't end up using the values. If you end up needing a heap tuple representation of the slot, you're going to now have to make a new one rather than just using the one that came straight from the buffer. So I don't think we could do something like this unconditionally. We'd have to do it only when there was a reasonable expectation that it would work out, which means we'd have to be able to predict fairly accurately whether it was going to work out. But I do agree with you that *if* we could make it work, it would probably be better than a longer-lived cache. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Avoid orphaned objects dependencies, take 3
On Wed, May 22, 2024 at 6:21 AM Bertrand Drouvot wrote: > I started initially with [1] but it was focusing on function-schema only. Yeah, that's what I thought we would want to do. And then just extend that to the other cases. > Then I proposed [2] making use of a dirty snapshot when recording the > dependency. > But this approach appeared to be "scary" and it was still failing to close > some race conditions. The current patch still seems to be using dirty snapshots for some reason, which struck me as a bit odd. My intuition is that if we're relying on dirty snapshots to solve problems, we likely haven't solved the problems correctly, which seems consistent with your statement about "failing to close some race conditions". But I don't think I understand the situation well enough to be sure just yet. > Then, Tom proposed another approach in [2] which is that "creation DDL will > have > to take a lock on each referenced object that'd conflict with a lock taken by > DROP". > This is the one the current patch is trying to implement. It's a clever idea, but I'm not sure that I like it. > I think there is 2 cases here: > > First case: the "conflicting" lock mode is for one of our own lock then > LockCheckConflicts() > would report this as a NON conflict. > > Second case: the "conflicting" lock mode is NOT for one of our own lock then > LockCheckConflicts() > would report a conflict. But I've the feeling that the existing code would > already lock those sessions. > > Was your concern about "First case" or "Second case" or both? The second case, I guess. It's bad to take a weaker lock first and then a stronger lock on the same object later, because it can lead to deadlocks that would have been avoided if the stronger lock had been taken at the outset. Here, it seems like things would happen in the other order: if we took two locks, we'd probably take the stronger lock in the higher-level code and then the weaker lock in the dependency code. That shouldn't break anything; it's just a bit inefficient. My concern was really more about the maintainability of the code. I fear that if we add code that takes heavyweight locks in surprising places, we might later find the behavior difficult to reason about. Tom, what is your thought about that concern? -- Robert Haas EDB: http://www.enterprisedb.com
Re: State of pg_createsubscriber
On Mon, May 20, 2024 at 2:42 AM Amit Kapila wrote: > Just to summarize, apart from BF failures for which we had some > discussion, I could recall the following open points: > > 1. After promotion, the pre-existing replication objects should be > removed (either optionally or always), otherwise, it can lead to a new > subscriber not being able to restart or getting some unwarranted data. > [1][2]. > > 2. Retaining synced slots on new subscribers can lead to unnecessary > WAL retention and dead rows [3]. > > 3. We need to consider whether some of the messages displayed in > --dry-run mode are useful or not [4]. Amit, thanks for summarzing your understanding of the situation. Tom, is this list complete, to your knowledge? The original thread is quite complex and it's hard to pick out what the open items actually are. :-( I would like to see this open item broken up into multiple open items, one per issue. Link [4] goes to a message that doesn't seem to relate to --dry-run. -- Robert Haas EDB: http://www.enterprisedb.com
Re: State of pg_createsubscriber
On Wed, May 22, 2024 at 9:52 AM Robert Haas wrote: > Another option that we should at least consider is "do nothing". In a > case like the one Shlok describes, how are we supposed to know what > the right thing to do is? Is it unreasonable to say that if the user > doesn't want those publications or subscriptions to exist, the user > should drop them? > > Maybe it is unreasonable to say that, but it seems to me we should at > least talk about that. As another option, maybe we could disable subscriptions, so that nothing happens when the server is first started, and then the user could decide after that what they want to do. -- Robert Haas EDB: http://www.enterprisedb.com
Re: State of pg_createsubscriber
On Wed, May 22, 2024 at 7:42 AM Amit Kapila wrote: > So, what shall we do about such cases? I think by default we can > remove all pre-existing subscriptions and publications on the promoted > standby or instead we can remove them based on some switch. If we want > to go with this idea then we might need to distinguish the between > pre-existing subscriptions and the ones created by this tool. > > The other case I remember adding an option in this tool was to avoid > specifying slots, pubs, etc. for each database. See [1]. We can > probably leave if the same is not important but we never reached the > conclusion of same. Another option that we should at least consider is "do nothing". In a case like the one Shlok describes, how are we supposed to know what the right thing to do is? Is it unreasonable to say that if the user doesn't want those publications or subscriptions to exist, the user should drop them? Maybe it is unreasonable to say that, but it seems to me we should at least talk about that. -- Robert Haas EDB: http://www.enterprisedb.com
Re: First draft of PG 17 release notes
On Tue, May 21, 2024 at 2:26 PM Melanie Plageman wrote: > For the vacuum WAL volume reduction, there were a bunch of smaller > projects throughout the last development year that I worked on that > were committed by different people and with different individual > benefits. Some changes caused vacuum to do less visibility checks (so > less CPU usage), some changed WAL format in a way that saves some > space, and some, like the commit you mention, make vacuum emit less > WAL. That commit by itself doesn't contain all of the user benefits of > the whole project. I couldn't think of a good place to list all of the > commits together that were part of the same project. Perhaps you could > argue that they were not in fact part of the same project and instead > were just small individual changes -- none of which are individually > worth including in the release notes. Yeah, I think a lot of projects have this problem in one way or another, but I think it may be worse for performance-related projects. I wasn't intending to knock that particular commit, just to be clear, or the commit message. I'm just saying that sometimes summarizing the commit log may not be as easy as we'd hope. -- Robert Haas EDB: http://www.enterprisedb.com
Re: SQL:2011 application time
On Thu, May 16, 2024 at 7:22 PM Jeff Davis wrote: > An empty range does not "bypass" the an exclusion constraint. The > exclusion constraint has a documented meaning and it's enforced. > > Of course there are situations where an empty range doesn't make a lot > of sense. For many domains zero doesn't make any sense, either. > Consider receiving an email saying "thank you for purchasing 0 > widgets!". Check constraints seem like a reasonable way to prevent > those kinds of problems. I think that's true. Having infinitely many events zero-length events scheduled at the same point in time isn't necessarily a problem: I can attend an infinite number of simultaneous meetings if I only need to attend them for exactly zero time. What I think is less clear is what that means for temporal primary keys. As Paul pointed out upthread, in every other case, a temporal primary key is at least as unique as a regular primary key, but in this case, it isn't. And someone might reasonably think that a temporal primary key should exclude empty ranges just as all primary keys exclude nulls. Or they might think the opposite. At least, so it seems to me. -- Robert Haas EDB: http://www.enterprisedb.com
Re: First draft of PG 17 release notes
On Tue, May 21, 2024 at 12:27 PM Andres Freund wrote: > To me that's the "General Performance" section. If somebody reading the > release notes doesn't care about performance, they can just skip that section > ([1]). I don't see why we wouldn't want to include the same level of detail > as for other changes. I'm relatively sure that we've had this argument in previous years and essentially everyone but Bruce has agreed with the idea that performance changes ought to be treated the same as any other kind of improvement. The difficulty is that Bruce is the one doing the release notes. I think it might help if someone were willing to prepare a patch showing what they think specifically should be changed. Or maybe Bruce would be willing to provide a list of all of the performance improvements he doesn't think are worth release-noting or isn't sure how to release-note, and someone else can then have a go at them. Personally, I suspect that a part of the problem, other than the inevitable fact that the person doing the work has a perspective on how the work should be done with which not everyone will agree, is that a lot of performance changes have commit messages that don't really explain what the user impact is. For instance, consider 6dbb490261a6170a3fc3e326c6983ad63e795047 ("Combine freezing and pruning steps in VACUUM"). It does actually say what the benefit is ("That reduces the overall amount of WAL generated") but the reader could easily be left wondering whether that is really the selling point. Does it also reduce CPU consumption? Is that more or less important than the WAL reduction? Was the WAL reduction the motivation for the work? Is the WAL reduction significant enough that this is a feature in its own right, or is this just preparatory to some other work? These kinds of ambiguities can exist for any commit, not just performance commits, but I bet that on average the problem is worse for performance-related commits. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Path to unreverting "Allow planner to use Merge Append to efficiently implement UNION"
On Tue, May 21, 2024 at 8:44 AM David Rowley wrote: > Thanks for having a look. I was planning to have the commit message > as per attached. I'd only split the patch for ease of review per > request of Tom. I should have mentioned that here. > > I would adjust the exact wording in the final paragraph as required > depending on what plan materialises. > > This also fixes up the comment stuff that Heikki mentioned. The consensus on pgsql-release was to unrevert this patch and commit the fix now, rather than waiting for the next beta. However, the consensus was also to push the un-revert as a separate commit from the bug fix, rather than together as suggested by Álvaro. Since time is short due to the impending release and it's very late where you are, I've taken care of this. Hope that's OK. Thanks, -- Robert Haas EDB: http://www.enterprisedb.com
Re: Possible Bug in relation_open
On Tue, May 21, 2024 at 9:58 AM Pradeep Kumar wrote: > If the user tries to open the relation in RangeVar and NoLock mode calling > table_openrv(relation, NoLock), it will internally call > relation_openrv()-->relation_open(). In relation_open() we checking the > Assert(lockmode >= NoLock && lockmode < MAX_LOCKMODES); , here we expecting > the lockmode is NoLock or greater than that, but in same function again we > checking this assert case Assert(lockmode != NoLock || > IsBootstrapProcessingMode() || CheckRelationLockedByMe(r, AccessShareLock, > true)); , here we are expecting (lockmode != NoLock) , so why are there two > cases that contradict? and What if the user tries to open the relation in > NoLock mode? and that will definitely cause the assert failure, Suppose the > user who writes some extension and reads some relation oid that is constant, > and wants to acquire NoLock?, need some clarification on this. You need to acquire a lock. Otherwise, the relcache entry could change underneath you while you're accessing it, which would result in PostgreSQL crashing. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Avoid orphaned objects dependencies, take 3
On Wed, May 15, 2024 at 6:31 AM Bertrand Drouvot wrote: > Please find attached v6 (only diff with v5 is moving the tests as suggested > above). I don't immediately know what to think about this patch. I've known about this issue for a long time, but I didn't think this was how we would fix it. I think the problem here is basically that we don't lock namespaces (schemas) when we're adding and removing things from the schema. So I assumed that if we ever did something about this, what we would do would be add a bunch of calls to lock schemas to the appropriate parts of the code. What you've done here instead is add locking at a much lower level - whenever we are adding a dependency on an object, we lock the object. The advantage of that approach is that we definitely won't miss anything. The disadvantage of that approach is that it means we have some very low-level code taking locks, which means it's not obvious which operations are taking what locks. Maybe it could even result in some redundancy, like the higher-level code taking a lock also (possibly in a different mode) and then this code taking another one. I haven't gone through the previous threads; it sounds like there's already been some discussion of this, but I'm just telling you how it strikes me on first look. -- Robert Haas EDB: http://www.enterprisedb.com
Re: libpq compression (part 3)
On Mon, May 20, 2024 at 4:12 PM Magnus Hagander wrote: > That used to be the case in HTTP/1. But header compression was one of the > headline features of HTTP/2, which isn't exactly new anymore. But there's a > special algorithm, HPACK, for it. And then http/3 uses QPACK. Cloudflare has > a pretty decent blog post explaining why and how: > https://blog.cloudflare.com/hpack-the-silent-killer-feature-of-http-2/, or > rfc7541 for all the details. > > tl;dr; is yes, let's be careful not to expose headers to a CRIME-style > attack. And I doubt our connections has as much to gain by compressing > "header style" fields as http, so we are probably better off just not > compressing > Work: https://www.redpill-linpro.com/ What do you think constitutes a header in the context of the PostgreSQL wire protocol? -- Robert Haas EDB: http://www.enterprisedb.com
Re: libpq compression (part 3)
On Mon, May 20, 2024 at 2:05 PM Andrey M. Borodin wrote: > Compression defeats encryption. That's why it's not in TLS anymore. > The thing is compression codecs use data self correlation. And if you mix > secret data with user's data, user might guess how correlated they are. Yeah, I'm aware that there are some problems like this. For example, suppose the bad guy can both supply some of the data sent over the connection (e.g. by typing search queries into a web page) and also observe the traffic between the web application and the database. Then they could supply data and try to guess how correlated that is with other data sent over the same connection. But if that's a practical attack, preventing compression prior to the authentication exchange probably isn't good enough: the user could also try to guess what queries are being sent on behalf of other users through the same pooled connection, or they could try to use the bits of the query that they can control to guess what the other bits of the query that they can't see look like. But, does this mean that we should just refuse to offer compression as a feature? This kind of attack isn't a threat in every environment, and in some environments, compression could be pretty useful. For instance, you might need to pull down a lot of data from the database over a slow connection. Perhaps you're the only user of the database, and you wrote all of the queries yourself in a locked vault, accepting no untrusted inputs. In that case, these kinds of attacks aren't possible, or at least I don't see how, but you might want both compression and encryption. I guess I don't understand why TLS removed support for encryption entirely instead of disclaiming its use in some appropriate way. -- Robert Haas EDB: http://www.enterprisedb.com
Re: libpq compression (part 3)
On Mon, May 20, 2024 at 1:23 PM Jacob Champion wrote: > On Mon, May 20, 2024 at 10:01 AM Robert Haas wrote: > > I really hope that you can't poke big enough holes to kill the feature > > entirely, though. Because that sounds sad. > > Even if there are holes, I don't think the situation's going to be bad > enough to tank everything; otherwise no one would be able to use > decompression on the Internet. :D And I expect the authors of the > newer compression methods to have thought about these things [1]. > > I hesitate to ask as part of the same email, but what were the plans > for compression in combination with transport encryption? (Especially > if you plan to compress the authentication exchange, since mixing your > LDAP password into the compression context seems like it might be a > bad idea if you don't want to leak it.) So, the data would be compressed first, with framing around that, and then transport encryption would happen afterwards. I don't see how that would leak your password, but I have a feeling that might be a sign that I'm about to learn some unpleasant truths. -- Robert Haas EDB: http://www.enterprisedb.com
Re: soliciting patches to review
On Tue, Apr 23, 2024 at 1:27 PM Robert Haas wrote: > Just a quick update. We have so far had 8 suggested patches from 6 > people, if I haven't missed anything. I'm fairly certain that not all > of those patches are going to be good candidates for this session, so > it would be great if a few more people wanted to volunteer their > patches. With approximately 1 week to go, I now have a list of 24 patches that I think are good candidates for this session and another 11 that were suggested but which I think are not good candidates for various reasons, including (1) being trivial, (2) being so complicated that it's not reasonable to review them in the time we'll have, and (3) patching something other than the C code, which I consider too specialized for this session. I think that's a long enough list that we probably won't get to all of the patches in the session, so I don't necessarily *need* more things to put on the list at this point. However, I am still accepting further nominations until approximately this time on Friday. Hence, if you have written a patch that you think would be a good candidate for some folks to review in their effort to become better reviewers, let me (and Andres) know. Thanks, -- Robert Haas EDB: http://www.enterprisedb.com
Re: Propagate sanity checks of ProcessUtility() to standard_ProcessUtility()?
On Fri, May 17, 2024 at 10:11 PM Michael Paquier wrote: > On Fri, May 17, 2024 at 03:53:58PM -0400, Robert Haas wrote: > > The usual pattern for using hooks like this is to call the next > > implementation, or the standard implementation, and pass down the > > arguments that you got. If you try to do that and mess it up, you're > > going to get a crash really, really quickly and it shouldn't be very > > difficult at all to figure out what you did wrong. Honestly, that > > doesn't seem like it would be hard even without the assertions: for > > the most part, a quick glance at the stack backtrace ought to be > > enough. If you're trying to do something more sophisticated, like > > mutating the node tree before passing it down, the chances of your > > mistakes getting caught by these assertions are pretty darn low, since > > they're very superficial checks. > > Perhaps, still that would be something. > > Hmm. We've had in the past bugs where DDL paths were playing with the > manipulation of Querys in core, corrupting their contents. It seems > like what you would mean is to have a check at the *end* of > ProcessUtility() that does an equalFuncs() on the Query, comparing it > with a copy of it taken at its beginning, all that hidden within two > USE_ASSERT_CHECKING blocks, when we'd expect the tree to not change. > With readOnlyTree, that would be just changing from one policy to > another, which is not really interesting at this stage based on how > ProcessUtility() is shaped. I don't think I meant to imply that. I think what I feel is that there's no real problem here, and that the patch sort of superficially looks useful, but isn't really. I'd suggest just dropping it. -- Robert Haas EDB: http://www.enterprisedb.com
Re: pg_combinebackup does not detect missing files
On Fri, May 17, 2024 at 6:14 PM David Steele wrote: > Then intentionally corrupt a file in the incr backup: > > $ truncate -s 0 test/backup/incr1/base/5/3764_fsm > > In this case pg_verifybackup will error: > > $ pg_verifybackup test/backup/incr1 > pg_verifybackup: error: "base/5/3764_fsm" has size 0 on disk but size > 24576 in the manifest > > But pg_combinebackup does not complain: > > $ pg_combinebackup test/backup/full test/backup/incr1 -o test/backup/combine > $ ls -lah test/backup/combine/base/5/3764_fsm > -rw--- 1 dev dialout 0 May 17 22:08 test/backup/combine/base/5/3764_fsm > > It would be nice if pg_combinebackup would (at least optionally but > prefferrably by default) complain in this case rather than the user > needing to separately run pg_verifybackup. My first reaction here is that it would be better to have people run pg_verifybackup for this. If we try to do this in pg_combinebackup, we're either going to be quite limited in the amount of validation we can do (which might lure users into a false sense of security) or we're going to make things quite a bit more complicated and expensive. Perhaps there's something here that is worth doing; I haven't thought about this deeply and can't really do so at present. I do believe in reasonable error detection, which I hope goes without saying, but I also believe strongly in orthogonality: a tool should do one job and do it as well as possible. -- Robert Haas EDB: http://www.enterprisedb.com
Re: libpq compression (part 3)
On Mon, May 20, 2024 at 12:49 PM Jacob Champion wrote: > ...and my response was that, no, the proposal doesn't seem to be > requiring that authentication take place before compression is done. > (As evidenced by your email. :D) If the claim is that there are no > security problems with letting unauthenticated clients force > decompression, then I can try to poke holes in that; I would prefer this approach, so I suggest trying to poke holes here first. If you find big enough holes then... > or if the claim > is that we don't need to worry about that at all because we'll wait > until after authentication, then I can poke holes in that too. My > request is just that we choose one. ...we can fall back to this and you can try to poke holes here. I really hope that you can't poke big enough holes to kill the feature entirely, though. Because that sounds sad. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Speed up clean meson builds by ~25%
On Mon, May 20, 2024 at 11:37 AM Andres Freund wrote: > On 2024-05-20 09:37:46 -0400, Robert Haas wrote: > > On Sat, May 18, 2024 at 6:09 PM Andres Freund wrote: > > > A few tests with ccache disabled: > > > > These tests seem to show no difference between the two releases, so I > > wonder what you're intending to demonstrate here. > > They show a few seconds of win for 'real' time. > -O0: 0m21.577s->0m19.529s > -O3: 0m59.730s->0m54.853s Ah, OK, I think I misinterpreted. -- Robert Haas EDB: http://www.enterprisedb.com
Re: libpq compression (part 3)
On Mon, May 20, 2024 at 10:15 AM Jacob Champion wrote: > This is just restating the "you can already do bad things" argument. I > understand that if your query gets executed, it's going to consume > resources on the thing that's executing it (for the record, though, > there are people working on constraining that). But introducing > disproportionate resource consumption into all traffic-inspecting > software, like pools and bouncers, seems like a different thing to me. > Many use cases are going to be fine with it, of course, but I don't > think it should be hand-waved. I can't follow this argument. I think it's important that the startup message is always sent uncompressed, because it's a strange exception to our usual message-formatting rules, and because it's so security-critical. I don't think we should do anything to allow more variation there, because any benefit will be small and the chances of introducing security vulnerabilities seems non-trivial. But if the client says in the startup message that it would like to send and receive compressed data and the server is happy with that request, I don't see why we need to postpone implementing that request until after the authentication exchange is completed. I think that will make the code more complicated and I don't see a security benefit. If the use of a particular compression algorithm is going to impose too much load, the server, or the pooler, is free to refuse it, and should. Deferring the use of the compression method until after authentication doesn't really solve any problem here, at least not that I can see. It does occur to me that if some compression algorithm has a buffer overrun bug, restricting its use until after authentication might reduce the score of the resulting CVE, because now you have to be able to authenticate to make an exploit work. Perhaps that's an argument for imposing a restriction here, but it doesn't seem to be the argument that you're making. -- Robert Haas EDB: http://www.enterprisedb.com
Re: libpq compression (part 3)
On Sat, May 18, 2024 at 1:18 AM Jacob Burroughs wrote: > I like this more than what I proposed, and will update the patches to > reflect this proposal. (I've gotten them locally back into a state of > applying cleanly and dealing with the changes needed to support direct > SSL connections, so refactoring the protocol layer shouldn't be too > hard now.) Sounds good! -- Robert Haas EDB: http://www.enterprisedb.com
Re: Shared detoast Datum proposal
Hi, Andy Fan asked me off-list for some feedback about this proposal. I have hesitated to comment on it for lack of having studied the matter in any detail, but since I've been asked for my input, here goes: I agree that there's a problem here, but I suspect this is not the right way to solve it. Let's compare this to something like the syscache. Specifically, let's think about the syscache on pg_class.relname. In the case of the syscache on pg_class.relname, there are two reasons why we might repeatedly look up the same values in the cache. One is that the code might do multiple name lookups when it really ought to do only one. Doing multiple lookups is bad for security and bad for performance, but we have code like that in some places and some of it is hard to fix. The other reason is that it is likely that the user will mention the same table names repeatedly; each time they do, they will trigger a lookup on the same name. By caching the result of the lookup, we can make it much faster. An important point to note here is that the queries that the user will issue in the future are unknown to us. In a certain sense, we cannot even know whether the same table name will be mentioned more than once in the same query: when we reach the first instance of the table name and look it up, we do not have any good way of knowing whether it will be mentioned again later, say due to a self-join. Hence, the pattern of cache lookups is fundamentally unknowable. But that's not the case in the motivating example that started this thread. In that case, the target list includes three separate references to the same column. We can therefore predict that if the column value is toasted, we're going to de-TOAST it exactly three times. If, on the other hand, the column value were mentioned only once, it would be detoasted just once. In that latter case, which is probably quite common, this whole cache idea seems like it ought to just waste cycles, which makes me suspect that no matter what is done to this patch, there will always be cases where it causes significant regressions. In the former case, where the column reference is repeated, it will win, but it will also hold onto the detoasted value after the third instance of detoasting, even though there will never be any more cache hits. Here, unlike the syscache case, the cache is thrown away at the end of the query, so future queries can't benefit. Even if we could find a way to preserve the cache in some cases, it's not very likely to pay off. People are much more likely to hit the same table more than once than to hit the same values in the same table more than once in the same session. Suppose we had a design where, when a value got detoasted, the detoasted value went into the place where the toasted value had come from. Then this problem would be handled pretty much perfectly: the detoasted value would last until the scan advanced to the next row, and then it would be thrown out. So there would be no query-lifespan memory leak. Now, I don't really know how this would work, because the TOAST pointer is coming out of a slot and we can't necessarily write one attribute of any arbitrary slot type without a bunch of extra overhead, but maybe there's some way. If it could be done cheaply enough, it would gain all the benefits of this proposal a lot more cheaply and with fewer downsides. Alternatively, maybe there's a way to notice the multiple references and introduce some kind of intermediate slot or other holding area where the detoasted value can be stored. In other words, instead of computing big_jsonb_col->'a', big_jsonb_col->'b', big_jsonb_col->'c' Maybe we ought to be trying to compute this: big_jsonb_col=detoast(big_jsonb_col) big_jsonb_col->'a', big_jsonb_col->'b', big_jsonb_col->'c' Or perhaps this: tmp=detoast(big_jsonb_col) tmp->'a', tmp->'b', tmp->'c' In still other words, a query-lifespan cache for this information feels like it's tackling the problem at the wrong level. I do suspect there may be query shapes where the same value gets detoasted multiple times in ways that the proposals I just made wouldn't necessarily catch, such as in the case of a self-join. But I think those cases are rare. In most cases, repeated detoasting is probably very localized, with the same exact TOAST pointer being de-TOASTed a couple of times in quick succession, so a query-lifespan cache seems like the wrong thing. Also, in most cases, repeated detoasting is probably quite predictable: we could infer it from static analysis of the query. So throwing ANY kind of cache at the problem seems like the wrong approach unless the cache is super-cheap, which doesn't seem to be the case with this proposal, because a cache caters to cases where we can't know whether we're going to recompute the same result, and here, that seems like something we could probably figure out. Because I don't see any way to avoid regressions in the common case where detoasting isn't repeated, I
Re: commitfest.postgresql.org is no longer fit for purpose
On Mon, May 20, 2024 at 7:49 AM Alvaro Herrera wrote: > On 2024-May-19, Tom Lane wrote: > > (The cfbot tends to discourage this, since as soon as one of the > > patches is committed it no longer knows how to apply the rest. > > Can we improve on that tooling somehow?) > > I think a necessary next step to further improve the cfbot is to get it > integrated in pginfra. Right now it runs somewhere in Thomas's servers > or something, and there's no real integration with the commitfest proper > except by scraping. Yes, I think we really need to fix this. Also, there's a bunch of mechanical work that could be done to make cfbot better, like making the navigation not reset the scroll every time you drill down one level through the build products. I would also like to see the buildfarm and CI converged in some way. I'm not sure how. I understand that the buildfarm tests more different configurations than we can reasonably afford to do in CI, but there is no sense in pretending that having two different systems doing similar jobs has no cost. -- Robert Haas EDB: http://www.enterprisedb.com
Re: commitfest.postgresql.org is no longer fit for purpose
t; dent in the problem. Maybe we could divide and conquer: get a > dozen-or-so senior contributors to split up the list of pending > patches and then look at each one with an eye to what needs to > happen to move it along (*not* to commit it right away, although > in some cases maybe that's the thing to do). It'd be great if > that could happen just before each commitfest, but that's probably > not practical with the current patch volume. What I'm thinking > for the moment is to try to make that happen once a year or so. I like this idea. > Yeah, all this stuff ultimately gets done "for the good of the > project", which isn't the most reliable motivation perhaps. > I don't see a better one... This is the really hard part. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Speed up clean meson builds by ~25%
On Sat, May 18, 2024 at 6:09 PM Andres Freund wrote: > A few tests with ccache disabled: These tests seem to show no difference between the two releases, so I wonder what you're intending to demonstrate here. Locally, I have ninja 1.11.1. I'm sure Andres will be absolutely shocked, shocked I say, to hear that I haven't upgraded to the very latest. Anyway, I tried a clean build with CC=clang without the patch and then with the patch and got: unpatched - real 1m9.872s patched - real 1m6.130s That's the time to run my script that first calls meson setup and then afterward runs ninja. I tried ninja -v and put the output to a file. With the patch: [292/2402] /opt/local/bin/perl ../pgsql/src/interfaces/ecpg/preproc/parse.pl --srcdir ../pgsql/src/interfaces/ecpg/preproc --parser ../pgsql/src/interfaces/ecpg/preproc/../../../backend/parser/gram.y --output src/interfaces/ecpg/preproc/preproc.y And without: [1854/2402] /opt/local/bin/perl ../pgsql/src/interfaces/ecpg/preproc/parse.pl --srcdir ../pgsql/src/interfaces/ecpg/preproc --parser ../pgsql/src/interfaces/ecpg/preproc/../../../backend/parser/gram.y --output src/interfaces/ecpg/preproc/preproc.y With my usual CC='ccache clang', I get real 0m37.500s unpatched and real 0m37.786s patched. I also tried this with the original v1 patch (not to be confused with the revised, still-v1 patch) and got real 37.950s. So I guess I'm now feeling pretty unexcited about this. If the patch really did what the subject line says, that would be nice to have. However, since Tom will patch the underlying problem in v18, this will only help people building the back-branches. Of those, it will only help the people using a slow compiler; I read the thread as suggesting that you need to be using clang rather than gcc and also not using ccache. Plus, it sounds like you also need an old ninja version. Even then, the highest reported savings is 10s and I only save 3s. I think that's a small enough savings with enough conditionals that we should not bother. It seems fine to say that people who need the fastest possible builds of back-branches should use at least one of gcc, ccache, and ninja >= 1.12. I'll go mark this patch Rejected in the CommitFest. Incidentally, this thread is an excellent object lesson in why it's so darn hard to cut down the size of the CF. I mean, this is a 7-line patch and we've now got a 30+ email thread about it. In my role as temporary self-appointed wielder of the CommitFest mace, I have now spent probably a good 2 hours trying to figure out what to do about it. There are more than 230 active patches in the CommitFest. That means CommitFest management would be more than a full-time job for an experienced committer even if every patch were as simple as this one, which is definitely not the case. If we want to restore some sanity here, we're going to have to find some way of distributing the burden across more people, including the patch authors. Note that Jelte's last update to the thread is over a month ago. I'm not saying that he's done something wrong, but I do strongly suspect that the time that the community as whole has spent on this patch is a pretty significant multiple of the time that he personally spent on it, and such dynamics are bound to create scaling issues. I don't know how we do better: I just want to highlight the problem. -- Robert Haas EDB: http://www.enterprisedb.com
Re: libpq compression (part 3)
On Fri, May 17, 2024 at 4:53 PM Jacob Burroughs wrote: > I think I was more thinking that trying to let both parties control > the parameter seemed like a recipe for confusion and sadness, and so > the choice that felt most natural to me was to let the sender control > it, but I'm definitely open to changing that the other way around. To be clear, I am not arguing that it should be the receiver's choice. I'm arguing it should be the client's choice, which means the client decides what it sends and also tells the server what to send to it. I'm open to counter-arguments, but as I've thought about this more, I've come to the conclusion that letting the client control the behavior is the most likely to be useful and the most consistent with existing facilities. I think we're on the same page based on the rest of your email: I'm just clarifying. > On the server side, we use slash separated sets of options > connection_compression=DEFAULT_VALUE_FOR_BOTH_DIRECTIONS/client_to_server=OVERRIDE_FOR_THIS_DIRECTION/server_to_client=OVERRIDE_FOR_THIS_DIRECTION > with the values being semicolon separated compression algorithms. > On the client side, you can specify > compression=, > but on the client side you can actually specify compression options, > which the server will use if provided, and otherwise it will fall back > to defaults. I have some quibbles with the syntax but I agree with the concept. What I'd probably do is separate the server side thing into two GUCs, each with a list of algorithms, comma-separated, like we do for other lists in postgresql.conf. Maybe make the default 'all' meaning "everything this build of the server supports". On the client side, I'd allow both things to be specified using a single option, because wanting to do the same thing in both directions will be common, and you actually have to type in connection strings sometimes, so verbosity matters more. As far as the format of the value for that keyword, what do you think about either compression=DO_THIS_BOTH_WAYS or compression=DO_THIS_WHEN_SENDING/DO_THIS_WHEN_RECEIVING, with each "do this" being a specification of the same form already accepted for server-side compression e.g. gzip or gzip:level=9? If you don't like that, why do you think the proposal you made above is better, and why is that one now punctuated with slashes instead of semicolons? > If we think we need to, we could let the server specify defaults for > server-side compression. My overall thought though is that having an > excessive number of knobs increases the surface area for testing and > bugs while also increasing potential user confusion and that allowing > configuration on *both* sides doesn't seem sufficiently useful to be > worth adding that complexity. I agree. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Speed up clean meson builds by ~25%
On Fri, May 17, 2024 at 4:59 PM Tom Lane wrote: > As I mentioned upthread, I'm more worried about confusing error > reports than the machine time. Well, Jelte fixed that. > I grant that there are people who are more affected, but still, I'd > just as soon not contort the build rules for a temporary problem. Arguably by doing this, but I don't think it's enough of a contortion to get excited about. Anyone else want to vote? -- Robert Haas EDB: http://www.enterprisedb.com
Re: Comments on Custom RMGRs
On Fri, May 17, 2024 at 4:20 PM Jeff Davis wrote: > Regarding this particular change: the checkpointing hook seems more > like a table AM feature, so I agree with you that we should have a good > idea how a real table AM might use this, rather than only > pg_stat_statements. I would even be OK with a pg_stat_statements example that is fully working and fully explained. I just don't want to have no example at all. The original proposal has been changed twice because of complaints that the hook wasn't quite useful enough, but I think that only proves that v3 is closer to being useful than v1. If v1 is 40% of the way to useful and v3 is 120% of the way to useful, wonderful! But if v1 is 20% of the way to being useful and v3 is 60% of the way to being useful, it's not time to commit anything yet. I don't know which is the case, and I think if someone wants this to be committed, they need to explain clearly why it's the first and not the second. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Speed up clean meson builds by ~25%
On Wed, Apr 17, 2024 at 11:11 PM Tom Lane wrote: > I think we should hold off on this. I found a simpler way to address > ecpg's problem than what I sketched upthread. I have a not-ready-to- > show-yet patch that allows the vast majority of ecpg's grammar > productions to use the default semantic action. Testing on my M1 > Macbook with clang 16.0.6 from MacPorts, I see the compile time for > preproc.o in HEAD as about 1m44 sec; but with this patch, 0.6 sec. If this is the consensus opinion, then https://commitfest.postgresql.org/48/4914/ should be marked Rejected. However, while I think the improvements that Tom was able to make here sound fantastic, I don't understand why that's an argument against Jelte's patches. After all, Tom's work will only go into v18, but this patch could be adopted in v17 and back-patched to all releases that support meson builds, saving oodles of compile time for as long as those releases are supported. The most obvious beneficiary of that course of action would seem to be Tom himself, since he back-patches more fixes than anybody, last I checked, but it'd be also be useful to get slightly quicker results from the buildfarm and slightly quicker results for anyone using CI on back-branches and for other hackers who are looking to back-patch bug fixes. I don't quite understand why we want to throw those potential benefits out the window just because we have a better fix for the future. -- Robert Haas EDB: http://www.enterprisedb.com
Re: commitfest.postgresql.org is no longer fit for purpose
On Fri, May 17, 2024 at 3:51 PM Laurenz Albe wrote: > On Fri, 2024-05-17 at 13:12 -0400, Greg Sabino Mullane wrote: > > > Long time ago there was a "rule" that people submitting patches are > > > expected > > > to do reviews. Perhaps we should be more strict this. > > > > Big -1. How would we even be more strict about this? Public shaming? > > Withholding a commit? > > I think it is a good rule. I don't think that it shouldn't lead to putting > people on the pillory or kicking their patches, but I imagine that a committer > looking for somebody else's patch to work on could prefer patches by people > who are doing their share of reviews. If you give me an automated way to find that out, I'll consider paying some attention to it. However, in order to sort the list of patches needing review by the amount of review done by the patch author, we'd first need to have a list of patches needing review. And right now we don't, or at least not in any usable way. commitfest.postgresql.org is supposed to give us that, but it doesn't. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Propagate sanity checks of ProcessUtility() to standard_ProcessUtility()?
On Thu, Feb 29, 2024 at 2:21 AM Michael Paquier wrote: > It's been brought to me that an extension may finish by breaking the > assumptions ProcessUtility() relies on when calling > standard_ProcessUtility(), causing breakages when passing down data to > cascading utility hooks. > > Isn't the state of the arguments given something we should check not > only in the main entry point ProcessUtility() but also in > standard_ProcessUtility(), to prevent issues if an extension > incorrectly manipulates the arguments it needs to pass down to other > modules that use the utility hook, like using a NULL query string? I can't imagine a scenario where this change saves more than 5 minutes of debugging, so I'd rather leave things the way they are. If you do this, then people will see the macro and have to go look at what it does, whereas right now, they can see the assertions themselves, which is better. The usual pattern for using hooks like this is to call the next implementation, or the standard implementation, and pass down the arguments that you got. If you try to do that and mess it up, you're going to get a crash really, really quickly and it shouldn't be very difficult at all to figure out what you did wrong. Honestly, that doesn't seem like it would be hard even without the assertions: for the most part, a quick glance at the stack backtrace ought to be enough. If you're trying to do something more sophisticated, like mutating the node tree before passing it down, the chances of your mistakes getting caught by these assertions are pretty darn low, since they're very superficial checks. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs
On Fri, May 17, 2024 at 1:26 PM Jacob Burroughs wrote: > I was leaving the details around triggering that for this conversation > and in that patch just designing the messages in a way that would > allow adding the reconfiguration/restarting to be easily added in a > backwards-compatible way in a future patch. I would imagine that an > explicit `ParameterSet` call that sets `_pq_.connection_compression` > (or whatever the implementation details turn out to be) would also > trigger a compressor restart, and when restarted it would pick an > algorithm/configuration based on the new value of the parameter rather > than the one used at connection establishment. Hmm, OK, interesting. I suppose that I thought we were going to handle problems like this by just adding bespoke messages for each case (e.g. CompressionSet). I wasn't thinking it would be practical to make a message whose remit was as general as what it seems like Jelte wants to do with ParameterSet, because I'm not sure everything can be handled that simply. It's not an exact analogy, but when you want to stop the server, it's not enough to say that you want to change from the Running state to the Stopped state. You have to specify which type of shutdown should be used to make the transition. You could even have more complicated cases, where one side says "prepare to do X" and the other side eventually says "OK, I'm prepared" and the first side says "great, now activate X" and the other side eventually says "OK, it's activate, please confirm that you've also activated it from your side" and the first side eventually says "OK, I confirm that". I think the fear that we're going to run into cases where such complex handshaking is necessary is a major reason why I'm afraid of Jelte's ideas about ParameterSet: it seems much more opinionated to me than he seems to think it is. To the extent that I can wrap my head around what Jelte is proposing, and all signs point to that extent being quite limited, I suppose I was thinking of something like his option (2). That is, I assumed that a client would request all the optional features that said client might wish to use at connection startup time. But I can see how that assumption might be somewhat problematic in a connection-pooling environment. Still, at least to me, it seems better than trying to rely on GUC_REPORT. My opinion is (1) GUC_REPORT isn't a particularly well-designed mechanism so I dislike trying to double down on it and (2) trying to mix these protocol-level parameters and the transactional GUCs we have together in a single mechanism seems potentially problematic and (3) I'm still not particularly happy about the idea of making protocol parameters into GUCs in the first place. Perhaps these are all minority positions, but I can't tell you what everyone thinks, only what I think. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Comments on Custom RMGRs
On Fri, Mar 29, 2024 at 1:09 PM Jeff Davis wrote: > I am fine with this. > > You've moved the discussion forward in two ways: > > 1. Changes to pg_stat_statements to actually use the API; and > 2. The hook is called at multiple points. > > Those at least partially address the concerns raised by Andres and > Robert. But given that there was pushback from multiple people on the > feature, I'd like to hear from at least one of them. It's very late in > the cycle so I'm not sure we'll get more feedback in time, though. In my seemingly-neverending pass through the July CommitFest, I reached this patch. My comment is: it's possible that rmgr_003.v3.patch is enough to be useful, but does anyone in the world think they know that for a fact? I mean, pgss_001.v1.patch purports to demonstrate that it can be used, but that's based on rmgr_003.v2.patch, not the v3 patch, and the emails seem to indicate that it may not actually work. I also think, looking at it, that it looks much more like a POC than something we'd consider ready for commit. It also seems very unclear that we'd want pg_stat_statements to behave this way, and indeed "this way" isn't really spelled out anywhere. I think it would be nice if we had an example that uses the proposed hook that we could actually commit. Maybe that's asking too much, though. I think the minimum thing we need is a compelling rationale for why this particular hook design is going to be good enough. That could be demonstrated by means of (1) a well-commented example that accomplishes some understandable goal and/or (2) a detailed description of how a non-core table AM or index AM is expected to be able to make use of this. Bonus points if the person providing that rationale can say credibly that they've actually implemented this and it works great with 100TB of production data. The problem here is not only that we don't want to commit a hook that does nothing useful. We also don't want to commit a hook that works wonderfully for someone but we have no idea why. If we do that, then we don't know whether it's OK to modify the hook in the future as the code evolves, or more to the point, which kinds of modifications will be acceptable. And also, the next person who wants to use it is likely to have to figure out all on their own how to do so, instead of being able to refer to comments or documentation or the commit message or at least a mailing list post. My basic position is not that this patch is a bad idea, but that it isn't really finished. The idea is probably a pretty good one, but whether this is a reasonable implementation of the idea doesn't seem clear, at least not to me. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Postgres and --config-file option
On Thu, May 16, 2024 at 4:57 AM Aleksander Alekseev wrote: > I propose my original v1 patch for correcting the --help output of > 'postgres' too. I agree with the above comments that corresponding > changes in v4 became somewhat unwieldy. So, who is it exactly that will be confused by the status quo? I mean, let's say you get this error: postgres does not know where to find the server configuration file. You must specify the --config-file or -D invocation option or set the PGDATA environment variable. As I see it, either you know how it works and just made a mistake this time, or you are a beginner. If it's the former, the fact that the error message doesn't mention every possible way of solving the problem does not matter, because you already know how to fix your mistake. If it's the latter, you don't need to know *every* way to fix the problem. You just need to know *one* way to fix the problem. I don't really understand why somebody would look at the existing message and say "gosh, it didn't tell me that I could also use -c!". If you already know that, don't you just ignore the hint and get busy with fixing the problem? If the reason that somebody is upset is because it's not technically true to say that you *must* do one of those things, we could fix that with "You must" -> "You can" or with "You must specify" -> "Specify". The patch you propose is also not terrible or anything, but it goes in the direction of listing every alternative, which will become unpalatable as soon as somebody adds one more way to do it, or maybe it's unpalatable already. Even if we don't do that, I don't see that there's a huge problem here. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Internal error codes triggered by tests
On Tue, Apr 23, 2024 at 12:55 AM Michael Paquier wrote: > Thoughts? The patch as proposed seems fine. Marking RfC. -- Robert Haas EDB: http://www.enterprisedb.com
Re: remove Todo item: Allow infinite intervals just like infinite timestamps
On Fri, May 17, 2024 at 9:12 AM jian he wrote: > https://wiki.postgresql.org/wiki/Todo > Dates and Times[edit] > Allow infinite intervals just like infinite timestamps > https://www.postgresql.org/message-id/4eb095c8.1050...@agliodbs.com > > this done at > https://git.postgresql.org/cgit/postgresql.git/commit/?id=519fc1bd9e9d7b408903e44f55f83f6db30742b7 > > Should we remove this item? If the item is done, you should edit the wiki page and mark it that way. Note this note at the top of the page: [D] Completed item - marks changes that are done, and will appear in the PostgreSQL 17 release. Note that the Todo list is not a very good Todo list and most people don't use it to find projects (and haven't for a long time). So it may not get updated, or consulted, very often. -- Robert Haas EDB: http://www.enterprisedb.com
Re: commitfest.postgresql.org is no longer fit for purpose
On Fri, May 17, 2024 at 11:57 AM Tom Lane wrote: > Melanie Plageman writes: > > So, anyway, I'd argue that we need a parking lot for patches which we > > all agree are important and have a path forward but need someone to do > > the last 20-80% of the work. To avoid this being a dumping ground, > > patches should _only_ be allowed in the parking lot if they have a > > clear path forward. Patches which haven't gotten any interest don't go > > there. Patches in which the author has clearly not addressed feedback > > that is reasonable for them to address don't go there. These are > > effectively community TODOs which we agree need to be done -- if only > > someone had the time. > > Hmm. I was envisioning "parking lot" as meaning "this is on my > personal TODO list, and I'd like CI support for it, but I'm not > expecting anyone else to pay attention to it yet". +1. > I think what > you are describing is valuable but different. Maybe call it > "pending" or such? Or invent a different name for the other thing. Yeah, there should be someplace that we keep a list of things that are thought to be important but we haven't gotten around to doing anything about yet, but I think that's different from the parking lot CF. -- Robert Haas EDB: http://www.enterprisedb.com
Re: commitfest.postgresql.org is no longer fit for purpose
On Fri, May 17, 2024 at 11:05 AM Tom Lane wrote: > > We already have gone back to that model. We just haven't admitted it > > yet. And we're never going to get out of it until we find a way to get > > the contents of the CommitFest application down to a more reasonable > > size and level of complexity. There's just no way everyone's up for > > that level of pain. I'm not sure not up for that level of pain. > > Yeah, we clearly need to get the patch list to a point of > manageability, but I don't agree that abandoning time-boxed CFs > will improve anything. I'm not sure. Suppose we plotted commits generally, or commits of non-committer patches, or reviews on-list, vs. time. Would we see any uptick in activity during CommitFests? Would it vary by committer? I don't know. I bet the difference wouldn't be as much as Tom Lane would like to see. Realistically, we can't manage how contributors spend their time all that much, and trying to do so is largely tilting at windmills. To me, the value of time-based CommitFests is as a vehicle to ensure freshness, or cleanup, or whatever word you want to do. If you just make a list of things that need attention and keep incrementally updating it, eventually for various reasons that list no longer reflects your current list of priorities. At some point, you have to throw it out and make a new list, or at least that's what always happens to me. We've fallen into a system where the default treatment of a patch is to be carried over to the next CommitFest and CfMs are expected to exert effort to keep patches from getting that default treatment when they shouldn't. But that does not scale. We need a system where things drop off the list unless somebody makes an effort to keep them on the list, and the easiest way to do that is to periodically make a *fresh* list that *doesn't* just clone some previous list. I realize that many people here are (rightly!) concerned with burdening patch authors with more steps that they have to follow. But the current system is serving new patch authors very poorly. If they get attention, it's much more likely to be because somebody saw their email and wrote back than it is to be because somebody went through the CommitFest and found their entry and was like "oh, I should review this". Honestly, if we get to a situation where a patch author is sad because they have to click a link every 2 months to say "yeah, I'm still here, please review my patch," we've already lost the game. That person isn't sad because we asked them to click a link. They're sad it's already been N * 2 months and nothing has happened. -- Robert Haas EDB: http://www.enterprisedb.com
Re: WAL record CRC calculated incorrectly because of underlying buffer modification
On Fri, May 17, 2024 at 10:56 AM Jeff Davis wrote: > I'm still not entirely clear on why hash indexes can't just follow the > rules and exclusive lock the buffer and dirty it. Presumably > performance would suffer, but I asked that question previously and > didn't get an answer: > > https://www.postgresql.org/message-id/CA%2BTgmoY%2BdagCyrMKau7UQeQU6w4LuVEu%2ByjsmJBoXKAo6XbUUA%40mail.gmail.com In my defense, the last time I worked on hash indexes was 7 years ago. If this question had come up within a year or two of that work, I probably would have both (a) had a much clearer idea of what the answer was and (b) felt obliged to drop everything and go research it if I didn't. But at this point, I feel like it's fair for me to tell you what I know and leave it to you to do further research if you feel like that's warranted. I know that we're each responsible for what we commit, but I don't really think that should extend to having to prioritize answering a hypothetical question ("what would happen if X thing worked like Y instead of the way it does?") about an area I haven't touched in long enough for every release that doesn't contain those commits to be out of support. If you feel otherwise, let's have that argument, but I have a feeling that it may be more that you're hoping I have some kind of oracular powers which, in reality, I lack. -- Robert Haas EDB: http://www.enterprisedb.com
Re: commitfest.postgresql.org is no longer fit for purpose
On Fri, May 17, 2024 at 10:31 AM Tom Lane wrote: > To my mind, the point of the time-boxed commitfests is to provide > a structure wherein people will (hopefully) pay some actual attention > to other peoples' patches. Conversely, the fact that we don't have > one running all the time gives committers some defined intervals > where they can work on their own stuff without feeling guilty that > they're not handling other people's patches. > > If we go back to the old its-development-mode-all-the-time approach, > what is likely to happen is that the commit rate for not-your-own- > patches goes to zero, because it's always possible to rationalize > your own stuff as being more important. We already have gone back to that model. We just haven't admitted it yet. And we're never going to get out of it until we find a way to get the contents of the CommitFest application down to a more reasonable size and level of complexity. There's just no way everyone's up for that level of pain. I'm not sure not up for that level of pain. -- Robert Haas EDB: http://www.enterprisedb.com
Re: commitfest.postgresql.org is no longer fit for purpose
On Fri, May 17, 2024 at 9:54 AM Joe Conway wrote: > I agree with you. Before commitfests were a thing, we had no structure > at all as I recall. The dates for the dev cycle were more fluid as I > recall, and we had no CF app to track things. Maybe retaining the > structure but going back to the continuous development would be just the > thing, with the CF app tracking just the currently (as of this > week/month/???) active stuff. The main thing that we'd gain from that is to avoid all the work of pushing lots of things forward to the next CommitFest at the end of every CommitFest. While I agree that we need to find a better way to handle that, I don't think it's really the biggest problem. The core problems here are (1) keeping stuff out of CommitFests that don't belong there and (2) labelling stuff that does belong in the CommitFest in useful ways. We should shape the solution around those problems. Maybe that will solve this problem along the way, but if it doesn't, that's easy enough to fix afterward. Like, we could also just have a button that says "move everything that's left to the next CommitFest". That, too, would avoid the manual work that this would avoid. But it wouldn't solve any other problems, so it's not really worth much consideration. -- Robert Haas EDB: http://www.enterprisedb.com
Re: psql JSON output format
On Fri, May 17, 2024 at 9:42 AM Christoph Berg wrote: > Thanks for summarizing the thread. > > Things mentioned in the thread: > > 1) rendering of SQL NULLs - include or omit the column > > 2) rendering of JSON values - both "quoted string" and "inline as >JSON" make sense > > 3) not quoting numeric values and booleans > > 4) no special treatment of other datatypes like arrays or compound >values, just quote them > > 5) row format: JSON object or array (array would be close to CSV >format) > > 6) overall format: array of rows, or simply print each row separately >("JSON Lines" format, https://jsonlines.org/) > > I think 1, 2 and perhaps 6 make sense to have configurable. Two or > three \pset options (or one option with a list of flags) don't sound > too bad complexity-wise. > > Or maybe just default to "omit NULL columns" and "inline JSON" (and > render null as NULL). If we're going to just have one option, I agree with making that the default, and I'd default to an array of row objects. If we're going to have something configurable, I'd at least consider making (4) configurable. It's tempting to just have one option, like \pset jsonformat nullcolumns=omit;inlinevalues=json,array;rowformat=object;resultcontainer=array simply because adding a ton of new options just for this isn't very appealing. But looking at how long that is, it's probably not a great idea. So I guess separate options is probably better? -- Robert Haas EDB: http://www.enterprisedb.com
Re: psql: Allow editing query results with \gedit
On Fri, May 17, 2024 at 9:24 AM Christoph Berg wrote: > Tom> The stated complaint was "it's too hard to build UPDATE commands", > Tom> which I can sympathize with. > > ... which this would perfectly address - it's building the commands. > > The editor will have a bit more clutter (the UPDATE SET WHERE > boilerplate), and the fields are somewhat out of order (the key at the > end), but SQL commands are what people understand, and there is > absolutely no ambiguity on what is going to be executed since the > commands are exactly what is leaving the editor. A point to consider is that the user can also do this in the query itself, if desired. It'd just be a matter of assembling the query string with appropriate calls to quote_literal() and quote_ident(), which is not actually all that hard and I suspect many of us have done that at one point or another. And then you know that you'll get the right set of key columns and update the right table (as opposed to, say, a view over the right table, or the wrong one out of several tables that you joined). Now you might say, well, that's not terribly convenient, which is probably true, but this might be a case of -- convenient, reliable, works with arbitrary queries -- pick two. I don't know. I wouldn't be all that surprised if there's something clever and useful we could do in this area, but I sure don't know what it is. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs
On Thu, May 16, 2024 at 1:39 PM Jacob Burroughs wrote: > As currently implemented [1], the client sends the server the list of > all compression algorithms it is willing to accept, and the server can > use one of them. If the server knows what `_pq_.compression` means > but doesn't actually support any compression, it will both send the > client its empty list of supported algorithms and just never send any > compressed messages, and everyone involved will be (relatively) happy. > There is a libpq function that a client can use to check what > compression is in use if a client *really* doesn't want to continue > with the conversation without compression, but 99% of the time I can't > see why a client wouldn't prefer to continue using a connection with > whatever compression the server supports (or even none) without more > explicit negotiation. (Unlike TLS, where automagically picking > between using and not using TLS has strange security implications and > effects, compression is a convenience feature for everyone involved.) This all seems sensible to me. > As the protocol layer is currently designed [1], it explicitly makes > it very easy to change/restart compression streams, specifically for > this use case (and in particular for the general connection pooler use > case). Compressed data is already framed in a `CompressedData` > message, and that message has a header byte that corresponds to an > enum value for which algorithm is currently in use. Any time the > compression stream was restarted by the sender, the first > `CompressedData` message will set that byte, and then the client will > restart its decompression stream with the chosen algorithm from that > point. For `CompressedData` messages that continue using the > already-established stream, the byte is simply set to 0. (This is > also how the "each side sends a list" form of negotiation is able to > work without additional round trips, as the `CompressedData` framing > itself communicates which compression algorithm has been selected.) OK, so you made it so that compressed data is fully self-identifying. Hence, there's no need to worry if something gets changed: the receiver, if properly implemented, can't help but notice. The only downside that I can see to this design is that you only have one byte to identify the compression algorithm, but that doesn't actually seem like a real problem at all, because I expect the number of supported compression algorithms to grow very slowly. I think it would take centuries, possibly millenia, before we started to get short of identifiers. So, cool. But, in your system, how does the client ask the server to switch to a different compression algorithm, or to restart the compression stream? -- Robert Haas EDB: http://www.enterprisedb.com
Re: commitfest.postgresql.org is no longer fit for purpose
On Fri, May 17, 2024 at 9:02 AM Peter Eisentraut wrote: > We already have the annotations feature, which is kind of this. I didn't realize we had that feature. When was it added, and how is it supposed to be used? -- Robert Haas EDB: http://www.enterprisedb.com
Re: commitfest.postgresql.org is no longer fit for purpose
On Fri, May 17, 2024 at 8:31 AM Jelte Fennema-Nio wrote: > Such a proposal would basically mean that no-one that cares about > their patches getting reviews can go on holiday and leave work behind > during the week before a commit fest. That seems quite undesirable to > me. Well, then we make it ten days instead of seven, or give a few days grace after the CF starts to play catchup, or allow the CfM to make exceptions. To be fair, I'm not sure that forcing people to do something like this is going to solve our problem. I'm very open to other ideas. But one idea that I'm not open to is to just keep doing what we're doing. It clearly and obviously does not work. I just tried scrolling through the CommitFest to a more or less random spot by flicking the mouse up and down, and then clicked on whatever ended up in the middle of my screen. I did this four times. Two of those landed on patches that had extremely long discussion threads already. One hit a patch from a non-committer that hasn't been reviewed and needs to be. And the fourth hit a patch from a committer which maybe could benefit from review but I can already guess that the patch works fine and unless somebody can find some architectural downside to the approach taken, there's not really a whole lot to talk about. I don't entirely know how to think about that result, but it seems pretty clear that the unreviewed non-committer patch ought to get priority, especially if we're talking about the possibility of non-committers or even junior committers doing drive-by reviews. The high-quality committer patch might be worth a comment from me, pro or con or whatever, but it's probably not a great use of time for a more casual contributor: they probably aren't going to find too much wrong with it. And the threads with extremely long threads already, well, I don't know if there's something useful that can be done with those threads or not, but those patches certainly haven't been ignored. I'm not sure that any of these should be evicted from the CommitFest, but we need to think about how to impose some structure on the chaos. Just classifying all four of those entries as either "Needs Review" or "Waiting on Author" is pretty useless; then they all look the same, and they're not. And please don't suggest adding a bunch more status values that the CfM has to manually police as the solution. We need to find some way to create a system that does the right thing more often by default. -- Robert Haas EDB: http://www.enterprisedb.com
Re: pg_combinebackup does not detect missing files
On Fri, May 17, 2024 at 1:18 AM David Steele wrote: > However, I think allowing the user to optionally validate the input > would be a good feature. Running pg_verifybackup as a separate step is > going to be a more expensive then verifying/copying at the same time. > Even with storage tricks to copy ranges of data, pg_combinebackup is > going to aware of files that do not need to be verified for the current > operation, e.g. old copies of free space maps. In cases where pg_combinebackup reuses a checksums from the input manifest rather than recomputing it, this could accomplish something. However, for any file that's actually reconstructed, pg_combinebackup computes the checksum as it's writing the output file. I don't see how it's sensible to then turn around and verify that the checksum that we just computed is the same one that we now get. It makes sense to run pg_verifybackup on the output of pg_combinebackup at a later time, because that can catch bits that have been flipped on disk in the meanwhile. But running the equivalent of pg_verifybackup during pg_combinebackup would amount to doing the exact same checksum calculation twice and checking that it gets the same answer both times. > One more thing occurs to me -- if data checksums are enabled then a > rough and ready output verification would be to test the checksums > during combine. Data checksums aren't very good but something should be > triggered if a bunch of pages go wrong, especially since the block > offset is part of the checksum. This would be helpful for catching > combine bugs. I don't know, I'm not very enthused about this. I bet pg_combinebackup has some bugs, and it's possible that one of them could involve putting blocks in the wrong places, but it doesn't seem especially likely. Even if it happens, it's more likely to be that pg_combinebackup thinks it's putting them in the right places but is actually writing them to the wrong offset in the file, in which case a block-checksum calculation inside pg_combinebackup is going to think everything's fine, but a standalone tool that isn't confused will be able to spot the damage. It's frustrating that we can't do better verification of these things, but to fix that I think we need better infrastructure elsewhere. For instance, if we made pg_basebackup copy blocks from shared_buffers rather than the filesystem, or at least copy them when they weren't being concurrently written to the filesystem, then we'd not have the risk of torn pages producing spurious bad checksums. If we could somehow render a backup consistent when taking it instead of when restoring it, we could verify tons of stuff. If we had some useful markers of how long files were supposed to be and which ones were supposed to be present, we could check a lot of things that are uncheckable today. pg_combinebackup does the best it can -- or the best I could make it do -- but there's a disappointing number of situations where it's like "hmm, in this situation, either something bad happened or it's just the somewhat unusual case where this happens in the normal course of events, and we have no way to tell which it is." -- Robert Haas EDB: http://www.enterprisedb.com
Re: commitfest.postgresql.org is no longer fit for purpose
ody who submitted a bug report, and at the same time, made the patch he submitted less visible to anyone who might want to help with it. Wahoo! -- Robert Haas EDB: http://www.enterprisedb.com
Re: commitfest.postgresql.org is no longer fit for purpose
On Fri, May 17, 2024 at 1:05 AM Peter Eisentraut wrote: > On 17.05.24 04:26, Robert Haas wrote: > > I do*emphatically* think we need a parking lot. > > People seem to like this idea; I'm not quite sure I follow it. If you > just want the automated patch testing, you can do that on your own by > setting up github/cirrus for your own account. If you keep emailing the > public your patches just because you don't want to set up your private > testing tooling, that seems a bit abusive? > > Are there other reasons why developers might want their patches > registered in a parking lot? It's easier to say what's happening than it is to say why it's happening. The CF contains a lot of stuff that appears to just be parked there, so evidently reasons exist. But if we are to guess what those reasons might be, Tom has already admitted he does that for CI, and I do the same, so probably other people also do it. I also suspect that some people are essentially using the CF app as a personal todo list. By sticking patches in there that they intend to commit next cycle, they both (1) feel virtuous, because they give at least the appearance of following the community process and inviting review before they commit and (2) avoid losing track of the stuff they plan to commit. There may be other reasons, too. -- Robert Haas EDB: http://www.enterprisedb.com
Re: commitfest.postgresql.org is no longer fit for purpose
On Thu, May 16, 2024 at 5:46 PM Tom Lane wrote: > Right, so what can we do about that? Does Needs Review state need to > be subdivided, and if so how? It doesn't really matter how many states we have if they're not set accurately. > At this point it seems like there's consensus to have a "parking" > section of the CF app, separate from the time-boxed CFs, and I hope > somebody will go make that happen. But I don't think that's our only > issue, so we need to keep thinking about what should be improved. I do *emphatically* think we need a parking lot. And a better integration between commitfest.postgresql.org and the cfbot, too. It's just ridiculous that more work hasn't been put into this. I'm not faulting Thomas or anyone else -- I mean, it's not his job to build the infrastructure the project needs any more than it is anyone else's -- but for a project of the size and importance of PostgreSQL to take years to make minor improvements to this kind of critical infrastructure is kind of nuts. If we don't have enough volunteers, let's go recruit some more and promise them cookies. I think you're overestimating the extent to which the problem is "we don't reject enough patches". That *is* a problem, but it seems we have also started rejecting some patches that just never really got much attention for no particularly good reason, while letting other things linger that didn't really get that much more attention, or which were objectively much worse ideas. I think that one place where the process is breaking down is in the tacit assumption that the person who wrote the patch wants to get it committed. In some cases, people park things in the CF for CI runs without a strong intention of pursuing them; in other cases, the patch author is in no kind of rush. I think we need to move to a system where patches exist independent of a CommitFest, and get CI run on them unless they get explicitly closed or are too inactive or some other criterion that boils down to nobody cares any more. Then, we need to get whatever subset of those patches need to be reviewed in a particular CommitFest added to that CommitFest. For example, imagine that the CommitFest is FORCIBLY empty until a week before it starts. You can still register patches in the system generally, but that just means they get CI runs, not that they're scheduled to be reviewed. A week before the CommitFest, everyone who has a patch registered in the system that still applies gets an email saying "click here if you think this patch should be reviewed in the upcoming CommitFest -- if you don't care about the patch any more or it needs more work before other people review it, don't click here". Then, the CommitFest ends up containing only the things where the patch author clicked there during that week. For bonus points, suppose we make it so that when you click the link, it takes you to a box where you can type in a text comment that will display in the app, explaining why your patch needs review: "Robert says the wire protocol changes in this patch are wrong, but I think he's full of hot garbage and want a second opinion!" (a purely hypothetical example, I'm sure) If you put in a comment like this when you register your patch for the CommitFest, it gets a sparkly gold border and floats to the top of the list, or we mail you a Kit-Kat bar, or something. I don't know. The point is - we need a much better signal to noise ratio here. I bet the number of patches in the CommitFest that actually need review is something like 25% of the total. The rest are things that are just parked there by a committer, or that the author doesn't care about right now, or that are already being actively discussed, or where there's not a clear way forward. We could create new statuses for all of those states - "Parked", "In Hibernation," "Under Discussion," and "Unclear" - but I think that's missing the point. What we really want is to not see that stuff in the first place. It's a CommitFest, not once-upon-a-time-I-wrote-a-patch-Fest. -- Robert Haas EDB: http://www.enterprisedb.com
commitfest.postgresql.org is no longer fit for purpose
deas people have for improving this situation. I doubt that there's any easy answer that just makes the problem go away -- keeping large groups of people organized is a tremendously difficult task under pretty much all circumstances, and the fact that, in this context, nobody's really the boss, makes it a whole lot harder. But I also feel like what we're doing right now can't possibly be the best that we can do. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs
On Thu, May 16, 2024 at 12:09 PM Jelte Fennema-Nio wrote: > I don't really understand the benefit of your proposal over option 2 > that I proposed. Afaict you're proposing that for e.g. compression we > first set _pq_.supports_compression=1 in the StartupMessage and use > that to do feature detection, and then after we get the response we > send ParameterSet("compression", "gzip"). To me this is pretty much > identical to option 2, except that it introduces an extra round trip > for no benefit (as far as I can see). Why not go for option 2 and send > _pq_.compression=gzip in the StartupMessage directly. Ugh, it's so hard to communicate clearly about this stuff. I didn't really have any thought that we'd ever try to handle something as complicated as compression using ParameterSet. I tend to agree that for compression I'd like to see the startup packet contain more than _pq_.compression=1, but I'm not sure what would happen after that exactly. If the client asks for _pq_.compression=lz4 and the server tells the client that it doesn't understand _pq_.compression at all, then everybody's on the same page: no compression. But, if the server understands the option but isn't OK with the proposed value, what happens then? Does it send a NegotiateCompressionType message after the NegotiateProtocolVersion, for example? That seems like it could lead to the client having to be prepared for a lot of NegotiateX messages somewhere down the road. I think at some point in the past we had discussed having the client list all the algorithms it supported in the argument to _pq_.compression, and then the server would respond with the algorithm it wanted use, or maybe a list of algorithms that it could allow, and then we'd go from there. But I'm not entirely sure if that's the right idea, either. Changing compression algorithms in mid-stream is tricky, too. If I tell the server "hey, turn on server-to-client compression!" then I need to be able to identify where exactly that happens. Any messages already sent by the server and not yet processed by me, or any messages sent after that but before the server handles my request, are going to be uncompressed. Then, at some point, I'll start getting compressed data. If the compressed data is framed inside some message type created for that purpose, like I get a CompressedMessage message and then I decompress to get the actual message, this is simpler to manage. But even then, it's tricky if the protocol shifts. If I tell the server, you know what, gzip was a bad choice, I want lz4, I'll need to know where the switch happens to be able to decompress properly. I don't know if we want to support changing compression algorithms in mid-stream. I don't think there's any reason we can't, but it might be a bunch of work for something that nobody really cares about. Not sure. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs
On Thu, May 16, 2024 at 5:22 AM Jelte Fennema-Nio wrote: > If we're not setting the protocol parameter in the StartupMessage, > there's currently no way for us to know if the protocol parameter is > supported by the server. If protocol parameters were unchangable then > that would be fine, but the whole point of introducing ParameterSet is > to make it possible to change protocol parameters on an existing > connection. Having the function SupportsProtocolCompression return > false, even though you can enable compression just fine, only because > we didn't ask for compression when connecting seems quite silly and > confusing. You're probably not going to like this answer, but I feel like this is another sign that you're trying to use the protocol extensibility facilities in the wrong way. In my first reply to the thread, I proposed having the client send _pq_.protocol_set=1 in that startup message. If the server accepts that message, then you can send whatever set of message types are associated with that option, which could include messages to list known settings, as well as messages to set them. Alternatively, if we used a wire protocol bump for this, you could request version 3.1 and everything that I just said still applies. In other words, I feel that if you had adopted the design that I proposed back in March, or some variant of it, the problem you're having now wouldn't exist. IMHO, we need to negotiate the language that we're going to use to communicate before we start communicating. We should find out which protocol version we're using, and what protocol options are accepted, based on sending a StartupMessage and receiving a reply. Then, after that, having established a common language, we can do whatever. I think you're trying to meld those two steps into one, which is understandable from the point of view of saving a round trip, but I just don't see it working out well. What we can do in the startup message is extremely limited, because any startup messages we generate can't break existing servers, and also because of the concerns I raised earlier about leaving room for more extension in the future. Once we get past the startup message negotiation, the sky's the limit! -- Robert Haas EDB: http://www.enterprisedb.com
Re: Direct SSL connection with ALPN and HBA rules
On Thu, May 16, 2024 at 10:23 AM Heikki Linnakangas wrote: > Yep, committed. Thanks everyone! Thanks! -- Robert Haas EDB: http://www.enterprisedb.com