Re: Should rolpassword be toastable?
"Jonathan S. Katz" writes: > I think Tom's initial suggestion (BLCKSZ/2) is better than 256, given we > really don't know what' out there in the wild, and this could end up > being a breaking change. Every other type in pg_authid is pretty small. I'm having second thoughts about that though, based on the argument that we don't really want a platform-dependent limit here. Admittedly, nobody changes BLCKSZ on production systems, but it's still theoretically an issue. I don't have a problem with selecting a larger limit such as 512 or 1024 though. > That said, I'm also imagining other things we may add that could require > TOAST support (remembering previous passwords? storing multiple > passwords options)? Things like previous passwords probably don't need to be accessed during authentication, so there are at least a couple of ways we could do that: * put the previous passwords in an auxiliary table; * put back pg_authid's toast table, but mark rolpassword as "STORAGE MAIN" so it doesn't go to toast, while letting columns that don't need to be touched at startup go there. However, if you wanted to allow multiple passwords I'm not sure about a good way. regards, tom lane
Re: [BUG FIX]Connection fails with whitespace after keepalives parameter value
Michael Paquier writes: > On Thu, Oct 03, 2024 at 11:57:16AM -0400, Tom Lane wrote: >> I don't have a strong opinion one way or the other about whether >> we should make libpq permissive about extra spaces (as per >> Michael's patch). I guess you could argue that all of these >> fixes are consistent with the principle of "be conservative >> with what you send and liberal with what you accept". But at >> most I'd fix these remaining things in HEAD. > Removing this extra whitespace from the ECPG strings sounds good here. > FWIW, my argument about doing this in libpq is not really related to > ECPG: it feels inconsistent to apply one rule for the parameters and a > different one for the values in URIs. So I'd be OK to see how this > goes on as a HEAD-only change. OK, if there's no objections let's push both remaining patches to HEAD only. regards, tom lane
Re: Should rolpassword be toastable?
Nathan Bossart writes: > I don't mind proceeding with the patch if there is strong support for it. > I wavered only because it's hard to be confident that we are choosing the > right limit. I'm not that fussed about it; surely 256 is more than anyone is using? If not, we'll get push-back and then we can have a discussion about the correct limit that's informed by more than guesswork. > ... But I can also buy the argument that none of this is a strong > enough reason to avoid making the error message nicer... There's that, and there's also the fact that if you assume someone is using $sufficiently-long-passwords then we might have broken their use-case already. We can't have much of a conversation here without a concrete case to look at. regards, tom lane
Re: Should rolpassword be toastable?
Nathan Bossart writes: > For the reasons discussed upthread [0], I can't bring myself to add an > arbitrary limit to the password hash length. I am going to leave 0001 > uncommitted for now. > [0] https://postgr.es/m/Zu2eT2H8OT3OXauc%40nathan I'm confused, as in [0] you said >> ... But on the off-chance >> that someone is building a custom driver that generates long hashes for >> whatever reason, I'd imagine that a clear error would be more helpful than >> "row is too big." I agree with the idea that complaining about the password being too long is far more intelligible than that. Another problem with leaving it as it stands in HEAD is that the effective limit is now platform-specific, if not indeed dependent on the phase of the moon (or at least, the other contents of the pg_authid row). I fear it would be very easy to construct cases where a password is accepted on one machine but fails with "row is too big" on another. A uniform limit seems much less fraught with surprises. regards, tom lane
Re: Replace IN VALUES with ANY in WHERE clauses during optimization
Laurenz Albe writes: > I wonder if it is worth the extra planning time to detect and improve > such queries. I'm skeptical too. I'm *very* skeptical of implementing it in the grammar as shown here --- I'd go so far as to say that that approach cannot be accepted. That's far too early, and it risks all sorts of problems. An example is that the code as given seems to assume that all the sublists are the same length ... but we haven't checked that yet. I also suspect that this does not behave the same as the original construct for purposes like resolving dissimilar types in the VALUES list. (In an ideal world, perhaps it'd behave the same, but that ship sailed a couple decades ago.) regards, tom lane
Re: pg_basebackup and error messages dependent on the order of the arguments
Robert Haas writes: > On Wed, Oct 2, 2024 at 6:00 AM Daniel Westermann (DWE) > wrote: >> Maybe checking if a valid "-D" or "--pgdata" was given and return a more >> generic error message would be an option? > It doesn't really seem reasonable to me to make the tools guess > whether somebody left out the argument to an option that requires an > argument. Consider these equivalent cases: > ... > I assume there are similar cases that don't involve PostgreSQL at all. Yeah. This has to be a standard problem for anything that uses getopt or getopt_long at all. Unless there's a standard approach (which I've not heard of) to resolving these ambiguities, I'm not sure that we should try to outsmart everybody else. In the case of getopt_long there's an additional problem, which is that that function itself may contain heuristics that rearrange the argument order based on what looks like a switch or not. It's likely that anything we did on top of that would behave differently depending on which version of getopt_long it is. regards, tom lane
Re: [BUG FIX]Connection fails with whitespace after keepalives parameter value
I poked into the ecpg end of this and found that the extra space is coming from one production in ecpg.trailer that's carelessly using cat_str (which inserts spaces) instead of makeN_str (which doesn't). So it's pretty trivial to fix, as attached. I do not think we could rip out ECPGconnect's logic to remove the spaces at runtime, because that would break existing ecpg applications until they're recompiled. It might be worth adding a comment there about why it's being done, though. I don't have a strong opinion one way or the other about whether we should make libpq permissive about extra spaces (as per Michael's patch). I guess you could argue that all of these fixes are consistent with the principle of "be conservative with what you send and liberal with what you accept". But at most I'd fix these remaining things in HEAD. regards, tom lane diff --git a/src/interfaces/ecpg/preproc/ecpg.trailer b/src/interfaces/ecpg/preproc/ecpg.trailer index b6233e5e53..f3ab73bed6 100644 --- a/src/interfaces/ecpg/preproc/ecpg.trailer +++ b/src/interfaces/ecpg/preproc/ecpg.trailer @@ -348,7 +348,7 @@ connect_options: ColId opt_opt_value if (strcmp($3, "&") != 0) mmerror(PARSE_ERROR, ET_ERROR, "unrecognized token \"%s\"", $3); - $$ = cat_str(3, make2_str($1, $2), $3, $4); + $$ = make3_str(make2_str($1, $2), $3, $4); } ; diff --git a/src/interfaces/ecpg/test/expected/connect-test5.c b/src/interfaces/ecpg/test/expected/connect-test5.c index ec1514ed9a..fc650f5cd5 100644 --- a/src/interfaces/ecpg/test/expected/connect-test5.c +++ b/src/interfaces/ecpg/test/expected/connect-test5.c @@ -117,7 +117,7 @@ main(void) #line 56 "test5.pgc" - { ECPGconnect(__LINE__, 0, "unix:postgresql://localhost/ecpg2_regression?connect_timeout=180 & client_encoding=sql_ascii" , "regress_ecpg_user1" , "connectpw" , "main", 0); } + { ECPGconnect(__LINE__, 0, "unix:postgresql://localhost/ecpg2_regression?connect_timeout=180&client_encoding=sql_ascii" , "regress_ecpg_user1" , "connectpw" , "main", 0); } #line 58 "test5.pgc" { ECPGdisconnect(__LINE__, "main");} diff --git a/src/interfaces/ecpg/test/expected/connect-test5.stderr b/src/interfaces/ecpg/test/expected/connect-test5.stderr index 51cc18916a..037db21758 100644 --- a/src/interfaces/ecpg/test/expected/connect-test5.stderr +++ b/src/interfaces/ecpg/test/expected/connect-test5.stderr @@ -50,7 +50,7 @@ [NO_PID]: sqlca: code: 0, state: 0 [NO_PID]: ecpg_finish: connection main closed [NO_PID]: sqlca: code: 0, state: 0 -[NO_PID]: ECPGconnect: opening database ecpg2_regression on port with options connect_timeout=180 & client_encoding=sql_ascii for user regress_ecpg_user1 +[NO_PID]: ECPGconnect: opening database ecpg2_regression on port with options connect_timeout=180&client_encoding=sql_ascii for user regress_ecpg_user1 [NO_PID]: sqlca: code: 0, state: 0 [NO_PID]: ecpg_finish: connection main closed [NO_PID]: sqlca: code: 0, state: 0
Re: make \d table Collation field showing domains collation if that attribute is type of domain.
jian he writes: > main changes: > @@ -1866,7 +1866,7 @@ describeOneTableDetails(const char *schemaname, > attrdef_col = cols++; > attnotnull_col = cols++; > appendPQExpBufferStr(&buf, ",\n (SELECT c.collname > FROM pg_catalog.pg_collation c, pg_catalog.pg_type t\n" > -" WHERE > c.oid = a.attcollation AND t.oid = a.atttypid AND a.attcollation <> > t.typcollation) AS attcollation"); > +" WHERE > c.oid = a.attcollation AND t.oid = a.atttypid and c.collname <> > 'default' ) AS attcollation"); I doubt that this is an improvement. It will create as many weird cases as it removes. (To cite just one, there is nothing preventing somebody from creating a collation named "default".) The existing definition seems fine to me, anyway. What's so wrong with treating the domain's collation as being an implicit property of the domain type? What you want to do hopelessly confuses this display between attributes of the table and attributes of the column type. A nearby comparable case is that the "Default" column only tells about default values applied in the table definition, not any that might have come from the domain. Nor does the "Check constraints" footer tell about constraints coming from a domain column. regards, tom lane
Re: CI warnings test for 32 bit, and faster headerscheck
Thomas Munro writes: > Hmm, given that configure uses more time than compiling (assuming 100% > ccache hits) and is woefully serial, I wonder what ingredients you'd > need to hash to have bulletproof cache invalidation for a persistent > configure cache, ie that survives between runs. The buildfarm uses a simple trick that seems to work remarkably well: if the animal's previous run failed (for any reason) then blow away the configure cache. Maybe that could be adapted here. regards, tom lane
Re: pg_verifybackup: TAR format backup verification
I wrote: > Sadly, it seems adder[1] is even pickier than mamba: Nope, it was my testing error: I supposed that this patch only affected pg_combinebackup and pg_verifybackup, so I only recompiled those modules not the whole tree. But there's one more place with a json_manifest_per_file_callback callback :-(. I pushed a quick-hack fix. regards, tom lane
Re: [BUG FIX]Connection fails with whitespace after keepalives parameter value
Fujii Masao writes: > On 2024/10/02 11:35, Michael Paquier wrote: >> On Tue, Oct 01, 2024 at 12:29:15PM -0400, Tom Lane wrote: >>> I agree with Sasaki-san that useKeepalives seems rather bogus: almost >>> every other place in fe-connect.c uses pqParseIntParam rather than >>> calling strtol directly, so why not this one? > I have no objection to improving the handling of the keepalives parameter. > OTOH, I think ecpg might have an issue when converting the connection URI. I went ahead and pushed Sasaki-san's patch. I think anything we might do in ecpg is probably just cosmetic and wouldn't get back-patched. > In the converted URI, whitespace is added before and after the ? character. > In my quick test, ECPGconnect() seems to handle this without error, > but when I tried the same URI in psql, it returned an error: > $ psql "postgresql://localhost:5432/postgres?keepalives=1 & keepalives_idle=1 > & keepalives_interval=1 & keepalives_count=2" > psql: error: invalid URI query parameter: " keepalives_idle" Interesting. This is unhappy about the space before a parameter name, not the space after a parameter value, so it's a different issue. But it's weird that ecpg takes it while libpq doesn't. Could libecpg be modifying/reassembling the URI string? I didn't look. regards, tom lane
Re: pg_verifybackup: TAR format backup verification
Robert Haas writes: > On Tue, Oct 1, 2024 at 1:32 PM Tom Lane wrote: >> Yes, mamba thinks this is OK. > Committed. Sadly, it seems adder[1] is even pickier than mamba: ../pgsql/src/backend/backup/basebackup_incremental.c: In function \342\200\230CreateIncrementalBackupInfo\342\200\231: ../pgsql/src/backend/backup/basebackup_incremental.c:179:30: error: assignment to \342\200\230json_manifest_per_file_callback\342\200\231 {aka \342\200\230void (*)(JsonManifestParseContext *, const char *, long long unsigned int, pg_checksum_type, int, unsigned char *)\342\200\231} from incompatible pointer type \342\200\230void (*)(JsonManifestParseContext *, const char *, size_t, pg_checksum_type, int, uint8 *)\342\200\231 {aka \342\200\230void (*)(JsonManifestParseContext *, const char *, unsigned int, pg_checksum_type, int, unsigned char *)\342\200\231} [-Wincompatible-pointer-types] 179 | context->per_file_cb = manifest_process_file; | ^ regards, tom lane [1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=adder&dt=2024-10-02%2014%3A09%3A58
Re: pg_trgm comparison bug on cross-architecture replication due to different char implementation
Noah Misch writes: > On Tue, Oct 01, 2024 at 11:55:48AM -0700, Masahiko Sawada wrote: >> Considering that the population of database cluster signedness will >> converge to signedness=true in the future, we can consider using >> -fsigned-char to prevent similar problems for the future. We need to >> think about possible side-effects as well, though. > It's good to think about -fsigned-char. While I find it tempting, several > things would need to hold for us to benefit from it: > - Every supported compiler has to offer it or an equivalent. > - The non-compiler parts of every supported C implementation need to > cooperate. For example, CHAR_MIN must change in response to the flag. See > the first comment in cash_in(). > - Libraries we depend on can't do anything incompatible with it. > Given that, I would lean toward not using -fsigned-char. It's unlikely all > three things will hold. Even if they do, the benefit is not large. I am very, very strongly against deciding that Postgres will only support one setting of char signedness. It's a step on the way to hardware monoculture, and we know where that eventually leads. (In other words, I categorically reject Sawada-san's assertion that signed chars will become universal. I'd reject the opposite assertion as well.) regards, tom lane
Re: Fixing backslash dot for COPY FROM...CSV
"Daniel Verite" writes: > Tom Lane wrote: >> Returning to my upthread thought that >>> I think we should fix it so that \. that's not alone on a line >>> throws an error, but I wouldn't go further than that. >> here's a quick follow-on patch to make that happen. It could >> probably do with a test case to demonstrate the error, but >> I didn't bother yet pending approval that we want to do this. > +1 for fixing. It's been on the thread for awhile now without objections, so I'll go ahead and make that happen. regards, tom lane
Re: pg_verifybackup: TAR format backup verification
I wrote: > Robert Haas writes: >> Here is an attempt at cleaning this up. I'm far from convinced that >> it's fully correct; my local compiler (clang version 15.0.7) doesn't >> seem fussed about conflating size_t with uint64, not even with -Wall >> -Werror. I don't suppose you have a fussier compiler locally that you >> can use to test this? > Sure, I can try it on mamba's host. It's slow though ... Yes, mamba thinks this is OK. regards, tom lane
Re: [BUG FIX]Connection fails with whitespace after keepalives parameter value
Fujii Masao writes: > On 2024/10/01 14:11, Yuto Sasaki (Fujitsu) wrote: >> Root cause: The method for parsing the keepalives parameter in the >> useKeepalives >> function of the libpq library is not appropriate. Specifically, it doesn't >> account for whitespace following the numeric value. > Is a connection URL with whitespace, like > "tcp:postgresql://localhost:5432/postgres?keepalives=1 & ...", > considered valid? If not, the issue seems to be that ecpg adds unnecessary > whitespace > to the connection URL, especially after the "&" character. I agree with Sasaki-san that useKeepalives seems rather bogus: almost every other place in fe-connect.c uses pqParseIntParam rather than calling strtol directly, so why not this one? We might have some work to do in ecpg as well, though. regards, tom lane
Re: pg_verifybackup: TAR format backup verification
Robert Haas writes: > Here is an attempt at cleaning this up. I'm far from convinced that > it's fully correct; my local compiler (clang version 15.0.7) doesn't > seem fussed about conflating size_t with uint64, not even with -Wall > -Werror. I don't suppose you have a fussier compiler locally that you > can use to test this? Sure, I can try it on mamba's host. It's slow though ... > Respectfully, if you'd just said in your first email about this "I > understand that you were trying to be consistent with a format string > somewhere else, but I don't think that's a good reason to do it this > way, so please use %llu and insert a cast," I would have just said > "fine, no problem" and I wouldn't have been irritated at all. I apologize for rubbing you the wrong way on this. It was not my intent. (But, in fact, I had not realized you copied that code from somewhere else.) regards, tom lane
Re: Fixing backslash dot for COPY FROM...CSV
"Daniel Verite" writes: > [ v6-0001-Support-backslash-dot-on-a-line-by-itself-as-vali.patch ] I did some more work on the docs and comments, and pushed that. Returning to my upthread thought that >>> I think we should fix it so that \. that's not alone on a line >>> throws an error, but I wouldn't go further than that. here's a quick follow-on patch to make that happen. It could probably do with a test case to demonstrate the error, but I didn't bother yet pending approval that we want to do this. (This passes check-world as it stands, indicating we have no existing test that expects this case to work.) Also, I used the same error message "end-of-copy marker corrupt" that we have for the case of junk following the marker, but I can't say that I think much of that phrasing. What do people think of "end-of-copy marker is not alone on its line", instead? regards, tom lane diff --git a/src/backend/commands/copyfromparse.c b/src/backend/commands/copyfromparse.c index a280efe23f..2ea9679b3c 100644 --- a/src/backend/commands/copyfromparse.c +++ b/src/backend/commands/copyfromparse.c @@ -1426,13 +1426,17 @@ CopyReadLineText(CopyFromState cstate) } /* - * Transfer only the data before the \. into line_buf, then - * discard the data and the \. sequence. + * If there is any data on this line before the \., complain. + */ +if (cstate->line_buf.len > 0 || + prev_raw_ptr > cstate->input_buf_index) + ereport(ERROR, + (errcode(ERRCODE_BAD_COPY_FILE_FORMAT), + errmsg("end-of-copy marker corrupt"))); + +/* + * Discard the \. and newline, then report EOF. */ -if (prev_raw_ptr > cstate->input_buf_index) - appendBinaryStringInfo(&cstate->line_buf, - cstate->input_buf + cstate->input_buf_index, - prev_raw_ptr - cstate->input_buf_index); cstate->input_buf_index = input_buf_ptr; result = true; /* report EOF */ break;
Re: pg_upgrade check for invalid databases
Daniel Gustafsson writes: >> On 30 Sep 2024, at 16:55, Tom Lane wrote: >> TBH I'm not finding anything very much wrong with the current >> behavior... this has to be a rare situation, do we need to add >> debatable behavior to make it easier? > One argument would be to make the checks consistent, pg_upgrade generally > tries > to report all the offending entries to help the user when fixing the source > database. Not sure if it's a strong enough argument for carrying code which > really shouldn't see much use though. OK, but the consistency argument would be to just report and fail. I don't think there's a precedent in other pg_upgrade checks for trying to fix problems automatically. regards, tom lane
Re: pg_verifybackup: TAR format backup verification
Robert Haas writes: > On Mon, Sep 30, 2024 at 11:31 AM Tom Lane wrote: >> WFM, modulo the suggestion about changing data types. > I would prefer not to make the data type change here because it has > quite a few tentacles. I see your point for the function's "len" argument, but perhaps it's worth doing - intremaining; + size_t remaining; remaining = sizeof(ControlFileData) - mystreamer->control_file_bytes; memcpy(((char *) &mystreamer->control_file) + mystreamer->control_file_bytes, - data, Min(len, remaining)); + data, Min((size_t) len, remaining)); This is enough to ensure the Min() remains safe. regards, tom lane
Re: pg_verifybackup: TAR format backup verification
Robert Haas writes: > On Mon, Sep 30, 2024 at 11:24 AM Tom Lane wrote: >> Um, wait ... we do have strtou64(), so you should use that. > The thing we should be worried about is not how large a JSON blob > might be, but rather how large any file that appears in the data > directory might be. So uint32 is out; and I think I hear you voting > for uint64 over size_t. Yes. size_t might only be 32 bits. > But then how do you think we should print > that? Cast to unsigned long long and use %llu? Our two standard solutions are to do that or to use UINT64_FORMAT. But UINT64_FORMAT is problematic in translatable strings because then the .po files would become platform-specific, so long long is what to use in that case. For a non-translated format string you can do either. > I don't understand what you think the widely-used, better solution is > here. What we just said above. regards, tom lane
Re: set_rel_pathlist function unnecessary params.
Kirill Reshke writes: > I happened to notice that `set_rel_pathlist` params, RelOptInfo *rel > and RangeTblEntry *rte are > unnecessary, because upon all usages, > `rte=root->simple_rte_array[rti]` and > `rel=root->simple_rel_array[rti]` holds. What's the point of providing > the same information 3 times? To avoid having to re-fetch it from those arrays? > So, I propose to refactor this a little bit. > Am I missing something? I'm -1 on changing this. It'd provide no detectable benefit while creating back-patching hazards in this code. regards, tom lane
Re: SET or STRICT modifiers on function affect planner row estimates
=?utf-8?Q?Micha=C5=82_K=C5=82eczek?= writes: >> On 30 Sep 2024, at 14:14, Ashutosh Bapat >> wrote: >> It is difficult to understand the exact problem from your description. >> Can you please provide EXPLAIN outputs showing the expected plan and >> the unexpected plan; plans on the node where the query is run and >> where the partitions are located. > The table structure is as follows: > CREATE TABLE tbl (…) PARTITION BY RANGE year(col02_date) You're still expecting people to magically intuit what all those "..."s are. I could spend many minutes trying to reconstruct a runnable example from these fragments, and if it didn't behave as you say, it'd be wasted effort because I didn't guess right about some un-mentioned detail. Please provide a *self-contained* example if you want someone to poke into this in any detail. You have not mentioned your PG version, either. My first guess would be that adding STRICT or adding a SET clause prevents function inlining, because it does. However, your Plan 2 doesn't seem to involve a FunctionScan node, so either these plans aren't really what you say or there's something else going on. regards, tom lane
Re: pg_basebackup and error messages dependent on the order of the arguments
"Daniel Westermann (DWE)" writes: >> Taking a closer look, many of the other switches-requiring-an-argument >> also just absorb "optarg" without checking its value till much later, >> so I'm not sure how far we could move the needle by special-casing >> --compress. > My point was not so much about --compress but rather giving a good error > message. Right, and my point was that the issue is bigger than --compress. For example, you get exactly the same misbehavior with $ pg_basebackup --checkpoint=fast --format=t -d --pgdata=/var/tmp/dummy pg_basebackup: error: must specify output directory or backup target pg_basebackup: hint: Try "pg_basebackup --help" for more information. I'm not sure how to solve the problem once you consider this larger scope. I don't think we could forbid arguments beginning with "-" for all of these switches. regards, tom lane
Re: pg_basebackup and error messages dependent on the order of the arguments
I wrote: > As this example shows, we really ought to validate the compression > argument on sight in order to get sensible error messages. The > trouble is that for server-side compression the code wants to just > pass the string through to the server and not form its own opinion > as to whether it's a known algorithm. > Perhaps it would help if we simply rejected strings beginning > with a dash? I haven't tested, but roughly along the lines of Taking a closer look, many of the other switches-requiring-an-argument also just absorb "optarg" without checking its value till much later, so I'm not sure how far we could move the needle by special-casing --compress. regards, tom lane
Re: pg_basebackup and error messages dependent on the order of the arguments
"Daniel Westermann (DWE)" writes: > shouldn't this give the same error message? > $ pg_basebackup --checkpoint=fast --format=t --compress > --pgdata=/var/tmp/dummy > pg_basebackup: error: must specify output directory or backup target > pg_basebackup: hint: Try "pg_basebackup --help" for more information. > $ pg_basebackup --pgdata=/var/tmp/dummy --checkpoint=fast --format=t > --compress > pg_basebackup: option '--compress' requires an argument > pg_basebackup: hint: Try "pg_basebackup --help" for more information. > I don't see why the first case gives not the same message as the second, > especially as the "output directory" is specified. It appears that the first case treats "--pgdata=/var/tmp/dummy" as the argument of --compress, and it doesn't bother to check that that specifies a valid compression algorithm until much later. As this example shows, we really ought to validate the compression argument on sight in order to get sensible error messages. The trouble is that for server-side compression the code wants to just pass the string through to the server and not form its own opinion as to whether it's a known algorithm. Perhaps it would help if we simply rejected strings beginning with a dash? I haven't tested, but roughly along the lines of case 'Z': + /* we don't want to check the algorithm yet, but ... */ + if (optarg[0] == '-') + pg_fatal("invalid compress option \"%s\", optarg); backup_parse_compress_options(optarg, &compression_algorithm, &compression_detail, &compressloc); break; regards, tom lane
Re: pg_walsummary, Character-not-present-in-option
btsugieyuusuke writes: >> Therefore, shouldn't “f:” and “w:” be removed? Looks like that to me too. Pushed. regards, tom lane
Re: ACL_MAINTAIN, Lack of comment content
Nathan Bossart writes: > On Mon, Sep 30, 2024 at 04:13:55PM +0200, Daniel Gustafsson wrote: >> I'm not a native speaker so I'm not sure which is right, but grepping for >> other >> lists of items shows that the last "and" item is often preceded by a comma so >> I'll do that. > I'm not aware of a project policy around the Oxford comma [0], but I tend > to include one. Yeah, as that wikipedia article suggests, you can find support for either choice. I'd say do what looks best in context. regards, tom lane
Re: pg_verifybackup: TAR format backup verification
Robert Haas writes: > On Sun, Sep 29, 2024 at 1:03 PM Tom Lane wrote: >>> CID 1620458: Resource leaks (RESOURCE_LEAK) >>> Variable "buffer" going out of scope leaks the storage it points to. > This looks like a real leak. It can only happen once per tarfile when > verifying a tar backup so it can never add up to much, but it makes > sense to fix it. +1 >>> CID 1620457: Memory - illegal accesses (OVERRUN) >>> Overrunning array of 296 bytes at byte offset 296 by dereferencing pointer >>> "(char *)&mystreamer->control_file + mystreamer->control_file_bytes". > I think this might be complaining about a potential zero-length copy. > Seems like perhaps the <= sizeof(ControlFileData) test should actually > be < sizeof(ControlFileData). That's clearly an improvement, but I was wondering if we should also change "len" and "remaining" to be unsigned (probably size_t). Coverity might be unhappy about the off-the-end array reference, but perhaps it's also concerned about what happens if len is negative. >>> CID 1620456: Null pointer dereferences (FORWARD_NULL) >>> Passing null pointer "suffix" to "strcmp", which dereferences it. > This one is happening, I believe, because report_backup_error() > doesn't perform a non-local exit, but we have a bit of code here that > acts like it does. Check. > Patch attached. WFM, modulo the suggestion about changing data types. regards, tom lane
Re: pg_verifybackup: TAR format backup verification
Robert Haas writes: > On Sat, Sep 28, 2024 at 6:59 PM Tom Lane wrote: >> Now, manifest_file.size is in fact a size_t, so %zu is the correct >> format spec for it. But astreamer_verify.checksum_bytes is declared >> uint64. This leads me to question whether size_t was the correct >> choice for manifest_file.size. > It seems that manifest_file.size is size_t because parse_manifest.h is > using size_t for json_manifest_per_file_callback. What's happening is > that json_manifest_finalize_file() is parsing the file size > information out of the manifest. It uses strtoul to do that and > assigns the result to a size_t. I don't think I had any principled > reason for making that decision; size_t is, I think, the size of an > object in memory, and this is not that. Correct, size_t is defined to measure in-memory object sizes. It's the argument type of malloc(), the result type of sizeof(), etc. It does not restrict the size of disk files. > This is just a string in a > JSON file that represents an integer which will hopefully turn out to > be the size of the file on disk. I guess I don't know what type I > should be using here. Most things in PostgreSQL use a type like uint32 > or uint64, but technically what we're going to be comparing against in > the end is probably an off_t produced by stat(), but the return value > of strtoul() or strtoull() is unsigned long or unsigned long long, > which is not any of those things. If you have a suggestion here, I'm > all ears. I don't know if it's realistic to expect that this code might be used to process JSON blobs exceeding 4GB. But if it is, I'd be inclined to use uint64 and strtoull for these purposes, if only to avoid cross-platform hazards with varying sizeof(long) and sizeof(size_t). Um, wait ... we do have strtou64(), so you should use that. >> Aside from that, I'm unimpressed with expending a five-line comment >> at line 376 to justify casting control_file_bytes to int, > I don't know what you mean by this. I mean that we have a widely-used, better solution. If you copied this from someplace else, the someplace else could stand to be improved too. regards, tom lane
Re: Do not lock temp relations
Maxim Orlov writes: > But for the second one: do we really need any lock for temp relations? Yes. Our implementation restrictions preclude access to the contents of another session's temp tables, but it is not forbidden to do DDL on them so long as no content access is required. (Without this, it'd be problematic for example to clean out a crashed session's temp tables. See the "orphan temporary tables" logic in autovacuum.c.) You need fairly realistic locking to ensure that's OK. regards, tom lane
Re: msys inet_pton strangeness
Andrew Dunstan writes: > On 2024-09-30 Mo 10:08 AM, Tom Lane wrote: >> Not entirely ... if fairywren had been generating that warning all >> along, I would have noticed it long ago, because I periodically >> scrape the BF database for compiler warnings. There has to have >> been some recent change in the system include files. > here's what I see on vendikar: Oh, wait, I forgot this is only about the v15 branch. I seldom search for warnings except on HEAD. Still, I'd have expected to notice it while v15 was development tip. Maybe we changed something since then? Anyway, it's pretty moot, I see no reason not to push forward with the proposed fix. regards, tom lane
Re: pg_upgrade check for invalid databases
Nathan Bossart writes: > Should we have pg_upgrade skip invalid databases? If the only valid action > is to drop them, IMHO it seems unnecessary to require users to manually > drop them before retrying pg_upgrade. I was thinking the same. But I wonder if there is any chance of losing data that could be recoverable. It feels like this should not be a default behavior. TBH I'm not finding anything very much wrong with the current behavior... this has to be a rare situation, do we need to add debatable behavior to make it easier? regards, tom lane
Re: msys inet_pton strangeness
Andrew Dunstan writes: > Ah, so this is because gcc 14.1.0 treats this as an error but gcc 12.2.0 > treats it as a warning. Now it makes sense. Not entirely ... if fairywren had been generating that warning all along, I would have noticed it long ago, because I periodically scrape the BF database for compiler warnings. There has to have been some recent change in the system include files. regards, tom lane
Re: Fixing backslash dot for COPY FROM...CSV
"Daniel Verite" writes: > To clarify the compatibility issue, the attached bash script > compares pre-patch and post-patch client/server combinations with > different cases, submitted with different copy variants. > ... > Also attaching the tables of results with the patch as it stands. > "Failed" is when psql reports an error and "Data mismatch" > is when it succeeds but with copied data differing from what was > expected. Thanks for doing that --- that does indeed clarify matters. Obviously, the cases we need to worry about most are the "Data mismatch" ones, since those might lead to silent misbehavior. I modified your script to show me the exact nature of the mismatches. The results are: * For case B, unpatched both sides, the "mismatch" is that the copy stops at the \. line. We should really call this the "old expected behavior", since it's not impossible that somebody is relying on it. We're okay with breaking that expectation, but the question is whether there might be other surprises. * For case B, unpatched-server+patched-psql, the mismatches are exactly the same, i.e. the user sees the old behavior. That's OK. * For case B, patched-server+unpatched-psql, the mismatches are different: the copy absorbs the \. as a data line and then stops. So that's surprising and bad, as it's neither the old expected behavior nor the new intended behavior. However, there are two things that make me less worried about this than I might be. First, it's pretty likely that the server will not find "\." alone on a line to be valid input data for the COPY, so that the result will be an error not silently wrong data. Second, the case of patched-server+unpatched-psql is fairly uncommon, and people know to expect that an old psql might not behave perfectly against a newer server. The other direction of version mismatch is of far more concern: people do expect a new psql to work with an old server. * For case C, patched-server+unpatched-psql, the mismatch for embedded data is again that the copy absorbs the \. as a data line and then stops. Again, this isn't great but the mitigating factors are the same as above. In all other cases, the results are the same or strictly better than before. So on the whole I think we are good on the compatibility front and I'm ready to press ahead with the v5 behavior. However, I've not looked at Artur's coding suggestions downthread. Do you want to review that and see if you want to adopt any of those changes? regards, tom lane
Re: pg_verifybackup: TAR format backup verification
Piling on a bit ... Coverity reported the following issues in this new code. I have not analyzed them to see if they're real problems. *** CID 1620458: Resource leaks (RESOURCE_LEAK) /srv/coverity/git/pgsql-git/postgresql/src/bin/pg_verifybackup/pg_verifybackup.c: 1025 in verify_tar_file() 1019relpath); 1020 1021/* Close the file. */ 1022if (close(fd) != 0) 1023report_backup_error(context, "could not close file \"%s\": %m", 1024relpath); >>> CID 1620458: Resource leaks (RESOURCE_LEAK) >>> Variable "buffer" going out of scope leaks the storage it points to. 1025 } 1026 1027 /* 1028 * Scan the hash table for entries where the 'matched' flag is not set; report 1029 * that such files are present in the manifest but not on disk. 1030 */ *** CID 1620457: Memory - illegal accesses (OVERRUN) /srv/coverity/git/pgsql-git/postgresql/src/bin/pg_verifybackup/astreamer_verify.c: 349 in member_copy_control_data() 343 */ 344 if (mystreamer->control_file_bytes <= sizeof(ControlFileData)) 345 { 346 int remaining; 347 348 remaining = sizeof(ControlFileData) - mystreamer->control_file_bytes; >>> CID 1620457: Memory - illegal accesses (OVERRUN) >>> Overrunning array of 296 bytes at byte offset 296 by dereferencing >>> pointer "(char *)&mystreamer->control_file + >>> mystreamer->control_file_bytes". 349 memcpy(((char *) &mystreamer->control_file) 350+ mystreamer->control_file_bytes, 351data, Min(len, remaining)); 352 } 353 354 /* Remember how many bytes we saw, even if we didn't buffer them. */ *** CID 1620456: Null pointer dereferences (FORWARD_NULL) /srv/coverity/git/pgsql-git/postgresql/src/bin/pg_verifybackup/pg_verifybackup.c: 939 in precheck_tar_backup_file() 933 "file \"%s\" is not expected in a tar format backup", 934 relpath); 935 tblspc_oid = (Oid) num; 936 } 937 938 /* Now, check the compression type of the tar */ >>> CID 1620456: Null pointer dereferences (FORWARD_NULL) >>> Passing null pointer "suffix" to "strcmp", which dereferences it. 939 if (strcmp(suffix, ".tar") == 0) 940 compress_algorithm = PG_COMPRESSION_NONE; 941 else if (strcmp(suffix, ".tgz") == 0) 942 compress_algorithm = PG_COMPRESSION_GZIP; 943 else if (strcmp(suffix, ".tar.gz") == 0) 944 compress_algorithm = PG_COMPRESSION_GZIP; regards, tom lane
Re: msys inet_pton strangeness
Andrew Dunstan writes: > Yeah, src/include/port/win32/sys/socket.h has: > #include > #include > #include > I'm inclined to think we might need to reverse the order of the last > two. TBH I don't really understand how this has worked up to now. I see the same in src/include/port/win32_port.h ... wouldn't that get included first? regards, tom lane
Re: pg_verifybackup: TAR format backup verification
The 32-bit buildfarm animals don't like this too much [1]: astreamer_verify.c: In function 'member_verify_checksum': astreamer_verify.c:297:68: error: format '%zu' expects argument of type 'size_t', but argument 6 has type 'uint64' {aka 'long long unsigned int'} [-Werror=format=] 297 |"file \\"%s\\" in \\"%s\\" should contain %zu bytes, but read %zu bytes", | ~~^ || | unsigned int | %llu 298 |m->pathname, mystreamer->archive_name, 299 |m->size, mystreamer->checksum_bytes); | ~~ | | | uint64 {aka long long unsigned int} Now, manifest_file.size is in fact a size_t, so %zu is the correct format spec for it. But astreamer_verify.checksum_bytes is declared uint64. This leads me to question whether size_t was the correct choice for manifest_file.size. If it is, is it OK to use size_t for checksum_bytes as well? If not, your best bet is likely to write %lld and add an explicit cast to long long, as we do in dozens of other places. I see that these messages are intended to be translatable, so INT64_FORMAT is not usable here. Aside from that, I'm unimpressed with expending a five-line comment at line 376 to justify casting control_file_bytes to int, when you could similarly cast it to long long, avoiding the need to justify something that's not even in tune with project style. regards, tom lane [1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mamba&dt=2024-09-28%2006%3A03%3A02
Re: msys inet_pton strangeness
Andrew Dunstan writes: > We should have included ws2tcpip.h, which includes this: > #define InetPtonA inet_pton > WINSOCK_API_LINKAGE INT WSAAPI InetPtonA(INT Family, LPCSTR pStringBuf, > PVOID pAddr); > It's conditioned on (_WIN32_WINNT >= 0x0600), but that should be true. > So I'm still very confused ;-( Me too. Does this compiler support the equivalent of -E, so that you can verify that the InetPtonA declaration is being read? regards, tom lane
Re: Better error reporting from extension scripts (Was: Extend ALTER OPERATOR)
I wrote: > It looks like if we did want to suppress that, the right fix is to > make gram.y track statement start locations more honestly, as in > 0002 attached (0001 is the same as before). I did a little bit of further work on this: * I ran some performance checks and convinced myself that indeed the more complex definition of YYLLOC_DEFAULT has no measurable cost compared to the overall runtime of raw_parser(), never mind later processing. So I see no reason not to go ahead with that change. I swapped the two patches to make that 0001, and added a regression test illustrating its effect on pg_stat_statements. (Without the gram.y change, the added slash-star comment shows up in the pg_stat_statements output, which is surely odd.) * I concluded that the best way to report the individual statement when we're able to do that is to include it in an errcontext() message, similar to what spi.c's _SPI_error_callback does. Otherwise it interacts badly if some more-tightly-nested error context function has already set up an "internal error query", as for example SQL function errors will do if you enable check_function_bodies = on. So here's v3, now with commit messages and regression tests. I feel like this is out of the proof-of-concept stage and might now actually be committable. There's still a question of whether reporting the whole script as the query is OK when we have a syntax error, but I have no good ideas as to how to make that terser. And I think you're right that we shouldn't let perfection stand in the way of making things noticeably better. regards, tom lane From f6022f120c6c8e16f33a7b7d220d0f3a8b7ebf4e Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Sat, 28 Sep 2024 13:48:39 -0400 Subject: [PATCH v3 1/2] Improve parser's reporting of statement start locations. Up to now, the parser's reporting of a statement's stmt_location included any preceding whitespace or comments. This isn't really desirable but was done to avoid accounting honestly for nonterminals that reduce to empty. It causes problems for pg_stat_statements, which partially compensates by manually stripping whitespace, but is not bright enough to strip /*-style comments. There will be more problems with an upcoming patch to improve reporting of errors in extension scripts, so it's time to do something about this. The thing we have to do to make it work right is to adjust YYLLOC_DEFAULT to scan the inputs of each production to find the first one that has a valid location (i.e., did not reduce to empty). In theory this adds a little bit of per-reduction overhead, but in practice it's negligible. I checked by measuring the time to run raw_parser() on the contents of information_schema.sql, and there was basically no change. Having done that, we can rely on any nonterminal that didn't reduce to completely empty to have a correct starting location, and we don't need the kluges the stmtmulti production formerly used. This should have a side benefit of allowing parse error reports to include an error position in some cases where they formerly failed to do so, due to trying to report the position of an empty nonterminal. I did not go looking for an example though. The one previously known case where that could happen (OptSchemaEltList) no longer needs the kluge it had; but I rather doubt that that was the only case. Discussion: https://postgr.es/m/zvv1clhnbjlcz...@msg.df7cb.de --- .../pg_stat_statements/expected/select.out| 5 +- contrib/pg_stat_statements/sql/select.sql | 3 +- src/backend/nodes/queryjumblefuncs.c | 6 ++ src/backend/parser/gram.y | 66 +++ 4 files changed, 34 insertions(+), 46 deletions(-) diff --git a/contrib/pg_stat_statements/expected/select.out b/contrib/pg_stat_statements/expected/select.out index dd6c756f67..e0e2fa265c 100644 --- a/contrib/pg_stat_statements/expected/select.out +++ b/contrib/pg_stat_statements/expected/select.out @@ -19,8 +19,9 @@ SELECT 1 AS "int"; 1 (1 row) +/* this comment should not appear in the output */ SELECT 'hello' - -- multiline + -- but this one will appear AS "text"; text --- @@ -129,7 +130,7 @@ SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C"; ---+--+-- 1 |1 | PREPARE pgss_test (int) AS SELECT $1, $2 LIMIT $3 4 |4 | SELECT $1 + - | | -- multiline + + | | -- but this one will appear + | | AS "text" 2 |2 | SELECT $1 + $2 3 |3 | SELECT $1 + $2 + $3 AS "add"
Re: msys inet_pton strangeness
Andrew Dunstan writes: > It's complaining like this: > C:/tools/xmsys64/home/pgrunner/bf/root/REL_15_STABLE/pgsql/src/interfaces/libpq/fe-secure-common.c:219:21: > error: implicit declaration of function 'inet_pton'; did you mean > 'inet_aton'? [-Wimplicit-function-declaration] >219 | if (inet_pton(AF_INET6, host, &addr) == 1) >| ^ > configure has determined that we have inet_pton, and I have repeated the > test manually. configure's test is purely a linker test. It does not check to see where/whether the function is declared. Meanwhile, the compiler is complaining that it doesn't see a declaration. So the problem probably can be fixed by adding an #include, but you'll need to figure out what. I see that our other user of inet_pton, fe-secure-openssl.c, has a rather different #include setup than fe-secure-common.c; does it compile OK? regards, tom lane
Re: Better error reporting from extension scripts (Was: Extend ALTER OPERATOR)
Christoph Berg writes: > Re: Tom Lane >> (It might be worth some effort to trim away comments appearing >> just before a command, but I didn't tackle that here.) > The "error when psql" comments do look confusing, but I guess in other > places the comment just before the query adds valuable context, so I'd > say leaving the comments in is ok. It looks like if we did want to suppress that, the right fix is to make gram.y track statement start locations more honestly, as in 0002 attached (0001 is the same as before). This'd add a few cycles per grammar nonterminal reduction, which is kind of annoying but probably is negligible in the grand scheme of things. Still, I'd not propose it just for this. But if memory serves, we've had previous complaints about pg_stat_statements failing to strip leading comments from queries, and this'd fix that. I think it likely also improves error cursor positioning for cases involving empty productions --- I'm a tad surprised that no other regression cases changed. regards, tom lane diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c index fab59ad5f6..0fae1332d2 100644 --- a/src/backend/commands/extension.c +++ b/src/backend/commands/extension.c @@ -54,6 +54,7 @@ #include "funcapi.h" #include "mb/pg_wchar.h" #include "miscadmin.h" +#include "nodes/queryjumble.h" #include "storage/fd.h" #include "tcop/utility.h" #include "utils/acl.h" @@ -107,6 +108,17 @@ typedef struct ExtensionVersionInfo struct ExtensionVersionInfo *previous; /* current best predecessor */ } ExtensionVersionInfo; +/* + * Information for script_error_callback() + */ +typedef struct +{ + const char *sql; /* entire script file contents */ + const char *filename; /* script file pathname */ + ParseLoc stmt_location; /* current stmt start loc, or -1 if unknown */ + ParseLoc stmt_len; /* length in bytes; 0 means "rest of string" */ +} script_error_callback_arg; + /* Local functions */ static List *find_update_path(List *evi_list, ExtensionVersionInfo *evi_start, @@ -670,9 +682,60 @@ read_extension_script_file(const ExtensionControlFile *control, return dest_str; } +/* + * error context callback for failures in script-file execution + */ +static void +script_error_callback(void *arg) +{ + script_error_callback_arg *callback_arg = (script_error_callback_arg *) arg; + int syntaxerrposition; + const char *lastslash; + + /* If it's a syntax error, convert to internal syntax error report */ + syntaxerrposition = geterrposition(); + if (syntaxerrposition > 0) + { + /* + * We must report the whole string because otherwise details such as + * psql's line number report would be wrong. + */ + errposition(0); + internalerrposition(syntaxerrposition); + internalerrquery(callback_arg->sql); + } + else if (callback_arg->stmt_location >= 0) + { + /* + * Since no syntax cursor will be shown, it's okay and helpful to trim + * the reported query string to just the current statement. + */ + const char *query = callback_arg->sql; + int location = callback_arg->stmt_location; + int len = callback_arg->stmt_len; + + query = CleanQuerytext(query, &location, &len); + internalerrquery(pnstrdup(query, len)); + } + + /* + * Trim the reported file name to remove the path. We know that + * get_extension_script_filename() inserted a '/', regardless of whether + * we're on Windows. + */ + lastslash = strrchr(callback_arg->filename, '/'); + if (lastslash) + lastslash++; + else + lastslash = callback_arg->filename; /* shouldn't happen, but cope */ + errcontext("extension script file \"%s\"", lastslash); +} + /* * Execute given SQL string. * + * The filename the string came from is also provided, for error reporting. + * * Note: it's tempting to just use SPI to execute the string, but that does * not work very well. The really serious problem is that SPI will parse, * analyze, and plan the whole string before executing any of it; of course @@ -682,12 +745,27 @@ read_extension_script_file(const ExtensionControlFile *control, * could be very long. */ static void -execute_sql_string(const char *sql) +execute_sql_string(const char *sql, const char *filename) { + script_error_callback_arg callback_arg; + ErrorContextCallback scripterrcontext; List *raw_parsetree_list; DestReceiver *dest; ListCell *lc1; + /* + * Setup error traceback support for ereport(). + */ + callback_arg.sql = sql; + callback_arg.filename = filename; + callback_arg.stmt_location = -1; + callback_arg.stmt_len = -1; + + scripterrcontext.callback = script_error_callback; + scripterrcontext.arg = (void *) &callback_arg; + scripterrcontext.previous = error_context_stack; + e
Re: Better error reporting from extension scripts (Was: Extend ALTER OPERATOR)
Christoph Berg writes: > Re: Tom Lane >> So the first part of that is great, but if your script file is >> large you probably won't be happy about having the whole thing >> repeated in the "QUERY" field. So this needs some work on >> user-friendliness. > Does this really have to be addressed? It would be way better than it > is now, and errors during extension creation are rare and mostly for > developers only, so it doesn't have to be pretty. Perhaps. I spent a little more effort on this and added code to report errors that don't come with an error location. On those, we don't have any constraints about what to report in the QUERY field, so I made it trim the string to just the current query within the script, which makes things quite a bit better. You can see the results in the test_extensions regression test changes. (It might be worth some effort to trim away comments appearing just before a command, but I didn't tackle that here.) regards, tom lane diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c index fab59ad5f6..0fae1332d2 100644 --- a/src/backend/commands/extension.c +++ b/src/backend/commands/extension.c @@ -54,6 +54,7 @@ #include "funcapi.h" #include "mb/pg_wchar.h" #include "miscadmin.h" +#include "nodes/queryjumble.h" #include "storage/fd.h" #include "tcop/utility.h" #include "utils/acl.h" @@ -107,6 +108,17 @@ typedef struct ExtensionVersionInfo struct ExtensionVersionInfo *previous; /* current best predecessor */ } ExtensionVersionInfo; +/* + * Information for script_error_callback() + */ +typedef struct +{ + const char *sql; /* entire script file contents */ + const char *filename; /* script file pathname */ + ParseLoc stmt_location; /* current stmt start loc, or -1 if unknown */ + ParseLoc stmt_len; /* length in bytes; 0 means "rest of string" */ +} script_error_callback_arg; + /* Local functions */ static List *find_update_path(List *evi_list, ExtensionVersionInfo *evi_start, @@ -670,9 +682,60 @@ read_extension_script_file(const ExtensionControlFile *control, return dest_str; } +/* + * error context callback for failures in script-file execution + */ +static void +script_error_callback(void *arg) +{ + script_error_callback_arg *callback_arg = (script_error_callback_arg *) arg; + int syntaxerrposition; + const char *lastslash; + + /* If it's a syntax error, convert to internal syntax error report */ + syntaxerrposition = geterrposition(); + if (syntaxerrposition > 0) + { + /* + * We must report the whole string because otherwise details such as + * psql's line number report would be wrong. + */ + errposition(0); + internalerrposition(syntaxerrposition); + internalerrquery(callback_arg->sql); + } + else if (callback_arg->stmt_location >= 0) + { + /* + * Since no syntax cursor will be shown, it's okay and helpful to trim + * the reported query string to just the current statement. + */ + const char *query = callback_arg->sql; + int location = callback_arg->stmt_location; + int len = callback_arg->stmt_len; + + query = CleanQuerytext(query, &location, &len); + internalerrquery(pnstrdup(query, len)); + } + + /* + * Trim the reported file name to remove the path. We know that + * get_extension_script_filename() inserted a '/', regardless of whether + * we're on Windows. + */ + lastslash = strrchr(callback_arg->filename, '/'); + if (lastslash) + lastslash++; + else + lastslash = callback_arg->filename; /* shouldn't happen, but cope */ + errcontext("extension script file \"%s\"", lastslash); +} + /* * Execute given SQL string. * + * The filename the string came from is also provided, for error reporting. + * * Note: it's tempting to just use SPI to execute the string, but that does * not work very well. The really serious problem is that SPI will parse, * analyze, and plan the whole string before executing any of it; of course @@ -682,12 +745,27 @@ read_extension_script_file(const ExtensionControlFile *control, * could be very long. */ static void -execute_sql_string(const char *sql) +execute_sql_string(const char *sql, const char *filename) { + script_error_callback_arg callback_arg; + ErrorContextCallback scripterrcontext; List *raw_parsetree_list; DestReceiver *dest; ListCell *lc1; + /* + * Setup error traceback support for ereport(). + */ + callback_arg.sql = sql; + callback_arg.filename = filename; + callback_arg.stmt_location = -1; + callback_arg.stmt_len = -1; + + scripterrcontext.callback = script_error_callback; + scripterrcontext.arg = (void *) &callback_arg; + scripterrcontext.previous = error_context_stack; + error_context_stack = &scripterrcontext; +
Re: Retiring is_pushed_down
Richard Guo writes: > When forming an outer join's joinrel, we have the is_pushed_down flag in > RestrictInfo nodes to distinguish those quals that are in that join's > JOIN/ON condition from those that were pushed down to the joinrel and > thus act as filter quals. Since now we have the outer-join-aware-Var > infrastructure, I think we can check to see whether a qual clause's > required_relids reference the outer join(s) being formed, in order to > tell if it's a join or filter clause. This seems like a more principled > way. (Interesting that optimizer/README actually describes this way in > section 'Relation Identification and Qual Clause Placement'.) Sorry for being so slow to look at this patch. The idea you're following is one that I spent a fair amount of time on while working on what became 2489d76c4 ("Make Vars be outer-join-aware"). I failed to make it work though. Digging in my notes from the time: - How about is_pushed_down? Would really like to get rid of that, because it's ugly/sloppily defined, and it's hard to determine the correct value for EquivClass-generated clauses once we allow derivations from OJ clauses. However, my original idea of checking for join's ojrelid present in clause's required_relids has issues: * fails if clause is not pushed as far down as it can possibly be (and lateral refs mean that that's hard to do sometimes) * getting the join's ojrelid to everywhere we need to check this is messy. I'd tolerate the mess if it worked nicely, but ... - So I'm worried that the point about lateral refs is still a problem in your version. To be clear, the hazard is that if a WHERE clause ends up getting placed at an outer join that's higher than any of the OJs specifically listed in its required_relids, we'd misinterpret it as being a join clause for that OJ although it should be a filter clause. The other thing I find in my old notes is speculation that we could use the concept of JoinDomains to replace is_pushed_down. That is, we'd have to label every RestrictInfo with the JoinDomain of its syntactic source location, and then we could tell if the RI was "pushed down" relative to a particular join by seeing if the JD was above or below that join. This ought to be impervious to not-pushed-down-all-the-way problems. The thing I'd not figured out was how to make this work with quals of full joins: they don't belong to either the upper JoinDomain or either of the lower ones. We could possibly fix this by giving a full join its very own JoinDomain that is understood to be a parent of both lower domains, but I ran out of energy to pursue that. If we went this route, we'd basically be replacing the is_pushed_down field with a JoinDomain field, which is surely not simpler. But it seems more crisply defined and perhaps more amenable to my long-term desire to be able to use the EquivalenceClass machinery with outer join clauses. (The idea being that an EC would describe equalities that hold within a JoinDomain, but not necessarily elsewhere.) regards, tom lane
Re: [ecpg bug]: can not use single '*' in multi-line comment after c preprocessor directives
"Winter Loo" writes: > The following code fails to pass the ecpg compilation, although it is > accepted by the gcc compiler. Yeah ... an isolated "/" inside the comment doesn't work either. > Confused! I am uncertain how to rectify the regex. I hope someone can address > this bug. I poked at this for awhile and concluded that we probably cannot make it work with a single regexp for "cppline". The right thing would involve an exclusive start condition for parsing a cppline, more or less like the way that /* comments are parsed in the start condition. This is kind of a lot of work compared to the value :-(. Maybe somebody else would like to take a crack at it, but I can't get excited enough about it. There are other deficiencies too in ecpg's handling of these things, like the fact that (I think) comments are mishandled in #include directives. regards, tom lane
Re: [PATCH] Support Int64 GUCs
Alexander Korotkov writes: > Do you think we don't need int64 GUCs just now, when 64-bit > transaction ids are far from committable shape? Or do you think we > don't need int64 GUCs even if we have 64-bit transaction ids? If yes, > what do you think we should use for *_age variables with 64-bit > transaction ids? I seriously doubt that _age values exceeding INT32_MAX would be useful, even in the still-extremely-doubtful situation that we get to true 64-bit XIDs. But if you think we must have that, we could still use float8 GUCs for them. float8 is exact up to 2^53 (given IEEE math), and you certainly aren't going to convince me that anyone needs _age values exceeding that. For that matter, an imprecise representation of such an age limit would still be all right wouldn't it? regards, tom lane
Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges
Christoph Berg writes: > I wish there was. The error reporting from failing extension scripts > is really bad with no context at all, it has hit me a few times in the > past already. Nobody's spent any work on that :-(. A really basic reporting facility is not hard to add, as in the attached finger exercise. The trouble with it can be explained by showing what I get after intentionally breaking a script file command: regression=# create extension cube; ERROR: syntax error at or near "CREAT" LINE 16: CREAT FUNCTION cube_send(cube) ^ QUERY: /* contrib/cube/cube--1.4--1.5.sql */ -- complain if script is sourced in psql, rather than via ALTER EXTENSION -- Remove @ and ~ DROP OPERATOR @ (cube, cube); DROP OPERATOR ~ (cube, cube); -- Add binary input/output handlers CREATE FUNCTION cube_recv(internal) RETURNS cube AS '$libdir/cube' LANGUAGE C IMMUTABLE STRICT PARALLEL SAFE; CREAT FUNCTION cube_send(cube) RETURNS bytea AS '$libdir/cube' LANGUAGE C IMMUTABLE STRICT PARALLEL SAFE; ALTER TYPE cube SET ( RECEIVE = cube_recv, SEND = cube_send ); CONTEXT: extension script file "/home/postgres/install/share/extension/cube--1.4--1.5.sql" So the first part of that is great, but if your script file is large you probably won't be happy about having the whole thing repeated in the "QUERY" field. So this needs some work on user-friendliness. I'm inclined to think that maybe we'd be best off keeping the server end of it straightforward, and trying to teach psql to abbreviate the QUERY field in a useful way. IIRC you get this same type of problem with very large SQL-language functions and suchlike. Also, I believe this doesn't help much for non-syntax errors (those that aren't reported with an error location). It might be interesting to use the RawStmt.stmt_location/stmt_len fields for the current parsetree to identify what to report, but I've not dug further than this. regards, tom lane diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c index fab59ad5f6..6af72ff422 100644 --- a/src/backend/commands/extension.c +++ b/src/backend/commands/extension.c @@ -107,6 +107,13 @@ typedef struct ExtensionVersionInfo struct ExtensionVersionInfo *previous; /* current best predecessor */ } ExtensionVersionInfo; +/* Info structure for script_error_callback() */ +typedef struct +{ + const char *sql; + const char *filename; +} script_error_callback_arg; + /* Local functions */ static List *find_update_path(List *evi_list, ExtensionVersionInfo *evi_start, @@ -670,9 +677,32 @@ read_extension_script_file(const ExtensionControlFile *control, return dest_str; } +/* + * error context callback for failures in script-file execution + */ +static void +script_error_callback(void *arg) +{ + script_error_callback_arg *callback_arg = (script_error_callback_arg *) arg; + int syntaxerrposition; + + /* If it's a syntax error, convert to internal syntax error report */ + syntaxerrposition = geterrposition(); + if (syntaxerrposition > 0) + { + errposition(0); + internalerrposition(syntaxerrposition); + internalerrquery(callback_arg->sql); + } + + errcontext("extension script file \"%s\"", callback_arg->filename); +} + /* * Execute given SQL string. * + * The filename the string came from is also provided, for error reporting. + * * Note: it's tempting to just use SPI to execute the string, but that does * not work very well. The really serious problem is that SPI will parse, * analyze, and plan the whole string before executing any of it; of course @@ -682,12 +712,25 @@ read_extension_script_file(const ExtensionControlFile *control, * could be very long. */ static void -execute_sql_string(const char *sql) +execute_sql_string(const char *sql, const char *filename) { + script_error_callback_arg callback_arg; + ErrorContextCallback scripterrcontext; List *raw_parsetree_list; DestReceiver *dest; ListCell *lc1; + /* + * Setup error traceback support for ereport(). + */ + callback_arg.sql = sql; + callback_arg.filename = filename; + + scripterrcontext.callback = script_error_callback; + scripterrcontext.arg = (void *) &callback_arg; + scripterrcontext.previous = error_context_stack; + error_context_stack = &scripterrcontext; + /* * Parse the SQL string into a list of raw parse trees. */ @@ -780,6 +823,8 @@ execute_sql_string(const char *sql) /* Be sure to advance the command counter after the last script command */ CommandCounterIncrement(); + + error_context_stack = scripterrcontext.previous; } /* @@ -1054,7 +1099,7 @@ execute_extension_script(Oid extensionOid, ExtensionControlFile *control, /* And now back to C string */ c_sql = text_to_cstring(DatumGetTextPP(t_sql)); - execute_sql_string(c_sql); + execute_sql_string(c_sql, filename); } PG_FINALLY(); {
Re: Test improvements and minor code fixes for formatting.c.
Nathan Bossart writes: > On Sun, Sep 08, 2024 at 05:32:16PM -0400, Tom Lane wrote: >> In looking at this, I found that there's also no test coverage >> for the , V, or PL format codes. Also, the possibility of >> overflow while converting an input value to int in order to >> pass it to int_to_roman was ignored. Attached is a patch that >> adds more test coverage and cleans up the Roman-numeral code >> a little bit. > I stared at the patch for a while, and it looks good to me. Pushed, thanks for looking! >> BTW, I also discovered that there is a little bit of support >> for a "B" format code: we can parse it, but then we ignore it. > AFAICT it's been like that since it was introduced [0]. I searched the > archives and couldn't find any discussion about this format code. Given > that, I don't have any concerns about removing it unless it causes ERRORs > for calls that currently succeed, but even then, it's probably fine. This > strikes me as something that might be fun for an aspiring hacker, though. Yeah, I left that alone for now. I don't have much interest in making it work, but perhaps someone else will. regards, tom lane
Re: PATCH: jsonpath string methods: lower, upper, initcap, l/r/btrim, replace, split_part
Florents Tselai writes: > This patch is a follow-up and generalization to [0]. > It adds the following jsonpath methods: lower, upper, initcap, l/r/btrim, > replace, split_part. How are you going to deal with the fact that this makes jsonpath operations not guaranteed immutable? (See commit cb599b9dd for some context.) Those are all going to have behavior that's dependent on the underlying locale. We have the kluge of having separate "_tz" functions to support non-immutable datetime operations, but that way doesn't seem like it's going to scale well to multiple sources of mutability. regards, tom lane
Re: src/backend/optimizer/util/plancat.c -> Is this correct English
"Daniel Westermann (DWE)" writes: > That's what I would have expected. But, as said, maybe this only sounds > strange to me. "Need not" is perfectly good English, although perhaps it has a faintly archaic whiff to it. regards, tom lane
Re: BUG #18097: Immutable expression not allowed in generated at
Adrien Nayrat writes: > I see. So I understand we were lucky it worked before the commit added > the check of volatility in generated column ? Pretty much. There are other cases that could trigger expansion of such a function before the restore is complete. It is unfortunate that this bit you in a minor release, but there are lots of other ways you could have tripped over the missing dependency unexpectedly. regards, tom lane
Re: XMLSerialize: version and explicit XML declaration
Jim Jones writes: > Is there any validation mechanism for VERSION literal>? AFAICS, all we do with an embedded XML version string is pass it to libxml2's xmlNewDoc(), which is the authority on whether it means anything. I'd be inclined to do the same here. regards, tom lane
Re: [PATCH] Support Int64 GUCs
FWIW, I agree with the upthread opinions that we shouldn't do this (invent int64 GUCs). I don't think we need the added code bloat and risk of breaking user code that isn't expecting this new GUC type. We invented the notion of GUC units in part to ensure that int32 GUCs could be adapted to handle potentially-large numbers. And there's always the fallback position of using a float8 GUC if you really feel you need a wider range. regards, tom lane
Re: BUG #18097: Immutable expression not allowed in generated at
Adrien Nayrat writes: > A customer encountered an issue while restoring a dump of its database > after applying 15.6 minor version. > It seems due to this fix : >>> Fix function volatility checking for GENERATED and DEFAULT >>> expressions (Tom Lane) I don't believe this example has anything to do with that. > CREATE SCHEMA s1; > CREATE SCHEMA s2; > CREATE FUNCTION s2.f1 (c1 text) RETURNS text > LANGUAGE SQL IMMUTABLE > AS $$ >SELECT c1 > $$; > CREATE FUNCTION s2.f2 (c1 text) RETURNS text > LANGUAGE SQL IMMUTABLE > AS $$ >SELECT s2.f1 (c1); > $$; > CREATE TABLE s1.t1 (c1 text, c2 text GENERATED ALWAYS AS (s2.f2 (c1)) > STORED); The problem here is that to pg_dump, the body of s2.f2 is just an opaque string, so it has no idea that that depends on s2.f1, and it ends up picking a dump order that doesn't respect that dependency. It used to be that there wasn't much you could do about this except choose object names that wouldn't cause the problem. In v14 and up there's another way, at least for SQL-language functions: you can write the function in SQL spec style. CREATE FUNCTION s2.f2 (c1 text) RETURNS text IMMUTABLE BEGIN ATOMIC SELECT s2.f1 (c1); END; Then the dependency is visible, both to the server and to pg_dump, and you get a valid dump order. regards, tom lane
Re: pg_ctl/miscinit: print "MyStartTime" as a long long instead of long to avoid 2038 problem.
Nathan Bossart writes: > I think we should use INT64_FORMAT here. That'll choose the right length > modifier for the platform. And I don't think we need to cast MyStartTime, > since it's a pg_time_t (which is just an int64). Agreed. However, a quick grep finds half a dozen other places that are casting MyStartTime to long. We should fix them all. Also note that if any of the other places are using translatable format strings, INT64_FORMAT is problematic in that context, and "long long" is a better answer for them. (I've not dug in the history, but I rather imagine that this is all a hangover from MyStartTime having once been plain "time_t".) regards, tom lane
Re: Proposal to Enable/Disable Index using ALTER INDEX
Peter Eisentraut writes: > On 23.09.24 22:51, Shayon Mukherjee wrote: >> I am happy to draft a patch for this as well. I think I have a working >> idea so far of where the necessary checks might go. However if you don’t >> mind, can you elaborate further on how the effect would be similar to >> enable_indexscan? > Planner settings like enable_indexscan used to just add a large number > (disable_cost) to the estimated plan node costs. It's a bit more > sophisticated in PG17. But in any case, I imagine "disabling an index" > could use the same mechanism. Or maybe not, maybe the setting would > just completely ignore the index. Yeah, I'd be inclined to implement this by having create_index_paths just not make any paths for rejected indexes. Or you could back up another step and keep plancat.c from building IndexOptInfos for them. The latter might have additional effects, such as preventing the plan from relying on a uniqueness condition enforced by the index. Not clear to me if that's desirable or not. [ thinks... ] One good reason for implementing it in plancat.c is that you'd have the index relation open and be able to see its name for purposes of matching to the filter. Anywhere else, getting the name would involve additional overhead. regards, tom lane
Re: pgsql: Improve default and empty privilege outputs in psql.
I wrote: > Yes, that's the expectation. I'm sure we can think of a more > backwards-compatible way to test for empty datacl, though. Looks like the attached should be sufficient. As penance, I tried all the commands in describe.c against 9.2 (or the oldest supported server version), and none of them fail now. regards, tom lane diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c index faabecbc76..6a36c91083 100644 --- a/src/bin/psql/describe.c +++ b/src/bin/psql/describe.c @@ -6678,7 +6678,7 @@ printACLColumn(PQExpBuffer buf, const char *colname) { appendPQExpBuffer(buf, "CASE" - " WHEN pg_catalog.cardinality(%s) = 0 THEN '%s'" + " WHEN pg_catalog.array_length(%s, 1) = 0 THEN '%s'" " ELSE pg_catalog.array_to_string(%s, E'\\n')" " END AS \"%s\"", colname, gettext_noop("(none)"),
Re: Cleaning up ERRCODE usage in our XML code
Daniel Gustafsson writes: > On 23 Sep 2024, at 19:17, Tom Lane wrote: >> Yeah, I looked at that but wasn't sure what to do with it. We should >> have validated the decl header when the XML value was created, so if >> we get here then either the value got corrupted on-disk or in-transit, >> or something forgot to do that validation, or libxml has changed its >> mind since then about what's a valid decl. At least some of those >> cases are probably legitimately called INTERNAL_ERROR. I thought for >> awhile about ERRCODE_DATA_CORRUPTED, but I'm not convinced that's a >> better fit. > I agree that it might not be an obvious better fit, but also not an obvious > worse fit. It will make it easier to filter on during fleet analysis so I > would be inclined to change it, but the main value of the patch are other > hunks > so feel free to ignore. Fair enough. Pushed with ERRCODE_DATA_CORRUPTED used there. Thanks again for reviewing. regards, tom lane
Re: pgsql: Improve default and empty privilege outputs in psql.
Christoph Berg writes: > Re: Tom Lane >> Improve default and empty privilege outputs in psql. > I'm sorry to report this when 17.0 has already been wrapped, but this > change is breaking `psql -l` against 9.3-or-earlier servers: > FEHLER: 42883: Funktion pg_catalog.cardinality(aclitem[]) existiert nicht Grumble. Well, if that's the worst bug in 17.0 we should all be very pleased indeed ;-). I'll see about fixing it after the release freeze lifts. > The psql docs aren't really explicit about which old versions are > still supported, but there's some mentioning that \d should work back > to 9.2: Yes, that's the expectation. I'm sure we can think of a more backwards-compatible way to test for empty datacl, though. regards, tom lane
Re: Add llvm version into the version string
Dmitry Dolgov <9erthali...@gmail.com> writes: > On Sun, Sep 22, 2024 at 01:15:54PM GMT, Dmitry Dolgov wrote: >> About a new function, I think that the llvm runtime version has to be >> shown in some form by pgsql_version. The idea is to skip an emails >> exchange like "here is the bug report" -> "can you attach the llvm >> version?". I'm not in favor of that, for a couple of reasons: * People already have expectations about what version() returns. Some distro and forks even modify it (see eg --extra-version). I think we risk breaking obscure use-cases if we add stuff onto that. Having version() return something different from the PG_VERSION_STR constant could cause other problems too, perhaps. * Believing that it'll save us questioning a bug submitter seems fairly naive to me. Nobody includes the result of version() unless specifically asked for it. * I'm not in favor of overloading version() with subsystem-related version numbers, because it doesn't scale. Who's to say we're not going to need the version of ICU, or libxml2, to take a couple of obvious examples? When you take that larger view, multiple subsystem-specific version functions seem to me to make more sense. Maybe another idea could be a new system view? => select * from pg_system_version; property | value core version | 18.1 architecture | x86_64-pc-linux-gnu word size| 64 bit compiler | gcc (GCC) 12.5.0 ICU version | 60.3 LLVM version | 18.1.0 ... Adding rows to a view over time seems way less likely to cause problems than messing with a string people probably have crufty parsing code for. >> If it's going to be a new separate function, I guess it won't >> make much difference to ask either to call the new function or the >> llvm-config, right? I do think that if we can get a version number out of the loaded library, that is worth doing. I don't trust the llvm-config that happens to be in somebody's PATH to be the same version that their PG is actually built with. regards, tom lane
Re: Cleaning up ERRCODE usage in our XML code
Daniel Gustafsson writes: >> On 20 Sep 2024, at 21:00, Tom Lane wrote: >> Per cfbot, rebased over d5622acb3. No functional changes. > Looking over these I don't see anything mis-characterized so +1 on going ahead > with these. It would be neat if we end up translating xml2 errors into XQuery > Error SQLSTATEs but this is a clear improvement over what we have until then. Thanks for looking at it! > There is an ERRCODE_INTERNAL_ERROR in xml_out_internal() which seems a tad odd > given that any error would be known to be parsing related and b) are caused by > libxml and not internally. Not sure if it's worth bothering with but with the > other ones improved it stood out. Yeah, I looked at that but wasn't sure what to do with it. We should have validated the decl header when the XML value was created, so if we get here then either the value got corrupted on-disk or in-transit, or something forgot to do that validation, or libxml has changed its mind since then about what's a valid decl. At least some of those cases are probably legitimately called INTERNAL_ERROR. I thought for awhile about ERRCODE_DATA_CORRUPTED, but I'm not convinced that's a better fit. regards, tom lane
Re: scalability bottlenecks with (many) partitions (and more)
Tomas Vondra writes: > Thanks. Pushed a fix for these issues, hopefully coverity will be happy. Thanks. > BTW is the coverity report accessible somewhere? I know someone > mentioned that in the past, but I don't recall the details. Maybe we > should have a list of all these resources, useful for committers, > somewhere on the wiki? Currently those reports only go to the security team. Perhaps we should rethink that? regards, tom lane
Re: scalability bottlenecks with (many) partitions (and more)
Tomas Vondra writes: > On 9/22/24 17:45, Tom Lane wrote: >> #define FAST_PATH_GROUP(index) \ >> -(AssertMacro(((index) >= 0) && ((index) < FP_LOCK_SLOTS_PER_BACKEND)), \ >> +(AssertMacro((uint32) (index) < FP_LOCK_SLOTS_PER_BACKEND), \ >> ((index) / FP_LOCK_SLOTS_PER_GROUP)) > For the (x >= 0) asserts, doing it this way relies on negative values > wrapping to large positive ones, correct? AFAIK it's guaranteed to be a > very large value, so it can't accidentally be less than the slot count. Right, any negative value would wrap to something more than INT32_MAX. regards, tom lane
Re: scalability bottlenecks with (many) partitions (and more)
___ *** CID 1619659: Integer handling issues (NO_EFFECT) /srv/coverity/git/pgsql-git/postgresql/src/backend/storage/lmgr/lock.c: 3067 in GetLockConflicts() 3061 3062for (j = 0; j < FP_LOCK_SLOTS_PER_GROUP; j++) 3063{ 3064uint32 lockmask; 3065 3066/* index into the whole per-backend array */ >>> CID 1619659: Integer handling issues (NO_EFFECT) >>> This greater-than-or-equal-to-zero comparison of an unsigned value is >>> always true. "group >= 0U". 3067uint32 f = FAST_PATH_SLOT(group, j); 3068 3069/* Look for an allocated slot matching the given relid. */ 3070if (relid != proc->fpRelId[f]) 3071continue; 3072lockmask = FAST_PATH_GET_BITS(proc, f); *** CID 1619658: Integer handling issues (NO_EFFECT) /srv/coverity/git/pgsql-git/postgresql/src/backend/storage/lmgr/lock.c: 2739 in FastPathUnGrantRelationLock() 2733uint32 group = FAST_PATH_REL_GROUP(relid); 2734 2735FastPathLocalUseCounts[group] = 0; 2736for (i = 0; i < FP_LOCK_SLOTS_PER_GROUP; i++) 2737{ 2738/* index into the whole per-backend array */ >>> CID 1619658: Integer handling issues (NO_EFFECT) >>> This greater-than-or-equal-to-zero comparison of an unsigned value is >>> always true. "group >= 0U". 2739uint32 f = FAST_PATH_SLOT(group, i); 2740 2741if (MyProc->fpRelId[f] == relid 2742&& FAST_PATH_CHECK_LOCKMODE(MyProc, f, lockmode)) 2743{ 2744Assert(!result); *** CID 1619657: Integer handling issues (NO_EFFECT) /srv/coverity/git/pgsql-git/postgresql/src/backend/storage/lmgr/lock.c: 2878 in FastPathGetRelationLockEntry() 2872 2873for (i = 0; i < FP_LOCK_SLOTS_PER_GROUP; i++) 2874{ 2875uint32 lockmode; 2876 2877/* index into the whole per-backend array */ >>> CID 1619657: Integer handling issues (NO_EFFECT) >>> This greater-than-or-equal-to-zero comparison of an unsigned value is >>> always true. "group >= 0U". 2878uint32 f = FAST_PATH_SLOT(group, i); 2879 2880/* Look for an allocated slot matching the given relid. */ 2881if (relid != MyProc->fpRelId[f] || FAST_PATH_GET_BITS(MyProc, f) == 0) 2882continue; 2883 regards, tom lane
Re: Add llvm version into the version string
Dmitry Dolgov <9erthali...@gmail.com> writes: > In many jit related bug reports, one of the first questions is often > "which llvm version is used". How about adding it into the > PG_VERSION_STR, similar to the gcc version? I'm pretty down on this idea as presented, first because our version strings are quite long already, and second because llvm is an external library. So I don't have any faith that what llvm-config said at configure time is necessarily the exact version we're using at run time. (Probably, the major version must be the same, but that doesn't prove anything about minor versions.) Is there a way to get the llvm library's version at run time? If so we could consider building a new function to return that. regards, tom lane
Re: Docs pg_restore: Shouldn't there be a note about -n ?
Florents Tselai writes: > Ah, swapped them by mistake on the previous email: > They're both available in the pg_dump and note on -n missing in pg_restore. > The question remains though: > Shouldn’t there be a note about -n in pg_restore ? Probably. I see that pg_dump has a third copy of the exact same note for "-e". pg_restore lacks that switch for some reason, but this is surely looking mighty duplicative. I propose getting rid of the per-switch Notes and putting a para into the Notes section, along the lines of When -e, -n, or -t is specified, pg_dump makes no attempt to dump any other database objects that the selected object(s) might depend upon. Therefore, there is no guarantee that the results of a selective dump can be successfully restored by themselves into a clean database. and mutatis mutandis for pg_restore. regards, tom lane
Re: pg_checksums: Reorder headers in alphabetical order
Fujii Masao writes: > I don’t have any objections to this commit, but I’d like to confirm > whether we really want to proactively reorder #include directives, > even for standard C library headers. I'm hesitant to do that. We can afford to insist that our own header files be inclusion-order-independent, because we have the ability to fix any problems that might arise. We have no ability to do something about it if the system headers on $random_platform have inclusion order dependencies. (In fact, I'm fairly sure there are already places in plperl and plpython where we know we have to be careful about inclusion order around those languages' headers.) So I would tread pretty carefully around making changes of this type, especially in long-established code. I have no reason to think that the committed patch will cause any problems, but I do think it's mostly asking for trouble. regards, tom lane
Re: First draft of PG 17 release notes
Laurenz Albe writes: > [ assorted corrections ] I fixed a couple of these before seeing your message. regards, tom lane
Re: First draft of PG 17 release notes
Bruce Momjian writes: > Patch applied to PG 17. I don't see a push? regards, tom lane
Re: Cleaning up ERRCODE usage in our XML code
I wrote: > [ v1-clean-up-errcodes-for-xml.patch ] Per cfbot, rebased over d5622acb3. No functional changes. regards, tom lane diff --git a/contrib/xml2/xpath.c b/contrib/xml2/xpath.c index 0fdf735faf..ef78aa00c8 100644 --- a/contrib/xml2/xpath.c +++ b/contrib/xml2/xpath.c @@ -388,7 +388,7 @@ pgxml_xpath(text *document, xmlChar *xpath, xpath_workspace *workspace) /* compile the path */ comppath = xmlXPathCtxtCompile(workspace->ctxt, xpath); if (comppath == NULL) -xml_ereport(xmlerrcxt, ERROR, ERRCODE_EXTERNAL_ROUTINE_EXCEPTION, +xml_ereport(xmlerrcxt, ERROR, ERRCODE_INVALID_ARGUMENT_FOR_XQUERY, "XPath Syntax Error"); /* Now evaluate the path expression. */ @@ -652,7 +652,7 @@ xpath_table(PG_FUNCTION_ARGS) comppath = xmlXPathCtxtCompile(ctxt, xpaths[j]); if (comppath == NULL) xml_ereport(xmlerrcxt, ERROR, - ERRCODE_EXTERNAL_ROUTINE_EXCEPTION, + ERRCODE_INVALID_ARGUMENT_FOR_XQUERY, "XPath Syntax Error"); /* Now evaluate the path expression. */ diff --git a/contrib/xml2/xslt_proc.c b/contrib/xml2/xslt_proc.c index f30a3a42c0..e761ca5cb5 100644 --- a/contrib/xml2/xslt_proc.c +++ b/contrib/xml2/xslt_proc.c @@ -90,7 +90,7 @@ xslt_process(PG_FUNCTION_ARGS) XML_PARSE_NOENT); if (doctree == NULL) - xml_ereport(xmlerrcxt, ERROR, ERRCODE_EXTERNAL_ROUTINE_EXCEPTION, + xml_ereport(xmlerrcxt, ERROR, ERRCODE_INVALID_XML_DOCUMENT, "error parsing XML document"); /* Same for stylesheet */ @@ -99,14 +99,14 @@ xslt_process(PG_FUNCTION_ARGS) XML_PARSE_NOENT); if (ssdoc == NULL) - xml_ereport(xmlerrcxt, ERROR, ERRCODE_EXTERNAL_ROUTINE_EXCEPTION, + xml_ereport(xmlerrcxt, ERROR, ERRCODE_INVALID_XML_DOCUMENT, "error parsing stylesheet as XML document"); /* After this call we need not free ssdoc separately */ stylesheet = xsltParseStylesheetDoc(ssdoc); if (stylesheet == NULL) - xml_ereport(xmlerrcxt, ERROR, ERRCODE_EXTERNAL_ROUTINE_EXCEPTION, + xml_ereport(xmlerrcxt, ERROR, ERRCODE_INVALID_ARGUMENT_FOR_XQUERY, "failed to parse stylesheet"); xslt_ctxt = xsltNewTransformContext(stylesheet, doctree); @@ -141,7 +141,7 @@ xslt_process(PG_FUNCTION_ARGS) NULL, NULL, xslt_ctxt); if (restree == NULL) - xml_ereport(xmlerrcxt, ERROR, ERRCODE_EXTERNAL_ROUTINE_EXCEPTION, + xml_ereport(xmlerrcxt, ERROR, ERRCODE_INVALID_ARGUMENT_FOR_XQUERY, "failed to apply stylesheet"); resstat = xsltSaveResultToString(&resstr, &reslen, restree, stylesheet); diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c index 41b1a5c6b0..2654c2d7ff 100644 --- a/src/backend/utils/adt/xml.c +++ b/src/backend/utils/adt/xml.c @@ -742,7 +742,7 @@ xmltotext_with_options(xmltype *data, XmlOptionType xmloption_arg, bool indent) { /* If it's a document, saving is easy. */ if (xmlSaveDoc(ctxt, doc) == -1 || xmlerrcxt->err_occurred) -xml_ereport(xmlerrcxt, ERROR, ERRCODE_INTERNAL_ERROR, +xml_ereport(xmlerrcxt, ERROR, ERRCODE_OUT_OF_MEMORY, "could not save document to xmlBuffer"); } else if (content_nodes != NULL) @@ -785,7 +785,7 @@ xmltotext_with_options(xmltype *data, XmlOptionType xmloption_arg, bool indent) if (xmlSaveTree(ctxt, newline) == -1 || xmlerrcxt->err_occurred) { xmlFreeNode(newline); - xml_ereport(xmlerrcxt, ERROR, ERRCODE_INTERNAL_ERROR, + xml_ereport(xmlerrcxt, ERROR, ERRCODE_OUT_OF_MEMORY, "could not save newline to xmlBuffer"); } } @@ -793,7 +793,7 @@ xmltotext_with_options(xmltype *data, XmlOptionType xmloption_arg, bool indent) if (xmlSaveTree(ctxt, node) == -1 || xmlerrcxt->err_occurred) { xmlFreeNode(newline); - xml_ereport(xmlerrcxt, ERROR, ERRCODE_INTERNAL_ERROR, + xml_ereport(xmlerrcxt, ERROR, ERRCODE_OUT_OF_MEMORY, "could not save content to xmlBuffer"); } } @@ -1004,7 +1004,7 @@ xmlpi(const char *target, text *arg, bool arg_is_null, bool *result_is_null) if (pg_strcasecmp(target, "xml") == 0) ereport(ERROR, -(errcode(ERRCODE_SYNTAX_ERROR), /* really */ +(errcode(ERRCODE_INVALID_XML_PROCESSING_INSTRUCTION), errmsg("invalid XML processing instruction"), errdetail("XML processing instruction target name cannot be \"%s\".", target))); @@ -4383,7 +4383,7 @@ xpath_internal(text *xpath_expr_text, xmltype *data, ArrayType *namespaces, xpath_len = VARSIZE_ANY_EXHDR(xpath_expr_text); if (xpath_len == 0) ereport(ERROR, -(errcode(ERRCODE_DATA_EXCEPTION), +(errcode(ERRCODE_INVALID_ARGUMENT_FOR_XQUERY), errmsg("empty XPath expression"))); string = pg_xmlCharStrndup(datastr, len); @@ -4456,7 +4456,7 @@ xpath_internal(text *xpath_expr_text, xmltype *data, Array
Re: Should rolpassword be toastable?
Nathan Bossart writes: > Here is a v3 patch set that fixes the test comment and a compiler warning > in cfbot. Nitpick: the message should say "%d bytes" not "%d characters", because we're counting bytes. Passes an eyeball check otherwise. regards, tom lane
Re: Why mention to Oracle ?
Tomas Vondra writes: > On 9/20/24 14:36, Marcos Pegoraro wrote: >> Why PostgreSQL DOCs needs to show or compare the Oracle way of doing >> things ? > I didn't dig into all the places you mention, but I'd bet those places > reference Oracle simply because it was the most common DB people either > migrated from or needed to support in their application next to PG, and > thus were running into problems. The similarity of the interfaces and > SQL dialects also likely played a role. It's less likely to run into > subtle behavior differences e.g. SQL Server when you have to rewrite > T-SQL stuff from scratch anyway. As far as the mentions in "Data Type Formatting Functions" go, those are there because those functions are not in the SQL standard; we stole the API definitions for them from Oracle, lock stock and barrel. (Except for the discrepancies that are called out by referencing what Oracle does differently.) A number of the other references probably have similar origins. regards, tom lane
Re: Clock-skew management in logical replication
Nisha Moond writes: > While considering the implementation of timestamp-based conflict > resolution (last_update_wins) in logical replication (see [1]), there > was a feedback at [2] and the discussion on whether or not to manage > clock-skew at database level. FWIW, I cannot see why we would do anything beyond suggesting that people run NTP. That's standard anyway on the vast majority of machines these days. Why would we add complexity that we have to maintain (and document) in order to cater to somebody not doing that? regards, tom lane
Re: attndims, typndims still not enforced, but make the value within a sane threshold
Michael Paquier writes: > On Fri, Sep 20, 2024 at 11:51:49AM +0800, Junwang Zhao wrote: >> Should you also bump the catalog version? > No need to worry about that when sending a patch because committers > take care of that when merging a patch into the tree. Doing that in > each patch submitted just creates more conflicts and work for patch > authors because they'd need to recolve conflicts each time a > catversion bump happens. And that can happen on a daily basis > sometimes depending on what is committed. Right. Sometimes the committer forgets to do that :-(, which is not great but it's not normally a big problem either. We've concluded it's better to err in that direction than impose additional work on patch submitters. If you feel concerned about the point, best practice is to include a mention that catversion bump is needed in your draft commit message. regards, tom lane
Rethinking parallel-scan relation identity checks
In [1] I whined about how the parallel heap scan machinery should have noticed that the same ParallelTableScanDesc was being used to give out block numbers for two different relations. Looking closer, there are Asserts that mean to catch this type of error --- but they are comparing relation OIDs, whereas what would have been needed to detect the problem was to compare RelFileLocators. It seems to me that a scan is fundamentally operating at the physical relation level, and therefore these tests should check RelFileLocators not OIDs. Hence I propose the attached. (For master only, of course; this would be an ABI break in the back branches.) This passes check-world and is able to catch the problem exposed in the other thread. Another possible view is that we should check both physical and logical relation IDs, but that seems like overkill to me. Thoughts? regards, tom lane [1] https://www.postgresql.org/message-id/2042942.1726781733%40sss.pgh.pa.us diff --git a/src/backend/access/index/indexam.c b/src/backend/access/index/indexam.c index dcd04b813d..1859be614c 100644 --- a/src/backend/access/index/indexam.c +++ b/src/backend/access/index/indexam.c @@ -500,8 +500,8 @@ index_parallelscan_initialize(Relation heapRelation, Relation indexRelation, EstimateSnapshotSpace(snapshot)); offset = MAXALIGN(offset); - target->ps_relid = RelationGetRelid(heapRelation); - target->ps_indexid = RelationGetRelid(indexRelation); + target->ps_locator = heapRelation->rd_locator; + target->ps_indexlocator = indexRelation->rd_locator; target->ps_offset = offset; SerializeSnapshot(snapshot, target->ps_snapshot_data); @@ -544,7 +544,9 @@ index_beginscan_parallel(Relation heaprel, Relation indexrel, int nkeys, Snapshot snapshot; IndexScanDesc scan; - Assert(RelationGetRelid(heaprel) == pscan->ps_relid); + Assert(RelFileLocatorEquals(heaprel->rd_locator, pscan->ps_locator)); + Assert(RelFileLocatorEquals(indexrel->rd_locator, pscan->ps_indexlocator)); + snapshot = RestoreSnapshot(pscan->ps_snapshot_data); RegisterSnapshot(snapshot); scan = index_beginscan_internal(indexrel, nkeys, norderbys, snapshot, diff --git a/src/backend/access/table/tableam.c b/src/backend/access/table/tableam.c index e57a0b7ea3..bd8715b679 100644 --- a/src/backend/access/table/tableam.c +++ b/src/backend/access/table/tableam.c @@ -168,7 +168,7 @@ table_beginscan_parallel(Relation relation, ParallelTableScanDesc pscan) uint32 flags = SO_TYPE_SEQSCAN | SO_ALLOW_STRAT | SO_ALLOW_SYNC | SO_ALLOW_PAGEMODE; - Assert(RelationGetRelid(relation) == pscan->phs_relid); + Assert(RelFileLocatorEquals(relation->rd_locator, pscan->phs_locator)); if (!pscan->phs_snapshot_any) { @@ -389,7 +389,7 @@ table_block_parallelscan_initialize(Relation rel, ParallelTableScanDesc pscan) { ParallelBlockTableScanDesc bpscan = (ParallelBlockTableScanDesc) pscan; - bpscan->base.phs_relid = RelationGetRelid(rel); + bpscan->base.phs_locator = rel->rd_locator; bpscan->phs_nblocks = RelationGetNumberOfBlocks(rel); /* compare phs_syncscan initialization to similar logic in initscan */ bpscan->base.phs_syncscan = synchronize_seqscans && diff --git a/src/include/access/relscan.h b/src/include/access/relscan.h index 521043304a..114a85dc47 100644 --- a/src/include/access/relscan.h +++ b/src/include/access/relscan.h @@ -18,6 +18,7 @@ #include "access/itup.h" #include "port/atomics.h" #include "storage/buf.h" +#include "storage/relfilelocator.h" #include "storage/spin.h" #include "utils/relcache.h" @@ -62,7 +63,7 @@ typedef struct TableScanDescData *TableScanDesc; */ typedef struct ParallelTableScanDescData { - Oid phs_relid; /* OID of relation to scan */ + RelFileLocator phs_locator; /* physical relation to scan */ bool phs_syncscan; /* report location to syncscan logic? */ bool phs_snapshot_any; /* SnapshotAny, not phs_snapshot_data? */ Size phs_snapshot_off; /* data for snapshot */ @@ -169,8 +170,8 @@ typedef struct IndexScanDescData /* Generic structure for parallel scans */ typedef struct ParallelIndexScanDescData { - Oid ps_relid; - Oid ps_indexid; + RelFileLocator ps_locator; /* physical table relation to scan */ + RelFileLocator ps_indexlocator; /* physical index relation to scan */ Size ps_offset; /* Offset in bytes of am specific structure */ char ps_snapshot_data[FLEXIBLE_ARRAY_MEMBER]; } ParallelIndexScanDescData;
Re: Should rolpassword be toastable?
Nathan Bossart writes: > Oh, actually, I see that we are already validating the hash, but you can > create valid SCRAM-SHA-256 hashes that are really long. So putting an > arbitrary limit (patch attached) is probably the correct path forward. I'd > also remove pg_authid's TOAST table while at it. Shouldn't we enforce the limit in every case in encrypt_password, not just this one? (I do agree that encrypt_password is an okay place to enforce it.) I think you will get pushback from a limit of 256 bytes --- I seem to recall discussion of actual use-cases where people were using strings of a couple of kB. Whatever the limit is, the error message had better cite it explicitly. Also, the ereport call needs an errcode. ERRCODE_PROGRAM_LIMIT_EXCEEDED is probably suitable. regards, tom lane
Re: BUG #18545: \dt breaks transaction, calling error when executed in SET SESSION AUTHORIZATION
I wrote: > Justin Pryzby writes: >> This commit seems to trigger elog(), not reproducible in the >> parent commit. >> 6e086fa2e77 Allow parallel workers to cope with a newly-created session user >> ID. >> postgres=# SET min_parallel_table_scan_size=0; CLUSTER pg_attribute USING >> pg_attribute_relid_attnum_index; >> ERROR: pg_attribute catalog is missing 26 attribute(s) for relation OID >> 70321 > I've been poking at this all day, and I still have little idea what's > going on. Got it, after a good deal more head-scratching. Here's the relevant parts of ParallelWorkerMain: /* * We've changed which tuples we can see, and must therefore invalidate * system caches. */ InvalidateSystemCaches(); /* * Restore GUC values from launching backend. We can't do this earlier, * because GUC check hooks that do catalog lookups need to see the same * database state as the leader. */ gucspace = shm_toc_lookup(toc, PARALLEL_KEY_GUC, false); RestoreGUCState(gucspace); ... /* Restore relmapper state. */ relmapperspace = shm_toc_lookup(toc, PARALLEL_KEY_RELMAPPER_STATE, false); RestoreRelationMap(relmapperspace); InvalidateSystemCaches blows away the worker's relcache. Then RestoreGUCState causes some catalog lookups (tracing shows that restoring default_text_search_config is what triggers this on my setup), and in particular pg_attribute's relcache entry will get constructed to support that. Then we wheel in a new set of relation map entries *without doing anything about what that might invalidate*. In the given test case, the globally-visible relmap says that pg_attribute's relfilenode is, say, . But we are busy rewriting it, so the parent process has an "active" relmap entry that says pg_attribute's relfilenode is . Given the above, the worker process will have built a pg_attribute relcache entry that contains , and even though it now knows is the value it should be using, that information never makes it to the worker's relcache. The upshot of this is that when the parallel heap scan machinery doles out some block numbers for the parent process to read, and some other block numbers for the worker to read, the worker is reading those block numbers from the pre-clustering copy of pg_attribute, which most likely doesn't match the post-clustering image. This accounts for the missing and duplicate tuples I was seeing in the scan output. Of course, the reason 6e086fa2e made this visible is that before that, any catalog reads triggered by RestoreGUCState were done in an earlier transaction, and then we would blow away the ensuing relcache entries in InvalidateSystemCaches. So there was no bug as long as you assume that the "..." code doesn't cause any catalog reads. I'm not too sure of that though --- it's certainly not very comfortable to assume that functions like SetCurrentRoleId and SetTempNamespaceState will never attempt a catalog lookup. The code has another hazard too, which is that this all implies that the GUC-related catalog lookups will be done against the globally-visible relmap state not whatever is active in the parent process. I have not tried to construct a POC showing that that can give incorrect answers (that is, different from what the parent thinks), but it seems plausible that it could. So the fix seems clear to me: RestoreRelationMap needs to happen before anything that could result in catalog lookups. I'm kind of inclined to move up the adjacent restores of non-transactional low-level stuff too, particularly RestoreReindexState which has direct impact on how catalog lookups are done. Independently of that, it's annoying that the parallel heap scan machinery failed to notice that it was handing out block numbers for two different relfilenodes. I'm inclined to see if we can put some Asserts in there that would detect that. This particular bug would have been far easier to diagnose that way, and it hardly seems unlikely that "worker is reading the wrong relation" could happen with other mistakes in future. regards, tom lane
Re: Should rolpassword be toastable?
Nathan Bossart writes: > Hm. It does seem like there's little point in giving pg_authid a TOAST > table, as rolpassword is the only varlena column, and it obviously has > problems. But wouldn't removing it just trade one unhelpful internal error > when trying to log in for another when trying to add a really long password > hash (which hopefully nobody is really trying to do in practice)? I wonder > if we could make this a little more user-friendly. We could put an arbitrary limit (say, half of BLCKSZ) on the length of passwords. regards, tom lane
Re: BUG #18545: \dt breaks transaction, calling error when executed in SET SESSION AUTHORIZATION
Justin Pryzby writes: > This commit seems to trigger elog(), not reproducible in the > parent commit. > 6e086fa2e77 Allow parallel workers to cope with a newly-created session user > ID. > postgres=# SET min_parallel_table_scan_size=0; CLUSTER pg_attribute USING > pg_attribute_relid_attnum_index; > ERROR: pg_attribute catalog is missing 26 attribute(s) for relation OID 70321 I've been poking at this all day, and I still have little idea what's going on. I've added a bunch of throwaway instrumentation, and have managed to convince myself that the problem is that parallel heap scan is broken. The scans done to rebuild pg_attribute's indexes seem to sometimes miss heap pages or visit pages twice (in different workers). I have no idea why this is, and even less idea how 6e086fa2e is provoking it. As you say, the behavior isn't entirely reproducible, but I couldn't make it happen at all after reverting 6e086fa2e's changes in transam/parallel.c, so apparently there is some connection. Another possibly useful data point is that for me it reproduces fairly well (more than one time in two) on x86_64 Linux, but I could not make it happen on macOS ARM64. If it's a race condition, which smells plausible, that's perhaps not hugely surprising. regards, tom lane
Re: detoast datum into the given buffer as a optimization.
Andy Fan writes: > * Note if caller provides a non-NULL buffer, it is the duty of caller > * to make sure it has enough room for the detoasted format (Usually > * they can use toast_raw_datum_size to get the size) This is a pretty awful, unsafe API design. It puts it on the caller to know how to get the detoasted length, and it implies double decoding of the toast datum. > One of the key point is we can always get the varlena rawsize cheaply > without any real detoast activity in advance, thanks to the existing > varlena design. This is not an assumption I care to wire into the API design. How about a variant like struct varlena * detoast_attr_cxt(struct varlena *attr, MemoryContext cxt) which promises to allocate the result in the specified context? That would cover most of the practical use-cases, I think. regards, tom lane
Re: Large expressions in indexes can't be stored (non-TOASTable)
Nathan Bossart writes: > On Wed, Sep 04, 2024 at 03:20:33PM -0400, Jonathan S. Katz wrote: >> On 9/4/24 3:08 PM, Tom Lane wrote: >>> Nathan Bossart writes: >>>> Thanks to commit 96cdeae, only a few catalogs remain that are missing TOAST >>>> tables: pg_attribute, pg_class, pg_index, pg_largeobject, and >>>> pg_largeobject_metadata. I've attached a short patch to add one for >>>> pg_index, which resolves the issue cited here. > Any objections to committing this? Nope. regards, tom lane
Re: Detailed release notes
Marcos Pegoraro writes: > Em qua., 18 de set. de 2024 às 06:02, Peter Eisentraut > escreveu: >> Maybe this shouldn't be done between RC1 and GA. This is clearly a more >> complex topic that should go through a proper review and testing cycle. > I think this is just a question of decision, not reviewing or testing. I'd say the opposite: the thing we lack is exactly testing, in the sense of how non-hackers will react to this. Nonetheless, I'm not upset about trying to do it now. We will get more feedback about major-release notes than minor-release notes. And the key point is that it's okay to consider this experimental. Unlike say a SQL feature, there are no compatibility concerns that put a premium on getting it right the first time. We can adjust the annotations or give up on them without much cost. regards, tom lane
Re: Detailed release notes
Jelte Fennema-Nio writes: > On Wed, 18 Sept 2024 at 02:55, Bruce Momjian wrote: >>> Also very clutter-y. I'm not convinced that any of this is a good >>> idea that will stand the test of time: I estimate that approximately >>> 0.01% of people who read the release notes will want these links. >> Yes, I think 0.01% is accurate. > I think that is a severe underestimation. I'm sure a very large fraction of the people commenting on this thread would like to have these links. But we are by no means representative of the average Postgres user. regards, tom lane
Re: attndims, typndims still not enforced, but make the value within a sane threshold
jian he writes: > Can we error out at the stage "create table", "create domain" > time if the attndims or typndims is larger than MAXDIM (6) ? The last time this was discussed, I think the conclusion was we should remove attndims and typndims entirely on the grounds that they're useless. I certainly don't see a point in adding more logic that could give the misleading impression that they mean something. regards, tom lane
Re: DROP OWNED BY fails to clean out pg_init_privs grants
=?UTF-8?B?0JXQs9C+0YAg0KfQuNC90LTRj9GB0LrQuNC9?= writes: > This query does not expect that test database may already contain some > information about custom user that ran test_pg_dump-running. I'm perfectly content to reject this as being an abuse of the test case. Our TAP tests are built on the assumption that they use databases created within the test case. Apparently, you've found a way to use the meson test infrastructure to execute a TAP test in the equivalent of "make installcheck" rather than "make check" mode. I am unwilling to buy into the proposition that our TAP tests should be proof against doing that after making arbitrary changes to the database's initial state. If anything, the fact that this is possible is a bug in our meson scripts. regards, tom lane
Re: BUG #18545: \dt breaks transaction, calling error when executed in SET SESSION AUTHORIZATION
Justin Pryzby writes: > This commit seems to trigger elog(), not reproducible in the > parent commit. Yeah, I can reproduce that. Will take a look tomorrow. regards, tom lane
Re: Detailed release notes
Bruce Momjian writes: > On Tue, Sep 17, 2024 at 03:29:54PM -0300, Marcos Pegoraro wrote: >> Em ter., 17 de set. de 2024 às 05:12, Alvaro Herrera >> >>> Add backend support for injection points (Michael Paquier) [commit 1] [2] > I think trying to add text to each item is both redundant and confusing, Also very clutter-y. I'm not convinced that any of this is a good idea that will stand the test of time: I estimate that approximately 0.01% of people who read the release notes will want these links. But if we keep it I think the annotations have to be very unobtrusive. (Has anyone looked at the PDF rendering of this? It seems rather unfortunate to me.) regards, tom lane
Re: Regression tests fail with tzdata 2024b
Wolfgang Walther writes: > The core regression tests need to be run with a timezone that tests > special cases in the timezone handling code. But that might not be true > for extensions - all they want could be a stable output across major and > minor versions of postgres and versions of tzdata. It could be helpful > to set pg_regress' timezone to UTC, for example? I would not recommend that choice. It would mask simple errors such as failing to apply the correct conversion between timestamptz and timestamp values. Also, if you have test cases that are affected by this issue at all, you probably have a need/desire to test things like DST transitions. regards, tom lane
Re: Regression tests fail with tzdata 2024b
Sven Klemm writes: > On Mon, Sep 16, 2024 at 5:19 PM Tom Lane wrote: >> Configurable to what? If your test cases are dependent on the >> historical behavior of PST8PDT, you're out of luck, because that >> simply isn't available anymore (or won't be once 2024b reaches >> your platform, anyway). > I was wondering whether the timezone used by pg_regress could be made > configurable. Yes, I understood that you were suggesting that. My point is that it wouldn't do you any good: you will still have to change any regression test cases that depend on behavior PST8PDT has/had that is different from America/Los_Angeles. That being the case, I don't see much value in making it configurable. regards, tom lane
Re: Document DateStyle effect on jsonpath string()
"David E. Wheeler" writes: > BTW, will the back-patch to 17 (cc4fdfa) be included in 17.0 or 17.1? 17.0. If we were already past 17.0 I'd have a lot more angst about changing this behavior. regards, tom lane
Re: Psql meta-command conninfo+
Alvaro Herrera writes: > On 2024-Sep-16, Jim Jones wrote: >> * The value of "Current User" does not match the function current_user() >> --- as one might expcect. It is a little confusing, as there is no >> mention of "Current User" in the docs. In case this is the intended >> behaviour, could you please add it to the docs? > It is intended. As Peter said[1], what we wanted was to display > client-side info, so PQuser() is the right thing to do. Now maybe > "Current User" is not the perfect column header, but at least the > definition seems consistent with the desired end result. Seems like "Session User" would be closer to being accurate, since PQuser()'s result does not change when you do SET ROLE etc. > Now, I think > the current docs saying to look at session_user() are wrong, they should > point to the libpq docs for the function instead; something like "The > name of the current user, as returned by PQuser()" and so on. Sure, but this does not excuse choosing a misleading column name when there are better choices readily available. regards, tom lane
Re: Regression tests fail with tzdata 2024b
Sven Klemm writes: > This is an unfortunate change as this will break extensions tests using > pg_regress for testing. We run our tests against multiple minor versions > and this getting backported means our tests will fail with the next minor > pg release. Is there a workaround available to make the timezone for > pg_regress configurable without going into every test? Configurable to what? If your test cases are dependent on the historical behavior of PST8PDT, you're out of luck, because that simply isn't available anymore (or won't be once 2024b reaches your platform, anyway). regards, tom lane
Re: Regression tests fail with tzdata 2024b
Wolfgang Walther writes: > Tom Lane: >> Also, as a real place to a greater extent >> than "PST8PDT" is, it's more subject to historical revisionism when >> somebody turns up evidence of local law having been different than >> TZDB currently thinks. > I now tried all versions of tzdata which we had in tree back to 2018g, > they all work fine with the same regression test output. 2018g was an > arbitrary cutoff, I just didn't try any further. Yeah, my belly-aching above is just about hypothetical future instability. In reality, I'm sure America/Los_Angeles is pretty well researched and so it's unlikely that there will be unexpected changes in its zone history. > In the end, we don't need a default timezone that will never change. We do, really. For example, there's a nonzero chance the USA will cancel DST altogether at some future time. (This would be driven by politicians who don't remember the last time, but there's no shortage of those.) That'd likely break some of our test results, and even if it happened not to, it'd still be bad because we'd probably lose some coverage of the DST-transition-related code paths in src/timezone/. So I'd really be way happier with a test timezone that never changes but does include DST behavior. I thought PST8PDT fit those requirements pretty well, and I'm annoyed at Eggert for having tossed it overboard for no benefit whatever. But I don't run tzdb, so here we are. > We just need one that didn't change in a reasonable number of > releases going backwards. We've had this sort of fire-drill before, e.g. commit 8d7af8fbe. It's no fun, and the potential surface area for unexpected changes is now much greater than the few tests affected by that change. But here we are, so I pushed your patch with a couple of other cosmetic bits. There are still a couple of references to PST8PDT in the tree, but they don't appear to care what the actual meaning of that zone is, so I left them be. regards, tom lane
Re: A starter task
sia kc writes: > About reply to all button, I think only sending to mailing list address > should suffice. Why including previous recipients too? It's a longstanding habit around here for a couple of reasons: * The mail list servers are occasionally slow. (Our infrastructure is way better than it once was, but that still happens sometimes.) If you directly cc: somebody, they can reply to that copy right away whether or not they get a copy from the list right away. * pgsql-hackers is a fire hose. cc'ing people who have shown interest in the thread is useful because they will get those copies separately from the list traffic, and so they can follow the thread without having to dig through all the traffic. regards, tom lane
Re: how to speed up 002_pg_upgrade.pl and 025_stream_regress.pl under valgrind
Thomas Munro writes: > (An interesting archeological detail about the regression tests is > that they seem to derive from the Wisconsin benchmark, famous for > benchmark wars and Oracle lawyers[1]. This is quite off-topic for the thread, but ... we actually had an implementation of the Wisconsin benchmark in src/test/bench, which we eventually removed (a05a4b478). It does look like the modern regression tests borrowed the definitions of "tenk1" and some related tables from there, but I think it'd be a stretch to say the tests descended from it. regards, tom lane
Re: A starter task
Tomas Vondra writes: > Presumably a new contributor will start by discussing the patch first, > and won't waste too much time on it. Yeah, that is a really critical piece of advice for a newbie: no matter what size of patch you are thinking about, a big part of the job will be to sell it to the rest of the community. It helps a lot to solicit advice while you're still at the design stage, before you spend a lot of time writing code you might have to throw away. Stuff that is on the TODO list has a bit of an advantage here, because that indicates there's been at least some interest and previous discussion. But that doesn't go very far, particularly if there was not consensus about how to do the item. Job 1 is to build that consensus. regards, tom lane
Re: A starter task
Tomas Vondra writes: > I think you can take a look at https://wiki.postgresql.org/wiki/Todo and > see if there's a patch/topic you would be interested in. It's really > difficult to "assign" a task based on a single sentence, with no info > about the person (experience with other projects, etc.). Beware that that TODO list is poorly maintained, so items may be out of date. Worse, most of what's there got there because it's hard, or there's not consensus about what the feature should look like, or both. So IMO it's not a great place for a beginner to start. > FWIW, maybe it'd be better to start by looking at existing patches and > do a bit of a review, learn how to apply/test those and learn from them. Yeah, this is a good way to get some exposure to our code base and development practices. regards, tom lane
Re: Trim the heap free memory
I wrote: > The single test case you showed suggested that maybe we could > usefully prod glibc to free memory at query completion, but we > don't need all this interrupt infrastructure to do that. I think > we could likely get 95% of the benefit with about a five-line > patch. To try to quantify that a little, I wrote a very quick-n-dirty patch to apply malloc_trim during finish_xact_command and log the effects. (I am not asserting this is the best place to call malloc_trim; it's just one plausible possibility.) Patch attached, as well as statistics collected from a run of the core regression tests followed by grep malloc_trim postmaster.log | sed 's/.*LOG:/LOG:/' | sort -k4n | uniq -c >trim_savings.txt We can see that out of about 43K test queries, 32K saved nothing whatever, and in only four was more than a couple of meg saved. That's pretty discouraging IMO. It might be useful to look closer at the behavior of those top four though. I see them as 2024-09-15 14:58:06.146 EDT [960138] LOG: malloc_trim saved 7228 kB 2024-09-15 14:58:06.146 EDT [960138] STATEMENT: ALTER TABLE delete_test_table ADD PRIMARY KEY (a,b,c,d); 2024-09-15 14:58:09.861 EDT [960949] LOG: malloc_trim saved 12488 kB 2024-09-15 14:58:09.861 EDT [960949] STATEMENT: with recursive search_graph(f, t, label, is_cycle, path) as ( select *, false, array[row(g.f, g.t)] from graph g union distinct select g.*, row(g.f, g.t) = any(path), path || row(g.f, g.t) from graph g, search_graph sg where g.f = sg.t and not is_cycle ) select * from search_graph; 2024-09-15 14:58:09.866 EDT [960949] LOG: malloc_trim saved 12488 kB 2024-09-15 14:58:09.866 EDT [960949] STATEMENT: with recursive search_graph(f, t, label) as ( select * from graph g union distinct select g.* from graph g, search_graph sg where g.f = sg.t ) cycle f, t set is_cycle to 'Y' default 'N' using path select * from search_graph; 2024-09-15 14:58:09.853 EDT [960949] LOG: malloc_trim saved 12616 kB 2024-09-15 14:58:09.853 EDT [960949] STATEMENT: with recursive search_graph(f, t, label) as ( select * from graph0 g union distinct select g.* from graph0 g, search_graph sg where g.f = sg.t ) search breadth first by f, t set seq select * from search_graph order by seq; I don't understand why WITH RECURSIVE queries might be more prone to leave non-garbage-collected memory behind than other queries, but maybe that is worth looking into. regards, tom lane diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c index 8bc6bea113..9efb4f7636 100644 --- a/src/backend/tcop/postgres.c +++ b/src/backend/tcop/postgres.c @@ -21,6 +21,7 @@ #include #include +#include #include #include #include @@ -2797,6 +2798,16 @@ finish_xact_command(void) MemoryContextStats(TopMemoryContext); #endif + { + char *old_brk = sbrk(0); + char *new_brk; + + malloc_trim(0); + new_brk = sbrk(0); + elog(LOG, "malloc_trim saved %zu kB", + (old_brk - new_brk + 1023) / 1024); + } + xact_started = false; } } 32293 LOG: malloc_trim saved 0 kB 4 LOG: malloc_trim saved 4 kB 12 LOG: malloc_trim saved 8 kB 7 LOG: malloc_trim saved 12 kB 57 LOG: malloc_trim saved 16 kB 3 LOG: malloc_trim saved 20 kB 22 LOG: malloc_trim saved 24 kB 14 LOG: malloc_trim saved 28 kB 288 LOG: malloc_trim saved 32 kB 20 LOG: malloc_trim saved 36 kB 26 LOG: malloc_trim saved 40 kB 18 LOG: malloc_trim saved 44 kB 26 LOG: malloc_trim saved 48 kB 27 LOG: malloc_trim saved 52 kB 37 LOG: malloc_trim saved 56 kB 26 LOG: malloc_trim saved 60 kB 218 LOG: malloc_trim saved 64 kB 20 LOG: malloc_trim saved 68 kB 44 LOG: malloc_trim saved 72 kB 44 LOG: malloc_trim saved 76 kB 45 LOG: malloc_trim saved 80 kB 29 LOG: malloc_trim saved 84 kB 91 LOG: malloc_trim saved 88 kB 31 LOG: malloc_trim saved 92 kB 191 LOG: malloc_trim saved 96 kB 30 LOG: malloc_trim saved 100 kB 81 LOG: malloc_trim saved 104 kB 24 LOG: malloc_trim saved 108 kB 214 LOG: malloc_trim saved 112 kB 32 LOG: malloc_trim saved 116 kB 178 LOG: malloc_trim saved 120 kB 86 LOG: malloc_trim saved 124 kB 4498 LOG: malloc_trim saved 128 kB 29 LOG: malloc_trim saved 132 kB 286 LOG: malloc_trim saved 136 kB 34 LOG: malloc_trim saved 140 kB 563 LOG: malloc_trim saved 144 kB 20 LOG: malloc_trim saved 148 kB 821 LOG: malloc_trim saved 152 kB 987 LOG: malloc_trim saved 156 kB 212 LOG: malloc_trim saved 160 kB 8 LOG
Re: how to speed up 002_pg_upgrade.pl and 025_stream_regress.pl under valgrind
Tomas Vondra writes: > [ 002_pg_upgrade and 027_stream_regress are slow ] > I don't have a great idea how to speed up these tests, unfortunately. > But one of the problems is that all the TAP tests run serially - one > after each other. Could we instead run them in parallel? The tests setup > their "private" clusters anyway, right? But there's parallelism within those two tests already, or I would hope so at least. If you run them in parallel then you are probably causing 40 backends instead of 20 to be running at once (plus 40 valgrind instances). Maybe you have a machine beefy enough to make that useful, but I don't. Really the way to fix those two tests would be to rewrite them to not depend on the core regression tests. The core tests do a lot of work that's not especially useful for the purposes of those tests, and it's not even clear that they are exercising all that we'd like to have exercised for those purposes. In the case of 002_pg_upgrade, all we really need to do is create objects that will stress all of pg_dump. It's a little harder to scope out what we want to test for 027_stream_regress, but it's still clear that the core tests do a lot of work that's not helpful. regards, tom lane
Re: Trim the heap free memory
shawn wang writes: > I have successfully registered my patch for the commitfest. However, upon > integration, I encountered several errors during the testing phase. I am > currently investigating the root causes of these issues and will work on > providing the necessary fixes. I should think the root cause is pretty obvious: malloc_trim() is a glibc-ism. I'm fairly doubtful that this is something we should spend time on. It can never work on any non-glibc platform. Even granting that a Linux-only feature could be worth having, I'm really doubtful that our memory allocation patterns are such that malloc_trim() could be expected to free a useful amount of memory mid-query. The single test case you showed suggested that maybe we could usefully prod glibc to free memory at query completion, but we don't need all this interrupt infrastructure to do that. I think we could likely get 95% of the benefit with about a five-line patch. regards, tom lane