Re: Document DateStyle effect on jsonpath string()
On Jul 4, 2024, at 04:28, jian he wrote: > Do you need to reset the datestyle? Wouldn’t hurt but it’s not necessary, no. It’s set only for the execution of this file, and there are no more calls that rely on it. > also the above query is time zone sensitive, maybe the time zone is > set in another place, but that's not explicit? It’s implicit in how PostgreSQL runs its test suite; other tests later change it. > > -String value converted from a JSON boolean, number, string, or > datetime > +String value converted from a JSON boolean, number, string, or > +datetime. Note that the string output of datetimes is determined by > +the parameter. > > imho, your patch has just too many examples. I’m confused. There are no examples in my patch, or this bit you cite. > for explaining the above sentence, the following example should be enough. Are you referring to the tests? I made them comprehensive so that we reliably demonstrate the behavior of the string() method on all the date/time data types. They are not examples, not in the documentation sense at least. Best, David
Document DateStyle effect on jsonpath string()
Hackers, In fuzing around trying to work out what’s going on with the formatting of timestamptz values cast by the timestamp_tz() jsonpath method[1], I noticed that the formatting of the string() method applied to date and time objects was not fully tested, or how the output is determined by the DateStyle method. The attached path aims to rectify this situation by adding tests that chain string() after the jsonpath date/time methods, both with the default testing “PostreSQL” DateStyle and “ISO”. It also mentions the impact of the DateStyle parameter in the string() documentation. Also available to review as a pull request[2]. Best, David [1]: https://www.postgresql.org/message-id/7DE080CE-6D8C-4794-9BD1-7D9699172FAB%40justatheory.com [2]: https://github.com/theory/postgres/pull/7/files v1-0001-Document-impact-of-datestyle-on-jsonpath-string.patch Description: Binary data
Re: jsonpath: Inconsistency of timestamp_tz() Output
On Jul 1, 2024, at 11:02, David E. Wheeler wrote: > Anyway, should the output of timestamptz JSONB values be made more > consistent? I’m happy to make a patch to do so, but could use a hand figuring > out where the behavior varies. I think if the formatting was more consistent, the test output would be: ``` patch --- a/src/test/regress/expected/jsonb_jsonpath.out +++ b/src/test/regress/expected/jsonb_jsonpath.out @@ -2914,7 +2914,7 @@ HINT: Use *_tz() function for time zone support. select jsonb_path_query_tz('"2023-08-15"', '$.timestamp_tz()'); -- should work jsonb_path_query_tz - - "2023-08-15T07:00:00+00:00" + "2023-08-15T00:00:00-07:00" (1 row) select jsonb_path_query('"12:34:56"', '$.timestamp_tz()'); @@ -3003,7 +3003,7 @@ HINT: Use *_tz() function for time zone support. select jsonb_path_query_tz('"2023-08-15 12:34:56"', '$.timestamp_tz()'); -- should work jsonb_path_query_tz - - "2023-08-15T12:34:56+00:00" + "2023-08-15T12:34:56+10:00" (1 row) select jsonb_path_query('"10-03-2017 12:34"', '$.datetime("dd-mm- HH24:MI")'); ``` That second example is a bit different than I noticed up-thread, not just a formatting issue but the offset is never applied!. That test run under tz +10, and this is how it works with the non-JSONB data types: david=# set time zone '+10'; SET Time: 0.689 ms david=# select '2023-08-15 12:34:56'::timestamptz; timestamptz 2023-08-15 12:34:56+10 (1 row) Time: 0.491 ms david=# select '2023-08-15 12:34:56'::timestamp::timestamptz; timestamptz 2023-08-15 12:34:56+10 (1 row) Best, David
jsonpath: Inconsistency of timestamp_tz() Output
Hackers, There’s an odd difference in the behavior of timestamp_tz() outputs. Running with America/New_York as my TZ, it looks fine for a full timestamptz, identical to how casting the types directly works: david=# set time zone 'America/New_York'; SET david=# select '2024-08-15 12:34:56-04'::timestamptz; timestamptz 2024-08-15 12:34:56-04 (1 row) david=# select jsonb_path_query_tz('"2024-08-15 12:34:56-04"', '$.timestamp_tz()'); jsonb_path_query_tz - "2024-08-15T12:34:56-04:00" Both show the time in America/New_York, which is great. But when casting from a date, the behavior differs. Casting directly: david=# select '2024-08-15'::date::timestamptz; timestamptz 2024-08-15 00:00:00-04 It stringifies to the current zone setting again, as expected. But look at the output from a path query: david=# select jsonb_path_query_tz('"2023-08-15"', '$.timestamp_tz()'); jsonb_path_query_tz - "2023-08-15T04:00:00+00:00" It’s using UTC for the display output! Shouldn’t it be using America/New_York? Note that I’m comparing a cast from date to timestamptz because that’s how the jsonpath parsing works[1]: it ultimately uses date2timestamptz_opt_overflow()[2] to make the conversion, which appears to set the offset from the time zone GUC, so I’m not sure where it’s converted to UTC before stringifying. Maybe an argument is missing from the stringification path? FWIW, explicitly calling the string() jsonpath method does produce a result in the current TZ: david=# select jsonb_path_query_tz('"2023-08-15"', '$.timestamp_tz().string()'); jsonb_path_query_tz -- "2023-08-15 00:00:00-04" That bit uses timestamptz_out to format the output, but JSONB has its own stringification[4] (called here[5]), but I can’t tell what might be different between a timestamptz cast from a date and one not cast from a date. Note the same divergency in behavior occurs when the source value is a timestamp, too. Compare: david=# select '2024-08-15 12:34:56'::timestamp::timestamptz; timestamptz 2024-08-15 12:34:56-04 (1 row) david=# select jsonb_path_query_tz('"2023-08-15 12:34:56"', '$.timestamp_tz()'); jsonb_path_query_tz - "2023-08-15T16:34:56+00:00" (1 row) Anyway, should the output of timestamptz JSONB values be made more consistent? I’m happy to make a patch to do so, but could use a hand figuring out where the behavior varies. Best, David [1]: https://github.com/postgres/postgres/blob/3497c87/src/backend/utils/adt/jsonpath_exec.c#L2708-L2718 [2]: https://github.com/postgres/postgres/blob/3497c87/src/backend/utils/adt/date.c#L613-L698 [3]: https://github.com/postgres/postgres/blob/3fb59e789dd9f21610101d1ec106ad58095e24f3/src/backend/utils/adt/jsonpath_exec.c#L1650-L1653 [4]: https://github.com/postgres/postgres/blob/3fb59e789dd9f21610101d1ec106ad58095e24f3/src/backend/utils/adt/json.c#L369-L407 [5]: https://github.com/postgres/postgres/blob/3fb59e7/src/backend/utils/adt/jsonb.c#L743-L748
Re: Proposal: Document ABI Compatibility
On Jun 27, 2024, at 17:48, Jeremy Schneider wrote: > Minor nit - misspelled “considerd" Thank you, Jeremy. V3 attached. Best, David v3-0001-Add-API-an-ABI-guidance-to-the-C-language-docs.patch Description: Binary data
Re: Patch bug: Fix jsonpath .* on Arrays
On Jun 27, 2024, at 04:17, Michael Paquier wrote: > The tests of jsonb_jsonpath.sql include a lot of patterns for @? and > jsonb_path_query with the lax and strict modes, so shouldn't these > additions be grouped closer to the existing tests rather than added at > the end of the file? I think you could argue that they should go with other tests for array unwrapping, though it’s kind of mixed throughout. But that’s more the bit I was testing; almost all the tests are lax, and it’s less the strict behavior to test here than the explicit behavior of unwrapping in lax mode. But ultimately I don’t care where they go, just that we have them. D
Re: Proposal: Document ABI Compatibility
On Jun 26, 2024, at 15:20, David E. Wheeler wrote: > CF: https://commitfest.postgresql.org/48/5080/ > PR: https://github.com/theory/postgres/pull/6 Aaaand v2 without the unnecessary formatting of unrelated documentation 臘♂️. Best, David v2-0001-Add-API-an-ABI-guidance-to-the-C-language-docs.patch Description: Binary data
Re: Proposal: Document ABI Compatibility
On Jun 26, 2024, at 15:14, David E. Wheeler wrote: > Okay here’s a patch that adds the proposed API and ABI guidance to the C > Language docs. The content is the same as Peter proposed, with some light > copy-editing. CF: https://commitfest.postgresql.org/48/5080/ PR: https://github.com/theory/postgres/pull/6 D
Re: Proposal: Document ABI Compatibility
On Jun 25, 2024, at 13:55, David E. Wheeler wrote: > Oh man this is fantastic, thank you! I’d be more than happy to just turn this > into a patch. But where should it go? Upthread I assumed xfunc.sgml, and > still think that’s a likely candidate. Perhaps I’ll just start there --- > unless someone thinks it should go somewhere other than the docs. Okay here’s a patch that adds the proposed API and ABI guidance to the C Language docs. The content is the same as Peter proposed, with some light copy-editing. Best, David v1-0001-Add-API-an-ABI-guidance-to-the-C-language-docs.patch Description: Binary data
Re: Patch bug: Fix jsonpath .* on Arrays
On Jun 25, 2024, at 13:48, David E. Wheeler wrote: > I have since realized it’s not a complete fix for the issue, and hacked > around it in my Go version. Would be fine to remove that bit, but IIRC this > was the only execution function that would return `jperNotFound` when it in > fact adds items to the `found` list. The current implementation only looks at > one or the other, so it’s not super important, but I found the inconsistency > annoying and sometimes confusing. I’ve removed this change. >> [1] select jsonb_path_query('[1,2,3,{"b": [3,4,5]}]', 'strict $.*'); >> I propose adding a similar test with explicitly specified lax mode: select >> jsonb_path_query('[1,2,3,{"b": [3,4,5]}]', 'lax $.*'); to show what lax mode >> is set by default. > > Very few of the other tests do so; I can add it if it’s important for this > case, though. Went ahead and added lax. > @? suppresses a number of errors. Perhaps I should add a variant of the > error-raising query that passes the silent arg, which would also suppress the > error. Added a variant where the silent param suppresses the error, too. V2 attached and the PR updated: https://github.com/theory/postgres/pull/4/files Best, David v2-0001-Add-tests-for-jsonpath-.-on-arrays.patch Description: Binary data
Re: Proposal: Document ABI Compatibility
On Jun 26, 2024, at 04:48, Laurenz Albe wrote: > Perhaps such information should go somewhere here: > https://www.postgresql.org/support/versioning/ This seems deeper and more detailed than what’s there now, but I can certainly imagine wanting to include this policy on the web site. That said, it didn’t occur to me to look under support when trying to find a place to put this; I was looking under Developers, on the principle that extension developers would look there. Best, David
Re: RFC: Additional Directory for Extensions
On Jun 25, 2024, at 10:43 AM, Robert Haas wrote: > If we want to work on making the sorts of changes that > you're proposing, let's do it on a separate thread. It's not going to > be meaningfully harder to move in that direction after some patch like > this than it is today. I appreciate this separation of concerns, Robert. In other news, here’s an updated patch that expands the documentation to record that the destination directory is a prefix, and full paths should be used under it. Also take the opportunity to document the PGXS DESTDIR variable as the thing to use to install files under the destination directory. It still requires a server restart; I can change it back to superuser-only if that’s the consensus. For those who prefer a GitHub patch review experience, see this PR: https://github.com/theory/postgres/pull/3/files Best, David v4-0001-Add-extension_destdir-GUC.patch Description: Binary data
Re: Proposal: Document ABI Compatibility
On Jun 25, 2024, at 7:33 AM, Peter Eisentraut wrote: > I took at a stab at this, using some of your text, but discussing API and ABI > separately. Oh man this is fantastic, thank you! I’d be more than happy to just turn this into a patch. But where should it go? Upthread I assumed xfunc.sgml, and still think that’s a likely candidate. Perhaps I’ll just start there --- unless someone thinks it should go somewhere other than the docs. Best, David
Re: Patch bug: Fix jsonpath .* on Arrays
On Jun 25, 2024, at 12:46 AM, Степан Неретин wrote: > Hi! Looks good to me, but I have several comments. Thanks for your review! > Your patch improves tests, but why did you change formatting in > jsonpath_exec.c? What's the motivation? It’s not just formatting. From the commit message: > While at it, teach `executeAnyItem()` to return `jperOk` when `found` > exist, not because it will be used (the result and `found` are inspected > by different functions), but because it seems like the proper thing to > return from `executeAnyItem()` when considered in isolation. I have since realized it’s not a complete fix for the issue, and hacked around it in my Go version. Would be fine to remove that bit, but IIRC this was the only execution function that would return `jperNotFound` when it in fact adds items to the `found` list. The current implementation only looks at one or the other, so it’s not super important, but I found the inconsistency annoying and sometimes confusing. > [1] select jsonb_path_query('[1,2,3,{"b": [3,4,5]}]', 'strict $.*'); > I propose adding a similar test with explicitly specified lax mode: select > jsonb_path_query('[1,2,3,{"b": [3,4,5]}]', 'lax $.*'); to show what lax mode > is set by default. Very few of the other tests do so; I can add it if it’s important for this case, though. > Odd empty result for the test: select jsonb '[1,2,3,{"b": [3,4,5]}]' @? > 'strict $.*'; > I expected an error like in test [1]. This behavior is not obvious to me. @? suppresses a number of errors. Perhaps I should add a variant of the error-raising query that passes the silent arg, which would also suppress the error. > Everything else is cool. Thanks to the patch and the discussion above, I > began to understand better how wildcards in JSON work. Yeah, it’s kind of wild TBH. Best, David
Re: RFC: Additional Directory for Extensions
On Jun 24, 2024, at 5:32 PM, Jelte Fennema-Nio wrote: > Still, for the sake of completeness it might make sense to support > this whole list in extension_destdir. (assuming it's easy to do) It should be with the current patch, which just uses a prefix to paths in `pg_config`. So if SHAREDIR is set to /usr/share/postgresql/16 and extension_destdir is set to /mount/ext, then Postgres will look for files in /mount/ext/usr/share/postgresql/16. The same rule applies (or should apply) for all other pg_config directory configs and where the postmaster looks for specific files. And PGXS already supports installing files in these locations, thanks to its DESTDIR param. (I don’t know how it works on Windows, though.) That said, this is very much a pattern designed for RPM and Debian package management patterns, and not for actually installing and managing extensions. And maybe that’s fine for now, as it can still be used to address the immutability problems descried in the original post in this thread. Ultimately, I’d like to figure out a way to more tidily organize installed extension files, but I think that, too, might be a separate thread. Best, David
Re: RFC: Additional Directory for Extensions
On Jun 24, 2024, at 17:17, Jelte Fennema-Nio wrote: > If you want to only change $libdir during CREATE EXTENSION (or ALTER > EXTENSION UPDATE), then why not just change it there. And really you'd > only want to change it when creating an extension from which the > control file is coming from extension_destdir. IIUC, the postmaster needs to load an extension on first use in every session unless it’s in shared_preload_libraries. > However, I can also see a case for really always changing $libdir. > Because some extensions in shared_preload_libraries, might want to > trigger loading other libraries that they ship with dynamically. And > these libraries are probably also in extension_destdir. Right, it can be more than just the DSOs for the extension itself. Best, David
Re: RFC: Additional Directory for Extensions
On Jun 24, 2024, at 4:28 PM, Robert Haas wrote: > As long as the GUC is superuser-only, I'm not sure what else there is > to do here. The only question is whether there's some reason to > disallow this even from the superuser, but I'm not quite seeing such a > reason. I can switch it back from requiring a restart to allowing a superuser to set it. >> I sketched them quickly, so agree they can be better. Reading the code, I >> now see that it appears to be the former case. I’d like to advocate for the >> latter. > > Sounds good. Yeah, though then I have a harder time deciding how it should work. pg_config’s paths are absolute. With your first example, we just use them exactly as they are, but prefix them with the destination directory. So if it’s set to `/my/temp/root/`, then files go into /my/temp/root/$(pg_conifg --sharedir) /my/temp/root/$(pg_conifg --pkglibdir) /my/temp/root/$(pg_conifg --bindir) # etc. Which is exactly how RPM and Apt packages are built, but seems like an odd configuration for general use. >> BINDIR >> DOCDIR >> HTMLDIR >> PKGINCLUDEDIR >> LOCALEDIR >> MANDIR >> >> I can imagine an extension wanting or needing to use any and all of these. > > Are these really all relevant to backend code? Oh I think so. Especially BINDIR; lots of extensions ship with binary applications. And most ship with docs, too (PGXS puts items listed in DOCS into DOCDIR). Some might also produce man pages (for their binaries), HTML docs, and other stuff. Maybe an FTE extension would include locale files? I find it pretty easy to imagine use cases for all of them. So much so that I wrote an extension binary distribution RFC[1] and its POC[2] around them. Best, David [1]: https://github.com/orgs/pgxn/discussions/2 [1]: https://justatheory.com/2024/06/trunk-poc/
Re: Proposal: Document ABI Compatibility
On Jun 19, 2024, at 05:41, Peter Eisentraut wrote: > This is probably a bit confusing. This might as well mean client application > code against libpq. Better something like "server plugin code that uses the > PostgreSQL server APIs". That works. > But now we're talking about API. That might be subject of another document > or another section in this one, but it seems confusing to mix this with the > ABI discussion. Hrm. They’re super closely-related in my mind, as an extension developer. I need to know both! I guess I’m taking of this policy as what I can expect may be changed (and how to adapt to it) and what won’t. That said, I’m fine to remove the API stuff if there’s consensus objecting to it, to be defined in a separate policy (perhaps on the same doc page). >> PostgreSQL avoids unnecessary API changes in major releases, but usually >> ships a few necessary API changes, including deprecation, renaming, and >> argument variation. > > Obviously, as a practical matter, there won't be random pointless changes. > But I wouldn't go as far as writing anything down about how these APIs are > developed. Fair enough, was trying to give some idea of the sorts of changes. Don’t have to include them. >> In such cases the incompatible changes will be listed in the Release Notes. > > I don't think anyone is signing up to do that. It needn’t be comprehensive. Just mention that an ABI or API changed in the release note item. Unless they almost *all* make such changes. >> Minor Releases >> -- > I think one major problem besides actively avoiding or managing such > minor-version ABI breaks is some automated detection. Otherwise, this just > means "we try, but who knows”. I think you *do* try, and the fact that there are so few issues means you succeed at that. I’m not advocating for an ABI guarantee here, just a description of the policy committees already follow. Here’s an update based on all the feedback, framing things more from the perspective of “do I need to recompile this or change my code”. Many thanks! ``` md ABI Policy == Changes to the the PostgreSQL server APIs may require recompilation of server plugin code that uses them. This policy describes the core team's approach to such changes, and what server API users can expect. Major Releases -- Applications that use server APIs must be compiled for each major release supported by the application. The inclusion of `PG_MODULE_MAGIC` ensures that code compiled for one major version will rejected by other major versions. Developers needing to support multiple versions of PostgreSQL with incompatible APIs should use the `PG_VERSION_NUM` constant to adjust code as appropriate. For example: ``` c #if PG_VERSION_NUM >= 16 #include "varatt.h" #endif ``` The core team avoids unnecessary breakage, but users of the server APIs should expect and be prepared to make adjustments and recompile for every major release. Minor Releases -- PostgreSQL makes an effort to avoid server API and ABI breaks in minor releases. In general, an application compiled against any minor release will work with any other minor release, past or future. In the absence of automated detection of such changes, this is not a guarantee, but history such breaking changes have been extremely rare. When a change *is* required, PostgreSQL will choose the least invasive change possible, for example by squeezing a new field into padding space or appending it to the end of a struct. This sort of change should not impact dependent applications unless they use `sizeof(the struct)` on a struct with an appended field, or create their own instances of such structs --- patterns best avoided. In rare cases, however, even such non-invasive changes may be impractical or impossible. In such an event, the change will be documented in the Release Notes, and details on the issue will also be posted to [TBD; mail list? Blog post? News item?]. To minimize issues and catch changes early, the project strongly recommends that developers adopt continuous integration testing at least for the latest minor release all major versions of Postgres they support. ``` Best, David
Re: Proposal: Document ABI Compatibility
On Jun 24, 2024, at 14:51, Robert Haas wrote: > I suppose that it's true that we try to avoid gratuitous breakage, but > I feel like it would be weird to document that. I see how that can seem weird to a committer deeply familiar with the development process and how things happen. But people outside the -hackers bubble have very little idea. It’s fair to say it needn’t be a long statement for major versions: a single sentence such as “we try to avoid gratuitous breakage” is a perfectly reasonable framing. But I’d say, in the interest of completeness, it would be useful to document the policy for major release *as well as* minor releases. Best, David
Re: Proposal: Document ABI Compatibility
On Jun 19, 2024, at 05:42, Peter Eisentraut wrote: >>> https://postgr.es/m/CAH2-Wzm-W6hSn71sUkz0Rem=qdeu7tnfmc7_jg2djrlfef_...@mail.gmail.com >>> >>> Theoretically anybody can do this themselves. In practice they don't. >>> So something as simple as providing automated reports about ABI >>> changes might well move the needle here. >> What would be required to make such a thing? Maybe it’d make a good Summer >> of Code project. > > The above thread contains a lengthy discussion about what one could do. I somehow missed that GSoC 2024 is already going with contributors. Making a mental note to add an item for 2025. D
Re: RFC: Additional Directory for Extensions
On Jun 24, 2024, at 1:53 PM, Robert Haas wrote: > Is "tighten up what the superuser can do" on our list of objectives? > Personally, I think we should be focusing mostly, and maybe entirely, > on letting non-superusers do more things, with appropriate security > controls. The superuser can ultimately do anything, since they can > cause shell commands to be run and load arbitrary code into the > backend and write code in untrusted procedural languages and mutilate > the system catalogs and lots of other terrible things. I guess the question then is what security controls are appropriate for this feature, which after all tells the postmaster what directories to read files from. It feels a little outside the scope of a regular user to even be aware of the file system undergirding the service. But perhaps there’s a non-superuser role for whom it is appropriate? > Now, I think there are environments where people have used things like > containers to try to lock down the superuser, and while I'm not sure > that can ever be particularly water-tight, if it were the case that > this patch would make it a whole lot easier for a superuser to bypass > the kinds of controls that people are imposing today, that might be an > argument against this patch. But ... off-hand, I'm not seeing such an > exposure. Yeah I’m not even sure I follow. Containers are immutable, other than mutable mounted volumes --- which is one use case this patch is attempting to enable. > On the patch itself, I find the documentation for this to be fairly > hard to understand. I think it could benefit from an example. I'm > confused about whether this is intended to let me search for > extensions in /my/temp/root/usr/lib/postgresql/... by setting > extension_directory=/my/temp/dir, or whether it's intended me to > search both /usr/lib/postgresql as I normally would and also > /some/other/place. I sketched them quickly, so agree they can be better. Reading the code, I now see that it appears to be the former case. I’d like to advocate for the latter. > If the latter, I wonder why we don't handle shared > libraries by setting dynamic_library_path and then just have an > analogue of that for control files. The challenge is that it applies not just to shared object libraries and control files, but also extension SQL files and any other SHAREDIR files an extension might include. But also, I think it should support all the pg_config installation targets that extensions might use, including: BINDIR DOCDIR HTMLDIR PKGINCLUDEDIR LOCALEDIR MANDIR I can imagine an extension wanting or needing to use any and all of these. Best, David
Re: Inconsistent Parsing of Offsets with Seconds
On Jun 22, 2024, at 14:10, David E. Wheeler wrote: > I believe the former issue is caused by the latter: The jsonpath > implementation uses the formatting strings to parse the timestamps[1], and > since there is no formatting to support offsets with seconds, it doesn’t work > at all in JSON timestamp parsing. > > [1]: > https://github.com/postgres/postgres/blob/70a845c/src/backend/utils/adt/jsonpath_exec.c#L2420-L2442 A side-effect of this implementation of date/time parsing using the to_char templates is that only time zone offsets and abbreviations are supported. I find the behavior a little surprising TBH: david=# select to_timestamp('2024-06-03 12:35:00America/New_York', '-MM-DD HH24:MI:SSTZ'); ERROR: invalid value "America/New_York" for "TZ" DETAIL: Time zone abbreviation is not recognized. Unless the SQL standard only supports offsets and abbreviations, I wonder if we’d be better off updating the above parsing code to also try the various date/time input functions, as well as the custom formats that *are* defined by the standard. Best,
Re: Inconsistent Parsing of Offsets with Seconds
On Jun 22, 2024, at 13:15, Tom Lane wrote: > It's hard to get excited about this. I freely admit I’m getting into the weeds here. :-) >> The corresponding jsonpath methods don’t like offsets with seconds *at all*: > > Perhaps that should be fixed, but it's pretty low-priority IMO. > I doubt there is any standard saying that JSON timestamps need > to be able to include that. > >> I see from the source[1] that offsets between plus or minus 15:59:59 >> are allowed; should the `OF` and `TZ formats be able to parse them? > > I'd vote no. to_date/to_char already have enough trouble with format > strings being squishier than one might expect. I believe the former issue is caused by the latter: The jsonpath implementation uses the formatting strings to parse the timestamps[1], and since there is no formatting to support offsets with seconds, it doesn’t work at all in JSON timestamp parsing. [1]: https://github.com/postgres/postgres/blob/70a845c/src/backend/utils/adt/jsonpath_exec.c#L2420-L2442 So if we were to fix the parsing of offsets in jsonpath, we’d either have to change the parsing code there or augment the to_timestamp() formats and use them. Totally agree not a priority; happy to just pretend offsets with seconds don’t exist in any practical sense. Best, David
Inconsistent Parsing of Offsets with Seconds
Hackers, The treatment of timestamptz (and timetz) values with offsets that include seconds seems a bit inconsistent. One can create such timestamps through the input function: david=# select '2024-06-22T12:35:00+02:30:15'::timestamptz; timestamptz 2024-06-22 10:04:45+00 But the offset seconds are dropped (or rounded away?) by to_timestamp()’s `OF` and `TZ` formats[2]: david=# select to_timestamp('2024-06-03 12:35:00+02:30:15', '-MM-DD HH24:MI:SSOF'); to_timestamp 2024-06-03 10:05:00+00 david=# select to_timestamp('2024-06-03 12:35:00+02:30:15', '-MM-DD HH24:MI:SSTZ'); to_timestamp 2024-06-03 02:05:00-08 The corresponding jsonpath methods don’t like offsets with seconds *at all*: david=# select jsonb_path_query('"2024-06-03 12:35:00+02:30:15"', '$.datetime("-MM-DD HH24:MI:SSOF")'); ERROR: trailing characters remain in input string after datetime format david=# select jsonb_path_query('"2024-06-03 12:35:00+02:30:15"', '$.timestamp_tz()'); ERROR: timestamp_tz format is not recognized: "2024-06-03 12:35:00+02:30:15" I see from the source[1] that offsets between plus or minus 15:59:59 are allowed; should the `OF` and `TZ formats be able to parse them? Or perhaps there should be a `TZS` format to complement `TZH` and `TZM`? Best, David [1] https://github.com/postgres/postgres/blob/70a845c/src/include/datatype/timestamp.h#L136-L142 [2]: https://www.postgresql.org/docs/16/functions-formatting.html
Re: call for applications: mentoring program for code contributors
On Jun 20, 2024, at 13:12, Robert Haas wrote: > I'm working to start a mentoring program where code contributors can > be mentored by current committers. Applications are now open: > https://forms.gle/dgjmdxtHYXCSg6aB7 This is amazing! Thank you for putting it together, Robert. Best, David
Re: Extension security improvement: Add support for extensions with an owned schema
On Jun 19, 2024, at 13:50, Jelte Fennema-Nio wrote: > This indeed does sound like the behaviour that pretty much every > existing extension wants to have. One small addition/clarification > that I would make to your definition: fully qualified references to > other objects should still be allowed. Would be tricky for referring to objects from other extensions with no defined schema, or are relatable. > 1. To have a safe search_path that can be used in SET search_path on a > function (see also [1]). > 2. To make it easy for extension authors to avoid conflicts with other > extensions/UDFs. These would indeed be nice improvements IMO. Best, David
Re: Extension security improvement: Add support for extensions with an owned schema
On Jun 19, 2024, at 11:28, Robert Haas wrote: > But I wonder if there might also be another possible approach: could > we, somehow, prevent object references in extension scripts from > resolving to anything other than the system catalogs and the contents > of that extension? Perhaps with a control file setting to specify a > list of trusted extensions which we're also allowed to reference? It would also have to allow access to other extensions it depends upon. D
Re: jsonpath Time and Timestamp Special Cases
On Apr 29, 2024, at 20:45, David E. Wheeler wrote: > I noticed that the jsonpath date/time functions (.time() and timestamp(), et > al.) don’t support some valid but special-case PostgreSQL values, notably > `infinity`, `-infinity`, and, for times, '24:00:00`: Looking at ECMA-404[1], “The JSON data interchange syntax”, it says, of numbers: > Numeric values that cannot be represented as sequences of digits (such as > Infinity and NaN) are not permitted. So it makes sense that the JSON path standard would be the same, since such JSON explicitly cannot represent these values as numbers. Still not sure about `24:00:00` as a time, though. I presume the jsonpath standard disallows it. Best, David [1]: https://ecma-international.org/wp-content/uploads/ECMA-404_2nd_edition_december_2017.pdf
Re: jsonpath: Missing regex_like && starts with Errors?
On Jun 17, 2024, at 6:44 PM, Chapman Flack wrote: > The current implementation seems to have made each of our > s responsible for swallowing its own errors, which > is one perfectly cromulent way to satisfy the SQL standard behavior saying > all errors within a should be swallowed. Naw, executePredicate does it for all of them, as for the left operand here[1]. > The standard says nothing on how they should behave outside of a > , because as far as the standard's concerned, > they can't appear there. > > Ours currently behave the same way, and swallow their errors. Yes, and they’re handled consistently, at least. > It would have been possible to write them in such a way as to raise errors, > but not when inside a , and that would also satisfy > the standard, but it would also give us the errors you would like from our > nonstandard "predicate check expressions". And then you could easily use > silent => true if you wanted them silent. I’m okay without the errors, as long as the behaviors are consistent. I mean it might be cool to have a way to get them, but the consistency I thought I saw was the bit that seemed like a bug. > I'd be leery of changing that, though, as we've already documented that > a "predicate check expression" returns true, false, or unknown, so having > it throw by default seems like a change of documented behavior. Right, same for using jsonb_path_match(). > The current situation can't make much use of 'silent', since it's already > false by default; you can't make it any falser to make predicate-check > errors show up. EXTREAMLY FALSE! > Would it be a thinkable thought to change the 'silent' default to null? > That could have the same effect as false for SQL standard expressions, and > the same effect seen now for "predicate check expressions", and you could > pass it explicitly false if you wanted errors from the predicate checks. Thaat seems like it’d be confusing TBH. > If that's no good, I don't see an obvious solution other than adding > another nonstandard construct to what's nonstandard already, and allowing > something like nonsilent(predicate check expression). The only options I can think of are a GUC to turn on SUPER STRICT mode or something (yuck, action at a distance) or introduce new functions with the new behavior. I advocate for neither (at this point). Best, David [1]: https://github.com/postgres/postgres/blob/82ed67a/src/backend/utils/adt/jsonpath_exec.c#L2058-L2059
Re: Proposal: Document ABI Compatibility
On Jun 12, 2024, at 11:57, Peter Geoghegan wrote: > That having been said, it would be useful if there was a community web > resource for this -- something akin to coverage.postgresql.org, but > with differential ABI breakage reports. You can see an example report > here: > > https://postgr.es/m/CAH2-Wzm-W6hSn71sUkz0Rem=qdeu7tnfmc7_jg2djrlfef_...@mail.gmail.com > > Theoretically anybody can do this themselves. In practice they don't. > So something as simple as providing automated reports about ABI > changes might well move the needle here. What would be required to make such a thing? Maybe it’d make a good Summer of Code project. Best, David
Re: Proposal: Document ABI Compatibility
On Jun 12, 2024, at 11:30, Peter Geoghegan wrote: > I'm a little surprised that we don't seem to have all that many > problems with ABI breakage, though. Although we theoretically have a > huge number of APIs that extension authors might choose to use, that > isn't really true in practical terms. The universe of theoretically > possible problems is vastly larger than the areas where we see > problems in practice. You have to be pragmatic about it. Things go wrong far less often than one might fear! Given this relative stability, I think it’s reasonable to document what heretofore assumed the policy is so that the fears can largely be put to rest by clear expectations. Best, David
Re: Proposal: Document ABI Compatibility
On Jun 12, 2024, at 11:20, Andres Freund wrote: >> The PostgreSQL core team maintains two application binary interface (ABI) >> guarantees: one for major releases and one for minor releases. > > I.e. for major versions it's "there is none"? Is it? ISTM that there is the intention not to break things that don’t need to be broken, though that doesn’t rule out interface improvements. >> In such cases the incompatible changes will be listed in the Release Notes. > > I don't think we actually exhaustively list all of them. Should they be? I can maybe see the argument not to for major releases. But I’ve also has the experience of a new failure on a major release and having to go find the reason for it and the fix, often requiring the attention of someone on a mailing list who might rather tap the “compatibility changes” sign in the latest change log. :-) > s/every/a reasonable/ or just s/every/an/ ✅ > The padding case doesn't affect sizeof() fwiw. ✅ > I think there's too often not an alternative to using sizeof(), potentially > indirectly (via makeNode() or such. So this sounds a bit too general. Is there some other way to avoid the issue? Or would constructor APIs need to be added to core? Updated with your suggestions: ``` md ABI Policy == The PostgreSQL core team maintains two application binary interface (ABI) guarantees: one for major releases and one for minor releases. Major Releases -- Applications that use the PostgreSQL APIs must be compiled for each major release supported by the application. The inclusion of `PG_MODULE_MAGIC` ensures that code compiled for one major version will rejected by other major versions. Furthermore, new releases may make API changes that require code changes. Use the `PG_VERSION_NUM` constant to adjust code in a backwards compatible way: ``` c #if PG_VERSION_NUM >= 16 #include "varatt.h" #endif ``` PostgreSQL avoids unnecessary API changes in major releases, but usually ships a few necessary API changes, including deprecation, renaming, and argument variation. In such cases the incompatible changes will be listed in the Release Notes. Minor Releases -- PostgreSQL makes an effort to avoid both API and ABI breaks in minor releases. In general, an application compiled against any minor release will work with any other minor release, past or future. When a change *is* required, PostgreSQL will choose the least invasive way possible, for example by squeezing a new field into padding space or appending it to the end of a struct. This sort of change should not impact dependent applications unless, they use `sizeof(the struct)` on a struct with an appended field, or create their own instances of such structs --- patterns best avoided. In rare cases, however, even such non-invasive changes may be impractical or impossible. In such an event, the change will be documented in the Release Notes, and details on the issue will also be posted to [TBD; mail list? Blog post? News item?]. The project strongly recommends that developers adopt continuous integration testing at least for the latest minor release all major versions of Postgres they support. ``` Best, David
Re: jsonpath: Missing regex_like && starts with Errors?
On Jun 16, 2024, at 11:52, David E. Wheeler wrote: > I think that’s how it should be; I prefer that it raises errors by default > but you can silence them: > > david=# select jsonb_path_query(target => '{"x": "hi"}', path => > '$.integer()', silent => false); > ERROR: jsonpath item method .integer() can only be applied to a string or > numeric value > > david=# select jsonb_path_query(target => '{"x": "hi"}', path => > '$.integer()', silent => true); > jsonb_path_query > -- > (0 rows) > > I suggest that the same behavior be adopted for `like_regex` and `starts > with`. Okay, I think I’ve figured this out, and the key is that I am, once again, comparing predicate path queries to SQL standard queries. If I update the first example to use a comparison I no longer get an error: david=# select jsonb_path_query('{"x": "hi"}', '$.integer() == 1'); jsonb_path_query -- null So I think that’s the key: There’s not a difference between the behavior of `like_regex` and `starts with` vs other predicate expressions. This dichotomy continues to annoy. I would very much like some way to have jsonb_path_query() raise an error (or even a warning!) if passed a predate expression, and for jsonb_path_match() to raise an error or warning if its path is not a predicate expression. Because I keep confusing TF out of myself. Best, David
Re: FYI: LLVM Runtime Crash
On Jun 17, 2024, at 16:37, Andres Freund wrote: > I suspect the issue might be that the version of clang and LLVM are diverging > too far. Does it work if you pass CLANG=/opt/homebrew/opt/llvm/bin/clang to > configure? It does! It didn’t occur to me this would be the issue, but I presumes /usr/bin/clang is not compatible with the latest LLVM installed from Homebrew. Interesting! I’ll update that issue. Thanks, David
Re: FYI: LLVM Runtime Crash
On Jun 17, 2024, at 11:52, David E. Wheeler wrote: > I don’t *think* it’s something that can be fixed in Postgres core. This is > mostly in FYI in case anyone else runs into this issue or knows something > more about it. Okay, a response to the issue[1] says the bug is in Postgres: > The error message is LLVM reporting the backend can't handle the particular > form of "probe-stack" attribute in the input LLVM IR. So this is likely a bug > in the way postgres is generating LLVM IR: please file a bug against > Postgres. (Feel free to reopen if you have some reason to believe the issue > is on the LLVM side.) Would it be best for me to send a report to pgsql-bugs? Best, David [1]: https://github.com/llvm/llvm-project/issues/95804#issuecomment-2174310977
FYI: LLVM Runtime Crash
Hackers, FYI, I wanted to try using PostgreSQL with LLVM on my Mac, but the backend repeatedly crashes during `make check`. I found the same issue in master and REL_16_STABLE. The crash message is: FATAL: fatal llvm error: Unsupported stack probing method server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. connection to server was lost Same thing appears in the log output. I poked around and found a similar bug[1] was fixed in LLVM in December, but as I’m getting it with the latest release from 2 weeks ago, something else might be going on. I’ve opened a new LLVM bug report[2] for the issue. I don’t *think* it’s something that can be fixed in Postgres core. This is mostly in FYI in case anyone else runs into this issue or knows something more about it. In the meantime I’ll be building without --with-llvm. Best, David [1]: https://github.com/llvm/llvm-project/issues/57171 [2]: https://github.com/llvm/llvm-project/issues/95804
Re: jsonpath: Missing regex_like && starts with Errors?
On Jun 15, 2024, at 12:23, Chapman Flack wrote: > I see. Yes, that documentation now says "predicate check expressions return > the single three-valued result of the predicate: true, false, or unknown". It has been there since jsonpath was introduced in v12[1]: > A path expression can be a Boolean predicate, although the SQL/JSON standard > allows predicates only in filters. This is necessary for implementation of > the @@ operator. For example, the following jsonpath expression is valid in > PostgreSQL: > > '$.track.segments[*].HR < 70' > (Aside: are all readers of the docs assumed to have learned the habit > of calling SQL null "unknown" when speaking of a boolean? They can flip > back to 8.6 Boolean Type and see 'a third state, “unknown”, which is > represented by the SQL null value'. But would it save them some page > flipping to add " (represented by SQL null)" to the sentence here?) In 9.16.2[2] it says: > The unknown value plays the same role as SQL NULL and can be tested for with > the is unknown predicate. > As Unknown is typically what the predicates return within a filter (where > errors get trapped) when an error has occurred, the existing docs seem to > suggest they behave the same way in a "predicate check expression", so a > change to that behavior now would be a change to what we've documented. It’s reasonable to ask, then, whether `starts with` and `like_regex` are correct and the others shouldn’t throw errors in predicate check expressions, yes. I don’t know the answer, but would like it to be consistent. > Can't really overload jsonb_path_query's 'silent' parameter for that, > because it is already false by default. If predicate check expressions > were nonsilent by default, the existing 'silent' parameter would be a > perfect way to silence them. I think that’s how it should be; I prefer that it raises errors by default but you can silence them: david=# select jsonb_path_query(target => '{"x": "hi"}', path => '$.integer()', silent => false); ERROR: jsonpath item method .integer() can only be applied to a string or numeric value david=# select jsonb_path_query(target => '{"x": "hi"}', path => '$.integer()', silent => true); jsonb_path_query -- (0 rows) I suggest that the same behavior be adopted for `like_regex` and `starts with`. > No appetite to add yet another optional boolean parameter to > jsonb_path_query for the sole purpose of controlling the silence of > our nonstandard syntax extension You don’t need it IMO, the existing silent parameter already does it existing error-raising operators. Best, David [1]: https://www.postgresql.org/docs/12/functions-json.html#FUNCTIONS-SQLJSON-PATH [2]: https://www.postgresql.org/docs/devel/functions-json.html#FUNCTIONS-SQLJSON-PATH
Re: Shouldn't jsonpath .string() Unwrap?
On Jun 15, 2024, at 12:48, Andrew Dunstan wrote: > Not really needed, I will commit shortly. Ah, okay, I wasn’t sure so just defaulted to making sure it was tracked. :-) Thanks Andrew, D
Re: Shouldn't jsonpath .string() Unwrap?
On Jun 15, 2024, at 10:39, David E. Wheeler wrote: >> The changes are straightforward and look good to me. However, I have kept >> the existing test of an empty array as is, assuming that it is one of the >> valid tests. It now returns zero rows instead of an error. Your added test >> case covers this issue. > > Makes sense, thank you. Added https://commitfest.postgresql.org/48/5039/. D
Re: jsonpath: Missing regex_like && starts with Errors?
On Jun 14, 2024, at 22:29, Chapman Flack wrote: > So I should go look at our code to see what grammar we've implemented, > exactly. It is beginning to seem as if we have simply added > as another choice for an expression, not restricted > to only appearing in a filter. If so, and we add documentation about how > we diverge from the standard, that's probably the way to say it. Yes, if I understand correctly, these are predicate check expressions, supported and documented as an extension to the standard since Postgres 12[1]. I found their behavior quite confusing for a while, and spent some time figuring it out and submitting a doc patch (committed in 7014c9a[2]) to hopefully clarify things in Postgres 17. > So that's where the errors went. Ah, great, that explains the error suppression in filters. Thank you. I still think the supression of `like_regex` and `starts with` errors in predicate path queries is odd, though. > The question of what should happen to the errors when a > appears outside of a > of course isn't answered in the standard, because that's not supposed > to be possible. So if we're allowing predicates to appear on their own > as expressions, it's also up to us to say what should happen with errors > when they do. Right, and I think there’s an inconsistency right now. Best, David [1]: https://www.postgresql.org/docs/devel/functions-json.html#FUNCTIONS-SQLJSON-CHECK-EXPRESSIONS [2]: https://github.com/postgres/postgres/commit/7014c9a
Re: Shouldn't jsonpath .string() Unwrap?
On Jun 15, 2024, at 10:27, Jeevan Chalke wrote: > Sorry, I have missed this in the original patch. I am surprised how that > happened. But thanks for catching the same and fixing it. No worries. :-) > The changes are straightforward and look good to me. However, I have kept the > existing test of an empty array as is, assuming that it is one of the valid > tests. It now returns zero rows instead of an error. Your added test case > covers this issue. Makes sense, thank you. D
jsonpath: Missing regex_like && starts with Errors?
Hackers, I noticed that neither `regex_like` nor `starts with`, the jsonpath operators, raise an error when the operand is not a string (or array of strings): david=# select jsonb_path_query('true', '$ like_regex "^hi"'); jsonb_path_query -- null (1 row) david=# select jsonb_path_query('{"x": "hi"}', '$ starts with "^hi"'); jsonb_path_query -- null (1 row) This is true in strict and lax mode, and with verbosity enabled (as in these examples). Most other operators raise an error when they can’t operate on the operand: david=# select jsonb_path_query('{"x": "hi"}', '$.integer()'); ERROR: jsonpath item method .integer() can only be applied to a string or numeric value david=# select jsonb_path_query('{"x": "hi"}', '$+$'); ERROR: left operand of jsonpath operator + is not a single numeric value Should `like_regex` and `starts with` adopt this behavior, too? I note that filter expressions seem to suppress these sorts of errors, but I assume that’s by design: david=# select jsonb_path_query('{"x": "hi"}', 'strict $ ?(@ starts with "^hi")'); jsonb_path_query -- (0 rows) david=# select jsonb_path_query('{"x": "hi"}', 'strict $ ?(@ like_regex "^hi")'); jsonb_path_query -- (0 rows) david=# select jsonb_path_query('{"x": "hi"}', 'strict $ ?(@.integer() == 1)'); jsonb_path_query -- (0 rows) D
Re: Shouldn't jsonpath .string() Unwrap?
> On Jun 14, 2024, at 11:25, Chapman Flack wrote: > > I would s/extepsions/exceptions/ in the added documentation. :) Bah, fixed and attached, thanks. > Offhand (as GitHub PRs aren't really The PG Way), if they were The Way, > I would find this one a little hard to follow, being based at a point > 28 unrelated commits ahead of the ref it's opened against. I suspect > 'master' on theory/postgres could be fast-forwarded to f1affb6 and then > the PR would look much more approachable. Yeah, I pushed the PR and branch before I synced master, and GitHub was taking a while to notice and update the PR. I fixed it with `git commit --all --amend --date now --reedit-message HEAD` and force-pushed (then fixed the typo and fixed again). D v2-0001-Teach-jsonpath-string-to-unwrap-in-lax-mode.patch Description: Binary data
Re: Shouldn't jsonpath .string() Unwrap?
On Jun 13, 2024, at 21:55, Chapman Flack wrote: > My opinion is yes, that should be done. 9.46, umm, General > Rule 11 g ii 6) A) says just "if MODE is lax and is not > type or size, then let BASE be Unwrap(BASE)." No special exemption > there for string(), nor further below at C) XV) for the operation > of string(). Thank you! Cited that bit in the commit message in the attached patch (also available as a GitHub PR[1]). D [1]: https://github.com/theory/postgres/pull/5 v1-0001-Teach-jsonpath-string-to-unwrap-in-lax-mode.patch Description: Binary data
Re: jsonpath: Missing Binary Execution Path?
On Jun 13, 2024, at 22:31, Chapman Flack wrote: > It's baked right into the standard grammar: || can only have a > on its right and a > on its left. > > && can only have a on its right and a > on its left. Wow. > The case for ! is even more limiting: it can't be applied to anything > but a . That can be either the exists predicate, > or, any other but wrapped in parentheses. > > The language seems sort of gappy in the same way XPath 1.0 was. XPath 2.0 > became much more consistent and conceptually unified, only by that time, > XML was old school, and JSON was cool, and apparently started inventing > a path language. I suppose that’s the reason for this design. But if these sorts of limitations were changed in XPath, perhaps SQL-Next could fix them, too. Thanks for citing the standard; super helpful. D
Re: jsonpath: Missing Binary Execution Path?
On Jun 13, 2024, at 21:58, Chapman Flack wrote: david=# select jsonb_path_query('1', '$ >= 1'); >>> >>> Good point. I can't either. No way I can see to parse that as >>> a . >> >> Whether we note it as non-standard or not is an open question then, but it >> does work and opens up a documentation question. > > Does the fact that it does work raise any potential concern that our > grammar is nonconformant in some way that could present a headache > somewhere else, or down the road with a later standard edition? I believe this case is already covered in the docs as a Postgres-specific feature: predicate path expressions. But even inside filters I don’t understand why &&, ||, at least, currently only work if their operands are predicate expressions. Seems weird; and your notes above suggest that rule applies only to !, which makes slightly more sense. D
Re: Shouldn't jsonpath .string() Unwrap?
On Jun 13, 2024, at 3:53 PM, Andrew Dunstan wrote: > Hmm. You might be right. Many of these items have this code, but the string() > branch does not: > if (unwrap && JsonbType(jb) == jbvArray) >return executeItemUnwrapTargetArray(cxt, jsp, jb, found, >false); Exactly, would be pretty easy to add. I can work up a patch this weekend. D
Re: jsonpath: Missing Binary Execution Path?
On Jun 13, 2024, at 3:33 PM, Andrew Dunstan wrote: > What does the spec say about these? What do other implementations do? Paging Mr. Eisentraut! :-) D
Re: jsonpath: Missing Binary Execution Path?
On Jun 13, 2024, at 11:32, David E. Wheeler wrote: > Should && and || not also work on scalar operands? I see the same issue for unary !, too: david=# select jsonb_path_query('true', '!$'); ERROR: syntax error at or near "$" of jsonpath input LINE 1: select jsonb_path_query('true', '!$'); ^ david=# select jsonb_path_query('[1, 3, 7]', '$[*] ? (!true)'); ERROR: syntax error at end of jsonpath input LINE 1: select jsonb_path_query('[1, 3, 7]', '$[*] ? (!true)'); ^ david=# select jsonb_path_query('[1, 3, 7]', '$[*] ? (!@.boolean())'); ERROR: syntax error at or near "@" of jsonpath input LINE 1: select jsonb_path_query('[1, 3, 7]', '$[*] ? (!@.boolean())'... ^ Best, David
jsonpath: Missing Binary Execution Path?
Hackers, Another apparent inconsistency I’ve noticed in jsonpath queries is the treatment of the && and || operators: They can’t operate on scalar functions, only on other expressions. Some examples: david=# select jsonb_path_query('true', '$ && $'); ERROR: syntax error at or near "&&" of jsonpath input LINE 1: select jsonb_path_query('true', '$ && $'); ^ david=# select jsonb_path_query('true', '$.boolean() && $.boolean()'); ERROR: syntax error at or near "&&" of jsonpath input LINE 1: select jsonb_path_query('true', '$.boolean() && $.boolean()'... ^ The only place I’ve seen them work is inside filters with binary or unary operands: jsonb_path_query('[1, 3, 7]', '$[*] ? (@ > 1 && @ < 5)'); jsonb_path_query -- 3 It doesn’t even work with boolean methods! david=# select jsonb_path_query('[1, 3, 7]', '$[*] ? (@.boolean() && @.boolean())'); ERROR: syntax error at or near "&&" of jsonpath input LINE 1: select jsonb_path_query('[1, 3, 7]', '$[*] ? (@.boolean() &&... ^ Other binary operators work just fine in these sorts of contexts: david=# select jsonb_path_query('1', '$ >= 1'); jsonb_path_query -- true (1 row) david=# select jsonb_path_query('[1, 3, 7]', '$[*] ? (@ > 1)'); jsonb_path_query -- 3 7 (2 rows) Should && and || not also work on scalar operands? Best, David
Re: Shouldn't jsonpath .string() Unwrap?
On Jun 12, 2024, at 4:02 PM, David G. Johnston wrote: > Adding Andrew. Thank you. > I'm willing to call this an open item against this feature as I don't see any > documentation explaining that string() behaves differently than the others. Maybe there’s some wording in the standard on this topic? I’m happy to provide a patch to auto-unwrap .string() in lax mode. Seems pretty straightforward. D
Re: Proposal: Document ABI Compatibility
On Jun 12, 2024, at 10:47, Robert Haas wrote: > What I think would be useful to document is our usual practices e.g. > adding new struct members at the end of structs, trying to avoid > changing public function signatures. If we document promises to > extension authors, I don't know how much difference that will make: > we'll probably end up needing to violate them at some point for one > reason or another. I think that’s fine if there is some sort of notification process. The policy I drafted upthread starts with making sure the such a break is mentioned in the release notes. > But if we document what committers should do, then > we might do better than we're now, because committers will be more > likely to do it right, and extension authors can also read those > instructions to understand what our practices are. Yes, this, thank you! D
Re: Proposal: Document ABI Compatibility
On Jun 12, 2024, at 8:58 AM, Jelte Fennema-Nio wrote: > While not strictly an ABI break I guess, the backport of 32d5a4974c81 > broke building Citus against 13.10 and 14.7[1]. > > [1]: https://github.com/citusdata/citus/pull/6711 Interesting one. We might want to advise projects to use deferent names if they copy code from the core, use an extension-specific prefix perhaps. That way if it gets backported by the core, as in this example, it won’t break anything, and the extension can choose to switch. D
Re: Proposal: Document ABI Compatibility
On Jun 12, 2024, at 8:43 AM, Peter Eisentraut wrote: >> Right, it’s just that extension authors could use some notification that >> such a change is coming so they can update their code, if necessary. > > I think since around 6 years ago we have been much more vigilant about > avoiding ABI breaks. So if there aren't any more recent examples of > breakage, then maybe that was ultimately successful, and the upshot is, > continue to be vigilant at about the same level? That sounds great to me. I’d like to get it documented, though, so that extension and other third party developers are aware of it, and not just making wild guesses and scaring each other over (perhaps) misconceptions. D
Re: Proposal: Document ABI Compatibility
On Jun 10, 2024, at 15:39, Andres Freund wrote: > That's 6 years ago, not sure we can really learn that much from that. > > And it's not like it's actually impossible, #ifdefs aren't great, but they are > better than nothing. Right, it’s just that extension authors could use some notification that such a change is coming so they can update their code, if necessary. >> Or, to David C’s point, perhaps it would be better to say there are some >> categories of APIs that are not subject to any guarantees in minor releases? > > I'm honestly very dubious that this is a good point to introduce a bunch of > formalism. It's a already a lot of work to maintain them, if we make it even > harder we'll end up more fixes not being backported, because it's not worth > the pain. Well it’s a matter of distributing the work. I don’t want to increase anyone’s workload unnecessarily, but as it is stuff like this can be surprising to extension maintainers with some expectation of minor release stability who had no warning of the change. That kind of thing can dissuade some people from deciding to write or maintain extensions, and lead others to recompile and distribute binaries for every single minor release. > To be blunt, the number of examples raised here doesn't seem to indicate that > this is an area where we need to invest additional resources. We are already > severely constrained as a project by committer bandwidth, there are plenty > other things that seem more important to focus on. So my question is, what’s the least onerous thing for committers to commit to doing that we can write down to properly set expectations? That’s where I want to start: can we publish a policy that reflects what committers already adhere to? And is there some way to let people know that an incompatible change is being released? Even if it just starts out in the release notes? Based on this thread, I’ve drafted the sort of policy I have in mind. Please don’t assume I’m advocating for exactly the wording here! Let’s workshop this until it’s something the committers and core team can agree to. (At that point I’ll turn it into a doc patch) Have a look and let me know what you think. ``` md ABI Policy == The PostgreSQL core team maintains two application binary interface (ABI) guarantees: one for major releases and one for minor releases. Major Releases -- Applications that use the PostgreSQL APIs must be compiled for each major release supported by the application. The inclusion of `PG_MODULE_MAGIC` ensures that code compiled for one major version will rejected by other major versions. Furthermore, new releases may make API changes that require code changes. Use the `PG_VERSION_NUM` constant to adjust code in a backwards compatible way: ``` c #if PG_VERSION_NUM >= 16 #include "varatt.h" #endif ``` PostgreSQL avoids unnecessary API changes in major releases, but usually ships a few necessary API changes, including deprecation, renaming, and argument variation. In such cases the incompatible changes will be listed in the Release Notes. Minor Releases -- PostgreSQL makes every effort to avoid both API and ABI breaks in minor releases. In general, an application compiled against any minor release will work with any other minor release, past or future. When a change *is* required, PostgreSQL will choose the least invasive way possible, for example by squeezing a new field into padding space or appending it to the end of a struct. This sort of change should not impact dependent applications unless they use `sizeof(the struct)` or create their own instances of such structs --- patterns best avoided. In rare cases, however, even such non-invasive changes may be impractical or impossible. In such an event, the change will be documented in the Release Notes, and details on the issue will also be posted to [TBD; mail list? Blog post? News item?]. The project strongly recommends that developers adopt continuous integration testing at least for the latest minor release all major versions of Postgres they support. ``` Best, David
Re: Proposal: Document ABI Compatibility
On Jun 4, 2024, at 03:18, Peter Eisentraut wrote: > This could possibly be avoided by renaming the symbol in backbranches. Maybe > something like > > #define InitResultRelInfo InitResultRelInfo2 > > Then you'd get a specific error message when loading the module, rather than > a crash. That sounds more useful, yes. Is that a practice the project would consider adopting? There’s also oracle_fdw@d137d15[1], which says: > An API break in PostgreSQL 10.4 and 9.6.9 makes it impossible > to use these versions: the "extract_actual_join_clauses" function > gained an additional parameter. The 10.4 commit is 68fab04, and it does indeed add a new function: ``` patch --- a/src/include/optimizer/restrictinfo.h +++ b/src/include/optimizer/restrictinfo.h @@ -36,6 +36,7 @@ extern List *get_actual_clauses(List *restrictinfo_list); extern List *extract_actual_clauses(List *restrictinfo_list, bool pseudoconstant); extern void extract_actual_join_clauses(List *restrictinfo_list, + Relids joinrelids, List **joinquals, List **otherquals); extern bool join_clause_is_movable_to(RestrictInfo *rinfo, RelOptInfo *baserel); ``` I wonder if that sort of change could be avoided in backpatches, maybe by adding and using a `extract_actual_join_clauses_compat` function and using that internally instead? Or, to David C’s point, perhaps it would be better to say there are some categories of APIs that are not subject to any guarantees in minor releases? Best, David [1]: https://github.com/laurenz/oracle_fdw/commit/d137d15edca8c67df1e5a01f417f4833b028 [2]: https://github.com/postgres/postgres/commit/68fab04f7c2a07c5308e3d2957198ccd7a80ebc5#diff-bb6fa74cb115e19684092f0938131cd5d99b26fa2d49480f7ea7f28e937a7fb4
Shouldn't jsonpath .string() Unwrap?
Hackers, Most of the jsonpath methods auto-unwrap in lax mode: david=# select jsonb_path_query('[-2,5]', '$.abs()'); jsonb_path_query -- 2 5 (2 rows) The obvious exceptions are size() and type(), which apply directly to arrays, so no need to unwrap: david=# select jsonb_path_query('[-2,5]', '$.size()'); jsonb_path_query -- 2 (1 row) david=# select jsonb_path_query('[-2,5]', '$.type()'); jsonb_path_query -- "array" But what about string()? Is there some reason it doesn’t unwrap? david=# select jsonb_path_query('[-2,5]', '$.string()'); ERROR: jsonpath item method .string() can only be applied to a bool, string, numeric, or datetime value What I expect: david=# select jsonb_path_query('[-2,5]', '$.string()'); jsonb_path_query — "2" "5" (2 rows) However, I do see a test[1] for this behavior, so maybe there’s a reason for it? Best, David [1]: https://github.com/postgres/postgres/blob/REL_17_BETA1/src/test/regress/expected/jsonb_jsonpath.out#L2527-L2530
Re: Patch bug: Fix jsonpath .* on Arrays
On Jun 7, 2024, at 10:23, David E. Wheeler wrote: > Rebased and moved the new tests to the end of the file. Bah, sorry, that was the previous patch. Here’s v3. D v3-0001-Add-tests-for-jsonpath-.-on-arrays.patch Description: Binary data
Re: Patch bug: Fix jsonpath .* on Arrays
On Jun 4, 2024, at 20:45, David E. Wheeler wrote: > Oh FFS, unwrapping still breaks my brain. You’re right, of course. Here’s a > new patch that demonstrates that behavior, since that code path is not > currently represented in tests AFAICT (I would have expected to have broken > it with this patch). Rebased and moved the new tests to the end of the file. D v2-0001-Add-tests-for-jsonpath-.-on-arrays.patch Description: Binary data
Re: Patch bug: Fix jsonpath .* on Arrays
On Jun 4, 2024, at 20:45, David E. Wheeler wrote: > Here’s a new patch that demonstrates that behavior, since that code path is > not currently represented in tests AFAICT (I would have expected to have > broken it with this patch). Commitfest link: https://commitfest.postgresql.org/48/5017/ D
Re: Patch bug: Fix jsonpath .* on Arrays
On Jun 4, 2024, at 12:28 PM, David G. Johnston wrote: > This seems to be working correctly. Lax mode causes the first array level to > unwrap and produce new context item values. Then the wildcard member > accessor is applied to each. Numbers don’t have members so no matches exist > in the first example. The object in the second indeed has a single member > and so matches the wildcard and its value, the array, is returned. Oh FFS, unwrapping still breaks my brain. You’re right, of course. Here’s a new patch that demonstrates that behavior, since that code path is not currently represented in tests AFAICT (I would have expected to have broken it with this patch). D 0001-Add-tests-for-jsonpath-.-on-arrays.patch Description: Binary data
Patch bug: Fix jsonpath .* on Arrays
Hackers, The behavior of the .* jpiAnyKey jsonpath selector seems incorrect. ``` select jsonb_path_query('[1,2,3]', '$.*'); jsonb_path_query -- (0 rows) select jsonb_path_query('[1,2,3,{"b": [3,4,5]}]', '$.*'); jsonb_path_query -- [3, 4, 5] ``` The first example might be expected, since .* is intended for object keys, but the handing of `jpiAnyKey` has a branch for unwrapping arrays. The second example, however, just seems weird: this is .*, not .**. The attached patch fixes it by passing the next node to `executeAnyItem()` (via `executeItemUnwrapTargetArray()`) and then properly setting `jperOk` when `executeAnyItem()` finds values and there is no current (next) node. I took this approach given what appears to be the intended behavior or $* on arrays in lax mode. However, I could see an argument that .* should not apply to arrays at all. If so, I can submit a new patch removing the branch that unwraps an array with .*. Best, David 0001-Fix-behavior-of-jsonpath-.-on-arrays.patch Description: Binary data
Re: Proposal: Document ABI Compatibility
On Jun 3, 2024, at 5:56 PM, Andres Freund wrote: > I don't see how this would trigger random crashes. > > Unfortunately [4] doesn't seem to take me to a relevant message (pruned chat > history?), so I can't infer more from that context. You can use [4] to join the Slack (if you haven’t already) and [5] for the relevant post. > Regardless of ABI issues, it's probably a good idea to continually run tests > against in-development minor versions, just to prevent something breaking from > creeping in. IIRC there were a handful of cases where we accidentally broke > some extension, because they relied on some implementation details. Oh yeah, I run regular tests against the latest minor release of all supported Postgres version for my extensions, using pgxn-tools[6], which looks like this[7]. Which I consider absolutely essential. But it doesn’t mean that something compiled against .4 will work with .3 and vice versa. That’s what we could use the guidance/guarantees on. >> Sure, probably not a problem, but if that’s the sole qualifier for making >> binary changes, I think it’s worth saying, as opposed to “we don’t make >> any”. Something like “Only changes to padding, which you never used anyway, >> right?” :-) > > IDK, to me something like this seems to promise more than we actually can. What I’d like to do is figure out exactly what we *can* promise and perhaps some guidelines, and start with that. Best, David [4]: https://pgtreats.info/slack-invite [5]: https://postgresteam.slack.com/archives/C056ZA93H1A/p1716502630690559?thread_ts=1716500801.036709=C056ZA93H1A [6]: https://github.com/pgxn/docker-pgxn-tools [7]: https://github.com/pgxn/docker-pgxn-tools/actions/runs/9351752462
Re: Proposal: Document ABI Compatibility
On Jun 3, 2024, at 14:58, Andres Freund wrote: > Hi, Hello Andres. > Are there notes for the session? Yes, but not posted yet. Here’s what Andreas 'ads' Scherbaum sent me for that bit of the conversation: * Core is focused on core ABI stability * David: No "statement of stability" in Core * David/Jeremy/Tom: coding guidelines, style guidelines *useful to have docs in core about what's stable and what's not, what you should compile against or not, and ABI guarantees * Abigale: there are hooks, but no overall concept for extensions * Tom: Peter Eisentraut is working on tests for extensions stability * Jeremy: nothing is preventing people from installing incompatible versions > It'd be nice if the slides for the talk could be uploaded... He also talked about it at Mini-Summit Five[1], for which I posted slides[2] (slides 11-14) and video[3] (starting at 29:08). > I don't think we can really make this a hard guarantee. Yes, we try hard to > avoid ABI breaks, but there IIRC have been a few cases over the years where > that wasn't practical for some reason. If we have to decide between a bad bug > and causing an ABI issue that's unlikely to affect anybody, we'll probably > choose the ABI issue. We can document that too, and perhaps a policy for letting people know. I thought I recalled something like this in the past, but Rob Treat did some spelunking through change logs and found only CVEs that needed repairs by manually running some SQL. So some sense of its rarity would also be useful. > Thus I am not really on board with this statement as-is. That’s fine, we should get it to where there’s consensus on the ordering and agreement on what, exactly, it should be. > Extensions in general can do lots of stuff, guaranteeing that bug fixes don't > cause any problems is just not feasible. > > It'd be interesting to see a few examples of actual minor-version-upgrade > extension breakages, so we can judge what caused them. In the community Slack[4], Matthias van de Meent writes[5]: > Citus’ pg_version_compat.h[7] where it re-implements a low-level function > that was newly introduced in PG14.7. If you build against PG14.7+ headers, > you may get random crashes when running on 14.6. I suppose it would work fine on 14.7 if compiled on 14.6 though. I suspect there aren’t many examples, though, mostly just a lot of anxiety, and some have decided that extensions must be recompiled for every minor release in order to avoid the issue. StackGres[7] is one example, but I suspect Omni (Yurii’s company) may follow. > I don't think it's common for such new-fields-in-padding to cause problems > when using an earlier minor PG version. For that the extension would need to > actually rely on the presence of the new field, but typically that'd not be > the case when we introduce a new field in a minor version. That’s what I was thinking, so “compatibility assured only if you don’t use padding yourself” would be super helpful. >> Unless, that is, we can provide a complete list of things not to do (like >> make use of padding) to avoid it. Is that feasible? > > You can't really rely on the contents of padding, in general. So I don't think > this is really something that needs to be called out. Sure, probably not a problem, but if that’s the sole qualifier for making binary changes, I think it’s worth saying, as opposed to “we don’t make any”. Something like “Only changes to padding, which you never used anyway, right?” :-) D [1]: https://justatheory.com/2024/05/mini-summit-five/ [2]: https://justatheory.com/shared/extension-ecosystem-summit/omni-universally-buildable-extensions.pdf [3]: https://youtu.be/R5ijx8IJyaM [4]: https://pgtreats.info/slack-invite [5]: https://postgresteam.slack.com/archives/C056ZA93H1A/p1716502630690559?thread_ts=1716500801.036709=C056ZA93H1A [6] https://github.com/citusdata/citus/blob/main/src/include/pg_version_compat.h#L236-L248 [7]: https://stackgres.io/doc/latest/intro/extensions/
Proposal: Document ABI Compatibility
Hackers, At the PGConf Unconference session on improving extension support in core, we talked quite a bit about the recent anxiety among extension developers about a lack of an ABI compatibility guarantee in Postgres. Yurii Rashkovskii did a little header file spelunking and talked[1] about a few changes in minor version releases, including to apparent field order in structs. Jeremy Schneider posted the example on Twitter[2], and Peter G replied[3]: > You must be referring to my commit 714780dc. The new field is stored within > alignment padding (though only on back branches). Has this been tied to a > known problem? At the Unconference, Tom Lane said that this approach is pretty well drilled into the heads of every committer, and new ones pick it up through experience. The goal, IIUC, is to never introduce binary incompatibilities into the C APIs in minor releases. This standard would be good to document, to let extension developers know exactly what the guarantees are. I’m happy to start a doc patch to add an ABI compatibility guarantee (presumably in xfunc.sgml), but want to be sure the details are right, namely: * There are no source or binary compatibility guarantees for major releases. * The ABI is guaranteed to change only in backward compatible ways in minor releases. If for some reason it doesn’t it’s a bug that will need to be fixed. This ensures that an extension compiled against an earlier minor release will continue to work without recompilation on later minor releases of the same major version. But if I understand correctly, the use of techniques like adding a new field in padding does not mean that extensions compiled on later minor releases will work on earlier minor releases of the same major version. Unless, that is, we can provide a complete list of things not to do (like make use of padding) to avoid it. Is that feasible? Are there other details it should include? Thanks, David [1]: https://www.pgevents.ca/events/pgconfdev2024/schedule/session/14 [2]: https://x.com/jer_s/status/1785717368804815026 [3]: https://x.com/petervgeoghegan/status/1785720228237717627
jsonpath Time and Timestamp Special Cases
Hello hackers, I noticed that the jsonpath date/time functions (.time() and timestamp(), et al.) don’t support some valid but special-case PostgreSQL values, notably `infinity`, `-infinity`, and, for times, '24:00:00`: ❯ psql psql (17devel) Type "help" for help. david=# select jsonb_path_query(to_jsonb('24:00:00'::time), '$.time()'); ERROR: time format is not recognized: "24:00:00" david=# select jsonb_path_query(to_jsonb('infinity'::timestamptz), '$.timestamp_tz()'); ERROR: timestamp_tz format is not recognized: "infinity" I assume this is because the standard doesn’t support these, or references JavaScript-only values or some such. Is that right? Best, David
Re: New committers: Melanie Plageman, Richard Guo
On Apr 26, 2024, at 07:54, Jonathan S. Katz wrote: > The Core Team would like to extend our congratulations to Melanie Plageman > and Richard Guo, who have accepted invitations to become our newest > PostgreSQL committers. > > Please join us in wishing them much success and few reverts! Great news, congratulations! Best, David
Re: Q: Escapes in jsonpath Idents
On Apr 24, 2024, at 3:22 PM, Erik Wienhold wrote: > Thanks Peter! But what is the definition of the entire path expression? > Perhaps something like: > > ::= { "." } > > That would imply that "$.$foo" is a valid path that accesses a variable > member (but I guess the path evaluation is also specified somewhere). I read it as “if it starts with a dollar sign, it’s a variable and not a path identifier”, and I assume any `.foo` expression is a path identifier. > What bugs me about this description, after reading it a couple of times, > is that it's not clear what is meant by ."$varname". It could mean two > things: (1) the double-quoting masks $varname in order to not interpret > those characters as a variable or (2) an interpolated string that > resolves $varname and yields a dynamic member accessor. My understanding is that if it’s in double quotes it’s never anything other than a string (whether a string literal or a path identifier string literal). IOW, variables don’t interpolate inside strings. > Under case (2) I'd expect that query to return 456 (because $foo > resolves to "bar"). (Similar to how psql would resolve :'foo' to > 'bar'.) Yes, I suspect this is the correct interpretation, but agree the wording could use some massaging, especially since path identifiers cannot start with a dollar sign anyway. Perhaps: "If the key name starts with $ or does not meet the JavaScript rules for an identifier, it must be enclosed in double quotes to make it a string literal." > Variables already work in array accessors and table 8.25 says that "The > specified index can be an integer, as well as an expression returning a > single numeric value [...]". A variable is such an expression. > >=> select jsonb_path_query('[2,3,5]', '$[$i]', '{"i":1}'); > jsonb_path_query >-- > 3 >(1 row) > > So I'd expect a similar behavior for member accessors as well when > seeing ."$varname" in the same table. Oh, interesting point! Now I wonder if the standard has this inconsistency (and is aware of it). > Yes, I think so. That would be case C in the spec excerpt provided by > Peter. So it's just a key name that happens to contain (but not start > with) the dollar sign. Exactly. It also matches the doc you quote above. Something would have to change in src/backend/utils/adt/jsonpath_scan.l to fix that, but that file makes my eyes water, so I’m not gonna take a stab at it. :-) D
Re: Q: Escapes in jsonpath Idents
On Apr 24, 2024, at 05:46, Peter Eisentraut wrote: > I have committed this patch, and backpatched it, as a bug fix, because the > existing description was wrong. To keep the patch minimal for backpatching, > I didn't do the conversion to a list. I'm not sure I like that anyway, > because it tends to draw more attention to that part over the surrounding > parts, which didn't seem appropriate in this case. But anyway, if you have > any more non-bug-fix editing in this area, which would then target PG18, > please send more patches. Makes sense, that level of detail gets into the weeks so maybe doesn’t need to be quite so prominent as a list. Thank you! David
Re: Q: Escapes in jsonpath Idents
On Apr 24, 2024, at 05:51, Peter Eisentraut wrote: >A is classified as follows. > >Case: > >a) A that is a is acontext variable>. > >b) A that begins with is a > . > >c) Otherwise, a is a . > > Does this help? I wasn't following all the discussion to see if there is > anything wrong with the implementation. Yes, it does, as it ties the special meaning of the dollar sign to the *beginning* of an expression. So it makes sense that this would be an error: david=# select '$.$foo'::jsonpath; ERROR: syntax error at or near "$foo" of jsonpath input LINE 1: select '$.$foo'::jsonpath; ^ But I’m less sure when a dollar sign is used in the *middle* (or end) of a json path identifier: david=# select '$.xx$foo'::jsonpath; ERROR: syntax error at or near "$foo" of jsonpath input LINE 1: select '$.xx$foo'::jsonpath; ^ Perhaps that should be valid? Best, David
Re: RFC: Additional Directory for Extensions
On Apr 4, 2024, at 1:20 PM, David E. Wheeler wrote: > I’ve added some docs based on your GUC description; updated patch attached. Here’s a rebase. I realize this probably isn’t going to happen for 17, given the freeze, but I would very much welcome feedback and pointers to address concerns about providing a second directory for extensions and DSOs. Quite a few people have talked about the need for this in the Extension Mini Summits[1], so I’m sure I could get some collaborators to make improvements or look at a different approach. Best, David [1] https://justatheory.com/2024/02/extension-ecosystem-summit/#extension-ecosystem-mini-summit v3-0001-Add-extension_directory-GUC.patch Description: Binary data
Re: ❓ JSON Path Dot Precedence
On Apr 10, 2024, at 10:29, Peter Eisentraut wrote: > So the whole thing is > > > > The syntax of and is then punted to > ECMAScript 5.1. > > 0x2 is a HexIntegerLiteral. (There can be no dots in that.) > > p10 is an Identifier. > > So I think this is all correct. That makes sense, thanks. It’s just a little odd to me that the resulting path isn’t a query at all. To Erik’s point: what path can `'0x2.p10` even select? Best, David
Re: ❓ JSON Path Dot Precedence
On Apr 7, 2024, at 15:46, Erik Wienhold wrote: > I guess jsonpath assumes that hex, octal, and binary literals are > integers. So there's no ambiguity about any fractional part that might > follow. Yeah, that’s what the comment in the flex file says: https://github.com/postgres/postgres/blob/b4a71cf/src/backend/utils/adt/jsonpath_scan.l#L102-L105 > Also, is there even a use case for path "0x2.p10"? The path has to > start with "$" or ("@" in case of a filter expression), doesn't it? And > it that case it doesn't parse: > >test=# select '$.0x2.p10'::jsonpath; >ERROR: trailing junk after numeric literal at or near ".0x" of jsonpath > input >LINE 1: select '$.0x2.p10'::jsonpath; > > Even with extra whitespace: > >test=# select '$ . 0x2 . p10'::jsonpath; >ERROR: syntax error at or near "0x2" of jsonpath input >LINE 1: select '$ . 0x2 . p10'::jsonpath; > > Or should it behave like an array accessor? Similar to: > >test=# select jsonb_path_query('[0,1,{"p10":42},3]', > '$[0x2].p10'::jsonpath); > jsonb_path_query >-- > 42 >(1 row) I too am curious why these parse successfully, but don’t appear to be useful. Best, David
❓ JSON Path Dot Precedence
Hello Hackers, A question about the behavior of the JSON Path parser. The docs[1] have this to say about numbers: > Numeric literals in SQL/JSON path expressions follow JavaScript rules, which > are different from both SQL and JSON in some minor details. For example, > SQL/JSON path allows .1 and 1., which are invalid in JSON. In other words, this is valid: david=# select '2.'::jsonpath; jsonpath -- 2 But this feature creates a bit of a conflict with the use of a dot for path expressions. Consider `0x2.p10`. How should that be parsed? As an invalid decimal expression ("trailing junk after numeric literal”), or as a valid integer 2 followed by the path segment “p10”? Here’s the parser’s answer: david=# select '0x2.p10'::jsonpath; jsonpath --- (2)."p10" So it would seem that, other things being equal, a path key expression (`.foo`) is slightly higher precedence than a decimal expression. Is that intentional/correct? Discovered while writing my Go lexer and throwing all of Go’s floating point literal examples[2] at it and comparing to the Postgres path parser. Curiously, this is only an issue for 0x/0o/0b numeric expressions; a decimal expression does not behave in the same way: david=# select '2.p10'::jsonpath; ERROR: trailing junk after numeric literal at or near "2.p" of jsonpath input LINE 1: select '2.p10'::jsonpath; Which maybe seems a bit inconsistent. Thoughts on what the “correct” behavior should be? Best, David [1]: https://www.postgresql.org/docs/devel/datatype-json.html#DATATYPE-JSONPATH [2]: https://tip.golang.org/ref/spec#Floating-point_literals
Re: RFC: Additional Directory for Extensions
On Apr 3, 2024, at 12:46 PM, Christoph Berg wrote: > Thanks for bringing this up, I should have submitted this years ago. > (The patch is originally from September 2020.) That’s okay, it’s still 2020 in some ways. > I designed the patch to require a superuser to set it, so it doesn't > matter very much by which mechanism it gets updated. There should be > little reason to vary it at run-time, so I'd be fine with requiring a > restart, but otoh, why restrict the superuser from reloading it if > they know what they are doing? I think that’s fair. I’ll keep it requiring a restart now, on the theory it would be easier to loosen it later than have to tighten it later. > I'm not sure the concept of "core libraries" exists. PG happens to > dlopen things at run time, and it doesn't know/care if they were > installed by users or by the original PG server. Also, an exploited > libpgtypes library is not worse than any other untrusted "user" > library, so you really don't want to allow users to provide their own > .so files, no matter by what mechanism. Yes, I guess my concern is whether it could be used to “shadow” core libraries. Maybe it’s no different, really. >> This would also allow the lookup of extension libraries prefixed by the >> directory field from the control file, which would enable much tidier >> extension installation: The control file, SQL scripts, and DSOs could all be >> in a single directory for an extension. > > Nice idea, but that would mean installing .so files into PGSHAREDIR. > Perhaps the whole extension stuff would have to move to PKGLIBDIR > instead. Yes, I was just poking around the code, and realized that, when extension functions are created they may or may not not use `MODULE_PATHNAME`, but in any event, there is nothing different about loading an extension DSO than any other DSO. I was hoping to find a path where it knows it’s opening a DSO for the purpose of an extension, so we could limit the lookup there. But that does not (currently) exist. Maybe we could add an `$extensiondir` variable to complement `$libdir`? Or is PGKLIBDIR is the way to go? I’m not familiar with it. It looks like extension JIT files are put there already. > Fwiw, I wrote this patch to solve the problem of testing extensions at > build-time where the build process does not have write access to > PGSHAREDIR. It solves that problem quite well, almost all PG > extensions have build-time test coverage now (where there was > basically 0 before). Yeah, good additional use case. On Apr 3, 2024, at 1:03 PM, Christoph Berg wrote: > Also, it's called extension "destdir" because it behaves like DESTDIR > in Makefiles: It prepends the given path to the path that PG is trying > to open when set. So it doesn't allow arbitrary new locations as of > now, just /home/build/foo-1/debian/foo/usr/share/postgresql/17/extension > in addition to /usr/share/postgresql/17/extension. (That is what the > Debian package build process needs, so that restriction/design choice > made sense. Right, this makes perfect sense, in that you don’t have to copy all the extension files from the destdir to the SHAREDIR to test them, which I imagine could be a PITA. > That's also included in the current GUC description: > > This directory is prepended to paths when loading extensions > (control and SQL files), and to the '$libdir' directive when > loading modules that back functions. The location is made > configurable to allow build-time testing of extensions that do not > have been installed to their proper location yet. > > Perhaps I should have included a more verbose "NOT FOR PRODUCTION" > there. The use cases I described upthread are very much production use cases. Do you think it’s not for production just because we need to really think it through? I’ve added some docs based on your GUC description; updated patch attached. Best, David v2-0001-Add-extension_directory-GUC.patch Description: Binary data
Re: RFC: Additional Directory for Extensions
On Apr 3, 2024, at 8:54 AM, David E. Wheeler wrote: > Yes, I like the suggestion to make it require a restart, which lets the > sysadmin control it and not limited to whatever the person who compiled it > thought would make sense. Here’s a revision of the Debian patch that requires a server start. However, in studying the patch, it appears that the `extension_directory` is searched for *all* shared libraries, not just those being loaded for an extension. Am I reading the `expand_dynamic_library_name()` function right? If so, this seems like a good way for a bad actor to muck with things, by putting an exploited libpgtypes library into the extension directory, where it would be loaded in preference to the core libpgtypes library, if they couldn’t exploit the original. I’m thinking it would be better to have the dynamic library lookup for extension libraries (and LOAD libraries?) separate, so that the `extension_directory` would not be used for core libraries. This would also allow the lookup of extension libraries prefixed by the directory field from the control file, which would enable much tidier extension installation: The control file, SQL scripts, and DSOs could all be in a single directory for an extension. Thoughts? Best, David v1-0001-Add-extension_directory-GUC.patch Description: Binary data
Re: RFC: Additional Directory for Extensions
On Apr 3, 2024, at 8:54 AM, David E. Wheeler wrote: > Yes, I like the suggestion to make it require a restart, which lets the > sysadmin control it and not limited to whatever the person who compiled it > thought would make sense. Or SIGHUP? D
Re: RFC: Additional Directory for Extensions
On Apr 3, 2024, at 3:57 AM, walt...@technowledgy.de wrote: > I can also imagine that it would be very helpful in a container setup to be > able to set an environment variable with this path instead of having to > recompile all of postgres to change it. Yes, I like the suggestion to make it require a restart, which lets the sysadmin control it and not limited to whatever the person who compiled it thought would make sense. Best, David
RFC: Additional Directory for Extensions
Hackers, In the Security lessons from liblzma thread[1], walther broached the subject of an extension directory path[1]: > Also a configurable directoy to look up extensions, possibly even to be > changed at run-time like [2]. The patch says this: > >> This directory is prepended to paths when loading extensions (control and >> SQL files), and to the '$libdir' directive when loading modules that back >> functions. The location is made configurable to allow build-time testing of >> extensions that do not have been installed to their proper location yet. > > This seems like a great thing to have. This might also be relevant in > light of recent discussions in the ecosystem around extension management. That quotation comes from this Debian patch[2] maintained by Christoph Berg. I’d like to formally propose integrating this patch into the core. And not only because it’s overhead for package maintainers like Christoph, but because a number of use cases have emerged since we originally discussed something like this back in 2013[3]: Docker Immutability --- Docker images must be immutable. In order for users of a Docker image to install extensions that persist, they must create a persistent volume, map it to SHAREDIR/extensions, and copy over all the core extensions (or muck with symlink magic[4]). This makes upgrades trickier, because the core extensions are mixed in with third party extensions. By supporting a second directory pretended to the list of directories to search, as the Debian patch does, users of Docker images can keep extensions they install separate from core extensions, in a directory mounted to a persistent volume with none of the core extensions. Along with tweaking dynamic_library_path to support additional directories for shared object libraries, which can also be mounted to a separate path, we can have a persistent and clean separation of immutable core extensions and extensions installed at runtime. Postgres.app The Postgres.app project also supports installing extensions. However, because they must go into the SHAREDIR/extensions, once a user installs one the package has been modified and the Apple bundle signature will be broken. The OS will no longer be able to validate that the app is legit. If the core supported an additional extension (and PKGLIBDIR), it would allow an immutable PostgreSQL base package and still allow extensions to be installed into directories outside the app bundle, and thus preserve bundle signing on macOS (and presumably other systems --- is this the nix issue, too?) RFC --- I know there was some objection to changes like this in the past, but the support I’m seeing in the liblzma thread for making pkglibdir configurable me optimistic that this might be the right time to support additional configuration for the extension directory, as well, starting with the Debian patch, perhaps. Thoughts? I would be happy to submit a clean version of the Debian patch[2]. Best, David [1] https://www.postgresql.org/message-id/99c41b46-616e-49d0-9ffd-a29432cec818%40technowledgy.de [2] https://salsa.debian.org/postgresql/postgresql/-/blob/17/debian/patches/extension_destdir?ref_type=heads [3] https://www.postgresql.org/message-id/flat/51ae0845.8010...@ocharles.org.uk [4] https://speakerdeck.com/ongres/postgres-extensions-in-kubernetes?slide=14
Re: Security lessons from liblzma
On Apr 1, 2024, at 06:55, walt...@technowledgy.de wrote: > Also a configurable directoy to look up extensions, possibly even to be > changed at run-time like [2]. The patch says this: > >> This directory is prepended to paths when loading extensions (control and >> SQL files), and to the '$libdir' directive when loading modules that back >> functions. The location is made configurable to allow build-time testing of >> extensions that do not have been installed to their proper location yet. > > This seems like a great thing to have. This might also be relevant in light > of recent discussions in the ecosystem around extension management. > > All the path-related issues have in common, that while it's easy to move > files around to their proper locations later, they all need to adjust > pg_config's output. Funny timing, I was planning to resurrect this old patch[1] and propose that patch this week. One of motivators is the increasing use of Docker images in Kubernetes to run Postgres, where there’s a desire to keep the core service and extensions immutable, and to have a second directory mounted to a persistent volume into which other extensions can be installed and preserved independently of the Docker image. The current approach involves symlinking shenanigans[2] that complicate things pretty substantially, making it more difficult to administer. A second directory fit for purpose would be far better. There are some other motivators, so I’ll do some additional diligence and start a separate thread (or reply to the original[3]). Best, David [1] https://commitfest.postgresql.org/5/170/ [2] https://speakerdeck.com/ongres/postgres-extensions-in-kubernetes?slide=14 [3] https://www.postgresql.org/message-id/flat/51AE0845.8010600%40ocharles.org.uk
Re: Patch: Add parse_type Function
On Mar 20, 2024, at 17:23, Tom Lane wrote: > Pushed with some editorialization. Mostly, I whacked the > documentation around pretty heavily: we have a convention for what > examples in function descriptions should look like, and this wasn't > it. Not entirely your fault, since some nearby entries in that > table hadn't gotten the word either. Ah, great, and your wording on the parser error issue is much better, thank you! Best, David
Re: Q: Escapes in jsonpath Idents
On Mar 17, 2024, at 20:09, Erik Wienhold wrote: > > On 2024-03-17 20:50 +0100, David E. Wheeler wrote: >> On Mar 17, 2024, at 15:12, Erik Wienhold wrote: >>> So I think it makes sense to reword the entire backslash part of the >>> paragraph and remove references to JSON entirely. The attached patch >>> does that and also formats the backslash escapes as a bulleted list for >>> readability. >> >> Ah, it’s JavaScript format, not JSON! This does clarify things quite >> nicely, thank you. Happy to add my review once it’s in a commit fest. > > Thanks. https://commitfest.postgresql.org/48/4899/ Applies cleanly, `make -C doc/src/sgml check` runs without error. Doc improvement welcome and much clearer than before. > I had the same reasoning while writing my first reply but scrapped that > part because I found it obvious: That jsonpath is parsed before calling > jsonb_path_exists() and therefore the parser has no context about any > variables, which might not even be hardcoded but may result from a > query. Right, there’s a chicken/egg problem. > Unfortunately, I don't have access to that part of the SQL spec. So I > don't know how the jsonpath grammar is specified. Seems quite logical; I think it should be documented, but I’d also be interested to know what the 2016 and 2023 standards say, exactly. > Also checked git log src/backend/utils/adt/jsonpath_scan.l for some > insights but haven't found any yet. Everybody’s taking shortcuts relative to the standard, AFAICT. For example, jsonpath_scan.l matches unqouted identifiers with these two regular expressions: {other}+ \/\* \\. Plus the backslash escapes. {other} is defined as: /* "other" means anything that's not special, blank, or '\' or '"' */ other [^\?\%\$\.\[\]\{\}\(\)\|\&\!\=\<\>\@\#\,\*:\-\+\/\\\" \t\n\r\f] Which is wy more liberal than the ECMA standard[1], by my reading, but the MSDN[2] description is quite succinct (thanks for the links!): > In JavaScript, identifiers are commonly made of alphanumeric characters, > underscores (_), and dollar signs ($). Identifiers are not allowed to start > with numbers. However, JavaScript identifiers are not only limited to ASCII — > many Unicode code points are allowed as well. Namely, any character in the > ID_Start category can start an identifier, while any character in the > ID_Continue category can appear after the first character. ID_Start[3] and ID_Continue[4] point to the unicode standard codes lister, nether of which reference Emoji. Sure enough, in Safari: > x = {"": true} < {: true} > x. < SyntaxError: Invalid character '\ud83c’ But in Postgres jsonpath: david=# select '$.'::jsonpath; jsonpath -- $."" If the MSDN references to ID_Start and ID_Continue are correct, then the Postgres path parser is being overly-liberal. Maybe that’s totally fine? Not sure what should be documented and what’s not worth it. Aside: I’m only digging into these details because I’m busy porting the path parser, so trying to figure out where to be compatible and where not to. So far I’m rejecting '$' (but allowing '\$' and '\u0024') but taking advantage of the unicode support in Go to specifically validate against ID_Start and ID_Continue. Best, David [1] https://262.ecma-international.org/#sec-identifier-names [2] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Lexical_grammar#identifiers [3] https://util.unicode.org/UnicodeJsps/list-unicodeset.jsp?a=%5Cp%7BID_Start%7D [4] https://util.unicode.org/UnicodeJsps/list-unicodeset.jsp?a=%5Cp%7BID_Continue%7D
Re: Q: Escapes in jsonpath Idents
On Mar 17, 2024, at 15:12, Erik Wienhold wrote: > Hi David, Hey Erik. Thanks for the detailed reply and patch! > So I think it makes sense to reword the entire backslash part of the > paragraph and remove references to JSON entirely. The attached patch > does that and also formats the backslash escapes as a bulleted list for > readability. Ah, it’s JavaScript format, not JSON! This does clarify things quite nicely, thank you. Happy to add my review once it’s in a commit fest. > The first case ($.$foo) is in line with the restriction on member > accessors that you quoted first. Huh, that’s now how I read it. Here it is again: >> Member accessor that returns an object member with the specified >> key. If the key name matches some named variable starting with $ or >> does not meet the JavaScript rules for an identifier, it must be >> enclosed in double quotes to make it a string literal. Note that in my example `$foo` does not match a variable. I mean it looks like a variable, but none is used here. I guess it’s being conservative because it might be used in one of the functions, like jsonb_path_exists(), to which variables might be passed. > The error message 'syntax error at or near "$oo" of jsonpath input' for > the second case ($.f$oo), however, looks as if the scanner identifies > '$oo' as a variable instead of contiuing the scan of identifier (f$oo) > for the member accessor. Looks like a bug to me because a variable > doesn't even make sense in that place. Right. Maybe the docs should be updated to say that a literal dollar sign isn’t supported in identifiers, unlike in JavaScript, except through escapes like this: > What works though, besides double quoting, is escaping the dollar sign: > > regress=# select '$.\u0024foo'::jsonpath; > jsonpath > -- > $."$foo" > (1 row) > > And we've come full circle :) Best, David
Re: Q: Escapes in jsonpath Idents
On Mar 16, 2024, at 14:39, David E. Wheeler wrote: > I went looking for the JavaScript rules for an identifier and found this in > the MDN docs[2]: > >> In JavaScript, identifiers can contain Unicode letters, $, _, and digits >> (0-9), but may not start with a digit. An identifier differs from a string >> in that a string is data, while an identifier is part of the code. In >> JavaScript, there is no way to convert identifiers to strings, but sometimes >> it is possible to parse strings into identifiers. Coda: Dollar signs don’t work at all outside double-quoted string identifiers: david=# select '$.$foo'::jsonpath; ERROR: syntax error at or near "$foo" of jsonpath input LINE 1: select '$.$foo'::jsonpath; ^ david=# select '$.f$oo'::jsonpath; ERROR: syntax error at or near "$oo" of jsonpath input LINE 1: select '$.f$oo'::jsonpath; ^ david=# select '$."$foo"'::jsonpath; jsonpath -- $."$foo" This, too, contradicts the MDM definition an identifier (and some quick browser tests). Best, David
Q: Escapes in jsonpath Idents
Hackers, The jsonpath doc[1] has an excellent description of the format of strings, but for unquoted path keys, it simply says: > Member accessor that returns an object member with the specified key. If the > key name matches some named variable starting with $ or does not meet the > JavaScript rules for an identifier, it must be enclosed in double quotes to > make it a string literal. I went looking for the JavaScript rules for an identifier and found this in the MDN docs[2]: > In JavaScript, identifiers can contain Unicode letters, $, _, and digits > (0-9), but may not start with a digit. An identifier differs from a string in > that a string is data, while an identifier is part of the code. In > JavaScript, there is no way to convert identifiers to strings, but sometimes > it is possible to parse strings into identifiers. However, the Postgres parsing of jsonpath keys appears to follow the same rules as strings, allowing backslash escapes: david=# select '$.fo\u00f8 == $x'::jsonpath; jsonpath --- ($."foø" == $"x") This would seem to contradict the documentation. Is this behavior required by the SQL standard? Do the docs need updating? Or should the code actually follow the JSON identifier behavior? Thanks, David PS: Those excellent docs on strings mentions support for \v, but the grammar in the right nav of https://www.json.org/json-en.html does not. Another bonus feature? [1]: https://www.postgresql.org/docs/16/datatype-json.html#DATATYPE-JSONPATH [2]: https://developer.mozilla.org/en-US/docs/Glossary/Identifier
Re: Patch: Add parse_type Function
On Mar 7, 2024, at 23:39, Erik Wienhold wrote: > I think you need to swap the examples. The text mentions the error case > first and the NULL case second. 臘♂️ Thanks, fixed in the attached patch. David v11-0001-Add-to_regtypemod-SQL-function.patch Description: Binary data
Re: Patch: Add parse_type Function
Hello Hackers, On Feb 25, 2024, at 13:00, David E. Wheeler wrote: >> postgres=# SELECT to_regtypemod('timestamp(-4)'); >> ERROR: syntax error at or near "-" >> LINE 1: SELECT to_regtypemod('timestamp(-4)'); >> ^ >> CONTEXT: invalid type name "timestamp(-4)" >> >> postgres=# SELECT to_regtypemod('text(-4)'); >> ERROR: type modifier is not allowed for type "text" > > Yeah, there was quite a bit of discussion of this issue back in September[1]. > >> This behaviour is mentioned in the documentation, so I'd say it is ok. > > This is my attempt to make it clearer that it can return an error, but I > don’t love the wording TBH. I’ve rebased the patch and, in an attempt to clarify this behavior, added a couple of examples to the docs for to_regtype. Updated patch attached. Best, David v10-0001-Add-to_regtypemod-SQL-function.patch Description: Binary data
Re: An improved README experience for PostgreSQL
On Feb 28, 2024, at 1:51 PM, Alvaro Herrera wrote: > *IF* people don't go overboard, yes. I agree, but let's keep an eye so > that it doesn't become an unreadable mess. I've seen some really > horrible markdown files that I'm sure most of you would object to. Markdown++ IME the keys to decent-looking Markdown are: 1. Wrapping lines to a legible width (76-80 chars) 2. Link references rather than inline links I try to follow these for my blog; posts end up looking like this: https://justatheory.com/2024/02/extension-metadata-typology.text (Append `.text` to any post to see the raw(ish) Markdown. Best, David
Re: Patch: Add parse_type Function
On Feb 24, 2024, at 19:11, Jim Jones wrote: >> What’s the protocol for marking a patch ready for committer? > > I guess after the review of the last assigned reviewer Oh, I didn’t realize someone was assigned. :-) > The fact that a completely invalid type returns NULL .. > > SELECT to_regtypemod('foo'); > to_regtypemod > --- > > (1 row) > > > .. but a "partially" valid one returns an error might be confusing > > postgres=# SELECT to_regtypemod('timestamp(-4)'); > ERROR: syntax error at or near "-" > LINE 1: SELECT to_regtypemod('timestamp(-4)'); > ^ > CONTEXT: invalid type name "timestamp(-4)" > > postgres=# SELECT to_regtypemod('text(-4)'); > ERROR: type modifier is not allowed for type "text" Yeah, there was quite a bit of discussion of this issue back in September[1]. > This behaviour is mentioned in the documentation, so I'd say it is ok. This is my attempt to make it clearer that it can return an error, but I don’t love the wording TBH. > I would personally prefer either NULL or an error in both cases, but I > can totally live with the current design. SAME. Best, David [1] https://www.postgresql.org/message-id/flat/09f9cad6-5096-43cc-b6a7-685703e47...@justatheory.com
Re: Patch: Add parse_type Function
On Feb 21, 2024, at 19:13, David E. Wheeler wrote: > Thanks. Anyone else? Or is it ready for committer? What’s the protocol for marking a patch ready for committer? Thanks, David
Re: Patch: Add parse_type Function
On Feb 21, 2024, at 17:19, Erik Wienhold wrote: > Thanks David! LGTM. Thanks. Anyone else? Or is it ready for committer? Best, David
Re: Patch: Add parse_type Function
On Feb 21, 2024, at 16:43, Erik Wienhold wrote: > The docs still state that to_regtypemod() has a named parameter, which > is not the case per pg_proc.dat. Bah, I cleaned it up before but somehow put it back. Thanks for the catch. Fixed. Best, David v9-0001-Add-to_regtypemod-SQL-function.patch Description: Binary data
Re: to_regtype() Raises Error
On Feb 21, 2024, at 11:54 AM, David Wheeler wrote: > Merged this change into the [to_regtypemod > patch](https://commitfest.postgresql.org/47/4807/), which has exactly the > same issue. > > The new status of this patch is: Needs review Bah, withdrawn. D
Re: Patch: Add parse_type Function
On Feb 21, 2024, at 11:18, Erik Wienhold wrote: > Thanks. But it's an applefile again, not a patch :P WTF. I still have that feature disabled. Oh, I think I deleted the file after dragged it into Mail but before sending, because it’s empty everywhere I look. 臘♂️ Let’s try that again. Best, David v8-0001-Add-to_regtypemod-SQL-function.patch Description: Binary data
Re: Patch: Add parse_type Function
On Feb 20, 2024, at 21:09, Erik Wienhold wrote: > The references are printed as "???" right now. Can be fixed with > xreflabel="format_type" and xreflabel="to_regtype" on those > elements. Thanks. > The docs show parameter name "type" but pg_proc.data does not define > proargnames. So the to_regtypemod() cannot be called using named > notation: > > test=> select to_regtypemod(type => 'int'::text); > ERROR: function to_regtypemod(type => text) does not exist Yes, this format is identical to that of to_regtype(): david=# select to_regtype(type => 'int'::text); ERROR: function to_regtype(type => text) does not exist > +Parses a string of text, extracts a potential type name from it, and >> +translates its typmod, the modifier for the type, if any. Failure to > > s/typmod, the modifier of the type/type modifier/ > > Because format_type() already uses "type modifier" and I think it helps > searchability to use the same wording. Yes, definitely better wording, thanks. V8 attached. Best, David v8-0001-Add-to_regtypemod-SQL-function.patch Description: application/applefile
Re: to_regtype() Raises Error
On Feb 5, 2024, at 09:01, David E. Wheeler wrote: > Ah, thank you. Updated patch attached. I’ve moved this patch into the to_regtype patch thread[1], since it exhibits the same behavior. Best, David [1] https://www.postgresql.org/message-id/60ef4e11-bc1c-4034-b37b-695662d28...@justatheory.com
Re: Patch: Add parse_type Function
On Feb 20, 2024, at 01:30, jian he wrote: > the second hint `-- grammar error expected` seems to contradict with > the results? Quite right, thank you, that’s actually a trapped error. I’ve tweaked the comments and their order in v7, attached. This goes back to the discussion of the error raising of to_regtype[1], so I’ve incorporated the patch from that thread into this patch, and set up the docs for to_regtypemod() with similar information. The wording is still a little opaque for my taste, though, written more for someone who knows a bit about the internals, but it’s a start. I’ve also fixed the wayward parameter in the function signature in the docs, and added a note about why I’ve also patched genbki.pl. Best, David [1] https://www.postgresql.org/message-id/flat/57E1FDDC-5A38-452D-82D7-A44DA2E13862%40justatheory.com#1ae0b11634bc33c7ad3cd728e43d504e v7-0001-Add-to_regtypemod-SQL-function.patch Description: Binary data
Re: Patch: Add parse_type Function
On Feb 19, 2024, at 21:58, Erik Wienhold wrote: > See the patch I wrote for my benchmarks. But it's pretty easy anyway to > cut down parse_type() ;) LOL, I missed that, just wrote it myself in the last hour. :-) v6 attached. > But you don't actually need reformat_type() in pgTAP. You can just get > the type OID and modifier of the want_type and have_type and compare > those. Then use format_type() for the error message. Looks a bit > cleaner to me than doing the string comparison. Fair. > On second thought, I guess comparing the reformatted type names is > necessary in order to have a uniform API on older Postgres releases > where pgTAP has to provide its own to_regtypmod() based on typmodin > functions. Maybe. Worth playing with. >> For the latter, it could easily be an example in the docs. > > Can be mentioned right under format_type(). Well I included it in the to_regtypemod docs here, but could so either. Best, David v6-0001-Add-to_regtypemod-SQL-function.patch Description: Binary data
Re: Patch: Add parse_type Function
On Feb 19, 2024, at 15:47, Tom Lane wrote: >> 1. Add a to_regtypmod() for those who just want the typemod. > > Seems like there's a good case for doing that. I’ll work on that. > I'm less thrilled about that, mainly because I can't think of > a good name for it ("format_type_string" is certainly not that). > Is the use-case for this functionality really strong enough that > we need to provide it as a single function rather than something > assembled out of spare parts? Not essential for pgTAP, no, as we can just use the parts. It was the typmod bit that was missing. On Feb 19, 2024, at 17:13, Tom Lane wrote: > After some time ruminating, a couple of possibilities occurred to > me: > reformat_type(text) > canonical_type_name(text) I was just thinking about hitting the thesaurus entry for “canonical”, but I quite like reformat_type. It’s super clear and draws the parallel to format_type() more clearly. Will likely steal the name for pgTAP if we don’t add it to core. :-) > I'm still unconvinced about that, though. I guess the questions are: * What are the other use cases for it? * How obvious is it how to do it? For the latter, it could easily be an example in the docs. Best, David
Re: Patch: Add parse_type Function
On Feb 18, 2024, at 15:55, Erik Wienhold wrote: >> The overhead of parse_type_and_format can be related to higher planning >> time. PL/pgSQL can assign composite without usage FROM clause. > > Thanks, didn't know that this makes a difference. In that case both > variants are on par. Presumably this is a side-effect of the use of a RECORD here, which requires a FROM clause in pure SQL, yes? The only way I can think of to avoid that is to: 1. Add a to_regtypmod() for those who just want the typemod. 2. Add a format_type_string() function that returns a string, the equivalent of this function but in C: CREATE FUNCTION format_type_string(text) RETURNS TEXT AS $$ SELECT format_type(to_regtype($1), to_regtypmod($1)) $$; We could also keep the record-returning function for users who want both the regtype and the regytypemod at once, but with the other two I consider it less essential. Thoughts? Best, David