Re: PATCH: jsonpath string methods: lower, upper, initcap, l/r/btrim, replace, split_part
> On 27 Sep 2024, at 12:45 PM, David E. Wheeler wrote: > > On Sep 26, 2024, at 13:59, Florents Tselai wrote: > >> Speaking of extensible: the jsonpath standard does mention function >> extensions [1] , >> so it looks like we're covered by the standard, and the mutability aspect is >> an implementation detail. No? > > That’s not the standard used for Postgres jsonpath. Postgres follows the > SQL/JSON standard in the SQL standard, which is not publicly available, but a > few people on the list have copies they’ve purchased and so could provide > some context. > > In a previous post I wondered if the SQL standard had some facility for > function extensions, but I suspect not. Maybe in the next iteration? > >> And having said that, the whole jsonb/jsonpath parser/executor >> infrastructure is extremely powerful >> and kinda under-utilized if we use it "only" for jsonpath. >> Tbh, I can see it supporting more specific DSLs and even offering hooks for >> extensions. >> And I know for certain I'm not the only one thinking about this. >> See [2] for example where they've lifted, shifted and renamed the >> jsonb/jsonpath infra to build a separate language for graphs > > I’m all for extensibility, though jsonpath does need to continue to comply > with the SQL standard. Do you have some idea of the sorts of hooks that would > allow extension authors to use some of that underlying capability? Re-tracing what I had to do 1. Define a new JsonPathItemType jpiMyExtType and map it to a JsonPathKeyword 2. Add a new JsonPathKeyword and make the lexer and parser aware of that, 3. Tell the main executor executeItemOptUnwrapTarget what to do when the new type is matched. I think 1, 2 are the trickiest because they require hooks to jsonpath_scan.l and parser jsonpath_gram.y 3. is the meat of a potential hook, which would be something like extern JsonPathExecResult executeOnMyJsonpathItem(JsonPathExecContext *cxt, JsonbValue *jb, JsonValueList *found); This should be called by the main executor executeItemOptUnwrapTarget when it encounters case jpiMyExtType It looks like quite an endeavor, to be honest.
Re: PATCH: jsonpath string methods: lower, upper, initcap, l/r/btrim, replace, split_part
On Thu, Sep 26, 2024 at 1:55 PM Alexander Korotkov wrote: > On Thu, Sep 26, 2024 at 12:04 AM Tom Lane wrote: > > Florents Tselai writes: > > > This patch is a follow-up and generalization to [0]. > > > It adds the following jsonpath methods: lower, upper, initcap, > l/r/btrim, > > > replace, split_part. > > > > How are you going to deal with the fact that this makes jsonpath > > operations not guaranteed immutable? (See commit cb599b9dd > > for some context.) Those are all going to have behavior that's > > dependent on the underlying locale. > > > > We have the kluge of having separate "_tz" functions to support > > non-immutable datetime operations, but that way doesn't seem like > > it's going to scale well to multiple sources of mutability. > > While inventing "_tz" functions I was thinking about jsonpath methods > and operators defined in standard then. Now I see huge interest on > extending that. I wonder if we can introduce a notion of flexible > mutability? Imagine that jsonb_path_query() function (and others) has > another function which analyzes arguments and reports mutability. If > jsonpath argument is constant and all methods inside are safe then > jsonb_path_query() is immutable otherwise it is stable. I was > thinking about that back working on jsonpath, but that time problem > seemed too limited for this kind of solution. Now, it's possibly time > to shake off the dust from this idea. What do you think? > > -- > Regards, > Alexander Korotkov > Supabase > In case you're having a deja vu, while researching this I did come across [0] where disussing this back in 2019. In this patch I've conveniently left jspIsMutable and jspIsMutableWalker untouched and under the rug, but for the few seconds I pondered over this,the best answer I came with was a simple heuristic to what Alexander says above: if all elements are safe, then the whole jsp is immutable. If we really want to tackle this and make jsonpath richer though, I don't think we can avoid being a little more flexible/explicit wrt mutability. Speaking of extensible: the jsonpath standard does mention function extensions [1] , so it looks like we're covered by the standard, and the mutability aspect is an implementation detail. No? And having said that, the whole jsonb/jsonpath parser/executor infrastructure is extremely powerful and kinda under-utilized if we use it "only" for jsonpath. Tbh, I can see it supporting more specific DSLs and even offering hooks for extensions. And I know for certain I'm not the only one thinking about this. See [2] for example where they've lifted, shifted and renamed the jsonb/jsonpath infra to build a separate language for graphs [0] https://www.postgresql.org/message-id/capphfdvdci4iqnf9fhrktqhe-5_8hmzelt56drh+_rv2rnr...@mail.gmail.com [1] https://www.rfc-editor.org/rfc/rfc9535.html#name-function-extensions [2] https://github.com/apache/age/blob/master/src/include/utils/agtype.h
Re: Docs pg_restore: Shouldn't there be a note about -n ?
On Sat, Sep 21, 2024 at 9:48 PM Florents Tselai wrote: > > > > On 21 Sep 2024, at 9:22 PM, Tom Lane wrote: > > > > Florents Tselai writes: > >> Ah, swapped them by mistake on the previous email: > >> They're both available in the pg_dump and note on -n missing in > pg_restore. > >> The question remains though: > >> Shouldn’t there be a note about -n in pg_restore ? > > > > Probably. I see that pg_dump has a third copy of the exact same > > note for "-e". pg_restore lacks that switch for some reason, > > but this is surely looking mighty duplicative. I propose getting > > rid of the per-switch Notes and putting a para into the Notes > > section, along the lines of > > > >When -e, -n, or -t is specified, pg_dump makes no attempt to dump > >any other database objects that the selected object(s) might > >depend upon. Therefore, there is no guarantee that the results of > >a selective dump can be successfully restored by themselves into a > >clean database. > > Agree with that, but I think there should be a pointer like “see Notes” . > Otherwise I’m pretty sure most would expect pg doing magic. > Can’t remember I scrolledl to the bottom of a page “notes” after finding > the option I want. > > I would also add an example of what “depend upon” means, > To underline that it’s really not that uncommon. > Something like: > “If you pg_dump only with -t A and A has foreign key constraints to table > B, > Those constraints won’t succeed If B has not been already restored” > > Attached an idea. The A/B example may be wordy / redundant, but the -e note on the distinction between installing binaries / creating an extension I think it's important. v1-0001-Note-on-e-t-n.patch Description: Binary data
PATCH: jsonpath string methods: lower, upper, initcap, l/r/btrim, replace, split_part
Hello hackers, This patch is a follow-up and generalization to [0]. It adds the following jsonpath methods: lower, upper, initcap, l/r/btrim, replace, split_part. It makes jsonpath able to support expressions like these: select jsonb_path_query('" hElLo WorlD "', '$.btrim().lower().upper().lower().replace("hello","bye") starts with "bye"'); select jsonb_path_query('"abc~@~def~@~ghi"', '$.split_part("~@~", 2)') They, of course, forward their implementation to the internal pg_proc-registered function. As a first wip/poc I've picked the functions I typically need to clean up JSON data. I've also added a README.jsonpath with documentation on how to add a new jsonpath method. If I had this available when I started, it would have saved me some time. So, I am leaving it here for the next hacker. This patch is not particularly intrusive to existing code: Afaict, the only struct I've touched is JsonPathParseItem , where I added { JsonPathParseItem *arg0, *arg1; } method_args. Up until now, most of the jsonpath methods that accept arguments rely on left/right operands, which works, but it could be more convenient for future more complex methods. I've also added the appropriate jspGetArgX(JsonPathItem *v, JsonPathItem *a). Open items - What happens if the jsonpath standard adds a new method by the same name? A.D. mentioned this in [0] with the proposal of having a prefix like pg_ or initial-upper letter. - Still using the default collation like the rest of the jsonpath code. - documentation N/A yet - I do realize that the process of adding a new method sketches an imaginary. CREATE JSONPATH FUNCTION. This has been on the back of my mind for some time now, but I can't say I have an action plan for this yet. GitHub PR view if you prefer: https://github.com/Florents-Tselai/postgres/pull/18 [0] https://www.postgresql.org/message-id/flat/185BF814-9225-46DB-B1A1-6468CF2C8B63%40justatheory.com#1850a37a98198974cf543aefe225ba56 All the best, Flo v1-0001-This-patch-adds-the-following-string-processing-m.patch Description: Binary data
Re: Docs pg_restore: Shouldn't there be a note about -n ?
> On 21 Sep 2024, at 9:22 PM, Tom Lane wrote: > > Florents Tselai writes: >> Ah, swapped them by mistake on the previous email: >> They're both available in the pg_dump and note on -n missing in pg_restore. >> The question remains though: >> Shouldn’t there be a note about -n in pg_restore ? > > Probably. I see that pg_dump has a third copy of the exact same > note for "-e". pg_restore lacks that switch for some reason, > but this is surely looking mighty duplicative. I propose getting > rid of the per-switch Notes and putting a para into the Notes > section, along the lines of > >When -e, -n, or -t is specified, pg_dump makes no attempt to dump >any other database objects that the selected object(s) might >depend upon. Therefore, there is no guarantee that the results of >a selective dump can be successfully restored by themselves into a >clean database. Agree with that, but I think there should be a pointer like “see Notes” . Otherwise I’m pretty sure most would expect pg doing magic. Can’t remember I scrolledl to the bottom of a page “notes” after finding the option I want. I would also add an example of what “depend upon” means, To underline that it’s really not that uncommon. Something like: “If you pg_dump only with -t A and A has foreign key constraints to table B, Those constraints won’t succeed If B has not been already restored” > > and mutatis mutandis for pg_restore. > > regards, tom lane
Re: Docs pg_restore: Shouldn't there be a note about -n ?
On Sat, Sep 21, 2024 at 8:34 PM Florents Tselai wrote: > I’m in the process of trying to restore some PG15/16 backups in PG17. > > While playing with different -t and -n combinations I was browsing through > the docs. > > In *pg_restore* there are two notes about both -t / -n > > > When -n / -t is specified, pg_dump makes no attempt to ... > > In pg_dump though there’s the equivalent note only for the -t option. > > Shouldn’t it be a note as well for -n ? > > Otherwise I would expect -n to cascade the restore to objects in other > schemas; > Which I don’t think it does. > > Am I missing something? > Ah, swapped them by mistake on the previous email: They're both available in the pg_dump and note on -n missing in pg_restore. The question remains though: Shouldn’t there be a note about -n in pg_restore ?
Docs pg_restore: Shouldn't there be a note about -n ?
I’m in the process of trying to restore some PG15/16 backups in PG17. While playing with different -t and -n combinations I was browsing through the docs. In pg_restore there are two notes about both -t / -n > When -n / -t is specified, pg_dump makes no attempt to ... In pg_dump though there’s the equivalent note only for the -t option. Shouldn’t it be a note as well for -n ? Otherwise I would expect -n to cascade the restore to objects in other schemas; Which I don’t think it does. Am I missing something?
Re: [PATCH] WIP: replace method for jsonpath
On 18 Sep 2024, at 3:47 PM, Andrew Dunstan wrote: On 2024-09-18 We 4:23 AM, Peter Eisentraut wrote: On 17.09.24 21:16, David E. Wheeler wrote: On Sep 17, 2024, at 15:03, Florents Tselai wrote: Fallback scenario: make this an extension, but in a first pass I didn’t find any convenient hooks. One has to create a whole new scanner, grammar etc. Yeah, it got me thinking about the RFC-9535 JSONPath "Function Extension" feature[1], which allows users to add functions. Would be cool to have a way to register jsonpath functions somehow, but I would imagine it’d need quite a bit of specification similar to RFC-9535. Wouldn’t surprise me to see something like that appear in a future version of the spec, with an interface something like CREATE OPERATOR. Why can't we "just" call any suitable pg_proc-registered function from JSON path? The proposed patch routes the example '$.replace("hello","bye")' internally to the internal implementation of the SQL function replace(..., 'hello', 'bye'). Why can't we do this automatically for any function call in a JSON path expression? That might work. The thing that bothers me about the original proposal is this: what if we add a new non-spec jsonpath method and then a new version of the spec adds a method with the same name, but not compatible with our method? We'll be in a nasty place. At the very least I think we need to try hard to avoid that. Maybe we should prefix non-spec method names with "pg_", or maybe use an initial capital letter. If naming is your main reservation, then I take it you’re generally positive. Having said that, “pg_” is probably too long for a jsonpath expression, Most importantly though, “pg_” in my mind is a prefix for things like catalog lookup and system monitoring. Not a function that the average user would use. Thus, I lean towards initial-capital. The more general case would look like: A new jsonpath item of the format $.Func(arg1, …, argn) can be applied (recursively or not) to a json object. As a first iteration/version only pg_proc-registered functions of the format func(text, ...,) -> text are available. We can focus on oid(arg0) = TEXTOID and rettype = TEXTOID fist. The first arg0 has to be TEXTOID (the current json string) while subsequent args are provided by the user in the jsonpath expression. The functions we want to support will be part of jsonpath grammar and during execution we'll have enough info to find the appropriate PGFunction to call. What I'm missing yet is how we could handle vars jsonb, in case the user doesn't want to just hardcode the actual function arguments. Then resolving argtypes1...n is a bit more complex: The signature for jsonb_apply(doc jsonb, func text[, variadic "any" args1_n]); [0] Here, variadic "any" works beautifully, but that's a brand-new function. In existing jsonb{path} facilities, vars are jsonb objects which could work as well I think. Unless I'm missing something. [0] https://github.com/Florents-Tselai/jsonb_apply cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: Get TupleDesc for extension-defined types
On Wed, Sep 18, 2024 at 1:09 PM Pavel Stehule wrote: > Hi > > st 18. 9. 2024 v 9:04 odesílatel Florents Tselai < > florents.tse...@gmail.com> napsal: > >> Correct me if I'm wrong, >> but for an extension that defines composite types, >> there's currently no easy way to get a TupleDesc, even for its own types. >> >> Something like >> TupleDesc get_extension_type_tupledesc(const char *extname, const char >> *typname) >> >> Here's a routine I've stolen borrowed from pramsey's code and have been >> using ever since. >> >> Could this be exposed in extension.h ? (probably without the version >> check) >> > > I don't think this functionality is generally useful. Wrapping > TypeGetTupleDesc(typoid, NIL) is very specific, and probably this code > should be inside the extension. > > Different question is API for searching in system catalog and > dependencies. I can imagine some functions like > That's a better phrasing > > Oid extid = get_extension_id(extname); > Oid objid = get_extension_object_id(extid, schema_can_be_null, name, > TYPEOID); // can be used for routine, table, ... > > tupdesc = TypeGetTupleDesc(objid, NIL); > These are valid. For context: The "problem" (inconvenience really) I'm trying to solve is this: Most extensions define some convenient PG_GETARG_MYTYPE(n) macros. When these types are varlena, things are easy. When they're composite types though things get more verbose. i.e. the lines of code the author needs to get from a Datum argument to struct MyType are too many and multiple extensions copy-paste the same logic. My hope is we could come up with a few routines that ease and standardize this a bit. You're right that extname isn't unique, so Oid should be the argument for extension, rather than char *extname, but in my mind the "default" is "the current extension" , but no arguing about that. > > Regards > > Pavel > > >
Re: [PATCH] WIP: replace method for jsonpath
> On 18 Sep 2024, at 11:23 AM, Peter Eisentraut wrote: > > On 17.09.24 21:16, David E. Wheeler wrote: >> On Sep 17, 2024, at 15:03, Florents Tselai wrote: >>> Fallback scenario: make this an extension, but in a first pass I didn’t >>> find any convenient hooks. >>> One has to create a whole new scanner, grammar etc. >> Yeah, it got me thinking about the RFC-9535 JSONPath "Function Extension" >> feature[1], which allows users to add functions. Would be cool to have a way >> to register jsonpath functions somehow, but I would imagine it’d need quite >> a bit of specification similar to RFC-9535. Wouldn’t surprise me to see >> something like that appear in a future version of the spec, with an >> interface something like CREATE OPERATOR. > > Why can't we "just" call any suitable pg_proc-registered function from JSON > path? The proposed patch routes the example '$.replace("hello","bye")' > internally to the internal implementation of the SQL function replace(..., > 'hello', 'bye'). Why can't we do this automatically for any function call in > a JSON path expression? > Well, we can. A couple of weeks ago, I discovered transform_jsonb_string_values, which is already available in jsonfuncs.h and that gave me the idea for this extension https://github.com/Florents-Tselai/jsonb_apply It does exactly what you’re saying: searches for a suitable pg_proc in the catalog, and directly applies it. select jsonb_apply('{ "id": 1, "name": "John", "messages": [ "hello" ] }', 'replace', 'hello', 'bye’); select jsonb_filter_apply('{ "id": 1, "name": "John", "messages": [ "hello" ] }', '{messages}', 'md5’); But, I don't know… jsonb_apply? That seemed “too fancy”/LISPy for standard Postgres. Now that you mention it, though, there’s an alternative of tweaking the grammar and calling the suitable text proc.
Get TupleDesc for extension-defined types
Correct me if I'm wrong, but for an extension that defines composite types, there's currently no easy way to get a TupleDesc, even for its own types. Something like TupleDesc get_extension_type_tupledesc(const char *extname, const char *typname) Here's a routine I've stolen borrowed from pramsey's code and have been using ever since. Could this be exposed in extension.h ? (probably without the version check) v1-0001-First-implementation-for-get_extension_type_tuple.patch Description: Binary data
Re: jsonb_strip_nulls with arrays?
On Tue, Sep 17, 2024 at 5:11 PM Andrew Dunstan wrote: > > On 2024-09-17 Tu 5:26 AM, Florents Tselai wrote: > > Currently: > > > jsonb_strip_nulls ( jsonb ) → jsonb > > Deletes all object fields that have null values from the given JSON value, > recursively. Null values that are not object fields are untouched. > > > > Null values that are not object fields are untouched. > > > Can we revisit this and make it work with arrays, too? > > Tbh, at first sight that looked like the expected behavior for me. > > That is strip nulls from arrays as well. > > > This has been available since 9.5 and iiuc predates lots of the jsonb > array work. > > > I don't think that's a great idea. Removing an object field which has a > null value shouldn't have any effect on the surrounding data, nor really > any on other operations (If you try to get the value of the missing field > it should give you back null). But removing a null array member isn't like > that at all - unless it's the trailing member of the array it will renumber > all the succeeding array members. > > And I don't think we should be changing the behaviour of a function, that > people might have been relying on for the better part of a decade. > > > > In practice, though, whenever jsonb_build_array is used (especially with > jsonpath), > > a few nulls do appear in the resulting array most of the times, > > Currently, there’s no expressive way to remove this. > > > We could also have jsonb_array_strip_nulls(jsonb) as well > > > We could, if we're going to do anything at all in this area. Another > possibility would be to provide a second optional parameter for > json{b}_strip_nulls. That's probably a better way to go. > Here's a patch that adds that argument (only for jsonb; no json implementation yet) That's how I imagined & implemented it, but there may be non-obvious pitfalls in the semantics. as-is version select jsonb_strip_nulls('[1,2,null,3,4]'); jsonb_strip_nulls [1, 2, null, 3, 4] (1 row) select jsonb_strip_nulls('{"a":1,"b":null,"c":[2,null,3],"d":{"e":4,"f":null}}'); jsonb_strip_nulls {"a": 1, "c": [2, null, 3], "d": {"e": 4}} (1 row) with the additional boolean flag added select jsonb_strip_nulls('[1,2,null,3,4]', *true*); jsonb_strip_nulls --- [1, 2, 3, 4] (1 row) select jsonb_strip_nulls('{"a":1,"b":null,"c":[2,null,3],"d":{"e":4,"f":null}}', *true*); jsonb_strip_nulls -- {"a": 1, "c": [2, 3], "d": {"e": 4}} (1 row) GH PR view: https://github.com/Florents-Tselai/postgres/pull/6/files > cheers > > > andrew > > > -- > Andrew Dunstan > EDB: https://www.enterprisedb.com > > v1-0002-Add-docs-for-strip_in_arrays-argument.patch Description: Binary data v1-0001-jsonb_strip_nulls-jsonb-bool-wip.patch Description: Binary data
Re: [PATCH] WIP: replace method for jsonpath
> On 17 Sep 2024, at 9:40 PM, David E. Wheeler wrote: > > On Sep 16, 2024, at 18:39, Florents Tselai wrote: > >> Here’s an updated version of this patch. > > Oh, nice function. > > But a broader question for hackers: Is replace() specified in the SQL/JSON > spec? If not, what’s the process for evaluating whether or not to add > features not specified by the spec? That’s the first argument I was expecting, and it’s a valid one. From a user’s perspective the answer is: Why not? The more text-processing facilities I have in jsonb, The less back-and-forth-parentheses-fu I do, The easier my life is. From a PG gatekeeping it’s a more complicated issue. It’s not part of the spec, But I think the jsonb infrastructure in PG is really powerful already and we can built on it, And can evolve into a superset DSL of jsonpath. For example, apache/age have lift-and-shifted this infra and built another DSL (cypher) on top of it. > >> As a future note: >> It’s worth noting that both this newly added jspItem and other ones like >> (jpiDecimal, jpiString) >> use jspGetRightArg and jspGetLeftArg. >> left and right can be confusing if more complex methods are added in the >> future. >> i.e. jsonpath methods with nargs>=3 . >> I was wondering if we’d like something like JSP_GETARG(n) > > So far I think we have only functions defined by the spec, which tend to be > unary or binary, so left and right haven’t been an issue. If the answer to the Q above is: “we stick to the spec” then there’s no thinking about this. But tbh, I’ve already started experimenting with other text methods in text $.strip() / trim() / upper() / lower() etc. Fallback scenario: make this an extension, but in a first pass I didn’t find any convenient hooks. One has to create a whole new scanner, grammar etc. > > Best, > > David >
jsonb_strip_nulls with arrays?
Currently: jsonb_strip_nulls ( jsonb ) → jsonb Deletes all object fields that have null values from the given JSON value, recursively. Null values that are not object fields are untouched. > Null values that are not object fields are untouched. Can we revisit this and make it work with arrays, too? Tbh, at first sight that looked like the expected behavior for me. That is strip nulls from arrays as well. This has been available since 9.5 and iiuc predates lots of the jsonb array work. In practice, though, whenever jsonb_build_array is used (especially with jsonpath), a few nulls do appear in the resulting array most of the times, Currently, there’s no expressive way to remove this. We could also have jsonb_array_strip_nulls(jsonb) as well
Re: [PATCH] WIP: replace method for jsonpath
Here’s an updated version of this patch.The previous version failed several CI steps; this passes them all.Unless someone disagrees,I’ll proceed with the documentation and add this for the next CF.As a future note:It’s worth noting that both this newly added jspItem and other ones like (jpiDecimal, jpiString)use jspGetRightArg and jspGetLeftArg.left and right can be confusing if more complex methods are added in the future.i.e. jsonpath methods with nargs>=3 .I was wondering if we’d like something like JSP_GETARG(n)GitHub PR in case you prefer it’s UI https://github.com/Florents-Tselai/postgres/pull/3 v2-0001-Add-first-working-version-for-jsonpath-.replace-s.patch Description: Binary data v2-0002-replace-should-work-with-strings-inside-arrays.patch Description: Binary data v2-0003-Fix-logic-for-jpiReplaceFunc.patch Description: Binary data On 15 Sep 2024, at 4:15 AM, Florents Tselai wrote:Hi,When working with jsonb/jsonpath, I’ve always wanted more fluent string operations for data cleaning.Ideally, chaining text processing methods together.Here’s a draft patch that adds a string replace method.It works like thisselect jsonb_path_query('"hello world"', '$.replace("hello","bye")'); jsonb_path_query -- "bye world"(1 row)And looks like plays nicely with existing facilities select jsonb_path_query('"hello world"', '$.replace("hello","bye") starts with "bye"'); jsonb_path_query -- true(1 row)I’d like some initial feedback on whether this is of interested before I proceed with the following:- I’ve tried respecting the surrounding code, but a more experienced eye can spot some inconsistencies. I’ll revisit those- Add more test cases (I’ve added the obvious ones, but ideas on more cases are welcome).- pg upgrades currently fail on CI (see)- better error handling/reporting: I’ve kept the wording of error messages, but we’ll need something akin to ERRCODE_INVALID_ARGUMENT_FOR_SQL_JSON_DATETIME_FUNCTION.- documentation.
[PATCH] WIP: replace method for jsonpath
Hi,When working with jsonb/jsonpath, I’ve always wanted more fluent string operations for data cleaning.Ideally, chaining text processing methods together.Here’s a draft patch that adds a string replace method.It works like thisselect jsonb_path_query('"hello world"', '$.replace("hello","bye")'); jsonb_path_query -- "bye world"(1 row)And looks like plays nicely with existing facilities select jsonb_path_query('"hello world"', '$.replace("hello","bye") starts with "bye"'); jsonb_path_query -- true(1 row)I’d like some initial feedback on whether this is of interested before I proceed with the following:- I’ve tried respecting the surrounding code, but a more experienced eye can spot some inconsistencies. I’ll revisit those- Add more test cases (I’ve added the obvious ones, but ideas on more cases are welcome).- pg upgrades currently fail on CI (see)- better error handling/reporting: I’ve kept the wording of error messages, but we’ll need something akin to ERRCODE_INVALID_ARGUMENT_FOR_SQL_JSON_DATETIME_FUNCTION.- documentation. v1-0001-jsonpath-replace-method.patch Description: Binary data
Re: Exporting float_to_shortest_decimal_buf(n) with Postgres 17 on Windows
> On 13 Sep 2024, at 11:07 PM, Andrew Kane wrote: > > Hi, > > With Postgres 17 RC1 on Windows, `float_to_shortest_decimal_buf` and > `float_to_shortest_decimal_bufn` are not longer exported. This causes > `unresolved external symbol` linking errors for extensions that rely on these > functions (like pgvector). Can these functions be exported like previous > versions of Postgres? Probably a Windows thing? Just tried on Darwin with 17_RC1, and pgvector 0.7.4 build installcheck’s OK. > > Thanks, > Andrew
[PATCH] Add some documentation on how to call internal functions
Hi, The documentation on extending using C functions, leaves a blank on how to call other internal Postgres functions.This can leave first-timers hanging.I think it’d be helpful to spend some paragraphs on discussing the DirectFunctionCall API.Here’s an attempt (also a PR[0]).Here’s the relevant HTML snippet for convenience:To call another version-1 function, you can use DirectFunctionCalln(func, arg1,...,argn). This is particularly useful when you want to call functions defined in the standard internal library, by using an interface similar to their SQL signature.Different flavors of similar macros can be found in fmgr.h. The main point though is that they expect a C function name to call as their first argument (or its Oid in some cases), and actual arguments should be supplied as Datums. They always return Datum.For example, to call the starts_with(text, text) from C, you can search through the catalog and find out that its C implementation is based on the Datum text_starts_with(PG_FUNCTION_ARGS) function.In fmgr.h there are also available macros the facilitate conversions between C types and Datum. For example to turn text* into Datum, you can use DatumGetTextPP(X). If your extension defines additional types, it is usually convenient to define similar macros for these types too.I’ve also added the below example function:PG_FUNCTION_INFO_V1(t_starts_with); Datum t_starts_with(PG_FUNCTION_ARGS) { Datum t1 = PG_GETARG_DATUM(0); Datum t2 = PG_GETARG_DATUM(1); bool bool_res; Datum datum_res = DirectFunctionCall2(text_starts_with, t1, t2); bool_res = DatumGetBool(datum_res); PG_RETURN_BOOL(bool_res); }PS1: I was not sure if src/tutorial is still relevant with this part of the documentation.If so, it needs updating too.[0] https://github.com/Florents-Tselai/postgres/pull/1/commits/1651b7bb68e0f9c2b61e1462367295d846d253ec 0001-Add-some-documentation-on-how-to-call-internal-funct.patch Description: Binary data
Re: Update platform notes to build Postgres on macos
./configure —help It will show that you can build —without-icu , you can also specify a path to pkg-config via PKG_CONFIG=/path/to/pkg-config side note: I’ve had better experience building with brew on macos, rather than macports. > On 4 Jul 2024, at 9:02 PM, Said Assemlal wrote: > > Hi, > > > > I just built postgresql on macos sonoma (v14) and I had to install the > following packages: > > * icu - https://ports.macports.org/port/icu/ > * pkg - https://ports.macports.org/port/pkgconfig/ > I don't see anything related to this on > https://www.postgresql.org/docs/devel/installation-platform-notes.html > > Did I miss something ? Should we add a note? > > best, > Saïd > > >
Re: SQL Property Graph Queries (SQL/PGQ)
In the ddl.sgml, I’d swap the first two paragraphs. I find the first one a bit confusing as-is. As far as I can tell, it’s an implementation detail. The first paragraph should answer, “I have some data modeled as a graph G=(V, E). Can Postgres help me?”. Then, introducing property graphs makes more sense. I'd also use the examples and fake data in `graph_table.sql` in ddl/queries.sgml). I was bummed that that copy-pasting didn't work as is. I’d keep explaining how a graph query translates to a relational one later in the page. As for the implementation, I can’t have an opinion yet, but for those not familiar, Apache Age uses a slightly different approach that mimics jsonpath (parses a sublanguage expression into an internal execution engine etc.). However, the standard requires mapping this to the relational model, which makes sense for core Postgres. > On 27 Jun 2024, at 3:31 PM, Peter Eisentraut wrote: > > Here is a new version of this patch. I have been working together with > Ashutosh on this. While the version 0 was more of a fragile demo, this > version 1 has a fairly complete minimal feature set and should be useful for > playing around with. We do have a long list of various internal bits that > still need to be fixed or revised or looked at again, so there is by no means > a claim that everything is completed. > > Documentation to get started is included (ddl.sgml and queries.sgml). (Of > course, feedback on the getting-started documentation would be most welcome.) >
Re: Do we want a hashset type?
Has anyone put this in a git repo / extension package or similar ? I’d like to try it out outside the core pg tree. > On 1 Jul 2023, at 12:04 PM, Joel Jacobson wrote: > > On Fri, Jun 30, 2023, at 06:50, jian he wrote: >> more like a C questions >> in this context does >> #define HASHSET_GET_VALUES(set) ((int32 *) ((set)->data + >> CEIL_DIV((set)->capacity, 8))) >> define first, then define struct int4hashset_t. Is this normally ok? > > Yes, it's fine. Macros are just text substitutions done pre-compilation. > >> Also does >> #define HASHSET_GET_VALUES(set) ((int32 *) ((set)->data + >> CEIL_DIV((set)->capacity, 8))) >> >> remove (int32 *) will make it generic? then when you use it, you can >> cast whatever type you like? > > Maybe, but might be less error-prone more descriptive with different > macros for each type, e.g. INT4HASHSET_GET_VALUES, > similar to the existing PG_GETARG_INT4HASHSET > > Curious to hear what everybody thinks about the interface, documentation, > semantics and implementation? > > Is there anything missing or something that you think should be > changed/improved? > > /Joel > >
Re: Improving FTS for Greek
> On 7 Jun 2023, at 12:13 AM, Peter Eisentraut wrote: > > On 03.06.23 19:47, Florents Tselai wrote: >> There’s another previous relevant patch [0] but was never merged. I’ve >> included these stop words and added some more (info in README.md). >> For my personal projects looks like it yields much better results. >> I’d like some feedback on the extension ; particularly on the installation >> infra (I’m not sure I’ve handled properly the permissions in the .sql files) >> I’ll then try to make a .patch for this. > > The open question at the previous attempt was that it wasn't clear what the > upstream source or long-term maintenance of the stop words list would be. If > it's just a personally composed list, then it's okay if you use it yourself, > but for including it into PostgreSQL it ought to come from a reputable > non-individual source like snowball. I’ve used the NLTK list [0] as my base of stopwords; Wouldn’t this be considered reputable enough ? 0 https://github.com/nltk/nltk_data/blob/gh-pages/packages/corpora/stopwords.zip (see greek.stop file in the archive) >
Improving FTS for Greek
I posted earlier in pgsql-general, that I realised there’s no greek.stop under $(pg_config —sharedir)/tsearch_data And indeed looks like stop words are maintained with to_tsvector(‘greek’, ..). I wrote an extension https://github.com/Florents-Tselai/pg_fts_greek that adds another ‘greek_ext’ regconfig Here’s how the results compare t to_tsvector('greek', t) to_tsvector('greek_ext', t) 'το τετράγωνο της υποτείνουσας ενός ορθογωνίου τριγώνου''εν':5 'ορθογων':6 'τ':3 'τετραγων':2 'το':1 'τριγων':7 'υποτεινουσ':4 'εν':5 'ορθογων':6 'τετραγων':2 'τριγων':7 'υποτεινουσ':4 'ο γιώργος είναι πονηρός' 'γιωργ':2 'εινα':3 'ο':1 'πονηρ':4 'γιωργ':2 'πονηρ':4 'ο ήλιος ο πράσινος o ήλιος που ανατέλλει' 'o':5 'ανατελλ':8 'ηλι':2,6 'ο':1,3 'π':7 'πρασιν':4'ανατελλ':8 'ηλι':2,6 'πρασιν':4 There’s another previous relevant patch [0] but was never merged. I’ve included these stop words and added some more (info in README.md). For my personal projects looks like it yields much better results. I’d like some feedback on the extension ; particularly on the installation infra (I’m not sure I’ve handled properly the permissions in the .sql files) I’ll then try to make a .patch for this. [0] https://www.postgresql.org/message-id/flat/e1c79330-48a5-abef-c309-8d4499e3180b%402ndquadrant.com#7431fdb9ae24b694155aef3f040b7b60