On Thu, Jun 25, 2026 at 9:00 PM Corey Huinker <[email protected]> wrote:
> On Wed, Jun 24, 2026 at 1:17 PM Florents Tselai <[email protected]> > wrote: > >> >> >> On Tue, Jun 23, 2026 at 6:30 PM Florents Tselai < >> [email protected]> wrote: >> >>> >>> >>> >>> On Tue, Jun 23, 2026 at 6:16 PM Corey Huinker <[email protected]> >>> wrote: >>> >>>> On Fri, Jun 19, 2026 at 3:19 AM Florents Tselai < >>>> [email protected]> wrote: >>>> >>>>> >>>>> On Tue, Jun 16, 2026 at 10:53 PM Corey Huinker < >>>>> [email protected]> wrote: >>>>> >>>>>> On Mon, Apr 13, 2026 at 5:57 AM Florents Tselai < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> Hello hackers, >>>>>>> >>>>>>> This is a follow-up to the work recently merged in bd4f879. >>>>>>> In hindsight, I regret not pushing these through for the previous >>>>>>> cycle, >>>>>>> as they represent the "missing pieces" for users trying to perform >>>>>>> data cleaning entirely within the JSONPath engine. >>>>>>> With these we can significantly reduce the need for users to "drop >>>>>>> out" of JSONPath >>>>>>> into standard SQL for basic string-to-string-or-array-and-back >>>>>>> workflows. >>>>>>> >>>>>>> >>>>>> Seems this one got overlooked, so I took a look. The first patch >>>>>> applies, but the second needs a rebase. >>>>>> >>>>> >>>>> Thanks for having a look Corey, >>>>> >>>>> here's a consolidated v2 . >>>>> >>>> >>>> My familiarity with the the internal json stuff is limited to my >>>> experience in reworking the I/O format of pg_ndistinct and pg_dependencies, >>>> plus some other misadventures in statistics export, so I apologize if some >>>> of these questions are naive. >>>> >>>> jsonpath.h: >>>> >>>> jpiStrJoin is added in the middle of the JsonPathItemType, but >>>> jpiStrSplit and jpiStrTranslate were added to the end of the enum. The >>>> comment specifically states that "the order must not be changed and new >>>> values must be added to the end" - is there some exception to this rule? If >>>> there is an exception should we modify the comment and have TAP test to >>>> show that it doesn't break pg_upgrade? >>>> >>> >>> >>> >>>> >>>> jsonpath.c: >>>> >>>> In jspInitBuffer(), you add 3 new cases, but they aren't in a row, >>>> aren't organized alphabetically, or any other guiding organization that I >>>> can derive. >>>> >>>> The order of new types in the Assert() in jspGetLeftArg() is different >>>> than the order in jspGetRightArg(). Is there a reason they're different? >>>> >>>> jsonpath_exec.c: >>>> >>>> Everything here checks out. There were some patterns that initially >>>> concerned me, but they all have precedent elsewhere in the file. >>>> >>>> jsonpath_gram.y and jsonpath_scan.l: >>>> LGTM >>>> >>>> jsonb_jsonpath.out/sql >>>> >>>> +-- Empty string delimiter (splits into individual characters) >>>> +select jsonb_path_query('"abc"', '$.split("")'); >>>> + jsonb_path_query >>>> +------------------ >>>> + ["abc"] >>>> >>>> Like Zsolt, I am confused by this test. The description does not match >>>> the result. Based on the description, I would expect ["a", "b", "c"], but >>>> the output here looks like it was from a '$.split(",")' or any delimiter >>>> that isn't one of the letters in the string. >>>> >>> >>> I have this planned, that comment is wrong and should read ; >>> >>> +-- Empty string delimiter (returns the original string as a single >>> element) >>> >>> >>>> >>>> the underlying function text_to_array is accessible via SQL >>>> string_to_array, and there we get: >>>> >>>> corey=# select string_to_array('foo', ''); >>>> string_to_array >>>> ----------------- >>>> {foo} >>>> (1 row) >>>> >>>> corey=# select string_to_array('foo', NULL); >>>> string_to_array >>>> ----------------- >>>> {f,o,o} >>>> (1 row) >>>> >>>> Assuming we don't find that behavior to be a bug, we should change the >>>> description of the test, or change the second parameter from '' to NULL. >>>> >>>> Moving along, the first "silent => true" query could use a comment. In >>>> general a comment above every test is nice for telling future developers >>>> what specifically is being tested. >>>> >>>> func-json.sgml: >>>> >>>> The entry for split() makes no mention of the proper behavior when the >>>> delimiter is empty >>>> >>>> Summary: >>>> >>>> The things I'm noticing seem like very random coding choices, which >>>> makes me think that they weren't choices, they're just randomness >>>> introduced by squashing and rebasing on top of bd4f879a9cdd. Could you >>>> check into that? >>>> >>> >>> Thanks for your thoroughness. >>> >>> From a quick review I think most of your comments are spot on, and the >>> enum ordering especially should be considered a bit critical as it could >>> break backwards compatibility and pg_upgrade. >>> >>> I'll incorporate these in an upcoming v4. >>> >> >> Here's that v4 that addresses your points. >> > > I think you may have posted the wrong file. The v4 patch only seems to > address my concern about the test case comment on split. Of particular > concern is the change to the enum in jsonpath.h, which will break binary > upgrades. > Argh, you're right . Apologies for the noise.
v4-0001-Add-more-jsonpath-string-methods-.translate-.spli.patch
Description: Binary data
