Re: jsonpath

2019-02-28 Thread Andres Freund
On 2019-03-01 03:36:49 +0300, Nikita Glukhov wrote:
> Patch 1 is what we are going to commit in PG12.

I think it's too early to make that determination. I think there's a
good chance, but that this needs more independent review.



Re: jsonpath

2019-02-28 Thread Alexander Korotkov
On Fri, Mar 1, 2019 at 3:36 AM Nikita Glukhov  wrote:
> I can also offset to explicitly pass timezone info into jsonpath function 
> using
> the special user dataype encapsulating struct pg_tz.

More interesting question is what would be the source of timezone.  If
even you encapsulate timezone in a separate datatype, the expression
will be still just stable assuming timezone is generated by stable
subexpression.  What we actually need is immutable timezone.  Day once
timezone is updated, you create new timezone version, while old
version is immutable.  Then if jsonpath has given particular *timezone
version*, it might remain immutable.  But that requires significant
rework of our timezone infrastructure.

> But simple integer timezone offset can be passed now using jsonpath variables
> (standard says only about integer timezone offsets; also it requires presence
> of timezone offset it in the input string if the format string contain 
> timezone
> components):
>
> =# SELECT jsonb_path_query(
>  '"28-02-2019 12:34"',
>  '$.datetime("DD-MM- HH24:MI TZH", $tz)',
>  jsonb_build_object('tz', EXTRACT(TIMEZONE FROM now()))
>);
>
>   jsonb_path_query
> -
>  "2019-02-28T12:34:00+03:00"
> (1 row)

Standard specifies fixed offset to be given for *particular datetime*.
For instance, if json contains offset in separate attribute or
whatever, then it's OK to use such two-arguments .datetime() method.
But that seems quite narrow use case.

Standard doesn't mean you get fixed offset extracted from "now()" and
apply it to random datetimes in your json collection.  That would work
correctly for real timezones only when they are fixed offsets, but
there are almost none of them!  So, that's just plain wrong, we never
should encourage users to do something like this.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: jsonpath

2019-03-03 Thread Pavel Stehule
so 2. 3. 2019 v 6:15 odesílatel Alexander Korotkov <
a.korot...@postgrespro.ru> napsal:

> Hi!
>
> On Fri, Mar 1, 2019 at 3:36 AM Nikita Glukhov 
> wrote:
> >
> > Attached 34th version of the patches.
> >
> > 1. Partial jsonpath support:
> >- Fixed copying of jsonb with vars jsonb_path_query() into SRF context
> >- Fixed error message for jsonpath vars
> >- Fixed file-level comment in jsonpath.c
> >
> > 2. Suppression of numeric errors:
> >Now error handling is done without PG_TRY/PG_CATCH using a bunch of
> internal
> >numeric functions with 'bool *error' flag.
>
> Revised patches 1 and 2 are attached.  Changes are following
>
>  * Small refactoring, comments adjustment and function renaming.  In
> particular, I've removed "recursive" prefix from function names,
> because it actually not that informative assuming header comment
> explains that the whole jsonpath executor is recursive.  Also, I made
> "Unwrap" suffix more clear.  Not it's distinguished what is unwrapped
> target (UnwrapTarget) or result (UnwrapResult).  Also, now it's clear
> that function doesn't always unwraps but has an option to do so
> (OptUnwrap).
>  * Some more cases are covered by regression tests.
>

These patches are large, but I think so the granularity and modularity of
these patches are correct.

Now, it looks very well.

Pavel


> --
> Alexander Korotkov
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>


Re: jsonpath

2019-03-03 Thread Tomas Vondra
Hi,

Here are some initial comments from a review of the 0001 part. I plan to
do more testing on a large data set and additional round of review over
the next week. FWIW I've passed this through valgrind and the usual
battery of regression tests, and there were no issues.

I haven't looked at 0002 yet, but it seems fairly small (especially
compared to 0001).

func.sgml
=

1) I see the changes removed  for some
reason. Is that intentional?

2) WHERE

We generally tag WHERE, not .

3) Filter expressions are applied from left to right

Perhaps s/applied/evaluated/ in reference to expressions?

4) The result of the filter expression may be true, false, or unknown.

It's not entirely clear to me what "unknown" means here. NULL, or
something else? There's a section in the SQL/JSON standard
explaining this (page 83), but perhaps we should explain it a bit
here too?

The standard says "In the SQL/JSON data model, there are no SQL
nulls, so Unknown is not part of the SQL/JSON data model." so I'm a
bit confused what "unknown" references to. Maybe some example?

Also, what happens when the result is unknown?


5) There's an example showing how to apply filter at a certain level,
using the @ variable, but what if we want to apply multiple filters at
different levels? Would it make sense to add such example?

6) ... extensions of the SQL/JSON standard

I'm not sure what "extension" is here. Is that an extension defined
in the SQL standard, or an additional PostgreSQL functionality not
described in the standard? (I assume the latter, just checking.)


7) There are references to "SQL/JSON sequences" without any explanation
what it means. Maybe I'm missing something obvious, though.

8) Implicit unwrapping in the lax mode is not performed in the following
cases:

I suggest to reword it like this:

In the lax mode, implicit unwrapping is not performed when:

9) We're not adding the datetime() method for now, due to the issues
with timezones. I wonder if we should add a note why it's missing to the
docs ...

10) I'm a bit puzzled though, because the standard says this in the
description of type() function on page 77

If I is a datetime, then “date”, “time without time zone”, “time
with time zone”, “timestamp without time zone”, or “timestamp with
time zone”, as appropriate.

But I see our type() function does not return anything like that (which
I presume is independent of timezone stuff). But I see jsonb.h has no
concept of datetime values, and the standard actually says this in the
datetime() section

JSON has no datetime types. Datetime values are most likely stored
in character strings.

Considering this, the type() section makes little sense, no?


jsonb_util.c


I see we're now handling NaN values in convertJsonbScalar(). Isn't it
actually a bug that we don't do this already? Or is it not needed for
some reason?

jsonpath.c
==

I suppose this should say "jsonpath version number" instead?

elog(ERROR, "unsupported jsonb version number %d", version);


regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: jsonpath

2019-03-03 Thread Andrew Dunstan


On 3/3/19 1:08 PM, Tomas Vondra wrote:
>
>
> jsonb_util.c
> 
>
> I see we're now handling NaN values in convertJsonbScalar(). Isn't it
> actually a bug that we don't do this already? Or is it not needed for
> some reason?
>

JSON standard numerics don't support NaN, Infinity etc., so I assume
this can only happen in a jsonpath expression being converted to a jsonb
value. If so, the new section  should contain a comment to that effect,
otherwise it will be quite confusing.


cheers


andrew


-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: jsonpath

2019-03-04 Thread Tomas Vondra
A bunch of additional comments, after looking at the patch a bit today.
All are mostly minor, and sometime perhaps a matter of preference.


1) There's a mismatch between the comment and actual function name for
jsonb_path_match_opr and jsonb_path_exists_opr(). The comments say
"_novars" instead.

2) In a couple of switches the "default" case does a return with a
value, following elog(ERROR). So it's practically unreachable, AFAICS
it's fine without it, and we don't do this elsewhere. And I don't get
any compiler warnings if I remove it either.

Examples:

JsonbTypeName

default:
elog(ERROR, "unrecognized jsonb value type: %d", jbv->type);
return "unknown";

jspOperationName

default:
elog(ERROR, "unrecognized jsonpath item type: %d", type);
return NULL;

compareItems

default:
elog(ERROR, "unrecognized jsonpath operation: %d", op);
return jpbUnknown;

3) jsonpath_send is using makeStringInfo() for a value that is not
returned - IMHO it should use regular stack-allocated variable and use
initStringInfo() instead

4) the version number should be defined/used as a constant, not as a
magic constant somewhere in the code

5) Why does jsonPathToCstring do this?

appendBinaryStringInfo(out, "strict ", 7);

Why not to use regular appendStringInfoString()? What am I missing?

6) comment typo: "Aling StringInfo"

7) alignStringInfoInt() should explain why we need this and why INTALIGN
is the right alignment.

8) I'm a bit puzzled by what flattenJsonPathParseItem does with 'next'

I don't quite understand what it's doing with 'next' value?

/*
 * Actual value will be recorded later, after next and children
 * processing.
 */
appendBinaryStringInfo(buf,
   (char *) &next, /* fake value */
   sizeof(next));

Perhaps a comment explaining it (why we need a fake value at all?) would
be a good idea here.

9) I see printJsonPathItem is only calling check_stack_depth while
flattenJsonPathParseItem also calls CHECK_INTERRUPTS. Why the
difference, considering they seem about equally expensive?

10) executeNumericItemMethod is missing a comment (unlike the other
executeXXX functions)

11) Wording of some of the error messages in the execute methods seems a
bit odd. For example executeNumericItemMethod may complain that it

... is applied to not a numeric value

but perhaps a more natural wording would be

... is applied to a non-numeric value

And similarly for the other execute methods. But I'm not a native
speaker, so perhaps the original wording is just fine.


regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: jsonpath

2019-03-05 Thread Robert Haas
On Mon, Mar 4, 2019 at 6:27 PM Tomas Vondra
 wrote:
> 11) Wording of some of the error messages in the execute methods seems a
> bit odd. For example executeNumericItemMethod may complain that it
>
> ... is applied to not a numeric value
>
> but perhaps a more natural wording would be
>
> ... is applied to a non-numeric value
>
> And similarly for the other execute methods. But I'm not a native
> speaker, so perhaps the original wording is just fine.

As a native speaker I can confirm that the first wording is definitely
not OK.  The second one is tolerable, but I wonder if there is
something better, like "can only be applied to a numeric value" or
maybe there's a way to rephrase it so that we complain about the
non-numeric value itself rather than the application, e.g. ERROR:
json_frobnitz can only frob numeric values or ERROR: argument to
json_frobnitz must be numeric.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: jsonpath

2019-03-14 Thread Alexander Korotkov
On Sun, Mar 10, 2019 at 1:51 PM Alexander Korotkov
 wrote:
> On Wed, Mar 6, 2019 at 12:40 AM Nikita Glukhov  
> wrote:
> > Attached 36th version of the patches.
>
> Thank yo for the revision!
>
> In the attached revision following changes are made:
>
> > "unknown" refers here to ordinary three-valued logical Unknown, which is
> > represented in SQL by NULL.
> >
> > JSON path expressions return sequences of SQL/JSON items, which are defined 
> > by
> > SQL/JSON data model.  But JSON path predicates (logical expressions), which 
> > are
> > used in filters, return three-valued logical values: False, True, or 
> > Unknown.
>
>  * I've added short explanation of this to the documentation.
>  * Removed no longer present data structures from typedefs.list of the
> first patch.
>  * Moved GIN support patch to number 3.  Seems to be well-isolated and
> not very complex patch.  I propose to consider this to 12 too.  I
> added high-level comment there, commit message and made some code
> beautification.

I think patches 1 and 2 are in committable shape (I reached Tomas
off-list, he doesn't have more notes regarding them).  While patch 3
requires more review.

I'm going to push 1 and 2 if no objections.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: jsonpath

2019-03-16 Thread Alexander Korotkov
On Thu, Mar 14, 2019 at 12:07 PM Alexander Korotkov
 wrote:
> On Sun, Mar 10, 2019 at 1:51 PM Alexander Korotkov
>  wrote:
> > On Wed, Mar 6, 2019 at 12:40 AM Nikita Glukhov  
> > wrote:
> > > Attached 36th version of the patches.
> >
> > Thank yo for the revision!
> >
> > In the attached revision following changes are made:
> >
> > > "unknown" refers here to ordinary three-valued logical Unknown, which is
> > > represented in SQL by NULL.
> > >
> > > JSON path expressions return sequences of SQL/JSON items, which are 
> > > defined by
> > > SQL/JSON data model.  But JSON path predicates (logical expressions), 
> > > which are
> > > used in filters, return three-valued logical values: False, True, or 
> > > Unknown.
> >
> >  * I've added short explanation of this to the documentation.
> >  * Removed no longer present data structures from typedefs.list of the
> > first patch.
> >  * Moved GIN support patch to number 3.  Seems to be well-isolated and
> > not very complex patch.  I propose to consider this to 12 too.  I
> > added high-level comment there, commit message and made some code
> > beautification.
>
> I think patches 1 and 2 are in committable shape (I reached Tomas
> off-list, he doesn't have more notes regarding them).  While patch 3
> requires more review.
>
> I'm going to push 1 and 2 if no objections.

So, pushed.  Many thanks to reviewers and authors!

Remaining part I'm proposing for 12 is attached.  I appreciate review of it.


--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


0001-Jsonpath-GIN-support-v38.path
Description: Binary data


Re: jsonpath

2019-03-16 Thread Jeff Janes
On Sat, Mar 16, 2019 at 5:36 AM Alexander Korotkov <
a.korot...@postgrespro.ru> wrote:

>
> So, pushed.  Many thanks to reviewers and authors!
>

I think these files have to be cleaned up by "make maintainer-clean"

./src/backend/utils/adt/jsonpath_gram.c
./src/backend/utils/adt/jsonpath_scan.c

Cheers,

Jeff


Re: jsonpath

2019-03-16 Thread Alexander Korotkov
сб, 16 мар. 2019 г., 20:52 Jeff Janes :

> On Sat, Mar 16, 2019 at 5:36 AM Alexander Korotkov <
> a.korot...@postgrespro.ru> wrote:
>
>>
>> So, pushed.  Many thanks to reviewers and authors!
>>
>
> I think these files have to be cleaned up by "make maintainer-clean"
>
> ./src/backend/utils/adt/jsonpath_gram.c
> ./src/backend/utils/adt/jsonpath_scan.c
>

Good catch, thanks!  Will fix.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: jsonpath

2019-03-16 Thread Tom Lane
Jeff Janes  writes:
> I think these files have to be cleaned up by "make maintainer-clean"

> ./src/backend/utils/adt/jsonpath_gram.c
> ./src/backend/utils/adt/jsonpath_scan.c

Good point.  I also see jsonpath_gram.h left behind after maintainer-clean:

$ git status --ignored
On branch master
Your branch is up-to-date with 'origin/master'.
Ignored files:
  (use "git add -f ..." to include in what will be committed)

src/backend/utils/adt/jsonpath_gram.c
src/backend/utils/adt/jsonpath_scan.c
src/include/utils/jsonpath_gram.h

Looks like whoever modified src/backend/Makefile's distprep target
didn't bother to read the comment.

regards, tom lane



Re: jsonpath

2019-03-16 Thread Tom Lane
I wrote:
> Good point.  I also see jsonpath_gram.h left behind after maintainer-clean:

Oh, and of potentially more significance: after maintainer-clean and
re-configure, make fails with

In file included from jsonpath_gram.y:24:
../../../../src/include/utils/jsonpath_scanner.h:25:33: error: 
utils/jsonpath_gram.h: No such file or directory

I first thought this was a problem with insufficient dependencies
allowing parallel make to do things in the wrong order, but the problem
repeats even without any parallelism.  It looks like the dependencies
have been constructed in such a way that if the symlink at
src/include/utils/jsonpath_gram.h exists but the underlying file
does not, nothing will make the underlying file.  This is pretty broken;
aside from this outright failure, it also suggests that nothing will
update that file if it exists but is out of date relative to its
sources.

Please make sure that the make rules associated with these files look
exactly like the previously-debugged rules for existing bison/flex
output files.  There are generally good reasons for every last bit
of weirdness in those.

regards, tom lane



Re: jsonpath

2019-03-16 Thread Alexander Korotkov
сб, 16 мар. 2019 г., 21:12 Tom Lane :

> I wrote:
> > Good point.  I also see jsonpath_gram.h left behind after
> maintainer-clean:
>
> Oh, and of potentially more significance: after maintainer-clean and
> re-configure, make fails with
>
> In file included from jsonpath_gram.y:24:
> ../../../../src/include/utils/jsonpath_scanner.h:25:33: error:
> utils/jsonpath_gram.h: No such file or directory
>
> I first thought this was a problem with insufficient dependencies
> allowing parallel make to do things in the wrong order, but the problem
> repeats even without any parallelism.  It looks like the dependencies
> have been constructed in such a way that if the symlink at
> src/include/utils/jsonpath_gram.h exists but the underlying file
> does not, nothing will make the underlying file.  This is pretty broken;
> aside from this outright failure, it also suggests that nothing will
> update that file if it exists but is out of date relative to its
> sources.
>
> Please make sure that the make rules associated with these files look
> exactly like the previously-debugged rules for existing bison/flex
> output files.  There are generally good reasons for every last bit
> of weirdness in those.


Uh, I didn't check that carefully enough.  Thank you for the explanation.
Will fix.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: jsonpath

2019-03-16 Thread Pavel Stehule
so 16. 3. 2019 v 10:36 odesílatel Alexander Korotkov <
a.korot...@postgrespro.ru> napsal:

> On Thu, Mar 14, 2019 at 12:07 PM Alexander Korotkov
>  wrote:
> > On Sun, Mar 10, 2019 at 1:51 PM Alexander Korotkov
> >  wrote:
> > > On Wed, Mar 6, 2019 at 12:40 AM Nikita Glukhov <
> n.glu...@postgrespro.ru> wrote:
> > > > Attached 36th version of the patches.
> > >
> > > Thank yo for the revision!
> > >
> > > In the attached revision following changes are made:
> > >
> > > > "unknown" refers here to ordinary three-valued logical Unknown,
> which is
> > > > represented in SQL by NULL.
> > > >
> > > > JSON path expressions return sequences of SQL/JSON items, which are
> defined by
> > > > SQL/JSON data model.  But JSON path predicates (logical
> expressions), which are
> > > > used in filters, return three-valued logical values: False, True, or
> Unknown.
> > >
> > >  * I've added short explanation of this to the documentation.
> > >  * Removed no longer present data structures from typedefs.list of the
> > > first patch.
> > >  * Moved GIN support patch to number 3.  Seems to be well-isolated and
> > > not very complex patch.  I propose to consider this to 12 too.  I
> > > added high-level comment there, commit message and made some code
> > > beautification.
> >
> > I think patches 1 and 2 are in committable shape (I reached Tomas
> > off-list, he doesn't have more notes regarding them).  While patch 3
> > requires more review.
> >
> > I'm going to push 1 and 2 if no objections.
>
> So, pushed.  Many thanks to reviewers and authors!
>
> Remaining part I'm proposing for 12 is attached.  I appreciate review of
> it.
>

I tested this patch and I didn't  find any issue - just I tested basic
functionality and regress tests.

looks well

Pavel


>
> --
> Alexander Korotkov
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>


Re: jsonpath

2019-03-17 Thread Alexander Korotkov
On Sat, Mar 16, 2019 at 9:39 PM Pavel Stehule  wrote:
> so 16. 3. 2019 v 10:36 odesílatel Alexander Korotkov 
>  napsal:
>>
>> On Thu, Mar 14, 2019 at 12:07 PM Alexander Korotkov
>>  wrote:
>> > On Sun, Mar 10, 2019 at 1:51 PM Alexander Korotkov
>> >  wrote:
>> > > On Wed, Mar 6, 2019 at 12:40 AM Nikita Glukhov  
>> > > wrote:
>> > > > Attached 36th version of the patches.
>> > >
>> > > Thank yo for the revision!
>> > >
>> > > In the attached revision following changes are made:
>> > >
>> > > > "unknown" refers here to ordinary three-valued logical Unknown, which 
>> > > > is
>> > > > represented in SQL by NULL.
>> > > >
>> > > > JSON path expressions return sequences of SQL/JSON items, which are 
>> > > > defined by
>> > > > SQL/JSON data model.  But JSON path predicates (logical expressions), 
>> > > > which are
>> > > > used in filters, return three-valued logical values: False, True, or 
>> > > > Unknown.
>> > >
>> > >  * I've added short explanation of this to the documentation.
>> > >  * Removed no longer present data structures from typedefs.list of the
>> > > first patch.
>> > >  * Moved GIN support patch to number 3.  Seems to be well-isolated and
>> > > not very complex patch.  I propose to consider this to 12 too.  I
>> > > added high-level comment there, commit message and made some code
>> > > beautification.
>> >
>> > I think patches 1 and 2 are in committable shape (I reached Tomas
>> > off-list, he doesn't have more notes regarding them).  While patch 3
>> > requires more review.
>> >
>> > I'm going to push 1 and 2 if no objections.
>>
>> So, pushed.  Many thanks to reviewers and authors!
>>
>> Remaining part I'm proposing for 12 is attached.  I appreciate review of it.
>
>
> I tested this patch and I didn't  find any issue - just I tested basic 
> functionality and regress tests.
>
> looks well

Thank you for your feedback!

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: jsonpath

2019-03-17 Thread Alexander Korotkov
On Sat, Mar 16, 2019 at 9:12 PM Tom Lane  wrote:
> I wrote:
> > Good point.  I also see jsonpath_gram.h left behind after maintainer-clean:
>
> Oh, and of potentially more significance: after maintainer-clean and
> re-configure, make fails with
>
> In file included from jsonpath_gram.y:24:
> ../../../../src/include/utils/jsonpath_scanner.h:25:33: error: 
> utils/jsonpath_gram.h: No such file or directory
>
> I first thought this was a problem with insufficient dependencies
> allowing parallel make to do things in the wrong order, but the problem
> repeats even without any parallelism.  It looks like the dependencies
> have been constructed in such a way that if the symlink at
> src/include/utils/jsonpath_gram.h exists but the underlying file
> does not, nothing will make the underlying file.  This is pretty broken;
> aside from this outright failure, it also suggests that nothing will
> update that file if it exists but is out of date relative to its
> sources.
>
> Please make sure that the make rules associated with these files look
> exactly like the previously-debugged rules for existing bison/flex
> output files.  There are generally good reasons for every last bit
> of weirdness in those.

I've pushed a fix.  I hope I didn't forget anything.

BTW, it appears that windows build scripts also needs some fixup.  I'm
not very familiar with that.  I would be glad if somebody review the
attached patch.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


jsonpath_windows_build_fixes.patch
Description: Binary data


Re: jsonpath

2019-03-17 Thread Andrew Dunstan


On 3/17/19 4:03 AM, Alexander Korotkov wrote:
> On Sat, Mar 16, 2019 at 9:12 PM Tom Lane  wrote:
>> I wrote:
>>> Good point.  I also see jsonpath_gram.h left behind after maintainer-clean:
>> Oh, and of potentially more significance: after maintainer-clean and
>> re-configure, make fails with
>>
>> In file included from jsonpath_gram.y:24:
>> ../../../../src/include/utils/jsonpath_scanner.h:25:33: error: 
>> utils/jsonpath_gram.h: No such file or directory
>>
>> I first thought this was a problem with insufficient dependencies
>> allowing parallel make to do things in the wrong order, but the problem
>> repeats even without any parallelism.  It looks like the dependencies
>> have been constructed in such a way that if the symlink at
>> src/include/utils/jsonpath_gram.h exists but the underlying file
>> does not, nothing will make the underlying file.  This is pretty broken;
>> aside from this outright failure, it also suggests that nothing will
>> update that file if it exists but is out of date relative to its
>> sources.
>>
>> Please make sure that the make rules associated with these files look
>> exactly like the previously-debugged rules for existing bison/flex
>> output files.  There are generally good reasons for every last bit
>> of weirdness in those.
> I've pushed a fix.  I hope I didn't forget anything.
>
> BTW, it appears that windows build scripts also needs some fixup.  I'm
> not very familiar with that.  I would be glad if somebody review the
> attached patch.




Why are we installing the jsonpath_gram.h file? It's not going to be
used by anything else, is it? TBH, I'm not sure I see why we're
installing the scanner.h file either.


cheers


andrew


-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: jsonpath

2019-03-17 Thread Tom Lane
Andrew Dunstan  writes:
> Why are we installing the jsonpath_gram.h file? It's not going to be
> used by anything else, is it? TBH, I'm not sure I see why we're
> installing the scanner.h file either.

As near as I can see, jsonpath_gram.h and jsonpath_scanner.h exist only
for communication between jsonpath_gram.y and jsonpath_scan.l.  Maybe
we'd be better off handling that need by compiling the .l file as part
of the .y file, as we used to do with the core lexer and still do with
several others (cf comments for commit 72b1e3a21); then we wouldn't
even have to generate these files much less install them.

A quick look at jsonpath_scan.c shows that it's pretty innocent of
the tricks we've learned over the years for flex/bison portability;
in particular I see that it's #including  before postgres.h,
which is a no-no.  So that whole area needs more review anyway.

regards, tom lane



Re: jsonpath

2019-03-17 Thread Alexander Korotkov
On Sun, Mar 17, 2019 at 6:03 PM Tom Lane  wrote:
> Andrew Dunstan  writes:
> > Why are we installing the jsonpath_gram.h file? It's not going to be
> > used by anything else, is it? TBH, I'm not sure I see why we're
> > installing the scanner.h file either.
>
> As near as I can see, jsonpath_gram.h and jsonpath_scanner.h exist only
> for communication between jsonpath_gram.y and jsonpath_scan.l.  Maybe
> we'd be better off handling that need by compiling the .l file as part
> of the .y file, as we used to do with the core lexer and still do with
> several others (cf comments for commit 72b1e3a21); then we wouldn't
> even have to generate these files much less install them.
>
> A quick look at jsonpath_scan.c shows that it's pretty innocent of
> the tricks we've learned over the years for flex/bison portability;
> in particular I see that it's #including  before postgres.h,
> which is a no-no.  So that whole area needs more review anyway.

Yeah, I didn't review this part of patch carefully enough (and it
seems that other reviewers too).  I'm going to write a patch revising
this part in next couple of days.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: jsonpath

2019-03-17 Thread Jonathan S. Katz
On 3/17/19 3:13 AM, Alexander Korotkov wrote:
> On Sat, Mar 16, 2019 at 9:39 PM Pavel Stehule  wrote:
>> so 16. 3. 2019 v 10:36 odesílatel Alexander Korotkov 
>>  napsal:
>>>
>>> On Thu, Mar 14, 2019 at 12:07 PM Alexander Korotkov
>>>  wrote:
 On Sun, Mar 10, 2019 at 1:51 PM Alexander Korotkov
  wrote:
> On Wed, Mar 6, 2019 at 12:40 AM Nikita Glukhov  
> wrote:
>> Attached 36th version of the patches.
>
> Thank yo for the revision!
>
> In the attached revision following changes are made:
>
>> "unknown" refers here to ordinary three-valued logical Unknown, which is
>> represented in SQL by NULL.
>>
>> JSON path expressions return sequences of SQL/JSON items, which are 
>> defined by
>> SQL/JSON data model.  But JSON path predicates (logical expressions), 
>> which are
>> used in filters, return three-valued logical values: False, True, or 
>> Unknown.
>
>  * I've added short explanation of this to the documentation.
>  * Removed no longer present data structures from typedefs.list of the
> first patch.
>  * Moved GIN support patch to number 3.  Seems to be well-isolated and
> not very complex patch.  I propose to consider this to 12 too.  I
> added high-level comment there, commit message and made some code
> beautification.

 I think patches 1 and 2 are in committable shape (I reached Tomas
 off-list, he doesn't have more notes regarding them).  While patch 3
 requires more review.

 I'm going to push 1 and 2 if no objections.
>>>
>>> So, pushed.  Many thanks to reviewers and authors!
>>>
>>> Remaining part I'm proposing for 12 is attached.  I appreciate review of it.
>>
>>
>> I tested this patch and I didn't  find any issue - just I tested basic 
>> functionality and regress tests.
>>
>> looks well
> 
> Thank you for your feedback!

Like Pavel, I did some basic testing of the patch (on current HEAD
4178d8b91c) trying out various JSON path expressions, and yes, it all
worked. I had a brief scare while testing on 4178d8b91c where initdb was
failing on the bootstrapping step, but after doing a thorough wipe of
build files and my output directory, it seems to be initializing okay.

I also did some testing of the GIN patch upthread, as the quickness of
retrieval of the data using JSON path is of course important as well.
Using a schema roughly like this:

CREATE TABLE news_feed (
id int GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
data jsonb NOT NULL
);
CREATE INDEX news_feed_data_gin_idx ON news_feed USING gin(data);

I loaded in a data set of roughly 420,000 rows. Each row had all the
same keys but differing values (e.g. "length" and "content" as keys)

I tested a few different JSON path scenarios. Some of the index scans
performed way better than the equivalent sequential scans, for instance:

SELECT count(*)
FROM news_feed
WHERE data @? '$.length ? (@ == 200)';

SELECT *
FROM news_feed
WHERE data @? '$.id ? (@ == "22613cbc-d83e-4a29-8b59-3b9f5cd61825")';

Using the index outperformed the sequential scan (and parallel seq scan)
by ~10-100x based on my config + laptop hardware!

However, when I did something a little more complex, like the below:

SELECT count(*)
FROM news_feed
WHERE data @? '$.length ? (@ < 150)';

SELECT count(*)
FROM news_feed
WHERE data @? '$.content ? (@ like_regex "^Start")';

SELECT id, jsonb_path_query(data, '$.content')
FROM news_feed
WHERE data @? '$.content ? (@ like_regex "risk" flag "i")';

I would find that the index scan performed as well as the sequential
scan. Additionally, on my laptop, the parallel sequential scan would
beat the index scan by ~2.5x in some cases.

Reading up on what the GIN patch does, this all makes sense: it's
optimized for equality, I understand there are challenges to be able to
handle inequality, regex exps, etc. And the cases where it really does
work well, it's _incredibly_ fast.

My suggestion would be adding some additional guidance in the user
documentation around how GIN works with the @@ and @? operators so they
can understand where GIN will work very well with JSON path + their data
and not be surprised when other types of JSON path queries are
performing on par with a sequential scan (or worse than a parallel seq
scan).

Thanks,

Jonathan



signature.asc
Description: OpenPGP digital signature


Re: jsonpath

2019-03-17 Thread Alexander Korotkov
Hi!

On Sun, Mar 17, 2019 at 7:46 PM Jonathan S. Katz  wrote:
> Like Pavel, I did some basic testing of the patch (on current HEAD
> 4178d8b91c) trying out various JSON path expressions, and yes, it all
> worked. I had a brief scare while testing on 4178d8b91c where initdb was
> failing on the bootstrapping step, but after doing a thorough wipe of
> build files and my output directory, it seems to be initializing okay.
>
> I also did some testing of the GIN patch upthread, as the quickness of
> retrieval of the data using JSON path is of course important as well.

Thank you very much for testing!

> Using a schema roughly like this:
>
> CREATE TABLE news_feed (
> id int GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
> data jsonb NOT NULL
> );
> CREATE INDEX news_feed_data_gin_idx ON news_feed USING gin(data);
>
> I loaded in a data set of roughly 420,000 rows. Each row had all the
> same keys but differing values (e.g. "length" and "content" as keys)
>
> I tested a few different JSON path scenarios. Some of the index scans
> performed way better than the equivalent sequential scans, for instance:
>
> SELECT count(*)
> FROM news_feed
> WHERE data @? '$.length ? (@ == 200)';
>
> SELECT *
> FROM news_feed
> WHERE data @? '$.id ? (@ == "22613cbc-d83e-4a29-8b59-3b9f5cd61825")';
>
> Using the index outperformed the sequential scan (and parallel seq scan)
> by ~10-100x based on my config + laptop hardware!

Great!

> However, when I did something a little more complex, like the below:
>
> SELECT count(*)
> FROM news_feed
> WHERE data @? '$.length ? (@ < 150)';
>
> SELECT count(*)
> FROM news_feed
> WHERE data @? '$.content ? (@ like_regex "^Start")';
>
> SELECT id, jsonb_path_query(data, '$.content')
> FROM news_feed
> WHERE data @? '$.content ? (@ like_regex "risk" flag "i")';
>
> I would find that the index scan performed as well as the sequential
> scan. Additionally, on my laptop, the parallel sequential scan would
> beat the index scan by ~2.5x in some cases.

Yeah, this cases are not supported.  Did optimizer automatically
select sequential scan in this case (if not touching enable_*
variables)?  It should, because optimizer understands that GIN scan
will be bad if extract_query method failed to extract anything.

> Reading up on what the GIN patch does, this all makes sense: it's
> optimized for equality, I understand there are challenges to be able to
> handle inequality, regex exps, etc. And the cases where it really does
> work well, it's _incredibly_ fast.

Yes, for more complex cases, we need different opclasses.  For
instance, we can consider porting jsquery opclasses to PG 13.  And it
become even more important to get parametrized opclasses, because we
don't necessary want to index all the json fields in this same way.
That's another challenge for future releases.  But what we have now is
just support for some of jsonpathes for existing opclasses.

> My suggestion would be adding some additional guidance in the user
> documentation around how GIN works with the @@ and @? operators so they
> can understand where GIN will work very well with JSON path + their data
> and not be surprised when other types of JSON path queries are
> performing on par with a sequential scan (or worse than a parallel seq
> scan).

Good point.  Will do.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: jsonpath

2019-03-17 Thread Jonathan S. Katz
On 3/17/19 12:55 PM, Alexander Korotkov wrote:
> 
>> However, when I did something a little more complex, like the below:
>>
>> SELECT count(*)
>> FROM news_feed
>> WHERE data @? '$.length ? (@ < 150)';
>>
>> SELECT count(*)
>> FROM news_feed
>> WHERE data @? '$.content ? (@ like_regex "^Start")';
>>
>> SELECT id, jsonb_path_query(data, '$.content')
>> FROM news_feed
>> WHERE data @? '$.content ? (@ like_regex "risk" flag "i")';
>>
>> I would find that the index scan performed as well as the sequential
>> scan. Additionally, on my laptop, the parallel sequential scan would
>> beat the index scan by ~2.5x in some cases.
> 
> Yeah, this cases are not supported.  Did optimizer automatically
> select sequential scan in this case (if not touching enable_*
> variables)?  It should, because optimizer understands that GIN scan
> will be bad if extract_query method failed to extract anything.

It did not - it was doing a bitmap heap scan. I have default costs
setup. Example output from EXPLAIN ANALYZE with the index available:

 Aggregate  (cost=1539.78..1539.79 rows=1 width=8) (actual
time=270.419..270.419 rows=1 loops=1)
   ->  Bitmap Heap Scan on news_feed  (cost=23.24..1538.73 rows=418
width=0) (actual time=84.040..270.407 rows=5 loops=1)
 Recheck Cond: (data @? '$."length"?(@ < 150)'::jsonpath)
 Rows Removed by Index Recheck: 418360
 Heap Blocks: exact=28690
 ->  Bitmap Index Scan on news_feed_data_gin_idx
(cost=0.00..23.14 rows=418 width=0) (actual time=41.788..41.788
rows=418365 loops=1)
   Index Cond: (data @? '$."length"?(@ < 150)'::jsonpath)
 Planning Time: 0.168 ms
 Execution Time: 271.105 ms

And for arguments sake, after I dropped the index (and
max_parallel_workers = 8):

 Finalize Aggregate  (cost=30998.07..30998.08 rows=1 width=8) (actual
time=91.062..91.062 rows=1 loops=1)
   ->  Gather  (cost=30997.65..30998.06 rows=4 width=8) (actual
time=90.892..97.739 rows=5 loops=1)
 Workers Planned: 4
 Workers Launched: 4
 ->  Partial Aggregate  (cost=29997.65..29997.66 rows=1 width=8)
(actual time=76.977..76.977 rows=1 loops=5)
   ->  Parallel Seq Scan on news_feed  (cost=0.00..29997.39
rows=104 width=0) (actual time=39.736..76.964 rows=1 loops=5)
 Filter: (data @? '$."length"?(@ < 150)'::jsonpath)
 Rows Removed by Filter: 83672
 Planning Time: 0.127 ms
 Execution Time: 97.801 ms


>> Reading up on what the GIN patch does, this all makes sense: it's
>> optimized for equality, I understand there are challenges to be able to
>> handle inequality, regex exps, etc. And the cases where it really does
>> work well, it's _incredibly_ fast.
> 
> Yes, for more complex cases, we need different opclasses.  For
> instance, we can consider porting jsquery opclasses to PG 13.  And it
> become even more important to get parametrized opclasses, because we
> don't necessary want to index all the json fields in this same way.
> That's another challenge for future releases.  But what we have now is
> just support for some of jsonpathes for existing opclasses.

Yeah, that makes sense, and seems to be my recollection from the several
years of presentations I've seen on the topic ;)

>> My suggestion would be adding some additional guidance in the user
>> documentation around how GIN works with the @@ and @? operators so they
>> can understand where GIN will work very well with JSON path + their data
>> and not be surprised when other types of JSON path queries are
>> performing on par with a sequential scan (or worse than a parallel seq
>> scan).
> 
> Good point.  Will do.

Thanks!

Jonathan



signature.asc
Description: OpenPGP digital signature


Re: jsonpath

2019-03-17 Thread Alexander Korotkov
On Sun, Mar 17, 2019 at 8:00 PM Jonathan S. Katz  wrote:
> On 3/17/19 12:55 PM, Alexander Korotkov wrote:
> >
> >> However, when I did something a little more complex, like the below:
> >>
> >> SELECT count(*)
> >> FROM news_feed
> >> WHERE data @? '$.length ? (@ < 150)';
> >>
> >> SELECT count(*)
> >> FROM news_feed
> >> WHERE data @? '$.content ? (@ like_regex "^Start")';
> >>
> >> SELECT id, jsonb_path_query(data, '$.content')
> >> FROM news_feed
> >> WHERE data @? '$.content ? (@ like_regex "risk" flag "i")';
> >>
> >> I would find that the index scan performed as well as the sequential
> >> scan. Additionally, on my laptop, the parallel sequential scan would
> >> beat the index scan by ~2.5x in some cases.
> >
> > Yeah, this cases are not supported.  Did optimizer automatically
> > select sequential scan in this case (if not touching enable_*
> > variables)?  It should, because optimizer understands that GIN scan
> > will be bad if extract_query method failed to extract anything.
>
> It did not - it was doing a bitmap heap scan. I have default costs
> setup. Example output from EXPLAIN ANALYZE with the index available:
>
>  Aggregate  (cost=1539.78..1539.79 rows=1 width=8) (actual
> time=270.419..270.419 rows=1 loops=1)
>->  Bitmap Heap Scan on news_feed  (cost=23.24..1538.73 rows=418
> width=0) (actual time=84.040..270.407 rows=5 loops=1)
>  Recheck Cond: (data @? '$."length"?(@ < 150)'::jsonpath)
>  Rows Removed by Index Recheck: 418360
>  Heap Blocks: exact=28690
>  ->  Bitmap Index Scan on news_feed_data_gin_idx
> (cost=0.00..23.14 rows=418 width=0) (actual time=41.788..41.788
> rows=418365 loops=1)
>Index Cond: (data @? '$."length"?(@ < 150)'::jsonpath)
>  Planning Time: 0.168 ms
>  Execution Time: 271.105 ms
>
> And for arguments sake, after I dropped the index (and
> max_parallel_workers = 8):
>
>  Finalize Aggregate  (cost=30998.07..30998.08 rows=1 width=8) (actual
> time=91.062..91.062 rows=1 loops=1)
>->  Gather  (cost=30997.65..30998.06 rows=4 width=8) (actual
> time=90.892..97.739 rows=5 loops=1)
>  Workers Planned: 4
>  Workers Launched: 4
>  ->  Partial Aggregate  (cost=29997.65..29997.66 rows=1 width=8)
> (actual time=76.977..76.977 rows=1 loops=5)
>->  Parallel Seq Scan on news_feed  (cost=0.00..29997.39
> rows=104 width=0) (actual time=39.736..76.964 rows=1 loops=5)
>  Filter: (data @? '$."length"?(@ < 150)'::jsonpath)
>  Rows Removed by Filter: 83672
>  Planning Time: 0.127 ms
>  Execution Time: 97.801 ms

Thank you for the explanation.  Is it jsonb_ops or jsonb_path_ops?

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: jsonpath

2019-03-17 Thread Jonathan S. Katz
On 3/17/19 1:02 PM, Alexander Korotkov wrote:
> 
> Thank you for the explanation.  Is it jsonb_ops or jsonb_path_ops?

I just used "USING gin(col)" so jsonb_ops.

Thanks,

Jonathan



signature.asc
Description: OpenPGP digital signature


Re: jsonpath

2019-03-17 Thread Alexander Korotkov
On Sun, Mar 17, 2019 at 8:06 PM Jonathan S. Katz  wrote:
> On 3/17/19 1:02 PM, Alexander Korotkov wrote:
> >
> > Thank you for the explanation.  Is it jsonb_ops or jsonb_path_ops?
>
> I just used "USING gin(col)" so jsonb_ops.

I see.  So, jsonb_ops extracts from this query only existence of
.length key.  And I can bet it exists in all (or almost all) the
documents.  Thus, optimizer thinks index might be useful, while it's
useless.  There is not much can be done while we don't have statistics
for jsonb (and access to it from GIN extract_query).  So, for now we
can just refuse from extracting only keys from jsonpath in jsonb_ops.
But I think it would be better to just document this issue.  In future
we should improve that with statistics.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: jsonpath

2019-03-17 Thread Jonathan S. Katz
On 3/17/19 1:14 PM, Alexander Korotkov wrote:
> On Sun, Mar 17, 2019 at 8:06 PM Jonathan S. Katz  wrote:
>> On 3/17/19 1:02 PM, Alexander Korotkov wrote:
>>>
>>> Thank you for the explanation.  Is it jsonb_ops or jsonb_path_ops?
>>
>> I just used "USING gin(col)" so jsonb_ops.
> 
> I see.  So, jsonb_ops extracts from this query only existence of
> .length key.  And I can bet it exists in all (or almost all) the
> documents.  Thus, optimizer thinks index might be useful, while it's
> useless.  There is not much can be done while we don't have statistics
> for jsonb (and access to it from GIN extract_query).  So, for now we
> can just refuse from extracting only keys from jsonpath in jsonb_ops.
> But I think it would be better to just document this issue.  In future
> we should improve that with statistics.

That seems to make sense, especially given how I've typically stored
JSON documents in PostgreSQL. It sounds like this particular problem
would be solved appropriately with JSONB statistics.

Thanks,

Jonathan



signature.asc
Description: OpenPGP digital signature


Re: jsonpath

2019-03-17 Thread Nikita Glukhov


On 17.03.2019 21:29, Jonathan S. Katz wrote:

On 3/17/19 1:14 PM, Alexander Korotkov wrote:

On Sun, Mar 17, 2019 at 8:06 PM Jonathan S. Katz  wrote:

On 3/17/19 1:02 PM, Alexander Korotkov wrote:

Thank you for the explanation.  Is it jsonb_ops or jsonb_path_ops?

I just used "USING gin(col)" so jsonb_ops.

I see.  So, jsonb_ops extracts from this query only existence of
.length key.  And I can bet it exists in all (or almost all) the
documents.  Thus, optimizer thinks index might be useful, while it's
useless.  There is not much can be done while we don't have statistics
for jsonb (and access to it from GIN extract_query).  So, for now we
can just refuse from extracting only keys from jsonpath in jsonb_ops.
But I think it would be better to just document this issue.  In future
we should improve that with statistics.

That seems to make sense, especially given how I've typically stored
JSON documents in PostgreSQL. It sounds like this particular problem
would be solved appropriately with JSONB statistics.


GIN jsonb_ops extracts from query

  data @? '$.length ? (@ < 150)'

the same GIN entries as from queries

  data @? '$.length'
  data ? 'length'


If you don't want to extract entries from unsupported expressions, you can try
to use another jsonpath operator @@. Queries will also look like a bit simpler:

  data @@ '$.length < 150'
  data @@ '$.content like_regex "^Start"'
  data @@ '$.content like_regex "risk" flag "i"'

All this queries emit no GIN entries. But note that

  data @@ '$ ? (@.content == "foo").length < 150'

emits the same entries as

  data @@ '$.content == "foo"'



We already have a POC implementation of jsonb statistics that was written
2 years ago.  I rebased it onto the current master yesterday.  If it is
interesting, you can find it on my GitHub [1].  But note, that there is
no support for  jsonpath operators yet, only boolean EXISTS ?, ?|, ?&, and
CONTAINS @> operators are supported.  Also there is no docs, and it works
slowly  (a more effective storage method for statistics of individual JSON
paths is needed).

Also there is ability to calculate derived statistics of expressions like

  js -> 'x' -> 0 -> 'y'
  js #> '{x,0,y}'

using jsonb statistics for columns "js". So the selectivity of expressions

  js -> 'x' -> 0 -> 'y' = '123'
  js #> '{x,0,y}' >= '123'

also can be estimated (but these expressions can't be used by index on "js").


This topic deserves a separate discussion.  I hope, we will start the
corresponding thread for PG13 after we find a more effective way of jsonb
statistics storing.


[1] https://github.com/glukhovn/postgres/tree/jsonb_stats

--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: jsonpath

2019-03-18 Thread Tom Lane
Just another minor bitch about this patch: jsonpath_scan.l has introduced
a typedef called "keyword".  This is causing pgindent to produce seriously
ugly results in libpq, and probably in other places where that is used as
a field or variable name.  Please rename that typedef to something less
generic.

regards, tom lane



Re: jsonpath

2019-03-18 Thread Alexander Korotkov
On Mon, Mar 18, 2019 at 10:08 PM Tom Lane  wrote:
> Just another minor bitch about this patch: jsonpath_scan.l has introduced
> a typedef called "keyword".  This is causing pgindent to produce seriously
> ugly results in libpq, and probably in other places where that is used as
> a field or variable name.  Please rename that typedef to something less
> generic.

Ooops... I propose to rename it to KeyWord, which is already
typedef'ed in formatting.c.  See the attached patch.  Is it OK?

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


jsonpath_rename_typedef.patch
Description: Binary data


Re: jsonpath

2019-03-18 Thread Tom Lane
Alexander Korotkov  writes:
> On Mon, Mar 18, 2019 at 10:08 PM Tom Lane  wrote:
>> Just another minor bitch about this patch: jsonpath_scan.l has introduced
>> a typedef called "keyword".  This is causing pgindent to produce seriously
>> ugly results in libpq, and probably in other places where that is used as
>> a field or variable name.  Please rename that typedef to something less
>> generic.

> Ooops... I propose to rename it to KeyWord, which is already
> typedef'ed in formatting.c.  See the attached patch.  Is it OK?

I had in mind JsonPathKeyword or something like that.  If you re-use
formatting.c's typedef name, pgindent won't care, but it's possible
you'd be in for unhappiness when trying to look at these structs in
gdb for instance.

regards, tom lane



Re: jsonpath

2019-03-18 Thread Pavel Stehule
po 18. 3. 2019 v 21:23 odesílatel Tom Lane  napsal:

> Alexander Korotkov  writes:
> > On Mon, Mar 18, 2019 at 10:08 PM Tom Lane  wrote:
> >> Just another minor bitch about this patch: jsonpath_scan.l has
> introduced
> >> a typedef called "keyword".  This is causing pgindent to produce
> seriously
> >> ugly results in libpq, and probably in other places where that is used
> as
> >> a field or variable name.  Please rename that typedef to something less
> >> generic.
>
> > Ooops... I propose to rename it to KeyWord, which is already
> > typedef'ed in formatting.c.  See the attached patch.  Is it OK?
>
> I had in mind JsonPathKeyword or something like that.  If you re-use
> formatting.c's typedef name, pgindent won't care, but it's possible
> you'd be in for unhappiness when trying to look at these structs in
> gdb for instance.
>

JsonPathKeyword is better verbose

Pavel


> regards, tom lane
>


Re: jsonpath

2019-03-19 Thread John Naylor
On Sun, Feb 24, 2019 at 5:03 PM Alexander Korotkov
 wrote:
> On Wed, Jan 30, 2019 at 5:28 AM Andres Freund  wrote:
> > Why -CF, and why is -p repeated?
>
> BTW, for our SQL grammar we have
>
> > scan.c: FLEXFLAGS = -CF -p -p
>
> Is it kind of default?

I just saw this in the committed patch. This is not default, it's
chosen for maximum performance at the expense of binary/memory size.
That's fine, but with a little effort you can also make the scanner
non-backtracking for additional performance, as in the attached.

-- 
John Naylorhttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


jsonpath-scan-nobackup.patch
Description: Binary data


Re: jsonpath

2019-03-19 Thread Alexander Korotkov
On Mon, Mar 18, 2019 at 11:23 PM Tom Lane  wrote:
> Alexander Korotkov  writes:
> > On Mon, Mar 18, 2019 at 10:08 PM Tom Lane  wrote:
> >> Just another minor bitch about this patch: jsonpath_scan.l has introduced
> >> a typedef called "keyword".  This is causing pgindent to produce seriously
> >> ugly results in libpq, and probably in other places where that is used as
> >> a field or variable name.  Please rename that typedef to something less
> >> generic.
>
> > Ooops... I propose to rename it to KeyWord, which is already
> > typedef'ed in formatting.c.  See the attached patch.  Is it OK?
>
> I had in mind JsonPathKeyword or something like that.  If you re-use
> formatting.c's typedef name, pgindent won't care, but it's possible
> you'd be in for unhappiness when trying to look at these structs in
> gdb for instance.

Good point, thanks!  Pushed.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: jsonpath

2019-03-19 Thread Alexander Korotkov
On Tue, Mar 19, 2019 at 12:23 PM John Naylor
 wrote:
> On Sun, Feb 24, 2019 at 5:03 PM Alexander Korotkov
>  wrote:
> > On Wed, Jan 30, 2019 at 5:28 AM Andres Freund  wrote:
> > > Why -CF, and why is -p repeated?
> >
> > BTW, for our SQL grammar we have
> >
> > > scan.c: FLEXFLAGS = -CF -p -p
> >
> > Is it kind of default?
>
> I just saw this in the committed patch. This is not default, it's
> chosen for maximum performance at the expense of binary/memory size.
> That's fine, but with a little effort you can also make the scanner
> non-backtracking for additional performance, as in the attached.

We're working on patchset improving flex scanner.  Already have
similar change among the others.  But thanks a lot for your attention!

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: jsonpath

2019-03-19 Thread Alexander Korotkov
On Sun, Mar 17, 2019 at 6:03 PM Tom Lane  wrote:
> Andrew Dunstan  writes:
> > Why are we installing the jsonpath_gram.h file? It's not going to be
> > used by anything else, is it? TBH, I'm not sure I see why we're
> > installing the scanner.h file either.
>
> As near as I can see, jsonpath_gram.h and jsonpath_scanner.h exist only
> for communication between jsonpath_gram.y and jsonpath_scan.l.  Maybe
> we'd be better off handling that need by compiling the .l file as part
> of the .y file, as we used to do with the core lexer and still do with
> several others (cf comments for commit 72b1e3a21); then we wouldn't
> even have to generate these files much less install them.
>
> A quick look at jsonpath_scan.c shows that it's pretty innocent of
> the tricks we've learned over the years for flex/bison portability;
> in particular I see that it's #including  before postgres.h,
> which is a no-no.  So that whole area needs more review anyway.

Attached patch is getting rid of jsonpath_gram.h.  Would like to see
results of http://commitfest.cputube.org/ before committing, because
I'm not available to test this of windows machine.  There would be
further patches rearranging jsonpath_gram.y and jsonpath_scan.l.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


0001-get-rid-of-jsonpath_gram_h.patch
Description: Binary data


Re: jsonpath

2019-03-21 Thread Alexander Korotkov
On Tue, Mar 19, 2019 at 8:10 PM Alexander Korotkov
 wrote:
> On Sun, Mar 17, 2019 at 6:03 PM Tom Lane  wrote:
> > Andrew Dunstan  writes:
> > > Why are we installing the jsonpath_gram.h file? It's not going to be
> > > used by anything else, is it? TBH, I'm not sure I see why we're
> > > installing the scanner.h file either.
> >
> > As near as I can see, jsonpath_gram.h and jsonpath_scanner.h exist only
> > for communication between jsonpath_gram.y and jsonpath_scan.l.  Maybe
> > we'd be better off handling that need by compiling the .l file as part
> > of the .y file, as we used to do with the core lexer and still do with
> > several others (cf comments for commit 72b1e3a21); then we wouldn't
> > even have to generate these files much less install them.
> >
> > A quick look at jsonpath_scan.c shows that it's pretty innocent of
> > the tricks we've learned over the years for flex/bison portability;
> > in particular I see that it's #including  before postgres.h,
> > which is a no-no.  So that whole area needs more review anyway.
>
> Attached patch is getting rid of jsonpath_gram.h.  Would like to see
> results of http://commitfest.cputube.org/ before committing, because
> I'm not available to test this of windows machine.  There would be
> further patches rearranging jsonpath_gram.y and jsonpath_scan.l.

Attaches patches improving jsonpath parser.  First one introduces
cosmetic changes, while second gets rid of backtracking.  I'm also
planning to add high-level comment for both grammar and lexer.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


0001-cosmetic-changes-jsonpath-parser.patch
Description: Binary data


0002-get-rid-of-backtracking.patch
Description: Binary data


Re: jsonpath

2019-03-21 Thread Nikita Glukhov

Hi.

Attached patch enables throwing of errors in jsonb_path_match() in its
non-silent mode when the jsonpath expression failed to return a singleton
boolean.  Previously, NULL was always returned, and it seemed to be
inconsistent with the behavior of other functions, in which structural and
other errors were not suppressed in non-silent mode.


We also think that jsonb_path_match() needs to be renamed, because its
current name is confusing to many users.  "Match" name is more suitable for
jsonpath-based pattern matching like that (maybe we'll implement it later):

  jsonb_path_match(
'{ "a": 1, "b": 2,"c": "str" }',
'{ "a": 1, "b": @ > 1, * : @.type == "string" }'
  )

Below are some possible name variants:

  jsonb_path_predicate() (original name)
  jsonb_path_pred()
  jsonb_path_test()
  jsonb_path_check()
  jsonb_path_bool()

--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

>From f298aaa1258c6a4b4a487fb44980654b3a8e2e36 Mon Sep 17 00:00:00 2001
From: Nikita Glukhov 
Date: Fri, 22 Mar 2019 02:34:24 +0300
Subject: [PATCH] Throw error in jsonb_path_match() when silent is false

---
 src/backend/utils/adt/jsonpath_exec.c| 26 +-
 src/test/regress/expected/jsonb_jsonpath.out | 51 
 src/test/regress/sql/jsonb_jsonpath.sql  | 12 +++
 3 files changed, 80 insertions(+), 9 deletions(-)

diff --git a/src/backend/utils/adt/jsonpath_exec.c b/src/backend/utils/adt/jsonpath_exec.c
index c072257..074cea2 100644
--- a/src/backend/utils/adt/jsonpath_exec.c
+++ b/src/backend/utils/adt/jsonpath_exec.c
@@ -320,7 +320,6 @@ jsonb_path_match(PG_FUNCTION_ARGS)
 {
 	Jsonb	   *jb = PG_GETARG_JSONB_P(0);
 	JsonPath   *jp = PG_GETARG_JSONPATH_P(1);
-	JsonbValue *jbv;
 	JsonValueList found = {0};
 	Jsonb	   *vars = NULL;
 	bool		silent = true;
@@ -333,18 +332,27 @@ jsonb_path_match(PG_FUNCTION_ARGS)
 
 	(void) executeJsonPath(jp, vars, jb, !silent, &found);
 
-	if (JsonValueListLength(&found) < 1)
-		PG_RETURN_NULL();
-
-	jbv = JsonValueListHead(&found);
-
 	PG_FREE_IF_COPY(jb, 0);
 	PG_FREE_IF_COPY(jp, 1);
 
-	if (jbv->type != jbvBool)
-		PG_RETURN_NULL();
+	if (JsonValueListLength(&found) == 1)
+	{
+		JsonbValue *jbv = JsonValueListHead(&found);
+
+		if (jbv->type == jbvBool)
+			PG_RETURN_BOOL(jbv->val.boolean);
+
+		if (jbv->type == jbvNull)
+			PG_RETURN_NULL();
+	}
+
+	if (!silent)
+		ereport(ERROR,
+(errcode(ERRCODE_SINGLETON_JSON_ITEM_REQUIRED),
+ errmsg(ERRMSG_SINGLETON_JSON_ITEM_REQUIRED),
+ errdetail("expression should return a singleton boolean")));
 
-	PG_RETURN_BOOL(jbv->val.boolean);
+	PG_RETURN_NULL();
 }
 
 /*
diff --git a/src/test/regress/expected/jsonb_jsonpath.out b/src/test/regress/expected/jsonb_jsonpath.out
index e604bae..66f0ffd 100644
--- a/src/test/regress/expected/jsonb_jsonpath.out
+++ b/src/test/regress/expected/jsonb_jsonpath.out
@@ -1769,6 +1769,57 @@ SELECT jsonb_path_exists('[{"a": 1}, {"a": 2}, {"a": 3}, {"a": 5}]', '$[*] ? (@.
  f
 (1 row)
 
+SELECT jsonb_path_match('true', '$', silent => false);
+ jsonb_path_match 
+--
+ t
+(1 row)
+
+SELECT jsonb_path_match('false', '$', silent => false);
+ jsonb_path_match 
+--
+ f
+(1 row)
+
+SELECT jsonb_path_match('null', '$', silent => false);
+ jsonb_path_match 
+--
+ 
+(1 row)
+
+SELECT jsonb_path_match('1', '$', silent => true);
+ jsonb_path_match 
+--
+ 
+(1 row)
+
+SELECT jsonb_path_match('1', '$', silent => false);
+ERROR:  singleton SQL/JSON item required
+DETAIL:  expression should return a singleton boolean
+SELECT jsonb_path_match('"a"', '$', silent => false);
+ERROR:  singleton SQL/JSON item required
+DETAIL:  expression should return a singleton boolean
+SELECT jsonb_path_match('{}', '$', silent => false);
+ERROR:  singleton SQL/JSON item required
+DETAIL:  expression should return a singleton boolean
+SELECT jsonb_path_match('[true]', '$', silent => false);
+ERROR:  singleton SQL/JSON item required
+DETAIL:  expression should return a singleton boolean
+SELECT jsonb_path_match('{}', 'lax $.a', silent => false);
+ERROR:  singleton SQL/JSON item required
+DETAIL:  expression should return a singleton boolean
+SELECT jsonb_path_match('{}', 'strict $.a', silent => false);
+ERROR:  SQL/JSON member not found
+DETAIL:  JSON object does not contain key "a"
+SELECT jsonb_path_match('{}', 'strict $.a', silent => true);
+ jsonb_path_match 
+--
+ 
+(1 row)
+
+SELECT jsonb_path_match('[true, true]', '$[*]', silent => false);
+ERROR:  singleton SQL/JSON item required
+DETAIL:  expression should return a singleton boolean
 SELECT jsonb '[{"a": 1}, {"a": 2}]' @@ '$[*].a > 1';
  ?column? 
 --
diff --git a/src/test/regress/sql/jsonb_jsonpath.sql b/src/test/regress/sql/jsonb_jsonpath.sql
index 41b346b..f8ef39c 100644
--- a/src/test/regress/sql/jsonb_jsonpath.sql
+++ b/src/test/regress/sql/jsonb_jsonpath.sql
@@ -366,6 +366,18 @@ SELECT jsonb_path_exis

Re: jsonpath

2019-03-21 Thread John Naylor
On Thu, Mar 21, 2019 at 9:59 PM Alexander Korotkov
 wrote:
> Attaches patches improving jsonpath parser.  First one introduces
> cosmetic changes, while second gets rid of backtracking.  I'm also
> planning to add high-level comment for both grammar and lexer.

The cosmetic changes look good to me. I just noticed a couple things
about the comments.

0001:

+/* Check if current scanstring value constitutes a keyword */

'is a keyword' is better. 'Constitutes' implies parts of a whole.

+ * Resize scanstring for appending of given length.  Reinitilize if required.

s/Reinitilize/Reinitialize/

The first sentence is not entirely clear to me.


0002:

These two rules are not strictly necessary:

{unicode}+\\ {
/* throw back the \\, and treat as unicode */
yyless(yyleng - 1);
parseUnicode(yytext, yyleng);
}

{hex_char}+\\ {
/* throw back the \\, and treat as hex */
yyless(yyleng - 1);
parseHexChars(yytext, yyleng);
}

...and only seem to be there because of how these are written:

{unicode}+ { parseUnicode(yytext, yyleng); }
{hex_char}+ { parseHexChars(yytext, yyleng); }
{unicode}*{unicodefail} { yyerror(NULL, "Unicode
sequence is invalid"); }
{hex_char}*{hex_fail} { yyerror(NULL, "Hex character
sequence is invalid"); }

I don't understand the reasoning here -- is it a micro-optimization?
The following is simpler, allow the rules I mentioned to be removed,
and make check still passes. I would prefer it unless there is a
performance penalty, in which case a comment to describe the
additional complexity would be helpful.

{unicode} { parseUnicode(yytext, yyleng); }
{hex_char} { parseHexChars(yytext, yyleng); }
{unicodefail} { yyerror(NULL, "Unicode sequence is invalid"); }
{hex_fail} { yyerror(NULL, "Hex character sequence is
invalid"); }

-- 
John Naylorhttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: jsonpath

2019-03-21 Thread Oleg Bartunov
On Fri, 22 Mar 2019, 03:14 Nikita Glukhov,  wrote:

> Hi.
>
> Attached patch enables throwing of errors in jsonb_path_match() in its
> non-silent mode when the jsonpath expression failed to return a singleton
> boolean.  Previously, NULL was always returned, and it seemed to be
> inconsistent with the behavior of other functions, in which structural and
> other errors were not suppressed in non-silent mode.
>
>
> We also think that jsonb_path_match() needs to be renamed, because its
> current name is confusing to many users.  "Match" name is more suitable for
> jsonpath-based pattern matching like that (maybe we'll implement it later):
>
>   jsonb_path_match(
> '{ "a": 1, "b": 2,"c": "str" }',
> '{ "a": 1, "b": @ > 1, * : @.type == "string" }'
>   )
>
> Would be very useful for constraints.

>

> Below are some possible name variants:
>
>   jsonb_path_predicate() (original name)
>
> Too long

  jsonb_path_pred()
>   jsonb_path_test()
>   jsonb_path_check()
>
>
Check is good to me

>
>   jsonb_path_bool()
>
> --
> Nikita Glukhov
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>
>


Re: jsonpath

2019-03-22 Thread Nikita Glukhov

On 21.03.2019 16:58, Alexander Korotkov wrote:


On Tue, Mar 19, 2019 at 8:10 PM Alexander Korotkov
 wrote:
Attaches patches improving jsonpath parser.  First one introduces
cosmetic changes, while second gets rid of backtracking.  I'm also
planning to add high-level comment for both grammar and lexer.


Parsing of integers now is wrong: neither JSON specification, nor our json[b]
allow leading zeros in integers and floats starting with a dot.

=# SELECT json '.1';
ERROR:  invalid input syntax for type json
LINE 1: SELECT jsonb '.1';
 ^
DETAIL:  Token "." is invalid.
CONTEXT:  JSON data, line 1: 

=# SELECT json '00';
ERROR:  invalid input syntax for type json
LINE 1: SELECT jsonb '00';
 ^
DETAIL:  Token "00" is invalid.
CONTEXT:  JSON data, line 1: 00

=# SELECT json '00.1';
ERROR:  invalid input syntax for type json
LINE 1: SELECT jsonb '00.1';
 ^
DETAIL:  Token "00" is invalid.
CONTEXT:  JSON data, line 1: 00...


In JavaScript integers with leading zero are treated as octal numbers,
but leading dot is a allowed:

v8 > 0123
83
v8 > 000.1
Uncaught SyntaxError: Unexpected number
v8> .1
0.1

Attached patch 0003 fixes this issue.

--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

>From 5b0844d77f4fe65d666072356f7c0e42d13c9a63 Mon Sep 17 00:00:00 2001
From: Nikita Glukhov 
Date: Fri, 22 Mar 2019 14:38:46 +0300
Subject: [PATCH 3/3] Fix parsing of numbers in jsonpath

---
 src/backend/utils/adt/jsonpath_scan.l  |   6 +-
 src/test/regress/expected/jsonpath.out | 117 +++--
 2 files changed, 55 insertions(+), 68 deletions(-)

diff --git a/src/backend/utils/adt/jsonpath_scan.l b/src/backend/utils/adt/jsonpath_scan.l
index 844ea5e..a6b67ea 100644
--- a/src/backend/utils/adt/jsonpath_scan.l
+++ b/src/backend/utils/adt/jsonpath_scan.l
@@ -77,9 +77,9 @@ any			[^\?\%\$\.\[\]\{\}\(\)\|\&\!\=\<\>\@\#\,\*:\-\+\/\\\"\' \t\n\r\f]
 blank		[ \t\n\r\f]
 
 digit		[0-9]
-integer		{digit}+
-decimal		{digit}*\.{digit}+
-decimalfail	{digit}+\.
+integer		(0|[1-9]{digit}*)
+decimal		{integer}\.{digit}+
+decimalfail	{integer}\.
 real		({integer}|{decimal})[Ee][-+]?{digit}+
 realfail1	({integer}|{decimal})[Ee]
 realfail2	({integer}|{decimal})[Ee][-+]
diff --git a/src/test/regress/expected/jsonpath.out b/src/test/regress/expected/jsonpath.out
index b7de491..a99643f 100644
--- a/src/test/regress/expected/jsonpath.out
+++ b/src/test/regress/expected/jsonpath.out
@@ -547,23 +547,20 @@ select '$ ? (@.a < +1)'::jsonpath;
 (1 row)
 
 select '$ ? (@.a < .1)'::jsonpath;
-jsonpath 
--
- $?(@."a" < 0.1)
-(1 row)
-
+ERROR:  bad jsonpath representation
+LINE 1: select '$ ? (@.a < .1)'::jsonpath;
+   ^
+DETAIL:  syntax error, unexpected '.' at or near "."
 select '$ ? (@.a < -.1)'::jsonpath;
- jsonpath 
---
- $?(@."a" < -0.1)
-(1 row)
-
+ERROR:  bad jsonpath representation
+LINE 1: select '$ ? (@.a < -.1)'::jsonpath;
+   ^
+DETAIL:  syntax error, unexpected '.' at or near "."
 select '$ ? (@.a < +.1)'::jsonpath;
-jsonpath 
--
- $?(@."a" < 0.1)
-(1 row)
-
+ERROR:  bad jsonpath representation
+LINE 1: select '$ ? (@.a < +.1)'::jsonpath;
+   ^
+DETAIL:  syntax error, unexpected '.' at or near "."
 select '$ ? (@.a < 0.1)'::jsonpath;
 jsonpath 
 -
@@ -619,23 +616,20 @@ select '$ ? (@.a < +1e1)'::jsonpath;
 (1 row)
 
 select '$ ? (@.a < .1e1)'::jsonpath;
-   jsonpath

- $?(@."a" < 1)
-(1 row)
-
+ERROR:  bad jsonpath representation
+LINE 1: select '$ ? (@.a < .1e1)'::jsonpath;
+   ^
+DETAIL:  syntax error, unexpected '.' at or near "."
 select '$ ? (@.a < -.1e1)'::jsonpath;
-jsonpath
-
- $?(@."a" < -1)
-(1 row)
-
+ERROR:  bad jsonpath representation
+LINE 1: select '$ ? (@.a < -.1e1)'::jsonpath;
+   ^
+DETAIL:  syntax error, unexpected '.' at or near "."
 select '$ ? (@.a < +.1e1)'::jsonpath;
-   jsonpath

- $?(@."a" < 1)
-(1 row)
-
+ERROR:  bad jsonpath representation
+LINE 1: select '$ ? (@.a < +.1e1)'::jsonpath;
+   ^
+DETAIL:  syntax error, unexpected '.' at or near "."
 select '$ ? (@.a < 0.1e1)'::jsonpath;
jsonpath
 ---
@@ -691,23 +685,20 @@ select '$ ? (@.a < +1e-1)'::jsonpath;
 (1 row)
 
 select '$ ? (@.a < .1e-1)'::jsonpath;
- jsonpath 
---
- $?(@."a" < 0.01)
-(1 row)
-
+ERROR:  bad jsonpath representation
+LINE 1: select '$ ? (@.a < .1e-1)'::jsonpath;
+   ^
+DETAIL:  syntax error, unexpected '.' at or near "."
 select '$ ? (@.a < -.1e-1)'::jsonpath;
- jsonpath  

- $?(@."a" < -0.01)
-(1 row)
-
+ERROR:  bad jsonpath representation
+LINE 1: select '$ ? (@.a < -.1e-1)'::jsonpath;
+   ^
+DETAIL:  syntax error, unexpected '.' at or near "."
 select '$ ? (@.a < +.1e-1)'::jsonpath;
- json

Re: jsonpath

2019-03-22 Thread Alexander Korotkov
On Fri, Mar 22, 2019 at 5:38 AM John Naylor  wrote:
> On Thu, Mar 21, 2019 at 9:59 PM Alexander Korotkov
>  wrote:
> > Attaches patches improving jsonpath parser.  First one introduces
> > cosmetic changes, while second gets rid of backtracking.  I'm also
> > planning to add high-level comment for both grammar and lexer.
>
> The cosmetic changes look good to me. I just noticed a couple things
> about the comments.
>
> 0001:
>
> +/* Check if current scanstring value constitutes a keyword */
>
> 'is a keyword' is better. 'Constitutes' implies parts of a whole.
>
> + * Resize scanstring for appending of given length.  Reinitilize if required.
>
> s/Reinitilize/Reinitialize/
>
> The first sentence is not entirely clear to me.

Thank you, fixed.

> 0002:
>
> These two rules are not strictly necessary:
>
> {unicode}+\\ {
> /* throw back the \\, and treat as unicode */
> yyless(yyleng - 1);
> parseUnicode(yytext, yyleng);
> }
>
> {hex_char}+\\ {
> /* throw back the \\, and treat as hex */
> yyless(yyleng - 1);
> parseHexChars(yytext, yyleng);
> }
>
> ...and only seem to be there because of how these are written:
>
> {unicode}+ { parseUnicode(yytext, yyleng); }
> {hex_char}+ { parseHexChars(yytext, yyleng); }
> {unicode}*{unicodefail} { yyerror(NULL, "Unicode
> sequence is invalid"); }
> {hex_char}*{hex_fail} { yyerror(NULL, "Hex character
> sequence is invalid"); }
>
> I don't understand the reasoning here -- is it a micro-optimization?
> The following is simpler, allow the rules I mentioned to be removed,
> and make check still passes. I would prefer it unless there is a
> performance penalty, in which case a comment to describe the
> additional complexity would be helpful.
>
> {unicode} { parseUnicode(yytext, yyleng); }
> {hex_char} { parseHexChars(yytext, yyleng); }
> {unicodefail} { yyerror(NULL, "Unicode sequence is invalid"); 
> }
> {hex_fail} { yyerror(NULL, "Hex character sequence is
> invalid"); }

These rules are needed for unicode.  Sequential escaped unicode
characters might be connected by hi surrogate value.  See
jsonpath_encoding regression test in attached patch.

Regarding hex, I made it so for the sake of uniformity.  But I changed
my mind and decided that simpler flex rules are more important.  So,
now they are considered one-by-one.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


0001-cosmetic-changes-jsonpath-parser-2.patch
Description: Binary data


0002-get-rid-of-backtracking-2.patch
Description: Binary data


Re: jsonpath

2019-03-24 Thread Andres Freund
Hi,

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=snapper&dt=2019-03-23%2013%3A01%3A28

2019-03-23 14:28:31.147 CET [18056:45] pg_regress/jsonpath LOG:  statement: 
select '$.g ? (@.a == 1 || !(@.x >= 123 || @.a == 4) && @.b == 7)'::jsonpath;
2019-03-23 14:28:31.157 CET [18055:59] pg_regress/jsonb_jsonpath LOG:  
statement: select jsonb_path_query('1', 'lax $.a');
2019-03-23 14:28:31.163 CET [9597:311] LOG:  server process (PID 18056) was 
terminated by signal 11: Segmentation fault
2019-03-23 14:28:31.163 CET [9597:312] DETAIL:  Failed process was running: 
select '$.g ? (@.a == 1 || !(@.x >= 123 || @.a == 4) && @.b == 7)'::jsonpath;
2019-03-23 14:28:31.164 CET [9597:313] LOG:  terminating any other
active server processes

Something's not quite right... Note this appears to be 32bit sparc.

- Andres



Re: jsonpath

2019-03-24 Thread Alexander Korotkov
On Sun, Mar 24, 2019 at 7:45 PM Andres Freund  wrote:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=snapper&dt=2019-03-23%2013%3A01%3A28
>
> 2019-03-23 14:28:31.147 CET [18056:45] pg_regress/jsonpath LOG:  statement: 
> select '$.g ? (@.a == 1 || !(@.x >= 123 || @.a == 4) && @.b == 7)'::jsonpath;
> 2019-03-23 14:28:31.157 CET [18055:59] pg_regress/jsonb_jsonpath LOG:  
> statement: select jsonb_path_query('1', 'lax $.a');
> 2019-03-23 14:28:31.163 CET [9597:311] LOG:  server process (PID 18056) was 
> terminated by signal 11: Segmentation fault
> 2019-03-23 14:28:31.163 CET [9597:312] DETAIL:  Failed process was running: 
> select '$.g ? (@.a == 1 || !(@.x >= 123 || @.a == 4) && @.b == 7)'::jsonpath;
> 2019-03-23 14:28:31.164 CET [9597:313] LOG:  terminating any other
> active server processes
>
> Something's not quite right... Note this appears to be 32bit sparc.

Thank you for pointing.  Will investigate.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: jsonpath

2019-03-26 Thread Alexander Korotkov
On Sun, Mar 24, 2019 at 9:09 PM Alexander Korotkov
 wrote:
> On Sun, Mar 24, 2019 at 7:45 PM Andres Freund  wrote:
> > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=snapper&dt=2019-03-23%2013%3A01%3A28
> >
> > 2019-03-23 14:28:31.147 CET [18056:45] pg_regress/jsonpath LOG:  statement: 
> > select '$.g ? (@.a == 1 || !(@.x >= 123 || @.a == 4) && @.b == 
> > 7)'::jsonpath;
> > 2019-03-23 14:28:31.157 CET [18055:59] pg_regress/jsonb_jsonpath LOG:  
> > statement: select jsonb_path_query('1', 'lax $.a');
> > 2019-03-23 14:28:31.163 CET [9597:311] LOG:  server process (PID 18056) was 
> > terminated by signal 11: Segmentation fault
> > 2019-03-23 14:28:31.163 CET [9597:312] DETAIL:  Failed process was running: 
> > select '$.g ? (@.a == 1 || !(@.x >= 123 || @.a == 4) && @.b == 
> > 7)'::jsonpath;
> > 2019-03-23 14:28:31.164 CET [9597:313] LOG:  terminating any other
> > active server processes
> >
> > Something's not quite right... Note this appears to be 32bit sparc.
>
> Thank you for pointing.  Will investigate.

Got access to that buildfarm animal thanks to Tom Turelinckx.  Now
running check-world in a loop on the same commit hash with same build
options.  Error wasn't triggered yet.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: jsonpath

2019-03-26 Thread Tom Lane
Alexander Korotkov  writes:
> Got access to that buildfarm animal thanks to Tom Turelinckx.  Now
> running check-world in a loop on the same commit hash with same build
> options.  Error wasn't triggered yet.

I notice that snapper is using force_parallel_mode = regress ...
have you got that enabled in your manual test?

regards, tom lane



Re: jsonpath

2019-03-26 Thread Alexander Korotkov
On Tue, Mar 26, 2019 at 5:32 PM Tom Lane  wrote:
> Alexander Korotkov  writes:
> > Got access to that buildfarm animal thanks to Tom Turelinckx.  Now
> > running check-world in a loop on the same commit hash with same build
> > options.  Error wasn't triggered yet.
>
> I notice that snapper is using force_parallel_mode = regress ...
> have you got that enabled in your manual test?

Nope.  Thank you for pointing!  I've rerun my test loop with this...

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: jsonpath

2019-03-27 Thread Alexander Korotkov
On Tue, Mar 26, 2019 at 6:06 PM Alexander Korotkov
 wrote:
> On Tue, Mar 26, 2019 at 5:32 PM Tom Lane  wrote:
> > Alexander Korotkov  writes:
> > > Got access to that buildfarm animal thanks to Tom Turelinckx.  Now
> > > running check-world in a loop on the same commit hash with same build
> > > options.  Error wasn't triggered yet.
> >
> > I notice that snapper is using force_parallel_mode = regress ...
> > have you got that enabled in your manual test?
>
> Nope.  Thank you for pointing!  I've rerun my test loop with this...

Still no reproduction.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: jsonpath

2019-03-27 Thread Tom Lane
Alexander Korotkov  writes:
> Still no reproduction.

Annoying, but it's probably not worth expending more effort on
right now.  I wonder whether that buildfarm animal can be upgraded
to capture core dump stack traces --- if so, then if it happens
again we'd have more info.

regards, tom lane




Re: jsonpath

2019-03-27 Thread Alexander Korotkov
On Wed, Mar 27, 2019 at 4:49 PM Tom Lane  wrote:
> Alexander Korotkov  writes:
> > Still no reproduction.
>
> Annoying, but it's probably not worth expending more effort on
> right now.  I wonder whether that buildfarm animal can be upgraded
> to capture core dump stack traces --- if so, then if it happens
> again we'd have more info.

Hopefully, Andrew will manage to get a backtrace [1].

BTW, while searching for this bug, I've collected this backtrace using valgrind.

==00:00:00:14.596 10866== Conditional jump or move depends on
uninitialised value(s)
==00:00:00:14.596 10866==at 0x579F8C4: strtod_l_internal (in
/usr/lib64/libc-2.17.so)
==00:00:00:14.596 10866==by 0x771561: float8in_internal_opt_error
(float.c:394)
==00:00:00:14.596 10866==by 0x7718B9: float8in_internal (float.c:515)
==00:00:00:14.596 10866==by 0x7718B9: float8in (float.c:336)
==00:00:00:14.596 10866==by 0x842D43: DirectFunctionCall1Coll (fmgr.c:803)
==00:00:00:14.596 10866==by 0x7C9649: numeric_float8 (numeric.c:3417)
==00:00:00:14.596 10866==by 0x842D43: DirectFunctionCall1Coll (fmgr.c:803)
==00:00:00:14.596 10866==by 0x7A1D8D: jsonb_float8 (jsonb.c:2058)
==00:00:00:14.596 10866==by 0x5F8F54: ExecInterpExpr (execExprInterp.c:649)
==00:00:00:14.596 10866==by 0x6A2E19: ExecEvalExprSwitchContext
(executor.h:307)
==00:00:00:14.596 10866==by 0x6A2E19: evaluate_expr (clauses.c:4827)
==00:00:00:14.596 10866==by 0x6A45FF: evaluate_function (clauses.c:4369)
==00:00:00:14.596 10866==by 0x6A45FF: simplify_function (clauses.c:3999)
==00:00:00:14.596 10866==by 0x6A31C1:
eval_const_expressions_mutator (clauses.c:2474)
==00:00:00:14.596 10866==by 0x644466: expression_tree_mutator
(nodeFuncs.c:3072)
==00:00:00:14.596 10866==
{
   
   Memcheck:Cond
   fun:strtod_l_internal
   fun:float8in_internal_opt_error
   fun:float8in_internal
   fun:float8in
   fun:DirectFunctionCall1Coll
   fun:numeric_float8
   fun:DirectFunctionCall1Coll
   fun:jsonb_float8
   fun:ExecInterpExpr
   fun:ExecEvalExprSwitchContext
   fun:evaluate_expr
   fun:evaluate_function
   fun:simplify_function
   fun:eval_const_expressions_mutator
   fun:expression_tree_mutator
}

Not sure whether it's related to 16d489b0fe.  Will investigate more.

1. 
https://www.postgresql.org/message-id/91ff584f-75d2-471f-4d9e-9ee8f09cdc1d%402ndQuadrant.com

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: jsonpath

2019-03-27 Thread Tomas Vondra

On Wed, Mar 27, 2019 at 05:37:40PM +0300, Alexander Korotkov wrote:

On Wed, Mar 27, 2019 at 4:49 PM Tom Lane  wrote:

Alexander Korotkov  writes:
> Still no reproduction.

Annoying, but it's probably not worth expending more effort on
right now.  I wonder whether that buildfarm animal can be upgraded
to capture core dump stack traces --- if so, then if it happens
again we'd have more info.


Hopefully, Andrew will manage to get a backtrace [1].

BTW, while searching for this bug, I've collected this backtrace using valgrind.

==00:00:00:14.596 10866== Conditional jump or move depends on
uninitialised value(s)
==00:00:00:14.596 10866==at 0x579F8C4: strtod_l_internal (in
/usr/lib64/libc-2.17.so)
==00:00:00:14.596 10866==by 0x771561: float8in_internal_opt_error
(float.c:394)
==00:00:00:14.596 10866==by 0x7718B9: float8in_internal (float.c:515)
==00:00:00:14.596 10866==by 0x7718B9: float8in (float.c:336)
==00:00:00:14.596 10866==by 0x842D43: DirectFunctionCall1Coll (fmgr.c:803)
==00:00:00:14.596 10866==by 0x7C9649: numeric_float8 (numeric.c:3417)
==00:00:00:14.596 10866==by 0x842D43: DirectFunctionCall1Coll (fmgr.c:803)
==00:00:00:14.596 10866==by 0x7A1D8D: jsonb_float8 (jsonb.c:2058)
==00:00:00:14.596 10866==by 0x5F8F54: ExecInterpExpr (execExprInterp.c:649)
==00:00:00:14.596 10866==by 0x6A2E19: ExecEvalExprSwitchContext
(executor.h:307)
==00:00:00:14.596 10866==by 0x6A2E19: evaluate_expr (clauses.c:4827)
==00:00:00:14.596 10866==by 0x6A45FF: evaluate_function (clauses.c:4369)
==00:00:00:14.596 10866==by 0x6A45FF: simplify_function (clauses.c:3999)
==00:00:00:14.596 10866==by 0x6A31C1:
eval_const_expressions_mutator (clauses.c:2474)
==00:00:00:14.596 10866==by 0x644466: expression_tree_mutator
(nodeFuncs.c:3072)
==00:00:00:14.596 10866==
{
  
  Memcheck:Cond
  fun:strtod_l_internal
  fun:float8in_internal_opt_error
  fun:float8in_internal
  fun:float8in
  fun:DirectFunctionCall1Coll
  fun:numeric_float8
  fun:DirectFunctionCall1Coll
  fun:jsonb_float8
  fun:ExecInterpExpr
  fun:ExecEvalExprSwitchContext
  fun:evaluate_expr
  fun:evaluate_function
  fun:simplify_function
  fun:eval_const_expressions_mutator
  fun:expression_tree_mutator
}

Not sure whether it's related to 16d489b0fe.  Will investigate more.



This might be another case of false positive due to SSE (which I think is
used by strtod in some cases). But I'd expect strncasecmp in the stack in
that case, so maybe that's not it.


cheers

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services





Re: jsonpath

2019-03-27 Thread Andrew Dunstan

On 3/27/19 9:48 AM, Tom Lane wrote:
> Alexander Korotkov  writes:
>> Still no reproduction.
> Annoying, but it's probably not worth expending more effort on
> right now.  I wonder whether that buildfarm animal can be upgraded
> to capture core dump stack traces --- if so, then if it happens
> again we'd have more info.
>
>   



I was able to get this stack trace.


cheers


andrew



-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

[Switching to Thread 3960.0x34b4]
0x7ffb9ce6a458 in ntdll!RtlRaiseStatus ()
   from C:\WINDOWS\SYSTEM32\ntdll.dll
(gdb) bt
#0  0x7ffb9ce6a458 in ntdll!RtlRaiseStatus ()
   from C:\WINDOWS\SYSTEM32\ntdll.dll
#1  0x7ffb9ce7760e in ntdll!memset () from C:\WINDOWS\SYSTEM32\ntdll.dll
#2  0x7ffb9ac52e1a in msvcrt!_setjmpex ()
   from C:\WINDOWS\System32\msvcrt.dll
#3  0x0087431a in pg_re_throw ()
at 
c:/MinGW/msys/1.0/home/pgrunner/bf/root/HEAD/pgsql.build/../pgsql/src/backend/utils/error/elog.c:1720
#4  0x00874106 in errfinish (dummy=)
at 
c:/MinGW/msys/1.0/home/pgrunner/bf/root/HEAD/pgsql.build/../pgsql/src/backend/utils/error/elog.c:464
#5  0x007cc938 in jsonpath_yyerror (result=result@entry=0x0, 
message=message@entry=0xab0868 <__func__.110231+1592> "unrecognized flag of 
LIKE_REGEX predicate")
at 
/home/pgrunner/bf/root/HEAD/pgsql.build/../pgsql/src/backend/utils/adt/jsonpath_scan.l:305
#6  0x007cec9d in makeItemLikeRegex (pattern=, 
pattern=, flags=, expr=0x73c7a80)
at 
/home/pgrunner/bf/root/HEAD/pgsql.build/../pgsql/src/backend/utils/adt/jsonpath_gram.y:512
#7  jsonpath_yyparse (result=, result@entry=0x4b0e818)
at 
/home/pgrunner/bf/root/HEAD/pgsql.build/../pgsql/src/backend/utils/adt/jsonpath_gram.y:183
#8  0x007ced61 in parsejsonpath (
str=str@entry=0x73c71e0 "$ ? (@ like_regex \"pattern\" flag \"a\")", 
len=len@entry=37)
at 
/home/pgrunner/bf/root/HEAD/pgsql.build/../pgsql/src/backend/utils/adt/jsonpath_scan.l:481
#9  0x007cf347 in jsonPathFromCstring (
in=0x73c71e0 "$ ? (@ like_regex \"pattern\" flag \"a\")", len=37)
at 
c:/MinGW/msys/1.0/home/pgrunner/bf/root/HEAD/pgsql.build/../pgsql/src/backend/utils/adt/jsonpath.c:170
#10 0x008795ea in InputFunctionCall (flinfo=flinfo@entry=0x4b0e970, 
str=str@entry=0x73c71e0 "$ ? (@ like_regex \"pattern\" flag \"a\")", 
typioparam=typioparam@entry=4072, typmod=typmod@entry=-1)
at 
c:/MinGW/msys/1.0/home/pgrunner/bf/root/HEAD/pgsql.build/../pgsql/src/backend/utils/fmgr/fmgr.c:1548
#11 0x008797fc in OidInputFunctionCall (functionId=, 
str=0x73c71e0 "$ ? (@ like_regex \"pattern\" flag \"a\")", 
typioparam=4072, typmod=typmod@entry=-1)
at 
c:/MinGW/msys/1.0/home/pgrunner/bf/root/HEAD/pgsql.build/../pgsql/src/backend/utils/fmgr/fmgr.c:1651
#12 0x0053e395 in stringTypeDatum (tp=tp@entry=0x7475b50, 
string=, atttypmod=atttypmod@entry=-1)
at 
c:/MinGW/msys/1.0/home/pgrunner/bf/root/HEAD/pgsql.build/../pgsql/src/backend/parser/parse_type.c:641
#13 0x0052729d in coerce_type (pstate=0x73c7618, node=0x73c7848, 
inputTypeId=, targetTypeId=4072, targetTypeMod=-1, 
ccontext=COERCION_EXPLICIT, cformat=COERCE_EXPLICIT_CAST, location=46)
at 
c:/MinGW/msys/1.0/home/pgrunner/bf/root/HEAD/pgsql.build/../pgsql/src/backend/parser/parse_coerce.c:304
#14 0x0052772b in coerce_to_target_type (
pstate=pstate@entry=0x73c7618, expr=, 
expr@entry=0x73c7848, exprtype=, exprtype@entry=705, 
targettype=, targettypmod=-1, 
ccontext=ccontext@entry=COERCION_EXPLICIT, 
cformat=cformat@entry=COERCE_EXPLICIT_CAST, location=location@entry=46)
at 
c:/MinGW/msys/1.0/home/pgrunner/bf/root/HEAD/pgsql.build/../pgsql/src/backend/parser/parse_coerce.c:103
#15 0x0052c489 in transformTypeCast (tc=0x73c7378, pstate=0x73c7618)
at 
c:/MinGW/msys/1.0/home/pgrunner/bf/root/HEAD/pgsql.build/../pgsql/src/backend/parser/parse_expr.c:2811
#16 transformExprRecurse (pstate=pstate@entry=0x73c7618, 
expr=expr@entry=0x73c7378)
at 
c:/MinGW/msys/1.0/home/pgrunner/bf/root/HEAD/pgsql.build/../pgsql/src/backend/parser/parse_expr.c:202
#17 0x0052f8aa in transformExprRecurse (expr=0x73c7378, 
pstate=0x73c7618)
at 
c:/MinGW/msys/1.0/home/pgrunner/bf/root/HEAD/pgsql.build/../pgsql/src/backend/parser/parse_expr.c:163
#18 transformExpr (pstate=pstate@entry=0x73c7618, expr=expr@entry=0x73c7378, 
exprKind=exprKind@entry=EXPR_KIND_SELECT_TARGET)
at 
c:/MinGW/msys/1.0/home/pgrunner/bf/root/HEAD/pgsql.build/../pgsql/src/backend/parser/parse_expr.c:155
#19 0x0053be7f in transformTargetEntry (
pstate=pstate@entry=0x73c7618, node=0x73c7378, expr=expr@entry=0x0, 
exprKind=exprKind@entry=EXPR_KIND_SELECT_TARGET, colname=0x0, 
resjunk=resjunk@entry=false)
at 
c:/MinGW/msys/1.0/home/pgrunner/bf/root/HEAD/pgsql.build/../pgsql/src/bac

Re: jsonpath

2019-03-27 Thread Tom Lane
Andrew Dunstan  writes:
> I was able to get this stack trace.
> 
> (gdb) bt
> #0  0x7ffb9ce6a458 in ntdll!RtlRaiseStatus ()
>from C:\WINDOWS\SYSTEM32\ntdll.dll
> #1  0x7ffb9ce7760e in ntdll!memset () from C:\WINDOWS\SYSTEM32\ntdll.dll
> #2  0x7ffb9ac52e1a in msvcrt!_setjmpex ()
>from C:\WINDOWS\System32\msvcrt.dll
> #3  0x0087431a in pg_re_throw ()
> at 
> c:/MinGW/msys/1.0/home/pgrunner/bf/root/HEAD/pgsql.build/../pgsql/src/backend/utils/error/elog.c:1720
> #4  0x00874106 in errfinish (dummy=)
> at 
> c:/MinGW/msys/1.0/home/pgrunner/bf/root/HEAD/pgsql.build/../pgsql/src/backend/utils/error/elog.c:464
> #5  0x007cc938 in jsonpath_yyerror (result=result@entry=0x0, 
> message=message@entry=0xab0868 <__func__.110231+1592> "unrecognized flag 
> of LIKE_REGEX predicate")
> at 
> /home/pgrunner/bf/root/HEAD/pgsql.build/../pgsql/src/backend/utils/adt/jsonpath_scan.l:305
> #6  0x007cec9d in makeItemLikeRegex (pattern=, 
> pattern=, flags=, expr=0x73c7a80)
> at 
> /home/pgrunner/bf/root/HEAD/pgsql.build/../pgsql/src/backend/utils/adt/jsonpath_gram.y:512

Hmm.  Reaching the yyerror call is expected given this input, but
seemingly the siglongjmp stack has been trashed?  The best I can
think of is a wild store that either occurs only on this platform
or happens to be harmless elsewhere ... but neither idea is terribly
convincing.

BTW, the expected behavior according to the regression test is

regression=# select '$ ? (@ like_regex "pattern" flag "a"'::jsonpath;
ERROR:  bad jsonpath representation
LINE 1: select '$ ? (@ like_regex "pattern" flag "a"'::jsonpath;
   ^
DETAIL:  unrecognized flag of LIKE_REGEX predicate at or near """

which leaves quite a lot to be desired already.  The "bad whatever"
error wording is a flat out violation of our message style guide,
while the "at or near  bit is pretty darn unhelpful.

The latter problem occurs because the last flex production was

\"  {
yylval->str = scanstring;
BEGIN INITIAL;
return STRING_P;
}

so that flex thinks the last token was just the quote mark ending the
string.  This could be improved on by adopting something similar to the
SET_YYLLOC() convention used in scan.l to remember the start of what the
user would think the token is.  Probably it's not worth the work right
now, but details like this are important from a fit-and-finish
perspective, so I'd like to see it improved sometime.

regards, tom lane




Re: jsonpath

2019-03-28 Thread Alexander Korotkov
On Thu, Mar 28, 2019 at 5:55 AM Andrew Dunstan
 wrote:
> On 3/27/19 9:48 AM, Tom Lane wrote:
> > Alexander Korotkov  writes:
> >> Still no reproduction.
> > Annoying, but it's probably not worth expending more effort on
> > right now.  I wonder whether that buildfarm animal can be upgraded
> > to capture core dump stack traces --- if so, then if it happens
> > again we'd have more info.
>
> I was able to get this stack trace.

Thank you very much!  It appears to be hard to investigate this even
with backtrace.  You told you can almost reliably reproduce this.
Could you please, find exact commit caused this error using "git
bisect"?  I would very appreciate this.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: jsonpath

2019-03-28 Thread Andrew Dunstan


On 3/28/19 1:01 AM, Tom Lane wrote:
> Andrew Dunstan  writes:
>> I was able to get this stack trace.
>>
>> (gdb) bt
>> #0  0x7ffb9ce6a458 in ntdll!RtlRaiseStatus ()
>>from C:\WINDOWS\SYSTEM32\ntdll.dll
>> #1  0x7ffb9ce7760e in ntdll!memset () from C:\WINDOWS\SYSTEM32\ntdll.dll
>> #2  0x7ffb9ac52e1a in msvcrt!_setjmpex ()
>>from C:\WINDOWS\System32\msvcrt.dll
>> #3  0x0087431a in pg_re_throw ()
>> at 
>> c:/MinGW/msys/1.0/home/pgrunner/bf/root/HEAD/pgsql.build/../pgsql/src/backend/utils/error/elog.c:1720
>> #4  0x00874106 in errfinish (dummy=)
>> at 
>> c:/MinGW/msys/1.0/home/pgrunner/bf/root/HEAD/pgsql.build/../pgsql/src/backend/utils/error/elog.c:464
>> #5  0x007cc938 in jsonpath_yyerror (result=result@entry=0x0, 
>> message=message@entry=0xab0868 <__func__.110231+1592> "unrecognized flag 
>> of LIKE_REGEX predicate")
>> at 
>> /home/pgrunner/bf/root/HEAD/pgsql.build/../pgsql/src/backend/utils/adt/jsonpath_scan.l:305
>> #6  0x007cec9d in makeItemLikeRegex (pattern=, 
>> pattern=, flags=, expr=0x73c7a80)
>> at 
>> /home/pgrunner/bf/root/HEAD/pgsql.build/../pgsql/src/backend/utils/adt/jsonpath_gram.y:512
> Hmm.  Reaching the yyerror call is expected given this input, but
> seemingly the siglongjmp stack has been trashed?  The best I can
> think of is a wild store that either occurs only on this platform
> or happens to be harmless elsewhere ... but neither idea is terribly
> convincing.



Further data point: if I just call the offending statement alone,
there's no crash. The crash only occurs when I call the whole script.
I'll see if I can triangulate a bit to get a minimal test case.


cheers


andrew



-- 

Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services





Re: jsonpath

2019-03-28 Thread Andrew Dunstan


On 3/28/19 5:38 AM, Alexander Korotkov wrote:
> On Thu, Mar 28, 2019 at 5:55 AM Andrew Dunstan
>  wrote:
>> On 3/27/19 9:48 AM, Tom Lane wrote:
>>> Alexander Korotkov  writes:
 Still no reproduction.
>>> Annoying, but it's probably not worth expending more effort on
>>> right now.  I wonder whether that buildfarm animal can be upgraded
>>> to capture core dump stack traces --- if so, then if it happens
>>> again we'd have more info.
>> I was able to get this stack trace.
> Thank you very much!  It appears to be hard to investigate this even
> with backtrace.  You told you can almost reliably reproduce this.
> Could you please, find exact commit caused this error using "git
> bisect"?  I would very appreciate this.
>


I'll try. It's time consuming given how long builds take.


Here's an interesting data point. If I run the whole jsonpath.sql script
it crashes every time. If I run just the offending statement it crashes
exactly every other time. It looks like in that case something gets
clobbered and then cleared.


cheers


andrew


-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services





Re: jsonpath

2019-03-28 Thread Alexander Korotkov
On Thu, Mar 28, 2019 at 3:25 PM Andrew Dunstan
 wrote:
> On 3/28/19 5:38 AM, Alexander Korotkov wrote:
> > On Thu, Mar 28, 2019 at 5:55 AM Andrew Dunstan
> >  wrote:
> >> On 3/27/19 9:48 AM, Tom Lane wrote:
> >>> Alexander Korotkov  writes:
>  Still no reproduction.
> >>> Annoying, but it's probably not worth expending more effort on
> >>> right now.  I wonder whether that buildfarm animal can be upgraded
> >>> to capture core dump stack traces --- if so, then if it happens
> >>> again we'd have more info.
> >> I was able to get this stack trace.
> > Thank you very much!  It appears to be hard to investigate this even
> > with backtrace.  You told you can almost reliably reproduce this.
> > Could you please, find exact commit caused this error using "git
> > bisect"?  I would very appreciate this.
>
> I'll try. It's time consuming given how long builds take.

Thank you very much for your efforts!

> Here's an interesting data point. If I run the whole jsonpath.sql script
> it crashes every time. If I run just the offending statement it crashes
> exactly every other time. It looks like in that case something gets
> clobbered and then cleared.

Could you clarify this a bit?  What is exactly every other time?  Do
you mean it doesn't crash first time, but crashes second time?  Do you
run each offending statement in the separate session?  Or do you run
multiple offending statements in the same session?

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: jsonpath

2019-03-28 Thread Andrew Dunstan


On 3/28/19 8:49 AM, Alexander Korotkov wrote:
>
>> Here's an interesting data point. If I run the whole jsonpath.sql script
>> it crashes every time. If I run just the offending statement it crashes
>> exactly every other time. It looks like in that case something gets
>> clobbered and then cleared.
> Could you clarify this a bit?  What is exactly every other time?  Do
> you mean it doesn't crash first time, but crashes second time?  Do you
> run each offending statement in the separate session?  Or do you run
> multiple offending statements in the same session?
>

I mean repeated invocations of


    psql -c "select '$ ? (@ like_regex \"pattern\" flag \"a\")'::jsonpath"


These get crash, no crash, crash, no crash ...



cheers


andrew


-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services





Re: jsonpath

2019-03-28 Thread Tom Lane
Andrew Dunstan  writes:
> I mean repeated invocations of
>     psql -c "select '$ ? (@ like_regex \"pattern\" flag \"a\")'::jsonpath"
> These get crash, no crash, crash, no crash ...

That is just wacko ... but it does seem to support my thought of
a wild store somewhere.  The mechanism for this case might be
that memory layout is different depending on whether we had to
rebuild the relcache init file at session start or not.  Similarly,
the fact that the full test script reliably crashes might be
dependent on previous commands having left things in a certain
state.  Unfortunately that gets us little closer to a fix.

Has anybody gotten through a valgrind run on this code yet?

regards, tom lane




Re: jsonpath

2019-03-28 Thread Alexander Korotkov
On Thu, Mar 28, 2019 at 4:31 PM Tom Lane  wrote:
> Andrew Dunstan  writes:
> > I mean repeated invocations of
> > psql -c "select '$ ? (@ like_regex \"pattern\" flag \"a\")'::jsonpath"
> > These get crash, no crash, crash, no crash ...
>
> That is just wacko ... but it does seem to support my thought of
> a wild store somewhere.  The mechanism for this case might be
> that memory layout is different depending on whether we had to
> rebuild the relcache init file at session start or not.  Similarly,
> the fact that the full test script reliably crashes might be
> dependent on previous commands having left things in a certain
> state.  Unfortunately that gets us little closer to a fix.
>
> Has anybody gotten through a valgrind run on this code yet?

I've [1].  Find single backtrace, but it doesn't seem to be related to
our issue.

1. 
https://www.postgresql.org/message-id/CAPpHfdsgyPKbaqOsJ4tyC97Ybpm69eGLHzBGLKYXsfJi%3Dc-fjA%40mail.gmail.com

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: jsonpath

2019-03-28 Thread Andres Freund



On March 28, 2019 9:31:14 AM EDT, Tom Lane  wrote:
>Andrew Dunstan  writes:
>> I mean repeated invocations of
>>     psql -c "select '$ ? (@ like_regex \"pattern\" flag
>\"a\")'::jsonpath"
>> These get crash, no crash, crash, no crash ...
>
>That is just wacko ... but it does seem to support my thought of
>a wild store somewhere.  The mechanism for this case might be
>that memory layout is different depending on whether we had to
>rebuild the relcache init file at session start or not.  Similarly,
>the fact that the full test script reliably crashes might be
>dependent on previous commands having left things in a certain
>state.  Unfortunately that gets us little closer to a fix.
>
>Has anybody gotten through a valgrind run on this code yet?

Skink has successfully passed since - but that's x86...
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.




Re: jsonpath

2019-03-28 Thread Tom Lane
Andres Freund  writes:
> On March 28, 2019 9:31:14 AM EDT, Tom Lane  wrote:
>> Has anybody gotten through a valgrind run on this code yet?

> Skink has successfully passed since - but that's x86...

Yeah, there is a depressingly high chance that this is somehow specific
to the bison version, flex version, and/or compiler in use on jacana.

regards, tom lane




Re: jsonpath

2019-03-28 Thread Andrew Dunstan


On 3/28/19 9:50 AM, Tom Lane wrote:
> Andres Freund  writes:
>> On March 28, 2019 9:31:14 AM EDT, Tom Lane  wrote:
>>> Has anybody gotten through a valgrind run on this code yet?
>> Skink has successfully passed since - but that's x86...
> Yeah, there is a depressingly high chance that this is somehow specific
> to the bison version, flex version, and/or compiler in use on jacana.
>
>   



lousyjack has also passed it (x64).


git bisect on jacana blames commit 550b9d26f.


cheers


andrew


-- 

Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services





Re: jsonpath

2019-03-29 Thread Alexander Korotkov
On Thu, Mar 28, 2019 at 7:43 PM Andrew Dunstan
 wrote:
> On 3/28/19 9:50 AM, Tom Lane wrote:
> > Andres Freund  writes:
> >> On March 28, 2019 9:31:14 AM EDT, Tom Lane  wrote:
> >>> Has anybody gotten through a valgrind run on this code yet?
> >> Skink has successfully passed since - but that's x86...
> > Yeah, there is a depressingly high chance that this is somehow specific
> > to the bison version, flex version, and/or compiler in use on jacana.
>
> lousyjack has also passed it (x64).
>
> git bisect on jacana blames commit 550b9d26f.

Hmm... 550b9d26f just makes jsonpath_gram.y and jsonpath_scan.l
compile at once.  I've re-read this commit and didn't find anything
suspicious.
I've asked Andrew for access to jacana in order to investigate this myself.



--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: jsonpath

2019-03-30 Thread Alexander Korotkov
On Fri, Mar 29, 2019 at 4:15 PM Alexander Korotkov
 wrote:
> On Thu, Mar 28, 2019 at 7:43 PM Andrew Dunstan
>  wrote:
> > On 3/28/19 9:50 AM, Tom Lane wrote:
> > > Andres Freund  writes:
> > >> On March 28, 2019 9:31:14 AM EDT, Tom Lane  wrote:
> > >>> Has anybody gotten through a valgrind run on this code yet?
> > >> Skink has successfully passed since - but that's x86...
> > > Yeah, there is a depressingly high chance that this is somehow specific
> > > to the bison version, flex version, and/or compiler in use on jacana.
> >
> > lousyjack has also passed it (x64).
> >
> > git bisect on jacana blames commit 550b9d26f.
>
> Hmm... 550b9d26f just makes jsonpath_gram.y and jsonpath_scan.l
> compile at once.  I've re-read this commit and didn't find anything
> suspicious.
> I've asked Andrew for access to jacana in order to investigate this myself.

I'm going to push there 3 attached patches for jsonpath.

1st one is revised patch implementing GIN index support for jsonpath.
Based on feedback from Jonathan Katz, I decided to restrict jsonb_ops
from querying only case.  Now, jsonb_ops also works on if
"accessors_chain = const" statement is found.  Keys are frequently not
selective.  Given we have no statistics of them, purely key GIN
queries are likely confuse optimizer making it select inefficient
plan.  Actually, jsonb_ops may be used for pure key query when ?
operator is used.  But in this case, user explicitly searches for key.
With jsonpath, purely key GIN searches can easily happen unintended.
So, restrict that.

2nd and 3rd patches are from Nikita Glukhov upthread.  2nd restrict
some cases in parsing numerics.  3rd make jsonb_path_match() function
throw error when result is not single boolean and silent mode is off.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


0001-jsonpath-gin-support.patch
Description: Binary data


0002-restrict-some-cases-in-jsonpath-numerics-parsing.patch
Description: Binary data


0003-make-jsonb_path_match-throw-error.patch
Description: Binary data


Re: jsonpath

2019-03-31 Thread Tom Lane
"Tom Turelinckx"  writes:
> Stack trace from skate:

Huh ... so that's nowhere near the jsonpath-syntax-error crash that
we saw before.

I assume this trace is from this run?

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skate&dt=2019-03-31%2006%3A24%3A35

That looks a whole lot like the previous failure on snapper:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=snapper&dt=2019-03-23%2013%3A01%3A28

right down to it having gotten through "make check" only to fail when the
same regression tests are run again during the pg_upgrade test.  I wonder
if there's any real significance to that, or if it's just that the failure
is not very repeatable.

BTW, what is the difference between skate and snapper?  They look to
be running on the same machine.

regards, tom lane




Re: jsonpath

2019-03-31 Thread Andrew Dunstan


On 3/31/19 12:21 PM, Tom Turelinckx wrote:
> Tom Lane wrote:
>
>> I assume this trace is from this run?
>> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skate&dt=2019-03-31%2006%3A24%3A35
> Yes. I did get a core file now, but it wasn't picked up by the buildfarm
> script, so I extracted the backtrace manually.
>
>

I have just committed a patch that should result in stack traces being
picked up in pg_upgrade testing. See



You should be able to drop the updated file in place, get it from



There will be a buildfarm release out very soon that includes this.


cheers


andrew


-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services





RE: jsonpath

2019-03-31 Thread Tom Turelinckx
Alexander Korotkov wrote:

> Hmm... 550b9d26f just makes jsonpath_gram.y and jsonpath_scan.l
> compile at once.  I've re-read this commit and didn't find anything
> suspicious.
> I've asked Andrew for access to jacana in order to investigate this myself.

Stack trace from skate:

[New LWP 6614]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/sparc-linux-gnu/libthread_db.so.1".
Core was generated by `postgres: pgbf regression [local] SELECT 
 '.
Program terminated with signal 11, Segmentation fault.
#0  strlen () at ../sysdeps/sparc/sparc64/strlen.S:34
34  ldx [%o0], %o5
#0  strlen () at ../sysdeps/sparc/sparc64/strlen.S:34
#1  0x0008a3e4 in printtup (slot=0x834888, self=0x864cc0) at printtup.c:435
#2  0x00259b60 in ExecutePlan (execute_once=, dest=0x864cc0, 
direction=, numberTuples=0, 
sendTuples=true, operation=CMD_SELECT, use_parallel_mode=, 
planstate=0x833eb0, estate=0x833d70)
at execMain.c:1686
#3  standard_ExecutorRun (queryDesc=0x7dcdc0, direction=, 
count=0, execute_once=)
at execMain.c:365
#4  0x00259da0 in ExecutorRun (queryDesc=0x808920, queryDesc@entry=0x7dcdc0, 
direction=direction@entry=ForwardScanDirection, count=8801472, 
execute_once=) at execMain.c:309
#5  0x003e6714 in PortalRunSelect (portal=portal@entry=0x808920, 
forward=forward@entry=true, count=0, 
count@entry=2147483647, dest=dest@entry=0x864cc0) at pquery.c:929
#6  0x003e7d3c in PortalRun (portal=portal@entry=0x808920, 
count=count@entry=2147483647, 
isTopLevel=isTopLevel@entry=true, run_once=run_once@entry=true, 
dest=dest@entry=0x864cc0, 
altdest=altdest@entry=0x864cc0, 
completionTag=completionTag@entry=0xff86e830 "") at pquery.c:770
#7  0x003e32d0 in exec_simple_query (
query_string=0x7ba400 "select '$.g ? (@.a == 1 || @.a == 4 && @.b == 
7)'::jsonpath;") at postgres.c:1215
#8  0x003e4854 in PostgresMain (argc=, argv=argv@entry=0x7e2370, 
dbname=0x7e2148 "regression", 
username=) at postgres.c:4247
#9  0x0007ae6c in BackendRun (port=0x7ddd40) at postmaster.c:4399
#10 BackendStartup (port=0x7ddd40) at postmaster.c:4090
#11 ServerLoop () at postmaster.c:1703
#12 0x00353a68 in PostmasterMain (argc=argc@entry=6, argv=argv@entry=0x7b49a8) 
at postmaster.c:1376
#13 0x0007cc60 in main (argc=6, argv=0x7b49a8) at main.c:228

Does this help?

Best regards,
Tom Turelinckx






RE: jsonpath

2019-03-31 Thread Tom Turelinckx
Tom Lane wrote:

> I assume this trace is from this run?
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skate&dt=2019-03-31%2006%3A24%3A35

Yes. I did get a core file now, but it wasn't picked up by the buildfarm
script, so I extracted the backtrace manually.

> That looks a whole lot like the previous failure on snapper:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=snapper&dt=2019-03-23%2013%3A01%3A28

That's what I meant.

> Huh ... so that's nowhere near the jsonpath-syntax-error crash that
> we saw before.

Sorry, I wasn't aware there were multiple crashes.

> BTW, what is the difference between skate and snapper?  They look to
> be running on the same machine.

They are. Skate runs with default buildfarm options. Snapper mimics
the options used by the pgdg debian source packages. Both build the
same source (first skate then snapper). This to avoid a repeat of [1].
Snapper also runs more tests (UpgradeXversion, CollateLinuxUTF8).

Best regards,
Tom Turelinckx

1. 
https://www.postgresql.org/message-id/20160413175827.dmlbtdf7c3mgmnex%40alap3.anarazel.de






Re: jsonpath

2019-04-03 Thread Andres Freund
On 2019-03-30 18:25:12 +0300, Alexander Korotkov wrote:
> I'm going to push there 3 attached patches for jsonpath.

I noticed that
https://commitfest.postgresql.org/22/1472/
https://commitfest.postgresql.org/22/1473/
are still open and marked as needs-review. Is there a reason for that?

- Andres




Re: jsonpath

2019-04-03 Thread Alexander Korotkov
On Wed, Apr 3, 2019 at 11:03 PM Andres Freund  wrote:
>
> On 2019-03-30 18:25:12 +0300, Alexander Korotkov wrote:
> > I'm going to push there 3 attached patches for jsonpath.
>
> I noticed that
> https://commitfest.postgresql.org/22/1472/
> https://commitfest.postgresql.org/22/1473/
> are still open and marked as needs-review. Is there a reason for that?

Nope.  I've moved both to the next CF.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: jsonpath

2019-01-18 Thread Alexander Korotkov
Nikita, thank you!

I noticed another thing.  3-arguments version of functions
jsonpath_exists(), jsonpath_predicate(), jsonpath_query(),
jsonpath_query_wrapped() are:

1) Not documented
2) Not used in operator definitions
3) Not used in subsequent patches

So, it looks like we can just remove then.  But I think we're very
likely can commit jsonpath patch to PostgreSQL 12, but I'm not sure
about other patches.  So, having those functions exposed to user can
be extremely useful until we have separate nodes for SQL/JSON.  So, I
vote to document these functions.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: jsonpath

2019-01-18 Thread Tomas Vondra
On 1/18/19 6:58 PM, Alexander Korotkov wrote:
> Nikita, thank you!
> 
> I noticed another thing.  3-arguments version of functions
> jsonpath_exists(), jsonpath_predicate(), jsonpath_query(),
> jsonpath_query_wrapped() are:
> 
> 1) Not documented
> 2) Not used in operator definitions
> 3) Not used in subsequent patches
> 
> So, it looks like we can just remove then.  But I think we're very
> likely can commit jsonpath patch to PostgreSQL 12, but I'm not sure
> about other patches.  So, having those functions exposed to user can
> be extremely useful until we have separate nodes for SQL/JSON.  So, I
> vote to document these functions.
> 

That seems a bit strange. If those functions are meant to be used by
other patches (which ones?) then why should not we make them part of
those future patches?

But it seems more like those functions are actually meant to be used by
users in the first place, in cases when we need to provide a third
parameter (which operators can't do). In that case yes - we should have
them documented properly, but also tested. Which is not the case
currently, because the regression tests only use the operators.

Two more comments:

1) I'm wondering why we even need separate functions for the different
numbers of arguments at the C level, as both simply call to the same
function anyway with a PG_NARGS() condition in it. Can't we ditch those
wrappers and reference that target function directly?

2) I once again ran into the jsonb->json business, which essentially
means that when you do this:

select json '{"a": { "b" : 10}}' @? '$.a.b';

it ends up calling jsonb_jsonpath_exists(), which then does this:

Jsonb *jb = PG_GETARG_JSONB_P(0);

I and am not sure why/how it seems to work, but I find it confusing
because the stack still looks like this:

#0  jsonb_jsonpath_exists (fcinfo=0x162f800) at jsonpath_exec.c:2857
#1  0x0096d721 in json_jsonpath_exists2 (fcinfo=0x162f800) at
jsonpath_exec.c:2882
#2  0x006c790a in ExecInterpExpr (state=0x162f300,
econtext=0x162ee18, isnull=0x7ffcea4c3857) at execExprInterp.c:648
...

and of course, the fcinfo->flinfo still points to the json-specific
function, which say the first parameter type is 'json'.

(gdb) print *fcinfo->flinfo
$23 = {fn_addr = 0x96d709 , fn_oid = 6043,
fn_nargs = 2, fn_strict = true, fn_retset = false, fn_stats = 2 '\002',
fn_extra = 0x0, fn_mcxt = 0x162e990, fn_expr = 0x15db378}


test=# select proname, prosrc, proargtypes from pg_proc where oid = 6043;
 proname |prosrc | proargtypes
-+---+-
 jsonpath_exists | json_jsonpath_exists2 | 114 6050
(1 row)

test=# select typname from pg_type where oid = 114;
 typname
-
 json
(1 row)


regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: jsonpath

2019-01-18 Thread Alexander Korotkov
On Sat, Jan 19, 2019 at 1:32 AM Tomas Vondra
 wrote:
> On 1/18/19 6:58 PM, Alexander Korotkov wrote:
> > So, it looks like we can just remove then.  But I think we're very
> > likely can commit jsonpath patch to PostgreSQL 12, but I'm not sure
> > about other patches.  So, having those functions exposed to user can
> > be extremely useful until we have separate nodes for SQL/JSON.  So, I
> > vote to document these functions.
>
> That seems a bit strange. If those functions are meant to be used by
> other patches (which ones?) then why should not we make them part of
> those future patches?

No, these functions aren't used by other patches (it was my original
wrong idea).  Other patches provides SQL expressions making these
functions not that necessary.  But I think we should keep these
functions in jsonpath patch.

> But it seems more like those functions are actually meant to be used by
> users in the first place, in cases when we need to provide a third
> parameter (which operators can't do). In that case yes - we should have
> them documented properly, but also tested. Which is not the case
> currently, because the regression tests only use the operators.

+1
Thank you for noticing.  Tests should be provided as well.

> Two more comments:
>
> 1) I'm wondering why we even need separate functions for the different
> numbers of arguments at the C level, as both simply call to the same
> function anyway with a PG_NARGS() condition in it. Can't we ditch those
> wrappers and reference that target function directly?

That was surprising for me too.  Technically, it's OK to do this.  And
we do this for extensions.  But for in-core functions we have
following sanity check.

-- Considering only built-in procs (prolang = 12), look for multiple uses
-- of the same internal function (ie, matching prosrc fields).  It's OK to
-- have several entries with different pronames for the same internal function,
-- but conflicts in the number of arguments and other critical items should
-- be complained of.  (We don't check data types here; see next query.)
-- Note: ignore aggregate functions here, since they all point to the same
-- dummy built-in function.

SELECT p1.oid, p1.proname, p2.oid, p2.proname
FROM pg_proc AS p1, pg_proc AS p2
WHERE p1.oid < p2.oid AND
p1.prosrc = p2.prosrc AND
p1.prolang = 12 AND p2.prolang = 12 AND
(p1.prokind != 'a' OR p2.prokind != 'a') AND
(p1.prolang != p2.prolang OR
 p1.prokind != p2.prokind OR
 p1.prosecdef != p2.prosecdef OR
 p1.proleakproof != p2.proleakproof OR
 p1.proisstrict != p2.proisstrict OR
 p1.proretset != p2.proretset OR
 p1.provolatile != p2.provolatile OR
 p1.pronargs != p2.pronargs);

And we already have some code written especially to make this check happy.

/* This is separate to keep the opr_sanity regression test from complaining */
Datum
regexp_split_to_table_no_flags(PG_FUNCTION_ARGS)
{
return regexp_split_to_table(fcinfo);
}

Personally I'm not fan of this approach, and I would rather relax this
sanity check.  But that doesn't seem to be a subject of this patch.
For jsonpath, it's OK to just keep this tradition.

> 2) I once again ran into the jsonb->json business, which essentially
> means that when you do this:
>
> select json '{"a": { "b" : 10}}' @? '$.a.b';
>
> it ends up calling jsonb_jsonpath_exists(), which then does this:
>
> Jsonb *jb = PG_GETARG_JSONB_P(0);
>
> I and am not sure why/how it seems to work, but I find it confusing
> because the stack still looks like this:
>
> #0  jsonb_jsonpath_exists (fcinfo=0x162f800) at jsonpath_exec.c:2857
> #1  0x0096d721 in json_jsonpath_exists2 (fcinfo=0x162f800) at
> jsonpath_exec.c:2882
> #2  0x006c790a in ExecInterpExpr (state=0x162f300,
> econtext=0x162ee18, isnull=0x7ffcea4c3857) at execExprInterp.c:648
> ...
>
> and of course, the fcinfo->flinfo still points to the json-specific
> function, which say the first parameter type is 'json'.
>
> (gdb) print *fcinfo->flinfo
> $23 = {fn_addr = 0x96d709 , fn_oid = 6043,
> fn_nargs = 2, fn_strict = true, fn_retset = false, fn_stats = 2 '\002',
> fn_extra = 0x0, fn_mcxt = 0x162e990, fn_expr = 0x15db378}
>
>
> test=# select proname, prosrc, proargtypes from pg_proc where oid = 6043;
>  proname |prosrc | proargtypes
> -+---+-
>  jsonpath_exists | json_jsonpath_exists2 | 114 6050
> (1 row)
>
> test=# select typname from pg_type where oid = 114;
>  typname
> -
>  json
> (1 row)

It's tricky.  There is jsonpath_json.c, which includes jsonpath_exec.c
with set of macro definitions making that code work with json type.
I'm going to refactor that.  But general approach to include same
functions more than once seems OK for me.

Some more notes:

1) It seems that @* and @# are not going to be supported by any
indexes.  I think we should remove these operators and let users use
functions instead.
2) I propose to rename @~ operator to @@. 

Re: jsonpath

2019-01-19 Thread Alexander Korotkov
On Sat, Jan 19, 2019 at 2:54 AM Alexander Korotkov
 wrote:
> 1) It seems that @* and @# are not going to be supported by any
> indexes.  I think we should remove these operators and let users use
> functions instead.
> 2) I propose to rename @~ operator to @@.  We already use @@ as
> "satisfies" in multiple places, and I thinks this case fits too.

3) How do we calculate the "id" property returned by keyvalue()
function?  It's not documented.  Even presence of "id" columns isn't
documented.  Standard stands that it's implementation-depended
indetifier of object holding key-value pair.  The way of its
calculation is also not clear from the code.  Why do we need constant
of 100?

id = jb->type != jbvBinary ? 0 :
(int64)((char *) jb->val.binary.data -
(char *) cxt->baseObject.jbc);
id += (int64) cxt->baseObject.id * INT64CONST(100);

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: jsonpath

2019-01-20 Thread Alexander Korotkov
On Sun, Jan 20, 2019 at 6:30 AM Alexander Korotkov
 wrote:
> I'll continue revising this patchset.  Nikita, could you please write
> tests for 3-argument versions of functions?  Also, please answer the
> question regarding "id" property.

I've some more notes regarding function set provided in jsonpath patch:
1) Function names don't follow the convention we have.  All our
functions dealing with jsonb have "jsonb_" prefix.  Exclusions have
"jsonb" in other part of name, for example, "to_jsonb(anyelement)".
We could name functions at SQL level in the same way they are named in
C.  So, they would be jsonb_jsonpath_exists() etc.  But it's probably
too long.  What about jsonb_path_exists() etc?
2) jsonpath_query_wrapped() name seems counter intuitive for me.  What
about jsonpath_query_array()?
3) jsonpath_query_wrapped() behavior looks tricky to me.  It returns
NULL for no results.  When result item is one, it is returned "as is".
When there are two or more result items, they are wrapped into array.
This representation of result seems extremely inconvenient for me.  It
becomes hard to solve even trivial tasks with that: for instance,
iterate all items found.  Without extra assumptions on what is going
to be returned it's also impossible for caller to distinguish single
array item found from multiple items found wrapped into array.  And
that seems very bad.  I know this behavior is inspired by SQL/JSON
standard.  But since these functions are anyway our extension, we can
implement them as we like.  So, I propose to make this function always
return array of items regardless on count of those items (even for 1
or 0 items).
4) If we change behavior of jsonpath_query_wrapped() as above, we can
also implement jsonpath_query_one() function, which would return first
result item or NULL for no items.
Any thoughts?

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: jsonpath

2019-01-21 Thread Nikita Glukhov

On 20.01.2019 2:45, Alexander Korotkov wrote:


3) How do we calculate the "id" property returned by keyvalue()
function?  It's not documented.  Even presence of "id" columns isn't
documented.  Standard stands that it's implementation-depended
indetifier of object holding key-value pair.  The way of its
calculation is also not clear from the code.  Why do we need constant
of 100?

 id = jb->type != jbvBinary ? 0 :
 (int64)((char *) jb->val.binary.data -
 (char *) cxt->baseObject.jbc);
 id += (int64) cxt->baseObject.id * INT64CONST(100);


I decided to construct object id from the two parts: base object id and its
binary offset in its base object's jsonb:

object_id = 100 * base_object_id + object_offset_in_base_object


100 (10^10) -- is a first round decimal number greater than 2^32
(maximal offset in jsonb).  Decimal multiplier is used here to improve the
readability of identifiers.


Base object is usually a root object of the path: context item '$' or path
variable '$var', literals can't produce objects for now.  But if the path
contains generated objects (.keyvalue() itself, for example), then they become
base object for the subsequent .keyvalue().  See example:

'$.a.b.keyvalue().value.keyvalue()' :
 - base for the first .keyvalue() is '$'
 - base for the second .keyvalue() is '$.a.b.keyvalue()'


Id of '$' is 0.
Id of '$var' is its ordinal (positive) number in the list of variables.
Ids for generated objects are assigned using global counter
'JsonPathExecContext.generatedObjectId' starting from 'number_of_vars + 1'.


Corresponding comments will be added in the upcoming version of the patches.

--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: jsonpath

2019-01-21 Thread Oleg Bartunov
On Mon, Jan 21, 2019 at 1:40 AM Alexander Korotkov
 wrote:
>
> On Sun, Jan 20, 2019 at 6:30 AM Alexander Korotkov
>  wrote:
> > I'll continue revising this patchset.  Nikita, could you please write
> > tests for 3-argument versions of functions?  Also, please answer the
> > question regarding "id" property.
>
> I've some more notes regarding function set provided in jsonpath patch:
> 1) Function names don't follow the convention we have.  All our
> functions dealing with jsonb have "jsonb_" prefix.  Exclusions have
> "jsonb" in other part of name, for example, "to_jsonb(anyelement)".
> We could name functions at SQL level in the same way they are named in
> C.  So, they would be jsonb_jsonpath_exists() etc.  But it's probably
> too long.  What about jsonb_path_exists() etc?

jsonpath_exists is similar to xpath_exists.
Actually, we could use jsonb_exists(jsonb, jsonpath, json), but then
we should specify the type of the second argument.

> 2) jsonpath_query_wrapped() name seems counter intuitive for me.  What
> about jsonpath_query_array()?

The question is should we try to provide a functional interface for
all options of
JSON_QUERY clause ?  The same is for  other SQL/JSON clauses.
Currently,  patch contains very limited subset of JSON_QUERY
functionality,  mostly for jsonpath testing.

> 3) jsonpath_query_wrapped() behavior looks tricky to me.  It returns
> NULL for no results.  When result item is one, it is returned "as is".
> When there are two or more result items, they are wrapped into array.
> This representation of result seems extremely inconvenient for me.  It
> becomes hard to solve even trivial tasks with that: for instance,
> iterate all items found.  Without extra assumptions on what is going
> to be returned it's also impossible for caller to distinguish single
> array item found from multiple items found wrapped into array.  And
> that seems very bad.  I know this behavior is inspired by SQL/JSON
> standard.  But since these functions are anyway our extension, we can
> implement them as we like.  So, I propose to make this function always
> return array of items regardless on count of those items (even for 1
> or 0 items).

Fair enough, but if we agree, that we provide an exact functionality of
SQL clauses, then better to follow the standard to avoid problems.


> 4) If we change behavior of jsonpath_query_wrapped() as above, we can
> also implement jsonpath_query_one() function, which would return first
> result item or NULL for no items.
> Any thoughts?

I think, that we should postpone this functional interface, which could be
added later.

>
> --
> Alexander Korotkov
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company



--
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: jsonpath

2019-01-21 Thread Alexander Korotkov
On Mon, Jan 21, 2019 at 6:05 PM Oleg Bartunov  wrote:
> On Mon, Jan 21, 2019 at 1:40 AM Alexander Korotkov
>  wrote:
> >
> > On Sun, Jan 20, 2019 at 6:30 AM Alexander Korotkov
> >  wrote:
> > > I'll continue revising this patchset.  Nikita, could you please write
> > > tests for 3-argument versions of functions?  Also, please answer the
> > > question regarding "id" property.
> >
> > I've some more notes regarding function set provided in jsonpath patch:
> > 1) Function names don't follow the convention we have.  All our
> > functions dealing with jsonb have "jsonb_" prefix.  Exclusions have
> > "jsonb" in other part of name, for example, "to_jsonb(anyelement)".
> > We could name functions at SQL level in the same way they are named in
> > C.  So, they would be jsonb_jsonpath_exists() etc.  But it's probably
> > too long.  What about jsonb_path_exists() etc?
>
> jsonpath_exists is similar to xpath_exists.

That's true.  The question is whether it's more important to follow
json[b] naming convention or xml/xpath naming convention?  I guess
json[b] naming convention is more important in our case.

> Actually, we could use jsonb_exists(jsonb, jsonpath, json), but then
> we should specify the type of the second argument.

Yes, but AFAICS the key point of json[b]_ prefixes is to evade
function overloading.  So, I'm -1 for use jsonb_ prefix and have
function overload because of that.

> > 2) jsonpath_query_wrapped() name seems counter intuitive for me.  What
> > about jsonpath_query_array()?
>
> The question is should we try to provide a functional interface for
> all options of
> JSON_QUERY clause ?  The same is for  other SQL/JSON clauses.
> Currently,  patch contains very limited subset of JSON_QUERY
> functionality,  mostly for jsonpath testing.

Actually, my point is following.  We have jsonpath patch close to the
committable shape.  And we have patch for SQL/JSON clauses including
JSON_QUERY, which is huge, complex and didn't receive any serious
review yet.  So, we'll did our best on that patch during this release
cycle, but I can't guarantee it will get in PostgreSQL 12.  Thus, my
idea is to make jsonpath patch self contained by providing brief and
convenient procedural interface.  This procedural interface is anyway
not a part of standard.  It *might* be inspired by standard clauses,
but might be not.  I think we should try to make this procedural
interface as good and convenient by itself.  It's our extension, and
it wouldn't make us more or less standard conforming.

> > 3) jsonpath_query_wrapped() behavior looks tricky to me.  It returns
> > NULL for no results.  When result item is one, it is returned "as is".
> > When there are two or more result items, they are wrapped into array.
> > This representation of result seems extremely inconvenient for me.  It
> > becomes hard to solve even trivial tasks with that: for instance,
> > iterate all items found.  Without extra assumptions on what is going
> > to be returned it's also impossible for caller to distinguish single
> > array item found from multiple items found wrapped into array.  And
> > that seems very bad.  I know this behavior is inspired by SQL/JSON
> > standard.  But since these functions are anyway our extension, we can
> > implement them as we like.  So, I propose to make this function always
> > return array of items regardless on count of those items (even for 1
> > or 0 items).
>
> Fair enough, but if we agree, that we provide an exact functionality of
> SQL clauses, then better to follow the standard to avoid problems.

No, I see this as our extension.  And I don't see problems in being
different from standard clauses, because it's different anyway.  For
me, in this case it's better to evade problems of users.  And current
behavior of this function seems like just single big pain :)

> > 4) If we change behavior of jsonpath_query_wrapped() as above, we can
> > also implement jsonpath_query_one() function, which would return first
> > result item or NULL for no items.
> > Any thoughts?
>
> I think, that we should postpone this functional interface, which could be
> added later.

The reason we typically postpone things is that they are hard to bring
to committable shape.  jsonpath_query_one() doesn't cost us any real
development.  So, I don't see point in postponing that if consider
that as good part of procedural jsonpath interface.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: jsonpath

2019-01-22 Thread Oleg Bartunov
On Tue, Jan 22, 2019 at 8:21 AM Alexander Korotkov
 wrote:
>
> The next revision is attached.
>
> Nikita made code and documentation improvements, renamed functions
> from "jsonpath_" prefix to "jsonb_path_" prefix.  He also renamed
> jsonpath_predicate() to jsonb_path_match() (that looks better for me
> too).
>
> I've further renamed jsonb_query_wrapped() to jsonb_query_array(), and
> changed that behavior to always wrap result into array.

agree with new names

so it mimic the behaviour of JSON_QUERY with UNCONDITIONAL WRAPPER option

 Also, I've
> introduced new function jsonb_query_first(), which returns first item
> from result.
>
> --
> Alexander Korotkov
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company



-- 
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: jsonpath

2019-01-22 Thread Alexander Korotkov
On Tue, Jan 22, 2019 at 12:13 PM Oleg Bartunov  wrote:
>
> On Tue, Jan 22, 2019 at 8:21 AM Alexander Korotkov
>  wrote:
> >
> > The next revision is attached.
> >
> > Nikita made code and documentation improvements, renamed functions
> > from "jsonpath_" prefix to "jsonb_path_" prefix.  He also renamed
> > jsonpath_predicate() to jsonb_path_match() (that looks better for me
> > too).
> >
> > I've further renamed jsonb_query_wrapped() to jsonb_query_array(), and
> > changed that behavior to always wrap result into array.
>
> agree with new names
>
> so it mimic the behaviour of JSON_QUERY with UNCONDITIONAL WRAPPER option

Good, thank you for feedback!

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: jsonpath

2019-01-25 Thread Alexander Korotkov
On Wed, Jan 23, 2019 at 8:01 AM Alexander Korotkov
 wrote:
> Finally, I'm going to commit this if no objections.

BTW, I decided to postpone commit for few days.  Nikita and me are
still working on better error messages.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: jsonpath

2019-01-28 Thread Tomas Vondra



On 1/28/19 1:22 AM, Alexander Korotkov wrote:
> On Sun, Jan 27, 2019 at 1:50 PM Alexander Korotkov
>  wrote:
>> On Sat, Jan 26, 2019 at 4:27 AM Alexander Korotkov
>>  wrote:
>>>
>>> On Wed, Jan 23, 2019 at 8:01 AM Alexander Korotkov
>>>  wrote:
 Finally, I'm going to commit this if no objections.
>>>
>>> BTW, I decided to postpone commit for few days.  Nikita and me are
>>> still working on better error messages.
>>
>> Updated patchset is attached.  This patchset includes:
>>
>> * Improved error handling by Nikita, revised by me,
>> * Code beautification.
>>
>> So, I'm going to commit this again.  This time seriously :)
> 
> I'm really sorry for confusing people, but I've one more revision.
> This is my first time attempting to commit such a large patch.
> 

I'm not sure what you're apologizing for, it simply shows how careful
you are when polishing such a large patch.

> Major changes are following:
>  * We find it ridiculous to save ErrorData for possible latter throw.
> Now, we either throw an error immediately or return jperError.  That
> also allows to get rid of unwanted changes in elog.c/elog.h.

OK.

>  * I decided to change behavior of jsonb_path_match() to throw as less
> errors as possible.  The reason is that it's used to implement
> potentially (patch is pending) indexable operator.  Index scan is not
> always capable to throw as many errors and sequential scan.  So, it's
> better to not introduce extra possible index scan and sequential scan
> results divergence.
> 

Hmmm, can you elaborate a bit more? Which errors were thrown before and
are not thrown with the current patch version?

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: jsonpath

2019-01-28 Thread Nikita Glukhov

On 28.01.2019 16:50, Tomas Vondra wrote:


On 1/28/19 1:22 AM, Alexander Korotkov wrote:

  * I decided to change behavior of jsonb_path_match() to throw as less
errors as possible.  The reason is that it's used to implement
potentially (patch is pending) indexable operator.  Index scan is not
always capable to throw as many errors and sequential scan.  So, it's
better to not introduce extra possible index scan and sequential scan
results divergence.

Hmmm, can you elaborate a bit more? Which errors were thrown before and
are not thrown with the current patch version?


In the previous version of the patch jsonb_path_match() threw error when
jsonpath did not return a singleton value, but in the last version in such cases
NULL is returned.  This problem arises because we cannot guarantee at compile
time that jsonpath expression used in jsonb_path_match() is a predicate.
Predicates by standard can return only True, False, and Unknown (errors occurred
during execution of their operands are transformed into Unknown values), so
predicates cannot throw errors, and there are no problems with errors.

GIN does not attempt to search non-predicate expressions, so there may be no
problem even we throw "not a singleton" error.


Here I want to remind that ability to use predicates in the root of jsonpath
expression is an our extension to standard that was created specially for the
operator @@.  By standard predicates are allowed only in filters.  Without this
extension we are still able to rewrite @@ using @?:
jsonb @@ 'predicate'is equivalent to
jsonb @? '$ ? (predicate)'
but such @? expression is a bit slower to execute and a bit verbose to write.

If we introduced special type 'jsonpath_predicate', then we could solve the
problem by checking the type of jsonpath expression at compile-time.


Another problem with error handling is that jsonb_path_query*() functions
always throw SQL/JSON errors and there is no easy and effective way to emulate
NULL ON ERROR behavior, which is used by default in SQL/JSON functions.  So I
think it's worth trying to add some kind of flag 'throwErrors' to
jsonb_path_query*() functions.

--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: jsonpath

2019-01-28 Thread Andres Freund
Hi,

On 2019-01-29 04:00:19 +0300, Alexander Korotkov wrote:
> + /*
> +  * It is safe to use here PG_TRY/PG_CATCH without subtransaction because
> +  * no function called inside performs data modification.
> +  */
> + PG_TRY();
> + {
> + res = DirectFunctionCall2(func, ldatum, rdatum);
> + }
> + PG_CATCH();
> + {
> + int errcode = geterrcode();
> +
> + if (jspThrowErrors(cxt) ||
> + ERRCODE_TO_CATEGORY(errcode) != ERRCODE_DATA_EXCEPTION)
> + PG_RE_THROW();
> +
> + MemoryContextSwitchTo(mcxt);
> + FlushErrorState();
> +
> + return jperError;
> + }
> + PG_END_TRY();

FWIW, I still think this is a terrible idea and shouldn't be merged this
way. The likelihood of introducing subtle bugs seems way too high - even
if it's possibly not buggy today, who says that it's not going to be in
the future?

Greetings,

Andres Freund



Re: jsonpath

2019-01-28 Thread Alexander Korotkov
On Tue, Jan 29, 2019 at 4:03 AM Andres Freund  wrote:
> On 2019-01-29 04:00:19 +0300, Alexander Korotkov wrote:
> > + /*
> > +  * It is safe to use here PG_TRY/PG_CATCH without subtransaction 
> > because
> > +  * no function called inside performs data modification.
> > +  */
> > + PG_TRY();
> > + {
> > + res = DirectFunctionCall2(func, ldatum, rdatum);
> > + }
> > + PG_CATCH();
> > + {
> > + int errcode = geterrcode();
> > +
> > + if (jspThrowErrors(cxt) ||
> > + ERRCODE_TO_CATEGORY(errcode) != 
> > ERRCODE_DATA_EXCEPTION)
> > + PG_RE_THROW();
> > +
> > + MemoryContextSwitchTo(mcxt);
> > + FlushErrorState();
> > +
> > + return jperError;
> > + }
> > + PG_END_TRY();
>
> FWIW, I still think this is a terrible idea and shouldn't be merged this
> way. The likelihood of introducing subtle bugs seems way too high - even
> if it's possibly not buggy today, who says that it's not going to be in
> the future?

I'm probably not yet understanding all the risks this code have.  So far I see:
1) One of functions called here performs database modification, while
it wasn't suppose to.  So, it becomes not safe to skip subtransaction.
2) ERRCODE_DATA_EXCEPTION was thrown for unexpected reason.  So, it
might appear that ERRCODE_DATA_EXCEPTION is not safe to ignore.
Could you complete this list?

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: jsonpath

2019-01-28 Thread Alexander Korotkov
On Tue, Jan 29, 2019 at 4:03 AM Andres Freund  wrote:
> On 2019-01-29 04:00:19 +0300, Alexander Korotkov wrote:
> > + /*
> > +  * It is safe to use here PG_TRY/PG_CATCH without subtransaction 
> > because
> > +  * no function called inside performs data modification.
> > +  */
> > + PG_TRY();
> > + {
> > + res = DirectFunctionCall2(func, ldatum, rdatum);
> > + }
> > + PG_CATCH();
> > + {
> > + int errcode = geterrcode();
> > +
> > + if (jspThrowErrors(cxt) ||
> > + ERRCODE_TO_CATEGORY(errcode) != 
> > ERRCODE_DATA_EXCEPTION)
> > + PG_RE_THROW();
> > +
> > + MemoryContextSwitchTo(mcxt);
> > + FlushErrorState();
> > +
> > + return jperError;
> > + }
> > + PG_END_TRY();

BTW, this code also looks... useless.  I can't see whole numeric.c
throwing ERRCODE_DATA_EXCEPTION.   Nikita, what do we mean here?

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: jsonpath

2019-01-28 Thread Andres Freund
On 2019-01-29 04:17:33 +0300, Alexander Korotkov wrote:
> On Tue, Jan 29, 2019 at 4:03 AM Andres Freund  wrote:
> > On 2019-01-29 04:00:19 +0300, Alexander Korotkov wrote:
> > > + /*
> > > +  * It is safe to use here PG_TRY/PG_CATCH without subtransaction 
> > > because
> > > +  * no function called inside performs data modification.
> > > +  */
> > > + PG_TRY();
> > > + {
> > > + res = DirectFunctionCall2(func, ldatum, rdatum);
> > > + }
> > > + PG_CATCH();
> > > + {
> > > + int errcode = geterrcode();
> > > +
> > > + if (jspThrowErrors(cxt) ||
> > > + ERRCODE_TO_CATEGORY(errcode) != 
> > > ERRCODE_DATA_EXCEPTION)
> > > + PG_RE_THROW();
> > > +
> > > + MemoryContextSwitchTo(mcxt);
> > > + FlushErrorState();
> > > +
> > > + return jperError;
> > > + }
> > > + PG_END_TRY();
> >
> > FWIW, I still think this is a terrible idea and shouldn't be merged this
> > way. The likelihood of introducing subtle bugs seems way too high - even
> > if it's possibly not buggy today, who says that it's not going to be in
> > the future?
> 
> I'm probably not yet understanding all the risks this code have.  So far I 
> see:

I find these *more* than sufficient to not go to the PG_TRY/CATCH
approach.


> 1) One of functions called here performs database modification, while
> it wasn't suppose to.  So, it becomes not safe to skip subtransaction.

It's not just data modifications. Even just modifying some memory
structures that'd normally be invalidated by an xact abort's
invalidation processing isn't safe.


> 2) ERRCODE_DATA_EXCEPTION was thrown for unexpected reason.  So, it
> might appear that ERRCODE_DATA_EXCEPTION is not safe to ignore.

It'd e.g. not surprise me very much if some OOM would end up translating
to ERRCODE_DATA_EXCEPTION, because some library function returned an
error due to ENOMEM.


> Could you complete this list?

3) The expression changed the current expression context, GUCs or any
   other such global variable. Without a proper subtrans reset this
   state isn't reverted.
4) The function acquires an LWLOCK, buffer reference, anything resowner
   owned. Skipping subtrans reset, that's not released in that
   moment. That's going to lead to potential hard deadlocks.
99) sigsetjmp is actually pretty expensive.

Greetings,

Andres Freund



Re: jsonpath

2019-01-28 Thread Tom Lane
Alexander Korotkov  writes:
> On Tue, Jan 29, 2019 at 4:03 AM Andres Freund  wrote:
>> FWIW, I still think this is a terrible idea and shouldn't be merged this
>> way. The likelihood of introducing subtle bugs seems way too high - even
>> if it's possibly not buggy today, who says that it's not going to be in
>> the future?

> I'm probably not yet understanding all the risks this code have.  So far I 
> see:
> 1) One of functions called here performs database modification, while
> it wasn't suppose to.  So, it becomes not safe to skip subtransaction.
> 2) ERRCODE_DATA_EXCEPTION was thrown for unexpected reason.  So, it
> might appear that ERRCODE_DATA_EXCEPTION is not safe to ignore.
> Could you complete this list?

Sure: every errcode we have is unsafe to treat this way.

The backend coding rule from day one has been that a thrown error requires
(sub)transaction cleanup to be done to make sure that things are back in a
good state.  You can *not* just decide that it's okay to ignore that,
especially not when invoking code outside the immediate area of what
you're doing.

As a counterexample, for any specific errcode you might claim is safe,
a plpgsql function might do something that requires cleanup (which is
not equal to "does data modification") and then throw that errcode.

You might argue that this code will only ever be used to call certain
functions in numeric.c and you've vetted every line of code that those
functions can reach ... but even to say that is to expose how fragile
the argument is.  Every time anybody touched numeric.c, or any code
reachable from there, we'd have to wonder whether we just introduced
a failure hazard for jsonpath.  That's not maintainable.  It's even
less maintainable if anybody decides they'd like to insert extension
hooks into any of that code.  I rather imagine that this could be
broken today by an ErrorContextCallback function, for example.
(That is, even today, "vetting every line" has to include vetting every
errcontext subroutine in the system, along with everything it can call.
Plus equivalent code in external PLs and so on.)

We've rejected code like this every time it was submitted to date,
and I don't see a reason to change that policy for jsonpath.

regards, tom lane



Re: jsonpath

2019-01-28 Thread Alexander Korotkov
On Tue, Jan 29, 2019 at 4:30 AM Alexander Korotkov
 wrote:
> On Tue, Jan 29, 2019 at 4:03 AM Andres Freund  wrote:
> > On 2019-01-29 04:00:19 +0300, Alexander Korotkov wrote:
> > > + /*
> > > +  * It is safe to use here PG_TRY/PG_CATCH without subtransaction 
> > > because
> > > +  * no function called inside performs data modification.
> > > +  */
> > > + PG_TRY();
> > > + {
> > > + res = DirectFunctionCall2(func, ldatum, rdatum);
> > > + }
> > > + PG_CATCH();
> > > + {
> > > + int errcode = geterrcode();
> > > +
> > > + if (jspThrowErrors(cxt) ||
> > > + ERRCODE_TO_CATEGORY(errcode) != 
> > > ERRCODE_DATA_EXCEPTION)
> > > + PG_RE_THROW();
> > > +
> > > + MemoryContextSwitchTo(mcxt);
> > > + FlushErrorState();
> > > +
> > > + return jperError;
> > > + }
> > > + PG_END_TRY();
>
> BTW, this code also looks... useless.  I can't see whole numeric.c
> throwing ERRCODE_DATA_EXCEPTION.   Nikita, what do we mean here?

Oh, sorry.  I've missed we have ERRCODE_TO_CATEGORY() here.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: jsonpath

2019-01-28 Thread Tom Lane
Alexander Korotkov  writes:
> + if (jspThrowErrors(cxt) ||
> + ERRCODE_TO_CATEGORY(errcode) != ERRCODE_DATA_EXCEPTION)
> + PG_RE_THROW();

> BTW, this code also looks... useless.  I can't see whole numeric.c
> throwing ERRCODE_DATA_EXCEPTION.   Nikita, what do we mean here?

I think what this code is claiming is that any errcode in the 22xxx
category is safe to trap.  Presumably what it's expecting to see is
ERRCODE_DIVISION_BY_ZERO (22012) or another error that numeric.c
could throw ... but 22xxx covers a heck of a lot of ground.

regards, tom lane



Re: jsonpath

2019-01-28 Thread Alexander Korotkov
On Tue, Jan 29, 2019 at 4:44 AM Tom Lane  wrote:
> Alexander Korotkov  writes:
> > On Tue, Jan 29, 2019 at 4:03 AM Andres Freund  wrote:
> >> FWIW, I still think this is a terrible idea and shouldn't be merged this
> >> way. The likelihood of introducing subtle bugs seems way too high - even
> >> if it's possibly not buggy today, who says that it's not going to be in
> >> the future?
>
> > I'm probably not yet understanding all the risks this code have.  So far I 
> > see:
> > 1) One of functions called here performs database modification, while
> > it wasn't suppose to.  So, it becomes not safe to skip subtransaction.
> > 2) ERRCODE_DATA_EXCEPTION was thrown for unexpected reason.  So, it
> > might appear that ERRCODE_DATA_EXCEPTION is not safe to ignore.
> > Could you complete this list?
>
> Sure: every errcode we have is unsafe to treat this way.
>
> The backend coding rule from day one has been that a thrown error requires
> (sub)transaction cleanup to be done to make sure that things are back in a
> good state.  You can *not* just decide that it's okay to ignore that,
> especially not when invoking code outside the immediate area of what
> you're doing.
>
> As a counterexample, for any specific errcode you might claim is safe,
> a plpgsql function might do something that requires cleanup (which is
> not equal to "does data modification") and then throw that errcode.
>
> You might argue that this code will only ever be used to call certain
> functions in numeric.c and you've vetted every line of code that those
> functions can reach ... but even to say that is to expose how fragile
> the argument is.  Every time anybody touched numeric.c, or any code
> reachable from there, we'd have to wonder whether we just introduced
> a failure hazard for jsonpath.  That's not maintainable.  It's even
> less maintainable if anybody decides they'd like to insert extension
> hooks into any of that code.  I rather imagine that this could be
> broken today by an ErrorContextCallback function, for example.
> (That is, even today, "vetting every line" has to include vetting every
> errcontext subroutine in the system, along with everything it can call.
> Plus equivalent code in external PLs and so on.)
>
> We've rejected code like this every time it was submitted to date,
> and I don't see a reason to change that policy for jsonpath.

Tom, Andres thank you for the explanation.  More than clear for now.
Will search for workaround.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: jsonpath

2019-01-28 Thread Andres Freund
Hi,

On 2019-01-28 17:31:15 -0800, Andres Freund wrote:
> On 2019-01-29 04:17:33 +0300, Alexander Korotkov wrote:
> > I'm probably not yet understanding all the risks this code have.  So far I 
> > see:
> 
> I find these *more* than sufficient to not go to the PG_TRY/CATCH
> approach.
> 
> 
> > 1) One of functions called here performs database modification, while
> > it wasn't suppose to.  So, it becomes not safe to skip subtransaction.
> 
> It's not just data modifications. Even just modifying some memory
> structures that'd normally be invalidated by an xact abort's
> invalidation processing isn't safe.
> 
> 
> > 2) ERRCODE_DATA_EXCEPTION was thrown for unexpected reason.  So, it
> > might appear that ERRCODE_DATA_EXCEPTION is not safe to ignore.
> 
> It'd e.g. not surprise me very much if some OOM would end up translating
> to ERRCODE_DATA_EXCEPTION, because some library function returned an
> error due to ENOMEM.
> 
> 
> > Could you complete this list?
> 
> 3) The expression changed the current expression context, GUCs or any
>other such global variable. Without a proper subtrans reset this
>state isn't reverted.
> 4) The function acquires an LWLOCK, buffer reference, anything resowner
>owned. Skipping subtrans reset, that's not released in that
>moment. That's going to lead to potential hard deadlocks.
> 99) sigsetjmp is actually pretty expensive.

And even if you, to address Tom's point about plpgsql, had a category
that could only be thrown by core code, and addressed 3) and 4) by
carefully and continuously auditing code, you'd still have the issue of

5) you'd likely leak memory at potentially prodiguous rate...

Greetings,

Andres Freund



Re: jsonpath

2019-01-28 Thread Tom Lane
Andres Freund  writes:
> And even if you, to address Tom's point about plpgsql, had a category
> that could only be thrown by core code,

Actually, that wasn't quite what my point was there.  Even if the error
that is thrown is perfectly safe and was thrown from a known spot in the
core code, elog.c will call ErrorContextCallback functions on the way out,
*before* longjmp'ing back to your code.  So, if PL foo executes a SQL
command that includes jsonpath operations, it's possible for PL foo's
error context reporting code to break whatever assumptions you thought
were safe to make.

Admittedly, a wise PL designer would refrain from assuming too much about
what they can get away with doing in an error context callback.  But
that very same caution should persuade us not to assume too much about
what actually has happened during error processing.

regards, tom lane



Re: jsonpath

2019-01-29 Thread Andres Freund
Hi,

On 2019-01-29 04:00:19 +0300, Alexander Korotkov wrote:
> +/*
> + * Make datetime type from 'date_txt' which is formated at argument 'fmt'.
> + * Actual datatype (returned in 'typid', 'typmod') is determined by
> + * presence of date/time/zone components in the format string.
> + */
> +Datum
> +to_datetime(text *date_txt, text *fmt, char *tzname, bool strict, Oid *typid,
> + int32 *typmod, int *tz)

Given other to_ functions being SQL callable, I'm not a fan of the
naming here.

> +{
> + struct pg_tm tm;
> + fsec_t  fsec;
> + int fprec = 0;
> + int flags;
> +
> + do_to_timestamp(date_txt, fmt, strict, &tm, &fsec, &fprec, &flags);
> +
> + *typmod = fprec ? fprec : -1;   /* fractional part precision */
> + *tz = 0;
> +
> + if (flags & DCH_DATED)
> + {
> + if (flags & DCH_TIMED)
> + {
> + if (flags & DCH_ZONED)
> + {
> + TimestampTz result;
> +
> + if (tm.tm_zone)
> + tzname = (char *) tm.tm_zone;
> +
> + if (tzname)
> + {
> + int dterr = 
> DecodeTimezone(tzname, tz);
> +
> + if (dterr)
> + DateTimeParseError(dterr, 
> tzname, "timestamptz");


Do we really need 6/7 indentation levels in new code?




> +jsonpath_scan.c: FLEXFLAGS = -CF -p -p

Why -CF, and why is -p repeated?


> - JsonEncodeDateTime(buf, val, DATEOID);
> + JsonEncodeDateTime(buf, val, DATEOID, NULL);

ISTM API changes like this ought to be done in a separate patch, to ease
review.


>  }
>  
> +
>  /*
>   * Compare two jbvString JsonbValue values, a and b.
>   *

Random WS change.


> +/*INPUT/OUTPUT*/

Why do we need this much code to serialize data that we initially got in
serialized manner? Couldn't we just keep the original around?


> +Datum
> +jsonpath_in(PG_FUNCTION_ARGS)
> +{

No binary input support? I'd suggest adding that, but keeping the
representation the same. Otherwise just adds unnecesary complexity for
driver authors wishing to use the binar protocol.



> +/Support functions for 
> JsonPath**/
> +
> +/*
> + * Support macroses to read stored values
> + */

s/macroses/macros/


> +++ b/src/backend/utils/adt/jsonpath_exec.c

Gotta say, I'm unenthusiastic about yet another execution engine in some
PG subsystem.


> @@ -0,0 +1,2776 @@
> +/*-
> + *
> + * jsonpath_exec.c
> + *Routines for SQL/JSON path execution.
> + *
> + * Copyright (c) 2019, PostgreSQL Global Development Group
> + *
> + * IDENTIFICATION
> + *   src/backend/utils/adt/jsonpath_exec.c
> + *
> + *-
> + */

Somewhere there needs to be higher level documentation explaining how
this stuff is supposed to work on a code level.

> +
> +/* strict/lax flags is decomposed into four [un]wrap/error flags */
> +#define jspStrictAbsenseOfErrors(cxt)(!(cxt)->laxMode)
> +#define jspAutoUnwrap(cxt)   ((cxt)->laxMode)
> +#define jspAutoWrap(cxt) ((cxt)->laxMode)
> +#define jspIgnoreStructuralErrors(cxt)   ((cxt)->ignoreStructuralErrors)
> +#define jspThrowErrors(cxt)  ((cxt)->throwErrors)

What's the point?


> +#define ThrowJsonPathError(code, detail) \
> + ereport(ERROR, \
> +  (errcode(ERRCODE_ ## code), \
> +   errmsg(ERRMSG_ ## code), \
> +   errdetail detail))
> +
> +#define ThrowJsonPathErrorHint(code, detail, hint) \
> + ereport(ERROR, \
> +  (errcode(ERRCODE_ ## code), \
> +   errmsg(ERRMSG_ ## code), \
> +   errdetail detail, \
> +   errhint hint))

These seem ill-advised, just making it harder to understand the code,
and grepping for it, without actually meaningfully simplifying anything.


> +/*
> + * Find value of jsonpath variable in a list of passing params
> + */

What is that comment supposed to mean?


> +/*
> + * Get the type name of a SQL/JSON item.
> + */
> +static const char *
> +JsonbTypeName(JsonbValue *jb)
> +{

Wasn't there pretty similar infrastructure elsewhere?



> +/*
> + * Cross-type comparison of two datetime SQL/JSON items.  If items are
> + * uncomparable, 'error' flag is set.
> + */

Sounds like it'd not raise an error, but it does in a bunch of scenarios.


> @@ -206,6 +206,22 @@ Section: Class 22 - Data Exception
>  2200NEERRCODE_INVALID_XML_CONTENT 

Re: jsonpath

2019-01-29 Thread Alexander Korotkov
Hi!

Thank you for your review!  Let me answer some points of your review
while others will be answered later.

On Wed, Jan 30, 2019 at 5:28 AM Andres Freund  wrote:
> On 2019-01-29 04:00:19 +0300, Alexander Korotkov wrote:
> > +/*INPUT/OUTPUT*/
>
> Why do we need this much code to serialize data that we initially got in
> serialized manner? Couldn't we just keep the original around?

As I get, you concern related to fact that we have jsonpath in text
form (serialized) and convert it to the binary form (also serialized).
Yes, we should do so.  Otherwise, we would have to parse jsonpath for
each evaluation.  Basically, for the same reason we have binary
representation for jsonb.

> > +++ b/src/backend/utils/adt/jsonpath_exec.c
>
> Gotta say, I'm unenthusiastic about yet another execution engine in some
> PG subsystem.

Better ideas?  I can imagine we can evade introduction of jsonpath
datatype and turn jsonpath into executor nodes.  But I hardly can
imagine you would be more enthusiastic about that :)

> > @@ -206,6 +206,22 @@ Section: Class 22 - Data Exception
> >  2200NEERRCODE_INVALID_XML_CONTENT  
> >   invalid_xml_content
> >  2200SEERRCODE_INVALID_XML_COMMENT  
> >   invalid_xml_comment
> >  2200TEERRCODE_INVALID_XML_PROCESSING_INSTRUCTION   
> >   invalid_xml_processing_instruction
> > +22030EERRCODE_DUPLICATE_JSON_OBJECT_KEY_VALUE  
> >   duplicate_json_object_key_value
> > +22031EERRCODE_INVALID_ARGUMENT_FOR_JSON_DATETIME_FUNCTION  
> >   invalid_argument_for_json_datetime_function
> > +22032EERRCODE_INVALID_JSON_TEXT
> >   invalid_json_text
> > +22033EERRCODE_INVALID_JSON_SUBSCRIPT   
> >   invalid_json_subscript
> > +22034EERRCODE_MORE_THAN_ONE_JSON_ITEM  
> >   more_than_one_json_item
> > +22035EERRCODE_NO_JSON_ITEM 
> >   no_json_item
> > +22036EERRCODE_NON_NUMERIC_JSON_ITEM
> >   non_numeric_json_item
> > +22037EERRCODE_NON_UNIQUE_KEYS_IN_JSON_OBJECT   
> >   non_unique_keys_in_json_object
> > +22038EERRCODE_SINGLETON_JSON_ITEM_REQUIRED 
> >   singleton_json_item_required
> > +22039EERRCODE_JSON_ARRAY_NOT_FOUND 
> >   json_array_not_found
> > +2203AEERRCODE_JSON_MEMBER_NOT_FOUND
> >   json_member_not_found
> > +2203BEERRCODE_JSON_NUMBER_NOT_FOUND
> >   json_number_not_found
> > +2203CEERRCODE_JSON_OBJECT_NOT_FOUND
> >   object_not_found
> > +2203FEERRCODE_JSON_SCALAR_REQUIRED 
> >   json_scalar_required
> > +2203DEERRCODE_TOO_MANY_JSON_ARRAY_ELEMENTS 
> >   too_many_json_array_elements
> > +2203EEERRCODE_TOO_MANY_JSON_OBJECT_MEMBERS
> >  too_many_json_object_members
>
> Are all of these invented by us?

None of them are invented by us.  All of them are part of SQL 2016 standard.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: jsonpath

2019-01-29 Thread Andres Freund
Hi,

On 2019-01-30 07:34:00 +0300, Alexander Korotkov wrote:
> Thank you for your review!  Let me answer some points of your review
> while others will be answered later.
> 
> On Wed, Jan 30, 2019 at 5:28 AM Andres Freund  wrote:
> > On 2019-01-29 04:00:19 +0300, Alexander Korotkov wrote:
> > > +/*INPUT/OUTPUT*/
> >
> > Why do we need this much code to serialize data that we initially got in
> > serialized manner? Couldn't we just keep the original around?
> 
> As I get, you concern related to fact that we have jsonpath in text
> form (serialized) and convert it to the binary form (also serialized).
> Yes, we should do so.  Otherwise, we would have to parse jsonpath for
> each evaluation.  Basically, for the same reason we have binary
> representation for jsonb.

But why can't we keep the text around, instead of having all of that
code for converting back?


> > > +++ b/src/backend/utils/adt/jsonpath_exec.c
> >
> > Gotta say, I'm unenthusiastic about yet another execution engine in some
> > PG subsystem.
> 
> Better ideas?  I can imagine we can evade introduction of jsonpath
> datatype and turn jsonpath into executor nodes.  But I hardly can
> imagine you would be more enthusiastic about that :)

Not executor nodes, I think it could be possible to put it into the
expression evaluation codepaths, but it's probably too different to fit
well (would get you JIT compilation of the whole thing tho).

> > > @@ -206,6 +206,22 @@ Section: Class 22 - Data Exception
> > >  2200NEERRCODE_INVALID_XML_CONTENT
> > > invalid_xml_content
> > >  2200SEERRCODE_INVALID_XML_COMMENT
> > > invalid_xml_comment
> > >  2200TEERRCODE_INVALID_XML_PROCESSING_INSTRUCTION 
> > > invalid_xml_processing_instruction
> > > +22030EERRCODE_DUPLICATE_JSON_OBJECT_KEY_VALUE
> > > duplicate_json_object_key_value
> > > +22031EERRCODE_INVALID_ARGUMENT_FOR_JSON_DATETIME_FUNCTION
> > > invalid_argument_for_json_datetime_function
> > > +22032EERRCODE_INVALID_JSON_TEXT  
> > > invalid_json_text
> > > +22033EERRCODE_INVALID_JSON_SUBSCRIPT 
> > > invalid_json_subscript
> > > +22034EERRCODE_MORE_THAN_ONE_JSON_ITEM
> > > more_than_one_json_item
> > > +22035EERRCODE_NO_JSON_ITEM   
> > > no_json_item
> > > +22036EERRCODE_NON_NUMERIC_JSON_ITEM  
> > > non_numeric_json_item
> > > +22037EERRCODE_NON_UNIQUE_KEYS_IN_JSON_OBJECT 
> > > non_unique_keys_in_json_object
> > > +22038EERRCODE_SINGLETON_JSON_ITEM_REQUIRED   
> > > singleton_json_item_required
> > > +22039EERRCODE_JSON_ARRAY_NOT_FOUND   
> > > json_array_not_found
> > > +2203AEERRCODE_JSON_MEMBER_NOT_FOUND  
> > > json_member_not_found
> > > +2203BEERRCODE_JSON_NUMBER_NOT_FOUND  
> > > json_number_not_found
> > > +2203CEERRCODE_JSON_OBJECT_NOT_FOUND  
> > > object_not_found
> > > +2203FEERRCODE_JSON_SCALAR_REQUIRED   
> > > json_scalar_required
> > > +2203DEERRCODE_TOO_MANY_JSON_ARRAY_ELEMENTS   
> > > too_many_json_array_elements
> > > +2203EEERRCODE_TOO_MANY_JSON_OBJECT_MEMBERS
> > >  too_many_json_object_members
> >
> > Are all of these invented by us?
> 
> None of them are invented by us.  All of them are part of SQL 2016 standard.

Cool.

Greetings,

Andres Freund



Re: jsonpath

2019-01-30 Thread Alexander Korotkov
On Wed, Jan 30, 2019 at 9:36 AM Andres Freund  wrote:
> On 2019-01-30 07:34:00 +0300, Alexander Korotkov wrote:
> > Thank you for your review!  Let me answer some points of your review
> > while others will be answered later.
> >
> > On Wed, Jan 30, 2019 at 5:28 AM Andres Freund 
wrote:
> > > On 2019-01-29 04:00:19 +0300, Alexander Korotkov wrote:
> > > >
+/*INPUT/OUTPUT*/
> > >
> > > Why do we need this much code to serialize data that we initially got
in
> > > serialized manner? Couldn't we just keep the original around?
> >
> > As I get, you concern related to fact that we have jsonpath in text
> > form (serialized) and convert it to the binary form (also serialized).
> > Yes, we should do so.  Otherwise, we would have to parse jsonpath for
> > each evaluation.  Basically, for the same reason we have binary
> > representation for jsonb.
>
> But why can't we keep the text around, instead of having all of that
> code for converting back?

Yeah, that's possible.  But now converting back to string is one of ways to
test that jsonpath parsing is correct.  If we remove conversion back to
string, this possibility would be also removed.  Also, we would loose way
to normalize jsonpath, which is probably not that important.  As well it's
generally ugly redundancy.  So, I can't say I like idea to save few
hundreds lines of codes for this price.

> > > > +++ b/src/backend/utils/adt/jsonpath_exec.c
> > >
> > > Gotta say, I'm unenthusiastic about yet another execution engine in
some
> > > PG subsystem.
> >
> > Better ideas?  I can imagine we can evade introduction of jsonpath
> > datatype and turn jsonpath into executor nodes.  But I hardly can
> > imagine you would be more enthusiastic about that :)
>
> Not executor nodes, I think it could be possible to put it into the
> expression evaluation codepaths, but it's probably too different to fit
> well (would get you JIT compilation of the whole thing tho).

Consider given example. We need to check some predicate for each JSON item,
where predicate is given by expression and set of items is produced by
another expression.  In order to fit this into expression evaluation, we
probably need some kind of lamda functions there.  It seems unlikely for me
that we want to implement that.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: jsonpath

2019-02-03 Thread Michael Paquier
On Tue, Jan 29, 2019 at 04:51:46AM +0300, Alexander Korotkov wrote:
> Oh, sorry.  I've missed we have ERRCODE_TO_CATEGORY() here.

Note: patch set moved to next CF, still waiting on author.
--
Michael


signature.asc
Description: PGP signature


Re: jsonpath

2019-02-24 Thread Tomas Vondra
Hi,

On 2/24/19 10:03 AM, Alexander Korotkov wrote:
> Hi!
> 
> Attached is revised version of jsonpath.  Assuming that jsonpath have
> problem places, I decided to propose partial implementation.
> Following functionality was cut from jsonpath:
> 1) Support of datetime datatypes.  Besides error suppression, this
> functionality have troubles with timezones.  According to standard we
> should support timezones in jsonpath expressions.  But that would
> prevent jsonpath functions from being immutable, that in turn limits
> their usage in expression indexes.

Assuming we get the patch committed without the datetime stuff now, what
does that mean for the future? Does that mean we'll be unable to extend
it to support datetime in the future, or what? If we define the jsonpath
functions as immutable now, people may create indexes - which means we
won't be able to downgrade it to stable later.

So, what's the plan here? The only thing I can think of is having two
sets of functions - an immutable one, prohibiting datetime expressions,
and stable that can't be used for indexes etc.

> 2) Suppression of numeric errors.  I will post it as a separate patch.
> Pushing this even this partial implementation of jsonpath to PG 12 is
> still very useful.  Also it will simplify a lot pushing other parts of
> SQL/JSON to future releases.
> 

+1 to push at least partial (but still useful) subset, instead of just
bumping the patch to PG13


regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



  1   2   3   4   5   >