Re: [HACKERS] btree_gin and btree_gist for enums
28.02.2017 00:22, Andrew Dunstan: OK, here's the whole series of patches. Patch 1 adds the CallerFInfoFunctionCall{1,2} functions. Patch 2 adds btree_gist support for their use for non-varlena types Patch 3 does the same for varlena types (Not required for patch 4, but better to be consistent, I think.) Patch 4 adds enum support to btree_gist Patch 5 adds enum support to btree_gin cheers andrew Thank you for implementing the feature! These patches seem to have some merge conflicts with recent commit: commit c7a9fa399d557c6366222e90b35db31e45d25678 Author: Stephen Frost Date: Wed Mar 15 11:16:25 2017 -0400 Add support for EUI-64 MAC addresses as macaddr8 And also, it's needed to update patch 0002 to consider macaddr8, I attached the diff. Complete patch (including 0002_addition fix) rebased to the current master is attached as well. It works as expected, code itself looks clear and well documented. The only issue I've found is a make check failure in contrib/btree_gin subdirectory: test numeric ... ok test enum ... /bin/sh: 1: cannot open /home/anastasia/newprojects/original_repo/postgres/contrib/btree_gin/sql/enum.sql: No such file diff: /home/anastasia/newprojects/original_repo/postgres/contrib/btree_gin/results/enum.out: No such file or directory diff command failed with status 512: diff "/home/anastasia/newprojects/original_repo/postgres/contrib/btree_gin/expected/enum.out" "/home/anastasia/newprojects/original_repo/postgres/contrib/btree_gin/results/enum.out" > "/home/anastasia/newprojects/original_repo/postgres/contrib/btree_gin/results/enum.out.diff" Please, add regression test for btree_gin, and this patch can be considered "Ready for committer". -- Anastasia Lubennikova Postgres Professional: http://www.postgrespro.com The Russian Postgres Company diff --git a/contrib/btree_gist/btree_macaddr8.c b/contrib/btree_gist/btree_macaddr8.c index 13238ef..96afbcd 100644 --- a/contrib/btree_gist/btree_macaddr8.c +++ b/contrib/btree_gist/btree_macaddr8.c @@ -28,37 +28,37 @@ PG_FUNCTION_INFO_V1(gbt_macad8_same); static bool -gbt_macad8gt(const void *a, const void *b) +gbt_macad8gt(const void *a, const void *b, FmgrInfo *flinfo) { return DatumGetBool(DirectFunctionCall2(macaddr8_gt, PointerGetDatum(a), PointerGetDatum(b))); } static bool -gbt_macad8ge(const void *a, const void *b) +gbt_macad8ge(const void *a, const void *b, FmgrInfo *flinfo) { return DatumGetBool(DirectFunctionCall2(macaddr8_ge, PointerGetDatum(a), PointerGetDatum(b))); } static bool -gbt_macad8eq(const void *a, const void *b) +gbt_macad8eq(const void *a, const void *b, FmgrInfo *flinfo) { return DatumGetBool(DirectFunctionCall2(macaddr8_eq, PointerGetDatum(a), PointerGetDatum(b))); } static bool -gbt_macad8le(const void *a, const void *b) +gbt_macad8le(const void *a, const void *b, FmgrInfo *flinfo) { return DatumGetBool(DirectFunctionCall2(macaddr8_le, PointerGetDatum(a), PointerGetDatum(b))); } static bool -gbt_macad8lt(const void *a, const void *b) +gbt_macad8lt(const void *a, const void *b, FmgrInfo *flinfo) { return DatumGetBool(DirectFunctionCall2(macaddr8_lt, PointerGetDatum(a), PointerGetDatum(b))); } static int -gbt_macad8key_cmp(const void *a, const void *b) +gbt_macad8key_cmp(const void *a, const void *b, FmgrInfo *flinfo) { mac8KEY*ia = (mac8KEY *) (((const Nsrt *) a)->t); mac8KEY*ib = (mac8KEY *) (((const Nsrt *) b)->t); @@ -142,7 +142,7 @@ gbt_macad8_consistent(PG_FUNCTION_ARGS) key.upper = (GBT_NUMKEY *) &kkk->upper; PG_RETURN_BOOL( - gbt_num_consistent(&key, (void *) query, &strategy, GIST_LEAF(entry), &tinfo) + gbt_num_consistent(&key, (void *) query, &strategy, GIST_LEAF(entry), &tinfo, fcinfo->flinfo) ); } @@ -154,7 +154,7 @@ gbt_macad8_union(PG_FUNCTION_ARGS) void *out = palloc0(sizeof(mac8KEY)); *(int *) PG_GETARG_POINTER(1) = sizeof(mac8KEY); - PG_RETURN_POINTER(gbt_num_union((void *) out, entryvec, &tinfo)); + PG_RETURN_POINTER(gbt_num_union((void *) out, entryvec, &tinfo, fcinfo->flinfo)); } @@ -184,7 +184,7 @@ gbt_macad8_picksplit(PG_FUNCTION_ARGS) PG_RETURN_POINTER(gbt_num_picksplit( (GistEntryVector *) PG_GETARG_POINTER(0), (GIST_SPLITVEC *) PG_GETARG_POINTER(1), - &tinfo + &tinfo, fcinfo->flinfo )); } @@ -195,6 +195,6 @@ gbt_macad8_same(PG_FUNCTION_ARGS) mac8KEY*b2 = (mac8KEY *) PG_GETARG_POINTER(1); bool *result = (bool *) PG_GETARG_POINTER(2); - *result = gbt_num_same((void *) b1, (void *) b2, &tinfo); + *result = gbt_num_same((void *) b1, (void *) b2, &tinfo, fcinfo->flinfo); PG_RETURN_POINTER(result); } commit 50f4a4efcb9ed3ae5e0378676459c6d1ce455bd2 Author: Anastasia Date: Tue Mar 21 10:11:37 2017 +0300 complete btree_gin_gist_enum patch diff --git a/contrib/btree_gin/Makefile b/contrib/btree_gin/Makefile index f22e4af..589755f 100644 --- a/contrib/btree_gin/Make
Re: [HACKERS] btree_gin and btree_gist for enums
On 02/27/2017 04:41 PM, Tom Lane wrote: > Andrew Dunstan writes: >> OK, here's the whole series of patches. > I've not tested it at all, but this looks generally sane in a quick > once-over. > > A minor quibble is that in 0003, you weren't terribly consistent about > argument order --- in some places you have the FmgrInfo argument added > before the collation argument, and in some places after. I'd suggest > trying to make the argument orders consistent with the fmgr.c support > functions. (I'm generally -1 on blindly adding stuff at the end.) > > Thanks for reviewing. I don't mind changing it, but if I do I'm inclined to make it as consistent as possible with the 0002 patch, which did put all the FmgrInfo arguments at the end - there's not any other more obvious place for them in that case, as there is no collation argument. cheers andrew -- Andrew Dunstanhttps://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] btree_gin and btree_gist for enums
Andrew Dunstan writes: > OK, here's the whole series of patches. I've not tested it at all, but this looks generally sane in a quick once-over. A minor quibble is that in 0003, you weren't terribly consistent about argument order --- in some places you have the FmgrInfo argument added before the collation argument, and in some places after. I'd suggest trying to make the argument orders consistent with the fmgr.c support functions. (I'm generally -1 on blindly adding stuff at the end.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] btree_gin and btree_gist for enums
On 02/26/2017 04:09 PM, Andrew Dunstan wrote: > > On 02/26/2017 03:26 PM, Tom Lane wrote: >> Andrew Dunstan writes: >>> This works for the btree_gin case. However, there's a difficulty for >>> btree_gist - in the puicksplit routine the cmp function is passed to >>> qsort() so there is no chance to pass it an flinfo to set up the call to >>> the real comparison routine. Implementing a custom sort routine to work >>> around the problem seems a bridge too far. I can;t think of an >>> alternative off hand. >> We already have qsort_arg ... can't you change it to use that? >> >> > > Yes, wasn't aware of that, that looks like exactly what I need. thanks. > > OK, here's the whole series of patches. Patch 1 adds the CallerFInfoFunctionCall{1,2} functions. Patch 2 adds btree_gist support for their use for non-varlena types Patch 3 does the same for varlena types (Not required for patch 4, but better to be consistent, I think.) Patch 4 adds enum support to btree_gist Patch 5 adds enum support to btree_gin cheers andrew -- Andrew Dunstanhttps://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services >From 74ebe1490de916409458e3d1adf47dfa20b1fefb Mon Sep 17 00:00:00 2001 From: Andrew Dunstan Date: Mon, 27 Feb 2017 15:41:03 -0500 Subject: [PATCH 4/4] Add btree_gist support for enum types. This will alow enums to be used in exclusio n constraints. The code uses the new CallerFInfoFunctionCall infrastructure in fmgr, and the support for it added to btree_gist in a nearby patch. --- contrib/btree_gist/Makefile | 8 +- contrib/btree_gist/btree_enum.c | 187 + contrib/btree_gist/btree_gist--1.3--1.4.sql | 69 contrib/btree_gist/btree_gist.control | 2 +- contrib/btree_gist/btree_gist.h | 3 +- contrib/btree_gist/btree_utils_num.c| 2 + contrib/btree_gist/data/enum.data | 595 contrib/btree_gist/expected/enum.out| 91 + contrib/btree_gist/sql/enum.sql | 38 ++ doc/src/sgml/btree-gist.sgml| 2 +- 10 files changed, 991 insertions(+), 6 deletions(-) create mode 100644 contrib/btree_gist/btree_enum.c create mode 100644 contrib/btree_gist/btree_gist--1.3--1.4.sql create mode 100644 contrib/btree_gist/data/enum.data create mode 100644 contrib/btree_gist/expected/enum.out create mode 100644 contrib/btree_gist/sql/enum.sql diff --git a/contrib/btree_gist/Makefile b/contrib/btree_gist/Makefile index d36f517..beb2ed7 100644 --- a/contrib/btree_gist/Makefile +++ b/contrib/btree_gist/Makefile @@ -6,16 +6,18 @@ OBJS = btree_gist.o btree_utils_num.o btree_utils_var.o btree_int2.o \ btree_int4.o btree_int8.o btree_float4.o btree_float8.o btree_cash.o \ btree_oid.o btree_ts.o btree_time.o btree_date.o btree_interval.o \ btree_macaddr.o btree_inet.o btree_text.o btree_bytea.o btree_bit.o \ -btree_numeric.o btree_uuid.o $(WIN32RES) +btree_numeric.o btree_uuid.o btree_enum.o $(WIN32RES) EXTENSION = btree_gist DATA = btree_gist--unpackaged--1.0.sql btree_gist--1.0--1.1.sql \ - btree_gist--1.1--1.2.sql btree_gist--1.2.sql btree_gist--1.2--1.3.sql + btree_gist--1.1--1.2.sql btree_gist--1.2.sql btree_gist--1.2--1.3.sql \ + btree_gist--1.3--1.4.sql + PGFILEDESC = "btree_gist - B-tree equivalent GiST operator classes" REGRESS = init int2 int4 int8 float4 float8 cash oid timestamp timestamptz \ time timetz date interval macaddr inet cidr text varchar char bytea \ -bit varbit numeric uuid not_equal +bit varbit numeric uuid enum not_equal SHLIB_LINK += $(filter -lm, $(LIBS)) diff --git a/contrib/btree_gist/btree_enum.c b/contrib/btree_gist/btree_enum.c new file mode 100644 index 000..5e46e78 --- /dev/null +++ b/contrib/btree_gist/btree_enum.c @@ -0,0 +1,187 @@ +/* + * contrib/btree_gist/btree_enum.c + */ +#include "postgres.h" +#include "fmgr.h" +#include "utils/builtins.h" + +#include "btree_gist.h" +#include "btree_utils_num.h" + +/* enums are really Oids, so we just use the same structure */ + +typedef struct +{ + Oid lower; + Oid upper; +} oidKEY; + +/* +** enum ops +*/ +PG_FUNCTION_INFO_V1(gbt_enum_compress); +PG_FUNCTION_INFO_V1(gbt_enum_fetch); +PG_FUNCTION_INFO_V1(gbt_enum_union); +PG_FUNCTION_INFO_V1(gbt_enum_picksplit); +PG_FUNCTION_INFO_V1(gbt_enum_consistent); +PG_FUNCTION_INFO_V1(gbt_enum_penalty); +PG_FUNCTION_INFO_V1(gbt_enum_same); + + +static bool +gbt_enumgt(const void *a, const void *b, FmgrInfo *flinfo) +{ + return DatumGetBool( + CallerFInfoFunctionCall2(enum_gt, flinfo, InvalidOid, ObjectIdGetDatum(*((const Oid *) a)), ObjectIdGetDatum(*((const Oid *) b))) + ); +} +static bool +gbt_enumge(const void *a, const void *b, FmgrInfo *flinfo) +{ + return DatumGetBool( + CallerFInfoFunctionCall2(enum_ge, flinfo, InvalidOid, ObjectIdGetDatum(*((const Oid *) a)), ObjectIdGetDatum(*((
Re: [HACKERS] btree_gin and btree_gist for enums
On 02/26/2017 03:26 PM, Tom Lane wrote: > Andrew Dunstan writes: >> This works for the btree_gin case. However, there's a difficulty for >> btree_gist - in the puicksplit routine the cmp function is passed to >> qsort() so there is no chance to pass it an flinfo to set up the call to >> the real comparison routine. Implementing a custom sort routine to work >> around the problem seems a bridge too far. I can;t think of an >> alternative off hand. > We already have qsort_arg ... can't you change it to use that? > > Yes, wasn't aware of that, that looks like exactly what I need. thanks. cheers andrew -- Andrew Dunstanhttps://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] btree_gin and btree_gist for enums
Andrew Dunstan writes: > This works for the btree_gin case. However, there's a difficulty for > btree_gist - in the puicksplit routine the cmp function is passed to > qsort() so there is no chance to pass it an flinfo to set up the call to > the real comparison routine. Implementing a custom sort routine to work > around the problem seems a bridge too far. I can;t think of an > alternative off hand. We already have qsort_arg ... can't you change it to use that? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] btree_gin and btree_gist for enums
On 02/25/2017 01:39 PM, Andrew Dunstan wrote: > > On 02/25/2017 01:34 PM, Tom Lane wrote: >> Andrew Dunstan writes: >>> On 02/25/2017 12:04 PM, Tom Lane wrote: I think it'd be better to leave DirectFunctionCallN alone and just invent a small number of CallerFInfoFunctionCallN support functions (maybe N=1 and N=2 would be enough, at least for now). >>> See attached. >> Yeah, I like this better, except that instead of >> >> + * The callee should not look at anything except the fn_mcxt and fn_extra. >> + * Anything else is likely to be bogus. >> >> maybe >> >> + * It's recommended that the callee only use the fn_extra and fn_mcxt >> + * fields, as other fields will typically describe the calling function >> + * not the callee. Conversely, the calling function should not have >> + * used fn_extra, unless its use is known compatible with the callee's. >> >> > > OK, Works for me. Thanks. > This works for the btree_gin case. However, there's a difficulty for btree_gist - in the puicksplit routine the cmp function is passed to qsort() so there is no chance to pass it an flinfo to set up the call to the real comparison routine. Implementing a custom sort routine to work around the problem seems a bridge too far. I can;t think of an alternative off hand. cheers andrew -- Andrew Dunstanhttps://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] btree_gin and btree_gist for enums
On 02/25/2017 01:34 PM, Tom Lane wrote: > Andrew Dunstan writes: >> On 02/25/2017 12:04 PM, Tom Lane wrote: >>> I think it'd be better to leave DirectFunctionCallN alone and just invent >>> a small number of CallerFInfoFunctionCallN support functions (maybe N=1 >>> and N=2 would be enough, at least for now). >> See attached. > Yeah, I like this better, except that instead of > > + * The callee should not look at anything except the fn_mcxt and fn_extra. > + * Anything else is likely to be bogus. > > maybe > > + * It's recommended that the callee only use the fn_extra and fn_mcxt > + * fields, as other fields will typically describe the calling function > + * not the callee. Conversely, the calling function should not have > + * used fn_extra, unless its use is known compatible with the callee's. > > OK, Works for me. Thanks. cheers andrew -- Andrew Dunstanhttps://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] btree_gin and btree_gist for enums
Andrew Dunstan writes: > On 02/25/2017 12:04 PM, Tom Lane wrote: >> I think it'd be better to leave DirectFunctionCallN alone and just invent >> a small number of CallerFInfoFunctionCallN support functions (maybe N=1 >> and N=2 would be enough, at least for now). > See attached. Yeah, I like this better, except that instead of + * The callee should not look at anything except the fn_mcxt and fn_extra. + * Anything else is likely to be bogus. maybe + * It's recommended that the callee only use the fn_extra and fn_mcxt + * fields, as other fields will typically describe the calling function + * not the callee. Conversely, the calling function should not have + * used fn_extra, unless its use is known compatible with the callee's. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] btree_gin and btree_gist for enums
On 02/25/2017 12:04 PM, Tom Lane wrote: > I think it'd be better to leave DirectFunctionCallN alone and just invent > a small number of CallerFInfoFunctionCallN support functions (maybe N=1 > and N=2 would be enough, at least for now). > > See attached. cheers andrew -- Andrew Dunstanhttps://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services diff --git a/src/backend/utils/fmgr/fmgr.c b/src/backend/utils/fmgr/fmgr.c index 3976496..66227e5 100644 --- a/src/backend/utils/fmgr/fmgr.c +++ b/src/backend/utils/fmgr/fmgr.c @@ -1276,6 +1276,54 @@ DirectFunctionCall9Coll(PGFunction func, Oid collation, Datum arg1, Datum arg2, return result; } +/* + * These functions work like the DirectFunctionCall functions except that + * they use the flinfo parameter to initialise the fcinfo for the call. + * The callee should not look at anything except the fn_mcxt and fn_extra. + * Anything else is likely to be bogus. + */ + +Datum +CallerFInfoFunctionCall1(PGFunction func, FmgrInfo *flinfo, Oid collation, Datum arg1) +{ + FunctionCallInfoData fcinfo; + Datum result; + + InitFunctionCallInfoData(fcinfo, flinfo, 1, collation, NULL, NULL); + + fcinfo.arg[0] = arg1; + fcinfo.argnull[0] = false; + + result = (*func) (&fcinfo); + + /* Check for null result, since caller is clearly not expecting one */ + if (fcinfo.isnull) + elog(ERROR, "function %p returned NULL", (void *) func); + + return result; +} + +Datum +CallerFInfoFunctionCall2(PGFunction func, FmgrInfo *flinfo, Oid collation, Datum arg1, Datum arg2) +{ + FunctionCallInfoData fcinfo; + Datum result; + + InitFunctionCallInfoData(fcinfo, flinfo, 2, collation, NULL, NULL); + + fcinfo.arg[0] = arg1; + fcinfo.arg[1] = arg2; + fcinfo.argnull[0] = false; + fcinfo.argnull[1] = false; + + result = (*func) (&fcinfo); + + /* Check for null result, since caller is clearly not expecting one */ + if (fcinfo.isnull) + elog(ERROR, "function %p returned NULL", (void *) func); + + return result; +} /* * These are for invocation of a previously-looked-up function with a diff --git a/src/include/fmgr.h b/src/include/fmgr.h index a671480..739ce46 100644 --- a/src/include/fmgr.h +++ b/src/include/fmgr.h @@ -475,6 +475,19 @@ extern Datum DirectFunctionCall9Coll(PGFunction func, Oid collation, Datum arg6, Datum arg7, Datum arg8, Datum arg9); +/* + * These functions work like the DirectFunctionCall functions except that + * they use the flinfo parameter to initialise the fcinfo for the call. + * The callee should not look at anything except the fn_mcxt and fn_extra. + * Anything else is likely to be bogus. + */ + +extern Datum CallerFInfoFunctionCall1(PGFunction func, FmgrInfo *flinfo, + Oid collation, Datum arg1); +extern Datum CallerFInfoFunctionCall1(PGFunction func, FmgrInfo *flinfo, + Oid collation, Datum arg1); + + /* These are for invocation of a previously-looked-up function with a * directly-computed parameter list. Note that neither arguments nor result * are allowed to be NULL. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] btree_gin and btree_gist for enums
> The reason this is kind of scary is that it's just blithely assuming > that the function won't look at the *other* fields of the FmgrInfo. > If it did, it would likely get very confused, since those fields > would be describing the GIN support function, not the function we're > calling. I am sorry if it doesn't make sense, but wouldn't the whole thing be better, if we refactor enum.c to call only he internal functions with TypeCacheEntry just like the rangetypes? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] btree_gin and btree_gist for enums
Andrew Dunstan writes: > On 02/24/2017 05:34 PM, Andrew Dunstan wrote: >> It's occurred to me that we could reduce the code clutter in fmgr.c a >> bit by turning the DirectFunctionCall{n]Coll functions into macros >> calling these functions and passing NULL as the flinfo param. > here's a patch along those lines. If there's agreement on this I can > finish up the work on btree_gist and btree_gin supoport for enums. I'm not really thrilled with doing it like that, for two reasons: 1. This makes it look like CallerFInfoFunctionCallN is the more basic, common interface, which it isn't and never will be. At best, that's confusing. And I don't like having only one documentation block covering both interfaces; that does nothing for clarity either. 2. By my count there are over 500 uses of DirectFunctionCallN in the backend. This will add an additional hidden parameter to each one, implying code bloat and some fractional speed cost. I think it'd be better to leave DirectFunctionCallN alone and just invent a small number of CallerFInfoFunctionCallN support functions (maybe N=1 and N=2 would be enough, at least for now). regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] btree_gin and btree_gist for enums
On 02/24/2017 05:34 PM, Andrew Dunstan wrote: > > On 02/24/2017 02:55 PM, Andrew Dunstan wrote: >> On 02/24/2017 11:02 AM, Tom Lane wrote: I don't know what to call it either. In my test I used CallerContextFunctionCall2 - not sure if that's quite right, but should be close. >>> CallerInfo? CallerFInfo? Or we could spell out CallerFmgrInfo but >>> that seems a bit verbose. >>> >>> >> I'll go with CallerFInfoFunctionCall2 etc. >> >> In the btree_gist system the calls to the routines like enum_cmp are >> buried about three levels deep. I'm thinking I'll just pass the flinfo >> down the stack and just call these routines at the bottom level. >> >> >> > > It's occurred to me that we could reduce the code clutter in fmgr.c a > bit by turning the DirectFunctionCall{n]Coll functions into macros > calling these functions and passing NULL as the flinfo param. > here's a patch along those lines. If there's agreement on this I can finish up the work on btree_gist and btree_gin supoport for enums. cheers andrew -- Andrew Dunstanhttps://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services diff --git a/src/backend/utils/fmgr/fmgr.c b/src/backend/utils/fmgr/fmgr.c index 3976496..d565d3e 100644 --- a/src/backend/utils/fmgr/fmgr.c +++ b/src/backend/utils/fmgr/fmgr.c @@ -1006,19 +1006,30 @@ fmgr_security_definer(PG_FUNCTION_ARGS) *- */ -/* - * These are for invocation of a specifically named function with a + +/* These are for invocation of a specifically named function with a * directly-computed parameter list. Note that neither arguments nor result - * are allowed to be NULL. Also, the function cannot be one that needs to - * look at FmgrInfo, since there won't be any. + * are allowed to be NULL. + * + * If the flinfo parameter is not NULL then this used to supply context + * to the called function. The callee should only use the passed in fn_mcxt + * and fn_extra. Anything else is likely to be bogus. + * + * But the fn_extra can be used for example for caching results + * (see enum_cmp for an example). + * + * No context is supplied if the parameter is NULL. See DirectFunctionCall + * macros. In this case the function must not try to look at FmgrInfo, since + * there won't be one and it is likely to result in an access violation. */ Datum -DirectFunctionCall1Coll(PGFunction func, Oid collation, Datum arg1) +CallerFInfoFunctionCall1(PGFunction func, FmgrInfo *flinfo, + Oid collation, Datum arg1) { FunctionCallInfoData fcinfo; Datum result; - InitFunctionCallInfoData(fcinfo, NULL, 1, collation, NULL, NULL); + InitFunctionCallInfoData(fcinfo, flinfo, 1, collation, NULL, NULL); fcinfo.arg[0] = arg1; fcinfo.argnull[0] = false; @@ -1033,12 +1044,13 @@ DirectFunctionCall1Coll(PGFunction func, Oid collation, Datum arg1) } Datum -DirectFunctionCall2Coll(PGFunction func, Oid collation, Datum arg1, Datum arg2) +CallerFInfoFunctionCall2(PGFunction func, FmgrInfo *flinfo, + Oid collation, Datum arg1, Datum arg2) { FunctionCallInfoData fcinfo; Datum result; - InitFunctionCallInfoData(fcinfo, NULL, 2, collation, NULL, NULL); + InitFunctionCallInfoData(fcinfo, flinfo, 2, collation, NULL, NULL); fcinfo.arg[0] = arg1; fcinfo.arg[1] = arg2; @@ -1055,13 +1067,14 @@ DirectFunctionCall2Coll(PGFunction func, Oid collation, Datum arg1, Datum arg2) } Datum -DirectFunctionCall3Coll(PGFunction func, Oid collation, Datum arg1, Datum arg2, - Datum arg3) +CallerFInfoFunctionCall3(PGFunction func, FmgrInfo *flinfo, + Oid collation, Datum arg1, Datum arg2, + Datum arg3) { FunctionCallInfoData fcinfo; Datum result; - InitFunctionCallInfoData(fcinfo, NULL, 3, collation, NULL, NULL); + InitFunctionCallInfoData(fcinfo, flinfo, 3, collation, NULL, NULL); fcinfo.arg[0] = arg1; fcinfo.arg[1] = arg2; @@ -1080,13 +1093,14 @@ DirectFunctionCall3Coll(PGFunction func, Oid collation, Datum arg1, Datum arg2, } Datum -DirectFunctionCall4Coll(PGFunction func, Oid collation, Datum arg1, Datum arg2, - Datum arg3, Datum arg4) +CallerFInfoFunctionCall4(PGFunction func, FmgrInfo *flinfo, + Oid collation, Datum arg1, Datum arg2, + Datum arg3, Datum arg4) { FunctionCallInfoData fcinfo; Datum result; - InitFunctionCallInfoData(fcinfo, NULL, 4, collation, NULL, NULL); + InitFunctionCallInfoData(fcinfo, flinfo, 4, collation, NULL, NULL); fcinfo.arg[0] = arg1; fcinfo.arg[1] = arg2; @@ -1107,13 +1121,14 @@ DirectFunctionCall4Coll(PGFunction func, Oid collation, Datum arg1, Datum arg2, } Datum -DirectFunctionCall5Coll(PGFunction func, Oid collation, Datum arg1, Datum arg2, - Datum arg3, Datum arg4, Datum arg5) +CallerFInfoFunctionCall5(PGFunction func, FmgrInfo *flinfo, + Oid collation, Datum arg1, Datum arg2, + Datum arg3, Datum arg4, Datum arg5) { FunctionCallI
Re: [HACKERS] btree_gin and btree_gist for enums
On 02/24/2017 02:55 PM, Andrew Dunstan wrote: > > On 02/24/2017 11:02 AM, Tom Lane wrote: >>> I don't know what to call it either. In my test I used >>> CallerContextFunctionCall2 - not sure if that's quite right, but should >>> be close. >> CallerInfo? CallerFInfo? Or we could spell out CallerFmgrInfo but >> that seems a bit verbose. >> >> > I'll go with CallerFInfoFunctionCall2 etc. > > In the btree_gist system the calls to the routines like enum_cmp are > buried about three levels deep. I'm thinking I'll just pass the flinfo > down the stack and just call these routines at the bottom level. > > > It's occurred to me that we could reduce the code clutter in fmgr.c a bit by turning the DirectFunctionCall{n]Coll functions into macros calling these functions and passing NULL as the flinfo param. cheers andrew -- Andrew Dunstanhttps://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] btree_gin and btree_gist for enums
On 02/24/2017 11:02 AM, Tom Lane wrote: > Andrew Dunstan writes: >> On 02/23/2017 04:41 PM, Tom Lane wrote: >>> The reason this is kind of scary is that it's just blithely assuming >>> that the function won't look at the *other* fields of the FmgrInfo. >>> If it did, it would likely get very confused, since those fields >>> would be describing the GIN support function, not the function we're >>> calling. >>> >>> We could alternatively have this trampoline function set up a fresh, local >>> FmgrInfo struct that it zeroes except for copying fn_extra and fn_mcxt >>> from the caller's struct, and then it copies fn_extra back again on the >>> way out. That's a few more cycles but it would be safer, I think; if the >>> function tried to look at the other fields such as fn_oid it would see >>> obviously bogus data. >> Do we want one or both of these? I'm prepared to code up a patch to >> fmgr.[ch] to provide them. > On reflection I'm not sure that the double-copy approach is all that much > safer than just passing down the caller's flinfo pointer. Most of the > time it would be better, but suppose that the callee updates fn_extra > and then throws elog(ERROR) --- the outcome would be different, probably > creating a leak in fn_mcxt. Maybe this would still be okay, because > perhaps that FmgrInfo is never used again, but I don't think we can assume > that for the case at hand. > > At this point I'd be inclined to just document that the called function > should only use fn_extra/fn_mcxt. fair enough. Will do it that way. > >> I don't know what to call it either. In my test I used >> CallerContextFunctionCall2 - not sure if that's quite right, but should >> be close. > CallerInfo? CallerFInfo? Or we could spell out CallerFmgrInfo but > that seems a bit verbose. > > I'll go with CallerFInfoFunctionCall2 etc. In the btree_gist system the calls to the routines like enum_cmp are buried about three levels deep. I'm thinking I'll just pass the flinfo down the stack and just call these routines at the bottom level. cheers andrew -- Andrew Dunstanhttps://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] btree_gin and btree_gist for enums
Andrew Dunstan writes: > On 02/23/2017 04:41 PM, Tom Lane wrote: >> The reason this is kind of scary is that it's just blithely assuming >> that the function won't look at the *other* fields of the FmgrInfo. >> If it did, it would likely get very confused, since those fields >> would be describing the GIN support function, not the function we're >> calling. >> >> We could alternatively have this trampoline function set up a fresh, local >> FmgrInfo struct that it zeroes except for copying fn_extra and fn_mcxt >> from the caller's struct, and then it copies fn_extra back again on the >> way out. That's a few more cycles but it would be safer, I think; if the >> function tried to look at the other fields such as fn_oid it would see >> obviously bogus data. > Do we want one or both of these? I'm prepared to code up a patch to > fmgr.[ch] to provide them. On reflection I'm not sure that the double-copy approach is all that much safer than just passing down the caller's flinfo pointer. Most of the time it would be better, but suppose that the callee updates fn_extra and then throws elog(ERROR) --- the outcome would be different, probably creating a leak in fn_mcxt. Maybe this would still be okay, because perhaps that FmgrInfo is never used again, but I don't think we can assume that for the case at hand. At this point I'd be inclined to just document that the called function should only use fn_extra/fn_mcxt. > I don't know what to call it either. In my test I used > CallerContextFunctionCall2 - not sure if that's quite right, but should > be close. CallerInfo? CallerFInfo? Or we could spell out CallerFmgrInfo but that seems a bit verbose. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] btree_gin and btree_gist for enums
On 02/23/2017 04:41 PM, Tom Lane wrote: > Andrew Dunstan writes: >> I'm not entirely sure how I should replace those DirectFunctionCall2 calls. > Basically what we want is for the called function to be able to use the > fcinfo->flinfo->fn_extra and fcinfo->flinfo->fn_mcxt fields of the > FmgrInfo struct that GIN has set up for the btree_gin support function. > > The fast but somewhat scary way to do it would just be to pass through > the flinfo pointer as-is. Imagine that fmgr.c grows a set of functions > like > > Datum > DontKnowWhatToCallThisFunctionCall2(PGFunction func, > FmgrInfo *flinfo, Oid collation, > Datum arg1, Datum arg2) > { > FunctionCallInfoData fcinfo; > Datumresult; > > InitFunctionCallInfoData(fcinfo, flinfo, 2, collation, NULL, NULL); > > fcinfo.arg[0] = arg1; > fcinfo.arg[1] = arg2; > fcinfo.argnull[0] = false; > fcinfo.argnull[1] = false; > > result = (*func) (&fcinfo); > > /* Check for null result, since caller is clearly not expecting one */ > if (fcinfo.isnull) > elog(ERROR, "function %p returned NULL", (void *) func); > > return result; > } > > and then you'd just pass through the fcinfo->flinfo you got. > > The reason this is kind of scary is that it's just blithely assuming > that the function won't look at the *other* fields of the FmgrInfo. > If it did, it would likely get very confused, since those fields > would be describing the GIN support function, not the function we're > calling. > > We could alternatively have this trampoline function set up a fresh, local > FmgrInfo struct that it zeroes except for copying fn_extra and fn_mcxt > from the caller's struct, and then it copies fn_extra back again on the > way out. That's a few more cycles but it would be safer, I think; if the > function tried to look at the other fields such as fn_oid it would see > obviously bogus data. > > Do we want one or both of these? I'm prepared to code up a patch to fmgr.[ch] to provide them. I don't know what to call it either. In my test I used CallerContextFunctionCall2 - not sure if that's quite right, but should be close. The technique is somewhat similar to what we do in plperl.c where we fake up a function call for handling DO blocks. cheers andrew -- Andrew Dunstanhttps://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] btree_gin and btree_gist for enums
On 02/23/2017 08:34 PM, Tom Lane wrote: > Andrew Dunstan writes: >> On 02/23/2017 04:41 PM, Tom Lane wrote: >>> BTW, while I'm looking at this ... isn't gin_enum_cmp broken on its face? >> Yes, that's what I'm trying to fix. > I'd forgotten where this thread started. For a minute there I thought > we had a live bug, rather than a deficiency in code-under-development. > > After thinking about that, I believe that enum_cmp_internal is a rather > considerable hazard. It might not be our only function that requires > fcinfo->flinfo cache space just some of the time not all the time, but > I don't recall anyplace else that could so easily undergo a reasonable > amount of testing without *ever* reaching the case where it requires > that cache space. So I'm now worried that it wouldn't be too hard for > such a bug to get past us. > > I think we should address that by adding a non-bypassable Assert that > the caller did provide flinfo, as in the attached. > > Looks sane. I don't believe there is anywhere in the core code that calls this without fcinfo->flinfo, But obviously I have tickled that with my original patch for the extension. cheers andrew -- Andrew Dunstanhttps://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] btree_gin and btree_gist for enums
Andrew Dunstan writes: > On 02/23/2017 04:41 PM, Tom Lane wrote: >> BTW, while I'm looking at this ... isn't gin_enum_cmp broken on its face? > Yes, that's what I'm trying to fix. I'd forgotten where this thread started. For a minute there I thought we had a live bug, rather than a deficiency in code-under-development. After thinking about that, I believe that enum_cmp_internal is a rather considerable hazard. It might not be our only function that requires fcinfo->flinfo cache space just some of the time not all the time, but I don't recall anyplace else that could so easily undergo a reasonable amount of testing without *ever* reaching the case where it requires that cache space. So I'm now worried that it wouldn't be too hard for such a bug to get past us. I think we should address that by adding a non-bypassable Assert that the caller did provide flinfo, as in the attached. regards, tom lane diff --git a/src/backend/utils/adt/enum.c b/src/backend/utils/adt/enum.c index 8110ee2..b1d2a6f 100644 *** a/src/backend/utils/adt/enum.c --- b/src/backend/utils/adt/enum.c *** enum_cmp_internal(Oid arg1, Oid arg2, Fu *** 263,268 --- 263,277 { TypeCacheEntry *tcache; + /* + * We don't need the typcache except in the hopefully-uncommon case that + * one or both Oids are odd. This means that cursory testing of code that + * fails to pass flinfo to an enum comparison function might not disclose + * the oversight. To make such errors more obvious, Assert that we have a + * place to cache even when we take a fast-path exit. + */ + Assert(fcinfo->flinfo != NULL); + /* Equal OIDs are equal no matter what */ if (arg1 == arg2) return 0; *** enum_cmp(PG_FUNCTION_ARGS) *** 381,392 Oid a = PG_GETARG_OID(0); Oid b = PG_GETARG_OID(1); ! if (a == b) ! PG_RETURN_INT32(0); ! else if (enum_cmp_internal(a, b, fcinfo) > 0) ! PG_RETURN_INT32(1); ! else ! PG_RETURN_INT32(-1); } /* Enum programming support functions */ --- 390,396 Oid a = PG_GETARG_OID(0); Oid b = PG_GETARG_OID(1); ! PG_RETURN_INT32(enum_cmp_internal(a, b, fcinfo)); } /* Enum programming support functions */ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] btree_gin and btree_gist for enums
On 02/23/2017 04:41 PM, Tom Lane wrote: > Andrew Dunstan writes: >> I'm not entirely sure how I should replace those DirectFunctionCall2 calls. > Basically what we want is for the called function to be able to use the > fcinfo->flinfo->fn_extra and fcinfo->flinfo->fn_mcxt fields of the > FmgrInfo struct that GIN has set up for the btree_gin support function. > > The fast but somewhat scary way to do it would just be to pass through > the flinfo pointer as-is. Imagine that fmgr.c grows a set of functions > like > > [...] Here's a POC for btree_gin based on my original patch. I just made your function static in btree_gin.c at least for now. I stayed with the DirectFunctionCall2 use in the type-specific routines where calling context wasn't needed (i.e. everything except enums). But it looks more than ugly and highly invasive for btree_gist, and sadly that's my main use case, exclusion constraints with enum fields. cheers andrew -- Andrew Dunstanhttps://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services diff --git a/contrib/btree_gin/Makefile b/contrib/btree_gin/Makefile index 0492091..c6aae26 100644 --- a/contrib/btree_gin/Makefile +++ b/contrib/btree_gin/Makefile @@ -4,13 +4,13 @@ MODULE_big = btree_gin OBJS = btree_gin.o $(WIN32RES) EXTENSION = btree_gin -DATA = btree_gin--1.0.sql btree_gin--unpackaged--1.0.sql +DATA = btree_gin--1.0.sql btree_gin--unpackaged--1.0.sql btree_gin--1.0--1.1.sql PGFILEDESC = "btree_gin - B-tree equivalent GIN operator classes" REGRESS = install_btree_gin int2 int4 int8 float4 float8 money oid \ timestamp timestamptz time timetz date interval \ macaddr inet cidr text varchar char bytea bit varbit \ - numeric + numeric enum ifdef USE_PGXS PG_CONFIG = pg_config diff --git a/contrib/btree_gin/btree_gin--1.0--1.1.sql b/contrib/btree_gin/btree_gin--1.0--1.1.sql new file mode 100644 index 000..3c40ccd --- /dev/null +++ b/contrib/btree_gin/btree_gin--1.0--1.1.sql @@ -0,0 +1,47 @@ +/* contrib/btree_gin/btree_gin--1.0--1.1.sql */ + +-- complain if script is sourced in psql, rather than via CREATE EXTENSION +\echo Use "ALTER EXTENSION btree_gin UPDATE TO '1.1'" to load this file. \quit + +-- +-- +-- +-- enum ops +-- +-- + + +CREATE FUNCTION gin_extract_value_anyenum(anyenum, internal) +RETURNS internal +AS 'MODULE_PATHNAME' +LANGUAGE C STRICT IMMUTABLE; + +CREATE FUNCTION gin_compare_prefix_anyenum(anyenum, anyenum, int2, internal) +RETURNS int4 +AS 'MODULE_PATHNAME' +LANGUAGE C STRICT IMMUTABLE; + +CREATE FUNCTION gin_extract_query_anyenum(anyenum, internal, int2, internal, internal) +RETURNS internal +AS 'MODULE_PATHNAME' +LANGUAGE C STRICT IMMUTABLE; + +CREATE FUNCTION gin_enum_cmp(anyenum, anyenum) +RETURNS int4 +AS 'MODULE_PATHNAME' +LANGUAGE C STRICT IMMUTABLE; + +CREATE OPERATOR CLASS enum_ops +DEFAULT FOR TYPE anyenum USING gin +AS +OPERATOR1 <, +OPERATOR2 <=, +OPERATOR3 =, +OPERATOR4 >=, +OPERATOR5 >, +FUNCTION1 gin_enum_cmp(anyenum,anyenum), +FUNCTION2 gin_extract_value_anyenum(anyenum, internal), +FUNCTION3 gin_extract_query_anyenum(anyenum, internal, int2, internal, internal), +FUNCTION4 gin_btree_consistent(internal, int2, anyelement, int4, internal, internal), +FUNCTION5 gin_compare_prefix_anyenum(anyenum,anyenum,int2, internal), +STORAGE anyenum; diff --git a/contrib/btree_gin/btree_gin.c b/contrib/btree_gin/btree_gin.c index 030b610..4694275 100644 --- a/contrib/btree_gin/btree_gin.c +++ b/contrib/btree_gin/btree_gin.c @@ -25,6 +25,35 @@ typedef struct QueryInfo Datum (*typecmp) (FunctionCallInfo); } QueryInfo; +/* + * Routine to provide context to a data type comparison call. + * Needed for enum support. + */ + +static Datum +CallerContextFunctionCall2(PGFunction func, + FmgrInfo *flinfo, Oid collation, + Datum arg1, Datum arg2) +{ + FunctionCallInfoData fcinfo; + Datumresult; + + InitFunctionCallInfoData(fcinfo, flinfo, 2, collation, NULL, NULL); + + fcinfo.arg[0] = arg1; + fcinfo.arg[1] = arg2; + fcinfo.argnull[0] = false; + fcinfo.argnull[1] = false; + + result = (*func) (&fcinfo); + + /* Check for null result, since caller is clearly not expecting one */ + if (fcinfo.isnull) + elog(ERROR, "function %p returned NULL", (void *) func); + + return result; +} + /*** GIN support functions shared by all datatypes ***/ @@ -112,13 +141,14 @@ gin_btree_compare_prefix(FunctionCallInfo fcinfo) int32 res, cmp; - cmp = DatumGetInt32(DirectFunctionCall2Coll( -data->typecmp, -PG_GET_COLLATION(), - (data->strategy == BTLessStrategyNumber || - data->strategy == BTLessEqualStrategyNumber) -? data->datum : a, -b)); + cmp = DatumGetInt32(CallerContextFunctionCall2( + data->typecmp, + fcinfo->fl
Re: [HACKERS] btree_gin and btree_gist for enums
On 02/23/2017 04:41 PM, Tom Lane wrote: > BTW, while I'm looking at this ... isn't gin_enum_cmp broken on its face? > It's using DirectFunctionCall2 to call enum_cmp, and that's wrong because > DirectFunctionCall2 does not pass through a flinfo but enum_cmp needs to > have one. I've not tested, but I'm certain that this would dump core if > asked to compare odd-numbered enum OIDs. > > Yes, that's what I'm trying to fix. cheers andrew -- Andrew Dunstanhttps://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] btree_gin and btree_gist for enums
Andrew Dunstan writes: > I'm not entirely sure how I should replace those DirectFunctionCall2 calls. Basically what we want is for the called function to be able to use the fcinfo->flinfo->fn_extra and fcinfo->flinfo->fn_mcxt fields of the FmgrInfo struct that GIN has set up for the btree_gin support function. The fast but somewhat scary way to do it would just be to pass through the flinfo pointer as-is. Imagine that fmgr.c grows a set of functions like Datum DontKnowWhatToCallThisFunctionCall2(PGFunction func, FmgrInfo *flinfo, Oid collation, Datum arg1, Datum arg2) { FunctionCallInfoData fcinfo; Datumresult; InitFunctionCallInfoData(fcinfo, flinfo, 2, collation, NULL, NULL); fcinfo.arg[0] = arg1; fcinfo.arg[1] = arg2; fcinfo.argnull[0] = false; fcinfo.argnull[1] = false; result = (*func) (&fcinfo); /* Check for null result, since caller is clearly not expecting one */ if (fcinfo.isnull) elog(ERROR, "function %p returned NULL", (void *) func); return result; } and then you'd just pass through the fcinfo->flinfo you got. The reason this is kind of scary is that it's just blithely assuming that the function won't look at the *other* fields of the FmgrInfo. If it did, it would likely get very confused, since those fields would be describing the GIN support function, not the function we're calling. We could alternatively have this trampoline function set up a fresh, local FmgrInfo struct that it zeroes except for copying fn_extra and fn_mcxt from the caller's struct, and then it copies fn_extra back again on the way out. That's a few more cycles but it would be safer, I think; if the function tried to look at the other fields such as fn_oid it would see obviously bogus data. BTW, while I'm looking at this ... isn't gin_enum_cmp broken on its face? It's using DirectFunctionCall2 to call enum_cmp, and that's wrong because DirectFunctionCall2 does not pass through a flinfo but enum_cmp needs to have one. I've not tested, but I'm certain that this would dump core if asked to compare odd-numbered enum OIDs. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] btree_gin and btree_gist for enums
On 11/05/2016 03:11 PM, Andrew Dunstan wrote: > > > On 11/05/2016 01:13 PM, Tom Lane wrote: >> Andrew Dunstan writes: >>> On 11/05/2016 11:46 AM, Tom Lane wrote: That may be a good fix for robustness purposes, but it seems pretty horrid from an efficiency standpoint. Where is this call, and should we be modifying it to provide a flinfo? >>> I thought of providing an flinfo, but I couldn't see a simple way to do >>> it that would provide something much longer lived than the function >>> call, in which case it seemed a bit pointless. That's why I asked for >>> assistance :-) >> Hmm. The problem is that the intermediate layer in btree_gist (and >> probably btree_gin too, didn't look) does not pass through the >> FunctionCallInfo data structure that's provided by the GIST AM. >> That wasn't needed up to now, because none of the supported data types >> are complex enough that any cached state would be useful, but trying >> to extend it to enums exposes the shortcoming. >> >> We could fix this, but it would be pretty invasive since it would >> require >> touching just about every function in btree_gist/gin. Not entirely sure >> that it's worth it. On the other hand, the problem might well come up >> again in future, perhaps for a datatype where the penalty for lack of >> a cache is greater --- eg, it would be pretty painful to support >> record_cmp without caching. And the changes ought to be relatively >> mechanical, even if they are extensive. > > > > Yeah. I think it's probably worth doing. btree_gin is probably a > better place to start because it's largely macro-ized. So looking at this we have: static Datum gin_btree_compare_prefix(FunctionCallInfo fcinfo) { Datum a = PG_GETARG_DATUM(0); Datum b = PG_GETARG_DATUM(1); QueryInfo *data = (QueryInfo *) PG_GETARG_POINTER(3); int32 res, cmp; cmp = DatumGetInt32(DirectFunctionCall2Coll( data->typecmp, PG_GET_COLLATION(), (data->strategy == BTLessStrategyNumber || data->strategy == BTLessEqualStrategyNumber) ? data->datum : a, b)); and then the referred to routine in the enum case looks like this: Datum gin_enum_cmp(PG_FUNCTION_ARGS) { Oid a = PG_GETARG_OID(0); Oid b = PG_GETARG_OID(1); int res = 0; if (ENUM_IS_LEFTMOST(a)) { res = (ENUM_IS_LEFTMOST(b)) ? 0 : -1; } else if (ENUM_IS_LEFTMOST(b)) { res = 1; } else { res = DatumGetInt32(DirectFunctionCall2(enum_cmp, ObjectIdGetDatum(a), ObjectIdGetDatum(b))); } PG_RETURN_INT32(res); } I'm not entirely sure how I should replace those DirectFunctionCall2 calls. cheers andrew -- Andrew Dunstanhttps://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] btree_gin and btree_gist for enums
On Wed, Nov 23, 2016 at 11:31 AM, Andrew Dunstan wrote: > I won't have time to fix this before the end of the Commitfest The patch is marked as "returned with feedback" in 2016-11 commitfest. Please free to submit an updated patch to the next commitfest. Regards, Hari Babu Fujitsu Australia
Re: [HACKERS] btree_gin and btree_gist for enums
I won't have time to fix this before the end of the Commitfest -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] btree_gin and btree_gist for enums
On 11/05/2016 01:13 PM, Tom Lane wrote: Andrew Dunstan writes: On 11/05/2016 11:46 AM, Tom Lane wrote: That may be a good fix for robustness purposes, but it seems pretty horrid from an efficiency standpoint. Where is this call, and should we be modifying it to provide a flinfo? I thought of providing an flinfo, but I couldn't see a simple way to do it that would provide something much longer lived than the function call, in which case it seemed a bit pointless. That's why I asked for assistance :-) Hmm. The problem is that the intermediate layer in btree_gist (and probably btree_gin too, didn't look) does not pass through the FunctionCallInfo data structure that's provided by the GIST AM. That wasn't needed up to now, because none of the supported data types are complex enough that any cached state would be useful, but trying to extend it to enums exposes the shortcoming. We could fix this, but it would be pretty invasive since it would require touching just about every function in btree_gist/gin. Not entirely sure that it's worth it. On the other hand, the problem might well come up again in future, perhaps for a datatype where the penalty for lack of a cache is greater --- eg, it would be pretty painful to support record_cmp without caching. And the changes ought to be relatively mechanical, even if they are extensive. Yeah. I think it's probably worth doing. btree_gin is probably a better place to start because it's largely macro-ized. I don't have time right now to do this. I'll try to get to it if nobody else does over then next couple of months. BTW, this patch would be a great deal shorter if you made use of the work done in 40b449ae8. That is, you no longer need to replace the base extension script --- just add an update script and change the default version in the .control file. See fd321a1df for an example. Oh, nice. My work was originally done in March, before this came in. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] btree_gin and btree_gist for enums
Andrew Dunstan writes: > On 11/05/2016 11:46 AM, Tom Lane wrote: >> That may be a good fix for robustness purposes, but it seems pretty horrid >> from an efficiency standpoint. Where is this call, and should we be >> modifying it to provide a flinfo? > I thought of providing an flinfo, but I couldn't see a simple way to do > it that would provide something much longer lived than the function > call, in which case it seemed a bit pointless. That's why I asked for > assistance :-) Hmm. The problem is that the intermediate layer in btree_gist (and probably btree_gin too, didn't look) does not pass through the FunctionCallInfo data structure that's provided by the GIST AM. That wasn't needed up to now, because none of the supported data types are complex enough that any cached state would be useful, but trying to extend it to enums exposes the shortcoming. We could fix this, but it would be pretty invasive since it would require touching just about every function in btree_gist/gin. Not entirely sure that it's worth it. On the other hand, the problem might well come up again in future, perhaps for a datatype where the penalty for lack of a cache is greater --- eg, it would be pretty painful to support record_cmp without caching. And the changes ought to be relatively mechanical, even if they are extensive. BTW, this patch would be a great deal shorter if you made use of the work done in 40b449ae8. That is, you no longer need to replace the base extension script --- just add an update script and change the default version in the .control file. See fd321a1df for an example. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] btree_gin and btree_gist for enums
Andrew Dunstan writes: > The real problem here is that enum_cmp_internal assumes that > fcinfo->flinfo has been set up, and DirectFunctionCallN doesn't, it sets > it to NULL. > The patch below cures the problem. I'm not sure if there is a better > way. Thoughts? That may be a good fix for robustness purposes, but it seems pretty horrid from an efficiency standpoint. Where is this call, and should we be modifying it to provide a flinfo? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] btree_gin and btree_gist for enums
On 11/04/2016 04:14 PM, Emre Hasegeli wrote: The GiST part of it doesn't really work. The current patch compares oids. We need to change it to compare enumsortorder. I could make it fail easily: regression=# create type e as enum ('0', '2', '3'); CREATE TYPE regression=# alter type e add value '1' after '0'; ALTER TYPE regression=# create table t as select (i % 4)::text::e from generate_series(0, 10) as i; SELECT 11 regression=# create index on t using gist (e); SEGFAULT It calls the enum routines which use the sortorder if necessary. It's not necessary to use sortorder in the case of evenly numbered enum oids as we have carefully arranged for them to be directly comparable, and all the initially allocated oids are even, a very nifty efficiency measure devised by Tom when we added enum extension. The real problem here is that enum_cmp_internal assumes that fcinfo->flinfo has been set up, and DirectFunctionCallN doesn't, it sets it to NULL. The patch below cures the problem. I'm not sure if there is a better way. Thoughts? cheers andrew diff --git a/src/backend/utils/adt/enum.c b/src/backend/utils/adt/enum.c index 47d5355..64ba7a7 100644 --- a/src/backend/utils/adt/enum.c +++ b/src/backend/utils/adt/enum.c @@ -261,7 +261,7 @@ enum_send(PG_FUNCTION_ARGS) static int enum_cmp_internal(Oid arg1, Oid arg2, FunctionCallInfo fcinfo) { - TypeCacheEntry *tcache; + TypeCacheEntry *tcache = NULL; /* Equal OIDs are equal no matter what */ if (arg1 == arg2) @@ -277,7 +277,8 @@ enum_cmp_internal(Oid arg1, Oid arg2, FunctionCallInfo fcinfo) } /* Locate the typcache entry for the enum type */ - tcache = (TypeCacheEntry *) fcinfo->flinfo->fn_extra; + if (fcinfo->flinfo != NULL) + tcache = (TypeCacheEntry *) fcinfo->flinfo->fn_extra; if (tcache == NULL) { HeapTuple enum_tup; @@ -296,7 +297,8 @@ enum_cmp_internal(Oid arg1, Oid arg2, FunctionCallInfo fcinfo) ReleaseSysCache(enum_tup); /* Now locate and remember the typcache entry */ tcache = lookup_type_cache(typeoid, 0); - fcinfo->flinfo->fn_extra = (void *) tcache; + if (fcinfo->flinfo != NULL) + fcinfo->flinfo->fn_extra = (void *) tcache; } /* The remaining comparison logic is in typcache.c */ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] btree_gin and btree_gist for enums
On 11/04/2016 04:14 PM, Emre Hasegeli wrote: Here is a patch to add enum support to btree_gin and btree_gist. I didn't include distance operations, as I didn't think it terribly important, and there isn't a simple way to compute it sanely and efficiently, so no KNN support. I don't think we can implement a meaningful distance operator for enums. Probably. This time including the data file for the gist regression tests. It doesn't apply to HEAD anymore. I have tested it on b12fd41. I'll fix the bitrot.. The GiST part of it doesn't really work. The current patch compares oids. We need to change it to compare enumsortorder. I could make it fail easily: ouch. Ok,I'll work on this. +ALTER OPERATOR FAMILY gist_enum_ops USING gist ADD Why don't we just create it with those changes? I'll take a look. + * Use a similar trick to that used for numeric for enums, since we don't Do you mean "similar trick that is used" or "trick similar to numeric"? This is perfectly legal English. It means "... a similar trick to the one used for numeric ... ". I'll change it to that if you think it's clearer. + * actually know the leftmost value of any enum without knowing the concrete + * type, so we use a dummy leftmost value of InvalidOid. +return DatumGetBool( +DirectFunctionCall2(enum_gt, ObjectIdGetDatum(*((const Oid *) a)), ObjectIdGetDatum(*((const Oid *) b))) +); Wouldn't it be nice to refactor enum_cmp_internal() to accept typcache and use it there like the rangetypes? Not sure. Will look. Thanks for the review. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] btree_gin and btree_gist for enums
>> Here is a patch to add enum support to btree_gin and btree_gist. I didn't >> include distance operations, as I didn't think it terribly important, and >> there isn't a simple way to compute it sanely and efficiently, so no KNN >> support. I don't think we can implement a meaningful distance operator for enums. > This time including the data file for the gist regression tests. It doesn't apply to HEAD anymore. I have tested it on b12fd41. The GiST part of it doesn't really work. The current patch compares oids. We need to change it to compare enumsortorder. I could make it fail easily: > regression=# create type e as enum ('0', '2', '3'); > CREATE TYPE > regression=# alter type e add value '1' after '0'; > ALTER TYPE > regression=# create table t as select (i % 4)::text::e from > generate_series(0, 10) as i; > SELECT 11 > regression=# create index on t using gist (e); > SEGFAULT > +ALTER OPERATOR FAMILY gist_enum_ops USING gist ADD Why don't we just create it with those changes? > + * Use a similar trick to that used for numeric for enums, since we don't Do you mean "similar trick that is used" or "trick similar to numeric"? > + * actually know the leftmost value of any enum without knowing the concrete > + * type, so we use a dummy leftmost value of InvalidOid. > +return DatumGetBool( > +DirectFunctionCall2(enum_gt, > ObjectIdGetDatum(*((const Oid *) a)), ObjectIdGetDatum(*((const Oid *) b))) > +); Wouldn't it be nice to refactor enum_cmp_internal() to accept typcache and use it there like the rangetypes? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] btree_gin and btree_gist for enums
On 03/24/2016 12:40 PM, Matt Wilmas wrote: (I notice the btree_gin docs don't mention "numeric," but it works.) Numeric does work - we have regression tests to prove it, do we should fix the docs. But I'm also curious to know why apparently we don't have distance operator support for numeric. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] btree_gin and btree_gist for enums
On 03/24/2016 12:40 PM, Matt Wilmas wrote: It would be *really* nice to have this in 9.6. It seems it's simply filling out functionality that should already be there, right? An oversight/bug fix so it works "as advertised?" :-) I think that would be stretching the process a bit far. I'm certainly not prepared to commit it unless there is a general consensus to make an exception for it. Is any other btree type-compatibility missing from these modules? (I notice the btree_gin docs don't mention "numeric," but it works.) uuid would be another type that should be fairly easily covered but isn't. I just looked over the patch, and the actual code addition/changes seem pretty small and straightforward (or am I wrong?). Yes, sure, it's all fairly simple. Took me a little while to understand what I was doing, but once I did it was pretty much plain sailing. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] btree_gin and btree_gist for enums
Hi Andrew, all, First message here! I didn't get around to sending an intro/"thank you all" e-mail yet, and a small performance (?) patch+idea(s)... (CPU stuff, since I don't otherwise know much about PG internals.) Anyway... - Original Message - From: "Andrew Dunstan" Sent: Thursday, March 17, 2016 Here is a patch to add enum support to btree_gin and btree_gist. I didn't include distance operations, as I didn't think it terribly important, and there isn't a simple way to compute it sanely and efficiently, so no KNN support. Major thanks for coming up with this a week after your "enums and indexing" message. My love of all things Postgres has a lot to do with being able to do nearly anything one could want (coming from MySQL :-/)! So I was like, "Wait, what?" when you brought up the subject, since this was something I hadn't actually tried, but was planning to use btree_gin/gist with a good mix of stuff, including enums (array and scalar). At first I thought it was just arrays of enums, until this patch for the contrib extensions (for the btree parts with GIN/GiST; plus GiST exclusion?), and your blog posts... [1][2] I wasn't certain from the second post whether your array solution can be used today without this patch (yes, just tried 9.5). So, two separate issues, and the patch addresses scalar stuff, IIUC. It would be *really* nice to have this in 9.6. It seems it's simply filling out functionality that should already be there, right? An oversight/bug fix so it works "as advertised?" :-) Is any other btree type-compatibility missing from these modules? (I notice the btree_gin docs don't mention "numeric," but it works.) I just looked over the patch, and the actual code addition/changes seem pretty small and straightforward (or am I wrong?). And it's not changing anything in the core, so... Well, just wanted to argue my case! :^) [1] http://adpgtech.blogspot.com/2016/03/gist-and-gin-support-for-enums.html [2] http://adpgtech.blogspot.com/2016/03/gin-indexing-array-of-enums.html cheers andrew Thanks, Matt -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] btree_gin and btree_gist for enums
On 03/18/2016 09:54 AM, Robert Haas wrote: On Thu, Mar 17, 2016 at 11:21 AM, Andrew Dunstan wrote: On 03/17/2016 06:44 AM, Andrew Dunstan wrote: Here is a patch to add enum support to btree_gin and btree_gist. I didn't include distance operations, as I didn't think it terribly important, and there isn't a simple way to compute it sanely and efficiently, so no KNN support. This time including the data file for the gist regression tests. Are you intending this as a 9.7 submission? Because it's pretty late for 9.6. I guess so. I would certainly find it hard to argue that it should be in 9.6 unless there is a consensus on it. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] btree_gin and btree_gist for enums
On Thu, Mar 17, 2016 at 11:21 AM, Andrew Dunstan wrote: > On 03/17/2016 06:44 AM, Andrew Dunstan wrote: >> >> Here is a patch to add enum support to btree_gin and btree_gist. I didn't >> include distance operations, as I didn't think it terribly important, and >> there isn't a simple way to compute it sanely and efficiently, so no KNN >> support. > > This time including the data file for the gist regression tests. Are you intending this as a 9.7 submission? Because it's pretty late for 9.6. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers