Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-08-19 Thread Mark Rofail
I have a concern that after supporting UPDATE/DELETE CASCADE, the
performance would drop.

On Thu, Jul 27, 2017 at 12:54 PM, Alexander Korotkov 
 wrote:
>
> I wonder how may RI trigger work so fast if it has to do some job besides
> index search with no results?
>

Best Regards,
Mark Rofail


Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-08-08 Thread Mark Rofail
On Tue, Aug 8, 2017 at 2:25 PM, Alexander Korotkov 
wrote:
>
> Do we already assume that default btree opclass for array element type
> matches PK opclass when using @>> operator on UPDATE/DELETE of referenced
> table?
>
I believe so, since it's a polymorphic function.


> If so, we don't introduce additional restriction here...
>
You mean to remove the wrapper query ?


> GROUP BY would also use default btree/hash opclass for element type.  It
> doesn't differ from DISTINCT from that point.
>
Then there's no going around this limitation,

Best Regard,
Mark Rofail


Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-08-05 Thread Mark Rofail
This is the query fired upon any UPDATE/DELETE for RI checks:

SELECT 1 FROM ONLY  x WHERE pkatt1 = $1 [AND ...] FOR KEY SHARE OF
x

in  the case of foreign key arrays, it's wrapped in this query:

SELECT 1 WHERE
(SELECT count(DISTINCT y) FROM unnest($1) y)
= (SELECT count(*) FROM () z)

This is where the limitation appears, the DISTINCT keyword. Since in
reality, count(DISTINCT) will fall back to the default btree opclass for
the array element type regardless of the opclass indicated in the access
method. Thus I believe going around DISTINCT is the way to go.

This is what I came up with:

SELECT 1 WHERE
(SELECT COUNT(*)
FROM
(
SELECT y
FROM unnest($1) y
GROUP BY y
)
)
= (SELECT count(*) () z)

I understand there might be some syntax errors but this is just a proof of
concept.

Is this the right way to go?
It's been a week and I don't think I made significant progress. Any
pointers?

Best Regards,
MarkRofail


Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-08-03 Thread Mark Rofail
To better understand a limitation I ask 5 questions

What is the limitation?
Why is there a limitation?
Why is it a limitation?
What can we do?
Is it feasible?

Through some reading:

*What is the limitation?*
presupposes that count(distinct y) has exactly the same notion of equality
that the PK unique index has. In reality, count(distinct) will fall back to
the default btree opclass for the array element type.

the planner may choose an optimization of this sort when the index's
opclass matches the one
DISTINCT will use, ie the default for the data type.

*Why is there a limitation?*
necessary because ri_triggers.c relies on COUNT(DISTINCT x) on the element
type, as well as on array_eq() on the array type, and we need those
operations to have the same notion of equality that we're using otherwise.

*Why is it a limitation?*
That's wrong: DISTINCT should use the equality operator that corresponds
to the index' operator class instead, not the default one.

*What can we do ?*
I'm sure that we can replace array_eq() with a newer polymorphic version
but I don't know how we could get around COUNT(DISTINCT x)

*Is it feasible? *
I don't think I have the experience to answer that

Best Regards,
Mark Rofail


Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-31 Thread Mark Rofail
On Mon, Jul 31, 2017 at 5:18 PM, Alvaro Herrera 
wrote:

> Tom Lane wrote:
> > Alvaro Herrera  writes:
> > > ...  However, when you create an index, you can
> > > indicate which operator class to use, and it may not be the default
> one.
> > > If a different one is chosen at index creation time, then a query using
> > > COUNT(distinct) will do the wrong thing, because DISTINCT will select
> > > an equality type using the type's default operator class, not the
> > > equality that belongs to the operator class used to create the index.
> >
> > > That's wrong: DISTINCT should use the equality operator that
> corresponds
> > > to the index' operator class instead, not the default one.
> >
> > Uh, what?  Surely the semantics of count(distinct x) *must not* vary
> > depending on what indexes happen to be available.
>
> Err ...
>
> > I think what you meant to say is that the planner may only choose an
> > optimization of this sort when the index's opclass matches the one
> > DISTINCT will use, ie the default for the data type.


I understand the problem. I am currently researching how to resolve it.

Best Regards,
Mark Rofail


Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-29 Thread Mark Rofail
These are limitations of the patch ordered by importance:

✗ presupposes that count(distinct y) has exactly the same notion of
equality that the PK unique index has. In reality, count(distinct) will
fall back to the default btree opclass for the array element type.

- Supported actions:
 ✔ NO ACTION
 ✔ RESTRICT
 ✗ CASCADE
 ✗ SET NULL
 ✗ SET DEFAULT

✗ coercion is unsopported. i.e. a numeric can't refrence int8

✗ Only one "ELEMENT" column allowed in a multi-column key

✗ undesirable dependency on default opclass semantics in the patch, which
is that it supposes it can use array_eq() to detect whether or not the
referencing column has changed.  But I think that can be fixed without
undue pain by providing a refactored version of array_eq() that can be told
which element-comparison function to use

✗ cross-type FKs are unsupported

-- Resolved limitations =

✔ fatal performance issues.  If you issue any UPDATE or DELETE against the
PK table, you get a query like this for checking to see if the RI
constraint would be violated:
SELECT 1 FROM ONLY fktable x WHERE $1 = ANY (fkcol) FOR SHARE OF x;
/* Changed into SELECT 1 FROM ONLY fktable x WHERE $1 @> fkcol FOR SHARE OF
x; */

-- 

Can someone help me understand the first limitation?


Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-28 Thread Mark Rofail
On Fri, Jul 28, 2017 at 1:19 PM, Erik Rijkers  wrote:

> One small thing while building docs:
>
> $  cd doc/src/sgml && make html
> osx -wall -wno-unused-param -wno-empty -wfully-tagged -D . -D . -x lower
> postgres.sgml >postgres.xml.tmp
> osx:ref/create_table.sgml:960:100:E: document type does not allow element
> "VARLISTENTRY" here
> Makefile:147: recipe for target 'postgres.xml' failed
> make: *** [postgres.xml] Error 1
>

I will work on it.

How's the rest of the patch ?


Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-27 Thread Mark Rofail
On Thu, Jul 27, 2017 at 7:30 PM, Alexander Korotkov 
wrote:

> Oh, ok.  I missed that.
>>
> Could you remind me why don't we have DELETE CASCADE?  I understand that
> UPDATE CASCADE is problematic because it's unclear which way should we
> delete elements from array.  But what about DELETE CASCADE?
>

Honestly, I didn't touch that part of the patch. It's very interesting
though, I think it would be great to spend the rest of GSoC in it.

Off the top of my head though, there's many ways to go about DELETE
CASCADE. You could only delete the member of the referencing array or the
whole array. I think there's a lot of options the user might want to
consider and it's hard to generalize to DELETE CASCADE. Maybe new grammar
would be introduced here ?|

Best Regards,
Mark Rofail


Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-27 Thread Mark Rofail
On Thu, Jul 27, 2017 at 7:15 PM, Erik Rijkers  wrote:

> It would help (me at least) if you could be more explicit about what
> exactly each instance is.
>

I apologize, I thought it was clear through the context.

I meant by the original patch is all the work done before my GSoC project.
The latest of which, was submitted by Tom Lane[1]. And rebased here[2].

The new patch is the latest one submitted by me[3].

And the new patch with index is the same[3], but with a GIN index built
over it. CREATE INDEX ON fktableforarray USING gin (fktest array_ops);

[1] https://www.postgresql.org/message-id/28617.1351095...@sss.pgh.pa.us
[2]
https://www.postgresql.org/message-id/CAJvoCutcMEYNFYK8Hdiui-M2y0ZGg%3DBe17fHgQ%3D8nHexZ6ft7w%40mail.gmail.com
[3]
https://www.postgresql.org/message-id/CAJvoCuuoGo5zJTpmPm90doYTUWoeUc%2BONXK2%2BH_vxsi%2BZi09bQ%40mail.gmail.com


Best Regards,
Mark Rofail


Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-27 Thread Mark Rofail
On Thu, Jul 27, 2017 at 12:54 PM, Alexander Korotkov 
wrote:
>
> How many rows of FK table were referencing the PK table row you're
> updating/deleting.
> I wonder how may RI trigger work so fast if it has to do some job besides
> index search with no results?
>
The problem here is that the only to option for the foreign key arrays are
NO ACTION and RESTRICT which don't allow me to update/delete a refrenced
row in the PK Table. the EXPLAIN ANALYZE only tells me that this violates
the FK constraint.

So we have two options. Either implement CASCADE or if there's a
configration for EXPLAIN to show costs even if it violates the FK
constraints.


> I think we should also vary the number of referencing rows.
>
The x axis is the number if refrencing rows in the FK table


Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-24 Thread Mark Rofail
It certainly is, thank you for the heads up. I included a note to encourage
the user to index the referencing column instead.

On Sun, Jul 23, 2017 at 4:41 AM, Robert Haas  wrote:
>
> This is a jumbo king-sized can of worms, and even a very experienced
> contributor would likely find it extremely difficult to sort all of
> the problems that would result from a change in this area.


Best Regards,
Mark Rofail


Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-24 Thread Mark Rofail
>
> However, there is a bug that prevented me from testing the third scenario,
> I assume there's an issue of incompatible types problem since the right
> operand type is anyelement and the supporting procedures expect anyarray.
> I am working on debugging it right now.
>

I have also solved the bug that prevented me from performance testing the
New Patch with the Index in place.

Here is a summary of the results:

A-  Original Patch
DELETE Average Execution time = 3.508 ms
UPDATE Average Execution time = 3.239 ms

B- New Patch
DELETE Average Execution time = 4.970 ms
UPDATE Average Execution time = 4.170 ms

C- With Index
DELETE Average Execution time = 0.169 ms
UPDATE Average Execution time = 0.147 ms


Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-19 Thread Mark Rofail
On Wed, Jul 19, 2017 at 10:08 PM, Alvaro Herrera 
wrote:

> So let's step back a bit,
> get a patch that works for the case where the types match on both sides
> of the FK, then we review that patch; if all is well, we can discuss the
> other problem as a stretch goal.


Agreed. This should be a future improvment.

I think the next step should be testing the performnce before/after the
modifiactions.


Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-19 Thread Mark Rofail
On Wed, Jul 19, 2017 at 7:28 PM, Robert Haas  wrote:

> Why do we have to solve that limitation?


Since the regress test labled element_foreing_key fails now that I made the
RI queries utilise @(anyarray, anyelement), that means it's not functioning
as it is meant to be.


Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-19 Thread Mark Rofail
*To summarise,* the options we have to solve the limitation of the
@>(anyarray , anyelement) where it produces the following error: operator
does not exist: integer[] @> smallint

*Option 1: *Multiple Operators
Have separate operators for every combination of datatypes instead of a
single polymorphic definition (i.e int4[] @>> int8, int4[] @>> int4, int4[]
@>> int2, int4[] @>> numeric.)

Drawback: High maintenance.


*Option 2: *Explicit casting
Where we compare the datatype of the 2 operands and cast with the
appropriate datatype

Drawback: figuring out the appropriate cast may require considerable
computation


*Option 3:* Unsafe Polymorphic datatypes
This a little out there. But since @>(anyarray, anyelement) have to resolve
to the same datatype. How about defining new datatypes without this
constraint? Where we handle the datatypes ourselves? It would ve something
like @>(unsafeAnyarray, unsafeAnyelement).

Drawback: a lot of defensive programming has to be implemented to guard
against any exception.


*Another thing*
Until this is settled, another thing I have to go through is performance
testing. To provide evidence that all we did actually enhances the
performance of the RI checks. How can I go about this?

Best Regards,
Mark Rofail


Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-19 Thread Mark Rofail
On Tue, Jul 18, 2017 at 11:14 PM, Alvaro Herrera 
wrote:
>
> Why did we add an operator and not a support
> procedure?


I thought the support procedures were constant within an opclass. They
implement the mandotary function required of an opclass. I don't see why we
would need to implement new ones since they already deal with the lefthand
operand which is the refrencing coloumn and is always an array so anyarray
would suffice.

Also the support procedure don't interact with the left and right operands
simultanously. And we want to target the combinations of  int4[] @>> int8,
int4[] @>> int4, int4[] @>> int2, int4[] @>> numeric.

So I think implementing operators is the way to go.

Best Regards,
Mark Rofail.


Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-18 Thread Mark Rofail
On Tue, 18 Jul 2017 at 7:43 pm, Alexander Korotkov 
wrote:

>  separate operators for int4[] @>> int8, int4[] @>> int4, int4[] @>> int2,
> int4[] @>> numeric.
>

My only comment on the separate operators is its high maintenance.  Any new
datatype introduced a corresponding operator should be created.


Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-18 Thread Mark Rofail
On Tue, 18 Jul 2017 at 7:43 pm, Alexander Korotkov 
wrote:

> On T upue, Jul 18, 2017 at 2:24 AM, Mark Rofail 
> wrote:
>
>> On Wed, Jul 12, 2017 at 12:53 AM, Alvaro Herrera <
>> alvhe...@2ndquadrant.com> wrote:
>>>
>>> We have one opclass for each type combination -- int4 to int2, int4 to
>>> int4, int4 to int8, etc.  You just need to add the new strategy to all
>>> the opclasses.
>>
>>
>>  I tried this approach by manually declaring the operator multiple of
>> times in pg_amop.h (src/include/catalog/pg_amop.h)
>>
>> so instead of the polymorphic declaration
>> DATA(insert ( 2745   2277 2283 5 s 6108 2742 0 )); /* anyarray @>>
>> anyelem */
>>
>> multiple declarations were used, for example for int4[] :
>> DATA(insert ( 2745   1007 20 5 s 6108 2742 0 )); /* int4[] @>> int8 */
>> DATA(insert ( 2745   1007 23 5 s 6108 2742 0 )); /* int4[] @>> int4 */
>> DATA(insert ( 2745   1007 21 5 s 6108 2742 0 )); /* int4[] @>> int2 */
>> DATA(insert ( 2745   1007 1700 5 s 6108 2742 0 ));/* int4[] @>> numeric
>> */
>>
>> However, make check produced:
>> could not create unique index "pg_amop_opr_fam_index"
>> Key (amopopr, amoppurpose, amopfamily)=(6108, s, 2745) is duplicated.
>>
>> Am I implementing this the wrong way or do we need to look for another
>> approach?
>>
>
> The problem is that you need to have not only opclass entries for the
> operators, but also operators themselves.  I.e. separate operators for
> int4[] @>> int8, int4[] @>> int4, int4[] @>> int2, int4[] @>> numeric.  You
> tried to add multiple pg_amop rows for single operator and consequently get
> unique index violation.
>
> Alvaro, do you think we need to define all these operators?  I'm not
> sure.  If even we need it, I think
> --
> Alexander Korotkov
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>


Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-17 Thread Mark Rofail
On Wed, Jul 12, 2017 at 12:53 AM, Alvaro Herrera 
 wrote:
>
> We have one opclass for each type combination -- int4 to int2, int4 to
> int4, int4 to int8, etc.  You just need to add the new strategy to all
> the opclasses.


 I tried this approach by manually declaring the operator multiple of times
in pg_amop.h (src/include/catalog/pg_amop.h)

so instead of the polymorphic declaration
DATA(insert ( 2745   2277 2283 5 s 6108 2742 0 )); /* anyarray @>> anyelem
*/

multiple declarations were used, for example for int4[] :
DATA(insert ( 2745   1007 20 5 s 6108 2742 0 )); /* int4[] @>> int8 */
DATA(insert ( 2745   1007 23 5 s 6108 2742 0 )); /* int4[] @>> int4 */
DATA(insert ( 2745   1007 21 5 s 6108 2742 0 )); /* int4[] @>> int2 */
DATA(insert ( 2745   1007 1700 5 s 6108 2742 0 ));/* int4[] @>> numeric */

However, make check produced:
could not create unique index "pg_amop_opr_fam_index"
Key (amopopr, amoppurpose, amopfamily)=(6108, s, 2745) is duplicated.

Am I implementing this the wrong way or do we need to look for another
approach?
diff --git a/src/backend/access/gin/ginarrayproc.c b/src/backend/access/gin/ginarrayproc.c
index a5238c3af5..9d6447923d 100644
--- a/src/backend/access/gin/ginarrayproc.c
+++ b/src/backend/access/gin/ginarrayproc.c
@@ -24,6 +24,7 @@
 #define GinContainsStrategy		2
 #define GinContainedStrategy	3
 #define GinEqualStrategy		4
+#define GinContainsElemStrategy	5
 
 
 /*
@@ -43,7 +44,7 @@ ginarrayextract(PG_FUNCTION_ARGS)
 	bool	   *nulls;
 	int			nelems;
 
-	get_typlenbyvalalign(ARR_ELEMTYPE(array),
+	get_typlenbyvalalign(ARR_ELEMTYPE(array),	
 		 &elmlen, &elmbyval, &elmalign);
 
 	deconstruct_array(array,
@@ -110,6 +111,11 @@ ginqueryarrayextract(PG_FUNCTION_ARGS)
 		case GinOverlapStrategy:
 			*searchMode = GIN_SEARCH_MODE_DEFAULT;
 			break;
+		case GinContainsElemStrategy:
+			/* only items that match the queried element 
+are considered candidate  */
+			*searchMode = GIN_SEARCH_MODE_DEFAULT;
+			break;
 		case GinContainsStrategy:
 			if (nelems > 0)
 *searchMode = GIN_SEARCH_MODE_DEFAULT;
@@ -171,6 +177,7 @@ ginarrayconsistent(PG_FUNCTION_ARGS)
 }
 			}
 			break;
+		case GinContainsElemStrategy:
 		case GinContainsStrategy:
 			/* result is not lossy */
 			*recheck = false;
@@ -258,7 +265,8 @@ ginarraytriconsistent(PG_FUNCTION_ARGS)
 }
 			}
 			break;
-		case GinContainsStrategy:
+			case GinContainsElemStrategy:
+			case GinContainsStrategy:
 			/* must have all elements in check[] true, and no nulls */
 			res = GIN_TRUE;
 			for (i = 0; i < nkeys; i++)
diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c
index 34dadd6e19..8c9eb0c676 100644
--- a/src/backend/utils/adt/arrayfuncs.c
+++ b/src/backend/utils/adt/arrayfuncs.c
@@ -4232,6 +4232,117 @@ arraycontained(PG_FUNCTION_ARGS)
 	PG_RETURN_BOOL(result);
 }
 
+/*
+ * array_contains_elem : checks an array for a spefific element
+ */
+static bool
+array_contains_elem(AnyArrayType *array, Datum elem, Oid element_type,
+bool element_isnull, Oid collation,	void **fn_extra)
+{
+	Oid 		arr_type = AARR_ELEMTYPE(array);
+	TypeCacheEntry *typentry;
+	int 		nelems;
+	int			typlen;
+	bool		typbyval;
+	char		typalign;
+	int			i;
+	array_iter 	it1;
+	FunctionCallInfoData locfcinfo;
+
+	if (arr_type != element_type)
+		ereport(ERROR,
+(errcode(ERRCODE_DATATYPE_MISMATCH),
+ errmsg("cannot compare different element types")));
+	
+	if (element_isnull)
+		return false;
+		
+	/*
+	 * We arrange to look up the equality function only once per series of
+	 * calls, assuming the element type doesn't change underneath us.  The
+	 * typcache is used so that we have no memory leakage when being used as
+	 * an index support function.
+	 */
+	typentry = (TypeCacheEntry *)*fn_extra;
+	if (typentry == NULL ||
+		typentry->type_id != arr_type)
+	{
+		typentry = lookup_type_cache(arr_type,
+	 TYPECACHE_EQ_OPR_FINFO);
+		if (!OidIsValid(typentry->eq_opr_finfo.fn_oid))
+			ereport(ERROR,
+	(errcode(ERRCODE_UNDEFINED_FUNCTION),
+	 errmsg("could not identify an equality operator for type %s",
+			format_type_be(arr_type;
+		*fn_extra = (void *)typentry;
+	}
+	typlen = typentry->typlen;
+	typbyval = typentry->typbyval;
+	typalign = typentry->typalign;
+
+	/*
+	 * Apply the comparison operator to each pair of array elements.
+	 */
+	InitFunctionCallInfoData(locfcinfo, &typentry->eq_opr_finfo, 2,
+			 collation, NULL, NULL);
+
+	/* Loop over source data */
+	nelems = ArrayGetNItems(AARR_NDIM(array), AARR_DIMS(array));
+	array_iter_setup(&it1, array);
+
+	for (i = 0; i < nelems; i++)
+	{
+		Datum elt1;
+		bool isnull;
+		bool oprresult;
+
+		/* Get element, checking for NULL */
+		elt1 = array_iter_next(&it1, &isnull, i, typlen, typbyval, typalign);
+
+		/*
+		 * We assume that the comparison operator is strict, so a NULL can't
+		 * match anything.  XXX this diverges from the "NULL=NULL" behavior of
+		 * array_eq, should we act like that?
+		 */
+		if (isnull)
+			

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-14 Thread Mark Rofail
On Wed, Jul 12, 2017 at 2:30 PM, Mark Rofail  wrote:

> On Wed, Jul 12, 2017 at 12:53 AM, Alvaro Herrera  > wrote:
>>
>> We have one opclass for each type combination -- int4 to int2, int4 to
>> int4, int4 to int8, etc.  You just need to add the new strategy to all
>> the opclasses.
>>
>
> Can you clarify this solution ? I think another solution would be external
> casting
>
>>
>> If external casting is to be used. If for example the two types in
question are smallint and integer. Would a function get_common_type(Oid
leftopr, Oid rightopr) be useful ?, that given the two types return the
"common" type between the two in this case integer.

Best Regards,
 Mark Rofail


Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-12 Thread Mark Rofail
On Wed, Jul 12, 2017 at 12:53 AM, Alvaro Herrera 
wrote:
>
> We have one opclass for each type combination -- int4 to int2, int4 to
> int4, int4 to int8, etc.  You just need to add the new strategy to all
> the opclasses.
>

Can you clarify this solution ? I think another solution would be external
casting

BTW now that we've gone through this a little further, it's starting to
> look like a mistake to me to use the same @> operator for (anyarray,
> anyelement) than we use for (anyarray, anyarray).


I agree. Changed to @>>

Best Regards,
Mark Rofail


Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-11 Thread Mark Rofail
here are the modifications to ri_triggers.c

On Wed, Jul 12, 2017 at 12:26 AM, Mark Rofail 
wrote:
>
> *What I did *
>
>- now the RI checks utilise the @>(anyarray, anyelement)
>
> Best Regards,
> Mark Rofail
>
diff --git a/src/backend/utils/adt/ri_triggers.c b/src/backend/utils/adt/ri_triggers.c
index 3a25ba52f3..2d2b8e6a4f 100644
--- a/src/backend/utils/adt/ri_triggers.c
+++ b/src/backend/utils/adt/ri_triggers.c
@@ -2650,7 +2650,7 @@ quoteRelationName(char *buffer, Relation rel)
  * ri_GenerateQual --- generate a WHERE clause equating two variables
  *
  * The idea is to append " sep leftop op rightop" to buf, or if fkreftype is
- * FKCONSTR_REF_EACH_ELEMENT, append " sep leftop op ANY(rightop)" to buf.
+ * FKCONSTR_REF_EACH_ELEMENT, append " sep leftop <@ rightop" to buf.
  *
  * The complexity comes from needing to be sure that the parser will select
  * the desired operator.  We always name the operator using
@@ -2694,21 +2694,34 @@ ri_GenerateQual(StringInfo buf,
  	else
  		oprright = operform->oprright;
  
-	appendStringInfo(buf, " %s %s", sep, leftop);
-	if (leftoptype != operform->oprleft)
-		ri_add_cast_to(buf, operform->oprleft);
- 
- 	appendStringInfo(buf, " OPERATOR(%s.%s) ",
+ 	if (fkreftype == FKCONSTR_REF_EACH_ELEMENT){
+		appendStringInfo(buf, " %s %s", sep, rightop);
+
+		if (rightoptype != oprright)
+ 			ri_add_cast_to(buf, oprright);
+
+		appendStringInfo(buf, " @> ");
+
+		appendStringInfoString(buf, leftop);
+
+		if (leftoptype != operform->oprleft)
+			ri_add_cast_to(buf, operform->oprleft);
+	 }	
+	else{
+		appendStringInfo(buf, " %s %s", sep, leftop);
+
+		if (leftoptype != operform->oprleft)
+			ri_add_cast_to(buf, operform->oprleft);
+
+		appendStringInfo(buf, " OPERATOR(%s.%s) ",
  	 quote_identifier(nspname), oprname);
- 
- 	if (fkreftype == FKCONSTR_REF_EACH_ELEMENT)
- 		appendStringInfoString(buf, "ANY (");
- 	appendStringInfoString(buf, rightop);
- 	if (rightoptype != oprright)
- 		ri_add_cast_to(buf, oprright);
- 	if (fkreftype == FKCONSTR_REF_EACH_ELEMENT)
- 		appendStringInfoChar(buf, ')');
 
+		appendStringInfoString(buf, rightop);
+
+		if (rightoptype != oprright)
+ 			ri_add_cast_to(buf, oprright);
+	}
+	
 	ReleaseSysCache(opertup);
 }
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-11 Thread Mark Rofail
On Sun, Jul 9, 2017 at 7:42 PM, Alexander Korotkov 
wrote:

> We may document that GIN index is required to accelerate RI queries for
> array FKs.  And users are encouraged to manually define them.
> It's also possible to define new option when index on referencing
> column(s) would be created automatically.  But I think this option should
> work the same way for regular FKs and array FKs.
>

I just thought because GIN index is suited for composite elements, it would
be appropriate for array FKs.

So we should leave it to the user ? I think tht would be fine too.

*What I did *

   - now the RI checks utilise the @>(anyarray, anyelement)
  - however there's a small problem:
  operator does not exist: integer[] @> smallint
  I assume that external casting would be required here. But how can I
  downcast smallint to integer or interger to numeric automatically ?

*What I plan to do*

   - work on the above mentioned buy/limitation
   - otherwise, I think this concludes limitation #5

fatal performance issues.  If you issue any UPDATE or DELETE against the PK
> table, you get a query like this for checking to see if the RI constraint
> would be violated:

SELECT 1 FROM ONLY fktable x WHERE $1 = ANY (fkcol) FOR SHARE OF x;.

or is there anything remaining ?

Best Regards,
Mark Rofail


Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-09 Thread Mark Rofail
On Sun, Jul 9, 2017 at 2:38 AM, Alexander Korotkov 
wrote:

> Could you, please, specify idea of what you're implementing in more
> detail?
>

Ultimatley we would like an indexed scan instead of a sequential scan, so I
thought we needed to index the FK array columns first.


Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-08 Thread Mark Rofail
* What I am working on*

   - since we want to create an index on the referencing column, I am
   working on firing a 'CREATE INDEX' query programatically right after the
   'CREATE TABLE' query
   - The problem I ran into is how to specify my Strategy (
  GinContainsElemStrategy) within the CREATE INDEX query. For
example: CREATE
  INDEX ON fktable USING gin (fkcolumn array_ops)
  Where does the strategy number fit?
  - The patch is attached here, is the approach I took to creating an
  index programmatically, correct?

Best Regard,
Mark Rofail
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index dc18fd1eae..085b63aa98 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -7139,6 +7139,31 @@ ATAddForeignKeyConstraint(AlteredTableInfo *tab, Relation rel,
 			ereport(ERROR,
 	(errcode(ERRCODE_INVALID_FOREIGN_KEY),
 	 errmsg("array foreign keys support only NO ACTION and RESTRICT actions")));
+
+		IndexStmt *stmt = makeNode(IndexStmt);
+		stmt->unique = false; /* is index unique? Nope, should allow duplicates*/
+		stmt->concurrent = false; /* should this be a concurrent index build? we want 
+	to lock out writes on the table until it's done. */
+		stmt->idxname = NULL; 		/* let the idxname be generated */ 
+		stmt->relation = /* relation name */;
+		stmt->accessMethod = "gin";	/* name of access method: GIN */
+		stmt->indexParams = /* column name + */"array_ops";
+		stmt->options = NULL;
+		stmt->tableSpace = NULL; 	/* NULL for default */
+		stmt->whereClause = NULL;
+		stmt->excludeOpNames = NIL;
+		stmt->idxcomment = NULL;
+		stmt->indexOid = InvalidOid;
+		stmt->oldNode = InvalidOid; /* relfilenode of existing storage, if any: None*/
+		stmt->primary = false; 		/* is index a primary key? Nope */
+		stmt->isconstraint = false; /* is it for a pkey/unique constraint? Nope */
+		stmt->deferrable = false;
+		stmt->initdeferred = false;
+		stmt->transformed = false;
+		stmt->if_not_exists = false; /* just do nothing if index already exists? Nope 
+	(this shouldn't happen)*/
+
+		ATExecAddIndex(tab, rel, stmt, true, lockmode);
 	}
 
  	/*
diff --git a/src/backend/utils/adt/ri_triggers.c b/src/backend/utils/adt/ri_triggers.c
index 3a25ba52f3..0045f64c9e 100644
--- a/src/backend/utils/adt/ri_triggers.c
+++ b/src/backend/utils/adt/ri_triggers.c
@@ -2650,7 +2650,7 @@ quoteRelationName(char *buffer, Relation rel)
  * ri_GenerateQual --- generate a WHERE clause equating two variables
  *
  * The idea is to append " sep leftop op rightop" to buf, or if fkreftype is
- * FKCONSTR_REF_EACH_ELEMENT, append " sep leftop op ANY(rightop)" to buf.
+ * FKCONSTR_REF_EACH_ELEMENT, append " sep leftop <@ rightop" to buf.
  *
  * The complexity comes from needing to be sure that the parser will select
  * the desired operator.  We always name the operator using
@@ -2697,17 +2697,10 @@ ri_GenerateQual(StringInfo buf,
 	appendStringInfo(buf, " %s %s", sep, leftop);
 	if (leftoptype != operform->oprleft)
 		ri_add_cast_to(buf, operform->oprleft);
- 
- 	appendStringInfo(buf, " OPERATOR(%s.%s) ",
- 	 quote_identifier(nspname), oprname);
- 
- 	if (fkreftype == FKCONSTR_REF_EACH_ELEMENT)
- 		appendStringInfoString(buf, "ANY (");
+ 	appendStringInfo(buf, " @> "); 
  	appendStringInfoString(buf, rightop);
  	if (rightoptype != oprright)
  		ri_add_cast_to(buf, oprright);
- 	if (fkreftype == FKCONSTR_REF_EACH_ELEMENT)
- 		appendStringInfoChar(buf, ')');
 
 	ReleaseSysCache(opertup);
 }

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-05 Thread Mark Rofail
To make the queries fired by the RI triggers GIN indexed. We need to ‒ as
Tom Lane has previously suggested[1] ‒ to replace the query

SELECT 1 FROM ONLY fktable x WHERE $1 = ANY (fkcol) FOR SHARE OF x;

with

SELECT 1 FROM ONLY fktable x WHERE ARRAY[$1] <@ fkcol FOR SHARE OF x;

but since we have @<(anyarray, anyelement) it can be improved to

SELECT 1 FROM ONLY fktable x WHERE $1 @> fkcol FOR SHARE OF x;

and the piece of code responsible for all of this is ri_GenerateQual in
ri_triggers.c.

How to accomplish that is the next step. I don't know if we should hardcode
the "@>" symbol or if we just index the fk table then ri_GenerateQual would
be able to find the operator on it's own.

*What I plan to do:*

   - study how to index the fk table upon its creation. I suspect this can
   be done in tablecmds.c

*Questions:*

   - how can you programmatically in C index a table?

[1] https://www.postgresql.org/message-id/28389.1351094795%40sss.pgh.pa.us

Best Regards,
Mark Rofail
diff --git a/src/backend/utils/adt/ri_triggers.c b/src/backend/utils/adt/ri_triggers.c
index 3a25ba52f3..0045f64c9e 100644
--- a/src/backend/utils/adt/ri_triggers.c
+++ b/src/backend/utils/adt/ri_triggers.c
@@ -2650,7 +2650,7 @@ quoteRelationName(char *buffer, Relation rel)
  * ri_GenerateQual --- generate a WHERE clause equating two variables
  *
  * The idea is to append " sep leftop op rightop" to buf, or if fkreftype is
- * FKCONSTR_REF_EACH_ELEMENT, append " sep leftop op ANY(rightop)" to buf.
+ * FKCONSTR_REF_EACH_ELEMENT, append " sep leftop <@ rightop" to buf.
  *
  * The complexity comes from needing to be sure that the parser will select
  * the desired operator.  We always name the operator using
@@ -2697,17 +2697,10 @@ ri_GenerateQual(StringInfo buf,
 	appendStringInfo(buf, " %s %s", sep, leftop);
 	if (leftoptype != operform->oprleft)
 		ri_add_cast_to(buf, operform->oprleft);
- 
- 	appendStringInfo(buf, " OPERATOR(%s.%s) ",
- 	 quote_identifier(nspname), oprname);
- 
- 	if (fkreftype == FKCONSTR_REF_EACH_ELEMENT)
- 		appendStringInfoString(buf, "ANY (");
+ 	appendStringInfo(buf, " @> "); 
  	appendStringInfoString(buf, rightop);
  	if (rightoptype != oprright)
  		ri_add_cast_to(buf, oprright);
- 	if (fkreftype == FKCONSTR_REF_EACH_ELEMENT)
- 		appendStringInfoChar(buf, ')');
 
 	ReleaseSysCache(opertup);
 }

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-06-25 Thread Mark Rofail
*What I did:*



   - read into the old patch but couldn't apply it since it's quite old. It
   needs to be rebased and that's what I am working on.  It's a lot of work.
  - incomplete patch can be found attached here

*Bugs*

   - problem with the @>(anyarray, anyelement) opertator: if for example,
   you apply the operator as follows  '{AA646'}' @> 'AA646' it
   maps to @>(anyarray, anyarray) since 'AA646' is interpreted as
   char[] instead of Text

*Suggestion:*

   - since I needed to check if the Datum was null and its type, I had to
   do it in the arraycontainselem and pass it as a parameter to the underlying
   function array_contains_elem. I'm proposing to introduce a new struct like
   ArrayType, but ElementType along all with brand new MACROs to make dealing
   with anyelement easier in any polymorphic context.


Best Regards,
Mark Rofail

On Tue, Jun 20, 2017 at 12:19 AM, Alvaro Herrera 
wrote:

> Mark Rofail wrote:
> > Okay, so major breakthrough.
> >
> > *Updates:*
> >
> >- The operator @>(anyarray, anyelement) is now functional
> >   - The segmentation fault was due to applying PG_FREE_IF_COPY on a
> >   datum when it should only be applied on TOASTed inputs
> >   - The only problem now is if for example you apply the operator as
> >   follows  '{AA646'}' @> 'AA646' it maps to
> @>(anyarray,
> >   anyarray) since 'AA646' is interpreted as char[] instead
> of Text
> >- Added some regression tests (src/test/regress/sql/arrays.sql) and
> >their results(src/test/regress/expected/arrays.out)
> >- wokred on the new GIN strategy, I don't think it would vary much
> from
> >GinContainsStrategy.
>
> OK, that's great.
>
> > *What I plan to do:*
> >
> >- I need to start working on the Referential Integrity code but I
> don't
> >where to start
>
> You need to study the old patch posted by Marco Nenciarini.
>
> --
> Álvaro Herrerahttps://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index ea655a10a8..712f631e88 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -2288,6 +2288,14 @@ SCRAM-SHA-256$<iteration count>:<salt><
  
 
  
+  confiselement
+  bool
+  
+  If a foreign key, is it an array ELEMENT
+  foreign key?
+ 
+
+ 
   coninhcount
   int4
   
@@ -2324,6 +2332,18 @@ SCRAM-SHA-256$<iteration count>:<salt><
  
 
  
+  confelement
+  bool[]
+  
+  
+ 	If a foreign key, list of booleans expressing which columns
+ 	are array ELEMENT columns; see
+ 	
+ 	for details
+  
+ 
+
+ 
   conpfeqop
   oid[]
   pg_operator.oid
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index b05a9c2150..c1c847bc7e 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -881,7 +881,112 @@ CREATE TABLE order_items (
 .

   
-
+  
+   
+Array ELEMENT Foreign Keys
+ 
+
+ ELEMENT foreign key
+
+ 
+
+ constraint
+ Array ELEMENT foreign key
+
+ 
+
+ constraint
+ ELEMENT foreign key
+
+ 
+
+ referential integrity
+
+ 
+
+ Another option you have with foreign keys is to use a
+ referencing column which is an array of elements with
+ the same type (or a compatible one) as the referenced
+ column in the related table. This feature is called
+ array element foreign key and is implemented
+ in PostgreSQL with ELEMENT foreign key constraints,
+ as described in the following example:
+ 
+
+CREATE TABLE drivers (
+driver_id integer PRIMARY KEY,
+first_name text,
+last_name text,
+...
+);
+
+CREATE TABLE races (
+race_id integer PRIMARY KEY,
+title text,
+race_day DATE,
+...
+final_positions integer[] ELEMENT REFERENCES drivers
+);
+
+ 
+ The above example uses an array (final_positions)
+ to store the results of a race: for each of its elements
+ a referential integrity check is enforced on the
+ drivers table.
+ Note that ELEMENT REFERENCES is an extension
+ of PostgreSQL and it is not included in the SQL standard.
+
+ 
+
+ Even though the most common use case for array ELEMENT
+ foreign keys is on a single column key, you can define an array
+ ELEMENT foreign key constraint on a group
+ of columns. As the following example shows, it must be written in table
+ constraint form:
+ 
+
+CREA

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-06-19 Thread Mark Rofail
Okay, so major breakthrough.

*Updates:*

   - The operator @>(anyarray, anyelement) is now functional
  - The segmentation fault was due to applying PG_FREE_IF_COPY on a
  datum when it should only be applied on TOASTed inputs
  - The only problem now is if for example you apply the operator as
  follows  '{AA646'}' @> 'AA646' it maps to @>(anyarray,
  anyarray) since 'AA646' is interpreted as char[] instead of Text
   - Added some regression tests (src/test/regress/sql/arrays.sql) and
   their results(src/test/regress/expected/arrays.out)
   - wokred on the new GIN strategy, I don't think it would vary much from
   GinContainsStrategy.

*What I plan to do:*

   - I need to start working on the Referential Integrity code but I don't
   where to start

Best Regards,
Mark Rofail
diff --git a/src/backend/access/gin/ginarrayproc.c b/src/backend/access/gin/ginarrayproc.c
index cc7435e030..a1b3f53ed9 100644
--- a/src/backend/access/gin/ginarrayproc.c
+++ b/src/backend/access/gin/ginarrayproc.c
@@ -24,6 +24,7 @@
 #define GinContainsStrategy		2
 #define GinContainedStrategy	3
 #define GinEqualStrategy		4
+#define GinContainsElemStrategy	5
 
 
 /*
@@ -110,6 +111,11 @@ ginqueryarrayextract(PG_FUNCTION_ARGS)
 		case GinOverlapStrategy:
 			*searchMode = GIN_SEARCH_MODE_DEFAULT;
 			break;
+		case GinContainsElemStrategy:
+			/* only items that match the queried element 
+are considered candidate  */
+			*searchMode = GIN_SEARCH_MODE_DEFAULT;
+			break;
 		case GinContainsStrategy:
 			if (nelems > 0)
 *searchMode = GIN_SEARCH_MODE_DEFAULT;
@@ -171,6 +177,7 @@ ginarrayconsistent(PG_FUNCTION_ARGS)
 }
 			}
 			break;
+		case GinContainsElemStrategy:
 		case GinContainsStrategy:
 			/* result is not lossy */
 			*recheck = false;
@@ -258,7 +265,8 @@ ginarraytriconsistent(PG_FUNCTION_ARGS)
 }
 			}
 			break;
-		case GinContainsStrategy:
+			case GinContainsElemStrategy:
+			case GinContainsStrategy:
 			/* must have all elements in check[] true, and no nulls */
 			res = GIN_TRUE;
 			for (i = 0; i < nkeys; i++)
diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c
index d9c8aa569c..c563aa564e 100644
--- a/src/backend/utils/adt/arrayfuncs.c
+++ b/src/backend/utils/adt/arrayfuncs.c
@@ -4232,6 +4232,117 @@ arraycontained(PG_FUNCTION_ARGS)
 	PG_RETURN_BOOL(result);
 }
 
+/*
+ * array_contains_elem : checks an array for a spefific element
+ */
+static bool
+array_contains_elem(AnyArrayType *array, Datum elem, Oid element_type,
+bool element_isnull, Oid collation,	void **fn_extra)
+{
+	Oid 		arr_type = AARR_ELEMTYPE(array);
+	TypeCacheEntry *typentry;
+	int 		nelems;
+	int			typlen;
+	bool		typbyval;
+	char		typalign;
+	int			i;
+	array_iter 	it1;
+	FunctionCallInfoData locfcinfo;
+
+	if (arr_type != element_type)
+		ereport(ERROR,
+(errcode(ERRCODE_DATATYPE_MISMATCH),
+ errmsg("cannot compare different element types")));
+	
+	if (element_isnull)
+		return false;
+		
+	/*
+	 * We arrange to look up the equality function only once per series of
+	 * calls, assuming the element type doesn't change underneath us.  The
+	 * typcache is used so that we have no memory leakage when being used as
+	 * an index support function.
+	 */
+	typentry = (TypeCacheEntry *)*fn_extra;
+	if (typentry == NULL ||
+		typentry->type_id != arr_type)
+	{
+		typentry = lookup_type_cache(arr_type,
+	 TYPECACHE_EQ_OPR_FINFO);
+		if (!OidIsValid(typentry->eq_opr_finfo.fn_oid))
+			ereport(ERROR,
+	(errcode(ERRCODE_UNDEFINED_FUNCTION),
+	 errmsg("could not identify an equality operator for type %s",
+			format_type_be(arr_type;
+		*fn_extra = (void *)typentry;
+	}
+	typlen = typentry->typlen;
+	typbyval = typentry->typbyval;
+	typalign = typentry->typalign;
+
+	/*
+	 * Apply the comparison operator to each pair of array elements.
+	 */
+	InitFunctionCallInfoData(locfcinfo, &typentry->eq_opr_finfo, 2,
+			 collation, NULL, NULL);
+
+	/* Loop over source data */
+	nelems = ArrayGetNItems(AARR_NDIM(array), AARR_DIMS(array));
+	array_iter_setup(&it1, array);
+
+	for (i = 0; i < nelems; i++)
+	{
+		Datum elt1;
+		bool isnull;
+		bool oprresult;
+
+		/* Get element, checking for NULL */
+		elt1 = array_iter_next(&it1, &isnull, i, typlen, typbyval, typalign);
+
+		/*
+		 * We assume that the comparison operator is strict, so a NULL can't
+		 * match anything.  XXX this diverges from the "NULL=NULL" behavior of
+		 * array_eq, should we act like that?
+		 */
+		if (isnull)
+			continue;
+
+		/*
+			* Apply the operator to the element pair
+			*/
+		locfcinfo.arg[0] = elt1;
+		locfcinfo.arg[1] = elem;
+		locfcinfo.argnull[0] = false;
+		locfcinfo.argnull[1] = false;
+		locfcinfo.isnull = false;
+		oprresult = DatumGetBool(FunctionCallInvoke(&locfcinfo));
+		if (oprresult)
+			return

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-06-17 Thread Mark Rofail
*Updates till now:*

   - added a record to pg_proc (src/include/catalog/pg_proc.h)
   - modified opr_sanity regression check expected results
   - implemented a  low-level function called `array_contains_elem` as an
   equivalent to `array_contain_compare` but accepts anyelement instead of
   anyarray as the right operand. This is more efficient than constructing an
   array and then immediately deconstructing it.

*Questions:*

   - I'd like to check that anyelem and anyarray have the same element
   type. but anyelem is obtained from PG_FUNCTION_ARGS as a Datum. How can
   I make such a check?

Best Regards,
Mark Rofail
diff --git a/src/backend/access/gin/ginarrayproc.c b/src/backend/access/gin/ginarrayproc.c
index cc7435e030..214aac8fba 100644
--- a/src/backend/access/gin/ginarrayproc.c
+++ b/src/backend/access/gin/ginarrayproc.c
@@ -24,6 +24,7 @@
 #define GinContainsStrategy		2
 #define GinContainedStrategy	3
 #define GinEqualStrategy		4
+#define GinContainsElemStrategy		5
 
 
 /*
@@ -43,7 +44,7 @@ ginarrayextract(PG_FUNCTION_ARGS)
 	bool	   *nulls;
 	int			nelems;
 
-	get_typlenbyvalalign(ARR_ELEMTYPE(array),
+	get_typlenbyvalalign(ARR_ELEMTYPE(array),	
 		 &elmlen, &elmbyval, &elmalign);
 
 	deconstruct_array(array,
@@ -110,7 +111,8 @@ ginqueryarrayextract(PG_FUNCTION_ARGS)
 		case GinOverlapStrategy:
 			*searchMode = GIN_SEARCH_MODE_DEFAULT;
 			break;
-		case GinContainsStrategy:
+			case GinContainsElemStrategy:
+			case GinContainsStrategy:
 			if (nelems > 0)
 *searchMode = GIN_SEARCH_MODE_DEFAULT;
 			else	/* everything contains the empty set */
@@ -171,6 +173,7 @@ ginarrayconsistent(PG_FUNCTION_ARGS)
 }
 			}
 			break;
+		case GinContainsElemStrategy:
 		case GinContainsStrategy:
 			/* result is not lossy */
 			*recheck = false;
@@ -258,7 +261,8 @@ ginarraytriconsistent(PG_FUNCTION_ARGS)
 }
 			}
 			break;
-		case GinContainsStrategy:
+			case GinContainsElemStrategy:
+			case GinContainsStrategy:
 			/* must have all elements in check[] true, and no nulls */
 			res = GIN_TRUE;
 			for (i = 0; i < nkeys; i++)
diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c
index d9c8aa569c..8009ab5acb 100644
--- a/src/backend/utils/adt/arrayfuncs.c
+++ b/src/backend/utils/adt/arrayfuncs.c
@@ -4232,6 +4232,107 @@ arraycontained(PG_FUNCTION_ARGS)
 	PG_RETURN_BOOL(result);
 }
 
+/*
+ * array_contains_elem : checks an array for a spefific element
+ */
+static bool
+array_contains_elem(AnyArrayType *array, Datum elem, Oid collation,
+	void **fn_extra)
+{
+	Oid 		element_type = AARR_ELEMTYPE(array);
+	TypeCacheEntry *typentry;
+	int 		nelems;
+	int			typlen;
+	bool		typbyval;
+	char		typalign;
+	int			i;
+	array_iter 	it1;
+	FunctionCallInfoData locfcinfo;
+
+	/*
+	 * We arrange to look up the equality function only once per series of
+	 * calls, assuming the element type doesn't change underneath us.  The
+	 * typcache is used so that we have no memory leakage when being used as
+	 * an index support function.
+	 */
+	typentry = (TypeCacheEntry *)*fn_extra;
+	if (typentry == NULL ||
+		typentry->type_id != element_type)
+	{
+		typentry = lookup_type_cache(element_type,
+	 TYPECACHE_EQ_OPR_FINFO);
+		if (!OidIsValid(typentry->eq_opr_finfo.fn_oid))
+			ereport(ERROR,
+	(errcode(ERRCODE_UNDEFINED_FUNCTION),
+	 errmsg("could not identify an equality operator for type %s",
+			format_type_be(element_type;
+		*fn_extra = (void *)typentry;
+	}
+	typlen = typentry->typlen;
+	typbyval = typentry->typbyval;
+	typalign = typentry->typalign;
+
+	/*
+	 * Apply the comparison operator to each pair of array elements.
+	 */
+	InitFunctionCallInfoData(locfcinfo, &typentry->eq_opr_finfo, 2,
+			 collation, NULL, NULL);
+
+	/* Loop over source data */
+	nelems = ArrayGetNItems(AARR_NDIM(array), AARR_DIMS(array));
+	array_iter_setup(&it1, array);
+
+	for (i = 0; i < nelems; i++)
+	{
+		Datum elt1;
+		bool isnull;
+		bool oprresult;
+
+		/* Get element, checking for NULL */
+		elt1 = array_iter_next(&it1, &isnull, i, typlen, typbyval, typalign);
+
+		/*
+		 * We assume that the comparison operator is strict, so a NULL can't
+		 * match anything.  XXX this diverges from the "NULL=NULL" behavior of
+		 * array_eq, should we act like that?
+		 */
+		if (isnull)
+			continue;
+
+		/*
+			* Apply the operator to the element pair
+			*/
+		locfcinfo.arg[0] = elt1;
+		locfcinfo.arg[1] = elem;
+		locfcinfo.argnull[0] = false;
+		locfcinfo.argnull[1] = false;
+		locfcinfo.isnull = false;
+		oprresult = DatumGetBool(FunctionCallInvoke(&locfcinfo));
+		if (oprresult)
+			return true;
+	}
+
+	return false;
+}
+
+Datum
+arraycontainselem(PG_FUNCTION_ARGS)
+{
+	AnyArrayType *array = PG_GETARG_ANY_ARRAY(0);
+	Datum elem = PG_GETARG_DATUM(1);
+	Oid collation = PG_GET_COLLATION();
+	bool result;
+
+	result = array_contains_elem(array, elem, collatio

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-06-10 Thread Mark Rofail
• After finding the arraycontains function, I implemented
arraycontainselem that corresponds to the operator @<(anyarray,
anyelem)
   ◦ Please read the attached patch file to view my progress.

•  In addition to src/backend/utils/adt/arrayfuncs.c where I
implemented arraycontainselem.

   ◦ I also edited pg_amop (src/include/catalog/pg_amop.h) since
it stores information about operators associated with access method
operator families.

+DATA(insert ( 2745   2277 2283 2 s 2753 2742 0 ));
{
2745: Oid amopfamily; (denotes gin array_ops)
277: Oid amoplefttype; (denotes anyaray)
2283: Oid amoprighttype; (denotes anyelem)
5: int16 amopstrategy; /* operator strategy number */ (denotes the new
startegy that is yet to be created)
's': char amoppurpose; (denotes 's' for search)
2753: Oid amopopr; (denotes the new operator Oid)
2742: Oid amopmethod;(denotes gin)
0: Oid amopsortfamily; (0 since search operator)
}

   ◦ And pg_operator (src/include/catalog/pg_operator.h) since it
stores information about operators.
+DATA(insert OID = 2753 (  "@>"   PGNSP PGUID b f f 2277 2283 16 0  0
arraycontainselem 0 0 ));
{
 "@>": NameData oprname; /* name of operator */
Oid oprnamespace; /* OID of namespace containing this oper */
Oid oprowner; /* operator owner */
'b': char oprkind; /* 'l', 'r', or 'b' */ (denotes infix)
'f': bool oprcanmerge; /* can be used in merge join? */
'f': bool oprcanhash; /* can be used in hash join? */
277: Oid oprleft; (denotes anyaray)
2283: Oid oprright; (denotes anyelem)
16: Oid oprresult;  (denotes boolean)
0: Oid oprcom; /* OID of commutator oper, or 0 if none */ (needs to be
revisited)
0: Oid oprnegate; /* OID of negator oper, or 0 if none */ (needs to be
revisited)
arraycontainselem: regproc oprcode; /* OID of underlying function */
0: regproc oprrest; /* OID of restriction estimator, or 0 */
0: regproc oprjoin; /* OID of join estimator, or 0 */
}
diff --git a/src/backend/access/gin/ginarrayproc.c b/src/backend/access/gin/ginarrayproc.c
index cc7435e030..14fedc8066 100644
--- a/src/backend/access/gin/ginarrayproc.c
+++ b/src/backend/access/gin/ginarrayproc.c
@@ -24,6 +24,7 @@
 #define GinContainsStrategy		2
 #define GinContainedStrategy	3
 #define GinEqualStrategy		4
+#define GinContainsElemStrategy		5
 
 
 /*
@@ -110,7 +111,8 @@ ginqueryarrayextract(PG_FUNCTION_ARGS)
 		case GinOverlapStrategy:
 			*searchMode = GIN_SEARCH_MODE_DEFAULT;
 			break;
+		case GinContainsElemStrategy:
 		case GinContainsStrategy:
 			if (nelems > 0)
 *searchMode = GIN_SEARCH_MODE_DEFAULT;
 			else	/* everything contains the empty set */
@@ -171,6 +173,7 @@ ginarrayconsistent(PG_FUNCTION_ARGS)
 }
 			}
 			break;
+		case GinContainsElemStrategy:
 		case GinContainsStrategy:
 			/* result is not lossy */
 			*recheck = false;
@@ -258,7 +261,8 @@ ginarraytriconsistent(PG_FUNCTION_ARGS)
 }
 			}
 			break;
-		case GinContainsStrategy:
+		case GinContainsElemStrategy:
 		case GinContainsStrategy:
 			/* must have all elements in check[] true, and no nulls */
 			res = GIN_TRUE;
 			for (i = 0; i < nkeys; i++)
diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c
index d9c8aa569c..e1ff6d33b5 100644
--- a/src/backend/utils/adt/arrayfuncs.c
+++ b/src/backend/utils/adt/arrayfuncs.c
@@ -4215,6 +4215,40 @@ arraycontains(PG_FUNCTION_ARGS)
 }
 
 Datum
+arraycontainselem(PG_FUNCTION_ARGS)
+{
+		Datum *elem = PG_GETARG_DATUM(0);
+		AnyArrayType *array1;
+		AnyArrayType *array2 = PG_GETARG_ANY_ARRAY(1);
+		Oid collation = PG_GET_COLLATION();
+		bool result;
+
+		int16 typlen;
+		bool typbyval;
+		char typalign;
+		int nelems;
+
+		/* we have one element */
+		nelems= 1;
+
+		/* get required info about the element type */
+		get_typlenbyvalalign(ARR_ELEMTYPE(array),
+			 &typlen, &typbyval, &typalign);
+
+		/* now build the array */
+		array1 =  construct_array(&elem, nelems,collation, &typlen, &typbyval, &typalign);
+
+		result = array_contain_compare(array2, array1, collation, true,
+			&fcinfo->flinfo->fn_extra);
+
+		/* Avoid leaking memory when handed toasted input. */
+		PG_FREE_IF_COPY(elem,0);
+		AARR_FREE_IF_COPY(array, 1);
+
+		PG_RETURN_BOOL(result);
+}
+
+Datum
 arraycontained(PG_FUNCTION_ARGS)
 {
 	AnyArrayType *array1 = PG_GETARG_ANY_ARRAY(0);
diff --git a/src/include/catalog/pg_amop.h b/src/include/catalog/pg_amop.h
index da0228de6b..2da9002577 100644
--- a/src/include/catalog/pg_amop.h
+++ b/src/include/catalog/pg_amop.h
@@ -687,6 +687,8 @@ DATA(insert (	2595   718 600 15 o 3291 783 1970 ));
  */
 DATA(insert (	2745   2277 2277 1 s 2750 2742 0 ));
 DATA(insert (	2745   2277 2277 2 s 2751 2742 0 ));
+//TODO link the operator's pg_operator OID
+DATA(insert ( 2745   2277 2283 5 s 2753 2742 0 ));
 DATA(insert (	2745   2277 2277 3 s 2752 2742 0 ));
 DATA(insert (	2745   2277 2277 4 s 1070 2742 0 ));
 
diff --git a/src/include/catalog/pg_operator.h b/src/include/catalog/pg_operator.h
index ccbb17efec..626a0b1c49 100644
--- a/src/include/catalog/pg_operator.h
+++ b/

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-05-29 Thread Mark Rofail
>
> rhaas=# select oid, * from pg_opfamily where opfmethod = 2742;
>  oid  | opfmethod |opfname | opfnamespace | opfowner
> --+---++--+--
>  2745 |  2742 | array_ops  |   11 |   10
>  3659 |  2742 | tsvector_ops   |   11 |   10
>  4036 |  2742 | jsonb_ops  |   11 |   10
>  4037 |  2742 | jsonb_path_ops |   11 |   10
> (4 rows)

I am particulary intrested in array_ops but I have failed in locating the
code behind it. Where is it reflected in the source code

Best Regards,
Mark Rofail


Fwd: [HACKERS] GSoC 2017 Proposal

2017-04-19 Thread Mark Rofail
Dear Mr Alexander,

I was checking the archives today and to my shock, I did not find my
reply to your previous question which was almost two weeks ago.
I apologise for the inconvenience, I have however replied within an
hour but apparently, it did not go through.

Best Regards,
Mark Moheb

On Thu, Apr 6, 2017 at 4:18 PM, Mark Rofail  wrote:
> Hello Mr Alexander,
>
> From my understanding, the main issue occurs whenever any UPDATE or
> DELETE statement is executed on the PK table,
> this triggers a referential integrity check on the FK table. In the
> previous patches, this was done by performing a sequential scan.
>
> To improve performance I propose that we index the FK column, and in
> my point of view the most suitable index would be the GIN index since
> it is targeted for composite items.
> However, to move forward with this approach, we have to be sure that
> the comparison semantics offered by GIN indexes satisfy our needs for
> the referential integrity check.
>
> This approach was proposed by Tom Lane:
> https://www.postgresql.org/message-id/28389.1351094795%40sss.pgh.pa.us
>
> I believe this can be accomplished by better understanding the GIN
> index implementation in postgreSQL, including its operators.
>
> This is the best to the knowledge I gained during the application
> period. I would like to investigate it further and would be delighted
> to hear your input regarding the matter,
>
> Best Regards,
> Mark Moheb


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers