2014-12-16 11:01 GMT+07:00 Ali Akbar <the.ap...@gmail.com>:
>
>
>
> 2014-12-16 10:47 GMT+07:00 Ali Akbar <the.ap...@gmail.com>:
>>
>>
>> 2014-12-16 6:27 GMT+07:00 Tomas Vondra <t...@fuzzy.cz>:
>> Just fast-viewing the patch.
>>
>> The patch is not implementing the checking for not creating new context
>> in initArrayResultArr. I think we should implement it also there for
>> consistency (and preventing future problems).
>>
>

Testing the performance with your query, looks promising: speedup is
between 12% ~ 15%.

Because i'm using 32-bit systems, setting work_mem to 1024GB failed:

> ERROR:  1073741824 is outside the valid range for parameter "work_mem" (64
> .. 2097151)
> STATEMENT:  SET work_mem = '1024GB';
> psql:/media/truecrypt1/oss/postgresql/postgresql/../patches/array-agg.sql:20:
> ERROR:  1073741824 is outside the valid range for parameter "work_mem" (64
> .. 2097151)
>

Maybe because of that, in the large groups a test, the speedup is awesome:

> master: 16,819 ms
>
with patch: 1,720 ms
>
Looks like with master, postgres resort to disk, but with the patch it fits
in memory.

Note: I hasn't tested the large dataset.

As expected, testing array_agg(anyarray), the performance is still the
same, because the subcontext hasn't implemented there (test script modified
from Tomas', attached).

I implemented the subcontext checking in initArrayResultArr by changing the
v3 patch like this:

> +++ b/src/backend/utils/adt/arrayfuncs.c
> @@ -4797,10 +4797,11 @@ initArrayResultArr(Oid array_type, Oid
> element_type, MemoryContext rcontext,
>                    bool subcontext)
>  {
>     ArrayBuildStateArr *astate;
> -   MemoryContext arr_context;
> +   MemoryContext arr_context = rcontext;   /* by default use the parent
> ctx */
>
>     /* Make a temporary context to hold all the junk */
> -   arr_context = AllocSetContextCreate(rcontext,
> +   if (subcontext)
> +       arr_context = AllocSetContextCreate(rcontext,
>                                         "accumArrayResultArr",
>                                         ALLOCSET_DEFAULT_MINSIZE,
>                                         ALLOCSET_DEFAULT_INITSIZE,
>

Testing the performance, it got the 12%~15% speedup. Good. (patch attached)


Looking at the modification in accumArrayResult* functions, i don't really
> comfortable with:
>
>    1. Code that calls accumArrayResult* after explicitly calling
>    initArrayResult* must always passing subcontext, but it has no effect.
>    2. All existing codes that calls accumArrayResult must be changed.
>
> Just an idea: why don't we minimize the change in API like this:
>
>    1. Adding parameter bool subcontext, only in initArrayResult*
>    functions but not in accumArrayResult*
>    2. Code that want to not creating subcontext must calls
>    initArrayResult* explicitly.
>
> Other codes that calls directly to accumArrayResult can only be changed in
> the call to makeArrayResult* (with release=true parameter). In places that
> we don't want to create subcontext (as in array_agg_transfn), modify it to
> use initArrayResult* before calling accumArrayResult*.
>
> What do you think?
>


As per your concern about calling initArrayResult* with subcontext=false,
while makeArrayResult* with release=true:

>   Also, it seems that using 'subcontext=false' and then 'release=true'
>   would be a bug. Maybe it would be appropriate to store the
>   'subcontext' value into the ArrayBuildState and then throw an error
>   if makeArrayResult* is called with (release=true && subcontext=false).
>

Yes, i think we should do that to minimize unexpected coding errors. In
makeArrayResult*, i think its better to not throwing an error, but using
assertions:

> Assert(release == false || astate->subcontext == true);
>

Regards,
-- 
Ali Akbar

Attachment: array-agg-anyarray.sql
Description: application/sql

diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index 18ae318..48e66bf 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -1271,13 +1271,13 @@ dblink_get_connections(PG_FUNCTION_ARGS)
 			/* stash away current value */
 			astate = accumArrayResult(astate,
 									  CStringGetTextDatum(hentry->name),
-									  false, TEXTOID, CurrentMemoryContext);
+									  false, TEXTOID, CurrentMemoryContext, true);
 		}
 	}
 
 	if (astate)
 		PG_RETURN_ARRAYTYPE_P(makeArrayResult(astate,
-											  CurrentMemoryContext));
+											  CurrentMemoryContext, true));
 	else
 		PG_RETURN_NULL();
 }
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index c16b38e..bd5eb32 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -671,7 +671,7 @@ transformRelOptions(Datum oldOptions, List *defList, char *namspace,
 				/* No match, so keep old option */
 				astate = accumArrayResult(astate, oldoptions[i],
 										  false, TEXTOID,
-										  CurrentMemoryContext);
+										  CurrentMemoryContext, true);
 			}
 		}
 	}
@@ -758,12 +758,12 @@ transformRelOptions(Datum oldOptions, List *defList, char *namspace,
 
 			astate = accumArrayResult(astate, PointerGetDatum(t),
 									  false, TEXTOID,
-									  CurrentMemoryContext);
+									  CurrentMemoryContext, true);
 		}
 	}
 
 	if (astate)
-		result = makeArrayResult(astate, CurrentMemoryContext);
+		result = makeArrayResult(astate, CurrentMemoryContext, true);
 	else
 		result = (Datum) 0;
 
diff --git a/src/backend/commands/foreigncmds.c b/src/backend/commands/foreigncmds.c
index ab4ed6c..804398a 100644
--- a/src/backend/commands/foreigncmds.c
+++ b/src/backend/commands/foreigncmds.c
@@ -82,11 +82,11 @@ optionListToArray(List *options)
 
 		astate = accumArrayResult(astate, PointerGetDatum(t),
 								  false, TEXTOID,
-								  CurrentMemoryContext);
+								  CurrentMemoryContext, true);
 	}
 
 	if (astate)
-		return makeArrayResult(astate, CurrentMemoryContext);
+		return makeArrayResult(astate, CurrentMemoryContext, true);
 
 	return PointerGetDatum(NULL);
 }
diff --git a/src/backend/executor/nodeSubplan.c b/src/backend/executor/nodeSubplan.c
index d9faf20..39faa4c 100644
--- a/src/backend/executor/nodeSubplan.c
+++ b/src/backend/executor/nodeSubplan.c
@@ -262,7 +262,7 @@ ExecScanSubPlan(SubPlanState *node,
 	/* Initialize ArrayBuildStateAny in caller's context, if needed */
 	if (subLinkType == ARRAY_SUBLINK)
 		astate = initArrayResultAny(subplan->firstColType,
-									CurrentMemoryContext);
+									CurrentMemoryContext, true);
 
 	/*
 	 * We are probably in a short-lived expression-evaluation context. Switch
@@ -372,7 +372,7 @@ ExecScanSubPlan(SubPlanState *node,
 			Assert(subplan->firstColType == tdesc->attrs[0]->atttypid);
 			dvalue = slot_getattr(slot, 1, &disnull);
 			astate = accumArrayResultAny(astate, dvalue, disnull,
-										 subplan->firstColType, oldcontext);
+										 subplan->firstColType, oldcontext, true);
 			/* keep scanning subplan to collect all values */
 			continue;
 		}
@@ -964,7 +964,7 @@ ExecSetParamPlan(SubPlanState *node, ExprContext *econtext)
 	/* Initialize ArrayBuildStateAny in caller's context, if needed */
 	if (subLinkType == ARRAY_SUBLINK)
 		astate = initArrayResultAny(subplan->firstColType,
-									CurrentMemoryContext);
+									CurrentMemoryContext, true);
 
 	/*
 	 * Must switch to per-query memory context.
@@ -1026,7 +1026,7 @@ ExecSetParamPlan(SubPlanState *node, ExprContext *econtext)
 			Assert(subplan->firstColType == tdesc->attrs[0]->atttypid);
 			dvalue = slot_getattr(slot, 1, &disnull);
 			astate = accumArrayResultAny(astate, dvalue, disnull,
-										 subplan->firstColType, oldcontext);
+										 subplan->firstColType, oldcontext, true);
 			/* keep scanning subplan to collect all values */
 			continue;
 		}
diff --git a/src/backend/utils/adt/array_userfuncs.c b/src/backend/utils/adt/array_userfuncs.c
index 50ea4d2..d290a06 100644
--- a/src/backend/utils/adt/array_userfuncs.c
+++ b/src/backend/utils/adt/array_userfuncs.c
@@ -504,7 +504,7 @@ array_agg_transfn(PG_FUNCTION_ARGS)
 							 elem,
 							 PG_ARGISNULL(1),
 							 arg1_typeid,
-							 aggcontext);
+							 aggcontext, false);
 
 	/*
 	 * The transition type for array_agg() is declared to be "internal", which
@@ -578,7 +578,7 @@ array_agg_array_transfn(PG_FUNCTION_ARGS)
 								PG_GETARG_DATUM(1),
 								PG_ARGISNULL(1),
 								arg1_typeid,
-								aggcontext);
+								aggcontext, false);
 
 	/*
 	 * The transition type for array_agg() is declared to be "internal", which
diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c
index 933c6b0..1a2e070 100644
--- a/src/backend/utils/adt/arrayfuncs.c
+++ b/src/backend/utils/adt/arrayfuncs.c
@@ -4617,22 +4617,23 @@ array_insert_slice(ArrayType *destArray,
  * were no elements.  This is preferred if an empty array is what you want.
  */
 ArrayBuildState *
-initArrayResult(Oid element_type, MemoryContext rcontext)
+initArrayResult(Oid element_type, MemoryContext rcontext, bool subcontext)
 {
 	ArrayBuildState *astate;
-	MemoryContext arr_context;
+	MemoryContext arr_context = rcontext;	/* by default use the parent ctx */
 
 	/* Make a temporary context to hold all the junk */
-	arr_context = AllocSetContextCreate(rcontext,
-										"accumArrayResult",
-										ALLOCSET_DEFAULT_MINSIZE,
-										ALLOCSET_DEFAULT_INITSIZE,
-										ALLOCSET_DEFAULT_MAXSIZE);
+	if (subcontext)
+		arr_context = AllocSetContextCreate(rcontext,
+											"accumArrayResult",
+											ALLOCSET_DEFAULT_MINSIZE,
+											ALLOCSET_DEFAULT_INITSIZE,
+											ALLOCSET_DEFAULT_MAXSIZE);
 
 	astate = (ArrayBuildState *)
 		MemoryContextAlloc(arr_context, sizeof(ArrayBuildState));
 	astate->mcontext = arr_context;
-	astate->alen = 64;			/* arbitrary starting array size */
+	astate->alen = 8;			/* arbitrary starting array size */
 	astate->dvalues = (Datum *)
 		MemoryContextAlloc(arr_context, astate->alen * sizeof(Datum));
 	astate->dnulls = (bool *)
@@ -4659,14 +4660,14 @@ ArrayBuildState *
 accumArrayResult(ArrayBuildState *astate,
 				 Datum dvalue, bool disnull,
 				 Oid element_type,
-				 MemoryContext rcontext)
+				 MemoryContext rcontext, bool subcontext)
 {
 	MemoryContext oldcontext;
 
 	if (astate == NULL)
 	{
 		/* First time through --- initialize */
-		astate = initArrayResult(element_type, rcontext);
+		astate = initArrayResult(element_type, rcontext, subcontext);
 	}
 	else
 	{
@@ -4718,7 +4719,8 @@ accumArrayResult(ArrayBuildState *astate,
  */
 Datum
 makeArrayResult(ArrayBuildState *astate,
-				MemoryContext rcontext)
+				MemoryContext rcontext,
+				bool release)
 {
 	int			ndims;
 	int			dims[1];
@@ -4729,7 +4731,7 @@ makeArrayResult(ArrayBuildState *astate,
 	dims[0] = astate->nelems;
 	lbs[0] = 1;
 
-	return makeMdArrayResult(astate, ndims, dims, lbs, rcontext, true);
+	return makeMdArrayResult(astate, ndims, dims, lbs, rcontext, release);
 }
 
 /*
@@ -4791,13 +4793,15 @@ makeMdArrayResult(ArrayBuildState *astate,
  *	rcontext is where to keep working state
  */
 ArrayBuildStateArr *
-initArrayResultArr(Oid array_type, Oid element_type, MemoryContext rcontext)
+initArrayResultArr(Oid array_type, Oid element_type, MemoryContext rcontext,
+				   bool subcontext)
 {
 	ArrayBuildStateArr *astate;
-	MemoryContext arr_context;
+	MemoryContext arr_context = rcontext;	/* by default use the parent ctx */
 
 	/* Make a temporary context to hold all the junk */
-	arr_context = AllocSetContextCreate(rcontext,
+	if (subcontext)
+		arr_context = AllocSetContextCreate(rcontext,
 										"accumArrayResultArr",
 										ALLOCSET_DEFAULT_MINSIZE,
 										ALLOCSET_DEFAULT_INITSIZE,
@@ -4827,7 +4831,7 @@ ArrayBuildStateArr *
 accumArrayResultArr(ArrayBuildStateArr *astate,
 					Datum dvalue, bool disnull,
 					Oid array_type,
-					MemoryContext rcontext)
+					MemoryContext rcontext, bool subcontext)
 {
 	ArrayType  *arg;
 	MemoryContext oldcontext;
@@ -4862,7 +4866,7 @@ accumArrayResultArr(ArrayBuildStateArr *astate,
 					(errcode(ERRCODE_DATATYPE_MISMATCH),
 					 errmsg("data type %s is not an array type",
 							format_type_be(array_type))));
-		astate = initArrayResultArr(array_type, element_type, rcontext);
+		astate = initArrayResultArr(array_type, element_type, rcontext, subcontext);
 	}
 	else
 	{
@@ -5064,7 +5068,7 @@ makeArrayResultArr(ArrayBuildStateArr *astate,
  *	rcontext is where to keep working state
  */
 ArrayBuildStateAny *
-initArrayResultAny(Oid input_type, MemoryContext rcontext)
+initArrayResultAny(Oid input_type, MemoryContext rcontext, bool subcontext)
 {
 	ArrayBuildStateAny *astate;
 	Oid			element_type = get_element_type(input_type);
@@ -5074,7 +5078,7 @@ initArrayResultAny(Oid input_type, MemoryContext rcontext)
 		/* Array case */
 		ArrayBuildStateArr *arraystate;
 
-		arraystate = initArrayResultArr(input_type, element_type, rcontext);
+		arraystate = initArrayResultArr(input_type, element_type, rcontext, subcontext);
 		astate = (ArrayBuildStateAny *)
 			MemoryContextAlloc(arraystate->mcontext,
 							   sizeof(ArrayBuildStateAny));
@@ -5089,7 +5093,7 @@ initArrayResultAny(Oid input_type, MemoryContext rcontext)
 		/* Let's just check that we have a type that can be put into arrays */
 		Assert(OidIsValid(get_array_type(input_type)));
 
-		scalarstate = initArrayResult(input_type, rcontext);
+		scalarstate = initArrayResult(input_type, rcontext, subcontext);
 		astate = (ArrayBuildStateAny *)
 			MemoryContextAlloc(scalarstate->mcontext,
 							   sizeof(ArrayBuildStateAny));
@@ -5112,19 +5116,19 @@ ArrayBuildStateAny *
 accumArrayResultAny(ArrayBuildStateAny *astate,
 					Datum dvalue, bool disnull,
 					Oid input_type,
-					MemoryContext rcontext)
+					MemoryContext rcontext, bool subcontext)
 {
 	if (astate == NULL)
-		astate = initArrayResultAny(input_type, rcontext);
+		astate = initArrayResultAny(input_type, rcontext, subcontext);
 
 	if (astate->scalarstate)
 		(void) accumArrayResult(astate->scalarstate,
 								dvalue, disnull,
-								input_type, rcontext);
+								input_type, rcontext, subcontext);
 	else
 		(void) accumArrayResultArr(astate->arraystate,
 								   dvalue, disnull,
-								   input_type, rcontext);
+								   input_type, rcontext, subcontext);
 
 	return astate;
 }
diff --git a/src/backend/utils/adt/regexp.c b/src/backend/utils/adt/regexp.c
index 50b33f6..441a6bd 100644
--- a/src/backend/utils/adt/regexp.c
+++ b/src/backend/utils/adt/regexp.c
@@ -1175,7 +1175,7 @@ regexp_split_to_array(PG_FUNCTION_ARGS)
 								  build_regexp_split_result(splitctx),
 								  false,
 								  TEXTOID,
-								  CurrentMemoryContext);
+								  CurrentMemoryContext, true);
 		splitctx->next_match++;
 	}
 
@@ -1185,7 +1185,7 @@ regexp_split_to_array(PG_FUNCTION_ARGS)
 	 * memory context anyway.
 	 */
 
-	PG_RETURN_ARRAYTYPE_P(makeArrayResult(astate, CurrentMemoryContext));
+	PG_RETURN_ARRAYTYPE_P(makeArrayResult(astate, CurrentMemoryContext, true));
 }
 
 /* This is separate to keep the opr_sanity regression test from complaining */
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index b3f397e..aa045a2 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -3576,7 +3576,7 @@ text_to_array_internal(PG_FUNCTION_ARGS)
 									  PointerGetDatum(result_text),
 									  is_null,
 									  TEXTOID,
-									  CurrentMemoryContext);
+									  CurrentMemoryContext, true);
 
 			pfree(result_text);
 
@@ -3621,7 +3621,7 @@ text_to_array_internal(PG_FUNCTION_ARGS)
 									  PointerGetDatum(result_text),
 									  is_null,
 									  TEXTOID,
-									  CurrentMemoryContext);
+									  CurrentMemoryContext, true);
 
 			pfree(result_text);
 
@@ -3631,7 +3631,7 @@ text_to_array_internal(PG_FUNCTION_ARGS)
 	}
 
 	PG_RETURN_ARRAYTYPE_P(makeArrayResult(astate,
-										  CurrentMemoryContext));
+										  CurrentMemoryContext, true));
 }
 
 /*
diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c
index 7d90312..ccf0820 100644
--- a/src/backend/utils/adt/xml.c
+++ b/src/backend/utils/adt/xml.c
@@ -3681,7 +3681,7 @@ xml_xpathobjtoxmlarray(xmlXPathObjectPtr xpathobj,
 					{
 						datum = PointerGetDatum(xml_xmlnodetoxmltype(xpathobj->nodesetval->nodeTab[i]));
 						(void) accumArrayResult(astate, datum, false,
-												XMLOID, CurrentMemoryContext);
+												XMLOID, CurrentMemoryContext, true);
 					}
 				}
 			}
@@ -3718,7 +3718,7 @@ xml_xpathobjtoxmlarray(xmlXPathObjectPtr xpathobj,
 	result_str = map_sql_value_to_xml_value(datum, datumtype, true);
 	datum = PointerGetDatum(cstring_to_xmltype(result_str));
 	(void) accumArrayResult(astate, datum, false,
-							XMLOID, CurrentMemoryContext);
+							XMLOID, CurrentMemoryContext, true);
 	return 1;
 }
 
@@ -3930,10 +3930,10 @@ xpath(PG_FUNCTION_ARGS)
 	ArrayType  *namespaces = PG_GETARG_ARRAYTYPE_P(2);
 	ArrayBuildState *astate;
 
-	astate = initArrayResult(XMLOID, CurrentMemoryContext);
+	astate = initArrayResult(XMLOID, CurrentMemoryContext, true);
 	xpath_internal(xpath_expr_text, data, namespaces,
 				   NULL, astate);
-	PG_RETURN_ARRAYTYPE_P(makeArrayResult(astate, CurrentMemoryContext));
+	PG_RETURN_ARRAYTYPE_P(makeArrayResult(astate, CurrentMemoryContext, true));
 #else
 	NO_XML_SUPPORT();
 	return 0;
diff --git a/src/include/utils/array.h b/src/include/utils/array.h
index e1d5749..385eda8 100644
--- a/src/include/utils/array.h
+++ b/src/include/utils/array.h
@@ -286,31 +286,31 @@ extern void deconstruct_array(ArrayType *array,
 extern bool array_contains_nulls(ArrayType *array);
 
 extern ArrayBuildState *initArrayResult(Oid element_type,
-				MemoryContext rcontext);
+				MemoryContext rcontext, bool subcontext);
 extern ArrayBuildState *accumArrayResult(ArrayBuildState *astate,
 				 Datum dvalue, bool disnull,
 				 Oid element_type,
-				 MemoryContext rcontext);
+				 MemoryContext rcontext, bool subcontext);
 extern Datum makeArrayResult(ArrayBuildState *astate,
-				MemoryContext rcontext);
+				MemoryContext rcontext, bool release);
 extern Datum makeMdArrayResult(ArrayBuildState *astate, int ndims,
 				  int *dims, int *lbs, MemoryContext rcontext, bool release);
 
 extern ArrayBuildStateArr *initArrayResultArr(Oid array_type, Oid element_type,
-				   MemoryContext rcontext);
+				   MemoryContext rcontext, bool subcontext);
 extern ArrayBuildStateArr *accumArrayResultArr(ArrayBuildStateArr *astate,
 					Datum dvalue, bool disnull,
 					Oid array_type,
-					MemoryContext rcontext);
+					MemoryContext rcontext, bool subcontext);
 extern Datum makeArrayResultArr(ArrayBuildStateArr *astate,
 				   MemoryContext rcontext, bool release);
 
 extern ArrayBuildStateAny *initArrayResultAny(Oid input_type,
-				   MemoryContext rcontext);
+				   MemoryContext rcontext, bool subcontext);
 extern ArrayBuildStateAny *accumArrayResultAny(ArrayBuildStateAny *astate,
 					Datum dvalue, bool disnull,
 					Oid input_type,
-					MemoryContext rcontext);
+					MemoryContext rcontext, bool subcontext);
 extern Datum makeArrayResultAny(ArrayBuildStateAny *astate,
 				   MemoryContext rcontext, bool release);
 
diff --git a/src/pl/plperl/plperl.c b/src/pl/plperl/plperl.c
index 492c1ef..817c606 100644
--- a/src/pl/plperl/plperl.c
+++ b/src/pl/plperl/plperl.c
@@ -1191,7 +1191,7 @@ array_to_datum_internal(AV *av, ArrayBuildState *astate,
 									 &isnull);
 
 			(void) accumArrayResult(astate, dat, isnull,
-									elemtypid, CurrentMemoryContext);
+									elemtypid, CurrentMemoryContext, true);
 		}
 	}
 }
@@ -1218,7 +1218,7 @@ plperl_array_to_datum(SV *src, Oid typid, int32 typmod)
 				 errmsg("cannot convert Perl array to non-array type %s",
 						format_type_be(typid))));
 
-	astate = initArrayResult(elemtypid, CurrentMemoryContext);
+	astate = initArrayResult(elemtypid, CurrentMemoryContext, true);
 
 	_sv_to_datum_finfo(elemtypid, &finfo, &typioparam);
 
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to