2014-12-16 11:01 GMT+07:00 Ali Akbar <[email protected]>: > > > > 2014-12-16 10:47 GMT+07:00 Ali Akbar <[email protected]>: >> >> >> 2014-12-16 6:27 GMT+07:00 Tomas Vondra <[email protected]>: >> Just fast-viewing the patch. >> >> The patch is not implementing the checking for not creating new context >> in initArrayResultArr. I think we should implement it also there for >> consistency (and preventing future problems). >> >
Testing the performance with your query, looks promising: speedup is
between 12% ~ 15%.
Because i'm using 32-bit systems, setting work_mem to 1024GB failed:
> ERROR: 1073741824 is outside the valid range for parameter "work_mem" (64
> .. 2097151)
> STATEMENT: SET work_mem = '1024GB';
> psql:/media/truecrypt1/oss/postgresql/postgresql/../patches/array-agg.sql:20:
> ERROR: 1073741824 is outside the valid range for parameter "work_mem" (64
> .. 2097151)
>
Maybe because of that, in the large groups a test, the speedup is awesome:
> master: 16,819 ms
>
with patch: 1,720 ms
>
Looks like with master, postgres resort to disk, but with the patch it fits
in memory.
Note: I hasn't tested the large dataset.
As expected, testing array_agg(anyarray), the performance is still the
same, because the subcontext hasn't implemented there (test script modified
from Tomas', attached).
I implemented the subcontext checking in initArrayResultArr by changing the
v3 patch like this:
> +++ b/src/backend/utils/adt/arrayfuncs.c
> @@ -4797,10 +4797,11 @@ initArrayResultArr(Oid array_type, Oid
> element_type, MemoryContext rcontext,
> bool subcontext)
> {
> ArrayBuildStateArr *astate;
> - MemoryContext arr_context;
> + MemoryContext arr_context = rcontext; /* by default use the parent
> ctx */
>
> /* Make a temporary context to hold all the junk */
> - arr_context = AllocSetContextCreate(rcontext,
> + if (subcontext)
> + arr_context = AllocSetContextCreate(rcontext,
> "accumArrayResultArr",
> ALLOCSET_DEFAULT_MINSIZE,
> ALLOCSET_DEFAULT_INITSIZE,
>
Testing the performance, it got the 12%~15% speedup. Good. (patch attached)
Looking at the modification in accumArrayResult* functions, i don't really
> comfortable with:
>
> 1. Code that calls accumArrayResult* after explicitly calling
> initArrayResult* must always passing subcontext, but it has no effect.
> 2. All existing codes that calls accumArrayResult must be changed.
>
> Just an idea: why don't we minimize the change in API like this:
>
> 1. Adding parameter bool subcontext, only in initArrayResult*
> functions but not in accumArrayResult*
> 2. Code that want to not creating subcontext must calls
> initArrayResult* explicitly.
>
> Other codes that calls directly to accumArrayResult can only be changed in
> the call to makeArrayResult* (with release=true parameter). In places that
> we don't want to create subcontext (as in array_agg_transfn), modify it to
> use initArrayResult* before calling accumArrayResult*.
>
> What do you think?
>
As per your concern about calling initArrayResult* with subcontext=false,
while makeArrayResult* with release=true:
> Also, it seems that using 'subcontext=false' and then 'release=true'
> would be a bug. Maybe it would be appropriate to store the
> 'subcontext' value into the ArrayBuildState and then throw an error
> if makeArrayResult* is called with (release=true && subcontext=false).
>
Yes, i think we should do that to minimize unexpected coding errors. In
makeArrayResult*, i think its better to not throwing an error, but using
assertions:
> Assert(release == false || astate->subcontext == true);
>
Regards,
--
Ali Akbar
array-agg-anyarray.sql
Description: application/sql
diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index 18ae318..48e66bf 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -1271,13 +1271,13 @@ dblink_get_connections(PG_FUNCTION_ARGS)
/* stash away current value */
astate = accumArrayResult(astate,
CStringGetTextDatum(hentry->name),
- false, TEXTOID, CurrentMemoryContext);
+ false, TEXTOID, CurrentMemoryContext, true);
}
}
if (astate)
PG_RETURN_ARRAYTYPE_P(makeArrayResult(astate,
- CurrentMemoryContext));
+ CurrentMemoryContext, true));
else
PG_RETURN_NULL();
}
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index c16b38e..bd5eb32 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -671,7 +671,7 @@ transformRelOptions(Datum oldOptions, List *defList, char *namspace,
/* No match, so keep old option */
astate = accumArrayResult(astate, oldoptions[i],
false, TEXTOID,
- CurrentMemoryContext);
+ CurrentMemoryContext, true);
}
}
}
@@ -758,12 +758,12 @@ transformRelOptions(Datum oldOptions, List *defList, char *namspace,
astate = accumArrayResult(astate, PointerGetDatum(t),
false, TEXTOID,
- CurrentMemoryContext);
+ CurrentMemoryContext, true);
}
}
if (astate)
- result = makeArrayResult(astate, CurrentMemoryContext);
+ result = makeArrayResult(astate, CurrentMemoryContext, true);
else
result = (Datum) 0;
diff --git a/src/backend/commands/foreigncmds.c b/src/backend/commands/foreigncmds.c
index ab4ed6c..804398a 100644
--- a/src/backend/commands/foreigncmds.c
+++ b/src/backend/commands/foreigncmds.c
@@ -82,11 +82,11 @@ optionListToArray(List *options)
astate = accumArrayResult(astate, PointerGetDatum(t),
false, TEXTOID,
- CurrentMemoryContext);
+ CurrentMemoryContext, true);
}
if (astate)
- return makeArrayResult(astate, CurrentMemoryContext);
+ return makeArrayResult(astate, CurrentMemoryContext, true);
return PointerGetDatum(NULL);
}
diff --git a/src/backend/executor/nodeSubplan.c b/src/backend/executor/nodeSubplan.c
index d9faf20..39faa4c 100644
--- a/src/backend/executor/nodeSubplan.c
+++ b/src/backend/executor/nodeSubplan.c
@@ -262,7 +262,7 @@ ExecScanSubPlan(SubPlanState *node,
/* Initialize ArrayBuildStateAny in caller's context, if needed */
if (subLinkType == ARRAY_SUBLINK)
astate = initArrayResultAny(subplan->firstColType,
- CurrentMemoryContext);
+ CurrentMemoryContext, true);
/*
* We are probably in a short-lived expression-evaluation context. Switch
@@ -372,7 +372,7 @@ ExecScanSubPlan(SubPlanState *node,
Assert(subplan->firstColType == tdesc->attrs[0]->atttypid);
dvalue = slot_getattr(slot, 1, &disnull);
astate = accumArrayResultAny(astate, dvalue, disnull,
- subplan->firstColType, oldcontext);
+ subplan->firstColType, oldcontext, true);
/* keep scanning subplan to collect all values */
continue;
}
@@ -964,7 +964,7 @@ ExecSetParamPlan(SubPlanState *node, ExprContext *econtext)
/* Initialize ArrayBuildStateAny in caller's context, if needed */
if (subLinkType == ARRAY_SUBLINK)
astate = initArrayResultAny(subplan->firstColType,
- CurrentMemoryContext);
+ CurrentMemoryContext, true);
/*
* Must switch to per-query memory context.
@@ -1026,7 +1026,7 @@ ExecSetParamPlan(SubPlanState *node, ExprContext *econtext)
Assert(subplan->firstColType == tdesc->attrs[0]->atttypid);
dvalue = slot_getattr(slot, 1, &disnull);
astate = accumArrayResultAny(astate, dvalue, disnull,
- subplan->firstColType, oldcontext);
+ subplan->firstColType, oldcontext, true);
/* keep scanning subplan to collect all values */
continue;
}
diff --git a/src/backend/utils/adt/array_userfuncs.c b/src/backend/utils/adt/array_userfuncs.c
index 50ea4d2..d290a06 100644
--- a/src/backend/utils/adt/array_userfuncs.c
+++ b/src/backend/utils/adt/array_userfuncs.c
@@ -504,7 +504,7 @@ array_agg_transfn(PG_FUNCTION_ARGS)
elem,
PG_ARGISNULL(1),
arg1_typeid,
- aggcontext);
+ aggcontext, false);
/*
* The transition type for array_agg() is declared to be "internal", which
@@ -578,7 +578,7 @@ array_agg_array_transfn(PG_FUNCTION_ARGS)
PG_GETARG_DATUM(1),
PG_ARGISNULL(1),
arg1_typeid,
- aggcontext);
+ aggcontext, false);
/*
* The transition type for array_agg() is declared to be "internal", which
diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c
index 933c6b0..1a2e070 100644
--- a/src/backend/utils/adt/arrayfuncs.c
+++ b/src/backend/utils/adt/arrayfuncs.c
@@ -4617,22 +4617,23 @@ array_insert_slice(ArrayType *destArray,
* were no elements. This is preferred if an empty array is what you want.
*/
ArrayBuildState *
-initArrayResult(Oid element_type, MemoryContext rcontext)
+initArrayResult(Oid element_type, MemoryContext rcontext, bool subcontext)
{
ArrayBuildState *astate;
- MemoryContext arr_context;
+ MemoryContext arr_context = rcontext; /* by default use the parent ctx */
/* Make a temporary context to hold all the junk */
- arr_context = AllocSetContextCreate(rcontext,
- "accumArrayResult",
- ALLOCSET_DEFAULT_MINSIZE,
- ALLOCSET_DEFAULT_INITSIZE,
- ALLOCSET_DEFAULT_MAXSIZE);
+ if (subcontext)
+ arr_context = AllocSetContextCreate(rcontext,
+ "accumArrayResult",
+ ALLOCSET_DEFAULT_MINSIZE,
+ ALLOCSET_DEFAULT_INITSIZE,
+ ALLOCSET_DEFAULT_MAXSIZE);
astate = (ArrayBuildState *)
MemoryContextAlloc(arr_context, sizeof(ArrayBuildState));
astate->mcontext = arr_context;
- astate->alen = 64; /* arbitrary starting array size */
+ astate->alen = 8; /* arbitrary starting array size */
astate->dvalues = (Datum *)
MemoryContextAlloc(arr_context, astate->alen * sizeof(Datum));
astate->dnulls = (bool *)
@@ -4659,14 +4660,14 @@ ArrayBuildState *
accumArrayResult(ArrayBuildState *astate,
Datum dvalue, bool disnull,
Oid element_type,
- MemoryContext rcontext)
+ MemoryContext rcontext, bool subcontext)
{
MemoryContext oldcontext;
if (astate == NULL)
{
/* First time through --- initialize */
- astate = initArrayResult(element_type, rcontext);
+ astate = initArrayResult(element_type, rcontext, subcontext);
}
else
{
@@ -4718,7 +4719,8 @@ accumArrayResult(ArrayBuildState *astate,
*/
Datum
makeArrayResult(ArrayBuildState *astate,
- MemoryContext rcontext)
+ MemoryContext rcontext,
+ bool release)
{
int ndims;
int dims[1];
@@ -4729,7 +4731,7 @@ makeArrayResult(ArrayBuildState *astate,
dims[0] = astate->nelems;
lbs[0] = 1;
- return makeMdArrayResult(astate, ndims, dims, lbs, rcontext, true);
+ return makeMdArrayResult(astate, ndims, dims, lbs, rcontext, release);
}
/*
@@ -4791,13 +4793,15 @@ makeMdArrayResult(ArrayBuildState *astate,
* rcontext is where to keep working state
*/
ArrayBuildStateArr *
-initArrayResultArr(Oid array_type, Oid element_type, MemoryContext rcontext)
+initArrayResultArr(Oid array_type, Oid element_type, MemoryContext rcontext,
+ bool subcontext)
{
ArrayBuildStateArr *astate;
- MemoryContext arr_context;
+ MemoryContext arr_context = rcontext; /* by default use the parent ctx */
/* Make a temporary context to hold all the junk */
- arr_context = AllocSetContextCreate(rcontext,
+ if (subcontext)
+ arr_context = AllocSetContextCreate(rcontext,
"accumArrayResultArr",
ALLOCSET_DEFAULT_MINSIZE,
ALLOCSET_DEFAULT_INITSIZE,
@@ -4827,7 +4831,7 @@ ArrayBuildStateArr *
accumArrayResultArr(ArrayBuildStateArr *astate,
Datum dvalue, bool disnull,
Oid array_type,
- MemoryContext rcontext)
+ MemoryContext rcontext, bool subcontext)
{
ArrayType *arg;
MemoryContext oldcontext;
@@ -4862,7 +4866,7 @@ accumArrayResultArr(ArrayBuildStateArr *astate,
(errcode(ERRCODE_DATATYPE_MISMATCH),
errmsg("data type %s is not an array type",
format_type_be(array_type))));
- astate = initArrayResultArr(array_type, element_type, rcontext);
+ astate = initArrayResultArr(array_type, element_type, rcontext, subcontext);
}
else
{
@@ -5064,7 +5068,7 @@ makeArrayResultArr(ArrayBuildStateArr *astate,
* rcontext is where to keep working state
*/
ArrayBuildStateAny *
-initArrayResultAny(Oid input_type, MemoryContext rcontext)
+initArrayResultAny(Oid input_type, MemoryContext rcontext, bool subcontext)
{
ArrayBuildStateAny *astate;
Oid element_type = get_element_type(input_type);
@@ -5074,7 +5078,7 @@ initArrayResultAny(Oid input_type, MemoryContext rcontext)
/* Array case */
ArrayBuildStateArr *arraystate;
- arraystate = initArrayResultArr(input_type, element_type, rcontext);
+ arraystate = initArrayResultArr(input_type, element_type, rcontext, subcontext);
astate = (ArrayBuildStateAny *)
MemoryContextAlloc(arraystate->mcontext,
sizeof(ArrayBuildStateAny));
@@ -5089,7 +5093,7 @@ initArrayResultAny(Oid input_type, MemoryContext rcontext)
/* Let's just check that we have a type that can be put into arrays */
Assert(OidIsValid(get_array_type(input_type)));
- scalarstate = initArrayResult(input_type, rcontext);
+ scalarstate = initArrayResult(input_type, rcontext, subcontext);
astate = (ArrayBuildStateAny *)
MemoryContextAlloc(scalarstate->mcontext,
sizeof(ArrayBuildStateAny));
@@ -5112,19 +5116,19 @@ ArrayBuildStateAny *
accumArrayResultAny(ArrayBuildStateAny *astate,
Datum dvalue, bool disnull,
Oid input_type,
- MemoryContext rcontext)
+ MemoryContext rcontext, bool subcontext)
{
if (astate == NULL)
- astate = initArrayResultAny(input_type, rcontext);
+ astate = initArrayResultAny(input_type, rcontext, subcontext);
if (astate->scalarstate)
(void) accumArrayResult(astate->scalarstate,
dvalue, disnull,
- input_type, rcontext);
+ input_type, rcontext, subcontext);
else
(void) accumArrayResultArr(astate->arraystate,
dvalue, disnull,
- input_type, rcontext);
+ input_type, rcontext, subcontext);
return astate;
}
diff --git a/src/backend/utils/adt/regexp.c b/src/backend/utils/adt/regexp.c
index 50b33f6..441a6bd 100644
--- a/src/backend/utils/adt/regexp.c
+++ b/src/backend/utils/adt/regexp.c
@@ -1175,7 +1175,7 @@ regexp_split_to_array(PG_FUNCTION_ARGS)
build_regexp_split_result(splitctx),
false,
TEXTOID,
- CurrentMemoryContext);
+ CurrentMemoryContext, true);
splitctx->next_match++;
}
@@ -1185,7 +1185,7 @@ regexp_split_to_array(PG_FUNCTION_ARGS)
* memory context anyway.
*/
- PG_RETURN_ARRAYTYPE_P(makeArrayResult(astate, CurrentMemoryContext));
+ PG_RETURN_ARRAYTYPE_P(makeArrayResult(astate, CurrentMemoryContext, true));
}
/* This is separate to keep the opr_sanity regression test from complaining */
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index b3f397e..aa045a2 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -3576,7 +3576,7 @@ text_to_array_internal(PG_FUNCTION_ARGS)
PointerGetDatum(result_text),
is_null,
TEXTOID,
- CurrentMemoryContext);
+ CurrentMemoryContext, true);
pfree(result_text);
@@ -3621,7 +3621,7 @@ text_to_array_internal(PG_FUNCTION_ARGS)
PointerGetDatum(result_text),
is_null,
TEXTOID,
- CurrentMemoryContext);
+ CurrentMemoryContext, true);
pfree(result_text);
@@ -3631,7 +3631,7 @@ text_to_array_internal(PG_FUNCTION_ARGS)
}
PG_RETURN_ARRAYTYPE_P(makeArrayResult(astate,
- CurrentMemoryContext));
+ CurrentMemoryContext, true));
}
/*
diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c
index 7d90312..ccf0820 100644
--- a/src/backend/utils/adt/xml.c
+++ b/src/backend/utils/adt/xml.c
@@ -3681,7 +3681,7 @@ xml_xpathobjtoxmlarray(xmlXPathObjectPtr xpathobj,
{
datum = PointerGetDatum(xml_xmlnodetoxmltype(xpathobj->nodesetval->nodeTab[i]));
(void) accumArrayResult(astate, datum, false,
- XMLOID, CurrentMemoryContext);
+ XMLOID, CurrentMemoryContext, true);
}
}
}
@@ -3718,7 +3718,7 @@ xml_xpathobjtoxmlarray(xmlXPathObjectPtr xpathobj,
result_str = map_sql_value_to_xml_value(datum, datumtype, true);
datum = PointerGetDatum(cstring_to_xmltype(result_str));
(void) accumArrayResult(astate, datum, false,
- XMLOID, CurrentMemoryContext);
+ XMLOID, CurrentMemoryContext, true);
return 1;
}
@@ -3930,10 +3930,10 @@ xpath(PG_FUNCTION_ARGS)
ArrayType *namespaces = PG_GETARG_ARRAYTYPE_P(2);
ArrayBuildState *astate;
- astate = initArrayResult(XMLOID, CurrentMemoryContext);
+ astate = initArrayResult(XMLOID, CurrentMemoryContext, true);
xpath_internal(xpath_expr_text, data, namespaces,
NULL, astate);
- PG_RETURN_ARRAYTYPE_P(makeArrayResult(astate, CurrentMemoryContext));
+ PG_RETURN_ARRAYTYPE_P(makeArrayResult(astate, CurrentMemoryContext, true));
#else
NO_XML_SUPPORT();
return 0;
diff --git a/src/include/utils/array.h b/src/include/utils/array.h
index e1d5749..385eda8 100644
--- a/src/include/utils/array.h
+++ b/src/include/utils/array.h
@@ -286,31 +286,31 @@ extern void deconstruct_array(ArrayType *array,
extern bool array_contains_nulls(ArrayType *array);
extern ArrayBuildState *initArrayResult(Oid element_type,
- MemoryContext rcontext);
+ MemoryContext rcontext, bool subcontext);
extern ArrayBuildState *accumArrayResult(ArrayBuildState *astate,
Datum dvalue, bool disnull,
Oid element_type,
- MemoryContext rcontext);
+ MemoryContext rcontext, bool subcontext);
extern Datum makeArrayResult(ArrayBuildState *astate,
- MemoryContext rcontext);
+ MemoryContext rcontext, bool release);
extern Datum makeMdArrayResult(ArrayBuildState *astate, int ndims,
int *dims, int *lbs, MemoryContext rcontext, bool release);
extern ArrayBuildStateArr *initArrayResultArr(Oid array_type, Oid element_type,
- MemoryContext rcontext);
+ MemoryContext rcontext, bool subcontext);
extern ArrayBuildStateArr *accumArrayResultArr(ArrayBuildStateArr *astate,
Datum dvalue, bool disnull,
Oid array_type,
- MemoryContext rcontext);
+ MemoryContext rcontext, bool subcontext);
extern Datum makeArrayResultArr(ArrayBuildStateArr *astate,
MemoryContext rcontext, bool release);
extern ArrayBuildStateAny *initArrayResultAny(Oid input_type,
- MemoryContext rcontext);
+ MemoryContext rcontext, bool subcontext);
extern ArrayBuildStateAny *accumArrayResultAny(ArrayBuildStateAny *astate,
Datum dvalue, bool disnull,
Oid input_type,
- MemoryContext rcontext);
+ MemoryContext rcontext, bool subcontext);
extern Datum makeArrayResultAny(ArrayBuildStateAny *astate,
MemoryContext rcontext, bool release);
diff --git a/src/pl/plperl/plperl.c b/src/pl/plperl/plperl.c
index 492c1ef..817c606 100644
--- a/src/pl/plperl/plperl.c
+++ b/src/pl/plperl/plperl.c
@@ -1191,7 +1191,7 @@ array_to_datum_internal(AV *av, ArrayBuildState *astate,
&isnull);
(void) accumArrayResult(astate, dat, isnull,
- elemtypid, CurrentMemoryContext);
+ elemtypid, CurrentMemoryContext, true);
}
}
}
@@ -1218,7 +1218,7 @@ plperl_array_to_datum(SV *src, Oid typid, int32 typmod)
errmsg("cannot convert Perl array to non-array type %s",
format_type_be(typid))));
- astate = initArrayResult(elemtypid, CurrentMemoryContext);
+ astate = initArrayResult(elemtypid, CurrentMemoryContext, true);
_sv_to_datum_finfo(elemtypid, &finfo, &typioparam);
-- Sent via pgsql-hackers mailing list ([email protected]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
