2014-12-16 11:01 GMT+07:00 Ali Akbar <the.ap...@gmail.com>: > > > > 2014-12-16 10:47 GMT+07:00 Ali Akbar <the.ap...@gmail.com>: >> >> >> 2014-12-16 6:27 GMT+07:00 Tomas Vondra <t...@fuzzy.cz>: >> Just fast-viewing the patch. >> >> The patch is not implementing the checking for not creating new context >> in initArrayResultArr. I think we should implement it also there for >> consistency (and preventing future problems). >> >
Testing the performance with your query, looks promising: speedup is between 12% ~ 15%. Because i'm using 32-bit systems, setting work_mem to 1024GB failed: > ERROR: 1073741824 is outside the valid range for parameter "work_mem" (64 > .. 2097151) > STATEMENT: SET work_mem = '1024GB'; > psql:/media/truecrypt1/oss/postgresql/postgresql/../patches/array-agg.sql:20: > ERROR: 1073741824 is outside the valid range for parameter "work_mem" (64 > .. 2097151) > Maybe because of that, in the large groups a test, the speedup is awesome: > master: 16,819 ms > with patch: 1,720 ms > Looks like with master, postgres resort to disk, but with the patch it fits in memory. Note: I hasn't tested the large dataset. As expected, testing array_agg(anyarray), the performance is still the same, because the subcontext hasn't implemented there (test script modified from Tomas', attached). I implemented the subcontext checking in initArrayResultArr by changing the v3 patch like this: > +++ b/src/backend/utils/adt/arrayfuncs.c > @@ -4797,10 +4797,11 @@ initArrayResultArr(Oid array_type, Oid > element_type, MemoryContext rcontext, > bool subcontext) > { > ArrayBuildStateArr *astate; > - MemoryContext arr_context; > + MemoryContext arr_context = rcontext; /* by default use the parent > ctx */ > > /* Make a temporary context to hold all the junk */ > - arr_context = AllocSetContextCreate(rcontext, > + if (subcontext) > + arr_context = AllocSetContextCreate(rcontext, > "accumArrayResultArr", > ALLOCSET_DEFAULT_MINSIZE, > ALLOCSET_DEFAULT_INITSIZE, > Testing the performance, it got the 12%~15% speedup. Good. (patch attached) Looking at the modification in accumArrayResult* functions, i don't really > comfortable with: > > 1. Code that calls accumArrayResult* after explicitly calling > initArrayResult* must always passing subcontext, but it has no effect. > 2. All existing codes that calls accumArrayResult must be changed. > > Just an idea: why don't we minimize the change in API like this: > > 1. Adding parameter bool subcontext, only in initArrayResult* > functions but not in accumArrayResult* > 2. Code that want to not creating subcontext must calls > initArrayResult* explicitly. > > Other codes that calls directly to accumArrayResult can only be changed in > the call to makeArrayResult* (with release=true parameter). In places that > we don't want to create subcontext (as in array_agg_transfn), modify it to > use initArrayResult* before calling accumArrayResult*. > > What do you think? > As per your concern about calling initArrayResult* with subcontext=false, while makeArrayResult* with release=true: > Also, it seems that using 'subcontext=false' and then 'release=true' > would be a bug. Maybe it would be appropriate to store the > 'subcontext' value into the ArrayBuildState and then throw an error > if makeArrayResult* is called with (release=true && subcontext=false). > Yes, i think we should do that to minimize unexpected coding errors. In makeArrayResult*, i think its better to not throwing an error, but using assertions: > Assert(release == false || astate->subcontext == true); > Regards, -- Ali Akbar
array-agg-anyarray.sql
Description: application/sql
diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c index 18ae318..48e66bf 100644 --- a/contrib/dblink/dblink.c +++ b/contrib/dblink/dblink.c @@ -1271,13 +1271,13 @@ dblink_get_connections(PG_FUNCTION_ARGS) /* stash away current value */ astate = accumArrayResult(astate, CStringGetTextDatum(hentry->name), - false, TEXTOID, CurrentMemoryContext); + false, TEXTOID, CurrentMemoryContext, true); } } if (astate) PG_RETURN_ARRAYTYPE_P(makeArrayResult(astate, - CurrentMemoryContext)); + CurrentMemoryContext, true)); else PG_RETURN_NULL(); } diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c index c16b38e..bd5eb32 100644 --- a/src/backend/access/common/reloptions.c +++ b/src/backend/access/common/reloptions.c @@ -671,7 +671,7 @@ transformRelOptions(Datum oldOptions, List *defList, char *namspace, /* No match, so keep old option */ astate = accumArrayResult(astate, oldoptions[i], false, TEXTOID, - CurrentMemoryContext); + CurrentMemoryContext, true); } } } @@ -758,12 +758,12 @@ transformRelOptions(Datum oldOptions, List *defList, char *namspace, astate = accumArrayResult(astate, PointerGetDatum(t), false, TEXTOID, - CurrentMemoryContext); + CurrentMemoryContext, true); } } if (astate) - result = makeArrayResult(astate, CurrentMemoryContext); + result = makeArrayResult(astate, CurrentMemoryContext, true); else result = (Datum) 0; diff --git a/src/backend/commands/foreigncmds.c b/src/backend/commands/foreigncmds.c index ab4ed6c..804398a 100644 --- a/src/backend/commands/foreigncmds.c +++ b/src/backend/commands/foreigncmds.c @@ -82,11 +82,11 @@ optionListToArray(List *options) astate = accumArrayResult(astate, PointerGetDatum(t), false, TEXTOID, - CurrentMemoryContext); + CurrentMemoryContext, true); } if (astate) - return makeArrayResult(astate, CurrentMemoryContext); + return makeArrayResult(astate, CurrentMemoryContext, true); return PointerGetDatum(NULL); } diff --git a/src/backend/executor/nodeSubplan.c b/src/backend/executor/nodeSubplan.c index d9faf20..39faa4c 100644 --- a/src/backend/executor/nodeSubplan.c +++ b/src/backend/executor/nodeSubplan.c @@ -262,7 +262,7 @@ ExecScanSubPlan(SubPlanState *node, /* Initialize ArrayBuildStateAny in caller's context, if needed */ if (subLinkType == ARRAY_SUBLINK) astate = initArrayResultAny(subplan->firstColType, - CurrentMemoryContext); + CurrentMemoryContext, true); /* * We are probably in a short-lived expression-evaluation context. Switch @@ -372,7 +372,7 @@ ExecScanSubPlan(SubPlanState *node, Assert(subplan->firstColType == tdesc->attrs[0]->atttypid); dvalue = slot_getattr(slot, 1, &disnull); astate = accumArrayResultAny(astate, dvalue, disnull, - subplan->firstColType, oldcontext); + subplan->firstColType, oldcontext, true); /* keep scanning subplan to collect all values */ continue; } @@ -964,7 +964,7 @@ ExecSetParamPlan(SubPlanState *node, ExprContext *econtext) /* Initialize ArrayBuildStateAny in caller's context, if needed */ if (subLinkType == ARRAY_SUBLINK) astate = initArrayResultAny(subplan->firstColType, - CurrentMemoryContext); + CurrentMemoryContext, true); /* * Must switch to per-query memory context. @@ -1026,7 +1026,7 @@ ExecSetParamPlan(SubPlanState *node, ExprContext *econtext) Assert(subplan->firstColType == tdesc->attrs[0]->atttypid); dvalue = slot_getattr(slot, 1, &disnull); astate = accumArrayResultAny(astate, dvalue, disnull, - subplan->firstColType, oldcontext); + subplan->firstColType, oldcontext, true); /* keep scanning subplan to collect all values */ continue; } diff --git a/src/backend/utils/adt/array_userfuncs.c b/src/backend/utils/adt/array_userfuncs.c index 50ea4d2..d290a06 100644 --- a/src/backend/utils/adt/array_userfuncs.c +++ b/src/backend/utils/adt/array_userfuncs.c @@ -504,7 +504,7 @@ array_agg_transfn(PG_FUNCTION_ARGS) elem, PG_ARGISNULL(1), arg1_typeid, - aggcontext); + aggcontext, false); /* * The transition type for array_agg() is declared to be "internal", which @@ -578,7 +578,7 @@ array_agg_array_transfn(PG_FUNCTION_ARGS) PG_GETARG_DATUM(1), PG_ARGISNULL(1), arg1_typeid, - aggcontext); + aggcontext, false); /* * The transition type for array_agg() is declared to be "internal", which diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c index 933c6b0..1a2e070 100644 --- a/src/backend/utils/adt/arrayfuncs.c +++ b/src/backend/utils/adt/arrayfuncs.c @@ -4617,22 +4617,23 @@ array_insert_slice(ArrayType *destArray, * were no elements. This is preferred if an empty array is what you want. */ ArrayBuildState * -initArrayResult(Oid element_type, MemoryContext rcontext) +initArrayResult(Oid element_type, MemoryContext rcontext, bool subcontext) { ArrayBuildState *astate; - MemoryContext arr_context; + MemoryContext arr_context = rcontext; /* by default use the parent ctx */ /* Make a temporary context to hold all the junk */ - arr_context = AllocSetContextCreate(rcontext, - "accumArrayResult", - ALLOCSET_DEFAULT_MINSIZE, - ALLOCSET_DEFAULT_INITSIZE, - ALLOCSET_DEFAULT_MAXSIZE); + if (subcontext) + arr_context = AllocSetContextCreate(rcontext, + "accumArrayResult", + ALLOCSET_DEFAULT_MINSIZE, + ALLOCSET_DEFAULT_INITSIZE, + ALLOCSET_DEFAULT_MAXSIZE); astate = (ArrayBuildState *) MemoryContextAlloc(arr_context, sizeof(ArrayBuildState)); astate->mcontext = arr_context; - astate->alen = 64; /* arbitrary starting array size */ + astate->alen = 8; /* arbitrary starting array size */ astate->dvalues = (Datum *) MemoryContextAlloc(arr_context, astate->alen * sizeof(Datum)); astate->dnulls = (bool *) @@ -4659,14 +4660,14 @@ ArrayBuildState * accumArrayResult(ArrayBuildState *astate, Datum dvalue, bool disnull, Oid element_type, - MemoryContext rcontext) + MemoryContext rcontext, bool subcontext) { MemoryContext oldcontext; if (astate == NULL) { /* First time through --- initialize */ - astate = initArrayResult(element_type, rcontext); + astate = initArrayResult(element_type, rcontext, subcontext); } else { @@ -4718,7 +4719,8 @@ accumArrayResult(ArrayBuildState *astate, */ Datum makeArrayResult(ArrayBuildState *astate, - MemoryContext rcontext) + MemoryContext rcontext, + bool release) { int ndims; int dims[1]; @@ -4729,7 +4731,7 @@ makeArrayResult(ArrayBuildState *astate, dims[0] = astate->nelems; lbs[0] = 1; - return makeMdArrayResult(astate, ndims, dims, lbs, rcontext, true); + return makeMdArrayResult(astate, ndims, dims, lbs, rcontext, release); } /* @@ -4791,13 +4793,15 @@ makeMdArrayResult(ArrayBuildState *astate, * rcontext is where to keep working state */ ArrayBuildStateArr * -initArrayResultArr(Oid array_type, Oid element_type, MemoryContext rcontext) +initArrayResultArr(Oid array_type, Oid element_type, MemoryContext rcontext, + bool subcontext) { ArrayBuildStateArr *astate; - MemoryContext arr_context; + MemoryContext arr_context = rcontext; /* by default use the parent ctx */ /* Make a temporary context to hold all the junk */ - arr_context = AllocSetContextCreate(rcontext, + if (subcontext) + arr_context = AllocSetContextCreate(rcontext, "accumArrayResultArr", ALLOCSET_DEFAULT_MINSIZE, ALLOCSET_DEFAULT_INITSIZE, @@ -4827,7 +4831,7 @@ ArrayBuildStateArr * accumArrayResultArr(ArrayBuildStateArr *astate, Datum dvalue, bool disnull, Oid array_type, - MemoryContext rcontext) + MemoryContext rcontext, bool subcontext) { ArrayType *arg; MemoryContext oldcontext; @@ -4862,7 +4866,7 @@ accumArrayResultArr(ArrayBuildStateArr *astate, (errcode(ERRCODE_DATATYPE_MISMATCH), errmsg("data type %s is not an array type", format_type_be(array_type)))); - astate = initArrayResultArr(array_type, element_type, rcontext); + astate = initArrayResultArr(array_type, element_type, rcontext, subcontext); } else { @@ -5064,7 +5068,7 @@ makeArrayResultArr(ArrayBuildStateArr *astate, * rcontext is where to keep working state */ ArrayBuildStateAny * -initArrayResultAny(Oid input_type, MemoryContext rcontext) +initArrayResultAny(Oid input_type, MemoryContext rcontext, bool subcontext) { ArrayBuildStateAny *astate; Oid element_type = get_element_type(input_type); @@ -5074,7 +5078,7 @@ initArrayResultAny(Oid input_type, MemoryContext rcontext) /* Array case */ ArrayBuildStateArr *arraystate; - arraystate = initArrayResultArr(input_type, element_type, rcontext); + arraystate = initArrayResultArr(input_type, element_type, rcontext, subcontext); astate = (ArrayBuildStateAny *) MemoryContextAlloc(arraystate->mcontext, sizeof(ArrayBuildStateAny)); @@ -5089,7 +5093,7 @@ initArrayResultAny(Oid input_type, MemoryContext rcontext) /* Let's just check that we have a type that can be put into arrays */ Assert(OidIsValid(get_array_type(input_type))); - scalarstate = initArrayResult(input_type, rcontext); + scalarstate = initArrayResult(input_type, rcontext, subcontext); astate = (ArrayBuildStateAny *) MemoryContextAlloc(scalarstate->mcontext, sizeof(ArrayBuildStateAny)); @@ -5112,19 +5116,19 @@ ArrayBuildStateAny * accumArrayResultAny(ArrayBuildStateAny *astate, Datum dvalue, bool disnull, Oid input_type, - MemoryContext rcontext) + MemoryContext rcontext, bool subcontext) { if (astate == NULL) - astate = initArrayResultAny(input_type, rcontext); + astate = initArrayResultAny(input_type, rcontext, subcontext); if (astate->scalarstate) (void) accumArrayResult(astate->scalarstate, dvalue, disnull, - input_type, rcontext); + input_type, rcontext, subcontext); else (void) accumArrayResultArr(astate->arraystate, dvalue, disnull, - input_type, rcontext); + input_type, rcontext, subcontext); return astate; } diff --git a/src/backend/utils/adt/regexp.c b/src/backend/utils/adt/regexp.c index 50b33f6..441a6bd 100644 --- a/src/backend/utils/adt/regexp.c +++ b/src/backend/utils/adt/regexp.c @@ -1175,7 +1175,7 @@ regexp_split_to_array(PG_FUNCTION_ARGS) build_regexp_split_result(splitctx), false, TEXTOID, - CurrentMemoryContext); + CurrentMemoryContext, true); splitctx->next_match++; } @@ -1185,7 +1185,7 @@ regexp_split_to_array(PG_FUNCTION_ARGS) * memory context anyway. */ - PG_RETURN_ARRAYTYPE_P(makeArrayResult(astate, CurrentMemoryContext)); + PG_RETURN_ARRAYTYPE_P(makeArrayResult(astate, CurrentMemoryContext, true)); } /* This is separate to keep the opr_sanity regression test from complaining */ diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c index b3f397e..aa045a2 100644 --- a/src/backend/utils/adt/varlena.c +++ b/src/backend/utils/adt/varlena.c @@ -3576,7 +3576,7 @@ text_to_array_internal(PG_FUNCTION_ARGS) PointerGetDatum(result_text), is_null, TEXTOID, - CurrentMemoryContext); + CurrentMemoryContext, true); pfree(result_text); @@ -3621,7 +3621,7 @@ text_to_array_internal(PG_FUNCTION_ARGS) PointerGetDatum(result_text), is_null, TEXTOID, - CurrentMemoryContext); + CurrentMemoryContext, true); pfree(result_text); @@ -3631,7 +3631,7 @@ text_to_array_internal(PG_FUNCTION_ARGS) } PG_RETURN_ARRAYTYPE_P(makeArrayResult(astate, - CurrentMemoryContext)); + CurrentMemoryContext, true)); } /* diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c index 7d90312..ccf0820 100644 --- a/src/backend/utils/adt/xml.c +++ b/src/backend/utils/adt/xml.c @@ -3681,7 +3681,7 @@ xml_xpathobjtoxmlarray(xmlXPathObjectPtr xpathobj, { datum = PointerGetDatum(xml_xmlnodetoxmltype(xpathobj->nodesetval->nodeTab[i])); (void) accumArrayResult(astate, datum, false, - XMLOID, CurrentMemoryContext); + XMLOID, CurrentMemoryContext, true); } } } @@ -3718,7 +3718,7 @@ xml_xpathobjtoxmlarray(xmlXPathObjectPtr xpathobj, result_str = map_sql_value_to_xml_value(datum, datumtype, true); datum = PointerGetDatum(cstring_to_xmltype(result_str)); (void) accumArrayResult(astate, datum, false, - XMLOID, CurrentMemoryContext); + XMLOID, CurrentMemoryContext, true); return 1; } @@ -3930,10 +3930,10 @@ xpath(PG_FUNCTION_ARGS) ArrayType *namespaces = PG_GETARG_ARRAYTYPE_P(2); ArrayBuildState *astate; - astate = initArrayResult(XMLOID, CurrentMemoryContext); + astate = initArrayResult(XMLOID, CurrentMemoryContext, true); xpath_internal(xpath_expr_text, data, namespaces, NULL, astate); - PG_RETURN_ARRAYTYPE_P(makeArrayResult(astate, CurrentMemoryContext)); + PG_RETURN_ARRAYTYPE_P(makeArrayResult(astate, CurrentMemoryContext, true)); #else NO_XML_SUPPORT(); return 0; diff --git a/src/include/utils/array.h b/src/include/utils/array.h index e1d5749..385eda8 100644 --- a/src/include/utils/array.h +++ b/src/include/utils/array.h @@ -286,31 +286,31 @@ extern void deconstruct_array(ArrayType *array, extern bool array_contains_nulls(ArrayType *array); extern ArrayBuildState *initArrayResult(Oid element_type, - MemoryContext rcontext); + MemoryContext rcontext, bool subcontext); extern ArrayBuildState *accumArrayResult(ArrayBuildState *astate, Datum dvalue, bool disnull, Oid element_type, - MemoryContext rcontext); + MemoryContext rcontext, bool subcontext); extern Datum makeArrayResult(ArrayBuildState *astate, - MemoryContext rcontext); + MemoryContext rcontext, bool release); extern Datum makeMdArrayResult(ArrayBuildState *astate, int ndims, int *dims, int *lbs, MemoryContext rcontext, bool release); extern ArrayBuildStateArr *initArrayResultArr(Oid array_type, Oid element_type, - MemoryContext rcontext); + MemoryContext rcontext, bool subcontext); extern ArrayBuildStateArr *accumArrayResultArr(ArrayBuildStateArr *astate, Datum dvalue, bool disnull, Oid array_type, - MemoryContext rcontext); + MemoryContext rcontext, bool subcontext); extern Datum makeArrayResultArr(ArrayBuildStateArr *astate, MemoryContext rcontext, bool release); extern ArrayBuildStateAny *initArrayResultAny(Oid input_type, - MemoryContext rcontext); + MemoryContext rcontext, bool subcontext); extern ArrayBuildStateAny *accumArrayResultAny(ArrayBuildStateAny *astate, Datum dvalue, bool disnull, Oid input_type, - MemoryContext rcontext); + MemoryContext rcontext, bool subcontext); extern Datum makeArrayResultAny(ArrayBuildStateAny *astate, MemoryContext rcontext, bool release); diff --git a/src/pl/plperl/plperl.c b/src/pl/plperl/plperl.c index 492c1ef..817c606 100644 --- a/src/pl/plperl/plperl.c +++ b/src/pl/plperl/plperl.c @@ -1191,7 +1191,7 @@ array_to_datum_internal(AV *av, ArrayBuildState *astate, &isnull); (void) accumArrayResult(astate, dat, isnull, - elemtypid, CurrentMemoryContext); + elemtypid, CurrentMemoryContext, true); } } } @@ -1218,7 +1218,7 @@ plperl_array_to_datum(SV *src, Oid typid, int32 typmod) errmsg("cannot convert Perl array to non-array type %s", format_type_be(typid)))); - astate = initArrayResult(elemtypid, CurrentMemoryContext); + astate = initArrayResult(elemtypid, CurrentMemoryContext, true); _sv_to_datum_finfo(elemtypid, &finfo, &typioparam);
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers