On Tue, Jun 25, 2019 at 11:18:19AM +0200, Tomas Vondra wrote:
On Mon, Jun 24, 2019 at 02:54:01PM +0100, Dean Rasheed wrote:
On Mon, 24 Jun 2019 at 00:42, Tomas Vondra <tomas.von...@2ndquadrant.com> wrote:

On Sun, Jun 23, 2019 at 10:23:19PM +0200, Tomas Vondra wrote:
On Sun, Jun 23, 2019 at 08:48:26PM +0100, Dean Rasheed wrote:
On Sat, 22 Jun 2019 at 15:10, Tomas Vondra <tomas.von...@2ndquadrant.com> wrote:
One annoying thing I noticed is that the base_frequency tends to end up
being 0, most likely due to getting too small. It's a bit strange, though,
because with statistic target set to 10k the smallest frequency for a
single column is 1/3e6, so for 2 columns it'd be ~1/9e12 (which I think is
something the float8 can represent).


Yeah, it should be impossible for the base frequency to underflow to
0. However, it looks like the problem is with mcv_list_items()'s use
of %f to convert to text, which is pretty ugly.


Yeah, I realized that too, eventually. One way to fix that would be
adding %.15f to the sprintf() call, but that just adds ugliness. It's
probably time to rewrite the function to build the tuple from datums,
instead of relying on BuildTupleFromCStrings.


OK, attached is a patch doing this. It's pretty simple, and it does
resolve the issue with frequency precision.

There's one issue with the signature, though - currently the function
returns null flags as bool array, but values are returned as simple
text value (formatted in array-like way, but still just a text).

In the attached patch I've reworked both to proper arrays, but obviously
that'd require a CATVERSION bump - and there's not much apetite for that
past beta2, I suppose. So I'll just undo this bit.


Hmm, I didn't spot that the old code was using a single text value
rather than a text array. That's clearly broken, especially since it
wasn't even necessarily constructing a valid textual representation of
an array (e.g., if an individual value's textual representation
included the array markers "{" or "}").

IMO fixing this to return a text array is worth doing, even though it
means a catversion bump.


Yeah :-(

It used to be just a "debugging" function, but now that we're using it
e.g. in pg_stats_ext definition, we need to be more careful about the
output. Presumably we could keep the text output and make sure it's
escaped properly etc. We could even build an array internally and then
run it through an output function. That'd not require catversion bump.

I'll cleanup the patch changing the function signature. If others think
the catversion bump would be too significant annoyance at this point, I
will switch back to the text output (with proper formatting).

Opinions?


Attached is a cleaned-up version of that patch. The main difference is
that instead of using construct_md_array() this uses ArrayBuildState to
construct the arrays, which is much easier. The docs don't need any
update because those were already using text[] for the parameter, the
code was inconsistent with it.

This does require catversion bump, but as annoying as it is, I think
it's worth it (and there's also the thread discussing the serialization
issues). Barring objections, I'll get it committed later next week, once
I get back from PostgresLondon.

As I mentioned before, if we don't want any additional catversion bumps,
it's possible to pass the arrays through output functions - that would
allow us keeping the text output (but correct, unlike what we have now).


regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/src/backend/statistics/mcv.c b/src/backend/statistics/mcv.c
index 5fe61ea0a4..498d704a62 100644
--- a/src/backend/statistics/mcv.c
+++ b/src/backend/statistics/mcv.c
@@ -1134,9 +1134,6 @@ Datum
 pg_stats_ext_mcvlist_items(PG_FUNCTION_ARGS)
 {
        FuncCallContext *funcctx;
-       int                     call_cntr;
-       int                     max_calls;
-       AttInMetadata *attinmeta;
 
        /* stuff done only on the first call of the function */
        if (SRF_IS_FIRSTCALL())
@@ -1166,13 +1163,13 @@ pg_stats_ext_mcvlist_items(PG_FUNCTION_ARGS)
                                        (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
                                         errmsg("function returning record 
called in context "
                                                        "that cannot accept 
type record")));
+               tupdesc = BlessTupleDesc(tupdesc);
 
                /*
                 * generate attribute metadata needed later to produce tuples 
from raw
                 * C strings
                 */
-               attinmeta = TupleDescGetAttInMetadata(tupdesc);
-               funcctx->attinmeta = attinmeta;
+               funcctx->attinmeta = TupleDescGetAttInMetadata(tupdesc);
 
                MemoryContextSwitchTo(oldcontext);
        }
@@ -1180,18 +1177,14 @@ pg_stats_ext_mcvlist_items(PG_FUNCTION_ARGS)
        /* stuff done on every call of the function */
        funcctx = SRF_PERCALL_SETUP();
 
-       call_cntr = funcctx->call_cntr;
-       max_calls = funcctx->max_calls;
-       attinmeta = funcctx->attinmeta;
-
-       if (call_cntr < max_calls)      /* do when there is more left to send */
+       if (funcctx->call_cntr < funcctx->max_calls)    /* do when there is 
more left to send */
        {
-               char      **values;
+               Datum           values[5];
+               bool            nulls[5];
                HeapTuple       tuple;
                Datum           result;
-
-               StringInfoData itemValues;
-               StringInfoData itemNulls;
+               ArrayBuildState *astate_values = NULL;
+               ArrayBuildState *astate_nulls = NULL;
 
                int                     i;
 
@@ -1203,20 +1196,9 @@ pg_stats_ext_mcvlist_items(PG_FUNCTION_ARGS)
 
                mcvlist = (MCVList *) funcctx->user_fctx;
 
-               Assert(call_cntr < mcvlist->nitems);
-
-               item = &mcvlist->items[call_cntr];
-
-               /*
-                * Prepare a values array for building the returned tuple. This 
should
-                * be an array of C strings which will be processed later by 
the type
-                * input functions.
-                */
-               values = (char **) palloc0(5 * sizeof(char *));
+               Assert(funcctx->call_cntr < mcvlist->nitems);
 
-               values[0] = (char *) palloc(64 * sizeof(char)); /* item index */
-               values[3] = (char *) palloc(64 * sizeof(char)); /* frequency */
-               values[4] = (char *) palloc(64 * sizeof(char)); /* base 
frequency */
+               item = &mcvlist->items[funcctx->call_cntr];
 
                outfuncs = (Oid *) palloc0(sizeof(Oid) * mcvlist->ndimensions);
                fmgrinfo = (FmgrInfo *) palloc0(sizeof(FmgrInfo) * 
mcvlist->ndimensions);
@@ -1230,61 +1212,49 @@ pg_stats_ext_mcvlist_items(PG_FUNCTION_ARGS)
                        fmgr_info(outfuncs[i], &fmgrinfo[i]);
                }
 
-               /* build the arrays of values / nulls */
-               initStringInfo(&itemValues);
-               initStringInfo(&itemNulls);
-
-               appendStringInfoChar(&itemValues, '{');
-               appendStringInfoChar(&itemNulls, '{');
-
                for (i = 0; i < mcvlist->ndimensions; i++)
                {
-                       Datum           val,
-                                               valout;
 
-                       if (i > 0)
-                       {
-                               appendStringInfoString(&itemValues, ", ");
-                               appendStringInfoString(&itemNulls, ", ");
-                       }
+                       astate_nulls = accumArrayResult(astate_nulls,
+                                                                 
BoolGetDatum(item->isnull[i]),
+                                                                 false,
+                                                                 BOOLOID,
+                                                                 
CurrentMemoryContext);
 
-                       if (item->isnull[i])
-                               valout = CStringGetDatum("NULL");
-                       else
+                       if (!item->isnull[i])
                        {
-                               val = item->values[i];
-                               valout = FunctionCall1(&fmgrinfo[i], val);
+                               Datum   val = FunctionCall1(&fmgrinfo[i], 
item->values[i]);
+                               text   *txt = 
cstring_to_text(DatumGetPointer(val));
+
+                               astate_values = accumArrayResult(astate_values,
+                                                                 
PointerGetDatum(txt),
+                                                                 false,
+                                                                 TEXTOID,
+                                                                 
CurrentMemoryContext);
                        }
-
-                       appendStringInfoString(&itemValues, 
DatumGetCString(valout));
-                       appendStringInfoString(&itemNulls, item->isnull[i] ? 
"t" : "f");
+                       else
+                               astate_values = accumArrayResult(astate_values,
+                                                                 (Datum) 0,
+                                                                 true,
+                                                                 TEXTOID,
+                                                                 
CurrentMemoryContext);
                }
 
-               appendStringInfoChar(&itemValues, '}');
-               appendStringInfoChar(&itemNulls, '}');
-
-               snprintf(values[0], 64, "%d", call_cntr);
-               snprintf(values[3], 64, "%f", item->frequency);
-               snprintf(values[4], 64, "%f", item->base_frequency);
+               values[0] = Int32GetDatum(funcctx->call_cntr);
+               values[1] = PointerGetDatum(makeArrayResult(astate_values, 
CurrentMemoryContext));
+               values[2] = PointerGetDatum(makeArrayResult(astate_nulls, 
CurrentMemoryContext));
+               values[3] = Float8GetDatum(item->frequency);
+               values[4] = Float8GetDatum(item->base_frequency);
 
-               values[1] = itemValues.data;
-               values[2] = itemNulls.data;
+               /* no NULLs in the tuple */
+               memset(nulls, 0, sizeof(nulls));
 
                /* build a tuple */
-               tuple = BuildTupleFromCStrings(attinmeta, values);
+               tuple = heap_form_tuple(funcctx->attinmeta->tupdesc, values, 
nulls);
 
                /* make the tuple into a datum */
                result = HeapTupleGetDatum(tuple);
 
-               /* clean up (this is not really necessary) */
-               pfree(itemValues.data);
-               pfree(itemNulls.data);
-
-               pfree(values[0]);
-               pfree(values[3]);
-               pfree(values[4]);
-               pfree(values);
-
                SRF_RETURN_NEXT(funcctx, result);
        }
        else                                            /* do when there is no 
more left */
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 87335248a0..36ae2ac9d8 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5019,7 +5019,7 @@
 { oid => '3427', descr => 'details about MCV list items',
   proname => 'pg_mcv_list_items', prorows => '1000', proretset => 't',
   provolatile => 's', prorettype => 'record', proargtypes => 'pg_mcv_list',
-  proallargtypes => '{pg_mcv_list,int4,text,_bool,float8,float8}',
+  proallargtypes => '{pg_mcv_list,int4,_text,_bool,float8,float8}',
   proargmodes => '{i,o,o,o,o,o}',
   proargnames => '{mcv_list,index,values,nulls,frequency,base_frequency}',
   prosrc => 'pg_stats_ext_mcvlist_items' },
diff --git a/src/test/regress/expected/stats_ext.out 
b/src/test/regress/expected/stats_ext.out
index 9e770786cc..6a070a9649 100644
--- a/src/test/regress/expected/stats_ext.out
+++ b/src/test/regress/expected/stats_ext.out
@@ -614,9 +614,9 @@ SELECT m.*
        pg_mcv_list_items(d.stxdmcv) m
  WHERE s.stxname = 'mcv_lists_stats'
    AND d.stxoid = s.oid;
- index |  values   |  nulls  | frequency | base_frequency 
--------+-----------+---------+-----------+----------------
-     0 | {1, 2, 3} | {f,f,f} |         1 |              1
+ index | values  |  nulls  | frequency | base_frequency 
+-------+---------+---------+-----------+----------------
+     0 | {1,2,3} | {f,f,f} |         1 |              1
 (1 row)
 
 -- mcv with arrays

Reply via email to