Re: Freeing transient memory in aggregate functions

2021-12-21 Thread Matt Magoffin
On 21/12/2021, at 10:25 AM, Tom Lane  wrote:
> Not quite like that.  Look into nodeAgg.c, which solves a similar problem
> for the transvalues themselves with code like
> 
>/* forget the old value, if any */
>if (!oldIsNull && !pertrans->inputtypeByVal)
>pfree(DatumGetPointer(oldVal));

Oh great, thanks for the tip!

— m@

Freeing transient memory in aggregate functions

2021-12-20 Thread Matt Magoffin
I have a question about trying to keep memory from growing too much in a C 
aggregate function with pass-by-reference types. I am trying to keep track of a 
last-seen value in my aggregate state, so I have code roughly doing this:

Datum current;
MemoryContext aggContext;
AggCheckCallContext(fcinfo, );
old = MemoryContextSwitchTo(aggContext);

if (!PG_ARGISNULL(0)) {
  current = PG_GETARG_DATUM(0);
  state->last = datumCopy(, typbyval, typlen);
}
MemoryContextSwitchTo(old);

I’m essentially doing a datumCopy() on every non-null input value. I was 
wondering if there is a way to free the previously copied datum, since I don’t 
really need it anymore? Something like

if (!PG_ARGISNULL(0)) {
  current = PG_GETARG_DATUM(0);
  if (state->last != NULL) {
pfree(state->last);
  }
  state->last = datumCopy(, typbyval, typlen);
}

I wasn’t sure it was allowed to call pfree() like this. My actual function is 
dealing with array input values, and for large sets of inputs I didn’t want to 
grow memory use as large as the entire data set being aggregated.

Kind regards,
Matt

Re: Properly handling aggregate in nested function call

2021-12-15 Thread Matt Magoffin
On 16/12/2021, at 2:43 PM, Tom Lane  wrote:
> Of course what this function is actually returning is numeric[].
> There is some code such as array_out that will look at the
> element type OID embedded in the array value, and do the right
> thing.  But other code will believe the function's declared
> result type, and that sort of code will go wrong.

(Hand slap to face!) Oh my, thank you for spotting that, this has been driving 
me to distraction, thinking I was doing something wrong with memory contexts or 
anything other than that copy/paste error.

> Alternatively you can declare one set of polymorphic
> functions and aggregates, in which case you'll need to closely
> study the stuff about the finalfunc_extra option [2].

Thanks for the tips, I have been eyeing that documentation.

— m@



Re: Properly handling aggregate in nested function call

2021-12-15 Thread Matt Magoffin
On 15/12/2021, at 11:51 AM, Tom Lane  wrote:
> You should
> probably palloc temp arrays right here and then use construct_md_array
> directly instead of dealing with an ArrayBuildState.

OK, I gave that a go [1] this time in a vec_agg_sum() aggregate, that operates 
much the same as the vec_agg_mean() one and is behaving in the same way. For 
example this shows the correct values:

SELECT vec_agg_sum(nums) FROM measurements;
NOTICE:  sum 0 = 4.92
NOTICE:  sum 1 = 5.91
NOTICE:  sum 2 = 14.80
vec_agg_sum
---
 {4.92,5.91,14.80}
(1 row)


But adding unnest() goes back to seemingly memory address values:

SELECT unnest(vec_agg_sum(nums)) FROM measurements;
NOTICE:  sum 0 = 4.92
NOTICE:  sum 1 = 5.91
NOTICE:  sum 2 = 14.80
 unnest 

 94069010663032
 94069010663044
 94069010663056
(3 rows)

Any other ideas I could look into?

— m@

[1] 
https://github.com/SolarNetwork/aggs_for_vecs/blob/df8b950c73d9b32fa6f016b44dd8859e9341fe0f/vec_agg_sum.c#L50-L52
 




Re: Properly handling aggregate in nested function call

2021-12-15 Thread Matt Magoffin
On 15/12/2021, at 11:51 AM, Tom Lane  wrote:
> Hmm, I think you're abusing the ArrayBuildState API.  In particular,
> what guarantees that the astate->dvalues and astate->dnulls arrays
> are long enough for what you're stuffing into them?

The length is defined in the aggregate transition function (i.e. the length of 
the first input array) and enforced there [1] so I can stuff away here.

> You should
> probably palloc temp arrays right here and then use construct_md_array
> directly instead of dealing with an ArrayBuildState.

OK, I can try that out. I did have another branch where I create the 
ArrayBuildState in this final function [2] but I still got the same result. 
I’ll try using construct_md_array instead as you suggest, thank you.

> Also, I wonder what happens when state->vec_counts[i] is zero.
> That's seemingly not your problem right now, since the ereport(NOTICE)
> is being reached; but it sure looks like trouble waiting to happen.

It can happen, but the ArrayBuildState is initialised with all 
astate->dnulls[i] as true at the start [3], so the 

if (state->vec_counts[i]) {
}

test just skips those elements and leaves them as NULL.

— m@

[1] 
https://github.com/SolarNetwork/aggs_for_vecs/blob/9e742cdc32a113268fd3c1f928c8ac724acec9f5/vec_stat_accum.c#L47-L49
 


[2] 
https://github.com/SolarNetwork/aggs_for_vecs/blob/ab5bea1039f865c26b899f99de866f172d125dc2/vec_agg_mean.c#L23
 


[3] 
https://github.com/SolarNetwork/aggs_for_vecs/blob/ab5bea1039f865c26b899f99de866f172d125dc2/util.c#L94-L96
 




Properly handling aggregate in nested function call

2021-12-13 Thread Matt Magoffin
I am working on a C aggregate function that returns a numeric[] result. If I 
execute the aggregate and return the results directly in my SQL, I get the 
expected results. For example:

SELECT vec_agg_mean(nums) FROM measurements;
NOTICE:  avg 0 = 1.2300
NOTICE:  avg 1 = 1.9700
NOTICE:  avg 2 = 3.7000
  vec_agg_mean  

 {1.2300,1.9700,3.7000}
(1 row)

The NOTICE logs are are there to help me verify the computed result in the 
aggregate final function, and they essentially do

DatumGetCString(DirectFunctionCall1(numeric_out, datum))

However if I nest the aggregate inside something, such as unnest(), I get what 
appear to be just memory addresses:

SELECT unnest(vec_agg_mean(nums)) FROM measurements;
NOTICE:  avg 0 = 1.2300
NOTICE:  avg 1 = 1.9700
NOTICE:  avg 2 = 3.7000
 unnest 

 94674302945040
 94674302945052
 94674302945064
(3 rows)

You can see the NOTICE logs are still the same. Passing the aggregate result to 
any other function seems to expose the problem, for example:

SELECT ARRAY[vec_agg_mean(nums)]::numeric[] FROM measurements;
NOTICE:  avg 0 = 1.2300
NOTICE:  avg 1 = 1.9700
NOTICE:  avg 2 = 3.7000
  array   
--
 {{94674302928624,94674302928636,94674302928648}}
(1 row)

Any ideas what I’m doing wrong here? The source is available here:

https://github.com/SolarNetwork/aggs_for_vecs/blob/9e742cdc32a113268fd3c1f928c8ac724acec9f5/vec_agg_mean.c
 
<https://github.com/SolarNetwork/aggs_for_vecs/blob/9e742cdc32a113268fd3c1f928c8ac724acec9f5/vec_agg_mean.c>

Cheers,
Matt Magoffin

Re: Handling memory contexts in aggregate function invoking other built-in aggregate functions

2021-12-04 Thread Matt Magoffin
On 5/12/2021, at 9:04 AM, Tom Lane  wrote:
> So that probably means that you weren't careful about allocating your
> own state data in the long-lived context (agg_context), and so it
> got freed between calls.

It turns out I wasn’t careful about setting isnull on the passed in state 
argument. After I fixed that [1] everything appears to be running smoothly!

— m@

[1] 
https://github.com/SolarNetwork/aggs_for_vecs/commit/a734ab211f9660f8a5a3d13870f089b46df56e7c

Re: Handling memory contexts in aggregate function invoking other built-in aggregate functions

2021-12-04 Thread Matt Magoffin
On 5/12/2021, at 5:16 AM, Tom Lane  wrote:

> Calling numeric_avg_accum in the agg_context is unnecessary, and possibly
> counterproductive (it might leak memory in that context, since like all
> other aggregates it assumes it's called in a short-lived context).


OK, thanks for that, I’ll remove the context switch before calling 
numeric_avg_accum and test more. 

> Are you testing in an --enable-cassert build?  If not, do that;
> it might make the cause of the crashes more apparent, thanks to
> CLOBBER_FREED_MEMORY and other debug support.

I did build with --enable-cassert, and I did see the state argument pointer 
passed to numeric_avg_accum
 as 0x7f7f7f7f7f, so now I understand why that was thanks to seeing the 
information about what that means on the Dev FAQ, thanks for that.

So given you didn’t say I shouldn’t be trying to invoke these aggregate 
functions as I’m trying to, does that mean in theory there isn’t anything 
inappropriate about doing this as far as you know?

Cheers,
Matt



Handling memory contexts in aggregate function invoking other built-in aggregate functions

2021-12-03 Thread Matt Magoffin
I am trying to see if I can write a C aggregate function that operates on 
numeric[] array columns, and essentially performs aggregates on the array 
elements as if they were individual columns, resulting in an output numeric[] 
array with the aggregate element values. I do realise I can use unnest() and 
then the built-in aggregates like this:

SELECT array_agg(v ORDER BY i) FROM (
  SELECT i, sum(v) AS v
  FROM (VALUES
  (ARRAY[1,2,3]::numeric[]), 
  (ARRAY[2,3,4]::numeric[])
) n(nums), unnest(n.nums) WITH ORDINALITY AS p(v,i)
  GROUP BY i
) g;

 array_agg 
---
 {3,5,7}

What I am doing is based on the aggs_for_vecs [1] extension so my goal is to 
have SQL like this:

SELECT vec_to_sum(nums) FROM (VALUES
(ARRAY[1,2,3]::numeric[]), 
(ARRAY[2,3,4]::numeric[])
  ) n(nums);

I have the extension working with numerics by doing numeric calculations with 
the help of the functions defined in numeric.h (numeric_add_opt_error() in this 
case) but for other aggregates like average or var_samp I was hoping to piggy 
back off all the support in numeric.c. The primary motivation for doing this as 
a C extension is because it has proved to execute much faster than the 
equivalent unnest() SQL.

So far, I have been working on average support via the vec_to_mean() aggregate, 
and my aggregate's [2] transition function sets up a FunctionCallInfo for the 
numeric_avg_accum() [3] function and then loops over the input array elements, 
calling numeric_avg_accum() and saving its result state object in my 
aggregate’s state. Before looping, I switch the memory context to the 
aggregate’s context, i.e. there is stuff like

MemoryContext aggContext;
AggCheckCallContext(fcinfo, );
old = MemoryContextSwitchTo(aggContext);
for (i = 0; i < arrayLength; i++) {
  // invoke numeric_avg_accum() for each array element, store result in my state
}
MemoryContextSwitchTo(old);

In my aggregate function's final function I set up a FunctionCallInfo for the 
numeric_avg() function and loop over all the numeric_avg_accum() state objects 
[4], saving their results as the final array returned from the vec_to_mean() 
aggregate.

Overall, the approach seems like it should work, however I don’t believe I’m 
handling the memory contexts correctly. The function works sometimes, but 
crashes or returns incorrect results sometimes. I tried to debug what’s going 
on, but I am very new to working on a C extension for Postgres so all I surmise 
so far is that some of my saved state objects appear to be released before they 
should be. That finally brings me to my questions:

1. Is it even reasonable for me to try to do this approach?
2. Is there anything I should be doing differently with memory contexts, like 
creating a new one(s) for the calls to numeric_avg_accum()?
3. Is there something else I’m doing wrong?

Thank you very much for your time. I hope I was clear, as I mentioned this is 
all quite new to me so my assumptions/approach/terminology might be off.

Cheers, 
Matt Magoffin


[1] https://github.com/pjungwir/aggs_for_vecs

[2] 
https://github.com/SolarNetwork/aggs_for_vecs/blob/feature/numeric-stats-agg/vec_to_mean_numeric.c

[3] 
https://github.com/SolarNetwork/aggs_for_vecs/blob/7c2a5aad35a814dca6d9f5a35df1de6e4b98d149/vec_to_mean_numeric.c#L57-L58

[4] 
https://github.com/SolarNetwork/aggs_for_vecs/blob/7c2a5aad35a814dca6d9f5a35df1de6e4b98d149/vec_to_mean_numeric.c#L117-L126





Re: Duplicate key violation on upsert

2020-03-26 Thread Matt Magoffin

> On 27/03/2020, at 5:26 AM, Adrian Klaver  wrote:
> 
> Well morning and coffee helped some, but not enough to offer blinding 
> insight. Reviewing the function above, the TimescaleDB insert block function 
> and the overview of the TimescaleDB hypertable architecture leads me to 
> believe there is some sort of conflict between the solarnetwork functions and 
> the TimescaleDB hypertable actions. It is a wishy-washy answer as I do not 
> understand the TimescaleDB architecture well enough. You might want to reach 
> to the TimescaleDB community to see if they can offer any further insight.

Fair enough! Thank you for taking the time to look through this issue with me, 
I really appreciate it. I’ll reach out the the TimescaleDB folks and see what 
they think.

— m@

Re: Duplicate key violation on upsert

2020-03-25 Thread Matt Magoffin

> On 23/03/2020, at 1:10 PM, Adrian Klaver  wrote:
> 
> So the query is in the function solardatum.store_datum()?

> 
> If so what is it doing?

Yes. This function first performs the INSERT INTO the solardatum.da_datum table 
that we’re discussing here; then it inserts into two different tables. If it 
helps, the actual SQL is available here:

https://github.com/SolarNetwork/solarnetwork-central/blob/4fa585929a5526187ade0e842c809837647c6a41/solarnet-db-setup/postgres/postgres-init-generic-datum-x-functions.sql#L203-L242

> And could you capture the values and pass them to a RAISE NOTICE?

It would take me some time to get that change deployed. If I was able to, what 
information do you think would be helpful here, e.g. that jdata_a is NULL or 
not, or something else?

The duplicate key violation occurs infrequently, and it does seem appropriate 
to drop the UNIQUE constraint on the da_datum_x_acc_idx given uniqueness is 
really only wanted on (node_id, ts, source_id). As long as I can confirm that 
query performance doesn’t decrease, I’d like to recreate the index without 
UNIQUE. Then I’m hoping this problem, whatever the cause, goes away.

— m@

Re: Duplicate key violation on upsert

2020-03-22 Thread Matt Magoffin

> On 23/03/2020, at 9:44 AM, Adrian Klaver  wrote:
> Is there a chance the BEFORE trigger functions are doing something that could 
> be leading to the error?
> 
> In the error log is there a line with the actual values that failed?

The error log does not show the literal values, no. Here is a literal example 
from the logs:

2020-03-20 19:51:11 NZDT [15165]: [6-1] ERROR:  duplicate key value violates 
unique constraint "_hyper_1_1931_chunk_da_datum_x_acc_idx"
2020-03-20 19:51:11 NZDT [15165]: [7-1] CONTEXT:  SQL statement "INSERT INTO 
solardatum.da_datum(ts, node_id, source_id, posted, jdata_i, jdata_a, jdata_s, 
jdata_t)
VALUES (ts_crea, node, src, ts_post, jdata_json->'i', 
jdata_json->'a', jdata_json->'s', 
solarcommon.json_array_to_text_array(jdata_json->'t'))
ON CONFLICT (node_id, ts, source_id) DO UPDATE
SET jdata_i = EXCLUDED.jdata_i,
jdata_a = EXCLUDED.jdata_a,
jdata_s = EXCLUDED.jdata_s,
jdata_t = EXCLUDED.jdata_t,
posted = EXCLUDED.posted
RETURNING (xmax = 0)"
PL/pgSQL function solardatum.store_datum(timestamp with time 
zone,bigint,text,timestamp with time zone,text,boolean) line 10 at SQL statement
2020-03-20 19:51:11 NZDT [15165]: [8-1] STATEMENT:  select * from 
solardatum.store_datum($1, $2, $3, $4, $5) as result

As for the BEFORE triggers, the solardatum.trigger_agg_stale_datum one does an 
INSERT into a different table and a SELECT from this same table. The 
_timescaledb_internal.insert_blocker one is part of the TimescaleDB extension 
which looks like it wouldn’t have an impact to this issue, but the source of 
that is

https://github.com/timescale/timescaledb/blob/91fe723d3aaaf88b53ebf8adc3e16a68ec45/src/hypertable.c#L1359
 


— m@

Re: Duplicate key violation on upsert

2020-03-21 Thread Matt Magoffin

> On 22/03/2020, at 8:11 AM, Adrian Klaver  wrote:
> 
>> I was thinking more about this:
>> "INSERT INTO solardatum.da_datum(ts, node_id, source_id, posted, jdata_i, 
>> jdata_a, jdata_s, jdata_t)
>> VALUES (…) ..."
>> from your OP. Namely whether it was:
>> VALUES (), (), (), ...
>> and if so there were values in the (),(),() that duplicated each other.
>> As to the second part of your response, ON CONFLICT does one of either 
>> INSERT or UPDATE. If:
>> 1) There is no conflict for ON CONFLICT (node_id, ts, source_id) then the 
>> INSERT proceeds.
>> 2) If there is a conflict then an UPDATE occurs using the SET values.
>> Now just me working through this:
>> da_datum_pkey   = (node_id, ts, source_id)
>> da_datum_x_acc_idx  = (node_id, source_id, ts DESC, jdata_a)
>> If 1) from above applies then da_datum_x_acc_idx will not be tripped as the 
>> only way that could happen is if the node_id, ts, source_id was the same as 
>> an existing row and that can't be true because the PK over the same values 
>> passed.
> 
> Well the below is complete rot. If you are UPDATEing then you are not 
> creating a duplicate row, just overwriting a value with itself.
> 
>> If 2) from above happened then you are trying to UPDATE a row with matching 
>> PK values(node_id, ts, source_id). Now it is entirely possible that since 
>> you are not testing for constraint violation on (node_id, source_id, ts 
>> DESC, jdata_a) that you be doing SET jdata_a = EXCLUDED.jdata_a, using a 
>> value that would trip da_datum_x_acc_idx

Sorry for the vagueness in my OP, I was trying to make it easier to read. The 
VALUES are for individual single column values, so a single possible row to 
insert/update.

So what you’ve outlined is basically what I thought should be happening. 
Namely, there can be only one row that will be inserted/updated. I am wondering 
if I should re-create the da_datum_x_acc_idx index without UNIQUE? I had it as 
UNIQUE to optimise the type of queries that make use of that index… but I did a 
little bit of testing using a non-UNIQUE index and those queries appear to 
execute around the same time as with the UNIQUE index. I just wasn’t sure if 
that would just be masking some other problem in my setup.

— m@

Re: Duplicate key violation on upsert

2020-03-20 Thread Matt Magoffin

> On 21/03/2020, at 8:10 AM, Adrian Klaver  wrote:
> 
>> The _hyper_1_1931_chunk_da_datum_x_acc_idx index has the same definition as 
>> the da_datum_x_acc_idx above (it is defined on a child table). That is, they 
>> are both essentially:
>> UNIQUE, btree (node_id, source_id, ts DESC, jdata_a) WHERE jdata_a IS NOT 
>> NULL
>> The da_datum_pkey index is what the ON CONFLICT cause refers to, so 
>> (node_id, ts, source_id) is UNIQUE as well.
> 
> Hmm, wonder if you are getting bit by this?:
> 
> https://www.postgresql.org/docs/12/sql-insert.html#SQL-ON-CONFLICT 
> 
> 
> "INSERT with an ON CONFLICT DO UPDATE clause is a “deterministic” statement. 
> This means that the command will not be allowed to affect any single existing 
> row more than once; a cardinality violation error will be raised when this 
> situation arises. Rows proposed for insertion should not duplicate each other 
> in terms of attributes constrained by an arbiter index or constraint.”


I’m not sure I’m wrapping my head around this. The INSERT affects 1 row as the 
unique values (node_id, ts, source_id) are specified in the statement. Is it 
possible that da_datum_x_acc_idx is used as the arbiter index in this 
situation, rather than da_datum_pkey (that I intended), and you’re saying that 
the jdata_a column is getting updated twice, first in the INSERT and second in 
the DO UPDATE, triggering the duplicate key violation?

— m@



Re: Duplicate key violation on upsert

2020-03-20 Thread Matt Magoffin


> On 21/03/2020, at 4:00 AM, Adrian Klaver  wrote:
> 
> On 3/20/20 2:17 AM, Matt Magoffin wrote:
>> Hello,
>> Indexes:
>> "da_datum_pkey" UNIQUE, btree (node_id, ts, source_id) CLUSTER, 
>> tablespace "solarindex"
>> "da_datum_reverse_pkey" UNIQUE, btree (node_id, ts DESC, source_id), 
>> tablespace "solarindex"
>> "da_datum_x_acc_idx" UNIQUE, btree (node_id, source_id, ts DESC, 
>> jdata_a) WHERE jdata_a IS NOT NULL, tablespace "solarindex"
>> The error/query looks like:
>> ERROR: duplicate key value violates unique constraint 
>> “_hyper_1_1931_chunk_da_datum_x_acc_idx"
> 
> What is the above index UNIQUE over?
> 
> What is da_datum_x_acc_idx index below indexed over?

The _hyper_1_1931_chunk_da_datum_x_acc_idx index has the same definition as the 
da_datum_x_acc_idx above (it is defined on a child table). That is, they are 
both essentially:

UNIQUE, btree (node_id, source_id, ts DESC, jdata_a) WHERE jdata_a IS NOT NULL

The da_datum_pkey index is what the ON CONFLICT cause refers to, so (node_id, 
ts, source_id) is UNIQUE as well.

— m@



Duplicate key violation on upsert

2020-03-20 Thread Matt Magoffin
Hello,

I am experiencing a duplicate key violation in Postgres 9.6 on occasion for one 
particular query, and I’m wondering where I’m going wrong. My table looks like 
this:

  Table "solardatum.da_datum"
  Column   |   Type   | Collation | Nullable | Default 
---+--+---+--+-
 ts| timestamp with time zone |   | not null | 
 node_id   | bigint   |   | not null | 
 source_id | character varying(64)|   | not null | 
 posted| timestamp with time zone |   | not null | 
 jdata_i   | jsonb|   |  | 
 jdata_a   | jsonb|   |  | 
 jdata_s   | jsonb|   |  | 
 jdata_t   | text[]   |   |  | 
Indexes:
"da_datum_pkey" UNIQUE, btree (node_id, ts, source_id) CLUSTER, tablespace 
"solarindex"
"da_datum_reverse_pkey" UNIQUE, btree (node_id, ts DESC, source_id), 
tablespace "solarindex"
"da_datum_x_acc_idx" UNIQUE, btree (node_id, source_id, ts DESC, jdata_a) 
WHERE jdata_a IS NOT NULL, tablespace "solarindex"
Triggers:
aa_agg_stale_datum BEFORE INSERT OR DELETE OR UPDATE ON solardatum.da_datum 
FOR EACH ROW EXECUTE PROCEDURE solardatum.trigger_agg_stale_datum()
ts_insert_blocker BEFORE INSERT ON solardatum.da_datum FOR EACH ROW EXECUTE 
PROCEDURE _timescaledb_internal.insert_blocker()

The error/query looks like:

ERROR: duplicate key value violates unique constraint 
“_hyper_1_1931_chunk_da_datum_x_acc_idx"
  Where: SQL statement "INSERT INTO solardatum.da_datum(ts, node_id, source_id, 
posted, jdata_i, jdata_a, jdata_s, jdata_t)
VALUES (…)
ON CONFLICT (node_id, ts, source_id) DO UPDATE
SET jdata_i = EXCLUDED.jdata_i,
jdata_a = EXCLUDED.jdata_a,
jdata_s = EXCLUDED.jdata_s,
jdata_t = EXCLUDED.jdata_t,
posted = EXCLUDED.posted
RETURNING (xmax = 0)"

I am using the TimescaleDB extension so there are child tables inheriting from 
this main table and that’s why the reported index name differs from the 
definition shown above. I’m not sure if the extension is the problem, so I 
thought I’d start here to see if I’ve configured something wrong or my 
expectations on how the upsert should work is wrong. My expectation was that 
basically the insert would never fail from a duplicate key violation.

The error always references the da_datum_x_acc_idx index, which is a partial 
index with jdata_a added as a covering column… that is, it’s only in the index 
so I can get some index-only results with that column. Is the partial index 
possibly an issue in this configuration?

Thanks for any insight,
Matt