subject:"\"\\\[HACKERS\\\] \\\[PATCH\\\] Negative Transition Aggregate Functions \\\(WIP\\\)\""

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-14 Thread Robert Haas

On Sun, Apr 13, 2014 at 2:23 PM, Tom Lane  wrote:
> * I also left out the table documenting which aggregates have this
> optimization.  That's not the kind of thing we ordinarily document,
> and it seems inevitable to me that such a table would be noteworthy
> mostly for wrong/incomplete/obsolete information in the future.

I tend to think that not documenting such things is an error.  Sure,
the documentation could become obsolete, but I don't see why it's
particularly more likely with this table than anywhere else, and if it
does happen, and people care, they'll submit patches to fix it.  More
to the point, when we don't document things like this, it doesn't
cause the knowledge not to be important to end-users; it just means
that the knowledge lives on wiki pages, support fora, and the minds of
people who are "in the know" rather than being easily and generally
accessible.  I'd rather have documentation on this topic that was,
say, 80% correct than have none at all.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-13 Thread David Rowley

> -Original Message-
> From: Tom Lane [mailto:t...@sss.pgh.pa.us]
> Sent: 14 April 2014 06:23
> To: Dean Rasheed
> Cc: Florian Pflug; David Rowley; Kevin Grittner; Josh Berkus; Greg Stark;
> PostgreSQL-development
> Subject: Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions
> (WIP)
> 
> Dean Rasheed  writes:
> > OK, I'm marking this ready for committer attention, on the
> > understanding that that doesn't include the invtrans_minmax patch.
> 
> I've finished reviewing/revising/committing this submission.

That's great news!
Tom, Thanks for taking the time to make the all modifications and commit
this.

Dean, Thank you for all your hard work reviewing the patch. The detail of
the review was quite impressive.

I'm really happy to see this make it in for 9.4.

Regards

David Rowley




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-13 Thread Florian Pflug

On Apr13, 2014, at 20:23 , Tom Lane  wrote:
> Dean Rasheed  writes:
>> OK, I'm marking this ready for committer attention, on the
>> understanding that that doesn't include the invtrans_minmax patch.
> 
> I've finished reviewing/revising/committing this submission.

Cool! Thanks!

> Some notes:
> 
> * I left out the EXPLAIN ANALYZE statistics, as I still feel that they're
> of little if any use to regular users, and neither very well defined nor
> well documented.  We can revisit that later of course.
> 
> * I also left out the table documenting which aggregates have this
> optimization.  That's not the kind of thing we ordinarily document,
> and it seems inevitable to me that such a table would be noteworthy
> mostly for wrong/incomplete/obsolete information in the future.

I can live with each leaving each of these out individually, but leaving
them both out means there's now no way of determining how a particular
sliding window aggregation will behave. That leaves our users a bit
out in the cold IMHO.

For example, with this patch you'd usually want to write
  SUM(numeric_value::numeric(n,m))
instead of
  SUM(numeric_value)::numeric(n,m)
in a sliding window aggregation, and there should be *some* way for
users to figure this out.

We have exhaustive tables of all the functions, operators and aggregates
we support, and maintenance of those seems to work well for the most
part. Why would a table of invertible aggregates be any different? If
it's the non-locality that bothers you - I guess the same information
could be added to the descriptions of the individual aggregates, instead
of having one big table.

> * I rejected the invtrans_collecting sub-patch altogether.  I didn't
> like anything about the idea of adding a poorly-documented field to
> ArrayBuildState and then teaching some random subset of the functions
> using that struct what to do with it.  I think it would've been simpler,
> more reliable, and not that much less efficient to just do memmove's in
> shiftArrayResult. The string-aggregate changes were at least more
> localized, but they still seemed awfully messy.  And there's a bigger
> issue: these aggregates have to do O(N) work per row for a frame of length
> N anyway, so it's not clear to me that there's any big-O gain to be had
> from making them into moving aggregates.

Yeah, those probably aren't super-usefull. I initially added array_agg
because with the nonstrict-forward/strict-reverse rule still in place,
there wouldn't otherwise have been an in-core aggregate which exercises
the non-strict case, which seemed like a bad idea. For the string case -
I didn't expect that to turn out to be *quite* this messy when I started
implementing it.

> Anyway, this is nice forward progress for 9.4, even if we get no further.

Yup! Thanks to everybody who made this happens!

best regards,
Florian Pflug

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-13 Thread Tom Lane

Dean Rasheed  writes:
> OK, I'm marking this ready for committer attention, on the
> understanding that that doesn't include the invtrans_minmax patch.

I've finished reviewing/revising/committing this submission.
Some notes:

* I left out the EXPLAIN ANALYZE statistics, as I still feel that they're
of little if any use to regular users, and neither very well defined nor
well documented.  We can revisit that later of course.

* I also left out the table documenting which aggregates have this
optimization.  That's not the kind of thing we ordinarily document,
and it seems inevitable to me that such a table would be noteworthy
mostly for wrong/incomplete/obsolete information in the future.

* I rejected the invtrans_collecting sub-patch altogether.  I didn't
like anything about the idea of adding a poorly-documented field to
ArrayBuildState and then teaching some random subset of the functions
using that struct what to do with it.  I think it would've been simpler,
more reliable, and not that much less efficient to just do memmove's in
shiftArrayResult.  The string-aggregate changes were at least more
localized, but they still seemed awfully messy.  And there's a bigger
issue: these aggregates have to do O(N) work per row for a frame of length
N anyway, so it's not clear to me that there's any big-O gain to be had
from making them into moving aggregates.  I doubt people are going to be
using these aggregates with very wide frames, just because they're going
to be horribly expensive no matter what we do.

Anyway, this is nice forward progress for 9.4, even if we get no further.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-11 Thread Florian Pflug

On Apr11, 2014, at 19:42 , Tom Lane  wrote:
> Florian Pflug  writes:
>> Yes, the idea had come up at some point during the review discussion. I
>> agree that it's only worthwhile if it works for state type internal - though
>> I think there ought to be a way to allow that. We could for example simply
>> decree that the initfunc's first parameter must be of type internal if it's
>> return type is, and then just pass NULL for that parameter.
> 
> I had thought about that, but it doesn't really work since it'd be
> violating the strictness spec of the function.

Only if we insist on passing SQL NULL, not if we passed an non-NULL value
that happens to be (char*)0.

Or we could simply require the initfunc to be non-strict in the case of
state type internal.

best regards,
Florian Pflug



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-11 Thread Tom Lane

Florian Pflug  writes:
> Yes, the idea had come up at some point during the review discussion. I
> agree that it's only worthwhile if it works for state type internal - though
> I think there ought to be a way to allow that. We could for example simply
> decree that the initfunc's first parameter must be of type internal if it's
> return type is, and then just pass NULL for that parameter.

I had thought about that, but it doesn't really work since it'd be
violating the strictness spec of the function.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-11 Thread Tom Lane

Dean Rasheed  writes:
> On 10 April 2014 19:54, Tom Lane  wrote:
>> So if we go with that terminology, perhaps these names for the
>> new CREATE AGGREGATE parameters:
>> 
>> initfuncapplies to plain aggregation, mutually exclusive with 
>> initcond
>> msfunc  (or just mfunc?) forward transition for moving-agg mode
>> mifunc  inverse transition for moving-agg mode
>> mstype  state datatype for moving-agg mode
>> msspace space estimate for mstype
>> mfinalfunc  final function for moving-agg mode
>> minitfunc   "firsttrans" for moving-agg mode
>> minitcond   mutually exclusive with minitfunc

> Yeah, those work for me.

> I think I prefer "mfunc" to "msfunc", but perhaps that's just my
> natural aversion to the "ms" prefix :-)

Meh.  We've got mstype, and I don't think leaving out the "s" there
feels right.

> Also, perhaps "minvfunc" rather than "mifunc" because "i" by itself
> could mean "initial".

Good point.  So with initfuncs out of the picture, we have
new CREATE AGGREGATE parameter names

msfunc  forward transition for moving-agg mode
minvfuncinverse transition for moving-agg mode
mfinalfunc  final function for moving-agg mode
mstype  state datatype for moving-agg mode
msspace space estimate for mstype
minitcond   initial state value for moving-agg mode

and new pg_aggregate columns

aggmtransfn   | regproc  | not null
aggminvtransfn| regproc  | not null
aggmfinalfn   | regproc  | not null
aggmtranstype | oid  | not null
aggmtransspace| integer  | not null
aggminitval   | text | 

It's a bit unfortunate that the catalog column names aren't quite on
the same page as CREATE AGGREGATE, but it doesn't seem like a good
idea to try to fix that now.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-11 Thread Florian Pflug

On Apr11, 2014, at 19:04 , Tom Lane  wrote:
> It strikes me that your second point
> 
>>  2) It allows strict aggregates whose state type and input type aren't
>> binary coercible to return NULL if all inputs were NULL without any
>> special coding. As it stands today, this doesn't work without some
>> kind of counter in the state, because the final function otherwise
>> won't know if there were non-NULL inputs or not. Aggregates whose state
>> and input types are binary coercible get around that by setting the
>> initial value to NULL while *still* being strict, and relying on the
>> state replacement behaviour of the aggregate machinery.
> 
> could be addressed by the initfunc idea, but I'm still not sufficiently
> excited by that one.  Given the problem with internal-type transition
> values, I think this could only win if there are cases where it would let
> us use a regular SQL type instead of internal for the transition value;
> and I'm not sure that there are many/any such cases.

Yes, the idea had come up at some point during the review discussion. I
agree that it's only worthwhile if it works for state type internal - though
I think there ought to be a way to allow that. We could for example simply
decree that the initfunc's first parameter must be of type internal if it's
return type is, and then just pass NULL for that parameter.

What I like about the initfunc idea is that it also naturally extends to
ordered-set aggregates, I think it'd be very useful for some possible
ordered-set aggregates to received their direct arguments beforehand and not
afterwards. 

But that all seems largely orthogonal to the invtrans patch.

best regards,
Florian Pflug

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-11 Thread Tom Lane

Florian Pflug  writes:
> On Apr11, 2014, at 17:09 , Tom Lane  wrote:
>> Basically, this comes down to a design judgment call, and my judgment
>> is differing from yours.  In the absence of opinions from others,
>> I'm going to do it my way.

> Ok. Are you going to do the necessary changes, or shall I? I'm happy to
> leave it to you, but if you lack the time I can try to find some.

Nah, I'm happy to do the work, since it's me insisting on changing it.

>> Tell me about the special case here again?  Should we revisit the issue?

> ...
> I don't feel too strongly either way on this one - I initially implemented the
> exception because I noticed that the NULL handling of some aggregates was
> broken otherwise, and it seemed simpler to fix this in one place than going
> over all the aggregates separately. OTOH, when I wrote the docs, I noticed
> how hard it was to describe the behaviour accurately, which made me like it
> less and less. And Dean wasn't happy with it at all, so that finally settled 
> it.

Yeah, if you can't document the design nicely, it's probably not a good
idea :-(.  Thanks for the summary.

It strikes me that your second point

>   2) It allows strict aggregates whose state type and input type aren't
>  binary coercible to return NULL if all inputs were NULL without any
>  special coding. As it stands today, this doesn't work without some
>  kind of counter in the state, because the final function otherwise
>  won't know if there were non-NULL inputs or not. Aggregates whose state
>  and input types are binary coercible get around that by setting the
>  initial value to NULL while *still* being strict, and relying on the
>  state replacement behaviour of the aggregate machinery.

could be addressed by the initfunc idea, but I'm still not sufficiently
excited by that one.  Given the problem with internal-type transition
values, I think this could only win if there are cases where it would let
us use a regular SQL type instead of internal for the transition value;
and I'm not sure that there are many/any such cases.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-11 Thread Florian Pflug

On Apr11, 2014, at 17:09 , Tom Lane  wrote:
> Basically, this comes down to a design judgment call, and my judgment
> is differing from yours.  In the absence of opinions from others,
> I'm going to do it my way.

Ok. Are you going to do the necessary changes, or shall I? I'm happy to
leave it to you, but if you lack the time I can try to find some.

>> ... SUM(int4) wouldn't need
>> *any* extra state at all to be invertible, if it weren't for those pesky
>> issues surrounding NULL handling. In fact, an earlier version of the
>> invtrans patch *did* use int4_sum as SUM(int4)'s transfer functions! The
>> only reason this doesn't work nowadays is that Dean didn't like the
>> forward-nonstrict-but-inverse-strict special case that made this work.
> 
> Tell me about the special case here again?  Should we revisit the issue?

My original coding allows the combination of non-strict forward with strict
reverse transition functions. The combination behaved like a strict aggregate
regarding NULL handling - i.e., neither the forward nor the reverse transition
function received NULL inputs. But if the initial state was NULL, the forward
transition function *would* be called with that NULL state value upon seeing
the first non-NULL input. In the window case, the aggregation machinery would
also take care to reset the state type to it's initial value when it removed
the last non-NULL input from the aggregation state (just like it does for
strict aggregates today). This had two advantages

  1) First, it allows strict aggregates to use state type internal. As it
 stands now, aggregates with state type internal must implement their
 own NULL handling, even if they behave exactly like most standard
 aggregates do, namely ignore NULLS and return NULL only if there were
 no non-NULL inputs. 

  2) It allows strict aggregates whose state type and input type aren't
 binary coercible to return NULL if all inputs were NULL without any
 special coding. As it stands today, this doesn't work without some
 kind of counter in the state, because the final function otherwise
 won't know if there were non-NULL inputs or not. Aggregates whose state
 and input types are binary coercible get around that by setting the
 initial value to NULL while *still* being strict, and relying on the
 state replacement behaviour of the aggregate machinery.

It, however, also has a few disadvantages

  3) It means that one needs to look at the inverse transition function's
 strictness setting even if that function is never used. 

  4) It feels a bit hacky.

(2) is what affects SUM(int4). The only reason it track the number of inputs
is to be able to return NULL instead of 0 if no inputs remain.

One idea to fix (3) and (4) was *explicitly* flagging aggregates as
NULL-handling or NULL-ignoring, and also as what one might call
"weakly strict", i.e. as returning NULL exactly if there were no non-NULL
inputs. There are various variations of that theme possible - one flag for
each behaviour, or simply a single "common behaviour" flag. In the end, I
decided not to pursue that, mostly because the aggregates affected by that
issued turned out to be relatively easy to fix. For the ones with state type
internal, I simply added a counter, and I made SUM(int4) use AVG's transition
function.

I don't feel too strongly either way on this one - I initially implemented the
exception because I noticed that the NULL handling of some aggregates was
broken otherwise, and it seemed simpler to fix this in one place than going
over all the aggregates separately. OTOH, when I wrote the docs, I noticed
how hard it was to describe the behaviour accurately, which made me like it
less and less. And Dean wasn't happy with it at all, so that finally settled it.

best regards,
Florian Pflug

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-11 Thread Tom Lane

Florian Pflug  writes:
> Which is why I feel that having two separate sets of transition functions
> and state types solves the wrong problem. If we find a way to prevent
> transition functions from being called directly, we'll shave a few cycles
> of quite a few existing aggregates, invertible or not. If we find a way
> around the initfunc-for-internal-statetype problem you discovered, the
> implementation would get much simpler, since we could then make nearly
> all of them strict. And this would again shave off a few cycles - for lots
> of NULL inputs, the effect could even be large.

Since neither of those latter things seems likely to happen, I don't
find this argument convincing at all.  Even if they did happen, they
would represent an incremental improvement that would be equally useful
with either one or two sfuncs.

Basically, this comes down to a design judgment call, and my judgment
is differing from yours.  In the absence of opinions from others,
I'm going to do it my way.

However, quite independently of how many sfuncs there are, I'm interested
about this:

> ... SUM(int4) wouldn't need
> *any* extra state at all to be invertible, if it weren't for those pesky
> issues surrounding NULL handling. In fact, an earlier version of the
> invtrans patch *did* use int4_sum as SUM(int4)'s transfer functions! The
> only reason this doesn't work nowadays is that Dean didn't like the
> forward-nonstrict-but-inverse-strict special case that made this work.

Tell me about the special case here again?  Should we revisit the issue?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-11 Thread Dean Rasheed

On 10 April 2014 22:52, Tom Lane  wrote:
> Dean Rasheed  writes:
>> I was imagining that firsttrans would only be passed the first value
>> to be aggregated, not any previous state, and that it would be illegal
>> to specify both an initcond and a firsttrans function.
>
>> The forward transition function would only be called for values after
>> the first, by which point the state would be non-null, and so it could
>> be made strict in most cases. The same would apply to the invertible
>> transition functions, so they wouldn't have to do null counting, which
>> in turn would make their state types simpler.
>
> I put together a very fast proof-of-concept patch for this (attached).
> It has a valid execution path for an aggregate with initfunc, but I didn't
> bother writing the CREATE AGGREGATE support yet.  I made sum(int4) work
> as you suggest, marking the transfn strict and ripping out int4_sum's
> internal support for null inputs.  The result seems to be about a 4% or so
> improvement in the overall aggregation speed, for a simple "SELECT
> sum(int4col) FROM table" query.  So from a performance standpoint this
> seems only marginally worth doing.  The real problem is not that 4% isn't
> worth the trouble, it's that AFAICS the only built-in aggregates that
> can benefit are sum(int2) and sum(int4).  So that looks like a rather
> narrow use-case.
>
> You had suggested upthread that we could use this idea to make the
> transition functions strict for aggregates using "internal" transition
> datatypes, but that does not work because the initfunc would violate
> the safety rule that a function returning internal must take at least
> one internal-type argument.  That puts a pretty strong damper on the
> usefulness of the approach, given how many internal-transtype aggregates
> we have (and the moving-aggregate case is not going to be better is it?)
>

Ah, that's disappointing. If it can't be made to work for aggregates
with internal state types, then I think it loses most of it's value,
and I don't think it will be of much more use in the moving aggregate
case either.

> So at this point I'm feeling unexcited about the initfunc idea.
> Unless it does something really good for the moving-aggregate case,
> I think we should drop it.
>

Agreed. Thanks for prototyping it though.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-10 Thread Florian Pflug

On Apr11, 2014, at 01:30 , Tom Lane  wrote:
> Florian Pflug  writes:
>> As for evidence - have you looked at the patch I posted? I'd be very
>> interested to know if it removes the performance differences you saw.
> 
> (1) You can't really prove the absence of a performance issue by showing
> that one specific aggregate doesn't have an issue.

I'm claiming that SUM(int4) is about as simple as it gets, so if the
effect can be mitigated there, it can be mitigated everywhere. The more
complex a forward-only transition function is, the less will and added if
or two hurt.

> (2) These results
> (mine as well as yours) seem mighty platform-dependent, so the fact you
> don't see an issue doesn't mean someone else won't.

Yeah, though *any* change - mine, the one your propose, and any other -
has the potential to hurt some platform due to weird interactions (say,
cache trashing). 

> (3) A new
> FunctionCallInfoData field just for this?  Surely not.  There's got to be
> a distributed cost to that (although I notice you didn't bother
> initializing the field most places, which is cheating).

I think the field doesn't actually increase the size of the structure at
all - at least not if the bool before it has size 1 and the short following
it is 2-byte aligned. Or at least that was why I made it a char, and added
it at the place I did. But I might be missing something...

Also, InitFunctionCallInfoData *does* initialize the flags to zero. Though
maybe not everybody uses that - I didn't check, this was just a quick hack.

> Now point 3 could be addressed by doing the signaling in some other way
> with the existing context field.  But it's still the case that you're
> trying to band-aid a bad design.  There's no good reason to make the sfunc
> do extra work to be invertible in contexts where we know, with certainty,
> that that work is useless.

This is the principal point where we disagree, I think. From my POV, the
problem isn't invertibility here at all. Heck, SUM(int4) wouldn't need
*any* extra state at all to be invertible, if it weren't for those pesky
issues surrounding NULL handling. In fact, an earlier version of the
invtrans patch *did* use int4_sum as SUM(int4)'s transfer functions! The
only reason this doesn't work nowadays is that Dean didn't like the
forward-nonstrict-but-inverse-strict special case that made this work.

The way I see it, the main problem is the drop in performance that comes
from using a pass-by-ref state type. Which IMHO happens because this
*already* is a heavily band-aided design - all the state validation and
"if (AggCheckCallContext(,NULL))" stuff really works around the fact that
transition functions *aren't* supposed to be user-called, yet they are,
and must deal.

Which is why I feel that having two separate sets of transition functions
and state types solves the wrong problem. If we find a way to prevent
transition functions from being called directly, we'll shave a few cycles
of quite a few existing aggregates, invertible or not. If we find a way
around the initfunc-for-internal-statetype problem you discovered, the
implementation would get much simpler, since we could then make nearly
all of them strict. And this would again shave off a few cycles - for lots
of NULL inputs, the effect could even be large.

Compared to that, the benefit of having a completely separate set of
transition functions and state types for the invertible case is much
less tangible, IMHO.

> Especially not when we know that even a few instructions of extra work
> can be performance-significant.

But where do we draw that line? nodeWindowAgg.c quite certainly wastes
about as many cycles as int4_avg_accum does on various checks that are
unnecessary unless in the non-sliding window case. Do we duplicate those
functions too?

best regards,
Florian Pflug

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-10 Thread Tom Lane

Florian Pflug  writes:
> My argument is that is costs us more complexity to duplicate everything
> for the invertible case, *and* the result seems less flexible - not
> from the POV of aggregate implementations, but from the POV of future
> extensions.

[ shrug... ]  You can argue against any feature whatsoever by claiming
that it might possibly conflict with something we would wish to do
sometime in the future.  I think you need to have a much more concrete
argument about specific issues this will cause in order to be convincing.
For all we know about ROLLUP/CUBE implementation issues right now, doing
this feature with separate implementations might make that easier not
harder.  (I note that the crux of my complaint right now is that we're
asking sfuncs to serve two masters --- how's it going to be better when
they have to serve three or four?)

> As for evidence - have you looked at the patch I posted? I'd be very
> interested to know if it removes the performance differences you saw.

(1) You can't really prove the absence of a performance issue by showing
that one specific aggregate doesn't have an issue.  (2) These results
(mine as well as yours) seem mighty platform-dependent, so the fact you
don't see an issue doesn't mean someone else won't.  (3) A new
FunctionCallInfoData field just for this?  Surely not.  There's got to be
a distributed cost to that (although I notice you didn't bother
initializing the field most places, which is cheating).

Now point 3 could be addressed by doing the signaling in some other way
with the existing context field.  But it's still the case that you're
trying to band-aid a bad design.  There's no good reason to make the sfunc
do extra work to be invertible in contexts where we know, with certainty,
that that work is useless.  Especially not when we know that even a few
instructions of extra work can be performance-significant.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-10 Thread Florian Pflug

On Apr11, 2014, at 00:07 , Tom Lane  wrote:
> Florian Pflug  writes:
>> I still think you're getting ahead of yourselves here. The number of
>> aggregates which benefit from this is tiny SUM(int2,int4) and maybe
>> BOOL_{AND,OR}. And in the SUM(int2,int4) case *only* on 64-bit archs -
>> for the others, the state type is already pass-by-ref.
> 
> That argument is reasonable for the initfunc idea, but it doesn't apply
> otherwise.

Why not? AFAICS, the increase in cost comes from going from an
by-value to a by-reference state type. Once you're using a by-refence
type, you already pay the overhead of the additional dereferences, and
for calling AggCheckCallContext() or some equivalent. 

>> Another reason I'm so opposed to this is that inverse transition
>> functions might not be the last kind of transition functions we ever
>> add. For example, if we ever get ROLLUP/CUBE, we might want to have
>> a mergefunc which takes two aggregation states and combines them 
>> into one. What do we do if we add those?
> 
> Make more pg_aggregate columns.  What exactly do you think is either
> easier or harder about such cases?  Also, maybe I'm misremembering
> the spec, but ROLLUP/CUBE wouldn't apply to window functions would
> they?  So if your argument is based on the assumption that these
> features need to combine, I'm not sure it's true.

Well, it was just an example - there might be other future extensions
which *do* need to combine. And we've been known to go beyond the spec
sometimes...

> Furthermore, I do not buy the argument that if we hack hard enough,
> we can make the performance cost of forcing the sfunc to do double duty
> negligible.  In the first place, that argument is unsupported by much
> evidence, and in the second place, it will certainly cost us complexity
> to make the performance issue go away.  Instead we can just design the
> problem out, for nothing that I see as a serious drawback.

My argument is that is costs us more complexity to duplicate everything
for the invertible case, *and* the result seems less flexible - not
from the POV of aggregate implementations, but from the POV of future
extensions.

As for evidence - have you looked at the patch I posted? I'd be very
interested to know if it removes the performance differences you saw.

best regards,
Florian Pflug

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-10 Thread Tom Lane

Florian Pflug  writes:
> I still think you're getting ahead of yourselves here. The number of
> aggregates which benefit from this is tiny SUM(int2,int4) and maybe
> BOOL_{AND,OR}. And in the SUM(int2,int4) case *only* on 64-bit archs -
> for the others, the state type is already pass-by-ref.

That argument is reasonable for the initfunc idea, but it doesn't apply
otherwise.

> Another reason I'm so opposed to this is that inverse transition
> functions might not be the last kind of transition functions we ever
> add. For example, if we ever get ROLLUP/CUBE, we might want to have
> a mergefunc which takes two aggregation states and combines them 
> into one. What do we do if we add those?

Make more pg_aggregate columns.  What exactly do you think is either
easier or harder about such cases?  Also, maybe I'm misremembering
the spec, but ROLLUP/CUBE wouldn't apply to window functions would
they?  So if your argument is based on the assumption that these
features need to combine, I'm not sure it's true.

The bottom line for me is that it seems conceptually far cleaner to
make the moving-aggregate support be independent than to insist that
it share an stype and sfunc with the plain case.

Furthermore, I do not buy the argument that if we hack hard enough,
we can make the performance cost of forcing the sfunc to do double duty
negligible.  In the first place, that argument is unsupported by much
evidence, and in the second place, it will certainly cost us complexity
to make the performance issue go away.  Instead we can just design the
problem out, for nothing that I see as a serious drawback.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-10 Thread Florian Pflug

On Apr10, 2014, at 21:34 , Dean Rasheed  wrote:
> On 10 April 2014 19:54, Tom Lane  wrote:
>> So if we go with that terminology, perhaps these names for the
>> new CREATE AGGREGATE parameters:
>> 
>> initfuncapplies to plain aggregation, mutually exclusive with 
>> initcond
>> msfunc  (or just mfunc?) forward transition for moving-agg mode
>> mifunc  inverse transition for moving-agg mode
>> mstype  state datatype for moving-agg mode
>> msspace space estimate for mstype
>> mfinalfunc  final function for moving-agg mode
>> minitfunc   "firsttrans" for moving-agg mode
>> minitcond   mutually exclusive with minitfunc
> 
> I think I prefer "mfunc" to "msfunc", but perhaps that's just my
> natural aversion to the "ms" prefix :-)
> 
> Also, perhaps "minvfunc" rather than "mifunc" because "i" by itself
> could mean "initial".

I still think you're getting ahead of yourselves here. The number of
aggregates which benefit from this is tiny SUM(int2,int4) and maybe
BOOL_{AND,OR}. And in the SUM(int2,int4) case *only* on 64-bit archs -
for the others, the state type is already pass-by-ref.

I don't think we should be additional that much additional machinery
until it has been conclusively demonstrated that the performance
regression cannot be fixed any other way. Which, quite frankly, I
don't believe. Nothing in int4_avg_accum looks particularly expensive,
and the things that *do* seem to cost something certainly can be made
cheaper. c.f. the patch I just posted.

Another reason I'm so opposed to this is that inverse transition
functions might not be the last kind of transition functions we ever
add. For example, if we ever get ROLLUP/CUBE, we might want to have
a mergefunc which takes two aggregation states and combines them 
into one. What do we do if we add those? Add yet a another set of
"mergable" transition functions? What about the various combinations
of invertible/non-invertible mergable/non-mergable that could result?
The opportunity cost seems pretty high here...

best regards,
Florian Pflug

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-10 Thread Tom Lane

Dean Rasheed  writes:
> I was imagining that firsttrans would only be passed the first value
> to be aggregated, not any previous state, and that it would be illegal
> to specify both an initcond and a firsttrans function.

> The forward transition function would only be called for values after
> the first, by which point the state would be non-null, and so it could
> be made strict in most cases. The same would apply to the invertible
> transition functions, so they wouldn't have to do null counting, which
> in turn would make their state types simpler.

I put together a very fast proof-of-concept patch for this (attached).
It has a valid execution path for an aggregate with initfunc, but I didn't
bother writing the CREATE AGGREGATE support yet.  I made sum(int4) work
as you suggest, marking the transfn strict and ripping out int4_sum's
internal support for null inputs.  The result seems to be about a 4% or so
improvement in the overall aggregation speed, for a simple "SELECT
sum(int4col) FROM table" query.  So from a performance standpoint this
seems only marginally worth doing.  The real problem is not that 4% isn't
worth the trouble, it's that AFAICS the only built-in aggregates that
can benefit are sum(int2) and sum(int4).  So that looks like a rather
narrow use-case.

You had suggested upthread that we could use this idea to make the
transition functions strict for aggregates using "internal" transition
datatypes, but that does not work because the initfunc would violate
the safety rule that a function returning internal must take at least
one internal-type argument.  That puts a pretty strong damper on the
usefulness of the approach, given how many internal-transtype aggregates
we have (and the moving-aggregate case is not going to be better is it?)

So at this point I'm feeling unexcited about the initfunc idea.
Unless it does something really good for the moving-aggregate case,
I think we should drop it.

regards, tom lane

diff --git a/src/backend/catalog/pg_aggregate.c b/src/backend/catalog/pg_aggregate.c
index fe6dc8a..1ca5c8f 100644
*** a/src/backend/catalog/pg_aggregate.c
--- b/src/backend/catalog/pg_aggregate.c
*** AggregateCreate(const char *aggName,
*** 390,395 
--- 390,396 
  	values[Anum_pg_aggregate_aggfnoid - 1] = ObjectIdGetDatum(procOid);
  	values[Anum_pg_aggregate_aggkind - 1] = CharGetDatum(aggKind);
  	values[Anum_pg_aggregate_aggnumdirectargs - 1] = Int16GetDatum(numDirectArgs);
+ 	values[Anum_pg_aggregate_agginitfn - 1] = ObjectIdGetDatum(InvalidOid);
  	values[Anum_pg_aggregate_aggtransfn - 1] = ObjectIdGetDatum(transfn);
  	values[Anum_pg_aggregate_aggfinalfn - 1] = ObjectIdGetDatum(finalfn);
  	values[Anum_pg_aggregate_aggsortop - 1] = ObjectIdGetDatum(sortop);
diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c
index 7e4bca5..2671c49 100644
*** a/src/backend/executor/nodeAgg.c
--- b/src/backend/executor/nodeAgg.c
*** typedef struct AggStatePerAggData
*** 152,157 
--- 152,158 
  	int			numTransInputs;
  
  	/* Oids of transfer functions */
+ 	Oid			initfn_oid;		/* may be InvalidOid */
  	Oid			transfn_oid;
  	Oid			finalfn_oid;	/* may be InvalidOid */
  
*** typedef struct AggStatePerAggData
*** 160,165 
--- 161,167 
  	 * corresponding oid is not InvalidOid.  Note in particular that fn_strict
  	 * flags are kept here.
  	 */
+ 	FmgrInfo	initfn;
  	FmgrInfo	transfn;
  	FmgrInfo	finalfn;
  
*** typedef struct AggHashEntryData
*** 296,301 
--- 298,306 
  static void initialize_aggregates(AggState *aggstate,
  	  AggStatePerAgg peragg,
  	  AggStatePerGroup pergroup);
+ static void initialize_transition_value(AggState *aggstate,
+ 			AggStatePerAgg peraggstate,
+ 			AggStatePerGroup pergroupstate);
  static void advance_transition_function(AggState *aggstate,
  			AggStatePerAgg peraggstate,
  			AggStatePerGroup pergroupstate);
*** initialize_aggregates(AggState *aggstate
*** 403,408 
--- 408,498 
  }
  
  /*
+  * Initialize the transition value upon reaching the first non-NULL input(s).
+  *
+  * We use this code when the initcond is NULL and the transfn is strict,
+  * which otherwise would mean the transition value can never become non-null.
+  * If an initfn has been provided, call it on the current input value(s);
+  * otherwise, take the current input value as the new transition value.
+  * (In the latter case, we already checked that this is okay datatype-wise.)
+  */
+ static void
+ initialize_transition_value(AggState *aggstate,
+ 			AggStatePerAgg peraggstate,
+ 			AggStatePerGroup pergroupstate)
+ {
+ 	FunctionCallInfo tfcinfo = &peraggstate->transfn_fcinfo;
+ 	MemoryContext oldContext;
+ 
+ 	if (OidIsValid(peraggstate->initfn_oid))
+ 	{
+ 		FunctionCallInfoData fcinfo;
+ 		int			numInitArgs;
+ 		Datum		newVal;
+ 
+ 		/* We run the transition functions in per-input-tuple memory context *

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-10 Thread Florian Pflug

On Apr10, 2014, at 02:13 , Florian Pflug  wrote:
> On Apr9, 2014, at 23:17 , Florian Pflug  wrote:
>> On Apr9, 2014, at 21:35 , Tom Lane  wrote:
>>> A quick test says that avg(int4)
>>> is about five percent slower than sum(int4), so that's the kind of hit
>>> we'd be taking on non-windowed aggregations if we do it like this.
>> 
>> That's rather surprising though. AFAICS, there's isn't much difference
>> between the two transfer functions int4_sum and int4_avg_accum at all.
>> 
>> The differences come down to (+ denoting things which ought to make
>> int4_avg_accum slower compared to int4_sum, - denoting the opposite)
>> 
>> 1. +) int4_avg_accum calls AggCheckCallContext
>> 2. -) int4_sum checks if the state is NULL (it never is for int4_avg_accum)
>> 3. +) int4_avg_accum uses ARR_HASNULL, ARR_SIZE, ARR_OVERHEAD_NONULLS
>>  to verify that the state is a 2-element array without NULL entries
>> 4. -) int4_sum checks if the input is NULL
> 
> I've done a bit of profiling on this (using Instruments.app on OSX). One
> thing I missed is that inv4_avg_accum also calls pg_detoast_datum - that
> calls comes from the PG_GETARG_ARRAYTYPE_P macro. Doing that is a bit silly,
> since we know that the datum cannot possibly be toasted I think (or if it
> was, nodeAgg.c should do the de-toasting).
> 
> The profile also attributes a rather large percent of the total runtime of
> int4_avg_accum (around 30%) to the call of AggCheckCallContext(). Getting
> rid of that would help quite a few transition functions, invertible or not.
> That certainly seems doable - we'd need a way to mark functions as internal
> support functions, and prevent user-initiated calls of such functions. 
> Transition functions marked that way could then safely scribble over their
> state arguments without having to consult AggCheckCallContext() first.
> 
> ...
> 
> However, I still believe the best approach at this point is to just work
> on making int4_avg_accum faster. I still see no principal reason what it
> has to be noticeably slower - the only additional work it absolutely *has*
> to perform is *one* 64-bit increment.

I played with this a bit.

Currently, int4_avg_accum invokes AggCheckCallContext every time, and also
repeatedly checks whether the array has the required dimension - which,
incidentally, is the only big difference between int4_avg_accum and int4_sum.

To avoid that, I added a flags field to fcinfo which nodeAgg uses to tell
transition functions whether they're called for the first time, or are being
called with whatever state they themselves returned last time, i.e the
n-th time.

If the n-th time flag is set, int4_avg_accum simply retrieves the state with
PG_GETARG_DATUM() instead of PG_GETARG_ARRAYTYPE_P(), relying on the fact
that it never returns toasted datums itself, and also doesn't bother to validate
the array size, for the same reason.

If the flag is not set, it uses PG_GETARG_ARRAYTYPE_COPY_P and does validate
the array size. In theory, it could further distinguish between a 1st call
in an aggregation context (where the copy is unnecessary), and a user-initiated
call (i.e. outside an aggregation). But that seems unnecessary - one additional
copy per aggregation group seems unlikely to be a problem.

With this in place, instruction-level profiling using Apple's Instruments.app
shows that int4_avg_accum takes about 1.5% of the total runtime of a simple
aggregation, while int4_sum takes about 1.2%. 

A (very rough) patch is attached.

I haven't been able to repeat Tom's measurement which shows a 5% difference in
performance between the invertible and the non-invertible versions of SUM(int4),
so I cannot say if this removes that. But the profiling I've done would 
certainly
indicate it should.

best regards,
Florian Pflug

invtrans_sumint4_opt2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-10 Thread Dean Rasheed

On 10 April 2014 19:54, Tom Lane  wrote:
> Dean Rasheed  writes:
>> On 10 April 2014 19:04, Tom Lane  wrote:
>>> What about names for the invertible-aggregate infrastructure?
>>> I'm tempted to prefix "inv" to all the existing names, but then
>>> "invsfunc" means the alternate forward function ... can we use
>>> "invifunc" for the inverse transition function?  Or maybe the
>>> prefix should be just "i".
>
>> Hmm, I'm not a fan of any of those names. Perhaps "win" as a prefix to
>> denote a sliding window? Or just "m" for "moving aggregate".
>
> Hmm ... "moving aggregate" is actually a pretty good name for this
> whole feature -- better than "invertible aggregate" anyway.  I can
> feel a global-search-and-replace coming on.
>
> So if we go with that terminology, perhaps these names for the
> new CREATE AGGREGATE parameters:
>
> initfuncapplies to plain aggregation, mutually exclusive with initcond
> msfunc  (or just mfunc?) forward transition for moving-agg mode
> mifunc  inverse transition for moving-agg mode
> mstype  state datatype for moving-agg mode
> msspace space estimate for mstype
> mfinalfunc  final function for moving-agg mode
> minitfunc   "firsttrans" for moving-agg mode
> minitcond   mutually exclusive with minitfunc
>

Yeah, those work for me.

I think I prefer "mfunc" to "msfunc", but perhaps that's just my
natural aversion to the "ms" prefix :-)

Also, perhaps "minvfunc" rather than "mifunc" because "i" by itself
could mean "initial".

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-10 Thread Tom Lane

Dean Rasheed  writes:
> On 10 April 2014 19:04, Tom Lane  wrote:
>> What about names for the invertible-aggregate infrastructure?
>> I'm tempted to prefix "inv" to all the existing names, but then
>> "invsfunc" means the alternate forward function ... can we use
>> "invifunc" for the inverse transition function?  Or maybe the
>> prefix should be just "i".

> Hmm, I'm not a fan of any of those names. Perhaps "win" as a prefix to
> denote a sliding window? Or just "m" for "moving aggregate".

Hmm ... "moving aggregate" is actually a pretty good name for this
whole feature -- better than "invertible aggregate" anyway.  I can
feel a global-search-and-replace coming on.

So if we go with that terminology, perhaps these names for the
new CREATE AGGREGATE parameters:

initfuncapplies to plain aggregation, mutually exclusive with initcond
msfunc  (or just mfunc?) forward transition for moving-agg mode
mifunc  inverse transition for moving-agg mode
mstype  state datatype for moving-agg mode
msspace space estimate for mstype
mfinalfunc  final function for moving-agg mode
minitfunc   "firsttrans" for moving-agg mode
minitcond   mutually exclusive with minitfunc

That takes us up to 16 columns in pg_aggregate, but it's still not going
to be a very voluminous catalog --- there's only 171 rows there today.
So I'm not particularly concerned about space, and if there's a chance
of squeezing out cycles, I think we should seize it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-10 Thread Dean Rasheed

On 10 April 2014 19:04, Tom Lane  wrote:
> Dean Rasheed  writes:
>> On 10 April 2014 15:18, Tom Lane  wrote:
>>> This idea of a separate firsttrans function is interesting but perhaps
>>> orthogonal to the current patch.  Also, I don't quite understand how
>>> it would work for aggregates with null initvalues; don't you end up
>>> with exactly the same conflict about how you can't mark the transfn
>>> strict?  Or is the idea that firsttrans would *only* apply to aggregates
>>> with null initvalue, and so you wouldn't even pass the previous state
>>> value to it?
>
>> I was imagining that firsttrans would only be passed the first value
>> to be aggregated, not any previous state, and that it would be illegal
>> to specify both an initcond and a firsttrans function.
>
> Got it.  So the existing behavior where we insert the first non-null
> value could be seen as a degenerate case in which the firsttrans function
> is an identity.
>

Right.

>> The forward transition function would only be called for values after
>> the first, by which point the state would be non-null, and so it could
>> be made strict in most cases. The same would apply to the invertible
>> transition functions, so they wouldn't have to do null counting, which
>> in turn would make their state types simpler.
>
> Makes sense to me.  We need names for these things though.  I don't
> think abbreviating to "ffunc" is a good idea because of the likelihood
> of confusion with the finalfunc (indeed I see the CREATE AGGREGATE
> ref page is already using "ffunc" as a short form for finalfunc).
> Maybe "initfunc", which would parallel "initcond"?
>

Yes, I was already thinking that "initfunc" is actually a much better
name for that.

> What about names for the invertible-aggregate infrastructure?
> I'm tempted to prefix "inv" to all the existing names, but then
> "invsfunc" means the alternate forward function ... can we use
> "invifunc" for the inverse transition function?  Or maybe the
> prefix should be just "i".
>

Hmm, I'm not a fan of any of those names. Perhaps "win" as a prefix to
denote a sliding window? Or just "m" for "moving aggregate".

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-10 Thread Tom Lane

Dean Rasheed  writes:
> On 10 April 2014 15:18, Tom Lane  wrote:
>> This idea of a separate firsttrans function is interesting but perhaps
>> orthogonal to the current patch.  Also, I don't quite understand how
>> it would work for aggregates with null initvalues; don't you end up
>> with exactly the same conflict about how you can't mark the transfn
>> strict?  Or is the idea that firsttrans would *only* apply to aggregates
>> with null initvalue, and so you wouldn't even pass the previous state
>> value to it?

> I was imagining that firsttrans would only be passed the first value
> to be aggregated, not any previous state, and that it would be illegal
> to specify both an initcond and a firsttrans function.

Got it.  So the existing behavior where we insert the first non-null
value could be seen as a degenerate case in which the firsttrans function
is an identity.

> The forward transition function would only be called for values after
> the first, by which point the state would be non-null, and so it could
> be made strict in most cases. The same would apply to the invertible
> transition functions, so they wouldn't have to do null counting, which
> in turn would make their state types simpler.

Makes sense to me.  We need names for these things though.  I don't
think abbreviating to "ffunc" is a good idea because of the likelihood
of confusion with the finalfunc (indeed I see the CREATE AGGREGATE
ref page is already using "ffunc" as a short form for finalfunc).
Maybe "initfunc", which would parallel "initcond"?

What about names for the invertible-aggregate infrastructure?
I'm tempted to prefix "inv" to all the existing names, but then
"invsfunc" means the alternate forward function ... can we use
"invifunc" for the inverse transition function?  Or maybe the
prefix should be just "i".

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-10 Thread Dean Rasheed

On 10 April 2014 15:18, Tom Lane  wrote:
> Dean Rasheed  writes:
>> On 10 April 2014 01:13, Florian Pflug  wrote:
>>> However, I still believe the best approach at this point is to just work
>>> on making int4_avg_accum faster. I still see no principal reason what it
>>> has to be noticeably slower - the only additional work it absolutely *has*
>>> to perform is *one* 64-bit increment.
>
>> In the best case that would make sum() not noticeably slower than
>> avg(), whereas using a firsttrans/initialfunction would potentially
>> make both of them faster than they currently are, and not just in
>> window queries.
>
> I'm still of the opinion that we should separate the transfn for
> invertible cases from the normal one, and allow for two separate
> state types.  One of the things that helps with is the strictness
> consideration: you no longer have to have the same strictness
> setting for the plain and invertible forward transfns.
>

Yes, I can imagine that there would be some aggregates for which the
plain forwards transition function would be simpler than the
invertible one, with a simpler state type. You'd still be left with
quite a large number of existing aggregates having non-strict plain
transition functions, in addition to a bunch of new non-strict
invertible transition functions that had to do null counting.

> This idea of a separate firsttrans function is interesting but perhaps
> orthogonal to the current patch.  Also, I don't quite understand how
> it would work for aggregates with null initvalues; don't you end up
> with exactly the same conflict about how you can't mark the transfn
> strict?  Or is the idea that firsttrans would *only* apply to aggregates
> with null initvalue, and so you wouldn't even pass the previous state
> value to it?
>

I was imagining that firsttrans would only be passed the first value
to be aggregated, not any previous state, and that it would be illegal
to specify both an initcond and a firsttrans function.

The forward transition function would only be called for values after
the first, by which point the state would be non-null, and so it could
be made strict in most cases. The same would apply to the invertible
transition functions, so they wouldn't have to do null counting, which
in turn would make their state types simpler.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-10 Thread Tom Lane

Dean Rasheed  writes:
> On 10 April 2014 01:13, Florian Pflug  wrote:
>> However, I still believe the best approach at this point is to just work
>> on making int4_avg_accum faster. I still see no principal reason what it
>> has to be noticeably slower - the only additional work it absolutely *has*
>> to perform is *one* 64-bit increment.

> In the best case that would make sum() not noticeably slower than
> avg(), whereas using a firsttrans/initialfunction would potentially
> make both of them faster than they currently are, and not just in
> window queries.

I'm still of the opinion that we should separate the transfn for
invertible cases from the normal one, and allow for two separate
state types.  One of the things that helps with is the strictness
consideration: you no longer have to have the same strictness
setting for the plain and invertible forward transfns.

This idea of a separate firsttrans function is interesting but perhaps
orthogonal to the current patch.  Also, I don't quite understand how
it would work for aggregates with null initvalues; don't you end up
with exactly the same conflict about how you can't mark the transfn
strict?  Or is the idea that firsttrans would *only* apply to aggregates
with null initvalue, and so you wouldn't even pass the previous state
value to it?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-09 Thread David Rowley

On Thu, Apr 10, 2014 at 9:55 AM, Tom Lane  wrote:

> Florian Pflug  writes:
> > I was (and still am) not in favour of duplicating the whole quadruple of
> > (state, initialvalue, transferfunction, finalfunction) because it seems
> > excessive. In fact, I believed that doing this would probably be grounds
> for
> > outright rejection of the patch, on the base of catalog bloat. And your
> > initial response to this suggestion seemed to confirm this.
>
> Well, I think it's much more likely that causing a performance penalty for
> cases unrelated to window aggregates would lead to outright rejection :-(.
> The majority of our users probably don't ever use window functions, but
> for sure they've heard of SUM().  We can't penalize the non-window case.
>
> Expanding pg_aggregate from 10 columns (as per patch) to 14 (as per this
> suggestion) is a little annoying but it doesn't sound like a show stopper.
> It seems reasonable to assume that the extra initval would be NULL in most
> cases, so it's probably a net addition of 12 bytes per row.
>
>
I also wouldn't imagine that the overhead of storing that would be too
great... And are there really any databases out there that have 1000's of
custom aggregate functions?

I'm actually quite glad to see someone agrees with me on this. I think it
opens up quite a bit of extra optimisation opportunities with things like
MAX and MIN... In these cases we could be tracking the number of values of
max found and reset it when we get a bigger value. That way we could report
the inverse transition as successful if maxcount is still above 0 after the
removal of a max value... Similar to how I implemented the inverse
transition for sum(numeric). In fact doing it this way would mean that
inverse transitions for sum(numeric) would never fail and retry. I just
thought we had gotten to a stage of not requiring this due to the overheads
being so low... I was quite surprised to see the count tracking account for
5% for sum int. What I don't quite understand yet is why we can't just
create a new function for int inverse transitions instead of borrowing the
inverse transition functions for avg...?

Regards

David Rowley

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-09 Thread David Rowley

On Thu, Apr 10, 2014 at 9:55 AM, Tom Lane  wrote:

>
> > I'm pretty sure David Rowley did some benchmarking. The results should be
> > in this thread somewhere I think, but they currently evade me... Maybe
> David
> > can re-post, if he's following this...
>
> I saw benchmarks addressing window aggregation, but none looking for
> side-effects on plain aggregation.
>
>
These ones maybe?
http://www.postgresql.org/message-id/CAApHDvr_oSpvM-XXz43eCMX8n0EfshJ=9j+rxvgqcy91yr-...@mail.gmail.com

I think it was only around SUM(numeric), and you'll need to scroll down a
bit to find it as it's mixed in with a load of window agg tests. At the
time it was the only forward transition function that I had added any
overhead to, so I think it was the only one I bothered to test at the time.
However, I do remember benchmarking the bool_and and bool_or changes after
I rewrote the way they worked and also found the same as you... no
difference, I just can't find the post with my results... Though it sounds
like Tom found the same as me so I don't think it's worth me looking any
more for them.

Regards

David Rowley

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-09 Thread Dean Rasheed

On 10 April 2014 01:13, Florian Pflug  wrote:
> Also, the *only* reason that SUM(int2|int4) cannot use int8 as it's
> transition type is that it needs to return NULL, not 0, if zero rows
> were aggregates. It might seems that it could just use int8 as state
> with initial value NULL, but that only works if the transition functions
> are non-strict (since the special case of state type == input type doesn't
> apply here). And for non-strict transition functions need to deal with
> NULL inputs themselves, which means counting the number of non-NULL inputs..
>
> That problem would go away though if we had support for an initalfunction,
> which would receive the first input value and initialize the state. In
> the case of SUM(), the initial function would be strict, and thus would be
> called on the first non-NULL input value, which it'd convert to int8 and
> return as the new state.
>

Ah snap!

> However, I still believe the best approach at this point is to just work
> on making int4_avg_accum faster. I still see no principal reason what it
> has to be noticeably slower - the only additional work it absolutely *has*
> to perform is *one* 64-bit increment.
>

In the best case that would make sum() not noticeably slower than
avg(), whereas using a firsttrans/initialfunction would potentially
make both of them faster than they currently are, and not just in
window queries.

Also, IMO a first/initial function leads to a cleaner separation of
logic, allowing some of the transition functions to be simplified
rather than becoming more complex.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-09 Thread Dean Rasheed

On 9 April 2014 22:55, Tom Lane  wrote:
> Florian Pflug  writes:
>> I was (and still am) not in favour of duplicating the whole quadruple of
>> (state, initialvalue, transferfunction, finalfunction) because it seems
>> excessive. In fact, I believed that doing this would probably be grounds for
>> outright rejection of the patch, on the base of catalog bloat. And your
>> initial response to this suggestion seemed to confirm this.
>
> Well, I think it's much more likely that causing a performance penalty for
> cases unrelated to window aggregates would lead to outright rejection :-(.
> The majority of our users probably don't ever use window functions, but
> for sure they've heard of SUM().  We can't penalize the non-window case.
>
> Expanding pg_aggregate from 10 columns (as per patch) to 14 (as per this
> suggestion) is a little annoying but it doesn't sound like a show stopper.
> It seems reasonable to assume that the extra initval would be NULL in most
> cases, so it's probably a net addition of 12 bytes per row.
>
>> On Apr9, 2014, at 20:20 , Tom Lane  wrote:
>>> The patch has in fact already done that to a couple of basic aggregates like
>>> sum(int4).  Has anyone bothered to test what side-effects that has on
>>> non-windowed aggregation performance?
>
>> I'm pretty sure David Rowley did some benchmarking. The results should be
>> in this thread somewhere I think, but they currently evade me... Maybe David
>> can re-post, if he's following this...
>
> I saw benchmarks addressing window aggregation, but none looking for
> side-effects on plain aggregation.
>
>> If we really go down that road (and I'm far from convinced), then maybe
>> instead of having a bunch of additional fields, we could have separate
>> entries in pg_aggregate for the two cases, with links between them.
>
> That seems like a complete mess; in particular it would break the primary
> key for pg_aggregate (aggfnoid), and probably break every existing query
> that looks at pg_aggregate.  Some extra fields would not break such
> expectations (in fact we've added fields to pg_aggregate in the past).
>

This may initially sound unrelated, but I think it might address some
of these issues. Suppose we added a 'firsttrans' function, that took a
single argument (the first value to be aggregated) and was responsible
for creating the initial state from that first value.

This would apply to aggregates that ignore null values, but whose
transition function cannot currently be declared strict (either
because the state type is internal, or because it is not the same as
the aggregate's argument type).

I think quite a lot of the existing aggregates fall into this
category, and this would allow their transition functions to be made
strict and simplified --- no more testing if the state is null, and
then building it, and no more testing if the argument is null and
ignoring it. That might give a noticeable performance boost in the
regular aggregate case, especially over data containing nulls.

But in addition, it would help with writing inverse transition
functions because if the transition functions could be made strict,
they wouldn't need to do null-counting, which would mean that their
state types would not need to be expanded.

So for example sum(int4) could continue to have int8 as its state
type, it could use int8(int4) as its firsttrans function, and
int4_sum() could become strict and lose all its null-handling logic.
Then int4_sum_inv() would be the trivial to write - just doing the
same in reverse.

I'm not sure it helps for all aggregates, but there are certainly some
where it would seem to simplify things.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-09 Thread Florian Pflug

On Apr9, 2014, at 23:17 , Florian Pflug  wrote:

> On Apr9, 2014, at 21:35 , Tom Lane  wrote:
>> A quick test says that avg(int4)
>> is about five percent slower than sum(int4), so that's the kind of hit
>> we'd be taking on non-windowed aggregations if we do it like this.
> 
> That's rather surprising though. AFAICS, there's isn't much difference
> between the two transfer functions int4_sum and int4_avg_accum at all.
> 
> The differences come down to (+ denoting things which ought to make
> int4_avg_accum slower compared to int4_sum, - denoting the opposite)
> 
> 1. +) int4_avg_accum calls AggCheckCallContext
> 2. -) int4_sum checks if the state is NULL (it never is for int4_avg_accum)
> 3. +) int4_avg_accum uses ARR_HASNULL, ARR_SIZE, ARR_OVERHEAD_NONULLS
>   to verify that the state is a 2-element array without NULL entries
> 4. -) int4_sum checks if the input is NULL

I've done a bit of profiling on this (using Instruments.app on OSX). One
thing I missed is that inv4_avg_accum also calls pg_detoast_datum - that
calls comes from the PG_GETARG_ARRAYTYPE_P macro. Doing that is a bit silly,
since we know that the datum cannot possibly be toasted I think (or if it
was, nodeAgg.c should do the de-toasting).

The profile also attributes a rather large percent of the total runtime of
int4_avg_accum (around 30%) to the call of AggCheckCallContext(). Getting
rid of that would help quite a few transition functions, invertible or not.
That certainly seems doable - we'd need a way to mark functions as internal
support functions, and prevent user-initiated calls of such functions. 
Transition functions marked that way could then safely scribble over their
state arguments without having to consult AggCheckCallContext() first.

I've also compared the disassemblies of int4_sum and int4_avg_accum, and
apart from these differences, they are pretty similar.

Also, the *only* reason that SUM(int2|int4) cannot use int8 as it's
transition type is that it needs to return NULL, not 0, if zero rows
were aggregates. It might seems that it could just use int8 as state
with initial value NULL, but that only works if the transition functions
are non-strict (since the special case of state type == input type doesn't
apply here). And for non-strict transition functions need to deal with
NULL inputs themselves, which means counting the number of non-NULL inputs..

That problem would go away though if we had support for an initalfunction,
which would receive the first input value and initialize the state. In
the case of SUM(), the initial function would be strict, and thus would be
called on the first non-NULL input value, which it'd convert to int8 and
return as the new state. 

However, I still believe the best approach at this point is to just work
on making int4_avg_accum faster. I still see no principal reason what it
has to be noticeably slower - the only additional work it absolutely *has*
to perform is *one* 64-bit increment.

best regards,
Florian Pflug

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-09 Thread Tom Lane

Florian Pflug  writes:
> I was (and still am) not in favour of duplicating the whole quadruple of
> (state, initialvalue, transferfunction, finalfunction) because it seems
> excessive. In fact, I believed that doing this would probably be grounds for
> outright rejection of the patch, on the base of catalog bloat. And your
> initial response to this suggestion seemed to confirm this.

Well, I think it's much more likely that causing a performance penalty for
cases unrelated to window aggregates would lead to outright rejection :-(.
The majority of our users probably don't ever use window functions, but
for sure they've heard of SUM().  We can't penalize the non-window case.

Expanding pg_aggregate from 10 columns (as per patch) to 14 (as per this
suggestion) is a little annoying but it doesn't sound like a show stopper.
It seems reasonable to assume that the extra initval would be NULL in most
cases, so it's probably a net addition of 12 bytes per row.

> On Apr9, 2014, at 20:20 , Tom Lane  wrote:
>> The patch has in fact already done that to a couple of basic aggregates like
>> sum(int4).  Has anyone bothered to test what side-effects that has on
>> non-windowed aggregation performance?

> I'm pretty sure David Rowley did some benchmarking. The results should be
> in this thread somewhere I think, but they currently evade me... Maybe David
> can re-post, if he's following this...

I saw benchmarks addressing window aggregation, but none looking for
side-effects on plain aggregation.

> If we really go down that road (and I'm far from convinced), then maybe
> instead of having a bunch of additional fields, we could have separate
> entries in pg_aggregate for the two cases, with links between them.

That seems like a complete mess; in particular it would break the primary
key for pg_aggregate (aggfnoid), and probably break every existing query
that looks at pg_aggregate.  Some extra fields would not break such
expectations (in fact we've added fields to pg_aggregate in the past).

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-09 Thread Florian Pflug

On Apr9, 2014, at 20:20 , Tom Lane  wrote:
> There was discussion upthread of providing
> two separate forward transition functions, but Florian argued that that
> would do nothing that you couldn't accomplish with a runtime check in
> the forward function.  I think that's nonsense though, because one of
> the key points here is that an invertible aggregate may require a more
> complex transition state data structure --- in particular, if you're
> forced to go from a pass-by-value to a pass-by-reference data type, right
> there you are going to take a big hit in aggregate performance, and there
> is no way for the forward transition function to avoid it.

To be precise, my point was that *only* having a separate non-invertible
forward transition function is pointless, exactly because of the reason
you gave - that won't allow a cheaper state representation for the
non-invertible case. So all such a non-invertible forward transition
function can do is to skip some bookkeeping that's required by the
inverse transition function - and *that* can just as easily be accomplished
by a runtime check.

I was (and still am) not in favour of duplicating the whole quadruple of
(state, initialvalue, transferfunction, finalfunction) because it seems
excessive. In fact, I believed that doing this would probably be grounds for
outright rejection of the patch, on the base of catalog bloat. And your
initial response to this suggestion seemed to confirm this.

> The patch has in fact already done that to a couple of basic aggregates like
> sum(int4).  Has anyone bothered to test what side-effects that has on
> non-windowed aggregation performance?

I'm pretty sure David Rowley did some benchmarking. The results should be
in this thread somewhere I think, but they currently evade me... Maybe David
can re-post, if he's following this...

> I think what'd make sense is to have a separate forward function *and*
> separate state datatype to be used when we want invertible aggregation
> (hm, and I guess a separate final function too).  So an aggregate
> definition would include two entirely independent implementations.
> If they chance to share sfunc and stype, fine, but they don't have to.
> 
> This would mean we'd need some new names for the doppelgangers of the
> CREATE AGGREGATE parameters sfunc, stype, sspace, finalfunc, and initcond
> (but not sortop).  I guess that'd open up a chance to use a more uniform
> naming scheme, but I'm not too sure what would be good.

If we really go down that road (and I'm far from convinced), then maybe
instead of having a bunch of additional fields, we could have separate
entries in pg_aggregate for the two cases, with links between them.

The non-invertible aggregates would have something like "_noninv" or so
appended to their name, and we'd automatically use them if we know we
won't need to remove entries from the aggregation state. That would also
allow the users to *force* the non-invertible aggregate to be used by
simply saying "SUM_NONINV" instead of "SUM".

Then all we'd need would be an additional OID field that links the
invertible to the non-invertible definition.

best regards,
Florian Pflug

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-09 Thread Florian Pflug

On Apr9, 2014, at 21:35 , Tom Lane  wrote:
> As a quick check, I compared aggregation performance in HEAD, non-assert
> builds, with and without --disable-float8-byval on a 64-bit machine.
> So this tests replacing a pass-by-val transition datatype with a
> pass-by-ref one without any other changes.  There's essentially no
> difference in performance of sum(int4), AFAICT, but that's because
> int4_sum goes out of its way to cheat and avoid palloc overhead.

> I looked to the bit_and() aggregates to see what would happen to
> an aggregate not thus optimized.  As expected, int4 and int8 bit_and
> are just about the same speed if int8 is pass by value ... but if it's
> pass by ref, the int8 case is a good 60% slower.

True, but that just means that aggregate transition functions really
ought to update the state in-place, no?

> So added palloc overhead, at least, is a no-go.  I see that the patched
> version of sum(int4) avoids that trap, but nonetheless it's replaced a
> pretty cheap transition function with a less cheap function, namely the
> function previously used for avg(int4).  A quick test says that avg(int4)
> is about five percent slower than sum(int4), so that's the kind of hit
> we'd be taking on non-windowed aggregations if we do it like this.

That's rather surprising though. AFAICS, there's isn't much difference
between the two transfer functions int4_sum and int4_avg_accum at all.

The differences come down to (+ denoting things which ought to make
int4_avg_accum slower compared to int4_sum, - denoting the opposite)

 1. +) int4_avg_accum calls AggCheckCallContext
 2. -) int4_sum checks if the state is NULL (it never is for int4_avg_accum)
 3. +) int4_avg_accum uses ARR_HASNULL, ARR_SIZE, ARR_OVERHEAD_NONULLS
   to verify that the state is a 2-element array without NULL entries
 4. -) int4_sum checks if the input is NULL

The number of conditional branches should be about the same (and all are
seldomly taken). The validity checks on the state array, i.e. (3), should
be rather cheap I think - not quite as cheap as PG_ARGISNULL maybe, but
not so much more expensive either. That leaves the AggCheckCallContext call.
If that call costs us 5%, maybe we can find a way to make it faster, or
get rid of it entirely? Still seems a lot of a call of a not-very-complex
function, though...

I'll go and check the disassembly - maybe something in int4_avg_accum turns
out to be more complex than is immediately obvious. I'll also try to create
a call profile, unless you already have one from your test runs.

best regards,
Florian Pflug

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-09 Thread Tom Lane

I wrote:
> ... an invertible aggregate may require a more
> complex transition state data structure --- in particular, if you're
> forced to go from a pass-by-value to a pass-by-reference data type, right
> there you are going to take a big hit in aggregate performance, and there
> is no way for the forward transition function to avoid it.  The patch
> has in fact already done that to a couple of basic aggregates like
> sum(int4).  Has anyone bothered to test what side-effects that has on
> non-windowed aggregation performance?

As a quick check, I compared aggregation performance in HEAD, non-assert
builds, with and without --disable-float8-byval on a 64-bit machine.
So this tests replacing a pass-by-val transition datatype with a
pass-by-ref one without any other changes.  There's essentially no
difference in performance of sum(int4), AFAICT, but that's because
int4_sum goes out of its way to cheat and avoid palloc overhead.
I looked to the bit_and() aggregates to see what would happen to
an aggregate not thus optimized.  As expected, int4 and int8 bit_and
are just about the same speed if int8 is pass by value ... but if it's
pass by ref, the int8 case is a good 60% slower.

So added palloc overhead, at least, is a no-go.  I see that the patched
version of sum(int4) avoids that trap, but nonetheless it's replaced a
pretty cheap transition function with a less cheap function, namely the
function previously used for avg(int4).  A quick test says that avg(int4)
is about five percent slower than sum(int4), so that's the kind of hit
we'd be taking on non-windowed aggregations if we do it like this.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-09 Thread Tom Lane

Dean Rasheed  writes:
> OK, I'm marking this ready for committer attention, on the
> understanding that that doesn't include the invtrans_minmax patch.

I've started to look at this patch set.  After rereading the thread,
I'm thinking that it's a mistake to just add the inverse transition
function and require it to work with the standard forward transition
function for the aggregate.  There was discussion upthread of providing
two separate forward transition functions, but Florian argued that that
would do nothing that you couldn't accomplish with a runtime check in
the forward function.  I think that's nonsense though, because one of
the key points here is that an invertible aggregate may require a more
complex transition state data structure --- in particular, if you're
forced to go from a pass-by-value to a pass-by-reference data type, right
there you are going to take a big hit in aggregate performance, and there
is no way for the forward transition function to avoid it.  The patch
has in fact already done that to a couple of basic aggregates like
sum(int4).  Has anyone bothered to test what side-effects that has on
non-windowed aggregation performance?

I think what'd make sense is to have a separate forward function *and*
separate state datatype to be used when we want invertible aggregation
(hm, and I guess a separate final function too).  So an aggregate
definition would include two entirely independent implementations.
If they chance to share sfunc and stype, fine, but they don't have to.

This would mean we'd need some new names for the doppelgangers of the
CREATE AGGREGATE parameters sfunc, stype, sspace, finalfunc, and initcond
(but not sortop).  I guess that'd open up a chance to use a more uniform
naming scheme, but I'm not too sure what would be good.

Comments?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-09 Thread Dean Rasheed

On 8 April 2014 21:48, Florian Pflug  wrote:
> On Apr7, 2014, at 17:41 , Dean Rasheed  wrote:
>> I've just finished reading through all the other patches, and they all
>> look OK to me. It's mostly straightforward stuff, so despite the size
>> it's hopefully all committable once the base patch goes in.
>
> Hm, I'm starting to have second thoughts about the minmax patch. The
> inverse transition functions for MIN and MAX have a non-trivial probability
> of failure - they trigger a rescan whenever the value that is removed isn't
> strictly smaller (respectively strictly larger) then the current maximum
> (respectively minimum). Thus, whenever that happens, we both call the
> inverse transition function *and* (since it fails) restart the aggregation.
>
> For windows based on ROWS, this isn't too bad - even if we fail every second
> time, we still avoid half the rescans, which should be a net win if the
> average window size is > 2.
>
> But for RANGE-based windows, more than one call of the inverse transition
> function must succeed for us to save anything, since we must successfully
> remove *all* peers to avoid one restart. This greatly increases the chance
> that using the inverse transition function hurts rather then helps - the
> situation is worse the larger the average number of peers is.
>

Argh, I hadn't really considered that case. I suppose any imperfect
inverse transition function has the potential to make performance
worse rather than better. But working out the likelihood of that isn't
necessarily straightforward.

It might be possible to include some sort of heuristic based on the
known information --- the number of rows P in the peer group about to
be removed vs the total number N of rows aggregated so far. If the
data were fairly random, then a quick back-of-the-envelope calculation
suggests that trying the inverse min/max functions would be worthwhile
on average if P were less than around 0.4N, but of course different
data distributions could make that much worse. Even a perfect inverse
transition function isn't going to be much use if P > N/2 (e.g.,
imagine a case where the peer groups decrease in size exponentially),
so perhaps we should be including such a check anyway.

That's also assuming that the cost of the inverse transition function
is about the same as the cost of the forward function, which is not
necessarily the case. Perhaps imperfect inverse transition functions
should be assigned a higher cost, and that should be factored into the
decision as to whether they are likely to be worth trying. All that
feels very speculative though, and I think it's too late to be
considering that for 9.4, so yes, let's leave out the min/max
aggregates for now.

> I've factored the BOOL_AND,BOOL_OR stuff out into a separate patch
> invtrans_bool - it previously was part of invtrans_minmax. Given the
> performance risk involved, I think that we probably shouldn't commit
> invtrans_minmax at this time. I should have brought this up earlier, but
> the issue had slipped my mind :-( Sorry for that.
>
>> I think that you're probably right that optimising
>> window_gettupleslot() to eliminate the O(n^2) behaviour can be left to
>> a later patch --- the existing performance benefits of this patch are
>> enough to justify its inclusion IMO. It would be nice to include the
>> trivial optimisation to window_gettupleslot() that I posted upthread,
>> since it gives such a big improvement for only a few lines of code
>> changed.
>
> Agreed, but since it's independent from the rest of the base patch,
> I kept it as a separate patch (invtrans_optimize) instead of merging it
> into the base patch. It should probably be committed separately too.
>

It would be good to commit at least the "base", "arith" and "optimize"
patches for 9.4. I think the "collecting" and "bool" patches are also
committable, but I also suspect that those aggregates are less well
used, so they could be considered lower priority.

>> If you post a new patch set, I'll mark it as ready for committer attention.
>
> New patch set is attached. The only difference to the previous one is that
> "Forward Transitions" and "Inverse Transitions" are now scaled with nloops,
> and that it includes your window_gettupleslot patch under the name
> invtrans_optimize.
>

OK, I'm marking this ready for committer attention, on the
understanding that that doesn't include the invtrans_minmax patch.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-08 Thread Florian Pflug

On Apr9, 2014, at 02:55 , David Rowley  wrote:
> On Wed, Apr 9, 2014 at 8:48 AM, Florian Pflug  wrote:
> 
> As explain above, invtrans_bool is a bit problematic, since it carries
> a real risk of performance regressions. It's included for completeness'
> sake, and should probably not be committed at this time.
> 
> Did you mean to write invtrans_minmax? Otherwise you didn't explain about
> you concerns with bool.

Grmpf. Should have re-read that once more before sending :-(

Yes, I meant invtrans_minmax is problematic! invtrans_bool is fine, the
inverse transition function never fails for BOOL_AND and BOOL_OR. This
is why I factored it out into a separate patch, to make it easy to not
apply the MIN/MAX stuff, while still applying the BOOL stuff. Sorry for
the confision.

best regards,
Florian Pflug

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-08 Thread David Rowley

On Wed, Apr 9, 2014 at 8:48 AM, Florian Pflug  wrote:

>
> As explain above, invtrans_bool is a bit problematic, since it carries
> a real risk of performance regressions. It's included for completeness'
> sake, and should probably not be committed at this time.
>
>
Did you mean to write invtrans_minmax? Otherwise you didn't explain about
you concerns with bool.

Regards

David Rowley

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-07 Thread Dean Rasheed

On 7 April 2014 14:09, Florian Pflug  wrote:
> On Apr5, 2014, at 09:55 , Dean Rasheed  wrote:
>> On 5 April 2014 08:38, Dean Rasheed  wrote:
>>> [snip] releasing it in this state feels a little half-baked
>>> to me.
>>>
>>
>> I regret writing that almost as soon as I sent it. The last of those
>> queries is now over 10 times faster than HEAD, which is certainly
>> worthwhile. What bugs me is that there is so much more potential in
>> this patch.
>
> Well, the perfect is the enemy of the good as they say. By all means,
> let's get rid of the O(n^2) behaviour in 9.5, but let's get a basic
> version into 9.4.
>
>> I think it's pretty close to being committable, but I fear that time
>> is running out for 9.4.
>
> I'm not aware of any open issues in the basic patch, other then
> scaling the reported calling statistics by nloops, which should be
> trivial. I'll post an updated patch later today.
>
> I don't really expect all the add-on patches to make it into 9.4  -
> they don't seem to have gotten much attention yet - but at least
> the inverse transition functions for the basic arithmetic aggregates
> should be doable I hope.
>

I've just finished reading through all the other patches, and they all
look OK to me. It's mostly straightforward stuff, so despite the size
it's hopefully all committable once the base patch goes in.

I think that you're probably right that optimising
window_gettupleslot() to eliminate the O(n^2) behaviour can be left to
a later patch --- the existing performance benefits of this patch are
enough to justify its inclusion IMO. It would be nice to include the
trivial optimisation to window_gettupleslot() that I posted upthread,
since it gives such a big improvement for only a few lines of code
changed.

If you post a new patch set, I'll mark it as ready for committer attention.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-07 Thread Florian Pflug

On Apr5, 2014, at 09:55 , Dean Rasheed  wrote:
> On 5 April 2014 08:38, Dean Rasheed  wrote:
>> [snip] releasing it in this state feels a little half-baked
>> to me.
>> 
> 
> I regret writing that almost as soon as I sent it. The last of those
> queries is now over 10 times faster than HEAD, which is certainly
> worthwhile. What bugs me is that there is so much more potential in
> this patch.

Well, the perfect is the enemy of the good as they say. By all means,
let's get rid of the O(n^2) behaviour in 9.5, but let's get a basic
version into 9.4.

> I think it's pretty close to being committable, but I fear that time
> is running out for 9.4.

I'm not aware of any open issues in the basic patch, other then 
scaling the reported calling statistics by nloops, which should be
trivial. I'll post an updated patch later today.

I don't really expect all the add-on patches to make it into 9.4  -
they don't seem to have gotten much attention yet - but at least 
the inverse transition functions for the basic arithmetic aggregates
should be doable I hope.

best regards,
Florian Pflug

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-05 Thread Dean Rasheed

On 5 April 2014 08:38, Dean Rasheed  wrote:
> [snip] releasing it in this state feels a little half-baked
> to me.
>

I regret writing that almost as soon as I sent it. The last of those
queries is now over 10 times faster than HEAD, which is certainly
worthwhile. What bugs me is that there is so much more potential in
this patch.

I think it's pretty close to being committable, but I fear that time
is running out for 9.4.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-05 Thread Dean Rasheed

On 4 April 2014 11:56, Florian Pflug  wrote:
>
>> On 04.04.2014, at 09:40, Dean Rasheed  wrote:
>>
>> I'm not sure how much additional work is required to sort this out,
>> but to me it looks more realistic to target 9.5 than 9.4, so at this
>> point I tend to think that the patch ought to be marked as returned
>> with feedback.
>

Just doing the first optimisation I recommended (which I
pessimistically referred to as a "micro-optimisation") actually gives
up to a 4x performance boost relative to the current patch for the
queries above:

SELECT SUM(n::int) OVER (ROWS BETWEEN CURRENT ROW AND 10 FOLLOWING)
FROM generate_series(1,2) g(n);
-- 20ms (was 33ms)

SELECT SUM(n::int) OVER (ROWS BETWEEN CURRENT ROW AND 100 FOLLOWING)
FROM generate_series(1,2) g(n);
-- 54ms (was 173ms)

SELECT SUM(n::int) OVER (ROWS BETWEEN CURRENT ROW AND 1000 FOLLOWING)
FROM generate_series(1,2) g(n);
-- 368ms (was 1467ms)

SELECT SUM(n::int) OVER (ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING)
FROM generate_series(1,2) g(n);
-- 1866ms (was 7709ms)

See the attached patch, which applies on top of yours but is actually
independent of it.

IMO, this doesn't go far enough though. We should be aiming to
eliminate the O(n^2) behaviour completely and have all the above
queries take roughly the same amount of time.

> I think the patch is worthwhile, even without this additional optimization. 
> In fact, If the optimization was part of the patch, there would probably be 
> calls to factor it out, on the ground that the patch is already rather large.
>
> I don't see what bumping the whole thing to 9.5 buys, compared do applying 
> what we have now, and optimizing in 9.5 further.
>

The problem with the current state of the patch is that we're going to
a lot of effort to add this new inverse aggregate function,
documenting it's benefits and revealing via EXPLAIN how it can reduce
by several orders of magnitude the number of transition function
calls, but then not giving a commensurate performance boost unless the
window is UNBOUNDED FOLLOWING. That's going to be awkward to explain
to users, and so releasing it in this state feels a little half-baked
to me.

Regards,
Dean
*** src/backend/executor/nodeWindowAgg.c.orig	2014-04-05 07:56:15.0 +0100
--- src/backend/executor/nodeWindowAgg.c	2014-04-05 08:03:32.0 +0100
*** window_gettupleslot(WindowObject winobj,
*** 2412,2425 
  		winobj->seekpos++;
  	}

! 	while (winobj->seekpos > pos)
  	{
! 		if (!tuplestore_gettupleslot(winstate->buffer, false, true, slot))
  			elog(ERROR, "unexpected end of tuplestore");
  		winobj->seekpos--;
  	}

! 	while (winobj->seekpos < pos)
  	{
  		if (!tuplestore_gettupleslot(winstate->buffer, true, true, slot))
  			elog(ERROR, "unexpected end of tuplestore");
--- 2412,2443 
  		winobj->seekpos++;
  	}

! 	/* Advance or rewind until we are within one tuple of the one we want */
! 	while (winobj->seekpos < pos-1)
  	{
! 		if (!tuplestore_advance(winstate->buffer, true))
! 			elog(ERROR, "unexpected end of tuplestore");
! 		winobj->seekpos++;
! 	}
! 
! 	while (winobj->seekpos > pos+1)
! 	{
! 		if (!tuplestore_advance(winstate->buffer, false))
  			elog(ERROR, "unexpected end of tuplestore");
  		winobj->seekpos--;
  	}

! 	/*
! 	 * Now we should be on the tuple immediately before or after the one we
! 	 * want, so just fetch forwards or backwards as appropriate.
! 	 */
! 	if (winobj->seekpos > pos)
! 	{
! 		if (!tuplestore_gettupleslot(winstate->buffer, false, true, slot))
! 			elog(ERROR, "unexpected end of tuplestore");
! 		winobj->seekpos--;
! 	}
! 	else
  	{
  		if (!tuplestore_gettupleslot(winstate->buffer, true, true, slot))
  			elog(ERROR, "unexpected end of tuplestore");

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-04 Thread Andres Freund

On 2014-04-04 12:56:55 +0200, Florian Pflug wrote:
>
> > On 04.04.2014, at 09:40, Dean Rasheed  wrote:
> >
> > I'm not sure how much additional work is required to sort this out,
> > but to me it looks more realistic to target 9.5 than 9.4, so at this
> > point I tend to think that the patch ought to be marked as returned
> > with feedback.
>
> I think the patch is worthwhile, even without this additional
> optimization. In fact, If the optimization was part of the patch,
> there would probably be calls to factor it out, on the ground that the
> patch is already rather large.
>
> I don't see what bumping the whole thing to 9.5 buys, compared do
> applying what we have now, and optimizing in 9.5 further.

>From my POV applying this patch can't be considered a very high priority
for 9.4x. It came *really* late to the game for a relatively complex
patch. A significant portion of the development only happened *after*
the start of the last commitfest.

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-04 Thread Florian Pflug

> On 04.04.2014, at 09:40, Dean Rasheed  wrote:
> 
> I'm not sure how much additional work is required to sort this out,
> but to me it looks more realistic to target 9.5 than 9.4, so at this
> point I tend to think that the patch ought to be marked as returned
> with feedback.

I think the patch is worthwhile, even without this additional optimization. In 
fact, If the optimization was part of the patch, there would probably be calls 
to factor it out, on the ground that the patch is already rather large.

I don't see what bumping the whole thing to 9.5 buys, compared do applying what 
we have now, and optimizing in 9.5 further.

best regards,
Florian Pflug

PS: Sorry for the broken mail I sent earlier - miss-touched on my Phone ;-(

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-04 Thread Florian Pflug

>> ), which seem reasonable. But
> then I started testing performance, and I found cases where the
> improvement is not nearly what I expected.
> 
> The example cited at the start of this thread is indeed orders of
> magnitude faster than HEAD:
> 
> SELECT SUM(n::int) OVER (ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
> F
> 
> 
> 
> 
> 
> 
> I'm not sure how much additional work is required to sort this out,
> but to me it looks more realistic to target 9.5 than 9.4, so at this
> point I tend to think that the patch ought to be marked as returned
> with feedback.
> 
> Thoughts?
> 
> Regards,
> Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-04 Thread Dean Rasheed

On 1 April 2014 20:58, Florian Pflug  wrote:
> On Apr1, 2014, at 10:08 , Dean Rasheed  wrote:
>> On 31 March 2014 01:58, Florian Pflug  wrote:
>>> Attached are updated patches that include the EXPLAIN changes mentioned
>>> above and updated docs.
>>>
>>
>> These patches need re-basing --- they no longer apply to HEAD.
>
> Rebased to head (554bb3beba27bf4a49edecc40f6c0f249974bc7c). I had to
> re-assign OIDs in the dependent paches again (those which actually
> add the various inverse transition functions).
>

Looking at the new explain output, that is not exactly what I was
suggesting upthread. In particular, in cases where the WindowAgg node
is executed more than once, I think that the reported transition
counts should be the averages per-execution of the node. That way the
number of transition function calls reported for the node is directly
comparable with its "rows" value. Thus I think the output should be
divided by nloops, which would be more consistent with the way other
similar values are reported in explain output (c.f.
show_instrumentation_count).

I started looking at the "arith" patch, and I got as far as looking at
the changes to count(*) and count(val), which seem reasonable. But
then I started testing performance, and I found cases where the
improvement is not nearly what I expected.

The example cited at the start of this thread is indeed orders of
magnitude faster than HEAD:

SELECT SUM(n::int) OVER (ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
FROM generate_series(1,2) g(n);

This executes in 20ms on my box, vs 30sec on HEAD, which reflects the
change from an O(n^2) to an O(n) algorithm.

However, if both ends of the frame move, the benefits are far less impressive:

SELECT SUM(n::int) OVER (ROWS BETWEEN CURRENT ROW AND 10 FOLLOWING)
FROM generate_series(1,2) g(n);
-- 33ms

SELECT SUM(n::int) OVER (ROWS BETWEEN CURRENT ROW AND 100 FOLLOWING)
FROM generate_series(1,2) g(n);
-- 173ms

SELECT SUM(n::int) OVER (ROWS BETWEEN CURRENT ROW AND 1000 FOLLOWING)
FROM generate_series(1,2) g(n);
-- 1467ms

SELECT SUM(n::int) OVER (ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING)
FROM generate_series(1,2) g(n);
-- 7709ms

This is still exhibiting the behaviour of an O(n^2) algorithm.

The problem lies in window_gettupleslot() which steps one row at a
time from the last position to the new required position. So if both
ends of the frame are moving, it needs to step forwards and backwards
through the entire window, for each row processed - hence the O(n^2)
behaviour.

Looking at window_gettupleslot, there is an obvious potential
mirco-optimisation that can be made if there are multiple rows to be
skipped --- instead of calling tuplestore_gettupleslot() multiple
times, call tuplestore_advance() multiple times, then call
tuplestore_gettupleslot() once to fetch the final required tuple, thus
avoiding a lot of unnecessary tuple copying. That of course won't
eliminate the O(n^2) behaviour, but it will reduce the overall factor,
and so is probably worth doing.

However, to fix the fundamental O(n^2) problem, I think there needs to
be separate tuplestore read pointers for the head and tail of the
frame. There may also be a case for another read pointer for the
current row too, and possibly one for general purpose user-triggered
fetches. One approach might be to have up to a small number N (say 3
or 4) of read pointers, with window_gettupleslot() automatically
choosing the one nearest to the requested row. Possibly there are
better approaches. I think a little more investigation is needed.

I'm not sure how much additional work is required to sort this out,
but to me it looks more realistic to target 9.5 than 9.4, so at this
point I tend to think that the patch ought to be marked as returned
with feedback.

Thoughts?

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-01 Thread Dean Rasheed

On 31 March 2014 01:58, Florian Pflug  wrote:
> Attached are updated patches that include the EXPLAIN changes mentioned
> above and updated docs.
>

These patches need re-basing --- they no longer apply to HEAD.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-03-28 Thread Dean Rasheed

On 27 March 2014 21:01, Florian Pflug  wrote:
> First, sorry guys for letting this slide - I was overwhelmed by other works,
> and this kind of slipped my mind :-(
>
> On Mar27, 2014, at 09:04 , Dean Rasheed  wrote:
>> On 26 March 2014 19:43, David Rowley  wrote:
>>> On Thu, Mar 27, 2014 at 7:33 AM, Tom Lane  wrote:

 David Rowley  writes:
> I've attached an updated invtrans_strictstrict_base patch which has the
> feature removed.

 What is the state of play on this patch?  Is the latest version what's in

 http://www.postgresql.org/message-id/64f96fd9-64d1-40b9-8861-e61820292...@phlo.org
 plus this sub-patch?  Is everybody reasonably happy with it?  I don't
 see it marked "ready for committer" in the CF app, but time is running
 out.

>>>
>>> As far as I know the only concern left was around the extra stats in the
>>> explain output, which I removed in the patch I attached in the previous
>>> email.
>>>
>>
>> Agreed. That was my last concern regarding the base patch, and I agree
>> that removing the new explain output is probably the best course of
>> action, given that we haven't reached consensus as to what the most
>> useful output would be.
>
> After re-reading the thread, I'd prefer to go with Dean's suggestion, i.e.
> simply reporting the total number of invocations of the forward transition
> functions, and the total number of invocations of the reverse transition
> function, over reporting nothing. The labels of the two counts would simply
> be "Forward Transitions" and "Reverse Transitions".
>

That should be "Inverse" not "Reverse" according to the terminology
agreed upthread.

Personally, I'm not a big fan of that terminology because "forward"
and "inverse" aren't natural antonyms. But actually I think that it's
"forward" that is the wrong word to use, because they actually both
move (different ends of) the frame forwards. The only alternatives I
can think of are "direct" and "inverse", which are natural antonyms,
but I don't want to hold up this patch bikeshedding over this. OTOH
this is not the first time on this thread that someone has slipped
into calling them "forward" and "reverse" transitions.

> But I don't want this issue to prevent us from getting this patch into 9.4,
> so if there are objections to this, I'll rip out the EXPLAIN stuff all
> together.
>
>>> The invtrans_strictstrict_base.patch in my previous email replaces the
>>> invtrans_strictstrict_base_038070.patch in that Florian sent here
>>> http://www.postgresql.org/message-id/64f96fd9-64d1-40b9-8861-e61820292...@phlo.org
>>> all of the other patches are unchanged so it's save to use Florian's latest
>>> ones
>>>
>>> Perhaps Dean can confirm that there's nothing else outstanding?
>>>
>>
>> Florian mentioned upthread that the docs hadn't been updated to
>> reflect the latest changes, so I think they need a little attention.
>
> I'll see to updating the docs, and will post a final patch within the next
> few days.
>
> Dean, have you by chance looked at the other patches yet?
>

No, sorry. I too have been swamped by other work. I will try to look
at them over the next few days. I don't anticipate that they will be
as complex as the base patch, so I hope that this can be finished in
time for 9.4.

If there are any other reviewers with spare cycles, feel free to jump in too.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-03-27 Thread Florian Pflug

First, sorry guys for letting this slide - I was overwhelmed by other works,
and this kind of slipped my mind :-(

On Mar27, 2014, at 09:04 , Dean Rasheed  wrote:
> On 26 March 2014 19:43, David Rowley  wrote:
>> On Thu, Mar 27, 2014 at 7:33 AM, Tom Lane  wrote:
>>> 
>>> David Rowley  writes:
 I've attached an updated invtrans_strictstrict_base patch which has the
 feature removed.
>>> 
>>> What is the state of play on this patch?  Is the latest version what's in
>>> 
>>> http://www.postgresql.org/message-id/64f96fd9-64d1-40b9-8861-e61820292...@phlo.org
>>> plus this sub-patch?  Is everybody reasonably happy with it?  I don't
>>> see it marked "ready for committer" in the CF app, but time is running
>>> out.
>>> 
>> 
>> As far as I know the only concern left was around the extra stats in the
>> explain output, which I removed in the patch I attached in the previous
>> email.
>> 
> 
> Agreed. That was my last concern regarding the base patch, and I agree
> that removing the new explain output is probably the best course of
> action, given that we haven't reached consensus as to what the most
> useful output would be.

After re-reading the thread, I'd prefer to go with Dean's suggestion, i.e.
simply reporting the total number of invocations of the forward transition
functions, and the total number of invocations of the reverse transition
function, over reporting nothing. The labels of the two counts would simply
be "Forward Transitions" and "Reverse Transitions".

But I don't want this issue to prevent us from getting this patch into 9.4,
so if there are objections to this, I'll rip out the EXPLAIN stuff all
together.

>> The invtrans_strictstrict_base.patch in my previous email replaces the
>> invtrans_strictstrict_base_038070.patch in that Florian sent here
>> http://www.postgresql.org/message-id/64f96fd9-64d1-40b9-8861-e61820292...@phlo.org
>> all of the other patches are unchanged so it's save to use Florian's latest
>> ones
>> 
>> Perhaps Dean can confirm that there's nothing else outstanding?
>> 
> 
> Florian mentioned upthread that the docs hadn't been updated to
> reflect the latest changes, so I think they need a little attention.

I'll see to updating the docs, and will post a final patch within the next
few days.

Dean, have you by chance looked at the other patches yet?

best regards,
Florian Pflug



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-03-27 Thread Dean Rasheed

On 26 March 2014 19:43, David Rowley  wrote:
> On Thu, Mar 27, 2014 at 7:33 AM, Tom Lane  wrote:
>>
>> David Rowley  writes:
>> > I've attached an updated invtrans_strictstrict_base patch which has the
>> > feature removed.
>>
>> What is the state of play on this patch?  Is the latest version what's in
>>
>> http://www.postgresql.org/message-id/64f96fd9-64d1-40b9-8861-e61820292...@phlo.org
>> plus this sub-patch?  Is everybody reasonably happy with it?  I don't
>> see it marked "ready for committer" in the CF app, but time is running
>> out.
>>
>
> As far as I know the only concern left was around the extra stats in the
> explain output, which I removed in the patch I attached in the previous
> email.
>

Agreed. That was my last concern regarding the base patch, and I agree
that removing the new explain output is probably the best course of
action, given that we haven't reached consensus as to what the most
useful output would be.


> The invtrans_strictstrict_base.patch in my previous email replaces the
> invtrans_strictstrict_base_038070.patch in that Florian sent here
> http://www.postgresql.org/message-id/64f96fd9-64d1-40b9-8861-e61820292...@phlo.org
> all of the other patches are unchanged so it's save to use Florian's latest
> ones
>
> Perhaps Dean can confirm that there's nothing else outstanding?
>

Florian mentioned upthread that the docs hadn't been updated to
reflect the latest changes, so I think they need a little attention.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-03-26 Thread David Rowley

On Thu, Mar 27, 2014 at 7:33 AM, Tom Lane  wrote:

> David Rowley  writes:
> > I've attached an updated invtrans_strictstrict_base patch which has the
> > feature removed.
>
> What is the state of play on this patch?  Is the latest version what's in
>
> http://www.postgresql.org/message-id/64f96fd9-64d1-40b9-8861-e61820292...@phlo.org
> plus this sub-patch?  Is everybody reasonably happy with it?  I don't
> see it marked "ready for committer" in the CF app, but time is running
> out.
>
>
As far as I know the only concern left was around the extra stats in the
explain output, which I removed in the patch I attached in the previous
email.

The invtrans_strictstrict_base.patch in my previous email replaces the
invtrans_strictstrict_base_038070.patch in that Florian sent here
http://www.postgresql.org/message-id/64F96FD9-64D1-40B9-8861-E6182029220B@phlo.orgall
of the other patches are unchanged so it's save to use Florian's
latest
ones

Perhaps Dean can confirm that there's nothing else outstanding?



> regards, tom lane
>

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-03-26 Thread Tom Lane

David Rowley  writes:
> I've attached an updated invtrans_strictstrict_base patch which has the
> feature removed.

What is the state of play on this patch?  Is the latest version what's in
http://www.postgresql.org/message-id/64f96fd9-64d1-40b9-8861-e61820292...@phlo.org
plus this sub-patch?  Is everybody reasonably happy with it?  I don't
see it marked "ready for committer" in the CF app, but time is running
out.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-03-07 Thread Florian Pflug

On Mar5, 2014, at 23:49 , Tom Lane  wrote:
> Florian Pflug  writes:
>> On Mar5, 2014, at 18:37 , Tom Lane  wrote:
>>> My advice is to lose the EXPLAIN output entirely.  If the authors of
>>> the patch can't agree on what it means, what hope have everyday users
>>> got of making sense of it?
> 
>> The question isn't what the current output means, but whether it's a
>> good metric to report or not.
> 
> If you can't agree, then it isn't.

Probably, yes, so let's find something that *is* a good metric.

(BTW, it's not the authors who disagree here. It was me who put the EXPLAIN
feature in, and Dean reviewed it and found it confusing. The original
author David seems to run out of time to work on this, and AFAIK hasn't
weighted in on that particular feature at all)

>> If we don't report anything, then how would a user check whether a query
>> is slow because of O(n^2) behaviour of a windowed aggregate, or because
>> of some other reasons?
> 
> [ shrug... ]  They can see whether the Window plan node is where the time
> is going.  It's not apparent to me that the extra numbers you propose to
> report will edify anybody.

By the same line of reasoning, every metric except execution time is
superfluous. Comparing execution times really is a horrible way to measure
this - not only does it include all kinds of thing that have nothing to do
with the restart behaviour of aggregates, you'd also have to construct a
base-line query first which is guaranteed to not restart.

best regards,
Florian Pflug

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-03-07 Thread Florian Pflug

On Mar5, 2014, at 23:49 , Tom Lane  wrote:
> Florian Pflug  writes:
>> On Mar5, 2014, at 18:37 , Tom Lane  wrote:
>>> My advice is to lose the EXPLAIN output entirely.  If the authors of
>>> the patch can't agree on what it means, what hope have everyday users
>>> got of making sense of it?
> 
>> The question isn't what the current output means, but whether it's a
>> good metric to report or not.
> 
> If you can't agree, then it isn't.

Probably, yes, so let's find something that *is* a good metric.

(BTW, it's not the authors who disagree here. It was me who put the EXPLAIN
feature in, and Dean reviewed it and found it confusing. The original
author David seems to run out of time to work on this, and AFAIK hasn't
weighted in on that particular feature at all)

>> If we don't report anything, then how would a user check whether a query
>> is slow because of O(n^2) behaviour of a windowed aggregate, or because
>> of some other reasons?
> 
> [ shrug... ]  They can see whether the Window plan node is where the time
> is going.  It's not apparent to me that the extra numbers you propose to
> report will edify anybody.

By the same line of reasoning, every metric except execution time is
superfluous. Comparing execution times really is a horrible way to measure
this - not only does it include all kinds of thing that have nothing to do
with the restart behaviour of aggregates, you'd also have to construct a
base-line query first which is guaranteed to not restart.

best regards,
Florian Pflug

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-03-07 Thread Tom Lane

Florian Pflug  writes:
> On Mar5, 2014, at 18:37 , Tom Lane  wrote:
>> My advice is to lose the EXPLAIN output entirely.  If the authors of
>> the patch can't agree on what it means, what hope have everyday users
>> got of making sense of it?

> The question isn't what the current output means, but whether it's a
> good metric to report or not.

If you can't agree, then it isn't.

> If we don't report anything, then how would a user check whether a query
> is slow because of O(n^2) behaviour of a windowed aggregate, or because
> of some other reasons?

[ shrug... ]  They can see whether the Window plan node is where the time
is going.  It's not apparent to me that the extra numbers you propose to
report will edify anybody.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-03-06 Thread Greg Stark

On Wed, Mar 5, 2014 at 10:49 PM, Tom Lane  wrote:
>
> [ shrug... ]  They can see whether the Window plan node is where the time
> is going.  It's not apparent to me that the extra numbers you propose to
> report will edify anybody.

Perhaps just saying "Incremental Window Function" versus "Iterated
Window Function" or something like that be sufficient? At least that
way query tuning quidelines have a keyword they can say to watch out
for. And someone trying to figure out *why* the time is being spent in
this node has something they might notice a correlation with.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-03-05 Thread Florian Pflug

On Mar5, 2014, at 18:27 , Dean Rasheed  wrote:
> On 5 March 2014 14:35, Florian Pflug  wrote:
>> When I added the EXPLAIN stuff, I initially simply reported the number
>> of times nodeWindowAgg has to restart the aggregation. The problem with
>> that approach is that not all restarts carry the same cost. If the frames
>> either don't overlap at all or are identical, restarts don't cause any
>> additional work. And for fixed-length frames (BETWEEN n PRECEEDING AND
>> m FOLLOWING), the performance effects of restarts depends on m-n.
>> 
>> Which is why I made it count the number of aggregated input rows instead.
>> 
>> Having said that, I' not really 100% happy with the name
>> "Transitions Per Row" for this - it was simply the best I could come up with
>> that was reasonably short. And I'm certainly open to reporting the absolute
>> count instead of a factor relative to ntuples.
>> 
>> If we made it an absolute count, would calling this "Aggregated Rows" work
>> for you?
> 
> I'm not sure about naming, but I think my preference would be to
> output the correct absolute counts for both the forward and inverse
> transitions (i.e. multiply by the right combinations of numaggs and
> numaggs_restart). The EXPLAIN output already tells you how many
> aggregates there are, and how many rows there are, so you'd be able to
> work out from that how much extra work it's doing.

Hm, if we do that we might as well go all the way and simply report these
numbers per aggregate, instead of once per window aggregation node. That'd
provide the maximum amount of information, and be quite unambiguous.

best regards,
Florian Pflug



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-03-05 Thread Florian Pflug

On Mar5, 2014, at 18:37 , Tom Lane  wrote:
> Dean Rasheed  writes:
>> I think we really need a larger consensus on this though, so I'd be
>> interested to hear what others think.
> 
> My advice is to lose the EXPLAIN output entirely.  If the authors of
> the patch can't agree on what it means, what hope have everyday users
> got of making sense of it?

The question isn't what the current output means, but whether it's a
good metric to report or not.

If we don't report anything, then how would a user check whether a query
is slow because of O(n^2) behaviour of a windowed aggregate, or because
of some other reasons? If inevitability where a purely static property,
then maybe we could get away with that, and say "check whether your
aggregates are invertible or not". But since we have partially invertible
aggregates, the performance characteristics depends on the input data,
so we IMHO need some way for users to check what's actually happening.

best regards,
Florian Pflug

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-03-05 Thread Vladimir Sitnikov

Tom,

I did not follow the thread very close, so I need to look what the
ambiguity is there, however I would love to see window rescans in explain
analyze.

I have great experience in tuning Oracle queries.
There are features in PostgreSQL's explain analyze that I miss badly in
Oracle: 'rows removed by filter' is my favourite one. Improving explain
analyze is great for performance analysis.

I would say target audience for 'explain analyze' is not all the users, but
someone closer to 'performance engineers'. Those beings are used to
triple-check results and build/validate hypothesis since all the counters
tend to lie, so 'a bit misleading counter' is not a showstopper.

I did not think of
'non-being-able-to-use-negative-transition-since-floats-do-not-commute'
before.
Thanks to this discussion I see what kind of dragons live here.

I would vote (if I had any vote at all) for the inclusion of performance
statistics to explain analyze (e.g. number of rescans and number of rows
fed to aggregate) provided performance impact is tolerable in both regular
and explain analyze mode.

I wonder how Oracle handles negative transition (does it?), however that is
a different discussion.

Regards,
Vladimir Sitnikov

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-03-05 Thread Tom Lane

Dean Rasheed  writes:
> I think we really need a larger consensus on this though, so I'd be
> interested to hear what others think.

My advice is to lose the EXPLAIN output entirely.  If the authors of
the patch can't agree on what it means, what hope have everyday users
got of making sense of it?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-03-05 Thread Dean Rasheed

On 5 March 2014 14:35, Florian Pflug  wrote:
> On Mar4, 2014, at 21:09 , Dean Rasheed  wrote:
>> On 3 March 2014 23:00, Florian Pflug  wrote:
 * In show_windowagg_info(), this calculation looks suspicious to me:

   double tperrow = winaggstate->aggfwdtrans /
   (inst->nloops * inst->ntuples);

 If the node is executed multiple times, aggfwdtrans will be reset in
 each loop, so the transitions per row figure will be under-estimated.
 ISTM that if you want to report on this, you'd need aggfwdtrans to be
 reset once per query, but I'm not sure exactly how to do that.

 ...

 Actually, I think it's misleading to only count forward transition
 function calls, because a call to the inverse transition function
 still represents a state transition, and is likely to be around the
 same cost. For a window of size 2, there would not be much advantage
 to using inverse transition functions, because it would be around 2
 transitions per row either way.
>>>
>>> True. In fact, I pondered whether to avoid using the inverse transition
>>> function for windows of 2 rows. In the end, I didn't because I felt that
>>> it makes custom aggregates harder to test.
>>>
>>> On the question of whether to count inverse transition function calls -
>>> the idea of the EXPLAIN VERBOSE ANALYZE output isn't really to show the
>>> number of state transitions, but rather to show whether the aggregation
>>> has O(n) or O(n^2) behaviour. The idea being that a value close to "1"
>>> means "inverse transition function works as expected", and larger values
>>> mean "not working so well".
>>>
>>> Regarding multiple evaluations - I think I based the behaviour on how
>>> ntuples works, which also only reports the value of the last evaluation
>>> I think. But maybe I'm confused about this.
>>>
>>
>> No, it doesn't look like that's correct for multiple loops. Consider
>> this example:
>>
>> ...
>>
>> It turns out that show_windowagg_info() is only called once at the
>> end, with ntuples=100, nloops=4 and aggfwdtrans=100, so it's computing
>> tperrow=100/(4*100)=0.25, which then gets truncated to 0.2. So to get
>> 1, you'd have to use this formula:
>>
>>double tperrow = winaggstate->aggfwdtrans / inst->ntuples;
>
> Hm, so I *was* confused - seems I mixed up ntuples (which counts the
> total number of tuples over all loops) with what we report as "rows"
> (i.e. the average number of rows per loop). Thanks for clearing that up!
>
>> I'm still not convinced that's the most useful thing to report though.
>> Personally, I'd prefer to just see the separate counts, e.g.:
>>
>>   ->  WindowAgg  (cost=0.00..22.50 rows=1000 width=4) (actual
>> time=0.019..0.094 rows=25 loops=4)
>> Output: sum(i.i) OVER (?)
>> Forward transitions: 25  Inverse transitions: 25
>>
>> IMO that gives a clearer picture of what's going on.
>
> My problem with this is that if there are multiple aggregates, the
> numbers don't really count the number of forward and reverse transfer
> function invocations. What nodeWindowAgg.c really counts is the number
> of *rows* it has to fetch from the tuple store to perform forward
> aggregations - the relevant code snippet is
>
>   if (numaggs_restart > 0)
> winstate->aggfwdtrans += (winstate->aggregatedupto
>   - winstate->frameheadpos);
>   else
> winstate->aggfwdtrans += (winstate->aggregatedupto
>   - aggregatedupto_nonrestarted);
>
> Now we could of course report these figures per aggregate instead of
> only once per aggregation node. But as I said earlier, my guess is that
> people usually won't be interested in the precise counts - most likely,
> what you want to know is "how much do the restarts cost me"
>

The problem I have with the single "Transitions Per Row" figure, and
the idea that a value close to 1.0 is supposed to be good, is that
it's not really true. For example, with a window size of 2 and a
"perfect" invertible aggregate, you'd get a value of 1.0, but with a
non-invertible aggregate you'd get a value of 2.0, when actually it
wouldn't do any better if it had an inverse. I think trying to
represent this as a single number is too simplistic.


> When I added the EXPLAIN stuff, I initially simply reported the number
> of times nodeWindowAgg has to restart the aggregation. The problem with
> that approach is that not all restarts carry the same cost. If the frames
> either don't overlap at all or are identical, restarts don't cause any
> additional work. And for fixed-length frames (BETWEEN n PRECEEDING AND
> m FOLLOWING), the performance effects of restarts depends on m-n.
>
> Which is why I made it count the number of aggregated input rows instead.
>
> Having said that, I' not really 100% happy with the name
> "Transitions Per Row" for this - it was simply the best I could come up with
> that was reasonably short. And I'm certainly open to reporting the absolute
>

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-03-05 Thread Florian Pflug

On Mar4, 2014, at 21:09 , Dean Rasheed  wrote:
> On 3 March 2014 23:00, Florian Pflug  wrote:
>>> * In show_windowagg_info(), this calculation looks suspicious to me:
>>> 
>>>   double tperrow = winaggstate->aggfwdtrans /
>>>   (inst->nloops * inst->ntuples);
>>> 
>>> If the node is executed multiple times, aggfwdtrans will be reset in
>>> each loop, so the transitions per row figure will be under-estimated.
>>> ISTM that if you want to report on this, you'd need aggfwdtrans to be
>>> reset once per query, but I'm not sure exactly how to do that.
>>> 
>>> ...
>>> 
>>> Actually, I think it's misleading to only count forward transition
>>> function calls, because a call to the inverse transition function
>>> still represents a state transition, and is likely to be around the
>>> same cost. For a window of size 2, there would not be much advantage
>>> to using inverse transition functions, because it would be around 2
>>> transitions per row either way.
>> 
>> True. In fact, I pondered whether to avoid using the inverse transition
>> function for windows of 2 rows. In the end, I didn't because I felt that
>> it makes custom aggregates harder to test.
>> 
>> On the question of whether to count inverse transition function calls -
>> the idea of the EXPLAIN VERBOSE ANALYZE output isn't really to show the
>> number of state transitions, but rather to show whether the aggregation
>> has O(n) or O(n^2) behaviour. The idea being that a value close to "1"
>> means "inverse transition function works as expected", and larger values
>> mean "not working so well".
>> 
>> Regarding multiple evaluations - I think I based the behaviour on how
>> ntuples works, which also only reports the value of the last evaluation
>> I think. But maybe I'm confused about this.
>> 
> 
> No, it doesn't look like that's correct for multiple loops. Consider
> this example:
> 
> ...
> 
> It turns out that show_windowagg_info() is only called once at the
> end, with ntuples=100, nloops=4 and aggfwdtrans=100, so it's computing
> tperrow=100/(4*100)=0.25, which then gets truncated to 0.2. So to get
> 1, you'd have to use this formula:
> 
>double tperrow = winaggstate->aggfwdtrans / inst->ntuples;

Hm, so I *was* confused - seems I mixed up ntuples (which counts the
total number of tuples over all loops) with what we report as "rows"
(i.e. the average number of rows per loop). Thanks for clearing that up!

> I'm still not convinced that's the most useful thing to report though.
> Personally, I'd prefer to just see the separate counts, e.g.:
> 
>   ->  WindowAgg  (cost=0.00..22.50 rows=1000 width=4) (actual
> time=0.019..0.094 rows=25 loops=4)
> Output: sum(i.i) OVER (?)
> Forward transitions: 25  Inverse transitions: 25
> 
> IMO that gives a clearer picture of what's going on.

My problem with this is that if there are multiple aggregates, the
numbers don't really count the number of forward and reverse transfer
function invocations. What nodeWindowAgg.c really counts is the number
of *rows* it has to fetch from the tuple store to perform forward
aggregations - the relevant code snippet is

  if (numaggs_restart > 0)
winstate->aggfwdtrans += (winstate->aggregatedupto 
  - winstate->frameheadpos);
  else
winstate->aggfwdtrans += (winstate->aggregatedupto
  - aggregatedupto_nonrestarted);

Now we could of course report these figures per aggregate instead of
only once per aggregation node. But as I said earlier, my guess is that
people usually won't be interested in the precise counts - most likely,
what you want to know is "how much do the restarts cost me"

When I added the EXPLAIN stuff, I initially simply reported the number
of times nodeWindowAgg has to restart the aggregation. The problem with
that approach is that not all restarts carry the same cost. If the frames
either don't overlap at all or are identical, restarts don't cause any
additional work. And for fixed-length frames (BETWEEN n PRECEEDING AND
m FOLLOWING), the performance effects of restarts depends on m-n.

Which is why I made it count the number of aggregated input rows instead.

Having said that, I' not really 100% happy with the name
"Transitions Per Row" for this - it was simply the best I could come up with
that was reasonably short. And I'm certainly open to reporting the absolute
count instead of a factor relative to ntuples.

If we made it an absolute count, would calling this "Aggregated Rows" work
for you?

best regards,
Florian Pflug

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-03-04 Thread Dean Rasheed

On 3 March 2014 23:00, Florian Pflug  wrote:
>> * In show_windowagg_info(), this calculation looks suspicious to me:
>>
>>double tperrow = winaggstate->aggfwdtrans /
>>(inst->nloops * inst->ntuples);
>>
>> If the node is executed multiple times, aggfwdtrans will be reset in
>> each loop, so the transitions per row figure will be under-estimated.
>> ISTM that if you want to report on this, you'd need aggfwdtrans to be
>> reset once per query, but I'm not sure exactly how to do that.
>>
>> ...
>>
>> Actually, I think it's misleading to only count forward transition
>> function calls, because a call to the inverse transition function
>> still represents a state transition, and is likely to be around the
>> same cost. For a window of size 2, there would not be much advantage
>> to using inverse transition functions, because it would be around 2
>> transitions per row either way.
>
> True. In fact, I pondered whether to avoid using the inverse transition
> function for windows of 2 rows. In the end, I didn't because I felt that
> it makes custom aggregates harder to test.
>
> On the question of whether to count inverse transition function calls -
> the idea of the EXPLAIN VERBOSE ANALYZE output isn't really to show the
> number of state transitions, but rather to show whether the aggregation
> has O(n) or O(n^2) behaviour. The idea being that a value close to "1"
> means "inverse transition function works as expected", and larger values
> mean "not working so well".
>
> Regarding multiple evaluations - I think I based the behaviour on how
> ntuples works, which also only reports the value of the last evaluation
> I think. But maybe I'm confused about this.
>

No, it doesn't look like that's correct for multiple loops. Consider
this example:

explain (verbose, analyse)
  select * from (values (10), (20), (30), (40)) v(x),
  lateral
  (select sum(i) over (rows between 4 preceding and current row)
 from generate_series(1, x) i) t;

 QUERY
PLAN

 Nested Loop  (cost=0.00..170.06 rows=4000 width=12) (actual
time=0.027..0.414 rows=100 loops=1)
   Output: "*VALUES*".column1, (sum(i.i) OVER (?))
   ->  Values Scan on "*VALUES*"  (cost=0.00..0.05 rows=4 width=4)
(actual time=0.002..0.006 rows=4 loops=1)
 Output: "*VALUES*".column1
   ->  WindowAgg  (cost=0.00..22.50 rows=1000 width=4) (actual
time=0.019..0.094 rows=25 loops=4)
 Output: sum(i.i) OVER (?)
 Transitions Per Row: 0.2
 ->  Function Scan on pg_catalog.generate_series i
(cost=0.00..10.00 rows=1000 width=4) (actual time=0.010..0.015 rows=25
loops=4)
   Output: i.i
   Function Call: generate_series(1, "*VALUES*".column1)


It turns out that show_windowagg_info() is only called once at the
end, with ntuples=100, nloops=4 and aggfwdtrans=100, so it's computing
tperrow=100/(4*100)=0.25, which then gets truncated to 0.2. So to get
1, you'd have to use this formula:

double tperrow = winaggstate->aggfwdtrans / inst->ntuples;

I'm still not convinced that's the most useful thing to report though.
Personally, I'd prefer to just see the separate counts, e.g.:

   ->  WindowAgg  (cost=0.00..22.50 rows=1000 width=4) (actual
time=0.019..0.094 rows=25 loops=4)
 Output: sum(i.i) OVER (?)
 Forward transitions: 25  Inverse transitions: 25

IMO that gives a clearer picture of what's going on.

Thoughts?

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-03-02 Thread Dean Rasheed

On 25 February 2014 12:33, Florian Pflug  wrote:
> On Feb24, 2014, at 17:50 , Dean Rasheed  wrote:
>> On 20 February 2014 01:48, Florian Pflug  wrote:
>>> On Jan29, 2014, at 13:45 , Florian Pflug  wrote:
 In fact, I'm
 currently leaning towards just forbidding non-strict forward transition
 function with strict inverses, and adding non-NULL counters to the
 aggregates that then require them. It's really only the SUM() aggregates
 that are affected by this, I think.
>>>
>>> I finally got around to doing that, and the results aren't too bad. The
>>> attached patches required that the strictness settings of the forward and
>>> reverse transition functions agree, and employ exactly the same 
>>> NULL-skipping
>>> logic we always had.
>>>
>>> The only aggregates seriously affected by that change were SUM(int2) and
>>> SUM(int4).
>>
>> I haven't looked at this in any detail yet, but that seems much neater
>> to me. It seems perfectly sensible that the forward and inverse
>> transition functions should have the same strictness settings, and
>> enforcing that keeps the logic simple, as well as hopefully making it
>> easier to document.
>
> Good to hear that you agree! I'll try to find some time to update the docs.
>

I finally got round to looking at this in more detail. Sorry for the
delay. Here is my more detailed review of the base patch.

Overall, I think that it is in reasonable shape, and as I said I think
the approach of enforcing matching strictness settings on the forward
and inverse transition functions is much simpler and neater. I have a
few comments, some cosmetic, and a couple more substantive:

* In a couple of places:

errmsg("stricness of forward and reverse transition functions must match")

- misspelling: "stricness".
- "reverse" should be "inverse" to match the terminology used elsewhere.

* Grammatical error in the comment for lookup_agg_function() - you
should drop the word "both".

* In show_windowagg_info(), this calculation looks suspicious to me:

double tperrow = winaggstate->aggfwdtrans /
(inst->nloops * inst->ntuples);

If the node is executed multiple times, aggfwdtrans will be reset in
each loop, so the transitions per row figure will be under-estimated.
ISTM that if you want to report on this, you'd need aggfwdtrans to be
reset once per query, but I'm not sure exactly how to do that.

Here's a test case:

explain (verbose, analyse)
  select sum(i) over (rows between 4 preceding and current row)
from generate_series(1, 10) i;

which outputs 10 rows with an average of 1 transition per row, but
doing the same window aggregate twice in a nested loop:

explain (verbose, analyse)
  select * from (values (10), (10)) v(x),
  lateral
  (select sum(i) over (rows between 4 preceding and current row)
 from generate_series(1, x) i) t;

outputs 20 rows, but only reports 0.5 transitions per row.

Actually, I think it's misleading to only count forward transition
function calls, because a call to the inverse transition function
still represents a state transition, and is likely to be around the
same cost. For a window of size 2, there would not be much advantage
to using inverse transition functions, because it would be around 2
transitions per row either way.

* The function comment for build_aggregate_fnexprs() needs to be
updated to reference the inverse transition function. I'd also be
tempted to have it allow invtransfnexpr be a NULL pointer, if the
inverse transition function expression tree is not required. Then
ExecInitAgg() could simply pass NULL, instead of having the local
variable invtransfnexpr with the slightly cryptic comment "needed but
never used".

* In struct WindowStatePerAggData, I think you should change the field
order to transfn_oid, invtransfn_oid and then finalfn_oid. It's only a
small thing, but that's the order those 3 functions are referred to
everywhere else.

* In struct WindowStatePerAggData, the comment for transValueCount
should read "number of aggregated values".

* If AggCheckCallContext() is called from a window function, and it
asks for an aggcontext, it will fail because calledaggno will be -1.
That can't currently happen for any of our built-in window functions,
and I'm not sure if it's likely to happen in the future, but I think
it would be better to defend against that possibility just in case. So
I think it ought to return the shared context in that case, as the
original code would have done.

* In advance_windowaggregate(), this code

if (peraggstate->transfn.fn_strict) {

is against the project style, which is to have curly braces on new
lines. But also, that test condition is the same as the preceding
block, so the 2 blocks could just be merged.

* I was wondering about the case of a forward transition function
returning NULL in the presence of an inverse transition function. In
this patch there are 3 pieces of code that test for that:

1). advance_windowaggregate() errors out if the forward t

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-02-25 Thread Florian Pflug

On Feb24, 2014, at 17:50 , Dean Rasheed  wrote:
> On 20 February 2014 01:48, Florian Pflug  wrote:
>> On Jan29, 2014, at 13:45 , Florian Pflug  wrote:
>>> In fact, I'm
>>> currently leaning towards just forbidding non-strict forward transition
>>> function with strict inverses, and adding non-NULL counters to the
>>> aggregates that then require them. It's really only the SUM() aggregates
>>> that are affected by this, I think.
>> 
>> I finally got around to doing that, and the results aren't too bad. The
>> attached patches required that the strictness settings of the forward and
>> reverse transition functions agree, and employ exactly the same NULL-skipping
>> logic we always had.
>> 
>> The only aggregates seriously affected by that change were SUM(int2) and
>> SUM(int4).
> 
> I haven't looked at this in any detail yet, but that seems much neater
> to me. It seems perfectly sensible that the forward and inverse
> transition functions should have the same strictness settings, and
> enforcing that keeps the logic simple, as well as hopefully making it
> easier to document.

Good to hear that you agree! I'll try to find some time to update the docs. 

best regards,
Florian Pflug



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-02-24 Thread Dean Rasheed

On 20 February 2014 01:48, Florian Pflug  wrote:
> On Jan29, 2014, at 13:45 , Florian Pflug  wrote:
>> In fact, I'm
>> currently leaning towards just forbidding non-strict forward transition
>> function with strict inverses, and adding non-NULL counters to the
>> aggregates that then require them. It's really only the SUM() aggregates
>> that are affected by this, I think.
>
> I finally got around to doing that, and the results aren't too bad. The
> attached patches required that the strictness settings of the forward and
> reverse transition functions agree, and employ exactly the same NULL-skipping
> logic we always had.
>
> The only aggregates seriously affected by that change were SUM(int2) and
> SUM(int4).
>
> The SUM, AVG and STDDEV aggregates which use NumericAggState where
> already mostly prepared for this - all they required were a few adjustments
> to correctly handle the last non-NULL, non-NaN input being removed, and a few
> additional PG_ARGISNULL calls for the inverse transition functions since 
> they're
> now non-strict. I've also modified them to unconditionally allocate the state
> at the first call, instead upon seeing the first non-NULL input, but that 
> isn't
> strictly required. But without that, the state can have three classes of 
> values -
> SQL-NULL, NULL pointer and valid pointer, and that's just confusing...
>
> SUM(int2) and SUM(int4) now simply use the same transition functions as
> AVG(int2) and AVG(int4), which use an int8 array to track the sum of the 
> inputs
> and the number of inputs, plus a new final function int2int4_sum(). 
> Previously,
> they used a single int8 as their state type.
>
> Since I was touching the code anyway, I removed some unnecessary inverse
> transition functions - namely, int8_avg_accum_inv and numeric_avg_accum_inv. 
> These
> are completely identical to their non-avg cousins - the only difference 
> between
> the corresponding forward transition functions is whether they request 
> computation
> of sumX2 (i.e. the sum of squares of the inputs) or not.
>
> I haven't yet updated the docs - it'll do that if and when there's consensus
> about whether this is the way to go or not.
>

I haven't looked at this in any detail yet, but that seems much neater
to me. It seems perfectly sensible that the forward and inverse
transition functions should have the same strictness settings, and
enforcing that keeps the logic simple, as well as hopefully making it
easier to document.

It's a shame that more transition functions cannot be made strict,
when they actually ignore null values, but I think trying to solve
that can be regarded as outside the scope of this patch.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-01-29 Thread Florian Pflug

On Jan29, 2014, at 09:59 , Dean Rasheed  wrote:
> On 28 January 2014 20:16, Florian Pflug  wrote:
>> On Jan27, 2014, at 23:28 , Dean Rasheed  wrote:
>>> This case is explicitly forbidden, both in CREATE AGGREGATE and in the
>>> executor. To me, that seems overly restrictive --- if transfn is
>>> strict, then you know for sure that no NULL values are added to the
>>> aggregate state, so surely it doesn't matter whether or not
>>> inv_transfn is strict. It will never be asked to remove NULL values.
>>> 
>>> I think there are definite use-cases where a user might want to use a
>>> pre-existing function as the inverse transition function, so it seems
>>> harsh to force them to wrap it in a strict function in this case.
>> 
>> I'm not sure that the likelihood of someone wanting to combine a strict
>> forward with a non-strict inverse function is very hight, but neither
>> 
> 
> Me neither, but the checks to forbid it aren't adding anything, and I
> think it's best to make it as flexible as possible.

Ok.

>> Another idea would be to have an explicit nulls=ignore|process option
>> for CREATE AGGREGATE. If nulls=process and either of the transition
>> functions are strict, we could either error out, or simply do what
>> normal functions calls do and pretend they return NULL for NULL inputs.
>> Not sure how the rule that forward transition functions may not return
>> NULL if there's an inverse transition function would fit in if we do
>> the latter, though.
>> 
>> The question is - is it worth it the effort to add that flag?
> 
> One use-case I had in mind upthread was suppose you wanted to write a
> custom version of array_agg that only collected non-NULL values. With
> such a flag, that would be trivial, but with the current patch you'd
> have to (count-intuitively) wrap the inverse transition function in a
> strict function.

I'd be more convinced by that if doing so was actually possible for
non-superusers. But it isn't, because the aggregate uses "internal" as
it's state type and it's thus entirely up to the user to not crash the
database by mixing transfer and final functions with incompatible
state data. Plus, instead of creating a custom aggregate, one can just
use a FILTER clause to get rid of the NULL values.

> Another use-case I can imagine is suppose you wanted a custom version
> of sum() that returned NULL if any of the input values were NULL. If
> you knew that most of your data was non-NULL, you might still wish to
> benefit from an inverse transition function. Right now the patch won't
> allow that because of the error in advance_windowaggregate(), but
> possibly that could be relaxed by forcing a restart in that case.

That's not really true - that patch only forbids that if you insist on
representing the state "i have seen a NULL input" with a NULL state value.
But if you instead just count the number of NULLS in your transition
functions, all you need to do is to have your final function return NULL
if that count is not zero.

> If I've understood correctly that should give similar to current
> performance if NULLs are present, and enhanced performance as the
> window moved over non-NULL data.

Exactly - and this makes defining a NULL-sensitive SUM() this way
rather silly - a simple counter has very nearly zero overhead, and avoids
all rescans.

> In that second case, it would also be nice if you could simply re-use
> the existing sum forward and inverse transition functions, with a
> different null-handling flag.

Even if we had a nulls=process|ignore flag, SUM's transition functions
would still need to take that use-case into account explicitly to make
this work - at the very least, the forward transition function would
need to return NULL if the input is NULL instead of just skipping it as
it does now. But that would leads to completely unnecessary rescans, so
what we actually ought to do then is to make it track whether there have
been NULL inputs and make the finalfunc return NULL in this case. Which
would border on ironic, since the whole raison d'etre for this is to
*avoid* spreading NULL-counting logic around...

Plus, to make this actually useable, we'd have to document this, and tell
people how to define such a SUM aggregate. But I don't want to go there -
we really mustn't encourage people to mix-and-match built-in aggregates
with state type "internal", since whether they work or crash depends
on factors outside the control of the user.

And finally, you can get that behaviour quite easily by doing

  CASE WHEN bool_and(input IS NOT NULL) whatever_agg(input) ELSE NULL END

> So I think an ignore-nulls flag would have real benefits, as well as
> being a cleaner design than relying on a strict inverse transition
> function.

The more I think about this, the less convinced I am. In fact, I'm
currently leaning towards just forbidding non-strict forward transition
function with strict inverses, and adding non-NULL counters to the
aggregates that then require them. It's really only the SUM

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-01-29 Thread Dean Rasheed

On 28 January 2014 20:16, Florian Pflug  wrote:
> On Jan27, 2014, at 23:28 , Dean Rasheed  wrote:
>> strict transfn vs non-strict inv_transfn
>> 
>>
>> This case is explicitly forbidden, both in CREATE AGGREGATE and in the
>> executor. To me, that seems overly restrictive --- if transfn is
>> strict, then you know for sure that no NULL values are added to the
>> aggregate state, so surely it doesn't matter whether or not
>> inv_transfn is strict. It will never be asked to remove NULL values.
>>
>> I think there are definite use-cases where a user might want to use a
>> pre-existing function as the inverse transition function, so it seems
>> harsh to force them to wrap it in a strict function in this case.
>>
>> AFAICS, the code in advance/retreat_windowaggregate should just work
>> if those checks are removed.
>
> True. It didn't use to in earlier version of the patch because the advance
> and retreat functions looked at the strict settings directly, but now that
> there's an ignore_nulls flag in the executor node, the only requirement
> should be that ignore_nulls is set if either of the transition functions
> is strict.
>
> I'm not sure that the likelihood of someone wanting to combine a strict
> forward with a non-strict inverse function is very hight, but neither
>

Me neither, but the checks to forbid it aren't adding anything, and I
think it's best to make it as flexible as possible.


>> non-strict transfn vs strict inv_transfn
>> 
>>
>> At first this seems as though it must be an error --- the forwards
>> transition function allows NULL values into the aggregate state, and
>> the inverse transition function is strict, so it cannot remove them.
>>
>> But actually what the patch is doing in this case is treating the
>> forwards transition function as partially strict --- it won't be
>> passed NULL values (at least in a window context), but it may be
>> passed a NULL state in order to build the initial state when the first
>> non-NULL value arrives.
>
> Exactly.
>
>> It looks like this behaviour is intended to support aggregates that
>> ignore NULL values, but cannot actually be made to have a strict
>> transition function. I think typically this is because the aggregate's
>> initial state is NULL and it's state type differs from the type of the
>> values being aggregated, and so can't be automatically created from
>> the first non-NULL value.
>
> Yes. I added this because the alternative would haven been to count
> non-NULL values in most of the existing SUM() aggregates.
>
>> That all seems quite ugly though, because now you have a transition
>> function that is not strict, which is passed NULL values when used in
>> a normal aggregate context, and not passed NULL values when used in a
>> window context (whether sliding or not), except for the NULL state for
>> the first non-NULL value.
>
> Ugh, true. Clearly, nodeAgg.c needs to follow what nodeWindowAgg.c does
> and skip NULL inputs if the aggregate has a strict inverse transition
> function. That fact that we never actually *call* the inverse doesn't
> matter. Will fix that once we decided on the various strictness issues.
>

Yuk!


>> I'm not sure if there is a better way to do it though. If we disallow
>> this case, these aggregates would have to use non-strict functions for
>> both the forward and inverse transition functions, which means they
>> would have to implement their own non-NULL value counting.
>> Alternatively, allowing strict transition functions for these
>> aggregates would require that we provide some other way to initialise
>> the state from the first non-NULL input, such as a new initfn.
>
> I'm not sure an initfn would really help. It would allow us to return
> to the initial requirement that the strict settings match - but you
> already deem the check that the forward transition function can only
> be strict if the inverse is also overly harsh, so would that really be
> an improvement? It's also cost us an additional pg_proc entry per aggregate.
>
> Another idea would be to have an explicit nulls=ignore|process option
> for CREATE AGGREGATE. If nulls=process and either of the transition
> functions are strict, we could either error out, or simply do what
> normal functions calls do and pretend they return NULL for NULL inputs.
> Not sure how the rule that forward transition functions may not return
> NULL if there's an inverse transition function would fit in if we do
> the latter, though.
>
> The question is - is it worth it the effort to add that flag?
>

Yeah, I thought about a flag too. I think that could work quite nicely.

Basically where I'm coming from is trying to make this as flexible as
possible, without pre-judging the kinds of aggregates that users may
want to add.

One use-case I had in mind upthread was suppose you wanted to write a
custom version of array_agg that only collected non-NULL values. With
such a flag, that would be triv

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-01-28 Thread Florian Pflug

On Jan27, 2014, at 23:28 , Dean Rasheed  wrote:
> strict transfn vs non-strict inv_transfn
> 
> 
> This case is explicitly forbidden, both in CREATE AGGREGATE and in the
> executor. To me, that seems overly restrictive --- if transfn is
> strict, then you know for sure that no NULL values are added to the
> aggregate state, so surely it doesn't matter whether or not
> inv_transfn is strict. It will never be asked to remove NULL values.
> 
> I think there are definite use-cases where a user might want to use a
> pre-existing function as the inverse transition function, so it seems
> harsh to force them to wrap it in a strict function in this case.
> 
> AFAICS, the code in advance/retreat_windowaggregate should just work
> if those checks are removed.

True. It didn't use to in earlier version of the patch because the advance
and retreat functions looked at the strict settings directly, but now that
there's an ignore_nulls flag in the executor node, the only requirement
should be that ignore_nulls is set if either of the transition functions
is strict.

I'm not sure that the likelihood of someone wanting to combine a strict
forward with a non-strict inverse function is very hight, but neither

> non-strict transfn vs strict inv_transfn
> 
> 
> At first this seems as though it must be an error --- the forwards
> transition function allows NULL values into the aggregate state, and
> the inverse transition function is strict, so it cannot remove them.
> 
> But actually what the patch is doing in this case is treating the
> forwards transition function as partially strict --- it won't be
> passed NULL values (at least in a window context), but it may be
> passed a NULL state in order to build the initial state when the first
> non-NULL value arrives.

Exactly.

> It looks like this behaviour is intended to support aggregates that
> ignore NULL values, but cannot actually be made to have a strict
> transition function. I think typically this is because the aggregate's
> initial state is NULL and it's state type differs from the type of the
> values being aggregated, and so can't be automatically created from
> the first non-NULL value.

Yes. I added this because the alternative would haven been to count
non-NULL values in most of the existing SUM() aggregates.

> That all seems quite ugly though, because now you have a transition
> function that is not strict, which is passed NULL values when used in
> a normal aggregate context, and not passed NULL values when used in a
> window context (whether sliding or not), except for the NULL state for
> the first non-NULL value.

Ugh, true. Clearly, nodeAgg.c needs to follow what nodeWindowAgg.c does
and skip NULL inputs if the aggregate has a strict inverse transition
function. That fact that we never actually *call* the inverse doesn't
matter. Will fix that once we decided on the various strictness issues.

> I'm not sure if there is a better way to do it though. If we disallow
> this case, these aggregates would have to use non-strict functions for
> both the forward and inverse transition functions, which means they
> would have to implement their own non-NULL value counting.
> Alternatively, allowing strict transition functions for these
> aggregates would require that we provide some other way to initialise
> the state from the first non-NULL input, such as a new initfn.

I'm not sure an initfn would really help. It would allow us to return
to the initial requirement that the strict settings match - but you
already deem the check that the forward transition function can only
be strict if the inverse is also overly harsh, so would that really be
an improvement? It's also cost us an additional pg_proc entry per aggregate.

Another idea would be to have an explicit nulls=ignore|process option
for CREATE AGGREGATE. If nulls=process and either of the transition
functions are strict, we could either error out, or simply do what
normal functions calls do and pretend they return NULL for NULL inputs.
Not sure how the rule that forward transition functions may not return
NULL if there's an inverse transition function would fit in if we do
the latter, though.

The question is - is it worth it the effort to add that flag?

best regards,
Florian Pflug

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-01-27 Thread Dean Rasheed

On 25 January 2014 02:21, Florian Pflug  wrote:
> I've now split this up into
>
> invtrans_base: Basic machinery plus tests with SQL-language aggregates
> invtrans_arith: COUNT(),SUM(),AVG(),STDDEV() and the like
> invtrans_minmax: MIN(),MAX(),BOOL_AND(),BOOL_OR()
> invtrans_collecting: ARRAY_AGG(), STRING_AGG()
> invtrans_docs: Documentation
>

Thanks. That makes it a bit easier to review, and hopefully easier to commit.

The thing that I'm currently trying to get my head round is the
null/not null, strict/not strict handling in this patch, which I find
a bit odd, particularly when transfn and inv_transfn have different
strictness settings:

strict transfn vs non-strict inv_transfn

This case is explicitly forbidden, both in CREATE AGGREGATE and in the
executor. To me, that seems overly restrictive --- if transfn is
strict, then you know for sure that no NULL values are added to the
aggregate state, so surely it doesn't matter whether or not
inv_transfn is strict. It will never be asked to remove NULL values.

I think there are definite use-cases where a user might want to use a
pre-existing function as the inverse transition function, so it seems
harsh to force them to wrap it in a strict function in this case.

AFAICS, the code in advance/retreat_windowaggregate should just work
if those checks are removed.

non-strict transfn vs strict inv_transfn

At first this seems as though it must be an error --- the forwards
transition function allows NULL values into the aggregate state, and
the inverse transition function is strict, so it cannot remove them.

But actually what the patch is doing in this case is treating the
forwards transition function as partially strict --- it won't be
passed NULL values (at least in a window context), but it may be
passed a NULL state in order to build the initial state when the first
non-NULL value arrives.

It looks like this behaviour is intended to support aggregates that
ignore NULL values, but cannot actually be made to have a strict
transition function. I think typically this is because the aggregate's
initial state is NULL and it's state type differs from the type of the
values being aggregated, and so can't be automatically created from
the first non-NULL value.

That all seems quite ugly though, because now you have a transition
function that is not strict, which is passed NULL values when used in
a normal aggregate context, and not passed NULL values when used in a
window context (whether sliding or not), except for the NULL state for
the first non-NULL value.

I'm not sure if there is a better way to do it though. If we disallow
this case, these aggregates would have to use non-strict functions for
both the forward and inverse transition functions, which means they
would have to implement their own non-NULL value counting.
Alternatively, allowing strict transition functions for these
aggregates would require that we provide some other way to initialise
the state from the first non-NULL input, such as a new initfn.

Thoughts?

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-01-26 Thread Florian Pflug

On Jan26, 2014, at 00:24 , David Rowley  wrote:
>> On Sat, Jan 25, 2014 at 3:21 PM, Florian Pflug  wrote:
>> On Jan24, 2014, at 08:47 , Dean Rasheed  wrote:
>> The invtrans_minmax patch doesn't contain any patches yet - David, could
>> you provide some for these functions, and also for bool_and and bool_or?
>> We don't need to test every datatype, but a few would be nice.
> 
> I've added a few regression tests for min, min and bool_or and bool_and.
> I've pushed these up to here:
> 
> https://github.com/david-rowley/postgres/commits/invtrans_minmax

OK, I've pushed this to github. I haven't produced new patches yet - I
think it's probably better to wait for a few things to pile up, to make
this less of a moving target. But ultimately, this is up to Dean - Dean,
I'll do whatever makes your life as a reviewer easier.

> I did wonder if you'd want to see uses of FILTER in there as I'm thinking
> that should really be covered by the core patch and the tests here really
> should just be checking the inverse transition functions for min and max.

I don't mind the FILTER - when this gets committed, the distinction between
what was in which patch goes away anyway. What I did realize when merging
this is that we currently don't tests the case of a window which lies
fully after the current row (i.e. BETWEEN N FOLLOWING AND m FOLLOWING). We
do test BETWEEN n PRECEDING AND m PRECEDING, because the array_agg and
string_agg tests do that. So maybe some of the MIN/MAX or arithmetic tests
could exercise that case? 

> As for the data types tested, I ended just adding tests for int and text
> for min and max. Let me know if you think that more should be tested.

That's sufficient for the regression tests, I think.

> As for bool_or and bool_and. I didn't think there was much extra that would
> need tested after I added 1 test. It's pretty simple code and adding anything
> extra seems like it would be testing something else.

Sounds fine to me.

best regards,
Florian Pflug

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-01-25 Thread David Rowley

On Sat, Jan 25, 2014 at 3:21 PM, Florian Pflug  wrote:

> On Jan24, 2014, at 08:47 , Dean Rasheed  wrote:
> The invtrans_minmax patch doesn't contain any patches yet - David, could
> you provide some for these functions, and also for bool_and and bool_or?
> We don't need to test every datatype, but a few would be nice.
>
>
I've added a few regression tests for min, min and bool_or and bool_and.
I've pushed these up to here:

https://github.com/david-rowley/postgres/commits/invtrans_minmax

I did wonder if you'd want to see uses of FILTER in there as I'm thinking
that should really be covered by the core patch and the tests here really
should just be checking the inverse transition functions for min and max.

As for the data types tested, I ended just adding tests for int and text
for min and max. Let me know if you think that more should be tested.

As for bool_or and bool_and. I didn't think there was much extra that would
need tested after I added 1 test. It's pretty simple code and adding
anything extra seems like it would be testing something else.

Regards

David Rowley

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-01-25 Thread Florian Pflug

On Jan25, 2014, at 09:50 , David Rowley  wrote:
> On Thu, Jan 23, 2014 at 1:57 PM, Florian Pflug  wrote:
>> On Jan23, 2014, at 01:17 , David Rowley  wrote:
>> > On Wed, Jan 22, 2014 at 12:46 AM, Florian Pflug  wrote:
>> >> If you want to play with
>> >> this, I think the first step has to be to find a set of guarantees that
>> >> SUM(float) is supposed to meet. Currently, SUM(float) guarantees that if 
>> >> the
>> >> summands all have the same sign, the error is bound by C * N, where C is 
>> >> some
>> >> (machine-specific?) constant (about 1e-15 or so), and N is the number of 
>> >> input
>> >> rows. Or at least so I think from looking at SUMs over floats in 
>> >> descending
>> >> order, which I guess is the worst case. You could, for example, try to 
>> >> see if
>> >> you can find a invertibility conditions which guarantees the same, but 
>> >> allows
>> >> C to be larger. That would put a bound on the number of digits lost by 
>> >> the new
>> >> SUM(float) compared to the old one.
>> >
>> > I guess if the case is that normal (non-window) sum(float) results are 
>> > undefined
>> > unless you add an order by to the aggregate then I guess there is no 
>> > possible
>> > logic to put in for inverse transitions that will make them behave the 
>> > same as
>> > an undefined behaviour.
>> 
>> Actually, if sum(float) was really undefined, it'd be *very* easy to provide 
>> an
>> inverse transition function - just make it a no-op. Heck, you could then even
>> make the forward transition function a no-op, since the very definition of
>> "undefined behaviour" is "result can be anything, including setting your 
>> house
>> on fire". The point is, it's *not* undefined - it's just imprecise. And the
>> imprecision can even be quantified, it just depends on the number of
>> input rows (the equal-sign requirement is mostly there to make "imprecision"
>> equivalent to "relative error").
> 
> My apologies, I meant to use the term nondeterministic rather than undefined.
> There's really not any need that I can see to turn things silly here.

I wasn't trying to be absurd, I was trying to get a point across.

> My point was more that since sum(float) can give different results if it used
> an index scan rather than a seq scan, trying to get the inverse transition to
> match something that gives varying results sounds like a tricky task to take 
> on.

You don't have to match it digit-by-digit! In that I fully agree with Kevin -
floats are *always* an approximation, and so is thus SUM(float). Summarization
order is BTW not the only source of non-determinism for SUM(float) - the exact
result can very between architectures, and for some architectures between
compilers. (Intel is one of these, due to their 80-bit extended precision format
that gets truncated to 64-bit when stored to main memory).

But in a large number of cases, they won't vary by *much*, which is *the* reason
why SUM(float) is *not* totally useless. And reasoning about SUM(float) which
ignores that by simply calling it "non-deterministic", "undefined" or whatever,
without any quantification of the possible error, has thus zero chance of
leading to interesting conclusions.

>> Secondly, you'd really need a second aggregate definition - usually, the
>> non-invertible aggregate will get away with a much smaller state 
>> representation.
>> Aggregates with state type "internal" could hack their way around that by
>> simply not using the additional parts of their state structure, but e.g.
>> plpgsql aggregates would still incur overhead due to repeated copying of
>> a unnecessary large state value. If it's possible to represent the
>> forward-only but not the invertible state state with a copy-by-value type,
>> that overhead could easily be a large part of the total overhead you paid
>> for having an inverse transition function.
> 
> I had imagined they keep the same state type and don't use any extra variables
> that are defined for the forward transition function that is invertible.

Yeah, and the above explains that at least for non-C-language aggregates, 
passing around that unnecessarily large state may very well prove to be the
source of a large part, if not almost all, of the overhead. So having just
a separate forward transition function will buy you almost nothing or some
cases.

I just tried this. I defined two aggregates mymax(int4) and myfastmax(int4),
both with just a forward transition function, both SQL-language functions.
mymax uses a composite type for the state containing an int4 field holding
the current maximum, and a dummy int8 field. myfastmax uses a plain int4
for the state, holding the current maximum. Both forward transition
functions essentially do

  case when current_max > v then current_max else v end

On my laptop, computing the maximum of 1e6 rows takes about 4.5 seconds with
myfastmax and 7.8 seconds with mymax. If make mymax's transition function
increment the dummy field on every transition, the time increases from 7.8
to 8.

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-01-25 Thread Tom Lane

David Rowley  writes:
> My point was more that since sum(float) can give different results if it
> used an index scan rather than a seq scan, trying to get the inverse
> transition to match something that gives varying results sounds like a
> tricky task to take on.

This is just a variant of the same excuse we heard before.  The question
is not whether sum(float8) can give bad results; the question is whether
we are going to break applications that are using it carefully (ie with
an appropriate ORDER BY) and expecting to get good results.

>> Secondly, you'd really need a second aggregate definition - usually, the
>> non-invertible aggregate will get away with a much smaller state
>> representation.

Yeah.  I think the idea of multiple transition functions in a single
aggregate definition is pretty broken to start with, but the likelihood
that they couldn't share aggregate state types puts it completely beyond
sanity.  We're not going to start inventing "stype2" etc.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-01-25 Thread David Rowley

On Thu, Jan 23, 2014 at 1:57 PM, Florian Pflug  wrote:

> On Jan23, 2014, at 01:17 , David Rowley  wrote:
> > On Wed, Jan 22, 2014 at 12:46 AM, Florian Pflug  wrote:
> >> If you want to play with
> >> this, I think the first step has to be to find a set of guarantees that
> >> SUM(float) is supposed to meet. Currently, SUM(float) guarantees that
> if the
> >> summands all have the same sign, the error is bound by C * N, where C
> is some
> >> (machine-specific?) constant (about 1e-15 or so), and N is the number
> of input
> >> rows. Or at least so I think from looking at SUMs over floats in
> descending
> >> order, which I guess is the worst case. You could, for example, try to
> see if
> >> you can find a invertibility conditions which guarantees the same, but
> allows
> >> C to be larger. That would put a bound on the number of digits lost by
> the new
> >> SUM(float) compared to the old one.
> >
> > I guess if the case is that normal (non-window) sum(float) results are
> undefined
> > unless you add an order by to the aggregate then I guess there is no
> possible
> > logic to put in for inverse transitions that will make them behave the
> same as
> > an undefined behaviour.
>
> Actually, if sum(float) was really undefined, it'd be *very* easy to
> provide an
> inverse transition function - just make it a no-op. Heck, you could then
> even
> make the forward transition function a no-op, since the very definition of
> "undefined behaviour" is "result can be anything, including setting your
> house
> on fire". The point is, it's *not* undefined - it's just imprecise. And the
> imprecision can even be quantified, it just depends on the number of
> input rows (the equal-sign requirement is mostly there to make
> "imprecision"
> equivalent to "relative error").
>
>
My apologies, I meant to use the term nondeterministic rather than
undefined. There's really not any need that I can see to turn things silly
here.

My point was more that since sum(float) can give different results if it
used an index scan rather than a seq scan, trying to get the inverse
transition to match something that gives varying results sounds like a
tricky task to take on.


> >> > If it seems sound enough, then I may implement it in C to see how much
> >> > overhead it adds to forward aggregation for floating point types, but
> even
> >> > if it did add too much overhead to forward aggregation it might be
> worth
> >> > allowing aggregates to have 2 forward transition functions and if the
> 2nd
> >> > one exists then it could be used in windowing functions where the
> frame
> >> > does not have "unbounded following".
> >>
> >> I don't think adding yet another type of aggregation function is the
> >> solution here.
> >
> > Why not?
> >
> > I thought, if in the cases where we need to burden the forward transition
> > functions with extra code to make inverse transitions possible, then why
> > not have an extra transition function so that it does not slow down
> normal
> > aggregation?
> >
> > My testing of sum(numeric) last night does not show any slow down by that
> > extra dscale tracking code that was added, but if it did then I'd be
> thinking
> > seriously about 2 forward transition functions to get around the problem.
> > I don't understand what would be wrong with that. The core code could
> just
> > make use of the 2nd function if it existed and inverse transitions were
> > thought to be required. i.e in a window context and does not
> > have UNBOUNDED PRECEDING.
>
> First, every additional function increases the maintenance burden, and at
> some point the expected benefit simply isn't worth it. IMHO at least.
>
>
There's little point in arguing for this as we've managed to narrow any
forward transition over head to background noise so far, my point more was
if there was no other way to maintain performance and have an inverse
transition function that this may be a solution to that problem.


> Secondly, you'd really need a second aggregate definition - usually, the
> non-invertible aggregate will get away with a much smaller state
> representation.
> Aggregates with state type "internal" could hack their way around that by
> simply not using the additional parts of their state structure, but e.g.
> plpgsql aggregates would still incur overhead due to repeated copying of
> a unnecessary large state value. If it's possible to represent the
> forward-only but not the invertible state state with a copy-by-value type,
> that overhead could easily be a large part of the total overhead you paid
> for having an inverse transition function.
>
>
I had imagined they keep the same state type and don't use any extra
variables that are defined for the forward transition function that is
invertible.


> And lastly, we could achieve much of the benefit of a second transition
> function by simply telling the forward transition function whether to
> expect inverse transition function calls. We already expose things like
> the aggregation memory con

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-01-24 Thread Florian Pflug

On Jan24, 2014, at 08:47 , Dean Rasheed  wrote:
> I think it should probably be broken up. It might be overly ambitious
> to try to get all of this committed during this commitfest, and in any
> case, I suspect that the committer would probably choose to commit it
> in stages. Perhaps something like:
> 
> Patch 1
> - Basic support for inverse transition functions, CREATE AGGREGATE
> support and doc updates. This should include test cases to validate
> that the underlying executor changes are correct, by defining custom
> aggregates such as sum(int) and array_agg() using inverse transition
> functions.
> 
> Patch 2
> - Add built-in inverse transition functions for count, sum(int), and friends.
> 
> Patch 3, 4...
> - Other related groups of built-in aggregates. By this point, it
> should be a fairly mechanical process.
> 
> Splitting it up this way now should help to focus on getting patch 1
> correct, without being distracted by all the other aggregates that may
> or may not usefully be made to have inverse transition functions. I
> think the value of the feature has been proved, and it is good to see
> that it can be applied to so many aggregates, but let's not try to do
> it all at once.

Working on that now, will post individual patches later today.

best regards,
Florian Pflug



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-01-23 Thread Dean Rasheed

On 22 January 2014 09:54, David Rowley  wrote:
> I've performed some more benchmarks on this patch tonight. The results and
> full recreation scripts are attached along with the patch it was tested
> against.
>

I noticed that the rate of changes to this patch has dropped off,
which I took as sign that it is ready for review, so I started doing
that.

My first thought was wow, this is a big patch!

I think it should probably be broken up. It might be overly ambitious
to try to get all of this committed during this commitfest, and in any
case, I suspect that the committer would probably choose to commit it
in stages. Perhaps something like:

Patch 1
 - Basic support for inverse transition functions, CREATE AGGREGATE
support and doc updates. This should include test cases to validate
that the underlying executor changes are correct, by defining custom
aggregates such as sum(int) and array_agg() using inverse transition
functions.

Patch 2
 - Add built-in inverse transition functions for count, sum(int), and friends.

Patch 3, 4...
 - Other related groups of built-in aggregates. By this point, it
should be a fairly mechanical process.

Splitting it up this way now should help to focus on getting patch 1
correct, without being distracted by all the other aggregates that may
or may not usefully be made to have inverse transition functions. I
think the value of the feature has been proved, and it is good to see
that it can be applied to so many aggregates, but let's not try to do
it all at once.

Regarding what would be in patch 1, I've only given it a cursory look,
but I did immediately notice a problem with the nodeWindowAgg.c
changes --- for aggregates with non-strict transition functions,
something equivalent to the not null counting code is needed to make
it work correctly with FILTER. For example, the following query gives
the wrong results with this patch:

SELECT array_agg(i) FILTER (WHERE i%3=0)
   OVER (ROWS BETWEEN 1 PRECEDING AND CURRENT ROW)
  FROM generate_series(1,10) g(i);

Perhaps it should just always count the non-skipped values added after
removal of filtered and null values. Then you might be able to
simplify things by getting rid of ignore_nulls from
WindowStatePerAggData and replacing notnullcount with a more general
valueCount to track the number of values actually added to transValue.

I haven't read through nodeWindowAgg.c in any detail yet (I think it
will take me a while to fully get my head round that code) but I think
it needs some more test cases to verify that the various corner cases
are covered.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-01-22 Thread Florian Pflug

On Jan23, 2014, at 01:07 , David Rowley  wrote:
> On Tue, Jan 21, 2014 at 3:20 AM, Florian Pflug  wrote:
>> On Jan20, 2014, at 08:42 , David Rowley  wrote:
>> >> On Mon, Jan 20, 2014 at 2:45 PM, Florian Pflug  wrote:
>> >> * I've also renamed INVFUNC to INVSFUNC. That's a pretty invasive change, 
>> >> and
>> >>   it's the last commit, so if you object to that, then you can merge up to
>> >>   eafa72330f23f7c970019156fcc26b18dd55be27 instead of
>> >>   de3d9148be9732c4870b76af96c309eaf1d613d7.
>> >
>> >
>> > Seems like sfunc really should be tfunc then we could have invtfunc. I'd 
>> > probably
>> > understand this better if I knew what the 's' was for in sfunc. I've not 
>> > applied
>> > this just yet. Do you have a reason why you think it's better?
>> 
>> My issue with just "invfunc" is mainly that it's too generic - it doesn't 
>> tell
>> you what it's supposed to be the inverse of.
>> 
>> I've always assumed that 's' in 'sfunc' and 'stype' stands for 'state', and 
>> that
>> the naming is inspired by control theory, where the function which acts on 
>> the
>> state space is often called S.
> 
> Ok, that makes more sense now and it seems like a reasonable idea. I'm not 
> not quite
> sure yet as when someone said upthread that these "negative transition 
> functions" as
> I was calling them at the time should really be called "inverse transition 
> functions",
> I then posted that I was going to call the create aggregate option "invfunc" 
> which
> nobody seemed to object to. I just don't want to go and change that now. It 
> is very
> possible this will come up again when the committer is looking at the patch. 
> It would
> be a waste if it ended up back at invfunc after we changed it to invsfunc.

Since we already settled on "inverse transition function", I kinda doubt that
calling the parameter invsfunc is going to meet a lot of resistance. But we can 
put
that off a little longer still...

I've pushed a few additional things to 
https://github.com/fgp/postgres/tree/invtrans.

* I update the CREATE AGGREGATE documentation, trying to include the 
description of
  the various modes of inverse transition functions into the paragraphs which 
already
  explained about STRICT for transition functions and such.

* I've also updated the list of window functions to include a list of those
  aggregates which potentially need to restart the computation, i.e. MIN/MAX 
and 
  the like.

* I've changed nodeWindowAgg.c to use per-aggregate aggregation contexts for the
  invertible aggregates. Without that, the aggregate context is potentially 
never
  reset, because that previously required *all* the aggregates to restart at the
  same time. That would be OK if we were sure not to leak unbounded amounts of
  stuff stores in that context, but unfortunately we sometimes do. For example,
  whenever a strict, invertible aggregate ends up with only NULL inputs, we
  re-initialize the aggregation, which leaks the old state value. We could
  pfree() that of course, but that state value might reference other stuff that
  we don't know about and thus cannot free. Separating the aggregation contexts
  is the only solution I came up with, so I did that.

* I've also tweaked an if to flag aggregates as invertible only if the frame 
head
  can actually move, i.e. if the frame start clause is something other than
  UNBOUNDED PRECEEDING. Since the choice of whether to use a private aggregation
  context is driven by that flag, that also makes the above apply only to 
aggregates
  were the inverse transition function is actually used.

I hope to find some time tomorrow or so to complete my pass through the 
documentation -
what's still missing as an explanation of the EXPLAIN VERBOSE ANALYZE field and 
maybe
some cleanup of xaggr.sgml.

Do you have any additional things pending?

best regards,
Florian Pflug

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-01-22 Thread Florian Pflug

On Jan23, 2014, at 01:17 , David Rowley  wrote:
> On Wed, Jan 22, 2014 at 12:46 AM, Florian Pflug  wrote:
>> If you want to play with
>> this, I think the first step has to be to find a set of guarantees that
>> SUM(float) is supposed to meet. Currently, SUM(float) guarantees that if the
>> summands all have the same sign, the error is bound by C * N, where C is some
>> (machine-specific?) constant (about 1e-15 or so), and N is the number of 
>> input
>> rows. Or at least so I think from looking at SUMs over floats in descending
>> order, which I guess is the worst case. You could, for example, try to see if
>> you can find a invertibility conditions which guarantees the same, but allows
>> C to be larger. That would put a bound on the number of digits lost by the 
>> new
>> SUM(float) compared to the old one.
> 
> I guess if the case is that normal (non-window) sum(float) results are 
> undefined
> unless you add an order by to the aggregate then I guess there is no possible
> logic to put in for inverse transitions that will make them behave the same as
> an undefined behaviour.

Actually, if sum(float) was really undefined, it'd be *very* easy to provide an
inverse transition function - just make it a no-op. Heck, you could then even
make the forward transition function a no-op, since the very definition of
"undefined behaviour" is "result can be anything, including setting your house
on fire". The point is, it's *not* undefined - it's just imprecise. And the
imprecision can even be quantified, it just depends on the number of
input rows (the equal-sign requirement is mostly there to make "imprecision"
equivalent to "relative error"). 

>> > If it seems sound enough, then I may implement it in C to see how much
>> > overhead it adds to forward aggregation for floating point types, but even
>> > if it did add too much overhead to forward aggregation it might be worth
>> > allowing aggregates to have 2 forward transition functions and if the 2nd
>> > one exists then it could be used in windowing functions where the frame
>> > does not have "unbounded following".
>> 
>> I don't think adding yet another type of aggregation function is the
>> solution here.
> 
> Why not?
>  
> I thought, if in the cases where we need to burden the forward transition
> functions with extra code to make inverse transitions possible, then why
> not have an extra transition function so that it does not slow down normal
> aggregation?
> 
> My testing of sum(numeric) last night does not show any slow down by that
> extra dscale tracking code that was added, but if it did then I'd be thinking
> seriously about 2 forward transition functions to get around the problem.
> I don't understand what would be wrong with that. The core code could just
> make use of the 2nd function if it existed and inverse transitions were
> thought to be required. i.e in a window context and does not
> have UNBOUNDED PRECEDING.

First, every additional function increases the maintenance burden, and at
some point the expected benefit simply isn't worth it. IMHO at least.

Secondly, you'd really need a second aggregate definition - usually, the
non-invertible aggregate will get away with a much smaller state representation.
Aggregates with state type "internal" could hack their way around that by
simply not using the additional parts of their state structure, but e.g.
plpgsql aggregates would still incur overhead due to repeated copying of
a unnecessary large state value. If it's possible to represent the
forward-only but not the invertible state state with a copy-by-value type,
that overhead could easily be a large part of the total overhead you paid
for having an inverse transition function.

And lastly, we could achieve much of the benefit of a second transition
function by simply telling the forward transition function whether to
expect inverse transition function calls. We already expose things like
the aggregation memory context to C-language transition functions, so the
basic mechanism is already there. In fact, I pondered whether to do this -
but then figured I'd leave it to however needs that facility first. Though
it wouldn't be much code - basically, setting a flag in the WindowAggState,
and exporting a function to be used by aggregates to read it, similar
to what AggCheckCallContext does.

best regards,
Florian Pflug

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-01-22 Thread David Rowley

On Wed, Jan 22, 2014 at 12:46 AM, Florian Pflug  wrote:

> On Jan21, 2014, at 10:53 , David Rowley  wrote:
> > It came to me that it might be possible to implement inverse transitions
> > for floating point aggregates by just detecting if precision has been
> > lost during forward transitions.
> >
> > I've written the test to do this as:
> >
> > IF state.value + value = state.value AND value <> 0 THEN
> >   newstate.precision_lost := true;
> >   newstate.value := state.value;
> >   ELSE
> >   newstate.precision_lost := false;
> >   newstate.value := state.value + value;
> >   END IF;
> >
> >
> > The inverse transition function checks the precision_lost and if it's
> true it
> > returns NULL. The core code is now implemented (thanks to Florian) to
> > re-aggregate when NULL is returned from the inverse transition function.
>
> That's not sufficient, I fear. You can lose all significant digits of the
> value
> and still have precision_lost = false afterwards. Try summing over 1e16,
> 1.01.
> "SELECT 1e16::float8 + 1.01::float8 = 1e16::float8" returns FALSE, yet
> "SELECT 1e16::float8 + 1.01::float8 - 1e16::float8" returns "2" where
> "1.01"
> would have been correct. That's still too much precision loss.
>
> I'm quite certain that the general idea has merit, but the actual
> invertibility condition is going to be more involved. If you want to play
> with
> this, I think the first step has to be to find a set of guarantees that
> SUM(float) is supposed to meet. Currently, SUM(float) guarantees that if
> the
> summands all have the same sign, the error is bound by C * N, where C is
> some
> (machine-specific?) constant (about 1e-15 or so), and N is the number of
> input
> rows. Or at least so I think from looking at SUMs over floats in descending
> order, which I guess is the worst case. You could, for example, try to see
> if
> you can find a invertibility conditions which guarantees the same, but
> allows
> C to be larger. That would put a bound on the number of digits lost by the
> new
> SUM(float) compared to the old one.
>
>
I guess if the case is that normal (non-window) sum(float) results are
undefined unless you add an order by to the aggregate then I guess there is
no possible logic to put in for inverse transitions that will make them
behave the same as an undefined behaviour.


> I don't have high hopes for this getting int 9.4, though.
>
>
Yeah I'm going to drop it. I need to focus on documents and performance
testing.


> > If it seems sound enough, then I may implement it in C to see how much
> > overhead it adds to forward aggregation for floating point types, but
> even
> > if it did add too much overhead to forward aggregation it might be worth
> > allowing aggregates to have 2 forward transition functions and if the 2nd
> > one exists then it could be used in windowing functions where the frame
> > does not have "unbounded following".
>
> I don't think adding yet another type of aggregation function is the
> solution here.
>
>
Why not?

I thought, if in the cases where we need to burden the forward transition
functions with extra code to make inverse transitions possible, then why
not have an extra transition function so that it does not slow down normal
aggregation?

My testing of sum(numeric) last night does not show any slow down by that
extra dscale tracking code that was added, but if it did then I'd be
thinking seriously about 2 forward transition functions to get around the
problem. I don't understand what would be wrong with that. The core code
could just make use of the 2nd function if it existed and inverse
transitions were thought to be required. i.e in a window context and does
not have UNBOUNDED PRECEDING.

Regards

David Rowley

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-01-22 Thread David Rowley

On Tue, Jan 21, 2014 at 3:20 AM, Florian Pflug  wrote:

> On Jan20, 2014, at 08:42 , David Rowley  wrote:
> >> On Mon, Jan 20, 2014 at 2:45 PM, Florian Pflug  wrote:
> >> * I've also renamed INVFUNC to INVSFUNC. That's a pretty invasive
> change, and
> >>   it's the last commit, so if you object to that, then you can merge up
> to
> >>   eafa72330f23f7c970019156fcc26b18dd55be27 instead of
> >>   de3d9148be9732c4870b76af96c309eaf1d613d7.
> >
> >
> > Seems like sfunc really should be tfunc then we could have invtfunc. I'd
> probably
> > understand this better if I knew what the 's' was for in sfunc. I've not
> applied
> > this just yet. Do you have a reason why you think it's better?
>
> My issue with just "invfunc" is mainly that it's too generic - it doesn't
> tell
> you what it's supposed to be the inverse of.
>
> I've always assumed that 's' in 'sfunc' and 'stype' stands for 'state',
> and that
> the naming is inspired by control theory, where the function which acts on
> the
> state space is often called S.
>
>
Ok, that makes more sense now and it seems like a reasonable idea. I'm not
not quite sure yet as when someone said upthread that these "negative
transition functions" as I was calling them at the time should really be
called "inverse transition functions", I then posted that I was going to
call the create aggregate option "invfunc" which nobody seemed to object
to. I just don't want to go and change that now. It is very possible this
will come up again when the committer is looking at the patch. It would be
a waste if it ended up back at invfunc after we changed it to invsfunc.

Regards

David Rowley

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-01-21 Thread Florian Pflug

On Jan20, 2014, at 15:20 , Florian Pflug  wrote:
> * In CREATE AGGREGATE, we should state the precise axioms we assume about 
> forward
> and inverse transition functions. The last time I read the text there, it was
> a bit ambiguous about whether inverse transition functions assume 
> commutativity,
> i.e. whether we assume that we can remove inputs other than the first one from
> the aggregation. Currently, all the inverse transition functions we have are,
> in fact, commutative, because all the forward transition function are also. 
> But
> we could e.g. add an inverse to array_agg and string_agg, and those would
> obviously need this ordering guarantee. I'd also like us to explain that the
> "inverse" in "inverse transition function" shouldn't be taken too literally. 
> For
> forward transition function F, inverse transition function G, and inputs 
> a,b,...
> we *don't require that G(F(s,a),a) == s. We we do require is that if I is the
> initial state, then
> 
>  G(F(...F(F(I,a),b)...,z),a) == F(...F(I,b)...,z).
> 
> (Well, actually we don't strictly require even that, the two states don't
> need to be identical, they just need to produce identical outputs when passed
> to the final function)
> 
> * In CREATE AGGREGATE, we should also explain the NULL semantics you get
> with various combinations of strict and non-strict forward and inverse
> transition functions. I think a table listing all the combinations is in order
> there, with a column explaining the semantics you get. We should also clearly
> state that once you provide an inverse transition function, NULL isn't a valid
> state value anymore, except as an initial state, i.e. that the forward 
> transition
> function must never return NULL in this case.

I gave that a shot, the results are at 
https://github.com/fgp/postgres/tree/invtrans

> * The window function page should explain the performance hazards when
> a moving frame head is involved. Ideally, we'd list which aggregates never
> have to restart, and which sometimes might, and what you can do to avoid that.
> We can also tell people to use EXPLAIN VERBOSE ANALYZE to check whether
> restarts are occurring. This is were we'd tell people to cast their numeric
> operands to a type with defined scale to avoid restarts, and were we'd state
> the SUM(float) *does* restart always due to precision loss issues.


> BTW, something that we haven't addressed at all is how inverse transition
> functions interact with what is called "ordered-set aggregates". I haven't
> wrapped my head around these fully, I think, so I'm still not sure if there's
> anything to do there or not. Can those even be used as window functions?
> Should we forbid ordered-set aggregates from specifying an inverse transition
> function?

It seems that these aren't valid window function anyway, so there's nothing
to do for now, I think.

best regards,
Florian Pflug



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-01-21 Thread Florian Pflug

On Jan21, 2014, at 10:53 , David Rowley  wrote:
> It came to me that it might be possible to implement inverse transitions
> for floating point aggregates by just detecting if precision has been
> lost during forward transitions.
> 
> I've written the test to do this as:
> 
> IF state.value + value = state.value AND value <> 0 THEN
>   newstate.precision_lost := true;
>   newstate.value := state.value;
>   ELSE
>   newstate.precision_lost := false;
>   newstate.value := state.value + value;
>   END IF;
> 
> 
> The inverse transition function checks the precision_lost and if it's true it
> returns NULL. The core code is now implemented (thanks to Florian) to
> re-aggregate when NULL is returned from the inverse transition function.

That's not sufficient, I fear. You can lose all significant digits of the value
and still have precision_lost = false afterwards. Try summing over 1e16, 1.01.
"SELECT 1e16::float8 + 1.01::float8 = 1e16::float8" returns FALSE, yet
"SELECT 1e16::float8 + 1.01::float8 - 1e16::float8" returns "2" where "1.01"
would have been correct. That's still too much precision loss.

I'm quite certain that the general idea has merit, but the actual
invertibility condition is going to be more involved. If you want to play with
this, I think the first step has to be to find a set of guarantees that 
SUM(float) is supposed to meet. Currently, SUM(float) guarantees that if the
summands all have the same sign, the error is bound by C * N, where C is some
(machine-specific?) constant (about 1e-15 or so), and N is the number of input
rows. Or at least so I think from looking at SUMs over floats in descending
order, which I guess is the worst case. You could, for example, try to see if
you can find a invertibility conditions which guarantees the same, but allows
C to be larger. That would put a bound on the number of digits lost by the new
SUM(float) compared to the old one.

I don't have high hopes for this getting int 9.4, though.

> If it seems sound enough, then I may implement it in C to see how much
> overhead it adds to forward aggregation for floating point types, but even
> if it did add too much overhead to forward aggregation it might be worth
> allowing aggregates to have 2 forward transition functions and if the 2nd
> one exists then it could be used in windowing functions where the frame
> does not have "unbounded following".

I don't think adding yet another type of aggregation function is the
solution here. 

best regards,
Florian Pflug

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-01-21 Thread David Rowley

On Sun, Dec 15, 2013 at 2:00 PM, Tom Lane  wrote:

> Greg Stark  writes:
> > On 14 Dec 2013 15:40, "Tom Lane"  wrote:
> >> I think you *can't* cover them for the float types; roundoff error
> >> would mean you don't get the same answers as before.
>
> > I was going to say the same thing. But then I started to wonder
> What's
> > so special about the answers we used to give? They are also subject to
> > round off and the results are already quite questionable in those cases.
>
> Well, we can't easily do better than the old answers, and the new ones
> might be arbitrarily worse.  Example: sum or average across single-row
> windows ought to be exact in any case, but it might be arbitrarily wrong
> with the negative-transition technique.
>
> More generally, this is supposed to be a performance enhancement only;
> it's not supposed to change the results.
>
>
It came to me that it might be possible to implement inverse transitions
for floating point aggregates by just detecting if precision has been lost
during forward transitions.

I've written the test to do this as:

 IF state.value + value = state.value AND value <> 0 THEN
newstate.precision_lost := true; newstate.value := state.value; ELSE
newstate.precision_lost := false; newstate.value := state.value + value;
END IF;
The inverse transition function checks the precision_lost and if it's true
it returns NULL. The core code is now implemented (thanks to Florian) to
re-aggregate when NULL is returned from the inverse transition function.

I've attached an implementation of this with the transition functions
written in plpgsql.
I don't really know for sure yet if it can handle all cases and give the
exact same results as it would without inverse transitions, but it
certainly fixes the error case which was presented

Using the attached on HEAD of
https://github.com/david-rowley/postgres/commits/invtrans

explain (analyze, verbose)
select mysum(v) over (order by i rows between current row and unbounded
following) from (values(1,1e20),(2,1)) b(i,v);

Gives me the expected results of 1e20 and 1, instead of my original attempt
which gave 1e20 and 0.

I guess the extra tracking on forward transition might mean this would not
be practical to implement in C for sum(float), but I just wanted to run the
idea through a few heads to see if anyone can present a case where it can
still produce wrong results.

If it seems sound enough, then I may implement it in C to see how much
overhead it adds to forward aggregation for floating point types, but even
if it did add too much overhead to forward aggregation it might be worth
allowing aggregates to have 2 forward transition functions and if the 2nd
one exists then it could be used in windowing functions where the frame
does not have "unbounded following".

Any thoughts?

Regards

David Rowley
BEGIN WORK;

CREATE TYPE float_state AS (precision_lost bool, value float);

CREATE OR REPLACE FUNCTION float_sum(state float_state, value float)
  RETURNS float_state AS
$$
DECLARE newstate float_state;
BEGIN
IF state IS NULL THEN
IF value IS NULL THEN
RETURN NULL;
ELSE
newstate.value := value;
newstate.precision_lost := false;
return newstate;
END IF;
END IF;

IF state.value + value = state.value AND value <> 0 THEN
newstate.precision_lost := true;
newstate.value := state.value;
ELSE
newstate.precision_lost := false;
newstate.value := state.value + value;
END IF;

RETURN newstate;
END;
$$
LANGUAGE plpgsql IMMUTABLE;

CREATE OR REPLACE FUNCTION float_sum_inv(state float_state, value float)
  RETURNS float_state AS
$$
DECLARE newstate float_state;
BEGIN
IF state.precision_lost = true THEN
RETURN NULL;
ELSE
newstate.value := state.value - value;
RETURN newstate;
END IF;
END;
$$
LANGUAGE plpgsql STRICT IMMUTABLE;

CREATE OR REPLACE FUNCTION float_sum_final(state float_state)
  RETURNS float AS
$$
BEGIN
IF NOT(state IS NULL) THEN
RETURN state.value;
ELSE
RETURN NULL;
END IF;
END;
$$
LANGUAGE plpgsql IMMUTABLE;

CREATE AGGREGATE mysum (float) (
stype = float_state,
sfunc = float_sum,
invfunc = float_sum_inv,
finalfunc = float_sum_final
  );

select mysum(v)  from (values(1,1e20),(2,1)) b(i,v);

-- forces re-aggregate due to precision loss
--explain (analyze, verbose)
select mysum(v) over (order by i rows between current row and unbounded 
following) from (values(1,1e20),(2,1)) b(i,v);

-- does not force reaggregate.
--explain (analyze, verbose)
select mysum(v) over (order by i rows between current row and unbounded 
following) from (values(1,1),(2,2),(3,3)) b(i,v);

rollback;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@post

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-01-20 Thread Florian Pflug

On Jan20, 2014, at 08:42 , David Rowley  wrote:
>> On Mon, Jan 20, 2014 at 2:45 PM, Florian Pflug  wrote:
>> * An assert that the frame end doesn't move backwards - I realized that
>>   it is after all easy to do that, if it's done after the loop which adds
>>   the new values, not before.
> 
> I've applied this too, but I'm wondering why an elog for if the head moves
> back, but an assert if the tail moves back? 

When I put the frame head check in, I was concerned that the code might crash
or loop endlessly if aggregatedbase was ever larger than frameheadpos, so
I made it elog(), just for safety.

The frame end check, OTOH, is done at the very end, so it doesn't protect
against much, it just documents that it's not supposed to happen. 

But yeah, it's bit weird. Feel free to turn the frame end check into an
elog() too if you want to.

>> * I've also renamed INVFUNC to INVSFUNC. That's a pretty invasive change, and
>>   it's the last commit, so if you object to that, then you can merge up to
>>   eafa72330f23f7c970019156fcc26b18dd55be27 instead of
>>   de3d9148be9732c4870b76af96c309eaf1d613d7.
> 
> 
> Seems like sfunc really should be tfunc then we could have invtfunc. I'd 
> probably
> understand this better if I knew what the 's' was for in sfunc. I've not 
> applied
> this just yet. Do you have a reason why you think it's better?

My issue with just "invfunc" is mainly that it's too generic - it doesn't tell
you what it's supposed to be the inverse of.

I've always assumed that 's' in 'sfunc' and 'stype' stands for 'state', and that
the naming is inspired by control theory, where the function which acts on the
state space is often called S.

> Thanks, yeah those really do need a review. I've lost a bit of direction with
> them and I'm not quite sure just how much detail to go in to with it. I'd like
> to explain a bit that users who need to use their numeric columns in window
> aggregates might want to think about having a defined scale to the numeric 
> rather
> than an undefined scale and explain that this is because the inverse 
> transition
> function for numeric bails out if it loses the maximum seen dscale. Though it
> does seem generally a good idea to have a defined scale, but then I guess 
> you've
> got to have a bit of knowledge about the numbers you're storing in that case.
> I'm not quite sure how to put that into words friendly enough for the 
> documents
> just yet and or exactly where to put the words. So any ideas or patches you 
> have
> around that would be great.

Here's what I image the docs should look like, very roughly

* In CREATE AGGREGATE, we should state the precise axioms we assume about 
forward
and inverse transition functions. The last time I read the text there, it was
a bit ambiguous about whether inverse transition functions assume commutativity,
i.e. whether we assume that we can remove inputs other than the first one from
the aggregation. Currently, all the inverse transition functions we have are,
in fact, commutative, because all the forward transition function are also. But
we could e.g. add an inverse to array_agg and string_agg, and those would
obviously need this ordering guarantee. I'd also like us to explain that the
"inverse" in "inverse transition function" shouldn't be taken too literally. For
forward transition function F, inverse transition function G, and inputs a,b,...
we *don't require that G(F(s,a),a) == s. We we do require is that if I is the
initial state, then

  G(F(...F(F(I,a),b)...,z),a) == F(...F(I,b)...,z).

(Well, actually we don't strictly require even that, the two states don't
need to be identical, they just need to produce identical outputs when passed
to the final function)

* In CREATE AGGREGATE, we should also explain the NULL semantics you get
with various combinations of strict and non-strict forward and inverse
transition functions. I think a table listing all the combinations is in order
there, with a column explaining the semantics you get. We should also clearly
state that once you provide an inverse transition function, NULL isn't a valid
state value anymore, except as an initial state, i.e. that the forward 
transition
function must never return NULL in this case.

* The window function page should explain the performance hazards when
a moving frame head is involved. Ideally, we'd list which aggregates never
have to restart, and which sometimes might, and what you can do to avoid that.
We can also tell people to use EXPLAIN VERBOSE ANALYZE to check whether
restarts are occurring. This is were we'd tell people to cast their numeric
operands to a type with defined scale to avoid restarts, and were we'd state
the SUM(float) *does* restart always due to precision loss issues.

BTW, something that we haven't addressed at all is how inverse transition
functions interact with what is called "ordered-set aggregates". I haven't
wrapped my head around these fully, I think, so I'm still not sure if there's
anything to do there or not. C

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-01-20 Thread David Rowley

On Mon, Jan 20, 2014 at 2:45 PM, Florian Pflug  wrote:

> On Jan19, 2014, at 20:00 , David Rowley  wrote:
> > I've applied that patch again and put in the sort operators.
>
> I've push a new version to https://github.com/fgp/postgres/tree/invtrans
> which includes
>
> * A bunch of missing declaration for *_inv functions
>
>
Thanks, I've applied that.


> * An assert that the frame end doesn't move backwards - I realized that
>   it is after all easy to do that, if it's done after the loop which adds
>   the new values, not before.
>
>
I've applied this too, but I'm wondering why an elog for if the head moves
back, but an assert if the tail moves back?


> * EXPLAIN VERBOSE ANALYZE now shows the max. number of forward aggregate
>   transitions per row and aggregate. It's a bit imprecise, because it
> doesn't
>   track the count per aggregate, but it's still a good metric for how well
>   the inverse transition functions work. If the number is close to one, you
>   know that very few rescans are happening.
>
>
I've not looked at this yet and I don't think I'll have time tonight, but
it sounds interesting. I guess it might be quite nice to have a way to see
this especially with the way the numeric stuff works, it might be actually
pretty hard to otherwise know how many inverse transition "failures" there
had been. Do you think it's also worth tracking the inverse transition
failures too?

* I've also renamed INVFUNC to INVSFUNC. That's a pretty invasive change,
> and
>   it's the last commit, so if you object to that, then you can merge up to
>   eafa72330f23f7c970019156fcc26b18dd55be27 instead of
>   de3d9148be9732c4870b76af96c309eaf1d613d7.
>
>
Seems like sfunc really should be tfunc then we could have invtfunc. I'd
probably understand this better if I knew what the 's' was for in sfunc.
I've not applied this just yet. Do you have a reason why you think it's
better?


> A few more things I noticed, all minor stuff
>
> * do_numeric_discard()'s inverseTransValid flag is unnecessary. First, if
> the
>   inverse transition function returns NULL once, we never call it again,
> so the
>   flag won't have any practical effect. And second, assume we ever called
> the
>   forward transition function after the inverse fail, and then retried the
> inverse.
>   In the case of do_numeric_discard(), that actually *could* allow the
> inverse
>   to suddenly succeed - if the call to the forward function increased the
> dscale
>   beyond that of the element we tried to remove, removal would suddenly be
>   possible again. We never do that, of course, and it seems unlikely we
> ever
>   will. But it's still weird to have code which serves no other purpose
> than to
>   pessimize a case which would otherwise just work fine.
>
>
hmm, yeah of course, you are right. I've removed this now.


> * The state == NULL checks in all the strict inverse transition functions
> are
>   redundant.
>
>
ok, I've removed these and added comments to note that these functions
should be declared strict.


> I haven't taken a close look at the documentation yet, I hope to be able to
> do that tomorrow. Otherwise, things look good as far as I'm concerned.
>
>
Thanks, yeah those really do need a review. I've lost a bit of direction
with them and I'm not quite sure just how much detail to go in to with it.
I'd like to explain a bit that users who need to use their numeric columns
in window aggregates might want to think about having a defined scale to
the numeric rather than an undefined scale and explain that this is because
the inverse transition function for numeric bails out if it loses the
maximum seen dscale. Though it does seem generally a good idea to have a
defined scale, but then I guess you've got to have a bit of knowledge about
the numbers you're storing in that case. I'm not quite sure how to put that
into words friendly enough for the documents just yet and or exactly where
to put the words. So any ideas or patches you have around that would be
great.

Once again thanks for all your work on this.

Regards

David Rowley


> best regards,
> Florian Pflug
>
>

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-01-20 Thread Florian Pflug

On Jan19, 2014, at 20:00 , David Rowley  wrote:
> I've applied that patch again and put in the sort operators.

I've push a new version to https://github.com/fgp/postgres/tree/invtrans
which includes

* A bunch of missing declaration for *_inv functions

* An assert that the frame end doesn't move backwards - I realized that
  it is after all easy to do that, if it's done after the loop which adds
  the new values, not before.

* EXPLAIN VERBOSE ANALYZE now shows the max. number of forward aggregate
  transitions per row and aggregate. It's a bit imprecise, because it doesn't
  track the count per aggregate, but it's still a good metric for how well
  the inverse transition functions work. If the number is close to one, you
  know that very few rescans are happening.

* I've also renamed INVFUNC to INVSFUNC. That's a pretty invasive change, and
  it's the last commit, so if you object to that, then you can merge up to
  eafa72330f23f7c970019156fcc26b18dd55be27 instead of
  de3d9148be9732c4870b76af96c309eaf1d613d7.

A few more things I noticed, all minor stuff

* do_numeric_discard()'s inverseTransValid flag is unnecessary. First, if the
  inverse transition function returns NULL once, we never call it again, so the
  flag won't have any practical effect. And second, assume we ever called the
  forward transition function after the inverse fail, and then retried the 
inverse.
  In the case of do_numeric_discard(), that actually *could* allow the inverse
  to suddenly succeed - if the call to the forward function increased the dscale
  beyond that of the element we tried to remove, removal would suddenly be
  possible again. We never do that, of course, and it seems unlikely we ever
  will. But it's still weird to have code which serves no other purpose than to
  pessimize a case which would otherwise just work fine.

* The state == NULL checks in all the strict inverse transition functions are
  redundant.

I haven't taken a close look at the documentation yet, I hope to be able to
do that tomorrow. Otherwise, things look good as far as I'm concerned.

best regards,
Florian Pflug



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-01-20 Thread David Rowley

On Mon, Jan 20, 2014 at 8:42 PM, David Rowley  wrote:

> On Mon, Jan 20, 2014 at 2:45 PM, Florian Pflug  wrote:
>
>> * EXPLAIN VERBOSE ANALYZE now shows the max. number of forward aggregate
>>   transitions per row and aggregate. It's a bit imprecise, because it
>> doesn't
>>   track the count per aggregate, but it's still a good metric for how well
>>   the inverse transition functions work. If the number is close to one,
>> you
>>   know that very few rescans are happening.
>>
>>
> I've not looked at this yet and I don't think I'll have time tonight, but
> it sounds interesting. I guess it might be quite nice to have a way to see
> this especially with the way the numeric stuff works, it might be actually
> pretty hard to otherwise know how many inverse transition "failures" there
> had been. Do you think it's also worth tracking the inverse transition
> failures too?
>
>
I've merged this patch but I attempted to get it into a bit more of a ready
state by moving the code out into a helper function the same way as the
other explain stuff is done. I've not touched explain before so do let me
know if I've made it worse.

https://github.com/david-rowley/postgres/commits/invtrans

Regards

David Rowley

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-01-19 Thread Florian Pflug

On Jan19, 2014, at 20:00 , David Rowley  wrote:
> I've applied that patch again and put in the sort operators.

I've push a new version to https://github.com/fgp/postgres/tree/invtrans
This branch includes the following changes

* A bunch of missing declaration for *_inv functions

* An assert that the frame end doesn't move backwards - I realized that
 it is after all easy to do that, if it's done after the loop which adds
 the new values, not before.

* EXPLAIN VERBOSE ANALYZE now shows the max. number of forward aggregate
 transitions per row and aggregate. It's a bit imprecise, because it doesn't
 track the count per aggregate, but it's still a good metric for how well
 the inverse transition functions work. If the number is close to one, you
 know that very few rescans are happening.

* I've also renamed INVFUNC to INVSFUNC. That's a pretty invasive change, and
 it's the last commit, so if you object to that, then you can merge up to
 eafa72330f23f7c970019156fcc26b18dd55be27 instead of
 de3d9148be9732c4870b76af96c309eaf1d613d7.

A few more things I noticed, all minor stuff

* do_numeric_discard()'s inverseTransValid flag is unnecessary. First, if the
 inverse transition function returns NULL once, we never call it again, so the
 flag won't have any practical effect. And second, assume we ever called the
 forward transition function after the inverse fail, and then retried the 
inverse.
 In the case of do_numeric_discard(), that actually *could* allow the inverse
 to suddenly succeed - if the call to the forward function increased the dscale
 beyond that of the element we tried to remove, removal would suddenly be
 possible again. We never do that, of course, and it seems unlikely we ever
 will. But it's still weird to have code which serves no other purpose than to
 pessimize a case which would otherwise just work fine.

* The state == NULL checks in all the strict inverse transition functions are
 redundant.

I haven't taken a close look at the documentation yet, I hope to be able to
do that tomorrow. Otherwise, things look good as far as I'm concerned.

best regards,
Florian Pflug



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-01-19 Thread David Rowley

On Mon, Jan 20, 2014 at 5:53 AM, Florian Pflug  wrote:

> On Jan19, 2014, at 05:27 , David Rowley  wrote:
> >> I just finished implementing the inverse transition functions for
> bool_and
> >> and bool_or, these aggregates had a sort operator which I assume would
> have
> >> allowed an index scan to be performed, but since I had to change the
> first
> >> argument of these aggregates to internal and that meant I had to get
> rid of
> >> the sort operator...
>
> Why does having transition type "internal" prevent you from specifying a
> sort operator? The sort operator's argument types must match the *input*
> type of the aggregate, not the transition type.
>
> Here's a pure SQL implementation of an optimized bool_and called myand_agg
> that uses state type bigint[] and specifies a sort operator.
>
>   create or replace function myboolagg_fwd(counts bigint[], value bool)
>   returns bigint[] as $$
> select array[
>   counts[1] + case value when true then 0 else 1 end,
>   counts[2] + case value when true then 1 else 0 end
> ]
>   $$ language sql strict immutable;
>
>   create or replace function myboolagg_inv(counts bigint[], value bool)
>   returns bigint[] as $$
> select array[
>   counts[1] - case value when true then 0 else 1 end,
>   counts[2] - case value when true then 1 else 0 end
> ]
>   $$ language sql strict immutable;
>
>   create or replace function myboolagg_and(counts bigint[])
>   returns bool as $$
> select case counts[1] when 0 then true else false end
>   $$ language sql strict immutable;
>
>   create aggregate myand_agg (bool) (
> stype = bigint[],
> sfunc = myboolagg_fwd,
> invfunc = myboolagg_inv,
> finalfunc = myboolagg_and,
> sortop = <,
> initcond = '{0,0}'
>   );
>
> With this, doing
>
>   create table boolvals as
> select i, random() < 0.5 as v from generate_series(1,1) i;
>   create index on boolvals(v);
>
>   explain analyze select myand_agg(v) from boolvals;
>
> yields
>
>  Result  (cost=0.33..0.34 rows=1 width=0) (actual time=0.067..0.067 rows=1
> loops=1)
>InitPlan 1 (returns $0)
>  ->  Limit  (cost=0.29..0.33 rows=1 width=1) (actual time=0.061..0.061
> rows=1 loops=1)
>->  Index Only Scan using boolvals_v_idx on boolvals
>  (cost=0.29..474.41 rows=9950 width=1) (actual time=0.061..0.061 rows=1
> loops=1)
>  Index Cond: (v IS NOT NULL)
>  Heap Fetches: 1
>  Total runtime: 0.100 ms
>
> which looks fine, no?
>
>
hmm, yeah you're right. I guess I didn't quite think through what the sort
comparison was comparing with, for some reason I had it in my head that it
was the aggregate state and not another value in a btree index.

I've applied that patch again and put in the sort operators.

Thanks for looking at that.

Regards

David Rowley


> best regards,
> Florian Pflug
>
>

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-01-19 Thread Florian Pflug

On Jan19, 2014, at 05:27 , David Rowley  wrote:
>> I just finished implementing the inverse transition functions for bool_and
>> and bool_or, these aggregates had a sort operator which I assume would have
>> allowed an index scan to be performed, but since I had to change the first
>> argument of these aggregates to internal and that meant I had to get rid of
>> the sort operator...

Why does having transition type "internal" prevent you from specifying a
sort operator? The sort operator's argument types must match the *input*
type of the aggregate, not the transition type.

Here's a pure SQL implementation of an optimized bool_and called myand_agg
that uses state type bigint[] and specifies a sort operator.

  create or replace function myboolagg_fwd(counts bigint[], value bool)
  returns bigint[] as $$
select array[
  counts[1] + case value when true then 0 else 1 end,
  counts[2] + case value when true then 1 else 0 end
]
  $$ language sql strict immutable;

  create or replace function myboolagg_inv(counts bigint[], value bool)
  returns bigint[] as $$
select array[
  counts[1] - case value when true then 0 else 1 end,
  counts[2] - case value when true then 1 else 0 end
]
  $$ language sql strict immutable;

  create or replace function myboolagg_and(counts bigint[])
  returns bool as $$
select case counts[1] when 0 then true else false end
  $$ language sql strict immutable;

  create aggregate myand_agg (bool) (
stype = bigint[],
sfunc = myboolagg_fwd,
invfunc = myboolagg_inv,
finalfunc = myboolagg_and,
sortop = <,
initcond = '{0,0}'
  );

With this, doing

  create table boolvals as
select i, random() < 0.5 as v from generate_series(1,1) i;
  create index on boolvals(v);

  explain analyze select myand_agg(v) from boolvals;

yields

 Result  (cost=0.33..0.34 rows=1 width=0) (actual time=0.067..0.067 rows=1 
loops=1)
   InitPlan 1 (returns $0)
 ->  Limit  (cost=0.29..0.33 rows=1 width=1) (actual time=0.061..0.061 
rows=1 loops=1)
   ->  Index Only Scan using boolvals_v_idx on boolvals  
(cost=0.29..474.41 rows=9950 width=1) (actual time=0.061..0.061 rows=1 loops=1)
 Index Cond: (v IS NOT NULL)
 Heap Fetches: 1
 Total runtime: 0.100 ms

which looks fine, no?

best regards,
Florian Pflug



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-01-19 Thread David Rowley

On Sat, Jan 18, 2014 at 11:34 AM, David Rowley  wrote:

>
> The latest version of the patch is attached.
>
>
I've attached an updated version of the patch.

I'm now using github to track the changes on the patch, so I've included
the commit sha in the file name of the latest commit that this patch
includes, but I've also included the date.

Please see https://github.com/david-rowley/postgres/commits/invtrans for
what's been changed.

Right now I don't think there is very much left to do. Perhaps the
documents need some examples of creating inverse transition functions, I
was not sure, so I left them out for now.

Regards

David Rowley

> Regards
>
> David Rowley
>
>

inverse_transition_functions_d00df99_2014-01-20.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-01-19 Thread David Rowley

On Sun, Jan 19, 2014 at 5:27 PM, David Rowley  wrote:

>
>> >> It's probably far more worth it for the bool and/or aggregates. We
>> could just
>> >> keep track of the values aggregated and the count of values as "true"
>> and return
>> >> true if those are the same in the case of "AND", then check the true
>> count
>> >> is > 0 in the case of "OR". I'd feel more strongly to go and do that
>> if I'd
>> >> actually ever used those aggregates for anything.
>>
>> That, OTOH, would be worthwhile I think. I'll go do that, though probably
>> not today. I hope to get to it sometime tomorrow.
>>
>
> I've commited a patch to the github repo to do this.
>
> https://github.com/david-rowley/postgres/commit/121b0823753cedf33bb94f646df3176b77f28500
> but I'm not sure if we can keep it as I had to remove the sort op as I
> explained above.
>
>>
>>
I think I'm going to have to revert the patch which implements the inverse
transition function for bool_and and bool_or.
I tested on an instance of 9.3.2 and the following queries use index scans.

create table booltest (b boolean not null);
insert into booltest (b) select false from generate_series(1,2) g(n);
insert into booltest (b) values(true);

create index booltest_b_idx ON booltest(b);
vacuum analyze booltest;

explain select bool_or(b) from booltest;
explain select bool_and(b) from booltest;

I'm guessing there is no way to have an internal state type on the
aggregate and a sort operator on the aggregate.

I wonder if it is worth creating naive inverse transition functions similar
to max()'s and min()'s inverse transition functions. I guess on average
they've got about a 50% chance of being used and likely for some work loads
it would be a win.

What's your thoughts?

Regards
David Rowley

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-01-18 Thread David Rowley

On Sun, Jan 19, 2014 at 3:22 AM, Florian Pflug  wrote:

>
> BTW, this made me realize that MIN and MAX currently have the same issue -
> they'll rescan if the inputs are all equal. We could avoid that by doing
> what
> you did with dscale - track the number of times we've seen the maximum. I
> wonder if that would be worth it - it would, unfortunately, require use to
> use state type "internal" there too, and hence to add final functions for
> all
> the MIN/MAX aggregates. But that seems excessive. So for now, let's just
> live with that.
>
>
Yeah, it's an idea... I had actually talked a bit about it before when
first I posted about inverse transition functions.

http://www.postgresql.org/message-id/caaphdvqu+ygw0vbpbb+yxhrpg5vcy_kifyi8xmxfo8kyocz...@mail.gmail.com

But I've only realised today that it might be a no go. Let me explain...

I just finished implementing the inverse transition functions for bool_and
and bool_or, these aggregates had a sort operator which I assume would have
allowed an index scan to be performed, but since I had to change the first
argument of these aggregates to internal and that meant I had to get rid of
the sort operator... So I'm not actually sure that we really should
implement inverse transition functions for bool_and and bool_or because of
this. Never-the-less I commited a patch to the github repo which implements
them. I guess this sort operator problem completely writes off doing
something similar for MAX and MIN as that would mean no index scan would be
possible for these aggregates!



> If we really *do* want to optimize this case, we could
> come to it from a completely different angle. Aggregates already
> special-case
> MIN and MAX to be able to use an index to evalutate SELECT MAX(c) FROM t.
> If we provided a way for the transition function to call the sort operator
> specified by SORTOP in CREATE AGGREGATE, one generic triple of forward and
> inverse transition function and final function would work for all the
> MIN and MAX aggregates. But that's material for a separate patch for 9.5
>
> > Note that after this fix the results for my quick benchmark now look
> like:
> >
> > create table num (num numeric(10,2) not null);
> > insert into num (num) select * from generate_series(1,2);
> > select sum(num) over(order by num rows between current row and unbounded
> following) from num; -- 113 ms
> > select sum(num / 10) over(order by num rows between current row and
> unbounded following) from num; -- 156ms
> > select sum(num / 1) over(order by num rows between current row and
> unbounded following) from num; -- 144 ms
> >
> > So it seems to be much less prone to falling back to brute force forward
> > transitions.
> > It also seems the / 10 version must have had to previously do 1 brute
> > force rescan but now it looks like it can do it in 1 scan.
> >
> >>> I'm not set on it, and I'm willing to try the lazy zeroing of the scale
> >>> tracker array, but I'm just not quite sure what extra real use cases it
> >>> covers that the one above does not. Perhaps my way won't perform
> inverse
> >>> transitions if the query did sum(numericCol / 10) or so.
> >>
> >> Dunno how many people SUM over numerics with different dscales. Easily
> >> possible that it's few enough to not be worth fussing over.
> >
> > Going by Tom's comments in the post above this is possible just by
> having an
> > unconstrained numeric column, but I guess there's still a good chance
> that
> > even those unconstrained numbers have the same scale or at least the
> scale
> > will likely not vary wildly enough to make us have to perform brute force
> > forward transitions for each row.
>
> Yeah, I'm convinced by now that your approach is the right trade-off there.
> Those who do have values with wildly different dscales in their columns can
> always add a cast to normalize them, if they experience a lot of restarts.
>
> So let's just add a sentence or two to the SUM(numeric) documentation about
> this, and be done.
>
>
I had a quick look and I couldn't decide the best place to write about
specific details on inverse transition functions. The best place I could
see was to add a note under the aggregates table here:
http://www.postgresql.org/docs/devel/static/functions-aggregate.html#FUNCTIONS-AGGREGATE-TABLE



> >
> > I did think of this when I wrote them. I thought that the removing 0
> case might
> > be quite common and worth it, but I thought the ~0 case would be less
> common,
> > but I just thought it was weird to do one without the other.
> > To do more tracking on these it looks like we'd need to change those
> aggregates
> > to use an state type that is internal and I think the extra tracking
> would mean
> > looping over a 8, 32 or 64 element array of int64's for each value, I
> just don't
> > think that would be a winner performance wise since the code that's
> there is
> > pretty much a handful of CPU cycles.
>
> Yeah, this is similar to the SUM(numeric) problem in that we *could* avoid
>

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-01-18 Thread Florian Pflug

On Jan18, 2014, at 06:15 , David Rowley  wrote:
>> On Sat, Jan 18, 2014 at 2:20 PM, Florian Pflug  wrote:
>> On Jan17, 2014, at 23:34 , David Rowley  wrote:
>>> The test turned out to become:
>>>  if (state->expectedScale == -1)
>>>  state->expectedScale = X.dscale;
>>>  else if (state->expectedScale != X.dscale)
>>>  state->expectedScale = -2;
>>> 
>>> In do_numeric_accum, then when we do the inverse I just check if
>>> expectedScale < 0 then return NULL.
>> 
>> Ok, so this will rescan if and only if the dscales of all inputs match.
>> I still that's overly pessimistic - we've only got a problem when we
>> removed the input with the largest dscale, no? So my suggestion would be
>> 
>> 
>> 
>> I'd think that this avoids more restarts without about the same effort,
>> but I haven't tried this though, so maybe I'm missing something.
> 
> This is not quite right as it means if all the values are the same then
> we reject inverse transitions since state->maxScale will always be equal
> to X.dscale.
> But you are right about the overly strict code I've put in, we should allow
> values with a less than the maximum dscale to be unaggregated without
> complaint. To implement this I needed a maxScaleCounter variable too so we
> only reject when the maxScaleCounter gets back to 0 again. 

Ups, sorry, yeah. Sounds sensible.

BTW, this made me realize that MIN and MAX currently have the same issue -
they'll rescan if the inputs are all equal. We could avoid that by doing what
you did with dscale - track the number of times we've seen the maximum. I
wonder if that would be worth it - it would, unfortunately, require use to
use state type "internal" there too, and hence to add final functions for all
the MIN/MAX aggregates. But that seems excessive. So for now, let's just
live with that.

If we really *do* want to optimize this case, we could
come to it from a completely different angle. Aggregates already special-case
MIN and MAX to be able to use an index to evalutate SELECT MAX(c) FROM t.
If we provided a way for the transition function to call the sort operator
specified by SORTOP in CREATE AGGREGATE, one generic triple of forward and
inverse transition function and final function would work for all the
MIN and MAX aggregates. But that's material for a separate patch for 9.5

> Note that after this fix the results for my quick benchmark now look like:
> 
> create table num (num numeric(10,2) not null);
> insert into num (num) select * from generate_series(1,2);
> select sum(num) over(order by num rows between current row and unbounded 
> following) from num; -- 113 ms
> select sum(num / 10) over(order by num rows between current row and unbounded 
> following) from num; -- 156ms
> select sum(num / 1) over(order by num rows between current row and unbounded 
> following) from num; -- 144 ms
> 
> So it seems to be much less prone to falling back to brute force forward
> transitions. 
> It also seems the / 10 version must have had to previously do 1 brute
> force rescan but now it looks like it can do it in 1 scan.
> 
>>> I'm not set on it, and I'm willing to try the lazy zeroing of the scale
>>> tracker array, but I'm just not quite sure what extra real use cases it
>>> covers that the one above does not. Perhaps my way won't perform inverse
>>> transitions if the query did sum(numericCol / 10) or so.
>> 
>> Dunno how many people SUM over numerics with different dscales. Easily
>> possible that it's few enough to not be worth fussing over.
> 
> Going by Tom's comments in the post above this is possible just by having an
> unconstrained numeric column, but I guess there's still a good chance that
> even those unconstrained numbers have the same scale or at least the scale
> will likely not vary wildly enough to make us have to perform brute force
> forward transitions for each row.

Yeah, I'm convinced by now that your approach is the right trade-off there.
Those who do have values with wildly different dscales in their columns can
always add a cast to normalize them, if they experience a lot of restarts.

So let's just add a sentence or two to the SUM(numeric) documentation about
this, and be done.

>> * build_aggregate_fnexprs() should allow NULL to be passed for 
>> invtransfn_oid,
>>  I think. I don't quite like that we construct that even for plain 
>> aggregates,
>>  and avoiding requires just an additional if.
>> 
>> I'm not quite sure what you mean on this. It passes InvalidOid in normal
>> aggregate calls (search for: "InvalidOid, /* invtrans is not needed here */")
>> and only looks up the function in build_aggregate_fnexprs if
>> (OidIsValid(invtransfn_oid)) is true. I'm not sure how this can be improved
>> since that function is used for window aggregates and normal aggregates.

I was thinking about checking for **invtransfnexpr = NULL, and not assigning
if it is. But on second thought, you're right - the additional variable doesn't
really hurt. So let's leave it as it is

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-01-17 Thread David Rowley

On Sat, Jan 18, 2014 at 6:15 PM, David Rowley  wrote:

> On Sat, Jan 18, 2014 at 2:20 PM, Florian Pflug  wrote:
>
>> * Don't we need to check for volatile function in the filter expression
>> too?
>>
>
>>
> I did manual testing on this before and the volatility test for the
> aggregate arguments seems to cover this. I didn't look into why but it just
> did. I've not test this again since your refactoring. I could test this
> easily before when my numeric case was changing the results because of the
> dscale problem, I noticed that if I did FILTER(WHERE random() > 0) that the
> extra trailing zeros would disappear.
> The problem now is that it's pretty hard to determine if an inverse
> transition took place, the only way we can really tell is performance. I'll
> see if I can invent a new test case for this by creating a user defined
> aggregate as you described. I'm thinking just append '+' to a string for
> transitions and '-' to a string for inverse transitions, then just make
> sure we only have a string of '+'s when doing something like filter(where
> random() >= 0).
>
>
>
I've added a test case for this and it seem work as expected:
https://github.com/david-rowley/postgres/commit/43a5021e8f8ae1af272e7e21a842d1b0d5cbe577

Regards

David Rowley

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-01-17 Thread David Rowley

On Sat, Jan 18, 2014 at 2:20 PM, Florian Pflug  wrote:

> First, I've go the feeling that I should somehow update the commitfest app,
> but I don't really know in which way. Should I put myself in as a reviewer,
> or as a second author? Or neither? Suggestions welcome...
>
>
We I guess you're both now, but it's a bit weird to be author and reviewer
so I've put your name against author too, hopefully Dean can review our
combined results and we can review each other's work at the same time.


> On Jan17, 2014, at 23:34 , David Rowley  wrote:
> > The test turned out to become:
> >   if (state->expectedScale == -1)
> >   state->expectedScale = X.dscale;
> >   else if (state->expectedScale != X.dscale)
> >   state->expectedScale = -2;
> >
> > In do_numeric_accum, then when we do the inverse I just check if
> > expectedScale < 0 then return NULL.
>
> Ok, so this will rescan if and only if the dscales of all inputs match.
> I still that's overly pessimistic - we've only got a problem when we
> removed the input with the largest dscale, no? So my suggestion would be
>
>   state->maxDScale = MAX(X.dscale, state->maxDScale);
>
> in do_numeric_accum, and in the inverse
>
>   if (state->maxDScane == X.dscale)
> return PG_RETURN_NULL;
>
> I'd think that this avoids more restarts without about the same effort,
> but I haven't tried this though, so maybe I'm missing something.
>
>
This is not quite right as it means if all the values are the same then
we reject inverse transitions since state->maxScale will always be equal to
X.dscale.
But you are right about the overly strict code I've put in, we should allow
values with a less than the maximum dscale to be unaggregated without
complaint. To implement this I needed a maxScaleCounter variable too so we
only reject when the maxScaleCounter gets back to 0 again.

Note that after this fix the results for my quick benchmark now look like:

create table num (num numeric(10,2) not null);
insert into num (num) select * from generate_series(1,2);
select sum(num) over(order by num rows between current row and unbounded
following) from num; -- 113 ms
select sum(num / 10) over(order by num rows between current row and
unbounded following) from num; -- 156ms
select sum(num / 1) over(order by num rows between current row and
unbounded following) from num; -- 144 ms

So it seems to be much less prone to falling back to brute force forward
transitions.
It also seems the / 10 version must have had to previously do 1 brute force
rescan but now it looks like it can do it in 1 scan.

> I'm not set on it, and I'm willing to try the lazy zeroing of the scale
> > tracker array, but I'm just not quite sure what extra real use cases it
> > covers that the one above does not. Perhaps my way won't perform inverse
> > transitions if the query did sum(numericCol / 10) or so.
>
> Dunno how many people SUM over numerics with different dscales. Easily
> possible that it's few enough to not be worth fussing over.
>
>
Going by Tom's comments in the post above this is possible just by having
an unconstrained numeric column, but I guess there's still a good chance
that even those unconstrained numbers have the same scale or at least the
scale will likely not vary wildly enough to make us have to perform brute
force forward transitions for each row.


> > create table num (num numeric(10,2) not null);
> > insert into num (num) select * from generate_series(1,2);
> > select sum(num) over(order by num rows between current row and unbounded
> following) from num; -- 124ms
> > select sum(num / 10) over(order by num rows between current row and
> unbounded following) from num; -- 254ms
> > select sum(num / 1) over(order by num rows between current row and
> unbounded following) from num; -- 108156.917 ms
> >
> > The divide by 1 case is slow because of that weird 20 trailing zero
> > instead of 16 when dividing a numeric by 1 and that causes the inverse
> > transition function to return NULL because the scale changes.
> >
> > I've not tested an unpatched version yet to see how that divide by 1
> query
> > performs on that but I'll get to that soon.
>
> So without the patch, all three queries should perform simiarly, i.e. take
> about 10 seconds, right? If so, the improvement is fantastic!
>
>
Well, it's actually 100 seconds, not 10. I tested the worse case
performance against an unpatched head and got 107 seconds instead of the
108. So I'm guessing that's pretty good as worse case is not really any
worse and the worse case is pretty hard to get to. I guess the results
would have to all have a different scale with the biggest scale on the
first aggregated values... Reaching that worse case just seems impossible
in a real world workload.


> > I'm thinking that the idea about detecting the numeric range with
> floating
> > point types and performing an inverse transition providing the range has
> > not gone beyond some defined danger zone is not material for this
> patch

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-01-17 Thread Florian Pflug

First, I've go the feeling that I should somehow update the commitfest app,
but I don't really know in which way. Should I put myself in as a reviewer,
or as a second author? Or neither? Suggestions welcome...

On Jan17, 2014, at 23:34 , David Rowley  wrote:
> The test turned out to become:
>   if (state->expectedScale == -1)
>   state->expectedScale = X.dscale;
>   else if (state->expectedScale != X.dscale)
>   state->expectedScale = -2;
> 
> In do_numeric_accum, then when we do the inverse I just check if
> expectedScale < 0 then return NULL.

Ok, so this will rescan if and only if the dscales of all inputs match.
I still that's overly pessimistic - we've only got a problem when we
removed the input with the largest dscale, no? So my suggestion would be

  state->maxDScale = MAX(X.dscale, state->maxDScale);

in do_numeric_accum, and in the inverse

  if (state->maxDScane == X.dscale)
return PG_RETURN_NULL;

I'd think that this avoids more restarts without about the same effort,
but I haven't tried this though, so maybe I'm missing something.

> I'm not set on it, and I'm willing to try the lazy zeroing of the scale
> tracker array, but I'm just not quite sure what extra real use cases it
> covers that the one above does not. Perhaps my way won't perform inverse
> transitions if the query did sum(numericCol / 10) or so.

Dunno how many people SUM over numerics with different dscales. Easily
possible that it's few enough to not be worth fussing over.

> create table num (num numeric(10,2) not null);
> insert into num (num) select * from generate_series(1,2);
> select sum(num) over(order by num rows between current row and unbounded 
> following) from num; -- 124ms
> select sum(num / 10) over(order by num rows between current row and unbounded 
> following) from num; -- 254ms
> select sum(num / 1) over(order by num rows between current row and unbounded 
> following) from num; -- 108156.917 ms
> 
> The divide by 1 case is slow because of that weird 20 trailing zero
> instead of 16 when dividing a numeric by 1 and that causes the inverse
> transition function to return NULL because the scale changes.
> 
> I've not tested an unpatched version yet to see how that divide by 1 query
> performs on that but I'll get to that soon.

So without the patch, all three queries should perform simiarly, i.e. take
about 10 seconds, right? If so, the improvement is fantastic!

> I'm thinking that the idea about detecting the numeric range with floating
> point types and performing an inverse transition providing the range has
> not gone beyond some defined danger zone is not material for this patch...
> I think it would be not a great deal of work to code, but the testing involved
> would probably make this patch not possible for 9.4

Yeah, I never imagined that this would happen for 9.4.

> The latest version of the patch is attached.

OK, there are a few more comments

* build_aggregate_fnexprs() should allow NULL to be passed for invtransfn_oid,
  I think. I don't quite like that we construct that even for plain aggregates,
  and avoiding requires just an additional if.

* Don't we need to check for volatile function in the filter expression too?

* As it stands, I don't think intXand_inv and intXor_inv are worth it, since
  the case where they return non-NULL is awefully slim (only for inputs
  containing only 1 respectively only zeros). We should either track the number
  of zeros and ones per bit, which would allow us to always perform inverse
  transitions, or just drop them. 

* Quite a few of the inverse transition functions are marked strict, yet
  contain code to handle NULL inputs. You can just remove that code - the system
  makes sure that strict functions never receive NULL arguments. Affected are,
  AFAICS numeric_accum_inv, numeric_avg_accum_inv, int2_accum_inv, 
  int4_accum_inv, int8_accum_inv, int8_avg_accum_inv, int2_sum_inv, 
int4_sum_inv,
  int8_sum_inv. Not sure that list is exhaustive...

* For any of the new inverse transition functions, I'd be inclined to just
  elog() if they're called directly and not as an aggregate. In particular
  those which check for that anyway, plus the *smaller_inv and *larger_inv
  ones. I don't see why anyone would ever want to call these directly - and
  if we elog() now, we can easy change these functions later, because no 
external
  code can depend on them. E.g., maybe someone wants to keep the top N elements
  in the MIN/MAX aggregates one day...

* The number of new regression tests seems a bit excessive. I don't think there
  really a policy what to test and what not, but in this case I think it 
suffices
  if we test the basic machinery, plus the more complex functions. But e.g. most
  of the SUM and AVG aggregates use numeric_accum or numeric_avg_accum 
internally,
  and the wrapper code basically just does datatype conversion, so testing a few
  cases seems enough there. What I think we *should* test, but don't do 
curre

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-01-17 Thread Tom Lane

David Rowley  writes:
> I just thought that my idea was good enough and very cheap too, won't all
> numerics that are actually stored in a column have the same scale anyway?

No; unconstrained numeric columns (ie, if you just say "numeric") don't
force their contents to any particular scale.  It might be that we don't
have to optimize for that case, since it's not in the SQL spec, but it
is definitely supported by PG.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-01-17 Thread David Rowley

On Sat, Jan 18, 2014 at 10:42 AM, David Rowley  wrote:

>
>> You could do better than that - the numeric problem amounts to tracking
>> the maximum
>> scale AFAICS, so it'd be sufficient to restart if we remove a value whose
>> scale equals
>> the current maximum. But if we do SUM(numeric) at all, I think we should
>> do so
>> without requiring restarts - I still think the overhead of tracking the
>> maximum scale
>> within the aggregated values isn't that bad. If we zero the dscale
>> counters lazily,
>> the numbers of entries we have to zero is bound by the maximum dscale we
>> encounter.
>> Since actually summing N digits is probably more expensive than zeroing
>> them, and since
>> we pay the zeroing overhead only once per aggregation but save
>> potentially many
>> summations, I'm pretty sure we come out ahead by quite some margin.
>>
>>
> We'll work that out, I don't think it will take very long to code up your
> idea either.
> I just thought that my idea was good enough and very cheap too, won't all
> numerics that are actually stored in a column have the same scale anyway?
> Is it not only been a problem because we've been testing with random
> numeric literals the whole time?
>
> The test turned out to become:
> if (state->expectedScale == -1)
> state->expectedScale = X.dscale;
>  else if (state->expectedScale != X.dscale)
> state->expectedScale = -2;
>
> In do_numeric_accum, then when we do the inverse I just check if
> expectedScale < 0 then return NULL.
>
> I'm not set on it, and I'm willing to try the lazy zeroing of the scale
> tracker array, but I'm just not quite sure what extra real use cases it
> covers that the one above does not. Perhaps my way won't perform inverse
> transitions if the query did sum(numericCol / 10) or so.
>
> I'll be committing this to the github repo very soon, so we can hack away
> at the scale tracking code once it's back in.
>
>
Ok, we're back up to 86 aggregate function / type combinations with inverse
transition functions.
I've commited my latest work up to github and here's a fresh patch which I
will need to do more tests on.

The test (below) that used to fail a few versions ago is back in there and
it's now passing.

SELECT SUM(n::numeric) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND
UNBOUNDED FOLLOWING)
  FROM (VALUES(1,1.01),(2,2),(3,3)) v(i,n);

In this case it won't use inverse transitions because the forward
transition function detects that the scale is not fixed.

I tested throwing some numerics into a table and I'm pretty happy with the
results.

create table num (num numeric(10,2) not null);
insert into num (num) select * from generate_series(1,2);
select sum(num) over(order by num rows between current row and unbounded
following) from num; -- 124ms
select sum(num / 10) over(order by num rows between current row and
unbounded following) from num; -- 254ms
select sum(num / 1) over(order by num rows between current row and
unbounded following) from num; -- 108156.917 ms

The divide by 1 case is slow because of that weird 20 trailing zero instead
of 16 when dividing a numeric by 1 and that causes the inverse transition
function to return NULL because the scale changes.

I've not tested an unpatched version yet to see how that divide by 1 query
performs on that but I'll get to that soon.

I'm thinking that the idea about detecting the numeric range with floating
point types and performing an inverse transition providing the range has
not gone beyond some defined danger zone is not material for this patch...
I think it would be not a great deal of work to code, but the testing
involved would probably make this patch not possible for 9.4

The latest version of the patch is attached.

Regards

David Rowley


inverse_transition_functions_v2.7.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

1 2 3 >

1 - 100 of 205 matches

Mail list logo