Re: [HACKERS] Re: Proposal/design feedback needed: WITHIN GROUP (sql standard ordered set aggregate functions)
On Tue, Sep 17, 2013 at 2:27 PM, Andrew Gierth and...@tao11.riddles.org.uk wrote: Robert == Robert Haas robertmh...@gmail.com writes: Someone should do the same in WaitForBackgroundWorkerStartup so that building with -Werror works. Robert I don't get a warning there. Can you be more specific about Robert the problem? bgworker.c: In function 'WaitForBackgroundWorkerStartup': bgworker.c:866: warning: 'pid' may be used uninitialized in this function Does the attached patch fix it for you? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company bgworker-wait-fix.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Proposal/design feedback needed: WITHIN GROUP (sql standard ordered set aggregate functions)
On Thu, Sep 19, 2013 at 12:52 PM, Andrew Gierth and...@tao11.riddles.org.uk wrote: It compiles without error and looks ok... Thanks for checking. Committed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Proposal/design feedback needed: WITHIN GROUP (sql standard ordered set aggregate functions)
Robert == Robert Haas robertmh...@gmail.com writes: bgworker.c: In function 'WaitForBackgroundWorkerStartup': bgworker.c:866: warning: 'pid' may be used uninitialized in this function Robert Does the attached patch fix it for you? It compiles without error and looks ok... -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Proposal/design feedback needed: WITHIN GROUP (sql standard ordered set aggregate functions)
On Sat, Sep 14, 2013 at 7:23 AM, Andrew Gierth and...@tao11.riddles.org.uk wrote: Peter Please fix compiler warnings: Someone should do the same in WaitForBackgroundWorkerStartup so that building with -Werror works. I don't get a warning there. Can you be more specific about the problem? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Proposal/design feedback needed: WITHIN GROUP (sql standard ordered set aggregate functions)
Robert == Robert Haas robertmh...@gmail.com writes: Someone should do the same in WaitForBackgroundWorkerStartup so that building with -Werror works. Robert I don't get a warning there. Can you be more specific about Robert the problem? bgworker.c: In function 'WaitForBackgroundWorkerStartup': bgworker.c:866: warning: 'pid' may be used uninitialized in this function gcc 4.2.2 / freebsd 8.2 -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Proposal/design feedback needed: WITHIN GROUP (sql standard ordered set aggregate functions)
Peter == Peter Eisentraut pete...@gmx.net writes: Peter Please fix compiler warnings: Someone should do the same in WaitForBackgroundWorkerStartup so that building with -Werror works. New patch coming shortly. -- Andrew. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Proposal/design feedback needed: WITHIN GROUP (sql standard ordered set aggregate functions)
On Fri, 2013-09-13 at 14:56 +0530, Atri Sharma wrote: This is our complete patch for implementation of WITHIN GROUP. Please fix compiler warnings: inversedistribution.c: In function ‘mode_final’: inversedistribution.c:276:11: warning: ‘mode_val’ may be used uninitialized in this function [-Wmaybe-uninitialized] inversedistribution.c:299:8: warning: ‘last_val’ may be used uninitialized in this function [-Wmaybe-uninitialized] -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Proposal/design feedback needed: WITHIN GROUP (sql standard ordered set aggregate functions)
On 7/19/13 11:59 AM, Atri Sharma wrote: Hi all, This is our current work-in-progress patch for WITHIN GROUP. This patch needs to be rebased. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Proposal/design feedback needed: WITHIN GROUP (sql standard ordered set aggregate functions)
Sent from my iPad On 04-Sep-2013, at 21:38, Peter Eisentraut pete...@gmx.net wrote: On 7/19/13 11:59 AM, Atri Sharma wrote: Hi all, This is our current work-in-progress patch for WITHIN GROUP. This patch needs to be rebased. This version of patch is quite old.We will be sending an updated patch before the start of September commitfest, with all the points you mentioned taken care of. Thanks for the points. Regards, Atri -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Proposal/design feedback needed: WITHIN GROUP (sql standard ordered set aggregate functions)
On Fri, 2013-07-19 at 21:29 +0530, Atri Sharma wrote: Hi all, This is our current work-in-progress patch for WITHIN GROUP. Please fix these compiler warnings: parse_agg.c: In function ‘check_ungrouped_columns_walker’: parse_agg.c:848:3: warning: passing argument 1 of ‘check_ungrouped_columns_walker’ from incompatible pointer type [enabled by default] parse_agg.c:822:1: note: expected ‘struct Node *’ but argument is of type ‘struct List *’ parse_func.c: In function ‘make_fn_arguments’: parse_func.c:1540:9: warning: assignment from incompatible pointer type [enabled by default] parse_func.c:1547:15: warning: assignment from incompatible pointer type [enabled by default] -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Proposal/design feedback needed: WITHIN GROUP (sql standard ordered set aggregate functions)
On Wed, Jul 24, 2013 at 04:16:28AM +, Andrew Gierth wrote: Noah Misch said: Other aggregates based on this syntax might not desire such type unification. Then there would have to be some way to distinguish that. Maybe those could have -1 and the standard hypothetical set functions -2, with some flag in CREATE AGGREGATE to sort it out. Sure. But let's not make the parser presume that an aggordnargs=-1 aggregate always wants its any arguments handled in the manner of the standard hypothetical set functions. This has to happen in the parser because these are errors that should be caught before execution: rank(foo) within group (order by bar,baz) rank(integercol) within group (order by textcol) Would be nice to have, but not an overriding concern. And collations have to be resolved (pairwise) before sorting can happen: rank(textval COLLATE C) within group (order by foo) -- sorts in C rank(textval COLLATE C) within group (order by bar COLLATE en_US) -- error (basically, in rank(x) within group (order by y) where x and y are collatable, the collation rules apply exactly as though you were doing (x y), with all the implicit vs. explicit stuff included) This, though, makes direct parser support nigh inevitable. The issue to resolve here is whether and to what extent we should implement the SQL-standard hypothetical set functions as special cases of some more-generic concept. Here's the declaration you proposed for the rank() HSF: create aggregate rank(variadic any) within group (variadic any) ( This would be the first place to my knowledge where any doesn't mean unrestricted type acceptance at the parser level. If we need bespoke syntax declaring a function as an HSF with the entailed call behavior nuances, fine. Treating a declaration with this particular mix of any as a secret handshake requesting those nuances is not the way to go. STYPE = boolean, INITCOND = 'f', SORTOP = , FINALFUNC = rank_hypothetical_final ); -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Proposal/design feedback needed: WITHIN GROUP (sql standard ordered set aggregate functions)
On Tue, Jul 23, 2013 at 01:21:52AM +, Andrew Gierth wrote: For hypothetical set functions we add a special case, aggordnargs=-1, for which both the aggregate and the finalfn must be defined as (variadic any) and parse analysis detects this case and unifies the types of the normal args with those of the ORDER BY args. Other aggregates based on this syntax might not desire such type unification. Having parse analysis do that distorts the character of an any argument. I think the proper place for such processing is the first call to a transition function. The transition functions could certainly call a new API exposed under src/backend/parser to do the heavy lifting. But let's not make the parser presume that an aggordnargs=-1 aggregate always wants its any arguments handled in the manner of the standard hypothetical set functions. The rest of the plan looks good so far. -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Proposal/design feedback needed: WITHIN GROUP (sql standard ordered set aggregate functions)
some Salesforce folks that would be me! It looks like I didn't quite communicate to Tom just what I was looking for as I do indeed want to have a variable number of any types, as: CREATE AGGREGATE FOO ( ANYELEMENT, more types, VARIADIC any) ( ... STYPE = ANYARRAY ...) so the corresponding transition function would be CREATE FUNCTION FOO_sfunc( ANYARRAY, ANYELEMENT, more types, VARIADIC any) RETURNS ANYARRAY and the final func is CREATE FUNCTION FOO_ffunc( ANYARRAY ) RETURNS ANYELEMENT The functions are in C, and I cheat and actually use the ANYARRAY transition variable as a struct just keeping the varlena length correct (thanks to Tom for that idea). Currently I just support a fixed number of any args but really need to have that be variable. So supporting VARIADIC any for user defined aggregates would be most useful. On Thu, Jul 18, 2013 at 7:09 PM, Tom Lane t...@sss.pgh.pa.us wrote: Noah Misch n...@leadboat.com writes: (I don't know whether VARIADIC transition functions work today, but that would become an orthogonal project.) Coincidentally enough, some Salesforce folk were asking me about allowing VARIADIC aggregates just a few days ago. I experimented enough to find out that if you make an array-accepting transition function, and then force the aggregate's pg_proc entry to look like it's variadic (by manually setting provariadic and some other fields), then everything seems to Just Work: the parser and executor are both fine with it. So I think all that's needed here is to add some syntax support to CREATE AGGREGATE, and probably make some tweaks in pg_dump. I was planning to go work on that sometime soon. Having said that, though, what Andrew seemed to want was VARIADIC ANY, which is a *completely* different kettle of fish, since the actual parameters can't be converted to an array. I'm not sure if that's as easy to support. regards, tom lane
[HACKERS] Re: Proposal/design feedback needed: WITHIN GROUP (sql standard ordered set aggregate functions)
Andrew, Well, as you probably know, the spec is a whole pile of random special-case syntax and any similarities are probably more accidental than anything else. Hah, I didn't realize that our ordered aggregate syntax even *was* spec. A major difference is that in agg(x order by y), the values of y are not passed to the aggregate function - they serve no purpose other than controlling the order of the x values. Whereas in WITHIN GROUP, the values in the ORDER BY ... clause are in some sense the primary input to the aggregate, and the p argument is secondary and can't vary between rows of the group. Our implementation does heavily reuse the existing executor mechanics for ORDER BY in aggregates, and it also reuses a fair chunk of the parser code for it, but there are significant differences. Well, seems like it would work the same as agg_func(constx,coly,colz ORDER BY coly, colz) ... which means you could reuse a LOT of the internal plumbing. Or am I missing something? Also, what would a CREATE AGGREGATE and state function definition for custom WITHIN GROUP aggregates look like? Any time you want to calculate what the rank, dense_rank or cume_dist would be of a specific row within a group without actually adding the row to the group, this is how it's done. I don't have any practical examples to hand, but this beast seems to be implemented in at least Oracle and MSSQL so I guess it has uses. Well, I still can't imagine a practical use for it, at least based on RANK. I certainly have no objections if you have the code, though. I'll also point out that mode() requires ordered input as well, so add that to the set of functions we'll want to eventually support. One thing I find myself wanting with ordered aggregates is the ability to exclude NULLs. Thoughts? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Proposal/design feedback needed: WITHIN GROUP (sql standard ordered set aggregate functions)
The problem is, given that the parser is looking at: foo(p1,p2,...) within group (order by q1,q2,...) how do we best represent the possible matching functions in pg_proc and pg_aggregate? Our partial solution so far does not allow polymorphism to work properly, so we need a better way; I'm hoping for some independent suggestions before I post my own ideas. Yeah, you'd need to extend VARIADIC somehow. That is, I should be able to define a function as: percentile_state ( pctl float, ordercols VARIADIC ANY ) returns VARIADIC ANY ... so that it can handle the sorting. Another way to look at it would be: percentile_state ( pctl float, orderedset ANONYMOUS ROW ) returns ANONYMOUS ROW as ... ... because really, what you're handing the state function is an anonymous row type constructed of the order by phrase. Of course, then we have to have some way to manipulate the anonymous row from within the function; at the very least, an equality operator. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Proposal/design feedback needed: WITHIN GROUP (sql standard ordered set aggregate functions)
On Thu, Jul 18, 2013 at 10:33:15PM +, Andrew Gierth wrote: Josh Berkus wrote: Well, seems like it would work the same as agg_func(constx,coly,colz ORDER BY coly, colz) I'd try transforming WITHIN GROUP into the above during parse analysis. The default would be the transformation for hypothetical set functions: agg(x,y,z) WITHIN GROUP (ORDER BY a,b,c) - agg(x,y,z ORDER BY a,b,c) Add a CREATE AGGREGATE option, say SQL_INVERSE_DISTRIBUTION_FUNCTION = {true|false} or SQLIDF, that chooses the IDF transformation: agg(x) WITHIN GROUP (ORDER BY y) - agg(x, y ORDER BY y) Then there's perhaps no new core aggregation or function candidacy machinery. (I don't know whether VARIADIC transition functions work today, but that would become an orthogonal project.) Compare how we handle standard interval typmod syntax; only the parser and deparser know about it. Atri's description upthread sounded pretty similar to that. Also, what would a CREATE AGGREGATE and state function definition for custom WITHIN GROUP aggregates look like? Now this is exactly the part we haven't nailed down yet and want ideas for. PERCENTILE_DISC would be declared as (float8, anyelement) with that SQLIDF option. To diagnose nonsensical calls made through nonstandard syntax, it could dig into its AggState to verify that its second argument is equal() to its first ORDER BY expression. There would be a question of whether to accept the WITHIN GROUP syntax for any aggregate or just for those for which the standard indicates it. Then follows the question of when to deparse as WITHIN GROUP and when to deparse as the internal syntax. I'd lean toward accepting WITHIN GROUP for any aggregate but deparsing that way only SQLIDF aggregates and aggregates with names matching the standard hypothetical set function names. Or you could add a second CREATE AGGREGATE option requesting hypothetical-set-function deparse style. -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Proposal/design feedback needed: WITHIN GROUP (sql standard ordered set aggregate functions)
Noah Misch n...@leadboat.com writes: (I don't know whether VARIADIC transition functions work today, but that would become an orthogonal project.) Coincidentally enough, some Salesforce folk were asking me about allowing VARIADIC aggregates just a few days ago. I experimented enough to find out that if you make an array-accepting transition function, and then force the aggregate's pg_proc entry to look like it's variadic (by manually setting provariadic and some other fields), then everything seems to Just Work: the parser and executor are both fine with it. So I think all that's needed here is to add some syntax support to CREATE AGGREGATE, and probably make some tweaks in pg_dump. I was planning to go work on that sometime soon. Having said that, though, what Andrew seemed to want was VARIADIC ANY, which is a *completely* different kettle of fish, since the actual parameters can't be converted to an array. I'm not sure if that's as easy to support. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Proposal/design feedback needed: WITHIN GROUP (sql standard ordered set aggregate functions)
On Thu, Jul 18, 2013 at 03:15:14AM +, Andrew Gierth wrote: The spec defines two types of aggregate function classed as ordered set function, as follows: 1. An inverse distribution function taking one argument (which must be a grouped column or otherwise constant within groups) plus a sorted group with exactly one column: =# SELECT (func(p) WITHIN GROUP (ORDER BY q)) from ... The motivating example for this (and the only ones in the spec) are percentile_cont and percentile_disc, to return a percentile result from a continuous or discrete distribution. (Thus percentile_cont(0.5) within group (order by x) is the spec's version of a median(x) function.) 2. A hypothetical set function taking N arguments of arbitrary types (a la VARIADIC any, rather than a fixed list) plus a sorted group with N columns of matching types: =# SELECT (func(p1,p2,...) WITHIN GROUP (ORDER BY q1,q2,...)) from ... (where typeof(p1)==typeof(q1) and so on, at least up to trivial conversions) The motivating example here is to be able to do rank(p1,p2,...) to return the rank that the specified values would have had if they were added to the group. As usual, we do not want to constrain ourselves to supporting only the specific cases in the spec, but would prefer a general solution. We (meaning myself and Atri) have an implementation that basically works, though it is not yet complete, but before taking it any further we need to resolve the design question of how to represent these two types of function in the system catalogs. The fact that there are in effect two parts to the parameter list, which are either independent (for inverse distribution funcs) or closely related (for hypothetical set functions), doesn't seem to point to an obvious way to represent this in pg_proc/pg_aggregate. I'm not yet satisfied with the method used in our implementation, What is that method? so we're throwing this open for suggestions. We will post the work-in-progress patch along with a description of its current implementation shortly. One of the major complications is that we ideally want to be able to do polymorphism based on the type of the sorted group, specifically in order to be able to do percentile_disc(float8) within group (order by anyelement) returning anyelement. (i.e. we should be able to get a discrete percentile from any type that is orderable.) The question here is how to resolve the return type both of the aggregate itself and of the finalfn. We've also had an expression of interest in extending this to allow percentile_disc(float8[]) and percentile_cont(float8[]) returning arrays; e.g. percentile_cont(array[0, 0.25, 0.5, 0.75, 1]) to return an array containing the bounds, median and quartiles in one go. This is an extension to the spec but it seems sufficiently obviously useful to be worth supporting. Comments? I'm really happy to see PostgreSQL come into its own when it comes to the analytics side of the house :) Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Proposal/design feedback needed: WITHIN GROUP (sql standard ordered set aggregate functions)
On Thu, Jul 18, 2013 at 10:02 AM, David Fetter da...@fetter.org wrote: On Thu, Jul 18, 2013 at 03:15:14AM +, Andrew Gierth wrote: The spec defines two types of aggregate function classed as ordered set function, as follows: 1. An inverse distribution function taking one argument (which must be a grouped column or otherwise constant within groups) plus a sorted group with exactly one column: =# SELECT (func(p) WITHIN GROUP (ORDER BY q)) from ... The motivating example for this (and the only ones in the spec) are percentile_cont and percentile_disc, to return a percentile result from a continuous or discrete distribution. (Thus percentile_cont(0.5) within group (order by x) is the spec's version of a median(x) function.) 2. A hypothetical set function taking N arguments of arbitrary types (a la VARIADIC any, rather than a fixed list) plus a sorted group with N columns of matching types: =# SELECT (func(p1,p2,...) WITHIN GROUP (ORDER BY q1,q2,...)) from ... (where typeof(p1)==typeof(q1) and so on, at least up to trivial conversions) The motivating example here is to be able to do rank(p1,p2,...) to return the rank that the specified values would have had if they were added to the group. As usual, we do not want to constrain ourselves to supporting only the specific cases in the spec, but would prefer a general solution. We (meaning myself and Atri) have an implementation that basically works, though it is not yet complete, but before taking it any further we need to resolve the design question of how to represent these two types of function in the system catalogs. The fact that there are in effect two parts to the parameter list, which are either independent (for inverse distribution funcs) or closely related (for hypothetical set functions), doesn't seem to point to an obvious way to represent this in pg_proc/pg_aggregate. I'm not yet satisfied with the method used in our implementation, What is that method? We currently represent ordered set functions with a new bool flag in pg_aggregate. The flag is set to true for ordered set functions(obviously) and false for all others. The currently implemented functions i.e. percentile_disc, percentile_cont and percentile_cont for intervals have their finalfns present in pg_aggregate. The aggregate functions take in two arguments, one for the percentile value and other for the input row set. So, percentile_cont's entry in pg_proc has float8 and float8 as its parameters and another entry of percentile_cont (with the interval version as the finalfn) has float8 and interval as its parameter types. As you can see, there isn't a way right now to resolve the return type of the aggregate for polymorphic cases. This is something we wish to resolve. Regards, Atri -- Regards, Atri l'apprenant -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers