On 2015-05-16 00:06:12 +0200, Andres Freund wrote:
> Andrew (and I) have been working on this since. Here's the updated and
> rebased patch.
>
> It misses a decent commit message and another beautification
> readthrough. I've spent the last hour going through the thing again and
> all I hit was a
On Thu, May 14, 2015 at 08:59:45AM +0200, Andres Freund wrote:
> On 2015-05-14 02:51:42 -0400, Noah Misch wrote:
> > Covering hash aggregation might entail a large preparatory refactoring
> > of nodeHash.c, but beyond development cost I can't malign that.
>
> You mean execGrouping.c? Afaics nodeHa
On 2015-05-14 09:16:10 +0100, Andrew Gierth wrote:
> Andres> A rough sketch of what I'm thinking of is:
>
> I'm not sure I'd do it quite like that.
It was meant as a sketch, so there's lots of things it's probably
missing ;)
> Rather, have a wrapper function get_outer_tuple that calls
> ExecPro
> "Andres" == Andres Freund writes:
Andres> My problem is that, unless I very much misunderstand something,
Andres> the current implementation can end up requiring roughly #sets *
Andres> #input of additional space for the "sidechannel tuplestore" in
Andres> some bad cases. That happens
On 2015-05-14 02:51:42 -0400, Noah Misch wrote:
> Covering hash aggregation might entail a large preparatory refactoring
> of nodeHash.c, but beyond development cost I can't malign that.
You mean execGrouping.c? Afaics nodeHash.c isn't involved, and it
doesn't look very interesting to make it so?
On Thu, May 14, 2015 at 08:38:07AM +0200, Andres Freund wrote:
> On 2015-05-14 02:32:04 -0400, Noah Misch wrote:
> > On Thu, May 14, 2015 at 07:50:31AM +0200, Andres Freund wrote:
> > > Andrew, is that a structure you could live with, or not?
> > >
> > > Others, what do you think?
> >
> > Andrew
On 2015-05-14 02:32:04 -0400, Noah Misch wrote:
> On Thu, May 14, 2015 at 07:50:31AM +0200, Andres Freund wrote:
> > Andrew, is that a structure you could live with, or not?
> >
> > Others, what do you think?
>
> Andrew and I discussed that very structure upthread:
> http://www.postgresql.org/me
On Thu, May 14, 2015 at 07:50:31AM +0200, Andres Freund wrote:
> I still believe that the general approach of chaining vs. a union or CTE
> is correct due to the efficiency arguments upthread. My problem is
> that, unless I very much misunderstand something, the current
> implementation can end up
On 2015-05-13 22:51:15 +0200, Andres Freund wrote:
> I'm pretty sure by now that I dislike the introduction of GroupedVar,
> and not just tentatively. While I can see why you found its
> introduction to be nicer than fiddling with the result tuple, for me the
> disadvantages seem to outweigh the a
On 2015-05-12 05:36:19 +0200, Andres Freund wrote:
> > Another controversial item was the introduction of GroupedVar. The need
> > for this can be avoided by explicitly setting to NULL the relevant
> > columns of the representative group tuple when evaluating result rows,
> > but (a) I don't think
> "Andres" == Andres Freund writes:
Andres> Andrew, are you going to be working on any of these?
As discussed on IRC, current status is:
>>> * The increased complexity of grouping_planner. It'd imo be good if some
>>> of that could be refactored into a separate function. Specifically t
On 2015-05-12 20:40:49 +0200, Andres Freund wrote:
> On 2015-05-12 05:36:19 +0200, Andres Freund wrote:
> > What I dislike so far:
> > * Minor formatting things. Just going to fix and push the ones I
> > dislike.
> > * The Hopcroft-Karp stuff not being separate
> > * The increased complexity of g
On 2015-05-12 05:36:19 +0200, Andres Freund wrote:
> What I dislike so far:
> * Minor formatting things. Just going to fix and push the ones I
> dislike.
> * The Hopcroft-Karp stuff not being separate
> * The increased complexity of grouping_planner. It'd imo be good if some
> of that could be
>I think the problem is "just" that for each variable, in each grouping
>set - a very large number in a large cube - we're recursing through the
>whole ChainAggregate tree, as each Var just points to a var one level
>lower.
For small values of very large, that is. Had a little thinko there. Its s
On 2015-04-30 05:35:26 +0100, Andrew Gierth wrote:
> > "Andres" == Andres Freund writes:
>
> Andres> This is not a real review. I'm just scanning through the
> Andres> patch, without reading the thread, to understand if I see
> Andres> something "worthy" of controversy. While scanning I mi
On Thu, Apr 30, 2015 at 05:35:26AM +0100, Andrew Gierth wrote:
> > "Andres" == Andres Freund writes:
> >> + * TODO: AGG_HASHED doesn't support multiple grouping sets yet.
>
> Andres> Are you intending to resolve this before an eventual commit?
>
> Original plan was to tackle AGG_H
> "Andres" == Andres Freund writes:
Andres> This is not a real review. I'm just scanning through the
Andres> patch, without reading the thread, to understand if I see
Andres> something "worthy" of controversy. While scanning I might have
Andres> a couple observations or questions.
>> +
Hi,
This is not a real review. I'm just scanning through the patch, without
reading the thread, to understand if I see something "worthy" of
controversy. While scanning I might have a couple observations or
questions.
On 2015-03-13 15:46:15 +, Andrew Gierth wrote:
> + * A list of group
On Wed, Jan 21, 2015 at 6:02 AM, Andrew Gierth
wrote:
> Updated patch (mostly just conflict resolution):
>
> - fix explain code to track changes to deparse context handling
>
> - tiny expansion of some comments (clarify in nodeAgg header
>comment that aggcontexts are now EContexts rather th
On Fri, Jan 02, 2015 at 03:55:23PM -0600, Jim Nasby wrote:
> On 12/31/14, 3:05 PM, Noah Misch wrote:
> >On Wed, Dec 31, 2014 at 05:33:43PM +, Andrew Gierth wrote:
> >"Noah" == Noah Misch writes:
> >>>
> >>> Noah> Suppose one node orchestrated all sorting and aggregation.
> >>>
>
On 12/31/14, 3:05 PM, Noah Misch wrote:
On Wed, Dec 31, 2014 at 05:33:43PM +, Andrew Gierth wrote:
> >"Noah" == Noah Misch writes:
>
> Noah> Suppose one node orchestrated all sorting and aggregation.
>
>Well, that has the downside of making it into an opaque blob, without
>actually ga
On Wed, Dec 31, 2014 at 05:33:43PM +, Andrew Gierth wrote:
> > "Noah" == Noah Misch writes:
>
> Noah> Suppose one node orchestrated all sorting and aggregation.
>
> Well, that has the downside of making it into an opaque blob, without
> actually gaining much.
The opaque-blob criticism
On Wed, Dec 31, 2014 at 02:45:29PM +0530, Atri Sharma wrote:
> > Suppose one node orchestrated all sorting and aggregation. Call it a
> > MultiGroupAggregate for now. It wouldn't harness Sort nodes, because it
> > performs aggregation between tuplesort_puttupleslot() calls. Instead, it
> > would
> "Noah" == Noah Misch writes:
Noah> Suppose one node orchestrated all sorting and aggregation.
Well, that has the downside of making it into an opaque blob, without
actually gaining much.
Noah> Call it a MultiGroupAggregate for now. It wouldn't harness
Noah> Sort nodes, because it perf
ChainAggregate is
> a bit like a node having two parents, a Sort and a GroupAggregate.
> However,
> the graph edge between ChainAggregate and its GroupAggregate is a
> tuplestore
> instead of the usual, synchronous ExecProcNode().
>
Well, I dont buy the two parents theory. The Sort nodes are i
On Tue, Dec 23, 2014 at 02:29:58AM -0500, Noah Misch wrote:
> On Mon, Dec 22, 2014 at 10:46:16AM -0500, Tom Lane wrote:
> > I still find the ChainAggregate approach too ugly at a system structural
> > level to accept, regardless of Noah's argument about number of I/O cycles
> > consumed. We'll be
On Mon, Dec 22, 2014 at 6:57 PM, Andrew Gierth
wrote:
> In the case of cube(a,b,c,d), our code currently gives:
>
> b,d,a,c: (b,d,a,c),(b,d)
> a,b,d:(a,b,d),(a,b)
> d,a,c:(d,a,c),(d,a),(d)
> c,d: (c,d),(c)
> b,c,d:(b,c,d),(b,c),(b)
> a,c,b:(a,c,b),(a,c),(a),()
That's pretty
On Mon, Dec 22, 2014 at 10:46:16AM -0500, Tom Lane wrote:
> I still find the ChainAggregate approach too ugly at a system structural
> level to accept, regardless of Noah's argument about number of I/O cycles
> consumed. We'll be paying for that in complexity and bugs into the
> indefinite future,
> "Robert" == Robert Haas writes:
>> I would be interested in seeing more good examples of the size and
>> type of grouping sets used in typical queries.
Robert> From what I have seen, there is interest in being able to do
Robert> things like GROUP BY CUBE(a, b, c, d) and have that be
R
On Tuesday, December 23, 2014, Robert Haas wrote:
> On Mon, Dec 22, 2014 at 11:19 AM, Andrew Gierth
> > wrote:
> > Tom> The other reason that's a bad comparison is that I've not seen
> > Tom> many queries that use more than a couple of window frames,
> > Tom> whereas we have to expect that the
On Mon, Dec 22, 2014 at 11:19 AM, Andrew Gierth
wrote:
> Tom> The other reason that's a bad comparison is that I've not seen
> Tom> many queries that use more than a couple of window frames,
> Tom> whereas we have to expect that the number of grouping sets in
> Tom> typical queries will be sig
> "Tom" == Tom Lane writes:
[Noah]
>> I caution against using window function performance as the
>> template for GROUPING SETS performance goals. The benefit of
>> GROUPING SETS compared to its UNION ALL functional equivalent is
>> 15% syntactic pleasantness, 85% performance opportuniti
Noah Misch writes:
> On Sat, Dec 13, 2014 at 04:37:48AM +, Andrew Gierth wrote:
> "Tom" == Tom Lane writes:
>> Tom> That seems pretty grotty from a performance+memory consumption
>> Tom> standpoint. At peak memory usage, each one of the Sort nodes
>> Tom> will contain every input row,
>> Ha
On Sat, Dec 13, 2014 at 04:37:48AM +, Andrew Gierth wrote:
> > "Tom" == Tom Lane writes:
>
> >> I'd already explained in more detail way back when we posted the
> >> patch. But to reiterate: the ChainAggregate nodes pass through
> >> their input data unchanged, but on group boundaries
On Mon, Dec 15, 2014 at 12:28 PM, Andrew Gierth
wrote:
>> "Michael" == Michael Paquier writes:
>
> Michael> Based on those comments, I am marking this patch as
> Michael> "Returned with Feedback" on the CF app for 2014-10. Andrew,
> Michael> feel free to move this entry to CF 2014-12 if yo
> "Michael" == Michael Paquier writes:
Michael> Based on those comments, I am marking this patch as
Michael> "Returned with Feedback" on the CF app for 2014-10. Andrew,
Michael> feel free to move this entry to CF 2014-12 if you are
Michael> planning to continue working on it so as it woul
On Thu, Dec 11, 2014 at 3:36 AM, Tom Lane wrote:
> I don't really have any comments on the algorithms yet, having spent too
> much time trying to figure out underdocumented data structures to get to
> the algorithms. However, noting the addition of list_intersection_int()
> made me wonder whether
> "Tom" == Tom Lane writes:
With the high-priority questions out of the way, time to tackle the
rest:
Tom> My single biggest complaint is about the introduction of struct
Tom> GroupedVar. If we stick with that, we're going to have to teach
Tom> an extremely large number of places that kn
> "Tom" == Tom Lane writes:
>> I'd already explained in more detail way back when we posted the
>> patch. But to reiterate: the ChainAggregate nodes pass through
>> their input data unchanged, but on group boundaries they write
>> aggregated result rows to a tuplestore shared by the whole
Andrew Gierth writes:
> "Tom" == Tom Lane writes:
> Tom> That seems pretty messy, especially given your further comments
> Tom> that these plan nodes are interconnected and know about each
> Tom> other (though you failed to say exactly how).
> I'd already explained in more detail way back whe
> "Tom" == Tom Lane writes:
>> What that code does is produce plans that look like this:
>> GroupAggregate
>> -> Sort
>>-> ChainAggregate
>> -> Sort
>> -> ChainAggregate
>> in much the same way that WindowAgg nodes are generated.
Tom> That seems pretty messy, esp
Andrew Gierth writes:
> "Tom" == Tom Lane writes:
> Tom> I've not spent any real effort looking at gsp2.patch yet, but it
> Tom> seems even worse off comment-wise: if there's any explanation in
> Tom> there at all of what a "chained aggregate" is, I didn't find it.
> (Maybe "stacked" would ha
> "Tom" == Tom Lane writes:
More comment on this later, but I want to highlight these specific
points since we need clear answers here to avoid wasting unnecessary
time and effort:
Tom> I've not spent any real effort looking at gsp2.patch yet, but it
Tom> seems even worse off comment-wise:
Andrew Gierth writes:
> And here is that recut patch set.
I started looking over this patch, but eventually decided that it needs
more work to be committable than I'm prepared to put in right now.
My single biggest complaint is about the introduction of struct
GroupedVar. If we stick with that,
On Sat, Sep 27, 2014 at 06:37:38AM +0100, Andrew Gierth wrote:
> > "Andrew" == Andrew Gierth writes:
>
> Andrew> I was holding off on posting a recut patch with the latest
> Andrew> EXPLAIN formatting changes (which are basically cosmetic)
> Andrew> until it became clear whether RLS was li
> "Heikki" == Heikki Linnakangas writes:
Heikki> There's been a lot of discussion and I haven't followed it in
Heikki> detail. Andrew, there were some open questions, but have you
Heikki> gotten enough feedback so that you know what to do next?
I was holding off on posting a recut patch w
There's been a lot of discussion and I haven't followed it in detail.
Andrew, there were some open questions, but have you gotten enough
feedback so that you know what to do next? I'm trying to get this
commitfest to an end, and this is still in "Needs Review" state...
- Heikki
--
Sent via
> "Josh" == Josh Berkus writes:
Josh> (b) If we're going to discuss ripping out YAML format, please
Josh> let's do that as a *separate* patch and discussion,
+infinity
>> Grouping Sets:
>> - ["two","four"]
>> - ["two"]
>> - []
>>
>> Would that be better? (It's not consistent
On 09/19/2014 08:52 AM, Andres Freund wrote:
>> Until someone decides to dike it out, I think we are obligated to make
>> > it produce something resembling correct output.
> I vote for ripping it out. There really isn't any justification for it
> and it broke more than once.
(a) I personally use i
On 19/09/14 17:52, Andres Freund wrote:
On 2014-09-19 16:35:52 +0100, Andrew Gierth wrote:
Marti> But is anyone actually using YAML output format, or was it
Marti> implemented simply "because we can"?
Until someone decides to dike it out, I think we are obligated to make
it produce somethin
> "Andrew" == Andrew Gierth writes:
Andrew> You're telling me. Also, feeding it to an online yaml-to-json
Andrew> converter gives the result as [["two","four"],["two"],null]
Andrew> which is not quite the same as the json version. An
Andrew> alternative would be:
Oh, another YAML alterna
On 2014-09-19 16:35:52 +0100, Andrew Gierth wrote:
> Marti> But is anyone actually using YAML output format, or was it
> Marti> implemented simply "because we can"?
>
> Until someone decides to dike it out, I think we are obligated to make
> it produce something resembling correct output.
I vot
> "Marti" == Marti Raudsepp writes:
>> (yaml format)
>> Grouping Sets:
>> - - "two"
>> - "four"
>> - - "two"
>> -
Marti> Now this is weird.
You're telling me. Also, feeding it to an online yaml-to-json
converter gives the result as [["two","four"],["two"],null] which is
not
On Fri, Sep 19, 2014 at 4:45 AM, Andrew Gierth
wrote:
> GroupAggregate (cost=1122.39..1197.48 rows=9 width=8)
>Group Key: two, four
>Group Key: two
>Group Key: ()
> "Grouping Sets": [
> ["two", "four"],
> ["two"],
> []
+1 looks good to me.
> (yaml for
> "Marti" == Marti Raudsepp writes:
Marti> Since you were asking for feedback on the EXPLAIN output on
Marti> IRC, I'd weigh in and say that having the groups on separate
Marti> lines would be significantly more readable.
I revisited the explain output a bit and have come up with these
(s
On 09/17/2014 03:02 PM, Marti Raudsepp wrote:
> So instead of:
> GroupAggregate
>Output: four, ten, hundred, count(*)
>Grouping Sets: (onek.four, onek.ten, onek.hundred), (onek.four,
> onek.ten), (onek.four), ()
>
> Perhaps print:
>Grouping Sets: (onek.four, onek.ten, onek.hundred)
>
On Fri, Sep 12, 2014 at 9:41 PM, Andrew Gierth
wrote:
> gsp1.patch - phase 1 code patch (full syntax, limited functionality)
> gsp2.patch - phase 2 code patch (adds full functionality using the
> new chained aggregate mechanism)
I gave these a try by convertin
> "Tomas" == Tomas Vondra writes:
Tomas> If we can get rid of the excessive ChainAggregate, that's
Tomas> certainly enough for now.
I found an algorithm that should provably give the minimal number of sorts
(I was afraid that problem would turn out to be NP-hard, but not so - it's
solvable
On 7.9.2014 18:52, Andrew Gierth wrote:
>> "Tomas" == Tomas Vondra writes:
>
> Tomas> Maybe preventing this completely (i.e. raising an ERROR with
> Tomas> "duplicate columns in CUBE/ROLLUP/... clauses") would be
> Tomas> appropriate. Does the standard says anything about this?
>
> The spe
> "Tomas" == Tomas Vondra writes:
>> As for computing it all twice, there's currently no attempt to
>> optimize multiple identical grouping sets into multiple
>> projections of a single grouping set result. CUBE(a,b,c,a) has
>> twice as many grouping sets as CUBE(a,b,c) does, even though
On 7.9.2014 15:11, Andrew Gierth wrote:
>> "Tomas" == Tomas Vondra writes:
>
> >> It's not one sort per grouping set, it's the minimal number of
> >> sorts needed to express the result as a union of ROLLUP
> >> clauses. The planner code will (I believe) always find the
> >> smallest numbe
> "Tomas" == Tomas Vondra writes:
>> It's not one sort per grouping set, it's the minimal number of
>> sorts needed to express the result as a union of ROLLUP
>> clauses. The planner code will (I believe) always find the
>> smallest number of sorts needed.
Tomas> You're probably right.
On 6.9.2014 23:34, Andrew Gierth wrote:
>> "Tomas" == Tomas Vondra writes:
>
> Tomas> I have significant doubts about the whole design,
> Tomas> though. Especially the decision not to use HashAggregate,
>
> There is no "decision not to use HashAggregate". There is simply no
> support for H
> "Tomas" == Tomas Vondra writes:
Tomas> I have significant doubts about the whole design,
Tomas> though. Especially the decision not to use HashAggregate,
There is no "decision not to use HashAggregate". There is simply no
support for HashAggregate yet.
Having it be able to work with Gro
On 31.8.2014 22:52, Andrew Gierth wrote:
> Recut patches:
>
> gsp1.patch - phase 1 code patch (full syntax, limited functionality)
> gsp2.patch - phase 2 code patch (adds full functionality using the
> new chained aggregate mechanism)
> gsp-doc.patch - doc
On Sunday, August 31, 2014, Andres Freund wrote:
> On 2014-08-31 21:09:59 +0530, Atri Sharma wrote:
> > On Sun, Aug 31, 2014 at 9:07 PM, Erik Rijkers > wrote:
> > > I have found that the "unrecognized node type" error is caused by:
>
> It's a warning, not an error, right?
>
> > > shared_preload_
On 2014-08-31 21:09:59 +0530, Atri Sharma wrote:
> On Sun, Aug 31, 2014 at 9:07 PM, Erik Rijkers wrote:
> > I have found that the "unrecognized node type" error is caused by:
It's a warning, not an error, right?
> > shared_preload_libraries = pg_stat_statements
> >
> > in postgresql.conf (as my
On Sun, Aug 31, 2014 at 9:07 PM, Erik Rijkers wrote:
> On Tue, August 26, 2014 14:24, Andrew Gierth wrote:
> >> "Erik" == Erik Rijkers writes:
> >
> > >> They apply cleanly for me at 2bde297 whether with git apply or
> > >> patch, except for the contrib one (which you don't need unless you
On Tue, August 26, 2014 14:24, Andrew Gierth wrote:
>> "Erik" == Erik Rijkers writes:
>
> >> They apply cleanly for me at 2bde297 whether with git apply or
> >> patch, except for the contrib one (which you don't need unless you
> >> want to run the contrib regression tests without applying
On Tue, August 26, 2014 14:24, Andrew Gierth wrote:
>> "Erik" == Erik Rijkers writes:
>
> >> They apply cleanly for me at 2bde297 whether with git apply or
> >> patch, except for the contrib one (which you don't need unless you
> >> want to run the contrib regression tests without applying
> "Erik" == Erik Rijkers writes:
>> They apply cleanly for me at 2bde297 whether with git apply or
>> patch, except for the contrib one (which you don't need unless you
>> want to run the contrib regression tests without applying the
>> gsp-u patch).
Erik> Ah, I had not realised that.
On Tue, August 26, 2014 11:13, Andrew Gierth wrote:
>> "Andrew" == Andrew Gierth writes:
>
>> "Erik" == Erik Rijkers writes:
>
> Erik> The patches did not apply anymore so I applied at 73eba19aebe0.
> Erik> There they applied OK, and make && make check was OK.
>
> Andrew> I'll look and
> "Andrew" == Andrew Gierth writes:
> "Erik" == Erik Rijkers writes:
Erik> The patches did not apply anymore so I applied at 73eba19aebe0.
Erik> There they applied OK, and make && make check was OK.
Andrew> I'll look and rebase if need be.
They apply cleanly for me at 2bde297 wheth
> "Erik" == Erik Rijkers writes:
Erik> The patches did not apply anymore so I applied at 73eba19aebe0.
Erik> There they applied OK, and make && make check was OK.
I'll look and rebase if need be.
--> WARNING: unrecognized node type: 347
Can't reproduce this - are you sure it's not a mi
On Mon, August 25, 2014 07:21, Andrew Gierth wrote:
> Here is the new version of our grouping sets patch. This version
> supersedes the previous post.
The patches did not apply anymore so I applied at 73eba19aebe0. There they
applied OK, and make && make check was OK.
drop table if exists it
2014-08-26 2:45 GMT+02:00 Andrew Gierth :
> > "Pavel" == Pavel Stehule writes:
>
> Pavel> Hi
> Pavel> I checked this patch, and it working very well
>
> Pavel> I found only two issue - I am not sure if it is issue
>
> Pavel> It duplicate rows
>
> Pavel> postgres=# explain select name, pl
> "Pavel" == Pavel Stehule writes:
Pavel> Hi
Pavel> I checked this patch, and it working very well
Pavel> I found only two issue - I am not sure if it is issue
Pavel> It duplicate rows
Pavel> postgres=# explain select name, place, sum(count), grouping(name),
Pavel> grouping(place) fr
Hi
I checked this patch, and it working very well
I found only two issue - I am not sure if it is issue
with data from https://wiki.postgresql.org/wiki/Grouping_Sets
postgres=# select name, place, sum(count), grouping(name), grouping(place)
from cars group by rollup(name, place);
name | pla
78 matches
Mail list logo