Re: [HACKERS] Parallel Aggregate costs don't consider combine/serial/deserial funcs

2016-04-13 Thread Robert Haas
On Tue, Apr 12, 2016 at 5:38 PM, David Rowley wrote: >>> One small point which I was a little unsure of in the attached is, >>> should the "if (aggref->aggdirectargs)" part of >>> count_agg_clauses_walker() be within the "if >>> (!context->combineStates)". I simply couldn't decide. We currently >>

Re: [HACKERS] Parallel Aggregate costs don't consider combine/serial/deserial funcs

2016-04-12 Thread David Rowley
On 13 April 2016 at 08:52, Robert Haas wrote: > On Sun, Apr 10, 2016 at 8:47 AM, David Rowley > wrote: >> I realised a few days ago that the parallel aggregate code does not >> cost for the combine, serialisation and deserialisation functions at >> all. > > Oops. > >> I've attached a patch which

Re: [HACKERS] Parallel Aggregate costs don't consider combine/serial/deserial funcs

2016-04-12 Thread Robert Haas
On Sun, Apr 10, 2016 at 8:47 AM, David Rowley wrote: > I realised a few days ago that the parallel aggregate code does not > cost for the combine, serialisation and deserialisation functions at > all. Oops. > I've attached a patch which fixes this. I've committed this patch. I wonder if it's g

Re: [HACKERS] Parallel Aggregate

2016-03-21 Thread James Sewell
Good news! On Tuesday, 22 March 2016, David Rowley wrote: > On 22 March 2016 at 02:35, Robert Haas > wrote: > > I have committed this after changing some of the comments. > > > > There might still be bugs ... but I don't see them. And the speedups > > look very impressive. > > > > Really nice

Re: [HACKERS] Parallel Aggregate

2016-03-21 Thread David Rowley
On 22 March 2016 at 02:35, Robert Haas wrote: > I have committed this after changing some of the comments. > > There might still be bugs ... but I don't see them. And the speedups > look very impressive. > > Really nice work, David. Thanks for that, and thank you for taking the time to carefully

Re: [HACKERS] Parallel Aggregate

2016-03-21 Thread Robert Haas
On Sun, Mar 20, 2016 at 7:30 PM, David Rowley wrote: > An updated patch is attached. This hopefully addresses your concerns > with the comment, and also the estimate_hashagg_tablesize() NULL > checking. I have committed this after changing some of the comments. There might still be bugs ... but

Re: [HACKERS] Parallel Aggregate

2016-03-21 Thread Robert Haas
On Sun, Mar 20, 2016 at 11:24 PM, Alvaro Herrera wrote: > David Rowley wrote: >> On 21 March 2016 at 15:48, Alvaro Herrera wrote: >> > David Rowley wrote: >> > >> >> I've rewritten the comment to become: >> >> >> >> /* >> >> * Providing that the estimated size of the hashtable does not exceed >>

Re: [HACKERS] Parallel Aggregate

2016-03-20 Thread David Rowley
On 21 March 2016 at 15:48, Alvaro Herrera wrote: > David Rowley wrote: > >> I've rewritten the comment to become: >> >> /* >> * Providing that the estimated size of the hashtable does not exceed >> * work_mem, we'll generate a HashAgg Path, although if we were unable >> * to sort above, then we'd

Re: [HACKERS] Parallel Aggregate

2016-03-20 Thread Alvaro Herrera
David Rowley wrote: > I've rewritten the comment to become: > > /* > * Providing that the estimated size of the hashtable does not exceed > * work_mem, we'll generate a HashAgg Path, although if we were unable > * to sort above, then we'd better generate a Path, so that we at least > * have one.

Re: [HACKERS] Parallel Aggregate

2016-03-20 Thread Tomas Vondra
Hi, On 03/21/2016 12:30 AM, David Rowley wrote: On 21 March 2016 at 09:47, Tomas Vondra wrote: ... I'm not sure changing the meaning of enable_hashagg like this is a good idea. It worked as a hard switch before, while with this change that would not be the case. Or more accurately - it would

Re: [HACKERS] Parallel Aggregate

2016-03-20 Thread David Rowley
On 21 March 2016 at 09:47, Tomas Vondra wrote: > On 03/20/2016 09:58 AM, David Rowley wrote: >> Thank you for the reviews. The only thing I can think to mention which >> I've not already is that I designed estimate_hashagg_tablesize() to be >> reusable in various places in planner.c, yet I've only

Re: [HACKERS] Parallel Aggregate

2016-03-20 Thread Tomas Vondra
Hi, On 03/20/2016 09:58 AM, David Rowley wrote: On 20 March 2016 at 03:19, Robert Haas wrote: On Fri, Mar 18, 2016 at 10:32 PM, David Rowley wrote: Updated patch is attached. I think this looks structurally correct now, and I think it's doing the right thing as far as parallelism is concer

Re: [HACKERS] Parallel Aggregate

2016-03-20 Thread Robert Haas
On Fri, Mar 18, 2016 at 9:16 AM, David Rowley wrote: > I've attached an updated patch. This looks substantially better than earlier versions, although I haven't studied every detail of it yet. + * Partial aggregation requires that each aggregate does not have a DISTINCT or + * ORDER BY clause, a

Re: [HACKERS] Parallel Aggregate

2016-03-20 Thread David Rowley
On 20 March 2016 at 03:19, Robert Haas wrote: > On Fri, Mar 18, 2016 at 10:32 PM, David Rowley > wrote: >> Updated patch is attached. > > I think this looks structurally correct now, and I think it's doing > the right thing as far as parallelism is concerned. I don't see any > obvious problems i

Re: [HACKERS] Parallel Aggregate

2016-03-19 Thread David Rowley
On 17 March 2016 at 01:19, Amit Kapila wrote: > Few assorted comments: > > 1. > /* > + * Determine if it's possible to perform aggregation in parallel using > + * multiple worker processes. We can permit this when there's at least one > + * partial_path in input_rel, but not if the query has gro

Re: [HACKERS] Parallel Aggregate

2016-03-19 Thread David Rowley
On 19 March 2016 at 06:15, Robert Haas wrote: > On Fri, Mar 18, 2016 at 9:16 AM, David Rowley > wrote: >> I've attached an updated patch. > > This looks substantially better than earlier versions, although I > haven't studied every detail of it yet. > > + * Partial aggregation requires that each

Re: [HACKERS] Parallel Aggregate

2016-03-19 Thread David Rowley
On 18 March 2016 at 20:25, Amit Kapila wrote: > Few more comments: > > 1. > > + if (parse->groupClause) > + path = (Path *) create_sort_path(root, > + grouped_rel, > + path, > + root->group_pathkeys, > + -1.0); > > For final path, why do you want to sort just for group by case? If there's no GROU

Re: [HACKERS] Parallel Aggregate

2016-03-19 Thread David Rowley
On 17 March 2016 at 18:05, David Rowley wrote: > Updated patch attached. Please disregard 0001-Allow-aggregation-to-happen-in-parallel_2016-03-17.patch. This contained a badly thought through last minute change to how the Gather path is generated and is broken. I played around with ways of gener

Re: [HACKERS] Parallel Aggregate

2016-03-19 Thread Robert Haas
On Wed, Mar 16, 2016 at 6:49 AM, David Rowley wrote: > On 16 March 2016 at 15:04, Robert Haas wrote: >> I don't think I'd be objecting if you made PartialAggref a real >> alternative to Aggref. But that's not what you've got here. A >> PartialAggref is just a wrapper around an underlying Aggref

Re: [HACKERS] Parallel Aggregate

2016-03-19 Thread Haribabu Kommi
On Thu, Mar 17, 2016 at 2:13 PM, James Sewell wrote: > > Hi again, > > This is probably me missing something, but is there a reason parallel > aggregate doesn't seem to ever create append nodes containing Index scans? > > SET random_page_cost TO 0.2; > SET max_parallel_degree TO 8; > > postgres=#

Re: [HACKERS] Parallel Aggregate

2016-03-19 Thread David Rowley
On 17 March 2016 at 00:57, Robert Haas wrote: > On Wed, Mar 16, 2016 at 6:49 AM, David Rowley > wrote: >> On 16 March 2016 at 15:04, Robert Haas wrote: >>> I don't think I'd be objecting if you made PartialAggref a real >>> alternative to Aggref. But that's not what you've got here. A >>> Part

Re: [HACKERS] Parallel Aggregate

2016-03-19 Thread Robert Haas
On Wed, Mar 16, 2016 at 5:05 PM, David Rowley wrote: >> Cool! Why not initialize aggpartialtype always? > > Because the follow-on patch sets that to either the serialtype or the > aggtranstype, depending on if serialisation is required. Serialisation > is required for parallel aggregate, but if w

Re: [HACKERS] Parallel Aggregate

2016-03-19 Thread Alvaro Herrera
I read this a bit, as an exercise to try to follow parallel query a bit. I think the most interesting thing I have to say is that the new error messages in ExecInitExpr do not conform to our style. Probably just downcase everything except Aggref and you're done, since they're can't-happen conditio

Re: [HACKERS] Parallel Aggregate

2016-03-19 Thread David Rowley
On 17 March 2016 at 03:47, Robert Haas wrote: > More review comments: > > /* > +* Likewise for any partial paths, although this case > is more simple as > +* we don't track the cheapest path. > +*/ > > I think in the process of gettin

Re: [HACKERS] Parallel Aggregate

2016-03-19 Thread David Rowley
On 17 March 2016 at 01:29, Robert Haas wrote: > On Wed, Mar 16, 2016 at 8:19 AM, Amit Kapila wrote: >> Isn't it better to call it as Parallel Aggregate instead of Partial >> Aggregate. Initialy, we have kept Partial for seqscan, but later on we >> changed to Parallel Seq Scan, so I am not able t

Re: [HACKERS] Parallel Aggregate

2016-03-19 Thread Robert Haas
On Fri, Mar 18, 2016 at 10:32 PM, David Rowley wrote: > Updated patch is attached. I think this looks structurally correct now, and I think it's doing the right thing as far as parallelism is concerned. I don't see any obvious problems in the rest of it, either, but I haven't thought hugely deep

Re: [HACKERS] Parallel Aggregate

2016-03-19 Thread Amit Kapila
On Thu, Mar 17, 2016 at 6:41 PM, David Rowley wrote: > > On 18 March 2016 at 01:22, Amit Kapila wrote: > > On Thu, Mar 17, 2016 at 10:35 AM, David Rowley > > wrote: > > Updated patch is attached. Thanks for the re-review. > Few more comments: 1. + if (parse->groupClause) + path = (Path *) cre

Re: [HACKERS] Parallel Aggregate

2016-03-19 Thread Amit Kapila
On Wed, Mar 16, 2016 at 4:19 PM, David Rowley wrote: > > On 16 March 2016 at 15:04, Robert Haas wrote: > > I don't think I'd be objecting if you made PartialAggref a real > > alternative to Aggref. But that's not what you've got here. A > > PartialAggref is just a wrapper around an underlying A

Re: [HACKERS] Parallel Aggregate

2016-03-19 Thread Robert Haas
On Wed, Mar 16, 2016 at 8:19 AM, Amit Kapila wrote: > Isn't it better to call it as Parallel Aggregate instead of Partial > Aggregate. Initialy, we have kept Partial for seqscan, but later on we > changed to Parallel Seq Scan, so I am not able to think why it is better to > call Partial incase of

Re: [HACKERS] Parallel Aggregate

2016-03-19 Thread Robert Haas
On Wed, Mar 16, 2016 at 7:57 AM, Robert Haas wrote: > On Wed, Mar 16, 2016 at 6:49 AM, David Rowley > wrote: >> On 16 March 2016 at 15:04, Robert Haas wrote: >>> I don't think I'd be objecting if you made PartialAggref a real >>> alternative to Aggref. But that's not what you've got here. A >>

Re: [HACKERS] Parallel Aggregate

2016-03-19 Thread Amit Kapila
On Thu, Mar 17, 2016 at 10:35 AM, David Rowley wrote: > > On 17 March 2016 at 01:19, Amit Kapila wrote: > > Few assorted comments: > > > > 2. > > AggPath * > > create_agg_path(PlannerInfo *root, > > @@ -2397,9 +2399,11 @@ create_agg_path(PlannerInfo *root, > > > > List *groupClause, > > List

Re: [HACKERS] Parallel Aggregate

2016-03-19 Thread James Sewell
Hi again, This is probably me missing something, but is there a reason parallel aggregate doesn't seem to ever create append nodes containing Index scans? SET random_page_cost TO 0.2; SET max_parallel_degree TO 8; postgres=# explain SELECT sum(count_i) FROM base GROUP BY view_time_day;

Re: [HACKERS] Parallel Aggregate

2016-03-18 Thread David Rowley
On 18 March 2016 at 01:22, Amit Kapila wrote: > On Thu, Mar 17, 2016 at 10:35 AM, David Rowley > wrote: >> >> On 17 March 2016 at 01:19, Amit Kapila wrote: >> > Few assorted comments: >> > >> > 2. >> > AggPath * >> > create_agg_path(PlannerInfo *root, >> > @@ -2397,9 +2399,11 @@ create_agg_pat

Re: [HACKERS] Parallel Aggregate

2016-03-18 Thread David Rowley
On 19 March 2016 at 05:46, Robert Haas wrote: > On Wed, Mar 16, 2016 at 5:05 PM, David Rowley > wrote: >>> Cool! Why not initialize aggpartialtype always? >> >> Because the follow-on patch sets that to either the serialtype or the >> aggtranstype, depending on if serialisation is required. Seria

Re: [HACKERS] Parallel Aggregate

2016-03-18 Thread David Rowley
On 19 March 2016 at 09:53, Alvaro Herrera wrote: > I read this a bit, as an exercise to try to follow parallel query a bit. Thanks for taking a look at this. > I think the most interesting thing I have to say is that the new error > messages in ExecInitExpr do not conform to our style. Probably

Re: [HACKERS] Parallel Aggregate

2016-03-16 Thread David Rowley
On 16 March 2016 at 15:04, Robert Haas wrote: > I don't think I'd be objecting if you made PartialAggref a real > alternative to Aggref. But that's not what you've got here. A > PartialAggref is just a wrapper around an underlying Aggref that > changes the interpretation of it - and I think that

Re: [HACKERS] Parallel Aggregate

2016-03-15 Thread Robert Haas
On Tue, Mar 15, 2016 at 9:23 PM, David Rowley wrote: >>> Should I update the patch to use the method you describe? >> >> Well, my feeling is that is going to make this a lot smaller and >> simpler, so I think so. But if you disagree strongly, let's discuss >> further. > > Not strongly. It means t

Re: [HACKERS] Parallel Aggregate

2016-03-15 Thread David Rowley
On 16 March 2016 at 14:08, Robert Haas wrote: > On Tue, Mar 15, 2016 at 8:55 PM, David Rowley > wrote: >> On 16 March 2016 at 13:42, Robert Haas wrote: >>> On Tue, Mar 15, 2016 at 8:04 PM, David Rowley >>> wrote: On 16 March 2016 at 12:58, Robert Haas wrote: > ...and why would one be

Re: [HACKERS] Parallel Aggregate

2016-03-15 Thread Robert Haas
On Tue, Mar 15, 2016 at 8:55 PM, David Rowley wrote: > On 16 March 2016 at 13:42, Robert Haas wrote: >> On Tue, Mar 15, 2016 at 8:04 PM, David Rowley >> wrote: >>> On 16 March 2016 at 12:58, Robert Haas wrote: ...and why would one be true and the other false? >>> >>> One would be the combi

Re: [HACKERS] Parallel Aggregate

2016-03-15 Thread David Rowley
On 16 March 2016 at 13:42, Robert Haas wrote: > On Tue, Mar 15, 2016 at 8:04 PM, David Rowley > wrote: >> On 16 March 2016 at 12:58, Robert Haas wrote: >>> ...and why would one be true and the other false? >> >> One would be the combine aggregate (having aggpartial = false), and >> the one in th

Re: [HACKERS] Parallel Aggregate

2016-03-15 Thread Robert Haas
On Tue, Mar 15, 2016 at 8:04 PM, David Rowley wrote: > On 16 March 2016 at 12:58, Robert Haas wrote: >> On Tue, Mar 15, 2016 at 6:55 PM, David Rowley >> wrote: >>> On 16 March 2016 at 11:00, Robert Haas wrote: I don't see why we would need to leave aggpartial out of the equals() check

Re: [HACKERS] Parallel Aggregate

2016-03-15 Thread David Rowley
On 16 March 2016 at 12:58, Robert Haas wrote: > On Tue, Mar 15, 2016 at 6:55 PM, David Rowley > wrote: >> On 16 March 2016 at 11:00, Robert Haas wrote: >>> I don't see why we would need to leave aggpartial out of the equals() >>> check. I must be missing something. >> >> See fix_combine_agg_exp

Re: [HACKERS] Parallel Aggregate

2016-03-15 Thread Robert Haas
On Tue, Mar 15, 2016 at 6:55 PM, David Rowley wrote: > On 16 March 2016 at 11:00, Robert Haas wrote: >> I don't see why we would need to leave aggpartial out of the equals() >> check. I must be missing something. > > See fix_combine_agg_expr_mutator() > > This piece of code: > > /* > * Aggrefs f

Re: [HACKERS] Parallel Aggregate

2016-03-15 Thread David Rowley
On 16 March 2016 at 11:00, Robert Haas wrote: > I don't see why we would need to leave aggpartial out of the equals() > check. I must be missing something. See fix_combine_agg_expr_mutator() This piece of code: /* * Aggrefs for partial aggregates are wrapped up in a PartialAggref, * we need to

Re: [HACKERS] Parallel Aggregate

2016-03-15 Thread Robert Haas
On Tue, Mar 15, 2016 at 5:50 PM, David Rowley wrote: >> I still think that's solving the problem the wrong way. Why can't >> exprType(), when applied to the Aggref, do something like this? >> >> { >> Aggref *aref = (Aggref *) expr; >> if (aref->aggpartial) >> return aref->aggtrans

Re: [HACKERS] Parallel Aggregate

2016-03-15 Thread David Rowley
On 16 March 2016 at 09:23, Robert Haas wrote: > On Mon, Mar 14, 2016 at 7:56 PM, David Rowley > wrote: >> A comment does explain this, but perhaps it's not good enough, so I've >> rewritten it to become. >> >> /* >> * PartialAggref >> * >> * When partial aggregation is required in a plan, the

Re: [HACKERS] Parallel Aggregate

2016-03-15 Thread Robert Haas
On Mon, Mar 14, 2016 at 7:56 PM, David Rowley wrote: >> More generally, why are we inventing PartialAggref >> instead of reusing Aggref? The code comments seem to contain no >> indication as to why we shouldn't need all the same details for >> PartialAggref that we do for Aggref, instead of only

Re: [HACKERS] Parallel Aggregate

2016-03-15 Thread David Rowley
On 15 March 2016 at 13:48, David Rowley wrote: > An updated patch will follow soon. I've attached an updated patch which addresses some of Robert's and Tomas' concerns. I've not done anything about the exprCollation() yet, as I'm still unsure of what it should do. I just don't see why returning t

Re: [HACKERS] Parallel Aggregate

2016-03-14 Thread David Rowley
On 15 March 2016 at 11:24, James Sewell wrote: > On Tuesday, 15 March 2016, Robert Haas wrote: >> >> > Does the cost of the aggregate function come into this calculation at >> > all? In PostGIS land, much smaller numbers of rows can generate loads >> > that would be effective to parallelize (work

Re: [HACKERS] Parallel Aggregate

2016-03-14 Thread David Rowley
On 15 March 2016 at 11:39, Tomas Vondra wrote: > I've looked at this patch today. The patch seems quite solid, but I do have > a few minor comments (or perhaps questions, given that this is the first > time I looked at the patch). > > 1) exprCollation contains this bit: > -

Re: [HACKERS] Parallel Aggregate

2016-03-14 Thread David Rowley
On 15 March 2016 at 08:53, Robert Haas wrote: > I haven't fully studied every line of this yet, but here are a few comments: > > + case T_PartialAggref: > + coll = InvalidOid; /* XXX is this correct? */ > + break; > > I doubt it. Thanks fo

Re: [HACKERS] Parallel Aggregate

2016-03-14 Thread James Sewell
On Tue, Mar 15, 2016 at 9:32 AM, Robert Haas wrote: > > I kind of doubt this would work well, but somebody could write a patch > for it and try it out. OK I'll give this a go today and report back. Would the eventual plan be to use pg_proc.procost for the functions from each aggregate concerne

Re: [HACKERS] Parallel Aggregate

2016-03-14 Thread Tomas Vondra
Hi, On 03/13/2016 10:44 PM, David Rowley wrote: On 12 March 2016 at 16:31, David Rowley wrote: I've attached an updated patch which is based on commit 7087166, things are really changing fast in the grouping path area at the moment, but hopefully the dust is starting to settle now. The attac

Re: [HACKERS] Parallel Aggregate

2016-03-14 Thread Robert Haas
On Mon, Mar 14, 2016 at 6:24 PM, James Sewell wrote: > Any chance of getting a GUC (say min_parallel_degree) added to allow setting > the initial value of parallel_degree, then changing the small relation check > to also pass if parallel_degree > 1? > > That way you could set min_parallel_degree o

Re: [HACKERS] Parallel Aggregate

2016-03-14 Thread James Sewell
On Tuesday, 15 March 2016, Robert Haas wrote: > > > Does the cost of the aggregate function come into this calculation at > > all? In PostGIS land, much smaller numbers of rows can generate loads > > that would be effective to parallelize (worker time much >> than > > startup cost). > > Unfortunat

Re: [HACKERS] Parallel Aggregate

2016-03-14 Thread Robert Haas
On Mon, Mar 14, 2016 at 3:56 PM, Paul Ramsey wrote: > On Sun, Mar 13, 2016 at 7:31 PM, David Rowley > wrote: >> On 14 March 2016 at 14:52, James Sewell wrote: >>> One question - how is the upper limit of workers chosen? >> >> See create_parallel_paths() in allpaths.c. Basically the bigger the >>

Re: [HACKERS] Parallel Aggregate

2016-03-14 Thread Paul Ramsey
On Sun, Mar 13, 2016 at 7:31 PM, David Rowley wrote: > On 14 March 2016 at 14:52, James Sewell wrote: >> One question - how is the upper limit of workers chosen? > > See create_parallel_paths() in allpaths.c. Basically the bigger the > relation (in pages) the more workers will be allocated, up un

Re: [HACKERS] Parallel Aggregate

2016-03-14 Thread Robert Haas
On Sun, Mar 13, 2016 at 5:44 PM, David Rowley wrote: > On 12 March 2016 at 16:31, David Rowley wrote: >> I've attached an updated patch which is based on commit 7087166, >> things are really changing fast in the grouping path area at the >> moment, but hopefully the dust is starting to settle now

Re: [HACKERS] Parallel Aggregate

2016-03-13 Thread David Rowley
On 14 March 2016 at 17:05, James Sewell wrote: > > Hi again, > > I've been playing around with inheritance combined with this patch. Currently > it looks like you are taking max(parallel_degree) from all the child tables > and using that for the number of workers. > > For large machines it makes

Re: [HACKERS] Parallel Aggregate

2016-03-13 Thread James Sewell
On Mon, Mar 14, 2016 at 3:05 PM, David Rowley wrote: > > Things to try: > 1. alter table a add column ts_date date; update a set ts_date = > date_trunc('DAY',ts); vacuum full analyze ts; > 2. or, create index on a (date_trunc('DAY',ts)); analyze a; > 3. or for testing, set the work_mem higher. >

Re: [HACKERS] Parallel Aggregate

2016-03-13 Thread James Sewell
Hi again, I've been playing around with inheritance combined with this patch. Currently it looks like you are taking max(parallel_degree) from all the child tables and using that for the number of workers. For large machines it makes much more sense to use sum(parallel_degree) - but I've just see

Re: [HACKERS] Parallel Aggregate

2016-03-13 Thread David Rowley
On 14 March 2016 at 16:39, James Sewell wrote: > > I've been testing how this works with partitioning (which seems to be > strange, but I'll post separately about that) and something odd seems to be > going on now with the parallel triggering: > > postgres=# create table a as select * from base_

Re: [HACKERS] Parallel Aggregate

2016-03-13 Thread James Sewell
Cool, I've been testing how this works with partitioning (which seems to be strange, but I'll post separately about that) and something odd seems to be going on now with the parallel triggering: postgres=# create table a as select * from base_p2015_11; SELECT 2000 postgres=# select * from a

Re: [HACKERS] Parallel Aggregate

2016-03-13 Thread David Rowley
On 14 March 2016 at 14:52, James Sewell wrote: > One question - how is the upper limit of workers chosen? See create_parallel_paths() in allpaths.c. Basically the bigger the relation (in pages) the more workers will be allocated, up until max_parallel_degree. There is also a comment in that func

Re: [HACKERS] Parallel Aggregate

2016-03-13 Thread James Sewell
Hi, Happy to test, really looking forward to seeing this stuff in core. The explain analyze is below: Finalize HashAggregate (cost=810142.42..810882.62 rows=59216 width=16) (actual time=2282.092..2282.202 rows=15 loops=1) Group Key: (date_trunc('DAY'::text, pageview_start_tstamp)) -> Gat

Re: [HACKERS] Parallel Aggregate

2016-03-13 Thread David Rowley
On 14 March 2016 at 14:16, James Sewell wrote: > I've done some testing with one of my data sets in an 8VPU virtual > environment and this is looking really, really good. > > My test query is: > > SELECT pageview, sum(pageview_count) > FROM fact_agg_2015_12 > GROUP BY date_trunc('DAY'::text, page

Re: [HACKERS] Parallel Aggregate

2016-03-13 Thread James Sewell
Hi, I've done some testing with one of my data sets in an 8VPU virtual environment and this is looking really, really good. My test query is: SELECT pageview, sum(pageview_count) FROM fact_agg_2015_12 GROUP BY date_trunc('DAY'::text, pageview); The query returns 15 rows. The fact_agg table is 5

Re: [HACKERS] Parallel Aggregate

2016-03-13 Thread Haribabu Kommi
On Mon, Mar 14, 2016 at 8:44 AM, David Rowley wrote: > On 12 March 2016 at 16:31, David Rowley wrote: >> I've attached an updated patch which is based on commit 7087166, >> things are really changing fast in the grouping path area at the >> moment, but hopefully the dust is starting to settle now

Re: [HACKERS] Parallel Aggregate

2016-03-13 Thread David Rowley
On 12 March 2016 at 16:31, David Rowley wrote: > I've attached an updated patch which is based on commit 7087166, > things are really changing fast in the grouping path area at the > moment, but hopefully the dust is starting to settle now. The attached patch fixes a harmless compiler warning abo

Re: [HACKERS] Parallel Aggregate

2016-03-11 Thread David Rowley
On 11 March 2016 at 03:39, David Rowley wrote: > A couple of things which I'm not 100% happy with. > > 1. make_partialgroup_input_target() doing lookups to the syscache. > Perhaps this job can be offloaded to a new function in a more suitable > location. Ideally the Aggref would already store the

Re: [HACKERS] Parallel Aggregate

2016-03-10 Thread David Rowley
On 8 March 2016 at 11:15, David Rowley wrote: > The setrefs.c parts of the patch remain completely broken. I've not > had time to look at this again yet, sorry. Ok, so again, apologies for previously sending such a broken patch. I've since managed to free up a bit of time to work on this, which n

Re: [HACKERS] Parallel Aggregate

2016-03-10 Thread Amit Langote
On Thu, Mar 10, 2016 at 10:55 PM, Robert Haas wrote: > On Thu, Mar 10, 2016 at 6:42 AM, David Rowley > wrote: >> The one reason that I asked about force_parallel_mode was that I >> assumed there was some buildfarm member running somewhere that >> switches this on and runs the regression tests. I

Re: [HACKERS] Parallel Aggregate

2016-03-10 Thread Robert Haas
On Thu, Mar 10, 2016 at 6:42 AM, David Rowley wrote: > Hmm, it appears I only looked as far as the enum declaration, which I > expected to have something. Perhaps I'm just not used to looking up > the manual for things relating to code. I don't mind adding some comments there, it just didn't seem

Re: [HACKERS] Parallel Aggregate

2016-03-10 Thread David Rowley
On 9 March 2016 at 04:06, Robert Haas wrote: > On Mon, Mar 7, 2016 at 5:15 PM, David Rowley > wrote: >> 3. Nothing in create_grouping_paths() looks at the force_parallel_mode >> GUC. I had a quick look at this GUC and was a bit surprised to see 3 >> possible states, but no explanation of what the

Re: [HACKERS] Parallel Aggregate

2016-03-09 Thread Haribabu Kommi
On Mon, Mar 7, 2016 at 4:39 PM, Tom Lane wrote: > Haribabu Kommi writes: >> 2. Temporary fix for float aggregate types in _equalAggref because of >> a change in aggtype to trans type, otherwise the parallel aggregation >> plan failure in set_plan_references. whenever the aggtype is not matched, >

Re: [HACKERS] Parallel Aggregate

2016-03-09 Thread Robert Haas
On Tue, Mar 8, 2016 at 4:26 PM, David Rowley wrote: >> The first one in the list will be the cheapest; why not just look at >> that? Sorted partial paths are interesting if some subsequent path >> construction step can make use of that sort ordering, but they're >> never interesting from the poin

Re: [HACKERS] Parallel Aggregate

2016-03-08 Thread David Rowley
On 9 March 2016 at 04:06, Robert Haas wrote: > On Mon, Mar 7, 2016 at 5:15 PM, David Rowley > wrote: >> My concerns are: >> 1. Since there's no cheapest_partial_path in RelOptInfo the code is >> currently considering every partial_path for parallel hash aggregate. >> With normal aggregation we on

Re: [HACKERS] Parallel Aggregate

2016-03-08 Thread Robert Haas
On Mon, Mar 7, 2016 at 5:15 PM, David Rowley wrote: > My concerns are: > 1. Since there's no cheapest_partial_path in RelOptInfo the code is > currently considering every partial_path for parallel hash aggregate. > With normal aggregation we only ever use the cheapest path, so this > may not be fu

Re: [HACKERS] Parallel Aggregate

2016-03-07 Thread David Rowley
On 5 March 2016 at 07:25, Robert Haas wrote: > On Thu, Mar 3, 2016 at 11:00 PM, David Rowley >> 3. The code never attempts to mix and match Grouping Agg and Hash Agg >> plans. e.g it could be an idea to perform Partial Hash Aggregate -> >> Gather -> Sort -> Finalize Group Aggregate, or hash as in

Re: [HACKERS] Parallel Aggregate

2016-03-06 Thread David Rowley
On 7 March 2016 at 18:19, Haribabu Kommi wrote: > Here I attached update patch with further changes, > 1. Explain plan changes for parallel grouping Perhaps someone might disagree with me, but I'm not all that sure I really get the need for that. With nodeAgg.c we're doing something fundamentally

Re: [HACKERS] Parallel Aggregate

2016-03-06 Thread Tom Lane
Haribabu Kommi writes: > 2. Temporary fix for float aggregate types in _equalAggref because of > a change in aggtype to trans type, otherwise the parallel aggregation > plan failure in set_plan_references. whenever the aggtype is not matched, > it verifies with the trans type also. That is a comp

Re: [HACKERS] Parallel Aggregate

2016-03-06 Thread Haribabu Kommi
On Sun, Mar 6, 2016 at 10:21 PM, Haribabu Kommi wrote: > > Pending: > 1. Explain plan needs to be corrected for parallel grouping similar like > parallel aggregate. Here I attached update patch with further changes, 1. Explain plan changes for parallel grouping 2. Temporary fix for float aggrega

Re: [HACKERS] Parallel Aggregate

2016-03-06 Thread Haribabu Kommi
On Fri, Mar 4, 2016 at 3:00 PM, David Rowley wrote: > On 17 February 2016 at 17:50, Haribabu Kommi wrote: >> Here I attached a draft patch based on previous discussions. It still needs >> better comments and optimization. > > Over in [1] Tom posted a large change to the grouping planner which > c

Re: [HACKERS] Parallel Aggregate

2016-03-04 Thread Robert Haas
On Thu, Mar 3, 2016 at 11:00 PM, David Rowley wrote: > On 17 February 2016 at 17:50, Haribabu Kommi wrote: >> Here I attached a draft patch based on previous discussions. It still needs >> better comments and optimization. > > Over in [1] Tom posted a large change to the grouping planner which >

Re: [HACKERS] Parallel Aggregate

2016-03-03 Thread David Rowley
On 17 February 2016 at 17:50, Haribabu Kommi wrote: > Here I attached a draft patch based on previous discussions. It still needs > better comments and optimization. Over in [1] Tom posted a large change to the grouping planner which causes large conflict with the parallel aggregation patch. I've

Re: [HACKERS] Parallel Aggregate

2016-02-16 Thread Haribabu Kommi
On Sat, Feb 13, 2016 at 3:51 PM, Robert Haas wrote: > On Sun, Feb 7, 2016 at 8:21 PM, Haribabu Kommi > wrote:future. >> Here I attached updated patch with the corrections. > > So, what about the main patch, for parallel aggregation itself? I'm > reluctant to spend too much time massaging combine

Re: [HACKERS] Parallel Aggregate

2016-02-12 Thread Robert Haas
On Sun, Feb 7, 2016 at 8:21 PM, Haribabu Kommi wrote:future. > Here I attached updated patch with the corrections. So, what about the main patch, for parallel aggregation itself? I'm reluctant to spend too much time massaging combine functions if we don't have the infrastructure to use them. Th

Re: [HACKERS] Parallel Aggregate

2016-02-08 Thread Haribabu Kommi
On Mon, Feb 8, 2016 at 9:01 AM, Robert Haas wrote: > On Thu, Jan 21, 2016 at 11:25 PM, Haribabu Kommi > wrote: >> [ new patch ] > > This patch contains a number of irrelevant hunks that really ought not > to be here and make the patch harder to understand, like this: > > -

Re: [HACKERS] Parallel Aggregate

2016-02-07 Thread Haribabu Kommi
On Mon, Feb 8, 2016 at 2:00 AM, Robert Haas wrote: > On Sun, Jan 24, 2016 at 7:56 PM, Haribabu Kommi > wrote: >> On Sat, Jan 23, 2016 at 12:59 PM, Haribabu Kommi >> wrote: >>> Here I attached updated patch with additional combine function for >>> two stage aggregates also. >> >> A wrong combine

Re: [HACKERS] Parallel Aggregate

2016-02-07 Thread Robert Haas
On Thu, Jan 21, 2016 at 11:25 PM, Haribabu Kommi wrote: > [ new patch ] This patch contains a number of irrelevant hunks that really ought not to be here and make the patch harder to understand, like this: -* Generate appropriate target list for scan/join subplan; may b

Re: [HACKERS] Parallel Aggregate

2016-02-07 Thread Robert Haas
On Sun, Jan 24, 2016 at 7:56 PM, Haribabu Kommi wrote: > On Sat, Jan 23, 2016 at 12:59 PM, Haribabu Kommi > wrote: >> Here I attached updated patch with additional combine function for >> two stage aggregates also. > > A wrong combine function was added in pg_aggregate.h in the earlier > patch th

Re: [HACKERS] Parallel Aggregate

2016-01-24 Thread Haribabu Kommi
On Sat, Jan 23, 2016 at 12:59 PM, Haribabu Kommi wrote: > > Here I attached updated patch with additional combine function for > two stage aggregates also. A wrong combine function was added in pg_aggregate.h in the earlier patch that leading to initdb problem. Corrected one is attached. Regards

Re: [HACKERS] Parallel Aggregate

2016-01-22 Thread Haribabu Kommi
On Fri, Jan 22, 2016 at 10:13 PM, David Rowley wrote: > On 22 January 2016 at 17:25, Haribabu Kommi wrote: >> Along with these changes, I added a float8 combine function to see >> how it works under parallel aggregate, it is working fine for float4, but >> giving small data mismatch with float8 d

Re: [HACKERS] Parallel Aggregate

2016-01-22 Thread David Rowley
On 22 January 2016 at 17:25, Haribabu Kommi wrote: > Along with these changes, I added a float8 combine function to see > how it works under parallel aggregate, it is working fine for float4, but > giving small data mismatch with float8 data type. > > postgres=# select avg(f3), avg(f4) from tbl; >

Re: [HACKERS] Parallel Aggregate

2016-01-21 Thread Haribabu Kommi
On Fri, Jan 22, 2016 at 7:44 AM, David Rowley wrote: > On 21 January 2016 at 18:26, Haribabu Kommi wrote: >> Here I attached update parallel aggregate patch on top of recent commits >> of combine aggregate and parallel join patch. It still lacks of cost >> comparison >> code to compare both para

Re: [HACKERS] Parallel Aggregate

2016-01-21 Thread David Rowley
On 21 January 2016 at 18:26, Haribabu Kommi wrote: > Here I attached update parallel aggregate patch on top of recent commits > of combine aggregate and parallel join patch. It still lacks of cost > comparison > code to compare both parallel and normal aggregates. Thanks for the updated patch.

Re: [HACKERS] Parallel Aggregate

2016-01-20 Thread Haribabu Kommi
On Thu, Dec 24, 2015 at 5:12 AM, Robert Haas wrote: > On Mon, Dec 21, 2015 at 6:38 PM, David Rowley > wrote: >> On 22 December 2015 at 04:16, Paul Ramsey wrote: >>> >>> Shouldn’t parallel aggregate come into play regardless of scan >>> selectivity? >> >> I'd say that the costing should take into

Re: [HACKERS] Parallel Aggregate

2015-12-23 Thread Robert Haas
On Mon, Dec 21, 2015 at 6:38 PM, David Rowley wrote: > On 22 December 2015 at 04:16, Paul Ramsey wrote: >> >> Shouldn’t parallel aggregate come into play regardless of scan >> selectivity? > > I'd say that the costing should take into account the estimated number of > groups. > > The more tuples

Re: [HACKERS] Parallel Aggregate

2015-12-21 Thread David Rowley
On 22 December 2015 at 04:16, Paul Ramsey wrote: > Shouldn’t parallel aggregate come into play regardless of scan selectivity? > I'd say that the costing should take into account the estimated number of groups. The more tuples that make it into each group, the more attractive parallel grouping

Re: [HACKERS] Parallel Aggregate

2015-12-21 Thread Haribabu Kommi
On Tue, Dec 22, 2015 at 2:16 AM, Paul Ramsey wrote: > Shouldn’t parallel aggregate come into play regardless of scan selectivity? > I know in PostGIS land there’s a lot of stuff like: > > SELECT ST_Union(geom) FROM t GROUP BY areacode; > > Basically, in the BI case, there’s often no filter at all.

  1   2   >