Re: [HACKERS] asynchronous and vectorized execution

2016-12-01 Thread Haribabu Kommi
On Mon, Oct 3, 2016 at 3:25 PM, Kyotaro HORIGUCHI < horiguchi.kyot...@lab.ntt.co.jp> wrote: > At Mon, 3 Oct 2016 13:14:23 +0900, Michael Paquier < > michael.paqu...@gmail.com> wrote in xn_g7uwnpqun_a_sewb...@mail.gmail.com> > > On Thu, Sep 29, 2016 at 5:50 PM, Kyotaro HORIGUCHI > > wrote: > > >

Re: [HACKERS] asynchronous and vectorized execution

2016-10-02 Thread Kyotaro HORIGUCHI
At Mon, 3 Oct 2016 13:14:23 +0900, Michael Paquier wrote in > On Thu, Sep 29, 2016 at 5:50 PM, Kyotaro HORIGUCHI > wrote: > > Sorry for no response, but, the answer is yes. We could be able > > to avoid the problem by managing execution state for every > > node. But it needs an additional flag

Re: [HACKERS] asynchronous and vectorized execution

2016-10-02 Thread Michael Paquier
On Thu, Sep 29, 2016 at 5:50 PM, Kyotaro HORIGUCHI wrote: > Sorry for no response, but, the answer is yes. We could be able > to avoid the problem by managing execution state for every > node. But it needs an additional flag in *State structs and > manipulating on the way shuttling up and down aro

Re: [HACKERS] asynchronous and vectorized execution

2016-09-29 Thread Kyotaro HORIGUCHI
Hello, thank you for the comment. At Fri, 23 Sep 2016 18:15:40 +0530, Amit Khandekar wrote in > On 13 September 2016 at 20:20, Robert Haas wrote: > > > On Mon, Aug 29, 2016 at 4:08 AM, Kyotaro HORIGUCHI > > wrote: > > > [ new patches ] > > > > +/* > > + * We assume th

Re: [HACKERS] asynchronous and vectorized execution

2016-09-23 Thread Robert Haas
On Fri, Sep 23, 2016 at 8:45 AM, Amit Khandekar wrote: > For e.g., in the above plan which you specified, suppose : > 1. Hash Join has called ExecProcNode() for the child foreign scan b, and so > is > waiting in ExecAsyncWaitForNode(foreign_scan_on_b). > 2. The event wait list already has foreign

Re: [HACKERS] asynchronous and vectorized execution

2016-09-23 Thread Amit Khandekar
On 13 September 2016 at 20:20, Robert Haas wrote: > On Mon, Aug 29, 2016 at 4:08 AM, Kyotaro HORIGUCHI > wrote: > > [ new patches ] > > +/* > + * We assume that few nodes are async-aware and async-unaware > + * nodes cannot be revserse-dispatched from lower no

Re: [HACKERS] asynchronous and vectorized execution

2016-09-13 Thread Robert Haas
On Tue, Aug 2, 2016 at 3:41 AM, Kyotaro HORIGUCHI wrote: > Thank you for the comment. > > At Mon, 1 Aug 2016 10:44:56 +0530, Amit Khandekar > wrote in >> On 21 July 2016 at 15:20, Kyotaro HORIGUCHI > > wrote: >> >> > >> > After some consideration, I found that ExecAsyncWaitForNode >> > cannot b

Re: [HACKERS] asynchronous and vectorized execution

2016-09-13 Thread Robert Haas
On Mon, Aug 29, 2016 at 4:08 AM, Kyotaro HORIGUCHI wrote: > [ new patches ] +/* + * We assume that few nodes are async-aware and async-unaware + * nodes cannot be revserse-dispatched from lower nodes that is + * async-aware. Firing of an async node

Re: [HACKERS] asynchronous and vectorized execution

2016-09-12 Thread Kyotaro HORIGUCHI
Hello, At Thu, 01 Sep 2016 16:12:31 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI wrote in <20160901.161231.110068639.horiguchi.kyot...@lab.ntt.co.jp> > There's perfomance degradation for non-asynchronous nodes, as > shown as 't0' below. > > The patch adds two "if-then" and one additional fun

Re: [HACKERS] asynchronous and vectorized execution

2016-09-01 Thread Kyotaro HORIGUCHI
This is random thoughts on this patch. At Tue, 30 Aug 2016 12:17:52 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI wrote in <20160830.121752.100817694.horiguchi.kyot...@lab.ntt.co.jp> > > As the result, the attached patchset is functionally the same > > with the last version but replace misused

Re: [HACKERS] asynchronous and vectorized execution

2016-08-29 Thread Kyotaro HORIGUCHI
No, it was wrong. At Mon, 29 Aug 2016 17:08:36 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI wrote in <20160829.170836.161449399.horiguchi.kyot...@lab.ntt.co.jp> > Hello, > > I considered applying the async infrastructure onto nodeGather, > but since parallel workers hardly make Gather (or th

Re: [HACKERS] asynchronous and vectorized execution

2016-08-02 Thread Kyotaro HORIGUCHI
Thank you for the comment. At Mon, 1 Aug 2016 10:44:56 +0530, Amit Khandekar wrote in > On 21 July 2016 at 15:20, Kyotaro HORIGUCHI > wrote: > > > > > After some consideration, I found that ExecAsyncWaitForNode > > cannot be reentrant because it means that the control goes into > > async-unaw

Re: [HACKERS] asynchronous and vectorized execution

2016-07-31 Thread Amit Khandekar
On 21 July 2016 at 15:20, Kyotaro HORIGUCHI wrote: > > After some consideration, I found that ExecAsyncWaitForNode > cannot be reentrant because it means that the control goes into > async-unaware nodes while having not-ready nodes, that is > inconsistent state. To inhibit such reentering, I allo

Re: [HACKERS] asynchronous and vectorized execution

2016-07-22 Thread Kyotaro HORIGUCHI
The previous patch set doesn't accept --enable-cassert. The attached additional one fixes it. It theoretically won't give degradation but I'll measure the performance change. At Thu, 21 Jul 2016 18:50:07 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI wrote in <20160721.185007.268388411.horiguch

Re: [HACKERS] asynchronous and vectorized execution

2016-07-11 Thread Kyotaro HORIGUCHI
I forgot to mention. At Tue, 12 Jul 2016 11:04:17 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI wrote in <20160712.110417.145469826.horiguchi.kyot...@lab.ntt.co.jp> > Cooled down then measured performance again. > > I show you the true result briefly for now. > > At Mon, 11 Jul 2016 19:07:22

Re: [HACKERS] asynchronous and vectorized execution

2016-07-11 Thread Kyotaro HORIGUCHI
Cooled down then measured performance again. I show you the true result briefly for now. At Mon, 11 Jul 2016 19:07:22 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI wrote in <20160711.190722.145849861.horiguchi.kyot...@lab.ntt.co.jp> > Anyway I need some time to cool down.. I recalled that I

Re: [HACKERS] asynchronous and vectorized execution

2016-07-11 Thread Kyotaro HORIGUCHI
At Mon, 11 Jul 2016 17:10:11 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI wrote in <20160711.171011.133133724.horiguchi.kyot...@lab.ntt.co.jp> > > Two things: > > > > 1. That's not the scenario I'm talking about. I'm concerned about > > making sure that query plans that don't use asynchronou

Re: [HACKERS] asynchronous and vectorized execution

2016-07-11 Thread Kyotaro HORIGUCHI
Hello, At Thu, 7 Jul 2016 13:59:54 -0400, Robert Haas wrote in > On Wed, Jul 6, 2016 at 3:29 AM, Kyotaro HORIGUCHI > wrote: > > This seems to be a good opportunity to show this patch. The > > attched patch set does async execution of foreignscan > > (postgres_fdw) on the Robert's first infrast

Re: [HACKERS] asynchronous and vectorized execution

2016-07-07 Thread Robert Haas
On Wed, Jul 6, 2016 at 3:29 AM, Kyotaro HORIGUCHI wrote: > This seems to be a good opportunity to show this patch. The > attched patch set does async execution of foreignscan > (postgres_fdw) on the Robert's first infrastructure, with some > modification. Cool. > ExecAsyncWaitForNode can get int

Re: [HACKERS] asynchronous and vectorized execution

2016-07-05 Thread Robert Haas
On Wed, Jun 29, 2016 at 11:00 AM, Amit Khandekar wrote: > We may also want to consider handling abstract events such as > "tuples-are-available-at-plan-node-X". > > One benefit is : we can combine this with batch processing. For e.g. in case > of an Append node containing foreign scans, its parent

Re: [HACKERS] asynchronous and vectorized execution

2016-06-29 Thread Amit Khandekar
We may also want to consider handling abstract events such as "tuples-are-available-at-plan-node-X". One benefit is : we can combine this with batch processing. For e.g. in case of an Append node containing foreign scans, its parent node may not want to process the Append node result until Append

Re: [HACKERS] asynchronous and vectorized execution

2016-05-11 Thread Robert Haas
On Wed, May 11, 2016 at 12:30 PM, Andres Freund wrote: > On 2016-05-11 12:27:55 -0400, Robert Haas wrote: >> On Wed, May 11, 2016 at 11:49 AM, Andres Freund wrote: >> > On 2016-05-11 10:12:26 -0400, Robert Haas wrote: >> >> > Hm. Do we really have to keep the page locked in the page-at-a-time >>

Re: [HACKERS] asynchronous and vectorized execution

2016-05-11 Thread Andres Freund
On 2016-05-11 12:27:55 -0400, Robert Haas wrote: > On Wed, May 11, 2016 at 11:49 AM, Andres Freund wrote: > > On 2016-05-11 10:12:26 -0400, Robert Haas wrote: > >> > Hm. Do we really have to keep the page locked in the page-at-a-time > >> > mode? Shouldn't the pin suffice? > >> > >> I think we nee

Re: [HACKERS] asynchronous and vectorized execution

2016-05-11 Thread Robert Haas
On Wed, May 11, 2016 at 11:49 AM, Andres Freund wrote: > On 2016-05-11 10:12:26 -0400, Robert Haas wrote: >> > I've to admit I'm not that convinced about the speedups in the !fdw >> > case. There seems to be a lot easier avenues for performance >> > improvements. >> >> What I'm talking about is a

Re: [HACKERS] asynchronous and vectorized execution

2016-05-11 Thread Andres Freund
On 2016-05-11 10:32:20 -0400, Robert Haas wrote: > On Tue, May 10, 2016 at 8:50 PM, Andres Freund wrote: > > That seems to suggest that we need to restructure how we get to calling > > fmgr functions, before worrying about the actual fmgr call. > > Any ideas on how to do that? ExecMakeFunctionRe

Re: [HACKERS] asynchronous and vectorized execution

2016-05-11 Thread Andres Freund
On 2016-05-11 10:12:26 -0400, Robert Haas wrote: > > I've to admit I'm not that convinced about the speedups in the !fdw > > case. There seems to be a lot easier avenues for performance > > improvements. > > What I'm talking about is a query like this: > > SELECT * FROM inheritance_tree_of_foreig

Re: [HACKERS] asynchronous and vectorized execution

2016-05-11 Thread Robert Haas
On Tue, May 10, 2016 at 8:50 PM, Andres Freund wrote: > That seems to suggest that we need to restructure how we get to calling > fmgr functions, before worrying about the actual fmgr call. Any ideas on how to do that? ExecMakeFunctionResultNoSets() isn't really doing a heck of a lot. Changing

Re: [HACKERS] asynchronous and vectorized execution

2016-05-11 Thread Konstantin Knizhnik
On 10.05.2016 20:26, Robert Haas wrote: At this moment (February) them have implemented translation of only few PostgreSQL operators used by ExecQuals and do not support aggregates. Them get about 2 times increase of speed at synthetic queries and 25% increase at TPC-H Q1 (for Q1 most critical

Re: [HACKERS] asynchronous and vectorized execution

2016-05-11 Thread Robert Haas
On Wed, May 11, 2016 at 10:17 AM, Konstantin Knizhnik wrote: > Yes, I agree with you that complete rewriting of optimizer is huge project > with unpredictable influence on performance of some queries. > Changing things incrementally is good approach, but only if we are moving in > right direction.

Re: [HACKERS] asynchronous and vectorized execution

2016-05-11 Thread Robert Haas
On Tue, May 10, 2016 at 8:23 PM, Andres Freund wrote: >> c. Modify some nodes (perhaps start with nodeAgg.c) to allow them to >> process a batch TupleTableSlot. This will require some tight loop to >> aggregate the entire TupleTableSlot at once before returning. >> d. Add function in execAmi.c whi

Re: [HACKERS] asynchronous and vectorized execution

2016-05-11 Thread Konstantin Knizhnik
On 11.05.2016 17:00, Robert Haas wrote: On Tue, May 10, 2016 at 3:42 PM, Konstantin Knizhnik wrote: Doesn't this actually mean that we need to have normal job scheduler which is given queue of jobs and having some pool of threads will be able to orginize efficient execution of queries? Optimi

Re: [HACKERS] asynchronous and vectorized execution

2016-05-11 Thread Robert Haas
On Tue, May 10, 2016 at 7:57 PM, Andres Freund wrote: >> 1. asynchronous execution, by which I mean the ability of a node to >> somehow say that it will generate a tuple eventually, but is not yet >> ready, so that the executor can go run some other part of the plan >> tree while it waits. [...].

Re: [HACKERS] asynchronous and vectorized execution

2016-05-11 Thread Robert Haas
On Tue, May 10, 2016 at 3:42 PM, Konstantin Knizhnik wrote: > Doesn't this actually mean that we need to have normal job scheduler which > is given queue of jobs and having some pool of threads will be able to > orginize efficient execution of queries? Optimizer can build pipeline > (graph) of tas

Re: [HACKERS] asynchronous and vectorized execution

2016-05-11 Thread Robert Haas
On Tue, May 10, 2016 at 4:57 PM, Jim Nasby wrote: > Even so, I would think that the simplification in the executor would be > worth it. If you need to add a new node there's dozens of places where you > might have to mess with these giant case statements. Dozens? I think the number is in the sing

Re: [HACKERS] asynchronous and vectorized execution

2016-05-11 Thread Ants Aasma
On Wed, May 11, 2016 at 3:52 AM, Andres Freund wrote: > On 2016-05-11 03:20:12 +0300, Ants Aasma wrote: >> On Tue, May 10, 2016 at 7:56 PM, Robert Haas wrote: >> > On Mon, May 9, 2016 at 8:34 PM, David Rowley >> > wrote: >> > I don't have any at the moment, but I'm not keen on hundreds of new >>

Re: [HACKERS] asynchronous and vectorized execution

2016-05-10 Thread Andres Freund
On 2016-05-11 03:20:12 +0300, Ants Aasma wrote: > On Tue, May 10, 2016 at 7:56 PM, Robert Haas wrote: > > On Mon, May 9, 2016 at 8:34 PM, David Rowley > > wrote: > > I don't have any at the moment, but I'm not keen on hundreds of new > > vector functions that can all have bugs or behavior differe

Re: [HACKERS] asynchronous and vectorized execution

2016-05-10 Thread Andres Freund
On 2016-05-10 12:56:17 -0400, Robert Haas wrote: > I suspect the number of queries that are being hurt by fmgr overhead > is really large, and I think it would be nice to attack that problem > more directly. It's a bit hard to discuss what's worthwhile in the > abstract, without performance number

Re: [HACKERS] asynchronous and vectorized execution

2016-05-10 Thread Andres Freund
On 2016-05-10 12:34:19 +1200, David Rowley wrote: > a. Modify ScanAPI to allow batch tuple fetching in predefined batch sizes. > b. Modify TupleTableSlot to allow > 1 tuple to be stored. Add flag to > indicate if the struct contains a single or a multiple tuples. > Multiple tuples may need to be de

Re: [HACKERS] asynchronous and vectorized execution

2016-05-10 Thread Ants Aasma
On Tue, May 10, 2016 at 7:56 PM, Robert Haas wrote: > On Mon, May 9, 2016 at 8:34 PM, David Rowley > wrote: > I don't have any at the moment, but I'm not keen on hundreds of new > vector functions that can all have bugs or behavior differences versus > the unvectorized versions of the same code.

Re: [HACKERS] asynchronous and vectorized execution

2016-05-10 Thread Andres Freund
Hi, On 2016-05-09 13:33:55 -0400, Robert Haas wrote: > I think there are several different areas > where we should consider major upgrades to our executor. It's too > slow and it doesn't do everything we want it to do. The main things > on my mind are: 3) We use a lot of very cache-inefficient

Re: [HACKERS] asynchronous and vectorized execution

2016-05-10 Thread Bert
hmm, the morsels paper looks really interesting at first sight. Let's see if we can get a poc working in PostgreSQL? :-) On Tue, May 10, 2016 at 9:42 PM, Konstantin Knizhnik < k.knizh...@postgrespro.ru> wrote: > On 05/10/2016 08:26 PM, Robert Haas wrote: > >> On Tue, May 10, 2016 at 3:00 AM, kons

Re: [HACKERS] asynchronous and vectorized execution

2016-05-10 Thread Jim Nasby
On 5/10/16 12:47 AM, Kouhei Kaigai wrote: > On 10 May 2016 at 13:38, Kouhei Kaigai wrote: > > My concern about ExecProcNode is, it is constructed with a large switch > > ... case statement. It involves tons of comparison operation at run-time. > > If we replace this switch ... case by function

Re: [HACKERS] asynchronous and vectorized execution

2016-05-10 Thread Konstantin Knizhnik
On 05/10/2016 08:26 PM, Robert Haas wrote: On Tue, May 10, 2016 at 3:00 AM, konstantin knizhnik wrote: What's wrong with it that worker is blocked? You can just have more workers (more than CPU cores) to let other of them continue to do useful work. Not really. The workers are all running the

Re: [HACKERS] asynchronous and vectorized execution

2016-05-10 Thread Robert Haas
On Tue, May 10, 2016 at 3:00 AM, konstantin knizhnik wrote: > What's wrong with it that worker is blocked? You can just have more workers > (more than CPU cores) to let other of them continue to do useful work. Not really. The workers are all running the same plan, so they'll all make the same d

Re: [HACKERS] asynchronous and vectorized execution

2016-05-10 Thread Robert Haas
On Mon, May 9, 2016 at 9:38 PM, Kouhei Kaigai wrote: > Is the parallel aware Append node sufficient to run multiple nodes > asynchronously? (Sorry, I couldn't have enough time to code the feature > even though we had discussion before.) It's tempting to think that parallel query and asynchronous

Re: [HACKERS] asynchronous and vectorized execution

2016-05-10 Thread Robert Haas
On Mon, May 9, 2016 at 8:34 PM, David Rowley wrote: > It's interesting that you mention this. We identified this as a pain > point during our work on column stores last year. Simply passing > single tuples around the executor is really unfriendly towards L1 > instruction cache, plus also the point

Re: [HACKERS] asynchronous and vectorized execution

2016-05-10 Thread Rajeev rastogi
On 09 May 2016 23:04, Robert Haas Wrote: >2. vectorized execution, by which I mean the ability of a node to return >tuples in batches rather than one by one. Andres has opined more than >once that repeated trips through ExecProcNode defeat the ability of the >CPU to do branch prediction correctly

Re: [HACKERS] asynchronous and vectorized execution

2016-05-10 Thread Kyotaro HORIGUCHI
Hello. At Mon, 9 May 2016 13:33:55 -0400, Robert Haas wrote in > Hi, > > I realize that we haven't gotten 9.6beta1 out the door yet, but I > think we can't really wait much longer to start having at least some > discussion of 9.7 topics, so I'm going to go ahead and put this one > out there.

Re: [HACKERS] asynchronous and vectorized execution

2016-05-10 Thread konstantin knizhnik
Hi, > 1. asynchronous execution, It seems to me that asynchronous execution can be considered as alternative to multithreading model (in case of PostgreSQL the roles of threads are played by workers). Async. operations are used to have smaller overhead but have scalability problems (because i

Re: [HACKERS] asynchronous and vectorized execution

2016-05-09 Thread Pavel Stehule
2016-05-10 8:05 GMT+02:00 David Rowley : > On 10 May 2016 at 16:34, Greg Stark wrote: > > > > On 9 May 2016 8:34 pm, "David Rowley" > wrote: > >> > >> This project does appear to require that we bloat the code with 100's > >> of vector versions of each function. I'm not quite sure if there's a >

Re: [HACKERS] asynchronous and vectorized execution

2016-05-09 Thread David Rowley
On 10 May 2016 at 16:34, Greg Stark wrote: > > On 9 May 2016 8:34 pm, "David Rowley" wrote: >> >> This project does appear to require that we bloat the code with 100's >> of vector versions of each function. I'm not quite sure if there's a >> better way to handle this. The problem is that the fmg

Re: [HACKERS] asynchronous and vectorized execution

2016-05-09 Thread Kouhei Kaigai
> -Original Message- > From: pgsql-hackers-ow...@postgresql.org > [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of David Rowley > Sent: Tuesday, May 10, 2016 2:01 PM > To: Kaigai Kouhei(海外 浩平) > Cc: Robert Haas; pgsql-hackers@postgresql.org > Subject: Re: [HACK

Re: [HACKERS] asynchronous and vectorized execution

2016-05-09 Thread David Rowley
On 10 May 2016 at 13:38, Kouhei Kaigai wrote: > My concern about ExecProcNode is, it is constructed with a large switch > ... case statement. It involves tons of comparison operation at run-time. > If we replace this switch ... case by function pointer, probably, it make > performance improvement.

Re: [HACKERS] asynchronous and vectorized execution

2016-05-09 Thread Greg Stark
On 9 May 2016 8:34 pm, "David Rowley" wrote: > > This project does appear to require that we bloat the code with 100's > of vector versions of each function. I'm not quite sure if there's a > better way to handle this. The problem is that the fmgr is pretty much > a barrier to SIMD operations, and

Re: [HACKERS] asynchronous and vectorized execution

2016-05-09 Thread Kouhei Kaigai
> -Original Message- > From: pgsql-hackers-ow...@postgresql.org > [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Robert Haas > Sent: Tuesday, May 10, 2016 2:34 AM > To: pgsql-hackers@postgresql.org > Subject: [HACKERS] asynchronous and vectorized execution >

Re: [HACKERS] asynchronous and vectorized execution

2016-05-09 Thread David Rowley
On 10 May 2016 at 05:33, Robert Haas wrote: > 2. vectorized execution, by which I mean the ability of a node to > return tuples in batches rather than one by one. Andres has opined > more than once that repeated trips through ExecProcNode defeat the > ability of the CPU to do branch prediction co

Re: [HACKERS] asynchronous and vectorized execution

2016-05-09 Thread Simon Riggs
On 9 May 2016 at 19:33, Robert Haas wrote: > I believe there are other people thinking about these > topics as well, including Andres Freund, Kyotaro Horiguchi, and > probably some folks at 2ndQuadrant (but I don't know exactly who). > 1. asynchronous execution > Not looking at that. > 2.