Re: [HACKERS] [DESIGN] ParallelAppend

Ashutosh Bapat Tue, 28 Jul 2015 01:23:21 -0700

On Tue, Jul 28, 2015 at 12:59 PM, David Rowley <david.row...@2ndquadrant.com
> wrote:


>
> On 27 July 2015 at 21:09, Kyotaro HORIGUCHI <
> horiguchi.kyot...@lab.ntt.co.jp> wrote:
>
>> Hello, can I ask some questions?
>>
>> I suppose we can take this as the analog of ParalleSeqScan.  I
>> can see not so distinction between Append(ParalleSeqScan) and
>> ParallelAppend(SeqScan). What difference is there between them?
>>
>> If other nodes will have the same functionality as you mention at
>> the last of this proposal, it might be better that some part of
>> this feature is implemented as a part of existing executor
>> itself, but not as a deidicated additional node, just as my
>> asynchronous fdw execution patch patially does. (Although it
>> lacks planner part and bg worker launching..) If that is the
>> case, it might be better that ExecProcNode is modified so that it
>> supports both in-process and inter-bgworker cases by the single
>> API.
>>
>> What do you think about this?
>>
>
> I have to say that I really like the thought of us having parallel enabled
> stuff in Postgres, but I also have to say that I don't think inventing all
> these special parallel node types is a good idea. If we think about
> everything that we can parallelise...
>
> Perhaps.... sort, hash join, seqscan, hash, bitmap heap scan, nested loop.
> I don't want to debate that, but perhaps there's more, perhaps less.
> Are we really going to duplicate all of the code and add in the parallel
> stuff as new node types?
>
> My other concern here is that I seldom hear people talk about the
> planner's architectural lack of ability to make a good choice about how
> many parallel workers to choose. Surely to properly calculate costs you
> need to know the exact number of parallel workers that will be available at
> execution time, but you need to know this at planning time!? I can't see
> how this works, apart from just being very conservative about parallel
> workers, which I think is really bad, as many databases have busy times in
> the day, and also quiet times, generally quiet time is when large batch
> stuff gets done, and that's the time that parallel stuff is likely most
> useful. Remember queries are not always planned just before they're
> executed. We could have a PREPAREd query, or we could have better plan
> caching in the future, or if we build some intelligence into the planner to
> choose a good number of workers based on the current server load, then
> what's to say that the server will be under this load at exec time? If we
> plan during a quiet time, and exec in a busy time all hell may break loose.
>
> I really do think that existing nodes should just be initialized in a
> parallel mode, and each node type can have a function to state if it
> supports parallelism or not.
>
> I'd really like to hear more opinions in the ideas I discussed here:
>
>
> http://www.postgresql.org/message-id/CAApHDvp2STf0=pqfpq+e7wa4qdympfm5qu_ytupe7r0jlnh...@mail.gmail.com
>
>
> This design makes use of the Funnel node that Amit has already made and
> allows more than 1 node to be executed in parallel at once.
>
> It appears that parallel enabling the executor node by node is
> fundamentally locked into just 1 node being executed in parallel, then
> perhaps a Funnel node gathering up the parallel worker buffers and
> streaming those back in serial mode. I believe by design, this does not
> permit a whole plan branch from executing in parallel and I really feel
> like doing things this way is going to be very hard to undo and improve
> later. I might be too stupid to figure it out, but how would parallel hash
> join work if it can't gather tuples from the inner and outer nodes in
> parallel?
>
> Sorry for the rant, but I just feel like we're painting ourselves into a
> corner by parallel enabling the executor node by node.
> Apologies if I've completely misunderstood things.
>
>
+1, well articulated.
-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: [HACKERS] [DESIGN] ParallelAppend

Reply via email to