Re: [HACKERS] modeling parallel contention (was: Parallel Append implementation)

Thomas Munro Thu, 04 May 2017 19:58:05 -0700

On Fri, May 5, 2017 at 2:23 PM, David Rowley
<[email protected]> wrote:
> On 5 May 2017 at 13:37, Andres Freund <[email protected]> wrote:
>> On 2017-05-02 15:13:58 -0400, Robert Haas wrote:
>>> Multiple people (including David Rowley
>>> as well as folks here at EnterpriseDB) have demonstrated that for
>>> certain queries, we can actually use a lot more workers and everything
>>> works great.  The problem is that for other queries, using a lot of
>>> workers works terribly.  The planner doesn't know how to figure out
>>> which it'll be - and honestly, I don't either.
>>
>> Have those benchmarks, even in a very informal form, been shared /
>> collected / referenced centrally?  I'd be very interested to know where
>> the different contention points are. Possibilities:
>
> I posted mine on [1], although the post does not go into much detail
> about the contention points. I only really briefly mention it at the
> end.


Just for fun, check out pages 42 and 43 of Wei Hong's thesis.  He
worked on Berkeley POSTGRES parallel query and a spin-off called XPRS,
and they got linear seq scan scaling up to number of spindles:

http://db.cs.berkeley.edu/papers/ERL-M93-28.pdf

It gather from flicking through the POSTGRES 4.2 sources and this
stuff about XPRS that they switched from a "launch N workers!" model
to a "generate tasks and schedule them" model somewhere between these
systems.  Chapters 2 and 3 cover the problem of avoiding excessive
parallelism that reduces performance adjusting dynamically to maximum
throughput.  I suspect we're going that way too at some point, and it
would certainly fix some problems I ran into with Parallel Shared
Hash.

XPRS's cost model included resource consumption, not just 'timerons'.
This is something I grappled with when trying to put a price tag on
Parallel Shared Hash plans where just one worker builds the hash table
while the others wait.  I removed that plan from the patch because it
became mostly redundant, but when it was there Postgres thought it was
the same cost as a plan where every worker hammers your system
building the same hash table, whereas XPRS would have considered such
a plan ludicrously expensive (depending on his 'w' term, see page 28,
which determines whether you care more about resource usage or
response time).

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] modeling parallel contention (was: Parallel Append implementation)

Reply via email to