Re: [HACKERS] Parallel Seq Scan

Kevin Grittner Tue, 07 Apr 2015 13:25:13 -0700

David Rowley <dgrowle...@gmail.com> wrote:

> If we attempt to do this parallel stuff at plan time, and we
> happen to plan at some quiet period, or perhaps worse, some
> application's start-up process happens to PREPARE a load of
> queries when the database is nice and quite, then quite possibly
> we'll end up with some highly parallel queries. Then perhaps come
> the time these queries are actually executed the server is very
> busy... Things will fall apart quite quickly due to the masses of
> IPC and context switches that would be going on.
>
> I completely understand that this parallel query stuff is all
> quite new to us all and we're likely still trying to nail down
> the correct infrastructure for it to work well, so this is why
> I'm proposing that the planner should know nothing of parallel
> query, instead I think it should work more along the lines of:
>
> * Planner should be completely oblivious to what parallel query
>   is.
> * Before executor startup the plan is passed to a function which
>   decides if we should parallelise it, and does so if the plan
>   meets the correct requirements. This should likely have a very
>   fast exit path such as:
>     if root node's cost < parallel_query_cost_threshold
>       return; /* the query is not expensive enough to attempt to make 
> parallel */
>
> The above check will allow us to have an almost zero overhead for
> small low cost queries.
>
> This function would likely also have some sort of logic in order
> to determine if the server has enough spare resource at the
> current point in time to allow queries to be parallelised


There is a lot to like about this suggestion.

I've seen enough performance crashes due to too many concurrent
processes (even when each connection can only use a single process)
to believe that, for a plan which will be saved, it is possible to
know at planning time whether parallelization will be a nice win or
a devastating over-saturation of resources during some later
execution phase.

Another thing to consider is that this is not entirely unrelated to
the concept of admission control policies.  Perhaps this phase
could be a more general execution start-up admission control phase,
where parallel processing would be one adjustment that could be
considered.  Initially it might be the *only* consideration, but it
might be good to try to frame it in a way that allowed
implementation of other policies, too.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Seq Scan

Reply via email to