subject:"Re\: \[HACKERS\] Parallel sec scan in plpgsql"

Re: [HACKERS] Parallel sec scan in plpgsql

2016-09-22 Thread Amit Kapila

On Thu, Sep 22, 2016 at 7:32 PM, Robert Haas  wrote:
> On Thu, Sep 22, 2016 at 8:36 AM, Amit Kapila  wrote:
>> I think for certain cases like into clause, the rows passed will be
>> equal to actual number of rows, otherwise it will generate error.  So
>> we can pass that information in executor layer.  Another kind of cases
>> which are worth considering are when from plpgsql we fetch limited
>> rows at-a-time, but we fetch till end like the case of
>> exec_stmt_return_query().
>
> Yes, I think that those cases can be considered.  Some careful code
> inspection will be needed to make sure the cases we want to enable are
> safe, and some testing will be needed to make sure they behave
> properly.  But it doesn't sound like an unsolvable problem.  I hope
> someone who isn't me will decide to work on it.  :-)
>

makes sense.  I think along with that we can also evaluate, if we can
enable parallel query from other pl languages. I think if we can
enable parallelism from all pl languages in 10.0, that will be a good
step forward.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel sec scan in plpgsql

2016-09-22 Thread Robert Haas

On Thu, Sep 22, 2016 at 8:36 AM, Amit Kapila  wrote:
> I think for certain cases like into clause, the rows passed will be
> equal to actual number of rows, otherwise it will generate error.  So
> we can pass that information in executor layer.  Another kind of cases
> which are worth considering are when from plpgsql we fetch limited
> rows at-a-time, but we fetch till end like the case of
> exec_stmt_return_query().

Yes, I think that those cases can be considered.  Some careful code
inspection will be needed to make sure the cases we want to enable are
safe, and some testing will be needed to make sure they behave
properly.  But it doesn't sound like an unsolvable problem.  I hope
someone who isn't me will decide to work on it.  :-)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel sec scan in plpgsql

2016-09-22 Thread Amit Kapila

On Tue, Sep 20, 2016 at 8:31 PM, Robert Haas  wrote:
> On Tue, Sep 20, 2016 at 9:24 AM, Amit Kapila  wrote:
>> I think here point is that for any case where there is count of rows
>> to be selected, we disable parallelism.  There are many genuine cases
>> like select count(*) into cnt ... which will run to completion, but as
>> plpgsql passes row count to be 1 or 2, it doesn't enter parallel mode.
>> There are couple other cases like that.  Do you see a reason for not
>> enabling parallelism for such cases?
>
> If we can somehow know that the rowcount which is passed is greater
> than or equal to the actual number of rows which will be generated,
> then it's fine to enable parallelism.
>

I think for certain cases like into clause, the rows passed will be
equal to actual number of rows, otherwise it will generate error.  So
we can pass that information in executor layer.  Another kind of cases
which are worth considering are when from plpgsql we fetch limited
rows at-a-time, but we fetch till end like the case of
exec_stmt_return_query().


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel sec scan in plpgsql

2016-09-20 Thread Robert Haas

On Tue, Sep 20, 2016 at 10:58 AM, Tom Lane  wrote:
> Maybe it would be better to fix the rule against workers
> invoking their own parallel queries.

That rule does have the advantage of preventing us from having one
user backend launch N^2 workers.  I don't think it would be that much
work to fix it, but the results might be pretty exciting.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel sec scan in plpgsql

2016-09-20 Thread Robert Haas

On Tue, Sep 20, 2016 at 9:24 AM, Amit Kapila  wrote:
> I think here point is that for any case where there is count of rows
> to be selected, we disable parallelism.  There are many genuine cases
> like select count(*) into cnt ... which will run to completion, but as
> plpgsql passes row count to be 1 or 2, it doesn't enter parallel mode.
> There are couple other cases like that.  Do you see a reason for not
> enabling parallelism for such cases?

If we can somehow know that the rowcount which is passed is greater
than or equal to the actual number of rows which will be generated,
then it's fine to enable parallelism.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel sec scan in plpgsql

2016-09-20 Thread Tom Lane

Amit Kapila  writes:
> On Mon, Sep 19, 2016 at 11:39 PM, Robert Haas  wrote:
>> However, I think the chances of supporting parallel query for queries
>> not executed to completion any time in the near future are very poor.

> I think here point is that for any case where there is count of rows
> to be selected, we disable parallelism.  There are many genuine cases
> like select count(*) into cnt ... which will run to completion, but as
> plpgsql passes row count to be 1 or 2, it doesn't enter parallel mode.
> There are couple other cases like that.  Do you see a reason for not
> enabling parallelism for such cases?

The other problem that would have to be confronted here is nesting,
ie it would only be OK for a plpgsql function to invoke a parallel
query if it wasn't itself being executed by a parallel worker ---
or maybe even if it's being executed by the leader process but there's
an active Gather somewhere else in the calling query's plan tree.
(Not sure about the implementation's properties for that case.)
We'd have to decide whether we want plancache to track both parallel
and nonparallel plans for plpgsql queries.  Do-able no doubt but
pretty ugly.  Maybe it would be better to fix the rule against workers
invoking their own parallel queries.

However that's probably moot unless the not-executing-to-completion
issue can be solved.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel sec scan in plpgsql

2016-09-20 Thread Amit Kapila

On Mon, Sep 19, 2016 at 11:39 PM, Robert Haas  wrote:
> On Sat, Sep 17, 2016 at 11:54 PM, Amit Kapila  wrote:
>> In general, I think we should support the cases as required (or
>> written) by you from plpgsql or sql functions.  We need more work to
>> support such cases. There are probably two ways of supporting such
>> cases, we can build some intelligence in plpgsql execution such that
>> it can recognise such queries and allow to use parallelism or we need
>> to think of enabling parallelism for cases where we don't run the plan
>> to completion.  Most of the use cases from plpgsql or sql function
>> fall into later category as they don't generally run the plan to
>> completion.
>
> I think there's certainly more work that could be done to teach
> PL/pgsql about cases where the query will run to completion.  I didn't
> work very hard to make sure we covered all of those; there are
> probably several different cases where parallelism could be safely
> enabled but currently isn't.  Also, I didn't do anything at all to
> update the other PLs, and that would be good, too.
>
> However, I think the chances of supporting parallel query for queries
> not executed to completion any time in the near future are very poor.
>

I think here point is that for any case where there is count of rows
to be selected, we disable parallelism.  There are many genuine cases
like select count(*) into cnt ... which will run to completion, but as
plpgsql passes row count to be 1 or 2, it doesn't enter parallel mode.
There are couple other cases like that.  Do you see a reason for not
enabling parallelism for such cases?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel sec scan in plpgsql

2016-09-20 Thread Alex Ignatov



On 18.09.2016 06:54, Amit Kapila wrote:

On Fri, Sep 16, 2016 at 8:48 PM, Alex Ignatov  wrote:


On 16.09.2016 16:50, Amit Kapila wrote:



Can you try by setting force_parallel_mode = off;?  I think it is
sending the whole function execution to worker due to
force_parallel_mode.




No changes:



Okay, it just skipped from my mind that we don't support parallel
queries for SQL statement execution (or statements executed via
exec_stmt_execsql) from plpgsql.  For detailed explanation of why that
is not feasible you can refer one of my earlier e-mails [1] on similar
topic.  I think if we can somehow get the results via Perform
statement, then it could be possible to use parallelism via plpgsql.

However, you can use it via SQL functions, an example is below:

set min_parallel_relation_size =0;
set parallel_tuple_cost=0;
set parallel_setup_cost=0;

Load 'auto_explain';
set auto_explain.log_min_duration = 0;
set auto_explain.log_analyze = true;
set auto_explain.log_nested_statements = true;

create table test_plpgsql(c1 int, c2 char(1000));
insert into test_plpgsql values(generate_series(1,10),'aaa');

create or replace function parallel_test_set_sql() returns
setof bigint as $$
select count(*) from test_plpgsql;
$$language sql PARALLEL SAFE STRICT STABLE;

Then execute function as: select * from parallel_test_set_sql();  You
can see below plan if auto_explain module is loaded.

Finalize Aggregate  (cost=14806.85..14806.86 rows=1 width=8) (actual tim
e=1094.966..1094.967 rows=1 loops=1)
  ->  Gather  (cost=14806.83..14806.84 rows=2 width=8) (actual time=472.
216..1094.943 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
->  Partial Aggregate  (cost=14806.83..14806.84 rows=1 width=8)
(actual time=177.867..177.868 rows=1 loops=3)
  ->  Parallel Seq Scan on test_plpgsql  (cost=0.00..14702.6
7 rows=41667 width=0) (actual time=0.384..142.565 rows=3 loops=3)
CONTEXT:  SQL function "parallel_test_set_sql" statement 1
LOG:  duration: 2965.040 ms  plan:
Query Text: select * from parallel_test_set_sql();
Function Scan on parallel_test_set_sql  (cost=0.25..10.25 rows=1000 widt
h=8) (actual time=2538.620..2776.955 rows=1 loops=1)


In general, I think we should support the cases as required (or
written) by you from plpgsql or sql functions.  We need more work to
support such cases. There are probably two ways of supporting such
cases, we can build some intelligence in plpgsql execution such that
it can recognise such queries and allow to use parallelism or we need
to think of enabling parallelism for cases where we don't run the plan
to completion.  Most of the use cases from plpgsql or sql function
fall into later category as they don't generally run the plan to
completion.


[1] - 
https://www.postgresql.org/message-id/CAA4eK1K8kaO_jRk42-o2rmhSRbKV-3mR%2BiNVcONLdbcSXW5TfQ%40mail.gmail.com



Thank you for you sugestion! That works.

But what  we can do with this function:
create or replace function parallel_test_sql(t int) returns setof bigint as
$$
   select count(*) from (select a,b,c,d,e,sum(bytes) from test where a>= $1 
group by a,b,c,d,e)t;
$$ language sql PARALLEL SAFE STRICT STABLE;

explain (analyze,buffers) select  * from  parallel_test_sql(2);

"Function Scan on parallel_test_sql  (cost=0.25..10.25 rows=1000 width=8) (actual 
time=2410.789..2410.790 rows=1 loops=1)"
"  Buffers: shared hit=63696"
"Planning time: 0.082 ms"
"Execution time: 2410.841 ms"

2016-09-20 14:09:04 MSK [13037]: [75-1] user=ipdr,db=ipdr,app=pgAdmin III - 
Query Tool,client=127.0.0.1 LOG:  duration: 2410.135 ms  plan:
Query Text:
   select count(*) from (select a,b,c,d,e,sum(bytes) from test where 
a>= $1 group by a,b,c,d,e)t;

Aggregate  (cost=230701.42..230701.43 rows=1 width=8)
  ->  HashAggregate  (cost=230363.59..230513.74 rows=15015 width=28)
Group Key: test.a, test.b, test.c, test.d, test.e
->  Seq Scan on test  (cost=0.00..188696.44 rows=372 
width=20)
  Filter: (a >= $1)


No parallelism again. Looks like that Filter: (a >= $1)  breaks parallelism



Alex Ignatov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel sec scan in plpgsql

2016-09-19 Thread Robert Haas

On Sat, Sep 17, 2016 at 11:54 PM, Amit Kapila  wrote:
> In general, I think we should support the cases as required (or
> written) by you from plpgsql or sql functions.  We need more work to
> support such cases. There are probably two ways of supporting such
> cases, we can build some intelligence in plpgsql execution such that
> it can recognise such queries and allow to use parallelism or we need
> to think of enabling parallelism for cases where we don't run the plan
> to completion.  Most of the use cases from plpgsql or sql function
> fall into later category as they don't generally run the plan to
> completion.

I think there's certainly more work that could be done to teach
PL/pgsql about cases where the query will run to completion.  I didn't
work very hard to make sure we covered all of those; there are
probably several different cases where parallelism could be safely
enabled but currently isn't.  Also, I didn't do anything at all to
update the other PLs, and that would be good, too.

However, I think the chances of supporting parallel query for queries
not executed to completion any time in the near future are very poor.
Once the query is suspended, the user can do anything they like,
including stuff that's parallel-unsafe, and then we have no choice but
to error out, and that's not what we want to happen.  If we had
absolutely no parallel-unsafe operations - which would be quite a feat
- then we might be able to get there.  But even then you have the
problem that while the query is suspended, you are still nailing down
workers that are sitting around idle, waiting for the leader to resume
the query, and that's not very good either.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel sec scan in plpgsql

2016-09-17 Thread Amit Kapila

On Fri, Sep 16, 2016 at 8:48 PM, Alex Ignatov  wrote:
>
> On 16.09.2016 16:50, Amit Kapila wrote:
>>
>>
>> Can you try by setting force_parallel_mode = off;?  I think it is
>> sending the whole function execution to worker due to
>> force_parallel_mode.
>>
>>
>
> No changes:
>

Okay, it just skipped from my mind that we don't support parallel
queries for SQL statement execution (or statements executed via
exec_stmt_execsql) from plpgsql.  For detailed explanation of why that
is not feasible you can refer one of my earlier e-mails [1] on similar
topic.  I think if we can somehow get the results via Perform
statement, then it could be possible to use parallelism via plpgsql.

However, you can use it via SQL functions, an example is below:

set min_parallel_relation_size =0;
set parallel_tuple_cost=0;
set parallel_setup_cost=0;

Load 'auto_explain';
set auto_explain.log_min_duration = 0;
set auto_explain.log_analyze = true;
set auto_explain.log_nested_statements = true;

create table test_plpgsql(c1 int, c2 char(1000));
insert into test_plpgsql values(generate_series(1,10),'aaa');

create or replace function parallel_test_set_sql() returns
setof bigint as $$
select count(*) from test_plpgsql;
$$language sql PARALLEL SAFE STRICT STABLE;

Then execute function as: select * from parallel_test_set_sql();  You
can see below plan if auto_explain module is loaded.

Finalize Aggregate  (cost=14806.85..14806.86 rows=1 width=8) (actual tim
e=1094.966..1094.967 rows=1 loops=1)
  ->  Gather  (cost=14806.83..14806.84 rows=2 width=8) (actual time=472.
216..1094.943 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
->  Partial Aggregate  (cost=14806.83..14806.84 rows=1 width=8)
(actual time=177.867..177.868 rows=1 loops=3)
  ->  Parallel Seq Scan on test_plpgsql  (cost=0.00..14702.6
7 rows=41667 width=0) (actual time=0.384..142.565 rows=3 loops=3)
CONTEXT:  SQL function "parallel_test_set_sql" statement 1
LOG:  duration: 2965.040 ms  plan:
Query Text: select * from parallel_test_set_sql();
Function Scan on parallel_test_set_sql  (cost=0.25..10.25 rows=1000 widt
h=8) (actual time=2538.620..2776.955 rows=1 loops=1)

In general, I think we should support the cases as required (or
written) by you from plpgsql or sql functions.  We need more work to
support such cases. There are probably two ways of supporting such
cases, we can build some intelligence in plpgsql execution such that
it can recognise such queries and allow to use parallelism or we need
to think of enabling parallelism for cases where we don't run the plan
to completion.  Most of the use cases from plpgsql or sql function
fall into later category as they don't generally run the plan to
completion.

[1] - 
https://www.postgresql.org/message-id/CAA4eK1K8kaO_jRk42-o2rmhSRbKV-3mR%2BiNVcONLdbcSXW5TfQ%40mail.gmail.com

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel sec scan in plpgsql

2016-09-16 Thread Alex Ignatov



On 16.09.2016 16:50, Amit Kapila wrote:

On Fri, Sep 16, 2016 at 6:57 PM, Alex Ignatov  wrote:

No it doesn't.
Paralleling neither sql function nor plpgsql:
Here is example :

ipdr=> show max_worker_processes ;
 max_worker_processes
--
 128
(1 row)
ipdr=> set max_parallel_workers_per_gather to 128;
SET
ipdr=> set force_parallel_mode=on;
SET
ipdr=> set min_parallel_relation_size =0;
SET
ipdr=> set parallel_tuple_cost=0;
SET



Can you try by setting force_parallel_mode = off;?  I think it is
sending the whole function execution to worker due to
force_parallel_mode.




No changes:

ipdr=> set max_parallel_workers_per_gather to 128;
SET
ipdr=> set min_parallel_relation_size =0;
SET
ipdr=> set parallel_tuple_cost=0;
SET
ipdr=> set force_parallel_mode = off;
SET
ipdr=> select name,setting from pg_settings where name 
in('max_parallel_workers_per_gather',
ipdr(>
'min_parallel_relation_size',
ipdr(>'parallel_tuple_cost',
ipdr(>
'force_parallel_mode');
  name   | setting
-+-
 force_parallel_mode | off
 max_parallel_workers_per_gather | 128
 min_parallel_relation_size  | 0
 parallel_tuple_cost | 0
(4 rows)

ipdr=> explain (analyze,buffers) select count(*) from (select 
a,b,c,d,e,sum(bytes) from test group by a,b,c,d,e)t;
   QUERY PLAN
-
 Aggregate  (cost=87702.33..87702.34 rows=1 width=8) (actual 
time=709.643..709.643 rows=1 loops=1)
   Buffers: shared hit=65015
   ->  Finalize HashAggregate  (cost=87364.49..87514.64 rows=15015 width=28) 
(actual time=706.382..708.456 rows=15015 loops=1)
 Group Key: test.a, test.b, test.c, test.d, test.e
 Buffers: shared hit=65015
 ->  Gather  (cost=85149.78..85299.93 rows=165165 width=20) (actual 
time=478.626..645.209 rows=180180 loops=1)
   Workers Planned: 11
   Workers Launched: 11
   Buffers: shared hit=65015
   ->  Partial HashAggregate  (cost=84149.78..84299.93 rows=15015 
width=20) (actual time=473.890..478.309 rows=15015 loops=12)
 Group Key: test.a, test.b, test.c, test.d, test.e
 Buffers: shared hit=63695
 ->  Parallel Seq Scan on test  (cost=0.00..72786.01 
rows=909101 width=20) (actual time=0.021..163.120 rows=83 loops=12)
   Buffers: shared hit=63695
 Planning time: 0.318 ms
 Execution time: 710.600 ms
(16 rows)


ipdr=> explain (analyze,buffers) select  parallel_test_plpgsql();
QUERY PLAN
--
 Result  (cost=0.00..0.26 rows=1 width=8) (actual time=4003.719..4003.720 
rows=1 loops=1)
   Buffers: shared hit=63869
 Planning time: 0.021 ms
 Execution time: 4003.769 ms
(4 rows)

auto_explain:
2016-09-16 18:02:29 MSK [29353]: [53-1] 
user=ipdr,db=ipdr,app=psql,client=[local] LOG:  duration: 4001.275 ms  plan:
Query Text: select count(*)  from (select a,b,c,d,e,sum(bytes) 
from test group by a,b,c,d,e)t
Aggregate  (cost=289035.43..289035.44 rows=1 width=8)
  ->  HashAggregate  (cost=288697.59..288847.74 rows=15015 width=28)
Group Key: test.a, test.b, test.c, test.d, test.e
->  Seq Scan on test  (cost=0.00..163696.15 rows=1115 
width=20)
2016-09-16 18:02:29 MSK [29353]: [54-1] user=ipdr,db=ipdr,app=psql,client=[local] 
CONTEXT:  SQL statement "select count(*)  from (select a,b,c,d,e,sum(bytes) 
from test group by a,b,c,d,e)t"
PL/pgSQL function parallel_test_plpgsql() line 5 at SQL statement


ipdr=> explain (analyze,buffers) select  parallel_test_plpgsql();
QUERY PLAN
--
 Result  (cost=0.00..0.26 rows=1 width=8) (actual time=4497.820..4497.822 
rows=1 loops=1)
   Buffers: shared hit=63695
 Planning time: 0.023 ms
 Execution time: 4497.872 ms
(4 rows)

auto_explain:
2016-09-16 18:03:23 MSK [29353]: [57-1] 
user=ipdr,db=ipdr,app=psql,client=[local] LOG:  duration: 4497.050 ms  plan:
Query Text: select count(*)  from (select a,b,c,d,e,sum(bytes) 
from test group by a,b,c,d,e)t
Aggregate  (cost=289035.43..289035.44 rows=1 width=8)
  ->  HashAggregate  (cost=288697.59..288847.74 rows=15015 width=28)
Group Key: test.a, test.b, test.c, test.d, test.e
->  Seq Scan on test  (cost=0.00..163696.15 rows=1115 
width=20)
2016-09-16 18:03:23 MSK [29353]: [58-1] u

Re: [HACKERS] Parallel sec scan in plpgsql

2016-09-16 Thread Amit Kapila

On Fri, Sep 16, 2016 at 6:57 PM, Alex Ignatov  wrote:
> No it doesn't.
> Paralleling neither sql function nor plpgsql:
> Here is example :
>
> ipdr=> show max_worker_processes ;
>  max_worker_processes
> --
>  128
> (1 row)
> ipdr=> set max_parallel_workers_per_gather to 128;
> SET
> ipdr=> set force_parallel_mode=on;
> SET
> ipdr=> set min_parallel_relation_size =0;
> SET
> ipdr=> set parallel_tuple_cost=0;
> SET
>

Can you try by setting force_parallel_mode = off;?  I think it is
sending the whole function execution to worker due to
force_parallel_mode.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel sec scan in plpgsql

2016-09-16 Thread Alex Ignatov


No it doesn't.
Paralleling neither sql function nor plpgsql:
Here is example :

ipdr=> show max_worker_processes ;
 max_worker_processes
--
 128
(1 row)
ipdr=> set max_parallel_workers_per_gather to 128;
SET
ipdr=> set force_parallel_mode=on;
SET
ipdr=> set min_parallel_relation_size =0;
SET
ipdr=> set parallel_tuple_cost=0;
SET

ipdr=> create table test as select (random ()*1000)::int % 3 as a,
ipdr-> (random ()*1000)::int % 5 as b,
ipdr-> (random ()*1000)::int % 7 as c,
ipdr-> (random ()*1000)::int % 11 as d,
ipdr-> (random ()*1000)::int % 13 as e,
ipdr-> (random ()*1000)::int % 17 as bytes
ipdr-> from generate_series(1,10*1000*1000);
SELECT 1000


ipdr=> create or replace function parallel_test_plpgsql() returns bigint as
ipdr-> $$
ipdr$> declare
ipdr$>cnt int:=0;
ipdr$> begin
ipdr$>select count(*) into cnt from (select a,b,c,d,e,sum(bytes) from test 
group by a,b,c,d,e)t;
ipdr$>return cnt;
ipdr$> end;
ipdr$> $$ language plpgsql PARALLEL SAFE  STRICT;
CREATE FUNCTION

ipdr=>
ipdr=> create or replace function parallel_test_sql() returns bigint as
ipdr-> $$
ipdr$>select count(*) from (select a,b,c,d,e,sum(bytes) from test group by 
a,b,c,d,e)t;
ipdr$> $$ language sql PARALLEL SAFE STRICT;
CREATE FUNCTION

ipdr=> analyze test;
ANALYZE
ipdr=> explain (analyze,buffers) select count(*) from (select 
a,b,c,d,e,sum(bytes) from test group by a,b,c,d,e)t;

   QUERY PLAN
-
 Aggregate  (cost=87702.33..87702.34 rows=1 width=8) (actual 
time=723.792..723.792 rows=1 loops=1)
   Buffers: shared hit=65015
   ->  Finalize HashAggregate  (cost=87364.49..87514.64 rows=15015 width=28) 
(actual time=720.496..722.589 rows=15015 loops=1)
 Group Key: test.a, test.b, test.c, test.d, test.e
 Buffers: shared hit=65015
 ->  Gather  (cost=85149.78..85299.93 rows=165165 width=20) (actual 
time=502.607..665.039 rows=180180 loops=1)
   Workers Planned: 11
   Workers Launched: 11
   Buffers: shared hit=65015
   ->  Partial HashAggregate  (cost=84149.78..84299.93 rows=15015 
width=20) (actual time=497.106..501.170 rows=15015 loops=12)
 Group Key: test.a, test.b, test.c, test.d, test.e
 Buffers: shared hit=63695
 ->  Parallel Seq Scan on test  (cost=0.00..72786.01 
rows=909101 width=20) (actual time=0.018..166.556 rows=83 loops=12)
   Buffers: shared hit=63695
 Planning time: 0.250 ms
 Execution time: 724.293 ms
(16 rows)

ipdr=> explain (analyze,buffers) select  parallel_test_plpgsql();
   QUERY PLAN

 Gather  (cost=1000.00..1000.26 rows=1 width=8) (actual time=4088.952..4088.956 
rows=1 loops=1)
   Workers Planned: 1
   Workers Launched: 1
   Single Copy: true
   Buffers: shared hit=64186
   ->  Result  (cost=0.00..0.26 rows=1 width=8) (actual time=4084.997..4084.999 
rows=1 loops=1)
 Buffers: shared hit=64149
 Planning time: 0.025 ms
 Execution time: 4100.026 ms
(9 rows)

Log from auto_explain:
2016-09-16 16:05:11 MSK [28209]: [1-1] user=,db=,app=,client= LOG:  duration: 
4082.517 ms  plan:
Query Text: select count(*)  from (select a,b,c,d,e,sum(bytes) 
from test group by a,b,c,d,e)t
Aggregate  (cost=289035.43..289035.44 rows=1 width=8)
  ->  HashAggregate  (cost=288697.59..288847.74 rows=15015 width=28)
Group Key: test.a, test.b, test.c, test.d, test.e
->  Seq Scan on test  (cost=0.00..163696.15 rows=1115 
width=20)


ipdr=> explain (analyze,buffers) select  parallel_test_sql();
   QUERY PLAN

 Gather  (cost=1000.00..1000.26 rows=1 width=8) (actual time=4256.830..4256.837 
rows=1 loops=1)
   Workers Planned: 1
   Workers Launched: 1
   Single Copy: true
   Buffers: shared hit=64132
   ->  Result  (cost=0.00..0.26 rows=1 width=8) (actual time=4252.401..4252.403 
rows=1 loops=1)
 Buffers: shared hit=64095
 Planning time: 0.151 ms
 Execution time: 4267.959 ms
(9 rows)

Log from auto_explain:
2016-09-16 16:22:03 MSK [731]: [1-1] user=,db=,app=,client= LOG:  duration: 
4249.851 ms  plan:
Query Text:
   select count(*) from (select a,b,c,d,e,sum(bytes) from test group by 
a,b,c,d,e)t;

Aggregate  (cost=289035.43..289035.44 rows=1 width=8)
  ->  HashAggregate  (cost=288697.59..288847.74 rows=150

Re: [HACKERS] Parallel sec scan in plpgsql

2016-09-15 Thread Ashutosh Bapat

On Thu, Sep 15, 2016 at 9:15 PM, Alex Ignatov  wrote:
> Hello!
> Does parallel secscan works in plpgsql?
>

Parallel seq scan is a query optimization that will work independent
of the source of the query - i.e whether it comes directly from a
client or a procedural language like plpgsql. So, I guess, answer to
your question is yes. If you are expecting something else, more
context will help.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel sec scan in plpgsql

Re: [HACKERS] Parallel sec scan in plpgsql

Re: [HACKERS] Parallel sec scan in plpgsql

Re: [HACKERS] Parallel sec scan in plpgsql

Re: [HACKERS] Parallel sec scan in plpgsql

Re: [HACKERS] Parallel sec scan in plpgsql

Re: [HACKERS] Parallel sec scan in plpgsql

Re: [HACKERS] Parallel sec scan in plpgsql

Re: [HACKERS] Parallel sec scan in plpgsql

Re: [HACKERS] Parallel sec scan in plpgsql

Re: [HACKERS] Parallel sec scan in plpgsql

Re: [HACKERS] Parallel sec scan in plpgsql

Re: [HACKERS] Parallel sec scan in plpgsql

Re: [HACKERS] Parallel sec scan in plpgsql

14 matches

Site Navigation

Mail list logo

Footer information