Re: [GENERAL] Sequential parallel scan / agg

2016-11-14 Thread Tobias Bussmann
Hi,

On 2016-11-13 23:04:44 +, James Sewell said:
>   1.  INSERT ... SELECT doesn't work
>   2.  WITH a AS (SELECT ...) INSERT FROM a doesn't work
>   3.  SELECT .. INTO doesn't work
>   4.  SELECT with no insert does work as expected
>> 
>> CREATE TABLE blah AS SELECT ...  ?

Unfortunately, in these cases the current parallelism features cannot be used 
as all of these queries do write data. See in the doc [1] at "When Can Parallel 
Query Be Used?" 

> Even when it is in general possible for parallel query plans to be generated, 
> the planner will not generate them for a given query if any of the following 
> are true:
>   •   The query writes any data or locks any database rows. If a 
> query contains a data-modifying operation either at the top level or within a 
> CTE, no parallel plans for that query will be generated. This is a limitation 
> of the current implementation which could be lifted in a future release.

Best
Tobias

[1]: 
https://www.postgresql.org/docs/current/static/when-can-parallel-query-be-used.html



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Sequential parallel scan / agg

2016-11-13 Thread James Sewell
OK, looks like this question is answered by the wiki:

https://wiki.postgresql.org/wiki/Parallel_Query

Even when parallel query is enabled in general, the query planner will
> never generate a parallel plan if any of the following are true:
>
>- The query writes data. If a query contains a data-modifying
>operation either at the top level or within a CTE, no parallel plans for
>that query will be generated. This is a limitation of the current
>implementation which could be lifted in a future release.
>
>
Sad, but looks to hold from this testing!

James Sewell,
PostgreSQL Team Lead / Solutions Architect



Suite 112, Jones Bay Wharf, 26-32 Pirrama Road, Pyrmont NSW 2009
*P *(+61) 2 8099 9000 <(+61)%202%208099%209000>  *W* www.jirotech.com  *F *
(+61) 2 8099 9099 <(+61)%202%208099%209000>

On Mon, Nov 14, 2016 at 11:13 AM, James Sewell 
wrote:

> Hi,
>
> Yes, same result (non-parallel seq scan)
>
> Cheers,
>
> James Sewell,
> PostgreSQL Team Lead / Solutions Architect
>
>
>
> Suite 112, Jones Bay Wharf, 26-32 Pirrama Road, Pyrmont NSW 2009
> *P *(+61) 2 8099 9000 <(+61)%202%208099%209000>  *W* www.jirotech.com
> *F *(+61) 2 8099 9099 <(+61)%202%208099%209000>
>
> On Mon, Nov 14, 2016 at 10:43 AM, melvin6925  wrote:
>
>> Have you tried:
>> CREATE TABLE blah AS SELECT ...  ?
>>
>>
>>
>> Sent via the Samsung Galaxy S® 6, an AT 4G LTE smartphone
>>
>>  Original message 
>> From: James Sewell 
>> Date: 11/13/16 18:04 (GMT-05:00)
>> To: pgsql-general 
>> Subject: [GENERAL] Sequential parallel scan / agg
>>
>> Hi all,
>>
>> I have a customer who is making use of the new parallel features on a 64
>> core box - this is working really well for them. However one issue (which
>> I've since realised I've never seen an example of) is getting the data from
>> these queries into a table.
>>
>> I can't seem to create a plan which does this and still uses multiple
>> workers, probably best to just show by example.
>>
>> Below you can see:
>>
>>1. *INSERT ... SELECT *doesn't work
>>2. *WITH a AS (SELECT ...) INSERT FROM* a doesn't work
>>3. *SELECT .. INTO* doesn't work
>>4. *SELECT* with no insert does work as expected
>>
>>
>>
>> test=# explain
>> insert into jirotech.test
>> (select count(*)  FROM  meter_read mr
>>
>> where TIMESTAMP_ROUNDED >= (timestamp'2016-03-30 00:00:00' - interval '2'
>> day)
>>
>> and TIMESTAMP_ROUNDED < timestamp'2016-03-30 00:00:00' )
>> ;
>>
>>QUERY PLAN
>> 
>> 
>> 
>>  Insert on test  (cost=13336866.41..13336866.44 rows=1 width=438)
>>->  Subquery Scan on "*SELECT*"  (cost=13336866.41..13336866.44
>> rows=1 width=438)
>>  ->  Aggregate  (cost=13336866.41..13336866.42 rows=1 width=8)
>>->  Append  (cost=0.00..12405302.12 rows=372625715 width=0)
>>  ->  Seq Scan on meter_read mr  (cost=0.00..0.00
>> rows=1 width=0)
>>Filter: ((timestamp_rounded >= '2016-03-28
>> 00:00:00'::timestamp without time zone) AND (timestamp_rounded <
>> '2016-03-30 00:00:00'::timestamp without time zone))
>>  ->  Seq Scan on meter_read_2016_03_29 mr_1
>>  (cost=0.00..6201295.24 rows=186240748 width=0)
>>Filter: ((timestamp_rounded >= '2016-03-28
>> 00:00:00'::timestamp without time zone) AND (timestamp_rounded <
>> '2016-03-30 00:00:00'::timestamp without time zone))
>>  ->  Seq Scan on meter_read_2016_03_28 mr_2
>>  (cost=0.00..6204006.88 rows=186384966 width=0)
>>Filter: ((timestamp_rounded >= '2016-03-28
>> 00:00:00'::timestamp without time zone) AND (timestamp_rounded <
>> '2016-03-30 00:00:00'::timestamp without time zone))
>> (10 rows)
>>
>> test=# explain
>> with a as
>> (select count(*)  FROM  meter_read mr
>>
>> where TIMESTAMP_ROUNDED >= (timestamp'2016-03-30 00:00:00' - interval '2'
>> day)
>>
>> and TIMESTAMP_ROUNDED < timestamp'2016-03-30 00:00:00'  )
>> INSERT INTO jirotech.test SELECT * FROM A
>> ;
>>
>>
>>   QUERY PLAN
>> 
>> 
>> 
>>  Insert on test  (cost=13336866.42..13336866.45 rows=1 width=438)
>>CTE a
>>  ->  Aggregate  (cost=13336866.41..13336866.42 rows=1 width=8)
>>->  Append  (cost=0.00..12405302.12 rows=372625715 width=0)
>>  ->  Seq Scan on meter_read mr  (cost=0.00..0.00 rows=1
>> width=0)
>>Filter: ((timestamp_rounded >= '2016-03-28
>> 00:00:00'::timestamp without time zone) AND (timestamp_rounded <
>> '2016-03-30 

Re: [GENERAL] Sequential parallel scan / agg

2016-11-13 Thread James Sewell
Hi,

Yes, same result (non-parallel seq scan)

Cheers,

James Sewell,
PostgreSQL Team Lead / Solutions Architect



Suite 112, Jones Bay Wharf, 26-32 Pirrama Road, Pyrmont NSW 2009
*P *(+61) 2 8099 9000 <(+61)%202%208099%209000>  *W* www.jirotech.com  *F *
(+61) 2 8099 9099 <(+61)%202%208099%209000>

On Mon, Nov 14, 2016 at 10:43 AM, melvin6925  wrote:

> Have you tried:
> CREATE TABLE blah AS SELECT ...  ?
>
>
>
> Sent via the Samsung Galaxy S® 6, an AT 4G LTE smartphone
>
>  Original message 
> From: James Sewell 
> Date: 11/13/16 18:04 (GMT-05:00)
> To: pgsql-general 
> Subject: [GENERAL] Sequential parallel scan / agg
>
> Hi all,
>
> I have a customer who is making use of the new parallel features on a 64
> core box - this is working really well for them. However one issue (which
> I've since realised I've never seen an example of) is getting the data from
> these queries into a table.
>
> I can't seem to create a plan which does this and still uses multiple
> workers, probably best to just show by example.
>
> Below you can see:
>
>1. *INSERT ... SELECT *doesn't work
>2. *WITH a AS (SELECT ...) INSERT FROM* a doesn't work
>3. *SELECT .. INTO* doesn't work
>4. *SELECT* with no insert does work as expected
>
>
>
> test=# explain
> insert into jirotech.test
> (select count(*)  FROM  meter_read mr
>
> where TIMESTAMP_ROUNDED >= (timestamp'2016-03-30 00:00:00' - interval '2'
> day)
>
> and TIMESTAMP_ROUNDED < timestamp'2016-03-30 00:00:00' )
> ;
>
>QUERY PLAN
> 
> 
> 
>  Insert on test  (cost=13336866.41..13336866.44 rows=1 width=438)
>->  Subquery Scan on "*SELECT*"  (cost=13336866.41..13336866.44 rows=1
> width=438)
>  ->  Aggregate  (cost=13336866.41..13336866.42 rows=1 width=8)
>->  Append  (cost=0.00..12405302.12 rows=372625715 width=0)
>  ->  Seq Scan on meter_read mr  (cost=0.00..0.00
> rows=1 width=0)
>Filter: ((timestamp_rounded >= '2016-03-28
> 00:00:00'::timestamp without time zone) AND (timestamp_rounded <
> '2016-03-30 00:00:00'::timestamp without time zone))
>  ->  Seq Scan on meter_read_2016_03_29 mr_1
>  (cost=0.00..6201295.24 rows=186240748 width=0)
>Filter: ((timestamp_rounded >= '2016-03-28
> 00:00:00'::timestamp without time zone) AND (timestamp_rounded <
> '2016-03-30 00:00:00'::timestamp without time zone))
>  ->  Seq Scan on meter_read_2016_03_28 mr_2
>  (cost=0.00..6204006.88 rows=186384966 width=0)
>Filter: ((timestamp_rounded >= '2016-03-28
> 00:00:00'::timestamp without time zone) AND (timestamp_rounded <
> '2016-03-30 00:00:00'::timestamp without time zone))
> (10 rows)
>
> test=# explain
> with a as
> (select count(*)  FROM  meter_read mr
>
> where TIMESTAMP_ROUNDED >= (timestamp'2016-03-30 00:00:00' - interval '2'
> day)
>
> and TIMESTAMP_ROUNDED < timestamp'2016-03-30 00:00:00'  )
> INSERT INTO jirotech.test SELECT * FROM A
> ;
>
>
>   QUERY PLAN
> 
> 
> 
>  Insert on test  (cost=13336866.42..13336866.45 rows=1 width=438)
>CTE a
>  ->  Aggregate  (cost=13336866.41..13336866.42 rows=1 width=8)
>->  Append  (cost=0.00..12405302.12 rows=372625715 width=0)
>  ->  Seq Scan on meter_read mr  (cost=0.00..0.00 rows=1
> width=0)
>Filter: ((timestamp_rounded >= '2016-03-28
> 00:00:00'::timestamp without time zone) AND (timestamp_rounded <
> '2016-03-30 00:00:00'::timestamp without time zone))
>  ->  Seq Scan on meter_read_2016_03_29 mr_1
>  (cost=0.00..6201295.24 rows=186240748 width=0)
>Filter: ((timestamp_rounded >= '2016-03-28
> 00:00:00'::timestamp without time zone) AND (timestamp_rounded <
> '2016-03-30 00:00:00'::timestamp without time zone))
>  ->  Seq Scan on meter_read_2016_03_28 mr_2
>  (cost=0.00..6204006.88 rows=186384966 width=0)
>Filter: ((timestamp_rounded >= '2016-03-28
> 00:00:00'::timestamp without time zone) AND (timestamp_rounded <
> '2016-03-30 00:00:00'::timestamp without time zone))
>->  CTE Scan on a  (cost=0.00..0.03 rows=1 width=438)
> (11 rows)
>
> test=# explain
> select count(*) into jirotech.jamestest FROM  meter_read mr
>
> where TIMESTAMP_ROUNDED >= (timestamp'2016-03-30 00:00:00' - interval '2'
> day)
>
> and TIMESTAMP_ROUNDED < timestamp'2016-03-30 00:00:00'
> ;
>
>  QUERY PLAN
> 

Re: [GENERAL] Sequential parallel scan / agg

2016-11-13 Thread melvin6925
Have you tried:CREATE TABLE blah AS SELECT ...  ?


Sent via the Samsung Galaxy S® 6, an AT 4G LTE smartphone
 Original message From: James Sewell 
 Date: 11/13/16  18:04  (GMT-05:00) To: 
pgsql-general  Subject: [GENERAL] Sequential 
parallel scan / agg 
Hi all,
I have a customer who is making use of the new parallel features on a 64 core 
box - this is working really well for them. However one issue (which I've since 
realised I've never seen an example of) is getting the data from these queries 
into a table.
I can't seem to create a plan which does this and still uses multiple workers, 
probably best to just show by example.
Below you can see:INSERT ... SELECT doesn't workWITH a AS (SELECT ...) INSERT 
FROM a doesn't workSELECT .. INTO doesn't workSELECT with no insert does work 
as expected

test=# explaininsert into jirotech.test(select count(*)  FROM  meter_read mr
where TIMESTAMP_ROUNDED >= (timestamp'2016-03-30 00:00:00' - interval '2' day)
and TIMESTAMP_ROUNDED < timestamp'2016-03-30 00:00:00' );                       
                                                                  QUERY PLAN    
                    

 Insert on test  (cost=13336866.41..13336866.44 rows=1 width=438)   ->  
Subquery Scan on "*SELECT*"  (cost=13336866.41..13336866.44 rows=1 width=438)   
      ->  Aggregate  (cost=13336866.41..13336866.42 rows=1 width=8)             
  ->  Append  (cost=0.00..12405302.12 rows=372625715 width=0)                   
  ->  Seq Scan on meter_read mr  (cost=0.00..0.00 rows=1 width=0)               
            Filter: ((timestamp_rounded >= '2016-03-28 00:00:00'::timestamp 
without time zone) AND (timestamp_rounded < '2016-03-30 00:00:00'::timestamp 
without time zone))                     ->  Seq Scan on meter_read_2016_03_29 
mr_1  (cost=0.00..6201295.24 rows=186240748 width=0)                           
Filter: ((timestamp_rounded >= '2016-03-28 00:00:00'::timestamp without time 
zone) AND (timestamp_rounded < '2016-03-30 00:00:00'::timestamp without time 
zone))                     ->  Seq Scan on meter_read_2016_03_28 mr_2  
(cost=0.00..6204006.88 rows=186384966 width=0)                           
Filter: ((timestamp_rounded >= '2016-03-28 00:00:00'::timestamp without time 
zone) AND (timestamp_rounded < '2016-03-30 00:00:00'::timestamp without time 
zone))(10 rows)
test=# explainwith a as(select count(*)  FROM  meter_read mr
where TIMESTAMP_ROUNDED >= (timestamp'2016-03-30 00:00:00' - interval '2' day)
and TIMESTAMP_ROUNDED < timestamp'2016-03-30 00:00:00'  )INSERT INTO 
jirotech.test SELECT * FROM A;                                                  
                                    QUERY PLAN                          

 Insert on test  (cost=13336866.42..13336866.45 rows=1 width=438)   CTE a     
->  Aggregate  (cost=13336866.41..13336866.42 rows=1 width=8)           ->  
Append  (cost=0.00..12405302.12 rows=372625715 width=0)                 ->  Seq 
Scan on meter_read mr  (cost=0.00..0.00 rows=1 width=0)                       
Filter: ((timestamp_rounded >= '2016-03-28 00:00:00'::timestamp without time 
zone) AND (timestamp_rounded < '2016-03-30 00:00:00'::timestamp without time 
zone))                 ->  Seq Scan on meter_read_2016_03_29 mr_1  
(cost=0.00..6201295.24 rows=186240748 width=0)                       Filter: 
((timestamp_rounded >= '2016-03-28 00:00:00'::timestamp without time zone) AND 
(timestamp_rounded < '2016-03-30 00:00:00'::timestamp without time zone))       
          ->  Seq Scan on meter_read_2016_03_28 mr_2  (cost=0.00..6204006.88 
rows=186384966 width=0)                       Filter: ((timestamp_rounded >= 
'2016-03-28 00:00:00'::timestamp without time zone) AND (timestamp_rounded < 
'2016-03-30 00:00:00'::timestamp without time zone))   ->  CTE Scan on a  
(cost=0.00..0.03 rows=1 width=438)(11 rows)
test=# explainselect count(*) into jirotech.jamestest FROM  meter_read mr
where TIMESTAMP_ROUNDED >= (timestamp'2016-03-30 00:00:00' - interval '2' day)
and TIMESTAMP_ROUNDED < timestamp'2016-03-30 00:00:00';                         
                                                          QUERY PLAN            
                  

 Aggregate  (cost=13336866.41..13336866.42 rows=1 width=8)   ->  Append  
(cost=0.00..12405302.12 rows=372625715 width=0)         ->  Seq Scan on 
meter_read mr  (cost=0.00..0.00 rows=1 width=0)