Re: [HACKERS] Parallel Append implementation

2018-08-02 Thread Adrien NAYRAT

On 08/01/2018 03:14 PM, Robert Haas wrote:

Committed to master and v11.  Thanks for the review.


Thanks!



Re: [HACKERS] Parallel Append implementation

2018-08-01 Thread Robert Haas
On Mon, Jul 30, 2018 at 8:02 PM, Thomas Munro
 wrote:
> On Tue, Jul 31, 2018 at 5:05 AM, Robert Haas  wrote:
>> New version attached.
>
> Looks good to me.

Committed to master and v11.  Thanks for the review.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] Parallel Append implementation

2018-07-30 Thread Thomas Munro
On Tue, Jul 31, 2018 at 5:05 AM, Robert Haas  wrote:
> New version attached.

Looks good to me.

-- 
Thomas Munro
http://www.enterprisedb.com



Re: [HACKERS] Parallel Append implementation

2018-07-30 Thread Robert Haas
On Sun, Jul 29, 2018 at 5:49 PM, Thomas Munro
 wrote:
> On Thu, May 10, 2018 at 7:08 AM, Robert Haas  wrote:
>>  [parallel-append-doc-v2.patch]
>
> +plans just as they can in any other plan.  However, in a parallel plan,
> +it is also possible that the planner may choose to substitute a
> +Parallel Append node.
>
> Maybe drop "it is also possible that "?  It seems a bit unnecessary
> and sounds a bit odd followed by "may ", but maybe it's just me.

Changed.

> +Also, unlike a regular Append node, which can only 
> have
> +partial children when used within a parallel plan, Parallel
> +Append node can have both partial and non-partial child plans.
>
> Missing "a" before "Parallel".

Fixed.

> +Non-partial children will be scanned by only a single worker, since
>
> Are we using "worker" in a more general sense that possibly includes
> the leader?  Hmm, yes, other text on this page does that too.  Ho hum.

Tried to be more careful about this.

New version attached.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


parallel-append-doc-v3.patch
Description: Binary data


Re: [HACKERS] Parallel Append implementation

2018-07-29 Thread Thomas Munro
On Thu, May 10, 2018 at 7:08 AM, Robert Haas  wrote:
>  [parallel-append-doc-v2.patch]

+plans just as they can in any other plan.  However, in a parallel plan,
+it is also possible that the planner may choose to substitute a
+Parallel Append node.

Maybe drop "it is also possible that "?  It seems a bit unnecessary
and sounds a bit odd followed by "may ", but maybe it's just me.

+Also, unlike a regular Append node, which can only have
+partial children when used within a parallel plan, Parallel
+Append node can have both partial and non-partial child plans.

Missing "a" before "Parallel".

+Non-partial children will be scanned by only a single worker, since

Are we using "worker" in a more general sense that possibly includes
the leader?  Hmm, yes, other text on this page does that too.  Ho hum.

-- 
Thomas Munro
http://www.enterprisedb.com



Re: [HACKERS] Parallel Append implementation

2018-05-09 Thread Robert Haas
On Tue, May 8, 2018 at 5:05 PM, Thomas Munro
 wrote:
> +scanning them more than once would preduce duplicate results.  Plans that
>
> s/preduce/produce/

Fixed, thanks.

> +Append or MergeAppend plan node.
> vs.
> +Append of regular Index Scan plans; each
>
> I think we should standardise on Foo Bar,
> FooBar or foo bar when
> discussing executor nodes on this page.

Well, EXPLAIN prints MergeAppend but Index Scan, and I think we should
follow that precedent here.

As for  vs. , I think the reason I ended up using
 in the section on scans was because I thought that
Parallel Seq Scan might be confusing (what's a
"seq"?), so I tried to fudge my way around that by referring to it as
an abstract idea rather than the exact EXPLAIN output.  You then
copied that style in the join section, and, well, like you say, now we
have a sort of hodgepodge of styles.  Maybe that's a problem for
another patch, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


parallel-append-doc-v2.patch
Description: Binary data


Re: [HACKERS] Parallel Append implementation

2018-05-08 Thread Thomas Munro
On Wed, May 9, 2018 at 1:15 AM, Robert Haas  wrote:
> On Tue, May 8, 2018 at 12:10 AM, Thomas Munro
>  wrote:
>> It's not a scan, it's not a join and it's not an aggregation so I
>> think it needs to be in a new  as the same level as those
>> others.  It's a different kind of thing.
>
> I'm a little skeptical about that idea because I'm not sure it's
> really in the same category as far as importance is concerned, but I
> don't have a better idea.  Here's a patch.  I'm worried this is too
> much technical jargon, but I don't know how to explain it any more
> simply.

+scanning them more than once would preduce duplicate results.  Plans that

s/preduce/produce/

+Append or MergeAppend plan node.
vs.
+Append of regular Index Scan plans; each

I think we should standardise on Foo Bar,
FooBar or foo bar when
discussing executor nodes on this page.

-- 
Thomas Munro
http://www.enterprisedb.com



Re: [HACKERS] Parallel Append implementation

2018-05-08 Thread Robert Haas
On Tue, May 8, 2018 at 12:10 AM, Thomas Munro
 wrote:
> It's not a scan, it's not a join and it's not an aggregation so I
> think it needs to be in a new  as the same level as those
> others.  It's a different kind of thing.

I'm a little skeptical about that idea because I'm not sure it's
really in the same category as far as importance is concerned, but I
don't have a better idea.  Here's a patch.  I'm worried this is too
much technical jargon, but I don't know how to explain it any more
simply.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


parallel-append-doc.patch
Description: Binary data


Re: [HACKERS] Parallel Append implementation

2018-05-07 Thread Thomas Munro
On Tue, May 8, 2018 at 5:23 AM, Robert Haas  wrote:
> On Sat, Apr 7, 2018 at 10:21 AM, Adrien Nayrat
>  wrote:
>> I notice Parallel append is not listed on Parallel Plans documentation :
>> https://www.postgresql.org/docs/devel/static/parallel-plans.html
>
> I agree it might be nice to mention this somewhere on this page, but
> I'm not exactly sure where it would make logical sense to put it.

It's not a scan, it's not a join and it's not an aggregation so I
think it needs to be in a new  as the same level as those
others.  It's a different kind of thing.

-- 
Thomas Munro
http://www.enterprisedb.com



Re: [HACKERS] Parallel Append implementation

2018-05-07 Thread Robert Haas
On Sat, Apr 7, 2018 at 10:21 AM, Adrien Nayrat
 wrote:
> I notice Parallel append is not listed on Parallel Plans documentation :
> https://www.postgresql.org/docs/devel/static/parallel-plans.html

I agree it might be nice to mention this somewhere on this page, but
I'm not exactly sure where it would make logical sense to put it.



-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] Parallel Append implementation

2018-04-07 Thread Adrien Nayrat
Hello,

I notice Parallel append is not listed on Parallel Plans documentation :
https://www.postgresql.org/docs/devel/static/parallel-plans.html

If you agree I can add it to Open Items.

Thanks,

-- 
Adrien NAYRAT




signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] Parallel Append implementation

2017-12-06 Thread Amit Khandekar
On 6 December 2017 at 04:01, Robert Haas  wrote:
> On Tue, Nov 28, 2017 at 6:02 AM, amul sul  wrote:
>> Here are the changes I did on v21 patch to handle crash reported by 
>> Rajkumar[1]:
>>
>> diff --git a/src/backend/executor/nodeAppend.c
>> b/src/backend/executor/nodeAppend.c
>> index e3b17cf0e2..e0ee918808 100644
>> --- a/src/backend/executor/nodeAppend.c
>> +++ b/src/backend/executor/nodeAppend.c
>> @@ -479,9 +479,12 @@ choose_next_subplan_for_worker(AppendState *node)
>> pstate->pa_next_plan = append->first_partial_plan;
>> else
>> pstate->pa_next_plan++;
>> -   if (pstate->pa_next_plan == node->as_whichplan)
>> +
>> +   if (pstate->pa_next_plan == node->as_whichplan ||
>> +   (pstate->pa_next_plan == append->first_partial_plan &&
>> +append->first_partial_plan >= node->as_nplans))
>> {
>> -   /* We've tried everything! */
>> +   /* We've tried everything or there were no partial plans */
>> pstate->pa_next_plan = INVALID_SUBPLAN_INDEX;
>> LWLockRelease(&pstate->pa_lock);
>> return false;
>
> I changed this around a little, added a test case, and committed this.

Thanks Robert !

The crash that is reported on pgsql-committers, is being discussed on
that list itself.

>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company



-- 
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company



Re: [HACKERS] Parallel Append implementation

2017-12-05 Thread Robert Haas
On Tue, Nov 28, 2017 at 6:02 AM, amul sul  wrote:
> Here are the changes I did on v21 patch to handle crash reported by 
> Rajkumar[1]:
>
> diff --git a/src/backend/executor/nodeAppend.c
> b/src/backend/executor/nodeAppend.c
> index e3b17cf0e2..e0ee918808 100644
> --- a/src/backend/executor/nodeAppend.c
> +++ b/src/backend/executor/nodeAppend.c
> @@ -479,9 +479,12 @@ choose_next_subplan_for_worker(AppendState *node)
> pstate->pa_next_plan = append->first_partial_plan;
> else
> pstate->pa_next_plan++;
> -   if (pstate->pa_next_plan == node->as_whichplan)
> +
> +   if (pstate->pa_next_plan == node->as_whichplan ||
> +   (pstate->pa_next_plan == append->first_partial_plan &&
> +append->first_partial_plan >= node->as_nplans))
> {
> -   /* We've tried everything! */
> +   /* We've tried everything or there were no partial plans */
> pstate->pa_next_plan = INVALID_SUBPLAN_INDEX;
> LWLockRelease(&pstate->pa_lock);
> return false;

I changed this around a little, added a test case, and committed this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] Parallel Append implementation

2017-11-29 Thread Michael Paquier
On Tue, Nov 28, 2017 at 8:02 PM, amul sul  wrote:
> Apart from this I have added few assert to keep eye on node->as_whichplan
> value in the attached patch, thanks.

This is still hot, moved to next CF.
-- 
Michael



Re: [HACKERS] Parallel Append implementation

2017-11-28 Thread amul sul
On Mon, Nov 27, 2017 at 10:21 PM, amul sul  wrote:
> Thanks a lot Rajkumar for this test. I am able to reproduce this crash by
> enabling  partition wise join.
>
> The reason for this crash is the same as
> the
> previous[1] i.e node->as_whichplan
> value.  This time append->first_partial_plan value looks suspicious. With
> the
> following change to the v21 patch, I am able to reproduce this crash as
> assert
> failure when enable_partition_wise_join = ON otherwise working fine.
>
> diff --git a/src/backend/executor/nodeAppend.c
> b/src/backend/executor/nodeAppend.c
> index e3b17cf0e2..4b337ac633 100644
> --- a/src/backend/executor/nodeAppend.c
> +++ b/src/backend/executor/nodeAppend.c
> @@ -458,6 +458,7 @@ choose_next_subplan_for_worker(AppendState *node)
>
> /* Backward scan is not supported by parallel-aware plans */
> Assert(ScanDirectionIsForward(node->ps.state->es_direction));
> +   Assert(append->first_partial_plan < node->as_nplans);
>
> LWLockAcquire(&pstate->pa_lock, LW_EXCLUSIVE);
>
>
> Will look into this more, tomorrow.
>
I haven't reached the actual reason why there wasn't any partial plan
(i.e.  value of append->first_partial_plan & node->as_nplans are same)
when the partition-wise join is enabled.  I think in this case we could simply
return false from choose_next_subplan_for_worker() when there aren't any
partial plan and we done with all non-partition plan, although I may be wrong
because I am yet to understand this patch.

Here are the changes I did on v21 patch to handle crash reported by Rajkumar[1]:

diff --git a/src/backend/executor/nodeAppend.c
b/src/backend/executor/nodeAppend.c
index e3b17cf0e2..e0ee918808 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -479,9 +479,12 @@ choose_next_subplan_for_worker(AppendState *node)
pstate->pa_next_plan = append->first_partial_plan;
else
pstate->pa_next_plan++;
-   if (pstate->pa_next_plan == node->as_whichplan)
+
+   if (pstate->pa_next_plan == node->as_whichplan ||
+   (pstate->pa_next_plan == append->first_partial_plan &&
+append->first_partial_plan >= node->as_nplans))
{
-   /* We've tried everything! */
+   /* We've tried everything or there were no partial plans */
pstate->pa_next_plan = INVALID_SUBPLAN_INDEX;
LWLockRelease(&pstate->pa_lock);
return false;

Apart from this I have added few assert to keep eye on node->as_whichplan
value in the attached patch, thanks.

1] 
http://postgr.es/m/CAKcux6nyDxOyE4PA8O%3DQgF-ugZp_y1G2U%2Burmf76-%3Df2knDsWA%40mail.gmail.com

Regards,
Amul


ParallelAppend_v22.patch
Description: Binary data


Re: [HACKERS] Parallel Append implementation

2017-11-27 Thread amul sul
Thanks a lot Rajkumar for this test. I am able to reproduce this crash by
enabling  partition wise join.

The reason for this crash is the same as
​ the​
previous[1] i.e node->as_whichplan
value.  This time append->first_partial_plan value looks suspicious. With
the
following change to the v21 patch, I am able to reproduce this crash as
assert
failure when enable_partition_wise_join = ON otherwise working fine.

diff --git a/src/backend/executor/nodeAppend.c
b/src/backend/executor/nodeAppend.c
index e3b17cf0e2..4b337ac633 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -458,6 +458,7 @@ choose_next_subplan_for_worker(AppendState *node)

/* Backward scan is not supported by parallel-aware plans */
Assert(ScanDirectionIsForward(node->ps.state->es_direction));
+   Assert(append->first_partial_plan < node->as_nplans);

LWLockAcquire(&pstate->pa_lock, LW_EXCLUSIVE);


Will look into this more, tomorrow.
​ ​

​1. http://postgr.es/m/CAAJ_b97kLNW8Z9nvc_JUUG5wVQUXvG=
f37WsX8ALF0A=kah...@mail.gmail.com

Regards,
Amul


On Fri, Nov 24, 2017 at 5:00 PM, Rajkumar Raghuwanshi
 wrote:
> On Thu, Nov 23, 2017 at 2:22 PM, amul sul  wrote:
>> Look like it is the same crash what v20 claim to be fixed, indeed I
>> missed to add fix[1] in v20 patch, sorry about that. Attached updated
>> patch includes aforementioned fix.
>
> Hi,
>
> I have applied latest v21 patch, it got crashed when enabled
> partition-wise-join,
> same query is working fine with and without partition-wise-join
> enabled on PG-head.
> please take a look.
>
> SET enable_partition_wise_join TO true;
>
> CREATE TABLE pt1 (a int, b int, c text, d int) PARTITION BY LIST(c);
> CREATE TABLE pt1_p1 PARTITION OF pt1 FOR VALUES IN ('', '0001',
> '0002', '0003');
> CREATE TABLE pt1_p2 PARTITION OF pt1 FOR VALUES IN ('0004', '0005',
> '0006', '0007');
> CREATE TABLE pt1_p3 PARTITION OF pt1 FOR VALUES IN ('0008', '0009',
> '0010', '0011');
> INSERT INTO pt1 SELECT i % 20, i % 30, to_char(i % 12, 'FM'), i %
> 30 FROM generate_series(0, 9) i;
> ANALYZE pt1;
>
> CREATE TABLE pt2 (a int, b int, c text, d int) PARTITION BY LIST(c);
> CREATE TABLE pt2_p1 PARTITION OF pt2 FOR VALUES IN ('', '0001',
> '0002', '0003');
> CREATE TABLE pt2_p2 PARTITION OF pt2 FOR VALUES IN ('0004', '0005',
> '0006', '0007');
> CREATE TABLE pt2_p3 PARTITION OF pt2 FOR VALUES IN ('0008', '0009',
> '0010', '0011');
> INSERT INTO pt2 SELECT i % 20, i % 30, to_char(i % 12, 'FM'), i %
> 30 FROM generate_series(0, 9) i;
> ANALYZE pt2;
>
> EXPLAIN ANALYZE
> SELECT t1.c, sum(t2.a), COUNT(*) FROM pt1 t1 FULL JOIN pt2 t2 ON t1.c
> = t2.c GROUP BY t1.c ORDER BY 1, 2, 3;
> WARNING:  terminating connection because of crash of another server
process
> DETAIL:  The postmaster has commanded this server process to roll back
> the current transaction and exit, because another server process
> exited abnormally and possibly corrupted shared memory.
> HINT:  In a moment you should be able to reconnect to the database and
> repeat your command.
> server closed the connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
> The connection to the server was lost. Attempting reset: Failed.
> !>
>
> stack-trace is given below.
>
> Core was generated by `postgres: parallel worker for PID 73935
>  '.
> Program terminated with signal 11, Segmentation fault.
> #0  0x006dc4b3 in ExecProcNode (node=0x7f7f7f7f7f7f7f7e) at
> ../../../src/include/executor/executor.h:238
> 238if (node->chgParam != NULL) /* something changed? */
> Missing separate debuginfos, use: debuginfo-install
> keyutils-libs-1.4-5.el6.x86_64 krb5-libs-1.10.3-65.el6.x86_64
> libcom_err-1.41.12-23.el6.x86_64 libselinux-2.0.94-7.el6.x86_64
> openssl-1.0.1e-57.el6.x86_64 zlib-1.2.3-29.el6.x86_64
> (gdb) bt
> #0  0x006dc4b3 in ExecProcNode (node=0x7f7f7f7f7f7f7f7e) at
> ../../../src/include/executor/executor.h:238
> #1  0x006dc72e in ExecAppend (pstate=0x26cd6e0) at
nodeAppend.c:207
> #2  0x006d1e7c in ExecProcNodeInstr (node=0x26cd6e0) at
> execProcnode.c:446
> #3  0x006dcee5 in ExecProcNode (node=0x26cd6e0) at
> ../../../src/include/executor/executor.h:241
> #4  0x006dd38c in fetch_input_tuple (aggstate=0x26cd7f8) at
> nodeAgg.c:699
> #5  0x006e02eb in agg_fill_hash_table (aggstate=0x26cd7f8) at
> nodeAgg.c:2536
> #6  0x006dfb2b in ExecAgg (pstate=0x26cd7f8) at nodeAgg.c:2148
> #7  0x006d1e7c in ExecProcNodeInstr (node=0x26cd7f8) at
> execProcnode.c:446
> #8  0x006d1e4d in ExecProcNodeFirst (node=0x26cd7f8) at
> execProcnode.c:430
> #9  0x006c9439 in ExecProcNode (node=0x26cd7f8) at
> ../../../src/include/executor/executor.h:241
> #10 0x006cbd73 in ExecutePlan (estate=0x26ccda0,
> planstate=0x26cd7f8, use_parallel_mode=0 '\000', operation=CMD_SELECT,
> sendTuples=1 '\001', numberTuples=0,
> direction=Forwar

Re: [HACKERS] Parallel Append implementation

2017-11-24 Thread Rajkumar Raghuwanshi
On Thu, Nov 23, 2017 at 2:22 PM, amul sul  wrote:
> Look like it is the same crash what v20 claim to be fixed, indeed I
> missed to add fix[1] in v20 patch, sorry about that. Attached updated
> patch includes aforementioned fix.

Hi,

I have applied latest v21 patch, it got crashed when enabled
partition-wise-join,
same query is working fine with and without partition-wise-join
enabled on PG-head.
please take a look.

SET enable_partition_wise_join TO true;

CREATE TABLE pt1 (a int, b int, c text, d int) PARTITION BY LIST(c);
CREATE TABLE pt1_p1 PARTITION OF pt1 FOR VALUES IN ('', '0001',
'0002', '0003');
CREATE TABLE pt1_p2 PARTITION OF pt1 FOR VALUES IN ('0004', '0005',
'0006', '0007');
CREATE TABLE pt1_p3 PARTITION OF pt1 FOR VALUES IN ('0008', '0009',
'0010', '0011');
INSERT INTO pt1 SELECT i % 20, i % 30, to_char(i % 12, 'FM'), i %
30 FROM generate_series(0, 9) i;
ANALYZE pt1;

CREATE TABLE pt2 (a int, b int, c text, d int) PARTITION BY LIST(c);
CREATE TABLE pt2_p1 PARTITION OF pt2 FOR VALUES IN ('', '0001',
'0002', '0003');
CREATE TABLE pt2_p2 PARTITION OF pt2 FOR VALUES IN ('0004', '0005',
'0006', '0007');
CREATE TABLE pt2_p3 PARTITION OF pt2 FOR VALUES IN ('0008', '0009',
'0010', '0011');
INSERT INTO pt2 SELECT i % 20, i % 30, to_char(i % 12, 'FM'), i %
30 FROM generate_series(0, 9) i;
ANALYZE pt2;

EXPLAIN ANALYZE
SELECT t1.c, sum(t2.a), COUNT(*) FROM pt1 t1 FULL JOIN pt2 t2 ON t1.c
= t2.c GROUP BY t1.c ORDER BY 1, 2, 3;
WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back
the current transaction and exit, because another server process
exited abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and
repeat your command.
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!>

stack-trace is given below.

Core was generated by `postgres: parallel worker for PID 73935
 '.
Program terminated with signal 11, Segmentation fault.
#0  0x006dc4b3 in ExecProcNode (node=0x7f7f7f7f7f7f7f7e) at
../../../src/include/executor/executor.h:238
238if (node->chgParam != NULL) /* something changed? */
Missing separate debuginfos, use: debuginfo-install
keyutils-libs-1.4-5.el6.x86_64 krb5-libs-1.10.3-65.el6.x86_64
libcom_err-1.41.12-23.el6.x86_64 libselinux-2.0.94-7.el6.x86_64
openssl-1.0.1e-57.el6.x86_64 zlib-1.2.3-29.el6.x86_64
(gdb) bt
#0  0x006dc4b3 in ExecProcNode (node=0x7f7f7f7f7f7f7f7e) at
../../../src/include/executor/executor.h:238
#1  0x006dc72e in ExecAppend (pstate=0x26cd6e0) at nodeAppend.c:207
#2  0x006d1e7c in ExecProcNodeInstr (node=0x26cd6e0) at
execProcnode.c:446
#3  0x006dcee5 in ExecProcNode (node=0x26cd6e0) at
../../../src/include/executor/executor.h:241
#4  0x006dd38c in fetch_input_tuple (aggstate=0x26cd7f8) at
nodeAgg.c:699
#5  0x006e02eb in agg_fill_hash_table (aggstate=0x26cd7f8) at
nodeAgg.c:2536
#6  0x006dfb2b in ExecAgg (pstate=0x26cd7f8) at nodeAgg.c:2148
#7  0x006d1e7c in ExecProcNodeInstr (node=0x26cd7f8) at
execProcnode.c:446
#8  0x006d1e4d in ExecProcNodeFirst (node=0x26cd7f8) at
execProcnode.c:430
#9  0x006c9439 in ExecProcNode (node=0x26cd7f8) at
../../../src/include/executor/executor.h:241
#10 0x006cbd73 in ExecutePlan (estate=0x26ccda0,
planstate=0x26cd7f8, use_parallel_mode=0 '\000', operation=CMD_SELECT,
sendTuples=1 '\001', numberTuples=0,
direction=ForwardScanDirection, dest=0x26b2ce0, execute_once=1
'\001') at execMain.c:1718
#11 0x006c9a12 in standard_ExecutorRun (queryDesc=0x26d7fa0,
direction=ForwardScanDirection, count=0, execute_once=1 '\001') at
execMain.c:361
#12 0x006c982e in ExecutorRun (queryDesc=0x26d7fa0,
direction=ForwardScanDirection, count=0, execute_once=1 '\001') at
execMain.c:304
#13 0x006d096c in ParallelQueryMain (seg=0x26322a8,
toc=0x7fda24d46000) at execParallel.c:1271
#14 0x0053272d in ParallelWorkerMain (main_arg=1203628635) at
parallel.c:1149
#15 0x007e8c99 in StartBackgroundWorker () at bgworker.c:841
#16 0x007fc029 in do_start_bgworker (rw=0x2656d00) at postmaster.c:5741
#17 0x007fc36b in maybe_start_bgworkers () at postmaster.c:5945
#18 0x007fb3fa in sigusr1_handler (postgres_signal_arg=10) at
postmaster.c:5134
#19 
#20 0x003dd26e1603 in __select_nocancel () at
../sysdeps/unix/syscall-template.S:82
#21 0x007f6bee in ServerLoop () at postmaster.c:1721
#22 0x007f63dd in PostmasterMain (argc=3, argv=0x2630180) at
postmaster.c:1365
#23 0x0072cb40 in main (argc=3, argv=0x2630180) at main.c:228

Thanks & Regards,
Rajkumar Raghuwanshi
QMG, EnterpriseDB Corporation



Re: [HACKERS] Parallel Append implementation

2017-11-23 Thread amul sul
Look like it is the same crash what v20 claim to be fixed, indeed I
missed to add fix[1] in v20 patch, sorry about that. Attached updated
patch includes aforementioned fix.


1] 
http://postgr.es/m/CAAJ_b97kLNW8Z9nvc_JUUG5wVQUXvG=f37WsX8ALF0A=kah...@mail.gmail.com


Regards,
Amul

On Thu, Nov 23, 2017 at 1:50 PM, Rajkumar Raghuwanshi
 wrote:
> On Thu, Nov 23, 2017 at 9:45 AM, amul sul  wrote:
>>
>> Attaching updated version of "ParallelAppend_v19_rebased" includes this
>> fix.
>
>
> Hi,
>
> I have applied attached patch and got a crash with below query. please take
> a look.
>
> CREATE TABLE tbl (a int, b int, c text, d int) PARTITION BY LIST(c);
> CREATE TABLE tbl_p1 PARTITION OF tbl FOR VALUES IN ('', '0001', '0002',
> '0003');
> CREATE TABLE tbl_p2 PARTITION OF tbl FOR VALUES IN ('0004', '0005', '0006',
> '0007');
> CREATE TABLE tbl_p3 PARTITION OF tbl FOR VALUES IN ('0008', '0009', '0010',
> '0011');
> INSERT INTO tbl SELECT i % 20, i % 30, to_char(i % 12, 'FM'), i % 30
> FROM generate_series(0, 999) i;
> ANALYZE tbl;
>
> EXPLAIN ANALYZE SELECT c, sum(a), avg(b), COUNT(*) FROM tbl GROUP BY c
> HAVING avg(d) < 15 ORDER BY 1, 2, 3;
> WARNING:  terminating connection because of crash of another server process
> DETAIL:  The postmaster has commanded this server process to roll back the
> current transaction and exit, because another server process exited
> abnormally and possibly corrupted shared memory.
> HINT:  In a moment you should be able to reconnect to the database and
> repeat your command.
> server closed the connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
> The connection to the server was lost. Attempting reset: Failed.
> !>
>
>
> stack-trace is given below.
>
> Reading symbols from /lib64/libnss_files.so.2...Reading symbols from
> /usr/lib/debug/lib64/libnss_files-2.12.so.debug...done.
> done.
> Loaded symbols for /lib64/libnss_files.so.2
> Core was generated by `postgres: parallel worker for PID 104999
> '.
> Program terminated with signal 11, Segmentation fault.
> #0  0x006dc4b3 in ExecProcNode (node=0x7f7f7f7f7f7f7f7e) at
> ../../../src/include/executor/executor.h:238
> 238if (node->chgParam != NULL) /* something changed? */
> Missing separate debuginfos, use: debuginfo-install
> keyutils-libs-1.4-5.el6.x86_64 krb5-libs-1.10.3-65.el6.x86_64
> libcom_err-1.41.12-23.el6.x86_64 libselinux-2.0.94-7.el6.x86_64
> openssl-1.0.1e-57.el6.x86_64 zlib-1.2.3-29.el6.x86_64
> (gdb) bt
> #0  0x006dc4b3 in ExecProcNode (node=0x7f7f7f7f7f7f7f7e) at
> ../../../src/include/executor/executor.h:238
> #1  0x006dc72e in ExecAppend (pstate=0x1947ed0) at nodeAppend.c:207
> #2  0x006d1e7c in ExecProcNodeInstr (node=0x1947ed0) at
> execProcnode.c:446
> #3  0x006dcef1 in ExecProcNode (node=0x1947ed0) at
> ../../../src/include/executor/executor.h:241
> #4  0x006dd398 in fetch_input_tuple (aggstate=0x1947fe8) at
> nodeAgg.c:699
> #5  0x006e02f7 in agg_fill_hash_table (aggstate=0x1947fe8) at
> nodeAgg.c:2536
> #6  0x006dfb37 in ExecAgg (pstate=0x1947fe8) at nodeAgg.c:2148
> #7  0x006d1e7c in ExecProcNodeInstr (node=0x1947fe8) at
> execProcnode.c:446
> #8  0x006d1e4d in ExecProcNodeFirst (node=0x1947fe8) at
> execProcnode.c:430
> #9  0x006c9439 in ExecProcNode (node=0x1947fe8) at
> ../../../src/include/executor/executor.h:241
> #10 0x006cbd73 in ExecutePlan (estate=0x1947590,
> planstate=0x1947fe8, use_parallel_mode=0 '\000', operation=CMD_SELECT,
> sendTuples=1 '\001', numberTuples=0,
> direction=ForwardScanDirection, dest=0x192acb0, execute_once=1 '\001')
> at execMain.c:1718
> #11 0x006c9a12 in standard_ExecutorRun (queryDesc=0x194ffc0,
> direction=ForwardScanDirection, count=0, execute_once=1 '\001') at
> execMain.c:361
> #12 0x006c982e in ExecutorRun (queryDesc=0x194ffc0,
> direction=ForwardScanDirection, count=0, execute_once=1 '\001') at
> execMain.c:304
> #13 0x006d096c in ParallelQueryMain (seg=0x18aa2a8,
> toc=0x7f899a227000) at execParallel.c:1271
> #14 0x0053272d in ParallelWorkerMain (main_arg=1218206688) at
> parallel.c:1149
> #15 0x007e8ca5 in StartBackgroundWorker () at bgworker.c:841
> #16 0x007fc035 in do_start_bgworker (rw=0x18ced00) at
> postmaster.c:5741
> #17 0x007fc377 in maybe_start_bgworkers () at postmaster.c:5945
> #18 0x007fb406 in sigusr1_handler (postgres_signal_arg=10) at
> postmaster.c:5134
> #19 
> #20 0x003dd26e1603 in __select_nocancel () at
> ../sysdeps/unix/syscall-template.S:82
> #21 0x007f6bfa in ServerLoop () at postmaster.c:1721
> #22 0x007f63e9 in PostmasterMain (argc=3, argv=0x18a8180) at
> postmaster.c:1365
> #23 0x0072cb4c in main (argc=3, argv=0x18a8180) at main.c:228
> (gdb)
>
>
> Thanks & Regards,
> Rajkumar Raghuwanshi
> QMG, EnterpriseDB Corporation


ParallelAppend_v21.patch
Descriptio

Re: [HACKERS] Parallel Append implementation

2017-11-23 Thread Rajkumar Raghuwanshi
On Thu, Nov 23, 2017 at 9:45 AM, amul sul  wrote:

> Attaching updated version of "ParallelAppend_v19_rebased" includes this
> fix.
>

Hi,

I have applied attached patch and got a crash with below query. please take
a look.

CREATE TABLE tbl (a int, b int, c text, d int) PARTITION BY LIST(c);
CREATE TABLE tbl_p1 PARTITION OF tbl FOR VALUES IN ('', '0001', '0002',
'0003');
CREATE TABLE tbl_p2 PARTITION OF tbl FOR VALUES IN ('0004', '0005', '0006',
'0007');
CREATE TABLE tbl_p3 PARTITION OF tbl FOR VALUES IN ('0008', '0009', '0010',
'0011');
INSERT INTO tbl SELECT i % 20, i % 30, to_char(i % 12, 'FM'), i % 30
FROM generate_series(0, 999) i;
ANALYZE tbl;

EXPLAIN ANALYZE SELECT c, sum(a), avg(b), COUNT(*) FROM tbl GROUP BY c
HAVING avg(d) < 15 ORDER BY 1, 2, 3;
WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the
current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and
repeat your command.
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!>


stack-trace is given below.

Reading symbols from /lib64/libnss_files.so.2...Reading symbols from
/usr/lib/debug/lib64/libnss_files-2.12.so.debug...done.
done.
Loaded symbols for /lib64/libnss_files.so.2
Core was generated by `postgres: parallel worker for PID
104999 '.
Program terminated with signal 11, Segmentation fault.
#0  0x006dc4b3 in ExecProcNode (node=0x7f7f7f7f7f7f7f7e) at
../../../src/include/executor/executor.h:238
238if (node->chgParam != NULL) /* something changed? */
Missing separate debuginfos, use: debuginfo-install
keyutils-libs-1.4-5.el6.x86_64 krb5-libs-1.10.3-65.el6.x86_64
libcom_err-1.41.12-23.el6.x86_64 libselinux-2.0.94-7.el6.x86_64
openssl-1.0.1e-57.el6.x86_64 zlib-1.2.3-29.el6.x86_64
(gdb) bt
#0  0x006dc4b3 in ExecProcNode (node=0x7f7f7f7f7f7f7f7e) at
../../../src/include/executor/executor.h:238
#1  0x006dc72e in ExecAppend (pstate=0x1947ed0) at nodeAppend.c:207
#2  0x006d1e7c in ExecProcNodeInstr (node=0x1947ed0) at
execProcnode.c:446
#3  0x006dcef1 in ExecProcNode (node=0x1947ed0) at
../../../src/include/executor/executor.h:241
#4  0x006dd398 in fetch_input_tuple (aggstate=0x1947fe8) at
nodeAgg.c:699
#5  0x006e02f7 in agg_fill_hash_table (aggstate=0x1947fe8) at
nodeAgg.c:2536
#6  0x006dfb37 in ExecAgg (pstate=0x1947fe8) at nodeAgg.c:2148
#7  0x006d1e7c in ExecProcNodeInstr (node=0x1947fe8) at
execProcnode.c:446
#8  0x006d1e4d in ExecProcNodeFirst (node=0x1947fe8) at
execProcnode.c:430
#9  0x006c9439 in ExecProcNode (node=0x1947fe8) at
../../../src/include/executor/executor.h:241
#10 0x006cbd73 in ExecutePlan (estate=0x1947590,
planstate=0x1947fe8, use_parallel_mode=0 '\000', operation=CMD_SELECT,
sendTuples=1 '\001', numberTuples=0,
direction=ForwardScanDirection, dest=0x192acb0, execute_once=1 '\001')
at execMain.c:1718
#11 0x006c9a12 in standard_ExecutorRun (queryDesc=0x194ffc0,
direction=ForwardScanDirection, count=0, execute_once=1 '\001') at
execMain.c:361
#12 0x006c982e in ExecutorRun (queryDesc=0x194ffc0,
direction=ForwardScanDirection, count=0, execute_once=1 '\001') at
execMain.c:304
#13 0x006d096c in ParallelQueryMain (seg=0x18aa2a8,
toc=0x7f899a227000) at execParallel.c:1271
#14 0x0053272d in ParallelWorkerMain (main_arg=1218206688) at
parallel.c:1149
#15 0x007e8ca5 in StartBackgroundWorker () at bgworker.c:841
#16 0x007fc035 in do_start_bgworker (rw=0x18ced00) at
postmaster.c:5741
#17 0x007fc377 in maybe_start_bgworkers () at postmaster.c:5945
#18 0x007fb406 in sigusr1_handler (postgres_signal_arg=10) at
postmaster.c:5134
#19 
#20 0x003dd26e1603 in __select_nocancel () at
../sysdeps/unix/syscall-template.S:82
#21 0x007f6bfa in ServerLoop () at postmaster.c:1721
#22 0x007f63e9 in PostmasterMain (argc=3, argv=0x18a8180) at
postmaster.c:1365
#23 0x0072cb4c in main (argc=3, argv=0x18a8180) at main.c:228
(gdb)


Thanks & Regards,
Rajkumar Raghuwanshi
QMG, EnterpriseDB Corporation


Re: [HACKERS] Parallel Append implementation

2017-11-22 Thread amul sul
On Wed, Nov 22, 2017 at 1:44 AM, Robert Haas  wrote:
> On Tue, Nov 21, 2017 at 6:57 AM, amul sul  wrote:
>> By doing following change on the v19 patch does the fix for me:
>>
>> --- a/src/backend/executor/nodeAppend.c
>> +++ b/src/backend/executor/nodeAppend.c
>> @@ -489,11 +489,9 @@ choose_next_subplan_for_worker(AppendState *node)
>> }
>>
>> /* Pick the plan we found, and advance pa_next_plan one more time. */
>> -   node->as_whichplan = pstate->pa_next_plan;
>> +   node->as_whichplan = pstate->pa_next_plan++;
>> if (pstate->pa_next_plan == node->as_nplans)
>> pstate->pa_next_plan = append->first_partial_plan;
>> -   else
>> -   pstate->pa_next_plan++;
>>
>> /* If non-partial, immediately mark as finished. */
>> if (node->as_whichplan < append->first_partial_plan)
>>
>> Attaching patch does same changes to Amit's ParallelAppend_v19_rebased.patch.
>
> Yes, that looks like a correct fix.  Thanks.
>

Attaching updated version of "ParallelAppend_v19_rebased" includes this fix.

Regards,
Amul


ParallelAppend_v20.patch
Description: Binary data


Re: [HACKERS] Parallel Append implementation

2017-11-21 Thread Robert Haas
On Tue, Nov 21, 2017 at 6:57 AM, amul sul  wrote:
> By doing following change on the v19 patch does the fix for me:
>
> --- a/src/backend/executor/nodeAppend.c
> +++ b/src/backend/executor/nodeAppend.c
> @@ -489,11 +489,9 @@ choose_next_subplan_for_worker(AppendState *node)
> }
>
> /* Pick the plan we found, and advance pa_next_plan one more time. */
> -   node->as_whichplan = pstate->pa_next_plan;
> +   node->as_whichplan = pstate->pa_next_plan++;
> if (pstate->pa_next_plan == node->as_nplans)
> pstate->pa_next_plan = append->first_partial_plan;
> -   else
> -   pstate->pa_next_plan++;
>
> /* If non-partial, immediately mark as finished. */
> if (node->as_whichplan < append->first_partial_plan)
>
> Attaching patch does same changes to Amit's ParallelAppend_v19_rebased.patch.

Yes, that looks like a correct fix.  Thanks.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] Parallel Append implementation

2017-11-21 Thread amul sul
On Tue, Nov 21, 2017 at 2:22 PM, Amit Khandekar  wrote:
> On 21 November 2017 at 12:44, Rafia Sabih  
> wrote:
>> On Mon, Nov 13, 2017 at 12:54 PM, Amit Khandekar  
>> wrote:
>>> Thanks a lot Robert for the patch. I will have a look. Quickly tried
>>> to test some aggregate queries with a partitioned pgbench_accounts
>>> table, and it is crashing. Will get back with the fix, and any other
>>> review comments.
>>>
>>> Thanks
>>> -Amit Khandekar
>>
>> I was trying to get the performance of this patch at commit id -
>> 11e264517dff7a911d9e6494de86049cab42cde3 and TPC-H scale factor 20
>> with the following parameter settings,
>> work_mem = 1 GB
>> shared_buffers = 10GB
>> effective_cache_size = 10GB
>> max_parallel_workers_per_gather = 4
>> enable_partitionwise_join = on
>>
>> and the details of the partitioning scheme is as follows,
>> tables partitioned = lineitem on l_orderkey and orders on o_orderkey
>> number of partitions in each table = 10
>>
>> As per the explain outputs PA was used in following queries- 1, 3, 4,
>> 5, 6, 7, 8, 10, 12, 14, 15, 18, and 21.
>> Unfortunately, at the time of executing any of these query, it is
>> crashing with the following information in  core dump of each of the
>> workers,
>>
>> Program terminated with signal 11, Segmentation fault.
>> #0  0x10600984 in pg_atomic_read_u32_impl (ptr=0x3ec29294)
>> at ../../../../src/include/port/atomics/generic.h:48
>> 48 return ptr->value;
>>
>> In case this a different issue as you pointed upthread, you may want
>> to have a look at this as well.
>> Please let me know if you need any more information in this regard.
>
> Right, for me the crash had occurred with a similar stack, although
> the real crash happened in one of the workers. Attached is the script
> file
> pgbench_partitioned.sql to create a schema with which I had reproduced
> the crash.
>
> The query that crashed :
> select sum(aid), avg(aid) from pgbench_accounts;
>
> Set max_parallel_workers_per_gather to at least 5.
>
> Also attached is v19 patch rebased.
>

I've spent little time to debug this crash. The crash happens in ExecAppend()
due to subnode in node->appendplans array is referred using incorrect
array index (out of bound value) in the following code:

/*
 * figure out which subplan we are currently processing
 */
subnode = node->appendplans[node->as_whichplan];

This incorrect value to node->as_whichplan is get assigned in the
choose_next_subplan_for_worker().

By doing following change on the v19 patch does the fix for me:

--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -489,11 +489,9 @@ choose_next_subplan_for_worker(AppendState *node)
}

/* Pick the plan we found, and advance pa_next_plan one more time. */
-   node->as_whichplan = pstate->pa_next_plan;
+   node->as_whichplan = pstate->pa_next_plan++;
if (pstate->pa_next_plan == node->as_nplans)
pstate->pa_next_plan = append->first_partial_plan;
-   else
-   pstate->pa_next_plan++;

/* If non-partial, immediately mark as finished. */
if (node->as_whichplan < append->first_partial_plan)

Attaching patch does same changes to Amit's ParallelAppend_v19_rebased.patch.

Regards,
Amul


fix_crash.patch
Description: Binary data


Re: [HACKERS] Parallel Append implementation

2017-11-21 Thread Amit Khandekar
On 21 November 2017 at 12:44, Rafia Sabih  wrote:
> On Mon, Nov 13, 2017 at 12:54 PM, Amit Khandekar  
> wrote:
>> Thanks a lot Robert for the patch. I will have a look. Quickly tried
>> to test some aggregate queries with a partitioned pgbench_accounts
>> table, and it is crashing. Will get back with the fix, and any other
>> review comments.
>>
>> Thanks
>> -Amit Khandekar
>
> I was trying to get the performance of this patch at commit id -
> 11e264517dff7a911d9e6494de86049cab42cde3 and TPC-H scale factor 20
> with the following parameter settings,
> work_mem = 1 GB
> shared_buffers = 10GB
> effective_cache_size = 10GB
> max_parallel_workers_per_gather = 4
> enable_partitionwise_join = on
>
> and the details of the partitioning scheme is as follows,
> tables partitioned = lineitem on l_orderkey and orders on o_orderkey
> number of partitions in each table = 10
>
> As per the explain outputs PA was used in following queries- 1, 3, 4,
> 5, 6, 7, 8, 10, 12, 14, 15, 18, and 21.
> Unfortunately, at the time of executing any of these query, it is
> crashing with the following information in  core dump of each of the
> workers,
>
> Program terminated with signal 11, Segmentation fault.
> #0  0x10600984 in pg_atomic_read_u32_impl (ptr=0x3ec29294)
> at ../../../../src/include/port/atomics/generic.h:48
> 48 return ptr->value;
>
> In case this a different issue as you pointed upthread, you may want
> to have a look at this as well.
> Please let me know if you need any more information in this regard.

Right, for me the crash had occurred with a similar stack, although
the real crash happened in one of the workers. Attached is the script
file
pgbench_partitioned.sql to create a schema with which I had reproduced
the crash.

The query that crashed :
select sum(aid), avg(aid) from pgbench_accounts;

Set max_parallel_workers_per_gather to at least 5.

Also attached is v19 patch rebased.

-- 
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company


pgbench_partitioned.sql
Description: Binary data


ParallelAppend_v19_rebased.patch
Description: Binary data


Re: [HACKERS] Parallel Append implementation

2017-11-20 Thread Rafia Sabih
On Mon, Nov 13, 2017 at 12:54 PM, Amit Khandekar  wrote:
> Thanks a lot Robert for the patch. I will have a look. Quickly tried
> to test some aggregate queries with a partitioned pgbench_accounts
> table, and it is crashing. Will get back with the fix, and any other
> review comments.
>
> Thanks
> -Amit Khandekar

I was trying to get the performance of this patch at commit id -
11e264517dff7a911d9e6494de86049cab42cde3 and TPC-H scale factor 20
with the following parameter settings,
work_mem = 1 GB
shared_buffers = 10GB
effective_cache_size = 10GB
max_parallel_workers_per_gather = 4
enable_partitionwise_join = on

and the details of the partitioning scheme is as follows,
tables partitioned = lineitem on l_orderkey and orders on o_orderkey
number of partitions in each table = 10

As per the explain outputs PA was used in following queries- 1, 3, 4,
5, 6, 7, 8, 10, 12, 14, 15, 18, and 21.
Unfortunately, at the time of executing any of these query, it is
crashing with the following information in  core dump of each of the
workers,

Program terminated with signal 11, Segmentation fault.
#0  0x10600984 in pg_atomic_read_u32_impl (ptr=0x3ec29294)
at ../../../../src/include/port/atomics/generic.h:48
48 return ptr->value;

In case this a different issue as you pointed upthread, you may want
to have a look at this as well.
Please let me know if you need any more information in this regard.



-- 
Regards,
Rafia Sabih
EnterpriseDB: http://www.enterprisedb.com/