Re: Parallel Inserts (WAS: [bug?] Missed parallel safety checks..)

2022-07-28 Thread Jacob Champion
On Thu, Jul 28, 2022 at 8:43 AM Julien Rouhaud  wrote:
> Could you send a rebased version?  In the meantime I will switch the entry to
> Waiting on Author.

By request in [1] I'm marking this Returned with Feedback for now.
Whenever you're ready, you can resurrect the patch entry by visiting

https://commitfest.postgresql.org/38/3143/

and changing the status to "Needs Review", and then changing the
status again to "Move to next CF". (Don't forget the second step;
hopefully we will have streamlined this in the near future!)

Thanks,
--Jacob

[1] 
https://www.postgresql.org/message-id/OS0PR01MB571696D623F35A09AB51903A94969%40OS0PR01MB5716.jpnprd01.prod.outlook.com




Re: Parallel Inserts (WAS: [bug?] Missed parallel safety checks..)

2022-01-14 Thread Julien Rouhaud
Hi,

On Thu, Sep 09, 2021 at 02:12:08AM +, houzj.f...@fujitsu.com wrote:
> 
> Attach new version patch set which remove the workaround patch.

This version of the patchset doesn't apply anymore:

http://cfbot.cputube.org/patch_36_3143.log
=== Applying patches on top of PostgreSQL commit ID 
a18b6d2dc288dfa6e7905ede1d4462edd6a8af47 ===
=== applying patch ./v19-0001-CREATE-ALTER-TABLE-PARALLEL-DML.patch
[...]
patching file src/backend/commands/tablecmds.c
Hunk #1 FAILED at 40.
Hunk #2 succeeded at 624 (offset 21 lines).
Hunk #3 succeeded at 670 (offset 21 lines).
Hunk #4 succeeded at 947 (offset 19 lines).
Hunk #5 succeeded at 991 (offset 19 lines).
Hunk #6 succeeded at 4256 (offset 40 lines).
Hunk #7 succeeded at 4807 (offset 40 lines).
Hunk #8 succeeded at 5217 (offset 40 lines).
Hunk #9 succeeded at 6193 (offset 42 lines).
Hunk #10 succeeded at 19278 (offset 465 lines).
1 out of 10 hunks FAILED -- saving rejects to file 
src/backend/commands/tablecmds.c.rej
[...]
patching file src/bin/pg_dump/pg_dump.c
Hunk #1 FAILED at 6253.
Hunk #2 FAILED at 6358.
Hunk #3 FAILED at 6450.
Hunk #4 FAILED at 6503.
Hunk #5 FAILED at 6556.
Hunk #6 FAILED at 6609.
Hunk #7 FAILED at 6660.
Hunk #8 FAILED at 6708.
Hunk #9 FAILED at 6756.
Hunk #10 FAILED at 6803.
Hunk #11 FAILED at 6872.
Hunk #12 FAILED at 6927.
Hunk #13 succeeded at 15524 (offset -1031 lines).
12 out of 13 hunks FAILED -- saving rejects to file 
src/bin/pg_dump/pg_dump.c.rej
[...]
patching file src/bin/psql/describe.c
Hunk #1 succeeded at 1479 (offset -177 lines).
Hunk #2 succeeded at 1493 (offset -177 lines).
Hunk #3 succeeded at 1631 (offset -241 lines).
Hunk #4 succeeded at 3374 (offset -277 lines).
Hunk #5 succeeded at 3731 (offset -310 lines).
Hunk #6 FAILED at 4109.
1 out of 6 hunks FAILED -- saving rejects to file src/bin/psql/describe.c.rej

Could you send a rebased version?  In the meantime I will switch the entry to
Waiting on Author.




RE: Parallel Inserts (WAS: [bug?] Missed parallel safety checks..)

2021-08-01 Thread houzj.f...@fujitsu.com
On August 2, 2021 2:04 PM Greg Nancarrow  wrote:
> On Mon, Aug 2, 2021 at 2:52 PM Amit Kapila  wrote:
> >
> > On Fri, Jul 30, 2021 at 6:53 PM houzj.f...@fujitsu.com
> >  wrote:
> > >
> > > On Friday, July 30, 2021 2:52 PM Greg Nancarrow 
> wrote:
> > > > On Fri, Jul 30, 2021 at 4:02 PM Amit Kapila 
> wrote:
> > > > >
> > > > > > Besides, I think we need a new default value about parallel
> > > > > > dml safety. Maybe 'auto' or 'null'(different from
> > > > > > safe/restricted/unsafe). Because, user is likely to alter the
> > > > > > safety to the default value to get the automatic safety check,
> > > > > > a independent default value can make it more clear.
> > > > > >
> > > > >
> > > > > Hmm, but auto won't work for partitioned tables, right? If so,
> > > > > that might appear like an inconsistency to the user and we need
> > > > > to document the same. Let me summarize the discussion so far in
> > > > > this thread so that it is helpful to others.
> > > > >
> > > >
> > > > To avoid that inconsistency, UNSAFE could be the default for
> > > > partitioned tables (and we would disallow setting AUTO for these).
> > > > So then AUTO is the default for non-partitioned tables only.
> > >
> > > I think this approach is reasonable, +1.
> > >
> >
> > I see the need to change to default via Alter Table but I am not sure
> > if Auto is the most appropriate way to handle that. How about using
> > DEFAULT itself as we do in the case of REPLICA IDENTITY? So, if users
> > have to alter parallel safety value to default, they need to just say
> > Parallel DML DEFAULT. The default would mean automatic behavior for
> > non-partitioned relations and ignore parallelism for partitioned
> > tables.
> >
> 
> Hmm, I'm not so sure I'm sold on that.
> I personally think "DEFAULT" here is vague, and users then need to know what
> DEFAULT equates to, based on the type of table (partitioned or non-partitioned
> table).
> Also, then there's two ways to set the actual "default" DML parallel-safety 
> for
> partitioned tables: DEFAULT or UNSAFE.
> At least "AUTO" is a meaningful default option name for non-partitioned tables
> - "automatic" parallel-safety checking, and the fact that it isn't the 
> default (and
> can't be set) for partitioned tables highlights the difference in the way 
> being
> proposed to treat them (i.e. use automatic checking only for non-partitioned
> tables).
> I'd be interested to hear what others think.
> I think a viable alternative would be to record whether an explicit DML
> parallel-safety has been specified, and if not, apply default behavior (i.e. 
> by
> default use automatic checking for non-partitioned tables and treat 
> partitioned
> tables as UNSAFE). I'm just not sure whether this kind of distinction 
> (explicit vs
> implicit default) has been used before in Postgres options.

I think both approaches are fine, but using "DEFAULT" might has a disadvantage
that if we somehow support automatic safety check for partitioned table in the
future, then the meaning of "DEFAULT" for partitioned table will change from
UNSAFE to automatic check. It could also bring some burden on the user to
modify their sql script.

Best regards,
houzj


Re: Parallel Inserts (WAS: [bug?] Missed parallel safety checks..)

2021-08-01 Thread Greg Nancarrow
On Mon, Aug 2, 2021 at 2:52 PM Amit Kapila  wrote:
>
> On Fri, Jul 30, 2021 at 6:53 PM houzj.f...@fujitsu.com
>  wrote:
> >
> > On Friday, July 30, 2021 2:52 PM Greg Nancarrow  wrote:
> > > On Fri, Jul 30, 2021 at 4:02 PM Amit Kapila  
> > > wrote:
> > > >
> > > > > Besides, I think we need a new default value about parallel dml
> > > > > safety. Maybe 'auto' or 'null'(different from
> > > > > safe/restricted/unsafe). Because, user is likely to alter the safety
> > > > > to the default value to get the automatic safety check, a independent 
> > > > > default
> > > > > value can make it more clear.
> > > > >
> > > >
> > > > Hmm, but auto won't work for partitioned tables, right? If so, that
> > > > might appear like an inconsistency to the user and we need to document
> > > > the same. Let me summarize the discussion so far in this thread so
> > > > that it is helpful to others.
> > > >
> > >
> > > To avoid that inconsistency, UNSAFE could be the default for partitioned 
> > > tables
> > > (and we would disallow setting AUTO for these).
> > > So then AUTO is the default for non-partitioned tables only.
> >
> > I think this approach is reasonable, +1.
> >
>
> I see the need to change to default via Alter Table but I am not sure
> if Auto is the most appropriate way to handle that. How about using
> DEFAULT itself as we do in the case of REPLICA IDENTITY? So, if users
> have to alter parallel safety value to default, they need to just say
> Parallel DML DEFAULT. The default would mean automatic behavior for
> non-partitioned relations and ignore parallelism for partitioned
> tables.
>

Hmm, I'm not so sure I'm sold on that.
I personally think "DEFAULT" here is vague, and users then need to
know what DEFAULT equates to, based on the type of table (partitioned
or non-partitioned table).
Also, then there's two ways to set the actual "default" DML
parallel-safety for partitioned tables: DEFAULT or UNSAFE.
At least "AUTO" is a meaningful default option name for
non-partitioned tables - "automatic" parallel-safety checking, and the
fact that it isn't the default (and can't be set) for partitioned
tables highlights the difference in the way being proposed to treat
them (i.e. use automatic checking only for non-partitioned tables).
I'd be interested to hear what others think.
I think a viable alternative would be to record whether an explicit
DML parallel-safety has been specified, and if not, apply default
behavior (i.e. by default use automatic checking for non-partitioned
tables and treat partitioned tables as UNSAFE). I'm just not sure
whether this kind of distinction (explicit vs implicit default) has
been used before in Postgres options.

Regards,
Greg Nancarrow
Fujitsu Australia




Re: Parallel Inserts (WAS: [bug?] Missed parallel safety checks..)

2021-08-01 Thread Amit Kapila
On Fri, Jul 30, 2021 at 6:53 PM houzj.f...@fujitsu.com
 wrote:
>
> On Friday, July 30, 2021 2:52 PM Greg Nancarrow  wrote:
> > On Fri, Jul 30, 2021 at 4:02 PM Amit Kapila  wrote:
> > >
> > > > Besides, I think we need a new default value about parallel dml
> > > > safety. Maybe 'auto' or 'null'(different from
> > > > safe/restricted/unsafe). Because, user is likely to alter the safety
> > > > to the default value to get the automatic safety check, a independent 
> > > > default
> > > > value can make it more clear.
> > > >
> > >
> > > Hmm, but auto won't work for partitioned tables, right? If so, that
> > > might appear like an inconsistency to the user and we need to document
> > > the same. Let me summarize the discussion so far in this thread so
> > > that it is helpful to others.
> > >
> >
> > To avoid that inconsistency, UNSAFE could be the default for partitioned 
> > tables
> > (and we would disallow setting AUTO for these).
> > So then AUTO is the default for non-partitioned tables only.
>
> I think this approach is reasonable, +1.
>

I see the need to change to default via Alter Table but I am not sure
if Auto is the most appropriate way to handle that. How about using
DEFAULT itself as we do in the case of REPLICA IDENTITY? So, if users
have to alter parallel safety value to default, they need to just say
Parallel DML DEFAULT. The default would mean automatic behavior for
non-partitioned relations and ignore parallelism for partitioned
tables.

-- 
With Regards,
Amit Kapila.




RE: Parallel Inserts (WAS: [bug?] Missed parallel safety checks..)

2021-07-30 Thread houzj.f...@fujitsu.com
On Friday, July 30, 2021 2:52 PM Greg Nancarrow  wrote:
> On Fri, Jul 30, 2021 at 4:02 PM Amit Kapila  wrote:
> >
> > > Besides, I think we need a new default value about parallel dml
> > > safety. Maybe 'auto' or 'null'(different from
> > > safe/restricted/unsafe). Because, user is likely to alter the safety
> > > to the default value to get the automatic safety check, a independent 
> > > default
> > > value can make it more clear.
> > >
> >
> > Hmm, but auto won't work for partitioned tables, right? If so, that
> > might appear like an inconsistency to the user and we need to document
> > the same. Let me summarize the discussion so far in this thread so
> > that it is helpful to others.
> >
> 
> To avoid that inconsistency, UNSAFE could be the default for partitioned 
> tables
> (and we would disallow setting AUTO for these).
> So then AUTO is the default for non-partitioned tables only.

I think this approach is reasonable, +1.

Best regards,
houzj 


Re: Parallel Inserts (WAS: [bug?] Missed parallel safety checks..)

2021-07-29 Thread Greg Nancarrow
On Fri, Jul 30, 2021 at 4:02 PM Amit Kapila  wrote:
>
> > Besides, I think we need a new default value about parallel dml safety. 
> > Maybe
> > 'auto' or 'null'(different from safe/restricted/unsafe). Because, user is
> > likely to alter the safety to the default value to get the automatic safety
> > check, a independent default value can make it more clear.
> >
>
> Hmm, but auto won't work for partitioned tables, right? If so, that
> might appear like an inconsistency to the user and we need to document
> the same. Let me summarize the discussion so far in this thread so
> that it is helpful to others.
>

To avoid that inconsistency, UNSAFE could be the default for
partitioned tables (and we would disallow setting AUTO for these).
So then AUTO is the default for non-partitioned tables only.

Regards,
Greg Nancarrow
Fujitsu Australia