Re: Parallel Inserts (WAS: [bug?] Missed parallel safety checks..)
On Thu, Jul 28, 2022 at 8:43 AM Julien Rouhaud wrote: > Could you send a rebased version? In the meantime I will switch the entry to > Waiting on Author. By request in [1] I'm marking this Returned with Feedback for now. Whenever you're ready, you can resurrect the patch entry by visiting https://commitfest.postgresql.org/38/3143/ and changing the status to "Needs Review", and then changing the status again to "Move to next CF". (Don't forget the second step; hopefully we will have streamlined this in the near future!) Thanks, --Jacob [1] https://www.postgresql.org/message-id/OS0PR01MB571696D623F35A09AB51903A94969%40OS0PR01MB5716.jpnprd01.prod.outlook.com
Re: Parallel Inserts (WAS: [bug?] Missed parallel safety checks..)
Hi, On Thu, Sep 09, 2021 at 02:12:08AM +, houzj.f...@fujitsu.com wrote: > > Attach new version patch set which remove the workaround patch. This version of the patchset doesn't apply anymore: http://cfbot.cputube.org/patch_36_3143.log === Applying patches on top of PostgreSQL commit ID a18b6d2dc288dfa6e7905ede1d4462edd6a8af47 === === applying patch ./v19-0001-CREATE-ALTER-TABLE-PARALLEL-DML.patch [...] patching file src/backend/commands/tablecmds.c Hunk #1 FAILED at 40. Hunk #2 succeeded at 624 (offset 21 lines). Hunk #3 succeeded at 670 (offset 21 lines). Hunk #4 succeeded at 947 (offset 19 lines). Hunk #5 succeeded at 991 (offset 19 lines). Hunk #6 succeeded at 4256 (offset 40 lines). Hunk #7 succeeded at 4807 (offset 40 lines). Hunk #8 succeeded at 5217 (offset 40 lines). Hunk #9 succeeded at 6193 (offset 42 lines). Hunk #10 succeeded at 19278 (offset 465 lines). 1 out of 10 hunks FAILED -- saving rejects to file src/backend/commands/tablecmds.c.rej [...] patching file src/bin/pg_dump/pg_dump.c Hunk #1 FAILED at 6253. Hunk #2 FAILED at 6358. Hunk #3 FAILED at 6450. Hunk #4 FAILED at 6503. Hunk #5 FAILED at 6556. Hunk #6 FAILED at 6609. Hunk #7 FAILED at 6660. Hunk #8 FAILED at 6708. Hunk #9 FAILED at 6756. Hunk #10 FAILED at 6803. Hunk #11 FAILED at 6872. Hunk #12 FAILED at 6927. Hunk #13 succeeded at 15524 (offset -1031 lines). 12 out of 13 hunks FAILED -- saving rejects to file src/bin/pg_dump/pg_dump.c.rej [...] patching file src/bin/psql/describe.c Hunk #1 succeeded at 1479 (offset -177 lines). Hunk #2 succeeded at 1493 (offset -177 lines). Hunk #3 succeeded at 1631 (offset -241 lines). Hunk #4 succeeded at 3374 (offset -277 lines). Hunk #5 succeeded at 3731 (offset -310 lines). Hunk #6 FAILED at 4109. 1 out of 6 hunks FAILED -- saving rejects to file src/bin/psql/describe.c.rej Could you send a rebased version? In the meantime I will switch the entry to Waiting on Author.
RE: Parallel Inserts (WAS: [bug?] Missed parallel safety checks..)
On August 2, 2021 2:04 PM Greg Nancarrow wrote: > On Mon, Aug 2, 2021 at 2:52 PM Amit Kapila wrote: > > > > On Fri, Jul 30, 2021 at 6:53 PM houzj.f...@fujitsu.com > > wrote: > > > > > > On Friday, July 30, 2021 2:52 PM Greg Nancarrow > wrote: > > > > On Fri, Jul 30, 2021 at 4:02 PM Amit Kapila > wrote: > > > > > > > > > > > Besides, I think we need a new default value about parallel > > > > > > dml safety. Maybe 'auto' or 'null'(different from > > > > > > safe/restricted/unsafe). Because, user is likely to alter the > > > > > > safety to the default value to get the automatic safety check, > > > > > > a independent default value can make it more clear. > > > > > > > > > > > > > > > > Hmm, but auto won't work for partitioned tables, right? If so, > > > > > that might appear like an inconsistency to the user and we need > > > > > to document the same. Let me summarize the discussion so far in > > > > > this thread so that it is helpful to others. > > > > > > > > > > > > > To avoid that inconsistency, UNSAFE could be the default for > > > > partitioned tables (and we would disallow setting AUTO for these). > > > > So then AUTO is the default for non-partitioned tables only. > > > > > > I think this approach is reasonable, +1. > > > > > > > I see the need to change to default via Alter Table but I am not sure > > if Auto is the most appropriate way to handle that. How about using > > DEFAULT itself as we do in the case of REPLICA IDENTITY? So, if users > > have to alter parallel safety value to default, they need to just say > > Parallel DML DEFAULT. The default would mean automatic behavior for > > non-partitioned relations and ignore parallelism for partitioned > > tables. > > > > Hmm, I'm not so sure I'm sold on that. > I personally think "DEFAULT" here is vague, and users then need to know what > DEFAULT equates to, based on the type of table (partitioned or non-partitioned > table). > Also, then there's two ways to set the actual "default" DML parallel-safety > for > partitioned tables: DEFAULT or UNSAFE. > At least "AUTO" is a meaningful default option name for non-partitioned tables > - "automatic" parallel-safety checking, and the fact that it isn't the > default (and > can't be set) for partitioned tables highlights the difference in the way > being > proposed to treat them (i.e. use automatic checking only for non-partitioned > tables). > I'd be interested to hear what others think. > I think a viable alternative would be to record whether an explicit DML > parallel-safety has been specified, and if not, apply default behavior (i.e. > by > default use automatic checking for non-partitioned tables and treat > partitioned > tables as UNSAFE). I'm just not sure whether this kind of distinction > (explicit vs > implicit default) has been used before in Postgres options. I think both approaches are fine, but using "DEFAULT" might has a disadvantage that if we somehow support automatic safety check for partitioned table in the future, then the meaning of "DEFAULT" for partitioned table will change from UNSAFE to automatic check. It could also bring some burden on the user to modify their sql script. Best regards, houzj
Re: Parallel Inserts (WAS: [bug?] Missed parallel safety checks..)
On Mon, Aug 2, 2021 at 2:52 PM Amit Kapila wrote: > > On Fri, Jul 30, 2021 at 6:53 PM houzj.f...@fujitsu.com > wrote: > > > > On Friday, July 30, 2021 2:52 PM Greg Nancarrow wrote: > > > On Fri, Jul 30, 2021 at 4:02 PM Amit Kapila > > > wrote: > > > > > > > > > Besides, I think we need a new default value about parallel dml > > > > > safety. Maybe 'auto' or 'null'(different from > > > > > safe/restricted/unsafe). Because, user is likely to alter the safety > > > > > to the default value to get the automatic safety check, a independent > > > > > default > > > > > value can make it more clear. > > > > > > > > > > > > > Hmm, but auto won't work for partitioned tables, right? If so, that > > > > might appear like an inconsistency to the user and we need to document > > > > the same. Let me summarize the discussion so far in this thread so > > > > that it is helpful to others. > > > > > > > > > > To avoid that inconsistency, UNSAFE could be the default for partitioned > > > tables > > > (and we would disallow setting AUTO for these). > > > So then AUTO is the default for non-partitioned tables only. > > > > I think this approach is reasonable, +1. > > > > I see the need to change to default via Alter Table but I am not sure > if Auto is the most appropriate way to handle that. How about using > DEFAULT itself as we do in the case of REPLICA IDENTITY? So, if users > have to alter parallel safety value to default, they need to just say > Parallel DML DEFAULT. The default would mean automatic behavior for > non-partitioned relations and ignore parallelism for partitioned > tables. > Hmm, I'm not so sure I'm sold on that. I personally think "DEFAULT" here is vague, and users then need to know what DEFAULT equates to, based on the type of table (partitioned or non-partitioned table). Also, then there's two ways to set the actual "default" DML parallel-safety for partitioned tables: DEFAULT or UNSAFE. At least "AUTO" is a meaningful default option name for non-partitioned tables - "automatic" parallel-safety checking, and the fact that it isn't the default (and can't be set) for partitioned tables highlights the difference in the way being proposed to treat them (i.e. use automatic checking only for non-partitioned tables). I'd be interested to hear what others think. I think a viable alternative would be to record whether an explicit DML parallel-safety has been specified, and if not, apply default behavior (i.e. by default use automatic checking for non-partitioned tables and treat partitioned tables as UNSAFE). I'm just not sure whether this kind of distinction (explicit vs implicit default) has been used before in Postgres options. Regards, Greg Nancarrow Fujitsu Australia
Re: Parallel Inserts (WAS: [bug?] Missed parallel safety checks..)
On Fri, Jul 30, 2021 at 6:53 PM houzj.f...@fujitsu.com wrote: > > On Friday, July 30, 2021 2:52 PM Greg Nancarrow wrote: > > On Fri, Jul 30, 2021 at 4:02 PM Amit Kapila wrote: > > > > > > > Besides, I think we need a new default value about parallel dml > > > > safety. Maybe 'auto' or 'null'(different from > > > > safe/restricted/unsafe). Because, user is likely to alter the safety > > > > to the default value to get the automatic safety check, a independent > > > > default > > > > value can make it more clear. > > > > > > > > > > Hmm, but auto won't work for partitioned tables, right? If so, that > > > might appear like an inconsistency to the user and we need to document > > > the same. Let me summarize the discussion so far in this thread so > > > that it is helpful to others. > > > > > > > To avoid that inconsistency, UNSAFE could be the default for partitioned > > tables > > (and we would disallow setting AUTO for these). > > So then AUTO is the default for non-partitioned tables only. > > I think this approach is reasonable, +1. > I see the need to change to default via Alter Table but I am not sure if Auto is the most appropriate way to handle that. How about using DEFAULT itself as we do in the case of REPLICA IDENTITY? So, if users have to alter parallel safety value to default, they need to just say Parallel DML DEFAULT. The default would mean automatic behavior for non-partitioned relations and ignore parallelism for partitioned tables. -- With Regards, Amit Kapila.
RE: Parallel Inserts (WAS: [bug?] Missed parallel safety checks..)
On Friday, July 30, 2021 2:52 PM Greg Nancarrow wrote: > On Fri, Jul 30, 2021 at 4:02 PM Amit Kapila wrote: > > > > > Besides, I think we need a new default value about parallel dml > > > safety. Maybe 'auto' or 'null'(different from > > > safe/restricted/unsafe). Because, user is likely to alter the safety > > > to the default value to get the automatic safety check, a independent > > > default > > > value can make it more clear. > > > > > > > Hmm, but auto won't work for partitioned tables, right? If so, that > > might appear like an inconsistency to the user and we need to document > > the same. Let me summarize the discussion so far in this thread so > > that it is helpful to others. > > > > To avoid that inconsistency, UNSAFE could be the default for partitioned > tables > (and we would disallow setting AUTO for these). > So then AUTO is the default for non-partitioned tables only. I think this approach is reasonable, +1. Best regards, houzj
Re: Parallel Inserts (WAS: [bug?] Missed parallel safety checks..)
On Fri, Jul 30, 2021 at 4:02 PM Amit Kapila wrote: > > > Besides, I think we need a new default value about parallel dml safety. > > Maybe > > 'auto' or 'null'(different from safe/restricted/unsafe). Because, user is > > likely to alter the safety to the default value to get the automatic safety > > check, a independent default value can make it more clear. > > > > Hmm, but auto won't work for partitioned tables, right? If so, that > might appear like an inconsistency to the user and we need to document > the same. Let me summarize the discussion so far in this thread so > that it is helpful to others. > To avoid that inconsistency, UNSAFE could be the default for partitioned tables (and we would disallow setting AUTO for these). So then AUTO is the default for non-partitioned tables only. Regards, Greg Nancarrow Fujitsu Australia