Hi,

It looks like the parsing of newly introduced "PARALLEL" option for
COPY FROM command has an issue(in the
0002-Framework-for-leader-worker-in-parallel-copy.patch),
Mentioning ....PARALLEL '4ar2eteid'); would pass with 4 workers since
atoi() is being used for converting string to integer which just
returns 4, ignoring other strings.

I used strtol(), added error checks and introduced the error "
improper use of argument to option "parallel"" for the above cases.

        parallel '4ar2eteid');
ERROR:  improper use of argument to option "parallel"
LINE 5:         parallel '1\');

Along with the updated patch
0002-Framework-for-leader-worker-in-parallel-copy.patch, also
attaching all the latest patches from [1].

[1] - 
https://www.postgresql.org/message-id/CALj2ACW94icER3WrWapon7JkcX8j0TGRue5ycWMTEvgA3X7fOg%40mail.gmail.com

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

On Tue, Jun 23, 2020 at 12:22 PM vignesh C <vignes...@gmail.com> wrote:
>
> On Tue, Jun 23, 2020 at 8:07 AM vignesh C <vignes...@gmail.com> wrote:
> > I have attached the patch for the same with the fixes.
>
> The patches were not applying on the head, attached the patches that can be 
> applied on head.
> I have added a commitfest entry[1] for this feature.
>
> [1] - https://commitfest.postgresql.org/28/2610/
>
>
> On Tue, Jun 23, 2020 at 8:07 AM vignesh C <vignes...@gmail.com> wrote:
>>
>> Thanks Ashutosh For your review, my comments are inline.
>> On Fri, Jun 19, 2020 at 5:41 PM Ashutosh Sharma <ashu.coe...@gmail.com> 
>> wrote:
>> >
>> > Hi,
>> >
>> > I just got some time to review the first patch in the list i.e. 
>> > 0001-Copy-code-readjustment-to-support-parallel-copy.patch. As the patch 
>> > name suggests, it is just trying to reshuffle the existing code for COPY 
>> > command here and there. There is no extra changes added in the patch as 
>> > such, but still I do have some review comments, please have a look:
>> >
>> > 1) Can you please add some comments atop the new function 
>> > PopulateAttributes() describing its functionality in detail. Further, this 
>> > new function contains the code from BeginCopy() to set attribute level 
>> > options used with COPY FROM such as FORCE_QUOTE, FORCE_NOT_NULL, 
>> > FORCE_NULL etc. in cstate and along with that it also copies the code from 
>> > BeginCopy() to set other infos such as client encoding type, encoding 
>> > conversion etc. Hence, I think it would be good to give it some better 
>> > name, basically something that matches with what actually it is doing.
>> >
>>
>> There is no new code added in this function, some part of code from
>> BeginCopy was made in to a new function as this part of code will also
>> be required for the parallel copy workers before the workers start the
>> actual copy operation. This code was made into a function to avoid
>> duplication. Changed the function name to PopulateGlobalsForCopyFrom &
>> added few comments.
>>
>> > 2) Again, the name for the new function CheckCopyFromValidity() doesn't 
>> > look good to me. From the function name it appears as if it does the 
>> > sanity check of the entire COPY FROM command, but actually it is just 
>> > doing the sanity check for the target relation specified with COPY FROM. 
>> > So, probably something like CheckTargetRelValidity would look more 
>> > sensible, I think? TBH, I am not good at naming the functions so you can 
>> > always ignore my suggestions about function and variable names :)
>> >
>>
>> Changed as suggested.
>> > 3) Any reason for not making CheckCopyFromValidity as a macro instead of a 
>> > new function. It is just doing the sanity check for the target relation.
>> >
>>
>> I felt there is reasonable number of lines in the function & it is not
>> in performance intensive path, so I preferred function over macro.
>> Your thoughts?
>>
>> > 4) Earlier in CopyReadLine() function while trying to clear the EOL marker 
>> > from cstate->line_buf.data (copied data), we were not checking if the line 
>> > read by CopyReadLineText() function is a header line or not, but I can see 
>> > that your patch checks that before clearing the EOL marker. Any reason for 
>> > this extra check?
>> >
>>
>> If you see the caller of CopyReadLine, i.e. NextCopyFromRawFields does
>> nothing for the header line, server basically calls CopyReadLine
>> again, it is a kind of small optimization. Anyway server is not going
>> to do anything with header line, I felt no need to clear EOL marker
>> for header lines.
>> /* on input just throw the header line away */
>> if (cstate->cur_lineno == 0 && cstate->header_line)
>> {
>> cstate->cur_lineno++;
>> if (CopyReadLine(cstate))
>> return false; /* done */
>> }
>>
>> cstate->cur_lineno++;
>>
>> /* Actually read the line into memory here */
>> done = CopyReadLine(cstate);
>> I think no need to make a fix for this. Your thoughts?
>>
>> > 5) I noticed the below spurious line removal in the patch.
>> >
>> > @@ -3839,7 +3953,6 @@ static bool
>> >  CopyReadLine(CopyState cstate)
>> >  {
>> >     bool        result;
>> > -
>> >
>>
>> Fixed.
>> I have attached the patch for the same with the fixes.
>> Thoughts?
>>
>> Regards,
>> Vignesh
>> EnterpriseDB: http://www.enterprisedb.com


Reply via email to