Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)

Damir Belyalov Wed, 24 Aug 2022 09:54:42 -0700

>
> > +               /* Buffer was filled, commit subtransaction and prepare
> to replay */
> > +               ReleaseCurrentSubTransaction();
> What is actually being committed by this ReleaseCurrentSubTransaction()?
> It seems to me that just safecstate->replay_buffer is fulfilled before
> this commit.
>
All tuples are collected in replay_buffer, which is created in
''oldcontext'' of CopyFrom() (not context of a subtransaction). That's why
it didn't clean up when you used RollbackAndReleaseCurrentSubTransaction().
Subtransactions are needed for better safety. There is no error when
copying from a file to the replay_buffer, but an error may occur at the
next stage - when adding a tuple to the table. Also there may be other
errors that are not obvious at first glance.


I feel it might be better to have it as a parameter rather than fixed at
> 1000.
>
Yes, I thought about it too. So I created another version with the GUC
parameter - replay_buffer_size. Attached it. But I think users won't need
to change replay_buffer_size.
Also replay_buffer does the same thing as MultiInsert buffer does and
MultiInsert buffer defined by const = 1000.

NextCopyFrom() is in copyfromparse.c while safeNextCopyFrom() is in
> copyfrom.c.
> Since safeNextCopyFrom() is analog of NextCopyFrom() as commented, would
> it be natural to put them in the same file?
>
Sure, corrected it.


> > 188 +               safecstate->errors++;
> > 189 +               if (safecstate->errors <= 100)
> > 190 +                   ereport(WARNING,
> > 191 +                           (errcode(errdata->sqlerrcode),
> > 192 +                           errmsg("%s", errdata->context)));
>
> It would be better to state in the manual that errors exceeding 100 are
> not displayed.
> Or, it might be acceptable to output all errors that exceed 100.
>
It'll be too complicated to create a new parameter just for showing the
given number of errors. I think 100 is an optimal size.


> > +typedef struct SafeCopyFromState
> > +{
> > +#define            REPLAY_BUFFER_SIZE 1000
> > +   HeapTuple       replay_buffer[REPLAY_BUFFER_SIZE]; /* accumulates
> > tuples for replaying it after an error */
> > +   int             saved_tuples;               /* # of tuples in
> > replay_buffer */
> > +   int             replayed_tuples;            /* # of tuples was
> > replayed from buffer */
> > +   int             errors;                     /* total # of errors */
> > +   bool            replay_is_active;
> > +   bool            begin_subtransaction;
> > +   bool            processed_remaining_tuples; /* for case of
> > replaying last tuples */
> > +   bool            skip_row;
>
> It would be helpful to add comments about skip_row, etc.
>
Corrected it.


> WARNING:  FIND 0 ERRORS
> When there were no errors, this WARNING seems not necessary.
>
Corrected it.

Add to this patch processing other errors and constraints and tests for
them.
I had to create another function safeExecConstraints() only for processing
constraints, because ExecConstraints() is after NextCopyFrom() and is not
in PG_TRY. This thing a little bit complicated the code.
Maybe it is a good approach to create a new function SafeCopyFrom() and do
all ''safe copying'' in PG_TRY, but it will almost duplicate the CopyFrom()
code.
Or maybe create a function only for loop for(;;). But we have the same
thing with duplicating code and passing a lot of variables (which are
created at the beginning of CopyFrom()) to this function.

Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)

Reply via email to