> > > + /* Buffer was filled, commit subtransaction and prepare > to replay */ > > + ReleaseCurrentSubTransaction(); > What is actually being committed by this ReleaseCurrentSubTransaction()? > It seems to me that just safecstate->replay_buffer is fulfilled before > this commit. > All tuples are collected in replay_buffer, which is created in ''oldcontext'' of CopyFrom() (not context of a subtransaction). That's why it didn't clean up when you used RollbackAndReleaseCurrentSubTransaction(). Subtransactions are needed for better safety. There is no error when copying from a file to the replay_buffer, but an error may occur at the next stage - when adding a tuple to the table. Also there may be other errors that are not obvious at first glance.
I feel it might be better to have it as a parameter rather than fixed at > 1000. > Yes, I thought about it too. So I created another version with the GUC parameter - replay_buffer_size. Attached it. But I think users won't need to change replay_buffer_size. Also replay_buffer does the same thing as MultiInsert buffer does and MultiInsert buffer defined by const = 1000. NextCopyFrom() is in copyfromparse.c while safeNextCopyFrom() is in > copyfrom.c. > Since safeNextCopyFrom() is analog of NextCopyFrom() as commented, would > it be natural to put them in the same file? > Sure, corrected it. > > 188 + safecstate->errors++; > > 189 + if (safecstate->errors <= 100) > > 190 + ereport(WARNING, > > 191 + (errcode(errdata->sqlerrcode), > > 192 + errmsg("%s", errdata->context))); > > It would be better to state in the manual that errors exceeding 100 are > not displayed. > Or, it might be acceptable to output all errors that exceed 100. > It'll be too complicated to create a new parameter just for showing the given number of errors. I think 100 is an optimal size. > > +typedef struct SafeCopyFromState > > +{ > > +#define REPLAY_BUFFER_SIZE 1000 > > + HeapTuple replay_buffer[REPLAY_BUFFER_SIZE]; /* accumulates > > tuples for replaying it after an error */ > > + int saved_tuples; /* # of tuples in > > replay_buffer */ > > + int replayed_tuples; /* # of tuples was > > replayed from buffer */ > > + int errors; /* total # of errors */ > > + bool replay_is_active; > > + bool begin_subtransaction; > > + bool processed_remaining_tuples; /* for case of > > replaying last tuples */ > > + bool skip_row; > > It would be helpful to add comments about skip_row, etc. > Corrected it. > WARNING: FIND 0 ERRORS > When there were no errors, this WARNING seems not necessary. > Corrected it. Add to this patch processing other errors and constraints and tests for them. I had to create another function safeExecConstraints() only for processing constraints, because ExecConstraints() is after NextCopyFrom() and is not in PG_TRY. This thing a little bit complicated the code. Maybe it is a good approach to create a new function SafeCopyFrom() and do all ''safe copying'' in PG_TRY, but it will almost duplicate the CopyFrom() code. Or maybe create a function only for loop for(;;). But we have the same thing with duplicating code and passing a lot of variables (which are created at the beginning of CopyFrom()) to this function.