On 2022-08-15 22:23, Damir Belyalov wrote:
I expected I could see 9999 rows after COPY, but just saw 999 rows.
Also I implemented your case and it worked correctly.
Thanks for the new patch!
Here are some comments on it.
+ if (safecstate->saved_tuples < REPLAY_BUFFER_SIZE)
+ {
+ valid_row = NextCopyFrom(cstate, econtext, values,
nulls);
+ if (valid_row)
+ {
+ /* Fill replay_buffer in oldcontext*/
+ MemoryContextSwitchTo(safecstate->oldcontext);
+
safecstate->replay_buffer[safecstate->saved_tuples++] =
heap_form_tuple(RelationGetDescr(cstate->rel), values, nulls);
+ /* Buffer was filled, commit subtransaction and prepare
to replay */
+ ReleaseCurrentSubTransaction();
What is actually being committed by this ReleaseCurrentSubTransaction()?
It seems to me that just safecstate->replay_buffer is fulfilled before
this commit.
As a test, I rewrote this ReleaseCurrentSubTransaction() to
RollbackAndReleaseCurrentSubTransaction() and COPYed over 1000 rows of
data, but same data were loaded.
+#define REPLAY_BUFFER_SIZE 1000
I feel it might be better to have it as a parameter rather than fixed at
1000.
+/*
+ * Analog of NextCopyFrom() but ignore rows with errors while copying.
+ */
+static bool
+safeNextCopyFrom(CopyFromState cstate, ExprContext *econtext, Datum
*values, bool *nulls)
NextCopyFrom() is in copyfromparse.c while safeNextCopyFrom() is in
copyfrom.c.
Since safeNextCopyFrom() is analog of NextCopyFrom() as commented, would
it be natural to put them in the same file?
188 + safecstate->errors++;
189 + if (safecstate->errors <= 100)
190 + ereport(WARNING,
191 + (errcode(errdata->sqlerrcode),
192 + errmsg("%s", errdata->context)));
It would be better to state in the manual that errors exceeding 100 are
not displayed.
Or, it might be acceptable to output all errors that exceed 100.
+typedef struct SafeCopyFromState
+{
+#define REPLAY_BUFFER_SIZE 1000
+ HeapTuple replay_buffer[REPLAY_BUFFER_SIZE]; /* accumulates
tuples for replaying it after an error */
+ int saved_tuples; /* # of tuples in
replay_buffer */
+ int replayed_tuples; /* # of tuples was
replayed from buffer */
+ int errors; /* total # of errors */
+ bool replay_is_active;
+ bool begin_subtransaction;
+ bool processed_remaining_tuples; /* for case of
replaying last tuples */
+ bool skip_row;
It would be helpful to add comments about skip_row, etc.
```
$ git apply ../patch/0003-COPY_IGNORE_ERRORS.patch
../patch/0003-COPY_IGNORE_ERRORS.patch:86: indent with spaces.
Datum *values, bool *nulls);
warning: 1 line adds whitespace errors.
```
There was a warning when applying the patch.
```
=# copy test from '/tmp/10000.data' with (ignore_errors);
WARNING: FIND 0 ERRORS
COPY 1003
```
When there were no errors, this WARNING seems not necessary.
--
Regards,
--
Atsushi Torikoshi
NTT DATA CORPORATION