On 2023-03-06 23:03, Daniel Gustafsson wrote:
On 28 Feb 2023, at 15:28, Damir Belyalov <dam.be...@gmail.com> wrote:

Tested patch on all cases: CIM_SINGLE, CIM_MULTI, CIM_MULTI_CONDITION. As expected it works.
Also added a description to copy.sgml and made a review on patch.
Thanks for your tests and improvements!

I added 'ignored_errors' integer parameter that should be output after the option is finished. All errors were added to the system logfile with full detailed context. Maybe it's better to log only error message.
Certainly.

FWIW, Greenplum has a similar construct (but which also logs the errors in the db) where data type errors are skipped as long as the number of errors don't exceed a reject limit. If the reject limit is reached then the COPY fails:

        LOG ERRORS [ SEGMENT REJECT LIMIT <count> [ ROWS | PERCENT ]]

IIRC the gist of this was to catch then the user copies the wrong input data or plain has a broken file. Rather than finding out after copying n rows which
are likely to be garbage the process can be restarted.

This version of the patch has a compiler error in the error message:

copyfrom.c: In function ‘CopyFrom’:
copyfrom.c:1008:29: error: format ‘%ld’ expects argument of type ‘long
int’, but argument 2 has type ‘uint64’ {aka ‘long long unsigned int’}
[-Werror=format=]
1008 | ereport(WARNING, errmsg("Errors: %ld", cstate->ignored_errors));
     |                          ^~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~
     |                                              |
     |                                              uint64 {aka long
long unsigned int}


On that note though, it seems to me that this error message leaves a bit to be
desired with regards to the level of detail.
+1.
I felt just logging "Error: %ld" would make people wonder the meaning of the %ld. Logging something like ""Error: %ld data type errors were found" might be clearer.

--
Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATION


Reply via email to