On 2024-07-03 02:07, Fujii Masao wrote:
Thanks for your comments!

On 2024/01/26 18:49, torikoshia wrote:
Hi,

9e2d870 enabled the COPY command to skip soft error, and I think we can add another option which specifies the maximum tolerable number of soft errors.

I remember this was discussed in [1], and feel it would be useful when loading 'dirty' data but there is a limit to how dirty it can be.

Attached a patch for this.

What do you think?

The patch no longer applies cleanly to HEAD. Could you update it?

I'm going to update it after discussing the option format as described below.


I think the REJECT_LIMIT feature is useful. Allowing it to be set as
either the absolute number of skipped rows or a percentage of the
total input rows is a good idea.

However, if we support REJECT_LIMIT, I'm not sure if the ON_ERROR
option is still necessary. REJECT_LIMIT seems to cover the same cases.
For instance, REJECT_LIMIT=infinity can act like ON_ERROR=ignore, and
REJECT_LIMIT=0 can act like ON_ERROR=stop.

I agree that it's possible to use only REJECT_LIMIT without ON_ERROR.
I also think it's easy to understand that REJECT_LIMIT=0 is ON_ERROR=stop. However, expressing REJECT_LIMIT='infinity' needs some definition like "setting REJECT_LIMIT to -1 means 'infinity'", doesn't it? If so, I think this might not so intuitive.

Also, since it seems Snowflake and Redshift have both options equivalent to REJECT_LIMIT and ON_ERROR, having both of them in PostgreSQL COPY might not be surprising: - Snowflake's ON_ERROR accepts "CONTINUE | SKIP_FILE | SKIP_FILE_num | 'SKIP_FILE_num%' | ABORT_STATEMENT"[1]
- Redshift has MAXERROR and IGNOREALLERRORS options[2]

BTW after seeing Snowflake makes SKIP_FILE_num one of the options of ON_ERROR, I'm a bit wondering whether REJECT_LIMIT also should be the same.


[1] https://docs.snowflake.com/en/sql-reference/sql/copy-into-table#copy-options-copyoptions [2] https://docs.aws.amazon.com/en_en/redshift/latest/dg/copy-parameters-data-load.html

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation


Reply via email to