On Fri, Jan 26, 2024 at 10:44 PM jian he <jian.universal...@gmail.com> wrote:

I doubt the following part:
      If the <literal>log</literal> value is specified,
      <command>COPY</command> behaves the same as
<literal>ignore</literal>, exept that
      it logs information that should have resulted in errors to
PostgreSQL log at
      <literal>INFO</literal> level.

I think it does something like:
When an error happens, cstate->escontext->error_data->elevel will be ERROR
you manually change the cstate->escontext->error_data->elevel to LOG,
then you call ThrowErrorData.

but it's not related to `<literal>INFO</literal> level`?
my log_min_messages is default, warning.

Thanks!

Modified them to NOTICE in accordance with the following summary message:
NOTICE:  x row was skipped due to data type incompatibility


On 2024-01-27 00:43, David G. Johnston wrote:
On Thu, Jan 25, 2024 at 9:42 AM torikoshia
<torikos...@oss.nttdata.com> wrote:

Hi,

As described in 9e2d870119, COPY ON_EEOR is expected to have more
"error_action".
(Note that option name was changed by b725b7eec)

I'd like to have a new option "log", which skips soft errors and
logs
information that should have resulted in errors to PostgreSQL log.

Seems like an easy win but largely unhelpful in the typical case.  I
suppose ETL routines using this feature may be running on their
machine under root or "postgres" but in a system where they are not
this very useful information is inaccessible to them.  I suppose the
DBA could set up an extractor to send these specific log lines
elsewhere but that seems like enough hassle to disfavor this approach
and favor one that can place the soft error data and feedback into
user-specified tables in the same database.  Setting up temporary
tables or unlogged tables probably is going to be a more acceptable
methodology than trying to get to the log files.

David J.

I agree that not a few people would prefer to store error information in tables and there have already been suggestions[1].

OTOH not everyone thinks saving table information is the best idea[2].

I think it would be desirable for ON_ERROR to be in a form that allows the user to choose where to store error information from among some options, such as table, log and file.

"ON_ERROR log" would be useful at least in the case of 'running on their machine under root or "postgres"' as you pointed out.


[1] https://www.postgresql.org/message-id/CACJufxEkkqnozdnvNMGxVAA94KZaCPkYw_Cx4JKG9ueNaZma_A%40mail.gmail.com

[2] https://www.postgresql.org/message-id/20231109002600.fuihn34bjqqgm...@awork3.anarazel.de

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation
From 5f44cc7525641302842a3d67c14ebb09615bf67b Mon Sep 17 00:00:00 2001
From: Atsushi Torikoshi <torikos...@oss.nttdata.com>
Date: Mon, 29 Jan 2024 12:02:32 +0900
Subject: [PATCH v2] Add new error_action "log" to ON_ERROR option

Currently ON_ERROR option only has "ignore" to skip malformed data and
there are no ways to know where and why COPY skipped them.

"log" skips malformed data as well as "ignore", but it logs information that
should have resulted in errors to PostgreSQL log.
---
 doc/src/sgml/ref/copy.sgml          |  9 +++++++--
 src/backend/commands/copy.c         |  4 +++-
 src/backend/commands/copyfrom.c     | 24 ++++++++++++++++++++----
 src/include/commands/copy.h         |  1 +
 src/test/regress/expected/copy2.out | 18 +++++++++++++-----
 src/test/regress/sql/copy2.sql      |  9 +++++++++
 6 files changed, 53 insertions(+), 12 deletions(-)

diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 21a5c4a052..3d949f04a4 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -380,12 +380,17 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
      <para>
       Specifies which <replaceable class="parameter">
       error_action</replaceable> to perform when there is malformed data in the input.
-      Currently, only <literal>stop</literal> (default) and <literal>ignore</literal>
-      values are supported.
+      Currently, only <literal>stop</literal> (default), <literal>ignore</literal>
+      and <literal>log</literal> values are supported.
       If the <literal>stop</literal> value is specified,
       <command>COPY</command> stops operation at the first error.
       If the <literal>ignore</literal> value is specified,
       <command>COPY</command> skips malformed data and continues copying data.
+      If the <literal>log</literal> value is specified,
+      <command>COPY</command> behaves the same as <literal>ignore</literal>,
+      except that it logs information that should have resulted in errors to
+      <productname>PostgreSQL</productname> log at <literal>NOTICE</literal>
+      level.
       The option is allowed only in <command>COPY FROM</command>.
       Only <literal>stop</literal> value is allowed when
       using <literal>binary</literal> format.
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index cc0786c6f4..812ca63350 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -415,13 +415,15 @@ defGetCopyOnErrorChoice(DefElem *def, ParseState *pstate, bool is_from)
 		return COPY_ON_ERROR_STOP;
 
 	/*
-	 * Allow "stop", or "ignore" values.
+	 * Allow "stop", "ignore" or "log" values.
 	 */
 	sval = defGetString(def);
 	if (pg_strcasecmp(sval, "stop") == 0)
 		return COPY_ON_ERROR_STOP;
 	if (pg_strcasecmp(sval, "ignore") == 0)
 		return COPY_ON_ERROR_IGNORE;
+	if (pg_strcasecmp(sval, "log") == 0)
+		return COPY_ON_ERROR_LOG;
 
 	ereport(ERROR,
 			(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 1fe70b9133..f2438023c8 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -1013,6 +1013,23 @@ CopyFrom(CopyFromState cstate)
 				 */
 				cstate->escontext->error_occurred = false;
 
+			else if (cstate->opts.on_error == COPY_ON_ERROR_LOG)
+			{
+				/* Adjust elevel so we don't jump out */
+				cstate->escontext->error_data->elevel = NOTICE;
+
+				/*
+				 * Despite the name, this won't raise an error since elevel is
+				 * NOTICE now.
+				 */
+				ThrowErrorData(cstate->escontext->error_data);
+
+				/* Initialize escontext in preparation for next soft error */
+				cstate->escontext->error_occurred = false;
+				cstate->escontext->details_wanted = true;
+				memset(cstate->escontext->error_data, 0, sizeof(ErrorData));
+			}
+
 			/* Report that this tuple was skipped by the ON_ERROR clause */
 			pgstat_progress_update_param(PROGRESS_COPY_TUPLES_SKIPPED,
 										 ++skipped);
@@ -1462,12 +1479,11 @@ BeginCopyFrom(ParseState *pstate,
 		cstate->escontext->type = T_ErrorSaveContext;
 		cstate->escontext->error_occurred = false;
 
-		/*
-		 * Currently we only support COPY_ON_ERROR_IGNORE. We'll add other
-		 * options later
-		 */
+		/* Error Details are required except when "ignore" is specified */
 		if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
 			cstate->escontext->details_wanted = false;
+		else
+			cstate->escontext->details_wanted = true;
 	}
 	else
 		cstate->escontext = NULL;
diff --git a/src/include/commands/copy.h b/src/include/commands/copy.h
index b3da3cb0be..c61ac2445f 100644
--- a/src/include/commands/copy.h
+++ b/src/include/commands/copy.h
@@ -38,6 +38,7 @@ typedef enum CopyOnErrorChoice
 {
 	COPY_ON_ERROR_STOP = 0,		/* immediately throw errors, default */
 	COPY_ON_ERROR_IGNORE,		/* ignore errors */
+	COPY_ON_ERROR_LOG,			/* save error to PostgreSQL log */
 } CopyOnErrorChoice;
 
 /*
diff --git a/src/test/regress/expected/copy2.out b/src/test/regress/expected/copy2.out
index 25c401ce34..6905d77de6 100644
--- a/src/test/regress/expected/copy2.out
+++ b/src/test/regress/expected/copy2.out
@@ -731,12 +731,20 @@ ERROR:  invalid input syntax for type integer: "a"
 CONTEXT:  COPY check_ign_err, line 2, column n: "a"
 COPY check_ign_err FROM STDIN WITH (on_error ignore);
 NOTICE:  4 rows were skipped due to data type incompatibility
+COPY check_ign_err FROM STDIN WITH (on_error log);
+NOTICE:  invalid input syntax for type integer: "a"
+NOTICE:  value "8888888888" is out of range for type integer
+NOTICE:  invalid input syntax for type integer: "a"
+NOTICE:  invalid input syntax for type integer: ""
+NOTICE:  4 rows were skipped due to data type incompatibility
 SELECT * FROM check_ign_err;
- n |  m  | k 
----+-----+---
- 1 | {1} | 1
- 5 | {5} | 5
-(2 rows)
+ n  |  m   | k  
+----+------+----
+  1 | {1}  |  1
+  5 | {5}  |  5
+  6 | {6}  |  6
+ 10 | {10} | 10
+(4 rows)
 
 -- test datatype error that can't be handled as soft: should fail
 CREATE TABLE hard_err(foo widget);
diff --git a/src/test/regress/sql/copy2.sql b/src/test/regress/sql/copy2.sql
index b5e549e856..54e1bc7f91 100644
--- a/src/test/regress/sql/copy2.sql
+++ b/src/test/regress/sql/copy2.sql
@@ -516,6 +516,15 @@ a	{2}	2
 
 5	{5}	5
 \.
+
+COPY check_ign_err FROM STDIN WITH (on_error log);
+6	{6}	6
+a	{7}	7
+8	{8}	8888888888
+9	{a, 9}	9
+
+10	{10}	10
+\.
 SELECT * FROM check_ign_err;
 
 -- test datatype error that can't be handled as soft: should fail

base-commit: 08e6344fd6423210b339e92c069bb979ba4e7cd6
-- 
2.39.2

Reply via email to