Hi,

On Thu, Mar 26, 2026 at 6:36 PM Sutou Kouhei <[email protected]> wrote:
>
> Hi,
>
> In <cad21aoclxuhq0ubjdkxvcetjbcff13ru_7u-qrrsu+0ppuq...@mail.gmail.com>
>   "Re: Make COPY format extendable: Extract COPY TO format implementations" 
> on Thu, 18 Dec 2025 15:43:07 -0800,
>   Masahiko Sawada <[email protected]> wrote:
>
> > Looking at these results, it seems that 0001-from-binary cases and
> > 0006-to-binary cases are slower throughout the six results?
>
> Good point. I didn't notice them. But I feel that it's not
> related to the patch set. Because 0001 doesn't change COPY
> FROM related code. 0001 just changes COPY TO related
> code. And 0006 just adds tests. 0006 doesn't change
> implementations.
>
>
> BTW, how to proceed this proposal? It seems that we can't
> proceed this proposal without PostgreSQL committers'
> attentions but it seems that it's difficult.

Sorry for going quiet on this for a while -- I haven't had time to
work on it until now.

After more thought, I'd like to keep the custom-format changes to the
bare minimum and not disturb the existing built-in format processing.

In particular, I've dropped the earlier rework that split
CopyToStateData / CopyFromStateData to hide built-in-specific fields
from extensions. That was my own idea, but I no longer think it pays
off: the fields it hid (raw_buf, line_buf, the input buffers, etc.)
are only ever used by the built-in text/CSV/binary parsers, and a
custom format never touches them -- so visible or not, nothing depends
on them, while splitting the struct is invasive to the existing format
processing. Touching the Copy state structs is fine in itself; it's
the hiding that wasn't worth the cost.

Instead, each state struct just gets one opaque pointer for a custom
format to keep its own state, and the existing code paths are left
alone.

Updated patches attached:

- 0001 moves CopyFromStateData and CopyToStateData to a new
copy_state.h, so extensions can implement their routines without
including the *_internal.h headers. It also drops file_fdw.c's
dependency on copyfrom_internal.h.
- 0002 introduces the registration API and the opaque per-format
pointer in both structs.
- 0003 adds a callback to validate the COPY options as a whole, called
after all options are processed.
- 0004 adds the regression tests.

I'd like to proceed in this direction barring objections. Feedback is
very welcome.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
From 1684e2394e5557f94c17e39b94351a76e601e00d Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <[email protected]>
Date: Mon, 22 Jun 2026 09:21:51 -0700
Subject: [PATCH v2 2/4] Allow extensions to register custom format to COPY TO
 and COPY FROM.

Author:
Reviewed-by:
Discussion: https://postgr.es/m/
---
 src/backend/commands/Makefile     |   1 +
 src/backend/commands/copy.c       |  97 ++++++++++++++++++++--
 src/backend/commands/copyapi.c    | 131 ++++++++++++++++++++++++++++++
 src/backend/commands/copyfrom.c   |   4 +-
 src/backend/commands/copyto.c     |   4 +-
 src/backend/commands/meson.build  |   1 +
 src/include/commands/copy.h       |  19 +++++
 src/include/commands/copy_state.h |   6 ++
 src/include/commands/copyapi.h    |  37 +++++++++
 src/tools/pgindent/typedefs.list  |   1 +
 10 files changed, 290 insertions(+), 11 deletions(-)
 create mode 100644 src/backend/commands/copyapi.c

diff --git a/src/backend/commands/Makefile b/src/backend/commands/Makefile
index 5b9d084977e..17b7aa08b55 100644
--- a/src/backend/commands/Makefile
+++ b/src/backend/commands/Makefile
@@ -23,6 +23,7 @@ OBJS = \
 	constraint.o \
 	conversioncmds.o \
 	copy.o \
+	copyapi.o \
 	copyfrom.o \
 	copyfromparse.o \
 	copyto.o \
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 003b70852bb..2fdba026ee0 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -23,6 +23,7 @@
 #include "access/xact.h"
 #include "catalog/pg_authid.h"
 #include "commands/copy.h"
+#include "commands/copyapi.h"
 #include "commands/defrem.h"
 #include "executor/executor.h"
 #include "mb/pg_wchar.h"
@@ -592,6 +593,14 @@ ProcessCopyOptions(ParseState *pstate,
 	bool		force_array_specified = false;
 	ListCell   *option;
 
+	/*
+	 * Options not recognized by core are collected here and, once the format
+	 * is known, either handed to a custom format's option parser or rejected.
+	 */
+	List	   *deferred_options = NIL;
+	ProcessOneOptionFn custom_process_option_fn = NULL;
+	char	   *custom_format_name = NULL;
+
 	/* Support external use for option sanity checking */
 	if (opts_out == NULL)
 		opts_out = palloc0_object(CopyFormatOptions);
@@ -620,6 +629,13 @@ ProcessCopyOptions(ParseState *pstate,
 				opts_out->format = COPY_FORMAT_BINARY;
 			else if (strcmp(fmt, "json") == 0)
 				opts_out->format = COPY_FORMAT_JSON;
+			else if (GetCopyCustomFormatRoutines(fmt, &opts_out->to_routine,
+												 &opts_out->from_routine,
+												 &custom_process_option_fn))
+			{
+				opts_out->format = COPY_FORMAT_CUSTOM;
+				custom_format_name = fmt;
+			}
 			else
 				ereport(ERROR,
 						(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
@@ -775,11 +791,54 @@ ProcessCopyOptions(ParseState *pstate,
 			opts_out->reject_limit = defGetCopyRejectLimitOption(defel);
 		}
 		else
+		{
+			/*
+			 * Not a core option.  Defer the check to after the loop as it may
+			 * belong to a custom format whose "format" option has not been
+			 * seen yet.
+			 */
+			deferred_options = lappend(deferred_options, defel);
+		}
+	}
+
+	/*
+	 * Now that the format and every option have been seen, resolve the
+	 * deferred options.
+	 */
+	if (deferred_options != NIL)
+	{
+		/*
+		 * For a custom format, they belong to the handler; for any built-in
+		 * (including the default) an unrecognized option is an error,
+		 * preserving the historical behavior relied on by external callers
+		 * such as file_fdw.
+		 */
+		if (opts_out->format != COPY_FORMAT_CUSTOM || custom_process_option_fn == NULL)
+		{
+			DefElem    *defel = linitial_node(DefElem, deferred_options);
+
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
 					 errmsg("option \"%s\" not recognized",
 							defel->defname),
 					 parser_errposition(pstate, defel->location)));
+		}
+
+		/*
+		 * Hand each option core did not recognize to the format's per-option
+		 * callback. Anything the format does not claim (or any option at all
+		 * if it has no callback) is an error, so an unrecognized option
+		 * always fails here.
+		 */
+		foreach_node(DefElem, opt, deferred_options)
+		{
+			if (!custom_process_option_fn(opts_out, is_from, opt))
+				ereport(ERROR,
+						(errcode(ERRCODE_SYNTAX_ERROR),
+						 errmsg("COPY format \"%s\" does not accept option \"%s\"",
+								custom_format_name, opt->defname),
+						 parser_errposition(pstate, opt->location)));
+		}
 	}
 
 	/*
@@ -869,7 +928,7 @@ ProcessCopyOptions(ParseState *pstate,
 	 * future-proofing.  Likewise we disallow all digits though only octal
 	 * digits are actually dangerous.
 	 */
-	if (opts_out->format != COPY_FORMAT_CSV &&
+	if (CopyFormatBuiltins(opts_out->format) && opts_out->format != COPY_FORMAT_CSV &&
 		strchr("\\.abcdefghijklmnopqrstuvwxyz0123456789",
 			   opts_out->delim[0]) != NULL)
 		ereport(ERROR,
@@ -888,7 +947,8 @@ ProcessCopyOptions(ParseState *pstate,
 				: errmsg("cannot specify %s in JSON mode", "HEADER"));
 
 	/* Check quote */
-	if (opts_out->format != COPY_FORMAT_CSV && opts_out->quote != NULL)
+	if (CopyFormatBuiltins(opts_out->format) && opts_out->format != COPY_FORMAT_CSV &&
+		opts_out->quote != NULL)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 		/*- translator: %s is the name of a COPY option, e.g. ON_ERROR */
@@ -905,7 +965,8 @@ ProcessCopyOptions(ParseState *pstate,
 				 errmsg("COPY delimiter and quote must be different")));
 
 	/* Check escape */
-	if (opts_out->format != COPY_FORMAT_CSV && opts_out->escape != NULL)
+	if (CopyFormatBuiltins(opts_out->format) && opts_out->format != COPY_FORMAT_CSV &&
+		opts_out->escape != NULL)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 		/*- translator: %s is the name of a COPY option, e.g. ON_ERROR */
@@ -917,7 +978,8 @@ ProcessCopyOptions(ParseState *pstate,
 				 errmsg("COPY escape must be a single one-byte character")));
 
 	/* Check force_quote */
-	if (opts_out->format != COPY_FORMAT_CSV && (opts_out->force_quote || opts_out->force_quote_all))
+	if (CopyFormatBuiltins(opts_out->format) && opts_out->format != COPY_FORMAT_CSV &&
+		(opts_out->force_quote || opts_out->force_quote_all))
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 		/*- translator: %s is the name of a COPY option, e.g. ON_ERROR */
@@ -931,8 +993,8 @@ ProcessCopyOptions(ParseState *pstate,
 						"COPY FROM")));
 
 	/* Check force_notnull */
-	if (opts_out->format != COPY_FORMAT_CSV && (opts_out->force_notnull != NIL ||
-												opts_out->force_notnull_all))
+	if (CopyFormatBuiltins(opts_out->format) && opts_out->format != COPY_FORMAT_CSV &&
+		(opts_out->force_notnull != NIL || opts_out->force_notnull_all))
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 		/*- translator: %s is the name of a COPY option, e.g. ON_ERROR */
@@ -947,8 +1009,8 @@ ProcessCopyOptions(ParseState *pstate,
 						"COPY TO")));
 
 	/* Check force_null */
-	if (opts_out->format != COPY_FORMAT_CSV && (opts_out->force_null != NIL ||
-												opts_out->force_null_all))
+	if (CopyFormatBuiltins(opts_out->format) && opts_out->format != COPY_FORMAT_CSV &&
+		(opts_out->force_null != NIL || opts_out->force_null_all))
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 		/*- translator: %s is the name of a COPY option, e.g. ON_ERROR */
@@ -995,7 +1057,8 @@ ProcessCopyOptions(ParseState *pstate,
 				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 				errmsg("COPY %s is not supported for %s", "FORMAT JSON", "COPY FROM"));
 
-	if (opts_out->format != COPY_FORMAT_JSON && opts_out->force_array)
+	if (CopyFormatBuiltins(opts_out->format) && opts_out->format != COPY_FORMAT_JSON &&
+		opts_out->force_array)
 		ereport(ERROR,
 				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 				errmsg("COPY %s can only be used with JSON mode", "FORCE_ARRAY"));
@@ -1048,6 +1111,22 @@ ProcessCopyOptions(ParseState *pstate,
 		 * ON_ERROR, third is the value of the COPY option, e.g. IGNORE */
 				 errmsg("COPY %s requires %s to be set to %s",
 						"REJECT_LIMIT", "ON_ERROR", "IGNORE")));
+
+	/* Check custom format routines */
+	if (opts_out->format == COPY_FORMAT_CUSTOM)
+	{
+		if (is_from && opts_out->from_routine == NULL)
+			ereport(ERROR,
+					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+					 errmsg("COPY format \"%s\" cannot be used with COPY FROM",
+							custom_format_name)));
+
+		if (!is_from && opts_out->to_routine == NULL)
+			ereport(ERROR,
+					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+					 errmsg("COPY format \"%s\" cannot be used with COPY TO",
+							custom_format_name)));
+	}
 }
 
 /*
diff --git a/src/backend/commands/copyapi.c b/src/backend/commands/copyapi.c
new file mode 100644
index 00000000000..168efbcf30b
--- /dev/null
+++ b/src/backend/commands/copyapi.c
@@ -0,0 +1,131 @@
+/*-------------------------------------------------------------------------
+ *
+ * copyapi.c
+ *	  Registry for pluggable COPY TO/FROM format handlers.
+ *
+ * The built-in formats (text, csv, binary, json) are dispatched directly by
+ * the COPY engine. Extensions can provide additional formats by registering
+ * a CopyToRoutine and/or CopyFromRoutine under a name from their _PG_init();
+ * ProcessCopyOptions() then resolves "COPY ... (FORMAT 'name')" against this
+ * registry.
+ *
+ * Portions Copyright (c) 1996-2026, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ *	  src/backend/commands/copyapi.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "commands/copyapi.h"
+#include "utils/memutils.h"
+
+typedef struct CopyCustomFormatEntry
+{
+	const char *name;			/* constant string; never freed (see below) */
+	const CopyToRoutine *to_routine;
+	const CopyFromRoutine *from_routine;
+	ProcessOneOptionFn option_fn;
+} CopyCustomFormatEntry;
+
+static CopyCustomFormatEntry *CopyCustomFormatArray = NULL;
+static int	CopyCustomFormatsAssigned = 0;
+static int	CopyCustomFormatsAllocated = 0;
+
+/* Is 'name' one of the built-in format keywords? */
+static bool
+is_builtin_copy_format(const char *name)
+{
+	return (strcmp(name, "text") == 0 ||
+			strcmp(name, "csv") == 0 ||
+			strcmp(name, "binary") == 0 ||
+			strcmp(name, "json") == 0);
+}
+
+/*
+ * Register a custom COPY format. Intended to be called from an extension's
+ * _PG_init(). Either routine may be NULL if the format does not support that
+ * direction (but not both).
+ *
+ * 'option_fn' may also be NULL if the format takes no format-specific options.
+ *
+ * 'name' is assumed to be a constant string or allocated in storage that will
+ * never be freed; it is stored by reference.
+ */
+void
+RegisterCopyCustomFormat(const char *name, const CopyToRoutine *to,
+						 const CopyFromRoutine *from, ProcessOneOptionFn option_fn)
+{
+	Assert(name != NULL && name[0] != '\0');
+
+	/* Must support at least one direction */
+	Assert(to != NULL || from != NULL);
+
+	Assert(to == NULL ||
+		   (to->CopyToStart != NULL && to->CopyToOneRow != NULL &&
+			to->CopyToEnd != NULL));
+	Assert(from == NULL ||
+		   (from->CopyFromStart != NULL && from->CopyFromOneRow != NULL &&
+			from->CopyFromEnd != NULL));
+
+	/* Check if it's already used by built-in format names */
+	if (is_builtin_copy_format(name))
+		elog(ERROR, "COPY format \"%s\" is a built-in format name", name);
+
+	/* Reject a duplicate registration. */
+	for (int i = 0; i < CopyCustomFormatsAssigned; i++)
+	{
+		if (strcmp(CopyCustomFormatArray[i].name, name) == 0)
+			elog(ERROR, "COPY format \"%s\" is already registered", name);
+	}
+
+	/* Create the array on first use; it must outlive the current context. */
+	if (CopyCustomFormatArray == NULL)
+	{
+		CopyCustomFormatsAllocated = 16;
+		CopyCustomFormatArray = (CopyCustomFormatEntry *)
+			MemoryContextAlloc(TopMemoryContext,
+							   CopyCustomFormatsAllocated * sizeof(CopyCustomFormatEntry));
+	}
+
+	/* Expand if full. */
+	if (CopyCustomFormatsAssigned >= CopyCustomFormatsAllocated)
+	{
+		CopyCustomFormatsAllocated *= 2;
+		CopyCustomFormatArray = (CopyCustomFormatEntry *)
+			repalloc_array(CopyCustomFormatArray, CopyCustomFormatEntry, CopyCustomFormatsAllocated);
+	}
+
+	CopyCustomFormatArray[CopyCustomFormatsAssigned].name = name;
+	CopyCustomFormatArray[CopyCustomFormatsAssigned].to_routine = to;
+	CopyCustomFormatArray[CopyCustomFormatsAssigned].from_routine = from;
+	CopyCustomFormatArray[CopyCustomFormatsAssigned].option_fn = option_fn;
+	CopyCustomFormatsAssigned++;
+}
+
+/*
+ * Look up a previously registered custom format. Returns false if 'name' is
+ * not registered. Out-parameters may be NULL if not wanted.
+ */
+bool
+GetCopyCustomFormatRoutines(const char *name, const CopyToRoutine **to,
+							const CopyFromRoutine **from, ProcessOneOptionFn * option_fn)
+{
+	for (int i = 0; i < CopyCustomFormatsAssigned; i++)
+	{
+		if (strcmp(CopyCustomFormatArray[i].name, name) == 0)
+		{
+			if (to)
+				*to = CopyCustomFormatArray[i].to_routine;
+			if (from)
+				*from = CopyCustomFormatArray[i].from_routine;
+			if (option_fn)
+				*option_fn = CopyCustomFormatArray[i].option_fn;
+
+			return true;
+		}
+	}
+	return false;
+}
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 2c57b32f4de..69ec94c9ec1 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -158,7 +158,9 @@ static const CopyFromRoutine CopyFromRoutineBinary = {
 static const CopyFromRoutine *
 CopyFromGetRoutine(const CopyFormatOptions *opts)
 {
-	if (opts->format == COPY_FORMAT_CSV)
+	if (opts->format == COPY_FORMAT_CUSTOM)
+		return opts->from_routine;
+	else if (opts->format == COPY_FORMAT_CSV)
 		return &CopyFromRoutineCSV;
 	else if (opts->format == COPY_FORMAT_BINARY)
 		return &CopyFromRoutineBinary;
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index ef2038c9a5d..f897f23737f 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -130,7 +130,9 @@ static const CopyToRoutine CopyToRoutineBinary = {
 static const CopyToRoutine *
 CopyToGetRoutine(const CopyFormatOptions *opts)
 {
-	if (opts->format == COPY_FORMAT_CSV)
+	if (opts->format == COPY_FORMAT_CUSTOM)
+		return opts->to_routine;
+	else if (opts->format == COPY_FORMAT_CSV)
 		return &CopyToRoutineCSV;
 	else if (opts->format == COPY_FORMAT_BINARY)
 		return &CopyToRoutineBinary;
diff --git a/src/backend/commands/meson.build b/src/backend/commands/meson.build
index 9f258d566eb..d98273da67e 100644
--- a/src/backend/commands/meson.build
+++ b/src/backend/commands/meson.build
@@ -11,6 +11,7 @@ backend_sources += files(
   'constraint.c',
   'conversioncmds.c',
   'copy.c',
+  'copyapi.c',
   'copyfrom.c',
   'copyfromparse.c',
   'copyto.c',
diff --git a/src/include/commands/copy.h b/src/include/commands/copy.h
index 5e710efff5b..9c40ca4ba09 100644
--- a/src/include/commands/copy.h
+++ b/src/include/commands/copy.h
@@ -58,7 +58,16 @@ typedef enum CopyFormat
 	COPY_FORMAT_BINARY,
 	COPY_FORMAT_CSV,
 	COPY_FORMAT_JSON,
+	COPY_FORMAT_CUSTOM,			/* format provided by an extension */
 } CopyFormat;
+#define CopyFormatBuiltins(format) ((format) != COPY_FORMAT_CUSTOM)
+
+/*
+ * Full definitions live in commands/copyapi.h, which includes this header;
+ * CopyFormatOptions only needs to hold pointers to the resolved routines.
+ */
+struct CopyToRoutine;
+struct CopyFromRoutine;
 
 /*
  * A struct to hold COPY options, in a parsed form. All of these are related
@@ -97,6 +106,16 @@ typedef struct CopyFormatOptions
 	CopyLogVerbosityChoice log_verbosity;	/* verbosity of logged messages */
 	int64		reject_limit;	/* maximum tolerable number of errors */
 	List	   *convert_select; /* list of column names (can be NIL) */
+
+	/*
+	 * Resolved handler for a custom format. The directoin not in use may be
+	 * NULL. For built-in formats these are unused.
+	 */
+	const struct CopyToRoutine *to_routine;
+	const struct CopyFromRoutine *from_routine;
+
+	/* Custom format private option data */
+	void	   *format_private_opts;
 } CopyFormatOptions;
 
 /* These are defined in copy_state.h */
diff --git a/src/include/commands/copy_state.h b/src/include/commands/copy_state.h
index 52cbf5067eb..6c5defbf4ee 100644
--- a/src/include/commands/copy_state.h
+++ b/src/include/commands/copy_state.h
@@ -178,6 +178,9 @@ typedef struct CopyFromStateData
 #define RAW_BUF_BYTES(cstate) ((cstate)->raw_buf_len - (cstate)->raw_buf_index)
 
 	uint64		bytes_processed;	/* number of bytes processed so far */
+
+	/* Custom format private data to store the state */
+	void	   *format_private;
 } CopyFromStateData;
 
 /*
@@ -248,6 +251,9 @@ typedef struct CopyToStateData
 	FmgrInfo   *out_functions;	/* lookup info for output functions */
 	MemoryContext rowcontext;	/* per-row evaluation context */
 	uint64		bytes_processed;	/* number of bytes processed so far */
+
+	/* Custom format private data to store the state */
+	void	   *format_private;
 } CopyToStateData;
 
 #endif							/* COPY_STATE_H */
diff --git a/src/include/commands/copyapi.h b/src/include/commands/copyapi.h
index 398e7a78bb3..8eb5fe9c7dc 100644
--- a/src/include/commands/copyapi.h
+++ b/src/include/commands/copyapi.h
@@ -14,6 +14,7 @@
 #ifndef COPYAPI_H
 #define COPYAPI_H
 
+#include "commands/copy_state.h"
 #include "commands/copy.h"
 
 /*
@@ -102,4 +103,40 @@ typedef struct CopyFromRoutine
 	void		(*CopyFromEnd) (CopyFromState cstate);
 } CopyFromRoutine;
 
+/*
+ * Optional callback to process one format-specific COPY option. Invoked
+ * from ProcessCopyOptions() once per option that core did not recognize, after
+ * every core option has been parsed (so 'opts' is fully populated).
+ *
+ * Returns true if the option belongs to the format and is valid. Returns false
+ * if the option is not one the format recognizes, in which case core raises the
+ * "not accepted" error; thus an unrecognized option always errors, whether or
+ * not the format supplies this callback. For a recognized option with an invalid
+ * value, the callback should ereport() itself.
+ *
+ * 'pstate' may be NULL (e.g. when options are checked outside a real COPY, as
+ * file_fdw does); parser_errposition(pstate, ...) tolerates NULL.
+ */
+typedef bool (*ProcessOneOptionFn) (CopyFormatOptions *opts, bool is_from,
+									DefElem *option);
+
+/*
+ * Register a COPY format under 'name', mapping it to its TO and/or FROM
+ * routines and optional option/validation callbacks. Intended to be called
+ * from an extension's _PG_init(). Either routine may be NULL if the format
+ * does not support that direction (but not both). Errors if 'name' collides
+ * with a built-in format or one already registered.
+ */
+extern void RegisterCopyCustomFormat(const char *name, const CopyToRoutine *to,
+									 const CopyFromRoutine *from,
+									 ProcessOneOptionFn option_fn);
+
+/*
+ * Look up a previously registered custom format. Returns false if 'name' is
+ * not registered. Out-parameters may be NULL if not wanted.
+ */
+extern bool GetCopyCustomFormatRoutines(const char *name, const CopyToRoutine **to,
+										const CopyFromRoutine **from,
+										ProcessOneOptionFn * option_fn);
+
 #endif							/* COPYAPI_H */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 1969d467c1d..5263710e451 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -540,6 +540,7 @@ ConvProcInfo
 ConversionLocation
 ConvertRowtypeExpr
 CookedConstraint
+CopyCustomFormatEntry
 CopyDest
 CopyFormat
 CopyFormatOptions
-- 
2.54.0

From d31d8cef68d10cb1817446af9a1e492ce88808e9 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <[email protected]>
Date: Mon, 22 Jun 2026 13:14:31 -0700
Subject: [PATCH v2 4/4] Add test module for COPY custom format.

Author:
Reviewed-by:
Discussion: https://postgr.es/m/
Backpatch-through:
---
 src/test/modules/Makefile                     |   1 +
 src/test/modules/meson.build                  |   1 +
 .../modules/test_copy_custom_format/Makefile  |  20 +++
 .../expected/test_copy_custom_format.out      | 105 +++++++++++
 .../test_copy_custom_format/meson.build       |  32 ++++
 .../sql/test_copy_custom_format.sql           |  32 ++++
 .../test_copy_custom_format.c                 | 169 ++++++++++++++++++
 src/tools/pgindent/typedefs.list              |   1 +
 8 files changed, 361 insertions(+)
 create mode 100644 src/test/modules/test_copy_custom_format/Makefile
 create mode 100644 src/test/modules/test_copy_custom_format/expected/test_copy_custom_format.out
 create mode 100644 src/test/modules/test_copy_custom_format/meson.build
 create mode 100644 src/test/modules/test_copy_custom_format/sql/test_copy_custom_format.sql
 create mode 100644 src/test/modules/test_copy_custom_format/test_copy_custom_format.c

diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 0a74ab5c86f..6dcb66174f5 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -23,6 +23,7 @@ SUBDIRS = \
 		  test_cloexec \
 		  test_checksums \
 		  test_copy_callbacks \
+		  test_copy_custom_format \
 		  test_custom_rmgrs \
 		  test_custom_stats \
 		  test_custom_types \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 4bca42bb370..adfa413fe58 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -23,6 +23,7 @@ subdir('test_bloomfilter')
 subdir('test_cloexec')
 subdir('test_checksums')
 subdir('test_copy_callbacks')
+subdir('test_copy_custom_format')
 subdir('test_cplusplusext')
 subdir('test_custom_rmgrs')
 subdir('test_custom_stats')
diff --git a/src/test/modules/test_copy_custom_format/Makefile b/src/test/modules/test_copy_custom_format/Makefile
new file mode 100644
index 00000000000..68a2a04ff09
--- /dev/null
+++ b/src/test/modules/test_copy_custom_format/Makefile
@@ -0,0 +1,20 @@
+# src/test/modules/test_copy_custom_format/Makefile
+
+MODULE_big = test_copy_custom_format
+OBJS = \
+	$(WIN32RES) \
+	test_copy_custom_format.o
+PGFILEDESC = "test_copy_custom_format - test custom COPY FORMAT"
+
+REGRESS = test_copy_custom_format
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_copy_custom_format
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/test_copy_custom_format/expected/test_copy_custom_format.out b/src/test/modules/test_copy_custom_format/expected/test_copy_custom_format.out
new file mode 100644
index 00000000000..817ca3fa60f
--- /dev/null
+++ b/src/test/modules/test_copy_custom_format/expected/test_copy_custom_format.out
@@ -0,0 +1,105 @@
+LOAD 'test_copy_custom_format';
+CREATE TABLE copy_data (a smallint, b integer, c bigint);
+INSERT INTO copy_data VALUES (1,2,3),(12,34,56),(123,456,789);
+COPY copy_data TO stdout WITH (format 'test_format');          -- Start, OutFunc x3, OneRow x3, End
+NOTICE:  CopyToOutFunc: attribute: smallint
+NOTICE:  CopyToOutFunc: attribute: integer
+NOTICE:  CopyToOutFunc: attribute: bigint
+NOTICE:  CopyToStart: the number of attributes of table: 3, the number of attributes to output: 3
+NOTICE:  CopyToOneRow: the number of valid values: 3
+NOTICE:  CopyToOneRow: the number of valid values: 3
+NOTICE:  CopyToOneRow: the number of valid values: 3
+NOTICE:  CopyToEnd
+COPY copy_data FROM stdin WITH (format 'test_format');         -- InFunc x3, Start, OneRow, End
+NOTICE:  CopyFromInFunc: attribute: smallint
+NOTICE:  CopyFromInFunc: attribute: integer
+NOTICE:  CopyFromInFunc: attribute: bigint
+NOTICE:  CopyFromStart: the number of attributes of table: 3, the number of attributes to input: 3
+NOTICE:  CopyFromOneRow
+NOTICE:  CopyFromEnd
+COPY copy_data (a, b) TO stdout WITH (format 'test_format');   -- Start: natts 2
+NOTICE:  CopyToOutFunc: attribute: smallint
+NOTICE:  CopyToOutFunc: attribute: integer
+NOTICE:  CopyToStart: the number of attributes of table: 3, the number of attributes to output: 2
+NOTICE:  CopyToOneRow: the number of valid values: 3
+NOTICE:  CopyToOneRow: the number of valid values: 3
+NOTICE:  CopyToOneRow: the number of valid values: 3
+NOTICE:  CopyToEnd
+COPY (SELECT a FROM copy_data) TO stdout WITH (format 'test_format'); -- Start: natts 1
+NOTICE:  CopyToOutFunc: attribute: smallint
+NOTICE:  CopyToStart: the number of attributes of table: 1, the number of attributes to output: 1
+NOTICE:  CopyToOneRow: the number of valid values: 1
+NOTICE:  CopyToOneRow: the number of valid values: 1
+NOTICE:  CopyToOneRow: the number of valid values: 1
+NOTICE:  CopyToEnd
+COPY copy_data TO stdout WITH (format 'nonexistent');          -- ERROR: not recognized
+ERROR:  COPY format "nonexistent" not recognized
+LINE 1: COPY copy_data TO stdout WITH (format 'nonexistent');
+                                       ^
+COPY copy_data TO stdout WITH (format 'text', format 'csv');   -- ERROR: conflicting
+ERROR:  conflicting or redundant options
+LINE 1: COPY copy_data TO stdout WITH (format 'text', format 'csv');
+                                                      ^
+COPY copy_data TO stdout WITH (format 'test_format', bogus 1); -- ERROR
+ERROR:  COPY format "test_format" does not accept option "bogus"
+LINE 1: ...PY copy_data TO stdout WITH (format 'test_format', bogus 1);
+                                                              ^
+COPY copy_data TO stdout WITH (format 'test_format', max_attributes 5); -- OK
+NOTICE:  CopyToOutFunc: attribute: smallint
+NOTICE:  CopyToOutFunc: attribute: integer
+NOTICE:  CopyToOutFunc: attribute: bigint
+NOTICE:  CopyToStart: the number of attributes of table: 3, the number of attributes to output: 3
+NOTICE:  CopyToOneRow: the number of valid values: 3
+NOTICE:  CopyToOneRow: the number of valid values: 3
+NOTICE:  CopyToOneRow: the number of valid values: 3
+NOTICE:  CopyToEnd
+COPY copy_data TO stdout WITH (format 'test_format', max_attributes 3); -- OK
+NOTICE:  CopyToOutFunc: attribute: smallint
+NOTICE:  CopyToOutFunc: attribute: integer
+NOTICE:  CopyToOutFunc: attribute: bigint
+NOTICE:  CopyToStart: the number of attributes of table: 3, the number of attributes to output: 3
+NOTICE:  CopyToOneRow: the number of valid values: 3
+NOTICE:  CopyToOneRow: the number of valid values: 3
+NOTICE:  CopyToOneRow: the number of valid values: 3
+NOTICE:  CopyToEnd
+COPY copy_data TO stdout WITH (format 'test_format', max_attributes 2); -- ERROR: 3 columns exceeds 2
+NOTICE:  CopyToOutFunc: attribute: smallint
+NOTICE:  CopyToOutFunc: attribute: integer
+NOTICE:  CopyToOutFunc: attribute: bigint
+ERROR:  relation has 3 columns, exceeds max_attributes 2
+COPY copy_data TO stdout WITH (format 'test_format', max_attributes 0);   -- ERROR: positive
+ERROR:  "max_attributes" must be a positive integer
+COPY copy_data TO stdout WITH (format 'test_format', max_attributes -1);  -- ERROR
+ERROR:  "max_attributes" must be a positive integer
+COPY copy_data TO stdout WITH (format 'test_format', max_attributes 'x'); -- ERROR: integer required
+ERROR:  max_attributes requires an integer value
+COPY copy_data FROM stdin WITH (format 'test_format', freeze true, disallow_freeze true); -- ERROR (validate)
+ERROR:  FREEZE cannot be used with "disallow_freeze" option
+COPY copy_data FROM stdin WITH (format 'test_format', disallow_freeze true); -- OK
+NOTICE:  CopyFromInFunc: attribute: smallint
+NOTICE:  CopyFromInFunc: attribute: integer
+NOTICE:  CopyFromInFunc: attribute: bigint
+NOTICE:  CopyFromStart: the number of attributes of table: 3, the number of attributes to input: 3
+NOTICE:  CopyFromOneRow
+NOTICE:  CopyFromEnd
+-- The built-in options are handled in the same way of built-in formats.
+COPY copy_data TO stdout WITH (format 'test_format', delimiter ',');
+NOTICE:  CopyToOutFunc: attribute: smallint
+NOTICE:  CopyToOutFunc: attribute: integer
+NOTICE:  CopyToOutFunc: attribute: bigint
+NOTICE:  CopyToStart: the number of attributes of table: 3, the number of attributes to output: 3
+NOTICE:  CopyToOneRow: the number of valid values: 3
+NOTICE:  CopyToOneRow: the number of valid values: 3
+NOTICE:  CopyToOneRow: the number of valid values: 3
+NOTICE:  CopyToEnd
+COPY copy_data TO stdout WITH (format 'test_format', quote '"');
+NOTICE:  CopyToOutFunc: attribute: smallint
+NOTICE:  CopyToOutFunc: attribute: integer
+NOTICE:  CopyToOutFunc: attribute: bigint
+NOTICE:  CopyToStart: the number of attributes of table: 3, the number of attributes to output: 3
+NOTICE:  CopyToOneRow: the number of valid values: 3
+NOTICE:  CopyToOneRow: the number of valid values: 3
+NOTICE:  CopyToOneRow: the number of valid values: 3
+NOTICE:  CopyToEnd
+COPY copy_data TO stdout WITH (format 'test_format', freeze true);     -- ERROR: FREEZE with COPY TO
+ERROR:  COPY FREEZE cannot be used with COPY TO
diff --git a/src/test/modules/test_copy_custom_format/meson.build b/src/test/modules/test_copy_custom_format/meson.build
new file mode 100644
index 00000000000..a231ed57649
--- /dev/null
+++ b/src/test/modules/test_copy_custom_format/meson.build
@@ -0,0 +1,32 @@
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+test_copy_custom_format_sources = files(
+'test_copy_custom_format.c',
+)
+
+if host_system == 'windows'
+  test_copy_custom_format_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+    '--NAME', 'test_copy_custom_format',
+    '--FILEDESC', 'test_copy_custom_format - test custom COPY FORMAT',])
+endif
+
+test_copy_custom_format = shared_module('test_copy_custom_format',
+  test_copy_custom_format_sources,
+  kwargs: pg_test_mod_args,
+)
+test_install_libs += test_copy_custom_format
+
+tests += {
+  'name': 'test_copy_custom_format',
+  'sd': meson.current_source_dir(),
+  'bd': meson.current_build_dir(),
+  'regress': {
+    'sql': [
+      'test_copy_custom_format',
+    ],
+    # Disabled because these tests require
+    # "shared_preload_libraries=test_custom_copy_format", which typical
+    # runningcheck users do not have (e.g. buildfarm clients).
+    'runningcheck': false,
+  },
+}
diff --git a/src/test/modules/test_copy_custom_format/sql/test_copy_custom_format.sql b/src/test/modules/test_copy_custom_format/sql/test_copy_custom_format.sql
new file mode 100644
index 00000000000..59f58fa55a2
--- /dev/null
+++ b/src/test/modules/test_copy_custom_format/sql/test_copy_custom_format.sql
@@ -0,0 +1,32 @@
+LOAD 'test_copy_custom_format';
+
+CREATE TABLE copy_data (a smallint, b integer, c bigint);
+INSERT INTO copy_data VALUES (1,2,3),(12,34,56),(123,456,789);
+
+COPY copy_data TO stdout WITH (format 'test_format');          -- Start, OutFunc x3, OneRow x3, End
+COPY copy_data FROM stdin WITH (format 'test_format');         -- InFunc x3, Start, OneRow, End
+\.
+
+COPY copy_data (a, b) TO stdout WITH (format 'test_format');   -- Start: natts 2
+COPY (SELECT a FROM copy_data) TO stdout WITH (format 'test_format'); -- Start: natts 1
+
+COPY copy_data TO stdout WITH (format 'nonexistent');          -- ERROR: not recognized
+COPY copy_data TO stdout WITH (format 'text', format 'csv');   -- ERROR: conflicting
+
+COPY copy_data TO stdout WITH (format 'test_format', bogus 1); -- ERROR
+
+COPY copy_data TO stdout WITH (format 'test_format', max_attributes 5); -- OK
+COPY copy_data TO stdout WITH (format 'test_format', max_attributes 3); -- OK
+COPY copy_data TO stdout WITH (format 'test_format', max_attributes 2); -- ERROR: 3 columns exceeds 2
+COPY copy_data TO stdout WITH (format 'test_format', max_attributes 0);   -- ERROR: positive
+COPY copy_data TO stdout WITH (format 'test_format', max_attributes -1);  -- ERROR
+COPY copy_data TO stdout WITH (format 'test_format', max_attributes 'x'); -- ERROR: integer required
+
+COPY copy_data FROM stdin WITH (format 'test_format', freeze true, disallow_freeze true); -- ERROR (validate)
+COPY copy_data FROM stdin WITH (format 'test_format', disallow_freeze true); -- OK
+\.
+
+-- The built-in options are handled in the same way of built-in formats.
+COPY copy_data TO stdout WITH (format 'test_format', delimiter ',');
+COPY copy_data TO stdout WITH (format 'test_format', quote '"');
+COPY copy_data TO stdout WITH (format 'test_format', freeze true);     -- ERROR: FREEZE with COPY TO
diff --git a/src/test/modules/test_copy_custom_format/test_copy_custom_format.c b/src/test/modules/test_copy_custom_format/test_copy_custom_format.c
new file mode 100644
index 00000000000..ca25832fcb0
--- /dev/null
+++ b/src/test/modules/test_copy_custom_format/test_copy_custom_format.c
@@ -0,0 +1,169 @@
+/*--------------------------------------------------------------------------
+ *
+ * test_copy_custom_format.c
+ *		Code for testing custom COPY format.
+ *
+ * Portions Copyright (c) 2026, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *		src/test/modules/test_copy_custom_format/test_copy_custom_format.c
+ *
+ * -------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "commands/copy.h"
+#include "commands/copyapi.h"
+#include "commands/copy_state.h"
+#include "commands/defrem.h"
+#include "utils/builtins.h"
+
+PG_MODULE_MAGIC;
+
+typedef struct TestCopyOptions
+{
+	int			max_attributes;
+	bool		disallow_freeze;
+} TestCopyOptions;
+
+static bool
+TestCopyProcessOneOption(CopyFormatOptions *opts, bool is_from, DefElem *option)
+{
+	TestCopyOptions *t = (TestCopyOptions *) opts->format_private_opts;
+
+	if (t == NULL)
+	{
+		t = palloc0_object(TestCopyOptions);
+		opts->format_private_opts = (void *) t;
+	}
+
+	if (strcmp(option->defname, "max_attributes") == 0)
+	{
+		int			val = defGetInt32(option);
+
+		if (val < 1)
+			ereport(ERROR,
+					errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+					errmsg("\"max_attributes\" must be a positive integer"));
+
+		t->max_attributes = val;
+		return true;
+	}
+	else if (strcmp(option->defname, "disallow_freeze") == 0)
+	{
+		t->disallow_freeze = defGetBoolean(option);
+		return true;
+	}
+
+	return false;
+}
+
+static void
+TestCopyValidateOptions(CopyFormatOptions *opts, bool is_from)
+{
+	TestCopyOptions *t = (TestCopyOptions *) opts->format_private_opts;
+
+	if (!t)
+		return;
+
+	if (t->disallow_freeze && opts->freeze)
+		ereport(ERROR,
+				errmsg("FREEZE cannot be used with \"disallow_freeze\" option"));
+}
+
+static void
+TestCopyFromInFunc(CopyFromState cstate, Oid atttypid, FmgrInfo *finfo, Oid *typioparam)
+{
+	ereport(NOTICE,
+			errmsg("CopyFromInFunc: attribute: %s", format_type_be(atttypid)));
+}
+
+static void
+check_max_attributes(CopyFormatOptions *opts, TupleDesc tupDesc)
+{
+	TestCopyOptions *t = (TestCopyOptions *) opts->format_private_opts;
+
+	if (t != NULL && t->max_attributes > 0 && tupDesc->natts > t->max_attributes)
+		ereport(ERROR,
+				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				errmsg("relation has %d columns, exceeds max_attributes %d",
+					   tupDesc->natts, t->max_attributes));
+}
+
+static void
+TestCopyFromStart(CopyFromState cstate, TupleDesc tupDesc)
+{
+	check_max_attributes(&cstate->opts, tupDesc);
+
+	ereport(NOTICE,
+			errmsg("CopyFromStart: the number of attributes of table: %d, the number of attributes to input: %d",
+				   tupDesc->natts, list_length(cstate->attnumlist)));
+}
+
+static bool
+TestCopyFromOneRow(CopyFromState cstate, ExprContext *econtext, Datum *values, bool *nulls)
+{
+	ereport(NOTICE, errmsg("CopyFromOneRow"));
+
+	return false;
+}
+
+static void
+TestCopyFromEnd(CopyFromState cstate)
+{
+	ereport(NOTICE, errmsg("CopyFromEnd"));
+}
+
+static void
+TestCopyToOutFunc(CopyToState cstate, Oid atttypid, FmgrInfo *finfo)
+{
+	ereport(NOTICE, errmsg("CopyToOutFunc: attribute: %s", format_type_be(atttypid)));
+}
+
+static void
+TestCopyToStart(CopyToState cstate, TupleDesc tupDesc)
+{
+	check_max_attributes(&cstate->opts, tupDesc);
+
+	ereport(NOTICE,
+			errmsg("CopyToStart: the number of attributes of table: %d, the number of attributes to output: %d",
+				   tupDesc->natts, list_length(cstate->attnumlist)));
+}
+
+static void
+TestCopyToOneRow(CopyToState cstate, TupleTableSlot *slot)
+{
+	ereport(NOTICE, (errmsg("CopyToOneRow: the number of valid values: %u", slot->tts_nvalid)));
+}
+
+static void
+TestCopyToEnd(CopyToState cstate)
+{
+	ereport(NOTICE, (errmsg("CopyToEnd")));
+}
+
+static const CopyToRoutine TestCopyToRoutine = {
+	.CopyToOutFunc = TestCopyToOutFunc,
+	.CopyToStart = TestCopyToStart,
+	.CopyToOneRow = TestCopyToOneRow,
+	.CopyToEnd = TestCopyToEnd,
+};
+
+
+static const CopyFromRoutine TestCopyFromRoutine = {
+	.CopyFromInFunc = TestCopyFromInFunc,
+	.CopyFromStart = TestCopyFromStart,
+	.CopyFromOneRow = TestCopyFromOneRow,
+	.CopyFromEnd = TestCopyFromEnd,
+};
+
+void
+_PG_init(void)
+{
+	RegisterCopyCustomFormat("test_format",
+							 &TestCopyToRoutine,
+							 &TestCopyFromRoutine,
+							 &TestCopyProcessOneOption,
+							 &TestCopyValidateOptions);
+}
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 5263710e451..552669abc5f 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -3179,6 +3179,7 @@ Tcl_Obj
 Tcl_Size
 Tcl_Time
 TempNamespaceStatus
+TestCopyOptions
 TestDSMRegistryHashEntry
 TestDSMRegistryStruct
 TestDecodingData
-- 
2.54.0

From c0e1e5986d10125b41f3fde10cdb3ba931db00b3 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <[email protected]>
Date: Mon, 22 Jun 2026 09:23:09 -0700
Subject: [PATCH v2 3/4] Add an hook for custom COPY format option validation.

Author:
Reviewed-by:
Discussion: https://postgr.es/m/
---
 src/backend/commands/copy.c    | 13 ++++++++++++-
 src/backend/commands/copyapi.c | 10 ++++++++--
 src/include/commands/copyapi.h | 15 +++++++++++++--
 3 files changed, 33 insertions(+), 5 deletions(-)

diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 2fdba026ee0..45908d0c1e5 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -599,6 +599,7 @@ ProcessCopyOptions(ParseState *pstate,
 	 */
 	List	   *deferred_options = NIL;
 	ProcessOneOptionFn custom_process_option_fn = NULL;
+	ValidateOptionsFn custom_validate_options_fn = NULL;
 	char	   *custom_format_name = NULL;
 
 	/* Support external use for option sanity checking */
@@ -631,7 +632,8 @@ ProcessCopyOptions(ParseState *pstate,
 				opts_out->format = COPY_FORMAT_JSON;
 			else if (GetCopyCustomFormatRoutines(fmt, &opts_out->to_routine,
 												 &opts_out->from_routine,
-												 &custom_process_option_fn))
+												 &custom_process_option_fn,
+												 &custom_validate_options_fn))
 			{
 				opts_out->format = COPY_FORMAT_CUSTOM;
 				custom_format_name = fmt;
@@ -1126,6 +1128,15 @@ ProcessCopyOptions(ParseState *pstate,
 					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 					 errmsg("COPY format \"%s\" cannot be used with COPY TO",
 							custom_format_name)));
+
+		/*
+		 * Let the format validate its fully-parsed options as a whole.  This
+		 * runs even when no format-specific options were given, so a format
+		 * can reject incompatible core options or enforce cross-option
+		 * constraints.
+		 */
+		if (custom_validate_options_fn != NULL)
+			custom_validate_options_fn(opts_out, is_from);
 	}
 }
 
diff --git a/src/backend/commands/copyapi.c b/src/backend/commands/copyapi.c
index 168efbcf30b..0bd1cb71af2 100644
--- a/src/backend/commands/copyapi.c
+++ b/src/backend/commands/copyapi.c
@@ -28,6 +28,7 @@ typedef struct CopyCustomFormatEntry
 	const CopyToRoutine *to_routine;
 	const CopyFromRoutine *from_routine;
 	ProcessOneOptionFn option_fn;
+	ValidateOptionsFn validate_fn;
 } CopyCustomFormatEntry;
 
 static CopyCustomFormatEntry *CopyCustomFormatArray = NULL;
@@ -56,7 +57,8 @@ is_builtin_copy_format(const char *name)
  */
 void
 RegisterCopyCustomFormat(const char *name, const CopyToRoutine *to,
-						 const CopyFromRoutine *from, ProcessOneOptionFn option_fn)
+						 const CopyFromRoutine *from, ProcessOneOptionFn option_fn,
+						 ValidateOptionsFn validate_fn)
 {
 	Assert(name != NULL && name[0] != '\0');
 
@@ -102,6 +104,7 @@ RegisterCopyCustomFormat(const char *name, const CopyToRoutine *to,
 	CopyCustomFormatArray[CopyCustomFormatsAssigned].to_routine = to;
 	CopyCustomFormatArray[CopyCustomFormatsAssigned].from_routine = from;
 	CopyCustomFormatArray[CopyCustomFormatsAssigned].option_fn = option_fn;
+	CopyCustomFormatArray[CopyCustomFormatsAssigned].validate_fn = validate_fn;
 	CopyCustomFormatsAssigned++;
 }
 
@@ -111,7 +114,8 @@ RegisterCopyCustomFormat(const char *name, const CopyToRoutine *to,
  */
 bool
 GetCopyCustomFormatRoutines(const char *name, const CopyToRoutine **to,
-							const CopyFromRoutine **from, ProcessOneOptionFn * option_fn)
+							const CopyFromRoutine **from, ProcessOneOptionFn * option_fn,
+							ValidateOptionsFn * validate_fn)
 {
 	for (int i = 0; i < CopyCustomFormatsAssigned; i++)
 	{
@@ -123,6 +127,8 @@ GetCopyCustomFormatRoutines(const char *name, const CopyToRoutine **to,
 				*from = CopyCustomFormatArray[i].from_routine;
 			if (option_fn)
 				*option_fn = CopyCustomFormatArray[i].option_fn;
+			if (validate_fn)
+				*validate_fn = CopyCustomFormatArray[i].validate_fn;
 
 			return true;
 		}
diff --git a/src/include/commands/copyapi.h b/src/include/commands/copyapi.h
index 8eb5fe9c7dc..c47c89a858f 100644
--- a/src/include/commands/copyapi.h
+++ b/src/include/commands/copyapi.h
@@ -120,6 +120,15 @@ typedef struct CopyFromRoutine
 typedef bool (*ProcessOneOptionFn) (CopyFormatOptions *opts, bool is_from,
 									DefElem *option);
 
+/*
+ * Optional callback to validate a custom format's fully-parsed options as a
+ * whole. Invoked once from ProcessCopyOptions() after all options have been
+ * processed, so it can enforce cross-option constraints and reject
+ * incompatible core options. It runs even when no format-specific options were
+ * supplied. Reports problems with ereport().
+ */
+typedef void (*ValidateOptionsFn) (CopyFormatOptions *opts, bool is_from);
+
 /*
  * Register a COPY format under 'name', mapping it to its TO and/or FROM
  * routines and optional option/validation callbacks. Intended to be called
@@ -129,7 +138,8 @@ typedef bool (*ProcessOneOptionFn) (CopyFormatOptions *opts, bool is_from,
  */
 extern void RegisterCopyCustomFormat(const char *name, const CopyToRoutine *to,
 									 const CopyFromRoutine *from,
-									 ProcessOneOptionFn option_fn);
+									 ProcessOneOptionFn option_fn,
+									 ValidateOptionsFn validate_fn);
 
 /*
  * Look up a previously registered custom format. Returns false if 'name' is
@@ -137,6 +147,7 @@ extern void RegisterCopyCustomFormat(const char *name, const CopyToRoutine *to,
  */
 extern bool GetCopyCustomFormatRoutines(const char *name, const CopyToRoutine **to,
 										const CopyFromRoutine **from,
-										ProcessOneOptionFn * option_fn);
+										ProcessOneOptionFn * option_fn,
+										ValidateOptionsFn * validate_fn);
 
 #endif							/* COPYAPI_H */
-- 
2.54.0

From 47aa927d65e483331f9b2010499b4c7f0bfb562c Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <[email protected]>
Date: Mon, 22 Jun 2026 11:50:27 -0700
Subject: [PATCH v2 1/4] Move Copy[From|To]StateData to copy_state.h.

Author:
Reviewed-by:
Discussion: https://postgr.es/m/
Backpatch-through:
---
 contrib/file_fdw/file_fdw.c              |   2 +-
 src/backend/commands/copyfrom.c          |   5 +-
 src/backend/commands/copyfromparse.c     |  10 +-
 src/backend/commands/copyto.c            |  88 +-------
 src/include/commands/copy.h              |   2 +-
 src/include/commands/copy_state.h        | 253 +++++++++++++++++++++++
 src/include/commands/copyfrom_internal.h | 165 +--------------
 7 files changed, 273 insertions(+), 252 deletions(-)
 create mode 100644 src/include/commands/copy_state.h

diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c
index 33a37d832ce..d152d05b92e 100644
--- a/contrib/file_fdw/file_fdw.c
+++ b/contrib/file_fdw/file_fdw.c
@@ -22,7 +22,7 @@
 #include "catalog/pg_authid.h"
 #include "catalog/pg_foreign_table.h"
 #include "commands/copy.h"
-#include "commands/copyfrom_internal.h"
+#include "commands/copy_state.h"
 #include "commands/defrem.h"
 #include "commands/explain_format.h"
 #include "commands/explain_state.h"
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 0087585b2c4..2c57b32f4de 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -30,6 +30,7 @@
 #include "access/xact.h"
 #include "catalog/namespace.h"
 #include "commands/copyapi.h"
+#include "commands/copy_state.h"
 #include "commands/copyfrom_internal.h"
 #include "commands/progress.h"
 #include "commands/trigger.h"
@@ -1732,7 +1733,7 @@ BeginCopyFrom(ParseState *pstate,
 							pg_encoding_to_char(GetDatabaseEncoding()))));
 	}
 
-	cstate->copy_src = COPY_FILE;	/* default */
+	cstate->copy_src = COPY_SOURCE_FILE;	/* default */
 
 	cstate->whereClause = whereClause;
 
@@ -1861,7 +1862,7 @@ BeginCopyFrom(ParseState *pstate,
 	if (data_source_cb)
 	{
 		progress_vals[1] = PROGRESS_COPY_TYPE_CALLBACK;
-		cstate->copy_src = COPY_CALLBACK;
+		cstate->copy_src = COPY_SOURCE_CALLBACK;
 		cstate->data_source_cb = data_source_cb;
 	}
 	else if (pipe)
diff --git a/src/backend/commands/copyfromparse.c b/src/backend/commands/copyfromparse.c
index 65fd5a0ab4f..0ff5db4b62d 100644
--- a/src/backend/commands/copyfromparse.c
+++ b/src/backend/commands/copyfromparse.c
@@ -184,7 +184,7 @@ ReceiveCopyBegin(CopyFromState cstate)
 	for (i = 0; i < natts; i++)
 		pq_sendint16(&buf, format); /* per-column formats */
 	pq_endmessage(&buf);
-	cstate->copy_src = COPY_FRONTEND;
+	cstate->copy_src = COPY_SOURCE_FRONTEND;
 	cstate->fe_msgbuf = makeStringInfo();
 	/* We *must* flush here to ensure FE knows it can send. */
 	pq_flush();
@@ -252,7 +252,7 @@ CopyGetData(CopyFromState cstate, void *databuf, int minread, int maxread)
 
 	switch (cstate->copy_src)
 	{
-		case COPY_FILE:
+		case COPY_SOURCE_FILE:
 			pgstat_report_wait_start(WAIT_EVENT_COPY_FROM_READ);
 			bytesread = fread(databuf, 1, maxread, cstate->copy_file);
 			pgstat_report_wait_end();
@@ -263,7 +263,7 @@ CopyGetData(CopyFromState cstate, void *databuf, int minread, int maxread)
 			if (bytesread == 0)
 				cstate->raw_reached_eof = true;
 			break;
-		case COPY_FRONTEND:
+		case COPY_SOURCE_FRONTEND:
 			while (maxread > 0 && bytesread < minread && !cstate->raw_reached_eof)
 			{
 				int			avail;
@@ -346,7 +346,7 @@ CopyGetData(CopyFromState cstate, void *databuf, int minread, int maxread)
 				bytesread += avail;
 			}
 			break;
-		case COPY_CALLBACK:
+		case COPY_SOURCE_CALLBACK:
 			bytesread = cstate->data_source_cb(databuf, minread, maxread);
 			break;
 	}
@@ -1259,7 +1259,7 @@ CopyReadLine(CopyFromState cstate, bool is_csv)
 		 * after \. up to the protocol end of copy data.  (XXX maybe better
 		 * not to treat \. as special?)
 		 */
-		if (cstate->copy_src == COPY_FRONTEND)
+		if (cstate->copy_src == COPY_SOURCE_FRONTEND)
 		{
 			int			inbytes;
 
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 6755bb698de..ef2038c9a5d 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -23,8 +23,8 @@
 #include "access/tupconvert.h"
 #include "catalog/pg_inherits.h"
 #include "commands/copyapi.h"
+#include "commands/copy_state.h"
 #include "commands/progress.h"
-#include "executor/execdesc.h"
 #include "executor/executor.h"
 #include "executor/tuptable.h"
 #include "funcapi.h"
@@ -42,76 +42,6 @@
 #include "utils/snapmgr.h"
 #include "utils/wait_event.h"
 
-/*
- * Represents the different dest cases we need to worry about at
- * the bottom level
- */
-typedef enum CopyDest
-{
-	COPY_FILE,					/* to file (or a piped program) */
-	COPY_FRONTEND,				/* to frontend */
-	COPY_CALLBACK,				/* to callback function */
-} CopyDest;
-
-/*
- * This struct contains all the state variables used throughout a COPY TO
- * operation.
- *
- * Multi-byte encodings: all supported client-side encodings encode multi-byte
- * characters by having the first byte's high bit set. Subsequent bytes of the
- * character can have the high bit not set. When scanning data in such an
- * encoding to look for a match to a single-byte (ie ASCII) character, we must
- * use the full pg_encoding_mblen() machinery to skip over multibyte
- * characters, else we might find a false match to a trailing byte. In
- * supported server encodings, there is no possibility of a false match, and
- * it's faster to make useless comparisons to trailing bytes than it is to
- * invoke pg_encoding_mblen() to skip over them. encoding_embeds_ascii is true
- * when we have to do it the hard way.
- */
-typedef struct CopyToStateData
-{
-	/* format-specific routines */
-	const CopyToRoutine *routine;
-
-	/* low-level state data */
-	CopyDest	copy_dest;		/* type of copy source/destination */
-	FILE	   *copy_file;		/* used if copy_dest == COPY_FILE */
-	StringInfo	fe_msgbuf;		/* used for all dests during COPY TO */
-
-	int			file_encoding;	/* file or remote side's character encoding */
-	bool		need_transcoding;	/* file encoding diff from server? */
-	bool		encoding_embeds_ascii;	/* ASCII can be non-first byte? */
-
-	/* parameters from the COPY command */
-	Relation	rel;			/* relation to copy to */
-	QueryDesc  *queryDesc;		/* executable query to copy from */
-	List	   *attnumlist;		/* integer list of attnums to copy */
-	char	   *filename;		/* filename, or NULL for STDOUT */
-	bool		is_program;		/* is 'filename' a program to popen? */
-	bool		json_row_delim_needed;	/* need delimiter before next row */
-	StringInfo	json_buf;		/* reusable buffer for JSON output,
-								 * initialized in BeginCopyTo */
-	TupleDesc	tupDesc;		/* Descriptor for JSON output; for a column
-								 * list this is a projected descriptor */
-	Datum	   *json_projvalues;	/* pre-allocated projection values, or
-									 * NULL */
-	bool	   *json_projnulls; /* pre-allocated projection nulls, or NULL */
-	copy_data_dest_cb data_dest_cb; /* function for writing data */
-
-	CopyFormatOptions opts;
-	Node	   *whereClause;	/* WHERE condition (or NULL) */
-	List	   *partitions;		/* OID list of partitions to copy data from */
-
-	/*
-	 * Working state
-	 */
-	MemoryContext copycontext;	/* per-copy execution context */
-
-	FmgrInfo   *out_functions;	/* lookup info for output functions */
-	MemoryContext rowcontext;	/* per-row evaluation context */
-	uint64		bytes_processed;	/* number of bytes processed so far */
-} CopyToStateData;
-
 /* DestReceiver for COPY (query) TO */
 typedef struct
 {
@@ -559,7 +489,7 @@ SendCopyBegin(CopyToState cstate)
 	}
 
 	pq_endmessage(&buf);
-	cstate->copy_dest = COPY_FRONTEND;
+	cstate->copy_dest = COPY_DEST_FRONTEND;
 }
 
 static void
@@ -606,7 +536,7 @@ CopySendEndOfRow(CopyToState cstate)
 
 	switch (cstate->copy_dest)
 	{
-		case COPY_FILE:
+		case COPY_DEST_FILE:
 			pgstat_report_wait_start(WAIT_EVENT_COPY_TO_WRITE);
 			if (fwrite(fe_msgbuf->data, fe_msgbuf->len, 1,
 					   cstate->copy_file) != 1 ||
@@ -642,11 +572,11 @@ CopySendEndOfRow(CopyToState cstate)
 			}
 			pgstat_report_wait_end();
 			break;
-		case COPY_FRONTEND:
+		case COPY_DEST_FRONTEND:
 			/* Dump the accumulated row as one CopyData message */
 			(void) pq_putmessage(PqMsg_CopyData, fe_msgbuf->data, fe_msgbuf->len);
 			break;
-		case COPY_CALLBACK:
+		case COPY_DEST_CALLBACK:
 			cstate->data_dest_cb(fe_msgbuf->data, fe_msgbuf->len);
 			break;
 	}
@@ -667,7 +597,7 @@ CopySendTextLikeEndOfRow(CopyToState cstate)
 {
 	switch (cstate->copy_dest)
 	{
-		case COPY_FILE:
+		case COPY_DEST_FILE:
 			/* Default line termination depends on platform */
 #ifndef WIN32
 			CopySendChar(cstate, '\n');
@@ -675,7 +605,7 @@ CopySendTextLikeEndOfRow(CopyToState cstate)
 			CopySendString(cstate, "\r\n");
 #endif
 			break;
-		case COPY_FRONTEND:
+		case COPY_DEST_FRONTEND:
 			/* The FE/BE protocol uses \n as newline for all platforms */
 			CopySendChar(cstate, '\n');
 			break;
@@ -1135,12 +1065,12 @@ BeginCopyTo(ParseState *pstate,
 	/* See Multibyte encoding comment above */
 	cstate->encoding_embeds_ascii = PG_ENCODING_IS_CLIENT_ONLY(cstate->file_encoding);
 
-	cstate->copy_dest = COPY_FILE;	/* default */
+	cstate->copy_dest = COPY_DEST_FILE; /* default */
 
 	if (data_dest_cb)
 	{
 		progress_vals[1] = PROGRESS_COPY_TYPE_CALLBACK;
-		cstate->copy_dest = COPY_CALLBACK;
+		cstate->copy_dest = COPY_DEST_CALLBACK;
 		cstate->data_dest_cb = data_dest_cb;
 	}
 	else if (pipe)
diff --git a/src/include/commands/copy.h b/src/include/commands/copy.h
index abecfe51098..5e710efff5b 100644
--- a/src/include/commands/copy.h
+++ b/src/include/commands/copy.h
@@ -99,7 +99,7 @@ typedef struct CopyFormatOptions
 	List	   *convert_select; /* list of column names (can be NIL) */
 } CopyFormatOptions;
 
-/* These are private in commands/copy[from|to].c */
+/* These are defined in copy_state.h */
 typedef struct CopyFromStateData *CopyFromState;
 typedef struct CopyToStateData *CopyToState;
 
diff --git a/src/include/commands/copy_state.h b/src/include/commands/copy_state.h
new file mode 100644
index 00000000000..52cbf5067eb
--- /dev/null
+++ b/src/include/commands/copy_state.h
@@ -0,0 +1,253 @@
+/*-------------------------------------------------------------------------
+ *
+ * copy_state.h
+ *	  prototypes for COPY TO/COPY FROM execution state.
+ *
+ * Portions Copyright (c) 1996-2026, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994-5, Regents of the University of California
+ *
+ * src/include/commands/copy_state.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#ifndef COPY_STATE_H
+#define COPY_STATE_H
+
+#include "commands/copy.h"
+#include "commands/trigger.h"
+#include "executor/execdesc.h"
+#include "nodes/miscnodes.h"
+
+/*
+ * Represents the different source cases we need to worry about at
+ * the bottom level
+ */
+typedef enum CopySource
+{
+	COPY_SOURCE_FILE,			/* from file (or a piped program) */
+	COPY_SOURCE_FRONTEND,		/* from frontend */
+	COPY_SOURCE_CALLBACK,		/* from callback function */
+} CopySource;
+
+/*
+ *	Represents the end-of-line terminator type of the input
+ */
+typedef enum EolType
+{
+	EOL_UNKNOWN,
+	EOL_NL,
+	EOL_CR,
+	EOL_CRNL,
+} EolType;
+
+/*
+ * This struct contains all the state variables used throughout a COPY FROM
+ * operation.
+ */
+typedef struct CopyFromStateData
+{
+	/* format routine */
+	const struct CopyFromRoutine *routine;
+
+	/* low-level state data */
+	CopySource	copy_src;		/* type of copy source */
+	FILE	   *copy_file;		/* used if copy_src == COPY_SOURCE_FILE */
+	StringInfo	fe_msgbuf;		/* used if copy_src == COPY_SOURCE_FRONTEND */
+
+	EolType		eol_type;		/* EOL type of input */
+	int			file_encoding;	/* file or remote side's character encoding */
+	bool		need_transcoding;	/* file encoding diff from server? */
+	Oid			conversion_proc;	/* encoding conversion function */
+
+	/* parameters from the COPY command */
+	Relation	rel;			/* relation to copy from */
+	List	   *attnumlist;		/* integer list of attnums to copy */
+	char	   *filename;		/* filename, or NULL for STDIN */
+	bool		is_program;		/* is 'filename' a program to popen? */
+	copy_data_source_cb data_source_cb; /* function for reading data */
+
+	CopyFormatOptions opts;
+	bool	   *convert_select_flags;	/* per-column CSV/TEXT CS flags */
+	Node	   *whereClause;	/* WHERE condition (or NULL) */
+
+	/* these are just for error messages, see CopyFromErrorCallback */
+	const char *cur_relname;	/* table name for error messages */
+	uint64		cur_lineno;		/* line number for error messages */
+	const char *cur_attname;	/* current att for error messages */
+	const char *cur_attval;		/* current att value for error messages */
+	bool		relname_only;	/* don't output line number, att, etc. */
+
+	/*
+	 * Working state
+	 */
+	MemoryContext copycontext;	/* per-copy execution context */
+
+	AttrNumber	num_defaults;	/* count of att that are missing and have
+								 * default value */
+	FmgrInfo   *in_functions;	/* array of input functions for each attrs */
+	Oid		   *typioparams;	/* array of element types for in_functions */
+	ErrorSaveContext *escontext;	/* soft error trapped during in_functions
+									 * execution */
+	uint64		num_errors;		/* total number of rows which contained soft
+								 * errors */
+	int		   *defmap;			/* array of default att numbers related to
+								 * missing att */
+	ExprState **defexprs;		/* array of default att expressions for all
+								 * att */
+	bool	   *defaults;		/* if DEFAULT marker was found for
+								 * corresponding att */
+	bool		simd_enabled;	/* use SIMD to scan for special chars? */
+
+	/*
+	 * True if the corresponding attribute's is a constrained domain. This
+	 * will be populated only when ON_ERROR is SET_NULL, otherwise NULL.
+	 */
+	bool	   *domain_with_constraint;
+
+	bool		volatile_defexprs;	/* is any of defexprs volatile? */
+	List	   *range_table;	/* single element list of RangeTblEntry */
+	List	   *rteperminfos;	/* single element list of RTEPermissionInfo */
+	ExprState  *qualexpr;
+
+	TransitionCaptureState *transition_capture;
+
+	/*
+	 * These variables are used to reduce overhead in COPY FROM.
+	 *
+	 * attribute_buf holds the separated, de-escaped text for each field of
+	 * the current line.  The CopyReadAttributes functions return arrays of
+	 * pointers into this buffer.  We avoid palloc/pfree overhead by re-using
+	 * the buffer on each cycle.
+	 *
+	 * In binary COPY FROM, attribute_buf holds the binary data for the
+	 * current field, but the usage is otherwise similar.
+	 */
+	StringInfoData attribute_buf;
+
+	/* field raw data pointers found by COPY FROM */
+
+	int			max_fields;
+	char	  **raw_fields;
+
+	/*
+	 * Similarly, line_buf holds the whole input line being processed. The
+	 * input cycle is first to read the whole line into line_buf, and then
+	 * extract the individual attribute fields into attribute_buf.  line_buf
+	 * is preserved unmodified so that we can display it in error messages if
+	 * appropriate.  (In binary mode, line_buf is not used.)
+	 */
+	StringInfoData line_buf;
+	bool		line_buf_valid; /* contains the row being processed? */
+
+	/*
+	 * input_buf holds input data, already converted to database encoding.
+	 *
+	 * In text mode, CopyReadLine parses this data sufficiently to locate line
+	 * boundaries, then transfers the data to line_buf. We guarantee that
+	 * there is a \0 at input_buf[input_buf_len] at all times.  (In binary
+	 * mode, input_buf is not used.)
+	 *
+	 * If encoding conversion is not required, input_buf is not a separate
+	 * buffer but points directly to raw_buf.  In that case, input_buf_len
+	 * tracks the number of bytes that have been verified as valid in the
+	 * database encoding, and raw_buf_len is the total number of bytes stored
+	 * in the buffer.
+	 */
+#define INPUT_BUF_SIZE 65536	/* we palloc INPUT_BUF_SIZE+1 bytes */
+	char	   *input_buf;
+	int			input_buf_index;	/* next byte to process */
+	int			input_buf_len;	/* total # of bytes stored */
+	bool		input_reached_eof;	/* true if we reached EOF */
+	bool		input_reached_error;	/* true if a conversion error happened */
+	/* Shorthand for number of unconsumed bytes available in input_buf */
+#define INPUT_BUF_BYTES(cstate) ((cstate)->input_buf_len - (cstate)->input_buf_index)
+
+	/*
+	 * raw_buf holds raw input data read from the data source (file or client
+	 * connection), not yet converted to the database encoding.  Like with
+	 * 'input_buf', we guarantee that there is a \0 at raw_buf[raw_buf_len].
+	 */
+#define RAW_BUF_SIZE 65536		/* we palloc RAW_BUF_SIZE+1 bytes */
+	char	   *raw_buf;
+	int			raw_buf_index;	/* next byte to process */
+	int			raw_buf_len;	/* total # of bytes stored */
+	bool		raw_reached_eof;	/* true if we reached EOF */
+
+	/* Shorthand for number of unconsumed bytes available in raw_buf */
+#define RAW_BUF_BYTES(cstate) ((cstate)->raw_buf_len - (cstate)->raw_buf_index)
+
+	uint64		bytes_processed;	/* number of bytes processed so far */
+} CopyFromStateData;
+
+/*
+ * Represents the different dest cases we need to worry about at
+ * the bottom level
+ */
+typedef enum CopyDest
+{
+	COPY_DEST_FILE,				/* to file (or a piped program) */
+	COPY_DEST_FRONTEND,			/* to frontend */
+	COPY_DEST_CALLBACK,			/* to callback function */
+} CopyDest;
+
+/*
+ * This struct contains all the state variables used throughout a COPY TO
+ * operation.
+ *
+ * Multi-byte encodings: all supported client-side encodings encode multi-byte
+ * characters by having the first byte's high bit set. Subsequent bytes of the
+ * character can have the high bit not set. When scanning data in such an
+ * encoding to look for a match to a single-byte (ie ASCII) character, we must
+ * use the full pg_encoding_mblen() machinery to skip over multibyte
+ * characters, else we might find a false match to a trailing byte. In
+ * supported server encodings, there is no possibility of a false match, and
+ * it's faster to make useless comparisons to trailing bytes than it is to
+ * invoke pg_encoding_mblen() to skip over them. encoding_embeds_ascii is true
+ * when we have to do it the hard way.
+ */
+typedef struct CopyToStateData
+{
+	/* format-specific routines */
+	const struct CopyToRoutine *routine;
+
+	/* low-level state data */
+	CopyDest	copy_dest;		/* type of copy source/destination */
+	FILE	   *copy_file;		/* used if copy_dest == COPY_DEST_FILE */
+	StringInfo	fe_msgbuf;		/* used for all dests during COPY TO */
+
+	int			file_encoding;	/* file or remote side's character encoding */
+	bool		need_transcoding;	/* file encoding diff from server? */
+	bool		encoding_embeds_ascii;	/* ASCII can be non-first byte? */
+
+	/* parameters from the COPY command */
+	Relation	rel;			/* relation to copy to */
+	QueryDesc  *queryDesc;		/* executable query to copy from */
+	List	   *attnumlist;		/* integer list of attnums to copy */
+	char	   *filename;		/* filename, or NULL for STDOUT */
+	bool		is_program;		/* is 'filename' a program to popen? */
+	bool		json_row_delim_needed;	/* need delimiter before next row */
+	StringInfo	json_buf;		/* reusable buffer for JSON output,
+								 * initialized in BeginCopyTo */
+	TupleDesc	tupDesc;		/* Descriptor for JSON output; for a column
+								 * list this is a projected descriptor */
+	Datum	   *json_projvalues;	/* pre-allocated projection values, or
+									 * NULL */
+	bool	   *json_projnulls; /* pre-allocated projection nulls, or NULL */
+	copy_data_dest_cb data_dest_cb; /* function for writing data */
+
+	CopyFormatOptions opts;
+	Node	   *whereClause;	/* WHERE condition (or NULL) */
+	List	   *partitions;		/* OID list of partitions to copy data from */
+
+	/*
+	 * Working state
+	 */
+	MemoryContext copycontext;	/* per-copy execution context */
+
+	FmgrInfo   *out_functions;	/* lookup info for output functions */
+	MemoryContext rowcontext;	/* per-row evaluation context */
+	uint64		bytes_processed;	/* number of bytes processed so far */
+} CopyToStateData;
+
+#endif							/* COPY_STATE_H */
diff --git a/src/include/commands/copyfrom_internal.h b/src/include/commands/copyfrom_internal.h
index 9d3e244ee55..f7afade9a39 100644
--- a/src/include/commands/copyfrom_internal.h
+++ b/src/include/commands/copyfrom_internal.h
@@ -14,31 +14,7 @@
 #ifndef COPYFROM_INTERNAL_H
 #define COPYFROM_INTERNAL_H
 
-#include "commands/copy.h"
-#include "commands/trigger.h"
-#include "nodes/miscnodes.h"
-
-/*
- * Represents the different source cases we need to worry about at
- * the bottom level
- */
-typedef enum CopySource
-{
-	COPY_FILE,					/* from file (or a piped program) */
-	COPY_FRONTEND,				/* from frontend */
-	COPY_CALLBACK,				/* from callback function */
-} CopySource;
-
-/*
- *	Represents the end-of-line terminator type of the input
- */
-typedef enum EolType
-{
-	EOL_UNKNOWN,
-	EOL_NL,
-	EOL_CR,
-	EOL_CRNL,
-} EolType;
+#include "commands/copy_state.h"
 
 /*
  * Represents the insert method to be used during COPY FROM.
@@ -52,145 +28,6 @@ typedef enum CopyInsertMethod
 								 * ExecForeignBatchInsert only if valid */
 } CopyInsertMethod;
 
-/*
- * This struct contains all the state variables used throughout a COPY FROM
- * operation.
- */
-typedef struct CopyFromStateData
-{
-	/* format routine */
-	const struct CopyFromRoutine *routine;
-
-	/* low-level state data */
-	CopySource	copy_src;		/* type of copy source */
-	FILE	   *copy_file;		/* used if copy_src == COPY_FILE */
-	StringInfo	fe_msgbuf;		/* used if copy_src == COPY_FRONTEND */
-
-	EolType		eol_type;		/* EOL type of input */
-	int			file_encoding;	/* file or remote side's character encoding */
-	bool		need_transcoding;	/* file encoding diff from server? */
-	Oid			conversion_proc;	/* encoding conversion function */
-
-	/* parameters from the COPY command */
-	Relation	rel;			/* relation to copy from */
-	List	   *attnumlist;		/* integer list of attnums to copy */
-	char	   *filename;		/* filename, or NULL for STDIN */
-	bool		is_program;		/* is 'filename' a program to popen? */
-	copy_data_source_cb data_source_cb; /* function for reading data */
-
-	CopyFormatOptions opts;
-	bool	   *convert_select_flags;	/* per-column CSV/TEXT CS flags */
-	Node	   *whereClause;	/* WHERE condition (or NULL) */
-
-	/* these are just for error messages, see CopyFromErrorCallback */
-	const char *cur_relname;	/* table name for error messages */
-	uint64		cur_lineno;		/* line number for error messages */
-	const char *cur_attname;	/* current att for error messages */
-	const char *cur_attval;		/* current att value for error messages */
-	bool		relname_only;	/* don't output line number, att, etc. */
-
-	/*
-	 * Working state
-	 */
-	MemoryContext copycontext;	/* per-copy execution context */
-
-	AttrNumber	num_defaults;	/* count of att that are missing and have
-								 * default value */
-	FmgrInfo   *in_functions;	/* array of input functions for each attrs */
-	Oid		   *typioparams;	/* array of element types for in_functions */
-	ErrorSaveContext *escontext;	/* soft error trapped during in_functions
-									 * execution */
-	uint64		num_errors;		/* total number of rows which contained soft
-								 * errors */
-	int		   *defmap;			/* array of default att numbers related to
-								 * missing att */
-	ExprState **defexprs;		/* array of default att expressions for all
-								 * att */
-	bool	   *defaults;		/* if DEFAULT marker was found for
-								 * corresponding att */
-	bool		simd_enabled;	/* use SIMD to scan for special chars? */
-
-	/*
-	 * True if the corresponding attribute's is a constrained domain. This
-	 * will be populated only when ON_ERROR is SET_NULL, otherwise NULL.
-	 */
-	bool	   *domain_with_constraint;
-
-	bool		volatile_defexprs;	/* is any of defexprs volatile? */
-	List	   *range_table;	/* single element list of RangeTblEntry */
-	List	   *rteperminfos;	/* single element list of RTEPermissionInfo */
-	ExprState  *qualexpr;
-
-	TransitionCaptureState *transition_capture;
-
-	/*
-	 * These variables are used to reduce overhead in COPY FROM.
-	 *
-	 * attribute_buf holds the separated, de-escaped text for each field of
-	 * the current line.  The CopyReadAttributes functions return arrays of
-	 * pointers into this buffer.  We avoid palloc/pfree overhead by re-using
-	 * the buffer on each cycle.
-	 *
-	 * In binary COPY FROM, attribute_buf holds the binary data for the
-	 * current field, but the usage is otherwise similar.
-	 */
-	StringInfoData attribute_buf;
-
-	/* field raw data pointers found by COPY FROM */
-
-	int			max_fields;
-	char	  **raw_fields;
-
-	/*
-	 * Similarly, line_buf holds the whole input line being processed. The
-	 * input cycle is first to read the whole line into line_buf, and then
-	 * extract the individual attribute fields into attribute_buf.  line_buf
-	 * is preserved unmodified so that we can display it in error messages if
-	 * appropriate.  (In binary mode, line_buf is not used.)
-	 */
-	StringInfoData line_buf;
-	bool		line_buf_valid; /* contains the row being processed? */
-
-	/*
-	 * input_buf holds input data, already converted to database encoding.
-	 *
-	 * In text mode, CopyReadLine parses this data sufficiently to locate line
-	 * boundaries, then transfers the data to line_buf. We guarantee that
-	 * there is a \0 at input_buf[input_buf_len] at all times.  (In binary
-	 * mode, input_buf is not used.)
-	 *
-	 * If encoding conversion is not required, input_buf is not a separate
-	 * buffer but points directly to raw_buf.  In that case, input_buf_len
-	 * tracks the number of bytes that have been verified as valid in the
-	 * database encoding, and raw_buf_len is the total number of bytes stored
-	 * in the buffer.
-	 */
-#define INPUT_BUF_SIZE 65536	/* we palloc INPUT_BUF_SIZE+1 bytes */
-	char	   *input_buf;
-	int			input_buf_index;	/* next byte to process */
-	int			input_buf_len;	/* total # of bytes stored */
-	bool		input_reached_eof;	/* true if we reached EOF */
-	bool		input_reached_error;	/* true if a conversion error happened */
-	/* Shorthand for number of unconsumed bytes available in input_buf */
-#define INPUT_BUF_BYTES(cstate) ((cstate)->input_buf_len - (cstate)->input_buf_index)
-
-	/*
-	 * raw_buf holds raw input data read from the data source (file or client
-	 * connection), not yet converted to the database encoding.  Like with
-	 * 'input_buf', we guarantee that there is a \0 at raw_buf[raw_buf_len].
-	 */
-#define RAW_BUF_SIZE 65536		/* we palloc RAW_BUF_SIZE+1 bytes */
-	char	   *raw_buf;
-	int			raw_buf_index;	/* next byte to process */
-	int			raw_buf_len;	/* total # of bytes stored */
-	bool		raw_reached_eof;	/* true if we reached EOF */
-
-	/* Shorthand for number of unconsumed bytes available in raw_buf */
-#define RAW_BUF_BYTES(cstate) ((cstate)->raw_buf_len - (cstate)->raw_buf_index)
-
-	uint64		bytes_processed;	/* number of bytes processed so far */
-} CopyFromStateData;
-
 extern void ReceiveCopyBegin(CopyFromState cstate);
 extern void ReceiveCopyBinaryHeader(CopyFromState cstate);
 
-- 
2.54.0

Reply via email to