Re: Schema variables - new implementation for Postgres 15

2021-12-18 Thread Erikjan Rijkers

Op 19-12-2021 om 08:13 schreef Pavel Stehule:



ne 19. 12. 2021 v 8:09 odesílatel Erik Rijkers  >


  > [0001-schema-variables-20211219.patch]
  > [0002-schema-variables-20211219.patch]

Hi Pavel,

I get an error during test 'session_variables'.

(on the upside, my own little testsuite runs without error)

thanks,


please, can you send me regress diff?



I did attach it but if you did not receive it, see also cfbot, especially

https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.156992


Erik





Re: Schema variables - new implementation for Postgres 15

2021-12-18 Thread Pavel Stehule
ne 19. 12. 2021 v 8:13 odesílatel Pavel Stehule 
napsal:

>
>
> ne 19. 12. 2021 v 8:09 odesílatel Erik Rijkers  napsal:
>
>> Op 19-12-2021 om 07:23 schreef Pavel Stehule:
>>
>> >
>> > I am sending new versions of patches.
>> >
>> > I hope I solved all Tomas's objections.
>> >
>> > 1. The schema variables were renamed to session variables
>> > 2. I fixed issues related to creating, dropping variables under
>> > subtransactions + regress tests
>> > 3. I fixed issues in pg_dump + regress tests
>> >
>>
>>  > [0001-schema-variables-20211219.patch]
>>  > [0002-schema-variables-20211219.patch]
>>
>> Hi Pavel,
>>
>> I get an error during test 'session_variables'.
>>
>> (on the upside, my own little testsuite runs without error)
>>
>> thanks,
>>
>
> please, can you send me regress diff?
>

I see the problem now, the test contains username, and that is wrong.

Schema | Name | Type | Is nullable | Is mutable | Default | Owner |
Transactional end action | Access privileges | Description
---+--+-+-++-+---+--++-
- svartest | var1 | numeric | t | t | | pavel | | pavel=SW/pavel +|
- | | | | | | | | var_test_role=SW/pavel |
+--+--+-+-++-+--+--+---+-
+ svartest | var1 | numeric | t | t | | appveyor | | appveyor=SW/appveyor
+|
+ | | | | | | | | var_test_role=SW/appveyor |
(1 row)
REVOKE ALL ON VARIABLE var1 FROM var_test_role;

I have to remove this test

Pavel

Regards
>
> Pavel
>
>
>
>> Erik Rijkers
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>


Re: Schema variables - new implementation for Postgres 15

2021-12-18 Thread Pavel Stehule
ne 19. 12. 2021 v 8:09 odesílatel Erik Rijkers  napsal:

> Op 19-12-2021 om 07:23 schreef Pavel Stehule:
>
> >
> > I am sending new versions of patches.
> >
> > I hope I solved all Tomas's objections.
> >
> > 1. The schema variables were renamed to session variables
> > 2. I fixed issues related to creating, dropping variables under
> > subtransactions + regress tests
> > 3. I fixed issues in pg_dump + regress tests
> >
>
>  > [0001-schema-variables-20211219.patch]
>  > [0002-schema-variables-20211219.patch]
>
> Hi Pavel,
>
> I get an error during test 'session_variables'.
>
> (on the upside, my own little testsuite runs without error)
>
> thanks,
>

please, can you send me regress diff?

Regards

Pavel



> Erik Rijkers
>
>
>
>
>
>
>
>
>
>


Re: Schema variables - new implementation for Postgres 15

2021-12-18 Thread Erik Rijkers

Op 19-12-2021 om 07:23 schreef Pavel Stehule:



I am sending new versions of patches.

I hope I solved all Tomas's objections.

1. The schema variables were renamed to session variables
2. I fixed issues related to creating, dropping variables under 
subtransactions + regress tests

3. I fixed issues in pg_dump + regress tests



> [0001-schema-variables-20211219.patch]
> [0002-schema-variables-20211219.patch]

Hi Pavel,

I get an error during test 'session_variables'.

(on the upside, my own little testsuite runs without error)

thanks,

Erik Rijkers









diff -U3 
/home/aardvark/pg_stuff/pg_sandbox/pgsql.schema_variables/src/test/regress/expected/session_variables.out
 
/home/aardvark/pg_stuff/pg_sandbox/pgsql.schema_variables/src/test/regress/results/session_variables.out
--- 
/home/aardvark/pg_stuff/pg_sandbox/pgsql.schema_variables/src/test/regress/expected/session_variables.out
   2021-12-19 07:48:40.422821574 +0100
+++ 
/home/aardvark/pg_stuff/pg_sandbox/pgsql.schema_variables/src/test/regress/results/session_variables.out
2021-12-19 08:00:29.286820957 +0100
@@ -32,11 +32,11 @@
 SET ROLE TO DEFAULT;
 -- check output
 \dV+ var1
-List of variables
-  Schema  | Name |  Type   | Is nullable | Is mutable | Default | Owner | 
Transactional end action |   Access privileges| Description 
---+--+-+-++-+---+--++-
- svartest | var1 | numeric | t   | t  | | pavel |  
| pavel=SW/pavel+| 
-  |  | | || |   |  
| var_test_role=SW/pavel | 
+   List of 
variables
+  Schema  | Name |  Type   | Is nullable | Is mutable | Default |  Owner   | 
Transactional end action | Access privileges | Description 
+--+--+-+-++-+--+--+---+-
+ svartest | var1 | numeric | t   | t  | | aardvark |   
   | aardvark=SW/aardvark +| 
+  |  | | || |  |   
   | var_test_role=SW/aardvark | 
 (1 row)
 
 REVOKE ALL ON VARIABLE var1 FROM var_test_role;
@@ -745,10 +745,10 @@
 CREATE VARIABLE var1 AS int DEFAULT 100;
 COMMENT ON VARIABLE var1 IS 'some variable comment';
 \dV+ var1
-   List of 
variables
-  Schema  | Name |  Type   | Is nullable | Is mutable | Default | Owner | 
Transactional end action | Access privileges |  Description  
---+--+-+-++-+---+--+---+---
- svartest | var1 | integer | t   | t  | 100 | pavel |  
|   | some variable comment
+List of 
variables
+  Schema  | Name |  Type   | Is nullable | Is mutable | Default |  Owner   | 
Transactional end action | Access privileges |  Description  
+--+--+-+-++-+--+--+---+---
+ svartest | var1 | integer | t   | t  | 100 | aardvark |   
   |   | some variable comment
 (1 row)
 
 DROP VARIABLE var1;


Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)

2021-12-18 Thread Pavel Stehule
Hi

so 18. 12. 2021 v 9:55 odesílatel Damir Belyalov 
napsal:

> Hello.
>
> Wrote a patch implementing COPY with ignoring errors in rows using block
> subtransactions.
>

It is great so you are working on this patch. Unfortunately, I am afraid
this simple design is not optimal. Using subtransaction for every row has
too big overhead. I think it should use subtransaction for blocks of rows
(1000 rows), and only when there is an exception, then it should replay
inserts in subtransaction per row. You should check performance overhead.

Regards

Pavel



> Syntax: COPY [table] FROM [file/stdin] WITH IGNORE_ERROS;
>
> Examples:
> CREATE TABLE check_ign_err (n int, m int, k int);
> COPY check_ign_err FROM STDIN WITH IGNORE_ERRORS;
> 1 1 1
> 2 2 2 2
> 3 3 3
> \.
> WARNING:  COPY check_ign_err, line 2: "2 2 2 2"
> SELECT * FROM check_ign_err;
>  n | m | k
> ---+---+---
>  1 | 1 | 1
>  3 | 3 | 3
> (2 rows)
>
> ##
>
> TRUNCATE check_ign_err;
> COPY check_ign_err FROM STDIN WITH IGNORE_ERRORS;
> 1 1 1
> 2 2
> 3 3 3
> \.
> WARNING:  COPY check_ign_err, line 2: "2 2"
> SELECT * FROM check_ign_err;
>  n | m | k
> ---+---+---
>  1 | 1 | 1
>  3 | 3 | 3
> (2 rows)
>
> ##
>
> TRUNCATE check_ign_err;
> COPY check_ign_err FROM STDIN WITH IGNORE_ERRORS;
> 1 1 1
> 2 a 2
> 3 3 3
> \.
> WARNING:  COPY check_ign_err, line 2, column m: "a"
> SELECT * FROM check_ign_err;
>  n | m | k
> ---+---+---
>  1 | 1 | 1
>  3 | 3 | 3
> (2 rows)
>
>
>
> Regards, Damir
>
> пт, 10 дек. 2021 г. в 21:48, Pavel Stehule :
>
>>
>>
>> 2014-12-26 11:41 GMT+01:00 Pavel Stehule :
>>
>>>
>>>
>>> 2014-12-25 22:23 GMT+01:00 Alex Shulgin :
>>>
 Trent Shipley  writes:

 > On Friday 2007-12-14 16:22, Tom Lane wrote:
 >> Neil Conway  writes:
 >> > By modifying COPY: COPY IGNORE ERRORS or some such would instruct
 COPY
 >> > to drop (and log) rows that contain malformed data. That is, rows
 with
 >> > too many or too few columns, rows that result in constraint
 violations,
 >> > and rows containing columns where the data type's input function
 raises
 >> > an error. The last case is the only thing that would be a bit
 tricky to
 >> > implement, I think: you could use PG_TRY() around the
 InputFunctionCall,
 >> > but I guess you'd need a subtransaction to ensure that you reset
 your
 >> > state correctly after catching an error.
 >>
 >> Yeah.  It's the subtransaction per row that's daunting --- not only
 the
 >> cycles spent for that, but the ensuing limitation to 4G rows imported
 >> per COPY.
 >
 > You could extend the COPY FROM syntax with a COMMIT EVERY n clause.
 This
 > would help with the 4G subtransaction limit.  The cost to the ETL
 process is
 > that a simple rollback would not be guaranteed send the process back
 to it's
 > initial state.  There are easy ways to deal with the rollback issue
 though.
 >
 > A {NO} RETRY {USING algorithm} clause might be useful.   If the NO
 RETRY
 > option is selected then the COPY FROM can run without subtransactions
 and in
 > excess of the 4G per transaction limit.  NO RETRY should be the
 default since
 > it preserves the legacy behavior of COPY FROM.
 >
 > You could have an EXCEPTIONS TO {filename|STDERR} clause. I would not
 give the
 > option of sending exceptions to a table since they are presumably
 malformed,
 > otherwise they would not be exceptions.  (Users should re-process
 exception
 > files if they want an if good then table a else exception to table b
 ...)
 >
 > EXCEPTIONS TO and NO RETRY would be mutually exclusive.
 >
 >
 >> If we could somehow only do a subtransaction per failure, things
 would
 >> be much better, but I don't see how.

 Hello,

 Attached is a proof of concept patch for this TODO item.  There is no
 docs yet, I just wanted to know if approach is sane.

 The added syntax is like the following:

   COPY [table] FROM [file/program/stdin] EXCEPTIONS TO [file or stdout]

 The way it's done it is abusing Copy Both mode and from my limited
 testing, that seems to just work.  The error trapping itself is done
 using PG_TRY/PG_CATCH and can only catch formatting or before-insert
 trigger errors, no attempt is made to recover from a failed unique
 constraint, etc.

 Example in action:

 postgres=# \d test_copy2
   Table "public.test_copy2"
  Column |  Type   | Modifiers
 +-+---
  id | integer |
  val| integer |

 postgres=# copy test_copy2 from program 'seq 3' exceptions to stdout;
 1
 NOTICE:  missing data for column "val"
 CONTEXT:  COPY test_copy2, line 1: "1"
 2
 NOTICE:  missing data for column "val"
 CONTEXT:  

Re: sequences vs. synchronous replication

2021-12-18 Thread Amit Kapila
On Sat, Dec 18, 2021 at 7:24 AM Tomas Vondra
 wrote:
>
> while working on logical decoding of sequences, I ran into an issue with
> nextval() in a transaction that rolls back, described in [1]. But after
> thinking about it a bit more (and chatting with Petr Jelinek), I think
> this issue affects physical sync replication too.
>
> Imagine you have a primary <-> sync_replica cluster, and you do this:
>
>CREATE SEQUENCE s;
>
>-- shutdown the sync replica
>
>BEGIN;
>SELECT nextval('s') FROM generate_series(1,50);
>ROLLBACK;
>
>BEGIN;
>SELECT nextval('s');
>COMMIT;
>
> The natural expectation would be the COMMIT gets stuck, waiting for the
> sync replica (which is not running), right? But it does not.
>

How about if we always WAL log the first sequence change in a transaction?

-- 
With Regards,
Amit Kapila.




Re: [EXTERNAL] Re: should we document an example to set multiple libraries in shared_preload_libraries?

2021-12-18 Thread Maciek Sakrejda
On Fri, Dec 10, 2021 at 10:10 AM Godfrin, Philippe E
 wrote:
> I may have missed something in this stream, but is this a system controlled 
> by Patroni?

In my case, no: it's a pretty vanilla PGDG install on Ubuntu 20.04.
Thanks for the context, though.

Thanks,
Maciek




Re: should we document an example to set multiple libraries in shared_preload_libraries?

2021-12-18 Thread Maciek Sakrejda
On Thu, Dec 9, 2021 at 7:42 AM Bharath Rupireddy
 wrote:
> How about ALTER SYSTEM SET shared_preload_libraries command itself
> checking for availability of the specified libraries (after fetching
> library names from the parsed string value) with stat() and then
> report an error if any of the libraries doesn't exist? Currently, it
> just accepts if the specified value passes the parsing.

That certainly would have helped me. I guess it technically breaks the
theoretical use case of "first change the shared_preload_libraries
setting, then install those libraries, then restart Postgres," but I
don't see why anyone would do that in practice.

For what it's worth, I don't even feel strongly that this needs to be
an error—just that the current flow around this is error-prone due to
several sharp edges and could be improved. I would be happy with an
error, but a warning or other mechanism could work, too. I do think
better documentation is not enough: the differences between a working
setting value and a broken one are pretty subtle.

Thanks,
Maciek




Getting rid of regression test input/ and output/ files

2021-12-18 Thread Tom Lane
I did some desultory investigation of the idea proposed at [1]
that we should refactor the regression test scripts to try to
reduce their interdependencies.  I soon realized that one of
the stumbling blocks to this is that we've tended to concentrate
data-loading COPY commands, as well as C function creation
commands, into a few files to reduce the notational cruft of
substituting path names and the like into the test scripts.
That is, we don't want to have even more scripts that have to be
translated from input/foo.source and output/foo.source into
runnable scripts and test results.

This led me to wonder why we couldn't get rid of that entire
mechanism in favor of some less-painful way of getting that
information into the scripts.  If we had the desired values in
psql variables, we could do what we need easily, for example
instead of

CREATE FUNCTION check_primary_key ()
RETURNS trigger
AS '@libdir@/refint@DLSUFFIX@'
LANGUAGE C;

something like

CREATE FUNCTION check_primary_key ()
RETURNS trigger
AS :'LIBDIR'
'/refint'
:'DLSUFFIX'
LANGUAGE C;

(The extra line breaks are needed to convince SQL that the
adjacent string literals should be concatenated.  We couldn't
have done this so easily before psql had the :'variable'
notation, but that came in in 9.0.)

I see two ways we could get the info from pg_regress into psql
variables:

1. Add "-v VARIABLE=VALUE" switches to the psql invocations.
This requires no new psql capability, but it does introduce
the problem of getting correct shell quoting of the values.
I think we'd need to either duplicate appendShellString in
pg_regress.c, or start linking both libpq and libpgfeutils.a
into pg_regress to be able to use appendShellString itself.
In the past we've not wanted to link libpq into pg_regress
(though I admit I've forgotten the argument for not doing so).

2. Export the values from pg_regress as environment variables,
and then add a way for the test scripts to read those variables.
I was a bit surprised to realize that we didn't have any way
to do that already --- psql has \setenv, so why did we never
invent \getenv?

On the whole I prefer #2, as it seems cleaner and it adds some
actually useful-to-end-users psql functionality.

Attached is a really incomplete, quick-n-dirty POC showing that
this can be made to work.  If there aren't objections or better
ideas, I'll see about fleshing this out.

regards, tom lane

[1] 
https://www.postgresql.org/message-id/20211217182518.GA2529654%40rfd.leadboat.com

diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index ccd7b48108..fb3bab9494 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -98,6 +98,8 @@ static backslashResult process_command_g_options(char *first_option,
  bool active_branch,
  const char *cmd);
 static backslashResult exec_command_gdesc(PsqlScanState scan_state, bool active_branch);
+static backslashResult exec_command_getenv(PsqlScanState scan_state, bool active_branch,
+		   const char *cmd);
 static backslashResult exec_command_gexec(PsqlScanState scan_state, bool active_branch);
 static backslashResult exec_command_gset(PsqlScanState scan_state, bool active_branch);
 static backslashResult exec_command_help(PsqlScanState scan_state, bool active_branch);
@@ -348,6 +350,8 @@ exec_command(const char *cmd,
 		status = exec_command_g(scan_state, active_branch, cmd);
 	else if (strcmp(cmd, "gdesc") == 0)
 		status = exec_command_gdesc(scan_state, active_branch);
+	else if (strcmp(cmd, "getenv") == 0)
+		status = exec_command_getenv(scan_state, active_branch, cmd);
 	else if (strcmp(cmd, "gexec") == 0)
 		status = exec_command_gexec(scan_state, active_branch);
 	else if (strcmp(cmd, "gset") == 0)
@@ -1481,6 +1485,43 @@ exec_command_gdesc(PsqlScanState scan_state, bool active_branch)
 	return status;
 }
 
+/*
+ * \getenv -- set variable from environment variable
+ */
+static backslashResult
+exec_command_getenv(PsqlScanState scan_state, bool active_branch,
+	const char *cmd)
+{
+	bool		success = true;
+
+	if (active_branch)
+	{
+		char	   *myvar = psql_scan_slash_option(scan_state,
+   OT_NORMAL, NULL, false);
+		char	   *envvar = psql_scan_slash_option(scan_state,
+	OT_NORMAL, NULL, false);
+
+		if (!myvar || !envvar)
+		{
+			pg_log_error("\\%s: missing required argument", cmd);
+			success = false;
+		}
+		else
+		{
+			char	   *envval = getenv(envvar);
+
+			if (envval && !SetVariable(pset.vars, myvar, envval))
+success = false;
+		}
+		free(myvar);
+		free(envvar);
+	}
+	else
+		ignore_slash_options(scan_state);
+
+	return success ? PSQL_CMD_SKIP_LINE : PSQL_CMD_ERROR;
+}
+
 /*
  * \gexec -- send query and execute each field of result
  */
diff --git a/src/test/regress/input/copy.source b/src/test/regress/input/copy.source
index 8acb516801..ad42d02a22 100644
--- a/src/test/regress/input/copy.source
+++ 

Re: [PATCH] psql: \dn+ to show size of each schema (and \dA+ for AMs)

2021-12-18 Thread Justin Pryzby
Remove bogus ACL check for AMs.
Rebased on cf0cab868.
Use ForkNumber rather than int.
Update comments and commit message.
Also move the Size column of \l and \dt
>From a1d33780e4c61a7bdb64ee6ab5c248e1175a3efd Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Tue, 13 Jul 2021 21:25:48 -0500
Subject: [PATCH v4 1/4] Add pg_am_size(), pg_namespace_size() ..

See also: 358a897fa, 528ac10c7
---
 src/backend/utils/adt/dbsize.c  | 130 
 src/include/catalog/pg_proc.dat |  19 +
 2 files changed, 149 insertions(+)

diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index d5a7fb13f3..908432ba28 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -13,18 +13,23 @@
 
 #include 
 
+#include "access/genam.h"
 #include "access/htup_details.h"
 #include "access/relation.h"
+#include "access/table.h"
 #include "catalog/catalog.h"
 #include "catalog/namespace.h"
 #include "catalog/pg_authid.h"
 #include "catalog/pg_tablespace.h"
 #include "commands/dbcommands.h"
+#include "commands/defrem.h"
 #include "commands/tablespace.h"
 #include "miscadmin.h"
 #include "storage/fd.h"
 #include "utils/acl.h"
 #include "utils/builtins.h"
+#include "utils/fmgroids.h"
+#include "utils/lsyscache.h"
 #include "utils/numeric.h"
 #include "utils/rel.h"
 #include "utils/relfilenodemap.h"
@@ -832,6 +837,131 @@ pg_size_bytes(PG_FUNCTION_ARGS)
 	PG_RETURN_INT64(result);
 }
 
+/*
+ * Return the sum of size of relations for which the given attribute of
+ * pg_class matches the specified OID value.
+ */
+static int64
+calculate_size_attvalue(int attnum, Oid attval)
+{
+	int64		totalsize = 0;
+	ScanKeyData 	skey;
+	Relation	pg_class;
+	SysScanDesc	scan;
+	HeapTuple	tuple;
+
+	ScanKeyInit(, attnum,
+			BTEqualStrategyNumber, F_OIDEQ, attval);
+
+	pg_class = table_open(RelationRelationId, AccessShareLock);
+	scan = systable_beginscan(pg_class, InvalidOid, false, NULL, 1, );
+	while (HeapTupleIsValid(tuple = systable_getnext(scan)))
+	{
+		Relation rel;
+		Form_pg_class classtuple = (Form_pg_class) GETSTRUCT(tuple);
+
+		rel = try_relation_open(classtuple->oid, AccessShareLock);
+		if (!rel)
+			continue;
+
+		for (ForkNumber forkNum = 0; forkNum <= MAX_FORKNUM; forkNum++)
+			totalsize += calculate_relation_size(>rd_node, rel->rd_backend, forkNum);
+
+		relation_close(rel, AccessShareLock);
+	}
+
+	systable_endscan(scan);
+	table_close(pg_class, AccessShareLock);
+	return totalsize;
+}
+
+/* Compute the size of relations in a schema (namespace) */
+static int64
+calculate_namespace_size(Oid nspOid)
+{
+	/*
+	 * User must be a member of pg_read_all_stats or have CREATE privilege for
+	 * target namespace.
+	 */
+	if (!is_member_of_role(GetUserId(), ROLE_PG_READ_ALL_STATS))
+	{
+		AclResult aclresult;
+		aclresult = pg_namespace_aclcheck(nspOid, GetUserId(), ACL_CREATE);
+		if (aclresult != ACLCHECK_OK)
+			aclcheck_error(aclresult, OBJECT_SCHEMA,
+		   get_namespace_name(nspOid));
+	}
+
+	return calculate_size_attvalue(Anum_pg_class_relnamespace, nspOid);
+}
+
+Datum
+pg_namespace_size_oid(PG_FUNCTION_ARGS)
+{
+	int64		size;
+	Oid			nspOid = PG_GETARG_OID(0);
+
+	size = calculate_namespace_size(nspOid);
+
+	if (size < 0)
+		PG_RETURN_NULL();
+
+	PG_RETURN_INT64(size);
+}
+
+Datum
+pg_namespace_size_name(PG_FUNCTION_ARGS)
+{
+	int64		size;
+	Name		nspName = PG_GETARG_NAME(0);
+	Oid			nspOid = get_namespace_oid(NameStr(*nspName), false);
+
+	size = calculate_namespace_size(nspOid);
+
+	if (size < 0)
+		PG_RETURN_NULL();
+
+	PG_RETURN_INT64(size);
+}
+
+/* Compute the size of relations using the given access method */
+static int64
+calculate_am_size(Oid amOid)
+{
+	/* XXX acl_check? */
+
+	return calculate_size_attvalue(Anum_pg_class_relam, amOid);
+}
+
+Datum
+pg_am_size_oid(PG_FUNCTION_ARGS)
+{
+	int64		size;
+	Oid			amOid = PG_GETARG_OID(0);
+
+	size = calculate_am_size(amOid);
+
+	if (size < 0)
+		PG_RETURN_NULL();
+
+	PG_RETURN_INT64(size);
+}
+
+Datum
+pg_am_size_name(PG_FUNCTION_ARGS)
+{
+	int64		size;
+	Name		amName = PG_GETARG_NAME(0);
+	Oid			amOid = get_am_oid(NameStr(*amName), false);
+
+	size = calculate_am_size(amOid);
+
+	if (size < 0)
+		PG_RETURN_NULL();
+
+	PG_RETURN_INT64(size);
+}
+
 /*
  * Get the filenode of a relation
  *
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 4d992dc224..ff08b2ac01 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -7230,6 +7230,25 @@
   descr => 'total disk space usage for the specified tablespace',
   proname => 'pg_tablespace_size', provolatile => 'v', prorettype => 'int8',
   proargtypes => 'name', prosrc => 'pg_tablespace_size_name' },
+
+{ oid => '9410',
+  descr => 'total disk space usage for the specified namespace',
+  proname => 'pg_namespace_size', provolatile => 'v', prorettype => 'int8',
+  proargtypes => 'oid', prosrc => 'pg_namespace_size_oid' },
+{ oid => '9411',
+  descr => 'total disk space usage for the specified 

Re: sequences vs. synchronous replication

2021-12-18 Thread Tom Lane
Tomas Vondra  writes:
> What do you mean by "not caching unused sequence numbers"? Reducing 
> SEQ_LOG_VALS to 1, i.e. WAL-logging every sequence increment?

Right.

> That'd work, but I wonder how significant the impact will be.

As I said, we've accepted worse in order to have stable replication
behavior.

regards, tom lane




Re: sequences vs. synchronous replication

2021-12-18 Thread Tomas Vondra




On 12/18/21 22:27, Tom Lane wrote:

Tomas Vondra  writes:

Here's a PoC demonstrating this idea. I'm not convinced it's the right
way to deal with this - it surely seems more like a duct tape fix than a
clean solution. But it does the trick.


I was imagining something a whole lot simpler, like "don't try to
cache unused sequence numbers when wal_level > minimal".  We've
accepted worse performance hits in that operating mode, and it'd
fix a number of user complaints we've seen about weird sequence
behavior on standbys.



What do you mean by "not caching unused sequence numbers"? Reducing 
SEQ_LOG_VALS to 1, i.e. WAL-logging every sequence increment?


That'd work, but I wonder how significant the impact will be. It'd bet 
it hurts the patch adding logical decoding of sequences quite a bit.


regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: sequences vs. synchronous replication

2021-12-18 Thread Tom Lane
Tomas Vondra  writes:
> Here's a PoC demonstrating this idea. I'm not convinced it's the right 
> way to deal with this - it surely seems more like a duct tape fix than a 
> clean solution. But it does the trick.

I was imagining something a whole lot simpler, like "don't try to
cache unused sequence numbers when wal_level > minimal".  We've
accepted worse performance hits in that operating mode, and it'd
fix a number of user complaints we've seen about weird sequence
behavior on standbys.

regards, tom lane




Re: Is my home $HOME or is it getpwent()->pw_dir ?

2021-12-18 Thread Chapman Flack
On 12/18/21 16:16, David G. Johnston wrote:
> psql docs saith:
> 
> "This utility, like most other PostgreSQL utilities, also uses the
> environment variables supported by libpq (see Section 34.15)."

I'm sure that's adequate as far as that goes. I just happened to miss it
when composing the longer email (and then I just thought "I bet there are
environment variables supported by libpq" and looked there).

Regards,
-Chap




Re: Is my home $HOME or is it getpwent()->pw_dir ?

2021-12-18 Thread David G. Johnston
On Sat, Dec 18, 2021 at 2:07 PM Chapman Flack  wrote:

> On 12/18/21 15:57, Chapman Flack wrote:
> > I see that I can set
> > a HISTFILE variable (or set PSQL_HISTORY in the environment),
> > and can set PSQLRC in the environment (but not as a variable),
> > and nothing can set the .pgpass location
>
> well, not in the psql docs, but in the environment variable section
> for libpq I do see a PGPASSFILE.
>
>
psql docs saith:

"This utility, like most other PostgreSQL utilities, also uses the
environment variables supported by libpq (see Section 34.15)."

David J.


Re: Is my home $HOME or is it getpwent()->pw_dir ?

2021-12-18 Thread Chapman Flack
On 12/18/21 15:57, Chapman Flack wrote:
> I see that I can set
> a HISTFILE variable (or set PSQL_HISTORY in the environment),
> and can set PSQLRC in the environment (but not as a variable),
> and nothing can set the .pgpass location

well, not in the psql docs, but in the environment variable section
for libpq I do see a PGPASSFILE.

-C




Is my home $HOME or is it getpwent()->pw_dir ?

2021-12-18 Thread Chapman Flack
Hi,

I sometimes do some testing as nobody, on a distro where
getpwent(nobody)->pw_dir is a directory that nobody can't write.
So I end up setting $HOME to a directory that, um, is writable.

When I start psql, strace shows $HOME being honored when looking
for .terminfo and .inputrc, and getpwent()->pw_dir being used
to look for .pgpass, .psqlrc, and .psql_history, which of course
aren't there.

I'm sure the .terminfo and .inputrc lookups are being done by library code.
In my experience, it seems traditionally unixy to let $HOME take precedence.

Maybe things that are pointedly cross-platform are more likely to rely
on the getpwent lookup. I run into the same issue with Java, which is
pointedly cross-platform.

But there, I can alias java to java -Duser.home="$HOME" and all is well.

Would a patch be acceptable for psql to allow such an option
on the command line? I assume that would be more acceptable than
just changing the default behavior.

And if so, would it be preferable to add a whole new option for it,
(--home ?) or, analogously to the way java works, just to add a
HOME variable so it can be set on the command line with -v ?

Or would a name like HOME pose too much risk that somebody is using
such a variable in psql scripts for unrelated purposes?

In a moment of hopefulness I tried \set and looked to see if such
a thing already exists, but I didn't see it. I see that I can set
a HISTFILE variable (or set PSQL_HISTORY in the environment),
and can set PSQLRC in the environment (but not as a variable),
and nothing can set the .pgpass location. One HOME variable could
take care of all three in one foop.

(Or could it? Perhaps .pgpass is handled in libpq at a layer unaware
of psql variables? But maybe the variable could have a modify event
that alerts libpq.)


Regards,
-Chap




Re: sequences vs. synchronous replication

2021-12-18 Thread Tomas Vondra



On 12/18/21 07:00, Tomas Vondra wrote:



On 12/18/21 05:52, Tom Lane wrote:

Tomas Vondra  writes:

The problem is exactly the same as in [1] - the aborted transaction
generated WAL, but RecordTransactionAbort() ignores that and does not
update LogwrtResult.Write, with the reasoning that aborted transactions
do not matter. But sequences violate that, because we only write WAL
once every 32 increments, so the following nextval() gets "committed"
without waiting for the replica (because it did not produce WAL).


Ugh.


I'm not sure this is a clear data corruption bug, but it surely walks
and quacks like one. My proposal is to fix this by tracking the lsn of
the last LSN for a sequence increment, and then check that LSN in
RecordTransactionCommit() before calling XLogFlush().


(1) Does that work if the aborted increment was in a different
session?  I think it is okay but I'm tired enough to not be sure.



Good point - it doesn't :-( At least not by simply storing LSN in a 
global variable or something like that.


The second backend needs to know the LSN of the last WAL-logged sequence 
increment, but only the first backend knows that. So we'd need to share 
that between backends somehow. I doubt we want to track LSN for every 
individual sequence (because for clusters with many dbs / sequences that 
may be a lot).


Perhaps we could track just a fixed number o LSN values in shared memory 
(say, 1024), and update/read just the element determined by hash(oid). 
That is, the backend WAL-logging sequence with given oid would set the 
current LSN to array[hash(oid) % 1024], and backend doing nextval() 
would simply remember the LSN in that slot. Yes, if there are conflicts 
that'll flush more than needed.




Here's a PoC demonstrating this idea. I'm not convinced it's the right 
way to deal with this - it surely seems more like a duct tape fix than a 
clean solution. But it does the trick.


I wonder if storing this in shmem is good enough - we lose the LSN info 
on restart, but the checkpoint should trigger FPI which makes it OK.


A bigger question is whether sequences are the only thing affected by 
this. If you look at RecordTransactionCommit() then we skip flush/wait 
in two cases:


1) !wrote_xlog - if the xact did not produce WAL

2) !markXidCommitted - if the xact does not have a valid XID

Both apply to sequences, and the PoC patch tweaks them. But maybe there 
are other places where we don't generate WAL and/or assign XID in some 
cases, to save time?



regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Companydiff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index e7b0bc804d..c6a0c07846 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -35,6 +35,7 @@
 #include "catalog/pg_enum.h"
 #include "catalog/storage.h"
 #include "commands/async.h"
+#include "commands/sequence.h"
 #include "commands/tablecmds.h"
 #include "commands/trigger.h"
 #include "common/pg_prng.h"
@@ -1289,6 +1290,13 @@ RecordTransactionCommit(void)
 	bool		RelcacheInitFileInval = false;
 	bool		wrote_xlog;
 
+	/*
+	 * Force flush and synchronous replication even if no XID was assigned.
+	 * This is needed when depending on WAL produced by another transaction
+	 * (possibly in a different backend). Sequences need this, for example.
+	 */
+	bool		force_sync = false;
+
 	/*
 	 * Log pending invalidations for logical decoding of in-progress
 	 * transactions.  Normally for DDLs, we log this at each command end,
@@ -1299,6 +1307,24 @@ RecordTransactionCommit(void)
 	if (XLogLogicalInfoActive())
 		LogLogicalInvalidations();
 
+	/*
+	 * Check the LSN of increments for sequences we touched in this transaction.
+	 * If it's higher than XactLastRecEnd, we need to force flush/sync.
+	 */
+	{
+		/* Separate call, so that we can compare to XactLastRecEnd. */
+		XLogRecPtr tmpptr = SequenceGetLastLSN();
+
+		/*
+		 * If higher than XactLastRecEnd, make sure we flush even without
+		 * a XID assigned.
+		 */
+		force_sync = (tmpptr > XactLastRecEnd);
+
+		/* Override the XactLastRecEnd value. */
+		XactLastRecEnd = Max(XactLastRecEnd, tmpptr);
+	}
+
 	/* Get data needed for commit record */
 	nrels = smgrGetPendingDeletes(true, );
 	nchildren = xactGetCommittedChildren();
@@ -1446,7 +1472,7 @@ RecordTransactionCommit(void)
 	 * if all to-be-deleted tables are temporary though, since they are lost
 	 * anyway if we crash.)
 	 */
-	if ((wrote_xlog && markXidCommitted &&
+	if ((wrote_xlog && (markXidCommitted || force_sync) &&
 		 synchronous_commit > SYNCHRONOUS_COMMIT_OFF) ||
 		forceSyncCommit || nrels > 0)
 	{
@@ -1504,7 +1530,7 @@ RecordTransactionCommit(void)
 	 * Note that at this stage we have marked clog, but still show as running
 	 * in the procarray and continue to hold locks.
 	 */
-	if (wrote_xlog && markXidCommitted)
+	if (wrote_xlog && (markXidCommitted || force_sync))
 		SyncRepWaitForLSN(XactLastRecEnd, true);
 
 	

Re: Column Filtering in Logical Replication

2021-12-18 Thread Alvaro Herrera
On 2021-Dec-18, Tomas Vondra wrote:

> On 12/18/21 02:34, Alvaro Herrera wrote:
> > On 2021-Dec-17, Tomas Vondra wrote:

> > > > If the server has a *separate* security mechanism to hide the
> > > > columns (per-column privs), it is that feature that will protect
> > > > the data, not the logical-replication-feature to filter out
> > > > columns.
> > > 
> > > Right. Although I haven't thought about how logical decoding
> > > interacts with column privileges. I don't think logical decoding
> > > actually checks column privileges - I certainly don't recall any
> > > ACL checks in src/backend/replication ...
> > 
> > Well, in practice if you're confronted with a replica that's
> > controlled by a malicious user that can tweak its behavior, then
> > replica-side privilege checking won't do anything useful.
> 
> I don't follow. Surely the decoding happens on the primary node,
> right?  Which is where the ACL checks would happen, using the role the
> replication connection is opened with.

I think you do follow.  Yes, the decoding happens on the primary node,
and the security checks should occur in the primary node, because to do
otherwise is folly(*).  Which means that column filtering, being a
replica-side feature, is *not* a security feature.  I was mistaken about
it, is all.  If you want security, you need to use column-level
privileges, as you say.

(*) The checks *must* occur in the primary side, because the primary
does not control the code that runs in the replica side.  The primary
must treat the replica as running potentially hostile code.  Trying to
defend against that is not practical.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/




Re: Adding CI to our tree

2021-12-18 Thread Andrew Dunstan


On 12/17/21 14:34, Andres Freund wrote:
> Hi,
>
> On 2021-12-17 09:36:05 -0500, Tom Lane wrote:
>> Andrew Dunstan  writes:
>>> Maye I have missed it, but why are we using ccache here? That seems a
>>> bit pointless in an ephemeral instance.
>> I believe Munro's cfbot tooling is able to save and re-use ccache
>> across successive instantiations of a build instance.  I've not
>> looked at this code, but if it can do that there'd be point to it.
> Yes, the ccache cache is persisted across runs (see the *_cache and
> upload_cache inststructions). It makes a quite substantial difference. One
> reason the windows runs are a lot slower than the others is just that visual
> studio isn't yet supported by ccache, and that there doesn't seem to be good
> other tools.
>
> The ccache maintainers merged more of the msvc support last weekend! So I have
> quite a bit of hope of being able to use ccache there as well.
>

Ok. I have had to disable ccache for fairywren (msys2) because it caused
serious instability.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Refactoring of compression options in pg_basebackup

2021-12-18 Thread Michael Paquier
Hi all,
(Added Georgios in CC.)

When working on the support of LZ4 for pg_receivewal, walmethods.c has
gained an extra parameter for the compression method.  It gets used on
the DirectoryMethod instead of the compression level to decide which
type of compression is used.  One thing that I left out during this
previous work is that the TarMethod also gained knowledge of this
compression method, but we still use the compression level to check if
tars should be compressed or not.

This is wrong on multiple aspects.  First, this is not consistent with
the directory method, making walmethods.c harder to figure out.
Second, this is not extensible if we want to introduce more
compression methods in pg_basebackup, like LZ4.  This reflects on the
options used by pg_receivewal and pg_basebackup, that are not
inconsistent as well.

The attached patch refactors the code of pg_basebackup and the
TarMethod of walmethods.c to use the compression method where it
should, splitting entirely the logic related the compression level.

This is one step toward the introduction of LZ4 in pg_basebackup, but
this refactoring is worth doing on its own, hence a separate thread to
deal with this problem first.  The options of pg_basebackup are
reworked to be consistent with pg_receivewal, as follows:
- --compress ranges now from 1 to 9, instead of 0 to 9.
- --compression-method={none,gzip} is added, the default is none, same
as HEAD.
- --gzip/-z has the same meaning as before, being just a synonym of
--compression-method=gzip with the default compression level of ZLIB
assigned if there is no --compress.

One more thing that I have noticed while hacking this stuff is that we
have no regression tests for gzip with pg_basebackup, so I have added
some that are skipped when not compiling the code with ZLIB.

Opinions?
--
Michael
From 6457eae3577cd0074f58797c4fa420c296d68b39 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Sat, 18 Dec 2021 20:27:30 +0900
Subject: [PATCH] Refactor options of pg_basebackup for compression level and
 methods

---
 src/bin/pg_basebackup/pg_basebackup.c| 79 +++-
 src/bin/pg_basebackup/t/010_pg_basebackup.pl | 47 +++-
 src/bin/pg_basebackup/walmethods.c   | 47 +++-
 doc/src/sgml/ref/pg_basebackup.sgml  | 22 +-
 4 files changed, 155 insertions(+), 40 deletions(-)

diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index 1739ac6382..01529aa036 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -113,6 +113,7 @@ static bool showprogress = false;
 static bool estimatesize = true;
 static int	verbose = 0;
 static int	compresslevel = 0;
+static WalCompressionMethod compression_method = COMPRESSION_NONE;
 static IncludeWal includewal = STREAM_WAL;
 static bool fastcheckpoint = false;
 static bool writerecoveryconf = false;
@@ -359,7 +360,9 @@ usage(void)
 	printf(_("  -X, --wal-method=none|fetch|stream\n"
 			 " include required WAL files with specified method\n"));
 	printf(_("  -z, --gzip compress tar output\n"));
-	printf(_("  -Z, --compress=0-9 compress tar output with given compression level\n"));
+	printf(_("  -Z, --compress=1-9 compress tar output with given compression level\n"));
+	printf(_("  --compression-method=METHOD\n"
+			 " method to compress data\n"));
 	printf(_("\nGeneral options:\n"));
 	printf(_("  -c, --checkpoint=fast|spread\n"
 			 " set fast or spread checkpointing\n"));
@@ -524,7 +527,7 @@ LogStreamerMain(logstreamer_param *param)
 	stream.do_sync);
 	else
 		stream.walmethod = CreateWalTarMethod(param->xlog,
-			  COMPRESSION_NONE, /* ignored */
+			  compression_method,
 			  compresslevel,
 			  stream.do_sync);
 
@@ -1034,19 +1037,22 @@ CreateBackupStreamer(char *archive_name, char *spclocation,
 			archive_file = NULL;
 		}
 
+		if (compression_method == COMPRESSION_NONE)
+			streamer = bbstreamer_plain_writer_new(archive_filename,
+   archive_file);
 #ifdef HAVE_LIBZ
-		if (compresslevel != 0)
+		else if (compression_method == COMPRESSION_GZIP)
 		{
 			strlcat(archive_filename, ".gz", sizeof(archive_filename));
 			streamer = bbstreamer_gzip_writer_new(archive_filename,
   archive_file,
   compresslevel);
 		}
-		else
 #endif
-			streamer = bbstreamer_plain_writer_new(archive_filename,
-   archive_file);
-
+		else
+		{
+			Assert(false);		/* not reachable */
+		}
 
 		/*
 		 * If we need to parse the archive for whatever reason, then we'll
@@ -1765,6 +1771,7 @@ main(int argc, char **argv)
 		{"no-manifest", no_argument, NULL, 5},
 		{"manifest-force-encode", no_argument, NULL, 6},
 		{"manifest-checksums", required_argument, NULL, 7},
+		{"compression-method", required_argument, NULL, 8},
 		{NULL, 0, NULL, 0}
 	};
 	int			c;
@@ -1872,14 +1879,10 @@ 

Re: Addition of --no-sync to pg_upgrade for test speedup

2021-12-18 Thread Michael Paquier
On Fri, Dec 17, 2021 at 09:47:05AM -0500, Bruce Momjian wrote:
> On Fri, Dec 17, 2021 at 10:21:04AM +0100, Peter Eisentraut wrote:
>> I think that is reasonable.

Thanks.  I have applied that, as that really helped here.

>> Maybe we could have some global option, like some environment variable, that
>> enables the "sync" mode in all tests, so it's easy to test that once in a
>> while.  Not really a requirement for your patch, but an idea in case this is
>> a concern.
> 
> Yes, I think it would be good to see all the places we might want to
> pass the no-sync option.

The remaining places in src/bin/ that I can see are pg_resetwal, where
we would fsync() a WAL segment full of zeros, and pg_recvlogical
OutputFsync(), which does not point to much data, I guess.  The first
one may be worth it, but that's just 16MB we are talking about and
WriteEmptyXLOG() is not a code path taken currently by the tests.

We could introduce a new environment variable if one wishes to enforce
those flushes, say PG_TEST_SYNC, on top of patching any TAP test that
has a --no-sync to filter it out.
--
Michael


signature.asc
Description: PGP signature


Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)

2021-12-18 Thread Damir Belyalov
Hello.

Wrote a patch implementing COPY with ignoring errors in rows using block
subtransactions.

Syntax: COPY [table] FROM [file/stdin] WITH IGNORE_ERROS;

Examples:
CREATE TABLE check_ign_err (n int, m int, k int);
COPY check_ign_err FROM STDIN WITH IGNORE_ERRORS;
1 1 1
2 2 2 2
3 3 3
\.
WARNING:  COPY check_ign_err, line 2: "2 2 2 2"
SELECT * FROM check_ign_err;
 n | m | k
---+---+---
 1 | 1 | 1
 3 | 3 | 3
(2 rows)

##

TRUNCATE check_ign_err;
COPY check_ign_err FROM STDIN WITH IGNORE_ERRORS;
1 1 1
2 2
3 3 3
\.
WARNING:  COPY check_ign_err, line 2: "2 2"
SELECT * FROM check_ign_err;
 n | m | k
---+---+---
 1 | 1 | 1
 3 | 3 | 3
(2 rows)

##

TRUNCATE check_ign_err;
COPY check_ign_err FROM STDIN WITH IGNORE_ERRORS;
1 1 1
2 a 2
3 3 3
\.
WARNING:  COPY check_ign_err, line 2, column m: "a"
SELECT * FROM check_ign_err;
 n | m | k
---+---+---
 1 | 1 | 1
 3 | 3 | 3
(2 rows)



Regards, Damir

пт, 10 дек. 2021 г. в 21:48, Pavel Stehule :

>
>
> 2014-12-26 11:41 GMT+01:00 Pavel Stehule :
>
>>
>>
>> 2014-12-25 22:23 GMT+01:00 Alex Shulgin :
>>
>>> Trent Shipley  writes:
>>>
>>> > On Friday 2007-12-14 16:22, Tom Lane wrote:
>>> >> Neil Conway  writes:
>>> >> > By modifying COPY: COPY IGNORE ERRORS or some such would instruct
>>> COPY
>>> >> > to drop (and log) rows that contain malformed data. That is, rows
>>> with
>>> >> > too many or too few columns, rows that result in constraint
>>> violations,
>>> >> > and rows containing columns where the data type's input function
>>> raises
>>> >> > an error. The last case is the only thing that would be a bit
>>> tricky to
>>> >> > implement, I think: you could use PG_TRY() around the
>>> InputFunctionCall,
>>> >> > but I guess you'd need a subtransaction to ensure that you reset
>>> your
>>> >> > state correctly after catching an error.
>>> >>
>>> >> Yeah.  It's the subtransaction per row that's daunting --- not only
>>> the
>>> >> cycles spent for that, but the ensuing limitation to 4G rows imported
>>> >> per COPY.
>>> >
>>> > You could extend the COPY FROM syntax with a COMMIT EVERY n clause.
>>> This
>>> > would help with the 4G subtransaction limit.  The cost to the ETL
>>> process is
>>> > that a simple rollback would not be guaranteed send the process back
>>> to it's
>>> > initial state.  There are easy ways to deal with the rollback issue
>>> though.
>>> >
>>> > A {NO} RETRY {USING algorithm} clause might be useful.   If the NO
>>> RETRY
>>> > option is selected then the COPY FROM can run without subtransactions
>>> and in
>>> > excess of the 4G per transaction limit.  NO RETRY should be the
>>> default since
>>> > it preserves the legacy behavior of COPY FROM.
>>> >
>>> > You could have an EXCEPTIONS TO {filename|STDERR} clause. I would not
>>> give the
>>> > option of sending exceptions to a table since they are presumably
>>> malformed,
>>> > otherwise they would not be exceptions.  (Users should re-process
>>> exception
>>> > files if they want an if good then table a else exception to table b
>>> ...)
>>> >
>>> > EXCEPTIONS TO and NO RETRY would be mutually exclusive.
>>> >
>>> >
>>> >> If we could somehow only do a subtransaction per failure, things would
>>> >> be much better, but I don't see how.
>>>
>>> Hello,
>>>
>>> Attached is a proof of concept patch for this TODO item.  There is no
>>> docs yet, I just wanted to know if approach is sane.
>>>
>>> The added syntax is like the following:
>>>
>>>   COPY [table] FROM [file/program/stdin] EXCEPTIONS TO [file or stdout]
>>>
>>> The way it's done it is abusing Copy Both mode and from my limited
>>> testing, that seems to just work.  The error trapping itself is done
>>> using PG_TRY/PG_CATCH and can only catch formatting or before-insert
>>> trigger errors, no attempt is made to recover from a failed unique
>>> constraint, etc.
>>>
>>> Example in action:
>>>
>>> postgres=# \d test_copy2
>>>   Table "public.test_copy2"
>>>  Column |  Type   | Modifiers
>>> +-+---
>>>  id | integer |
>>>  val| integer |
>>>
>>> postgres=# copy test_copy2 from program 'seq 3' exceptions to stdout;
>>> 1
>>> NOTICE:  missing data for column "val"
>>> CONTEXT:  COPY test_copy2, line 1: "1"
>>> 2
>>> NOTICE:  missing data for column "val"
>>> CONTEXT:  COPY test_copy2, line 2: "2"
>>> 3
>>> NOTICE:  missing data for column "val"
>>> CONTEXT:  COPY test_copy2, line 3: "3"
>>> NOTICE:  total exceptions ignored: 3
>>>
>>> postgres=# \d test_copy1
>>>   Table "public.test_copy1"
>>>  Column |  Type   | Modifiers
>>> +-+---
>>>  id | integer | not null
>>>
>>> postgres=# set client_min_messages to warning;
>>> SET
>>> postgres=# copy test_copy1 from program 'ls /proc' exceptions to stdout;
>>> ...
>>> vmstat
>>> zoneinfo
>>> postgres=#
>>>
>>> Limited performance testing shows no significant difference between
>>> error-catching and plain code path.  For example, timing
>>>
>>>