Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)

2019-12-31 Thread Masahiko Sawada
On Sun, Dec 1, 2019 at 12:03 PM Michael Paquier  wrote:
>
> On Fri, Nov 01, 2019 at 09:38:37AM +0900, Moon, Insung wrote:
> > Of course, I may not have written the excellent quality code
> > correctly, so I will make an interim report if possible.
>
> The last patch has rotten, and does not apply anymore.  A rebase would
> be nice, so I am switching the patch as waiting on author, and moved
> it to next CF.
>

We have discussed in off-list and weekly voice meeting for several
months that the purpose of this feature and the target for PG13 and we
concluded to step back and to focus on only internal key management
system for PG13. Transparent data encryption support is now the target
for PG14 or later. Key management system is an important
infrastructure for TDE but it can work independent of TDE. The plan
for PG13 we discussed is to introduce the internal key management
system that has one encryption key for whole database cluster and make
it have some interface to get encryption keys that are managed inside
PostgreSQL database in order to integrate it with other components
such as pgcrypto.

Idea is to get something encrypted and decrypted without ever knowing
the actual key that was used to encrypt it. The attached patch has two
APIs to wrap and unwrap the secret by the encryption key stored inside
database cluster. user generate a secret key locally and send it to
PostgreSQL server to wrap it using by pg_kmgr_wrap() and save it
somewhere. Then user can use the saved and wrapped secret key to
encrypt and decrypt user data by something like:

INSERT INTO tbl VALUES (pg_encrypt('user data', pg_kmgr_unwrap('x'));

Where 'x' is the result of pg_kmgr_wrap function.

I've attached the KMS patch. It requires openssl library. What the
patch does is simple and is not changed much from the previous version
patch that includes KMS and TDE; we generate one encryption key called
master key for whole database cluster at initdb time, which is stored
in pg_control and wrapped by key encryption key(KEK) derived from
user-provided passphrase. When postmaster starts up it verifies the
correctness of passphrase provided by user using hmac key which is
also derived from user-provided passphrase. The server won't start if
the passphrase is incorrect. Once the passphrase is verified the
master key is loaded to the shared buffer and is active.

I added two options to initdb: --cluster-passphrase-command and -e
that takes a passphrase command and a cipher algorithm (currently only
aes-128 and aes-256) respectively. The internal KMS is enabled by
executing initdb with those options as follows:

$ initdb -D data --cluster-passphrase-command="echo 'password'" -e aes-256

I believe the internal KMS would be useful several use cases but I'd
like to have discussion around the integration with pgcrypto because
pgcrypto would be the first user of the KMS and pgcrypto can be more
powerful with the KMS. I'll register this KMS patch to the next Commit
Fest.

I really appreciate peoples (CC-ing) who participated in off-list
discussion/meeting for many inputs, suggestions and reviewing codes.

Regards,


--
Masahiko Sawada  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/configure b/configure
index 3d9bd0bdf8..19f927cfb8 100755
--- a/configure
+++ b/configure
@@ -12111,7 +12111,7 @@ done
   # defines OPENSSL_VERSION_NUMBER to claim version 2.0.0, even though it
   # doesn't have these OpenSSL 1.1.0 functions. So check for individual
   # functions.
-  for ac_func in OPENSSL_init_ssl BIO_get_data BIO_meth_new ASN1_STRING_get0_data
+  for ac_func in OPENSSL_init_ssl OPENSSL_init_crypto BIO_get_data BIO_meth_new ASN1_STRING_get0_data
 do :
   as_ac_var=`$as_echo "ac_cv_func_$ac_func" | $as_tr_sh`
 ac_fn_c_check_func "$LINENO" "$ac_func" "$as_ac_var"
diff --git a/configure.in b/configure.in
index 34aea0b4ac..0a94ea37c5 100644
--- a/configure.in
+++ b/configure.in
@@ -1193,7 +1193,7 @@ if test "$with_openssl" = yes ; then
   # defines OPENSSL_VERSION_NUMBER to claim version 2.0.0, even though it
   # doesn't have these OpenSSL 1.1.0 functions. So check for individual
   # functions.
-  AC_CHECK_FUNCS([OPENSSL_init_ssl BIO_get_data BIO_meth_new ASN1_STRING_get0_data])
+  AC_CHECK_FUNCS([OPENSSL_init_ssl OPENSSL_init_crypto BIO_get_data BIO_meth_new ASN1_STRING_get0_data])
   # OpenSSL versions before 1.1.0 required setting callback functions, for
   # thread-safety. In 1.1.0, it's no longer required, and CRYPTO_lock()
   # function was removed.
diff --git a/src/backend/Makefile b/src/backend/Makefile
index b0d2be7ee0..64ce50474f 100644
--- a/src/backend/Makefile
+++ b/src/backend/Makefile
@@ -21,7 +21,7 @@ SUBDIRS = access bootstrap catalog parser commands executor foreign lib libpq \
 	main nodes optimizer partitioning port postmaster \
 	regex replication rewrite \
 	statistics storage tcop tsearch utils $(top_builddir)/src/timezone \
-	jit
+	jit crypto
 
 include $(srcdir)/co

Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)

2019-12-31 Thread Ibrar Ahmed
On Tue, Dec 31, 2019 at 1:05 PM Masahiko Sawada 
wrote:

> On Sun, Dec 1, 2019 at 12:03 PM Michael Paquier 
> wrote:
> >
> > On Fri, Nov 01, 2019 at 09:38:37AM +0900, Moon, Insung wrote:
> > > Of course, I may not have written the excellent quality code
> > > correctly, so I will make an interim report if possible.
> >
> > The last patch has rotten, and does not apply anymore.  A rebase would
> > be nice, so I am switching the patch as waiting on author, and moved
> > it to next CF.
> >
>
> We have discussed in off-list and weekly voice meeting for several
> months that the purpose of this feature and the target for PG13 and we
> concluded to step back and to focus on only internal key management
> system for PG13. Transparent data encryption support is now the target
> for PG14 or later. Key management system is an important
> infrastructure for TDE but it can work independent of TDE. The plan
> for PG13 we discussed is to introduce the internal key management
> system that has one encryption key for whole database cluster and make
> it have some interface to get encryption keys that are managed inside
> PostgreSQL database in order to integrate it with other components
> such as pgcrypto.
>
> Idea is to get something encrypted and decrypted without ever knowing
> the actual key that was used to encrypt it. The attached patch has two
> APIs to wrap and unwrap the secret by the encryption key stored inside
> database cluster. user generate a secret key locally and send it to
> PostgreSQL server to wrap it using by pg_kmgr_wrap() and save it
> somewhere. Then user can use the saved and wrapped secret key to
> encrypt and decrypt user data by something like:
>
> INSERT INTO tbl VALUES (pg_encrypt('user data', pg_kmgr_unwrap('x'));
>
> Where 'x' is the result of pg_kmgr_wrap function.
>
> I've attached the KMS patch. It requires openssl library. What the
> patch does is simple and is not changed much from the previous version
> patch that includes KMS and TDE; we generate one encryption key called
> master key for whole database cluster at initdb time, which is stored
> in pg_control and wrapped by key encryption key(KEK) derived from
> user-provided passphrase. When postmaster starts up it verifies the
> correctness of passphrase provided by user using hmac key which is
> also derived from user-provided passphrase. The server won't start if
> the passphrase is incorrect. Once the passphrase is verified the
> master key is loaded to the shared buffer and is active.
>
> I added two options to initdb: --cluster-passphrase-command and -e
> that takes a passphrase command and a cipher algorithm (currently only
> aes-128 and aes-256) respectively. The internal KMS is enabled by
> executing initdb with those options as follows:
>
> $ initdb -D data --cluster-passphrase-command="echo 'password'" -e aes-256
>
> I believe the internal KMS would be useful several use cases but I'd
> like to have discussion around the integration with pgcrypto because
> pgcrypto would be the first user of the KMS and pgcrypto can be more
> powerful with the KMS. I'll register this KMS patch to the next Commit
> Fest.
>
> It is already there "KMS - Internal key management system" (
https://commitfest.postgresql.org/26/2196/).

I really appreciate peoples (CC-ing) who participated in off-list
> discussion/meeting for many inputs, suggestions and reviewing codes.
>
> Regards,
>
>
> --
> Masahiko Sawada  http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>


-- 
Ibrar Ahmed


Re: remove support for old Python versions

2019-12-31 Thread Peter Eisentraut

On 2019-12-09 11:37, Peter Eisentraut wrote:

Per discussion in [0], here is a patch set to remove support for Python
versions older than 2.6.


[0]:
https://www.postgresql.org/message-id/6d3b7b69-0970-4d40-671a-268c46e93...@2ndquadrant.com


It appears that the removal of old OpenSSL support is stalled or 
abandoned for now, so this patch set is on hold for now as well, as far 
as I'm concerned.  I have committed the change of the Python exception 
syntax in the documentation, though.


--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




[PATCH] lazy relations delete

2019-12-31 Thread Maxim Orlov

Hi!

Here is the case.

Assume we have a master to slave replication with shared_buffers set up 
to 2 GB at the master and 4 GB at the slave. All of the data is written 
to the master, while reading occurs from slave.


Now we decided to drop many tables, let's say 1000 or 1 not in a 
single transaction, but each table in a separate one. So, due to "plain" 
shared_buffers memory we have to do for loop for every relation which 
leads to lag between master and slave.


In real case scenario such issue lead to not a minutes lag, but hours 
lag. At the same time PostgreSQL have a great routine to delete many 
relations in a single transaction.


So, to get rid of this kind of issue here came up an idea: what if not 
to delete everyone of relations right away and just store them in an 
array, prevent shared buffers (correspond to a deleted relations) from 
been flushed. And then array reaches it max size we need to walk all 
buffers only once to "free" shared buffers correspond to a deleted 
relations.


Here some values from the test which I am made.

Without patch:

1.

(master 2 GB) - drop 1000 tables took 6 sec

(slave 4 GB) - drop 1000 tables took 8 sec

2.

(master 4 GB) - drop 1000 tables took 10 sec

(slave 8 GB) - drop 1000 tables took 16 sec

3.

(master 10 GB) - drop 1000 tables took 22 sec

(slave 20 GB) - drop 1000 tables took 38 sec


With patch:

1.

(master 2 GB) - drop 1000 tables took 2 sec

(slave 4 GB) - drop 1000 tables took 2 sec

2.

(master 4 GB) - drop 1000 tables took 3 sec

(slave 8 GB) - drop 1000 tables took 3 sec

3.

(master 10 GB) - drop 1000 tables took 4 sec

(slave 20 GB) - drop 1000 tables took 4 sec

--
Max Orlov
E-mail: m.or...@postgrespro.ru



init.sh
Description: application/shellscript


run-001.sh
Description: application/shellscript


run-001-core.sh
Description: application/shellscript
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 1f10a97dc78..e0d1fb6a824 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -66,8 +66,6 @@
 #define BUF_WRITTEN0x01
 #define BUF_REUSABLE			0x02
 
-#define DROP_RELS_BSEARCH_THRESHOLD		20
-
 typedef struct PrivateRefCountEntry
 {
 	Buffer		buffer;
@@ -410,6 +408,72 @@ ForgetPrivateRefCountEntry(PrivateRefCountEntry *ref)
 	}
 }
 
+/*
+ * Lazy relations delete mechanism.
+ *
+ * On relation drop no need to walk shared buffers every time, just put a deleted
+ * RelFileNode into an array. Thus we store RelFileNodes in RelNodeDeleted
+ * struct and delete them later when number of deleted relations will
+ * exceed LAZY_DELETE_ARRAY_SIZE.
+ */
+
+#define	LAZY_DELETE_ARRAY_SIZE 128
+
+/* Wrapper for array of lazy deleted RelFileNodes. */
+typedef struct RelNodeDeletedBuffer
+{
+	RelFileNode 	rnodes[LAZY_DELETE_ARRAY_SIZE];	/* Array of deleted */
+	int 			idx;			/* Current position */
+} RelNodeDeletedBuffer;
+
+static RelNodeDeletedBuffer *RelNodeDeleted = NULL;
+
+/*
+ * Initialize shared array of lazy deleted relations.
+ *
+ * This is called once during shared-memory initialization (either in the
+ * postmaster, or in a standalone backend).
+ */
+void InitRelNodeDeletedBuffer(void)
+{
+	bool 	found;
+
+	RelNodeDeleted = ShmemInitStruct("Lazy Deleted Relations",
+sizeof(RelNodeDeletedBuffer),
+&found);
+
+	if (!found)
+		MemSet(RelNodeDeleted, 0, sizeof(RelNodeDeletedBuffer));
+}
+
+/*
+ * Compute the size of shared memory for the buffer of lazy deleted relations.
+ */
+Size
+RelNodeDeletedBufferSize(void)
+{
+	Size		size = 0;
+
+	size = add_size(size, sizeof(RelNodeDeletedBuffer));
+
+	return size;
+}
+
+/*
+ * Check for relation is in lazy deleted buffer (up to n-th position).
+ */
+static inline bool
+IsRelFileNodeDeleted(RelFileNode rnode, int n)
+{
+	int i;
+
+	for (i = 0; i < n; i++)
+		if (RelFileNodeEquals(rnode, RelNodeDeleted->rnodes[i]))
+			return true;
+
+	return false;
+}
+
 /*
  * BufferIsPinned
  *		True iff the buffer is pinned (also checks for valid buffer number).
@@ -452,6 +516,7 @@ static BufferDesc *BufferAlloc(SMgrRelation smgr,
 			   BlockNumber blockNum,
 			   BufferAccessStrategy strategy,
 			   bool *foundPtr);
+static void InvalidateRelFileNodesBuffers(RelFileNode *nodes, int nnodes);
 static void FlushBuffer(BufferDesc *buf, SMgrRelation reln);
 static void AtProcExit_Buffers(int code, Datum arg);
 static void CheckForBufferLeaks(void);
@@ -2690,6 +2755,22 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
 	char	   *bufToWrite;
 	uint32		buf_state;
 
+	LWLockAcquire(LazyRelDeleteLock, LW_SHARED);
+
+	/*
+	 * If rnode is in lazy deleted buffer clear dirty and checkpoint flags.
+	 */
+	if (IsRelFileNodeDeleted(buf->tag.rnode, RelNodeDeleted->idx))
+	{
+		buf_state = LockBufHdr(buf);
+		buf_state &= ~(BM_DIRTY | BM_CHECKPOINT_NEEDED);
+		UnlockBufHdr(buf, buf_state);
+		LWLockRelease(LazyRelDeleteLock);
+		return;
+	}
+
+	LWLockRelease(LazyRelDeleteLock);
+
 	/*
 	 * Acquire the buf

color by default

2019-12-31 Thread Peter Eisentraut
With the attached patch, I propose to enable the colored output by 
default in PG13.


For those who don't like color output, I also add support for the 
environment variable NO_COLOR, which is an emerging standard for turning 
off color across different software packages (https://no-color.org/). 
Of course, you can also continue to use the PG_COLOR variable.


I have looked around how other packages do the automatic color 
detection.  It's usually a combination of mysterious termcap stuff and 
slightly less mysterious matching of the TERM variable against a list of 
known terminal types.  I figured we can skip the termcap stuff and still 
get really good coverage in practice, so that's what I did.


I have also added a documentation appendix to explain all of this. 
(Perhaps we should now remove the repetitive mention of the PG_COLOR 
variable in each man page, but I haven't done that in this patch.)


I'm aware of the pending patch to improve color support on Windows. 
I'll check that one out as well, but it appears to be orthogonal to this 
one.


--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From 58ca7359728959ccc9d7e96d6fbeb324714e9317 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Tue, 31 Dec 2019 11:20:21 +0100
Subject: [PATCH] Use colors by default

Refine the system by which it is decided whether colored output it
used.  Besides the PG_COLOR environment variable previously in use, we
now also observe the NO_COLORS environment variable.  Furthermore, if
neither of these are set, we check whether the terminal supports
colors using the TERM environment variable and use colors
automatically if so.

Also add a documentation appendix that explains who this works and how
the individual colors can be configured.
---
 doc/src/sgml/color.sgml| 102 +
 doc/src/sgml/filelist.sgml |   1 +
 doc/src/sgml/postgres.sgml |   1 +
 src/common/logging.c   |  41 +++
 4 files changed, 145 insertions(+)
 create mode 100644 doc/src/sgml/color.sgml

diff --git a/doc/src/sgml/color.sgml b/doc/src/sgml/color.sgml
new file mode 100644
index 00..f1697230e5
--- /dev/null
+++ b/doc/src/sgml/color.sgml
@@ -0,0 +1,102 @@
+
+
+
+ Color Support
+
+ 
+  color
+ 
+
+ 
+  Most programs in the PostgreSQL package can produce colorized console
+  output.  This appendix describes how that is configured.
+ 
+
+ 
+  When Color is Used
+
+  
+   The decision whether to use colorized output is made according to the
+   following procedure:
+
+   
+
+ 
+  If the environment variable
+  PG_COLORPG_COLOR
+  is set:
+ 
+
+ 
+  
+   
+If the value is always, then color is used.
+   
+  
+
+  
+   
+If the value is auto and the standard error stream
+is associated with a terminal device, then color is used.
+   
+  
+
+  
+   
+Otherwise, color is not used.
+   
+  
+ 
+
+
+
+ 
+  If the environment variable NO_COLOR is set, then color
+  is not used.  See https://no-color.org/"/> for more
+  information about this.
+ 
+
+
+
+ 
+  If the environment variable TERM is set to a terminal
+  type that is recognized to support color and the standard error stream
+  is associated with a terminal device, then color is used.
+ 
+
+
+
+ 
+  Otherwise, color is not used.
+ 
+
+   
+  
+ 
+
+ 
+  Configuring the Colors
+
+  
+   The actual colors to be used are configured using the environment variable
+   PG_COLORSPG_COLORS
+   (note plural).  The value is a colon-separated list of
+   
key=value
+   pairs.  The keys specify what the color is to be used for.  The values are
+   SGR (Select Graphic Rendition) specifications, which are interpreted by the
+   terminal.
+  
+
+  
+   The default value is error=01;31:warning=01;35:locus=01.
+  
+
+  
+   
+This color specification format is also used by other software packages
+such as GCC, GNU
+coreutils, and GNU grep.
+   
+  
+ 
+
diff --git a/doc/src/sgml/filelist.sgml b/doc/src/sgml/filelist.sgml
index 3da2365ea9..1043d0f7ab 100644
--- a/doc/src/sgml/filelist.sgml
+++ b/doc/src/sgml/filelist.sgml
@@ -170,6 +170,7 @@
 
 
 
+
 
 
 
diff --git a/doc/src/sgml/postgres.sgml b/doc/src/sgml/postgres.sgml
index e59cba7997..1f7bd32878 100644
--- a/doc/src/sgml/postgres.sgml
+++ b/doc/src/sgml/postgres.sgml
@@ -278,6 +278,7 @@ Appendixes
   &docguide;
   &limits;
   &acronyms;
+  &color;
 
  
 
diff --git a/src/common/logging.c b/src/common/logging.c
index 895da7150e..9b0c67046f 100644
--- a/src/common/logging.c
+++ b/src/common/logging.c
@@ -32,6 +32,37 @@ static const char *sgr_locus = NULL;
 #define ANSI_ESCAPE_FMT "\x1b[%sm"
 #define ANSI_ESCAPE_RESET "\x1b[0m"
 
+
+#define str_starts_with(str, substr) (strncmp(str, substr, strlen(substr)) == 
0)
+#define str_end

Re: [PATCH] Fix PostgreSQL 12.1 server build and install problems under MSYS2

2019-12-31 Thread Peter Eisentraut

On 2019-12-04 11:32, Guram Duka wrote:
Master branch got error in configure stage and then compiling like 12.1 
branch.


checking how to link an embedded Python application... configure:
error: could not find shared library for Python
You might have to rebuild your Python installation.  Refer to the
documentation for details.  Use --without-python to disable building
PL/Python.

I registered the patch.


As explained to you in the previous patch, you need to explain each 
change in detail and show the problems you are facing.  Other people 
have used this platform before and it works for them, so you need to 
explain what's different for you, especially considering the extent of 
the changes you are proposing.


--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: pgbench - use pg logging capabilities

2019-12-31 Thread Peter Eisentraut

On 2019-12-24 11:17, Fabien COELHO wrote:

As suggested in "cce64a51", this patch make pgbench use postgres logging
capabilities.

I tried to use fatal/error/warning/info/debug where appropriate.

Some printing to stderr remain for some pgbench specific output.


The patch seems pretty straightforward, but this

+/*
+ * Convenient shorcuts
+ */
+#define fatal pg_log_fatal
+#define error pg_log_error
+#define warning pg_log_warning
+#define info pg_log_info
+#define debug pg_log_debug

seems counterproductive.  Let's just use the normal function names.

--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: color by default

2019-12-31 Thread Juan José Santamaría Flecha
On Tue, Dec 31, 2019 at 11:40 AM Peter Eisentraut <
peter.eisentr...@2ndquadrant.com> wrote:

>
> I'm aware of the pending patch to improve color support on Windows.
> I'll check that one out as well, but it appears to be orthogonal to this
> one.
>
>
Actually I think it would be better to rebase that patch on top of this, as
the Windows function enable_vt_mode() incorporates the logic of both
isatty() and terminal_supports_color() by enabling CMDs support of VT100
escape codes.

Regards,

Juan José Santamaría Flecha


Re: backup manifests

2019-12-31 Thread Tels

Moin,

sorry for the very late reply. There was a discussion about the specific 
format of the backup manifests, and maybe that was already discussed and 
I just overlooked it:


1) Why invent your own format, and not just use a machine-readable 
format that already exists? It doesn't have to be full blown XML, or 
even JSON, something simple as YAML would already be better. That way 
not everyone has to write their own parser. Or maybe it is already YAML 
and just the different keywords where under discussion?


2) It would be very wise to add a version number to the format. That 
will making an extension later much easier and avoids the "we need  to 
add X, but that breaks compatibility with all software out there" 
situations that often arise a few years down the line.


Best regards,

and a happy New Year 2020

Tels




Re: pgbench - use pg logging capabilities

2019-12-31 Thread Michael Paquier



On December 31, 2019 8:10:13 PM GMT+09:00, Peter Eisentraut 
 wrote:
> seems counterproductive.  Let's just use the normal function names.

+1.
-- 
Michael




Re: color by default

2019-12-31 Thread Tom Lane
Peter Eisentraut  writes:
> With the attached patch, I propose to enable the colored output by 
> default in PG13.

FWIW, I shall be setting NO_COLOR permanently if this gets committed.
I wonder how many people there are who actually *like* colored output?
I find it to be invariably less readable than plain B&W text.

I may well be in the minority, but I think some kind of straw poll
might be advisable, rather than doing this just because.

regards, tom lane




Re: color by default

2019-12-31 Thread Abel Abraham Camarillo Ojeda
On Tue, Dec 31, 2019 at 7:35 AM Tom Lane  wrote:

> Peter Eisentraut  writes:
> > With the attached patch, I propose to enable the colored output by
> > default in PG13.
>
> FWIW, I shall be setting NO_COLOR permanently if this gets committed.
> I wonder how many people there are who actually *like* colored output?
> I find it to be invariably less readable than plain B&W text.
>
> I may well be in the minority, but I think some kind of straw poll
> might be advisable, rather than doing this just because.
>

+1


> regards, tom lane
>
>
>


Re: color by default

2019-12-31 Thread Daniel Gustafsson
> On 31 Dec 2019, at 14:35, Tom Lane  wrote:
> 
> Peter Eisentraut  writes:
>> With the attached patch, I propose to enable the colored output by 
>> default in PG13.
> 
> FWIW, I shall be setting NO_COLOR permanently if this gets committed.

Me too.

> I may well be in the minority, but I think some kind of straw poll
> might be advisable, rather than doing this just because.

+1

cheers ./daniel




Re: color by default

2019-12-31 Thread Andreas Joseph Krogh

På tirsdag 31. desember 2019 kl. 14:35:39, skrev Tom Lane mailto:t...@sss.pgh.pa.us>>: 
Peter Eisentraut  writes:
 > With the attached patch, I propose to enable the colored output by
 > default in PG13.

 FWIW, I shall be setting NO_COLOR permanently if this gets committed.
 I wonder how many people there are who actually *like* colored output?
 I find it to be invariably less readable than plain B&W text.

 I may well be in the minority, but I think some kind of straw poll
 might be advisable, rather than doing this just because. 


It's easier to spot errors/warnings when they are colored/emphasized imo. Much 
like colored output from grep/diff; We humans have colored vision for a reason. 



--
 Andreas Joseph Krogh 


Re: color by default

2019-12-31 Thread Alvaro Herrera
On 2019-Dec-31, Andreas Joseph Krogh wrote:

> It's easier to spot errors/warnings when they are colored/emphasized imo. 
> Much 
> like colored output from grep/diff; We humans have colored vision for a 
> reason. 

I do use color output (and find it useful), for that reason.

I'm not sure that the documentation addition properly describes the
logic to be used; if it does, I'm not sure that the logic is really what
we want.  Is the logic in the docs supposed to be "last rule that
matches wins" or "first rule that matches wins"?  I think that should be
explicit.  Do we want to have NO_COLORS override the TERM heuristics?
(I'm pretty sure we do.)  OTOH we also want PG_COLORS to override
NO_COLORS.

Per https://no-colors.org (thanks for the link) it seems pretty clear
that people who don't want colors should be already setting NO_COLORS,
and everyone would be happy.  It's not just PG programs that are
colorizing stuff.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: color by default

2019-12-31 Thread Isaac Morland
On Tue, 31 Dec 2019 at 10:18, Alvaro Herrera 
wrote:

Per https://no-colors.org (thanks for the link) it seems pretty clear
>

https://no-color.org


Decade indication

2019-12-31 Thread Bruce Momjian
Does the next decade start on 2020-01-01 or 2021-01-01?  Postgres says
it start on the former date:

SELECT EXTRACT(DECADE FROM '2019-01-01'::date);
 date_part
---
   201

SELECT EXTRACT(DECADE FROM '2020-01-01'::date);
 date_part
---
   202

but the _century_ starts on 2001-01-01, not 2000-01-01:

SELECT EXTRACT(CENTURY FROM '2000-01-01'::date);
 date_part
---
20

SELECT EXTRACT(CENTURY FROM '2001-01-01'::date);
 date_part
---
21

That seems inconsistent to me.  /pgtop/src/backend/utils/adt/timestamp.c
has this C comment:

 * what is a decade wrt dates? let us assume that decade 199
 * is 1990 thru 1999... decade 0 starts on year 1 BC, and -1
 * is 11 BC thru 2 BC...

FYI, these two URLs suggest the inconsistency is OK:

https://www.timeanddate.com/calendar/decade.html
https://en.wikipedia.org/wiki/Decade

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +




Re: color by default

2019-12-31 Thread Jose Luis Tallon

On 31/12/19 14:35, Tom Lane wrote:

Peter Eisentraut  writes:

With the attached patch, I propose to enable the colored output by
default in PG13.

FWIW, I shall be setting NO_COLOR permanently if this gets committed.
I wonder how many people there are who actually *like* colored output?
I find it to be invariably less readable than plain B&W text.

I may well be in the minority, but I think some kind of straw poll
might be advisable, rather than doing this just because.


+1

...and Happy New Year!


    / J.L.






Re: backup manifests

2019-12-31 Thread David Fetter
On Tue, Dec 31, 2019 at 01:30:01PM +0100, Tels wrote:
> Moin,
> 
> sorry for the very late reply. There was a discussion about the specific
> format of the backup manifests, and maybe that was already discussed and I
> just overlooked it:
> 
> 1) Why invent your own format, and not just use a machine-readable format
> that already exists? It doesn't have to be full blown XML, or even JSON,
> something simple as YAML would already be better. That way not everyone has
> to write their own parser. Or maybe it is already YAML and just the
> different keywords where under discussion?

YAML is extremely fragile and error-prone. It's also a superset of
JSON, so I don't understand what you mean by "as simple as."

-1 from me on YAML

That said, I agree that there's no reason to come up with a bespoke
format and parser when JSON is already available in every PostgreSQL
installation.  Imposing a structure atop that includes a version
number, as you suggest, seems pretty straightforward, and should be
done.

Would it make sense to include some kind of capability description in
the format along with the version number?

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate




Re: Decade indication

2019-12-31 Thread Glyn Astill
 Funnily enough I was having a conversation with my wife on exactly this as I 
opened your email.

If the Wikipedia article is to be trusted, the following seems fitting:

  SELECT EXTRACT(ORDINAL DECADE FROM '2020-01-01'::date);
    date_part
    ---
          201

And the default:

SELECT EXTRACT(CARDINAL DECADE FROM '2020-01-01'::date);
    date_part
    ---
          202

 On Tuesday, 31 December 2019, 16:36:02 GMT, Bruce Momjian 
 wrote:  
 
 Does the next decade start on 2020-01-01 or 2021-01-01?  Postgres says
it start on the former date:

    SELECT EXTRACT(DECADE FROM '2019-01-01'::date);
    date_part
    ---
          201
    
    SELECT EXTRACT(DECADE FROM '2020-01-01'::date);
    date_part
    ---
          202

but the _century_ starts on 2001-01-01, not 2000-01-01:

    SELECT EXTRACT(CENTURY FROM '2000-01-01'::date);
    date_part
    ---
            20
    
    SELECT EXTRACT(CENTURY FROM '2001-01-01'::date);
    date_part
    ---
            21

That seems inconsistent to me.  /pgtop/src/backend/utils/adt/timestamp.c
has this C comment:

    * what is a decade wrt dates? let us assume that decade 199
    * is 1990 thru 1999... decade 0 starts on year 1 BC, and -1
    * is 11 BC thru 2 BC...

FYI, these two URLs suggest the inconsistency is OK:

    https://www.timeanddate.com/calendar/decade.html
    https://en.wikipedia.org/wiki/Decade

-- 
  Bruce Momjian          http://momjian.us
  EnterpriseDB                            http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +


  

Re: Decade indication

2019-12-31 Thread Andrew Dunstan
On Wed, Jan 1, 2020 at 3:05 AM Bruce Momjian  wrote:
>
> Does the next decade start on 2020-01-01 or 2021-01-01?  Postgres says
> it start on the former date:
>
> SELECT EXTRACT(DECADE FROM '2019-01-01'::date);
>  date_part
> ---
>201
>
> SELECT EXTRACT(DECADE FROM '2020-01-01'::date);
>  date_part
> ---
>202
>
> but the _century_ starts on 2001-01-01, not 2000-01-01:
>
> SELECT EXTRACT(CENTURY FROM '2000-01-01'::date);
>  date_part
> ---
> 20
>
> SELECT EXTRACT(CENTURY FROM '2001-01-01'::date);
>  date_part
> ---
> 21
>
> That seems inconsistent to me.  /pgtop/src/backend/utils/adt/timestamp.c
> has this C comment:
>
>  * what is a decade wrt dates? let us assume that decade 199
>  * is 1990 thru 1999... decade 0 starts on year 1 BC, and -1
>  * is 11 BC thru 2 BC...
>
> FYI, these two URLs suggest the inconsistency is OK:
>
> https://www.timeanddate.com/calendar/decade.html
> https://en.wikipedia.org/wiki/Decade
>


https://en.wikipedia.org/wiki/Century says:

"Although a century can mean any arbitrary period of 100 years, there
are two viewpoints on the nature of standard centuries. One is based
on strict construction, while the other is based on popular
perspective (general usage).

According to the strict construction of the Gregorian calendar, the
1st century AD began with 1 AD and ended with 100 AD, with the same
pattern continuing onward. In this model, the n-th century
started/will start on the year (100 × n) − 99 and ends in 100 × n.
Because of this, a century will only include one year, the centennial
year, that starts with the century's number (e.g. 1900 was the last
year of the 19th century).[2]

In general usage, centuries are aligned with decades by grouping years
based on their shared digits. In this model, the 'n' -th century
started/will start on the year (100 x n) - 100 and ends in (100 x n) -
1. For example, the 20th century is generally regarded as from 1900 to
1999, inclusive. This is sometimes known as the odometer effect. The
astronomical year numbering and ISO 8601 systems both contain a year
zero, so the first century begins with the year zero, rather than the
year one."


If I had to choose I'd go with the "general usage" rule above, but I
don't think we should change behaviour now.


cheers

andrew

-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Decade indication

2019-12-31 Thread Tom Lane
Andrew Dunstan  writes:
> On Wed, Jan 1, 2020 at 3:05 AM Bruce Momjian  wrote:
>> Does the next decade start on 2020-01-01 or 2021-01-01?  Postgres says
>> it start on the former date:
>> ...
>> That seems inconsistent to me.  /pgtop/src/backend/utils/adt/timestamp.c
>> has this C comment:
>> 
>> * what is a decade wrt dates? let us assume that decade 199
>> * is 1990 thru 1999... decade 0 starts on year 1 BC, and -1
>> * is 11 BC thru 2 BC...

> If I had to choose I'd go with the "general usage" rule above, but I
> don't think we should change behaviour now.

Well, yeah, that.  The quoted comment dates to commit 46be0c18f of
2004-08-20, and a bit of excavation shows that it was just explaining
behavior that existed before, clear back to when Lockhart installed
all this functionality in 2001.

It's pretty darn difficult to justify changing behavior that's stood
for 18+ years, especially when the argument that it's wrong is subject
to debate.  Either users think it's correct, or nobody uses this
function.  In either case, nobody will thank us for changing it.

It's possible that we could add an alternate keyword for a different
decade (and/or century) definition, but I'd want to see some actual
field demand for that first.

regards, tom lane




Re: backup manifests

2019-12-31 Thread David Steele

On 12/31/19 10:43 AM, David Fetter wrote:

On Tue, Dec 31, 2019 at 01:30:01PM +0100, Tels wrote:

Moin,

sorry for the very late reply. There was a discussion about the specific
format of the backup manifests, and maybe that was already discussed and I
just overlooked it:

1) Why invent your own format, and not just use a machine-readable format
that already exists? It doesn't have to be full blown XML, or even JSON,
something simple as YAML would already be better. That way not everyone has
to write their own parser. Or maybe it is already YAML and just the
different keywords where under discussion?


YAML is extremely fragile and error-prone. It's also a superset of
JSON, so I don't understand what you mean by "as simple as."

-1 from me on YAML


-1 from me as well.  YAML is easy to write but definitely non-trivial to 
read.



That said, I agree that there's no reason to come up with a bespoke
format and parser when JSON is already available in every PostgreSQL
installation.  Imposing a structure atop that includes a version
number, as you suggest, seems pretty straightforward, and should be
done.


+1.  I continue to support a format that would be easily readable 
without writing a lot of code.


--
-David
da...@pgmasters.net




TRUNCATE on foreign tables

2019-12-31 Thread Kohei KaiGai
Hello,

We right now don't support TRUNCATE on foreign tables.
It may be a strange missing piece and restriction of operations.
For example, if a partitioned table contains some foreign tables in its leaf,
user cannot use TRUNCATE command to clean up the partitioned table.

Probably, API design is not complicated. We add a new callback for truncate
on the FdwRoutine, and ExecuteTruncateGuts() calls it if relation is foreign-
table. In case of postgres_fdw, it also issues "TRUNCATE" command on the
remote side in the transaction block [*1].

[*1] But I hope oracle_fdw does not follow this implementation as is. :-)

How about your thought?

I noticed this restriction when I'm working on Arrow_Fdw enhancement for
"writable" capability. Because Apache Arrow [*2] is a columnar file format,
it is not designed for UPDATE/DELETE, but capable to bulk-INSERT.
It is straightforward idea to support only INSERT, and clear data by TRUNCATE.

[*2] Apache Arrow - https://arrow.apache.org/docs/format/Columnar.html

Best regards,
-- 
HeteroDB, Inc / The PG-Strom Project
KaiGai Kohei 




pg_restore crash when there is a failure before all child process is created

2019-12-31 Thread vignesh C
Hi,

I found one crash in pg_restore, this occurs when there is a failure before
all the child workers are created. Back trace for the same is given below:
#0  0x7f9c6d31e337 in raise () from /lib64/libc.so.6
#1  0x7f9c6d31fa28 in abort () from /lib64/libc.so.6
#2  0x7f9c6d317156 in __assert_fail_base () from /lib64/libc.so.6
#3  0x7f9c6d317202 in __assert_fail () from /lib64/libc.so.6
#4  0x00407c9e in WaitForTerminatingWorkers (pstate=0x14af7f0) at
parallel.c:515
#5  0x00407bf9 in ShutdownWorkersHard (pstate=0x14af7f0) at
parallel.c:451
#6  0x00407ae9 in archive_close_connection (code=1, arg=0x6315a0
) at parallel.c:368
#7  0x0041a7c7 in exit_nicely (code=1) at pg_backup_utils.c:99
#8  0x00408180 in ParallelBackupStart (AH=0x14972e0) at
parallel.c:967
#9  0x0040a3dd in RestoreArchive (AHX=0x14972e0) at
pg_backup_archiver.c:661
#10 0x00404125 in main (argc=6, argv=0x7ffd5146f308) at
pg_restore.c:443

The problem is like:

   - The variable pstate->numWorkers is being set with the number of
   workers initially in ParallelBackupStart.
   - Then the workers are created one by one.
   - Before creating all the process there is a failure.
   - Then the parent terminates the child process and waits for all the
   child process to get terminated.
   - This function WaitForTerminatingWorkers checks if all process is
   terminated by calling HasEveryWorkerTerminated.
   - HasEveryWorkerTerminated will always return false because it will
   check for the numWorkers rather than the actual forked process count and
   hits the next assert "Assert(j < pstate->numWorkers);".


Attached patch has the fix for the same. Fixed it by setting
pstate->numWorkers with the actual worker count when the child process is
being created.

Thoughts?

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com
From 5cded66879e74f6351b44856eef2d66fab172e95 Mon Sep 17 00:00:00 2001
From: Vignesh C
Date: Wed, 1 Jan 2020 08:48:44 +0530
Subject: [PATCH] pg_restore crash when there is a failure before all parallel
 workers are created.

There is a possibility that there can be a failure before all the parallel
workers are created, pstate->numWorkers is being set with the number of
workers initially even before the fork process is successful. In error case
while trying to terminate and wait for all the child process, it will try to
wait for the initial value instead of waiting for the actual child process.
Fixed it by setting pstate->numWorkers with the actual forked child processes.
---
 src/bin/pg_dump/parallel.c | 21 ++---
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/src/bin/pg_dump/parallel.c b/src/bin/pg_dump/parallel.c
index 24239fa..88d8ee2 100644
--- a/src/bin/pg_dump/parallel.c
+++ b/src/bin/pg_dump/parallel.c
@@ -907,22 +907,22 @@ ParallelBackupStart(ArchiveHandle *AH)
 {
 	ParallelState *pstate;
 	int			i;
+	int numWorkers = AH->public.numWorkers;
 
-	Assert(AH->public.numWorkers > 0);
+	Assert(numWorkers > 0);
 
 	pstate = (ParallelState *) pg_malloc(sizeof(ParallelState));
 
-	pstate->numWorkers = AH->public.numWorkers;
+	pstate->numWorkers = (numWorkers == 1) ? numWorkers : 0;
 	pstate->te = NULL;
 	pstate->parallelSlot = NULL;
 
-	if (AH->public.numWorkers == 1)
+	if (numWorkers == 1)
 		return pstate;
 
-	pstate->te = (TocEntry **)
-		pg_malloc0(pstate->numWorkers * sizeof(TocEntry *));
+	pstate->te = (TocEntry **) pg_malloc0(numWorkers * sizeof(TocEntry *));
 	pstate->parallelSlot = (ParallelSlot *)
-		pg_malloc0(pstate->numWorkers * sizeof(ParallelSlot));
+		pg_malloc0(numWorkers * sizeof(ParallelSlot));
 
 #ifdef WIN32
 	/* Make fmtId() and fmtQualifiedId() use thread-local storage */
@@ -950,7 +950,7 @@ ParallelBackupStart(ArchiveHandle *AH)
 	fflush(NULL);
 
 	/* Create desired number of workers */
-	for (i = 0; i < pstate->numWorkers; i++)
+	for (i = 0; i < numWorkers; i++)
 	{
 #ifdef WIN32
 		WorkerInfo *wi;
@@ -980,6 +980,11 @@ ParallelBackupStart(ArchiveHandle *AH)
 		slot->pipeRevRead = pipeMW[PIPE_READ];
 		slot->pipeRevWrite = pipeWM[PIPE_WRITE];
 
+		/*
+		 * Number of workers need to be increased before fork as the workers
+		 * will be using numWorkers to iterate and identify their slot.
+		 */
+		pstate->numWorkers++;
 #ifdef WIN32
 		/* Create transient structure to pass args to worker function */
 		wi = (WorkerInfo *) pg_malloc(sizeof(WorkerInfo));
@@ -1026,6 +1031,8 @@ ParallelBackupStart(ArchiveHandle *AH)
 		}
 		else if (pid < 0)
 		{
+			pstate->numWorkers--;
+
 			/* fork failed */
 			fatal("could not create worker process: %m");
 		}
-- 
1.8.3.1



Re: TRUNCATE on foreign tables

2019-12-31 Thread Kohei KaiGai
Hello,

The attached patch adds TRUNCATE support on foreign table.

This patch adds an optional callback ExecForeignTruncate(Relation rel)
to FdwRoutine.
It is invoked during ExecuteTruncateGuts, then FDW driver hands over
the jobs related
to complete "truncate on the foreign table".
Of course, it is not clear to define the concept of "truncate" on some
FDW drivers.
In this case, TRUNCATE command prohibits to apply these foreign tables.

2019 is not finished at everywhere on the earth yet, so I believe it
is Ok to add this patch
to CF-2020:Jan.

Best regards,

2020年1月1日(水) 11:46 Kohei KaiGai :
>
> Hello,
>
> We right now don't support TRUNCATE on foreign tables.
> It may be a strange missing piece and restriction of operations.
> For example, if a partitioned table contains some foreign tables in its leaf,
> user cannot use TRUNCATE command to clean up the partitioned table.
>
> Probably, API design is not complicated. We add a new callback for truncate
> on the FdwRoutine, and ExecuteTruncateGuts() calls it if relation is foreign-
> table. In case of postgres_fdw, it also issues "TRUNCATE" command on the
> remote side in the transaction block [*1].
>
> [*1] But I hope oracle_fdw does not follow this implementation as is. :-)
>
> How about your thought?
>
> I noticed this restriction when I'm working on Arrow_Fdw enhancement for
> "writable" capability. Because Apache Arrow [*2] is a columnar file format,
> it is not designed for UPDATE/DELETE, but capable to bulk-INSERT.
> It is straightforward idea to support only INSERT, and clear data by TRUNCATE.
>
> [*2] Apache Arrow - https://arrow.apache.org/docs/format/Columnar.html
>
> Best regards,
> --
> HeteroDB, Inc / The PG-Strom Project
> KaiGai Kohei 



-- 
HeteroDB, Inc / The PG-Strom Project
KaiGai Kohei 


pgsql13-truncate-on-foreign-table.v1.patch
Description: Binary data