date:20190130

Re: Don't wake up to check trigger file if none is configured

2019-01-30 Thread Michael Paquier

On Wed, Dec 26, 2018 at 03:24:01PM +0900, Michael Paquier wrote:
> It seems to me that the comment on top of WaitLatch should be clearer
> about that, and that the current state leads to confusion.  Another
> thing coming to my mind is that I think it would be useful to make the
> timeout configurable so as instances can react more quickly in the
> case of a sudden death of the WAL receiver (or to check faster for a
> trigger file if the HA application is to lazy to send a signal to the
> standby host).
> 
> Attached is a patch to improve the comment for now.

So, does somebody have an objection if I apply the comment patch?  Per
the reasons above, the proposed patch is not correct, but the code can
be more descriptive.
--
Michael


signature.asc
Description: PGP signature

Re: Cache lookup errors with functions manipulation object addresses

2019-01-30 Thread Michael Paquier

On Tue, Dec 25, 2018 at 01:34:38PM +0900, Michael Paquier wrote:
> With that all the regression tests show a consistent behavior for all
> object types when those are undefined.  Thoughts?

End-of-commit-fest rebase and moved to next CF.
--
Michael
From 6514d737cef100ea8146b1d6d10fa93e40c6ee9c Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Tue, 25 Dec 2018 12:21:24 +0900
Subject: [PATCH 1/3] Add flag to format_type_extended to enforce NULL-ness

If a type scanned is undefined, type format routines have two behaviors
depending on if FORMAT_TYPE_ALLOW_INVALID is defined by the caller:
- Generate an error
- Return undefined type name "???" or "-".

The current interface is unhelpful for callers willing to format
properly a type name, but still make sure that the type is defined as
there could be types matching the undefined strings.  In order to
counter that, add a new flag called FORMAT_TYPE_FORCE_NULL which returns
a NULL result instead of "??? or "-" which does not generate an error.
They will be used for future patches to improve the SQL interface for
object addresses.
---
 src/backend/utils/adt/format_type.c | 20 
 src/include/utils/builtins.h|  1 +
 2 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/src/backend/utils/adt/format_type.c b/src/backend/utils/adt/format_type.c
index 84a1b3c72b..6fc8f55668 100644
--- a/src/backend/utils/adt/format_type.c
+++ b/src/backend/utils/adt/format_type.c
@@ -98,6 +98,9 @@ format_type(PG_FUNCTION_ARGS)
  * - FORMAT_TYPE_ALLOW_INVALID
  *			if the type OID is invalid or unknown, return ??? or such instead
  *			of failing
+ * - FORMAT_TYPE_FORCE_NULL
+ *			if the type OID is invalid or unknown, return NULL instead of ???
+ *			or such
  * - FORMAT_TYPE_FORCE_QUALIFY
  *			always schema-qualify type names, regardless of search_path
  *
@@ -116,13 +119,20 @@ format_type_extended(Oid type_oid, int32 typemod, bits16 flags)
 	char	   *buf;
 	bool		with_typemod;
 
-	if (type_oid == InvalidOid && (flags & FORMAT_TYPE_ALLOW_INVALID) != 0)
-		return pstrdup("-");
+	if (type_oid == InvalidOid)
+	{
+		if ((flags & FORMAT_TYPE_FORCE_NULL) != 0)
+			return NULL;
+		else if ((flags & FORMAT_TYPE_ALLOW_INVALID) != 0)
+			return pstrdup("-");
+	}
 
 	tuple = SearchSysCache1(TYPEOID, ObjectIdGetDatum(type_oid));
 	if (!HeapTupleIsValid(tuple))
 	{
-		if ((flags & FORMAT_TYPE_ALLOW_INVALID) != 0)
+		if ((flags & FORMAT_TYPE_FORCE_NULL) != 0)
+			return NULL;
+		else if ((flags & FORMAT_TYPE_ALLOW_INVALID) != 0)
 			return pstrdup("???");
 		else
 			elog(ERROR, "cache lookup failed for type %u", type_oid);
@@ -145,7 +155,9 @@ format_type_extended(Oid type_oid, int32 typemod, bits16 flags)
 		tuple = SearchSysCache1(TYPEOID, ObjectIdGetDatum(array_base_type));
 		if (!HeapTupleIsValid(tuple))
 		{
-			if ((flags & FORMAT_TYPE_ALLOW_INVALID) != 0)
+			if ((flags & FORMAT_TYPE_FORCE_NULL) != 0)
+return NULL;
+			else if ((flags & FORMAT_TYPE_ALLOW_INVALID) != 0)
 return pstrdup("???[]");
 			else
 elog(ERROR, "cache lookup failed for type %u", type_oid);
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index 03d0ee2766..43e7ef471c 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -109,6 +109,7 @@ extern Datum numeric_float8_no_overflow(PG_FUNCTION_ARGS);
 #define FORMAT_TYPE_TYPEMOD_GIVEN	0x01	/* typemod defined by caller */
 #define FORMAT_TYPE_ALLOW_INVALID	0x02	/* allow invalid types */
 #define FORMAT_TYPE_FORCE_QUALIFY	0x04	/* force qualification of type */
+#define FORMAT_TYPE_FORCE_NULL		0x08	/* force NULL if undefined */
 extern char *format_type_extended(Oid type_oid, int32 typemod, bits16 flags);
 
 extern char *format_type_be(Oid type_oid);
-- 
2.20.1

From 3eaf6833c367549b9309b9df7ef7d56f8e596cc3 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Tue, 25 Dec 2018 13:13:28 +0900
Subject: [PATCH 2/3] Refactor format procedure and operator APIs to be more
 modular

This introduce a new set of extended routines for procedure and operator
lookups, with a flags bitmask argument that can modify the default
behavior:
- Force schema qualification
- Force NULL as result instead of a numeric OID for an undefined
object.
---
 src/backend/utils/adt/regproc.c | 61 +++--
 src/include/utils/regproc.h | 10 ++
 2 files changed, 52 insertions(+), 19 deletions(-)

diff --git a/src/backend/utils/adt/regproc.c b/src/backend/utils/adt/regproc.c
index 09cf0e1abc..f85b47821a 100644
--- a/src/backend/utils/adt/regproc.c
+++ b/src/backend/utils/adt/regproc.c
@@ -40,8 +40,6 @@
 #include "utils/regproc.h"
 #include "utils/varlena.h"
 
-static char *format_operator_internal(Oid operator_oid, bool force_qualify);
-static char *format_procedure_internal(Oid procedure_oid, bool force_qualify);
 static void parseNameAndArgTypes(const char *string, bool allowNone,
 	 List **names, int *nargs, Oid *argtypes);
 
@@ -322,24 +320,27 @@

Re: Add pg_partition_root to get top-most parent of a partition tree

2019-01-30 Thread Michael Paquier

On Sat, Dec 15, 2018 at 10:16:08AM +0900, Michael Paquier wrote:
> Changed in this sense.  Please find attached an updated patch.

Rebased as per the attached, and moved to next CF.
--
Michael
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 4930ec17f6..86ff4e5c9e 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -20274,6 +20274,17 @@ postgres=# SELECT * FROM pg_walfile_name_offset(pg_stop_backup());
 their partitions, and so on.

   
+  
+   
+pg_partition_root
+pg_partition_root(regclass)
+   
+   regclass
+   
+Return the top-most parent of a partition tree to which the given
+relation belongs.
+   
+  
  
 

diff --git a/src/backend/utils/adt/partitionfuncs.c b/src/backend/utils/adt/partitionfuncs.c
index 5cdf4a4524..6d731493d9 100644
--- a/src/backend/utils/adt/partitionfuncs.c
+++ b/src/backend/utils/adt/partitionfuncs.c
@@ -25,6 +25,34 @@
 #include "utils/lsyscache.h"
 #include "utils/syscache.h"
 
+/*
+ * Perform several checks on a relation on which is extracted some
+ * information related to its partition tree.  Returns false if the
+ * relation cannot be processed, in which case it is up to the caller
+ * to decide what to do, by either raising an error or doing something
+ * else.
+ */
+static bool
+check_rel_can_be_partition(Oid relid)
+{
+	char		relkind;
+
+	/* Check if relation exists */
+	if (!SearchSysCacheExists1(RELOID, ObjectIdGetDatum(relid)))
+		return false;
+
+	relkind = get_rel_relkind(relid);
+
+	/* Only allow relation types that can appear in partition trees. */
+	if (relkind != RELKIND_RELATION &&
+		relkind != RELKIND_FOREIGN_TABLE &&
+		relkind != RELKIND_INDEX &&
+		relkind != RELKIND_PARTITIONED_TABLE &&
+		relkind != RELKIND_PARTITIONED_INDEX)
+		return false;
+
+	return true;
+}
 
 /*
  * pg_partition_tree
@@ -39,19 +67,10 @@ pg_partition_tree(PG_FUNCTION_ARGS)
 {
 #define PG_PARTITION_TREE_COLS	4
 	Oid			rootrelid = PG_GETARG_OID(0);
-	char		relkind = get_rel_relkind(rootrelid);
 	FuncCallContext *funcctx;
 	ListCell  **next;
 
-	if (!SearchSysCacheExists1(RELOID, ObjectIdGetDatum(rootrelid)))
-		PG_RETURN_NULL();
-
-	/* Return NULL for relation types that cannot appear in partition trees */
-	if (relkind != RELKIND_RELATION &&
-		relkind != RELKIND_FOREIGN_TABLE &&
-		relkind != RELKIND_INDEX &&
-		relkind != RELKIND_PARTITIONED_TABLE &&
-		relkind != RELKIND_PARTITIONED_INDEX)
+	if (!check_rel_can_be_partition(rootrelid))
 		PG_RETURN_NULL();
 
 	/* stuff done only on the first call of the function */
@@ -153,3 +172,39 @@ pg_partition_tree(PG_FUNCTION_ARGS)
 	/* done when there are no more elements left */
 	SRF_RETURN_DONE(funcctx);
 }
+
+/*
+ * pg_partition_root
+ *
+ * For the given relation part of a partition tree, return its top-most
+ * root parent.
+ */
+Datum
+pg_partition_root(PG_FUNCTION_ARGS)
+{
+	Oid		relid = PG_GETARG_OID(0);
+	Oid		rootrelid;
+	List   *ancestors;
+
+	if (!check_rel_can_be_partition(relid))
+		PG_RETURN_NULL();
+
+	/*
+	 * If the relation is not a partition (it may be the partition parent),
+	 * return itself as a result.
+	 */
+	if (!get_rel_relispartition(relid))
+		PG_RETURN_OID(relid);
+
+	/* Fetch the top-most parent */
+	ancestors = get_partition_ancestors(relid);
+	rootrelid = llast_oid(ancestors);
+	list_free(ancestors);
+
+	/*
+	 * "rootrelid" must contain a valid OID, given that the input relation
+	 * is a valid partition tree member as checked above.
+	 */
+	Assert(OidIsValid(rootrelid));
+	PG_RETURN_OID(rootrelid);
+}
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 3ecc2e12c3..b1efd9c49c 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -10509,4 +10509,9 @@
   proargnames => '{rootrelid,relid,parentrelid,isleaf,level}',
   prosrc => 'pg_partition_tree' },
 
+# function to get the top-most partition root parent
+{ oid => '3424', descr => 'get top-most partition root parent',
+  proname => 'pg_partition_root', prorettype => 'regclass',
+  proargtypes => 'regclass', prosrc => 'pg_partition_root' },
+
 ]
diff --git a/src/test/regress/expected/partition_info.out b/src/test/regress/expected/partition_info.out
index 3e15e02f8d..a884df976f 100644
--- a/src/test/regress/expected/partition_info.out
+++ b/src/test/regress/expected/partition_info.out
@@ -12,6 +12,18 @@ SELECT * FROM pg_partition_tree(0);
| ||  
 (1 row)
 
+SELECT pg_partition_root(NULL);
+ pg_partition_root 
+---
+ 
+(1 row)
+
+SELECT pg_partition_root(0);
+ pg_partition_root 
+---
+ 
+(1 row)
+
 -- Test table partition trees
 CREATE TABLE ptif_test (a int, b int) PARTITION BY range (a);
 CREATE TABLE ptif_test0 PARTITION OF ptif_test
@@ -66,6 +78,20 @@ SELECT relid, parentrelid, level, isleaf
  ptif_test01 | ptif_test0  | 0 | t
 (1 row)
 
+-- List all members using pg_partition_root with leaf table

Re: Adding a TAP test checking data consistency on standby with minRecoveryPoint

2019-01-30 Thread Michael Paquier

On Mon, Nov 12, 2018 at 01:08:38PM +0900, Michael Paquier wrote:
> Thanks!  I am stealing that stuff, and I have added an offline check by
> comparing the value of minRecoveryPoint in pg_controldata.  Again, if
> you revert c186ba13 and run the tests, both online and offline failures
> are showing up.
> 
> What do you think?

I think that this test has some value, and it did not get any reviews,
so I am moving it to the next CF.
--
Michael


signature.asc
Description: PGP signature

Re: Unused parameters & co in code

2019-01-30 Thread Michael Paquier

On Wed, Jan 30, 2019 at 08:14:05AM -0300, Alvaro Herrera wrote:
> On 2019-Jan-30, Michael Paquier wrote:
>> ReindexPartitionedIndex => parentIdx
> 
> I'd not change this one, as we will surely want to do something useful
> eventually.  Not that it matters, but it'd be useless churn.

A lot of these things assume that the unused arguments may be useful
in the future.

>> _bt_relbuf => rel (Quite some cleanup here for the btree code)
> 
> If this is a worthwhile cleanup, let's see a patch.

This cleanup is disappointing, so it can be discarded.

I looked at all the references I have spotted yesterday, and most of
them are not really worth the trouble, still there are some which
could be cleaned up.  Attached is a set of patches which do some
cleanup here and there, I don't propose all of them for commit, some
of them are attached just for reference:
- 0001 cleans up port in SendAuthRequest.  This one is disappointing,
so I am fine to discard it.
- 0002 works on _bt_relbuf, whose callers don't actually benefit from
the cleanup as the relation worked on is always used for different
reasons, so it can be discarded.
- 0003 works on the code of GIN, which simplifies at least the code,
so it could be applied.  This removes more than changed.
- 0004 also cleans some code for clause parsing, with a negative line
output.
- 0005 is for pg_rewind, which is some stuff I introduced, so I'd like
to clean up my mess and the change is recent :)
- 0006 is for tablecmds.c, and something I would like to apply because
it reduces some confusion with some recursion arguments which are not
needed for constraint handling and inheritance.  Most of the complains
come from lockmode not being used but all the AtPrep and AtExec
routines are rather symmetric so I am not bothering about that.
--
Michael
From 2b86907b218798545e17e5f6cbc985f8368387b5 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Thu, 31 Jan 2019 14:20:33 +0900
Subject: [PATCH 1/6] Simplify SendAuthRequest

---
 src/backend/libpq/auth.c | 34 +-
 1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/src/backend/libpq/auth.c b/src/backend/libpq/auth.c
index 20fe98fb82..f92f78e17a 100644
--- a/src/backend/libpq/auth.c
+++ b/src/backend/libpq/auth.c
@@ -43,8 +43,8 @@
  * Global authentication functions
  *
  */
-static void sendAuthRequest(Port *port, AuthRequest areq, const char *extradata,
-int extralen);
+static void sendAuthRequest(AuthRequest areq, const char *extradata,
+			int extralen);
 static void auth_failed(Port *port, int status, char *logdetail);
 static char *recv_password_packet(Port *port);
 
@@ -523,7 +523,7 @@ ClientAuthentication(Port *port)
 
 		case uaGSS:
 #ifdef ENABLE_GSS
-			sendAuthRequest(port, AUTH_REQ_GSS, NULL, 0);
+			sendAuthRequest(AUTH_REQ_GSS, NULL, 0);
 			status = pg_GSS_recvauth(port);
 #else
 			Assert(false);
@@ -532,7 +532,7 @@ ClientAuthentication(Port *port)
 
 		case uaSSPI:
 #ifdef ENABLE_SSPI
-			sendAuthRequest(port, AUTH_REQ_SSPI, NULL, 0);
+			sendAuthRequest(AUTH_REQ_SSPI, NULL, 0);
 			status = pg_SSPI_recvauth(port);
 #else
 			Assert(false);
@@ -603,7 +603,7 @@ ClientAuthentication(Port *port)
 		(*ClientAuthentication_hook) (port, status);
 
 	if (status == STATUS_OK)
-		sendAuthRequest(port, AUTH_REQ_OK, NULL, 0);
+		sendAuthRequest(AUTH_REQ_OK, NULL, 0);
 	else
 		auth_failed(port, status, logdetail);
 }
@@ -613,7 +613,7 @@ ClientAuthentication(Port *port)
  * Send an authentication request packet to the frontend.
  */
 static void
-sendAuthRequest(Port *port, AuthRequest areq, const char *extradata, int extralen)
+sendAuthRequest(AuthRequest areq, const char *extradata, int extralen)
 {
 	StringInfoData buf;
 
@@ -740,7 +740,7 @@ CheckPasswordAuth(Port *port, char **logdetail)
 	int			result;
 	char	   *shadow_pass;
 
-	sendAuthRequest(port, AUTH_REQ_PASSWORD, NULL, 0);
+	sendAuthRequest(AUTH_REQ_PASSWORD, NULL, 0);
 
 	passwd = recv_password_packet(port);
 	if (passwd == NULL)
@@ -841,7 +841,7 @@ CheckMD5Auth(Port *port, char *shadow_pass, char **logdetail)
 		return STATUS_ERROR;
 	}
 
-	sendAuthRequest(port, AUTH_REQ_MD5, md5Salt, 4);
+	sendAuthRequest(AUTH_REQ_MD5, md5Salt, 4);
 
 	passwd = recv_password_packet(port);
 	if (passwd == NULL)
@@ -895,7 +895,7 @@ CheckSCRAMAuth(Port *port, char *shadow_pass, char **logdetail)
 	/* Put another '\0' to mark that list is finished. */
 	appendStringInfoChar(_mechs, '\0');
 
-	sendAuthRequest(port, AUTH_REQ_SASL, sasl_mechs.data, sasl_mechs.len);
+	sendAuthRequest(AUTH_REQ_SASL, sasl_mechs.data, sasl_mechs.len);
 	pfree(sasl_mechs.data);
 
 	/*
@@ -1000,9 +1000,9 @@ CheckSCRAMAuth(Port *port, char *shadow_pass, char **logdetail)
 			elog(DEBUG4, "sending SASL challenge of length %u", outputlen);
 
 			if (result == SASL_EXCHANGE_SUCCESS)
-sendAuthRequest(port, AUTH_REQ_SASL_FIN, output, outputlen);
+sendAuthRequest(AUTH_REQ_SASL_FIN, output,

Re: Index Skip Scan

2019-01-30 Thread Kyotaro HORIGUCHI

Hello.

At Wed, 30 Jan 2019 18:19:05 +0100, Dmitry Dolgov <9erthali...@gmail.com> wrote
in
> A bit of adjustment after nodes/relation -> nodes/pathnodes.

I had a look on this.

The name "index skip scan" is a different feature from the
feature with the name on other prodcuts, which means "index scan
with postfix key (of mainly of multi column key) that scans
ignoring the prefixing part" As Thomas suggested I'd suggest that
we call it "index hop scan". (I can accept Hopscotch, either:p)

Also as mentioned upthread by Peter Geoghegan, this could easly
give worse plan by underestimation. So I also suggest that this
has dynamic fallback function. In such perspective it is not
suitable for AM API level feature.

If all leaf pages are on the buffer and the average hopping
distance is less than (expectedly) 4 pages (the average height of
the tree), the skip scan will lose. If almost all leaf pages are
staying on disk, we could win only by 2-pages step (skipping over
one page).

=
As I'm writing the above, I came to think that it's better
implement this as an pure executor optimization.

Specifically, let _bt_steppage count the ratio of skipped pages
so far then if the ratio exceeds some threshold (maybe around
3/4) it gets into hopscotching mode, where it uses index scan to
find the next page (rather than traversing). As mentioned above,
I think using skip scan to go beyond the next page is a good
bet. If the success ration of skip scans goes below some
threshold (I'm not sure for now), we should fall back to
traversing.

Any opinions?

Some comments on the patch below.

+ skip scan approach, which is based on the idea of
+ https://wiki.postgresql.org/wiki/Free_Space_Map_Problems;>
+ Loose index scan. Rather than scanning all equal values of a key,
+ as soon as a new value is found, it will search for a larger value on the

I'm not sure it is a good thing to put a pointer to rather
unstable source in the documentation.

This adds a new AM method but it seems avaiable only for ordered
indexes, specifically btree. And it seems that the feature can be
implemented in btgettuple since btskip apparently does the same
thing. (I agree to Robert in the point in [1]).

[1]
https://www.postgresql.org/message-id/CA%2BTgmobb3uN0xDqTRu7f7WdjGRAXpSFxeAQnvNr%3DOK5_kC_SSg%40mail.gmail.com

Related to the above, it seems better that the path generation of
skip scan is a part of index scan. Whether skip scan or not is a
matter of index scan itself.

62 matches

Mail list logo