Re: [HACKERS] Custom compression methods

2021-03-02 Thread Dilip Kumar
On Mon, Mar 1, 2021 at 8:53 PM Dilip Kumar  wrote:
>
> Now, I think the only pending thing is related to the expandedrecord,
> basically, currently, we have detoasted the compressed filed only in
> expanded_record_set_field_internal function.  I am still not
> completely sure that for the built-in types do we need to do something
> for expanded_record_set_tuple and expanded_record_set_field or not,  I
> mean in these functions do we only expand the external to survive the
> COMMIT/ROLLBACK or do we also expand it send it to some target table
> like we do in expanded_record_set_field_internal.

I have done further analysis of the compressed field in the
expandedrecord. My observation is that only in
expanded_record_set_field_internal we unconditionally pass true and
only when it is called from ER_get_flat_size.  In all the other
functions (expanded_record_set_tuple and expanded_record_set_fields)
we only pass expand_external to true if estate->atomic is not set.
And, the estate->atomic is set to false only if we are executing the
anonymous block from a transaction block (there might be another way
to have estate->atomic as false).  But the point is that the
flattening in these two functions are conditional which means we can
not use these expanded records to form some kind of row, otherwise, we
can not have the conditional flattening based on the way how the PL
block is being executed, so I think this proves Robert's point that we
are expanding this only for surviving the commit/rollback inside the
PL block.  That means for the built-in types, decompression in
expanded_record_set_field_internal should be sufficient and that was
already done in my latest version of patch v29-0001.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: archive_command / pg_stat_archiver & documentation

2021-03-02 Thread Benoit Lobréau
Thanks !

Le mar. 2 mars 2021 à 04:10, Julien Rouhaud  a écrit :

> On Tue, Mar 2, 2021 at 9:29 AM Michael Paquier 
> wrote:
> >
> > On Mon, Mar 01, 2021 at 05:17:06PM +0800, Julien Rouhaud wrote:
> > > Maybe this can be better addressed than with a link in the
> > > documentation.  The final outcome is that it can be difficult to
> > > monitor the archiver state in such case.  That's orthogonal to this
> > > patch but maybe we can add a new "archiver_start" timestamptz column
> > > in pg_stat_archiver, so monitoring tools can detect a problem if it's
> > > too far away from pg_postmaster_start_time() for instance?
> >
> > There may be other solutions as well.  I have applied the doc patch
> > for now.
>
> Thanks!
>


Re: [PATCH] Bug fix in initdb output

2021-03-02 Thread Nitin Jadhav
>
> FWIW, I don't think that it is a good idea to come back to this
> decision for *nix platforms, so I would let it as-is, and use relative
> paths if initdb is called using a relative path.


The command to be displayed either in absolute path or relative path
depends on the way the user is calling initdb. I agree with the above
comment to keep this behaviour as-is, that is if the initdb is called using
relative path, then it displays relative path otherwise absolute path.


> However, if you can write something that
> makes the path printed compatible for a WIN32 terminal while keeping
> the behavior consistent with other platforms, people would have no
> objections to that.

I feel the patch attached above handles this scenario.

Thanks and Regards,
Nitin Jadhav

On Tue, Mar 2, 2021 at 6:53 AM Michael Paquier  wrote:

> On Tue, Mar 02, 2021 at 01:28:57AM +0100, Juan José Santamaría Flecha
> wrote:
> > For me it is a +1 for the change to absolute. Let's see if more people
> want
> > to weigh in on the matter.
>
> FWIW, I don't think that it is a good idea to come back to this
> decision for *nix platforms, so I would let it as-is, and use relative
> paths if initdb is called using a relative path.
>
> The number of people using a relative path for Windows initialization
> sounds rather limited to me.  However, if you can write something that
> makes the path printed compatible for a WIN32 terminal while keeping
> the behavior consistent with other platforms, people would have no
> objections to that.
> --
> Michael
>


Re: A reloption for partitioned tables - parallel_workers

2021-03-02 Thread Laurenz Albe
On Tue, 2021-03-02 at 11:23 +0900, Amit Langote wrote:
> +ALTER TABLE pagg_tab_ml SET (parallel_workers = 0);
> +EXPLAIN (COSTS OFF)
> +SELECT a FROM pagg_tab_ml WHERE b = 42;
> +QUERY PLAN
> +---
> + Append
> +   ->  Seq Scan on pagg_tab_ml_p1 pagg_tab_ml_1
> + Filter: (b = 42)
> +   ->  Seq Scan on pagg_tab_ml_p2_s1 pagg_tab_ml_2
> + Filter: (b = 42)
> +   ->  Seq Scan on pagg_tab_ml_p2_s2 pagg_tab_ml_3
> + Filter: (b = 42)
> +(7 rows)
> 
> I got the same result with my implementation, but I am wondering if
> setting parallel_workers=0 on the parent table shouldn't really
> disable a regular (non-parallel-aware) Append running under Gather
> even if it does Parallel Append (parallel-aware)?  So in this test
> case, there should have been a Gather atop Append, with individual
> partitions scanned using Parallel Seq Scan where applicable.

I am not sure, but I tend to think that if you specify no
parallel workers, you want no parallel workers.

But I noticed the following:

  SET enable_partitionwise_aggregate = on;

  EXPLAIN (COSTS OFF)
  SELECT count(*) FROM pagg_tab_ml;
QUERY PLAN  
  --
   Finalize Aggregate
 ->  Gather
   Workers Planned: 4
   ->  Parallel Append
 ->  Partial Aggregate
   ->  Parallel Seq Scan on pagg_tab_ml_p1 pagg_tab_ml
 ->  Partial Aggregate
   ->  Parallel Seq Scan on pagg_tab_ml_p3_s1 pagg_tab_ml_3
 ->  Partial Aggregate
   ->  Parallel Seq Scan on pagg_tab_ml_p2_s1 pagg_tab_ml_1
 ->  Partial Aggregate
   ->  Parallel Seq Scan on pagg_tab_ml_p3_s2 pagg_tab_ml_4
 ->  Partial Aggregate
   ->  Parallel Seq Scan on pagg_tab_ml_p2_s2 pagg_tab_ml_2
  (14 rows)

The default number of parallel workers is taken, because the append is
on an upper relation, not the partitioned table itself.

One would wish that "parallel_workers" somehow percolated up, but I
have no idea how that should work.

Yours,
Laurenz Albe





psql crash while executing core regression tests

2021-03-02 Thread walker
hi


During installation from source code, there are many crashes for psql while 
executing core regression tests, all the crashes are similar, the backtrace 
info as follows:


Core was generated by 'psql'.
Program terminated with signal 11, Segmentation fault.
# 0 0x0043f140 in slash_yylex()
(gdb) bt
#0 0x0043f140 in slash_yylex()
#1 0x004430fc in psql_scan_slash_command()
#2 0x0043f140 in HandleSlashCmds()
#3 0x0043f140 in MainLoop()
#4 0x0043f140 in main()


I did more compared testing about this scenario, as follows:
1. install from local source code(xxx.tar.gz)
1) switch to source tree directory, and build there   no crash 
generated
2) create a build directory, and build there   no crash generated


2. install from git source code
1) switch to source tree directory, and build there  no crash generated
2) create a build directory, and build there  many crashes generated, but 
if install newer version of flex, e.g. 2.6.4, the problem doesn't exist. Any 
suggestions about this behavior?



NOTES:
test commands are same, as follows:
configure --enable-coverage --enable-tap-tests
make
make check


testing environment:
PostgreSQL: 13.2
redhat 7.4, 3.10.0-693.e17.x86_64
flex: 2.5.37


thanks
walker

Re: Add --tablespace option to reindexdb

2021-03-02 Thread Daniel Gustafsson
> On 2 Mar 2021, at 07:04, Michael Paquier  wrote:

> I have also removed the assertions based on the version number of the
> backend, based on Daniel's input sent upthread.
> 
> What do you think?

+1, no objections from me after a readthrough of this version.

--
Daniel Gustafsson   https://vmware.com/





Re: Race condition in recovery?

2021-03-02 Thread Dilip Kumar
On Sat, Jan 23, 2021 at 10:06 AM Dilip Kumar  wrote:
>
> But I am afraid that whether this adjustment (setting based on
> receiveTLI) is done based on some analysis or part of some bug fix.  I
> will try to find the history of this and maybe based on that we can
> make a better decision.

I have done further analysis of this,  so this of initializing the
expectedTLEs from receiveTLI instead of recoveryTargetTLI is done in
below commit.

=
ee994272ca50f70b53074f0febaec97e28f83c4e
Author: Heikki Linnakangas   2013-01-03 14:11:58
Committer: Heikki Linnakangas   2013-01-03 14:11:58

Delay reading timeline history file until it's fetched from master.

Streaming replication can fetch any missing timeline history files from the
master, but recovery would read the timeline history file for the target
timeline before reading the checkpoint record, and before walreceiver has
had a chance to fetch it from the master. Delay reading it, and the sanity
checks involving timeline history, until after reading the checkpoint
record.

There is at least one scenario where this makes a difference: if you take
a base backup from a standby server right after a timeline switch, the
WAL segment containing the initial checkpoint record will begin with an
older timeline ID. Without the timeline history file, recovering that file
will fail as the older timeline ID is not recognized to be an ancestor of
the target timeline. If you try to recover from such a backup, using only
streaming replication to fetch the WAL, this patch is required for that to
work.
=

I did not understand one point about this commit message, "Without the
timeline history file, recovering that file will fail as the older
timeline ID is not recognized to be an ancestor of the target
timeline. "  I mean in which case this will be true?

Now the problem is that we have initialized the expectTLEs based on
the older timeline history file instead of recoveryTargetTLI, which is
breaking the sanity of expectedTLEs.  So in below function
(rescanLatestTimeLine), if we find the newest TLI is same as
recoveryTargetTLI then we assume we don't need to do anything but the
problem is expectedTLEs is set to wrong target and we never update
unless again timeline changes.  So I think first we need to identify
what the above commit is trying to achieve and then can we do it in a
better way without breaking the sanity of expectedTLEs.

rescanLatestTimeLine(void)
{

newtarget = findNewestTimeLine(recoveryTargetTLI);
if (newtarget == recoveryTargetTLI)
{
/* No new timelines found */
return false;
}
...
newExpectedTLEs = readTimeLineHistory(newtarget);
...
expectedTLEs = newExpectedTLEs;
}

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: Parallel Full Hash Join

2021-03-02 Thread Thomas Munro
On Fri, Feb 12, 2021 at 11:02 AM Melanie Plageman
 wrote:
> I just attached the diff.

Squashed into one patch for the cfbot to chew on, with a few minor
adjustments to a few comments.
From 87c74af25940b0fc85186b0defe6e21ea2324c28 Mon Sep 17 00:00:00 2001
From: Melanie Plageman 
Date: Wed, 4 Nov 2020 14:25:33 -0800
Subject: [PATCH v5] Parallel Hash {Full,Right} Outer Join.

Previously, parallel full and right outer joins were not supported due
to a potential deadlock hazard posed by allowing workers to wait on a
barrier after barrier participants have started emitting tuples. More
details on the deadlock hazard can be found in the referenced
discussion.

For now, sidestep the problem by terminating parallelism for the
unmatched inner tuple scan. The last process to arrive at the barrier
prepares for the unmatched inner tuple scan in HJ_NEED_NEW_OUTER and
transitions to HJ_FILL_INNER, scanning the hash table and emitting
unmatched inner tuples.

To align parallel and serial hash join, change
ExecScanHashTableForUnmatched() to also scan HashMemoryChunks for the
unmatched tuple scan instead of accessing tuples through the hash table
buckets.

Author: Melanie Plageman 
Author: Thomas Munro 
Discussion: https://postgr.es/m/CA%2BhUKG%2BA6ftXPz4oe92%2Bx8Er%2BxpGZqto70-Q_ERwRaSyA%3DafNg%40mail.gmail.com
---
 src/backend/executor/nodeHash.c | 205 ++--
 src/backend/executor/nodeHashjoin.c |  58 +++
 src/backend/optimizer/path/joinpath.c   |  14 +-
 src/include/executor/hashjoin.h |  15 +-
 src/include/executor/nodeHash.h |   3 +
 src/test/regress/expected/join_hash.out |  56 ++-
 src/test/regress/sql/join_hash.sql  |  23 ++-
 7 files changed, 283 insertions(+), 91 deletions(-)

diff --git a/src/backend/executor/nodeHash.c b/src/backend/executor/nodeHash.c
index c5f2d1d22b..6305688efd 100644
--- a/src/backend/executor/nodeHash.c
+++ b/src/backend/executor/nodeHash.c
@@ -510,6 +510,7 @@ ExecHashTableCreate(HashState *state, List *hashOperators, List *hashCollations,
 		hashtable->spaceAllowed * SKEW_HASH_MEM_PERCENT / 100;
 	hashtable->chunks = NULL;
 	hashtable->current_chunk = NULL;
+	hashtable->current_chunk_idx = 0;
 	hashtable->parallel_state = state->parallel_state;
 	hashtable->area = state->ps.state->es_query_dsa;
 	hashtable->batches = NULL;
@@ -2046,16 +2047,72 @@ void
 ExecPrepHashTableForUnmatched(HashJoinState *hjstate)
 {
 	/*--
-	 * During this scan we use the HashJoinState fields as follows:
+	 * During this scan we use the HashJoinTable fields as follows:
 	 *
-	 * hj_CurBucketNo: next regular bucket to scan
-	 * hj_CurSkewBucketNo: next skew bucket (an index into skewBucketNums)
-	 * hj_CurTuple: last tuple returned, or NULL to start next bucket
+	 * current_chunk: current HashMemoryChunk to scan
+	 * current_chunk_idx: index in current HashMemoryChunk
 	 *--
 	 */
+	HashJoinTable hashtable = hjstate->hj_HashTable;
+
 	hjstate->hj_CurBucketNo = 0;
 	hjstate->hj_CurSkewBucketNo = 0;
 	hjstate->hj_CurTuple = NULL;
+	hashtable->current_chunk = hashtable->chunks;
+	hashtable->current_chunk_idx = 0;
+}
+
+/*
+ * ExecParallelPrepHashTableForUnmatched
+ *		set up for a series of ExecParallelScanHashTableForUnmatched calls
+ *		return true if this worker is elected to do the unmatched inner scan
+ */
+bool
+ExecParallelPrepHashTableForUnmatched(HashJoinState *hjstate)
+{
+	/*--
+	 * During this scan we use the ParallelHashJoinBatchAccessor fields for the
+	 * current batch as follows:
+	 *
+	 * current_chunk: current HashMemoryChunk to scan
+	 * current_chunk_idx: index in current HashMemoryChunk
+	 *--
+	 */
+	HashJoinTable hashtable = hjstate->hj_HashTable;
+	int			curbatch = hashtable->curbatch;
+	ParallelHashJoinBatchAccessor *batch_accessor = &hashtable->batches[curbatch];
+	ParallelHashJoinBatch *batch = hashtable->batches[curbatch].shared;
+	bool		last = false;
+
+	hjstate->hj_CurBucketNo = 0;
+	hjstate->hj_CurSkewBucketNo = 0;
+	hjstate->hj_CurTuple = NULL;
+	if (curbatch < 0)
+		return false;
+	last = BarrierArriveAndDetachExceptLast(&batch->batch_barrier);
+	if (!last)
+	{
+		hashtable->batches[hashtable->curbatch].done = true;
+		/* Make sure any temporary files are closed. */
+		sts_end_parallel_scan(hashtable->batches[curbatch].inner_tuples);
+		sts_end_parallel_scan(hashtable->batches[curbatch].outer_tuples);
+
+		/*
+		 * Track the largest batch we've been attached to.  Though each
+		 * backend might see a different subset of batches, explain.c will
+		 * scan the results from all backends to find the largest value.
+		 */
+		hashtable->spacePeak =
+			Max(hashtable->spacePeak, batch->size + sizeof(dsa_pointer_atomic) * hashtable->nbuckets);
+		hashtable->curbatch = -1;
+	}
+	else
+	{
+		batch_accessor->shared_chunk = batch->chunks;
+		batch_accessor->current_chunk = dsa_get_address(hashtable->area, batch_accessor->shared_chunk);
+		batch_accessor->current_chunk_idx = 0;
+	}
+	return last;
 }
 
 /*
@@ 

Re: [PATCH] postgres-fdw: column option to override foreign types

2021-03-02 Thread Ashutosh Bapat
On Mon, Mar 1, 2021 at 12:59 PM Dian M Fay  wrote:
>
> Full use of a custom data type with postgres_fdw currently requires the
> type be maintained in both the local and remote databases. `CREATE
> FOREIGN TABLE` does not check declared types against the remote table,
> but declaring e.g. a remote enum to be local text works only partway, as
> seen here. A simple select query against alpha_items returns the enum
> values as text; however, *filtering* on the column yields an error.
>
> create database alpha;
> create database beta;
>
> \c alpha
>
> create type itemtype as enum ('one', 'two', 'three');
> create table items (
>   id serial not null primary key,
>   type itemtype not null
> );
> insert into items (type) values ('one'), ('one'), ('two');
>
> \c beta
>
> create extension postgres_fdw;
> create server alpha foreign data wrapper postgres_fdw options (dbname 
> 'alpha', host 'localhost', port '5432');
> create user mapping for postgres server alpha options (user 'postgres');
>
> create foreign table alpha_items (
>   id int,
>   type text
> ) server alpha options (table_name 'items');

postgres_fdw assumes that the local type declared is semantically same
as the remote type. Ideally the enum should also be declared locally
and used to declare type's datatype. See how to handle UDTs in
postgres_fdw at
https://stackoverflow.com/questions/37734170/can-the-foreign-data-wrapper-fdw-postgres-handle-the-geometry-data-type-of-postg

--
Best Wishes,
Ashutosh Bapat




Re: We should stop telling users to "vacuum that database in single-user mode"

2021-03-02 Thread Magnus Hagander
On Tue, Mar 2, 2021 at 7:52 AM David Rowley  wrote:
>
> On Tue, 2 Mar 2021 at 04:32, Hannu Krosing  wrote:
> >
> > It looks like we are unnecessarily instructing our usiers to vacuum their
> > databases in single-user mode when just vacuuming would be enough.
> >
> > We should fix the error message to be less misleading.
>
> It would be good to change the message as it's pretty outdated. Back
> in 8ad3965a1 (2005) when the message was added, SELECT and VACUUM
> would have called GetNewTransactionId().  That's no longer the case.
> We only do that when we actually need an XID.
>
> However, I wonder if it's worth going a few steps further to try and
> reduce the chances of that message being seen in the first place.
> Maybe it's worth considering ditching any (auto)vacuum cost limits for
> any table which is within X transaction from wrapping around.
> Likewise for "VACUUM;" when the database's datfrozenxid is getting
> dangerously high.
>
> Such "emergency" vacuums could be noted in the auto-vacuum log and
> NOTICEd or WARNING sent to the user during manual VACUUMs. Maybe the
> value of X could be xidStopLimit minus a hundred million or so.
>
> I have seen it happen that an instance has a vacuum_cost_limit set and
> someone did start the database in single-user mode, per the advice of
> the error message only to find that the VACUUM took a very long time
> due to the restrictive cost limit. I struggle to imagine why anyone
> wouldn't want the vacuum to run as quickly as possible in that
> situation.

Multiple instances running on the same hardware and only one of them
being in trouble?

But it would probably be worthwhile throwing a WARNING if vacuum is
run with cost delay enabled in single user mode -- so that the user is
at least aware of the choice (and can cancel and try again). Maybe
even a warning directly when starting up a single user session, to let
them know?


-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/




Re: Parallel INSERT (INTO ... SELECT ...)

2021-03-02 Thread Amit Kapila
On Mon, Mar 1, 2021 at 9:08 AM Greg Nancarrow  wrote:
>
> Posting an updated set of patches that includes Amit Langote's patch
> to the partition tracking scheme...
>

Few comments:
==
1.
"Prior to entering parallel-mode for execution of INSERT with parallel SELECT,
a TransactionId is acquired and assigned to the current transaction state which
is then serialized in the parallel DSM for the parallel workers to use."

This point in the commit message seems a bit misleading to me because
IIUC we need to use transaction id only in the master backend for the
purpose of this patch. And, we are doing this at an early stage
because we don't allow to allocate it in parallel-mode. I think here
we should mention in some way that this has a disadvantage that if the
underneath select doesn't return any row then this transaction id
allocation would not be of use. However, that shouldn't happen in
practice in many cases. But, if I am missing something and this is
required by parallel workers then we can ignore what I said.

2.
/*
+ * Prepare for entering parallel mode by assigning a
+ * FullTransactionId, to be included in the transaction state that is
+ * serialized in the parallel DSM.
+ */
+ (void) GetCurrentTransactionId();
+ }

Similar to the previous comment this also seems to indicate that we
require TransactionId for workers. If that is not the case then this
comment also needs to be modified.

3.
 static int common_prefix_cmp(const void *a, const void *b);

-
 /*

Spurious line removal.

4.
 * Assess whether it's feasible to use parallel mode for this query. We
  * can't do this in a standalone backend, or if the command will try to
- * modify any data, or if this is a cursor operation, or if GUCs are set
- * to values that don't permit parallelism, or if parallel-unsafe
- * functions are present in the query tree.
+ * modify any data using a CTE, or if this is a cursor operation, or if
+ * GUCs are set to values that don't permit parallelism, or if
+ * parallel-unsafe functions are present in the query tree.

This comment change is not required because this is quite similar to
what we do for CTAS. Your further comment changes in this context are
sufficient.

5.
+ (IsModifySupportedInParallelMode(parse->commandType) &&
+ is_parallel_possible_for_modify(parse))) &&

I think it would be better if we move the check
IsModifySupportedInParallelMode inside
is_parallel_possible_for_modify. Also, it might be better to name this
function as is_parallel_allowed_for_modify.

6.
@@ -260,6 +265,21 @@ set_plan_references(PlannerInfo *root, Plan *plan)
  */
  add_rtes_to_flat_rtable(root, false);

+ /*
+ * If modifying a partitioned table, add its parallel-safety-checked
+ * partitions too to glob->relationOids, to register them as plan
+ * dependencies. This is only really needed in the case of a parallel
+ * plan, so that if parallel-unsafe properties are subsequently defined
+ * on the partitions, the cached parallel plan will be invalidated and
+ * a non-parallel plan will be generated.
+ */
+ if (IsModifySupportedInParallelMode(root->parse->commandType))
+ {
+ if (glob->partitionOids != NIL && glob->parallelModeNeeded)
+ glob->relationOids =
+ list_concat(glob->relationOids, glob->partitionOids);
+ }
+

Isn't it possible to add these partitionOids in set_plan_refs with the
T_Gather(Merge) node handling? That looks like a more natural place to
me, if that is feasible then we don't need parallelModeNeeded check
and maybe we don't need to even check IsModifySupportedInParallelMode
but I am less sure of the second check requirement.

7.
+#include "access/table.h"
+#include "access/xact.h"
 #include "access/transam.h"
+#include "catalog/pg_class.h"
 #include "catalog/pg_type.h"
 #include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
@@ -24,6 +27,8 @@
 #include "optimizer/planmain.h"
 #include "optimizer/planner.h"
 #include "optimizer/tlist.h"
+#include "parser/parsetree.h"
+#include "partitioning/partdesc.h"

I think apart from xact.h, we don't need new additional includes.

8. I see that in function target_rel_max_parallel_hazard, we don't
release the lock on the target table after checking parallel-safety
but then in function target_rel_max_parallel_hazard_recurse, we do
release the lock on partition tables after checking their
parallel-safety. Can we add some comments to explain these two cases?

9. I noticed that the tests added for the first patch in
v18-0002-Parallel-SELECT-for-INSERT-INTO-.-SELECT-tests-and-doc take
even more time than select_parallel. I think we should see if we can
reduce the timing of this test. I haven't studied it in detail but
maybe some Inserts execution can be avoided. In some cases like below
just checking the plan should be sufficient. I think you can try to
investigate and see how much we can reduce it without affecting on
code-coverage of newly added code.

+--
+-- Parallel unsafe column default, sh

Re: [Patch] ALTER SYSTEM READ ONLY

2021-03-02 Thread Dilip Kumar
On Fri, Feb 19, 2021 at 5:43 PM Amul Sul  wrote:
>
> In the attached version I have made the changes accordingly what Robert has
> summarised in his previous mail[1].
>
> In addition to that, I also move the code that updates the control file to
> XLogAcceptWrites() which will also get skipped when the system is read-only 
> (wal
> prohibited).  The system will be in the crash recovery, and that will
> change once we do the end-of-recovery checkpoint and the WAL writes operation
> which we were skipping from startup.  The benefit of keeping the system in
> recovery mode is that it fixes my concern[2] where other backends could 
> connect
> and write wal records while we were changing the system to read-write. Now, no
> other backends allow a wal write; UpdateFullPageWrites(), end-of-recovery
> checkpoint, and XLogReportParameters() operations will be performed in the 
> same
> sequence as it is in the startup while changing the system to read-write.

I was looking into the changes espcially recovery related problem,  I
have a few questions

1.
+static bool
+XLogAcceptWrites(bool needChkpt, bool bgwriterLaunched,
+ bool localPromoteIsTriggered, XLogReaderState *xlogreader,
+ bool archiveRecoveryRequested, TimeLineID endOfLogTLI,
+ XLogRecPtr endOfLog, TimeLineID thisTimeLineID)
+{
+boolpromoted = false;
+
+/*
.
+if (localPromoteIsTriggered)
 {
-checkPointLoc = ControlFile->checkPoint;
+XLogRecord *record;

...
+record = ReadCheckpointRecord(xlogreader,
+  ControlFile->checkPoint,
+  1, false);
 if (record != NULL)
 {
 promoted = true;
...
CreateEndOfRecoveryRecord();
}

Why do we need to move promote related code in XLogAcceptWrites?
IMHO, this promote related handling should be in StartupXLOG only.
That will look cleaner.

>
> 1] 
> http://postgr.es/m/CA+TgmoZ=cctbaxxmtyzogxegqzoz9smkbwrdpsacpjvfcgc...@mail.gmail.com
> 2] 
> http://postgr.es/m/caaj_b97xx-nqrym_uxzecph9asgomrordnhrg1n51fdcdwo...@mail.gmail.com

2.
I did not clearly understand your concern in point [2], because of
which you have to postpone RECOVERY_STATE_DONE untill system is set
back to read-write.  Can you explain this?


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: [PATCH] regexp_positions ( string text, pattern text, flags text ) → setof int4range[]

2021-03-02 Thread Joel Jacobson
On Tue, Mar 2, 2021, at 06:31, Tom Lane wrote:
>"Joel Jacobson"  writes:
>> Unless fixed, then the way I see it, I don't think we can use int4range[] 
>> for regexp_positions(),
>
>Yeah.  It's a cute idea, but the semantics aren't quite right.

I think there is a case to allow creating empty ranges *with* bounds 
information, e.g. '[6,7)'::int4range,
as well as the current only possibility to create empty ranges *without* bounds 
information, e.g. 'empty'::int4range

I've had a look at how ranges are implemented,
and I think I've found a way to support this is a simple non-invasive way.

I've outlined the idea in a patch, which I will send separately,
as it's a different feature, possibly useful for purposes other than 
regexp_positions().

/Joel

Re: buildfarm windows checks / tap tests on windows

2021-03-02 Thread Andrew Dunstan


On 3/1/21 3:07 PM, Andres Freund wrote:
> Hi,
>
> As part of trying to make the aio branch tests on cirrus CI pass with
> some tap tests I noticed that "src/tools/msvc/vcregress.pl recoverycheck"
> hangs. A long phase of remote debugging later I figured out that that's
> not a fault of the aio branch - it also doesn't pass on master (fixed in
> [1]).
>
> Which confused me - shouldn't the buildfarm have noticed? But it seems
> that all msvc animals either don't have tap tests enabled or they
> disable 'misc-checks' which in turn is what the buildfarm client uses to
> trigger all the 'recovery' checks.
>
> It seems we're not just skipping recovery, but also e.g. tap tests in
> contrib, all the tests in src/test/modules/...
>
> Andrew, what's the reason for that? Is it just that they hung at some
> point? Were too slow?



I don't think speed is the issue. I probably disabled misc-tests on
drongo and bowerbird (my two animals in question) because I got  either
instability or errors I was unable to diagnose. I'll go back and take
another look to narrow this down. It's possible to disable individual tests.



>
>
> On that last point: Running the tap tests on windows appears to be
> *excruciatingly* slow. How does anybody develop on windows without a
> mechanism to actually run tests in parallel?


I think most people develop elsewhere and then adapt/test on Windows if
necessary.


>
> I think it'd be good if vcregress.pl at least respected PROVE_FLAGS from
> the environment - it can't currently be passed in for several of the
> vcregress.pl tests, and it does seem to make things to be at least
> somewhat less painful.



+1


>
>
> This makes it even clearer to me that we really need a builtin
> testrunner that runs tests efficiently *and* debuggable on windows.
>

"show me the code" :-)


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: 2019-03 CF now in progress

2021-03-02 Thread David Steele

Hi Justin,

On 3/1/21 6:19 PM, Justin Pryzby wrote:

Could I suggest to update the CF APP to allow:
| Target version: 15


I don't have permission to add target versions (or at least I can't find 
it in the admin interface) so hopefully Michael or Magnus can do it.


Regards,
--
-David
da...@pgmasters.net




Re: pg_upgrade version checking questions

2021-03-02 Thread Peter Eisentraut

On 23.02.21 17:14, Daniel Gustafsson wrote:

This exports validate_exec to reduce duplication, and implements a custom
find_other_exec-like function in pg_upgrade to check each binary for the
version number.  Keeping a local copy of validate_exec is easy to do if it's
deemed not worth it to export it.


This looks mostly okay to me.

The commit message says something about "to ensure the health of the 
target cluster", which doesn't make sense to me.  Maybe find a better 
wording.


The name find_exec() seems not very accurate.  It doesn't find anything. 
 Maybe "check"?


I'm not sure why the new find_exec() adds EXE.  AFAIK, this is only 
required for stat(), and validate_exec() already does it.






[PATCH] Support empty ranges with bounds information

2021-03-02 Thread Joel Jacobson
Hi,

As discussed in the separate thread "[PATCH] regexp_positions ( string text, 
pattern text, flags text ) → setof int4range[]" [1]
it's currently not possible to create an empty range with bounds information.

This patch tries to improve the situation by keeping the bounds information,
and allow accessing it via lower() and upper().

No other semantics have been changed.
All tests passes without any changes.

All examples below give the same result before/after this patch:

SELECT int4range(6,6,'[)');
empty
SELECT isempty(int4range(6,6,'[)'));
TRUE
SELECT int4range(6,6,'[)') = int4range(7,7,'[)');
TRUE
SELECT 'empty'::int4range;
empty
SELECT lower('empty'::int4range);
NULL
SELECT upper('empty'::int4range);
NULL
SELECT isempty('empty'::int4range);
TRUE
SELECT 'empty'::int4range = 'empty'::int4range;
TRUE
SELECT int4range(6,6,'[)') = 'empty'::int4range;
TRUE

The only semantic change is lower() and upper()
now returns the lower and upper bounds
for empty ranges created with bounds:
 
SELECT lower(int4range(6,6,'[)'));
 6
SELECT upper(int4range(6,6,'[)'));
 6

Isaac Morland asked an interesting question in the other thread [1]:
>Doing this would introduce numerous questions which would have to be resolved.
>For example, where/when is the empty range resulting from an intersection 
>operation?

The result of intersection is with this patch unchanged,
the resulting empty range has no bounds information, e.g:

SELECT lower(int4range(10,10,'[)') * int4range(20,20,'[)'));
NULL

Patch explained below:

I've made use of the two previously not used null flags:

-#define RANGE_LB_NULL 0x20 /* lower bound is null (NOT USED) */
-#define RANGE_UB_NULL 0x40 /* upper bound is null (NOT USED) */
+#define RANGE_LB_NULL 0x20 /* lower bound is null */
+#define RANGE_UB_NULL 0x40 /* upper bound is null */

I've changed the RANGE_HAS_LBOUND and RANGE_HAS_UBOUND macros
to not look at RANGE_EMPTY:

-#define RANGE_HAS_LBOUND(flags) (!((flags) & (RANGE_EMPTY | \
-   RANGE_LB_NULL | \
-   RANGE_LB_INF)))
+#define RANGE_HAS_LBOUND(flags) (!((flags) & (RANGE_LB_NULL | RANGE_LB_INF)))

The definition for RANGE_HAS_UBOUND has been changed in the same way.

These NULL-flags are now set to explicitly indicate there is no bounds 
information,
when parsing a text string containing the "empty" literal in range_parse(),
or when the caller of make_range() passes empty=true:

- flags |= RANGE_EMPTY;
+ flags |= RANGE_EMPTY | RANGE_LB_NULL | RANGE_UB_NULL;

In the range_lower() and range_upper() functions,
the RANGE_HAS_...BOUND() macros are used,
instead of the old hard-coded expression, e.g.:

- if (empty || lower.infinite)
+ if (!RANGE_HAS_LBOUND(flags))

Finally, in range_recv() we must not mask out the NULL flags,
since they are now used:

flags &= (RANGE_EMPTY |
  RANGE_LB_INC |
  RANGE_LB_INF |
+ RANGE_LB_NULL |
  RANGE_UB_INC |
- RANGE_UB_INF);
+ RANGE_UB_INF |
+ RANGE_UB_NULL);

That's all of it.

I think this little change would make range types more intuitive useful in 
practise.

/Joel

[1] 
https://www.postgresql.org/message-id/5eae8911-241a-4432-accc-80e6ffecedfa%40www.fastmail.com


empty-ranges-with-bounds-information.patch
Description: Binary data


Re: [PATCH] regexp_positions ( string text, pattern text, flags text ) → setof int4range[]

2021-03-02 Thread Isaac Morland
On Tue, 2 Mar 2021 at 00:52, Joel Jacobson  wrote:

> Ranges are treated as sets. As such equality is defined by membership.
>
> That being said, I agree that there may be situations in which it would be
> convenient to have empty ranges at specific locations. Doing this would
> introduce numerous questions which would have to be resolved. For example,
> where/when is the empty range resulting from an intersection operation?
>
>
> Hmm, I think I would assume the intersection of two non-overlapping ranges
> to be isempty()=TRUE,
> and for lower() and upper() to continue to return NULL.
>
> But I think a zero-length range created with actual bounds should
> return the lower() and upper() values during creation, instead of NULL.
>
> I tried to find some other programming environments with range types.
>
> The first one I found was Ada.
>

Interesting!

Array indices are a bit different than general ranges however.

One question I would have is whether empty ranges are all equal to each
other. If they are, you have an equality that isn’t really equality; if
they aren’t then you would have ranges that are unequal even though they
have exactly the same membership. Although I suppose this is already true
for some types where ends can be specified as open or closed but end up
with the same end element; many range types canonicalize to avoid this but
I don’t think they all do.

Returning to the RE result issue, I wonder how much it actually matters
where any empty matches are. Certainly the actual contents of the match
don’t matter; you don’t need to be able to index into the string to extract
the substring. The only scenario I can see where it could matter is if the
RE is using lookahead or look back to find occurrences before or after
something else. If we stipulate that the result array will be in order,
then you still don’t have the exact location of empty matches but you do at
least have where they are relative to non-empty matches.


Re: [PATCH] regexp_positions ( string text, pattern text, flags text ) → setof int4range[]

2021-03-02 Thread Joel Jacobson
Hi Isaac,

Many thanks for the comments.

On Tue, Mar 2, 2021, at 14:34, Isaac Morland wrote:
> One question I would have is whether empty ranges are all equal to each 
> other. If they are, you have an equality that isn’t really equality; if they 
> aren’t then you would have ranges that are unequal even though they have 
> exactly the same membership. Although I suppose this is already true for some 
> types where ends can be specified as open or closed but end up with the same 
> end element; many range types canonicalize to avoid this but I don’t think 
> they all do.

I thought about this problem too. I don't think there is a perfect solution.
Leaving things as they are is problematic too since it makes the range type 
useless for some use-cases.
I've sent a patch in a separate thread with the least invasive idea I could 
come up with.
 
> Returning to the RE result issue, I wonder how much it actually matters where 
> any empty matches are. Certainly the actual contents of the match don’t 
> matter; you don’t need to be able to index into the string to extract the 
> substring. The only scenario I can see where it could matter is if the RE is 
> using lookahead or look back to find occurrences before or after something 
> else.

Hmm, I think it would be ugly to have corner-cases handled differently than the 
rest.

> If we stipulate that the result array will be in order, then you still don’t 
> have the exact location of empty matches but you do at least have where they 
> are relative to non-empty matches.

This part I didn't fully understand. Can you please provide some example on 
this?

/Joel


Re: [PATCH] regexp_positions ( string text, pattern text, flags text ) → setof int4range[]

2021-03-02 Thread Isaac Morland
On Tue, 2 Mar 2021 at 08:58, Joel Jacobson  wrote:

> If we stipulate that the result array will be in order, then you still
> don’t have the exact location of empty matches but you do at least have
> where they are relative to non-empty matches.
>
>
> This part I didn't fully understand. Can you please provide some example
> on this?
>

Suppose the match results are:

[4,8)
[10,10)
[13,16)
[20,20)
[24,24)

Then this gets turned into:

[4,8)
empty
[13,16)
empty
empty

So you know that there are non-empty matches from 4-8 and 13-16, plus an
empty match between them and two empty matches at the end. Given that all
empty strings are identical, I think it's only in pretty rare circumstances
where you need to know exactly where the empty matches are; it would have
to be a matter of identifying empty matches immediately before or after a
specific pattern; in which case I suspect it would usually be just as easy
to match the pattern itself directly.

Does this help?


Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

2021-03-02 Thread Pavel Borisov
>
> I completely agree that checking uniqueness requires looking at the heap,
> but I don't agree that every caller of bt_index_check on an index wants
> that particular check to be performed.  There are multiple ways in which an
> index might be corrupt, and Peter wrote the code to only check some of them
> by default, with options to expand the checks to other things.  This is why
> heapallindexed is optional.  If you don't want to pay the price of checking
> all entries in the heap against the btree, you don't have to.
>

I've got the idea and revised the patch accordingly. Thanks!
Pfa v4 of a patch. I've added an optional argument to allow uniqueness
checks for the unique indexes.
Also, I added a test variant to make them work on 32-bit systems.
Unfortunately, converting the regression test to TAP would be a pain for
me. Hope it can be used now as a 2-variant regression test for 32 and 64
bit systems.

Thank you for your consideration!

-- 
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com 


v4-0001-Make-amcheck-checking-UNIQUE-constraint-for-btree.patch
Description: Binary data


Re: [PATCH] regexp_positions ( string text, pattern text, flags text ) → setof int4range[]

2021-03-02 Thread Joel Jacobson
On Tue, Mar 2, 2021, at 15:05, Isaac Morland wrote:
> Suppose the match results are:
> 
> [4,8)
> [10,10)
> [13,16)
> [20,20)
> [24,24)
> 
> Then this gets turned into:
> 
> [4,8)
> empty
> [13,16)
> empty
> empty
> 
> So you know that there are non-empty matches from 4-8 and 13-16, plus an 
> empty match between them and two empty matches at the end. Given that all 
> empty strings are identical, I think it's only in pretty rare circumstances 
> where you need to know exactly where the empty matches are; it would have to 
> be a matter of identifying empty matches immediately before or after a 
> specific pattern; in which case I suspect it would usually be just as easy to 
> match the pattern itself directly.
> 
> Does this help?

Thanks, I see what you mean now.

I agree it's probably a corner-case,
but I think I would still prefer a complete solution by just returning setof 
two integer[] values,
instead of the cuter-but-only-partial solution of using the existing 
int4range[].

Even better would be if we could fix the range type so it could actually be 
used in this and other similar situations.

If so, then we could do:

SELECT r FROM regexp_positions('caaabaaabeee','(?<=b)a+','g') AS r;
 r
---
{"[6,9)"}
(1 row)

SELECT r FROM regexp_positions('caaabaaabeee','(?<=b)','g') AS r;
r
-
{empty}
{empty}
(2 rows)

SELECT lower(r[1]), upper(r[1]) FROM 
regexp_positions('caaabaaabeee','(?<=b)','g') AS r;
lower | upper
---+---
 5 | 5
 9 | 9
(2 rows)

/Joel

Re: [Patch] ALTER SYSTEM READ ONLY

2021-03-02 Thread Amul Sul
On Tue, Mar 2, 2021 at 5:52 PM Dilip Kumar  wrote:
>
> On Fri, Feb 19, 2021 at 5:43 PM Amul Sul  wrote:
> >
> > In the attached version I have made the changes accordingly what Robert has
> > summarised in his previous mail[1].
> >
> > In addition to that, I also move the code that updates the control file to
> > XLogAcceptWrites() which will also get skipped when the system is read-only 
> > (wal
> > prohibited).  The system will be in the crash recovery, and that will
> > change once we do the end-of-recovery checkpoint and the WAL writes 
> > operation
> > which we were skipping from startup.  The benefit of keeping the system in
> > recovery mode is that it fixes my concern[2] where other backends could 
> > connect
> > and write wal records while we were changing the system to read-write. Now, 
> > no
> > other backends allow a wal write; UpdateFullPageWrites(), end-of-recovery
> > checkpoint, and XLogReportParameters() operations will be performed in the 
> > same
> > sequence as it is in the startup while changing the system to read-write.
>
> I was looking into the changes espcially recovery related problem,  I
> have a few questions
>
> 1.
> +static bool
> +XLogAcceptWrites(bool needChkpt, bool bgwriterLaunched,
> + bool localPromoteIsTriggered, XLogReaderState *xlogreader,
> + bool archiveRecoveryRequested, TimeLineID endOfLogTLI,
> + XLogRecPtr endOfLog, TimeLineID thisTimeLineID)
> +{
> +boolpromoted = false;
> +
> +/*
> .
> +if (localPromoteIsTriggered)
>  {
> -checkPointLoc = ControlFile->checkPoint;
> +XLogRecord *record;
>
> ...
> +record = ReadCheckpointRecord(xlogreader,
> +  ControlFile->checkPoint,
> +  1, false);
>  if (record != NULL)
>  {
>  promoted = true;
> ...
> CreateEndOfRecoveryRecord();
> }
>
> Why do we need to move promote related code in XLogAcceptWrites?
> IMHO, this promote related handling should be in StartupXLOG only.

XLogAcceptWrites() tried to club all the WAL write operations that happen at the
end of StartupXLOG(). The only exception is that promotion checkpoint.

> That will look cleaner.

I think it would be better to move the promotion checkpoint call inside
XLogAcceptWrites() as well. So that we can say XLogAcceptWrites() is a part of
StartupXLOG() does the required WAL writes. Thoughts?

>
> >
> > 1] 
> > http://postgr.es/m/CA+TgmoZ=cctbaxxmtyzogxegqzoz9smkbwrdpsacpjvfcgc...@mail.gmail.com
> > 2] 
> > http://postgr.es/m/caaj_b97xx-nqrym_uxzecph9asgomrordnhrg1n51fdcdwo...@mail.gmail.com
>
> 2.
> I did not clearly understand your concern in point [2], because of
> which you have to postpone RECOVERY_STATE_DONE untill system is set
> back to read-write.  Can you explain this?
>

Sure, for that let me explain how this transition to read-write occurs.  When a
backend executes a function (ie. pg_prohibit_wal(false)) to make the system
read-write then that system state changes will be conveyed by the Checkpointer
process to all existing backends using global barrier and while Checkpointer in
the progress of conveying this barrier, any existing backends who might have
absorbed barriers can write new records.

We don't want that to happen in cases where previous recovery-end-checkpoint is
skipped in startup. We want Checkpointer first to convey the barrier to all
backends but, the backend shouldn't write wal until the Checkpointer writes
recovery-end-checkpoint record.

To refrain these backends from writing WAL I think we should keep the server in
crash recovery mode until UpdateFullPageWrites(),
end-of-recovery-checkpoint, and XLogReportParameters() are performed.

Regards,
Amul




Re: [PATCH] Bug fix in initdb output

2021-03-02 Thread Alvaro Herrera
On 2021-Mar-02, Nitin Jadhav wrote:

> > FWIW, I don't think that it is a good idea to come back to this
> > decision for *nix platforms, so I would let it as-is, and use
> > relative paths if initdb is called using a relative path.
> 
> The command to be displayed either in absolute path or relative path
> depends on the way the user is calling initdb. I agree with the above
> comment to keep this behaviour as-is, that is if the initdb is called
> using relative path, then it displays relative path otherwise absolute
> path.

Yeah.

> > However, if you can write something that makes the path printed
> > compatible for a WIN32 terminal while keeping the behavior
> > consistent with other platforms, people would have no objections to
> > that.
> 
> I feel the patch attached above handles this scenario.

Agreed.  I'll push the original patch then.  Thanks everybody for the
discussion.

-- 
Álvaro Herrera39°49'30"S 73°17'W
"Porque francamente, si para saber manejarse a uno mismo hubiera que
rendir examen... ¿Quién es el machito que tendría carnet?"  (Mafalda)




Re: psql crash while executing core regression tests

2021-03-02 Thread Hamid Akhtar
I use CentOS 7 with flex 2.5.37 quite extensively have never come across a
psql crash. This seems more like an environment related issue on your
system.



On Tue, Mar 2, 2021 at 1:53 PM walker  wrote:

> hi
>
> During installation from source code, there are many crashes for psql
> while executing core regression tests, all the crashes are similar, the
> backtrace info as follows:
>
> Core was generated by 'psql'.
> Program terminated with signal 11, Segmentation fault.
> # 0 0x0043f140 in slash_yylex()
> (gdb) bt
> #0 0x0043f140 in slash_yylex()
> #1 0x004430fc in psql_scan_slash_command()
> #2 0x0043f140 in HandleSlashCmds()
> #3 0x0043f140 in MainLoop()
> #4 0x0043f140 in main()
>
> I did more compared testing about this scenario, as follows:
> 1. install from local source code(xxx.tar.gz)
> 1) switch to source tree directory, and build there   no crash
> generated
> 2) create a build directory, and build there   no crash generated
>
> 2. install from git source code
> 1) switch to source tree directory, and build there  no crash generated
> 2) create a build directory, and build there  many crashes generated,
> but if install newer version of flex, e.g. 2.6.4, the problem doesn't
> exist. Any suggestions about this behavior?
>
> NOTES:
> test commands are same, as follows:
> configure --enable-coverage --enable-tap-tests
> make
> make check
>
> testing environment:
> PostgreSQL: 13.2
> redhat 7.4, 3.10.0-693.e17.x86_64
> flex: 2.5.37
>
> thanks
> walker
>


Re: [PATCH] Support empty ranges with bounds information

2021-03-02 Thread Tom Lane
"Joel Jacobson"  writes:
> As discussed in the separate thread "[PATCH] regexp_positions ( string text, 
> pattern text, flags text ) → setof int4range[]" [1]
> it's currently not possible to create an empty range with bounds information.

> This patch tries to improve the situation by keeping the bounds information,
> and allow accessing it via lower() and upper().

I think this is an actively bad idea.  We had a clean set-theoretic
definition of ranges as sets of points, and with this we would not.
We should not be whacking around the fundamental semantics of a
whole class of data types on the basis that it'd be cute to make
regexp_position return its result as int4range rather than int[].

If we did go forward with this, what would the implications be for
multiranges?

regards, tom lane




Re: Add --tablespace option to reindexdb

2021-03-02 Thread Mark Dilger



> On Mar 1, 2021, at 10:04 PM, Michael Paquier  wrote:
> 
> On Mon, Mar 01, 2021 at 10:12:54AM -0800, Mark Dilger wrote:
>> Your check verifies that reindexing a system table on a new
>> tablespace fails, but does not verify what the failure was.  I
>> wonder if you might want to make it more robust, something like:
> 
> I was not sure if that was worth adding or not, but no objections to
> do so.  So updated the patch to do that.
> 
> On Mon, Mar 01, 2021 at 09:47:57AM -0800, Mark Dilger wrote:
>> I recognize that you are borrowing some of that from
>> src/bin/pgbench/t/001_pgbench_with_server.pl, but I don't find
>> anything intuitive about the name "ets" and would rather not see
>> that repeated.  There is nothing in TestLib::perl2host to explain
>> this name choice, nor in pgbench's test, nor yours.
> 
> Okay, I have made the variable names more explicit.
> 
>> but I don't see what tests of the --concurrently option have to do
>> with this patch.  I'm not saying there is anything wrong with this
>> change, but it seems out of place.  Am I missing something?
> 
> While hacking on this feature I have just bumped into this separate
> issue, where the same test just gets repeated twice.  I have just
> applied an independent patch to take care of this problem separately,
> and backpatched it down to 12 where this got introduced.
> 
> On Mon, Mar 01, 2021 at 09:26:03AM -0800, Mark Dilger wrote:
>> I think the language "to reindex on" could lead users to think that
>> indexes already existent in the given tablespace will be reindexed.
>> In other words, the option might be misconstrued as a way to specify
>> all the indexes you want reindexed.  Whatever you do here, beware
>> that you are using similar language in the --help, so consider
>> changing that, too.
> 
> OK.  I have switched the docs to "Specifies the tablespace where
> indexes are rebuilt" and --help to "tablespace where indexes are
> rebuilt".
> 
> I have also removed the assertions based on the version number of the
> backend, based on Daniel's input sent upthread.
> 
> What do you think?

Looks good.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: [PATCH] regexp_positions ( string text, pattern text, flags text ) → setof int4range[]

2021-03-02 Thread Mark Dilger



> On Mar 2, 2021, at 5:34 AM, Isaac Morland  wrote:
> 
> Returning to the RE result issue, I wonder how much it actually matters where 
> any empty matches are. Certainly the actual contents of the match don’t 
> matter; you don’t need to be able to index into the string to extract the 
> substring. The only scenario I can see where it could matter is if the RE is 
> using lookahead or look back to find occurrences before or after something 
> else. If we stipulate that the result array will be in order, then you still 
> don’t have the exact location of empty matches but you do at least have where 
> they are relative to non-empty matches.

I agree the contents of the match don't matter, because they are always empty.  
But the position matters.  You could intend to split a string in multiple 
places using lookaheads and lookbehinds to determine the split points.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: [PATCH] postgres-fdw: column option to override foreign types

2021-03-02 Thread Dian M Fay
On Tue Mar 2, 2021 at 6:50 AM EST, Ashutosh Bapat wrote:
> On Mon, Mar 1, 2021 at 12:59 PM Dian M Fay  wrote:
> >
> > Full use of a custom data type with postgres_fdw currently requires the
> > type be maintained in both the local and remote databases. `CREATE
> > FOREIGN TABLE` does not check declared types against the remote table,
> > but declaring e.g. a remote enum to be local text works only partway, as
> > seen here. A simple select query against alpha_items returns the enum
> > values as text; however, *filtering* on the column yields an error.
>
> postgres_fdw assumes that the local type declared is semantically same
> as the remote type. Ideally the enum should also be declared locally
> and used to declare type's datatype. See how to handle UDTs in
> postgres_fdw at
> https://stackoverflow.com/questions/37734170/can-the-foreign-data-wrapper-fdw-postgres-handle-the-geometry-data-type-of-postg

I'm aware, and the reason for this change is that I think it's annoying
to declare and maintain the type on the local server for the sole
purpose of accommodating a read-only foreign table that effectively
treats it like text anyway. The real scenario that prompted it is a
tickets table with status, priority, category, etc. enums. We don't have
plans to modify them any time soon, but if we do it's got to be
coordinated and deployed across two databases, all so we can use what
might as well be a text column in a single WHERE clause. Since foreign
tables can be defined over subsets of columns, reordered, and names
changed, a little opt-in flexibility with types too doesn't seem
misplaced.

Note that currently, postgres_fdw will strip casts on the WHERE column:
`where type::text = 'one'` becomes `where ((type = 'one'::text))` (the
value is cast separately). Making it respect those is another option,
but I thought including it in column configuration would be less
surprising to users who aren't aware of the difference between the local
and remote tables.




Re: [PATCH] psql : Improve code for help option

2021-03-02 Thread Fujii Masao




On 2021/03/02 11:57, miyake_kouta wrote:

Hi.

I found some redundant code in psql/help.c, so I propose a patch to fix it.
In the current help.c, the variable "user" is set to the value of $PGUSER (or 
get_user_name).
However, $PGUSER is referenced again in the code that follows.
We can replace this part with "user", I think.


Good catch!


fprintf(output, _("  -U, --username=USERNAME  database user name (default: 
\"%s\")\n"), env);


We can simplify the code more and remove "env = user"
by just using "user" instead of "env" in the above?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: psql crash while executing core regression tests

2021-03-02 Thread Tom Lane
Hamid Akhtar  writes:
> I use CentOS 7 with flex 2.5.37 quite extensively have never come across a
> psql crash. This seems more like an environment related issue on your
> system.

Yeah ... also, there are more than a dozen buildfarm animals using 2.5.37,
and none of them have shown any sign of distress.  We have also got
animals using just about every other flex version under the sun, and they
all work.  So I'm inclined to guess that the apparent dependence on flex
version is a mirage, and the real reason why it worked or didn't work is
elsewhere.  We don't have enough info here to identify the problem, but
I'd suggest a couple of things:

* make sure you're starting from a completely clean tree ("git clean -dfx"
is your friend)

* avoid changing PATH or other important environment variables between
configure and build

* watch out for conflicts between different PG installations on the same
machine

On the last point, it can be a really bad idea to have any preinstalled
PG programs or libraries on a machine where you're trying to do PG
development.  It's too easy to pick up a library out of /usr/lib or a
header out of /usr/include when you wanted to use the version in your
build tree.

Whether any of this explains your problem remains to be seen, but that's
the kind of issue I'd suggest looking for.

regards, tom lane




Re: [PATCH] Support empty ranges with bounds information

2021-03-02 Thread Mark Dilger



> On Mar 2, 2021, at 5:20 AM, Joel Jacobson  wrote:
> 
> it's currently not possible to create an empty range with bounds information.
> 
> This patch tries to improve the situation by keeping the bounds information,
> and allow accessing it via lower() and upper().
> 
> No other semantics have been changed.
> All tests passes without any changes.

I recall this issue of empty ranges not keeping any bounds information being 
discussed back when range types were developed, and the design choice was 
intentional.  Searching the archives for that discussion, I don't find 
anything, probably because I'm not searching for the right keywords.  Anybody 
have a link to that discussion?

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: [Patch] ALTER SYSTEM READ ONLY

2021-03-02 Thread Dilip Kumar
On Tue, Mar 2, 2021 at 7:54 PM Amul Sul  wrote:
> XLogAcceptWrites() tried to club all the WAL write operations that happen at 
> the
> end of StartupXLOG(). The only exception is that promotion checkpoint.

Okay, I was expecting that XLogAcceptWrites should have all the WAL
write-related operations which should either executed at the end of
StartupXLOG() if the system is not read-only or after the system is
set back to read-write.  But promotion-related code is completely
irrelevant when it is executed from PerformPendingStartupOperations.
So I am not entirely sure that whether we want to keep those stuff in
XLogAcceptWrites.

> > That will look cleaner.
>
> I think it would be better to move the promotion checkpoint call inside
> XLogAcceptWrites() as well. So that we can say XLogAcceptWrites() is a part of
> StartupXLOG() does the required WAL writes. Thoughts?

Okay so if we want to keep all the WAL write inside XLogAcceptWrites
including promotion-related stuff then +1 for moving this also inside
XLogAcceptWrites.

> > >
> > > 1] 
> > > http://postgr.es/m/CA+TgmoZ=cctbaxxmtyzogxegqzoz9smkbwrdpsacpjvfcgc...@mail.gmail.com
> > > 2] 
> > > http://postgr.es/m/caaj_b97xx-nqrym_uxzecph9asgomrordnhrg1n51fdcdwo...@mail.gmail.com
> >
> > 2.
> > I did not clearly understand your concern in point [2], because of
> > which you have to postpone RECOVERY_STATE_DONE untill system is set
> > back to read-write.  Can you explain this?
> >
>
> Sure, for that let me explain how this transition to read-write occurs.  When 
> a
> backend executes a function (ie. pg_prohibit_wal(false)) to make the system
> read-write then that system state changes will be conveyed by the Checkpointer
> process to all existing backends using global barrier and while Checkpointer 
> in
> the progress of conveying this barrier, any existing backends who might have
> absorbed barriers can write new records.
>
> We don't want that to happen in cases where previous recovery-end-checkpoint 
> is
> skipped in startup. We want Checkpointer first to convey the barrier to all
> backends but, the backend shouldn't write wal until the Checkpointer writes
> recovery-end-checkpoint record.
>
> To refrain these backends from writing WAL I think we should keep the server 
> in
> crash recovery mode until UpdateFullPageWrites(),
> end-of-recovery-checkpoint, and XLogReportParameters() are performed.

Thanks for the explanation.  Now, I understand the problem, however, I
am not sure that whether keeping the system in recovery is the best
way to solve this but as of now I don't have anything better to
suggest, and immediately I couldn’t think of any problem with this
solution.  But I will think about this again.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




GiST comment improvement

2021-03-02 Thread Bruce Momjian
In talking to Teodor this week, I have written the attached C comment
patch which improves our description of GiST's NSN and GistBuildLSN
values.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee

>From defd16776c8016179a87c3d32ed359570c62727d Mon Sep 17 00:00:00 2001
From: Bruce Momjian 
Date: Tue, 2 Mar 2021 11:38:14 -0500
Subject: [PATCH] gist squash commit

---
 src/backend/access/gist/README |  1 +
 src/include/access/gist.h  | 15 +++
 2 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/src/backend/access/gist/README b/src/backend/access/gist/README
index 8ca0cf78cc..25cab0047b 100644
--- a/src/backend/access/gist/README
+++ b/src/backend/access/gist/README
@@ -10,6 +10,7 @@ GiST stands for Generalized Search Tree. It was introduced in the seminal paper
 Jeffrey F. Naughton, Avi Pfeffer:
 
 http://www.sai.msu.su/~megera/postgres/gist/papers/gist.ps
+https://dsf.berkeley.edu/papers/sigmod97-gist.pdf
 
 and implemented by J. Hellerstein and P. Aoki in an early version of
 PostgreSQL (more details are available from The GiST Indexing Project
diff --git a/src/include/access/gist.h b/src/include/access/gist.h
index aa5f1763dd..4b06575d98 100644
--- a/src/include/access/gist.h
+++ b/src/include/access/gist.h
@@ -51,13 +51,20 @@
 #define F_HAS_GARBAGE		(1 << 4)	/* some tuples on the page are dead,
 		 * but not deleted yet */
 
-/* NSN - node sequence number, a special-purpose LSN */
+/*
+ * NSN (node sequence number) is a special-purpose LSN which is stored on each
+ * index page in GISTPageOpaqueData and updated only during page splits.  By
+ * recording the parent's LSN in GISTSearchItem.parentlsn, it is possible to
+ * detect concurrent child page splits by checking if parentlsn < child's NSN,
+ * and handle them properly.  The child page's LSN is insufficient for this
+ * purpose since it is updated for every page change.
+ */
 typedef XLogRecPtr GistNSN;
 
 /*
- * A bogus LSN / NSN value used during index build. Must be smaller than any
- * real or fake unlogged LSN, so that after an index build finishes, all the
- * splits are considered completed.
+ * A fake LSN / NSN value used during index builds. Must be smaller than any
+ * real or fake (unlogged) LSN generated after the index build completes so
+ * that all splits are considered complete.
  */
 #define GistBuildLSN	((XLogRecPtr) 1)
 
-- 
2.20.1



Re: [PATCH] Support empty ranges with bounds information

2021-03-02 Thread Pantelis Theodosiou
On Tue, Mar 2, 2021 at 3:28 PM Mark Dilger 
wrote:

>
>
> > On Mar 2, 2021, at 5:20 AM, Joel Jacobson  wrote:
> >
> > it's currently not possible to create an empty range with bounds
> information.
> >
> > This patch tries to improve the situation by keeping the bounds
> information,
> > and allow accessing it via lower() and upper().
> >
> > No other semantics have been changed.
> > All tests passes without any changes.
>
> I recall this issue of empty ranges not keeping any bounds information
> being discussed back when range types were developed, and the design choice
> was intentional.  Searching the archives for that discussion, I don't find
> anything, probably because I'm not searching for the right keywords.
> Anybody have a link to that discussion?
>
> —
> Mark Dilger
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
>
 Marc, perhaps you were referring to this discussion?
https://www.postgresql.org/message-id/4d5534d002250003a...@gw.wicourts.gov


Re: GROUP BY DISTINCT

2021-03-02 Thread Vik Fearing
On 3/2/21 4:06 PM, Georgios Kokolatos wrote:
> As a minor gripe, I would note the addition of list_int_cmp.
> The block
> 
> +   /* Sort each groupset individually */
> +   foreach(cell, result)
> +   list_sort(lfirst(cell), list_int_cmp);
> 
> Can follow suit from the rest of the code, and define a static 
> cmp_list_int_asc(), as
> indeed the same patch does for cmp_list_len_contents_asc.
> This is indeed point of which I will not hold a too strong opinion.

I did it this way because list_int_cmp is a general purpose function for
int lists that can be reused elsewhere in the future.  Whereas
cmp_list_len_contents_asc is very specific to this case so I kept it local.

I'm happy to change it around if that's what consensus wants.

> Overall :+1: from me.

Thanks for looking at it!

> I will be bumping to 'Ready for Committer' unless objections.

In that case, here is another patch that fixes a typo in the docs
mentioned privately to me by Erik.  The typo (and a gratuitous rebase)
is the only change to what you just reviewed.
-- 
Vik Fearing
>From e8c0042267abc2dedea8ecca2c6bfde7118d Mon Sep 17 00:00:00 2001
From: Vik Fearing 
Date: Sun, 21 Feb 2021 10:26:57 +0100
Subject: [PATCH] implement GROUP BY DISTINCT

---
 doc/src/sgml/queries.sgml  |  41 
 doc/src/sgml/ref/select.sgml   |   9 +-
 src/backend/catalog/sql_features.txt   |   2 +-
 src/backend/nodes/copyfuncs.c  |   2 +
 src/backend/nodes/equalfuncs.c |   2 +
 src/backend/nodes/list.c   |  16 +++
 src/backend/nodes/outfuncs.c   |   2 +
 src/backend/nodes/readfuncs.c  |   1 +
 src/backend/optimizer/plan/planner.c   |   2 +-
 src/backend/parser/analyze.c   |   1 +
 src/backend/parser/gram.y  |  22 ++--
 src/backend/parser/parse_agg.c |  58 ++-
 src/include/nodes/parsenodes.h |   2 +
 src/include/nodes/pg_list.h|   1 +
 src/include/parser/parse_agg.h |   2 +-
 src/test/regress/expected/groupingsets.out | 111 +
 src/test/regress/sql/groupingsets.sql  |  26 +
 17 files changed, 284 insertions(+), 16 deletions(-)

diff --git a/doc/src/sgml/queries.sgml b/doc/src/sgml/queries.sgml
index bc0b3cc9fe..ac88df4391 100644
--- a/doc/src/sgml/queries.sgml
+++ b/doc/src/sgml/queries.sgml
@@ -1372,6 +1372,47 @@ GROUP BY GROUPING SETS (
 

 
+   
+When specifying multiple grouping items together, the final set of grouping
+sets might contain duplicates. For example:
+
+GROUP BY ROLLUP (a, b), ROLLUP (a, c)
+
+is equivalent to
+
+GROUP BY GROUPING SETS (
+(a, b, c),
+(a, b),
+(a, b),
+(a, c),
+(a),
+(a),
+(a, c),
+(a),
+()
+)
+
+If these duplicates are undesirable, they can be removed using the
+DISTINCT clause directly on the GROUP BY.
+Therefore:
+
+GROUP BY DISTINCT ROLLUP (a, b), ROLLUP (a, c)
+
+is equivalent to
+
+GROUP BY GROUPING SETS (
+(a, b, c),
+(a, b),
+(a, c),
+(a),
+()
+)
+
+This is not the same as using SELECT DISTINCT because the output
+rows may still contain duplicates.  If any of the ungrouped columns contains NULL,
+it will be indistinguishable from the NULL used when that same column is grouped.
+   
+
   

 The construct (a, b) is normally recognized in expressions as
diff --git a/doc/src/sgml/ref/select.sgml b/doc/src/sgml/ref/select.sgml
index ab91105599..9c5cf50ef0 100644
--- a/doc/src/sgml/ref/select.sgml
+++ b/doc/src/sgml/ref/select.sgml
@@ -37,7 +37,7 @@ SELECT [ ALL | DISTINCT [ ON ( expressionexpression [ [ AS ] output_name ] [, ...] ]
 [ FROM from_item [, ...] ]
 [ WHERE condition ]
-[ GROUP BY grouping_element [, ...] ]
+[ GROUP BY [ ALL | DISTINCT ] grouping_element [, ...] ]
 [ HAVING condition ]
 [ WINDOW window_name AS ( window_definition ) [, ...] ]
 [ { UNION | INTERSECT | EXCEPT } [ ALL | DISTINCT ] select ]
@@ -778,7 +778,7 @@ WHERE condition

 The optional GROUP BY clause has the general form
 
-GROUP BY grouping_element [, ...]
+GROUP BY [ ALL | DISTINCT ] grouping_element [, ...]
 

 
@@ -802,7 +802,10 @@ GROUP BY grouping_element [, ...]
 independent grouping sets.  The effect of this is
 equivalent to constructing a UNION ALL between
 subqueries with the individual grouping sets as their
-GROUP BY clauses.  For further details on the handling
+GROUP BY clauses.  The optional DISTINCT
+clause removes duplicate sets before processing; it does not
+transform the UNION ALL into a UNION DISTINCT.
+For further details on the handling
 of grouping sets see .

 
diff --git a/src/backend/catalog/sql_features.txt b/src/backend/catalog/sql_features.txt
index ab0895ce3c..65b5bfdd69 100644
--- a/src/backend/catalog/sql_features.txt
+++ b/src/backend/catalog/sql_features.txt
@@ -482,7 +4

Re: [PATCH] Support empty ranges with bounds information

2021-03-02 Thread Mark Dilger



> On Mar 2, 2021, at 8:51 AM, Pantelis Theodosiou  wrote:
> 
> 
> 
> On Tue, Mar 2, 2021 at 3:28 PM Mark Dilger  
> wrote:
> 
> 
> > On Mar 2, 2021, at 5:20 AM, Joel Jacobson  wrote:
> > 
> > it's currently not possible to create an empty range with bounds 
> > information.
> > 
> > This patch tries to improve the situation by keeping the bounds information,
> > and allow accessing it via lower() and upper().
> > 
> > No other semantics have been changed.
> > All tests passes without any changes.
> 
> I recall this issue of empty ranges not keeping any bounds information being 
> discussed back when range types were developed, and the design choice was 
> intentional.  Searching the archives for that discussion, I don't find 
> anything, probably because I'm not searching for the right keywords.  Anybody 
> have a link to that discussion?
> 
> —
> Mark Dilger
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
> 
> 
>  Marc, perhaps you were referring to this discussion?
> https://www.postgresql.org/message-id/4d5534d002250003a...@gw.wicourts.gov

Yes, I believe so.  Thank you for the link.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: Table AM modifications to accept column projection lists

2021-03-02 Thread Jacob Champion
On Mon, 2021-03-01 at 23:13 +, Jacob Champion wrote:
> On Mon, 2021-03-01 at 16:59 -0600, Justin Pryzby wrote:
> > Why is this removed ?
> 
> Mm, both of those analyze.c changes seem suspect. Possibly a mismerge
> from the zedstore branch; let me double-check.
> 
> > its (possessive) not it's ("it is")
> 
> Thanks, I'll fix this at the same time.

Both fixed in v3; thanks for the catch!

--Jacob
From 484cdaf5050f4010127ae2de5624f39d7f12982d Mon Sep 17 00:00:00 2001
From: Jacob Champion 
Date: Tue, 2 Mar 2021 08:59:45 -0800
Subject: [PATCH v3] tableam: accept column projection list

Co-authored-by: Soumyadeep Chakraborty 
Co-authored-by: Melanie Plageman 
Co-authored-by: Ashwin Agrawal 
Co-authored-by: Jacob Champion 
---
 src/backend/access/heap/heapam_handler.c |   5 +-
 src/backend/access/nbtree/nbtsort.c  |   3 +-
 src/backend/access/table/tableam.c   |   5 +-
 src/backend/commands/trigger.c   |  19 +++-
 src/backend/executor/execMain.c  |   3 +-
 src/backend/executor/execPartition.c |   2 +
 src/backend/executor/execReplication.c   |   6 +-
 src/backend/executor/execScan.c  | 108 +
 src/backend/executor/nodeIndexscan.c |  10 ++
 src/backend/executor/nodeLockRows.c  |   7 +-
 src/backend/executor/nodeModifyTable.c   |  38 ++--
 src/backend/executor/nodeSeqscan.c   |  43 -
 src/backend/executor/nodeTidscan.c   |  11 ++-
 src/backend/nodes/copyfuncs.c|   2 +
 src/backend/nodes/equalfuncs.c   |   2 +
 src/backend/nodes/outfuncs.c |   2 +
 src/backend/nodes/readfuncs.c|   2 +
 src/backend/optimizer/path/allpaths.c|  85 +++-
 src/backend/optimizer/plan/planner.c |  19 
 src/backend/optimizer/prep/preptlist.c   |  13 ++-
 src/backend/optimizer/util/inherit.c |  37 ++-
 src/backend/parser/analyze.c |  40 ++--
 src/backend/parser/parse_relation.c  |  18 
 src/backend/partitioning/partbounds.c|  14 ++-
 src/backend/rewrite/rewriteHandler.c |   8 ++
 src/include/access/tableam.h | 118 +--
 src/include/executor/executor.h  |   6 ++
 src/include/nodes/execnodes.h|   1 +
 src/include/nodes/parsenodes.h   |  14 +++
 src/include/utils/rel.h  |  14 +++
 30 files changed, 611 insertions(+), 44 deletions(-)

diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index bd5faf0c1f..5d7bc5f6bd 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -180,7 +180,8 @@ static bool
 heapam_fetch_row_version(Relation relation,
 		 ItemPointer tid,
 		 Snapshot snapshot,
-		 TupleTableSlot *slot)
+		 TupleTableSlot *slot,
+		 Bitmapset *project_cols)
 {
 	BufferHeapTupleTableSlot *bslot = (BufferHeapTupleTableSlot *) slot;
 	Buffer		buffer;
@@ -348,7 +349,7 @@ static TM_Result
 heapam_tuple_lock(Relation relation, ItemPointer tid, Snapshot snapshot,
   TupleTableSlot *slot, CommandId cid, LockTupleMode mode,
   LockWaitPolicy wait_policy, uint8 flags,
-  TM_FailureData *tmfd)
+  TM_FailureData *tmfd, Bitmapset *project_cols)
 {
 	BufferHeapTupleTableSlot *bslot = (BufferHeapTupleTableSlot *) slot;
 	TM_Result	result;
diff --git a/src/backend/access/nbtree/nbtsort.c b/src/backend/access/nbtree/nbtsort.c
index 2c4d7f6e25..57462eb9c7 100644
--- a/src/backend/access/nbtree/nbtsort.c
+++ b/src/backend/access/nbtree/nbtsort.c
@@ -1966,7 +1966,8 @@ _bt_parallel_scan_and_sort(BTSpool *btspool, BTSpool *btspool2,
 	indexInfo = BuildIndexInfo(btspool->index);
 	indexInfo->ii_Concurrent = btshared->isconcurrent;
 	scan = table_beginscan_parallel(btspool->heap,
-	ParallelTableScanFromBTShared(btshared));
+	ParallelTableScanFromBTShared(btshared),
+	NULL);
 	reltuples = table_index_build_scan(btspool->heap, btspool->index, indexInfo,
 	   true, progress, _bt_build_callback,
 	   (void *) &buildstate, scan);
diff --git a/src/backend/access/table/tableam.c b/src/backend/access/table/tableam.c
index 5ea5bdd810..9bda57247a 100644
--- a/src/backend/access/table/tableam.c
+++ b/src/backend/access/table/tableam.c
@@ -172,7 +172,7 @@ table_parallelscan_initialize(Relation rel, ParallelTableScanDesc pscan,
 }
 
 TableScanDesc
-table_beginscan_parallel(Relation relation, ParallelTableScanDesc parallel_scan)
+table_beginscan_parallel(Relation relation, ParallelTableScanDesc parallel_scan, Bitmapset *proj)
 {
 	Snapshot	snapshot;
 	uint32		flags = SO_TYPE_SEQSCAN |
@@ -194,6 +194,9 @@ table_beginscan_parallel(Relation relation, ParallelTableScanDesc parallel_scan)
 		snapshot = SnapshotAny;
 	}
 
+	if (proj)
+		return relation->rd_tableam->scan_begin_with_column_projection(relation, snapshot, 0, NULL,
+			parallel_scan, flags, proj);
 	return relation->rd_tableam->scan_begin(relation, snapshot, 0, NULL,
 			pa

Re: SQL-standard function body

2021-03-02 Thread Peter Eisentraut

On 11.02.21 09:02, Peter Eisentraut wrote:

Here is an updated patch to get it building again.


Another updated patch to get things building again.  I've also fixed the 
last TODO I had in there in qualifying function arguments as necessary 
in ruleutils.c.


Updated patch to resolve merge conflict.  No changes in functionality.
From 8e61da555d6083f9ec0f90791b02082376fe010b Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Tue, 2 Mar 2021 18:08:15 +0100
Subject: [PATCH v8] SQL-standard function body

This adds support for writing CREATE FUNCTION and CREATE PROCEDURE
statements for language SQL with a function body that conforms to the
SQL standard and is portable to other implementations.

Instead of the PostgreSQL-specific AS $$ string literal $$ syntax,
this allows writing out the SQL statements making up the body
unquoted, either as a single statement:

CREATE FUNCTION add(a integer, b integer) RETURNS integer
LANGUAGE SQL
RETURN a + b;

or as a block

CREATE PROCEDURE insert_data(a integer, b integer)
LANGUAGE SQL
BEGIN ATOMIC
  INSERT INTO tbl VALUES (a);
  INSERT INTO tbl VALUES (b);
END;

The function body is parsed at function definition time and stored as
expression nodes in a new pg_proc column prosqlbody.  So at run time,
no further parsing is required.

However, this form does not support polymorphic arguments, because
there is no more parse analysis done at call time.

Dependencies between the function and the objects it uses are fully
tracked.

A new RETURN statement is introduced.  This can only be used inside
function bodies.  Internally, it is treated much like a SELECT
statement.

psql needs some new intelligence to keep track of function body
boundaries so that it doesn't send off statements when it sees
semicolons that are inside a function body.

Also, per SQL standard, LANGUAGE SQL is the default, so it does not
need to be specified anymore.

Discussion: 
https://www.postgresql.org/message-id/flat/1c11f1eb-f00c-43b7-799d-2d44132c0...@2ndquadrant.com
---
 doc/src/sgml/catalogs.sgml|  10 +
 doc/src/sgml/ref/create_function.sgml | 126 +-
 doc/src/sgml/ref/create_procedure.sgml|  62 -
 src/backend/catalog/pg_aggregate.c|   1 +
 src/backend/catalog/pg_proc.c | 116 +
 src/backend/commands/aggregatecmds.c  |   2 +
 src/backend/commands/functioncmds.c   | 119 +++--
 src/backend/commands/typecmds.c   |   4 +
 src/backend/executor/functions.c  |  79 +++---
 src/backend/nodes/copyfuncs.c |  15 ++
 src/backend/nodes/equalfuncs.c|  13 +
 src/backend/nodes/outfuncs.c  |  12 +
 src/backend/nodes/readfuncs.c |   1 +
 src/backend/optimizer/util/clauses.c  | 126 +++---
 src/backend/parser/analyze.c  |  35 +++
 src/backend/parser/gram.y | 129 +++---
 src/backend/tcop/postgres.c   |   3 +-
 src/backend/utils/adt/ruleutils.c | 132 +-
 src/bin/pg_dump/pg_dump.c |  45 +++-
 src/bin/psql/describe.c   |  15 +-
 src/fe_utils/psqlscan.l   |  23 +-
 src/include/catalog/pg_proc.dat   |   4 +
 src/include/catalog/pg_proc.h |   6 +-
 src/include/commands/defrem.h |   2 +
 src/include/executor/functions.h  |  15 ++
 src/include/fe_utils/psqlscan_int.h   |   2 +
 src/include/nodes/nodes.h |   1 +
 src/include/nodes/parsenodes.h|  13 +
 src/include/parser/kwlist.h   |   2 +
 src/include/tcop/tcopprot.h   |   1 +
 src/interfaces/ecpg/preproc/ecpg.addons   |   6 +
 src/interfaces/ecpg/preproc/ecpg.trailer  |   4 +-
 .../regress/expected/create_function_3.out| 231 +-
 .../regress/expected/create_procedure.out |  58 +
 src/test/regress/sql/create_function_3.sql|  96 +++-
 src/test/regress/sql/create_procedure.sql |  26 ++
 36 files changed, 1321 insertions(+), 214 deletions(-)

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index db29905e91..7b8e2e7227 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -5980,6 +5980,16 @@ pg_proc Columns
   
  
 
+ 
+  
+   prosqlbody pg_node_tree
+  
+  
+   Pre-parsed SQL function body.  This will be used for language SQL
+   functions if the body is not specified as a string constant.
+   
+ 
+
  
   
proconfig text[]
diff --git a/doc/src/sgml/ref/create_function.sgml 
b/doc/src/sgml/ref/create_function.sgml
index 3c1eaea651..1b5b9420db 100644
--- a/doc/src/sgml/ref/create_function.sgml
+++ b/doc/src/sgml/ref/create_function.sgml
@@ -38,6 +38,7 @@
 | SET configuration_parameter 
{ TO value | = value | FROM CURRENT }

Re: [PATCH] Support empty ranges with bounds information

2021-03-02 Thread Pantelis Theodosiou
On Tue, Mar 2, 2021 at 4:57 PM Mark Dilger 
wrote:

>
>
> > On Mar 2, 2021, at 8:51 AM, Pantelis Theodosiou 
> wrote:
> >
> >
> >
> > On Tue, Mar 2, 2021 at 3:28 PM Mark Dilger 
> wrote:
> >
> >
> > > On Mar 2, 2021, at 5:20 AM, Joel Jacobson  wrote:
> > >
> > > it's currently not possible to create an empty range with bounds
> information.
> > >
> > > This patch tries to improve the situation by keeping the bounds
> information,
> > > and allow accessing it via lower() and upper().
> > >
> > > No other semantics have been changed.
> > > All tests passes without any changes.
> >
> > I recall this issue of empty ranges not keeping any bounds information
> being discussed back when range types were developed, and the design choice
> was intentional.  Searching the archives for that discussion, I don't find
> anything, probably because I'm not searching for the right keywords.
> Anybody have a link to that discussion?
> >
> > —
> > Mark Dilger
> > EnterpriseDB: http://www.enterprisedb.com
> > The Enterprise PostgreSQL Company
> >
> >
> >  Marc, perhaps you were referring to this discussion?
> >
> https://www.postgresql.org/message-id/4d5534d002250003a...@gw.wicourts.gov
>
> Yes, I believe so.  Thank you for the link.
>

Welcome. Also this message, where I found the link and has an overview of
the different views at the time (and more links):

https://www.postgresql.org/message-id/1299865026.3474.58.camel%40jdavis

On Fri, 2011-03-11 at 08:37 -0500, Bruce Momjian wrote:
> > Where are we on this? The options are: 1. Rip out empty ranges. Several
> people have been skeptical of their
> usefulness, but I don't recall anyone directly saying that they should
> be removed. Robert Haas made the point that range types aren't closed
> under UNION:
> http://archives.postgresql.org/pgsql-hackers/2011-02/msg01045.php So the
> additional nice mathematical properties provided by empty ranges
> are not as important (because it wouldn't be perfect anyway). 2. Change
> the semantics. Erik Rijkers suggested that we define all
> operators for empty ranges, perhaps using NULL semantics:
> http://archives.postgresql.org/pgsql-hackers/2011-02/msg00942.php And
> Kevin Grittner suggested that there could be discrete ranges of zero
> length yet a defined starting point:
> http://archives.postgresql.org/pgsql-hackers/2011-02/msg01042.php 3.
> Leave empty ranges with the existing "empty set" semantics. Nathan
> Boley made a good point here:
> http://archives.postgresql.org/pgsql-hackers/2011-02/msg01108.php Right
> now it's #3, and I lean pretty strongly toward keeping it. Without
> #3, people will get confused when fairly simple operations fail in a
> data-dependent way (at runtime). With #3, people will run into problems
> only in situations where it is fairly dubious to have an empty range
> anyway (and therefore likely a real error), such as finding ranges "left
> of" an empty range. Otherwise, I'd prefer #1 to #2. I think #2 is a bad
> path to take, and
> we'll end up with a lot of unintuitive and error-prone operators. Regards,
> Jeff Davis


Re: GiST comment improvement

2021-03-02 Thread Andrey Borodin



> 2 марта 2021 г., в 21:40, Bruce Momjian  написал(а):
> 
> In talking to Teodor this week, I have written the attached C comment
> patch which improves our description of GiST's NSN and GistBuildLSN
> values.

I'd suggest also add NSN acronym description to Concurrency section of 
src/backend/access/gist/README. And maybe even to doc/src/sgml/acronyms.sgml. 
Just for connectivity.

Thanks!

Best regards, Andrey Borodin.



PROXY protocol support

2021-03-02 Thread Magnus Hagander
PFA a simple patch that implements support for the PROXY protocol.

This is a protocol common and very light weight in proxies and load
balancers (haproxy is one common example, but also for example the AWS
cloud load balancers).  Basically this protocol prefixes the normal
connection with a header and a specification of what the original host
was, allowing the server to unwrap that and get the correct client
address instead of just the proxy ip address. It is a one-way protocol
in that there is no response from the server, it's just purely a
prefix of the IP information.

Using this when PostgreSQL is behind a proxy allows us to keep using
pg_hba.conf rules based on the original ip address, as well as track
the original address in log messages and pg_stat_activity etc.

The implementation adds a parameter named proxy_servers which lists
the ips or ip+cidr mask to be trusted. Since a proxy can decide what
the origin is, and this is used for security decisions, it's very
important to not just trust any server, only those that are
intentionally used. By default, no servers are listed, and thus the
protocol is disabled.

When specified, and the connection on the normal port has the proxy
prefix on it, and the connection comes in from one of the addresses
listed as valid proxy servers, we will replace the actual IP address
of the client with the one specified in the proxy packet.

Currently there is no information about the proxy server in the
pg_stat_activity view, it's only available as a log message. But maybe
it should go in pg_stat_activity as well? Or in a separate
pg_stat_proxy view?

(In passing, I note that pq_discardbytes were in pqcomm.h, yet listed
as static in pqcomm.c -- but now made non-static)

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/
diff --git a/doc/src/sgml/client-auth.sgml b/doc/src/sgml/client-auth.sgml
index b420486a0a..d4f6fad5b0 100644
--- a/doc/src/sgml/client-auth.sgml
+++ b/doc/src/sgml/client-auth.sgml
@@ -353,6 +353,15 @@ hostnogssenc  database  user
 
+  
+   If  is enabled and the
+   connection is made through a proxy server using the PROXY
+   protocol, the actual IP address of the client will be used
+   for matching. If a connection is made through a proxy server
+   not using the PROXY protocol, the IP address of the
+   proxy server will be used.
+  
+
   
These fields do not apply to local records.
   
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index b5718fc136..fc7de25378 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -682,6 +682,30 @@ include_dir 'conf.d'
   
  
 
+ 
+  proxy_servers (string)
+  
+   proxy_servers configuration parameter
+  
+  
+  
+   
+A comma separated list of one or more host names or cidr specifications
+of proxy servers to trust. If a connection using the PROXY protocol is made
+from one of these IP addresses, PostgreSQL will
+read the client IP address from the PROXY header and consider that the
+address of the client, instead of listing all connections as coming from
+the proxy server.
+   
+   
+If a proxy connection is made from an IP address not covered by this
+list, the connection will be rejected. By default no proxy is trusted
+and all proxy connections will be rejected.  This parameter can only
+be set at server start.
+   
+  
+ 
+
  
   max_connections (integer)
   
diff --git a/src/backend/libpq/pqcomm.c b/src/backend/libpq/pqcomm.c
index 27a298f110..9163761cc2 100644
--- a/src/backend/libpq/pqcomm.c
+++ b/src/backend/libpq/pqcomm.c
@@ -53,6 +53,7 @@
  *		pq_getmessage	- get a message with length word from connection
  *		pq_getbyte		- get next byte from connection
  *		pq_peekbyte		- peek at next byte from connection
+ *		pq_peekbytes- peek at a known number of bytes from connection
  *		pq_putbytes		- send bytes to connection (not flushed until pq_flush)
  *		pq_flush		- flush pending output
  *		pq_flush_if_writable - flush pending output if writable without blocking
@@ -1039,6 +1040,27 @@ pq_peekbyte(void)
 	return (unsigned char) PqRecvBuffer[PqRecvPointer];
 }
 
+
+/* 
+ *		pq_peekbytes		- peek at a known number of bytes from connection.
+ *			  Note! Does NOT wait for more data to arrive.
+ *
+ *		returns 0 if OK, EOF if trouble
+ * 
+ */
+int
+pq_peekbytes(char *s, size_t len)
+{
+	Assert(PqCommReadingMsg);
+
+	if (PqRecvLength - PqRecvPointer < len)
+		return EOF;
+
+	memcpy(s, PqRecvBuffer + PqRecvPointer, len);
+
+	return 0;
+}
+
 /* 
  *		pq_getbyte_if_available - get a single byte from connection,
  *			if available
@@ -1135,7 +1157,7 @@ pq_getbytes(char *s, size_t len)
  *		returns 0 if OK, EOF if trouble
  * --

Re: 2019-03 CF now in progress

2021-03-02 Thread Magnus Hagander
On Tue, Mar 2, 2021 at 1:53 PM David Steele  wrote:
>
> Hi Justin,
>
> On 3/1/21 6:19 PM, Justin Pryzby wrote:
> > Could I suggest to update the CF APP to allow:
> > | Target version: 15
>
> I don't have permission to add target versions (or at least I can't find
> it in the admin interface) so hopefully Michael or Magnus can do it.

Done.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/




Re: DETAIL for wrong scram password

2021-03-02 Thread Jacob Champion
On Sat, 2021-02-27 at 17:02 -0500, Jeff Janes wrote:
> Note that in one case you do get the "does not match" line.  That is
> if the user has a scram password assigned and the hba specifies
> plain-text 'password' as the method.  So if the absence of the DETAIL
> is intentional, it is not internally consistent.

Agreed.

> The attached patch fixes the issue.  I don't know if this is the
> correct location to be installing the message,
> maybe verify_client_proof should be doing it instead.

Hmm, I agree that the current location doesn't seem quite right. If
someone adds a new way to exit the SCRAM loop on failure, they'd
probably need to partially unwind this change to avoid printing the
wrong detail message. But in my opinion, verify_client_proof()
shouldn't be concerned with logging details...

What would you think about adding the additional detail right after
verify_client_proof() fails? I.e.

> if (!verify_client_proof(state) || state->doomed)
> {
>   /*  */
>   result = SASL_EXCHANGE_FAILURE;
>   break;
> }

That way the mismatched-password detail is linked directly to the
mismatched-password logic.

Other notes:
- Did you have any thoughts around adding a regression test, since this
would be an easy thing to break in the future?
- I was wondering about timing attacks against the psprintf() call to
construct the logdetail string, but it looks like the other authn code
paths aren't worried about that.

--Jacob


Re: [PATCH] Support empty ranges with bounds information

2021-03-02 Thread Joel Jacobson
On Tue, Mar 2, 2021, at 15:42, Tom Lane wrote:
> "Joel Jacobson"  writes:
> > As discussed in the separate thread "[PATCH] regexp_positions ( string 
> > text, pattern text, flags text ) → setof int4range[]" [1]
> > it's currently not possible to create an empty range with bounds 
> > information.
> 
> > This patch tries to improve the situation by keeping the bounds information,
> > and allow accessing it via lower() and upper().
> 
> I think this is an actively bad idea.  We had a clean set-theoretic
> definition of ranges as sets of points, and with this we would not.
> We should not be whacking around the fundamental semantics of a
> whole class of data types on the basis that it'd be cute to make
> regexp_position return its result as int4range rather than int[].

I think there are *lots* of other use-cases where the current semantics of 
range types are very problematic.

The regexp_positions() patch just demonstrates one concrete example
on when real-life zero-length ranges can definitively have positions,
regardless of what the mathematicians thinks.
(I use the term "position" here since if lower=upper bound,
then we're talking about a position, since it has zero length.)

I think there is a risk lots of users coming from other programming environments
will misunderstand how ranges work, start implementing something using them,
only to later have to rewrite all their code using ranges due to eventually 
encountering the
zero-length corner-case for which there is no work-around (except not using 
ranges).

Imagine e.g. a Rust user, who has learned how ranges work in Rust,
and thinks the program below is valid and and expects it to output
"start 3 end 3 is_empty true".

fn main() {
let r = std::ops::Range { start: 3, end: 3 };

println!(
"start {} end {} is_empty {}",
r.start,
r.end,
r.is_empty()
);
}

I think it would be a challenge to explain how PostgreSQL's range semantics
to this user, why you get NULL when trying to
access the start and end values for this range.

I feel this is a perfect example of when theory and practise has since long 
parted ways,
and the theory is only cute until you face the ugly reality.

That said, subtle changes are of course possibly dangerous,
and since I'm not a huge range type user myself,
I can't have an opinion on how many rely on the current null semantics for 
lower() and upper().

Argh! I wish we would have access to a large set of real-life real-time 
statistics on PostgreSQL SQL queries
and result sets from lots of different users around the world, similar to the 
regex test corpus.
It's very unfair e.g. Amazon with their Aurora could in theory collect such 
statistics on all their users,
so their Aurora-hackers could answers questions like

"Do lower() and upper() ever return NULL in real-life for ranges?"

While all we can do is to rely on user reports and our imagination.
Maybe we should allow opting-in to contribute with statistics from production 
servers,
to help us better understand how PostgreSQL is used in real-life?
I see lots of problems with data privacy, business secrets etc, but perhaps 
there are something that can be done.

Oh well. At least it was fun to learn about how ranges are implemented behind 
the scenes.

If we cannot do a subtle change, then I think we should consider an entirely 
new range class,
just like multi-ranges are added in v14. Maybe "negrange" could be a good name?

> 
> If we did go forward with this, what would the implications be for
> multiranges?

None. It would only affect lower()/upper() for a single range created with 
bounds.

Before patch:

SELECT numrange(3,4) + numrange(5,5);
   [3,4)
SELECT upper(numrange(3,4) + numrange(5,5));
 4
SELECT numrange(5,5);
  empty
SELECT upper(numrange(5,5));
NULL

After patch:

SELECT numrange(3,4) + numrange(5,5);
   [3,4)
SELECT upper(numrange(3,4) + numrange(5,5));
 4
SELECT numrange(5,5);
  empty
SELECT upper(numrange(5,5));
 5

At the very least, I think we should in any case add test coverage of what 
lower()/upper() returns for empty ranges.

/Joel


Re: [PATCH] Support empty ranges with bounds information

2021-03-02 Thread Mark Dilger



> On Mar 2, 2021, at 9:52 AM, Joel Jacobson  wrote:
> 
> On Tue, Mar 2, 2021, at 15:42, Tom Lane wrote:
>> "Joel Jacobson"  writes:
>> > As discussed in the separate thread "[PATCH] regexp_positions ( string 
>> > text, pattern text, flags text ) → setof int4range[]" [1]
>> > it's currently not possible to create an empty range with bounds 
>> > information.
>> 
>> > This patch tries to improve the situation by keeping the bounds 
>> > information,
>> > and allow accessing it via lower() and upper().
>> 
>> I think this is an actively bad idea.  We had a clean set-theoretic
>> definition of ranges as sets of points, and with this we would not.
>> We should not be whacking around the fundamental semantics of a
>> whole class of data types on the basis that it'd be cute to make
>> regexp_position return its result as int4range rather than int[].
> 
> I think there are *lots* of other use-cases where the current semantics of 
> range types are very problematic.

I'm inclined to think that this conversation is ten years too late.  Range 
semantics are already relied upon in our code, but also in the code of others.  
It might be very hard to debug code that was correct when written but broken by 
this proposed change.  The problem is not just with lower() and upper(), but 
with equality testing (mentioned upthread), since code may rely on two 
different "positions" (your word) both being equal, and both sorting the same.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: [PATCH] Support empty ranges with bounds information

2021-03-02 Thread Joel Jacobson
On Tue, Mar 2, 2021, at 17:51, Pantelis Theodosiou wrote:
> 
>  Marc, perhaps you were referring to this discussion?
> https://www.postgresql.org/message-id/4d5534d002250003a...@gw.wicourts.gov

Thanks for the link to the discussion.

I will real with great interest and learn the arguments,
so I can explain them to future versions of myself
when they as the same question ten years from now on this list.

/Joel


Re: [PATCH] Support empty ranges with bounds information

2021-03-02 Thread Joel Jacobson
On Tue, Mar 2, 2021, at 19:01, Mark Dilger wrote:
> The problem is not just with lower() and upper(), but with equality testing 
> (mentioned upthread), since code may rely on two different "positions" (your 
> word) both being equal, and both sorting the same.

That's why the patch doesn't change equality.

/Joel

Re: GiST comment improvement

2021-03-02 Thread Bruce Momjian
On Tue, Mar  2, 2021 at 10:25:23PM +0500, Andrey Borodin wrote:
> 
> 
> > 2 марта 2021 г., в 21:40, Bruce Momjian  написал(а):
> > 
> > In talking to Teodor this week, I have written the attached C comment
> > patch which improves our description of GiST's NSN and GistBuildLSN
> > values.
> 
> I'd suggest also add NSN acronym description to Concurrency section of 
> src/backend/access/gist/README. And maybe even to doc/src/sgml/acronyms.sgml. 
> Just for connectivity.

Uh, NSN is only something that exists in the source code. Do we document
those cases?

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee





Re: [PATCH] Support empty ranges with bounds information

2021-03-02 Thread Chapman Flack
On 03/02/21 13:01, Mark Dilger wrote:
> The problem is not just with lower() and upper(), but with equality testing
> (mentioned upthread), since code may rely on two different "positions"
> (your word) both being equal, and both sorting the same.

Could those concerns be addressed perhaps, not by adding an entirely new
just-like-a-range-but-remembers-position-when-zero-width type (which would
feel wartlike to me), but by tweaking ranges to /secretly/ remember the
position when zero width?

Secretly, in the sense that upper(), lower(), and the default sort
operator would keep their established behavior, but new functions like
upper_or_pos(), lower_or_pos() would return the non-NULL value even for
an empty range, and another sort operator could be provided for use
when one wants the ordering to reflect it?

Regards,
-Chap




Re: [PATCH] Support empty ranges with bounds information

2021-03-02 Thread Mark Dilger



> On Mar 2, 2021, at 10:08 AM, Joel Jacobson  wrote:
> 
> On Tue, Mar 2, 2021, at 19:01, Mark Dilger wrote:
>> The problem is not just with lower() and upper(), but with equality testing 
>> (mentioned upthread), since code may rely on two different "positions" (your 
>> word) both being equal, and both sorting the same.
> 
> That's why the patch doesn't change equality.

How does that work if I SELECT DISTINCT ON (nr) ... and then take upper(nr).  
It's just random which values I get?  

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: new heapcheck contrib module

2021-03-02 Thread Robert Haas
On Tue, Mar 2, 2021 at 12:10 PM Mark Dilger
 wrote:
> On further reflection, I decided to implement these changes and not worry 
> about the behavioral change.

Thanks.

> I skipped this part.  The initcmd argument is only handed to 
> ParallelSlotsGetIdle().  Doing as you suggest would not really be simpler, it 
> would just move that argument to ParallelSlotsSetup().  But I don't feel 
> strongly about it, so I can move this, too, if you like.
>
> I didn't do this either, and for the same reason.  It's just a parameter to 
> ParallelSlotsGetIdle(), so nothing is really gained by moving it to 
> ParallelSlotsSetup().

OK. I thought it was more natural to pass a bunch of arguments at
setup time rather than passing a bunch of arguments at get-idle time,
but I don't feel strongly enough about it to insist, and somebody else
can always change it later if they decide I had the right idea.

> Rather than the slots user tweak the slot's ConnParams, 
> ParallelSlotsGetIdle() takes a dbname argument, and uses it as 
> ConnParams->override_dbname.

OK, but you forgot to update the comments. ParallelSlotsGetIdle()
still talks about a cparams argument that it no longer has.

The usual idiom for sizing a memory allocation involving
FLEXIBLE_ARRAY_MEMBER is something like offsetof(ParallelSlotArray,
slots) + numslots * sizeof(ParallelSlot). Your version uses sizeof();
don't.

Other than that 0001 looks to me to be in pretty good shape now.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: [PATCH] Support empty ranges with bounds information

2021-03-02 Thread Tom Lane
Mark Dilger  writes:
> On Mar 2, 2021, at 10:08 AM, Joel Jacobson  wrote:
>> On Tue, Mar 2, 2021, at 19:01, Mark Dilger wrote:
>>> The problem is not just with lower() and upper(), but with equality testing 
>>> (mentioned upthread), since code may rely on two different "positions" 
>>> (your word) both being equal, and both sorting the same.

>> That's why the patch doesn't change equality.

> How does that work if I SELECT DISTINCT ON (nr) ... and then take upper(nr).  
> It's just random which values I get?  

More generally, values that are visibly different yet compare equal
are user-unfriendly in the extreme.  We do have cases like that
(IEEE-float minus zero, area-based compare of some geometric types
come to mind) but they are all warts, not things to be emulated.

regards, tom lane




Re: Table AM modifications to accept column projection lists

2021-03-02 Thread Zhihong Yu
Hi,

+   /* Make sure the the new slot is not dependent on the original tuple */

There is duplicate 'the'.

For neededColumnContextWalker(),

+   else if(var->varattno == 0) {

I think the if following the else is not needed - I assume var->varattno
wouldn't be negative.
Similar comment for extract_scan_columns().

+   while ((col = bms_next_member(parent_cols, col)) >= 0)
+   {
+   Var *var = (Var *) list_nth(translated_vars, col - 1);

If col is 0, do you still want to call list_nth() ?

Cheers

On Tue, Mar 2, 2021 at 9:10 AM Jacob Champion  wrote:

> On Mon, 2021-03-01 at 23:13 +, Jacob Champion wrote:
> > On Mon, 2021-03-01 at 16:59 -0600, Justin Pryzby wrote:
> > > Why is this removed ?
> >
> > Mm, both of those analyze.c changes seem suspect. Possibly a mismerge
> > from the zedstore branch; let me double-check.
> >
> > > its (possessive) not it's ("it is")
> >
> > Thanks, I'll fix this at the same time.
>
> Both fixed in v3; thanks for the catch!
>
> --Jacob
>


Re: PROXY protocol support

2021-03-02 Thread Arthur Nascimento
Hi,

On Tue, 2 Mar 2021 at 14:43, Magnus Hagander  wrote:
> PFA a simple patch that implements support for the PROXY protocol.

Nice. I didn't know I needed this. But in hindsight, I would've used
it quite a few times in the past if I could have.

> The implementation adds a parameter named proxy_servers which lists
> the ips or ip+cidr mask to be trusted. Since a proxy can decide what
> the origin is, and this is used for security decisions, it's very
> important to not just trust any server, only those that are
> intentionally used. By default, no servers are listed, and thus the
> protocol is disabled.

Might make sense to add special cases for 'samehost' and 'samenet', as
in hba rules, as proxy servers are commonly on the same machine or
share one of the same internal networks.

Despite the security issues, I'm sure people will soon try and set
proxy_servers='*' or 'all' if they think this setting works as
listen_addresses or as pg_hba. But I don't think I'd make these use
cases easier.

Tureba - Arthur Nascimento




Re: [PATCH] Support empty ranges with bounds information

2021-03-02 Thread Mark Dilger



> On Mar 2, 2021, at 10:16 AM, Chapman Flack  wrote:
> 
> On 03/02/21 13:01, Mark Dilger wrote:
>> The problem is not just with lower() and upper(), but with equality testing
>> (mentioned upthread), since code may rely on two different "positions"
>> (your word) both being equal, and both sorting the same.
> 
> Could those concerns be addressed perhaps, not by adding an entirely new
> just-like-a-range-but-remembers-position-when-zero-width type (which would
> feel wartlike to me), but by tweaking ranges to /secretly/ remember the
> position when zero width?
> 
> Secretly, in the sense that upper(), lower(), and the default sort
> operator would keep their established behavior, but new functions like
> upper_or_pos(), lower_or_pos() would return the non-NULL value even for
> an empty range, and another sort operator could be provided for use
> when one wants the ordering to reflect it?

I vaguely recall that ten years ago I wanted zero-width range types to not 
collapse into an empty range.  I can't recall if I ever expressed that opinion 
-- I just remember thinking it would be nice, and for reasons similar to what 
Joel is arguing here.  But I can't see having 
compares-equal-but-not-truly-equal ranges as a good idea.  I think Tom is right 
about that.

I also think the regexp work that inspired this thread could return something 
other than a range, so the motivation for creating a frankenstein range 
implementation doesn't really exist.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: GROUP BY DISTINCT

2021-03-02 Thread Georgios Kokolatos
The following review has been posted through the commitfest application:
make installcheck-world:  not tested
Implements feature:   not tested
Spec compliant:   not tested
Documentation:not tested

Hi,

this is a useful feature, thank you for implementing. I gather that it follows 
the standard, if so,
then there are definitely no objections from me.

The patch in version 2, applies cleanly and passes all the tests.
It contains documentation which seems correct to a non native speaker.

As a minor gripe, I would note the addition of list_int_cmp.
The block

+   /* Sort each groupset individually */
+   foreach(cell, result)
+   list_sort(lfirst(cell), list_int_cmp);

Can follow suit from the rest of the code, and define a static 
cmp_list_int_asc(), as
indeed the same patch does for cmp_list_len_contents_asc.
This is indeed point of which I will not hold a too strong opinion.

Overall :+1: from me.

I will be bumping to 'Ready for Committer' unless objections.

Re: [PATCH] Support empty ranges with bounds information

2021-03-02 Thread Joel Jacobson
On Tue, Mar 2, 2021, at 17:51, Pantelis Theodosiou wrote:
>  Marc, perhaps you were referring to this discussion?
> https://www.postgresql.org/message-id/4d5534d002250003a...@gw.wicourts.gov

I've read through the "Re: Range Types: empty ranges" thread from 2011.

My comments:

Jeff Davis wrote:
>The cost, of course, is that not all operations are well-defined for
>empty ranges. I think those are mostly operators like those mentioned in
>the other thread: ">>" (strictly right of), "<<" (strictly left of), and
>"-|-" (adjacent); and perhaps "&>" and "&<". These are probably used a
>little less frequently, and should probably not be used in a context
>where empty ranges are permitted (if they are, it's likely a mistake and
>an error should be thrown).

Interesting. I realize all of these operators would actually be well defined 
for empty ranges *with* bounds information.

"Kevin Grittner" wrote:
>Perhaps it was a mistake to get so concrete rather than conceptual
>-- basically, it seems like it could be a useful concept for any
>planned or scheduled range with an indeterminate end point, which
>you want to "reserve" up front and record in progress until
>complete.  The alternative would be that such "ranges to be" have a
>parallel "planned start value" column of the same type as the range,
>to be used as the start of the range once it is not empty.  Or, as
>another way to put it, it seems potentially useful to me to have an
>empty range which is pinned to a location, in *addition* to the
>unpinned empty ranges such as would be needed to represent the
>intersection of two non-overlapping ranges.

I fully agree with this. Such "pinned to a location" empty range
is exactly what I needed for regexp_positions() and is what
the patch implements.

It seems that the consequences of allowing empty ranges with bounds information
wasn't really discussed in detail in this thread. Nobody commented on Kevin's 
idea, as far as I can see when reading the thread.

Instead, the discussion focused on the consequences of
allowing empty ranges without bounds information,
which apparently was finally accepted, since that's what we have now.

/Joel


Re: [PATCH] Support empty ranges with bounds information

2021-03-02 Thread Joel Jacobson
On Tue, Mar 2, 2021, at 19:16, Chapman Flack wrote:
> On 03/02/21 13:01, Mark Dilger wrote:
> > The problem is not just with lower() and upper(), but with equality testing
> > (mentioned upthread), since code may rely on two different "positions"
> > (your word) both being equal, and both sorting the same.
> 
> Could those concerns be addressed perhaps, not by adding an entirely new
> just-like-a-range-but-remembers-position-when-zero-width type (which would
> feel wartlike to me), but by tweaking ranges to /secretly/ remember the
> position when zero width?

This is actually how it's implemented. The patch doesn't affect equality.
It just stores the lower/upper bounds, if available, upon creation.

> 
> Secretly, in the sense that upper(), lower(), and the default sort
> operator would keep their established behavior, but new functions like
> upper_or_pos(), lower_or_pos() would return the non-NULL value even for
> an empty range, and another sort operator could be provided for use
> when one wants the ordering to reflect it?

This is a great idea!

This would solve the potential problems of users relying
on upper()/lower() to always return null when range isempty().

Such new functions could then be used by new users who have read the 
documentation
and understand how they work.
This would effectively mean there would be absolutely no semantic changes at 
all.

I will work on a new patch to try out this idea.

/Joel

Re: GROUP BY DISTINCT

2021-03-02 Thread Georgios



‐‐‐ Original Message ‐‐‐
On Tuesday, March 2, 2021 5:51 PM, Vik Fearing  wrote:

> On 3/2/21 4:06 PM, Georgios Kokolatos wrote:
>
> > As a minor gripe, I would note the addition of list_int_cmp.
> > The block
> >
> > - /* Sort each groupset individually */
> >
> >
> > - foreach(cell, result)
> >
> >
> > - list_sort(lfirst(cell), list_int_cmp);
> >
> >
> >
> > Can follow suit from the rest of the code, and define a static 
> > cmp_list_int_asc(), as
> > indeed the same patch does for cmp_list_len_contents_asc.
> > This is indeed point of which I will not hold a too strong opinion.
>
> I did it this way because list_int_cmp is a general purpose function for
> int lists that can be reused elsewhere in the future. Whereas
> cmp_list_len_contents_asc is very specific to this case so I kept it local.

Of course. I got the intention and I have noted my opinion.
>
> I'm happy to change it around if that's what consensus wants.

As before, I will not hold a too strong opinion.

>
> > Overall :+1: from me.
>
> Thanks for looking at it!
>
> > I will be bumping to 'Ready for Committer' unless objections.
>
> In that case, here is another patch that fixes a typo in the docs
> mentioned privately to me by Erik. The typo (and a gratuitous rebase)
> is the only change to what you just reviewed.

Thank you. The typo was indistiguishable to me too.

My :+1: stands for version 3 of the patch. Updating status in the
commitfest to reflect that.

//Georgios -- https://www.vmware.com

>
> 
>
> Vik Fearing






Re: [PATCH] Support empty ranges with bounds information

2021-03-02 Thread Joel Jacobson
On Tue, Mar 2, 2021, at 19:17, Mark Dilger wrote:
> > On Mar 2, 2021, at 10:08 AM, Joel Jacobson  wrote:
> > That's why the patch doesn't change equality.
> 
> How does that work if I SELECT DISTINCT ON (nr) ... and then take upper(nr).  
> It's just random which values I get?  

Yes. It's random, since equality isn't changed, the sort operation cannot tell 
the difference, and nor could a user who isn't aware of upper() / lower() could 
reveal differences.

Demo:

CREATE TABLE t AS SELECT int4range(i,i+FLOOR(random()*2)::integer,'[)') AS nr 
FROM generate_series(1,10) AS i;

SELECT nr, lower(nr), upper(nr) FROM t ORDER BY 1;
   nr   | lower | upper
+---+---
empty  |10 |10
empty  | 4 | 4
empty  | 6 | 6
empty  | 7 | 7
empty  | 1 | 1
empty  | 3 | 3
[2,3)  | 2 | 3
[5,6)  | 5 | 6
[8,9)  | 8 | 9
[9,10) | 9 |10
(10 rows)

SELECT DISTINCT ON (nr) nr, lower(nr), upper(nr) FROM t ORDER BY 1;
   nr   | lower | upper
+---+---
empty  |10 |10
[2,3)  | 2 | 3
[5,6)  | 5 | 6
[8,9)  | 8 | 9
[9,10) | 9 |10
(5 rows)

/Joel



Re: [PATCH] Support empty ranges with bounds information

2021-03-02 Thread Mark Dilger



> On Mar 2, 2021, at 11:34 AM, Joel Jacobson  wrote:
> 
> Yes. It's random, since equality isn't changed, the sort operation cannot 
> tell the difference, and nor could a user who isn't aware of upper() / 
> lower() could reveal differences.

This sounds unworkable even just in light of the original motivation for this 
whole thread.  If I use your proposed regexp_positions(string text, pattern 
text, flags text) function to parse a large number of "positions" from a 
document, store all those positions in a table, and do a join of those 
positions against something else, it's not going to work.  Positions will 
randomly vanish from the results of that join, which is going to be really 
surprising.  I'm sure there are other examples of Tom's general point about 
compares-equal-but-not-equal datatypes.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: [PATCH] Support empty ranges with bounds information

2021-03-02 Thread Mark Dilger



> On Mar 2, 2021, at 11:42 AM, Mark Dilger  wrote:
> 
> 
> 
>> On Mar 2, 2021, at 11:34 AM, Joel Jacobson  wrote:
>> 
>> Yes. It's random, since equality isn't changed, the sort operation cannot 
>> tell the difference, and nor could a user who isn't aware of upper() / 
>> lower() could reveal differences.
> 
> This sounds unworkable even just in light of the original motivation for this 
> whole thread.  If I use your proposed regexp_positions(string text, pattern 
> text, flags text) function to parse a large number of "positions" from a 
> document, store all those positions in a table, and do a join of those 
> positions against something else, it's not going to work.  Positions will 
> randomly vanish from the results of that join, which is going to be really 
> surprising.  I'm sure there are other examples of Tom's general point about 
> compares-equal-but-not-equal datatypes.

I didn't phrase that clearly enough.  I'm thinking about whether you include 
the bounds information in the hash function.  The current implementation of 
hash_range(PG_FUNCTION_ARGS) is going to hash the lower and upper bounds, since 
you didn't change it to do otherwise, so "equal" values won't always hash the 
same.  I haven't tested this out, but it seems you could get a different set of 
rows depending on whether the planner selects a hash join.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: [PATCH] Support empty ranges with bounds information

2021-03-02 Thread Joel Jacobson
On Tue, Mar 2, 2021, at 20:20, Joel Jacobson wrote:
> On Tue, Mar 2, 2021, at 19:16, Chapman Flack wrote:
>> Secretly, in the sense that upper(), lower(), and the default sort
>> operator would keep their established behavior, but new functions like
>> upper_or_pos(), lower_or_pos() would return the non-NULL value even for
>> an empty range, and another sort operator could be provided for use
>> when one wants the ordering to reflect it?
> 
> I will work on a new patch to try out this idea.

Here is a patch implementing this idea.

lower() and upper() are now restored to their originals.

The new functions range_start() and range_end()
works just like lower() and upper(),
except they also return bounds information for empty ranges,
if available, otherwise they return null.

This means, there is no risk of affecting any current users of ranges.

I think this a good pragmatical solution to many real-world problems that can 
be solved efficiently with ranges.

I've also added test coverage of lower() and upper() for null range values.

/Joel

empty-ranges-with-bounds-information-v2.patch
Description: Binary data


Re: [PATCH] Support empty ranges with bounds information

2021-03-02 Thread Joel Jacobson
On Tue, Mar 2, 2021, at 20:57, Mark Dilger wrote:
> I didn't phrase that clearly enough.  I'm thinking about whether you include 
> the bounds information in the hash function.  The current implementation of 
> hash_range(PG_FUNCTION_ARGS) is going to hash the lower and upper bounds, 
> since you didn't change it to do otherwise, so "equal" values won't always 
> hash the same.  I haven't tested this out, but it seems you could get a 
> different set of rows depending on whether the planner selects a hash join.

I think this issue is solved by the 
empty-ranges-with-bounds-information-v2.patch I just sent,
since with it, there are no semantic changes at all, lower() and upper() works 
like before.

/Joel

Re: [PATCH] Support empty ranges with bounds information

2021-03-02 Thread Chapman Flack
On 03/02/21 14:20, Joel Jacobson wrote:
> This would effectively mean there would be absolutely no semantic changes at 
> all.
> 
> I will work on a new patch to try out this idea.

I may have been assuming a degree of orthogonality in SQL that isn't
really there ... only in a few situations (like creating an index) do
you easily get to specify a non-default operator class to use when
comparing or hashing a value.

So perhaps a solution could be built on the same range machinery, but
requiring some new syntax in CREATE TYPE ... AS RANGE, something like
WITH POSITIONS. A new concrete range type created that way would not
be a whole different kind of a thing, and would share most machinery
with other range types, but would have the position-remembering
behavior. Given a range type created over int4 in that way, maybe
named int4prange, the regexp_positions function could return one of those.

Regards,
-Chap




Re: Optimising latch signals

2021-03-02 Thread Thomas Munro
On Mon, Mar 1, 2021 at 2:29 PM Thomas Munro  wrote:
> Time to watch the buildfarm to find out if my speculation about
> illumos is correct...

I just heard that damselfly's host has been decommissioned with no
immediate plan for a replacement.  That was the last of the
Solaris-family animals testing master.  It may be some time before I
find out if my assumptions broke something on that OS...




Re: [PATCH] Automatic HASH and LIST partition creation

2021-03-02 Thread Justin Pryzby
https://commitfest.postgresql.org/32/2694/

I don't know what committers will say, but I think that "ALTER TABLE" might be
the essential thing for this patch to support, not "CREATE".  (This is similar
to ALTER..SET STATISTICS, which is not allowed in CREATE.)

The reason is that ALTER is what's important for RANGE partitions, which need
to be created dynamically (for example, to support time-series data
continuously inserting data around 'now').  I assume it's sometimes also
important for LIST.  I think this patch should handle those cases better before
being commited, or else we risk implementing grammar and other user-facing 
interface
that fails to handle what's needed into the future (or that's non-essential).
Even if dynamic creation isn't implemented yet, it seems important to at least
implement the foundation for setting the configuration to *allow* that in the
future, in a manner that's consistent with the initial implementation for
"static" partitions.

ALTER also supports other ideas I mentioned here:
https://www.postgresql.org/message-id/20200706145947.GX4107%40telsasoft.com

  - ALTER .. SET interval (for dynamic/deferred RANGE partitioning)
  - ALTER .. SET modulus, for HASH partitioning, in the initial implementation,
this would allow CREATING paritions, but wouldn't attempt to handle moving
data if overlapping table already exists:
  - Could also set the table-name, maybe by format string;
  - Could set "retention interval" for range partitioning;
  - Could set if the partitions are themselves partitioned(??)

I think once you allow setting configuration parameters like this, then you
might have an ALTER command to "effect" them, which would create any static
tables required by the configuration.  maybe that'd be automatic, but if
there's an "ALTER .. APPLY PARTITIONS" command (or whatever), maybe in the
future, the command could also be used to "repartition" existing table data
into partitions with more fine/course granularity (modulus, or daily vs monthly
range, etc).

-- 
Justin




Re: [PATCH] Support empty ranges with bounds information

2021-03-02 Thread Joel Jacobson
On Tue, Mar 2, 2021, at 19:28, Tom Lane wrote:
> More generally, values that are visibly different yet compare equal
> are user-unfriendly in the extreme.  We do have cases like that
> (IEEE-float minus zero, area-based compare of some geometric types
> come to mind) but they are all warts, not things to be emulated.

I almost agree with you, but if forced to choose,
I would rather have two wart feet I can use, rather than limping around on only 
one wart free foot.

/Joel

Re: new heapcheck contrib module

2021-03-02 Thread Robert Haas
On Tue, Mar 2, 2021 at 1:24 PM Robert Haas  wrote:
> Other than that 0001 looks to me to be in pretty good shape now.

Incidentally, we might want to move this to a new thread with a better
subject line, since the current subject line really doesn't describe
the uncommitted portion of the work. And create a new CF entry, too.

Moving onto 0002:

The index checking options should really be called btree index
checking options. I think I'd put the table options first, and the
btree options second. Other kinds of indexes could follow some day. I
would personally omit the short forms of --heapallindexed and
--parent-check; I think we'll run out of option names too quickly if
people add more kinds of checks.

Perhaps VerifyBtreeSlotHandler should emit a warning of some kind if
PQntuples(res) != 0.

+   /*
+* Test that this function works, but for now we're
not using the list
+* 'relations' that it builds.
+*/
+   conn = connectDatabase(&cparams, progname, opts.echo,
false, true);

This comment appears to have nothing to do with the code, since
connectDatabase() does not build a list of 'relations'.

amcheck_sql seems to include paranoia, but do we need that if we're
using a secure search path? Similarly for other SQL queries, e.g. in
prepare_table_command.

It might not be strictly necessary for the static functions in
pg_amcheck.c to use_three completelyDifferent NamingConventions for
its static functions.

should_processing_continue() is one semicolon over budget.

The initializer for opts puts a comma even after the last member
initializer. Is that going to be portable to all compilers?

+   for (failed = false, cell = opts.include.head; cell; cell = cell->next)

I think failed has to be false here, because it gets initialized at
the top of the function. If we need to reinitialize it for some
reason, I would prefer you do that on the previous line, separate from
the for loop stuff.

+   char   *dbrgx;  /* Database regexp parsed from pattern, or
+* NULL */
+   char   *nsprgx; /* Schema regexp parsed from pattern, or NULL */
+   char   *relrgx; /* Relation regexp parsed from pattern, or
+* NULL */
+   booltblonly;/* true if relrgx should only match tables */
+   boolidxonly;/* true if relrgx should only match indexes */

Maybe: db_regex, nsp_regex, rel_regex, table_only, index_only?

Just because it seems theoretically possible that someone will see
nsprgx and not immediately understand what it's supposed to mean, even
if they know that nsp is a common abbreviation for namespace in
PostgreSQL code, and even if they also know what a regular expression
is.

Your four messages about there being nothing to check seem like they
could be consolidated down to one: "nothing to check for pattern
\"%s\"".

I would favor changing things so that once argument parsing is
complete, we switch to reporting all errors that way. So in other
words here, and everything that follows:

+   fprintf(stderr, "%s: no databases to check\n", progname);

+* ParallelSlots based event loop follows.

"Main event loop."

To me it would read slightly better to change each reference to
"relations list" to "list of relations", but perhaps that is too
nitpicky.

I think the two instances of goto finish could be avoided with not
much work. At most a few things need to happen only if !failed, and
maybe not even that, if you just said "break;" instead.

+ * Note: Heap relation corruption is returned by verify_heapam() without the
+ * use of raising errors, but running verify_heapam() on a corrupted table may

How about "Heap relation corruption() is reported by verify_heapam()
via the result set, rather than an ERROR, ..."

It seems mighty inefficient to have a whole bunch of consecutive calls
to remove_relation_file() or corrupt_first_page() when every such call
stops and restarts the database. I would guess these tests will run
noticeably faster if you don't do that. Either the functions need to
take a list of arguments, or the stop/start needs to be pulled up and
done in the caller.

corrupt_first_page() could use a comment explaining what exactly we're
overwriting, and in particular noting that we don't want to just
clobber the LSN, but rather something where we can detect a wrong
value.

There's a long list of calls to command_checks_all() in 003_check.pl
that don't actually check anything but that the command failed, but
run it with a bunch of different options. I don't understand the value
of that, and suggest reducing the number of cases tested. If you want,
you can have tests elsewhere that focus -- perhaps by using verbose
mode -- on checking that the right tables are being checked.

This is not yet a full review of everything in this patch -- I haven't
sorted through all of the tests yet, or all of the new query
constru

Re: [PATCH] Support empty ranges with bounds information

2021-03-02 Thread Mark Dilger



> On Mar 2, 2021, at 12:04 PM, Joel Jacobson  wrote:
> 
> On Tue, Mar 2, 2021, at 20:57, Mark Dilger wrote:
>> I didn't phrase that clearly enough.  I'm thinking about whether you include 
>> the bounds information in the hash function.  The current implementation of 
>> hash_range(PG_FUNCTION_ARGS) is going to hash the lower and upper bounds, 
>> since you didn't change it to do otherwise, so "equal" values won't always 
>> hash the same.  I haven't tested this out, but it seems you could get a 
>> different set of rows depending on whether the planner selects a hash join.
> 
> I think this issue is solved by the 
> empty-ranges-with-bounds-information-v2.patch I just sent,
> since with it, there are no semantic changes at all, lower() and upper() 
> works like before.

There are semantic differences, because hash_range() doesn't call lower() and 
upper(), it uses RANGE_HAS_LBOUND and RANGE_HAS_UBOUND, the behavior of which 
you have changed.  I created a regression test and expected results and checked 
after applying your patch, and your patch breaks the hash function behavior.  
Notice that before your patch, all three ranges hashed to the same value, but 
not so after:


@@ -1,18 +1,18 @@
 select hash_range('[a,a)'::textrange);
  hash_range
 
-  484847245
+ -590102690
 (1 row)

 select hash_range('[b,b)'::textrange);
  hash_range
 
-  484847245
+  281562732
 (1 row)

 select hash_range('[c,c)'::textrange);
- hash_range 
-
-  484847245
+ hash_range  
+-
+ -1887445565
 (1 row)


You might change how hash_range() works to get all "equal" values to hash the 
same value, but that just gets back to the problem that non-equal things appear 
to be equal.  I guess that's your two-warty-feet preference, but not everyone 
is going to be in agreement on that.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: [PATCH] Support empty ranges with bounds information

2021-03-02 Thread Joel Jacobson


On Tue, Mar 2, 2021, at 21:40, Mark Dilger wrote:
> 
> 
> > On Mar 2, 2021, at 12:04 PM, Joel Jacobson  wrote:
> > 
> > On Tue, Mar 2, 2021, at 20:57, Mark Dilger wrote:
> >> I didn't phrase that clearly enough.  I'm thinking about whether you 
> >> include the bounds information in the hash function.  The current 
> >> implementation of hash_range(PG_FUNCTION_ARGS) is going to hash the lower 
> >> and upper bounds, since you didn't change it to do otherwise, so "equal" 
> >> values won't always hash the same.  I haven't tested this out, but it 
> >> seems you could get a different set of rows depending on whether the 
> >> planner selects a hash join.
> > 
> > I think this issue is solved by the 
> > empty-ranges-with-bounds-information-v2.patch I just sent,
> > since with it, there are no semantic changes at all, lower() and upper() 
> > works like before.
> 
> There are semantic differences, because hash_range() doesn't call lower() and 
> upper(), it uses RANGE_HAS_LBOUND and RANGE_HAS_UBOUND, the behavior of which 
> you have changed.  I created a regression test and expected results and 
> checked after applying your patch, and your patch breaks the hash function 
> behavior.  Notice that before your patch, all three ranges hashed to the same 
> value, but not so after:
> 
> 
> @@ -1,18 +1,18 @@
> select hash_range('[a,a)'::textrange);
>   hash_range
> 
> -  484847245
> + -590102690
> (1 row)
> 
> select hash_range('[b,b)'::textrange);
>   hash_range
> 
> -  484847245
> +  281562732
> (1 row)
> 
> select hash_range('[c,c)'::textrange);
> - hash_range 
> -
> -  484847245
> + hash_range  
> +-
> + -1887445565
> (1 row)
> 
> 
> You might change how hash_range() works to get all "equal" values to hash the 
> same value, but that just gets back to the problem that non-equal things 
> appear to be equal.  I guess that's your two-warty-feet preference, but not 
> everyone is going to be in agreement on that.

Yikes. Here be dragons. I think I want my wart free foot back please.

Many thanks for explaining. I think I’ll abandon this patch. I guess 
implementing an entirely new range type could be an acceptable solution, but 
that’s too big of a project for me to manage on my own. If any more experienced 
hackers are interested in such a project, I would love to help if I can.

> 
> —
> Mark Dilger
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
> 
> 
> 
> 

Kind regards,

Joel


Re: buildfarm windows checks / tap tests on windows

2021-03-02 Thread Andres Freund
Hi,

On 2021-03-02 07:48:28 -0500, Andrew Dunstan wrote:
> I don't think speed is the issue. I probably disabled misc-tests on
> drongo and bowerbird (my two animals in question) because I got  either
> instability or errors I was unable to diagnose. I'll go back and take
> another look to narrow this down. It's possible to disable individual tests.

Yea, there was one test that hung in 021_row_visibility.pl which was a
weird perl hang - with weird symptoms. But that should be fixed now.

There's another failure in recoverycheck that I at first thought was the
fault of the aio branch. But it also see it on master - but only on one
of the two machines I use to test. Pretty odd.

t/003_recovery_targets.pl  7/9
#   Failed test 'multiple conflicting settings'
#   at t/003_recovery_targets.pl line 151.

#   Failed test 'recovery end before target reached is a fatal error'
#   at t/003_recovery_targets.pl line 177.
t/003_recovery_targets.pl  9/9 # Looks like you failed 2 tests of 9.
t/003_recovery_targets.pl  Dubious, test returned 2 (wstat 512, 
0x200)
Failed 2/9 subtests

I think it's pretty dangerous if we have a substantial number of tests
that aren't run on windows - I think a lot of us just assume that the
BF would catch windows specific problems...


> > On that last point: Running the tap tests on windows appears to be
> > *excruciatingly* slow. How does anybody develop on windows without a
> > mechanism to actually run tests in parallel?
> 
> 
> I think most people develop elsewhere and then adapt/test on Windows if
> necessary.

Yea, but even that overstretches my patience by a good bit. I can't deal
with a serial check-world on linux either - I think that's the biggest
differentiator. It's a lot less painful to deal with slow-ish tests if
they do all their slowness concurrently. But that's basically impossible
with vcregress.pl, and the bf etc can't easily do it either until we
have a decent way to see the correct logfiles & output for individual
tests...


> > I think it'd be good if vcregress.pl at least respected PROVE_FLAGS from
> > the environment - it can't currently be passed in for several of the
> > vcregress.pl tests, and it does seem to make things to be at least
> > somewhat less painful.
> +1

K, will send a patch for that in a bit.


> > This makes it even clearer to me that we really need a builtin
> > testrunner that runs tests efficiently *and* debuggable on windows.
> >
> 
> "show me the code" :-)

The biggest obstacle on that front is perl. I started to write one, but
hit several perl issues within an hour. I think I might write one in
python, that'd be a lot less painful.


One windows build question I have is why the msvc infrastructure doesn't
accept msys perl in places like this:
guid  => $^O eq "MSWin32" ? Win32::GuidGen() : 
'FAKE',
If I change them to accept msys perl then the build ends up working.

The reason it'd be nice to accept msys perl is that git for windows
bundles that - and that's already installed on most CI projects...

Greetings,

Andres Freund




Re: [PATCH] Support empty ranges with bounds information

2021-03-02 Thread Mark Dilger



> On Mar 2, 2021, at 12:51 PM, Joel Jacobson  wrote:
> 
> Yikes. Here be dragons. I think I want my wart free foot back please.
> 
> Many thanks for explaining. I think I’ll abandon this patch. I guess 
> implementing an entirely new range type could be an acceptable solution, but 
> that’s too big of a project for me to manage on my own. If any more 
> experienced hackers are interested in such a project, I would love to help if 
> I can.

Part of what was strange about arguing against your patch is that I kind of 
wanted the feature to work that way back when it originally got written.  (Not 
to say that it *should* have worked that way, just that part of me wanted it.)


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: We should stop telling users to "vacuum that database in single-user mode"

2021-03-02 Thread David Rowley
On Wed, 3 Mar 2021 at 01:12, Magnus Hagander  wrote:
>
> On Tue, Mar 2, 2021 at 7:52 AM David Rowley  wrote:
> > I have seen it happen that an instance has a vacuum_cost_limit set and
> > someone did start the database in single-user mode, per the advice of
> > the error message only to find that the VACUUM took a very long time
> > due to the restrictive cost limit. I struggle to imagine why anyone
> > wouldn't want the vacuum to run as quickly as possible in that
> > situation.
>
> Multiple instances running on the same hardware and only one of them
> being in trouble?

You might be right. I'm not saying it's a great idea but thought it
was worth considering.

We could turn to POLA and ask; what would you be more surprised at; 1)
Your database suddenly using more I/O than it had been previously, or;
2) Your database no longer accepting DML.

David




Re: A qsort template

2021-03-02 Thread Daniel Gustafsson
> On 18 Feb 2021, at 04:09, Thomas Munro  wrote:

> In another thread[1], I proposed $SUBJECT, but then we found a better
> solution to that thread's specific problem.  The general idea is still
> good though: it's possible to (1) replace several existing copies of
> our qsort algorithm with one, and (2) make new specialised versions a
> bit more easily than the existing Perl generator allows.  So, I'm back
> with a rebased stack of patches.  I'll leave specific cases for new
> worthwhile specialisations for separate proposals; I've heard about
> several.

Just to play around with this while reviewing I made a qsort_strcmp, like in
the attached, and tested it using a ~9M word [0] randomly shuffled wordlist.
While being too small input to make any meaningful difference in runtime (it
shaved a hair off but it might well be within the error margin) there was no
regression either.  More importantly, it was really simple and quick to make a
tailored qsort which is the intention with the patch.  While still being a bit
of magic, moving from the Perl generator makes this slightly less magic IMO so
+1 on this approach.

A tiny nitpick on the patch itself:

+ *   - ST_COMPARE(a, b) - a simple comparison expression
+ *   - ST_COMPARE(a, b, arg) - variant that takes an extra argument
Indentation.

All tests pass and the documentation in the the sort_template.h is enough to go
on, but I would prefer to see a comment in port/qsort.c referring back to
sort_template.h for documentation.

--
Daniel Gustafsson   https://vmware.com/

[0] https://github.com/dwyl/english-words/ shuffled 20 times over



qsort_tmpl_strcmp-patch
Description: Binary data


Re: pg_upgrade version checking questions

2021-03-02 Thread Daniel Gustafsson
> On 2 Mar 2021, at 14:20, Peter Eisentraut  
> wrote:
> 
> On 23.02.21 17:14, Daniel Gustafsson wrote:
>> This exports validate_exec to reduce duplication, and implements a custom
>> find_other_exec-like function in pg_upgrade to check each binary for the
>> version number.  Keeping a local copy of validate_exec is easy to do if it's
>> deemed not worth it to export it.
> 
> This looks mostly okay to me.

Thanks for reviewing!

> The commit message says something about "to ensure the health of the target 
> cluster", which doesn't make sense to me.  Maybe find a better wording.

Reworded in the attached updated version.

> The name find_exec() seems not very accurate.  It doesn't find anything.  
> Maybe "check"?

I'm not wild about check_exec(), but every other name I could think of was
drastically worse so I went with check_exec.

> I'm not sure why the new find_exec() adds EXE.  AFAIK, this is only required 
> for stat(), and validate_exec() already does it.

Good point, fixed.

--
Daniel Gustafsson   https://vmware.com/



v6-0001-Check-version-of-target-cluster-binaries.patch
Description: Binary data


Re: Allow matching whole DN from a client certificate

2021-03-02 Thread Jacob Champion
On Fri, 2021-02-26 at 15:40 -0500, Andrew Dunstan wrote:
> I think the thing that's principally outstanding w.r.t. this patch is
> what format we should use to extract the DN.

That and the warning label for sharp edges.

> Should we use RFC2253,
> which reverses the field order, as has been suggested upthread and is in
> the latest patch? I'm slightly worried that it might be a POLA
> violation.

All I can provide is the hindsight from httpd. [1] is the thread that
gave rise to its LegacyDNStringFormat.

Since RFC 2253 isn't a canonical encoding scheme, and we've already
established that different TLS implementations do things slightly
differently even when providing RFC-compliant output, maybe it doesn't
matter in the end: to get true compatibility, we need to implement a DN
matching scheme rather than checking string equality. But using RFC2253
for version 1 of the feature at least means that the *simplest* cases
are the same across backends, since I doubt the NSS implementation is
going to try to recreate OpenSSL's custom format.

--Jacob

[1] 
https://lists.apache.org/thread.html/2055b56985c69e7a6977151bf9817a0f982a4ad3b78a6a1984977fd0%401289507617%40%3Cusers.httpd.apache.org%3E


Re: buildfarm windows checks / tap tests on windows

2021-03-02 Thread Andrew Dunstan


On 3/2/21 7:48 AM, Andrew Dunstan wrote:
> On 3/1/21 3:07 PM, Andres Freund wrote:
>> Hi,
>>
>> As part of trying to make the aio branch tests on cirrus CI pass with
>> some tap tests I noticed that "src/tools/msvc/vcregress.pl recoverycheck"
>> hangs. A long phase of remote debugging later I figured out that that's
>> not a fault of the aio branch - it also doesn't pass on master (fixed in
>> [1]).
>>
>> Which confused me - shouldn't the buildfarm have noticed? But it seems
>> that all msvc animals either don't have tap tests enabled or they
>> disable 'misc-checks' which in turn is what the buildfarm client uses to
>> trigger all the 'recovery' checks.
>>
>> It seems we're not just skipping recovery, but also e.g. tap tests in
>> contrib, all the tests in src/test/modules/...
>>
>> Andrew, what's the reason for that? Is it just that they hung at some
>> point? Were too slow?
>
>
> I don't think speed is the issue. I probably disabled misc-tests on
> drongo and bowerbird (my two animals in question) because I got  either
> instability or errors I was unable to diagnose. I'll go back and take
> another look to narrow this down. It's possible to disable individual tests.



Well, I though it was, but it wasn't. Fixed now, see



I've deployed this on drongo, which will now run almost all the TAP
tests. The exception is the recovery tests, because
021_row_visibility.pl crashes so badly it brings down the whole
buildfarm client.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: WIP: BRIN multi-range indexes

2021-03-02 Thread Alvaro Herrera
On 2021-Mar-03, Tomas Vondra wrote:

> 1) The 0001 patch allows passing of all scan keys to BRIN opclasses,
> which is needed for the minmax-multi to work. But it also modifies the
> existing opclasses (minmax and inclusion) to do this - but for those
> opclasses it does not make much difference, I think. Initially this was
> done because the patch did this for all opclasses, but then we added the
> detection based on number of parameters. So I wonder if we should just
> remove this, to make the patch a bit smaller. It'll also test the other
> code path (for opclasses without the last parameter).

I think it makes sense to just do them all in one pass.  I think trying
to keep the old way working just because it's how it was working
previously does not have much benefit.  I don't think we care about the
*patch* being small as much as the resulting *code* being as simple as
possible (but no simpler).


> 2) This needs bsearch_arg, but we only have that defined/used in
> extended_statistics_internal.h - it seems silly to include that here, as
> this has nothing to do with extended stats, so I simply added another
> prototype (which gets linked correctly). But, I suppose a better way
> would be to define our "portable" variant pg_bsearch_arg, next to
> pg_qsort etc.

Yeah, I think it makes sense to move bsearch_arg to a place where it's
more generally accesible (src/port/bsearch_arg.c I suppose), and make
both places use that.

-- 
Álvaro Herrera   Valdivia, Chile




Re: WIP: BRIN multi-range indexes

2021-03-02 Thread Tomas Vondra
On 3/3/21 12:44 AM, Alvaro Herrera wrote:
> On 2021-Mar-03, Tomas Vondra wrote:
> 
>> 1) The 0001 patch allows passing of all scan keys to BRIN opclasses,
>> which is needed for the minmax-multi to work. But it also modifies the
>> existing opclasses (minmax and inclusion) to do this - but for those
>> opclasses it does not make much difference, I think. Initially this was
>> done because the patch did this for all opclasses, but then we added the
>> detection based on number of parameters. So I wonder if we should just
>> remove this, to make the patch a bit smaller. It'll also test the other
>> code path (for opclasses without the last parameter).
> 
> I think it makes sense to just do them all in one pass.  I think trying
> to keep the old way working just because it's how it was working
> previously does not have much benefit.  I don't think we care about the
> *patch* being small as much as the resulting *code* being as simple as
> possible (but no simpler).
> 

That's kinda my point - I agree the size of the patch is not the primary
concern, but it makes the minmax/inclusion code a bit more complicated
(because they now have to loop over the keys), with very little benefit
(there might be some speedup, but IMO it's rather negligible).

Alternatively we could simply remove the code supporting the old API
with "consistent" functions without the additional parameter. But the
idea was to seamlessly support existing opclasses / not breaking them
unnecessarily (I know we don't guarantee that in major upgrades, but as
they may not benefit from this, why break them?). It'd simplify the code
in brin.c a little bit, but the opclasses a bit more complex.

> 
>> 2) This needs bsearch_arg, but we only have that defined/used in
>> extended_statistics_internal.h - it seems silly to include that here, as
>> this has nothing to do with extended stats, so I simply added another
>> prototype (which gets linked correctly). But, I suppose a better way
>> would be to define our "portable" variant pg_bsearch_arg, next to
>> pg_qsort etc.
> 
> Yeah, I think it makes sense to move bsearch_arg to a place where it's
> more generally accesible (src/port/bsearch_arg.c I suppose), and make
> both places use that.
> 

OK, will do.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Confusing behavior of psql's \e

2021-03-02 Thread Jacob Champion
On Wed, 2020-12-16 at 10:45 +0100, Laurenz Albe wrote:
> I consider this a bug fix, but one that shouldn't be backpatched.
> Re-executing the previous query if the editor is quit is
> annoying at least and dangerous at worst.

I like that this patch also clears the query buffer in the error case.
(For example, if I save the file but then decide I want to cancel
execution, the only choice is to issue an abortive :cq from Vim. The
current master-branch behavior is to just dump me back onto a
continuation prompt, and I have to manually \r the buffer. With this
patch, I'm returned to an initial prompt with a clear buffer. Very
nice.)

Some unexpected behavior I saw when testing this patch: occasionally I
would perform a bare \e, save the temporary file, and quit, only to
find that nothing was executed. What's happening is, I'm saving the
file too quickly, and the timestamp check (which has only second
precision) is failing! This isn't a problem in practice for the
explicit-filename case, because you probably didn't create the file
within the last second, but the temporary files are always zero seconds
old by definition. I could see this tripping up some people.

--Jacob


Re: WIP: BRIN multi-range indexes

2021-03-02 Thread Alvaro Herrera
On 2021-Mar-03, Tomas Vondra wrote:

> That's kinda my point - I agree the size of the patch is not the primary
> concern, but it makes the minmax/inclusion code a bit more complicated
> (because they now have to loop over the keys), with very little benefit
> (there might be some speedup, but IMO it's rather negligible).

Yeah, OK.

> Alternatively we could simply remove the code supporting the old API
> with "consistent" functions without the additional parameter. But the
> idea was to seamlessly support existing opclasses / not breaking them
> unnecessarily (I know we don't guarantee that in major upgrades, but as
> they may not benefit from this, why break them?). It'd simplify the code
> in brin.c a little bit, but the opclasses a bit more complex.

Well, I doubt any opclass-support functions exist outside of core.
Or am I just outdated and we do know of some?

-- 
Álvaro Herrera   Valdivia, Chile
"Escucha y olvidarás; ve y recordarás; haz y entenderás" (Confucio)




Re: PROXY protocol support

2021-03-02 Thread Jacob Champion
On Tue, 2021-03-02 at 18:43 +0100, Magnus Hagander wrote:
> PFA a simple patch that implements support for the PROXY protocol.

I'm not all the way through the patch yet, but this part jumped out at
me:

> + if (memcmp(proxyheader.sig, 
> "\x0d\x0a\x0d\x0a\x00\x0d\x0a\x51\x55\x49\x54\x0a", sizeof(proxyheader.sig)) 
> != 0)
> + {
> + /*
> +  * Data is there but it wasn't a proxy header. Also fall 
> through to
> +  * normal processing
> +  */
> + pq_endmsgread();
> + return STATUS_OK;

From my reading, the spec explicitly disallows this sort of fallback
behavior:

> The receiver MUST be configured to only receive the protocol described in this
> specification and MUST not try to guess whether the protocol header is present
> or not. This means that the protocol explicitly prevents port sharing between
> public and private access.

You might say, "if we already trust the proxy server, why should we
care?" but I think the point is that you want to catch
misconfigurations where the middlebox is forwarding bare TCP without
adding a PROXY header of its own, which will "work" for innocent
clients but in reality is a ticking timebomb. If you've decided to
trust an intermediary to use PROXY connections, then you must _only_
accept PROXY connections from that intermediary. Does that seem like a
reasonable interpretation?

--Jacob


Re: buildfarm windows checks / tap tests on windows

2021-03-02 Thread Andres Freund
Hi,

On 2021-03-02 17:04:16 -0500, Andrew Dunstan wrote:
> Well, I though it was, but it wasn't. Fixed now, see
> 
> 
> 
> I've deployed this on drongo, which will now run almost all the TAP
> tests.

Thanks!


> The exception is the recovery tests, because
> 021_row_visibility.pl crashes so badly it brings down the whole
> buildfarm client.

It still does, even after

commit 1e6e40447115ca7b4749d7d117b81b016ee5e2c2 (upstream/master, master)
Author: Andres Freund 
Date:   2021-03-01 09:52:15 -0800

Fix recovery test hang in 021_row_visibility.pl on windows.

? I didn't see failures after that?

Greetings,

Andres Freund




Re: WIP: BRIN multi-range indexes

2021-03-02 Thread Tomas Vondra



On 3/3/21 1:17 AM, Alvaro Herrera wrote:
> On 2021-Mar-03, Tomas Vondra wrote:
> 
>> That's kinda my point - I agree the size of the patch is not the
>> primary concern, but it makes the minmax/inclusion code a bit more
>> complicated (because they now have to loop over the keys), with
>> very little benefit (there might be some speedup, but IMO it's
>> rather negligible).
> 
> Yeah, OK.
> 
>> Alternatively we could simply remove the code supporting the old
>> API with "consistent" functions without the additional parameter.
>> But the idea was to seamlessly support existing opclasses / not
>> breaking them unnecessarily (I know we don't guarantee that in
>> major upgrades, but as they may not benefit from this, why break
>> them?). It'd simplify the code in brin.c a little bit, but the
>> opclasses a bit more complex.
> 
> Well, I doubt any opclass-support functions exist outside of core. Or
> am I just outdated and we do know of some?
> 

I'm not aware of any. This was proposed by Nikita Glukhov:

https://www.postgresql.org/message-id/49cb668f-d6f9-3493-681d-7d40b715ef64%40postgrespro.ru

Alexander Korotkov seemed to agree with Nikita, but I don't recall any
references to actual existing opclasses.

Then again - the amount of code to support two signatures is fairly
small. If we decide to get rid of it, then fine - but the new complexity
in minmax/inclusion likely exceeds that.

Which is why I was thinking about still supporting both signatures, but
reverting the changes in brin_minmax and brin_inclusion.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: buildfarm windows checks / tap tests on windows

2021-03-02 Thread Michael Paquier
On Tue, Mar 02, 2021 at 04:54:57PM -0800, Andres Freund wrote:
> It still does, even after
> 
> commit 1e6e40447115ca7b4749d7d117b81b016ee5e2c2 (upstream/master, master)
> Author: Andres Freund 
> Date:   2021-03-01 09:52:15 -0800
> 
> Fix recovery test hang in 021_row_visibility.pl on windows.
> 
> ? I didn't see failures after that?

Yes.  Testing this morning on top of 5b2f2af, it fails for me with a
"Terminating on signal SIGBREAK".

Having a support for PROVE_TESTS would be nice in src/tools/msvc/,
wrapping any ENV{PROVE_TESTS} value within an extra glob() before
passing that down to the prove command.
--
Michael


signature.asc
Description: PGP signature


Re: Add --tablespace option to reindexdb

2021-03-02 Thread Michael Paquier
On Tue, Mar 02, 2021 at 10:01:43AM +0100, Daniel Gustafsson wrote:
> +1, no objections from me after a readthrough of this version.

Thanks, Daniel and Mark.  I have applied v2 from upthread, then.
--
Michael


signature.asc
Description: PGP signature


PITR promote bug: Checkpointer writes to older timeline

2021-03-02 Thread Soumyadeep Chakraborty
Hello hackers,

We came across an issue where the checkpointer writes to the older
timeline while a promotion is ongoing after reaching the recovery point
in a PITR, when there are prepared transactions before the recovery
point. We came across this issue first in REL_12_STABLE and saw that it
also exists in devel.

Setup:
PFA a minimal reproduction script repro.txt.

After running the script, we notice that the checkpointer has written
the end-of-recovery shutdown checkpoint in the previous timeline (tli =
1), i.e. it wrote into the WAL segment 00010003 instead
of writing to the WAL segment 00020003, causing it to
overwrite WAL records past the recovery point (please see attached diff
output file waldiff.diff) in 00010003.

Also, performing a subsequent shutdown on the recovered server may cause
the next shutdown checkpoint record to be written, again, to the
previous timeline, i.e. to 00010003. A subsequent server
start will fail as the starup process will be unable to find the
checkpoint in the latest timeline (00020003) and we will
get:

...
LOG:  invalid record length at 0/3016FB8: wante
d 24, got 0
LOG:  invalid primary checkpoint record
PANIC:  could not locate a valid checkpoint record
...

RCA:

When there are prepared transactions in an older timeline, in the
checkpointer, a call to CheckPointTwoPhase() and subsequently to
XlogReadTwoPhaseData() and subsequently to read_local_xlog_page() leads
to the following line:

read_upto = GetXLogReplayRecPtr(&ThisTimeLineID);

GetXLogReplayRecPtr() will change ThisTimeLineID to 1, in order to read
the two phase WAL records in the older timeline. This variable will
remain unchanged and the checkpointer ends up writing the checkpoint
record into the older WAL segment (when XLogBeginInsert() is called
within CreateCheckPoint(), the value is still 1). The value is not
synchronized as even if RecoveryInProgress() is called,
xlogctl->SharedRecoveryState is not RECOVERY_STATE_DONE
(SharedRecoveryInProgress = true in older versions) as the startup
process waits for the checkpointer inside RequestCheckpoint() (since
recovery_target_action='promote' involves a non-fast promotion). Thus,
InitXLOGAccess() is not called and the value of ThisTimeLineID is not
updated before the checkpoint record write.

Since 1148e22a82e, GetXLogReplayRecPtr() is called with ThisTimeLineID
instead of a local variable, within read_local_xlog_page().

PFA a small patch that fixes the problem by explicitly calling
InitXLOGAccess() in CheckPointTwoPhase(), after the two phase state data
is read, in order to update ThisTimeLineID to the latest timeline. It is
okay to call InitXLOGAccess() as it is lightweight and would mostly be
a no-op.

Regards,
Soumyadeep, Kevin and Jimmy
VMWare
#! /bin/sh

# Primary setup

/usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data
echo "archive_mode = on
archive_command = 'cp %p /tmp/archive/%f'
max_prepared_transactions = 10" >> /usr/local/pgsql/data/postgresql.conf

/usr/local/pgsql/bin/pg_ctl -D /usr/local/pgsql/data -l /tmp/logfile_p start

# Secondary setup

/usr/local/pgsql/bin/pg_basebackup -D /usr/local/pgsql/data_s -X stream && 
touch /usr/local/pgsql/data_s/recovery.signal
echo "port = 6432
restore_command = 'cp /tmp/archive/%f %p'
recovery_target_name = 'rp'
recovery_target_action = 'promote'
max_prepared_transactions = 20" >> /usr/local/pgsql/data_s/postgresql.conf

# Workload
echo "CREATE TABLE foo(i int);
BEGIN;
INSERT INTO foo VALUES(1);
PREPARE TRANSACTION 'fooinsert';
SELECT pg_create_restore_point('rp');
INSERT INTO foo VALUES(2);
SELECT pg_switch_wal();
" > /tmp/workload.sql
/usr/local/pgsql/bin/psql postgres -f /tmp/workload.sql

/usr/local/pgsql/bin/pg_ctl -D /usr/local/pgsql/data_s -l /tmp/logfile_s start

sleep 5

/usr/local/pgsql/bin/pg_waldump /tmp/archive/00010003 > 
/tmp/archive_00010003
/usr/local/pgsql/bin/pg_waldump 
/usr/local/pgsql/data_s/pg_wal/00010003 > 
/tmp/standby_00010003

diff -u /tmp/archive_00010003 
/tmp/standby_00010003 > /tmp/waldiff.diff
cat /tmp/waldiff.diff
From dbf5a9f6899bedf28b482fc03a4a2b0571e92e9b Mon Sep 17 00:00:00 2001
From: Soumyadeep Chakraborty 
Date: Tue, 2 Mar 2021 17:41:20 -0800
Subject: [PATCH 1/1] Prevent checkpointer from writing to older timeline

Co-authored-by: Kevin Yeap 
---
 src/backend/access/transam/twophase.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index 80d2d20d6cc..081aee6217e 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -1740,6 +1740,16 @@ CheckPointTwoPhase(XLogRecPtr redo_horizon)
 	}
 	LWLockRelease(TwoPhaseStateLock);
 
+	if (serialized_xacts > 0)
+	{
+		/*
+		 * In order to ensure that we write the checkpoint record to the latest
+		 * timeline, refresh T

Re: Removing vacuum_cleanup_index_scale_factor

2021-03-02 Thread Peter Geoghegan
On Mon, Mar 1, 2021 at 10:33 PM Masahiko Sawada  wrote:
> The original design that made VACUUM set
> pg_class.reltuples/pg_class.relpages in indexes (from 15+ years ago)
> assumed that it was cheap to handle statistics in passing. Even if we
> have btvacuumcleanup() not do an index scan at all, this is 100%
> allowed by the amvacuumcleanup contract described in the
> documentation:
>
> "It is OK to return NULL if the index was not changed at all during
> the VACUUM operation, but otherwise correct stats should be returned."
>
> The above description was added by commit e57345975cf in 2006 and
> hasn't changed for now.

The intention here is not to revise the amvacuumcleanup() contract --
the contract already allows us to do what we want inside nbtree. We
want to teach btvacuumcleanup() to not do any real work, at least
outside of rare cases where we have known deleted pages that must
still be placed in the FSM for recycling -- btvacuumcleanup() would
generally just return NULL when there was no btbulkdelete() call
during the same VACUUM operation (the main thing that prevents this
today is vacuum_cleanup_index_scale_factor). More generally, we would
like to change the *general* expectations that we make of index AMs in
places like vacuumlazy.c and analyze.c. But we're worried about
dependencies that aren't formalized anywhere but still matter -- code
may have evolved to assume that index AMs behaved a certain way in
common and important cases (probably also in code like vacuumlazy.c).
That's what we want to avoid breaking.

Masahiko has already given an example of such a problem: currently,
VACUUM ANALYZE simply assumes that its VACUUM will call each indexes'
amvacuumcleanup() routine in all cases, and will have each call set
pg_class.reltuples and pg_class.relpages in respect of each index.
ANALYZE therefore avoids overwriting indexes' pg_class stats inside
do_analyze_rel() (at the point where it calls vac_update_relstats()
for each index). That approach is already wrong with hash indexes, but
under this new directly for btvacuumcleanup(), it would become wrong
in just the same way with nbtree indexes (if left unaddressed).

Clearly "the letter of the law" and "the spirit of the law" must both
be considered. We want to talk about the latter on this thread.
Concretely, here are specific questions (perhaps Tom can weigh in on
these as the principal designer of the relevant interfaces):

1. Any objections to the idea of teaching VACUUM ANALYZE to
distinguish between the cases where VACUUM ran and performed "real
index vacuuming", to make it more intelligent about overwriting
pg_class stats for indexes?

I define "real index vacuuming" as calling any indexes ambulkdelete() routine.

2. Does anybody anticipate any other issues? Possibly an issue that
resembles this existing known VACUUM ANALYZE issue?

Thanks
-- 
Peter Geoghegan




Re: Libpq support to connect to standby server as priority

2021-03-02 Thread Tom Lane
Greg Nancarrow  writes:
> I've marked this as "Ready for Committer".

I've pushed this after whacking it around a fair amount.  A lot of
that was cosmetic, but one thing that wasn't is that I got rid of the
proposed "which_primary_host" variable.  I thought the logic around
that was way too messy and probably buggy.  Even if it worked exactly
as intended, I'm dubious that the design intention was good.  I think
it makes more sense just to go through the whole server list again
without the restriction to standby servers.  In particular, that will
give saner results if the servers' status isn't holding still.

regards, tom lane




Re: buildfarm windows checks / tap tests on windows

2021-03-02 Thread Andrew Dunstan


On 3/2/21 8:27 PM, Michael Paquier wrote:
> On Tue, Mar 02, 2021 at 04:54:57PM -0800, Andres Freund wrote:
>> It still does, even after
>>
>> commit 1e6e40447115ca7b4749d7d117b81b016ee5e2c2 (upstream/master, master)
>> Author: Andres Freund 
>> Date:   2021-03-01 09:52:15 -0800
>>
>> Fix recovery test hang in 021_row_visibility.pl on windows.
>>
>> ? I didn't see failures after that?
> Yes.  Testing this morning on top of 5b2f2af, it fails for me with a
> "Terminating on signal SIGBREAK".
>
> Having a support for PROVE_TESTS would be nice in src/tools/msvc/,
> wrapping any ENV{PROVE_TESTS} value within an extra glob() before
> passing that down to the prove command.



Yes, I saw similar this morning, which woud have been after that commit.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: New IndexAM API controlling index vacuum strategies

2021-03-02 Thread Masahiko Sawada
On Tue, Mar 2, 2021 at 12:00 PM Peter Geoghegan  wrote:
>
> On Sun, Feb 21, 2021 at 10:28 PM Masahiko Sawada  
> wrote:
> > Sorry for the late response.
>
> Me too!

No problem, thank you for your comment!

>
> > 1. Whereas skipping index vacuum and heap vacuum is a very attractive
> > improvement, if we skip that by default I wonder if we need a way to
> > disable it. Vacuum plays a role in cleaning and diagnosing tables in
> > practice. So in a case where the table is bad state and the user wants
> > to clean all heap pages, it would be good to have a way to disable
> > this skipping behavior. One solution would be that index_cleanup
> > option has three different behaviors: on, auto (or smart), and off. We
> > enable this skipping behavior by default in ‘auto’ mode, but
> > specifying "INDEX_CLEANUP true” means to enforce index vacuum and
> > therefore disabling it.
>
> Sounds reasonable to me. Maybe users should express the skipping
> behavior that they desire in terms of the *proportion* of all heap
> blocks with LP_DEAD line pointers that we're willing to have while
> still skipping index vacuuming + lazy_vacuum_heap() heap scan. In
> other words, it can be a "scale" type GUC/param (though based on heap
> blocks *at the end* of the first heap scan, not tuples at the point
> the av launcher considers launching AV workers).

A scale type parameter seems good to me but I wonder if how users can
tune that parameter. We already have tuple-based parameters such as
autovacuum_vacuum_scale_factor/threshold and I think that users
basically don't pay attention to that table updates result in how many
blocks.

And I'm concerned that my above idea could confuse users since what we
want to control is both heap vacuum and index vacuum but it looks like
controlling only index vacuum.

The third idea is a VACUUM command option like DISABLE_PAGE_SKIPPING
to disable such skipping behavior. I imagine that the
user-controllable-option to enforce both heap vacuum and index vacuum
would be required also in the future when we have the vacuum strategy
feature (i.g., incremental vacuum).

>
> > @@ -1299,6 +1303,7 @@ lazy_scan_heap(Relation onerel, VacuumParams
> > *params, LVRelStats *vacrelstats,
> > {
> > lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
> > all_visible = false;
> > +   has_dead_tuples = true;
> > continue;
> > }
> >
> > I added the above change in the patch to count the number of heap
> > pages having at least one LP_DEAD line pointer. But it's weird to me
> > that we have never set has_dead_tuples true when we found an LP_DEAD
> > line pointer.
>
> I think that you're right. However, in practice it isn't harmful
> because has_dead_tuples is only used when "all_visible = true", and
> only to detect corruption (which should never happen). I think that it
> should be fixed as part of this work, though.

Agreed.

>
> Lots of stuff in this area is kind of weird already. Sometimes this is
> directly exposed to users, even. This came up recently, when I was
> working on VACUUM VERBOSE stuff. (This touched the precise piece of
> code you've patched in the quoted diff snippet, so perhaps you know
> some of the story I will tell you now already.)
>
> I recently noticed that VACUUM VERBOSE can report a very low
> tups_vacuumed/"removable heap tuples" when run against tables where
> most pruning is opportunistic pruning rather than VACUUM pruning
> (which is very common), provided there are no HOT updates (which is
> common but not very common). This can be very confusing, because
> VACUUM VERBOSE will report a "tuples_deleted" for the heap relation
> that is far far less than the "tuples_removed" it reports for indexes
> on the same table -- even though both fields have values that are
> technically accurate (e.g., not very affected by concurrent activity
> during VACUUM, nothing like that).
>
> This came to my attention when I was running BenchmarkSQL for the
> 64-bit XID deleted pages patch. One of the BenchmarkSQL tables (though
> only one -- the table whose UPDATEs are not HOT safe, which is unique
> among BenchmarkSQL/TPC-C tables). I pushed a commit with comment
> changes [1] to make that aspect of VACUUM VERBOSE a little less
> confusing. (I was actually running a quick-and-dirty hack that made
> log_autovacuum show VACUUM VERBOSE index stuff -- I would probably
> have missed the weird difference between heap tups_vacuumed and index
> tuples_removed without this custom log_autovacuum hack.)

That's true. I didn't know that.

>
> Just to be clear (I think you agree already): we should base any
> triggering logic for skipping index vacuuming/lazy_vacuum_heap() on
> logic that does not care *when* heap pages first contained LP_DEAD
> line pointers (could be that they were counted in tups_vacuumed due to
> being pruned during this VACUUM operation, could be from an earlier
> opportunistic pruning, etc).

Agreed. We should bas

Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

2021-03-02 Thread Peter Geoghegan
On Mon, Mar 1, 2021 at 11:22 AM Mark Dilger
 wrote:
> If bt_index_check() and bt_index_parent_check() are to have this 
> functionality, shouldn't there be an option controlling it much as the option 
> (heapallindexed boolean) controls checking whether all entries in the heap 
> are indexed in the btree?  It seems inconsistent to have an option to avoid 
> checking the heap for that, but not for this.

I agree. Actually, it should probably use the same snapshot as the
heapallindexed=true case. So either only perform unique constraint
verification when that option is used, or invent a new option that
will still share the snapshot used by heapallindexed=true (when the
options are combined).

> The regression test you provided is not portable.  I am getting lots of 
> errors due to differing output of the form "page lsn=0/4DAD7E0".  You might 
> turn this into a TAP test and use a regular expression to check the output.

I would test this using a custom opclass that does simple fault
injection. For example, an opclass that indexes integers, but can be
configured to dynamically make 0 values equal or unequal to each
other. That's more representative of real-world problems.

You "break the warranty" by updating pg_index, even compared to
updating other system catalogs. In particular, you break the
"indcheckxmin wait -- wait for xmin to be old before using index"
stuff in get_relation_info(). So it seems worse than updating
pg_attribute, for example (which is something that the tests do
already).

-- 
Peter Geoghegan




Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

2021-03-02 Thread Peter Geoghegan
On Mon, Feb 8, 2021 at 2:46 AM Pavel Borisov  wrote:
> Caveat: if the first entry on the next index page has a key equal to the key 
> on a previous page AND all heap tid's corresponding to this entry are 
> invisible, currently cross-page check can not detect unique constraint 
> violation between previous index page entry and 2nd, 3d and next current 
> index page entries. In this case, there would be a message that recommends 
> doing VACUUM to remove the invisible entries from the index and repeat the 
> check. (Generally, it is recommended to do vacuum before the check, but for 
> the testing purpose I'd recommend turning it off to check the detection of 
> visible-invisible-visible duplicates scenarios)

It's rather unlikely that equal values in a unique index will end up
on different leaf pages. It's really rare, in fact. This following
comment block from nbtinsert.c (which appears right before we call
_bt_check_unique()) explains why this is:

* It might be necessary to check a page to the right in _bt_check_unique,
* though that should be very rare.  In practice the first page the value ...

You're going to have to "couple" buffer locks in the style of
_bt_check_unique() (as well as keeping a buffer lock on "the first
leaf page a duplicate might be on" throughout) if you need the test to
work reliably. But why bother with that? The tool doesn't have to be
100% perfect at detecting corruption (nothing can be), and it's rather
unlikely that it will matter for this test. A simple test that doesn't
handle cross-page duplicates is still going to be very effective.

I don't think that it's acceptable for your new check to raise a
WARNING instead of an ERROR. I especially don't like that the new
unique checking functionality merely warns that the index *might* be
corrupt. False positives are always unacceptable within amcheck, and I
think that this is a false positive.

-- 
Peter Geoghegan




Buildfarm failure in crake

2021-03-02 Thread vignesh C
Hi,

I noticed there is buildfarm failure in crake, it fails with the
following error:
Mar 02 21:22:56 ./src/test/recovery/t/001_stream_rep.pl: Variable
declared in conditional statement at line 88, column 2.  Declare
variables outside of the condition.
([Variables::ProhibitConditionalDeclarations] Severity: 5)

I felt the variable declaration and the assignment need to be split as
it involves conditional statements. I have attached a patch which will
help in fixing the problem.
Thoughts?

Regards,
Vignesh
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index 07a9912..cf5211d 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -84,8 +84,9 @@ sub test_target_session_attrs
 	my $node2_host = $node2->host;
 	my $node2_port = $node2->port;
 	my $node2_name = $node2->name;
+	my $target_name;
 
-	my $target_name = $target_node->name if (defined $target_node);
+	$target_name = $target_node->name if (defined $target_node);
 
 	# Build connection string for connection attempt.
 	my $connstr = "host=$node1_host,$node2_host ";


Re: [PATCH] psql : Improve code for help option

2021-03-02 Thread miyake_kouta

2021-03-03 00:09, Fujii Masao wrote:

We can simplify the code more and remove "env = user"
by just using "user" instead of "env" in the above?


Thank you for your comment.
I updated my patch and replaced "env" with "user".
Please check.

Regards.
--
Kota Miyake
diff --git a/src/bin/psql/help.c b/src/bin/psql/help.c
index e44120bf76..edb17ed5dc 100644
--- a/src/bin/psql/help.c
+++ b/src/bin/psql/help.c
@@ -133,11 +133,7 @@ usage(unsigned short int pager)
 	env = getenv("PGPORT");
 	fprintf(output, _("  -p, --port=PORT  database server port (default: \"%s\")\n"),
 			env ? env : DEF_PGPORT_STR);
-	/* Display default user */
-	env = getenv("PGUSER");
-	if (!env)
-		env = user;
-	fprintf(output, _("  -U, --username=USERNAME  database user name (default: \"%s\")\n"), env);
+	fprintf(output, _("  -U, --username=USERNAME  database user name (default: \"%s\")\n"), user);
 	fprintf(output, _("  -w, --no-passwordnever prompt for password\n"));
 	fprintf(output, _("  -W, --password   force password prompt (should happen automatically)\n"));
 


Re: Identify missing publications from publisher while create/alter subscription.

2021-03-02 Thread Euler Taveira
On Wed, Feb 3, 2021, at 2:13 AM, Bharath Rupireddy wrote:
> On Mon, Jan 25, 2021 at 10:32 PM vignesh C  wrote:
> 
> > If a publication which does not exist is specified during create 
> > subscription, then we should throw an error similar to step 2 behavior. 
> > Similarly if a publication which does not exist is specified during alter 
> > subscription, then we should throw an error similar to step 4 behavior. If 
> > publication is dropped after subscription is created, this should be 
> > removed when an alter subscription subname refresh publication is performed 
> > similar to step 4.
> > Thoughts?
Postgres implemented a replication mechanism that is decoupled which means that 
both parties can perform "uncoordinated" actions. As you shown, the CREATE  
  
SUBSCRIPTION informing a non-existent publication is one of them. I think that
being permissive in some situations can prevent you from painting yourself into
a corner. Even if you try to be strict on the subscriber side, the other side  
(publisher) can impose you additional complexity.   
  

   
You are arguing that in the initial phase, the CREATE SUBSCRIPTION has a strict
mechanism. It is a fair point. However, impose this strictness for the other
  
SUBSCRIPTION commands should be carefully forethought. If we go that route, I  
suggest to have the current behavior as an option. The reasons are: (i) it is  
backward-compatible, (ii) it allows us some known flexibility (non-existent 
  
publication), and (iii) it would probably allow us to fix a scenario created by 
the strict mode. This strict mode can be implemented via new parameter  
  
validate_publication (ENOCAFFEINE to propose a better name) that checks if the 
publication is available when you executed the CREATE SUBSCRIPTION. Similar
parameter can be used in ALTER SUBSCRIPTION ... SET PUBLICATION and ALTER   
  
SUBSCRIPTION ... REFRESH PUBLICATION.

> To not mandate any new behaviour, I would suggest to have a new option
> for ALTER SUBSCRIPTION ... REFRESH PUBLICATION WITH
> (remove_dropped_publications = false). The new option
> remove_dropped_publications will have a default value false, when set
> to true it will check if the publications that are present in the
> subscription list are actually existing on the publisher or not, if
> not remove them from the list. And also in the documentation we need
> to clearly mention the consequence of this new option setting to true,
> that is, the dropped publications if created again will need to be
> added to the subscription list again.
REFRESH PUBLICATION is not the right command to remove publications. There is a 
command for it: ALTER SUBSCRIPTION ... SET PUBLICATION.

   
The other alternative is to document that non-existent publication names can be 
in the subscription catalog and it is ignored while executing SUBSCRIPTION  
  
commands. You could possibly propose a NOTICE/WARNING that informs the user 
  
that the SUBSCRIPTION command contains non-existent publication.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: Track replica origin progress for Rollback Prepared

2021-03-02 Thread Amit Kapila
On Wed, Jan 13, 2021 at 2:03 PM Ajin Cherian  wrote:
>
> On Wed, Jan 13, 2021 at 6:28 PM Ajin Cherian  wrote:
>
> > I added the below log:
> Small correction, I sent the wrong code change for logging, this was
> the correct one:
>

Okay, thanks. I have rebased the patch and updated comments as
suggested by Michael. Additionally, I ran pgindent as well.

Michael, I want to push this patch soon unless you have any concerns.
This is required for correctness if any plugin uses 2PC and origins to
track the progress. Now, that we have exposed the two_phase option via
a plugin, it is good if we get this in as well.

-- 
With Regards,
Amit Kapila.


v2-0001-Track-replication-origin-progress-for-rollbacks.patch
Description: Binary data


Re: New IndexAM API controlling index vacuum strategies

2021-03-02 Thread Peter Geoghegan
On Tue, Mar 2, 2021 at 6:44 PM Masahiko Sawada  wrote:
> A scale type parameter seems good to me but I wonder if how users can
> tune that parameter. We already have tuple-based parameters such as
> autovacuum_vacuum_scale_factor/threshold and I think that users
> basically don't pay attention to that table updates result in how many
> blocks.

Fair. The scale thing was just a random suggestion, nothing to take
too seriously.

> The third idea is a VACUUM command option like DISABLE_PAGE_SKIPPING
> to disable such skipping behavior. I imagine that the
> user-controllable-option to enforce both heap vacuum and index vacuum
> would be required also in the future when we have the vacuum strategy
> feature (i.g., incremental vacuum).

Yeah, I'm worried about conflicting requirements here -- this patch
and the next patch (that pushes the same ideas further) might have
different requirements.

I think that this patch will mostly be useful in cases where there are
very few LP_DEAD-containing heap pages, but consistently more than
zero. So it's probably not easy to tune.

What we might want is an on/off switch. But why? DISABLE_PAGE_SKIPPING
was added because the freeze map work in 9.6 was considered high risk
at the time, and we needed to have a tool to manage that risk. But
this patch doesn't seem nearly as tricky. No?

> > Lots of stuff in this area is kind of weird already. Sometimes this is
> > directly exposed to users, even. This came up recently, when I was
> > working on VACUUM VERBOSE stuff.

> That's true. I didn't know that.

It occurs to me that "tups_vacuumed vs. total LP_DEAD Items in heap
after VACUUM finishes" is similar to "pages_newly_deleted vs.
pages_deleted" for indexes. An easy mistake to make!

> > https://www.postgresql.org/message-id/flat/20130108024957.GA4751%40tornado.leadboat.com
> >
> > Of course, this effort to eliminate the "tupgone =
> > true"/XLOG_HEAP2_CLEANUP_INFO special case didn't go anywhere at the
> > time.
>
> I'll look at that thread.

I'm not sure if it's super valuable to look at the thread. But it is
reassuring to see that Noah shared the intuition that the "tupgone =
true" case was kind of bad, even back in 2013. It's one part of my
"mental map" of VACUUM.

-- 
Peter Geoghegan




  1   2   >