from:"Justin Pryzby"

Re: Include access method in listTables output

2020-08-19 Thread Justin Pryzby

On Mon, Aug 17, 2020 at 11:10:05PM +0530, vignesh C wrote:
> On Sat, Aug 1, 2020 at 8:12 AM vignesh C  wrote:
> > > >
> > > > +-- access method column should not be displayed for sequences
> > > > +\ds+
> > > >
> > > > -  List of relations
> > > >
> > > >
> > > > -   Schema | Name | Type | Owner | Persistence | Size | Description
> > > > ++--+--+---+-+--+-
> > > > +(0 rows)
> > > >
> > > > We can include one test for view.
> 
> I felt adding one test for view is good and added it.
> Attached a new patch for the same.
> 
> I felt patch is in shape for committer to have a look at this.

The patch tester shows it's failing xmllint ; could you send a fixed patch ?

/usr/bin/xmllint --path . --noout --valid postgres.sgml
ref/psql-ref.sgml:1189: parser error : Opening and ending tag mismatch: link 
line 1187 and para

http://cfbot.cputube.org/georgios-kokolatos.html

-- 
Justin

v13: show extended stats target in \d

2020-08-30 Thread Justin Pryzby

The stats target can be set since commit d06215d03, but wasn't shown by psql.
 ALTER STATISISTICS .. SET STATISTICS n.

Normal (1-D) stats targets are shown in \d+ table.
Stats objects are shown in \d (no plus).
Arguably, this should be shown only in "verbose" mode (\d+).
>From 60a4033a04fbaafbb01c2bb2d73bb2fbe4d1da75 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sun, 30 Aug 2020 23:38:17 -0500
Subject: [PATCH 1/1] Show stats target of extended statistics

This can be set since d06215d03, but wasn't shown by psql.
Normal (1-D) stats targets are shown in \d+ table.
---
 src/bin/psql/describe.c |  8 +++-
 src/test/regress/expected/stats_ext.out | 18 ++
 src/test/regress/sql/stats_ext.sql  |  2 ++
 3 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index d81f1575bf..73befa36ee 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -2683,10 +2683,12 @@ describeOneTableDetails(const char *schemaname,
 			  "a.attnum = s.attnum AND NOT attisdropped)) AS columns,\n"
 			  "  'd' = any(stxkind) AS ndist_enabled,\n"
 			  "  'f' = any(stxkind) AS deps_enabled,\n"
-			  "  'm' = any(stxkind) AS mcv_enabled\n"
+			  "  'm' = any(stxkind) AS mcv_enabled,\n"
+			  "  %s"
 			  "FROM pg_catalog.pg_statistic_ext stat "
 			  "WHERE stxrelid = '%s'\n"
 			  "ORDER BY 1;",
+			  (pset.sversion >= 13 ? "stxstattarget\n" : "-1\n"),
 			  oid);
 
 			result = PSQLexec(buf.data);
@@ -2732,6 +2734,10 @@ describeOneTableDetails(const char *schemaname,
 	  PQgetvalue(result, i, 4),
 	  PQgetvalue(result, i, 1));
 
+	if (strcmp(PQgetvalue(result, i, 8), "-1") != 0)
+		appendPQExpBuffer(&buf, " STATISTICS %s",
+	  PQgetvalue(result, i, 8));
+
 	printTableAddFooter(&cont, buf.data);
 }
 			}
diff --git a/src/test/regress/expected/stats_ext.out b/src/test/regress/expected/stats_ext.out
index 0ae779a3b9..4dea48a2e8 100644
--- a/src/test/regress/expected/stats_ext.out
+++ b/src/test/regress/expected/stats_ext.out
@@ -102,6 +102,15 @@ WARNING:  statistics object "public.ab1_a_b_stats" could not be computed for rel
 ALTER TABLE ab1 ALTER a SET STATISTICS -1;
 -- setting statistics target 0 skips the statistics, without printing any message, so check catalog
 ALTER STATISTICS ab1_a_b_stats SET STATISTICS 0;
+\d+ ab1
+Table "public.ab1"
+ Column |  Type   | Collation | Nullable | Default | Storage | Stats target | Description 
++-+---+--+-+-+--+-
+ a  | integer |   |  | | plain   |  | 
+ b  | integer |   |  | | plain   |  | 
+Statistics objects:
+"public"."ab1_a_b_stats" (ndistinct, dependencies, mcv) ON a, b FROM ab1 STATISTICS 0
+
 ANALYZE ab1;
 SELECT stxname, stxdndistinct, stxddependencies, stxdmcv
   FROM pg_statistic_ext s, pg_statistic_ext_data d
@@ -113,6 +122,15 @@ SELECT stxname, stxdndistinct, stxddependencies, stxdmcv
 (1 row)
 
 ALTER STATISTICS ab1_a_b_stats SET STATISTICS -1;
+\d+ ab1
+Table "public.ab1"
+ Column |  Type   | Collation | Nullable | Default | Storage | Stats target | Description 
++-+---+--+-+-+--+-
+ a  | integer |   |  | | plain   |  | 
+ b  | integer |   |  | | plain   |  | 
+Statistics objects:
+"public"."ab1_a_b_stats" (ndistinct, dependencies, mcv) ON a, b FROM ab1
+
 -- partial analyze doesn't build stats either
 ANALYZE ab1 (a);
 WARNING:  statistics object "public.ab1_a_b_stats" could not be computed for relation "public.ab1"
diff --git a/src/test/regress/sql/stats_ext.sql b/src/test/regress/sql/stats_ext.sql
index 2834a902a7..d1d49948b4 100644
--- a/src/test/regress/sql/stats_ext.sql
+++ b/src/test/regress/sql/stats_ext.sql
@@ -72,12 +72,14 @@ ANALYZE ab1;
 ALTER TABLE ab1 ALTER a SET STATISTICS -1;
 -- setting statistics target 0 skips the statistics, without printing any message, so check catalog
 ALTER STATISTICS ab1_a_b_stats SET STATISTICS 0;
+\d+ ab1
 ANALYZE ab1;
 SELECT stxname, stxdndistinct, stxddependencies, stxdmcv
   FROM pg_statistic_ext s, pg_statistic_ext_data d
  WHERE s.stxname = 'ab1_a_b_stats'
AND d.stxoid = s.oid;
 ALTER STATISTICS ab1_a_b_stats SET STATISTICS -1;
+\d+ ab1
 -- partial analyze doesn't build stats either
 ANALYZE ab1 (a);
 ANALYZE ab1;
-- 
2.17.0

Re: list of extended statistics on psql

2020-08-30 Thread Justin Pryzby

On Thu, Aug 27, 2020 at 07:53:23PM -0400, Alvaro Herrera wrote:
> +1 for the general idea, and +1 for \dX being the syntax to use
> 
> IMO the per-type columns should show both the type being enabled as
> well as it being built.
> 
> (How many more stat types do we expect -- Tomas?  I wonder if having one
> column per type is going to scale in the long run.)
> 
> Also, the stat obj name column should be first, followed by a single
> column listing both table and columns that it applies to.  Keep in mind
> that in the future we might want to add stats that cross multiple tables
> -- that's why the CREATE syntax is the way it is.  So we should give
> room for that in psql's display too.

There's also a plan for CREATE STATISTICS to support expresion statistics, with
the statistics functionality of an expression index, but without the cost of
index-update on UPDATE/DELETE.  That's Tomas' patch here:
https://commitfest.postgresql.org/29/2421/

I think that would compute ndistinct and MCV, same as indexes, but not
dependencies.  To me, I think it's better if there's a single column showing
the "kinds" of statistics to be generated (stxkind), rather than a column for
each.

I'm not sure why the length of the stats lists cast as text is useful to show?
We don't have a slash-dee command to show the number of MCV or histogram in
traditional, 1-D stats in pg_statistic, right ?  I think anybody wanting that
would learn to SELECT FROM pg_statistic*.  Also, the length of the text output
isn't very meaningful ?  If this is json, maybe you'd do something like this:
|SELECT a.stxdndistinct , COUNT(b) FROM pg_statistic_ext_data a , 
json_each(stxdndistinct::Json) AS b GROUP BY 1

I guess stxdmcv isn't json, but it seems especially meaningless to show
length() of its ::text, since we don't even "deserialize" the object to begin
with.

BTW, I've just started a new thread about displaying in psql \d the stats
target of target extended stats.

-- 
Justin

Re: v13: show extended stats target in \d

2020-08-31 Thread Justin Pryzby

On Mon, Aug 31, 2020 at 07:47:35AM +, gkokola...@pm.me wrote:
> ‐‐‐ Original Message ‐‐‐
> On Monday, 31 August 2020 08:00, Justin Pryzby  wrote:
> 
> > The stats target can be set since commit d06215d03, but wasn't shown by 
> > psql.
> > ALTER STATISISTICS .. SET STATISTICS n.
> >
> > Normal (1-D) stats targets are shown in \d+ table.
> > Stats objects are shown in \d (no plus).
> > Arguably, this should be shown only in "verbose" mode (\d+).
> 
> This seems rather useful. May I suggest you add it to the commitfest?

I added at
https://commitfest.postgresql.org/29/2712/

+   appendPQExpBuffer(&buf, " STATISTICS %s",

Maybe it should have a comma, like ", STATISTICS %s"?
Similar to these:

appendPQExpBuffer(&buf, ", ON TABLE %s",
|Triggers:
|trg1 AFTER INSERT ON trigpart3 FOR EACH ROW EXECUTE FUNCTION 
trigger_nothing(), ON TABLE trigpart

and:
appendPQExpBufferStr(&buf, ", PARTITIONED");
|part_ee_ff FOR VALUES IN ('ee', 'ff'), PARTITIONED,

Also, now I wonder if CREATE STATISTICS should support some syntax to set the
initial target.  Like:

|CREATE STATISTICS abstats ON a,b FROM child.abc_202006 WITH(STATISTICS 111);

-- 
Justin

Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace on the fly

2020-09-01 Thread Justin Pryzby

This patch seems to be missing a call to RelationAssumeNewRelfilenode() in
reindex_index().

That's maybe the related to the cause of the crashes I pointed out earlier this
year.

Alexey's v4 patch changed RelationSetNewRelfilenode() to accept a tablespace
parameter, but Michael seemed to object to that.  However that seems cleaner
and ~30 line shorter.

Michael, would you comment on that ?  The v4 patch and your comments are here.
https://www.postgresql.org/message-id/attachment/105574/v4-0001-Allow-REINDEX-and-REINDEX-CONCURRENTLY-to-change-tablespace.patch
https://www.postgresql.org/message-id/20191127035416.GG5435%40paquier.xyz

> --- a/src/backend/catalog/index.c
> +++ b/src/backend/catalog/index.c
> @@ -3480,6 +3518,47 @@ reindex_index(Oid indexId, bool 
> skip_constraint_checks, char persistence,
>*/
>   CheckTableNotInUse(iRel, "REINDEX INDEX");
>  
> + if (tablespaceOid == MyDatabaseTableSpace)
> + tablespaceOid = InvalidOid;
> +
> + /*
> +  * Set the new tablespace for the relation.  Do that only in the
> +  * case where the reindex caller wishes to enforce a new tablespace.
> +  */
> + if (set_tablespace &&
> + tablespaceOid != iRel->rd_rel->reltablespace)
> + {
> + Relationpg_class;
> + Form_pg_class   rd_rel;
> + HeapTuple   tuple;
> +
> + /* First get a modifiable copy of the relation's pg_class row */
> + pg_class = table_open(RelationRelationId, RowExclusiveLock);
> +
> + tuple = SearchSysCacheCopy1(RELOID, ObjectIdGetDatum(indexId));
> + if (!HeapTupleIsValid(tuple))
> + elog(ERROR, "cache lookup failed for relation %u", 
> indexId);
> + rd_rel = (Form_pg_class) GETSTRUCT(tuple);
> +
> + /*
> +  * Mark the relation as ready to be dropped at transaction 
> commit,
> +  * before making visible the new tablespace change so as this 
> won't
> +  * miss things.
> +  */
> + RelationDropStorage(iRel);
> +
> + /* Update the pg_class row */
> + rd_rel->reltablespace = tablespaceOid;
> + CatalogTupleUpdate(pg_class, &tuple->t_self, tuple);
> +
> + heap_freetuple(tuple);
> +
> + table_close(pg_class, RowExclusiveLock);
> +
> + /* Make sure the reltablespace change is visible */
> + CommandCounterIncrement();

Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace on the fly

2020-09-01 Thread Justin Pryzby

On Tue, Sep 01, 2020 at 11:40:18AM -0400, Alvaro Herrera wrote:
> On 2020-Aug-11, Justin Pryzby wrote:
> > On Tue, Aug 11, 2020 at 02:39:45PM +0900, Michael Paquier wrote:
> 
> > > The grammar that has been committed was the one that for the most
> > > support, so we need to live with that.  I wonder if we should simplify
> > > ReindexStmt and move the "concurrent" flag to be under "options", but
> > > that may not be worth the time spent on as long as we don't have
> > > CONCURRENTLY part of the parenthesized grammar.
> > 
> > I think it's kind of a good idea, since the next patch does exactly that
> > (parenthesize (CONCURRENTLY)).
> > 
> > I included that as a new 0002, but it doesn't save anything though, so maybe
> > it's not a win.
> 
> The advantage of using a parenthesized option list is that you can add
> *further* options without making the new keywords reserved.  Of course,
> we already reserve CONCURRENTLY and VERBOSE pretty severely, so there's
> no change.  If you wanted REINDEX FLUFFY then it wouldn't work without
> making that at least type_func_name_keyword I think; but REINDEX
> (FLUFFY) would work just fine.  And of course the new feature at hand
> can be implemented.

The question isn't whether to use a parenthesized option list.  I realized that
long ago (even though Alexey didn't initially like it).  Check 0002, which gets
rid of "bool concurrent" in favour of stmt->options&REINDEXOPT_CONCURRENT.

-- 
Justin

Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace on the fly

2020-09-01 Thread Justin Pryzby

On Wed, Sep 02, 2020 at 10:00:12AM +0900, Michael Paquier wrote:
> On Tue, Sep 01, 2020 at 11:48:30AM -0400, Alvaro Herrera wrote:
> > On 2020-Sep-01, Justin Pryzby wrote:
> >> The question isn't whether to use a parenthesized option list.  I realized 
> >> that
> >> long ago (even though Alexey didn't initially like it).  Check 0002, which 
> >> gets
> >> rid of "bool concurrent" in favour of stmt->options&REINDEXOPT_CONCURRENT.
> > 
> > Ah!  I see, sorry for the noise.  Well, respectfully, having a separate
> > boolean to store one option when you already have a bitmask for options
> > is silly.
> 
> Yeah, I am all for removing "concurrent" from ReindexStmt, but I don't
> think that the proposed 0002 is that, because it is based on the
> assumption that we'd want more than just boolean-based options in
> those statements, and this case is not justified yet except if it
> becomes possible to enforce tablespaces.  At this stage, I think that
> it is more sensible to just update gram.y and add a
> REINDEXOPT_CONCURRENTLY.  I also think that it would also make sense
> to pass down "options" within ReindexIndexCallbackState() (for example
> imagine a SKIP_LOCKED for REINDEX).

Uh, this whole thread is about implementing REINDEX (TABLESPACE foo), and the
preliminary patch 0001 is to keep separate the tablespace parts of that
content.  0002 is a minor tangent which I assume would be squished into 0001
which cleans up historic cruft, using new params in favour of historic options.

I think my change is probably incomplete, and ReindexStmt node should not have
an int options.  parse_reindex_params() would parse it into local int *options
and char **tablespacename params.

-- 
Justin

Re: REINDEX SCHEMA/DATABASE/SYSTEM weak with dropped relations

2020-09-01 Thread Justin Pryzby

> diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
> index 47d4c07306..23840bb8e6 100644
> --- a/src/include/nodes/parsenodes.h
> +++ b/src/include/nodes/parsenodes.h
> @@ -3352,6 +3352,7 @@ typedef struct ConstraintsSetStmt
>  /* Reindex options */
>  #define REINDEXOPT_VERBOSE (1 << 0) /* print progress info */
>  #define REINDEXOPT_REPORT_PROGRESS (1 << 1) /* report pgstat progress */
> +#define REINDEXOPT_MISSING_OK (2 << 1)   /* skip missing relations */

I think you probably intended to write: 1<<2

Even though it's the same, someone is likely to be confused if they try to use
3<<1 vs 1<<3.

I noticed while resolving merge conflict.

-- 
Justin

Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace on the fly

2020-09-01 Thread Justin Pryzby

On Tue, Sep 01, 2020 at 09:24:10PM -0500, Justin Pryzby wrote:
> On Wed, Sep 02, 2020 at 10:00:12AM +0900, Michael Paquier wrote:
> > On Tue, Sep 01, 2020 at 11:48:30AM -0400, Alvaro Herrera wrote:
> > > On 2020-Sep-01, Justin Pryzby wrote:
> > >> The question isn't whether to use a parenthesized option list.  I 
> > >> realized that
> > >> long ago (even though Alexey didn't initially like it).  Check 0002, 
> > >> which gets
> > >> rid of "bool concurrent" in favour of 
> > >> stmt->options&REINDEXOPT_CONCURRENT.
> > > 
> > > Ah!  I see, sorry for the noise.  Well, respectfully, having a separate
> > > boolean to store one option when you already have a bitmask for options
> > > is silly.
> > 
> > Yeah, I am all for removing "concurrent" from ReindexStmt, but I don't
> > think that the proposed 0002 is that, because it is based on the
> > assumption that we'd want more than just boolean-based options in
> > those statements, and this case is not justified yet except if it
> > becomes possible to enforce tablespaces.  At this stage, I think that
> > it is more sensible to just update gram.y and add a
> > REINDEXOPT_CONCURRENTLY.  I also think that it would also make sense
> > to pass down "options" within ReindexIndexCallbackState() (for example
> > imagine a SKIP_LOCKED for REINDEX).
> 
> Uh, this whole thread is about implementing REINDEX (TABLESPACE foo), and the
> preliminary patch 0001 is to keep separate the tablespace parts of that
> content.  0002 is a minor tangent which I assume would be squished into 0001
> which cleans up historic cruft, using new params in favour of historic 
> options.
> 
> I think my change is probably incomplete, and ReindexStmt node should not have
> an int options.  parse_reindex_params() would parse it into local int *options
> and char **tablespacename params.

Done in the attached, which is also rebased on 1d6541666.

And added RelationAssumeNewRelfilenode() as I mentioned - but I'm hoping to
hear from Michael about any reason not to call RelationSetNewRelfilenode()
instead of directly calling the things it would itself call.

-- 
Justin
>From 9d881a6aa401cebef0a2ce781d34a2a5ea8ded60 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Fri, 27 Mar 2020 17:50:46 -0500
Subject: [PATCH v23 1/5] Change REINDEX/CLUSTER to accept an option list..

..like EXPLAIN (..), VACUUM (..), and ANALYZE (..).

Change docs in the style of VACUUM.  See also: 52dcfda48778d16683c64ca4372299a099a15b96
---
 doc/src/sgml/ref/cluster.sgml| 28 ++---
 doc/src/sgml/ref/reindex.sgml| 43 +---
 src/backend/commands/cluster.c   | 23 -
 src/backend/commands/indexcmds.c | 41 --
 src/backend/nodes/copyfuncs.c|  2 ++
 src/backend/nodes/equalfuncs.c   |  2 ++
 src/backend/parser/gram.y| 35 +++---
 src/backend/tcop/utility.c   | 36 +++---
 src/bin/psql/tab-complete.c  | 23 +
 src/include/commands/cluster.h   |  3 ++-
 src/include/commands/defrem.h|  7 +++---
 src/include/nodes/parsenodes.h   |  2 ++
 12 files changed, 179 insertions(+), 66 deletions(-)

diff --git a/doc/src/sgml/ref/cluster.sgml b/doc/src/sgml/ref/cluster.sgml
index 4da60d8d56..110fb3083e 100644
--- a/doc/src/sgml/ref/cluster.sgml
+++ b/doc/src/sgml/ref/cluster.sgml
@@ -21,8 +21,13 @@ PostgreSQL documentation
 
  
 
-CLUSTER [VERBOSE] table_name [ USING index_name ]
-CLUSTER [VERBOSE]
+CLUSTER [VERBOSE] [ ( option [, ...] ) ] table_name [ USING index_name ]
+CLUSTER [VERBOSE] [ ( option [, ...] ) ]
+
+where option can be one of:
+
+VERBOSE [ boolean ]
+
 
  
 
@@ -81,6 +86,16 @@ CLUSTER [VERBOSE]
   Parameters
 
   
+   
+VERBOSE
+
+ 
+  Prints a progress report as each table is clustered.
+
+ 
+
+   
+

 table_name
 
@@ -100,10 +115,15 @@ CLUSTER [VERBOSE]

 

-VERBOSE
+boolean
 
  
-  Prints a progress report as each table is clustered.
+  Specifies whether the selected option should be turned on or off.
+  You can write TRUE, ON, or
+  1 to enable the option, and FALSE,
+  OFF, or 0 to disable it.  The
+  boolean value can also
+  be omitted, in which case TRUE is assumed.
  
 

diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index c16f223e4e..a32e192a87 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -25,7 +25,7 @@ REINDEX [ ( option [, ...] ) ] { IN
 
 where option can be one of:
 
-VERBOSE
+VERBOSE [ boolean ]
 
  
 
@@ -141,19 +141,6 @@ REINDEX [ ( option [, ...] ) ] { IN
 

 
-   
-name
-
-

Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace on the fly

2020-09-02 Thread Justin Pryzby

On Thu, Sep 03, 2020 at 12:00:17AM +0300, Alexey Kondratov wrote:
> On 2020-09-02 07:56, Justin Pryzby wrote:
> > 
> > Done in the attached, which is also rebased on 1d6541666.
> > 
> > And added RelationAssumeNewRelfilenode() as I mentioned - but I'm hoping
> > to
> > hear from Michael about any reason not to call
> > RelationSetNewRelfilenode()
> > instead of directly calling the things it would itself call.
> 
> The latest patch set immediately got a conflict with Michael's fixup
> 01767533, so I've rebased it first of all.

On my side, I've also rearranged function parameters to make the diff more
readable.  And squishes your changes into the respective patches.

Michael started a new thread about retiring ReindexStmt->concurrent, which I
guess will cause more conflicts (although I don't see why we wouldn't implement
a generic List grammar now rather than only after a preliminary patch).

-- 
Justin
>From 704d230463deca5303b0eae6c55e0e197c6fa473 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Fri, 27 Mar 2020 17:50:46 -0500
Subject: [PATCH v25 1/6] Change REINDEX/CLUSTER to accept an option list..

..like EXPLAIN (..), VACUUM (..), and ANALYZE (..).

Change docs in the style of VACUUM.  See also: 52dcfda48778d16683c64ca4372299a099a15b96
---
 doc/src/sgml/ref/cluster.sgml| 27 +---
 doc/src/sgml/ref/reindex.sgml| 43 +---
 src/backend/commands/cluster.c   | 23 -
 src/backend/commands/indexcmds.c | 41 --
 src/backend/nodes/copyfuncs.c|  2 ++
 src/backend/nodes/equalfuncs.c   |  2 ++
 src/backend/parser/gram.y| 33 +---
 src/backend/tcop/utility.c   | 36 +++---
 src/bin/psql/tab-complete.c  | 22 
 src/include/commands/cluster.h   |  3 ++-
 src/include/commands/defrem.h|  7 +++---
 src/include/nodes/parsenodes.h   |  2 ++
 12 files changed, 175 insertions(+), 66 deletions(-)

diff --git a/doc/src/sgml/ref/cluster.sgml b/doc/src/sgml/ref/cluster.sgml
index 4da60d8d56..e6ebce27e6 100644
--- a/doc/src/sgml/ref/cluster.sgml
+++ b/doc/src/sgml/ref/cluster.sgml
@@ -21,8 +21,13 @@ PostgreSQL documentation
 
  
 
-CLUSTER [VERBOSE] table_name [ USING index_name ]
-CLUSTER [VERBOSE]
+CLUSTER [VERBOSE] [ ( option [, ...] ) ] table_name [ USING index_name ]
+CLUSTER [VERBOSE] [ ( option [, ...] ) ]
+
+where option can be one of:
+
+VERBOSE [ boolean ]
+
 
  
 
@@ -81,6 +86,15 @@ CLUSTER [VERBOSE]
   Parameters
 
   
+   
+VERBOSE
+
+ 
+  Prints a progress report as each table is clustered.
+ 
+
+   
+

 table_name
 
@@ -100,10 +114,15 @@ CLUSTER [VERBOSE]

 

-VERBOSE
+boolean
 
  
-  Prints a progress report as each table is clustered.
+  Specifies whether the selected option should be turned on or off.
+  You can write TRUE, ON, or
+  1 to enable the option, and FALSE,
+  OFF, or 0 to disable it.  The
+  boolean value can also
+  be omitted, in which case TRUE is assumed.
  
 

diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index c16f223e4e..a32e192a87 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -25,7 +25,7 @@ REINDEX [ ( option [, ...] ) ] { IN
 
 where option can be one of:
 
-VERBOSE
+VERBOSE [ boolean ]
 
  
 
@@ -141,19 +141,6 @@ REINDEX [ ( option [, ...] ) ] { IN
 

 
-   
-name
-
- 
-  The name of the specific index, table, or database to be
-  reindexed.  Index and table names can be schema-qualified.
-  Presently, REINDEX DATABASE and REINDEX SYSTEM
-  can only reindex the current database, so their parameter must match
-  the current database's name.
- 
-
-   
-

 CONCURRENTLY
 
@@ -181,6 +168,34 @@ REINDEX [ ( option [, ...] ) ] { IN
  
 

+
+   
+name
+
+ 
+  The name of the specific index, table, or database to be
+  reindexed.  Index and table names can be schema-qualified.
+  Presently, REINDEX DATABASE and REINDEX SYSTEM
+  can only reindex the current database, so their parameter must match
+  the current database's name.
+ 
+
+   
+
+   
+boolean
+
+ 
+  Specifies whether the selected option should be turned on or off.
+  You can write TRUE, ON, or
+  1 to enable the option, and FALSE,
+  OFF, or 0 to disable it.  The
+  boolean value can also
+  be omitted, in which case TRUE is assumed.
+ 
+
+   
+
   
  
 
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index 0d647e912c..a42dfe98f4 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -35,6 +35,7 @@
 #include "catalog/pg_am.h"
 #include "catalog/toasting.h"

Re: proposal: possibility to read dumped table's name from file

2020-09-03 Thread Justin Pryzby

On Sun, Jul 05, 2020 at 10:08:09PM +0200, Pavel Stehule wrote:
> st 1. 7. 2020 v 23:24 odesílatel Justin Pryzby  napsal:
> 
> > On Thu, Jun 11, 2020 at 09:36:18AM +0200, Pavel Stehule wrote:
> > > st 10. 6. 2020 v 0:30 odesílatel Justin Pryzby > 
> > > napsal:
> > Also, your getline is dynamically re-allocating lines of arbitrary length.
> > Possibly that's not needed.  We'll typically read "+t schema.relname",
> > which is
> > 132 chars.  Maybe it's sufficient to do
> > char buf[1024];
> > fgets(buf);
> > if strchr(buf, '\n') == NULL: error();
> > ret = pstrdup(buf);
> 
> 63 bytes is max effective identifier size, but it is not max size of
> identifiers. It is very probably so buff with 1024 bytes will be enough for
> all, but I do not want to increase any new magic limit. More when dynamic
> implementation is not too hard.

Maybe you'd want to use a StrInfo like recent patches (8f8154a50).

> Table name can be very long - sometimes the data names (table names) can be
> stored in external storages with full length and should not be practical to
> require truncating in filter file.
> 
> For this case it is very effective, because a resized (increased) buffer is
> used for following rows, so realloc should not be often. So when I have to
> choose between two implementations with similar complexity, I prefer more
> dynamic code without hardcoded limits. This dynamic hasn't any overhead.

-- 
Justin

Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace on the fly

2020-09-03 Thread Justin Pryzby

On Wed, Sep 02, 2020 at 06:07:06PM -0500, Justin Pryzby wrote:
> On my side, I've also rearranged function parameters to make the diff more
> readable.  And squishes your changes into the respective patches.

This resolves a breakage I failed to notice from a last-minute edit.
And squishes two commits.
And rebased on Michael's commit removing ReindexStmt->concurrent.

-- 
Justin
>From 7e04caad0d010b5fd3eeca8d9bd436e89b657e4a Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Fri, 27 Mar 2020 17:50:46 -0500
Subject: [PATCH v26 1/5] Change REINDEX/CLUSTER to accept an option list..

..like EXPLAIN (..), VACUUM (..), and ANALYZE (..).

Change docs in the style of VACUUM.  See also: 52dcfda48778d16683c64ca4372299a099a15b96
---
 doc/src/sgml/ref/cluster.sgml  | 27 ++---
 doc/src/sgml/ref/reindex.sgml  | 43 ++-
 src/backend/commands/cluster.c | 28 --
 src/backend/nodes/copyfuncs.c  |  4 +--
 src/backend/nodes/equalfuncs.c |  4 +--
 src/backend/parser/gram.y  | 54 +++---
 src/backend/tcop/utility.c | 42 ++
 src/bin/psql/tab-complete.c| 22 ++
 src/include/commands/cluster.h |  3 +-
 src/include/commands/defrem.h  |  2 +-
 src/include/nodes/parsenodes.h |  4 +--
 11 files changed, 170 insertions(+), 63 deletions(-)

diff --git a/doc/src/sgml/ref/cluster.sgml b/doc/src/sgml/ref/cluster.sgml
index 4da60d8d56..e6ebce27e6 100644
--- a/doc/src/sgml/ref/cluster.sgml
+++ b/doc/src/sgml/ref/cluster.sgml
@@ -21,8 +21,13 @@ PostgreSQL documentation
 
  
 
-CLUSTER [VERBOSE] table_name [ USING index_name ]
-CLUSTER [VERBOSE]
+CLUSTER [VERBOSE] [ ( option [, ...] ) ] table_name [ USING index_name ]
+CLUSTER [VERBOSE] [ ( option [, ...] ) ]
+
+where option can be one of:
+
+VERBOSE [ boolean ]
+
 
  
 
@@ -81,6 +86,15 @@ CLUSTER [VERBOSE]
   Parameters
 
   
+   
+VERBOSE
+
+ 
+  Prints a progress report as each table is clustered.
+ 
+
+   
+

 table_name
 
@@ -100,10 +114,15 @@ CLUSTER [VERBOSE]

 

-VERBOSE
+boolean
 
  
-  Prints a progress report as each table is clustered.
+  Specifies whether the selected option should be turned on or off.
+  You can write TRUE, ON, or
+  1 to enable the option, and FALSE,
+  OFF, or 0 to disable it.  The
+  boolean value can also
+  be omitted, in which case TRUE is assumed.
  
 

diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index c16f223e4e..a32e192a87 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -25,7 +25,7 @@ REINDEX [ ( option [, ...] ) ] { IN
 
 where option can be one of:
 
-VERBOSE
+VERBOSE [ boolean ]
 
  
 
@@ -141,19 +141,6 @@ REINDEX [ ( option [, ...] ) ] { IN
 

 
-   
-name
-
- 
-  The name of the specific index, table, or database to be
-  reindexed.  Index and table names can be schema-qualified.
-  Presently, REINDEX DATABASE and REINDEX SYSTEM
-  can only reindex the current database, so their parameter must match
-  the current database's name.
- 
-
-   
-

 CONCURRENTLY
 
@@ -181,6 +168,34 @@ REINDEX [ ( option [, ...] ) ] { IN
  
 

+
+   
+name
+
+ 
+  The name of the specific index, table, or database to be
+  reindexed.  Index and table names can be schema-qualified.
+  Presently, REINDEX DATABASE and REINDEX SYSTEM
+  can only reindex the current database, so their parameter must match
+  the current database's name.
+ 
+
+   
+
+   
+boolean
+
+ 
+  Specifies whether the selected option should be turned on or off.
+  You can write TRUE, ON, or
+  1 to enable the option, and FALSE,
+  OFF, or 0 to disable it.  The
+  boolean value can also
+  be omitted, in which case TRUE is assumed.
+ 
+
+   
+
   
  
 
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index 0d647e912c..4a3678831d 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -35,6 +35,7 @@
 #include "catalog/pg_am.h"
 #include "catalog/toasting.h"
 #include "commands/cluster.h"
+#include "commands/defrem.h"
 #include "commands/progress.h"
 #include "commands/tablecmds.h"
 #include "commands/vacuum.h"
@@ -102,8 +103,29 @@ static List *get_tables_to_cluster(MemoryContext cluster_context);
  *---
  */
 void
-cluster(ClusterStmt *stmt, bool isTopLevel)
+cluster(ParseState *pstate, ClusterStmt *stmt, bool isTopLevel)
 {
+	ListCell	*lc;
+	int			options = 0;
+
+	/* Parse list of generic parameters not handled by the parser */
+	foreach(lc, stmt->params)
+	{
+		DefElem	*opt = (DefElem *) lfirst(lc);
+
+		if (strcmp(opt->

Re: v13: show extended stats target in \d

2020-09-05 Thread Justin Pryzby

On Tue, Sep 01, 2020 at 05:08:25PM -0400, Alvaro Herrera wrote:
> +1 on fixing this, since the ability to change stats target is new in
> pg13.
> 
> On 2020-Aug-31, Justin Pryzby wrote:
> 
> > Maybe it should have a comma, like ", STATISTICS %s"?
> 
> It does need some separator.  Maybe a comma is sufficient, but I'm not
> sure: that will fail when we add cross-relation stats, because the
> FROM clause will have more relations and possibly have commas too.

Good thought.

You said it'll "fail", but I guess you mean it could be confusing to look at.

I didn't actually know that "multiple tables" were planned for MV stats.
Another consideration is expression stats (as compared with expression
indexes).  I don't see how that helps guide this decision at all, though.

I think trying to parse \d output is discouraged and a bad idea, but it's
obviously not ok if the output is ambiguous.

But it's not ambiguous, since STATISTICS is capitalized and unquoted.
Arguably, "nn" could be construed as looking like a table alias, which doesn't
make sense here, and integer aliases would also need to be quoted (and seem
unlikely and not useful).

... FROM t1, t2, STATISTICS nn

I think this is 1) unambiguous; 2) clear; 3) consistent with other output and
with the "ALTER STATISTICS SET STATISTICS NN" command.  It could say "SET" but
I don't think it's useful to include; 4) seems to reasonably anticipate future
support for expressions and multiple tables; 

I'm happy if someone suggests something better, but I don't know what that
would be.  Unless we hurried up and finished this for v13, and included
stxstattarget.
https://commitfest.postgresql.org/29/2692/
https://www.postgresql.org/message-id/flat/c0939aba-3b12-b596-dd08-913dda4b40f0%40nttcom.co.jp_1#32680b2fe0ab473c58a33eb0f9f00d42

Maybe it's not relevant, but I found these don't have a separator.
ppendPQExpBufferStr(&buf, " DEFERRABLE");
ppendPQExpBufferStr(&buf, " INITIALLY DEFERRED");
ppendPQExpBufferStr(&buf, " INVALID");
ppendPQExpBufferStr(&buf, " PRIMARY KEY,");
ppendPQExpBufferStr(&buf, " REPLICA IDENTITY");
ppendPQExpBufferStr(&buf, " UNIQUE,");
ppendPQExpBufferStr(&buf, " UNIQUE CONSTRAINT,");
ppendPQExpBufferStr(&buf, " CLUSTER");
ppendPQExpBufferStr(&buf, " AS RESTRICTIVE");
ppendPQExpBuffer(&buf, "\n  TO %s",

These look like the only similar things with more than a single separator:
ppendPQExpBuffer(&buf, "\n  USING (%s)",
ppendPQExpBuffer(&buf, "\n  WITH CHECK (%s)",

-- 
Justin

Re: Multivariate MCV list vs. statistics target

2020-09-05 Thread Justin Pryzby

I think the docs are inconsistent with the commit message and the code
(d06215d03) and docs should be corrected, soemthing like so:

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index b135c89005..cd10a6a6fc 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7302,7 +7302,8 @@ SCRAM-SHA-256$:&l
of statistics accumulated for this statistics object by
.
A zero value indicates that no statistics should be collected.
-   A negative value says to use the system default statistics target.
+   A negative value says to use the maximum of the statistics targets of
+   the referenced columns, if set, or the system default statistics target.
Positive values of stxstattarget
determine the target number of most common values
to collect.
diff --git a/doc/src/sgml/ref/alter_statistics.sgml 
b/doc/src/sgml/ref/alter_statistics.sgml
index be4c3f1f05..f2e8a93166 100644
--- a/doc/src/sgml/ref/alter_statistics.sgml
+++ b/doc/src/sgml/ref/alter_statistics.sgml
@@ -101,7 +101,8 @@ ALTER STATISTICS name SET STATISTIC
 The statistic-gathering target for this statistics object for 
subsequent
  operations.
 The target can be set in the range 0 to 1; alternatively, set it
-to -1 to revert to using the system default statistics
+to -1 to revert to using the maximum statistics target of the
+referenced column's, if set, or the system default statistics
 target ().
 For more information on the use of statistics by the
 PostgreSQL query planner, refer to
-- 
2.17.0

Re: v13: show extended stats target in \d

2020-09-06 Thread Justin Pryzby

On Sun, Sep 06, 2020 at 01:06:05PM -0700, Peter Geoghegan wrote:
> On Tue, Sep 1, 2020 at 2:08 PM Alvaro Herrera  
> wrote:
> > It does need some separator.  Maybe a comma is sufficient, but I'm not
> > sure: that will fail when we add cross-relation stats, because the
> > FROM clause will have more relations and possibly have commas too.
> 
> How about a line break? That seems like a simple solution that takes
> all the competing concerns into account.
> 
> The fact that that will stand out isn't necessarily a bad thing. I
> think it's a good thing.

Like this ?

postgres=# \d t  
 Table "public.t"
 Column |  Type   | Collation | Nullable | Default
+-+---+--+-
 a  | integer |   |  |
 b  | integer |   |  |
Statistics objects:
"public"."t" (ndistinct, dependencies, mcv) ON a, b FROM t

Are there any other examples of similarly related information spread across
lines ?

I find that to be too verbose ; I guess it could be shown only in \d+, which is
true of column stats.

It's weird that the quoting rules are different for the stats object vs the
columns and the table.  The schema qualification is also divergent.  Also,
"public.t" form is technically wrong (the objects should be *separately*
quoted), but seems to be used all over.  If the table or schema has a dot, it's
ambiguous what this means: Table "public.public.t".

-- 
Justin

v13: CLUSTER segv with wal_level=minimal and parallel index creation

2020-09-06 Thread Justin Pryzby

Following a bulk load, a CLUSTER command run by a maintenance script crashed.
This is currently reproducible on that instance, so please suggest if I can
provide more info.

< 2020-09-06 15:44:16.369 MDT  >LOG:  background worker "parallel worker" (PID 
2576) was terminated by signal 6: Aborted
< 2020-09-06 15:44:16.369 MDT  >DETAIL:  Failed process was running: CLUSTER 
pg_attribute USING pg_attribute_relid_attnam_index

The crash happens during:
ts=# REINDEX INDEX pg_attribute_relid_attnum_index;
..but not:
ts=# REINDEX INDEX pg_attribute_relid_attnam_index ;

 pg_catalog | pg_attribute_relid_attnam_index | index | postgres | pg_attribute 
| permanent   | 31 MB | 
 pg_catalog | pg_attribute_relid_attnum_index | index | postgres | pg_attribute 
| permanent   | 35 MB | 

I suspect
|commit c6b92041d Skip WAL for new relfilenodes, under wal_level=minimal.

In fact, I set wal_level=minimal for the bulk load.  Note also:
 override | data_checksums  | on
 configuration file   | checkpoint_timeout  | 60
 configuration file   | maintenance_work_mem| 1048576
 configuration file   | max_wal_senders | 0
 configuration file   | wal_compression | on
 configuration file   | wal_level   | minimal
 configuration file   | fsync   | off
 configuration file   | full_page_writes| off
 default  | server_version  | 13beta3

(gdb) bt
#0  0x7ffad387 in raise () from /lib64/libc.so.6
#1  0x7ffaea78 in abort () from /lib64/libc.so.6
#2  0x00921da5 in ExceptionalCondition 
(conditionName=conditionName@entry=0xad4078 "relcache_verdict == 
RelFileNodeSkippingWAL(relation->rd_node)", errorType=errorType@entry=0x977f49 
"FailedAssertion", 
fileName=fileName@entry=0xad3068 "relcache.c", 
lineNumber=lineNumber@entry=2976) at assert.c:67
#3  0x0091a08b in AssertPendingSyncConsistency 
(relation=0x7ff99c2a70b8) at relcache.c:2976
#4  AssertPendingSyncs_RelationCache () at relcache.c:3036
#5  0x0058e591 in smgrDoPendingSyncs (isCommit=isCommit@entry=true, 
isParallelWorker=isParallelWorker@entry=true) at storage.c:685
#6  0x0053b1a4 in CommitTransaction () at xact.c:2118
#7  0x0053b826 in EndParallelWorkerTransaction () at xact.c:5300
#8  0x0052fcf7 in ParallelWorkerMain (main_arg=) at 
parallel.c:1479
#9  0x0076047a in StartBackgroundWorker () at bgworker.c:813
#10 0x0076d88d in do_start_bgworker (rw=0x23ac110) at postmaster.c:5865
#11 maybe_start_bgworkers () at postmaster.c:6091
#12 0x0076e43e in sigusr1_handler (postgres_signal_arg=) 
at postmaster.c:5260
#13 
#14 0x7ff999a6c983 in __select_nocancel () from /lib64/libc.so.6
#15 0x004887bc in ServerLoop () at postmaster.c:1691
#16 0x0076fb45 in PostmasterMain (argc=argc@entry=3, 
argv=argv@entry=0x237d280) at postmaster.c:1400
#17 0x0048a83d in main (argc=3, argv=0x237d280) at main.c:210

(gdb) bt f
...
#4  AssertPendingSyncs_RelationCache () at relcache.c:3036
status = {hashp = 0x23cba50, curBucket = 449, curEntry = 0x0}
locallock = 
rels = 0x23ff018
maxrels = 
nrels = 0
idhentry = 
i = 
#5  0x0058e591 in smgrDoPendingSyncs (isCommit=isCommit@entry=true, 
isParallelWorker=isParallelWorker@entry=true) at storage.c:685
pending = 
nrels = 0
maxrels = 0
srels = 0x0
scan = {hashp = 0x23edf60, curBucket = 9633000, curEntry = 0xe01600 
}
pendingsync = 
#6  0x0053b1a4 in CommitTransaction () at xact.c:2118
s = 0xe01600 
latestXid = 
is_parallel_worker = true
__func__ = "CommitTransaction"

Re: v13: CLUSTER segv with wal_level=minimal and parallel index creation

2020-09-06 Thread Justin Pryzby

This is easily reproduced, at least on pg_attribute

[pryzbyj@localhost ~]$ /usr/pgsql-13/bin/initdb -D pgsql13.dat
[pryzbyj@localhost ~]$ /usr/pgsql-13/bin/postgres -D pgsql13.dat -c 
logging_collector=off -c port=5678 -c unix_socket_directories=/tmp -c 
wal-level=minimal -c max_wal_senders=0&
[pryzbyj@localhost ~]$ psql -h /tmp -p 5678 postgres -c "SET 
min_parallel_table_scan_size=0" -c "SET client_min_messages=debug" -c "REINDEX 
INDEX pg_attribute_relid_attnum_index"
SET
SET
DEBUG:  building index "pg_attribute_relid_attnum_index" on table 
"pg_attribute" with request for 1 parallel worker
TRAP: FailedAssertion("relcache_verdict == 
RelFileNodeSkippingWAL(relation->rd_node)", File: "relcache.c", Line: 2976)
postgres: parallel worker for PID 9637 (ExceptionalCondition+0x66)[0x921d86]
postgres: parallel worker for PID 9637 
(AssertPendingSyncs_RelationCache+0x1db)[0x91a08b]
postgres: parallel worker for PID 9637 (smgrDoPendingSyncs+0x71)[0x58e591]
postgres: parallel worker for PID 9637 [0x53b1a4]
postgres: parallel worker for PID 9637 
(EndParallelWorkerTransaction+0x16)[0x53b826]
postgres: parallel worker for PID 9637 (ParallelWorkerMain+0x437)[0x52fcf7]
postgres: parallel worker for PID 9637 (StartBackgroundWorker+0x25a)[0x76047a]
postgres: parallel worker for PID 9637 [0x76d88d]
postgres: parallel worker for PID 9637 [0x76e43e]
/lib64/libpthread.so.0(+0xf630)[0x7f46d4b47630]
/lib64/libc.so.6(__select+0x13)[0x7f46d26ba983]
postgres: parallel worker for PID 9637 [0x4887bc]
postgres: parallel worker for PID 9637 (PostmasterMain+0x1165)[0x76fb45]
postgres: parallel worker for PID 9637 (main+0x70d)[0x48a83d]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x7f46d25e7555]

-- 
Justin

Re: PG 13 release notes, first draft

2020-09-07 Thread Justin Pryzby

> I think this is still going to be lost/misunderstood/confuse some people.
vacuumdb already supports -j.  This allows it to run vacuum(parallel N).  So
maybe say "...to process indexes in parallel".

-- 
Justin
>From 3f45433193d8fd3e2110df23e56a2ce31d232243 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sun, 6 Sep 2020 22:10:11 -0500
Subject: [PATCH] fixes release notes

---
 doc/src/sgml/release-13.sgml | 38 +---
 1 file changed, 18 insertions(+), 20 deletions(-)

diff --git a/doc/src/sgml/release-13.sgml b/doc/src/sgml/release-13.sgml
index e27b044408..f4cc65178a 100644
--- a/doc/src/sgml/release-13.sgml
+++ b/doc/src/sgml/release-13.sgml
@@ -731,8 +731,8 @@ Author: Noah Misch 
 -->
 

-Allow skipping of WAL for full table writes if WAL writes to be skipped during a transaction
+which creates or rewrites a relations if  is minimal (Kyotaro
 Horiguchi)

@@ -753,8 +753,8 @@ Author: Peter Eisentraut 
 -->
 

-Enable Unix-domain sockets
-support on Windows (Peter Eisentraut)
+Enable support for Unix-domain sockets
+on Windows (Peter Eisentraut)

   
 
@@ -765,7 +765,7 @@ Author: Fujii Masao 
 -->
 

-Improve the performance when replaying DROP DATABASE
 commands when many tablespaces are in use (Fujii Masao)

@@ -888,7 +888,7 @@ Author: Tomas Vondra 
 -->
 

-Allow a sample of statements to be logged (Adrien Nayrat)
+Allow logging a sample of statements (Adrien Nayrat)

 

@@ -1000,7 +1000,7 @@ Author: Michael Paquier 
 

 Add leader_pid to  to report parallel worker ownership
+linkend="pg-stat-activity-view"/> to report parallel worker's leader process
 (Julien Rouhaud)

   
@@ -1285,7 +1285,7 @@ Author: Alvaro Herrera 
 -->
 
   
-   Allow WAL receivers use a temporary replication slot
+   Allow WAL receivers to use a temporary replication slot
if a permanent one is not specified (Peter Eisentraut, Sergei Kornilov)
   
 
@@ -1369,8 +1369,8 @@ Author: Fujii Masao 
 -->
 
   
-   Allow WAL recovery to continue even if invalid
-   pages are referenced (Fujii Masao)
+   Allow recovery to continue even if invalid
+   pages are referenced by WAL (Fujii Masao)
   
 
   
@@ -1689,15 +1689,14 @@ Author: Andrew Dunstan 
 
   
Add alternate version of jsonb_setI()
+   linkend="functions-json-processing-table">jsonb_set()
with special NULL handling (Andrew Dunstan)
   
 
   
-   The new function, jsonb_set_lax(), allows null
-   new values to either set the specified key to JSON
-   null, delete the key, raise exception, or ignore the operation.
-   IS 'return_target' CLEAR?
+   The new function, jsonb_set_lax(), allows new
+   null values to either set the specified key to JSON
+   null, delete the key, raise an exception, or ignore the operation.
   
  
 
@@ -1863,8 +1862,8 @@ Author: Tom Lane 
   
Add function min_scale()
-   that returns the number of digits to the right the decimal point
-   that is required to represent the numeric value with full precision
+   that returns the number of digits to the right of the decimal point
+   that are required to represent the numeric value with full precision
(Pavel Stehule)
   
  
@@ -1913,8 +1912,7 @@ Author: Thomas Munro 
   
 
   
-   The old function names were kept for backward compatibility.  DO WE
-   HAVE NEW NAMES?
+   The old functions were kept for backward compatibility.
   
  
 
@@ -2833,7 +2831,7 @@ Author: Tom Lane 
   
 
   
-   This improves performance for queries that access many object.
+   This improves performance for queries that access many objects.
The internal List API has also been improved.
   
  
-- 
2.17.0

Re: 回复：how to create index concurrently on partitioned table

2020-09-07 Thread Justin Pryzby

Thanks for completing and pushing the REINDEX patch and others.

Here's a rebasified + fixed version of the others.

On Tue, Sep 01, 2020 at 02:51:58PM +0900, Michael Paquier wrote:
> The REINDEX patch is progressing its way, so I have looked a bit at
> the part for CIC.
> 
> Visibly, the case of multiple partition layers is not handled
> correctly.  Here is a sequence that gets broken:
..
> This fails as follows:
> ERROR: XX000: unrecognized node type: 2139062143
> LOCATION:  copyObjectImpl, copyfuncs.c:5718

Because copyObject needed to be called within a longlived context.

Also, my previous revision failed to implement your suggestion to first build
catalog entries with INVALID indexes and to then reindex them.  Fixed.

-- 
Justin
>From 17a14e5ad128024b05ae288a5096148b3f114c98 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sat, 6 Jun 2020 17:42:23 -0500
Subject: [PATCH v6 1/2] Allow CREATE INDEX CONCURRENTLY on partitioned table

Note, this effectively reverts 050098b14, so take care to not reintroduce the
bug it fixed.
---
 doc/src/sgml/ref/create_index.sgml |   9 --
 src/backend/commands/indexcmds.c   | 118 -
 src/test/regress/expected/indexing.out |  60 -
 src/test/regress/sql/indexing.sql  |  18 +++-
 4 files changed, 165 insertions(+), 40 deletions(-)

diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
index 33aa64e81d..c780dc9547 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -657,15 +657,6 @@ Indexes:
 cannot.

 
-   
-Concurrent builds for indexes on partitioned tables are currently not
-supported.  However, you may concurrently build the index on each
-partition individually and then finally create the partitioned index
-non-concurrently in order to reduce the time where writes to the
-partitioned table will be locked out.  In this case, building the
-partitioned index is a metadata only operation.
-   
-
   
  
 
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index f1b5f87e6a..61d0c4914c 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -665,17 +665,6 @@ DefineIndex(Oid relationId,
 	partitioned = rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE;
 	if (partitioned)
 	{
-		/*
-		 * Note: we check 'stmt->concurrent' rather than 'concurrent', so that
-		 * the error is thrown also for temporary tables.  Seems better to be
-		 * consistent, even though we could do it on temporary table because
-		 * we're not actually doing it concurrently.
-		 */
-		if (stmt->concurrent)
-			ereport(ERROR,
-	(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-	 errmsg("cannot create index on partitioned table \"%s\" concurrently",
-			RelationGetRelationName(rel;
 		if (stmt->excludeOpNames)
 			ereport(ERROR,
 	(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
@@ -1110,6 +1099,11 @@ DefineIndex(Oid relationId,
 		if (pd->nparts != 0)
 			flags |= INDEX_CREATE_INVALID;
 	}
+	else if (concurrent && OidIsValid(parentIndexId))
+	{
+		/* If concurrent, initial build of index partitions as "invalid" */
+		flags |= INDEX_CREATE_INVALID;
+	}
 
 	if (stmt->deferrable)
 		constr_flags |= INDEX_CONSTR_CREATE_DEFERRABLE;
@@ -1162,6 +1156,14 @@ DefineIndex(Oid relationId,
 		 */
 		if (!stmt->relation || stmt->relation->inh)
 		{
+			/*
+			 * Need to close the relation before recursing into children, so
+			 * copy needed data into a longlived context.
+			 */
+
+			MemoryContext	ind_context = AllocSetContextCreate(PortalContext, "CREATE INDEX",
+	ALLOCSET_DEFAULT_SIZES);
+			MemoryContext	oldcontext = MemoryContextSwitchTo(ind_context);
 			PartitionDesc partdesc = RelationGetPartitionDesc(rel);
 			int			nparts = partdesc->nparts;
 			Oid		   *part_oids = palloc(sizeof(Oid) * nparts);
@@ -1173,8 +1175,10 @@ DefineIndex(Oid relationId,
 		 nparts);
 
 			memcpy(part_oids, partdesc->oids, sizeof(Oid) * nparts);
+			parentDesc = CreateTupleDescCopy(RelationGetDescr(rel));
+			table_close(rel, NoLock);
+			MemoryContextSwitchTo(oldcontext);
 
-			parentDesc = RelationGetDescr(rel);
 			opfamOids = palloc(sizeof(Oid) * numberOfKeyAttributes);
 			for (i = 0; i < numberOfKeyAttributes; i++)
 opfamOids[i] = get_opclass_family(classObjectId[i]);
@@ -1217,10 +1221,12 @@ DefineIndex(Oid relationId,
 	continue;
 }
 
+oldcontext = MemoryContextSwitchTo(ind_context);
 childidxs = RelationGetIndexList(childrel);
 attmap =
 	build_attrmap_by_name(RelationGetDescr(childrel),
 		  parentDesc);
+MemoryContextSwitchTo(oldcontext);
 
 foreach(cell, childidxs)
 {
@@ -1291,10 +1297,14 @@ DefineIndex(Oid relationId,
  */
 if (!found)
 {
-	IndexStmt  *childStmt = copyObject(stmt);
+	In

Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2020-09-08 Thread Justin Pryzby

On Sat, Jul 18, 2020 at 03:15:32PM -0500, Justin Pryzby wrote:
> Still waiting for feedback from a committer.

This patch has been waiting for input from a committer on the approach I've
taken with the patches since March 10, so I'm planning to set to "Ready" - at
least ready for senior review.

-- 
Justin

Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace on the fly

2020-09-08 Thread Justin Pryzby

On Thu, Sep 03, 2020 at 09:43:51PM -0500, Justin Pryzby wrote:
> And rebased on Michael's commit removing ReindexStmt->concurrent.

Rebased on a6642b3ae: Add support for partitioned tables and indexes in REINDEX

So now this includes the new functionality and test for reindexing a
partitioned table onto a new tablespace.  That part could use some additional
review.

I guess this patch series will also conflict with the CLUSTER part of this
other one.  Once its CLUSTER patch is commited, this patch should to be updated
to test clustering a partitioned table to a new tbspc.
https://commitfest.postgresql.org/29/2584/
REINDEX/CIC/CLUSTER of partitioned tables

-- 
Justin
>From 992e0121925c74d5c5a4e5b132cddb3d6b31da86 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Fri, 27 Mar 2020 17:50:46 -0500
Subject: [PATCH v27 1/5] Change REINDEX/CLUSTER to accept an option list..

..like EXPLAIN (..), VACUUM (..), and ANALYZE (..).

Change docs in the style of VACUUM.  See also: 52dcfda48778d16683c64ca4372299a099a15b96
---
 doc/src/sgml/ref/cluster.sgml  | 27 ++---
 doc/src/sgml/ref/reindex.sgml  | 43 ++-
 src/backend/commands/cluster.c | 28 --
 src/backend/nodes/copyfuncs.c  |  4 +--
 src/backend/nodes/equalfuncs.c |  4 +--
 src/backend/parser/gram.y  | 54 +++---
 src/backend/tcop/utility.c | 38 ++--
 src/bin/psql/tab-complete.c| 22 ++
 src/include/commands/cluster.h |  3 +-
 src/include/nodes/parsenodes.h |  4 +--
 10 files changed, 167 insertions(+), 60 deletions(-)

diff --git a/doc/src/sgml/ref/cluster.sgml b/doc/src/sgml/ref/cluster.sgml
index 4da60d8d56..e6ebce27e6 100644
--- a/doc/src/sgml/ref/cluster.sgml
+++ b/doc/src/sgml/ref/cluster.sgml
@@ -21,8 +21,13 @@ PostgreSQL documentation
 
  
 
-CLUSTER [VERBOSE] table_name [ USING index_name ]
-CLUSTER [VERBOSE]
+CLUSTER [VERBOSE] [ ( option [, ...] ) ] table_name [ USING index_name ]
+CLUSTER [VERBOSE] [ ( option [, ...] ) ]
+
+where option can be one of:
+
+VERBOSE [ boolean ]
+
 
  
 
@@ -81,6 +86,15 @@ CLUSTER [VERBOSE]
   Parameters
 
   
+   
+VERBOSE
+
+ 
+  Prints a progress report as each table is clustered.
+ 
+
+   
+

 table_name
 
@@ -100,10 +114,15 @@ CLUSTER [VERBOSE]

 

-VERBOSE
+boolean
 
  
-  Prints a progress report as each table is clustered.
+  Specifies whether the selected option should be turned on or off.
+  You can write TRUE, ON, or
+  1 to enable the option, and FALSE,
+  OFF, or 0 to disable it.  The
+  boolean value can also
+  be omitted, in which case TRUE is assumed.
  
 

diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index 33af4ae02a..593b38a7ee 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -25,7 +25,7 @@ REINDEX [ ( option [, ...] ) ] { IN
 
 where option can be one of:
 
-VERBOSE
+VERBOSE [ boolean ]
 
  
 
@@ -145,19 +145,6 @@ REINDEX [ ( option [, ...] ) ] { IN
 

 
-   
-name
-
- 
-  The name of the specific index, table, or database to be
-  reindexed.  Index and table names can be schema-qualified.
-  Presently, REINDEX DATABASE and REINDEX SYSTEM
-  can only reindex the current database, so their parameter must match
-  the current database's name.
- 
-
-   
-

 CONCURRENTLY
 
@@ -185,6 +172,34 @@ REINDEX [ ( option [, ...] ) ] { IN
  
 

+
+   
+name
+
+ 
+  The name of the specific index, table, or database to be
+  reindexed.  Index and table names can be schema-qualified.
+  Presently, REINDEX DATABASE and REINDEX SYSTEM
+  can only reindex the current database, so their parameter must match
+  the current database's name.
+ 
+
+   
+
+   
+boolean
+
+ 
+  Specifies whether the selected option should be turned on or off.
+  You can write TRUE, ON, or
+  1 to enable the option, and FALSE,
+  OFF, or 0 to disable it.  The
+  boolean value can also
+  be omitted, in which case TRUE is assumed.
+ 
+
+   
+
   
  
 
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index 0d647e912c..4a3678831d 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -35,6 +35,7 @@
 #include "catalog/pg_am.h"
 #include "catalog/toasting.h"
 #include "commands/cluster.h"
+#include "commands/defrem.h"
 #include "commands/progress.h"
 #include "commands/tablecmds.h"
 #include "commands/vacuum.h"
@@ -102,8 +103,29 @@ static List *get_tables_to_cluster(MemoryContext cluster_context);
  *---
  */
 void
-cluster(ClusterStmt *stmt, bool isTopLevel)
+cluster(ParseState *pstate, Cl

Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace on the fly

2020-09-08 Thread Justin Pryzby

On Tue, Sep 08, 2020 at 09:02:38PM -0300, Alvaro Herrera wrote:
> On 2020-Sep-08, Justin Pryzby wrote:
> 
> > From 992e0121925c74d5c5a4e5b132cddb3d6b31da86 Mon Sep 17 00:00:00 2001
> > From: Justin Pryzby 
> > Date: Fri, 27 Mar 2020 17:50:46 -0500
> > Subject: [PATCH v27 1/5] Change REINDEX/CLUSTER to accept an option list..
> > 
> > ..like EXPLAIN (..), VACUUM (..), and ANALYZE (..).
> > 
> > Change docs in the style of VACUUM.  See also: 
> > 52dcfda48778d16683c64ca4372299a099a15b96
> 
> I don't understand why you change all options to DefElem instead of
> keeping the bitmask for those options that can use it.

That's originally how I did it, too.

Initially I added List *params, and Michael suggested to retire
ReindexStmt->concurrent.  I provided a patch to do so, initially by leaving int
options and then, after this, removing it to "complete the thought", and get
rid of the remnants of the "old way" of doing it.  This is also how vacuum and
explain are done.
https://www.postgresql.org/message-id/20200902022410.GA20149%40telsasoft.com

-- 
Justin

Re: doc review for v13

2020-09-09 Thread Justin Pryzby

I've added a few more.

-- 
Justin
>From 137321a0d476f66b5e5f21c2f627c407330e50b1 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 30 Mar 2020 19:43:22 -0500
Subject: [PATCH v8 01/14] doc: btree deduplication

commit 0d861bbb702f8aa05c2a4e3f1650e7e8df8c8c27
Author: Peter Geoghegan 
---
 doc/src/sgml/btree.sgml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/src/sgml/btree.sgml b/doc/src/sgml/btree.sgml
index d03ee4d6fa..69c1ee0e97 100644
--- a/doc/src/sgml/btree.sgml
+++ b/doc/src/sgml/btree.sgml
@@ -642,7 +642,7 @@ options(relopts local_relopts *) returns
   
   
Deduplication works by periodically merging groups of duplicate
-   tuples together, forming a single posting list tuple for each
+   tuples together, forming a single posting list tuple for each
group.  The column key value(s) only appear once in this
representation.  This is followed by a sorted array of
TIDs that point to rows in the table.  This
-- 
2.17.0

>From fc883d317a895140771ce34564847bb9ca98b7e3 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sun, 29 Mar 2020 21:22:43 -0500
Subject: [PATCH v8 02/14] doc: pg_stat_progress_basebackup

commit e65497df8f85ab9b9084c928ff69f384ea729b24
Author: Fujii Masao 
---
 doc/src/sgml/monitoring.sgml | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 304c49f07b..ea8780327f 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -6069,8 +6069,8 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
   waiting for checkpoint to finish
   
The WAL sender process is currently performing
-   pg_start_backup to set up for
-   taking a base backup, and waiting for backup start
+   pg_start_backup to prepare to
+   take a base backup, and waiting for the start-of-backup
checkpoint to finish.
   
  
-- 
2.17.0

>From 2d7c3ca43db77c910e8cda4b53fd6c6b09421b40 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Tue, 7 Apr 2020 19:56:56 -0500
Subject: [PATCH v8 03/14] doc: Allow users to limit storage reserved by
 replication slots

commit c6550776394e25c1620bc8258427c8f1d448080d
Author: Alvaro Herrera 
---
 doc/src/sgml/config.sgml | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 7a7177c550..190157df0a 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -3900,9 +3900,9 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
 slots are allowed to retain in the pg_wal
 directory at checkpoint time.
 If max_slot_wal_keep_size is -1 (the default),
-replication slots retain unlimited amount of WAL files.  If
-restart_lsn of a replication slot gets behind more than that megabytes
-from the current LSN, the standby using the slot may no longer be able
+replication slots may retain an unlimited amount of WAL files.  Otherwise, if
+restart_lsn of a replication slot falls behind the current LSN by more
+than the given size, the standby using the slot may no longer be able
 to continue replication due to removal of required WAL files. You
 can see the WAL availability of replication slots
 in pg_replication_slots.
-- 
2.17.0

>From e06f8c6f048daa9829dc3bf0450caf5c099d9e4f Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 6 Apr 2020 17:16:07 -0500
Subject: [PATCH v8 04/14] doc: s/evade/avoid/

---
 src/backend/access/gin/README | 2 +-
 src/backend/utils/adt/jsonpath_exec.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/backend/access/gin/README b/src/backend/access/gin/README
index 125a82219b..41d4e1e8a0 100644
--- a/src/backend/access/gin/README
+++ b/src/backend/access/gin/README
@@ -413,7 +413,7 @@ leftmost leaf of the tree.
 Deletion algorithm keeps exclusive locks on left siblings of pages comprising
 currently investigated path.  Thus, if current page is to be removed, all
 required pages to remove both downlink and rightlink are already locked.  That
-evades potential right to left page locking order, which could deadlock with
+avoids potential right to left page locking order, which could deadlock with
 concurrent stepping right.
 
 A search concurrent to page deletion might already have read a pointer to the
diff --git a/src/backend/utils/adt/jsonpath_exec.c b/src/backend/utils/adt/jsonpath_exec.c
index f146767bfc..2d2eb7d7a3 100644
--- a/src/backend/utils/adt/jsonpath_exec.c
+++ b/src/backend/utils/adt/jsonpath_exec.c
@@ -35,7 +35,7 @@
  * executeItemOptUnwrapTarget() function have 'unwrap' argument, which indicates
  * whether unwrapping of array is needed.  When unwrap == true, each of array
  * members is passed to executeItemOptUnwrapTarget() again but with unwrap == false
- * in order to

Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace on the fly

2020-09-09 Thread Justin Pryzby

On Wed, Sep 09, 2020 at 09:22:00PM +0900, Michael Paquier wrote:
> On Tue, Sep 08, 2020 at 07:17:58PM -0500, Justin Pryzby wrote:
> > Initially I added List *params, and Michael suggested to retire
> > ReindexStmt->concurrent.  I provided a patch to do so, initially by leaving 
> > int
> > options and then, after this, removing it to "complete the thought", and get
> > rid of the remnants of the "old way" of doing it.  This is also how vacuum 
> > and
> > explain are done.
> > https://www.postgresql.org/message-id/20200902022410.GA20149%40telsasoft.com
> 
> Defining a set of DefElem when parsing and then using the int
> "options" with bitmasks where necessary at the beginning of the
> execution looks like a good balance to me.  This way, you can extend
> the grammar to use things like (verbose = true), etc.

It doesn't need to be extended - defGetBoolean already handles that.  I don't
see what good can come from storing the information in two places in the same
structure.

|postgres=# CLUSTER (VERBOSE on) pg_attribute USING 
pg_attribute_relid_attnum_index ;
|INFO:  clustering "pg_catalog.pg_attribute" using index scan on 
"pg_attribute_relid_attnum_index"
|INFO:  "pg_attribute": found 0 removable, 2968 nonremovable row versions in 55 
pages
|DETAIL:  0 dead row versions cannot be removed yet.
|CPU: user: 0.01 s, system: 0.00 s, elapsed: 0.01 s.
|CLUSTER

-- 
Justin

Re: [HACKERS] [PATCH] Generic type subscripting

2020-09-09 Thread Justin Pryzby

On Wed, Aug 05, 2020 at 04:04:22PM +0200, Dmitry Dolgov wrote:
> > On Sun, Aug 02, 2020 at 12:50:12PM +0200, Pavel Stehule wrote:
> > >
> > > > Maybe this could be salvaged by flushing 0005 in its current form and
> > > > having the jsonb subscript executor do something like "if the current
> > > > value-to-be-subscripted is a JSON array, then try to convert the textual
> > > > subscript value to an integer".  Not sure about what the error handling
> > > > rules ought to be like, though.
> > >
> > > I'm fine with the idea of separating 0005 patch and potentially prusuing
> > > it as an independent item. Just need to rebase 0006, since Pavel
> > > mentioned that it's a reasonable change he would like to see in the
> > > final result.
> >
> > +1
> 
> Here is what I had in mind. Worth noting that, as well as the original

This seems to already hit a merge conflict (8febfd185).
Would you re-rebase ?

-- 
Justin

Re: v13: show extended stats target in \d

2020-09-09 Thread Justin Pryzby

On Tue, Sep 01, 2020 at 12:35:19PM +, Georgios Kokolatos wrote:
> I will humbly disagree with the current review. I shall refrain from changing 
> the status though because I do not have very strong feeling about it.

Sorry but this was in my spam, and didn't see until now.

> 
> However the patch contains:
> 
> - "  'm' = 
> any(stxkind) AS mcv_enabled\n"
> + "  'm' = 
> any(stxkind) AS mcv_enabled,\n"
> + "  %s"
>   "FROM 
> pg_catalog.pg_statistic_ext stat "
>   "WHERE stxrelid = 
> '%s'\n"
>   "ORDER BY 1;",
> + (pset.sversion >= 
> 13 ? "stxstattarget\n" : "-1\n"),
>   oid);
> 
> This seems to be breaking a bit the pattern in describeOneTableDetails().
> First, it is customary to write the whole query for the version in its own 
> block. I do find this pattern rather helpful and clean. So in my humble 
> opinion, the ternary expression should be replaced with a distinct if block 
> that would implement stxstattarget. Second, setting the value to -1 as an 
> indication of absence breaks the pattern a bit further. More on that bellow.

Hm, I did like this using the "hastriggers" code as a template.  But you're
right that everywhere else does it differently.

> +   if (strcmp(PQgetvalue(result, i, 8), 
> "-1") != 0)
> +   appendPQExpBuffer(&buf, " 
> STATISTICS %s",
> + 
> PQgetvalue(result, i, 8));
> +
> 
> In the same function, one will find a bit of a different pattern for printing 
> the statistics, e.g.
>  if (strcmp(PQgetvalue(result, i, 7), "t") == 0) 
> I will again hold the opinion that if the query gets crafted a bit 
> differently, one can inspect if the table has `stxstattarget` and then, go 
> ahead and print the value.
> 
> Something in the lines of:
> if (strcmp(PQgetvalue(result, i, 8), "t") == 0)
> appendPQExpBuffer(&buf, " STATISTICS %s", 
> PQgetvalue(result, i, 9));

I think what you've written wouldn't give the desired behavior, since it would
show the stats target even when it's set to the default.  I don't see the point
of having additional, separate, version-specific boolean columns for 1) column
exists; 2) column is not default, in addition to 3) column value.  But I added
comment about what Peter and I agree is desirable, anyway.

> Finally, the patch might be more complete if a note is added in the 
> documentation.
> Have you considered adding something in doc/src/sgml/ref/psql-ref.sgml? If 
> no, will you consider it? If yes, why did you discard it?

I don't think the details of psql output are currently documented.  This shows
nothing about column statistics target, nor about MV stats at all.
https://www.postgresql.org/docs/13/app-psql.html

As for the discussion about a separator, I think maybe a comma is enough.  I
doubt anyone is going to think that you can get a valid command by prefixing
this by "CREATE STATISTICS".  Actually, it didn't even occur to me it was valid
command without the stats target - after all, that's not true of indexes.

+"public"."ab1_a_b_stats" (ndistinct, dependencies, mcv) ON a, b FROM ab1, 
STATISTICS 0

This revision only shows the stats target in verbose mode (slash dee plus).

-- 
Justin
>From e5b351cc9b0518c9162a4b988b17ed920f09d952 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sun, 30 Aug 2020 23:38:17 -0500
Subject: [PATCH v2] Show stats target of extended statistics

The stats target can be set since commit d06215d03, but wasn't shown by psql.
 ALTER STATISISTICS .. SET STATISTICS n.
Traditional column stats targets are shown in \d+ table, so do the same for
extended/MV stats.
---
 src/bin/psql/describe.c | 14 --
 src/test/regress/expected/stats_ext.out | 18 ++
 src/test/regress/sql/stats_ext.sql  |  2 ++
 3 files changed, 32 insertions(+), 2 deletions(-)

diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 0861d74a6f..3c391fe686 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -2683,8 +2683,13

Re: v13: show extended stats target in \d

2020-09-09 Thread Justin Pryzby

On Wed, Sep 09, 2020 at 07:22:30PM -0300, Alvaro Herrera wrote:
> On 2020-Sep-09, Justin Pryzby wrote:
> 
> > As for the discussion about a separator, I think maybe a comma is enough.  I
> > doubt anyone is going to think that you can get a valid command by prefixing
> > this by "CREATE STATISTICS".  Actually, it didn't even occur to me it was 
> > valid
> > command without the stats target - after all, that's not true of indexes.
> 
> By the way, I got into this because of your comment in
> https://postgr.es/m/20200901011429.gz5...@telsasoft.com on whether we
> needed CREATE STATISTICS to have a clause for this.

And I agree that it does not - the question was raised during development of
the feature, and nobody sees a need.  I asked since I didn't know and I didn't
want it to be missed for lack of asking.

-- 
Justin

Re: please update ps display for recovery checkpoint

2020-09-09 Thread Justin Pryzby

On Mon, Aug 31, 2020 at 03:52:44PM +0900, Michael Paquier wrote:
> On Thu, Aug 20, 2020 at 05:09:05PM +0900, Michael Paquier wrote:
> > That could be helpful.  Wouldn't it be better to use "end-of-recovery
> > checkpoint" instead?  That's the common wording in the code comments.
> > 
> > I don't see the point of patch 0002.  In the same paragraph, we
> > already know that this applies to any checkpoints.
> 
> Thinking more about this..  Could it be better to just add some calls
> to set_ps_display() directly in CreateCheckPoint()?  This way, both
> the checkpointer as well as the startup process at the end of recovery
> would benefit from the change.

What would you want the checkpointer's ps to say ?

Normally it just says:
postgres  3468  3151  0 Aug27 ?00:20:57 postgres: checkpointer  
  

Or do you mean do the same thing as now, but one layer lower, like:

@@ -8728,6 +8725,9 @@ CreateCheckPoint(int flags)
+   if (flags & CHECKPOINT_END_OF_RECOVERY)
+   set_ps_display("recovery checkpoint");

-- 
Justin

Re: Fix for parallel BTree initialization bug

2020-09-09 Thread Justin Pryzby

On Tue, Sep 08, 2020 at 06:25:03PM +, Jameson, Hunter 'James' wrote:
> Hi, I ran across a small (but annoying) bug in initializing parallel BTree 
> scans, which causes the parallel-scan state machine to get confused. The fix 
> is one line; the description is a bit longer—

What postgres version was this ?

> Before, function _bt_first() would exit immediately if the specified scan 
> keys could never be satisfied--without notifying other parallel workers, if 
> any, that the scan key was done. This moved that particular worker to a scan 
> key beyond what was in the shared parallel-query state, so that it would 
> later try to read in "InvalidBlockNumber", without recognizing it as a 
> special sentinel value.
> 
> The basic bug is that the BTree parallel query state machine assumes that a 
> worker process is working on a key <= the global key--a worker process can be 
> behind (i.e., hasn't finished its work on a previous key), but never ahead. 
> By allowing the first worker to move on to the next scan key, in this one 
> case, without notifying other workers, the global key ends up < the first 
> worker's local key.
> 
> Symptoms of the bug are: on R/O, we get an error saying we can't extend the 
> index relation, while on an R/W we just extend the index relation by 1 block.

What's the exact error ?  Are you able to provide a backtrace ?

> To reproduce, you need a query that:
> 
> 1. Executes parallel BTree index scan;
> 2. Has an IN-list of size > 1;

Do you mean you have an index on col1 and a query condition like: col1 IN 
(a,b,c...) ?

> 3. Has an additional index filter that makes it impossible to satisfy the
> first IN-list condition.

.. AND col1::text||'foo' = '';
I think you mean that the "impossible" condition makes it so that a btree
worker exits early.

> (We encountered such a query, and therefore the bug, on a production 
> instance.)

Could you send the "shape" of the query or its plan, obfuscated and redacted as
need be ?

-- 
Justin

Re: recovering from "found xmin ... from before relfrozenxid ..."

2020-09-10 Thread Justin Pryzby

On Thu, Sep 10, 2020 at 11:21:02AM -0400, Robert Haas wrote:
> On Fri, Aug 28, 2020 at 5:55 AM Ashutosh Sharma  wrote:
> > Please have a look into the attached patch for the changes and let me know 
> > for any other concerns. Thank you.
> 
> I have committed this version.

Thanks ; I marked it as such in CF app.
https://commitfest.postgresql.org/29/2700/

-- 
Justin

Re: PG 13 release notes, first draft

2020-09-10 Thread Justin Pryzby

On Mon, Sep 07, 2020 at 08:40:26AM -0500, Justin Pryzby wrote:

Rebasing onto 3965de54e718600a4703233936e56a3202caf73f, I'm left with:

diff --git a/doc/src/sgml/release-13.sgml b/doc/src/sgml/release-13.sgml
index 8fffc6fe0a..69d143e10c 100644
--- a/doc/src/sgml/release-13.sgml
+++ b/doc/src/sgml/release-13.sgml
@@ -131,7 +131,7 @@ Author: Thomas Munro 
  
 
 
-SELECT round(sum(OLDVALUE / n::float)) FROM 
generate_series(1, OLDVALUE) s(n);
+SELECT round(sum(OLDVALUE / n::float)) AS newvalue 
FROM generate_series(1, OLDVALUE) s(n);
 
 
 
@@ -776,8 +776,8 @@ Author: Noah Misch 
 -->
 

-Allow skipping of WAL for full table writes if WAL writes to be skipped during a transaction
+which creates or rewrites a relation if  is minimal (Kyotaro
 Horiguchi)

@@ -1007,7 +1007,7 @@ Author: Michael Paquier 
 

 Add leader_pid to  to report parallel worker ownership
+linkend="pg-stat-activity-view"/> to report parallel worker's leader 
process
 (Julien Rouhaud)

   
@@ -1262,8 +1262,8 @@ Author: Peter Eisentraut 
 -->
 

-Enable Unix-domain sockets
-support on Windows (Peter Eisentraut)
+Enable support for Unix-domain 
sockets
+on Windows (Peter Eisentraut)

   
 
@@ -1391,8 +1391,8 @@ Author: Fujii Masao 
 -->
 
   
-   Allow WAL recovery to continue even if invalid
-   pages are referenced (Fujii Masao)
+   Allow recovery to continue even if invalid
+   pages are referenced by WAL (Fujii Masao)

Re: Fix for parallel BTree initialization bug

2020-09-10 Thread Justin Pryzby

Against all odds, I was able to reproduce this.

begin;
CREATE TABLE t AS SELECT generate_series(1,99)i;
ALTER TABLE t SET (parallel_workers=2, autovacuum_enabled=off);
CREATE INDEX ON t(i);
commit;

SET parallel_leader_participation=off; SET min_parallel_table_scan_size=0; SET 
enable_bitmapscan=off; SET enable_indexonlyscan=off; SET enable_seqscan=off;
explain(analyze , verbose on) SELECT COUNT(1) FROM t a WHERE a.i>555 AND i IN ( 
333,334,335,336,337,338,339,340,341,342,343,344,345,346,347,348,349,350,351,352,353,354,355,356,357,358,359,360,361,362,363,364,365,366,367,368,369,370,371,372,373,374,375,376,377,378,379,380,381,382,383,384,385,386,387,388,389,390,391,392,393,394,395,396,397,398,399,400,401,402,403,404,405,406,407,408,409,410,411,412,413,414,415,416,417,418,419,420,421,422,423,424,425,426,427,428,429,430,431,432,433,434,435,436,437,438,439,440,441,442,443,444,445,446,447,448,449,450,451,452,453,454,455,456,457,458,459,460,461,462,463,464,465,466,467,468,469,470,471,472,473,474,475,476,477,478,479,480,481,482,483,484,485,486,487,488,489,490,491,492,493,494,495,496,497,498,499,500,501,502,503,504,505,506,507,508,509,510,511,512,513,514,515,516,517,518,519,520,521,522,523,524,525,526,527,528,529,530,531,532,533,534,535,536,537,538,539,540,541,542,543,544,545,546,547,548,549,550,551,552,553,554,555,556,557,558,559,560,561,562,563,564,565,566,567,568,569,570,571,572,573,574,575,576,577,578,579,580,581,582,583,584,585,586,587,588,589,590,591,592,593,594,595,596,597,598,599,600,601,602,603,604,605,606,607,608,609,610,611,612,613,614,615,616,617,618,619,620,621,622,623,624,625,626,627,628,629,630,631,632,633,634,635,636,637,638,639,640,641,642,643,644,645,646,647,648,649,650,651,652,653,654,655,656,657,658,659,660,661,662,663,664,665,666
 ) ORDER BY 1;

Which gives a plan like:
 Sort  (cost=5543.71..5543.72 rows=1 width=8)
   Sort Key: (count(1))
   ->  Finalize Aggregate  (cost=5543.69..5543.70 rows=1 width=8)
 ->  Gather  (cost=5543.48..5543.69 rows=2 width=8)
   Workers Planned: 2
   ->  Partial Aggregate  (cost=4543.48..4543.49 rows=1 width=8)
 ->  Parallel Index Scan using t_i_idx on t a  
(cost=0.42..4204.92 rows=135423 width=0)

I don't get an error, on read-only hot standby.  I do get inconsistent results,
including on primary server.

count | 222
count | 214

This appears to be a bug in commit 569174f1b btree: Support parallel index 
scans.

I've added your patch here:
https://commitfest.postgresql.org/30/2729/

In the course of reproducing this, I also added:
@@ -1972,2 +1975,3 @@ _bt_readnextpage(IndexScanDesc scan, BlockNumber blkno, 
ScanDirection dir)
rel = scan->indexRelation;
+   Assert(BlockNumberIsValid(blkno));

-- 
Justin

Re: 回复：how to create index concurrently on partitioned table

2020-09-11 Thread Justin Pryzby

On Tue, Sep 08, 2020 at 01:31:05PM +0900, Michael Paquier wrote:
> - CIC on partitioned relations.  (Should we also care about DROP INDEX
> CONCURRENTLY as well?)

Do you have any idea what you think that should look like for DROP INDEX
CONCURRENTLY ?

-- 
Justin

Re: 回复：how to create index concurrently on partitioned table

2020-09-14 Thread Justin Pryzby

On Sat, Sep 12, 2020 at 10:35:34AM +0900, Michael Paquier wrote:
> On Fri, Sep 11, 2020 at 07:13:01PM -0500, Justin Pryzby wrote:
> > On Tue, Sep 08, 2020 at 01:31:05PM +0900, Michael Paquier wrote:
> >> - CIC on partitioned relations.  (Should we also care about DROP INDEX
> >> CONCURRENTLY as well?)
> > 
> > Do you have any idea what you think that should look like for DROP INDEX
> > CONCURRENTLY ?
> 
> Making the maintenance of the partition tree consistent to the user is
> the critical part here, so my guess on this matter is:
> 1) Remove each index from the partition tree and mark the indexes as
> invalid in the same transaction.  This makes sure that after commit no
> indexes would get used for scans, and the partition dependency tree
> pis completely removed with the parent table.  That's close to what we
> do in index_concurrently_swap() except that we just want to remove the
> dependencies with the partitions, and not just swap them of course.
> 2) Switch each index to INDEX_DROP_SET_DEAD, one per transaction
> should be fine as that prevents inserts.
> 3) Finish the index drop.
> 
> Step 2) and 3) could be completely done for each index as part of
> index_drop().  The tricky part is to integrate 1) cleanly within the
> existing dependency machinery while still knowing about the list of
> partitions that can be removed.  I think that this part is not that
> straight-forward, but perhaps we could just make this logic part of
> RemoveRelations() when listing all the objects to remove.

Thanks.

I see three implementation ideas..

1. I think your way has an issue that the dependencies are lost.  If there's an
interruption, the user is maybe left with hundreds or thousands of detached
indexes to clean up.  This is strange since there's actually no detach command
for indexes (but they're implicitly "attached" when a matching parent index is
created).  A 2nd issue is that DETACH currently requires an exclusive lock (but
see Alvaro's WIP patch).

2. Maybe the easiest way is to mark all indexes invalid and then drop all
partitions (concurrently) and then the partitioned parent.  If interrupted,
this would leave a parent index marked "invalid", and some child tables with no
indexes.  I think this may be "ok".  The same thing is possible if a concurrent
build is interrupted, right ?

3. I have a patch which changes index_drop() to "expand" a partitioned index 
into
its list of children.  Each of these becomes a List:
| indexId, heapId, userIndexRelation, userHeapRelation, heaplocktag, heaprelid, 
indexrelid
The same process is followed as for a single index, but handling all partitions
at once in two transactions total.  Arguably, this is bad since that function
currently takes a single Oid but would now ends up operating on a list of 
indexes.

Anyway, for now this is rebased on 83158f74d.

-- 
Justin
>From 94f3578d6a6ae9d7e33f902dd86c2b7cf0aee3e0 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sat, 6 Jun 2020 17:42:23 -0500
Subject: [PATCH v7 1/3] Allow CREATE INDEX CONCURRENTLY on partitioned table

Note, this effectively reverts 050098b14, so take care to not reintroduce the
bug it fixed.

XXX: does pgstat_progress_update_param() break other commands progress ?
---
 doc/src/sgml/ref/create_index.sgml |   9 --
 src/backend/commands/indexcmds.c   | 135 +
 src/test/regress/expected/indexing.out |  60 ++-
 src/test/regress/sql/indexing.sql  |  18 +++-
 4 files changed, 166 insertions(+), 56 deletions(-)

diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
index 33aa64e81d..c780dc9547 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -657,15 +657,6 @@ Indexes:
 cannot.

 
-   
-Concurrent builds for indexes on partitioned tables are currently not
-supported.  However, you may concurrently build the index on each
-partition individually and then finally create the partitioned index
-non-concurrently in order to reduce the time where writes to the
-partitioned table will be locked out.  In this case, building the
-partitioned index is a metadata only operation.
-   
-
   
  
 
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index f1b5f87e6a..d417404211 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -91,6 +91,7 @@ static bool ReindexRelationConcurrently(Oid relationOid, int options);
 
 static void ReindexPartitions(Oid relid, int options, bool isTopLevel);
 static void ReindexMultipleInternal(List *relids, int options);
+static void reindex_invalid_child_indexes(Oid indexRelationId);
 static void reindex_error_callback(void *args);
 static void update_relispartition(Oid relationId, bool newval);
 static bool Compar

Re: Force update_process_title=on in crash recovery?

2020-09-15 Thread Justin Pryzby

On Tue, Sep 15, 2020 at 10:01:18AM -0400, Tom Lane wrote:
> Thomas Munro  writes:
> > Based on a couple of independent reports from users with no idea how
> > to judge the progress of a system recovering from a crash, Christoph
> > and I wondered if we should override update_process_title for the
> > "recovering ..." message, at least until connections are allowed.  We
> > already do that to set the initial titles.
> 
> > Crash recovery is a rare case where important information is reported
> > through the process title that isn't readily available anywhere else,
> > since you can't log in.  If you want to gauge  progress on a system
> > that happened to crash with update_process_title set to off, your best
> > hope is probably to trace the process or spy on the files it has open,
> > to see which WAL segment it's accessing, but that's not very nice.
> 
> Seems like a good argument, but you'd have to be careful about the
> final state when you stop overriding update_process_title --- it can't
> be left looking like it's still-in-progress on some random WAL file.
> (Compare my nearby gripes about walsenders being sloppy about their
> pg_stat_activity and process title presentations.)

Related:
https://commitfest.postgresql.org/29/2688/

I'm not sure I understood Michael's recent message, but I think maybe refers to
promotion of a standby.

-- 
Justin

Re: Yet another fast GiST build

2020-09-17 Thread Justin Pryzby

On Thu, Sep 17, 2020 at 11:38:47AM +0300, Heikki Linnakangas wrote:
> On 15/09/2020 14:36, Heikki Linnakangas wrote:
> > Another patch version, fixed a few small bugs pointed out by assertion
> > failures in the regression tests.
> 
> Pushed. Thanks everyone!

+/* FIXME: bump this before pushing! */
 #define CATALOG_VERSION_NO 202009031

Re: Online checksums patch - once again

2020-09-18 Thread Justin Pryzby

+ * changed to "inprogress-off", the barrier for mvoving to "off" can be
moving

+ * When disabling checksums, data_checksums will be set of "inprogress-off"
set *to*

+   get_namespace_name(RelationGetNamespace(reln)), 
RelationGetRelationName(reln),

I think this palloc()s a new copy of the namespace every 100 blocks.
Better do it outside the loop.

+   {"inprogress-on", DATA_CHECKSUMS_INPROGRESS_ON, true},
+   {"inprogress-off", DATA_CHECKSUMS_INPROGRESS_OFF, true},

enabling / disabling ?

+typedef enum ChecksumType
+{
+   DATA_CHECKSUMS_OFF = 0,
+   DATA_CHECKSUMS_ON,
+   DATA_CHECKSUMS_INPROGRESS_ON,
+   DATA_CHECKSUMS_INPROGRESS_OFF
+}  ChecksumType;

Should this be an bitmask, maybe
DATA_CHECKSUMS_WRITE = 1 
DATA_CHECKSUMS_VERIFY = 2,

It occured to me that you could rephrase this patch as "Make checksum state
per-relation rather than cluster granularity".  That's currently an
implementation detail, but could be exposed as a feature.  That could be a
preliminary 0001 patch.  Half the existing patch would be 0002 "Allow
online enabling checksums for a given relation/database/cluster".  You might
save some of the existing effort of synchronize the cluster-wide checksum
state, since it doesn't need to be synchronized.  The "data_checksums" GUC
might be removed, or changed to an enum: on/off/per_relation.  Moving from
"per_relation" to "on" would be an optional metadata-only change, allowed only
when all rels in all dbs are checksumed.  I'm not sure if you'd even care about
temp tables, since "relhaschecksum" would be authoritative, rather than a
secondary bit only used during processing.

XLogHintBitIsNeeded() and DataChecksumsEnabled() would need to check
relhaschecksum, which tentatively seems possible.

I'm not sure if it's possible, but maybe pg_checksums would be able to skip
rels which had already been checksummed "online" (with an option to force
reprocessing).

Maybe some people would want (no) checksums on specific tables, and that could
eventually be implemented as 0003: "ALTER TABLE SET checksums=".  I'm thinking
of that being used immediately after an CREATE, but I suppose ON would cause
the backend to rewrite the table with checksums synchronously (not in the BGW),
either with AEL or by calling ProcessSingleRelationByOid().

-- 
Justin

Re: should INSERT SELECT use a BulkInsertState?

2020-09-19 Thread Justin Pryzby

On Sun, Jul 12, 2020 at 08:57:00PM -0500, Justin Pryzby wrote:
> On Thu, Jun 04, 2020 at 10:30:47AM -0700, Andres Freund wrote:
> > On 2020-05-08 02:25:45 -0500, Justin Pryzby wrote:
> > > Seems to me it should, at least conditionally.  At least if there's a 
> > > function
> > > scan or a relation or ..
> > 
> > Well, the problem is that this can cause very very significant
> > regressions. As in 10x slower or more. The ringbuffer can cause constant
> > XLogFlush() calls (due to the lsn interlock), and the eviction from
> > shared_buffers (regardless of actual available) will mean future vacuums
> > etc will be much slower.  I think this is likely to cause pretty
> > widespread regressions on upgrades.
> > 
> > Now, it sucks that we have this problem in the general facility that's
> > supposed to be used for this kind of bulk operation. But I don't really
> > see it realistic as expanding use of bulk insert strategies unless we
> > have some more fundamental fixes.
> 
> I made this conditional on BEGIN BULK/SET bulk, so I'll solicit comments on 
> that.

@cfbot: rebased
>From acfc6ef7b84a6753a49b7f4c9d5b77a0abbfd37c Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Fri, 8 May 2020 02:17:32 -0500
Subject: [PATCH v3] Allow INSERT SELECT to use a BulkInsertState

---
 src/backend/executor/nodeModifyTable.c | 23 +--
 src/backend/parser/gram.y  |  7 ++-
 src/backend/tcop/utility.c |  4 
 src/backend/utils/misc/guc.c   | 12 +++-
 src/include/executor/nodeModifyTable.h |  2 ++
 src/include/nodes/execnodes.h  |  2 ++
 src/include/parser/kwlist.h|  1 +
 7 files changed, 47 insertions(+), 4 deletions(-)

diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 9812089161..464ad5e346 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -75,6 +75,8 @@ static ResultRelInfo *getTargetResultRelInfo(ModifyTableState *node);
 static void ExecSetupChildParentMapForSubplan(ModifyTableState *mtstate);
 static TupleConversionMap *tupconv_map_for_subplan(ModifyTableState *node,
    int whichplan);
+/* guc */
+bool insert_in_bulk = false;
 
 /*
  * Verify that the tuples to be produced by INSERT or UPDATE match the
@@ -578,7 +580,7 @@ ExecInsert(ModifyTableState *mtstate,
 			table_tuple_insert_speculative(resultRelationDesc, slot,
 		   estate->es_output_cid,
 		   0,
-		   NULL,
+		   NULL, /* Bulk insert not supported */
 		   specToken);
 
 			/* insert index entries for tuple */
@@ -617,7 +619,7 @@ ExecInsert(ModifyTableState *mtstate,
 			/* insert the tuple normally */
 			table_tuple_insert(resultRelationDesc, slot,
 			   estate->es_output_cid,
-			   0, NULL);
+			   0, mtstate->bistate);
 
 			/* insert index entries for tuple */
 			if (resultRelInfo->ri_NumIndices > 0)
@@ -2332,6 +2334,17 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 
 	mtstate->mt_arowmarks = (List **) palloc0(sizeof(List *) * nplans);
 	mtstate->mt_nplans = nplans;
+	mtstate->bistate = NULL;
+	if (operation == CMD_INSERT &&
+			node->onConflictAction != ONCONFLICT_UPDATE &&
+			node->rootResultRelIndex < 0)
+	{
+		// Plan *p = linitial(node->plans);
+		Assert(nplans == 1);
+
+		if (insert_in_bulk)
+			mtstate->bistate = GetBulkInsertState();
+	}
 
 	/* set up epqstate with dummy subplan data for the moment */
 	EvalPlanQualInit(&mtstate->mt_epqstate, estate, NULL, NIL, node->epqParam);
@@ -2776,6 +2789,12 @@ ExecEndModifyTable(ModifyTableState *node)
 		   resultRelInfo);
 	}
 
+	if (node->bistate)
+	{
+		FreeBulkInsertState(node->bistate);
+		table_finish_bulk_insert((getTargetResultRelInfo(node))->ri_RelationDesc, 0);
+	}
+
 	/*
 	 * Close all the partitioned tables, leaf partitions, and their indices
 	 * and release the slot used for tuple routing, if set.
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 017940bdcd..0bc2108a2b 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -631,7 +631,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 	ASSERTION ASSIGNMENT ASYMMETRIC AT ATTACH ATTRIBUTE AUTHORIZATION
 
 	BACKWARD BEFORE BEGIN_P BETWEEN BIGINT BINARY BIT
-	BOOLEAN_P BOTH BY
+	BOOLEAN_P BOTH BY BULK
 
 	CACHE CALL CALLED CASCADE CASCADED CASE CAST CATALOG_P CHAIN CHAR_P
 	CHARACTER CHARACTERISTICS CHECK CHECKPOINT CLASS CLOSE
@@ -9846,6 +9846,9 @@ transaction_mode_item:
 			| NOT DEFERRABLE
 	{ $$ = makeDefElem("transaction_deferrable",
 	   makeIntConst(false, @1), @1); }
+			| BULK
+	{ $$ = makeDefElem("bulk",
+	   makeIntCo

Re: please update ps display for recovery checkpoint

2020-09-19 Thread Justin Pryzby

On Thu, Sep 10, 2020 at 01:37:10PM +0900, Michael Paquier wrote:
> On Wed, Sep 09, 2020 at 09:00:50PM -0500, Justin Pryzby wrote:
> > What would you want the checkpointer's ps to say ?
> > 
> > Normally it just says:
> > postgres  3468  3151  0 Aug27 ?00:20:57 postgres: checkpointer  
> >   
> 
> Note that CreateCheckPoint() can also be called from the startup
> process if the bgwriter has not been launched once recovery finishes.
> 
> > Or do you mean do the same thing as now, but one layer lower, like:
> >
> > @@ -8728,6 +8725,9 @@ CreateCheckPoint(int flags)
> > +   if (flags & CHECKPOINT_END_OF_RECOVERY)
> > +   set_ps_display("recovery checkpoint");
> 
> For the use-case discussed here, that would be fine.  Now the
> difficult point is how much information we can actually display
> without bloating ps while still have something meaningful.  Showing
> all the information from LogCheckpointStart() would bloat the output a
> lot for example.  So, thinking about that, my take would be to have ps
> display the following at the beginning of CreateCheckpoint() and
> CreateRestartPoint():
> - restartpoint or checkpoint
> - shutdown
> - end-of-recovery
> 
> The output also needs to be cleared once the routines finish or if
> there is a skip, of course.

In my inital request, I *only* care about the startup process' recovery
checkpoint.  AFAIK, this exits when it's done, so there may be no need to
"revert" to the previous "ps".  However, one could argue that it's currently a
bug that the "recovering NNN" portion isn't updated after finishing the WAL
files.

StartupXLOG -> xlogreader -> XLogPageRead -> WaitForWALToBecomeAvailable -> 
XLogFileReadAnyTLI -> XLogFileRead
  -> CreateCheckPoint

Maybe it's a bad idea if the checkpointer is continuously changing its display.
I don't see the utility in it, since log_checkpoints does more than ps could
ever do.  I'm concerned that would break things for someone using something
like pgrep.
|$ ps -wwf `pgrep -f 'checkpointer *$'`
|UIDPID  PPID  C STIME TTY  STAT   TIME CMD
|postgres  9434  9418  0 Aug20 ?Ss   214:25 postgres: checkpointer   

|pryzbyj  23010 23007  0 10:33 ?00:00:00 postgres: checkpointer 
checkpoint

I think this one is by far the most common, but somewhat confusing, since it's
only one word.  This led me to put parenthesis around it:

|pryzbyj  26810 26809 82 10:53 ?00:00:12 postgres: startup 
(end-of-recovery checkpoint)

Related: I have always thought that this message meant "recovery will complete
Real Soon", but I now understand it to mean "beginning the recovery checkpoint,
which is flagged CHECKPOINT_IMMEDIATE" (and may take a long time).

|2020-09-19 10:53:26.345 CDT [26810] LOG:  checkpoint starting: end-of-recovery 
immediate

-- 
Justin
>From d1d473beb4b59a93ceb7859f4776da47d9381aee Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sat, 19 Sep 2020 10:12:53 -0500
Subject: [PATCH v2 1/4] Clear PS display after recovery

...so it doesn't continue to say 'recovering ...' during checkpoint
---
 src/backend/access/transam/xlog.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 61754312e2..0183cae9a9 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7396,6 +7396,7 @@ StartupXLOG(void)
 			/*
 			 * end of main redo apply loop
 			 */
+			set_ps_display("");
 
 			if (reachedRecoveryTarget)
 			{
-- 
2.17.0

>From 0e0bbeb6501f5b1a7a5f0e4109dbf8a8152de249 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sat, 8 Feb 2020 09:16:14 -0600
Subject: [PATCH v2 2/4] Update PS display following replay of last xlog..

..otherwise it shows "recovering " for the duration of the recovery
checkpoint.
---
 src/backend/access/transam/xlog.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 0183cae9a9..3d8220ba78 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -8892,6 +8892,13 @@ CreateCheckPoint(int flags)
 	if (log_checkpoints)
 		LogCheckpointStart(flags, false);
 
+	if (flags & CHECKPOINT_END_OF_RECOVERY)
+		set_ps_display("(end-of-recovery checkpoint)");
+	else if (flags & CHECKPOINT_IS_SHUTDOWN)
+		set_ps_display("(shutdown checkpoint)");
+	else
+		set_ps_display("(checkpoint)");
+
 	TRACE_POSTGRESQL_CHECKPOINT_START(flags);
 
 	/*
@@ -9106,6 +9113,7 @@ CreateCheckPoint(int flags)
 
 	/* Real work is done, but log and update stats before releasin

Re: Probable documentation errors or improvements

2020-09-19 Thread Justin Pryzby

On Thu, Sep 10, 2020 at 12:19:55PM -0700, Yaroslav wrote:
> Disclaimer: I'm not a native speaker, so not sure those are actually
> incorrect, and can't offer non-trivial suggestions.

https://www.postgresql.org/message-id/flat/CAKFQuwZh3k_CX2-%2BNcZ%3DFZss4bX6ASxDFEXJTY6u4wTH%2BG8%2BKA%40mail.gmail.com#9f9eba0cbbf9b57455503537575f5339

Most of these appear to be reasonable corrections or improvements.

Would you want to submit a patch against the master branch ?
It'll be easier for people to read when it's in a consistent format.
And less work to read it than to write it.

I suggest to first handle the 10+ changes which are most important and in need
of fixing.  After a couple rounds, then see if what's left is worth patching.

-- 
Justin

Re: doc review for v13

2020-09-19 Thread Justin Pryzby

On Thu, Sep 10, 2020 at 03:58:31PM +0900, Michael Paquier wrote:
> On Wed, Sep 09, 2020 at 09:37:42AM -0500, Justin Pryzby wrote:
> > I've added a few more.
> 
> I have done an extra round of review on this patch series, and applied
> what looked obvious to me (basically the points already discussed
> upthread).  Some parts applied down to 9.6 for the docs.

Thanks.  Here's the remainder, with some new ones.

-- 
Justin
>From ff7662fb2e68257bcffd9f0281220b5d3ab9dfbc Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 30 Mar 2020 19:43:22 -0500
Subject: [PATCH v9 01/15] doc: btree deduplication

commit 0d861bbb702f8aa05c2a4e3f1650e7e8df8c8c27
Author: Peter Geoghegan 
---
 doc/src/sgml/btree.sgml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/src/sgml/btree.sgml b/doc/src/sgml/btree.sgml
index 20cabe921f..bb395e6a85 100644
--- a/doc/src/sgml/btree.sgml
+++ b/doc/src/sgml/btree.sgml
@@ -642,7 +642,7 @@ options(relopts local_relopts *) returns
   
   
Deduplication works by periodically merging groups of duplicate
-   tuples together, forming a single posting list tuple for each
+   tuples together, forming a single posting list tuple for each
group.  The column key value(s) only appear once in this
representation.  This is followed by a sorted array of
TIDs that point to rows in the table.  This
-- 
2.17.0

>From 9be911f869805b2862154d0170fa89fbd6be9e10 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sun, 29 Mar 2020 21:22:43 -0500
Subject: [PATCH v9 02/15] doc: pg_stat_progress_basebackup

commit e65497df8f85ab9b9084c928ff69f384ea729b24
Author: Fujii Masao 
---
 doc/src/sgml/monitoring.sgml | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 673a0e73e4..4e0193a967 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -6089,8 +6089,8 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
   waiting for checkpoint to finish
   
The WAL sender process is currently performing
-   pg_start_backup to set up for
-   taking a base backup, and waiting for backup start
+   pg_start_backup to prepare to
+   take a base backup, and waiting for the start-of-backup
checkpoint to finish.
   
  
-- 
2.17.0

>From 6500196ffbcc735eccb7623e728788aa3f5494bc Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Tue, 7 Apr 2020 19:56:56 -0500
Subject: [PATCH v9 03/15] doc: Allow users to limit storage reserved by
 replication slots

commit c6550776394e25c1620bc8258427c8f1d448080d
Author: Alvaro Herrera 
---
 doc/src/sgml/config.sgml | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 2c75876e32..6c1c9157d8 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -3902,9 +3902,9 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
 slots are allowed to retain in the pg_wal
 directory at checkpoint time.
 If max_slot_wal_keep_size is -1 (the default),
-replication slots retain unlimited amount of WAL files.  If
-restart_lsn of a replication slot gets behind more than that megabytes
-from the current LSN, the standby using the slot may no longer be able
+replication slots may retain an unlimited amount of WAL files.  Otherwise, if
+restart_lsn of a replication slot falls behind the current LSN by more
+than the given size, the standby using the slot may no longer be able
 to continue replication due to removal of required WAL files. You
 can see the WAL availability of replication slots
 in pg_replication_slots.
-- 
2.17.0

>From bf31db4f44e2d379660492a68343a1a65e10cb60 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 6 Apr 2020 17:16:07 -0500
Subject: [PATCH v9 04/15] doc: s/evade/avoid/

---
 src/backend/access/gin/README | 2 +-
 src/backend/utils/adt/jsonpath_exec.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/backend/access/gin/README b/src/backend/access/gin/README
index 125a82219b..41d4e1e8a0 100644
--- a/src/backend/access/gin/README
+++ b/src/backend/access/gin/README
@@ -413,7 +413,7 @@ leftmost leaf of the tree.
 Deletion algorithm keeps exclusive locks on left siblings of pages comprising
 currently investigated path.  Thus, if current page is to be removed, all
 required pages to remove both downlink and rightlink are already locked.  That
-evades potential right to left page locking order, which could deadlock with
+avoids potential right to left page locking order, which could deadlock with
 concurrent stepping right.
 
 A search concurrent to page deletion might already have read a pointer to the
diff --git a/src/backend/utils/adt/jsonpath_exec.c b/src/backend/utils/adt/jsonpa

Re: Yet another fast GiST build

2020-09-20 Thread Justin Pryzby

On Sun, Sep 20, 2020 at 05:10:05PM -0400, Tom Lane wrote:
> I wrote:
> > It appears that hyrax (CLOBBER_CACHE_ALWAYS) is not very happy
> > with this:
> > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hyrax&dt=2020-09-19%2021%3A27%3A23
> 
> I reproduced that and traced it to a missing RelationOpenSmgr call.
> Fixed now.

This also appears to break checksums.

postgres=#  CREATE TABLE pvactst (i INT, a INT[], p POINT) with 
(autovacuum_enabled = off);
postgres=#  CREATE INDEX gist_pvactst ON pvactst USING gist (p);
postgres=#  INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM 
generate_series(1,1000) i;
WARNING:  page verification failed, calculated checksum 34313 but expected 0
ERROR:  invalid page in block 0 of relation base/12859/16389

I was able to make this work like so:

@@ -449,6 +450,7 @@ gist_indexsortbuild(GISTBuildState *state)
 
/* Write out the root */
PageSetLSN(pagestate->page, GistBuildLSN);
+   PageSetChecksumInplace(pagestate->page, GIST_ROOT_BLKNO, 
state->indexrel->rd_smgr);
smgrwrite(state->indexrel->rd_smgr, MAIN_FORKNUM, GIST_ROOT_BLKNO,
  pagestate->page, true);
if (RelationNeedsWAL(state->indexrel))
@@ -555,6 +557,7 @@ gist_indexsortbuild_flush_ready_pages(GISTBuildState *state)
 
PageSetLSN(page, GistBuildLSN);
 
+   PageSetChecksumInplace(page, state->pages_written, 
state->indexrel->rd_smgr);
smgrextend(state->indexrel->rd_smgr,
   MAIN_FORKNUM,
   state->pages_written++,

-- 
Justin

Re: Probable documentation errors or improvements

2020-09-23 Thread Justin Pryzby

On Sat, Sep 19, 2020 at 12:55:58PM -0500, Justin Pryzby wrote:
> On Thu, Sep 10, 2020 at 12:19:55PM -0700, Yaroslav wrote:
> > Disclaimer: I'm not a native speaker, so not sure those are actually
> > incorrect, and can't offer non-trivial suggestions.
> 
> https://www.postgresql.org/message-id/flat/CAKFQuwZh3k_CX2-%2BNcZ%3DFZss4bX6ASxDFEXJTY6u4wTH%2BG8%2BKA%40mail.gmail.com#9f9eba0cbbf9b57455503537575f5339
> 
> Most of these appear to be reasonable corrections or improvements.
> 
> Would you want to submit a patch against the master branch ?
> It'll be easier for people to read when it's in a consistent format.
> And less work to read it than to write it.
> 
> I suggest to first handle the 10+ changes which are most important and in need
> of fixing.  After a couple rounds, then see if what's left is worth patching.

A couple of these were already fixed (querytree and "be appear").

I provide a patch for a few others.

Copying Teodor et al regarding tsquery.

wdiff to follow.

commit 6c5a4e6fde2841f6a2ad00f3cc4530e7fcf9a9f3
Author: Justin Pryzby 
Date:   Tue Sep 22 23:34:54 2020 -0500

fix tsquery example

broken since bb140506df605fab58f48926ee1db1f80bdafb59

diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml
index 8af6ac01d3..7daf292468 100644
--- a/doc/src/sgml/textsearch.sgml
+++ b/doc/src/sgml/textsearch.sgml
@@ -1623,9 +1623,9 @@ occurrences to display in the result.',


SELECT to_tsquery('fat') <-> to_tsquery('cat | rat');
  ?column?
[-]{++}
 'fat' <-> {+(+} 'cat' |[-'fat' <->-] 'rat' {+)+}

  
 

commit 096a9097d58de8e2d21e42fab528b27a5213a699
Author: Justin Pryzby 
Date:   Tue Sep 22 23:13:00 2020 -0500

HAVING conditions cannot be repeated

diff --git a/doc/src/sgml/ref/select.sgml b/doc/src/sgml/ref/select.sgml
index b93e4ca208..94889b66b4 100644
--- a/doc/src/sgml/ref/select.sgml
+++ b/doc/src/sgml/ref/select.sgml
@@ -38,7 +38,7 @@ SELECT [ ALL | DISTINCT [ ON ( expressionfrom_item [, ...] ]
[ WHERE condition ]
[ GROUP BY grouping_element [, 
...] ]
[ HAVING condition[-[, ...]-] ]
[ WINDOW window_name AS ( 
window_definition ) [, ...] ]
[ { UNION | INTERSECT | EXCEPT } [ ALL | DISTINCT ] select ]
[ ORDER BY expression [ ASC | 
DESC | USING operator ] [ NULLS { 
FIRST | LAST } ] [, ...] ]
diff --git a/doc/src/sgml/ref/select_into.sgml 
b/doc/src/sgml/ref/select_into.sgml
index b1af52a4da..e82e416d60 100644
--- a/doc/src/sgml/ref/select_into.sgml
+++ b/doc/src/sgml/ref/select_into.sgml
@@ -28,7 +28,7 @@ SELECT [ ALL | DISTINCT [ ON ( expressionfrom_item [, ...] ]
[ WHERE condition ]
[ GROUP BY expression [, ...] ]
[ HAVING condition[-[, ...]-] ]
[ WINDOW window_name AS ( 
window_definition ) [, ...] ]
[ { UNION | INTERSECT | EXCEPT } [ ALL | DISTINCT ] select ]
    [ ORDER BY expression [ ASC | 
DESC | USING operator ] [ NULLS { 
FIRST | LAST } ] [, ...] ]

commit 1e16102f906d9ead33ebdfc0b6f5d862e851131b
Author: Justin Pryzby 
Date:   Tue Sep 22 22:24:14 2020 -0500

style and consistency

diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml
index 390c49eb6a..649020b7da 100644
--- a/doc/src/sgml/indexam.sgml
+++ b/doc/src/sgml/indexam.sgml
@@ -612,7 +612,7 @@ amgettuple (IndexScanDesc scan,
   will pass the caller's snapshot test.  On success, 
amgettuple
   must also set scan->xs_recheck to true or false.
   False means it is certain that the index entry matches the scan keys.
   [-true-]{+True+} means this is not certain, and the conditions represented 
by the
   scan keys must be rechecked against the heap tuple after fetching it.
   This provision supports lossy index operators.
   Note that rechecking will extend only to the scan conditions; a partial

commit 44c7efab0499644060c19868d4c431007e8cccaa
Author: Justin Pryzby 
Date:   Tue Sep 22 22:55:59 2020 -0500

distro agnostic

diff --git a/doc/src/sgml/sepgsql.sgml b/doc/src/sgml/sepgsql.sgml
index 9961569afc..15417bf19d 100644
--- a/doc/src/sgml/sepgsql.sgml
+++ b/doc/src/sgml/sepgsql.sgml
@@ -88,7 +88,7 @@ Policy from config file:targeted
  
   To build this module, include the option --with-selinux in
   your PostgreSQL configure command.  Be sure that the
   libselinux-devel [-RPM-]{+package+} is installed at 
build time.
  

  
@@ -177,7 +177,7 @@ $ for DBNAME in template0 template1 postgres; do
   Makefile on your system; the path shown below is only an example.
   (This Makefile is usually supplied by the
   selinux-policy-devel or
   selinux-policy [-RPM.)-]{+package.)+}
   Once built, install this policy package using the
   semodule command, which loads supplied policy packages
   into the kernel.  If the package is correctly installed,

commit 1584c0dd6a5183b915

pg_upgrade: fail early if a tablespace dir already exists for new cluster version

2020-09-24 Thread Justin Pryzby

This should be caught during --check, or earlier in the upgrade, rather than
only after dumping the schema.

Typically, the tablespace dir would be left behind by a previous failed upgrade
attempt, causing a cascade of failured upgrades.  I guess I may not be the only
one with a 3 year old wrapper which has a hack to check for this.

|rm -fr pgsql12.dat pgsql13.dat
|/usr/pgsql-12/bin/initdb -D pgsql12.dat --no-sync
|/usr/pgsql-13/bin/initdb -D pgsql13.dat --no-sync
|/usr/pgsql-12/bin/postgres -D pgsql12.dat -c port=5678 -k /tmp
|mkdir tblspc tblspc/PG_13_202007203
|psql -h /tmp -p 5678 postgres -c "CREATE TABLESPACE one LOCATION 
'/home/pryzbyj/tblspc'"
|/usr/pgsql-13/bin/pg_upgrade -D pgsql13.dat -d pgsql12.dat -b /usr/pgsql-12/bin
|pg_upgrade_utility.log:
|CREATE TABLESPACE "one" OWNER "pryzbyj" LOCATION '/home/pryzbyj/tblspc';
|psql:pg_upgrade_dump_globals.sql:27: ERROR:  directory 
"/home/pryzbyj/tblspc/PG_13_202007201" already in use as a tablespace
>From 3df104610d73833da60bffcc54756a6ecb2f390f Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Thu, 24 Sep 2020 19:49:40 -0500
Subject: [PATCH v1] pg_upgrade --check to avoid tablespace failure mode

---
 src/bin/pg_upgrade/check.c | 32 
 1 file changed, 32 insertions(+)

diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index 2f7aa632c5..b07e63dab2 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -13,6 +13,7 @@
 #include "fe_utils/string_utils.h"
 #include "mb/pg_wchar.h"
 #include "pg_upgrade.h"
+#include "common/relpath.h"
 
 static void check_new_cluster_is_empty(void);
 static void check_databases_are_compatible(void);
@@ -27,6 +28,7 @@ static void check_for_tables_with_oids(ClusterInfo *cluster);
 static void check_for_reg_data_type_usage(ClusterInfo *cluster);
 static void check_for_jsonb_9_4_usage(ClusterInfo *cluster);
 static void check_for_pg_role_prefix(ClusterInfo *cluster);
+static void check_for_existing_tablespace_dirs(ClusterInfo *new_cluster);
 static char *get_canonical_locale_name(int category, const char *locale);
 
 
@@ -187,6 +189,8 @@ check_new_cluster(void)
 	check_is_install_user(&new_cluster);
 
 	check_for_prepared_transactions(&new_cluster);
+
+	check_for_existing_tablespace_dirs(&new_cluster);
 }
 
 
@@ -541,6 +545,34 @@ create_script_for_cluster_analyze(char **analyze_script_file_name)
 	check_ok();
 }
 
+/*
+ * Check that tablespace dirs for new cluster do not already exist.
+ * If they did, they'd cause an error while restoring global objects.
+ * This allows failing earlier rather than only after dumping schema.
+ */
+static void
+check_for_existing_tablespace_dirs(ClusterInfo *new_cluster)
+{
+	char	old_tablespace_dir[MAXPGPATH];
+
+	prep_status("Checking for pre-existing tablespace directories for new cluster version");
+
+// if (GET_MAJOR_VERSION(old_cluster.major_version) <= 804)
+	for (int tblnum = 0; tblnum < os_info.num_old_tablespaces; tblnum++)
+	{
+		struct stat statbuf;
+		snprintf(old_tablespace_dir, MAXPGPATH, "%s%c%s",
+os_info.old_tablespaces[tblnum],
+PATH_SEPARATOR, TABLESPACE_VERSION_DIRECTORY);
+
+		canonicalize_path(old_tablespace_dir);
+
+		// XXX: lstat ?
+		if (stat(old_tablespace_dir, &statbuf) == 0)
+			pg_fatal("tablespace directory already exists for new cluster version: \"%s\"\n",
+	 old_tablespace_dir);
+	}
+}
 
 /*
  * create_script_for_old_cluster_deletion()
-- 
2.17.0

Re: Probable documentation errors or improvements

2020-09-25 Thread Justin Pryzby

Split one patch about text search, added another one (sequences), added some
info to commit messages, and added here.
https://commitfest.postgresql.org/30/2744/

-- 
Justin
>From adf050ac6cc7d0905fc1613dce1a04f76a892609 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Tue, 22 Sep 2020 21:40:17 -0500
Subject: [PATCH v2 01/11] extraneous comma

---
 doc/src/sgml/rules.sgml | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/rules.sgml b/doc/src/sgml/rules.sgml
index bcf860b68b..d6e3463ac2 100644
--- a/doc/src/sgml/rules.sgml
+++ b/doc/src/sgml/rules.sgml
@@ -769,7 +769,7 @@ SELECT t1.a, t2.b, t1.ctid FROM t1, t2 WHERE t1.a = t2.a;
 
 
 
-The benefit of implementing views with the rule system is,
+The benefit of implementing views with the rule system is
 that the planner has all
 the information about which tables have to be scanned plus the
 relationships between these tables plus the restrictive
@@ -781,7 +781,7 @@ SELECT t1.a, t2.b, t1.ctid FROM t1, t2 WHERE t1.a = t2.a;
 the best path to execute the query, and the more information
 the planner has, the better this decision can be. And
 the rule system as implemented in PostgreSQL
-ensures, that this is all information available about the query
+ensures that this is all information available about the query
 up to that point.
 
 
-- 
2.17.0

>From d6e81a48596edcd255015767095e8985464051e1 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Wed, 23 Sep 2020 13:09:11 -0500
Subject: [PATCH v2 02/11] semicolons after END in programlisting

---
 doc/src/sgml/func.sgml|  2 +-
 doc/src/sgml/plpgsql.sgml | 12 ++--
 doc/src/sgml/rules.sgml   |  2 +-
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 461b748d89..1e18f332a3 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -26580,7 +26580,7 @@ BEGIN
  obj.object_name,
  obj.object_identity;
 END LOOP;
-END
+END;
 $$;
 CREATE EVENT TRIGGER test_event_trigger_for_drops
ON sql_drop
diff --git a/doc/src/sgml/plpgsql.sgml b/doc/src/sgml/plpgsql.sgml
index 815912666d..a71a8e7e48 100644
--- a/doc/src/sgml/plpgsql.sgml
+++ b/doc/src/sgml/plpgsql.sgml
@@ -1143,7 +1143,7 @@ BEGIN
 SELECT users.userid INTO STRICT userid
 FROM users WHERE users.username = get_userid.username;
 RETURN userid;
-END
+END;
 $$ LANGUAGE plpgsql;
 
  On failure, this function might produce an error message such as
@@ -1816,7 +1816,7 @@ BEGIN
 RETURN NEXT r; -- return current row of SELECT
 END LOOP;
 RETURN;
-END
+END;
 $BODY$
 LANGUAGE plpgsql;
 
@@ -1844,7 +1844,7 @@ BEGIN
 END IF;
 
 RETURN;
- END
+ END;
 $BODY$
 LANGUAGE plpgsql;
 
@@ -1918,7 +1918,7 @@ DECLARE myvar int := 5;
 BEGIN
   CALL triple(myvar);
   RAISE NOTICE 'myvar = %', myvar;  -- prints 15
-END
+END;
 $$;
 
 
@@ -3521,7 +3521,7 @@ BEGIN
 ROLLBACK;
 END IF;
 END LOOP;
-END
+END;
 $$;
 
 CALL transaction_test1();
@@ -5175,7 +5175,7 @@ DECLARE
 f1 int;
 BEGIN
 RETURN f1;
-END
+END;
 $$ LANGUAGE plpgsql;
 WARNING:  variable "f1" shadows a previously defined variable
 LINE 3: f1 int;
diff --git a/doc/src/sgml/rules.sgml b/doc/src/sgml/rules.sgml
index d6e3463ac2..e81addcfa9 100644
--- a/doc/src/sgml/rules.sgml
+++ b/doc/src/sgml/rules.sgml
@@ -2087,7 +2087,7 @@ CREATE FUNCTION tricky(text, text) RETURNS bool AS $$
 BEGIN
 RAISE NOTICE '% => %', $1, $2;
 RETURN true;
-END
+END;
 $$ LANGUAGE plpgsql COST 0.01;
 
 SELECT * FROM phone_number WHERE tricky(person, phone);
-- 
2.17.0

>From db23e239d021e018c3b9d6cb4d983f5bbcae8b85 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Tue, 22 Sep 2020 22:51:47 -0500
Subject: [PATCH v2 03/11] punctuation

---
 doc/src/sgml/catalogs.sgml | 2 +-
 doc/src/sgml/config.sgml   | 4 ++--
 doc/src/sgml/parallel.sgml | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index de9bacd34f..d1b8fc8010 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -1525,7 +1525,7 @@
   
   
Role can log in. That is, this role can be given as the initial
-   session authorization identifier
+   session authorization identifier.
   
  
 
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 8eabf93834..1c753ccb7e 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -10084,8 +10084,8 @@ LOG:  CleanUpLock: deleting: lock(0xb7acd844) id(24688,24696,0,0,0,1)
   
   

-If set, do not trace locks for tables below this OID. (use to avoid
-output on system tables)
+If set, do not trace locks for tables below this OID (used to avoid
+output on system tables).


 This parameter is only av

Re: pg_upgrade: fail early if a tablespace dir already exists for new cluster version

2020-09-25 Thread Justin Pryzby

Added at
https://commitfest.postgresql.org/30/2739/
>From 831246aa6d6837b2b0da7c96ad2f44ffd6cd3951 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Thu, 24 Sep 2020 19:49:40 -0500
Subject: [PATCH v2] pg_upgrade --check to avoid tablespace failure mode

---
 src/bin/pg_upgrade/check.c | 37 +
 1 file changed, 37 insertions(+)

diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index 2f7aa632c5..154e3d249e 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -13,6 +13,7 @@
 #include "fe_utils/string_utils.h"
 #include "mb/pg_wchar.h"
 #include "pg_upgrade.h"
+#include "common/relpath.h"
 
 static void check_new_cluster_is_empty(void);
 static void check_databases_are_compatible(void);
@@ -27,6 +28,7 @@ static void check_for_tables_with_oids(ClusterInfo *cluster);
 static void check_for_reg_data_type_usage(ClusterInfo *cluster);
 static void check_for_jsonb_9_4_usage(ClusterInfo *cluster);
 static void check_for_pg_role_prefix(ClusterInfo *cluster);
+static void check_for_existing_tablespace_dirs(ClusterInfo *new_cluster);
 static char *get_canonical_locale_name(int category, const char *locale);
 
 
@@ -187,6 +189,8 @@ check_new_cluster(void)
 	check_is_install_user(&new_cluster);
 
 	check_for_prepared_transactions(&new_cluster);
+
+	check_for_existing_tablespace_dirs(&new_cluster);
 }
 
 
@@ -542,6 +546,39 @@ create_script_for_cluster_analyze(char **analyze_script_file_name)
 }
 
 
+/*
+ * Check that tablespace dirs do not already exist for new cluster version.
+ * If they did, it'd cause an error while restoring global objects.
+ * This allows failing earlier rather than only after dumping schema.
+ *
+ * Note, v8.4 has no tablespace_suffix, which is fine so long as the version we
+ * being upgraded *to* has a suffix.
+ */
+static void
+check_for_existing_tablespace_dirs(ClusterInfo *new_cluster)
+{
+	char	old_tablespace_dir[MAXPGPATH];
+
+	prep_status("Checking for pre-existing tablespace directories");
+
+	for (int tblnum = 0; tblnum < os_info.num_old_tablespaces; tblnum++)
+	{
+		struct stat statbuf;
+		snprintf(old_tablespace_dir, MAXPGPATH, "%s%s",
+os_info.old_tablespaces[tblnum],
+new_cluster->tablespace_suffix);
+
+		canonicalize_path(old_tablespace_dir);
+
+		// XXX: lstat ?
+		if (stat(old_tablespace_dir, &statbuf) == 0 || errno != ENOENT)
+			pg_fatal("tablespace directory already exists for new cluster version: \"%s\"\n",
+	 old_tablespace_dir);
+	}
+
+	check_ok();
+}
+
 /*
  * create_script_for_old_cluster_deletion()
  *
-- 
2.17.0

Re: 回复：how to create index concurrently on partitioned table

2020-09-26 Thread Justin Pryzby

On Thu, Sep 24, 2020 at 05:11:03PM +0900, Michael Paquier wrote:
> start.  So, the goal of the patch, as I would define it, is to give a
> way to CLUSTER to work on a partitioned table using a given
> partitioned index.  Hence, we would perform CLUSTER automatically from
> the top-most parent for any partitions that have an index on the same
> partition tree as the partitioned index defined in USING, switching
> indisclustered automatically depending on the index used.

I think that's right, except there's no need to look for a compatible
partitioned index: we just use the child index.

Also, if a partitioned index is clustered, when we clear indisclustered for
other indexes, should we also propogate that to their parent indexes, if any ?

> From what I can see, attempting to use a CLUSTER on a top-most
> partitioned table fails to work on child tables,

Oops - it looks like this patch never worked right, due to the RECHECK on
indisclustered.  I think maybe I returned to the CIC patch and never finishing
with cluster.

> It would be good also to check if
> we have a partition index tree that maps partially with a partition
> table tree (aka no all table partitions have a partition index), where
> these don't get clustered because there is no index to work on.

This should not happen, since a incomplete partitioned index is "invalid".

> Isn't NoLock unsafe here, even knowing that an exclusive lock is taken on
> the parent?  It seems to me that at least schema changes should be
> prevented with a ShareLock in the first transaction building the list
> of elements to cluster.

Thanks for noticing.  I chose ShareUpdateExclusiveLock since it's
set-exclusive.

-- 
Justin
>From e4fe1e422dc2c0ee21ce9df15d5baf24e825bf9d Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sat, 6 Jun 2020 17:42:23 -0500
Subject: [PATCH v8 1/3] Allow CREATE INDEX CONCURRENTLY on partitioned table

Note, this effectively reverts 050098b14, so take care to not reintroduce the
bug it fixed.

XXX: does pgstat_progress_update_param() break other commands progress ?
---
 doc/src/sgml/ref/create_index.sgml |   9 --
 src/backend/commands/indexcmds.c   | 135 +
 src/test/regress/expected/indexing.out |  60 ++-
 src/test/regress/sql/indexing.sql  |  18 +++-
 4 files changed, 166 insertions(+), 56 deletions(-)

diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
index 33aa64e81d..c780dc9547 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -657,15 +657,6 @@ Indexes:
 cannot.

 
-   
-Concurrent builds for indexes on partitioned tables are currently not
-supported.  However, you may concurrently build the index on each
-partition individually and then finally create the partitioned index
-non-concurrently in order to reduce the time where writes to the
-partitioned table will be locked out.  In this case, building the
-partitioned index is a metadata only operation.
-   
-
   
  
 
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index f1b5f87e6a..d417404211 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -91,6 +91,7 @@ static bool ReindexRelationConcurrently(Oid relationOid, int options);
 
 static void ReindexPartitions(Oid relid, int options, bool isTopLevel);
 static void ReindexMultipleInternal(List *relids, int options);
+static void reindex_invalid_child_indexes(Oid indexRelationId);
 static void reindex_error_callback(void *args);
 static void update_relispartition(Oid relationId, bool newval);
 static bool CompareOpclassOptions(Datum *opts1, Datum *opts2, int natts);
@@ -665,17 +666,6 @@ DefineIndex(Oid relationId,
 	partitioned = rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE;
 	if (partitioned)
 	{
-		/*
-		 * Note: we check 'stmt->concurrent' rather than 'concurrent', so that
-		 * the error is thrown also for temporary tables.  Seems better to be
-		 * consistent, even though we could do it on temporary table because
-		 * we're not actually doing it concurrently.
-		 */
-		if (stmt->concurrent)
-			ereport(ERROR,
-	(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-	 errmsg("cannot create index on partitioned table \"%s\" concurrently",
-			RelationGetRelationName(rel;
 		if (stmt->excludeOpNames)
 			ereport(ERROR,
 	(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
@@ -1110,6 +1100,11 @@ DefineIndex(Oid relationId,
 		if (pd->nparts != 0)
 			flags |= INDEX_CREATE_INVALID;
 	}
+	else if (concurrent && OidIsValid(parentIndexId))
+	{
+		/* If concurrent, initially build index partitions as "invalid" */
+		flags |= INDEX_CREATE_INVALID;
+	}
 
 	if (stmt->deferrable)
 		constr_flags |= INDEX_CONSTR_CREATE_DEFERRABLE;
@@ -1162,6 +1157,14 @@ Defi

Re: More efficient RI checks - take 2

2020-09-26 Thread Justin Pryzby

On Fri, Jun 05, 2020 at 05:16:43PM +0200, Antonin Houska wrote:
> Antonin Houska  wrote:
> 
> > In general, the checks are significantly faster if there are many rows to
> > process, and a bit slower when we only need to check a single row.
> 
> Attached is a new version that uses the existing simple queries if there's
> only one row to check. SPI is used for both single-row and bulk checks - as
> discussed in this thread, it can perhaps be replaced with a different approach
> if appears to be beneficial, at least for the single-row checks.
> 
> I think using a separate query for the single-row check is more practicable
> than convincing the planner that the bulk-check query should only check a
> single row. So this patch version tries to show what it'd look like.

I'm interested in testing this patch, however there's a lot of internals to
digest.

Are there any documentation updates or regression tests to add ?  If FKs
support "bulk" validation, users should know when that applies, and be able to
check that it's working as intended.  Even if the test cases are overly verbose
or not stable, and not intended for commit, that would be a useful temporary
addition.

I think that calls=4 indicates this is using bulk validation.

postgres=# begin; explain(analyze, timing off, costs off, summary off, verbose) 
DELETE FROM t WHERE i<999; rollback;
BEGIN
  QUERY PLAN   
---
 Delete on public.t (actual rows=0 loops=1)
   ->  Index Scan using t_pkey on public.t (actual rows=998 loops=1)
 Output: ctid
 Index Cond: (t.i < 999)
 Trigger RI_ConstraintTrigger_a_16399 for constraint t_i_fkey: calls=4

I started thinking about this 1+ years ago wondering if a BRIN index could be
used for (bulk) FK validation.

So I would like to be able to see the *plan* for the query.

I was able to show the plan and see that BRIN can be used like so:
|SET auto_explain.log_nested_statements=on; SET client_min_messages=debug; SET 
auto_explain.log_min_duration=0;
Should the plan be visible in explain (not auto-explain) ?

BTW did you see this older thread ?
https://www.postgresql.org/message-id/flat/CA%2BU5nMLM1DaHBC6JXtUMfcG6f7FgV5mPSpufO7GRnbFKkF2f7g%40mail.gmail.com

-- 
Justin

Re: AppendStringInfoChar instead of appendStringInfoString

2020-09-27 Thread Justin Pryzby

On Fri, Sep 25, 2020 at 08:49:57AM +, Hou, Zhijie wrote:
> In (/src/backend/replication/backup_manifest.c)
> 
> I found the following code:
> 
>   appendStringInfoString(&buf, "\n");
>   appendStringInfoString(&buf, "\"");

Good point.  There's another one:

$ git grep -E 'appendStringInfoString.*".{,1}");'
src/backend/utils/adt/ruleutils.c:  appendStringInfoString(buf, "(");

I noticed you added a similar thread here.
https://commitfest.postgresql.org/30/

I think this one could be combined as a single patchset with the existing CF
entry for the other thread.

https://www.postgresql.org/message-id/flat/cb172cf4361e4c7ba7167429070979d4%40G08CNEXMBPEKD05.g08.fujitsu.local
https://www.postgresql.org/message-id/ce9a8564ccf1435eacf14bb7364ae9de%40G08CNEXMBPEKD05.g08.fujitsu.local

-- 
Justin

CREATE TABLE .. PARTITION OF fails to preserve tgenabled for inherited row triggers

2020-09-30 Thread Justin Pryzby

CREATE TABLE t(i int) PARTITION BY RANGE(i);
CREATE TABLE t1 PARTITION OF t FOR VALUES FROM (1) TO (10);
CREATE OR REPLACE FUNCTION tgf() RETURNS trigger LANGUAGE plpgsql AS $$ begin 
raise exception 'except'; end $$;
CREATE TRIGGER tg AFTER INSERT ON t FOR EACH ROW EXECUTE FUNCTION tgf();
ALTER TABLE t1 DISABLE TRIGGER tg;
INSERT INTO t VALUES(1); -- inserts when trigger is disabled: good
ALTER TABLE t DISABLE TRIGGER tg;
CREATE TABLE t2 PARTITION OF t FOR VALUES FROM (10) TO (20);

postgres=# SELECT tgrelid::regclass, tgenabled FROM pg_trigger WHERE 
tgrelid::regclass::text IN ('t1','t2');
 tgrelid | tgenabled 
-+---
 t1  | D
 t2  | O
(2 rows)

I consider this a bug,but CreateTrigStmt doesn't have any "enabled" member
(since it's impossible to CREATE TRIGGER .. DISABLED), so I'm not sure where
the fix should be.

terminate called after throwing an instance of 'std::bad_alloc'

2020-09-30 Thread Justin Pryzby

A VM crashed which is now running PG13.0 on centos7:

Sep 30 19:40:08 database7 abrt-hook-ccpp: Process 17905 (postgres) of user 26 
killed by SIGABRT - dumping core
Core was generated by `postgres: telsasoft ts 192.168.122.11(34608) SELECT'.

Unfortunately, the filesystem wasn't large enough and the corefile is
truncated.

The first badness in our logfiles looks like this ; this is the very head of
the logfile:

|[pryzbyj@database7 ~]$ sudo gzip -dc 
/var/log/postgresql/crash-postgresql-2020-09-30_194000.log.gz |head
|[sudo] password for pryzbyj: 
|terminate called after throwing an instance of 'std::bad_alloc'
|  what():  std::bad_alloc
|< 2020-09-30 19:40:09.653 ADT  >LOG:  checkpoint starting: time
|< 2020-09-30 19:40:17.002 ADT  >LOG:  checkpoint complete: wrote 74 buffers 
(0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=7.331 s, sync=0.006 
s, total=7.348 s; sync files=25, longest=0.002 s, average=0.000 s; distance=295 
kB, estimate=4183 kB
|< 2020-09-30 19:40:22.642 ADT  >LOG:  server process (PID 17905) was 
terminated by signal 6: Aborted
|< 2020-09-30 19:40:22.642 ADT  >DETAIL:  Failed process was running: --BEGIN 
SQL
|SELECT * FROM

I was able to grep the filesystem to find what looks like the preceding logfile
(which our script had immediately rotated in its attempt to be helpful).

|< 2020-09-30 19:39:09.096 ADT  >LOG:  checkpoint starting: time
|< 2020-09-30 19:39:12.640 ADT  >LOG:  checkpoint complete: wrote 35 buffers 
(0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=3.523 s, sync=0.006 
s, total=3.544 s; sync files=14, longest=0.002 s, average=0.000 s; distance=103 
kB, estimate=4615 kB

This seemed familiar, and I found it had happened 18 months ago on a different
server:

|terminate called after throwing an instance of 'std::bad_alloc'
|  what():  std::bad_alloc
|< 2019-02-01 15:36:54.434 CST  >LOG:  server process (PID 13557) was 
terminated by signal 6: Aborted
|< 2019-02-01 15:36:54.434 CST  >DETAIL:  Failed process was running:
|SELECT *, '' as n
|FROM a, b WHERE
|a.id = b.id AND
|b.c IS NOT NULL
|ORDER BY b.c LIMIT 9

I have a log of pg_settings, which shows that this server ran v11.1 until
2019-02-14, when it was upgraded to v11.2.  Since v11.2 included a fix for a
bug I reported involving JIT and wide tables, I probably saw this crash and
dismissed it, even though tables here named "a" and "b" have only ~30 columns
combined.  The query that crashed in 2019 is actually processing a small queue,
and runs every 5sec.

The query that crashed today is a "day" level query which runs every 15min, so
it ran 70some times today with no issue.

Our DBs use postgis, and today's crash JOINs to the table with geometry
columns, but does not use them at all.

But the 2019 doesn't even include the geometry table.  I'm not sure if these
are even the same crash, but if they are, I think it's maybe an JIT issue and
not postgis (??)

I've had JIT disabled since 2019, due to no performance benefit for us, but
I've been re-enabling it during upgrades and transitions, and instead disabling
jit_tuple_deforming (since this performs badly for columns with high attnum).
So maybe this will recur before too long.

Re: please update ps display for recovery checkpoint

2020-10-02 Thread Justin Pryzby

On Fri, Oct 02, 2020 at 04:28:14PM +0900, Michael Paquier wrote:
> > Related: I have always thought that this message meant "recovery will 
> > complete
> > Real Soon", but I now understand it to mean "beginning the recovery 
> > checkpoint,
> > which is flagged CHECKPOINT_IMMEDIATE" (and may take a long time).
> 
> Yep.  And at the end of crash recovery seconds feel like minutes.
> 
> I agree that "checkpointer checkpoint" is not the best fit.  Using
> parenthesis would also be inconsistent with the other usages of this
> API in the backend code.

I think maybe I got the idea for parenthesis from these:
src/backend/tcop/postgres.c:set_ps_display("idle in 
transaction (aborted)");
src/backend/postmaster/postmaster.c-if (port->remote_port[0] != '\0')
src/backend/postmaster/postmaster.c-appendStringInfo(&ps_data, 
"(%s)", port->remote_port);


>  What about adding "running" then?  This
> would give "checkpointer running end-of-recovery checkpoint".

What about one of these?
"checkpointer: running end-of-recovery checkpoint"
"checkpointer running: end-of-recovery checkpoint"

Thanks for looking.

-- 
Justin

Re: terminate called after throwing an instance of 'std::bad_alloc'

2020-10-03 Thread Justin Pryzby

On Wed, Sep 30, 2020 at 09:16:09PM -0500, Justin Pryzby wrote:
> Our DBs use postgis, and today's crash JOINs to the table with geometry
> columns, but does not use them at all.

I'm wrong here - it does return the geometry columns.

We didn't use JIT with v12, but I've been setting jit=on along with v13
upgrades, so I was trying to convince myself that there's a change in v13, and
not just our config..

I've seen repeated OOMs on a small number of reports.  It looks like JIT
optimization/inlining used a good deal of RAM in v12, and even more on v13.
Maybe that's what's expected; it looks like this scales with the number of
partitions and number of functions, so it could be arbitrarily large, and not
respectful of work_mem (below, at the 4MB default).

 CREATE TABLE t(a int) PARTITION BY RANGE(a);
 \pset pager off
 SELECT format('CREATE TABLE t%s PARTITION OF t FOR VALUES FROM (%s)TO(%s) WITH 
(parallel_workers=2)', i, 100*i, 100*i+100) FROM generate_series(0,666)i;
 \gexec
 INSERT INTO t SELECT i%1 FROM generate_series(1,9)i;
 CREATE INDEX ON t(left(a::text,1));
 VACUUM ANALYZE t;
 \o /dev/null \\ SET client_min_messages=debug; SET log_executor_stats=on; SET 
jit=on; SET jit_expressions=on; SET jit_above_cost=0; SET 
jit_optimize_above_cost=1; SET jit_inline_above_cost=1; explain( analyze) 
SELECT * FROM t WHERE substr(a::text,length(a::text)-6,6)='1' OR 
substr(a::text,length(a::text)-6,6)='2' OR 
substr(a::text,length(a::text)-6,6)='3' OR 
substr(a::text,length(a::text)-6,6)='4' OR 
substr(a::text,length(a::text)-6,6)='5' OR 
substr(a::text,length(a::text)-6,6)='6';

In v13:
With optimize_above=-1, inline_above=-1: 333472 kB max resident size
With optimize_above=1, inline_above=1: 654036 kB max resident size

In v12:
With optimize_above=-1, inline_above=-1: 330988 kB max resident size
With optimize_above=1, inline_above=1: 556844 kB max resident size

These tests are on the same centos7 server, which has:
llvm5.0-libs-5.0.1-7.el7.x86_64

This appears to be similar under centos8 with LLVM9.

The OOM itself could be due to our JIT settings and maybe not a bug.  My
thinking is that when we hit OOM, that sometimes manifests itself as the C++
exception, which somehow trickles up into an SIGABRT in postgres.  

I don't know if there's a way to capture useful debugging info on OOM, but the
live processes do seem to leaking memory, or otherwise not reusing it.

For example, look at a this huge backend, which is connected to the process
which have been experiencing OOM.  The backend is 4.2GB while idle, so I'd have
hoped for it to have released everything it could.  I'm aware that free()
usually cannot return small allocations to the kernel, but then should not lead
to continued growth over time.

  PID USER  PR  NIVIRTRESSHR S  %CPU %MEM TIME+ COMMAND 

18000 postgres  20   0 4975152   4.2g   4684 S   0.0 55.6  81:29.87 postgres: 
main main 10.124.252.14(58296) idle 

main=# SELECT * FROM pg_stat_activity WHERE pid=18000;
datid| 151510
datname  | main
pid  | 18000
leader_pid   | 
usesysid | 16714
usename  | main
application_name | run_thresholds --daemon
client_addr  | 10.124.252.14
client_hostname  | 
client_port  | 58296
backend_start| 2020-10-02 22:26:16.676936-06
xact_start   | 
query_start  | 2020-10-03 12:37:37.639971-06
state_change | 2020-10-03 12:37:37.640183-06
wait_event_type  | Client
wait_event   | ClientRead
state| idle
backend_xid  | 
backend_xmin | 
query| ROLLBACK
backend_type | client backend

This is mostly just in "heap":

$ sudo cat /proc/18000/maps |awk -Wnon-decimal -F '[- ]' '{a="0x"$1; b="0x"$2; 
print (b-a)/1024**2,$0}' |sort -gr |head
3428.65 010d6000-d757c000 rw-p  00:00 0 
 [heap]
512.004 7f3ec96e8000-7f3ee96e9000 rw-p  00:00 0 
384.004 7f3e9e6e1000-7f3eb66e2000 rw-p  00:00 0 
148 2ac0-2aaab400 rw-s  00:0e 44184914   
/anon_hugepage (deleted)

The postgres memory contexts don't show anything nearly that large.
  MessageContext: 25236472 total in 20 blocks; 2990656 free (4 chunks); 
22245816 used
  CacheMemoryContext: 25892160 total in 42 blocks; 22770984 free (28285 
chunks); 3121176 used
Grand total: 70879896 bytes in 1466 blocks; 29518440 free (29097 chunks); 
41361456 used

I was able to make a small, apparent leak like so.  This applies to
postgis3.0/postgres12/LLVM5/centos7, and
pos

Re: terminate called after throwing an instance of 'std::bad_alloc'

2020-10-03 Thread Justin Pryzby

On Sat, Oct 03, 2020 at 04:01:49PM -0700, Andres Freund wrote:
> The below turned out to be somewhat obsoleted by what you wrote below,
> which I unfortunately hadn't yet read - but I think it's still good
> information, so I'm not gonna delete it:

Right, I understand that RES RAM might includes shared buffers and demand
loaded file-backed and mmap stuff.  But on our servers, I tentatively wouldn't
expect it to be more than a handful of GB.  It might be true that most of our
report processes run in a forked children, specifically to avoid leaks, and the
forked children would have their own, shortlived backend, which would hide any
postgres leak (in addition to having tried to use jit since v11).

> > I was able to make a small, apparent leak like so.  This applies to
> > postgis3.0/postgres12/LLVM5/centos7, and
> > postgis3.1alpha/postgres13/LLVM9/centos8.
> >
> > Inlining is essential to see the leak, optimization is not.  Showing every 
> > 9th
> > query:
> >
> > $ for a in `seq 1 99`; do echo "SET jit_inline_above_cost=1; SET 
> > jit_optimize_above_cost=-1; SET jit_above_cost=0; SET jit_expressions=on; 
> > SET log_executor_stats=on; SET client_min_messages=debug; SET jit=on; 
> > explain analyze SELECT st_x(site_geo) FROM sites WHERE site_geo IS NOT 
> > NULL;"; done |psql ts 2>&1 |grep 'resident' |sed -n '1~9p'
> > !   46024 kB max resident size
> > !   46492 kB max resident size
> 
> Huh, that's certainly not good. Looking.

Reproduce like:
postgres=# CREATE EXTENSION postgis;
postgres=# CREATE TABLE geo(g geometry(point));
postgres=# INSERT INTO geo SELECT st_GeometryFromText('POINT(11 12)',0) FROM 
generate_series(1,9);
postgres=# SET jit_above_cost=0; SET jit_inline_above_cost=0; SET 
client_min_messages=debug; SET log_executor_stats=on; explain analyze SELECT 
st_x(g) FROM geo;

Re: Probable documentation errors or improvements

2020-10-05 Thread Justin Pryzby

On Fri, Sep 25, 2020 at 09:30:00AM -0500, Justin Pryzby wrote:
> Split one patch about text search, added another one (sequences), added some
> info to commit messages, and added here.
> https://commitfest.postgresql.org/30/2744/

Added an additional patch regarding spaces between function arguments.

-- 
Justin
>From 28c0407bcf044e1a9a8c542ae6a4304a54f3 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Tue, 22 Sep 2020 21:40:17 -0500
Subject: [PATCH v3 01/12] extraneous comma

---
 doc/src/sgml/rules.sgml | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/rules.sgml b/doc/src/sgml/rules.sgml
index bcf860b68b..d6e3463ac2 100644
--- a/doc/src/sgml/rules.sgml
+++ b/doc/src/sgml/rules.sgml
@@ -769,7 +769,7 @@ SELECT t1.a, t2.b, t1.ctid FROM t1, t2 WHERE t1.a = t2.a;
 
 
 
-The benefit of implementing views with the rule system is,
+The benefit of implementing views with the rule system is
 that the planner has all
 the information about which tables have to be scanned plus the
 relationships between these tables plus the restrictive
@@ -781,7 +781,7 @@ SELECT t1.a, t2.b, t1.ctid FROM t1, t2 WHERE t1.a = t2.a;
 the best path to execute the query, and the more information
 the planner has, the better this decision can be. And
 the rule system as implemented in PostgreSQL
-ensures, that this is all information available about the query
+ensures that this is all information available about the query
 up to that point.
 
 
-- 
2.17.0

>From 4a181e8df650e2c93a2a3fded7bfd67081bb18dc Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Wed, 23 Sep 2020 13:09:11 -0500
Subject: [PATCH v3 02/12] semicolons after END in programlisting

---
 doc/src/sgml/func.sgml|  2 +-
 doc/src/sgml/plpgsql.sgml | 12 ++--
 doc/src/sgml/rules.sgml   |  2 +-
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index e7cff980dd..8082fddf8c 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -26597,7 +26597,7 @@ BEGIN
  obj.object_name,
  obj.object_identity;
 END LOOP;
-END
+END;
 $$;
 CREATE EVENT TRIGGER test_event_trigger_for_drops
ON sql_drop
diff --git a/doc/src/sgml/plpgsql.sgml b/doc/src/sgml/plpgsql.sgml
index 74b6b25878..94a3b36458 100644
--- a/doc/src/sgml/plpgsql.sgml
+++ b/doc/src/sgml/plpgsql.sgml
@@ -1181,7 +1181,7 @@ BEGIN
 SELECT users.userid INTO STRICT userid
 FROM users WHERE users.username = get_userid.username;
 RETURN userid;
-END
+END;
 $$ LANGUAGE plpgsql;
 
  On failure, this function might produce an error message such as
@@ -1854,7 +1854,7 @@ BEGIN
 RETURN NEXT r; -- return current row of SELECT
 END LOOP;
 RETURN;
-END
+END;
 $BODY$
 LANGUAGE plpgsql;
 
@@ -1882,7 +1882,7 @@ BEGIN
 END IF;
 
 RETURN;
- END
+ END;
 $BODY$
 LANGUAGE plpgsql;
 
@@ -1956,7 +1956,7 @@ DECLARE myvar int := 5;
 BEGIN
   CALL triple(myvar);
   RAISE NOTICE 'myvar = %', myvar;  -- prints 15
-END
+END;
 $$;
 
 
@@ -3559,7 +3559,7 @@ BEGIN
 ROLLBACK;
 END IF;
 END LOOP;
-END
+END;
 $$;
 
 CALL transaction_test1();
@@ -5213,7 +5213,7 @@ DECLARE
 f1 int;
 BEGIN
 RETURN f1;
-END
+END;
 $$ LANGUAGE plpgsql;
 WARNING:  variable "f1" shadows a previously defined variable
 LINE 3: f1 int;
diff --git a/doc/src/sgml/rules.sgml b/doc/src/sgml/rules.sgml
index d6e3463ac2..e81addcfa9 100644
--- a/doc/src/sgml/rules.sgml
+++ b/doc/src/sgml/rules.sgml
@@ -2087,7 +2087,7 @@ CREATE FUNCTION tricky(text, text) RETURNS bool AS $$
 BEGIN
 RAISE NOTICE '% => %', $1, $2;
 RETURN true;
-END
+END;
 $$ LANGUAGE plpgsql COST 0.01;
 
 SELECT * FROM phone_number WHERE tricky(person, phone);
-- 
2.17.0

>From f3a1c02e297a501b116c079c8cbc41563522320b Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Tue, 22 Sep 2020 22:51:47 -0500
Subject: [PATCH v3 03/12] punctuation

---
 doc/src/sgml/catalogs.sgml | 2 +-
 doc/src/sgml/config.sgml   | 4 ++--
 doc/src/sgml/parallel.sgml | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 3927b1030d..5bd54cb218 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -1525,7 +1525,7 @@
   
   
Role can log in. That is, this role can be given as the initial
-   session authorization identifier
+   session authorization identifier.
   
  
 
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index ee914740cc..8b4a2e440f 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -10084,8 +10084,8 @@ LOG:  CleanUpLock: deleting: lock(0xb7acd844) id(24688,24696,0,0,0,1)
   
   

-If set, do not trace locks for tables below this OID. (use to avoid
-output on system tables)
+If set,

Re: 回复：how to create index concurrently on partitioned table

2020-10-05 Thread Justin Pryzby

On Mon, Oct 05, 2020 at 05:46:27PM +0900, Michael Paquier wrote:
> On Sat, Sep 26, 2020 at 02:56:55PM -0500, Justin Pryzby wrote:
> > Also, if a partitioned index is clustered, when we clear indisclustered for
> > other indexes, should we also propogate that to their parent indexes, if 
> > any ?
> 
> I am not sure what you mean here.  Each partition's cluster runs in
> its own individual transaction based on the patch you sent.  Are you
> suggesting to update indisclustered for the partitioned index of a
> partitioned table and all its parent partitioned in the same
> transaction, aka a transaction working on the partitioned table?

No, I mean that if a partition is no longer clustered on some index, then its
parent isn't clustered on that indexes parent, either.

It means that we might do N catalog updates for a partition heirarchy that's N
levels deep.  Normally, N=2, and we'd clear indisclustered for the index as
well as its parent.  This is not essential, though.

> >> It would be good also to check if
> >> we have a partition index tree that maps partially with a partition
> >> table tree (aka no all table partitions have a partition index), where
> >> these don't get clustered because there is no index to work on.
> >
> > This should not happen, since a incomplete partitioned index is "invalid".
> 
> Indeed, I did not know this property.

I think that's something we can apply for CIC/DIC, too.
It's not essential to avoid leaving an "invalid" or partial index if
interrupted.  It's only essential that a partial, partitioned index is not
"valid".

For DROP IND CONCURRENTLY, I wrote:

On Mon, Sep 14, 2020 at 09:31:03AM -0500, Justin Pryzby wrote:
> 2. Maybe the easiest way is to mark all indexes invalid and then drop all
> partitions (concurrently) and then the partitioned parent.  If interrupted,
> this would leave a parent index marked "invalid", and some child tables with 
> no
> indexes.  I think this may be "ok".  The same thing is possible if a 
> concurrent
> build is interrupted, right ?


-- 
Justin

Re: 回复：how to create index concurrently on partitioned table

2020-10-05 Thread Justin Pryzby

On Tue, Oct 06, 2020 at 11:42:55AM +0900, Michael Paquier wrote:
> On Mon, Oct 05, 2020 at 03:08:32PM -0500, Justin Pryzby wrote:
> > It means that we might do N catalog updates for a partition heirarchy 
> > that's N
> > levels deep.  Normally, N=2, and we'd clear indisclustered for the index as
> > well as its parent.  This is not essential, though.
> 
> Hmm.  I got to think more about this one, and being able to ensure a
> consistent state of indisclustered for partitioned tables and all
> their partitions across multiple transactions at all times would be a
> nice property, as we could easily track down if an operation has
> failed in-flight.  The problem here is that we are limited by
> indisclustered being a boolean, so we may set indisclustered one way
> or another on some partitions, or on some parent partitioned tables,
> but we are not sure if what's down is actually clustered for the
> leaves of a partitioned index, or not.  Or maybe we even have an
> inconsistent state, so this does not provide a good user experience.
> The consistency of a partition tree is a key thing, and we have such 
> guarantees with REINDEX (with or without concurrently), so what I'd
> like to think that what makes sense for indisclustered on a
> partitioned index is to have the following set of guarantees, and
> relying on indisvalid as being true iff an index partition tree is
> complete on a given table partition tree is really important:
> - If indisclustered is set to true for a partitioned index, then we
> have the guarantee that all its partition and partitioned indexes have
> indisclustered set to true, across all the layers down to the leaves.
> - If indisclustered is set to false for a partitioned index, then we
> have the guarantee that all its partition and partitioned indexes have
> indisclustered set to false.
> 
> If we happen to attach a new partition to a partitioned table of such
> a tree, I guess that it is then logic to have indisclustered set to
> the same value as the partitioned index when the partition inherits an
> index.  So, it seems to me that with the current catalogs we are
> limited by indisclustered as being a boolean to maintain some kind of
> consistency across transactions of CLUSTER, as we would use one
> transaction per leaf for the work.  In order to make things better
> here, we could switch indisclustered to a char, able to use three
> states:
> - 'e' or enabled, equivalent to indisclustered == true.  There should
> be only one index on a table with 'e' set at a given time.
> - 'd' or disabled, equivalent to indisclustered == false.
> - a new third state, for an equivalent of work-in-progress, let's call
> it 'w', or whatever.
> 
> Then, when we begin a CLUSTER on a partitioned table, I would imagine
> the following:
> - Switch all the indexes selected to 'w' in a first transaction, and
> do not reset the state of other indexes if there is one 'e'.  For
> CLUSTER without USING, we switch the existing 'e' to 'w' if there is
> such an index.  If there are no indexes select-able, issue an error.
> If we find an index with 'w', we have a candidate from a previous
> failed command, so this gets used.  I don't think that this design
> requires a new DDL either as we could reset all 'w' and 'e' to 'd' if
> using ALTER TABLE SET WITHOUT CLUSTER on a partitioned table.  For
> CLUSTER with USING, the partitioned index selected and its leaves are
> switched to 'w', similarly.
> - Then do the work for all the partitions, one partition per
> transaction, keeping 'w'.
> - In a last transaction, switch all the partitions from 'w' to 'e',
> resetting any existing 'e'.

Honestly, I think you're over-thinking and over-engineering indisclustered.

If "clusteredness" was something we offered to maintain across DML, I think
that might be important to provide stronger guarantees.  As it is now, I don't
think this patch is worth changing the catalog definition.

> ALTER TABLE CLUSTER ON should also be a supported operation, where 'e'
> gets switched for all the partitions in a single transaction.  Of
> course, this should not work for an invalid index.  Even with such a
> design, I got to wonder if there could be cases where a user does
> *not* want to cluster the leaves of a partition tree with the same
> index.  If that were to happen, the partition tree may need a redesign
> so making things work so as we force partitions to inherit what's
> wanted for the partitioned table makes the most sense to me.

I wondered the same thing: should clustering a parent in

Re: Adding CI to our tree

2022-03-10 Thread Justin Pryzby

On Thu, Mar 10, 2022 at 12:50:15PM -0800, Andres Freund wrote:
> > -  setup_cores_script: |
> > +  setup_os_script: |
> >  mkdir -m 770 /tmp/cores
> >  chown root:postgres /tmp/cores
> >  sysctl kern.corefile='/tmp/cores/%N.%P.core'
> > +#pkg install -y ...
> 
> Would you mind if I split this into setup_core_files_script and
> setup_additional_packages_script:?

That's fine.  FYI I'm also planning on using choco install --no-progress
I could resend my latest patches shortly.

> > Subject: [PATCH 6/7] wip: cirrus/windows: add compiler_warnings_script
> > 
> > I'm not sure how to write this test in windows shell; it's also not easy to
> > write it in posix sh, since windows shell is somehow interpretting && and 
> > ||...
> 
> That comment isn't accurate anymore now that it's in an external script,
> right?

No, it is accurate.  What I mean is that it's also hard to write it as a
1-liner using posix sh, since the || (and &&) seemed to be interpretted by
cmd.exe and needed escaping - gross.

-- 
Justin

Re: Adding CI to our tree

2022-03-10 Thread Justin Pryzby

See attached, or at
https://github.com/justinpryzby/postgres/runs/5503079878
>From c631c3d9bdb8325aaaecc5dcdfac46eca7bd907a Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 17 Jan 2022 00:53:04 -0600
Subject: [PATCH 1/7] cirrus: include hints how to install OS packages..

This is useful for patches during development, but once a feature is merged,
new libraries should be added to the OS image files, rather than installed
during every CI run forever into the future.
---
 .cirrus.yml | 20 
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/.cirrus.yml b/.cirrus.yml
index 40854046d66..6026a973dbb 100644
--- a/.cirrus.yml
+++ b/.cirrus.yml
@@ -73,10 +73,12 @@ task:
 chown -R postgres:postgres .
 mkdir -p ${CCACHE_DIR}
 chown -R postgres:postgres ${CCACHE_DIR}
-  setup_cores_script: |
+  setup_core_files_script: |
 mkdir -m 770 /tmp/cores
 chown root:postgres /tmp/cores
 sysctl kern.corefile='/tmp/cores/%N.%P.core'
+  setup_additional_packages_script: |
+#pkg install -y ...
 
   # NB: Intentionally build without --with-llvm. The freebsd image size is
   # already large enough to make VM startup slow, and even without llvm
@@ -180,10 +182,13 @@ task:
 chown -R postgres:postgres ${CCACHE_DIR}
 echo '* - memlock 134217728' > /etc/security/limits.d/postgres.conf
 su postgres -c "ulimit -l -H && ulimit -l -S"
-  setup_cores_script: |
+  setup_core_files_script: |
 mkdir -m 770 /tmp/cores
 chown root:postgres /tmp/cores
 sysctl kernel.core_pattern='/tmp/cores/%e-%s-%p.core'
+  setup_additional_packages_script: |
+#apt-get update
+#apt-get -y install ...
 
   configure_script: |
 su postgres <<-EOF
@@ -237,7 +242,7 @@ task:
 ulimit -a -H && ulimit -a -S
 export
 
-  setup_cores_script:
+  setup_core_files_script:
 - mkdir ${HOME}/cores
 - sudo sysctl kern.corefile="${HOME}/cores/core.%P"
 
@@ -254,7 +259,7 @@ task:
   # packages do not need to be downloaded.
   homebrew_cache:
 folder: $HOMEBREW_CACHE
-  homebrew_install_script: |
+  setup_additional_packages_script: |
 brew install \
   ccache \
   icu4c \
@@ -389,6 +394,9 @@ task:
 powershell -Command get-psdrive -psprovider filesystem
 set
 
+  setup_additional_packages_script: |
+REM choco install -y --no-progress ...
+
   configure_script:
 # copy errors out when using forward slashes
 - copy src\tools\ci\windows_build_config.pl src\tools\msvc\config.pl
@@ -484,6 +492,10 @@ task:
   ccache_cache:
 folder: $CCACHE_DIR
 
+  setup_additional_packages_script: |
+#apt-get update
+#apt-get -y install ...
+
   ###
   # Test that code can be built with gcc/clang without warnings
   ###
-- 
2.17.1

>From 9e5df76468c4ecc8f411f106c86d54c1f06f70f3 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sun, 27 Feb 2022 15:17:50 -0600
Subject: [PATCH 2/7] cirrus: compile with -Og..

To improve performance of check-world, and improve debugging, without
significantly slower builds (they're cached anyway).

This makes freebsd check-world run in 8.5 minutes rather than 15 minutes.
And on linux mitigates the run-time performance effect of --coverage.
---
 .cirrus.yml | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/.cirrus.yml b/.cirrus.yml
index 6026a973dbb..5179353dc9d 100644
--- a/.cirrus.yml
+++ b/.cirrus.yml
@@ -107,7 +107,7 @@ task:
 \
 CC="ccache cc" \
 CXX="ccache c++" \
-CFLAGS="-O0 -ggdb"
+CFLAGS="-Og -ggdb"
 EOF
   build_script: su postgres -c "gmake -s -j${BUILD_JOBS} world-bin"
   upload_caches: ccache
@@ -201,8 +201,8 @@ task:
 CC="ccache gcc" \
 CXX="ccache g++" \
 CLANG="ccache clang" \
-CFLAGS="-O0 -ggdb" \
-CXXFLAGS="-O0 -ggdb"
+CFLAGS="-Og -ggdb" \
+CXXFLAGS="-Og -ggdb"
 EOF
   build_script: su postgres -c "make -s -j${BUILD_JOBS} world-bin"
   upload_caches: ccache
@@ -315,8 +315,8 @@ task:
   CC="ccache cc" \
   CXX="ccache c++" \
   CLANG="ccache ${brewpath}/llvm/bin/ccache" \
-  CFLAGS="-O0 -ggdb" \
-  CXXFLAGS="-O0 -ggdb" \
+  CFLAGS="-Og -ggdb" \
+  CXXFLAGS="-Og -ggdb" \
   \
   LLVM_CONFIG=${brewpath}/llvm/bin/llvm-config \
   PYTHON=python3
-- 
2.17.1

>From 92f64c297628ebb7026b6b61795444ea078720d2 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sun, 20 Feb 2022 15:01:59 -0600
Subject: [PATCH 3/7] cirrus/windows: add compiler_warnings_script

I'm not sure how to write this test in windows shell; it's also not easy to
write it in posix sh, since windows shell is somehow interpretting && and ||...

ci-os-only: w

Re: refactoring basebackup.c

2022-03-10 Thread Justin Pryzby

I'm getting errors from pg_basebackup when using both -D- and 
--compress=server-*
The issue seems to go away if I use --no-manifest.

$ ./src/bin/pg_basebackup/pg_basebackup -h /tmp -Ft -D- --wal-method none 
--compress=server-gzip >/dev/null ; echo $?
pg_basebackup: error: tar member has empty name
1

$ ./src/bin/pg_basebackup/pg_basebackup -h /tmp -Ft -D- --wal-method none 
--compress=server-gzip >/dev/null ; echo $?
NOTICE:  WAL archiving is not enabled; you must ensure that all required WAL 
segments are copied through other means to complete the backup
pg_basebackup: error: COPY stream ended before last file was finished
1

Re: wal_compression=zstd

2022-03-10 Thread Justin Pryzby

On Fri, Mar 11, 2022 at 12:23:59PM +0900, Michael Paquier wrote:
> On Wed, Mar 09, 2022 at 07:14:11AM -0600, Justin Pryzby wrote:
> > Anyway there's no compelling reason to not use the default.  If we were to 
> > use
> > a non-default default, we'd have to choose between 1 and 2 (or some negative
> > compression level).  My thinking was that zstd-1 would give the 
> > lowest-hanging
> > fruits for zstd, while minimizing performance tradeoff, since WAL affects
> > interactivity.  But choosing between 1 and 2 seems like bikeshedding.
> 
> Yeah, I have looked again at the patch today, and I saw no reason to
> not apply it to give more options to the user as zstd or lz4 are both
> good in their own ways.  So done, with the default level used.

It's great news - thanks.

-- 
Justin

Re: refactoring basebackup.c

2022-03-11 Thread Justin Pryzby

On Fri, Mar 11, 2022 at 10:19:29AM -0500, Robert Haas wrote:
> So I think we should just refuse this command. Patch for that attached.

Sounds right.

Also, I think the magic 8 for .gz should actually be a 7.

I'm not sure why it tests for ".gz" but not ".tar.gz", which would help to make
them all less magic.

commit 1fb1e21ba7a500bb2b85ec3e65f59130fcdb4a7e
Author: Justin Pryzby 
Date:   Thu Mar 10 21:22:16 2022 -0600

pg_basebackup: make magic numbers less magic

The magic 8 for .gz should actually be a 7.

.tar.gz
1234567

.tar.lz4
.tar.zst
12345678

See d45099425, 751b8d23b, 7cf085f07.

diff --git a/src/bin/pg_basebackup/pg_basebackup.c 
b/src/bin/pg_basebackup/pg_basebackup.c
index 9f3ecc60fbe..8dd9721323d 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -1223,17 +1223,17 @@ CreateBackupStreamer(char *archive_name, char 
*spclocation,
is_tar = (archive_name_len > 4 &&
  strcmp(archive_name + archive_name_len - 4, ".tar") 
== 0);
 
-   /* Is this a gzip archive? */
-   is_tar_gz = (archive_name_len > 8 &&
-strcmp(archive_name + archive_name_len - 3, 
".gz") == 0);
+   /* Is this a .tar.gz archive? */
+   is_tar_gz = (archive_name_len > 7 &&
+strcmp(archive_name + archive_name_len - 7, 
".tar.gz") == 0);
 
-   /* Is this a LZ4 archive? */
+   /* Is this a .tar.lz4 archive? */
is_tar_lz4 = (archive_name_len > 8 &&
- strcmp(archive_name + archive_name_len - 4, 
".lz4") == 0);
+ strcmp(archive_name + archive_name_len - 8, 
".tar.lz4") == 0);
 
-   /* Is this a ZSTD archive? */
+   /* Is this a .tar.zst archive? */
is_tar_zstd = (archive_name_len > 8 &&
-  strcmp(archive_name + archive_name_len - 4, 
".zst") == 0);
+  strcmp(archive_name + archive_name_len - 8, 
".tar.zst") == 0);
 
/*
 * We have to parse the archive if (1) we're suppose to extract it, or 
if

Re: wal_compression=zstd

2022-03-11 Thread Justin Pryzby

While rebasing, I realized this should have bumped XLOG_PAGE_MAGIC.

Also, there's a dangling "and".

+The supported methods are pglz,
+lz4 (if PostgreSQL
+was compiled with --with-lz4) and
+zstd (if PostgreSQL
+was compiled with --with-zstd) and
+The default value is off.
+Only superusers can change this setting.

-- 
Justin

Re: Add parameter jit_warn_above_fraction

2022-03-12 Thread Justin Pryzby

On Mon, Mar 07, 2022 at 04:02:16PM +0100, Magnus Hagander wrote:
> On Mon, Mar 7, 2022 at 2:09 PM Justin Pryzby  wrote:
> > On Mon, Mar 07, 2022 at 01:10:32PM +0100, Magnus Hagander wrote:
> > > On Fri, Feb 25, 2022 at 5:23 PM Justin Pryzby  
> > > wrote:
> > > >
> > > > I think it should be a NOTICE (or less?)
> > >
> > > Hmm. I'm not so sure.
> > >
> > > Other similar parameters often use LOG, but the downside of that is
> > > that it won't be sent to the client.
> > >
> > > The problem with using NOTICE is that it won't go to the log by
> > > default. It needs to be at least warning to do that.
> >
> > Anyone who uses this parameter is aleady going to be changing GUCs, so it
> > doesn't need to work by default.  The people who are likely to enable this
> > already monitor their logs and have probably customized their logging
> > configuration.
> 
> Sure, but if you set log_min_messagse to NOTICE you are likely going
> to flood your logs beyond usefulness. And the whole idea of this
> parameter is you can keep it on during "normal times" to catch
> outliers.

I do set log_min_messages - not to NOTICE, but to INFO ;)

Would NOTICEs really flood most people's logs ?  From what ?  They're aren't
that many, and most seem to be for DDL and utility commands.  We do, uh, more
of that than most people would say is reasonable.

I see a couple hundred of these per day:
| PostGIS: Unable to compute statistics for...
And during deployment, a few hundred more for IF NOT EXIST commands.
That's it.

I still think this GUC should not be WARNING.  If it's a warning, then *this*
will cause previously-quiet logs to be flooded - the opposite problem.  Which
is not desirable for a new guc.

https://www.postgresql.org/docs/current/runtime-config-logging.html

INFOProvides information implicitly requested by the user, e.g., output 
from VACUUM VERBOSE.INFOINFORMATION
NOTICE  Provides information that might be helpful to users, e.g., notice of 
truncation of long identifiers.NOTICE  INFORMATION
WARNING Provides warnings of likely problems, e.g., COMMIT outside a 
transaction block. NOTICE  WARNING

> > I don't understand - why doesn't it combine trivially with
> > log_min_duration_statement?  Are you saying that the default / pre-existing 
> > min
> > duration may not log all of the intended queries ?  I think that just means 
> > the
> > configuration needs to be changed.  The GUC needs to allow/help finding 
> > these
> > JIT issues, but going to require an admin's interaction in any case.  
> > Avoiding
> > false positives may be important for it to be useful at all.
> >
> > Hmm .. should it also combine with min_sample_rate ?  Maybe that addresses 
> > your
> > concern.
> 
> For one, what if I don't want to log any queries unless this is the
> case? I leave log_min_duration_statement at -1...

Then you will conclude that our logging is inadequate for your case.
You need to filter the lines which don't interest you.

> Or what if I want to log say only queries taking more than 60 seconds.
> Now I will miss all queries taking 30 seconds where 99.9% of the time
> is spent on JIT which I definitely *would* want.

If you're unwilling to deal with log entries which are uninteresting, then
clearly you'll need a separate GUC just for log_min_duration_jit_above_fraction
(and maybe another one for jit_above_duration_log_level).  Or implement generic
log filtering based on file/severity/message.

As I see it: most people won't set this GUC at all.  Those who do might enable
it with a high threshold value of log_min_duration_statement (probably a
pre-existing, configured value) to see what there is to see - the worst issues.

A small fraction of people will enable it with a low threshold value for
log_min_duration_statement - maybe only temporarily or per-session.  They'll be
willing to sift through the logs to look for interesting things, like queries
that take 10ms, 75% of which is in JIT, but run hundreds of times per second.

This feature intends to make it easier to identify queries affected by this
problem, but it doesn't need to be possible to do that without logging anything
else.

-- 
Justin

Re: support for MERGE

2022-03-12 Thread Justin Pryzby

On Sat, Jan 29, 2022 at 12:03:35AM -0600, Justin Pryzby wrote:
> Note that MergeWhenClause and MergeAction have the node definition in a
> different order than the header, which is a bit confusing.

The .h files still order these fields differently than the other .h files, and
then the node funcs (at least MergeAction) also have a different order than the
.h files.

> diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
> index 1617702d9d..c8e8254b16 100644
> --- a/src/include/nodes/parsenodes.h
> +++ b/src/include/nodes/parsenodes.h
> @@ -117,7 +117,7 @@ typedef struct Query
> +typedef struct MergeWhenClause
> +{
> + NodeTag type;
> + boolmatched;/* true=MATCHED, false=NOT 
> MATCHED */
> + CmdType commandType;/* INSERT/UPDATE/DELETE/DO NOTHING */
> + Node   *condition;  /* WHEN conditions (raw parser) */
> + List   *targetList; /* INSERT/UPDATE targetlist */
> + /* the following members are only useful for INSERT action */
> + List   *cols;   /* optional: names of the 
> target columns */
> + List   *values; /* VALUES to INSERT, or NULL */
> + OverridingKind override;/* OVERRIDING clause */
> +} MergeWhenClause;

> +/*
> + * WHEN [NOT] MATCHED THEN action info
> + */
> +typedef struct MergeAction
> +{
> + NodeTag type;
> + boolmatched;/* true=MATCHED, false=NOT 
> MATCHED */
> + OverridingKind override;/* OVERRIDING clause */
> + Node   *qual;   /* transformed WHEN conditions 
> */
> + CmdType commandType;/* INSERT/UPDATE/DELETE/DO NOTHING */
> + List   *targetList; /* the target list (of TargetEntry) */
> + List   *updateColnos;   /* target attribute numbers of an 
> UPDATE */
> +} MergeAction;

> diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
> index d4f8455a2b..234a701045 100644
> --- a/src/backend/nodes/copyfuncs.c
> +++ b/src/backend/nodes/copyfuncs.c
> +static MergeAction *
> +_copyMergeAction(const MergeAction *from)
> +{
> + MergeAction *newnode = makeNode(MergeAction);
> +
> + COPY_SCALAR_FIELD(matched);
> + COPY_SCALAR_FIELD(commandType);
> + COPY_SCALAR_FIELD(override);
> + COPY_NODE_FIELD(qual);
> + COPY_NODE_FIELD(targetList);
> + COPY_NODE_FIELD(updateColnos);

> +static MergeWhenClause *
> +_copyMergeWhenClause(const MergeWhenClause *from)
> +{
> + MergeWhenClause *newnode = makeNode(MergeWhenClause);
> +
> + COPY_SCALAR_FIELD(matched);
> + COPY_SCALAR_FIELD(commandType);
> + COPY_NODE_FIELD(condition);
> + COPY_NODE_FIELD(targetList);
> + COPY_NODE_FIELD(cols);
> + COPY_NODE_FIELD(values);
> + COPY_SCALAR_FIELD(override);
> + return newnode;
> +}

> diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
> index f1002afe7a..5e1ff02a55 100644
> --- a/src/backend/nodes/equalfuncs.c
> +++ b/src/backend/nodes/equalfuncs.c
> @@ -841,6 +841,20 @@ _equalOnConflictExpr(const OnConflictExpr *a, const 
> OnConflictExpr *b)
> +static bool
> +_equalMergeAction(const MergeAction *a, const MergeAction *b)
> +{
> + COMPARE_SCALAR_FIELD(matched);
> + COMPARE_SCALAR_FIELD(commandType);
> + COMPARE_SCALAR_FIELD(override);
> + COMPARE_NODE_FIELD(qual);
> + COMPARE_NODE_FIELD(targetList);
> + COMPARE_NODE_FIELD(updateColnos);

> +static bool
> +_equalMergeWhenClause(const MergeWhenClause *a, const MergeWhenClause *b)
> +{
> + COMPARE_SCALAR_FIELD(matched);
> + COMPARE_SCALAR_FIELD(commandType);
> + COMPARE_NODE_FIELD(condition);
> + COMPARE_NODE_FIELD(targetList);
> + COMPARE_NODE_FIELD(cols);
> + COMPARE_NODE_FIELD(values);
> + COMPARE_SCALAR_FIELD(override);
> +
> + return true;
> +}

> diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
> index 6bdad462c7..7549b27b39 100644
> --- a/src/backend/nodes/outfuncs.c
> +++ b/src/backend/nodes/outfuncs.c
> @@ -429,6 +429,21 @@ _outModifyTable(StringInfo str, const ModifyTable *node)
> +static void
> +_outMergeWhenClause(StringInfo str, const MergeWhenClause *node)
> +{
> + WRITE_NODE_TYPE("MERGEWHENCLAUSE");
> +
> + WRITE_BOOL_FIELD(matched);
> + WRITE_ENUM_FIELD(commandType, CmdType);
> + WRITE_NODE_FIELD(condition);
> + WRITE_NODE_FIELD(targetList);
> + WRITE_NODE_FIELD(cols);
> + WRITE_NODE_FIELD(values);
> + WRITE_ENUM_FIELD(override, OverridingKind);

> +static void
> +_outMergeActio

Re: preserving db/ts/relfilenode OIDs across pg_upgrade (was Re: storing an explicit nonce)

2022-03-12 Thread Justin Pryzby

On Tue, Jan 25, 2022 at 10:19:53AM +0530, Shruthi Gowda wrote:
> On Tue, Jan 25, 2022 at 1:14 AM Robert Haas  wrote:
> > On Sat, Jan 22, 2022 at 2:20 AM Shruthi Gowda  wrote:
> > > Agree. In the latest patch, the template0 and postgres OIDs are fixed
> > > to unused manually assigned OIDs 4 and 5 respectively. These OIDs are
> > > no more listed as unused OIDs.
> >
> > Thanks. Committed with a few more cosmetic changes.
> 
> Thanks, Robert for committing this patch.

If I'm not wrong, this can be closed.
https://commitfest.postgresql.org/37/3296/

Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2022-03-12 Thread Justin Pryzby

On Sun, Mar 13, 2022 at 09:45:35AM +0900, Michael Paquier wrote:
> On Wed, Mar 09, 2022 at 10:50:45AM -0600, Justin Pryzby wrote:
> > Rebased over 9e9858389 (Michael may want to look at the tuplestore part?).
> 
> Are you referring to the contents of 0003 here that changes the
> semantics of pg_ls_dir_files() regarding its setup call?

Yes, as it has this:

-   SetSingleFuncCall(fcinfo, SRF_SINGLE_USE_EXPECTED); 

  
...
-   SetSingleFuncCall(fcinfo, 0);   

  
...
+   if (flags & LS_DIR_METADATA)

  
+   SetSingleFuncCall(fcinfo, 0);   

  
+   else

  
+   SetSingleFuncCall(fcinfo, SRF_SINGLE_USE_EXPECTED); 

  

-- 
Justin

Re: refactoring basebackup.c (zstd workers)

2022-03-14 Thread Justin Pryzby

On Mon, Mar 14, 2022 at 09:41:35PM +0530, Dipesh Pandit wrote:
> I tried to implement support for parallel ZSTD compression. The
> library provides an option (ZSTD_c_nbWorkers) to specify the
> number of compression workers. The number of parallel
> workers can be set as part of compression parameter and if this
> option is specified then the library performs parallel compression
> based on the specified number of workers.
> 
> User can specify the number of parallel worker as part of
> --compress option by appending an integer value after at sign (@).
> (-Z, --compress=[{client|server}-]{gzip|lz4|zstd}[:LEVEL][@WORKERS])

I suggest to use a syntax that's more general than that, maybe something like

:[level=]N,parallel=N,flag,flag,...

For example, someone may want to use zstd "long" mode or (when it's released)
rsyncable mode, or specify fine-grained compression parameters (strategy,
windowLog, hashLog, etc).

I hope the same syntax will be shared with wal_compression and pg_dump.
And libpq, if that patch progresses.

BTW, I think this may be better left for PG16.

-- 
Justin

Re: refactoring basebackup.c (zstd workers)

2022-03-14 Thread Justin Pryzby

On Mon, Mar 14, 2022 at 01:02:20PM -0400, Robert Haas wrote:
> On Mon, Mar 14, 2022 at 12:35 PM Justin Pryzby  wrote:
> > I suggest to use a syntax that's more general than that, maybe something 
> > like
> >
> > :[level=]N,parallel=N,flag,flag,...
> >
> > For example, someone may want to use zstd "long" mode or (when it's 
> > released)
> > rsyncable mode, or specify fine-grained compression parameters (strategy,
> > windowLog, hashLog, etc).
> 
> That's an interesting idea. I wonder what the replication protocol
> ought to look like in that case. Should we have a COMPRESSION_DETAIL
> argument that is just a string, and let the server parse it out? Or
> separate protocol-level options? It does feel reasonable to have both
> COMPRESSION_LEVEL and COMPRESSION_WORKERS as first-class options, but
> I don't know that we want COMPRESSION_HASHLOG true as part of our
> first-class grammar.

I was only referring to the user-facing grammar.

Internally, I was thinking they'd all be handled as first-class options, with
separate struct fields and separate replication protocol options.  If an option
isn't known, it'd be rejected on the client side, rather than causing an error
on the server.

Maybe there'd be an option parser for this in common/ (I think that might
require having new data structure there too, maybe one for each compression
method, or maybe a union{} to handles them all).  Most of the ~100 lines to
support wal_compression='zstd:N' are to parse out the N.

-- 
Justin

Re: [PATCH] Fix various spelling errors

2022-03-14 Thread Justin Pryzby

On Mon, Mar 14, 2022 at 11:03:50PM +, Kekalainen, Otto wrote:
>  Hello!
> 
> I propose the attached patch to be applied on the 'master' branch of 
> PostgreSQL
> to fix various spelling errors.
> 
> Most fixes are in comments and have no effect on functionality. Some fixes are
> also in variable names but they should be safe to change, as the change is
> consistent in all occurrences of the variable.

LGTM - I found a few of these myself.
Attached now, in case it's useful to handle them together.
>From 7a64f847db6129b347aa5ce2560111639f77a97c Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 24 Jan 2022 19:40:13 -0600
Subject: [PATCH 01/17] doc: than than

My mistake at 410aa248e5a883fde4832999cc9b23c7ace0f2ff
---
 doc/src/sgml/postgres-fdw.sgml | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/postgres-fdw.sgml b/doc/src/sgml/postgres-fdw.sgml
index 8ebf0dc3a05..88f564d7fd8 100644
--- a/doc/src/sgml/postgres-fdw.sgml
+++ b/doc/src/sgml/postgres-fdw.sgml
@@ -1003,8 +1003,8 @@ postgres=# SELECT postgres_fdw_disconnect_all();
   it's passed to and used as application_name
   in a foreign server, note that it will be truncated to less than
   NAMEDATALEN characters and anything other than
-  than printable ASCII characters will be replaced with question
-  marks (?).
+  printable ASCII characters will be replaced with question marks
+  (?).
   See  for details.
  
 
-- 
2.17.1

>From 4f6cb9631f01efda87d139ae74200c216a332842 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 24 Jan 2022 20:13:38 -0600
Subject: [PATCH 02/17] doc: duplicate words: THE since
 0aa8a01d04c8fe200b7a106878eebc3d0af9105c

Found like:
time find doc -name '*.sgml' |xargs sed -srn ':l; /\n\n/!{N; b l}; /(\<[[:alpha:]]{1,})\>\n[[:space:]]*\<\1\>/!d; s//>>&<stream_start_cb
 and stream_stop_cb callbacks. Once all the decoded
 changes are transmitted, the transaction can be committed using the
-the stream_commit_cb callback
+stream_commit_cb callback
 (or possibly aborted using the stream_abort_cb callback).
 If two-phase commits are supported, the transaction can be prepared using the
 stream_prepare_cb callback,
-- 
2.17.1

>From 3c105b2a9c9bcb34e166a9ec36c1a5b5ddcb9b1c Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Fri, 18 Feb 2022 15:09:44 -0600
Subject: [PATCH 03/17] remove duplicate word

---
 src/bin/pg_basebackup/t/010_pg_basebackup.pl | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
index 8cb8cfe045e..44c0a4439ad 100644
--- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
+++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
@@ -22,7 +22,7 @@ my $node = PostgreSQL::Test::Cluster->new('main');
 
 # For nearly all pg_basebackup invocations some options should be specified,
 # to keep test times reasonable. Using @pg_basebackup_defs as the first
-# element of the array passed to to IPC::Run interpolate the array (as it is
+# element of the array passed to IPC::Run interpolate the array (as it is
 # not a reference to an array)...
 my @pg_basebackup_defs = ('pg_basebackup', '--no-sync', '-cfast');
 
-- 
2.17.1

>From ece9a790b956ecefa748dc748f88bfeb97dded7a Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Tue, 25 Jan 2022 10:47:55 -0600
Subject: [PATCH 04/17] doc: database SYSTEM since
 aa01051418f10afbdfa781b8dc109615ca785ff9

---
 doc/src/sgml/ref/create_database.sgml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/src/sgml/ref/create_database.sgml b/doc/src/sgml/ref/create_database.sgml
index f70d0c75b4d..3b390027ed3 100644
--- a/doc/src/sgml/ref/create_database.sgml
+++ b/doc/src/sgml/ref/create_database.sgml
@@ -231,7 +231,7 @@ CREATE DATABASE name

 
  The object identifier to be used for the new database. If this
- parameter is not specified, the database will choose a suitable
+ parameter is not specified, the database system will choose a suitable
  OID automatically. This parameter is primarily intended for internal
  use by pg_upgrade, and only
      pg_upgrade can specify a value less
-- 
2.17.1

>From f920059b4ea195a6f916147fe4cc0c82c79d875e Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Wed, 16 Feb 2022 21:07:53 -0600
Subject: [PATCH 05/17] doc: Remove 'synchronized' from --no-sync

The corresponding change was made for pgupgrade.sgml in commit 410aa248
---
 doc/src/sgml/ref/pg_rewind.sgml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/src/sgml/ref/pg_rewind.sgml b/doc/src/sgml/ref/pg_rewind.sgml
index 33e6bb64ad6..c11f671855f 100644
--- a/doc/src/sgml/ref/pg_rewind.sgml
+++ b/doc/src/sgml/ref/pg_rewind.sgml
@@ -210,7 +210,7 @@ Post

Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2022-03-14 Thread Justin Pryzby

On Mon, Mar 14, 2022 at 01:53:54PM +0900, Michael Paquier wrote:
> On Wed, Mar 09, 2022 at 10:50:45AM -0600, Justin Pryzby wrote:
> > I also changed pg_ls_dir_recurse() to handle concurrent removal of a dir, 
> > which
> > I noticed caused an infrequent failure on CI.  However I'm not including 
> > that
> > here, since the 2nd half of the patch set seems isn't ready due to lstat() 
> > on
> > windows.
> 
> lstat() has been a subject of many issues over the years with our
> internal emulation and issues related to its concurrency, but we use
> it in various areas of the in-core code, so that does not sound like
> an issue to me.  It depends on what you want to do with it in
> genfile.c and which data you'd expect, in addition to the detection of
> junction points for WIN32, I guess.

To avoid cycles, a recursion function would need to know whether to recurse
into a directory or to output that something is isdir=false or type=link, and
avoid recursing into it. 

> pg_ls_dir_recurse(), but that's a WITH RECURSIVE, so we would not
> really need it, do we?

Tom disliked it when I had written it as a recursive CTE, so I rewrote it in C.
129225.1606166...@sss.pgh.pa.us

> Hmm.  I am not convinced that we need a new set of SQL functions as
> presented in 0003 (addition of meta-data in pg_ls_dir()), and
> extensions of 0004 (do the same but for pg_ls_tmpdir) and 0005 (same
> for the other pg_ls* functions).  The changes of pg_ls_dir_files()
> make in my opinion the code harder to follow as the resulting tuple 
> size depends on the type of flag used, and you can already retrieve
> the rest of the information with a join, probably LATERAL, on
> pg_stat_file(), no?  The same can be said about 0007, actually.

Yes, one can get the same information with a lateral join (as I said 2 years
ago).  But it's more helpful to provide a function, rather than leave that to
people to figure out - possibly incorrectly, or badly, like by parsing the
output of COPY FROM PROGRAM 'ls -l'.  The query to handle tablespaces is
particularly obscure:
20200310183037.ga29...@telsasoft.com
20201223191710.gr30...@telsasoft.com

One could argue that most of the pg_ls_* functions aren't needed (including
1922d7c6e), since the same things are possible with pg_ls_dir() and
pg_stat_file().
|1922d7c6e Add SQL functions to monitor the directory contents of replication 
slots

The original, minimal goal of this patch was to show shared tempdirs in
pg_ls_tmpfile() - rather than hiding them misleadingly as currently happens.
20200310183037.ga29...@telsasoft.com
20200313131232.go29...@telsasoft.com

I added the metadata function 2 years ago since it's silly to show metadata for
tmpdir but not other, arbitrary directories.
20200310183037.ga29...@telsasoft.com
20200313131232.go29...@telsasoft.com
20201223191710.gr30...@telsasoft.com

> The addition of isdir for any of the paths related to pg_logical/ and
> the replication slots has also a limited interest, because we know
> already those contents, and these are not directories as far as I
> recall.

Except when we don't, since extensions can do things that core doesn't, as
Fabien pointed out.
alpine.DEB.2.21.2001160927390.30419@pseudo

> In the whole set, improving the docs as of 0001 makes sense, but the
> change is incomplete.  Most of 0002 also makes sense and should be
> stable enough.  I am less enthusiastic about any of the other changes
> proposed and what we can gain from these parts.

It is frustrating to hear this feedback now, after the patch has gone through
multiple rewrites over 2 years - based on other positive feedback and review.
I went to the effort to ask, numerous times, whether to write the patch and how
its interfaces should look.  Now, I'm hearing that not only the implementation
but its goals are wrong.  What should I have done to avoid that ?

20200503024215.gj28...@telsasoft.com
20191227195918.gf12...@telsasoft.com
20200116003924.gj26...@telsasoft.com
20200908195126.gb18...@telsasoft.com

Re: Assert in pageinspect with NULL pages

2022-03-15 Thread Justin Pryzby

On Tue, Mar 15, 2022 at 06:32:44PM +0900, Michael Paquier wrote:

> + if (!IS_BRIN(indexRel))
> + ereport(ERROR,
> + (errcode(ERRCODE_WRONG_OBJECT_TYPE),
> +  errmsg("\"%s\" is not a %s index",
> + 
> RelationGetRelationName(indexRel), "BRIN")));

If it were me, I'd write this without the extra parens around (errcode()).

> +-- Suppress the DETAIL message, to allow the tests to work across various
> +-- default page sizes.

I think you mean "various non-default page sizes" or "various page sizes".

-- 
Justin

Re: pg14 psql broke \d datname.nspname.relname

2022-03-15 Thread Justin Pryzby

On Wed, Oct 13, 2021 at 11:54:26AM -0500, Justin Pryzby wrote:
> It seems unfortunate if names from log messages qualified with datname were 
> now
> rejected.  Like this one:
> 
> | automatic analyze of table "ts.child.cdrs_2021_10_12"...

Mark mentioned this "log message" use case in his proposed commit message, but
I wanted to mention what seems like a more important parallel:

postgres=# SELECT 'postgres.public.postgres_log'::regclass;
regclass | postgres_log

postgres=# SELECT 'not.postgres.public.postgres_log'::regclass;
ERROR:  improper relation name (too many dotted names): 
not.postgres.public.postgres_log
^
postgres=# SELECT 'not.public.postgres_log'::regclass;
ERROR:  cross-database references are not implemented: "not.public.postgres_log"

I think Mark used this as the model behavior for \d for this patch, which
sounds right.  Since the "two dot" case wasn't fixed in 14.1 nor 2, it seems
better to implement the ultimate, intended behavior now, rather than trying to
exactly match what old versions did.  I'm of the understanding that's what
Mark's patch does, so +1 from me.

I don't know how someone upgrading from an old version would know about the
change, though (rejecting junk prefixes rather than ignoring them).  *If* it
were important, it seems like it'd need to be added to the 14.0 release notes.

Re: document the need to analyze partitioned tables

2022-03-15 Thread Justin Pryzby

On Mon, Mar 14, 2022 at 05:23:54PM -0400, Robert Haas wrote:
> On Fri, Jan 21, 2022 at 1:31 PM Tomas Vondra  
> wrote:
> > [ new patch ]
> 
> This patch is originally by Justin. The latest version is by Tomas. I
> think the next step is for Justin to say whether he's OK with the
> latest version that Tomas posted. If he is, then I suggest that he
> also mark it Ready for Committer, and that Tomas commit it. If he's
> not, he should say what he wants changed and either post a new version
> himself or wait for Tomas to do that.

Yes, I think it can be Ready.  Done.

I amended some of Tomas' changes (see 0003, attached as txt).

@cfbot: the *.patch file is for your consumption, and the others are only there
to show my changes.

> I think the fact that is classified as a "Bug Fix" in the CommitFest
> application is not particularly good. I would prefer to see it
> classified under "Documentation". I'm prepared to concede that
> documentation can have bugs as a general matter, but nobody's data is
> getting eaten because the documentation wasn't updated. In fact, this
> is the fourth patch from the "bug fix" section I've studied this
> afternoon, and, well, none of them have been back-patchable code
> defects.

In fact, I consider this to be back-patchable back to v10.  IMO it's an
omission that this isn't documented.  Not all bugs cause data to be eaten.  If
someone reads the existing documentation, they might conclude that their
partitioned tables don't need to be analyzed, and they would've been better
served by not reading the docs.

-- 
Justin
>From c237d91ec258ebbf24ebd3a38e139777582817f6 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Thu, 22 Jul 2021 16:06:18 -0500
Subject: [PATCH] documentation deficiencies for ANALYZE of partitioned tables

This is partially extracted from 1b5617eb844cd2470a334c1d2eec66cf9b39c41a,
which was reverted.
---
 doc/src/sgml/maintenance.sgml | 29 +
 doc/src/sgml/ref/analyze.sgml | 32 +---
 2 files changed, 58 insertions(+), 3 deletions(-)

diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index 36f975b1e5b..34d72dba784 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -290,6 +290,15 @@
 to meaningful statistical changes.

 
+   
+Tuples changed in partitions and inheritance children do not trigger
+analyze on the parent table.  If the parent table is empty or rarely
+changed, it may never be processed by autovacuum, and the statistics for
+the inheritance tree as a whole won't be collected. It is necessary to
+run ANALYZE on the parent table manually in order to
+keep the statistics up to date.
+   
+

 As with vacuuming for space recovery, frequent updates of statistics
 are more useful for heavily-updated tables than for seldom-updated
@@ -347,6 +356,19 @@
  ANALYZE commands on those tables on a suitable schedule.
 

+
+   
+
+ The autovacuum daemon does not issue ANALYZE commands
+ for partitioned tables.  Inheritance parents will only be analyzed if the
+ parent itself is changed - changes to child tables do not trigger
+ autoanalyze on the parent table.  If your queries require statistics on
+ parent tables for proper planning, it is necessary to periodically run
+ a manual ANALYZE on those tables to keep the statistics
+ up to date.
+
+   
+
   
 
   
@@ -819,6 +841,13 @@ analyze threshold = analyze base threshold + analyze scale factor * number of tu
 since the last ANALYZE.

 
+   
+Partitioned tables are not processed by autovacuum.  Statistics
+should be collected by running a manual ANALYZE when it is
+first populated, and again whenever the distribution of data in its
+partitions changes significantly.
+   
+

 Temporary tables cannot be accessed by autovacuum.  Therefore,
 appropriate vacuum and analyze operations should be performed via
diff --git a/doc/src/sgml/ref/analyze.sgml b/doc/src/sgml/ref/analyze.sgml
index c423aeeea5e..8268ba87f63 100644
--- a/doc/src/sgml/ref/analyze.sgml
+++ b/doc/src/sgml/ref/analyze.sgml
@@ -263,9 +263,35 @@ ANALYZE [ VERBOSE ] [ table_and_columns
 
   
-If any of the child tables are foreign tables whose foreign data wrappers
-do not support ANALYZE, those child tables are ignored while
-gathering inheritance statistics.
+For partitioned tables, ANALYZE gathers statistics by
+sampling rows from all partitions; in addition, it will recurse into each
+partition and update its statistics.  Each leaf partition is analyzed only
+once, even with multi-level partitioning.  No statistics are collected for
+only the parent table (without data from its partitions), because with
+partitioning it's guaranteed t

Re: refactoring basebackup.c (zstd negative compression)

2022-03-16 Thread Justin Pryzby

Should zstd's negative compression levels be supported here ?

Here's a POC patch which is enough to play with it.

$ src/bin/pg_basebackup/pg_basebackup --wal-method fetch -Ft -D - -h /tmp 
--no-sync --compress=zstd |wc -c
12305659
$ src/bin/pg_basebackup/pg_basebackup --wal-method fetch -Ft -D - -h /tmp 
--no-sync --compress=zstd:1 |wc -c
13827521
$ src/bin/pg_basebackup/pg_basebackup --wal-method fetch -Ft -D - -h /tmp 
--no-sync --compress=zstd:0 |wc -c
12304018
$ src/bin/pg_basebackup/pg_basebackup --wal-method fetch -Ft -D - -h /tmp 
--no-sync --compress=zstd:-1 |wc -c
16443893
$ src/bin/pg_basebackup/pg_basebackup --wal-method fetch -Ft -D - -h /tmp 
--no-sync --compress=zstd:-2 |wc -c
17349563
$ src/bin/pg_basebackup/pg_basebackup --wal-method fetch -Ft -D - -h /tmp 
--no-sync --compress=zstd:-4 |wc -c
19452631
$ src/bin/pg_basebackup/pg_basebackup --wal-method fetch -Ft -D - -h /tmp 
--no-sync --compress=zstd:-7 |wc -c
21871505

Also, with a partial regression DB, this crashes when writing to stdout.

$ src/bin/pg_basebackup/pg_basebackup --wal-method fetch -Ft -D - -h /tmp 
--no-sync --compress=lz4 |wc -c
pg_basebackup: bbstreamer_lz4.c:172: bbstreamer_lz4_compressor_content: 
Assertion `mystreamer->base.bbs_buffer.maxlen >= out_bound' failed.
24117248

#4  0xe8b4 in bbstreamer_lz4_compressor_content 
(streamer=0x555a5260, member=0x7fffc760, 
data=0x73068010 "{ \"PostgreSQL-Backup-Manifest-Version\": 
1,\n\"Files\": [\n{ \"Path\": \"backup_label\", \"Size\": 227, 
\"Last-Modified\": \"2022-03-16 02:29:11 GMT\", \"Checksum-Algorithm\": 
\"CRC32C\", \"Checksum\": \"46f69d99\" },\n{ \"Pa"..., len=401072, 
context=BBSTREAMER_MEMBER_CONTENTS) at bbstreamer_lz4.c:172
mystreamer = 0x555a5260
next_in = 0x73068010 "{ \"PostgreSQL-Backup-Manifest-Version\": 
1,\n\"Files\": [\n{ \"Path\": \"backup_label\", \"Size\": 227, 
\"Last-Modified\": \"2022-03-16 02:29:11 GMT\", \"Checksum-Algorithm\": 
\"CRC32C\", \"Checksum\": \"46f69d99\" },\n{ \"Pa"...
    ...

(gdb) p mystreamer->base.bbs_buffer.maxlen
$1 = 524288
(gdb) p (int) LZ4F_compressBound(len, &mystreamer->prefs)
$4 = 524300

This is with: liblz4-1:amd64 1.9.2-2ubuntu0.20.04.1
>From 9a330a3a1801352cef3b5912e31aba61760dac32 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Thu, 10 Mar 2022 20:16:19 -0600
Subject: [PATCH] pg_basebackup: support Zstd negative compression levels

"higher than maximum" is bogus

TODO: each compression methods should enforce its own levels
---
 src/backend/replication/basebackup_zstd.c |  8 ++--
 src/backend/replication/repl_scanner.l|  4 +++-
 src/bin/pg_basebackup/bbstreamer_zstd.c   |  7 ++-
 src/bin/pg_basebackup/pg_basebackup.c | 23 ---
 4 files changed, 27 insertions(+), 15 deletions(-)

diff --git a/src/backend/replication/basebackup_zstd.c b/src/backend/replication/basebackup_zstd.c
index c0e2be6e27b..4464fcb30e1 100644
--- a/src/backend/replication/basebackup_zstd.c
+++ b/src/backend/replication/basebackup_zstd.c
@@ -71,7 +71,7 @@ bbsink_zstd_new(bbsink *next, int compresslevel)
 
 	Assert(next != NULL);
 
-	if (compresslevel < 0 || compresslevel > 22)
+	if (compresslevel < -7 || compresslevel > 22)
 		ereport(ERROR,
 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
  errmsg("zstd compression level %d is out of range",
@@ -96,13 +96,17 @@ bbsink_zstd_begin_backup(bbsink *sink)
 {
 	bbsink_zstd *mysink = (bbsink_zstd *) sink;
 	size_t		output_buffer_bound;
+	size_t		ret;
 
 	mysink->cctx = ZSTD_createCCtx();
 	if (!mysink->cctx)
 		elog(ERROR, "could not create zstd compression context");
 
-	ZSTD_CCtx_setParameter(mysink->cctx, ZSTD_c_compressionLevel,
+	ret = ZSTD_CCtx_setParameter(mysink->cctx, ZSTD_c_compressionLevel,
 		   mysink->compresslevel);
+	if (ZSTD_isError(ret))
+		elog(ERROR, "could not create zstd compression context: %s",
+ZSTD_getErrorName(ret));
 
 	/*
 	 * We need our own buffer, because we're going to pass different data to
diff --git a/src/backend/replication/repl_scanner.l b/src/backend/replication/repl_scanner.l
index 4b64c0d768b..05c4ef463a1 100644
--- a/src/backend/replication/repl_scanner.l
+++ b/src/backend/replication/repl_scanner.l
@@ -86,6 +86,7 @@ xdinside		[^"]+
 
 digit			[0-9]
 hexdigit		[0-9A-Fa-f]
+sign			("-"|"+")
 
 ident_start		[A-Za-z\200-\377_]
 ident_cont		[A-Za-z\200-\377_0-9\$]
@@ -127,9 +128,10 @@ NOEXPORT_SNAPSHOT	{ return K_NOEXPORT_SNAPSHOT; }
 USE_SNAPSHOT		{ return K_USE_SNAPSHOT; }
 WAIT{ return K_WAIT; }
 
+
 {space}+		{ /* do nothing */ }
 
-{digit}+		{
+{sign}?{digit}+		{

Re: Teach pg_receivewal to use lz4 compression

2022-03-17 Thread Justin Pryzby

On Sat, Feb 12, 2022 at 12:52:40PM +0900, Michael Paquier wrote:
> On Fri, Feb 11, 2022 at 10:07:49AM -0500, Robert Haas wrote:
> > Over in 
> > http://postgr.es/m/CA+TgmoYUDEJga2qV_XbAZ=pgebaosgfmzz6ac4_srwom_+u...@mail.gmail.com
> > I was noticing that CreateWalTarMethod doesn't support LZ4
> > compression. It would be nice if it did. I thought maybe the patch on
> > this thread would fix that, but I think maybe it doesn't, because it
> > looks like that's touching the WalDirectoryMethod part of that file,
> > rather than the WalTarMethod part. Is that correct?
> 
> Correct.  pg_receivewal only cares about the directory method, so this
> thread was limited to this part.  Yes, it would be nice to extend
> fully the tar method of walmethods.c to support LZ4, but I was not
> sure what needed to be done, and I am still not sure based on what has
> just been done as of 751b8d23.
> 
> > And, on a related note, Michael, do you plan to get something
> > committed here? 
> 
> Apart from f79962d, ba5 and 50e1441, I don't think that there was
> something left to do for this thread.  Perhaps I am missing something?

I think this should use 

+#include "lz4frame.h"

commit ba595d2322da095a1e6703171b3f1f2815cb
Author: Michael Paquier 
Date:   Fri Nov 5 11:33:25 2021 +0900

Add support for LZ4 compression in pg_receivewal

-- 
Justin

Re: refactoring basebackup.c (zstd workers)

2022-03-17 Thread Justin Pryzby

diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index 9178c779ba..00c593f1af 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -2731,14 +2731,24 @@ The commands accepted in replication mode are:
+
+  For gzip the compression level should be an

gzip comma

+++ b/src/backend/replication/basebackup.c
@@ -18,6 +18,7 @@
 
 #include "access/xlog_internal.h"  /* for pg_start/stop_backup */
 #include "common/file_perm.h"
+#include "common/backup_compression.h"

alphabetical

-errmsg("unrecognized 
compression algorithm: \"%s\"",
+errmsg("unrecognized 
compression algorithm \"%s\"",

Most other places seem to say "compression method".  So I'd suggest to change
that here, and in doc/src/sgml/ref/pg_basebackup.sgml.

-   if (o_compression_level && !o_compression)
+   if (o_compression_detail && !o_compression)
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
 errmsg("compression level requires 
compression")));

s/level/detail/

 /*
+ * Basic parsing of a value specified for -Z/--compress.
+ *
+ * We're not concerned here with understanding exactly what behavior the
+ * user wants, but we do need to know whether the user is requesting client
+ * or server side compression or leaving it unspecified, and we need to
+ * separate the name of the compression algorithm from the detail string.
+ *
+ * For instance, if the user writes --compress client-lz4:6, we want to
+ * separate that into (a) client-side compression, (b) algorithm "lz4",
+ * and (c) detail "6". Note, however, that all the client/server prefix is
+ * optional, and so is the detail. The algorithm name is required, unless
+ * the whole string is an integer, in which case we assume "gzip" as the
+ * algorithm and use the integer as the detail.
..
  */
 static void
+parse_compress_options(char *option, char **algorithm, char **detail,
+  CompressionLocation *locationres)

It'd be great if this were re-usable for wal_compression, which I hope in pg16 
will
support at least level=N.  And eventually pg_dump.  But those clients shouldn't
accept a client/server prefix.  Maybe the way to handle that is for those tools
to check locationres and reject it if it was specified.

+ * We're not concerned with validation at this stage, so if the user writes
+ * --compress client-turkey:sandwhich, the requested algorithm is "turkey"
+ * and the detail string is "sandwhich". We'll sort out whether that's legal

sp: sandwich

+   WalCompressionMethodwal_compress_method;

This is confusingly similar to src/include/access/xlog.h:WalCompression.
I think someone else mentioned this before ?

+ * A compression specification specifies the parameters that should be used
+ * when * performing compression with a specific algorithm. The simplest

star

+/*
+ * Get the human-readable name corresponding to a particular compression
+ * algorithm.
+ */
+char *
+get_bc_algorithm_name(bc_algorithm algorithm)

should be const ?

+   /* As a special case, the specification can be a bare integer. */
+   bare_level = strtol(specification, &bare_level_endp, 10);

Should this call expect_integer_value()?
See below.

+   result->parse_error =
+   pstrdup("found empty string where a compression 
option was expected");

Needs to be localized with _() ?
Also, document that it's pstrdup'd.

+/*
+ * Parse 'value' as an integer and return the result.
+ *
+ * If parsing fails, set result->parse_error to an appropriate message
+ * and return -1.
+ */
+static int
+expect_integer_value(char *keyword, char *value, bc_specification *result)

-1 isn't great, since it's also an integer, and, also a valid compression level
for zstd (did you see my message about that?).  Maybe INT_MIN is ok.

+{
+   int ivalue;
+   char   *ivalue_endp;
+
+   ivalue = strtol(value, &ivalue_endp, 10);

Should this also set/check errno ?
And check if value != ivalue_endp ?
See strtol(3)

+char *
+validate_bc_specification(bc_specification *spec)
...
+   /*
+* If a compression level was specified, check that the algorithm 
expects
+* a compression level and that the level is within the legal range for
+* the algorithm.

It would be nice if this could be shared with wal_compression and pg_dump.
We shouldn't need multiple places with structures giving the algorithms and
range of compression levels.

+   unsignedoptions;/* OR of 
BACKUP_COMPRESSION_OPTION constants */

Should be "unsigned int" or "bits32" ?

The server crashes if I send an unknown option - you should hit that in the
regression tests.

$ src/bin/pg_basebackup/pg_basebackup --wal-method fetch -Ft -D - -h /tmp 
--no-sync --no-manifest --com

Re: PROPOSAL: Support global and local disabling of indexes

2022-03-18 Thread Justin Pryzby

On Thu, Mar 17, 2022 at 11:16:24PM -0700, Paul Martinez wrote:
> I propose adding an ALTER INDEX command that can enable or disable an
> index on a global level:

See also this thread:
https://commitfest.postgresql.org/34/2937/
https://www.postgresql.org/message-id/flat/CANbhV-H35fJsKnLoZJuhkYqg2MO4XLjR57Qwf=3-xovg2+2...@mail.gmail.com

-- 
Justin

Re: Remove INT64_FORMAT in translatable strings

2022-03-18 Thread Justin Pryzby

On Fri, Mar 18, 2022 at 01:12:40PM -0400, Tom Lane wrote:
> Japin Li  writes:
> > we can rely on %lld/%llu and we decided to use them in translatable strings.
> 
> Seems like good cleanup, so pushed.  I think though that project style
> is to use "long long" or "unsigned long long", without the unnecessary
> "int" --- it certainly makes little sense to do it both ways in the
> same patch.

This seemed familiar - it's about the same thing I sent here, while fixing
ftello().

https://www.postgresql.org/message-id/flat/20210104025321.ga9...@telsasoft.com
0002-Fix-broken-error-message-on-unseekable-input.patch

-- 
Justin

Re: Add 64-bit XIDs into PostgreSQL 15

2022-03-19 Thread Justin Pryzby

On Fri, Mar 11, 2022 at 08:26:26PM +0300, Aleksander Alekseev wrote:
> Hi hackers,
> 
> > Here is a new version of the patchset. SLRU refactoring was moved to a
> > separate patch. Both v14-0003 (XID_FMT macro) and v14-0004 (SLRU
> > refactoring) can be delivered in PG15.
> 
> Here is a new version of the patchset. The changes compared to v14 are
> minimal. Most importantly, the GCC warning reported by cfbot was
> (hopefully) fixed. The patch order was also altered, v15-0001 and
> v15-0002 are targeting PG15 now, the rest are targeting PG16.

Do you know that you can test a branch on cirrus without using CF bot or
mailing the patch to the list ?  See src/tools/ci/README

-- 
Justin

Re: Commitfest manager for 2022-03

2022-03-19 Thread Justin Pryzby

On Sun, Feb 27, 2022 at 04:51:16PM +0800, Julien Rouhaud wrote:
> On Sat, Feb 26, 2022 at 06:37:21PM -0600, Justin Pryzby wrote:
> > Can I suggest to update the CF APP to allow:
> > | Target version: 16
> > 
> > I also suggest to update patches to indicate which are (not) being 
> > considered
> > for v15.
> 
> I don't really understand what that field is supposed to mean.

It can mean whatever we decide it's convenient for it to mean.  I suppose its
primary purpose may have been to indicate which backbranch a bugfix applies to
(?)

I find it useful to indicate whether or not a patch is intended to be
considered for the next release.  If a patch is targetting v15, it's more
interesting to review.

@Thomas Munro: I think it'd be useful if cfbot would also show 1) the "target
version" (if any); and, 2) whether a patch has a committer set.

-- 
Justin

Re: refactoring basebackup.c (zstd workers)

2022-03-20 Thread Justin Pryzby

On Sun, Mar 20, 2022 at 03:05:28PM -0400, Robert Haas wrote:
> On Thu, Mar 17, 2022 at 3:41 PM Justin Pryzby  wrote:
> > -errmsg("unrecognized 
> > compression algorithm: \"%s\"",
> > +errmsg("unrecognized 
> > compression algorithm \"%s\"",
> >
> > Most other places seem to say "compression method".  So I'd suggest to 
> > change
> > that here, and in doc/src/sgml/ref/pg_basebackup.sgml.
> 
> I'm not sure that's really better, and I don't think this patch is
> introducing an altogether novel usage. I think I would probably try to
> standardize on algorithm rather than method if I were standardizing
> the whole source tree, but I think we can leave that discussion for
> another time.

The user-facing docs are already standardized using "compression method", with
2 exceptions, of which one is contrib/ and the other is what I'm suggesting to
make consistent here.

$ git grep 'compression algorithm' doc
doc/src/sgml/pgcrypto.sgml:Which compression algorithm to use.  Only 
available if
doc/src/sgml/ref/pg_basebackup.sgml:compression algorithm is selected, 
or if server-side compression

> > +   result->parse_error =
> > +   pstrdup("found empty string where a 
> > compression option was expected");
> >
> > Needs to be localized with _() ?
> > Also, document that it's pstrdup'd.
> 
> Did the latter. The former would need to be fixed in a bunch of places
> and while I'm happy to accept an expert opinion on exactly what needs
> to be done here, I don't want to try to do it and do it wrong. Better
> to let someone with good knowledge of the subject matter patch it up
> later than do a crummy job now.

I believe it just needs _("foo")
See git grep '= _('

I mentioned another issue off-list:
pg_basebackup.c:2741:10: warning: suggest parentheses around assignment used as 
truth value [-Wparentheses]
 2741 |   Assert(compressloc = COMPRESS_LOCATION_SERVER);
  |  ^~~
pg_basebackup.c:2741:3: note: in expansion of macro ‘Assert’
 2741 |   Assert(compressloc = COMPRESS_LOCATION_SERVER);

This crashes the server using your v2 patch:

src/bin/pg_basebackup/pg_basebackup --wal-method fetch -Ft -D - -h /tmp 
--no-sync --no-manifest --compress=server-zstd:level, |wc -c

I wonder whether the syntax should really use both ":" and ",".
Maybe ":" isn't needed at all.

This patch also needs to update the other user-facing docs.

typo: contain a an

-- 
Justin

Re: refactoring basebackup.c (zstd workers)

2022-03-21 Thread Justin Pryzby

On Sun, Mar 20, 2022 at 09:38:44PM -0400, Robert Haas wrote:
> > This patch also needs to update the other user-facing docs.
> 
> Which ones exactly?

I mean pg_basebackup -Z

-Z level
-Z [{client|server}-]method[:level]
--compress=level
--compress=[{client|server}-]method[:level]

Re: refactoring basebackup.c (zstd workers)

2022-03-21 Thread Justin Pryzby

On Mon, Mar 21, 2022 at 12:57:36PM -0400, Robert Haas wrote:
> > typo: contain a an
> I searched for the "contain a an" typo that you mentioned but was not able to
> find it. Can you give me a more specific pointer?

Here:

+ * during parsing, and will otherwise contain a an appropriate error message.

> I looked a little bit more at the compression method vs. compression
> algorithm thing. I agree that there is some inconsistency in
> terminology here, but I'm still not sure that we are well-served by
> trying to make it totally uniform, especially if we pick the word
> "method" as the standard rather than "algorithm". In my opinion,
> "method" is less specific than "algorithm". If someone asks me to
> choose a compression algorithm, I know that I should give an answer
> like "lz4" or "zstd". If they ask me to pick a compression method, I'm
> not quite sure whether they want that kind of answer or whether they
> want something more detailed, like "use lz4 with compression level 3
> and a 1MB block size". After all, that is (at least according to my
> understanding of how English works) a perfectly valid answer to the
> question "what method should I use to compress this data?" -- but not
> to the question "what algorithm should I use to compress this data?".
> The latter can ONLY be properly answered by saying something like
> "lz4". And I think that's really the root of my hesitation to make the
> kinds of changes you want here.

I think "algorithm" could be much more nuanced than "lz4", but I also think
we've spent more than enough time on it now :)

-- 
Justin

Re: New Object Access Type hooks

2022-03-22 Thread Justin Pryzby

If I'm not wrong, this is still causing issues at least on cfbot/windows, even
since f0206d99.

https://cirrus-ci.com/task/5266352712712192
https://cirrus-ci.com/task/5061218867085312
https://cirrus-ci.com/task/5663822005403648
https://cirrus-ci.com/task/5744257246953472

https://cirrus-ci.com/task/5744257246953472
[22:26:50.939] test test_oat_hooks   ... FAILED  173 ms

https://api.cirrus-ci.com/v1/artifact/task/5744257246953472/log/src/test/modules/test_oat_hooks/regression.diffs
diff -w -U3 
c:/cirrus/src/test/modules/test_oat_hooks/expected/test_oat_hooks.out 
c:/cirrus/src/test/modules/test_oat_hooks/results/test_oat_hooks.out
--- c:/cirrus/src/test/modules/test_oat_hooks/expected/test_oat_hooks.out   
2022-03-22 22:13:23.386895200 +
+++ c:/cirrus/src/test/modules/test_oat_hooks/results/test_oat_hooks.out
2022-03-22 22:26:51.104419600 +
@@ -15,12 +15,6 @@
 NOTICE:  in process utility: superuser finished CreateRoleStmt
 CREATE TABLE regress_test_table (t text);
 NOTICE:  in process utility: superuser attempting CreateStmt
-NOTICE:  in object access: superuser attempting namespace search (subId=0) [no 
report on violation, allowed]
-LINE 1: CREATE TABLE regress_test_table (t text);
- ^
-NOTICE:  in object access: superuser finished namespace search (subId=0) [no 
report on violation, allowed]
-LINE 1: CREATE TABLE regress_test_table (t text);
- ^
 NOTICE:  in object access: superuser attempting create (subId=0) [explicit]

Re: Probable CF bot degradation

2022-03-22 Thread Justin Pryzby

On Wed, Mar 23, 2022 at 12:44:09PM +1300, Thomas Munro wrote:
> On Mon, Mar 21, 2022 at 12:46 PM Thomas Munro  wrote:
> > On Mon, Mar 21, 2022 at 12:23 PM Thomas Munro  
> > wrote:
> > > On Mon, Mar 21, 2022 at 1:58 AM Matthias van de Meent 
> > >  wrote:
> > > > Additionally, are there plans to validate commits of the main branch
> > > > before using them as a base for CF entries, so that "bad" commits on
> > > > master won't impact CFbot results as easy?
> > >
> > > How do you see this working?
> >
> > [Now with more coffee on board]  Oh, right, I see, you're probably
> > thinking that we could look at
> > https://github.com/postgres/postgres/commits/master and take the most
> > recent passing commit as a base.  Hmm, interesting idea.
> 
> A nice case in point today: everything is breaking on Windows due to a
> commit in master, which could easily be avoided by looking back a
> certain distance for a passing commit from postgres/postgres to use as
> a base.  Let's me see if this is easy to fix...
> 
> https://www.postgresql.org/message-id/20220322231311.GK28503%40telsasoft.com

I suggest not to make it too sophisticated.  If something is broken, the CI
should show that rather than presenting a misleading conclusion.

Maybe you could keep track of how many consecutive, *new* failures there've
been (which were passing on the previous run for that task, for that patch) and
delay if it's more than (say) 5.  For bonus points, queue a rerun of all the
failed tasks once something passes.

If you create a page to show the queue, maybe it should show the history of
results, too.  And maybe there should be a history of results for each patch.

If you implement interactive buttons, maybe it could allow re-queueing some
recent failures (add to end of queue).

-- 
Justin

Re: Adding CI to our tree

2022-03-22 Thread Justin Pryzby

On Fri, Mar 18, 2022 at 03:45:03PM -0700, Andres Freund wrote:
> Pushed 0001, 0002. Only change I made was to add

Thanks - is there any reason not to do the MSVC compiler warnings one, too ?

I see that it'll warn about issues with at least 3 patches (including one of
yours).

-- 
Justin

Re: SQL/JSON: functions

2022-03-23 Thread Justin Pryzby

At least 0002-SQL-JSON-constructors-v64.patch has an issue with nodes,
per COPY_PARSE_PLAN_TREES.

+ERROR:  unrecognized node type: 157

Re: SQL/JSON: functions

2022-03-23 Thread Justin Pryzby

On Wed, Mar 23, 2022 at 03:49:17PM -0400, Andrew Dunstan wrote:
> 
> On 3/23/22 08:24, Justin Pryzby wrote:
> > At least 0002-SQL-JSON-constructors-v64.patch has an issue with nodes,
> > per COPY_PARSE_PLAN_TREES.
> >
> > +ERROR:  unrecognized node type: 157
> 
> I just tried to reproduce this and was unable to.  Ubuntu 20.04, gcc 9.4.0.
> 
> the build was done with CPPFLAGS=-DCOPY_PARSE_PLAN_TREES. To be on the
> safe side I took ccache out of the equation.
> 
> Can you give me any more details?

Sorry, the issue was actually with 

#define RAW_EXPRESSION_COVERAGE_TEST

make check is enough to see it:

@@ -6,181 +6,78 @@
 (1 row)
 
 SELECT JSON_OBJECT(RETURNING json);
- json_object 
--
- {}
-(1 row)
-
+ERROR:  unrecognized node type: 157
[...]

Re: Adding CI to our tree

2022-03-23 Thread Justin Pryzby

On Thu, Mar 24, 2022 at 09:52:39AM +1300, Thomas Munro wrote:
> On Thu, Mar 10, 2022 at 9:37 AM Justin Pryzby  wrote:
> > -Og
> 
> Adding this to CXXFLAGS caused a torrent of warnings from g++ about
> LLVM headers, which I also see on my local system for LLVM 11 and LLVM
> 14:

Yes, I mentioned seeing some other warnings here.
20220302205058.gj15...@telsasoft.com

I think warnings were cleaned up with -O0/1/2 but not with -Og.

Re: multithreaded zstd backup compression for client and server

2022-03-23 Thread Justin Pryzby

+* We check for failure here because (1) older versions of the library
+* do not support ZSTD_c_nbWorkers and (2) the library might want to
+* reject an unreasonable values (though in practice it does not seem 
to do
+* so).
+*/
+   ret = ZSTD_CCtx_setParameter(mysink->cctx, ZSTD_c_nbWorkers,
+
mysink->workers);
+   if (ZSTD_isError(ret))
+   ereport(ERROR,
+   errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+   errmsg("could not set compression worker count 
to %d: %s",
+  mysink->workers, 
ZSTD_getErrorName(ret)));

Also because the library may not be compiled with threading.  A few days ago, I
tried to rebase the original "parallel workers" patch over the COMPRESS DETAIL
patch but then couldn't test it, even after trying various versions of the zstd
package and trying to compile it locally.  I'll try again soon...

I think you should also test the return value when setting the compress level.
Not only because it's generally a good idea, but also because I suggested to
support negative compression levels.  Which weren't allowed before v1.3.4, and
then the range is only defined since 1.3.6 (ZSTD_minCLevel).  At some point,
the range may have been -7..22 but now it's -131072..22.

lib/compress/zstd_compress.c:int ZSTD_minCLevel(void) { return 
(int)-ZSTD_TARGETLENGTH_MAX; }
lib/zstd.h:#define ZSTD_TARGETLENGTH_MAXZSTD_BLOCKSIZE_MAX
lib/zstd.h:#define ZSTD_BLOCKSIZE_MAX (1<>From 80f45cfbe13d6fc0f16e49b7ea76f1e50afb632c Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Thu, 10 Mar 2022 20:16:19 -0600
Subject: [PATCH] pg_basebackup: support Zstd negative compression levels

"higher than maximum" is bogus

TODO: each compression methods should enforce its own levels
---
 src/backend/replication/basebackup_zstd.c | 7 +--
 src/bin/pg_basebackup/bbstreamer_zstd.c   | 5 -
 src/common/backup_compression.c   | 6 +-
 3 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/src/backend/replication/basebackup_zstd.c b/src/backend/replication/basebackup_zstd.c
index 4835aa70fca..74681ee3fe8 100644
--- a/src/backend/replication/basebackup_zstd.c
+++ b/src/backend/replication/basebackup_zstd.c
@@ -79,7 +79,7 @@ bbsink_zstd_new(bbsink *next, bc_specification *compress)
 	else
 	{
 		compresslevel = compress->level;
-		Assert(compresslevel >= 1 && compresslevel <= 22);
+		Assert(compresslevel >= -7 && compresslevel <= 22 && compresslevel != 0);
 	}
 
 	sink = palloc0(sizeof(bbsink_zstd));
@@ -108,8 +108,11 @@ bbsink_zstd_begin_backup(bbsink *sink)
 	if (!mysink->cctx)
 		elog(ERROR, "could not create zstd compression context");
 
-	ZSTD_CCtx_setParameter(mysink->cctx, ZSTD_c_compressionLevel,
+	ret = ZSTD_CCtx_setParameter(mysink->cctx, ZSTD_c_compressionLevel,
 		   mysink->compresslevel);
+	if (ZSTD_isError(ret))
+		elog(ERROR, "could not create zstd compression context: %s",
+ZSTD_getErrorName(ret));
 
 	/*
 	 * We check for failure here because (1) older versions of the library
diff --git a/src/bin/pg_basebackup/bbstreamer_zstd.c b/src/bin/pg_basebackup/bbstreamer_zstd.c
index e17dfb6bd54..640729003a4 100644
--- a/src/bin/pg_basebackup/bbstreamer_zstd.c
+++ b/src/bin/pg_basebackup/bbstreamer_zstd.c
@@ -85,8 +85,11 @@ bbstreamer_zstd_compressor_new(bbstreamer *next, bc_specification *compress)
 		pg_log_error("could not create zstd compression context");
 
 	/* Initialize stream compression preferences */
-	ZSTD_CCtx_setParameter(streamer->cctx, ZSTD_c_compressionLevel,
+	ret = ZSTD_CCtx_setParameter(streamer->cctx, ZSTD_c_compressionLevel,
 		   compress->level);
+	if (ZSTD_isError(ret))
+		pg_log_error("could not set compression level to: %d: %s",
+compress->level, ZSTD_getErrorName(ret));
 
 	/*
 	 * We check for failure here because (1) older versions of the library
diff --git a/src/common/backup_compression.c b/src/common/backup_compression.c
index 969e08cca20..c0eff30024c 100644
--- a/src/common/backup_compression.c
+++ b/src/common/backup_compression.c
@@ -260,13 +260,17 @@ validate_bc_specification(bc_specification *spec)
 		else if (spec->algorithm == BACKUP_COMPRESSION_LZ4)
 			max_level = 12;
 		else if (spec->algorithm == BACKUP_COMPRESSION_ZSTD)
+		{
 			max_level = 22;
+			/* The minimum level depends on the version.. */
+			min_level = -7;
+		}
 		else
 			return psprintf(_("compression algorithm \"%s\" does not accept a compression level"),
 			get_bc_algorithm_name(spec->algorithm));
 
 		if (spec->level < min_level || spec->level > max_level)
-			return psprintf(_("compression algorithm \"%s\" expec

Re: multithreaded zstd backup compression for client and server

2022-03-23 Thread Justin Pryzby

On Wed, Mar 23, 2022 at 04:34:04PM -0400, Robert Haas wrote:
> be, spawning threads inside the PostgreSQL backend. Short of cats and
> dogs living together, it's hard to think of anything more terrifying,
> because the PostgreSQL backend is very much not thread-safe. However,
> a lot of the things we usually worry about when people make noises
> about using threads in the backend don't apply here, because the
> threads are hidden away behind libzstd interfaces and can't execute
> any PostgreSQL code. Therefore, I think it might be safe to just ...
> turn this on. One reason I think that is that this whole approach was
> recommended to me by Andres ... but that's not to say that there
> couldn't be problems.  I worry a bit that the mere presence of threads
> could in some way mess things up, but I don't know what the mechanism
> for that would be, and I don't want to postpone shipping useful
> features based on nebulous fears.

Note that the PGDG .RPMs and .DEBs are already linked with pthread, via
libxml => liblzma.

$ ldd /usr/pgsql-14/bin/postgres |grep xm
libxml2.so.2 => /lib64/libxml2.so.2 (0x7faab984e000)
$ objdump -p /lib64/libxml2.so.2 |grep NEED
  NEEDED   libdl.so.2
  NEEDED   libz.so.1
  NEEDED   liblzma.so.5
  NEEDED   libm.so.6
  NEEDED   libc.so.6
  VERNEED  0x00019218
  VERNEEDNUM   0x0005
$ objdump -p /lib64/liblzma.so.5 |grep NEED
  NEEDED   libpthread.so.0



Did you try this on windows at all ?  It's probably no surprise that zstd
implements threading differently there.

Re: CREATE INDEX CONCURRENTLY on partitioned index

2022-03-25 Thread Justin Pryzby

On Fri, Mar 25, 2022 at 01:05:49AM -0400, Greg Stark wrote:
> This patch is marked "waiting on author" in the CF. However the most
> recent emails have patches and it's not clear to me what's left from
> previous reviews that might not be addressed yet. Should this patch be
> marked "Needs Review"?
> 
> Anastasia and Alexander are marked as reviewers. Are you still able to
> review it or are there still pending issues that need to be resolved
> from previous reviews?

I still haven't responded to Alexander's feedback, so I need to do that.
(Sorry).

However, since the patch attracted no attention for 50 some weeks last year, so
now is a weird time to shift attention to it.  As such, I will move it to the
next CF.

https://www.postgresql.org/message-id/flat/20210226182019.gu20...@telsasoft.com#da169a0a518bf8121604437d9ab053b3

-- 
Justin

Re: Add LZ4 compression in pg_dump

2022-03-25 Thread Justin Pryzby

On Fri, Mar 25, 2022 at 01:20:47AM -0400, Greg Stark wrote:
> It seems development on this has stalled. If there's no further work
> happening I guess I'll mark the patch returned with feedback. Feel
> free to resubmit it to the next CF when there's progress.

Since it's a reasonably large patch (and one that I had myself started before)
and it's only been 20some days since (minor) review comments, and since the
focus right now is on committing features, and not reviewing new patches, and
this patch is new one month ago, and its 0002 not intended for pg15, therefor
I'm moving it to the next CF, where I hope to work with its authors to progress
it.

-- 
Justin

Re: pg_relation_size on partitioned table

2022-03-25 Thread Justin Pryzby

On Fri, Mar 25, 2022 at 08:52:40PM +0800, Japin Li wrote:
> When I try to get total size of partition tables though partitioned table
> name using pg_relation_size(), it always returns zero.  I can use the
> following SQL to get total size of partition tables, however, it is a bit
> complex.

This doesn't handle multiple levels of partitioning, as \dP+ already does.

Any new function should probably be usable by \dP+ (although it would also need
to support older server versions for another ~10 years).

> SELECT pg_size_pretty(sum(pg_relation_size(i.inhrelid)))
> FROM pg_class c JOIN pg_inherits i ON c.oid = i.inhparent
> WHERE relname = 'parent';

> Could we provide a function to get the total size of the partition table
> though the partitioned table name?  Maybe we can extend
> the pg_relation_size() to get the total size of partition tables through
> the partitioned table name.

Sometimes people would want the size of the table itself and not the size of
its partitions, so it's not good to change pg_relation_size().

OTOH, pg_total_relation_size() shows a table size including toast and indexes.
Toast are an implementation detail, which is intended to be hidden from
application developers.  And that's a goal for partitioning, too.  So maybe it
would make sense if it showed the size of the table, toast, indexes, *and*
partitions (but not legacy inheritance children).

I know I'm not the only one who can't keep track of what all the existing
pg_*_size functions include, so adding more functions will also add some
additional confusion, unless, perhaps, it took arguments indicating what to
include, like pg_total_relation_size(partitions=>false, toast=>true,
indexes=>true, fork=>main).

-- 
Justin

Re: Add psql command to list constraints

2022-03-25 Thread Justin Pryzby

On Fri, Mar 25, 2022 at 03:11:47PM -0400, Robert Haas wrote:
> On Fri, Mar 25, 2022 at 12:28 AM Greg Stark  wrote:
> > Development of this seems to have stalled with the only review of this
> > patch expressing some skepticism about whether it's needed at all.
> 
> Now, there is some precedent for the idea of providing a command that
> lists everything globally. Specifically, we have a \dp command, also
> known as \z, to list privileges across all objects in the database.

> The other thing that we have that is somewhat similar to this is \dd,

\dX is similar, and I remember wondering whether it was really useful/needed.
The catalog tables are exposed and documented for a reason, and power-users
will learn to use them.

+0.5 to mark the patch as RWF or rejected.

-- 
Justin

Re: pgsql: Add 'basebackup_to_shell' contrib module.

2022-03-25 Thread Justin Pryzby

On Fri, Mar 25, 2022 at 02:27:07PM -0700, Andres Freund wrote:
> On 2022-03-25 13:52:11 -0400, Robert Haas wrote:
> > On Fri, Mar 25, 2022 at 12:36 PM Andres Freund  wrote:
> > > Create a CF entry for it, or enable CI on a github repo?
> >
> > I created a CF entry for it. Then I had to try to Google around to
> > find the URL from the cfbot, because it's not even linked from
> > commitfest.postgresql.org for some reason. #blamemagnus

I see it here (and in cfbot), although I'm not sure how you created a new
patch for the active CF, and not for the next CF.
https://commitfest.postgresql.org/37/

> > I don't think that the Windows CI is running the TAP tests for
> > contrib. At least, I can't find any indication of it in the output. So
> > it doesn't really help to assess how portable this test is, unless I'm
> > missing something.
> 
> Yea. It's really unfortunate how vcregress.pl makes it hard to run all
> tests. And we're kind of stuck finding a way forward. It's easy enough to work
> around for individual tests by just adding them to the test file (like Thomas
> did nearby), but clearly that doesn't scale. Andrew wasn't happy with
> additional vcregress commands. The fact that vcregress doesn't run tests in
> parallel makes things take forever. And so it goes on.

I have a patch to add alltaptests target to vcregress.  But I don't recall
hearing any objection to new targets until now.

https://github.com/justinpryzby/postgres/runs/5174877506

Re: Add LZ4 compression in pg_dump

2022-03-25 Thread Justin Pryzby

On Sat, Mar 26, 2022 at 02:57:50PM +0900, Michael Paquier wrote:
> I have a question about 0002, actually.  What has led you to the
> conclusion that this code is dead and could be removed?

See 0001 and the manpage.

+   'pg_dump: compression is not supported by tar archive format');

When I submitted a patch to support zstd, I spent awhile trying to make
compression work with tar, but it's a significant effort and better done
separately.

Re: Add LZ4 compression in pg_dump

2022-03-26 Thread Justin Pryzby

LZ4F_HEADER_SIZE_MAX isn't defined in old LZ4.

I ran into that on an ubuntu LTS, so I don't think it's so old that it
shouldn't be handled more gracefully.  LZ4 should either have an explicit
version check, or else shouldn't depend on that feature (or should define a
safe fallback version if the library header doesn't define it).

https://packages.ubuntu.com/liblz4-1

0003: typo: of legacy => or legacy

There are a large number of ifdefs being added here - it'd be nice to minimize
that.  basebackup was organized to use separate files, which is one way.

$ git grep -c 'ifdef .*LZ4' src/bin/pg_dump/compress_io.c
src/bin/pg_dump/compress_io.c:19

In last year's CF entry, I had made a union within CompressorState.  LZ4
doesn't need z_streamp (and ztsd will need ZSTD_outBuffer, ZSTD_inBuffer,
ZSTD_CStream).

0002: I wonder if you're able to re-use any of the basebackup parsing stuff
from commit ffd53659c.  You're passing both the compression method *and* level.
I think there should be a structure which includes both.  In the future, that
can also handle additional options.  I hope to re-use these same things for
wal_compression=method:level.

You renamed this:

|-   COMPR_ALG_LIBZ
|-} CompressionAlgorithm;
|+   COMPRESSION_GZIP,
|+} CompressionMethod;

..But I don't think that's an improvement.  If you were to change it, it should
say something like PGDUMP_COMPRESS_ZLIB, since there are other compression
structs and typedefs.  zlib is not idential to gzip, which uses a different
header, so in WriteDataToArchive(), LIBZ is correct, and GZIP is incorrect.

The cf* changes in pg_backup_archiver could be split out into a separate
commit.  It's strictly a code simplification - not just preparation for more
compression algorithms.  The commit message should "See also:
bf9aa490db24b2334b3595ee33653bf2fe39208c".

The changes in 0002 for cfopen_write seem insufficient:
|+   if (compressionMethod == COMPRESSION_NONE)
|+   fp = cfopen(path, mode, compressionMethod, 0);
|else
|{
| #ifdef HAVE_LIBZ
|char   *fname;
| 
|fname = psprintf("%s.gz", path);
|-   fp = cfopen(fname, mode, compression);
|+   fp = cfopen(fname, mode, compressionMethod, compressionLevel);
|free_keep_errno(fname);
| #else

The only difference between the LIBZ and uncompressed case is the file
extension, and it'll be the only difference with LZ4 too.  So I suggest to
first handle the file extension, and the rest of the code path is not
conditional on the compression method.  I don't think cfopen_write even needs
HAVE_LIBZ - can't you handle that in cfopen_internal() ?

This patch rejects -Z0, which ought to be accepted:
./src/bin/pg_dump/pg_dump -h /tmp regression -Fc -Z0 |wc
pg_dump: error: can only specify -Z/--compress [LEVEL] when method is set

Your 0003 patch shouldn't reference LZ4:
+#ifndef HAVE_LIBLZ4
+   if (*compressionMethod == COMPRESSION_LZ4)
+   supports_compression = false;
+#endif

The 0004 patch renames zlibOutSize to outsize - I think the patch series should
be constructed such as to minimize the size of the method-specific patches.  I
say this anticipating also adding support for zstd.  The preliminary patches
should have all the boring stuff.  It would help for reviewing to keep the 
patches split up, or to enumerate all the boring things that are being renamed
(like change OutputContext to cfp, rename zlibOutSize, ...).

0004: The include should use  and not "lz4.h"

freebsd/cfbot is failing.

I suggested off-list to add an 0099 patch to change LZ4 to the default, to
exercise it more on CI.

-- 
Justin

Re: Add LZ4 compression in pg_dump

2022-03-26 Thread Justin Pryzby

On Sat, Mar 26, 2022 at 11:21:56AM -0500, Justin Pryzby wrote:
> You're passing both the compression method *and* level.  I think there should
> be a structure which includes both.  In the future, that can also handle
> additional options.

I'm not sure if there's anything worth saving, but I did that last year with
0003-Support-multiple-compression-algs-levels-opts.patch
I sent a rebased copy off-list.
https://www.postgresql.org/message-id/flat/20210104025321.ga9...@telsasoft.com#ca1b9f9d3552c87fa874731cad9d8391

|   fatal("not built with LZ4 support");
|   fatal("not built with lz4 support");

Please use consistent capitalization of "lz4" - then the compiler can optimize
away duplicate strings.

> 0004: The include should use  and not "lz4.h"

Also, use USE_LZ4 rather than HAVE_LIBLZ4, per 75eae0908.

< 3 4 5 6 7 8 9 10 11 12 >

701 - 800 of 1887 matches

Mail list logo