Re: Addition of --no-sync to pg_upgrade for test speedup

2021-12-21 Thread Michael Paquier
On Mon, Dec 20, 2021 at 10:46:13AM -0300, Alvaro Herrera wrote:
> On 2021-Dec-16, Michael Paquier wrote:
>> In pg_upgrade, we let the flush happen with initdb --sync-only, based
>> on the binary path of the new cluster, so I think that we are not
>> going to miss any test coverage by skipping that.
> 
> There was one patch of mine with breakage that only manifested in the
> pg_upgrade test *because* of its lack of no-sync.  I'm afraid that this
> change would hide certain problems.
> https://postgr.es/m/20210130023011.n545o54j65t4k...@alap3.anarazel.de

Hmm.  This talks about fsync=on being a factor counting in detecting a
failure with the backend.  Why would the fsync done with initdb
--sync-only on the target cluster once pg_upgrade is done change
something here?

> I'm not 100% comfortable with this.  What can we do to preserve *some*
> testing that include syncing?  Maybe some option that a few buildfarm
> animals use?

If you object about this part, I am fine to revert the change in
test.sh until there is a better facility to enforce syncs across tests
in the buildfarm, though.  I can hack something to centralize all
that, of course, but I am not sure when I'll be able to do so in the
short term.  Could I keep that in MSVC's vcregress.pl at least for the
time being?
--
Michael


signature.asc
Description: PGP signature


Re: Unifying VACUUM VERBOSE and log_autovacuum_min_duration output

2021-12-21 Thread Masahiko Sawada
On Tue, Dec 21, 2021 at 2:39 AM Peter Geoghegan  wrote:
>
> On Mon, Nov 29, 2021 at 6:51 PM Peter Geoghegan  wrote:
> > Attached is a WIP patch doing this.
>
> This has bitrot, so I attach v2, mostly just to keep the CFTester
> status green. The only real change is one minor simplification to how
> we set everything up, inside heap_vacuum_rel().

I've looked at the patch and here are comments:

@@ -3076,16 +3021,12 @@ lazy_cleanup_one_index(Relation indrel,
IndexBulkDeleteResult *istat,
   LVRelState *vacrel)
 {
IndexVacuumInfo ivinfo;
-   PGRUsageru0;
LVSavedErrInfo saved_err_info;

-   pg_rusage_init();
-
ivinfo.index = indrel;
ivinfo.analyze_only = false;
ivinfo.report_progress = false;
ivinfo.estimated_count = estimated_count;
-   ivinfo.message_level = elevel;

I think we should set message_level. Otherwise, index AM will set an
invalid log level, although any index AM in core seems not to use it.

---
-   /*
-* Update error traceback information.  This is the
last phase during
-* which we add context information to errors, so we
don't need to
-* revert to the previous phase.
-*/

Why is this comment removed? ISTM this comment is still valid.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: Emit a warning if the extension's GUC is set incorrectly

2021-12-21 Thread Shinya Kato

On 2021-12-22 02:23, Tom Lane wrote:

Kyotaro Horiguchi  writes:
At Mon, 20 Dec 2021 21:05:23 +0900, Shinya Kato 
 wrote in

We should use EmitWarningsOnPlaceholders when we use
DefineCustomXXXVariable.
I don't think there is any room for debate.



Unfortunately, pltcl.c defines variables both in pltcl and pltclu but
the patch adds the warning only for pltclu.


Right.  The rest of 0001 looks fine, so pushed with that correction.

I concur that 0002 is unacceptable.  The result of it will be that
a freshly-loaded extension will fail to hook into the system as it
should, because it will fail to complete its initialization.
That will cause user frustration and bug reports.

(BTW, I wonder if EmitWarningsOnPlaceholders should be using LOG
rather than WARNING when it's running in the postmaster.  Use of
WARNING is moderately likely to result in nothing getting printed
at all.)

0003 looks to me like it was superseded by 75d22069e.  I do not
particularly like that patch though; it seems both inefficient
and ill-designed.  0003 is adding a check in an equally bizarre
place.  Why isn't add_placeholder_variable the place to deal
with this?  Also, once we know that a prefix is reserved,
why don't we throw an error not just a warning for attempts to
set some unknown variable under that prefix?  We don't allow
you to set unknown non-prefixed variables.

regards, tom lane


Thank you for pushing!
Thank you all for the reviews, I think I will take down 0002 and 0003.

--
Regards,

--
Shinya Kato
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: pg_upgrade should truncate/remove its logs before running

2021-12-21 Thread Michael Paquier
On Mon, Dec 20, 2021 at 09:39:26PM -0600, Justin Pryzby wrote:
> On Mon, Dec 20, 2021 at 08:21:51PM +0900, Michael Paquier wrote:
>> we could choose something simpler for the default, like
>> "pg_upgrade_log".  I don't have a good history in naming new things,
>> though :)
> 
> I specifically called it .d to made it obvious that it's a dir - nearly
> everything that ends in "log" is a file, so people are likely to run "rm" and
> "less" on it - including myself.

Okay.

>> .gitignore should be updated, I guess?
> 
> Are you suggesting to remove these ?
> -/pg_upgrade_internal.log
> -/loadable_libraries.txt

Yep, it looks so as these are part of the logs, the second one being a
failure state.

> -/reindex_hash.sql

But this one is not, no?

>> Besides, this patch has no documentation.
> 
> TBH I'm not even sure if the dir needs to be configurable ?

I'd think it is better to have some control on that.  Not sure what
the opinion of others is on this specific point, though.
--
Michael


signature.asc
Description: PGP signature


Re: row filtering for logical replication

2021-12-21 Thread Peter Smith
On Tue, Dec 21, 2021 at 9:03 PM vignesh C  wrote:
>
...
> Few comments:
> 1) list_free(schemarelids) is called inside if and and outside if in
> break case. Can we move it above continue so that it does not gets
> called in the break case:
> +   schemarelids =
> GetAllSchemaPublicationRelations(pub->oid,
> +
>  pub->pubviaroot ?
> +
>  PUBLICATION_PART_ROOT
> :
> +
>
> PUBLICATION_PART_LEAF);
> +   if (list_member_oid(schemarelids, entry->relid))
> +   {
> +   if (pub->pubactions.pubinsert)
> +   no_filter[idx_ins] = true;
> +   if (pub->pubactions.pubupdate)
> +   no_filter[idx_upd] = true;
> +   if (pub->pubactions.pubdelete)
> +   no_filter[idx_del] = true;
> +
> +   list_free(schemarelids);
> +
> +   /* Quick exit loop if all pubactions
> have no row-filter. */
> +   if (no_filter[idx_ins] &&
> no_filter[idx_upd] && no_filter[idx_del])
> +   break;
> +
> +   continue;
> +   }
> +   list_free(schemarelids);
>

I think this review comment is mistaken. The break will break from the
loop, so the free is still needed. So I skipped this comment. If you
still think there is a problem please give a more detailed explanation
about it.

> 2) create_edata_for_relation also is using similar logic, can it also
> call this function to reduce duplicate code
> +static EState *
> +create_estate_for_relation(Relation rel)
> +{
> +   EState  *estate;
> +   RangeTblEntry   *rte;
> +
> +   estate = CreateExecutorState();
> +
> +   rte = makeNode(RangeTblEntry);
> +   rte->rtekind = RTE_RELATION;
> +   rte->relid = RelationGetRelid(rel);
> +   rte->relkind = rel->rd_rel->relkind;
> +   rte->rellockmode = AccessShareLock;
> +   ExecInitRangeTable(estate, list_make1(rte));
> +
> +   estate->es_output_cid = GetCurrentCommandId(false);
> +
> +   return estate;
> +}
>

Yes, that other code looks similar, but I am not sure it is worth
rearranging things for the sake of trying to make use of only 5 or 6
common LOC.  Anyway, I felt this review comment is not really related
to the RF patch; It seems more like a potential idea for a future
patch to use some common code *after* the RF code is committed. So I
skipped this comment.


> 3) In one place select is in lower case, it can be changed to upper case
> +   resetStringInfo();
> +   appendStringInfo(,
> +"SELECT DISTINCT
> pg_get_expr(prqual, prrelid) "
> +"  FROM pg_publication p "
> +"  INNER JOIN
> pg_publication_rel pr ON (p.oid = pr.prpubid) "
> +"WHERE pr.prrelid
> = %u AND p.pubname IN ( %s ) "
> +"AND NOT (select
> bool_or(puballtables) "
> +"  FROM pg_publication "
> +"  WHERE pubname
> in ( %s )) "
> +"AND (SELECT count(1)=0 "
> +"  FROM
> pg_publication_namespace pn, pg_class c "
> +"  WHERE c.oid =
> %u AND c.relnamespace = pn.pnnspid)",
> +   lrel->remoteid,
> +   pub_names.data,
> +   pub_names.data,
> +   lrel->remoteid);
>

Fixed in v52-0001 [1]

> 5) Should we mention user should take care of deletion of row filter
> records after table sync is done.
> +   ALL TABLES or FOR ALL TABLES IN
> SCHEMA publication,
> +   then all other WHERE clauses (for the same
> publish operation)
> +   become redundant.
> +   If the subscriber is a PostgreSQL
> version before 15
> +   then any row filtering is ignored during the initial data
> synchronization phase.
>
Fixed in v52-0001 [1]

> 6) Should this be an Assert, since this will be taken care earlier by
> GetTransformedWhereClause->transformWhereClause->coerce_to_boolean:
> +   expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype,
> BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
> +
> +   if (expr == NULL)
> +   ereport(ERROR,
> +   (errcode(ERRCODE_CANNOT_COERCE),
> +  

Re: Confused comment about drop replica identity index

2021-12-21 Thread Michael Paquier
On Mon, Dec 20, 2021 at 11:57:32AM -0300, Euler Taveira wrote:
> On Mon, Dec 20, 2021, at 8:11 AM, Michael Paquier wrote:
>> That's mostly fine.  I have made some adjustments as per the
>> attached.
>
> Your patch looks good to me.

Thanks.  I have done this part for now.
--
Michael


signature.asc
Description: PGP signature


Re: In-placre persistance change of a relation

2021-12-21 Thread Kyotaro Horiguchi
Hello, Jakub.

At Tue, 21 Dec 2021 13:07:28 +, Jakub Wartak  
wrote in 
> So what's suspicious is that 122880 -> 0 file size truncation. I've 
> investigated WAL and it seems to contain TRUNCATE records
> after logged FPI images, so when the crash recovery would kick in it probably 
> clears this table (while it shouldn't).

Darn..  It is too silly that I wrongly issued truncate records for the
target relation of the function (rel) instaed of the relation on which
we're currently operating at that time (r).

> However if I perform CHECKPOINT just before crash the WAL stream contains 
> just RUNNING_XACTS and CHECKPOINT_ONLINE 
> redo records, this probably prevents truncating. I'm newbie here so please 
> take this theory with grain of salt, it can be 
> something completely different.

It is because the WAL records are inconsistent with the on-disk state.
After a crash before a checkpoint after the SET LOGGED, recovery ends with
recoverying the broken WAL records, but after that the on-disk state
is persisted and the broken WAL records are not replayed.

The following fix works.

--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -5478,7 +5478,7 @@ RelationChangePersistence(AlteredTableInfo *tab, char 
persistence,
xl_smgr_truncate xlrec;
 
xlrec.blkno = 0;
-   xlrec.rnode = rel->rd_node;
+   xlrec.rnode = r->rd_node;
xlrec.flags = SMGR_TRUNCATE_ALL;
 

I made another change in this version. Previously only btree among all
index AMs was processed in the in-place manner.  In this version we do
that all AMs except GiST.  Maybe if gistGetFakeLSN behaved the same
way for permanent and unlogged indexes, we could skip index rebuild in
exchange of some extra WAL records emitted while it is unlogged.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From 0cac0fade05322c1aa8b7ec020f8fe1f9e5fb50e Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Wed, 11 Nov 2020 21:51:11 +0900
Subject: [PATCH v11 1/2] In-place table persistence change

Even though ALTER TABLE SET LOGGED/UNLOGGED does not require data
rewriting, currently it runs heap rewrite which causes large amount of
file I/O.  This patch makes the command run without heap rewrite.
Addition to that, SET LOGGED while wal_level > minimal emits WAL using
XLOG_FPI instead of massive number of HEAP_INSERT's, which should be
smaller.

Also this allows for the cleanup of files left behind in the crash of
the transaction that created it.
---
 src/backend/access/rmgrdesc/smgrdesc.c |  52 +++
 src/backend/access/transam/README  |   8 +
 src/backend/access/transam/xact.c  |   7 +
 src/backend/access/transam/xlog.c  |  17 +
 src/backend/catalog/storage.c  | 545 -
 src/backend/commands/tablecmds.c   | 265 ++--
 src/backend/replication/basebackup.c   |   3 +-
 src/backend/storage/buffer/bufmgr.c|  88 
 src/backend/storage/file/fd.c  |   4 +-
 src/backend/storage/file/reinit.c  | 344 +++-
 src/backend/storage/smgr/md.c  |  93 -
 src/backend/storage/smgr/smgr.c|  32 ++
 src/backend/storage/sync/sync.c|  20 +-
 src/bin/pg_rewind/parsexlog.c  |  24 ++
 src/common/relpath.c   |  47 ++-
 src/include/catalog/storage.h  |   3 +
 src/include/catalog/storage_xlog.h |  42 +-
 src/include/common/relpath.h   |   9 +-
 src/include/storage/bufmgr.h   |   2 +
 src/include/storage/fd.h   |   1 +
 src/include/storage/md.h   |   8 +-
 src/include/storage/reinit.h   |  10 +-
 src/include/storage/smgr.h |  17 +
 23 files changed, 1459 insertions(+), 182 deletions(-)

diff --git a/src/backend/access/rmgrdesc/smgrdesc.c b/src/backend/access/rmgrdesc/smgrdesc.c
index 773d57..d251f22207 100644
--- a/src/backend/access/rmgrdesc/smgrdesc.c
+++ b/src/backend/access/rmgrdesc/smgrdesc.c
@@ -40,6 +40,49 @@ smgr_desc(StringInfo buf, XLogReaderState *record)
 		 xlrec->blkno, xlrec->flags);
 		pfree(path);
 	}
+	else if (info == XLOG_SMGR_UNLINK)
+	{
+		xl_smgr_unlink *xlrec = (xl_smgr_unlink *) rec;
+		char	   *path = relpathperm(xlrec->rnode, xlrec->forkNum);
+
+		appendStringInfoString(buf, path);
+		pfree(path);
+	}
+	else if (info == XLOG_SMGR_MARK)
+	{
+		xl_smgr_mark *xlrec = (xl_smgr_mark *) rec;
+		char	   *path = GetRelationPath(xlrec->rnode.dbNode,
+		   xlrec->rnode.spcNode,
+		   xlrec->rnode.relNode,
+		   InvalidBackendId,
+		   xlrec->forkNum, xlrec->mark);
+		char	   *action;
+
+		switch (xlrec->action)
+		{
+			case XLOG_SMGR_MARK_CREATE:
+action = "CREATE";
+break;
+			case XLOG_SMGR_MARK_UNLINK:
+action = "DELETE";
+break;
+			default:
+action = "";
+break;
+		}
+
+		appendStringInfo(buf, "%s %s", action, path);
+		pfree(path);
+	}
+	else if (info == 

Re: row filtering for logical replication

2021-12-21 Thread Amit Kapila
On Wed, Dec 22, 2021 at 9:23 AM vignesh C  wrote:
>
> On Tue, Dec 21, 2021 at 2:29 PM Peter Smith  wrote:
> >
> > Here is the v51* patch set:
> >
>
> I tweaked the query slightly based on Euler's changes, the explain
> analyze of the updated query based on Euler's suggestions, existing
> query and Euler's query is given below:
> 1) updated query based on Euler's suggestion:
> explain analyze SELECT DISTINCT pg_get_expr(prqual, prrelid)   FROM
> pg_publication p   INNER JOIN pg_publication_rel pr ON (p.oid =
> pr.prpubid) WHERE pr.prrelid = 16384 AND p.pubname IN ( 'pub1' )
>   AND NOT (select bool_or(puballtables)   FROM pg_publication
>  WHERE pubname in ( 'pub1' )) AND NOT EXISTS (SELECT 1   FROM
> pg_publication_namespace pn, pg_class c   WHERE c.oid = 16384 AND
> c.relnamespace = pn.pnnspid);
>QUERY
> PLAN
> 
> Unique  (cost=14.68..14.69 rows=1 width=32) (actual time=0.121..0.126
> rows=1 loops=1)
>InitPlan 1 (returns $0)
>  ->  Aggregate  (cost=1.96..1.97 rows=1 width=1) (actual
> time=0.025..0.026 rows=1 loops=1)
>->  Seq Scan on pg_publication  (cost=0.00..1.96 rows=1
> width=1) (actual time=0.016..0.017 rows=1 loops=1)
>  Filter: (pubname = 'pub1'::name)
>InitPlan 2 (returns $1)
>  ->  Nested Loop  (cost=0.27..8.30 rows=1 width=0) (actual
> time=0.002..0.003 rows=0 loops=1)
>Join Filter: (pn.pnnspid = c.relnamespace)
>->  Seq Scan on pg_publication_namespace pn
> (cost=0.00..0.00 rows=1 width=4) (actual time=0.001..0.002 rows=0
> loops=1)
>->  Index Scan using pg_class_oid_index on pg_class c
> (cost=0.27..8.29 rows=1 width=4) (never executed)
>  Index Cond: (oid = '16384'::oid)
>->  Sort  (cost=4.40..4.41 rows=1 width=32) (actual
> time=0.119..0.121 rows=1 loops=1)
>  Sort Key: (pg_get_expr(pr.prqual, pr.prrelid)) COLLATE "C"
>  Sort Method: quicksort  Memory: 25kB
>  ->  Result  (cost=0.00..4.39 rows=1 width=32) (actual
> time=0.094..0.098 rows=1 loops=1)
>One-Time Filter: ((NOT $0) AND (NOT $1))
>->  Nested Loop  (cost=0.00..4.39 rows=1 width=36)
> (actual time=0.013..0.015 rows=1 loops=1)
>  Join Filter: (p.oid = pr.prpubid)
>  ->  Seq Scan on pg_publication p
> (cost=0.00..1.96 rows=1 width=4) (actual time=0.004..0.005 rows=1
> loops=1)
>Filter: (pubname = 'pub1'::name)
>  ->  Seq Scan on pg_publication_rel pr
> (cost=0.00..2.41 rows=1 width=40) (actual time=0.005..0.005 rows=1
> loops=1)
>Filter: (prrelid = '16384'::oid)
> Planning Time: 1.014 ms
> Execution Time: 0.259 ms
> (24 rows)
>
> 2) Existing query:
> postgres=# explain analyze SELECT DISTINCT pg_get_expr(prqual,
> prrelid)   FROM pg_publication p
>INNER JOIN pg_publication_rel pr ON (p.oid = pr.prpubid) WHERE
> pr.prrelid = 16384 AND p.pubname IN ( 'pub1' )
>  AND NOT (select bool_or(puballtables)   FROM pg_publication
> WHERE pubname in ( 'pub1' ))
>  AND (SELECT count(1)=0   FROM pg_publication_namespace pn,
> pg_class c   WHERE c.oid = 16384 AND c.relnamespace = pn.pnnspid);
>QUERY
> PLAN
> -
> Unique  (cost=14.69..14.70 rows=1 width=32) (actual time=0.162..0.166
> rows=1 loops=1)
>InitPlan 1 (returns $0)
>  ->  Aggregate  (cost=1.96..1.97 rows=1 width=1) (actual
> time=0.023..0.025 rows=1 loops=1)
>->  Seq Scan on pg_publication  (cost=0.00..1.96 rows=1
> width=1) (actual time=0.014..0.016 rows=1 loops=1)
>  Filter: (pubname = 'pub1'::name)
>InitPlan 2 (returns $1)
>  ->  Aggregate  (cost=8.30..8.32 rows=1 width=1) (actual
> time=0.044..0.045 rows=1 loops=1)
>->  Nested Loop  (cost=0.27..8.30 rows=1 width=0) (actual
> time=0.028..0.029 rows=0 loops=1)
>  Join Filter: (pn.pnnspid = c.relnamespace)
>  ->  Seq Scan on pg_publication_namespace pn
> (cost=0.00..0.00 rows=1 width=4) (actual time=0.004..0.004 rows=0
> loops=1)
>  ->  Index Scan using pg_class_oid_index on pg_class c
>  (cost=0.27..8.29 rows=1 width=4) (never executed)
>Index Cond: (oid = '16384'::oid)
>->  Sort  (cost=4.40..4.41 rows=1 width=32) (actual
> time=0.159..0.161 rows=1 loops=1)
>  Sort Key: (pg_get_expr(pr.prqual, pr.prrelid)) COLLATE "C"
>  Sort Method: quicksort  Memory: 25kB
>  ->  Result  (cost=0.00..4.39 rows=1 width=32) (actual
> time=0.142..0.147 rows=1 loops=1)
>One-Time Filter: 

Re: Unifying VACUUM VERBOSE and log_autovacuum_min_duration output

2021-12-21 Thread Greg Stark
I haven't read the patch yet. But some thoughts based on the posted output:

1) At first I was quite skeptical about losing the progress reporting.
I've often found it quite useful. But looking at the examples I'm
convinced.

Or rather I think a better way to look at it is that the progress
output for the operator should be separated from the metrics logged.
As an operator what I want to see is some progress indicator
""starting table scan", "overflow at x% of table scanned, starting
index scan", "processing index 1" "index 2"... so I can have some idea
of how much longer the vacuum will take and see whether I need to
raise maintenance_work_mem and by how much. I don't need to see all
the metrics while it's running.

2) I don't much like the format. I want to be able to parse the output
with awk or mtail or even just grep for relevant lines. Things like
"index scan not needed" make it hard to parse since you don't know
what it will look like if they are needed. I would have expected
something like "index scans: 0" which is actually already there up
above. I'm not clear how this line is meant to be read. Is it just
explaining *why* the index scan was skipped? It would just be missing
entirely if it wasn't skipped?

Fwiw, having it be parsable is why I wouldn't want it to be multiple
ereports. That would mean it could get interleaved with other errors
from other backends. That would be a disaster.




Re: postgres_fdw: incomplete subabort cleanup of connections used in async execution

2021-12-21 Thread Etsuro Fujita
Hi Alexander,

On Wed, Dec 22, 2021 at 1:08 AM Alexander Pyhalov
 wrote:
> Etsuro Fujita писал 2021-12-19 13:25:
> > While working on [1], I noticed $SUBJECT: postgres_fdw resets the
> > per-connection states of connections, which store async requests sent
> > to remote servers in async_capable mode, during post-abort
> > (pgfdw_xact_callback()), but it fails to do so during post-subabort
> > (pgfdw_subxact_callback()).  This causes a crash when re-executing a
> > query that was aborted in a subtransaction:

> Looks good to me.

Great!  Thanks for reviewing!

Best regards,
Etsuro Fujita




Re: sequences vs. synchronous replication

2021-12-21 Thread Fujii Masao




On 2021/12/22 10:57, Tomas Vondra wrote:



On 12/19/21 04:03, Amit Kapila wrote:

On Sat, Dec 18, 2021 at 7:24 AM Tomas Vondra
 wrote:


while working on logical decoding of sequences, I ran into an issue with
nextval() in a transaction that rolls back, described in [1]. But after
thinking about it a bit more (and chatting with Petr Jelinek), I think
this issue affects physical sync replication too.

Imagine you have a primary <-> sync_replica cluster, and you do this:

    CREATE SEQUENCE s;

    -- shutdown the sync replica

    BEGIN;
    SELECT nextval('s') FROM generate_series(1,50);
    ROLLBACK;

    BEGIN;
    SELECT nextval('s');
    COMMIT;

The natural expectation would be the COMMIT gets stuck, waiting for the
sync replica (which is not running), right? But it does not.



How about if we always WAL log the first sequence change in a transaction?



I've been thinking about doing something like this, but I think it would not have any 
significant advantages compared to using "SEQ_LOG_VALS 0". It would still have 
the same performance hit for plain nextval() calls, and there's no measurable impact on 
simple workloads that already write WAL in transactions even with SEQ_LOG_VALS 0.


Just idea; if wal_level > minimal, how about making nextval_internal() (1) 
check whether WAL is replicated to sync standbys, up to the page lsn of the 
sequence, and (2) forcibly emit a WAL record if not replicated yet? The similar 
check is performed at the beginning of SyncRepWaitForLSN(), so probably we can 
reuse that code.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: row filtering for logical replication

2021-12-21 Thread vignesh C
On Tue, Dec 21, 2021 at 2:29 PM Peter Smith  wrote:
>
> Here is the v51* patch set:
>

I tweaked the query slightly based on Euler's changes, the explain
analyze of the updated query based on Euler's suggestions, existing
query and Euler's query is given below:
1) updated query based on Euler's suggestion:
explain analyze SELECT DISTINCT pg_get_expr(prqual, prrelid)   FROM
pg_publication p   INNER JOIN pg_publication_rel pr ON (p.oid =
pr.prpubid) WHERE pr.prrelid = 16384 AND p.pubname IN ( 'pub1' )
  AND NOT (select bool_or(puballtables)   FROM pg_publication
 WHERE pubname in ( 'pub1' )) AND NOT EXISTS (SELECT 1   FROM
pg_publication_namespace pn, pg_class c   WHERE c.oid = 16384 AND
c.relnamespace = pn.pnnspid);
   QUERY
PLAN

Unique  (cost=14.68..14.69 rows=1 width=32) (actual time=0.121..0.126
rows=1 loops=1)
   InitPlan 1 (returns $0)
 ->  Aggregate  (cost=1.96..1.97 rows=1 width=1) (actual
time=0.025..0.026 rows=1 loops=1)
   ->  Seq Scan on pg_publication  (cost=0.00..1.96 rows=1
width=1) (actual time=0.016..0.017 rows=1 loops=1)
 Filter: (pubname = 'pub1'::name)
   InitPlan 2 (returns $1)
 ->  Nested Loop  (cost=0.27..8.30 rows=1 width=0) (actual
time=0.002..0.003 rows=0 loops=1)
   Join Filter: (pn.pnnspid = c.relnamespace)
   ->  Seq Scan on pg_publication_namespace pn
(cost=0.00..0.00 rows=1 width=4) (actual time=0.001..0.002 rows=0
loops=1)
   ->  Index Scan using pg_class_oid_index on pg_class c
(cost=0.27..8.29 rows=1 width=4) (never executed)
 Index Cond: (oid = '16384'::oid)
   ->  Sort  (cost=4.40..4.41 rows=1 width=32) (actual
time=0.119..0.121 rows=1 loops=1)
 Sort Key: (pg_get_expr(pr.prqual, pr.prrelid)) COLLATE "C"
 Sort Method: quicksort  Memory: 25kB
 ->  Result  (cost=0.00..4.39 rows=1 width=32) (actual
time=0.094..0.098 rows=1 loops=1)
   One-Time Filter: ((NOT $0) AND (NOT $1))
   ->  Nested Loop  (cost=0.00..4.39 rows=1 width=36)
(actual time=0.013..0.015 rows=1 loops=1)
 Join Filter: (p.oid = pr.prpubid)
 ->  Seq Scan on pg_publication p
(cost=0.00..1.96 rows=1 width=4) (actual time=0.004..0.005 rows=1
loops=1)
   Filter: (pubname = 'pub1'::name)
 ->  Seq Scan on pg_publication_rel pr
(cost=0.00..2.41 rows=1 width=40) (actual time=0.005..0.005 rows=1
loops=1)
   Filter: (prrelid = '16384'::oid)
Planning Time: 1.014 ms
Execution Time: 0.259 ms
(24 rows)

2) Existing query:
postgres=# explain analyze SELECT DISTINCT pg_get_expr(prqual,
prrelid)   FROM pg_publication p
   INNER JOIN pg_publication_rel pr ON (p.oid = pr.prpubid) WHERE
pr.prrelid = 16384 AND p.pubname IN ( 'pub1' )
 AND NOT (select bool_or(puballtables)   FROM pg_publication
WHERE pubname in ( 'pub1' ))
 AND (SELECT count(1)=0   FROM pg_publication_namespace pn,
pg_class c   WHERE c.oid = 16384 AND c.relnamespace = pn.pnnspid);
   QUERY
PLAN
-
Unique  (cost=14.69..14.70 rows=1 width=32) (actual time=0.162..0.166
rows=1 loops=1)
   InitPlan 1 (returns $0)
 ->  Aggregate  (cost=1.96..1.97 rows=1 width=1) (actual
time=0.023..0.025 rows=1 loops=1)
   ->  Seq Scan on pg_publication  (cost=0.00..1.96 rows=1
width=1) (actual time=0.014..0.016 rows=1 loops=1)
 Filter: (pubname = 'pub1'::name)
   InitPlan 2 (returns $1)
 ->  Aggregate  (cost=8.30..8.32 rows=1 width=1) (actual
time=0.044..0.045 rows=1 loops=1)
   ->  Nested Loop  (cost=0.27..8.30 rows=1 width=0) (actual
time=0.028..0.029 rows=0 loops=1)
 Join Filter: (pn.pnnspid = c.relnamespace)
 ->  Seq Scan on pg_publication_namespace pn
(cost=0.00..0.00 rows=1 width=4) (actual time=0.004..0.004 rows=0
loops=1)
 ->  Index Scan using pg_class_oid_index on pg_class c
 (cost=0.27..8.29 rows=1 width=4) (never executed)
   Index Cond: (oid = '16384'::oid)
   ->  Sort  (cost=4.40..4.41 rows=1 width=32) (actual
time=0.159..0.161 rows=1 loops=1)
 Sort Key: (pg_get_expr(pr.prqual, pr.prrelid)) COLLATE "C"
 Sort Method: quicksort  Memory: 25kB
 ->  Result  (cost=0.00..4.39 rows=1 width=32) (actual
time=0.142..0.147 rows=1 loops=1)
   One-Time Filter: ((NOT $0) AND $1)
   ->  Nested Loop  (cost=0.00..4.39 rows=1 width=36)
(actual time=0.016..0.018 rows=1 loops=1)
Join Filter: (p.oid = pr.prpubid)
 ->  Seq Scan on pg_publication p
(cost=0.00..1.96 

Re: parallel vacuum comments

2021-12-21 Thread Masahiko Sawada
On Tue, Dec 21, 2021 at 10:24 PM Amit Kapila  wrote:
>
> On Tue, Dec 21, 2021 at 11:24 AM Masahiko Sawada  
> wrote:
> >
> > On Tue, Dec 21, 2021 at 2:04 PM Amit Kapila  wrote:
> > >
> >
> > Thank you for the comment. Agreed.
> >
> > I've attached updated version patches. Please review them.
> >
>
> These look mostly good to me. Please find attached the slightly edited
> version of the 0001 patch. I have modified comments, ran pgindent, and
> modify the commit message. I'll commit this tomorrow if you are fine
> with it.

Thank you for committing the first patch.

I've attached an updated version second patch. Please review it.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/


v11-0001-Move-parallel-vacuum-code-to-vacuumparallel.c.patch
Description: Binary data


Re: do only critical work during single-user vacuum?

2021-12-21 Thread Masahiko Sawada
On Wed, Dec 22, 2021 at 6:56 AM Peter Geoghegan  wrote:
>
> On Tue, Dec 21, 2021 at 1:31 PM John Naylor
>  wrote:
> > On second thought, we don't really need another number here. We could
> > simply go by the existing failsafe parameter, and if the admin wants a
> > different value, it's already possible to specify
> > vacuum_failsafe_age/vacuum_multixact_failsafe_age in a session,
> > including in single-user mode. Perhaps a new boolean called
> > FAILSAFE_ONLY. If no table is specified, then when generating the list
> > of tables, include only those with relfrozenxid/relminmxid greater
> > than their failsafe thresholds.
>
> That's equivalent to the quick and dirty patch I wrote (assuming that
> the user actually uses this new FAILSAFE_ONLY thing).
>
> But if we're going to add a new option to the VACUUM command (or
> something of similar scope), then we might as well add a new behavior
> that is reasonably exact -- something that (say) only *starts* a
> VACUUM for those tables whose relfrozenxid age currently exceeds half
> the autovacuum_freeze_max_age for the table (usually taken from the
> GUC, sometimes taken from the reloption), which also forces the
> failsafe. And with similar handling for
> relminmxid/autovacuum_multixact_freeze_max_age.
>
> In other words, while triggering the failsafe is important, simply *not
> starting* VACUUM for relations where there is really no need for it is
> at least as important. We shouldn't even think about pruning or
> freezing with these tables. (ISTM that the only thing that might be a
> bit controversial about any of this is my definition of "safe", which
> seems like about the right trade-off to me.)

+1

>
> This new command/facility should probably not be a new flag to the
> VACUUM command, as such. Rather, I think that it should either be an
> SQL-callable function, or a dedicated top-level command (that doesn't
> accept any tables). The only reason to have this is for scenarios
> where the user is already in a tough spot with wraparound failure,
> like that client of yours. Nobody wants to force the failsafe for one
> specific table. It's not general purpose, at all, and shouldn't claim
> to be.

Even not in the situation where the database has to run as the
single-user mode to freeze tuples, I think there would be some use
cases where users want to run vacuum (in failsafe mode) on tables with
relfrozenxid/relminmxid greater than their failsafe thresholds before
falling into that situation. I think it’s common that users are
somehow monitoring relfrozenxid/relminmxid and want to manually run
vacuum on them rather than relying on autovacuums. --min-xid-age
option and --min-mxid-age option of vacuumdb command would be good
examples. So I think this new command/facility might not necessarily
need to be specific to single-user mode.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: sequences vs. synchronous replication

2021-12-21 Thread Tomas Vondra




On 12/19/21 04:03, Amit Kapila wrote:

On Sat, Dec 18, 2021 at 7:24 AM Tomas Vondra
 wrote:


while working on logical decoding of sequences, I ran into an issue with
nextval() in a transaction that rolls back, described in [1]. But after
thinking about it a bit more (and chatting with Petr Jelinek), I think
this issue affects physical sync replication too.

Imagine you have a primary <-> sync_replica cluster, and you do this:

CREATE SEQUENCE s;

-- shutdown the sync replica

BEGIN;
SELECT nextval('s') FROM generate_series(1,50);
ROLLBACK;

BEGIN;
SELECT nextval('s');
COMMIT;

The natural expectation would be the COMMIT gets stuck, waiting for the
sync replica (which is not running), right? But it does not.



How about if we always WAL log the first sequence change in a transaction?



I've been thinking about doing something like this, but I think it would 
not have any significant advantages compared to using "SEQ_LOG_VALS 0". 
It would still have the same performance hit for plain nextval() calls, 
and there's no measurable impact on simple workloads that already write 
WAL in transactions even with SEQ_LOG_VALS 0.



regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Checkpointer crashes with "PANIC: could not fsync file "pg_tblspc/.."

2021-12-21 Thread Dilip Kumar
On Wed, 22 Dec 2021 at 12:28 AM, Ashutosh Sharma 
wrote:

>
> Is it okay to share the same tablespace (infact relfile) between the
> primary and standby server? Perhaps NO.
>

> Oops, yeah absolutely they can never share the tablespace path.  So we can
ignore this issue.

—
Dilip
-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


Re: CREATEROLE and role ownership hierarchies

2021-12-21 Thread Mark Dilger



> On Dec 21, 2021, at 5:11 PM, Shinya Kato  
> wrote:
> 
> I fixed the patches because they cannot be applied to HEAD.

Thank you.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: [PATCH] Tracking statements entry timestamp in pg_stat_statements

2021-12-21 Thread Anton A. Melnikov




On 03.12.2021 19:55, Andrei Zubkov wrote:

On Fri, 2021-12-03 at 17:03 +0300, Andrei Zubkov wrote:

...

What if we'll create a new view for such resettable fields? It will
make description of views and reset functions in the docs much more
clear.



Hi, Andrey!

I completely agree that creating a separate view for these new fields is
the most correct thing to do.

As for variable names, the term global is already used for global 
statistics, in particular in the struct pgssGlobalStats.

The considered timestamps refer to per statement level
as pointed out in the struct pgssEntry's comment. Therefore, i think 
it's better to rename gstats_since to just stats_reset in the same way.
Also it might be better to use the term 'auxiliary' and use the same 
approach as for existent similar vars.

So internally it might look something like this:

double  aux_min_time[PGSS_NUMKIND];
double  aux_max_time[PGSS_NUMKIND];
TimestampTz aux_stats_reset;

And at the view level:
  aux_min_plan_time float8,
  aux_max_plan_time float8,
  aux_min_exec_time float8,
  aux_max_exec_time float8,
  aux_stats_reset timestamp with time zone

Functions names might be pg_stat_statements_aux() and 
pg_stat_statements_aux_reset().


The top-level application may find out period the aux extrema were 
collected by determining which reset was closer as follows:

data_collect_period = aux_stats_reset > stats_reset ?
now - aux_stats_reset : now - stats_reset;
and decide not to trust this data if the period was too small.
For correct work aux_stats_reset must be updated and aux extrema values 
must be reset simultaneously with updating of stats_reset therefore some 
synchronization needed to avoid race with possible external reset.


I've tested the patch v4 and didn't find any evident problems.
Contrib installcheck said:
test pg_stat_statements   ... ok  163 ms
test oldextversions   ... ok   46 ms

With best regards,
--
Anton A. Melnikov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




RE: parallel vacuum comments

2021-12-21 Thread houzj.f...@fujitsu.com
On Wed, Dec 22, 2021 8:38 AM Masahiko Sawada  wrote:
> On Tue, Dec 21, 2021 at 10:24 PM Amit Kapila 
> wrote:
> >
> > On Tue, Dec 21, 2021 at 11:24 AM Masahiko Sawada
>  wrote:
> > >
> > > On Tue, Dec 21, 2021 at 2:04 PM Amit Kapila 
> wrote:
> > > >
> > >
> > > Thank you for the comment. Agreed.
> > >
> > > I've attached updated version patches. Please review them.
> > >
> >
> > These look mostly good to me. Please find attached the slightly edited
> > version of the 0001 patch. I have modified comments, ran pgindent, and
> > modify the commit message. I'll commit this tomorrow if you are fine
> > with it.
> 
> It looks good to me.
+1, the patch passes check-world and looks good to me.

Best regards,
Hou zj



Re: Use extended statistics to estimate (Var op Var) clauses

2021-12-21 Thread Mark Dilger



> On Dec 21, 2021, at 4:28 PM, Mark Dilger  wrote:
> 
> Maybe there is some reason this is ok.

... and there is.  Sorry for the noise.  The planner appears to be smart enough 
to know that column "salary" is not being changed, and therefore NEW.salary and 
OLD.salary are equal.  If I test a different update statement that contains a 
new value for "salary", the added assertion is not triggered.

(I didn't quite realize what the clause's varnosyn field was telling me until 
after I hit "send".)


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: parallel vacuum comments

2021-12-21 Thread Masahiko Sawada
On Tue, Dec 21, 2021 at 10:24 PM Amit Kapila  wrote:
>
> On Tue, Dec 21, 2021 at 11:24 AM Masahiko Sawada  
> wrote:
> >
> > On Tue, Dec 21, 2021 at 2:04 PM Amit Kapila  wrote:
> > >
> >
> > Thank you for the comment. Agreed.
> >
> > I've attached updated version patches. Please review them.
> >
>
> These look mostly good to me. Please find attached the slightly edited
> version of the 0001 patch. I have modified comments, ran pgindent, and
> modify the commit message. I'll commit this tomorrow if you are fine
> with it.

Thank you for the patch! It looks good to me.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: Use extended statistics to estimate (Var op Var) clauses

2021-12-21 Thread Mark Dilger



> On Dec 12, 2021, at 6:21 PM, Tomas Vondra  
> wrote:
> 
> <0001-Improve-estimates-for-Var-op-Var-with-the-same-Var.patch>

+* It it's (variable = variable) with the same variable on both sides, it's

s/It it's/If it's/

0001 lacks regression coverage.

> <0002-simplification.patch>

Changing comments introduced by patch 0001 in patch 0002 just creates git churn:

-* estimate the selectivity as 1.0 (or 0.0 if it's negated).
+* estimate the selectivity as 1.0 (or 0.0 when it's negated).

and:

  * matching_restriction_variable
- * Examine the args of a restriction clause to see if it's of the
- * form (variable op variable) with the same variable on both sides.
+ * Check if the two arguments of a restriction clause refer to the same
+ * variable, i.e. if the condition is of the form (variable op variable).
+ * We can deduce selectivity for such (in)equality clauses.

0002 also lacks regression coverage.

> <0003-relax-the-restrictions.patch>

0003 also lacks regression coverage.

> <0004-main-patch.patch>

Ok.

> <0005-Don-t-treat-Var-op-Var-as-simple-clauses.patch>

0005 again lacks regression coverage.



There might be a problem in how selectivity thinks about comparison between 
identical columns from the NEW and OLD pseudotables.  To show this, add an 
Assert to see where matching_restriction_variables() might return true:

--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -4952,6 +4952,8 @@ matching_restriction_variables(PlannerInfo *root, List 
*args, int varRelid)
return false;
 
/* The two variables need to match */
+   Assert(!equal(left, right));
+
return equal(left, right);

This results in the regression tests failing on "update rtest_emp set ename = 
'wiecx' where ename = 'wiecc';".  It may seem counterintuitive that 
matching_restriction_variables() would return true for a where-clause with only 
one occurrence of variable "ename", until you read the rule defined in 
rules.sql:

  create rule rtest_emp_upd as on update to rtest_emp where new.salary != 
old.salary do
  insert into rtest_emplog values (new.ename, current_user,
  'honored', new.salary, old.salary);

I think what's really happening here is that "new.salary != old.salary" is 
being processed by matching_restriction_variables() and returning that the 
new.salary refers to the same thing that old.salary refers to.

Here is the full stack trace, for reference:

(lldb) bt
* thread #1, stop reason = signal SIGSTOP
frame #0: 0x7fff6d8fd33a libsystem_kernel.dylib`__pthread_kill + 10
frame #1: 0x7fff6d9b9e60 libsystem_pthread.dylib`pthread_kill + 430
frame #2: 0x7fff6d884808 libsystem_c.dylib`abort + 120
frame #3: 0x0001048a6c31 
postgres`ExceptionalCondition(conditionName="!equal(left, right)", 
errorType="FailedAssertion", fileName="selfuncs.c", lineNumber=4955) at 
assert.c:69:2
frame #4: 0x00010481e733 
postgres`matching_restriction_variables(root=0x7fe65e02b2d0, 
args=0x7fe65e02bf38, varRelid=0) at selfuncs.c:4955:2
frame #5: 0x00010480f63c 
postgres`eqsel_internal(fcinfo=0x7ffeebb9aeb8, negate=true) at 
selfuncs.c:265:6
frame #6: 0x00010481040a postgres`neqsel(fcinfo=0x7ffeebb9aeb8) at 
selfuncs.c:565:2
frame #7: 0x0001048b420c 
postgres`FunctionCall4Coll(flinfo=0x7ffeebb9af38, collation=0, 
arg1=140627396440784, arg2=901, arg3=140627396443960, arg4=0) at fmgr.c:1212:11
frame #8: 0x0001048b50e6 postgres`OidFunctionCall4Coll(functionId=102, 
collation=0, arg1=140627396440784, arg2=901, arg3=140627396443960, arg4=0) at 
fmgr.c:1448:9
  * frame #9: 0x000104557e4d 
postgres`restriction_selectivity(root=0x7fe65e02b2d0, operatorid=901, 
args=0x7fe65e02bf38, inputcollid=0, varRelid=0) at plancat.c:1828:26
frame #10: 0x0001044d4c76 
postgres`clause_selectivity_ext(root=0x7fe65e02b2d0, 
clause=0x7fe65e02bfe8, varRelid=0, jointype=JOIN_INNER, 
sjinfo=0x, use_extended_stats=true) at clausesel.c:902:10
frame #11: 0x0001044d4186 
postgres`clauselist_selectivity_ext(root=0x7fe65e02b2d0, 
clauses=0x7fe65e02cdd0, varRelid=0, jointype=JOIN_INNER, 
sjinfo=0x, use_extended_stats=true) at clausesel.c:185:8
frame #12: 0x0001044d3f97 
postgres`clauselist_selectivity(root=0x7fe65e02b2d0, 
clauses=0x7fe65e02cdd0, varRelid=0, jointype=JOIN_INNER, 
sjinfo=0x) at clausesel.c:108:9
frame #13: 0x0001044de192 
postgres`set_baserel_size_estimates(root=0x7fe65e02b2d0, 
rel=0x7fe65e01e640) at costsize.c:4931:3
frame #14: 0x0001044d16be 
postgres`set_plain_rel_size(root=0x7fe65e02b2d0, rel=0x7fe65e01e640, 
rte=0x7fe65e02ac40) at allpaths.c:584:2
frame #15: 0x0001044d0a79 
postgres`set_rel_size(root=0x7fe65e02b2d0, rel=0x7fe65e01e640, rti=1, 
rte=0x7fe65e02ac40) at allpaths.c:413:6
frame #16: 

Re: row filtering for logical replication

2021-12-21 Thread Peter Smith
On Tue, Dec 21, 2021 at 5:58 AM Euler Taveira  wrote:
>
> On Mon, Dec 20, 2021, at 12:10 AM, houzj.f...@fujitsu.com wrote:
>
> Attach the V49 patch set, which addressed all the above comments on the 0002
> patch.
>
> I've been testing the latest versions of this patch set. I'm attaching a new
> patch set based on v49. The suggested fixes are in separate patches after the
> current one so it is easier to integrate them into the related patch. The
> majority of these changes explains some decision to improve readability IMO.
>
> row-filter x row filter. I'm not a native speaker but "row filter" is widely
> used in similar contexts so I suggest to use it. (I didn't adjust the commit
> messages)

Fixed in v51* [1]. And I also updated the commit comments.

>
> An ancient patch use the term coerce but it was changed to cast. Coercion
> implies an implicit conversion [1]. If you look at a few lines above you will
> see that this expression expects an implicit conversion.

Fixed in v51* [1]

>
> I modified the query to obtain the row filter expressions to (i) add the 
> schema
> pg_catalog to some objects and (ii) use NOT EXISTS instead of subquery (it
> reads better IMO).

Not changed in v51, but IIUC this might be fixed soon if it is
confirmed to be better. [2 Amit]

> A detail message requires you to capitalize the first word of sentences and
> includes a period at the end.

Fixed in v51* [1]

>
> It seems all server messages and documentation use the terminology "WHERE
> clause". Let's adopt it instead of "row filter".

Fixed in v51* [1]

>
> I reviewed 0003. It uses TupleTableSlot instead of HeapTuple. I probably 
> missed
> the explanation but it requires more changes (logicalrep_write_tuple and 3 new
> entries into RelationSyncEntry). I replaced this patch with a slightly
> different one (0005 in this patch set) that uses HeapTuple instead. I didn't
> only simple tests and it requires tests. I noticed that this patch does not
> include a test to cover the case where TOASTed values are not included in the
> new tuple. We should probably add one.

Not changed in v51. See response from Ajin [3 Ajin].

>
> I agree with Amit that it is a good idea to merge 0001, 0002, and 0005. I 
> would
> probably merge 0004 because it is just isolated changes.

Fixed in v51* [1] per Amit's suggestion (so the 0004 is still separate)


--
[1] 
https://www.postgresql.org/message-id/CAHut%2BPs%2BdACvefCZasRE%3DP%3DDtaNmQvM3kiGyKyBHANA0yGcTZw%40mail.gmail.com
[2 Amit] 
https://www.postgresql.org/message-id/CAA4eK1KwoA5k8v9z9e4ZPN_X%3D1GAmQmsWyauFwZpKiSHqy6eZA%40mail.gmail.com
[3 Ajin] 
https://www.postgresql.org/message-id/CAFPTHDbfpPNh3GLGjySS%3DAuRfbQPQFNvfiyG1GDQW2kz1yT7Og%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia




Re: row filtering for logical replication

2021-12-21 Thread Peter Smith
On Mon, Dec 20, 2021 at 9:30 PM Amit Kapila  wrote:
>
> On Mon, Dec 20, 2021 at 8:41 AM houzj.f...@fujitsu.com
>  wrote:
> >
> > Thanks for the comments, I agree with all the comments.
> > Attach the V49 patch set, which addressed all the above comments on the 0002
> > patch.
> >
>
> Few comments/suugestions:
> ==
> 1.
> + Oid publish_as_relid = InvalidOid;
> +
> + /*
> + * For a partition, if pubviaroot is true, check if any of the
> + * ancestors are published. If so, note down the topmost ancestor
> + * that is published via this publication, the row filter
> + * expression on which will be used to filter the partition's
> + * changes. We could have got the topmost ancestor when collecting
> + * the publication oids, but that will make the code more
> + * complicated.
> + */
> + if (pubform->pubviaroot && relation->rd_rel->relispartition)
> + {
> + if (pubform->puballtables)
> + publish_as_relid = llast_oid(ancestors);
> + else
> + publish_as_relid = GetTopMostAncestorInPublication(pubform->oid,
> +ancestors);
> + }
> +
> + if (publish_as_relid == InvalidOid)
> + publish_as_relid = relid;
>
> I think you can initialize publish_as_relid as relid and then later
> override it if required. That will save the additional check of
> publish_as_relid.
>

Fixed in v51* [1]

> 2. I think your previous version code in GetRelationPublicationActions
> was better as now we have to call memcpy at two places.
>

Fixed in v51* [1]

> 3.
> +
> + if (list_member_oid(GetRelationPublications(ancestor),
> + puboid) ||
> + list_member_oid(GetSchemaPublications(get_rel_namespace(ancestor)),
> + puboid))
> + {
> + topmost_relid = ancestor;
> + }
>
> I think here we don't need to use braces ({}) as there is just a
> single statement in the condition.
>

Fixed in v51* [1]

> 4.
> +#define IDX_PUBACTION_n 3
> + ExprState*exprstate[IDX_PUBACTION_n]; /* ExprState array for row filter.
> +One per publication action. */
> ..
> ..
>
> I think we can have this define outside the structure. I don't like
> this define name, can we name it NUM_ROWFILTER_TYPES or something like
> that?
>

Partly fixed in v51* [1], I've changed the #define name but I did not
move it. The adjacent comment talks about these ExprState caches and
explains the reason why the number is 3. So if I move the #define then
half that comment would have to move with it. I thought it is better
to keep all the related parts grouped together with the one
explanatory comment, but if you still want the #define moved please
confirm and I can do it in a future version.

> I think we can now merge 0001, 0002, and 0005. We are still evaluating
> the performance for 0003, so it is better to keep it separate. We can
> take the decision to merge it once we are done with our evaluation.
>

Merged as suggested in v51* [1]

--
[1] 
https://www.postgresql.org/message-id/CAHut%2BPs%2BdACvefCZasRE%3DP%3DDtaNmQvM3kiGyKyBHANA0yGcTZw%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia




Re: do only critical work during single-user vacuum?

2021-12-21 Thread John Naylor
On Tue, Dec 21, 2021 at 5:56 PM Peter Geoghegan  wrote:
>
> On Tue, Dec 21, 2021 at 1:31 PM John Naylor
>  wrote:
> > On second thought, we don't really need another number here. We could
> > simply go by the existing failsafe parameter, and if the admin wants a
> > different value, it's already possible to specify
> > vacuum_failsafe_age/vacuum_multixact_failsafe_age in a session,
> > including in single-user mode. Perhaps a new boolean called
> > FAILSAFE_ONLY. If no table is specified, then when generating the list
> > of tables, include only those with relfrozenxid/relminmxid greater
> > than their failsafe thresholds.
>
> That's equivalent to the quick and dirty patch I wrote (assuming that
> the user actually uses this new FAILSAFE_ONLY thing).

Equivalent but optional.

> But if we're going to add a new option to the VACUUM command (or
> something of similar scope), then we might as well add a new behavior
> that is reasonably exact -- something that (say) only *starts* a
> VACUUM for those tables whose relfrozenxid age currently exceeds half
> the autovacuum_freeze_max_age for the table (usually taken from the
> GUC, sometimes taken from the reloption), which also forces the
> failsafe. And with similar handling for
> relminmxid/autovacuum_multixact_freeze_max_age.
>
> In other words, while triggering the failsafe is important, simply *not
> starting* VACUUM for relations where there is really no need for it is
> at least as important. We shouldn't even think about pruning or
> freezing with these tables.

Right, not starting where not necessary is crucial for getting out of
single-user mode as quickly as possible. We're in agreement there.

> (ISTM that the only thing that might be a
> bit controversial about any of this is my definition of "safe", which
> seems like about the right trade-off to me.)

It seems reasonable to want to start back up and not immediately have
anti-wraparound vacuums kick in. On the other hand, it's not good to
do work while unable to monitor progress, and while more WAL is piling
up. I'm not sure where the right trade off is.

> This new command/facility should probably not be a new flag to the
> VACUUM command, as such. Rather, I think that it should either be an
> SQL-callable function, or a dedicated top-level command (that doesn't
> accept any tables). The only reason to have this is for scenarios
> where the user is already in a tough spot with wraparound failure,
> like that client of yours. Nobody wants to force the failsafe for one
> specific table. It's not general purpose, at all, and shouldn't claim
> to be.

Makes sense, I'll have a think about what that would look like.

-- 
John Naylor
EDB: http://www.enterprisedb.com




Re: A test for replay of regression tests

2021-12-21 Thread Thomas Munro
Hi,

Rebased and updated based on feedback.  Responses to multiple emails below:

On Thu, Dec 16, 2021 at 1:22 PM Michael Paquier  wrote:
> On Wed, Dec 15, 2021 at 05:50:45PM +0900, Michael Paquier wrote:
> > Hmm.  FWIW, I am working on doing similar for pg_upgrade to switch to
> > TAP there, and we share a lot in terms of running pg_regress on an
> > exising cluster.  Wouldn't it be better to move this definition to
> > src/Makefile.global.in rather than src/test/recovery/?
> >
> > My pg_regress command is actually very similar to yours, so I am
> > wondering if this would be better if somehow centralized, perhaps in
> > Cluster.pm.

Thanks for looking.  Right, it sounds like you'll have the same
problems I ran into.  I haven't updated this patch for that yet, as
I'm not sure exactly what you need and we could easily move it in a
later commit.  Does that seem reasonable?

> By the way, while I was sorting out my things, I have noticed that v4
> does not handle EXTRA_REGRESS_OPT.  Is that wanted?  You could just
> add that into your patch set and push the extra options to the
> pg_regress command:
> my $extra_opts_val = $ENV{EXTRA_REGRESS_OPT} || "";
> my @extra_opts = split(/\s+/, $extra_opts_val);

Seems like a good idea for consistency, but isn't that a variable
that's supposed to be expanded by a shell, not naively split on
whitespace?  Perhaps we should use the single-argument variant of
system(), so the whole escaped enchilada is passed to a shell?  Tried
like that in this version (though now I'm wondering what the correct
perl incantation is to shell-escape $outputdir and $dlpath...)

On Sat, Dec 11, 2021 at 1:17 PM Andres Freund  wrote:
> On 2021-12-10 12:58:01 +1300, Thomas Munro wrote:
> > +$node_primary->adjust_conf('postgresql.conf', 'max_connections', '25', 1);
>
> Possible that this will cause problem on some *BSD platform with a limited
> count of semaphores. But we can deal with that if / when it happens.

Right, those systems don't work out of the box for us already without
sysctl tweaks, so it doesn't matter if animals have to be adjusted
further.

> > @@ -590,16 +595,35 @@ create_tablespace_directories(const char *location, 
> > const Oid tablespaceoid)
> >   char   *linkloc;
> >   char   *location_with_version_dir;
> >   struct stat st;
> > + boolin_place;
> >
> >   linkloc = psprintf("pg_tblspc/%u", tablespaceoid);
> > - location_with_version_dir = psprintf("%s/%s", location,
> > +
> > + /*
> > +  * If we're asked to make an 'in place' tablespace, create the 
> > directory
> > +  * directly where the symlink would normally go.  This is a 
> > developer-only
> > +  * option for now, to facilitate regression testing.
> > +  */
> > + in_place =
> > + (allow_in_place_tablespaces || InRecovery) && 
> > strlen(location) == 0;
>
> Why is in_place set to true by InRecovery?

Well the real condition is strlen(location) == 0, and the other part
is a sort of bit belt-and-braces check, but yeah, I should just remove
that part.  Done.

> ISTM that allow_in_place_tablespaces should be checked in CreateTableSpace(),
> and create_tablespace_directories() should just do whatever it's told?
> Otherwise it seems there's ample potential for confusion, e.g. because of the
> GUC differing between primary and replica, or between crash and a new start.

Agreed, that was the effect but the extra unnecessary check was a bit confusing.

> >   /*
> >* Attempt to coerce target directory to safe permissions.  If this 
> > fails,
> >* it doesn't exist or has the wrong owner.
> >*/
> > - if (chmod(location, pg_dir_create_mode) != 0)
> > + if (!in_place && chmod(location, pg_dir_create_mode) != 0)
> >   {
> >   if (errno == ENOENT)
> >   ereport(ERROR,
>
> Maybe add a coment saying that we don't need to chmod here because
> MakePGDirectory() takes care of that?

Done.

> > @@ -648,13 +672,13 @@ create_tablespace_directories(const char *location, 
> > const Oid tablespaceoid)
> >   /*
> >* In recovery, remove old symlink, in case it points to the wrong 
> > place.
> >*/
> > - if (InRecovery)
> > + if (!in_place && InRecovery)
> >   remove_tablespace_symlink(linkloc);
>
> I don't think it's right to check !in_place as you do here, given that it
> currently depends on a GUC setting that's possibly differs between WAL
> generation and replay time?

I have to, because otherwise we'll remove the directory we just
created at the top of the function.  It doesn't really depend on a GUC
(clearer after previous change).

> Perhaps also add a test that we catch "in-place" tablespace creation with
> allow_in_place_tablespaces = false? Although perhaps that's better done in the
> previous commit...

There was a test for that already, see this bit:

+-- empty tablespace locations are not usually allowed
+CREATE TABLESPACE regress_tblspace 

Re: do only critical work during single-user vacuum?

2021-12-21 Thread Peter Geoghegan
On Tue, Dec 21, 2021 at 1:31 PM John Naylor
 wrote:
> On second thought, we don't really need another number here. We could
> simply go by the existing failsafe parameter, and if the admin wants a
> different value, it's already possible to specify
> vacuum_failsafe_age/vacuum_multixact_failsafe_age in a session,
> including in single-user mode. Perhaps a new boolean called
> FAILSAFE_ONLY. If no table is specified, then when generating the list
> of tables, include only those with relfrozenxid/relminmxid greater
> than their failsafe thresholds.

That's equivalent to the quick and dirty patch I wrote (assuming that
the user actually uses this new FAILSAFE_ONLY thing).

But if we're going to add a new option to the VACUUM command (or
something of similar scope), then we might as well add a new behavior
that is reasonably exact -- something that (say) only *starts* a
VACUUM for those tables whose relfrozenxid age currently exceeds half
the autovacuum_freeze_max_age for the table (usually taken from the
GUC, sometimes taken from the reloption), which also forces the
failsafe. And with similar handling for
relminmxid/autovacuum_multixact_freeze_max_age.

In other words, while triggering the failsafe is important, simply *not
starting* VACUUM for relations where there is really no need for it is
at least as important. We shouldn't even think about pruning or
freezing with these tables. (ISTM that the only thing that might be a
bit controversial about any of this is my definition of "safe", which
seems like about the right trade-off to me.)

This new command/facility should probably not be a new flag to the
VACUUM command, as such. Rather, I think that it should either be an
SQL-callable function, or a dedicated top-level command (that doesn't
accept any tables). The only reason to have this is for scenarios
where the user is already in a tough spot with wraparound failure,
like that client of yours. Nobody wants to force the failsafe for one
specific table. It's not general purpose, at all, and shouldn't claim
to be.


--
Peter Geoghegan




Re: daitch_mokotoff module

2021-12-21 Thread Dag Lem
Hello again,

It turns out that there actually exists an(other) implementation of
the Daitch-Mokotoff Soundex System which gets it right; the JOS
Soundex Calculator at https://www.jewishgen.org/jos/jossound.htm
Other implementations I have been able to find, like the one in Apache
Commons Coded used in e.g. Elasticsearch, are far from correct.

The source code for the JOS Soundex Calculator is not available, as
far as I can tell, however I have run the complete list of 98412 last
names at
https://raw.githubusercontent.com/philipperemy/name-dataset/master/names_dataset/v1/last_names.all.txt
through the calculator, in order to have a good basis for comparison.

This revealed a few shortcomings in my implementation. In particular I
had to go back to the drawing board in order to handle the dual nature
of "J" correctly. "J" can be either a vowel or a consonant in
Daitch-Mokotoff soundex, and this complicates encoding of the
*previous* character.

I have also done a more thorough review and refactoring of the code,
which should hopefully make things quite a bit more understandable to
others.

The changes are summarized as follows:

* Returns NULL for input without any encodable characters.
* Uses the same "unoffical" rules for "UE" as other implementations.
* Correctly considers "J" as either a vowel or a consonant.
* Corrected encoding for e.g. "HANNMANN".
* Code refactoring and comments for readability.
* Better examples in the documentation.

The implementation is now in correspondence with the JOS Soundex
Calculator for the 98412 last names mentioned above, with only the
following exceptions:

JOS: cedeño 43 53
PG:  cedeño 436000 536000
JOS: sadab(khura)  437000
PG:  sadab(khura)  437590

I hope this addition to the fuzzystrmatch extension module will prove
to be useful to others as well!

This is my very first code contribution to PostgreSQL, and I would be
grateful for any advice on how to proceed in order to get the patch
accepted.

Best regards

Dag Lem

diff --git a/contrib/fuzzystrmatch/Makefile b/contrib/fuzzystrmatch/Makefile
index 0704894f88..826e529e3e 100644
--- a/contrib/fuzzystrmatch/Makefile
+++ b/contrib/fuzzystrmatch/Makefile
@@ -3,11 +3,12 @@
 MODULE_big = fuzzystrmatch
 OBJS = \
 	$(WIN32RES) \
+	daitch_mokotoff.o \
 	dmetaphone.o \
 	fuzzystrmatch.o
 
 EXTENSION = fuzzystrmatch
-DATA = fuzzystrmatch--1.1.sql fuzzystrmatch--1.0--1.1.sql
+DATA = fuzzystrmatch--1.2.sql fuzzystrmatch--1.1--1.2.sql fuzzystrmatch--1.0--1.1.sql
 PGFILEDESC = "fuzzystrmatch - similarities and distance between strings"
 
 REGRESS = fuzzystrmatch
@@ -22,3 +23,8 @@ top_builddir = ../..
 include $(top_builddir)/src/Makefile.global
 include $(top_srcdir)/contrib/contrib-global.mk
 endif
+
+daitch_mokotoff.o: daitch_mokotoff.h
+
+daitch_mokotoff.h: daitch_mokotoff_header.pl
+	perl $< > $@
diff --git a/contrib/fuzzystrmatch/daitch_mokotoff.c b/contrib/fuzzystrmatch/daitch_mokotoff.c
new file mode 100644
index 00..1b7263c349
--- /dev/null
+++ b/contrib/fuzzystrmatch/daitch_mokotoff.c
@@ -0,0 +1,593 @@
+/*
+ * Daitch-Mokotoff Soundex
+ *
+ * Copyright (c) 2021 Finance Norway
+ * Author: Dag Lem 
+ *
+ * This implementation of the Daitch-Mokotoff Soundex System aims at high
+ * performance.
+ *
+ * - The processing of each phoneme is initiated by an O(1) table lookup.
+ * - For phonemes containing more than one character, a coding tree is traversed
+ *   to process the complete phoneme.
+ * - The (alternate) soundex codes are produced digit by digit in-place in
+ *   another tree structure.
+ *
+ *  References:
+ *
+ * https://www.avotaynu.com/soundex.htm
+ * https://www.jewishgen.org/InfoFiles/Soundex.html
+ * https://familypedia.fandom.com/wiki/Daitch-Mokotoff_Soundex
+ * https://stevemorse.org/census/soundex.html (dmlat.php, dmsoundex.php)
+ * https://github.com/apache/commons-codec/ (dmrules.txt, DaitchMokotoffSoundex.java)
+ * https://metacpan.org/pod/Text::Phonetic (DaitchMokotoff.pm)
+ *
+ * A few notes on other implementations:
+ *
+ * - All other known implementations have the same unofficial rules for "UE",
+ *   these are also adapted by this implementation (0, 1, NC).
+ * - The only other known implementation which is capable of generating all
+ *   correct soundex codes in all cases is the JOS Soundex Calculator at
+ *   https://www.jewishgen.org/jos/jossound.htm
+ * - "J" is considered (only) a vowel in dmlat.php
+ * - The official rules for "RS" are commented out in dmlat.php
+ * - Identical code digits for adjacent letters are not collapsed correctly in
+ *   dmsoundex.php when double digit codes are involved. E.g. "BESST" yields
+ *   744300 instead of 743000 as for "BEST".
+ * - "J" is considered (only) a consonant in DaitchMokotoffSoundex.java
+ * - "Y" is not considered a vowel in DaitchMokotoffSoundex.java
+ *
+ * Permission to use, copy, modify, and distribute this software and its
+ * documentation for any purpose, without fee, and without a written agreement
+ * is hereby granted, 

Re: do only critical work during single-user vacuum?

2021-12-21 Thread John Naylor
I wrote:

> Instead of a boolean, it seems like the new option should specify some
> age below which VACUUM will skip the table entirely, and above which
> will enter fail-safe mode. As mentioned earlier, the shutdown hint
> could spell out the exact command. With this design, it would specify
> the fail-safe default, or something else, to use with the option.

On second thought, we don't really need another number here. We could
simply go by the existing failsafe parameter, and if the admin wants a
different value, it's already possible to specify
vacuum_failsafe_age/vacuum_multixact_failsafe_age in a session,
including in single-user mode. Perhaps a new boolean called
FAILSAFE_ONLY. If no table is specified, then when generating the list
of tables, include only those with relfrozenxid/relminmxid greater
than their failsafe thresholds.

-- 
John Naylor
EDB: http://www.enterprisedb.com




Re: Buildfarm support for older versions

2021-12-21 Thread Larry Rosenman

On 12/16/2021 3:23 pm, Andrew Dunstan wrote:

On 12/16/21 15:53, Larry Rosenman wrote:



I get:
ERROR for site owner:
Invalid domain for site key

on https://pgbuildfarm.org/cgi-bin/register-form.pl



try https://buildfarm.postgresql.org/cgi-bin/register-form.pl


cheers


andrew
I filled out that form on the 16th, and haven't gotten a new animal 
assignment.  Is there

a problem with my data?
--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106




Re: XMAX_LOCK_ONLY and XMAX_COMMITTED (fk/multixact code)

2021-12-21 Thread Mark Dilger



> On Dec 1, 2021, at 10:59 AM, Bossart, Nathan  wrote:
> 
> here is a v3

It took a while for me to get to this

@@ -1304,33 +1370,46 @@ HeapTupleSatisfiesVacuumHorizon(HeapTuple htup, Buffer 
buffer, TransactionId *de

if (HEAP_XMAX_IS_LOCKED_ONLY(tuple->t_infomask))
{
+   if (unlikely(tuple->t_infomask & HEAP_XMAX_COMMITTED))
+   {
+   if (tuple->t_infomask & HEAP_XMAX_LOCK_ONLY)
+   ereport(ERROR,
+   (errcode(ERRCODE_DATA_CORRUPTED),
+errmsg_internal("found tuple with HEAP_XMAX_COMMITTED "
+"and HEAP_XMAX_LOCK_ONLY")));
+
+   /* pre-v9.3 lock-only bit pattern */
+   ereport(ERROR,
+   (errcode(ERRCODE_DATA_CORRUPTED),
+errmsg_internal("found tuple with HEAP_XMAX_COMMITTED and"
+"HEAP_XMAX_EXCL_LOCK set and "
+"HEAP_XMAX_IS_MULTI unset")));
+   }
+

I find this bit hard to understand.  Does the comment mean to suggest that the 
*upgrade* process should have eliminated all pre-v9.3 bit patterns, and 
therefore any such existing patterns are certainly corruption, or does it mean 
that data written by pre-v9.3 servers (and not subsequently updated) is defined 
as corrupt, or  ?

I am not complaining that the logic is wrong, just trying to wrap my head 
around what the comment means.


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: minor gripe about lax reloptions parsing for views

2021-12-21 Thread Mark Dilger


Rebased patch attached:



v3-0001-Reject-storage-options-in-toast-namespace-in-view.patch
Description: Binary data


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company





Re: Checkpointer crashes with "PANIC: could not fsync file "pg_tblspc/.."

2021-12-21 Thread Ashutosh Sharma
This is happening because in the case of the primary server, we let the
checkpointer process to unlink the first segment of the rel file but
that's not the case with the standby server. In case of standby, the
startup/recovery process unlinks the first segment of the rel file
immediately during WAL replay. Now, in this case as the tablespace path is
shared between the primary and standby servers, when the startup process
unlinks the first segment of the rel file, it gets unlinked/deleted for the
primary server as well. So, when we run the checkpoint on the primary
server the subsequent fsync fails with the error "No such file.." which
causes the database system to PANIC.

Please have a look at the code below in mdunlinkfork().

if (isRedo || forkNum != MAIN_FORKNUM ||
RelFileNodeBackendIsTemp(rnode))
{
if (!RelFileNodeBackendIsTemp(rnode))
{
/* Prevent other backends' fds from holding on to the disk
space */
ret = do_truncate(path);

/* Forget any pending sync requests for the first segment */
register_forget_request(rnode, forkNum, 0 /* first seg */ );
}
else
ret = 0;

/* Next unlink the file, unless it was already found to be missing
*/
if (ret == 0 || errno != ENOENT)
{
ret = unlink(path);
if (ret < 0 && errno != ENOENT)
ereport(WARNING,
(errcode_for_file_access(),
 errmsg("could not remove file \"%s\": %m", path)));
}
}
else
{
/* Prevent other backends' fds from holding on to the disk space */
ret = do_truncate(path);

/* Register request to unlink first segment later */
register_unlink_segment(rnode, forkNum, 0 /* first seg */ );
}

==

Is it okay to share the same tablespace (infact relfile) between the
primary and standby server? Perhaps NO.

--
With Regards,
Ashutosh Sharma.

On Tue, Dec 21, 2021 at 4:47 PM Dilip Kumar  wrote:

> While testing the below case with the hot standby setup (with the
> latest code), I have noticed that the checkpointer process crashed
> with the $subject error. As per my observation, we have registered the
> SYNC_REQUEST when inserting some tuple into the table, and later on
> ALTER SET TABLESPACE we have registered the SYNC_UNLINK_REQUEST, which
> looks fine so far, then I have noticed that only when the standby is
> connected the underlying table file w.r.t the old tablespace is
> already deleted.  Now, in AbsorbFsyncRequests we don't do anything for
> the SYNC_REQUEST even though we have SYNC_UNLINK_REQUEST for the same
> file, but since the underlying file is already deleted the
> checkpointer cashed while processing the SYNC_REQUEST.
>
> I have spent some time on this but could not figure out how the
> relfilenodenode file w.r.t. to the old tablespace is getting deleted
> and if I disconnect the standby then it is not getting deleted, not
> sure how walsender is playing a role in deleting the file even before
> checkpointer process the unlink request.
>
> postgres[8905]=# create tablespace tab location
> '/home/dilipkumar/work/PG/install/bin/test';
> CREATE TABLESPACE
> postgres[8905]=# create tablespace tab1 location
> '/home/dilipkumar/work/PG/install/bin/test1';
> CREATE TABLESPACE
> postgres[8905]=# create database test tablespace tab;
> CREATE DATABASE
> postgres[8905]=# \c test
> You are now connected to database "test" as user "dilipkumar".
> test[8912]=# create table t( a int PRIMARY KEY,b text);
> CREATE TABLE
> test[8912]=# insert into t values (generate_series(1,10), 'aaa');
> INSERT 0 10
> test[8912]=# alter table t set tablespace tab1 ;
> ALTER TABLE
> test[8912]=# CHECKPOINT ;
> WARNING:  57P02: terminating connection because of crash of another
> server process
>
> log shows:
> PANIC:  could not fsync file
> "pg_tblspc/16384/PG_15_202112131/16386/16387": No such file or
> directory
>
> backtrace:
> #0  0x7f2f865ff387 in raise () from /lib64/libc.so.6
> #1  0x7f2f86600a78 in abort () from /lib64/libc.so.6
> #2  0x00b13da3 in errfinish (filename=0xcf283f "sync.c", ..
> #3  0x00978dc7 in ProcessSyncRequests () at sync.c:439
> #4  0x005949d2 in CheckPointGuts (checkPointRedo=67653624,
> flags=108) at xlog.c:9590
> #5  0x005942fe in CreateCheckPoint (flags=108) at xlog.c:9318
> #6  0x008a80b7 in CheckpointerMain () at checkpointer.c:444
>
> Note: This smaller test case is derived from one of the bigger
> scenarios raised by Neha Sharma [1]
>
> [1]
> https://www.postgresql.org/message-id/CANiYTQs0E8TcB11eU0C4eNN0tUd%3DSQqsqEtL1AVZP1%3DEnD-49A%40mail.gmail.com
>
> --
> Regards,
> Dilip Kumar
> EnterpriseDB: http://www.enterprisedb.com
>
>
>


Re: pg14 psql broke \d datname.nspname.relname

2021-12-21 Thread Mark Dilger


Rebased patch attached:



v3-0001-Reject-patterns-with-too-many-parts-or-wrong-db.patch
Description: Binary data


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company





Re: Emit a warning if the extension's GUC is set incorrectly

2021-12-21 Thread Tom Lane
I wrote:
> 0003 looks to me like it was superseded by 75d22069e.  I do not
> particularly like that patch though; it seems both inefficient
> and ill-designed.  0003 is adding a check in an equally bizarre
> place.  Why isn't add_placeholder_variable the place to deal
> with this?  Also, once we know that a prefix is reserved,
> why don't we throw an error not just a warning for attempts to
> set some unknown variable under that prefix?  We don't allow
> you to set unknown non-prefixed variables.

Concretely, I think we should do the attached, which removes much of
75d22069e and does the check at the point of placeholder creation.
(After looking closer, add_placeholder_variable's caller is the
function that is responsible for rejecting bad names, not
add_placeholder_variable itself.)

This also fixes an indisputable bug in 75d22069e, namely that it
did prefix comparisons incorrectly and would throw warnings for
cases it shouldn't.  Reserving "plpgsql" shouldn't have the effect
of reserving "pl" as well.

I'm tempted to propose that we also rename EmitWarningsOnPlaceholders
to something like MarkGUCPrefixReserved, to more clearly reflect
what it does now.  (We could provide the old name as a macro alias
to avoid breaking extensions needlessly.)

regards, tom lane

diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index bff949a40b..f345eceb5b 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -148,6 +148,8 @@ extern bool optimize_bounded_sort;
 
 static int	GUC_check_errcode_value;
 
+static List *reserved_class_prefix = NIL;
+
 /* global variables for check hook support */
 char	   *GUC_check_errmsg_string;
 char	   *GUC_check_errdetail_string;
@@ -235,8 +237,6 @@ static bool check_recovery_target_lsn(char **newval, void **extra, GucSource sou
 static void assign_recovery_target_lsn(const char *newval, void *extra);
 static bool check_primary_slot_name(char **newval, void **extra, GucSource source);
 static bool check_default_with_oids(bool *newval, void **extra, GucSource source);
-static void check_reserved_prefixes(const char *varName);
-static List *reserved_class_prefix = NIL;
 
 /* Private functions in guc-file.l that need to be called from guc.c */
 static ConfigVariable *ProcessConfigFileInternal(GucContext context,
@@ -5569,18 +5569,44 @@ find_option(const char *name, bool create_placeholders, bool skip_errors,
 		 * doesn't contain a separator, don't assume that it was meant to be a
 		 * placeholder.
 		 */
-		if (strchr(name, GUC_QUALIFIER_SEPARATOR) != NULL)
+		const char *sep = strchr(name, GUC_QUALIFIER_SEPARATOR);
+
+		if (sep != NULL)
 		{
-			if (valid_custom_variable_name(name))
-return add_placeholder_variable(name, elevel);
-			/* A special error message seems desirable here */
-			if (!skip_errors)
-ereport(elevel,
-		(errcode(ERRCODE_INVALID_NAME),
-		 errmsg("invalid configuration parameter name \"%s\"",
-name),
-		 errdetail("Custom parameter names must be two or more simple identifiers separated by dots.")));
-			return NULL;
+			size_t		classLen = sep - name;
+			ListCell   *lc;
+
+			/* The name must be syntactically acceptable ... */
+			if (!valid_custom_variable_name(name))
+			{
+if (!skip_errors)
+	ereport(elevel,
+			(errcode(ERRCODE_INVALID_NAME),
+			 errmsg("invalid configuration parameter name \"%s\"",
+	name),
+			 errdetail("Custom parameter names must be two or more simple identifiers separated by dots.")));
+return NULL;
+			}
+			/* ... and it must not match any previously-reserved prefix */
+			foreach(lc, reserved_class_prefix)
+			{
+const char *rcprefix = lfirst(lc);
+
+if (strlen(rcprefix) == classLen &&
+	strncmp(name, rcprefix, classLen) == 0)
+{
+	if (!skip_errors)
+		ereport(elevel,
+(errcode(ERRCODE_INVALID_NAME),
+ errmsg("invalid configuration parameter name \"%s\"",
+		name),
+ errdetail("\"%s\" is a reserved prefix.",
+		   rcprefix)));
+	return NULL;
+}
+			}
+			/* OK, create it */
+			return add_placeholder_variable(name, elevel);
 		}
 	}
 
@@ -8764,7 +8790,6 @@ ExecSetVariableStmt(VariableSetStmt *stmt, bool isTopLevel)
 	 (superuser() ? PGC_SUSET : PGC_USERSET),
 	 PGC_S_SESSION,
 	 action, true, 0, false);
-			check_reserved_prefixes(stmt->name);
 			break;
 		case VAR_SET_MULTI:
 
@@ -8850,8 +8875,6 @@ ExecSetVariableStmt(VariableSetStmt *stmt, bool isTopLevel)
 	 (superuser() ? PGC_SUSET : PGC_USERSET),
 	 PGC_S_SESSION,
 	 action, true, 0, false);
-
-			check_reserved_prefixes(stmt->name);
 			break;
 		case VAR_RESET_ALL:
 			ResetAllOptions();
@@ -9345,8 +9368,9 @@ EmitWarningsOnPlaceholders(const char *className)
 {
 	int			classLen = strlen(className);
 	int			i;
-	MemoryContext	oldcontext;
+	MemoryContext oldcontext;
 
+	/* Check for existing placeholders. */
 	for (i = 0; i < 

Re: do only critical work during single-user vacuum?

2021-12-21 Thread John Naylor
On Tue, Dec 21, 2021 at 3:56 AM Masahiko Sawada  wrote:
>
> On Tue, Dec 21, 2021 at 1:53 PM Peter Geoghegan  wrote:
> >
> > On Mon, Dec 20, 2021 at 8:40 PM Masahiko Sawada  
> > wrote:
> > > BTW a vacuum automatically enters failsafe mode under the situation
> > > where the user has to run a vacuum in the single-user mode, right?
> >
> > Only for the table that had the problem. Maybe there are no other
> > tables that a database level "VACUUM" will need to spend much time on,
> > or maybe there are, and they will make it take much much longer (it
> > all depends).
> >
> > The goal of the patch is to make sure that when we're in single user
> > mode, we'll consistently trigger the failsafe, for every VACUUM
> > against every table -- not just the table (or tables) whose
> > relfrozenxid is very old. That's still naive, but much less naive than
> > simply telling users to VACUUM the whole database in single user mode
> > while vacuuming indexes, etc.
>
> I understand the patch, thank you for the explanation!
>
> I remember Simon proposed a VACUUM command option[1], called
> FAST_FREEZE, to turn off index cleanup and heap truncation. Now that
> we have failsafe mechanism probably we can have a VACUUM command
> option to turn on failsafe mode instead.

I've been thinking a bit more about this, and I see two desirable
goals of anti-wraparound vacuum in single-user mode:

1. Get out of single-user mode as quickly as possible.

2. Minimize the catch-up work we have to do once we're out.

Currently, a naive vacuum does as much work as possible and leaves a
bunch of WAL streaming and archiving work for later, so that much is
easy to improve upon and we don't have to be terribly sophisticated.
Keeping in mind Andres' point that we don't want to force possibly
unwanted behavior just because we're in single-user mode, it makes
sense to have some kind of option that has the above two goals.
Instead of a boolean, it seems like the new option should specify some
age below which VACUUM will skip the table entirely, and above which
will enter fail-safe mode. As mentioned earlier, the shutdown hint
could spell out the exact command. With this design, it would specify
the fail-safe default, or something else, to use with the option. That
seems doable for v15 -- any thoughts on that approach?

In standard operation, the above goals could be restated as "advance
xmin as quickly as possible" and "generate as little future
'work/debt' as possible, whether dirty pages or WAL". There are some
more sophisticated things we can do in this regard, but something like
the above could also be useful in normal operation. In fact, that
"normal" could be just after we restarted after doing the bare-minimum
in single-user mode, and want to continue freezing and keep things
under control.

--
John Naylor
EDB: http://www.enterprisedb.com




Re: Emit a warning if the extension's GUC is set incorrectly

2021-12-21 Thread Tom Lane
Kyotaro Horiguchi  writes:
> At Mon, 20 Dec 2021 21:05:23 +0900, Shinya Kato 
>  wrote in 
>> We should use EmitWarningsOnPlaceholders when we use
>> DefineCustomXXXVariable.
>> I don't think there is any room for debate.

> Unfortunately, pltcl.c defines variables both in pltcl and pltclu but
> the patch adds the warning only for pltclu.

Right.  The rest of 0001 looks fine, so pushed with that correction.

I concur that 0002 is unacceptable.  The result of it will be that
a freshly-loaded extension will fail to hook into the system as it
should, because it will fail to complete its initialization.
That will cause user frustration and bug reports.

(BTW, I wonder if EmitWarningsOnPlaceholders should be using LOG
rather than WARNING when it's running in the postmaster.  Use of
WARNING is moderately likely to result in nothing getting printed
at all.)

0003 looks to me like it was superseded by 75d22069e.  I do not
particularly like that patch though; it seems both inefficient
and ill-designed.  0003 is adding a check in an equally bizarre
place.  Why isn't add_placeholder_variable the place to deal
with this?  Also, once we know that a prefix is reserved,
why don't we throw an error not just a warning for attempts to
set some unknown variable under that prefix?  We don't allow
you to set unknown non-prefixed variables.

regards, tom lane




Re: postgres_fdw: incomplete subabort cleanup of connections used in async execution

2021-12-21 Thread Alexander Pyhalov

Etsuro Fujita писал 2021-12-19 13:25:

Hi,

While working on [1], I noticed $SUBJECT: postgres_fdw resets the
per-connection states of connections, which store async requests sent
to remote servers in async_capable mode, during post-abort
(pgfdw_xact_callback()), but it fails to do so during post-subabort
(pgfdw_subxact_callback()).  This causes a crash when re-executing a
query that was aborted in a subtransaction:



Hi.
Looks good to me.
--
Best regards,
Alexander Pyhalov,
Postgres Professional




RE: Optionally automatically disable logical replication subscriptions on error

2021-12-21 Thread osumi.takami...@fujitsu.com
On Thursday, December 16, 2021 9:51 PM I wrote:
> Attached the updated patch v14.
FYI, I've conducted a test of disable_on_error flag using
pg_upgrade.  I prepared PG14 and HEAD applied with disable_on_error patch.
Then, I setup a logical replication pair of the publisher and the subscriber by 
14
and executed pg_upgrade for both the publisher and the subscriber individually.

After the updation, on the subscriber, I've confirmed the disable_on_error is 
false
via both pg_subscription and \dRs+, as expected.

Best Regards,
Takamichi Osumi




Re: parallel vacuum comments

2021-12-21 Thread Amit Kapila
On Tue, Dec 21, 2021 at 11:24 AM Masahiko Sawada  wrote:
>
> On Tue, Dec 21, 2021 at 2:04 PM Amit Kapila  wrote:
> >
>
> Thank you for the comment. Agreed.
>
> I've attached updated version patches. Please review them.
>

These look mostly good to me. Please find attached the slightly edited
version of the 0001 patch. I have modified comments, ran pgindent, and
modify the commit message. I'll commit this tomorrow if you are fine
with it.

-- 
With Regards,
Amit Kapila.


v11-0001-Move-index-vacuum-routines-to-vacuum.c.patch
Description: Binary data


RE: In-placre persistance change of a relation

2021-12-21 Thread Jakub Wartak
Hi Kyotaro,

> I took a bit too long detour but the patch gets to pass make-world for me.

Good news, v10 passes all the tests for me (including TAP recover ones). 
There's major problem I think:

drop table t6;
create unlogged table t6 (id bigint, t text);
create sequence s1;
insert into t6 select nextval('s1'), repeat('A', 1000) from generate_series(1, 
100);
alter table t6 set logged;
select pg_sleep(1);
<--optional checkpoint, more on this later.
/usr/pgsql-15/bin/pg_ctl -D /var/lib/pgsql/15/data -l logfile -m immediate stop
/usr/pgsql-15/bin/pg_ctl -D /var/lib/pgsql/15/data -l logfile start
select count(*) from t6; -- shows 0 rows

But If I perform checkpoint before crash, data is there..  apparently the 
missing steps done by checkpointer 
seem to help. If checkpoint is not done, then some peeking reveals that upon 
startup there is truncation (?!):

$ /usr/pgsql-15/bin/pg_ctl -D /var/lib/pgsql/15/data -l logfile -m immediate 
stop
$ find /var/lib/pgsql/15/data/ -name '73741*' -ls
112723206  120 -rw---   1 postgres postgres   122880 Dec 21 12:42 
/var/lib/pgsql/15/data/base/73740/73741
112723202   24 -rw---   1 postgres postgres24576 Dec 21 12:42 
/var/lib/pgsql/15/data/base/73740/73741_fsm
$ /usr/pgsql-15/bin/pg_ctl -D /var/lib/pgsql/15/data -l logfile start
waiting for server to start done
server started
$ find /var/lib/pgsql/15/data/ -name '73741*' -ls
1127232060 -rw---   1 postgres postgres0 Dec 21 12:42 
/var/lib/pgsql/15/data/base/73740/73741
112723202   16 -rw---   1 postgres postgres16384 Dec 21 12:42 
/var/lib/pgsql/15/data/base/73740/73741_fsm

So what's suspicious is that 122880 -> 0 file size truncation. I've 
investigated WAL and it seems to contain TRUNCATE records
after logged FPI images, so when the crash recovery would kick in it probably 
clears this table (while it shouldn't).

However if I perform CHECKPOINT just before crash the WAL stream contains just 
RUNNING_XACTS and CHECKPOINT_ONLINE 
redo records, this probably prevents truncating. I'm newbie here so please take 
this theory with grain of salt, it can be 
something completely different.

-J.





Re: Multi-Column List Partitioning

2021-12-21 Thread Amit Langote
On Tue, Dec 21, 2021 at 2:47 PM Ashutosh Sharma  wrote:
> On Mon, Dec 20, 2021 at 7:04 PM Amit Langote  wrote:
>> On Mon, Dec 13, 2021 at 11:37 PM Ashutosh Sharma  
>> wrote:
>> >
>> > Hi,
>> >
>> > Is this okay?
>> >
>> > postgres=# CREATE TABLE t1 (a int, b int) PARTITION BY LIST ( a, a, a );
>> > CREATE TABLE
>> >
>> > postgres=# CREATE TABLE t1_1 PARTITION OF t1 FOR VALUES IN ((1, 2, 3), (4, 
>> > 5, 6));
>> > CREATE TABLE
>> >
>> > postgres=# \d t1
>> >Partitioned table "public.t1"
>> >  Column |  Type   | Collation | Nullable | Default
>> > +-+---+--+-
>> >  a  | integer |   |  |
>> >  b  | integer |   |  |
>> > Partition key: LIST (a, a, a)
>> > Number of partitions: 1 (Use \d+ to list them.)
>>
>> I'd say it's not okay for a user to expect this to work sensibly, and
>> I don't think it would be worthwhile to write code to point that out
>> to the user if that is what you were implying.
>
> OK. As you wish.

Actually, we *do* have some code in check_new_partition_bound() to
point it out if an empty range is specified for a partition, something
that one (or a DDL script) may accidentally do:

/*
 * First check if the resulting range would be empty with
 * specified lower and upper bounds...
 */
cmpval = partition_rbound_cmp(key->partnatts,
  key->partsupfunc,
  key->partcollation,
  lower->datums, lower->kind,
  true, upper);
Assert(cmpval != 0);
if (cmpval > 0)
{
/* Point to problematic key in the lower datums list. */
PartitionRangeDatum *datum = list_nth(spec->lowerdatums,
  cmpval - 1);

ereport(ERROR,
(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
 errmsg("empty range bound specified for
partition \"%s\"",
relname),
 errdetail("Specified lower bound %s is
greater than or equal to upper bound %s.",

get_range_partbound_string(spec->lowerdatums),

get_range_partbound_string(spec->upperdatums)),
 parser_errposition(pstate, datum->location)));
}

So one may wonder why we don't catch and point out more such user
mistakes, like the one in your example.  It may not be hard to
implement a proof that the partition bound definition a user entered
results in a self-contradictory partition constraint using the
facilities given in predtest.c.  (The empty-range proof seemed simple
enough to implement as the above block of code.)  I don't however see
why we should do that for partition constraints if we don't do the
same for CHECK constraints; for example, the following definition,
while allowed, is not very useful:

create table foo (a int check (a = 1 and a = 2));
\d foo
Table "public.foo"
 Column |  Type   | Collation | Nullable | Default
+-+---+--+-
 a  | integer |   |  |
Check constraints:
"foo_a_check" CHECK (a = 1 AND a = 2)

Maybe partitioning should be looked at differently than the free-form
CHECK constraints, but I'm not so sure.  Or if others insist that it
may be worthwhile to improve the user experience in such cases, we
could do that as a separate patch than the patch to implement
multi-column list partitioning.

-- 
Amit Langote
EDB: http://www.enterprisedb.com




RE: [PATCH] pg_stat_toast v0.3

2021-12-21 Thread kuroda.hay...@fujitsu.com
Dear Gunnar,

> The attached v0.3 includes
>   * columns "storagemethod" and "compressmethod" added as per Hayato
> Kuroda's suggestion

I prefer your implementation that referring another system view.

> * gathering timing information

Here is a minor comment:
I'm not sure when we should start to measure time.
If you want to record time spent for compressing, toast_compress_datum() should 
be
sandwiched by INSTR_TIME_SET_CURRENT() and caclurate the time duration.
Currently time_spent is calcurated in the pgstat_report_toast_activity(),
but this have a systematic error.
If you want to record time spent for inserting/updating, however,
I think we should start measuring at the upper layer, maybe 
heap_toast_insert_or_update().

> Any hints on how to write meaningful tests would be much appreciated!
> I.e., will a test

I searched hints from PG sources, and I thought that modules/ subdirectory can 
be used.
Following is the example:
https://github.com/postgres/postgres/tree/master/src/test/modules/commit_ts

I attached a patch to create a new test. Could you rewrite the sql and the 
output file?

Best Regards,
Hayato Kuroda
FUJITSU LIMITED



adding_new_test.patch
Description: adding_new_test.patch


Re: Schema variables - new implementation for Postgres 15

2021-12-21 Thread Pavel Stehule
út 21. 12. 2021 v 13:36 odesílatel Justin Pryzby 
napsal:

> On Tue, Dec 21, 2021 at 01:29:00PM +0100, Pavel Stehule wrote:
> > Hi
> >
> > út 21. 12. 2021 v 0:09 odesílatel Justin Pryzby 
> > napsal:
> >
> > > I don't understand what 0002 patch does relative to the 0001 patch.
> > > Is 0002 to change the error messages from "schema variables" to
> "session
> > > variables" , in a separate commit to show that the main patch doesn't
> > > change
> > > regression results ?  Could you add commit messages ?
> > >
> > > I mentioned before that there's a pre-existing use of the phrase
> "session
> > > variable", which you should change to something else:
> > >
> > > origin:doc/src/sgml/ref/set_role.sgml:   SET ROLE
> does
> > > not process session variables as specified by
> > > origin:doc/src/sgml/ref/set_role.sgml-   the role's  > > linkend="sql-alterrole">ALTER ROLE settings;
> > > this only happens during
> > > origin:doc/src/sgml/ref/set_role.sgml-   login.
> > >
> > > Maybe "session variable" should be added to the glossary.
> > >
> > > The new tests crash if debug_discard_caches=on.
> > >
> > > 2021-12-20 16:15:44.476 CST postmaster[7478] LOG:  server process (PID
> > > 7657) was terminated by signal 6: Aborted
> > > 2021-12-20 16:15:44.476 CST postmaster[7478] DETAIL:  Failed process
> was
> > > running: DISCARD VARIABLES;
> >
> > How do you inject this parameter to regress tests?
>
> You can run PGOPTIONS='-c debug_invalidate_caches=1' make check
>
> I used make installcheck against a running instance where I'd used
> ALTER SYSTEM SET debug_discard_caches=on.
>
> You can also manually run psql against the .sql file itself.
> ...which is a good idea since this causes the regression tests take hours.
>
> Or just add SET debug_discard_caches=on to your .sql file.
>

ok thank you

I'll try it.


> --
> Justin
>


Re: Schema variables - new implementation for Postgres 15

2021-12-21 Thread Justin Pryzby
On Tue, Dec 21, 2021 at 01:29:00PM +0100, Pavel Stehule wrote:
> Hi
> 
> út 21. 12. 2021 v 0:09 odesílatel Justin Pryzby 
> napsal:
> 
> > I don't understand what 0002 patch does relative to the 0001 patch.
> > Is 0002 to change the error messages from "schema variables" to "session
> > variables" , in a separate commit to show that the main patch doesn't
> > change
> > regression results ?  Could you add commit messages ?
> >
> > I mentioned before that there's a pre-existing use of the phrase "session
> > variable", which you should change to something else:
> >
> > origin:doc/src/sgml/ref/set_role.sgml:   SET ROLE does
> > not process session variables as specified by
> > origin:doc/src/sgml/ref/set_role.sgml-   the role's  > linkend="sql-alterrole">ALTER ROLE settings;
> > this only happens during
> > origin:doc/src/sgml/ref/set_role.sgml-   login.
> >
> > Maybe "session variable" should be added to the glossary.
> >
> > The new tests crash if debug_discard_caches=on.
> >
> > 2021-12-20 16:15:44.476 CST postmaster[7478] LOG:  server process (PID
> > 7657) was terminated by signal 6: Aborted
> > 2021-12-20 16:15:44.476 CST postmaster[7478] DETAIL:  Failed process was
> > running: DISCARD VARIABLES;
> 
> How do you inject this parameter to regress tests?

You can run PGOPTIONS='-c debug_invalidate_caches=1' make check

I used make installcheck against a running instance where I'd used
ALTER SYSTEM SET debug_discard_caches=on.

You can also manually run psql against the .sql file itself.
...which is a good idea since this causes the regression tests take hours.

Or just add SET debug_discard_caches=on to your .sql file.

-- 
Justin




Re: Schema variables - new implementation for Postgres 15

2021-12-21 Thread Pavel Stehule
Hi

út 21. 12. 2021 v 0:09 odesílatel Justin Pryzby 
napsal:

> I don't understand what 0002 patch does relative to the 0001 patch.
> Is 0002 to change the error messages from "schema variables" to "session
> variables" , in a separate commit to show that the main patch doesn't
> change
> regression results ?  Could you add commit messages ?
>
> I mentioned before that there's a pre-existing use of the phrase "session
> variable", which you should change to something else:
>
> origin:doc/src/sgml/ref/set_role.sgml:   SET ROLE does
> not process session variables as specified by
> origin:doc/src/sgml/ref/set_role.sgml-   the role's  linkend="sql-alterrole">ALTER ROLE settings;
> this only happens during
> origin:doc/src/sgml/ref/set_role.sgml-   login.
>
> Maybe "session variable" should be added to the glossary.
>
> The new tests crash if debug_discard_caches=on.
>
> 2021-12-20 16:15:44.476 CST postmaster[7478] LOG:  server process (PID
> 7657) was terminated by signal 6: Aborted
> 2021-12-20 16:15:44.476 CST postmaster[7478] DETAIL:  Failed process was
> running: DISCARD VARIABLES;
>

How do you inject this parameter to regress tests?

Regards

Pavel


> TRAP: FailedAssertion("sessionvars", File: "sessionvariable.c", Line: 270,
> PID: 7657)
>
> #2  0x564858a4f1a8 in ExceptionalCondition
> (conditionName=conditionName@entry=0x564858b8626d "sessionvars",
> errorType=errorType@entry=0x564858aa700b "FailedAssertion",
> fileName=fileName@entry=0x564858b86234 "sessionvariable.c",
> lineNumber=lineNumber@entry=270) at assert.c:69
> #3  0x56485874fec6 in sync_sessionvars_xact_callback (event= out>, arg=) at sessionvariable.c:270
> #4  sync_sessionvars_xact_callback (event=, arg= out>) at sessionvariable.c:253
> #5  0x56485868030a in CallXactCallbacks (event=XACT_EVENT_PRE_COMMIT)
> at xact.c:3644
> #6  CommitTransaction () at xact.c:2178
> #7  0x564858681975 in CommitTransactionCommand () at xact.c:3043
> #8  0x56485892b7a9 in finish_xact_command () at postgres.c:2722
> #9  0x56485892dc5b in finish_xact_command () at postgres.c:2720
> #10 exec_simple_query () at postgres.c:1240
> #11 0x56485892f70a in PostgresMain () at postgres.c:4498
> #12 0x56485889a479 in BackendRun (port=,
> port=) at postmaster.c:4594
> #13 BackendStartup (port=) at postmaster.c:4322
> #14 ServerLoop () at postmaster.c:1802
> #15 0x56485889b47c in PostmasterMain () at postmaster.c:1474
> #16 0x5648585c60c0 in main (argc=5, argv=0x564858e553f0) at main.c:198
>


Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints

2021-12-21 Thread Dilip Kumar
On Thu, Dec 16, 2021 at 9:26 PM Neha Sharma
 wrote:
>
> Hi,
>
> While testing the v8 patches in a hot-standby setup, it was observed the 
> master is crashing with the below error;
>
> 2021-12-16 19:32:47.757 +04 [101483] PANIC:  could not fsync file 
> "pg_tblspc/16385/PG_15_202112111/16386/16391": No such file or directory
> 2021-12-16 19:32:48.917 +04 [101482] LOG:  checkpointer process (PID 101483) 
> was terminated by signal 6: Aborted
>
> Parameters configured at master:
> wal_level = hot_standby
> max_wal_senders = 3
> hot_standby = on
> max_standby_streaming_delay= -1
> wal_consistency_checking='all'
> max_wal_size= 10GB
> checkpoint_timeout= 1d
> log_min_messages=debug1
>
> Test Case:
> create tablespace tab1 location 
> '/home/edb/PGsources/postgresql/inst/bin/test1';
> create tablespace tab location '/home/edb/PGsources/postgresql/inst/bin/test';
> create database test tablespace tab;
> \c test
> create table t( a int PRIMARY KEY,b text);
> CREATE OR REPLACE FUNCTION large_val() RETURNS TEXT LANGUAGE SQL AS 'select 
> array_agg(md5(g::text))::text from generate_series(1, 256) g';
> insert into t values (generate_series(1,10), large_val());
> alter table t set tablespace tab1 ;
> \c postgres
> create database test1 template test;
> \c test1
> alter table t set tablespace tab;
> \c postgres
> alter database test1 set tablespace tab1;
>
> --cancel the below command
> alter database test1 set tablespace pg_default; --press ctrl+c
> \c test1
> alter table t set tablespace tab1;
>
>
> Log file attached for reference.

Seems like this is an existing issue and I am able to reproduce on the
PostgreSQL head as well [1]

[1] 
https://www.postgresql.org/message-id/CAFiTN-szX%3DayO80EnSWonBu1YMZrpOr9V0R3BzHBSjMrMPAeMg%40mail.gmail.com

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Checkpointer crashes with "PANIC: could not fsync file "pg_tblspc/.."

2021-12-21 Thread Dilip Kumar
While testing the below case with the hot standby setup (with the
latest code), I have noticed that the checkpointer process crashed
with the $subject error. As per my observation, we have registered the
SYNC_REQUEST when inserting some tuple into the table, and later on
ALTER SET TABLESPACE we have registered the SYNC_UNLINK_REQUEST, which
looks fine so far, then I have noticed that only when the standby is
connected the underlying table file w.r.t the old tablespace is
already deleted.  Now, in AbsorbFsyncRequests we don't do anything for
the SYNC_REQUEST even though we have SYNC_UNLINK_REQUEST for the same
file, but since the underlying file is already deleted the
checkpointer cashed while processing the SYNC_REQUEST.

I have spent some time on this but could not figure out how the
relfilenodenode file w.r.t. to the old tablespace is getting deleted
and if I disconnect the standby then it is not getting deleted, not
sure how walsender is playing a role in deleting the file even before
checkpointer process the unlink request.

postgres[8905]=# create tablespace tab location
'/home/dilipkumar/work/PG/install/bin/test';
CREATE TABLESPACE
postgres[8905]=# create tablespace tab1 location
'/home/dilipkumar/work/PG/install/bin/test1';
CREATE TABLESPACE
postgres[8905]=# create database test tablespace tab;
CREATE DATABASE
postgres[8905]=# \c test
You are now connected to database "test" as user "dilipkumar".
test[8912]=# create table t( a int PRIMARY KEY,b text);
CREATE TABLE
test[8912]=# insert into t values (generate_series(1,10), 'aaa');
INSERT 0 10
test[8912]=# alter table t set tablespace tab1 ;
ALTER TABLE
test[8912]=# CHECKPOINT ;
WARNING:  57P02: terminating connection because of crash of another
server process

log shows:
PANIC:  could not fsync file
"pg_tblspc/16384/PG_15_202112131/16386/16387": No such file or
directory

backtrace:
#0  0x7f2f865ff387 in raise () from /lib64/libc.so.6
#1  0x7f2f86600a78 in abort () from /lib64/libc.so.6
#2  0x00b13da3 in errfinish (filename=0xcf283f "sync.c", ..
#3  0x00978dc7 in ProcessSyncRequests () at sync.c:439
#4  0x005949d2 in CheckPointGuts (checkPointRedo=67653624,
flags=108) at xlog.c:9590
#5  0x005942fe in CreateCheckPoint (flags=108) at xlog.c:9318
#6  0x008a80b7 in CheckpointerMain () at checkpointer.c:444

Note: This smaller test case is derived from one of the bigger
scenarios raised by Neha Sharma [1]

[1]https://www.postgresql.org/message-id/CANiYTQs0E8TcB11eU0C4eNN0tUd%3DSQqsqEtL1AVZP1%3DEnD-49A%40mail.gmail.com

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: In-placre persistance change of a relation

2021-12-21 Thread Kyotaro Horiguchi
At Tue, 21 Dec 2021 17:13:21 +0900 (JST), Kyotaro Horiguchi 
 wrote in 
> Ugh! I completely forgot about TAP tests.. Thanks for the testing and
> sorry for the bugs.
> 
> This is a bit big change so I need a bit of time before posting the
> next version.

I took a bit too long detour but the patch gets to pass make-world for
me.

In this version:

- When relation persistence is changed from logged to unlogged, buffer
 persistence is flipped then an init-fork is created along with a mark
 file for the fork (RelationCreateInitFork). The mark file is removed
 at commit but left alone after a crash before commit. At the next
 startup, ResetUnloggedRelationsInDbspaceDir() removes the init fork
 file if it finds the mark file corresponding to the file.

- When relation persistence is changed from unlogged to logged, buffer
  persistence is flipped then the exisging init-fork is marked to be
  dropped at commit (RelationDropInitFork). Finally the whole content
  is WAL-logged in the page-wise manner (RelationChangePersistence),

- The two operations above are repeatable within a transaction and
 commit makes the last operation persist and rollback make the all
 operations abandoned.

- Storage files are created along with a "mark" file for the
 relfilenode. It behaves the same way to the above except the mark
 files corresponds to the whole relfilenode.

- The at-commit operations this patch adds require to be WAL-logged so
 they don't fit pendingDeletes list, which is executed after commit. I
 added a new pending-work list pendingCleanups that is executed just
 after pendingSyncs.  (new in this version)

 
regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From bc7e14b8af3c72e4ab99c964688d18ef4545f8b9 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Wed, 11 Nov 2020 21:51:11 +0900
Subject: [PATCH v10 1/2] In-place table persistence change

Even though ALTER TABLE SET LOGGED/UNLOGGED does not require data
rewriting, currently it runs heap rewrite which causes large amount of
file I/O.  This patch makes the command run without heap rewrite.
Addition to that, SET LOGGED while wal_level > minimal emits WAL using
XLOG_FPI instead of massive number of HEAP_INSERT's, which should be
smaller.

Also this allows for the cleanup of files left behind in the crash of
the transaction that created it.
---
 src/backend/access/rmgrdesc/smgrdesc.c |  52 +++
 src/backend/access/transam/README  |   8 +
 src/backend/access/transam/xact.c  |   7 +
 src/backend/access/transam/xlog.c  |  17 +
 src/backend/catalog/storage.c  | 545 -
 src/backend/commands/tablecmds.c   | 256 ++--
 src/backend/replication/basebackup.c   |   3 +-
 src/backend/storage/buffer/bufmgr.c|  88 
 src/backend/storage/file/fd.c  |   4 +-
 src/backend/storage/file/reinit.c  | 344 +++-
 src/backend/storage/smgr/md.c  |  93 -
 src/backend/storage/smgr/smgr.c|  32 ++
 src/backend/storage/sync/sync.c|  20 +-
 src/bin/pg_rewind/parsexlog.c  |  24 ++
 src/common/relpath.c   |  47 ++-
 src/include/catalog/storage.h  |   3 +
 src/include/catalog/storage_xlog.h |  42 +-
 src/include/common/relpath.h   |   9 +-
 src/include/storage/bufmgr.h   |   2 +
 src/include/storage/fd.h   |   1 +
 src/include/storage/md.h   |   8 +-
 src/include/storage/reinit.h   |  10 +-
 src/include/storage/smgr.h |  17 +
 23 files changed, 1450 insertions(+), 182 deletions(-)

diff --git a/src/backend/access/rmgrdesc/smgrdesc.c b/src/backend/access/rmgrdesc/smgrdesc.c
index 773d57..d251f22207 100644
--- a/src/backend/access/rmgrdesc/smgrdesc.c
+++ b/src/backend/access/rmgrdesc/smgrdesc.c
@@ -40,6 +40,49 @@ smgr_desc(StringInfo buf, XLogReaderState *record)
 		 xlrec->blkno, xlrec->flags);
 		pfree(path);
 	}
+	else if (info == XLOG_SMGR_UNLINK)
+	{
+		xl_smgr_unlink *xlrec = (xl_smgr_unlink *) rec;
+		char	   *path = relpathperm(xlrec->rnode, xlrec->forkNum);
+
+		appendStringInfoString(buf, path);
+		pfree(path);
+	}
+	else if (info == XLOG_SMGR_MARK)
+	{
+		xl_smgr_mark *xlrec = (xl_smgr_mark *) rec;
+		char	   *path = GetRelationPath(xlrec->rnode.dbNode,
+		   xlrec->rnode.spcNode,
+		   xlrec->rnode.relNode,
+		   InvalidBackendId,
+		   xlrec->forkNum, xlrec->mark);
+		char	   *action;
+
+		switch (xlrec->action)
+		{
+			case XLOG_SMGR_MARK_CREATE:
+action = "CREATE";
+break;
+			case XLOG_SMGR_MARK_UNLINK:
+action = "DELETE";
+break;
+			default:
+action = "";
+break;
+		}
+
+		appendStringInfo(buf, "%s %s", action, path);
+		pfree(path);
+	}
+	else if (info == XLOG_SMGR_BUFPERSISTENCE)
+	{
+		xl_smgr_bufpersistence *xlrec = (xl_smgr_bufpersistence *) rec;
+		char	   *path = relpathperm(xlrec->rnode, MAIN_FORKNUM);
+
+		appendStringInfoString(buf, path);
+		appendStringInfo(buf, " persistence %d", 

Re: row filtering for logical replication

2021-12-21 Thread vignesh C
On Tue, Dec 21, 2021 at 2:29 PM Peter Smith  wrote:
>
> Here is the v51* patch set:
>
> Main changes from Euler's v50* are
> 1. Most of Euler's "fixes" patches are now merged back in
> 2. Patches are then merged per Amit's suggestion [Amit 20/12]
> 3. Some other review comments are addressed
>
> ~~
>
> Phase 1 - Merge the Euler fixes
> ===
>
> v51-0001 (main) <== v50-0001 (main 0001) + v50-0002 (fixes 0001)
> - OK, accepted and merged all the "fixes" back into the 0001
> - (fixed typos)
> - There is a slight disagreement with what the PG docs say about
> NULLs, but I will raise a separate comment on -hackers later
> (meanwhile, the current PG docs text is from the Euler patch)
>
> v51-0002 (validation) <== v50-0003 (validation 0002) + v50-0004 (fixes 0002)
> - OK, accepted and merges all these "fixes" back into the 0002
> - (fixed typo)
>
> v51-0003 (new/old) <== v50-0005 (new/old 0003)
> - REVERTED to the v49 version of this patch!
> - Please see Ajin's explanation why the v49 code was required [Ajin 21/12]
>
> v51-0004 (tab-complete + dump) <== v50-0006 (tab-complete + dump 0004)
> - No changes
>
> v51-0005 (for all tables) <== v50-0007 (for all tables 0005) +
> v50-0008 (fixes 0005)
> - OK, accepted and merged most of these "fixes" back into the 0005
> - (fixed typo/grammar)
> - Amit requested we not use Euler's new tablesync SQL just yet [Amit 21/12]
>
>
> Phase 2 - Merge main (Phase 1) patches per Amit suggestion
> 
>
> v51-0001 (main) <== v51-0001 (main) + v51-0002 (validation) + v51-0005
> (for all tables)
> - (also combined all the commit comments)
>
> v51-0002 (new/old) <== v51-0003 (new/old)
>
> v51-0005 (tab-complete + dump) <== v51-0004
>
>
> Review comments (details)
> =
>
> v51-0001 (main)
> - Addressed review comments from Amit. [Amit 20/12] #1,#2,#3,#4
>
> v51-0002 (new/old tuple)
> - Includes a patch from Greg (provided internally)
>
> v51-0003 (tab, dump)
> - No changes
>
> --
> [Amit 20/12] 
> https://www.postgresql.org/message-id/CAA4eK1JJgfEPKYvQAcwGa5jjuiUiQRQZ0Pgo-HF0KFHh-jyNQQ%40mail.gmail.com
> [Ajin 21/12] 
> https://www.postgresql.org/message-id/CAFPTHDbfpPNh3GLGjySS%3DAuRfbQPQFNvfiyG1GDQW2kz1yT7Og%40mail.gmail.com
> [Amit 21/12] 
> https://www.postgresql.org/message-id/CAA4eK1KwoA5k8v9z9e4ZPN_X%3D1GAmQmsWyauFwZpKiSHqy6eZA%40mail.gmail.com

Few comments:
1) list_free(schemarelids) is called inside if and and outside if in
break case. Can we move it above continue so that it does not gets
called in the break case:
+   schemarelids =
GetAllSchemaPublicationRelations(pub->oid,
+
 pub->pubviaroot ?
+
 PUBLICATION_PART_ROOT
:
+

PUBLICATION_PART_LEAF);
+   if (list_member_oid(schemarelids, entry->relid))
+   {
+   if (pub->pubactions.pubinsert)
+   no_filter[idx_ins] = true;
+   if (pub->pubactions.pubupdate)
+   no_filter[idx_upd] = true;
+   if (pub->pubactions.pubdelete)
+   no_filter[idx_del] = true;
+
+   list_free(schemarelids);
+
+   /* Quick exit loop if all pubactions
have no row-filter. */
+   if (no_filter[idx_ins] &&
no_filter[idx_upd] && no_filter[idx_del])
+   break;
+
+   continue;
+   }
+   list_free(schemarelids);

2) create_edata_for_relation also is using similar logic, can it also
call this function to reduce duplicate code
+static EState *
+create_estate_for_relation(Relation rel)
+{
+   EState  *estate;
+   RangeTblEntry   *rte;
+
+   estate = CreateExecutorState();
+
+   rte = makeNode(RangeTblEntry);
+   rte->rtekind = RTE_RELATION;
+   rte->relid = RelationGetRelid(rel);
+   rte->relkind = rel->rd_rel->relkind;
+   rte->rellockmode = AccessShareLock;
+   ExecInitRangeTable(estate, list_make1(rte));
+
+   estate->es_output_cid = GetCurrentCommandId(false);
+
+   return estate;
+}

3) In one place select is in lower case, it can be changed to upper case
+   resetStringInfo();
+   appendStringInfo(,
+"SELECT DISTINCT
pg_get_expr(prqual, prrelid) "
+"  FROM pg_publication p "
+"  INNER JOIN
pg_publication_rel pr ON (p.oid = pr.prpubid) "
+"WHERE pr.prrelid
= %u AND p.pubname IN ( %s ) "
+"AND NOT (select

Re: Failed transaction statistics to measure the logical replication progress

2021-12-21 Thread Greg Nancarrow
On Mon, Dec 20, 2021 at 8:40 PM osumi.takami...@fujitsu.com
 wrote:
>
> Updated the patch to address your concern.
>

Some review comments on the v18 patches:

v18-0002

doc/src/sgml/monitoring.sgml
(1) tablesync worker stats?

Shouldn't the comment below only mention the apply worker? (since
we're no longer recording stats of the tablesync worker)

+   Number of transactions that failed to be applied by the table
+   sync worker or main apply worker in this subscription. This
+   counter is updated after confirming the error is not same as
+   the previous one.
+   

Also, it should say "... the error is not the same as the previous one."


src/backend/catalog/system_views.sql
(2) pgstat_report_subworker_xact_end()

Fix typo and some wording:

BEFORE:
+ *  This should be called before the call of process_syning_tables() not to
AFTER:
+ *  This should be called before the call of
process_syncing_tables(), so to not


src/backend/postmaster/pgstat.c
(3) pgstat_send_subworker_xact_stats()

BEFORE:
+ * Send a subworker transaction stats to the collector.
AFTER:
+ * Send a subworker's transaction stats to the collector.

(4)
Wouldn't it be best for:

+   if (!TimestampDifferenceExceeds(last_report, now, PGSTAT_STAT_INTERVAL))

to be:

+   if (last_report != 0 && !TimestampDifferenceExceeds(last_report,
now, PGSTAT_STAT_INTERVAL))

?

(5) pgstat_send_subworker_xact_stats()

I think that the comment:

+   * Clear out the statistics buffer, so it can be re-used.

should instead say:

+   * Clear out the supplied statistics.

because the current comment infers that repWorker is pointed at the
MyLogicalRepWorker buffer (which it might be but it shouldn't be
relying on that)
Also, I think that the function header should mention something like:
"The supplied repWorker statistics are cleared upon return, to assist
re-use by the caller."


Regards,
Greg Nancarrow
Fujitsu Australia




Re: simplifying foreign key/RI checks

2021-12-21 Thread Zhihong Yu
On Mon, Dec 20, 2021 at 5:17 AM Amit Langote 
wrote:

> On Mon, Dec 20, 2021 at 6:19 PM Zhihong Yu  wrote:
> > On Sun, Dec 19, 2021 at 10:20 PM Amit Langote 
> wrote:
> >>
> >> On Mon, Dec 20, 2021 at 2:00 PM Corey Huinker 
> wrote:
> >> > Sorry for the delay. This patch no longer applies, it has some
> conflict with d6f96ed94e73052f99a2e545ed17a8b2fdc1fb8a
> >>
> >> Thanks Corey for the heads up.  Rebased with some cosmetic adjustments.
> >>
> > Hi,
> >
> > +   Assert(partidx < 0 || partidx < partdesc->nparts);
> > +   partoid = partdesc->oids[partidx];
> >
> > If partidx < 0, do we still need to fill out partoid and is_leaf ? It
> seems we can return early based on (should call table_close(rel) first):
> >
> > +   /* No partition found. */
> > +   if (partidx < 0)
> > +   return NULL;
>
> Good catch, thanks.  Patch updated.
>
> Hi,

+   int lockflags = 0;
+   TM_Result   test;
+
+   lockflags = TUPLE_LOCK_FLAG_LOCK_UPDATE_IN_PROGRESS;

The above assignment can be meged with the line where variable lockflags is
declared.

+   GetUserIdAndSecContext(_userid, _sec_context);

save_userid -> saved_userid
save_sec_context -> saved_sec_context

+* the transaction-snapshot mode.  If we didn't push one already, do

didn't push -> haven't pushed

For ri_PerformCheck():

+   boolsource_is_pk = true;

It seems the value of source_is_pk doesn't change - the value true can be
plugged into ri_ExtractValues() calls directly.

Cheers


Re: In-placre persistance change of a relation

2021-12-21 Thread Kyotaro Horiguchi
Ugh! I completely forgot about TAP tests.. Thanks for the testing and
sorry for the bugs.

This is a bit big change so I need a bit of time before posting the
next version.


At Mon, 20 Dec 2021 13:38:35 +, Jakub Wartak  
wrote in 
> 1) check-worlds seems OK but make -C src/test/recovery check shows a couple 
> of failing tests here locally and in 
> https://cirrus-ci.com/task/4699985735319552?logs=test#L807 :
> t/009_twophase.pl  (Wstat: 256 Tests: 24 Failed: 1)
>   Failed test:  21
>   Non-zero exit status: 1

PREPARE TRANSACTION requires uncommited file creation to be
committed. Concretely we need to remove the "mark" files for the
in-transaction created relation file during PREPARE TRANSACTION.

pendingSync is not a parallel mechanism with pendingDeletes so we
cannot move mark deletion to pendingSync.

After all I decided to add a separate list pendingCleanups for pending
non-deletion tasks separately from pendingDeletes and execute it
before insering the commit record.  Not only the above but also all of
the following failures vanished by the change.

> t/014_unlogged_reinit.pl   (Wstat: 512 Tests: 12 Failed: 2)
>   Failed tests:  9-10
>   Non-zero exit status: 2
> t/018_wal_optimize.pl  (Wstat: 7424 Tests: 0 Failed: 0)
>   Non-zero exit status: 29
>   Parse errors: Bad plan.  You planned 38 tests but ran 0.
> t/022_crash_temp_files.pl  (Wstat: 7424 Tests: 6 Failed: 0)
>   Non-zero exit status: 29
>   Parse errors: Bad plan.  You planned 9 tests but ran 6.


> 018 made no sense, I've tried to take a quick look with wal_level=minimal why 
> it is failing , it is mystery to me as the sequence seems to be pretty basic 
> but the outcome is not:

I think this shares the same cause.

> ~> cat repro.sql
> create tablespace tbs1 location '/tbs1';
> CREATE TABLE moved (id int);
> INSERT INTO moved VALUES (1);
> BEGIN;
> ALTER TABLE moved SET TABLESPACE tbs1;
> CREATE TABLE originated (id int);
> INSERT INTO originated VALUES (1);
> CREATE UNIQUE INDEX ON originated(id) TABLESPACE tbs1;
> COMMIT;
..
> ERROR:  could not open file "base/32833/32839": No such file or directory


> z3=# \dt+
>   List of relations
>  Schema |Name| Type  |  Owner   | Persistence |  Size   | Description
> ++---+--+-+-+-
>  public | moved  | table | postgres | permanent   | 0 bytes |
>  public | originated | table | postgres | permanent   | 0 bytes |
> 
> This happens even without placing on tablespace at all {for originated table 
> , but no for moved on}, some major mishap is there (commit should guarantee 
> correctness) or I'm tired and having sloppy fingers.
> 
> 2) minor one testcase, still something is odd.
> 
> drop tablespace tbs1;
> create tablespace tbs1 location '/tbs1';
> CREATE UNLOGGED TABLE t4 (a int) tablespace tbs1;
> CREATE UNLOGGED TABLE t5 (a int) tablespace tbs1;
> CREATE UNLOGGED TABLE t6 (a int) tablespace tbs1;
> CREATE TABLE t7 (a int) tablespace tbs1;
> insert into t7 values (1);
> insert into t5 values (1);
> insert into t6 values (1);
> \dt+
>  List of relations
>  Schema | Name | Type  |  Owner   | Persistence |Size| Description
> +--+---+--+-++-
>  public | t4   | table | postgres | unlogged| 0 bytes|
>  public | t5   | table | postgres | unlogged| 8192 bytes |
>  public | t6   | table | postgres | unlogged| 8192 bytes |
>  public | t7   | table | postgres | permanent   | 8192 bytes |
> (4 rows)
> 
> ALTER TABLE ALL IN TABLESPACE tbs1 set logged; 
> ==> STILL WARNING:  unrecognized node type: 349
> \dt+
>  List of relations
>  Schema | Name | Type  |  Owner   | Persistence |Size| Description
> +--+---+--+-++-
>  public | t4   | table | postgres | permanent   | 0 bytes|
>  public | t5   | table | postgres | permanent   | 8192 bytes |
>  public | t6   | table | postgres | permanent   | 8192 bytes |
>  public | t7   | table | postgres | permanent   | 8192 bytes |
> 
> So it did rewrite however this warning seems to be unfixed. I've tested on 
> e2c52beecdea152ca680a22ef35c6a7da55aa30f.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center