Re: convert elog(LOG) calls to ereport

2020-12-02 Thread Noah Misch
On Wed, Dec 02, 2020 at 02:26:24PM +0100, Peter Eisentraut wrote:
> There are a number of elog(LOG) calls that appear to be user-facing, so they
> should be ereport()s.

> @@ -8591,25 +8604,46 @@ LogCheckpointEnd(bool restartpoint)
>   CheckpointStats.ckpt_sync_rels;
>   average_msecs = (long) ((average_sync_time + 999) / 1000);
>  
> - elog(LOG, "%s complete: wrote %d buffers (%.1f%%); "
> -  "%d WAL file(s) added, %d removed, %d recycled; "
> -  "write=%ld.%03d s, sync=%ld.%03d s, total=%ld.%03d s; "
> -  "sync files=%d, longest=%ld.%03d s, average=%ld.%03d s; "
> -  "distance=%d kB, estimate=%d kB",

+1 to this change.

> @@ -1763,7 +1771,8 @@ pq_setkeepalivesidle(int idle, Port *port)
>  #else
>   if (idle != 0)
>   {
> - elog(LOG, "setting the keepalive idle time is not supported");
> + ereport(LOG,
> + (errmsg("setting the keepalive idle time is not 
> supported")));

+1

> --- a/src/backend/optimizer/geqo/geqo_erx.c
> +++ b/src/backend/optimizer/geqo/geqo_erx.c
> @@ -420,7 +420,8 @@ edge_failure(PlannerInfo *root, Gene *gene, int index, 
> Edge *edge_table, int num
>   }
>   }
>  
> - elog(LOG, "no edge found via random decision and total_edges == 
> 4");
> + ereport(LOG,
> + (errmsg("no edge found via random decision and 
> total_edges == 4")));

The user can't act upon this without reading the source code.  This and the
other messages proposed in this file are better as elog().

> @@ -343,7 +346,8 @@ PGSharedMemoryCreate(Size size,
>* care.
>*/
>   if (!CloseHandle(hmap))
> - elog(LOG, "could not close handle to shared memory: error code 
> %lu", GetLastError());
> + ereport(LOG,
> + (errmsg("could not close handle to shared 
> memory: error code %lu", GetLastError(;

The numerous messages proposed in src/backend/port/ files are can't-happen
events, and there's little a DBA can do without reading the source code.
They're better as elog().

> @@ -6108,8 +6111,9 @@ backend_read_statsfile(void)
>   /* Copy because timestamptz_to_str returns a 
> static buffer */
>   filetime = pstrdup(timestamptz_to_str(file_ts));
>   mytime = pstrdup(timestamptz_to_str(cur_ts));
> - elog(LOG, "stats collector's time %s is later 
> than backend local time %s",
> -  filetime, mytime);
> + ereport(LOG,
> + (errmsg("statistics collector's 
> time %s is later than backend local time %s",
> + filetime, 
> mytime)));

+1

> @@ -769,10 +769,11 @@ StartupReplicationOrigin(void)
>   replication_states[last_state].remote_lsn = 
> disk_state.remote_lsn;
>   last_state++;
>  
> - elog(LOG, "recovered replication state of node %u to %X/%X",
> -  disk_state.roident,
> -  (uint32) (disk_state.remote_lsn >> 32),
> -  (uint32) disk_state.remote_lsn);
> + ereport(LOG,
> + (errmsg("recovered replication state of node %u 
> to %X/%X",
> + disk_state.roident,
> + (uint32) (disk_state.remote_lsn 
> >> 32),
> + (uint32) 
> disk_state.remote_lsn)));

+1

> @@ -1914,7 +1914,8 @@ FileClose(File file)
>  
>   /* in any case do the unlink */
>   if (unlink(vfdP->fileName))
> - elog(LOG, "could not unlink file \"%s\": %m", 
> vfdP->fileName);
> + ereport(LOG,
> + (errmsg("could not unlink file \"%s\": 
> %m", vfdP->fileName)));

+1

>  
>   /* and last report the stat results */
>   if (stat_errno == 0)
> @@ -1922,7 +1923,8 @@ FileClose(File file)
>   else
>   {
>   errno = stat_errno;
> - elog(LOG, "could not stat file \"%s\": %m", 
> vfdP->fileName);
> + ereport(LOG,
> + (errmsg("could not stat file \"%s\": 
> %m", vfdP->fileName)));

+1


For the changes I didn't mention explicitly (most of them), I'm -0.5.  Many of
them "can't happen", use source code terms of art, and/or breach guideline
"Avoid mentioning called function names, either; instead say what the code was
trying to do" (https://www.postgresql.org/docs/devel/error-style-guide.html).




Re: Two fsync related performance issues?

2020-12-02 Thread Craig Ringer
On Wed, 2 Dec 2020 at 15:41, Michael Paquier  wrote:

> On Tue, Dec 01, 2020 at 07:39:30PM +0800, Craig Ringer wrote:
> > On Wed, 14 Oct 2020, 13:06 Michael Paquier,  wrote:
> >> Yeah, it is safer to assume that it is the responsability of the
> >> backup tool to ensure that because it could be possible that a host is
> >> unplugged just after taking a backup, and having Postgres do this work
> >> at the beginning of archive recovery would not help in most cases.
> >
> > Let's document that assumption in the docs for pg_basebackup and the file
> > system copy based replica creation docs. With a reference to initdb's
> > datadir sync option.
>
> Do you have any suggestion about that, in the shape of a patch perhaps?
>

I'll try to get to that, but I have some other docs patches outstanding
that I'd like to resolve first.


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 2ndQuadrant - PostgreSQL Solutions for the Enterprise


Re: [PATCH] Runtime control of CLOBBER_CACHE_ALWAYS

2020-12-02 Thread Craig Ringer
On Tue, 27 Oct 2020 at 16:34, Peter Eisentraut <
peter.eisentr...@2ndquadrant.com> wrote:

> On 2020-09-25 09:40, Craig Ringer wrote:
> > While working on extensions I've often wanted to enable cache clobbering
> > for a targeted piece of code, without paying the price of running it for
> > all backends during postgres startup and any initial setup tasks that
> > are required.
> >
> > So here's a patch that, when CLOBBER_CACHE_ALWAYS or
> > CLOBBER_CACHE_RECURSIVE are defined, adds a GUC named
> > clobber_cache_depth . It defaults to 1 for CLOBBER_CACHE_ALWAYS or 3 for
> > CLOBBER_CACHE_RECURSIVE to match the existing compiled-in behaviour. But
> > with this change it's now possible to run Pg with clobber_cache_depth=0
> > then set it to 1 only for targeted tests.
>
> I think it would be great if the cache clobber code is always compiled
> in (but turned off) when building with assertions.  The would reduce the
> number of hoops to jump through to actually use this locally and reduce
> the number of surprises from the build farm.
>

I implemented something a bit like that for a patched postgres where it
became an additional configure option. It's easy enough to enable it
automatically for USING_CASSERT instead, and I think that's sensible, so
I've adapted the patch to do so.

As initially written the patch defined the clobber control GUC only if
cache clobber control is compiled in. But we don't do that for anything
else, so I've changed it to always define the GUC, which will be ignored
without effect if no cache clobber control is compiled in. I also renamed
the GUC to debug_clobber_cache_depth to match other debug GUCs.

For backward compatibility, you can arrange it so that the built-in
> default of clobber_cache_depth is 1 or 3 if CLOBBER_CACHE_ALWAYS or
> CLOBBER_CACHE_RECURSIVE are defined.
>

It already does that - see diff hunk for src/include/utils/inval.h in prior
patch.

I've just changed it to default to 0 if neither are defined and moved it to
pg_config_manual.h .

I noticed that I missed handling of RECOVER_RELATION_BUILD_MEMORY in the
earlier patch - the relcache was eagerly freeing relation descriptor memory
without regard for the current clobber_cache_depth value. I've changed that
in the updated patch so that RelationBuildDesc only frees memory eagerly if
clobber_cache_depth is actually > 0 or if RECOVER_RELATION_BUILD_MEMORY has
been explicitly defined to 1. It also preserves the original behaviour of
not eagerly freeing memory even when cache clobber is enabled if
RECOVER_RELATION_BUILD_MEMORY is explicitly defined to 0.

Both CLOBBER_CACHE_ENABLED and RECOVER_RELATION_BUILD_MEMORY are now shown
in pg_config_manual.h .

A small entry has been added to the docs too, under developer options.

The changes add a little more noise than I'd prefer, but I didn't want to
vanish support for the compile-time constants or introduce surprising
behaviour.

To try it out, apply the patch (git am), build with --enable-cassert, then
compare:

   make -C src/test/regress check

and

   PGOPTIONS="-c debug_clobber_cache_depth=1" \
   make -C src/test/regress check

The speed difference will be obvious if nothing else!

It's much more practical using make check-tests and a specific TESTS= list.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 2ndQuadrant - PostgreSQL Solutions for the Enterprise
From fed26086ca2d5840ac288846b7b81b0d74af6949 Mon Sep 17 00:00:00 2001
From: Craig Ringer 
Date: Tue, 22 Sep 2020 09:51:00 +0800
Subject: [PATCH v2] Replace CLOBBER_CACHE_ALWAYS with
 debug_clobber_cache_depth GUC

Forced cache invalidation (CLOBBER_CACHE_ALWAYS) has been impractical to use
for testing in PostgreSQL because it's so slow and because it's toggled on/off
only at build time. It is helpful when hunting bugs in any code that uses the
sycache/relcache because causes cache invalidations to be injected whenever it
would be possible for an invalidation to occur, whether or not one was really
pending.

Address this by providing runtime control over cache clobber behaviour using
the new debug_clobber_cache_depth GUC. Support is not compiled in at all
unless assertions are enabled or CLOBBER_CACHE_ENABLED is explicitly defined at
compile time. It defaults to 0 if compiled in, so it has negligible effect
on assert build performance by default.

When support is compiled in, test code can now set debug_clobber_cache_depth=1
locally to a backend to test specific queries, functions, extensions, etc. Or
tests can toggle it globally for a specific test case while retaining normal
performance during test setup and teardown.

For backwards compatibility with existing test harnesses and scripts,
clobber_cache_depth defaults to 1 if CLOBBER_CACHE_ALWAYS is defined, and to 3
if CLOBBER_CACHE_RECURSIVE is defined.

CLOBBER_CACHE_ENABLED is now visible in pg_config_manual.h, as is the related
RECOVER_RELATION_BUILD_MEMORY setting for the relcache.
---
 doc/src/sgml/config.sgml| 34 +

Re: Huge memory consumption on partitioned table with FKs

2020-12-02 Thread Amit Langote
On Thu, Dec 3, 2020 at 2:29 PM Kyotaro Horiguchi
 wrote:
> At Thu, 3 Dec 2020 12:27:53 +0900, Amit Langote  
> wrote in
> > On Thu, Dec 3, 2020 at 10:15 AM Kyotaro Horiguchi
> >  wrote:
> > > I don't think you're missing something. As I (tried to..) mentioned in
> > > another branch of this thread, you're right. On the otherhand
> > > fk_attnums is actually used to generate the query, thus *issue*
> > > potentially affects the query.  Although that might be paranoic, that
> > > possibility is eliminated by checking the congruency(?) of fk_attnums
> > > (or other members).  We might even be able to share a riinfo among
> > > such children.
> >
> > Just to be sure I've not misunderstood you, let me mention why I think
> > the queries used by different partitions all turn out to be the same
> > despite attribute number differences among partitions.  The places
> > that uses fk_attnums when generating a query are these among others:
> >
> > Oid fk_type = RIAttType(fk_rel, riinfo->fk_attnums[i]);
> > ...
> > Oid fk_coll = RIAttCollation(fk_rel, riinfo->fk_attnums[i]);
> > ...
> > quoteOneName(attname,
> >  RIAttName(fk_rel, riinfo->fk_attnums[i]));
> >
> > For the queries on the referencing side ("check" side),
> > type/collation/attribute name determined using the above are going to
> > be the same for all partitions in a given tree irrespective of the
> > attribute number, because they're logically the same column.  On the
>
> Yes, I know that, which is what I meant by "practically" or
> "actually", but it is not explicitly defined AFAICS.

Well, I think it's great that we don't have to worry *in this part of
the code* about partition's fk_attnums not being congruent with the
root parent's, because ensuring that is the responsibility of the
other parts of the system such as DDL.  If we have any problems in
this area, they should be dealt with by ensuring that there are no
bugs in those other parts.

> Thus that would be no longer an issue if we explicitly define that
> "When conpraentid stores a valid value, each element of fk_attnums
> points to logically the same attribute with the RI_ConstraintInfo for
> the parent constraint."  Or I'd be happy if we have such a comment
> there instead.

I saw a comment in Kuroda-san's v2 patch that is perhaps meant to
address this point, but the placement needs to be reconsidered:

@@ -366,6 +368,14 @@ RI_FKey_check(TriggerData *trigdata)
querysep = "WHERE";
for (int i = 0; i < riinfo->nkeys; i++)
{
+
+   /*
+   * We share the same plan among all relations in a partition
+   * hierarchy.  The plan is guaranteed to be compatible since all of
+   * the member relations are guaranteed to have the equivalent set
+   * of foreign keys in fk_attnums[].
+   */
+
Oid pk_type = RIAttType(pk_rel, riinfo->pk_attnums[i]);
Oid fk_type = RIAttType(fk_rel, riinfo->fk_attnums[i]);

A more appropriate place for this kind of comment would be where
fk_attnums is defined or in ri_BuildQueryKey() that is shared by
different RI query issuing functions.

> > referenced side ("restrict", "cascade", "set" side), as you already
> > mentioned, fk_attnums refers to the top parent table of the
> > referencing side, so no possibility of they being different in the
> > various referenced partitions' RI_ConstraintInfos.
>
> Right. (I'm not sure I have mention that here, though:p)A

Maybe I misread but I think you did in your email dated Dec 1 where you said:

"After an off-list discussion, we confirmed that even in that case the
patch works as is because fk_attnum (or contuple.conkey) always stores
key attnums compatible to the topmost parent when conparent has a
valid value (assuming the current usage of fk_attnum), but I still
feel uneasy to rely on that unclear behavior."

> > On the topic of how we'd be able to share even the RI_ConstraintInfos
> > among partitions, that would indeed look a bit more elaborate than the
> > patch we have right now.
>
> Maybe just letting the hash entry for the child riinfo point to the
> parent riinfo if all members (other than constraint_id, of course)
> share the exactly the same values.  No need to count references since
> we don't going to remove riinfos.

Ah, something maybe worth trying.  Although the memory we'd save by
sharing the RI_ConstraintInfos would not add that much to the savings
we're having by sharing the plan, because it's the plans that are a
memory hog AFAIK.

> > > About your patch, it calculates the root constrid at the time an
> > > riinfo is created, but when the root-partition is further attached to
> > > another partitioned-table after the riinfo creation,
> > > constraint_root_id gets stale.  Of course that dones't matter
> > > practically, though.
> >
> > Maybe we could also store the hash value of the root constraint OID as
> > rootHashValue and check for that one too in
> > InvalidateConstrai

Re: [HACKERS] logical decoding of two-phase transactions

2020-12-02 Thread Peter Smith
On Thu, Dec 3, 2020 at 5:34 PM Masahiko Sawada  wrote:
> While looking at the patch set I found that the tests in
> src/test/subscription don't work with this patch. I got the following
> error:
>
> 2020-12-03 15:18:12.666 JST [44771] tap_sub ERROR:  unrecognized
> pgoutput option: two_phase
> 2020-12-03 15:18:12.666 JST [44771] tap_sub CONTEXT:  slot "tap_sub",
> output plugin "pgoutput", in the startup callback
> 2020-12-03 15:18:12.666 JST [44771] tap_sub STATEMENT:
> START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '2',
> two_phase 'on', publication_names '"tap_pub","tap_pub_ins_only"')
>
> In v29-0009 patch "two_phase" option is added on the subscription side
> (i.g., libpqwalreceiver) but it seems not on the publisher side
> (pgoutput).
>

The v29-0009 patch is still a WIP for a new SUBSCRIPTION "two_phase"
option so it is not yet fully implemented. I did run following prior
to upload but somehow did not see those failures yesterday:
cd src/test/subscription
make check

Anyway, as 0009 is the last of the set  please just git apply
--reverse that one if it is causing a problem.

Sorry for any inconvenience. I will add the missing functionality to
0009 as soon as I can.

Kind Regards,
Peter Smith.
Fujitsu Australia




RE: [Patch] Optimize dropping of relation buffers using dlist

2020-12-02 Thread tsunakawa.ta...@fujitsu.com
From: Jamison, Kirk/ジャミソン カーク 
> Apologies for the delay, but attached are the updated versions to simplify the
> patches.

Looks good for me.  Thanks to Horiguchi-san and Andres-san, the code bebecame 
further compact and easier to read.  I've marked this ready for committer.


To the committer:
I don't think it's necessary to refer to COMMIT/ROLLBACK PREPARED in the 
following part of the 0003 commit message.  They surely call 
DropRelFileNodesAllBuffers(), but COMMIT/ROLLBACK also call it.

the full scan threshold. This improves the DropRelationFiles()
performance when the TRUNCATE command truncated off any of the empty
pages at the end of relation, and when dropping relation buffers if a
commit/rollback transaction has been prepared in FinishPreparedTransaction().


Regards
Takayuki Tsunakawa





Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace on the fly

2020-12-02 Thread Michael Paquier
On Wed, Dec 02, 2020 at 10:30:08PM -0600, Justin Pryzby wrote:
> Good idea.  I think you mean like this.

Yes, something like that.  Thanks.

> +typedef struct ReindexParams {
> + bool concurrently;
> + bool verbose;
> + bool missingok;
> +
> + int options;/* bitmask of lowlevel REINDEXOPT_* */
> +} ReindexParams;
> +

By moving everything into indexcmds.c, keeping ReindexParams within it
makes sense to me.  Now, there is no need for the three booleans
because options stores the same information, no?

>  struct ReindexIndexCallbackState
>  {
> - int options;/* options from 
> statement */
> + boolconcurrently;
>   Oid locked_table_oid;   /* tracks previously 
> locked table */
>  };

Here also, I think that we should just pass down the full
ReindexParams set.
--
Michael


signature.asc
Description: PGP signature


Re: convert elog(LOG) calls to ereport

2020-12-02 Thread Michael Paquier
On Wed, Dec 02, 2020 at 11:04:45AM -0300, Alvaro Herrera wrote:
> Please take the opportunity to move the flag name out of the message in
> this one, also.  I do wonder if it'd be a good idea to move the syscall
> name itself out of the message, too; that would reduce the number of
> messages to translate 50x to just "%s(%s) failed: %m" instead of one
> message per distinct syscall.

+1.

+   else
+   ereport(LOG,
+   (errmsg("checkpoint starting:%s%s%s%s%s%s%s%s",

Would it be better to add a note for translators here, in short that
all those %s are options related to checkpoint/restartpoints?

The ones in geqo_erx.c look like debugging information, and the ones
in win32_shmem.c for segment creation are code paths only used by the
postmaster.
--
Michael


signature.asc
Description: PGP signature


Re: Commitfest 2020-11 is closed

2020-12-02 Thread Thomas Munro
On Thu, Dec 3, 2020 at 12:02 PM Josef Šimánek  wrote:
> st 2. 12. 2020 v 23:36 odesílatel Andrew Dunstan  napsal:
> > On 12/2/20 5:13 PM, Thomas Munro wrote:
> > > I'm experimenting with Github's built in CI.  All other ideas welcome.
> >
> > I'd look very closely at gitlab.
>
> I was about to respond before with the same idea. Feel free to ping
> regarding another CI. Also there is possibility to move to GitHub
> Actions (free open source CI). I got some experience with that as
> well.

I spent today getting something working on Github just to try it out,
and started a new thread[1] about that.  I've not tried Gitlab and
have no opinion about that; if someone has a working patch similar to
that, I'd definitely be interested to take a look.  I've looked at
some others.  For what cfbot is doing (namely: concentrating the work
of hundreds into one "account" via hundreds of branches), the spin-up
times and concurrency limits are a bit of a problem on many of them.
FWIW I think they're all wonderful for offering this service to open
source projects and I'm grateful that they do it!

[1] 
https://www.postgresql.org/message-id/flat/CA%2BhUKG%2By_SHVQcU3CPokmJxuHp1niebCjq4XzZizf8SR9ZdQRQ%40mail.gmail.com




Re: Printing LSN made easy

2020-12-02 Thread Ashutosh Bapat
On Mon, Nov 30, 2020 at 8:07 PM Alvaro Herrera 
wrote:

> On 2020-Nov-30, Ashutosh Bapat wrote:
>
> > Peter Eisentraut explained that the translation system can not handle
> > strings broken by macros, e.g. errmsg("foo" MACRO "bar"), since it
> doesn't
> > know statically what the macro resolves to. But I am not familiar with
> our
> > translation system myself. But if we allow INT64_FORMAT, LSN_FORMAT
> should
> > be allowed. That makes life much easier. We do not need other functions
> at
> > all.
> >
> > Peter E or somebody familiar with translations can provide more
> > information.
>
> We don't allow INT64_FORMAT in translatable strings, period.  (See
> commit 6a1cd8b9236d which undid some hacks we had to use to work around
> the lack of it, by allowing %llu to take its place.)  For the same
> reason, we cannot allow LSN_FORMAT or similar constructions in
> translatable strings either.


> If the solution to ugliness of LSN printing is going to require that we
> add a "%s" which prints a previously formatted string with LSN_FORMAT,
> it won't win us anything.
>

Thanks for the commit. The commit reverted code which uses a char array to
print int64. On the same lines, using a character array to print an LSN
won't be acceptable. I agree.

Maybe we should update the section "Message-Writing Guidelines" in
doc/src/sgml/nls.sgml. I think it's implied by "Do not construct sentences
at run-time, like ... " but using a macro is not really constructing
sentences run time. Hopefully, some day, we will fix the translation system
to handle strings with macros and make it easier to print PG specific
objects in strings easily.


>
> As annoyed as I am about our LSN printing, I don't think this patch
> has the solutions we need.
>

I think we could at least accept the macros so that they can be used in
non-translatable strings i.e. use it with sprints, appendStringInfo
variants etc. The macros will also serve to document the string format of
LSN. Developers can then copy from the macros rather than copying from
"some other" (erroneous) usage in the code.

--
Best Wishes,
Ashutosh


Re: [HACKERS] logical decoding of two-phase transactions

2020-12-02 Thread Masahiko Sawada
On Wed, Dec 2, 2020 at 8:24 PM Peter Smith  wrote:
>
> I have rebased the v28 patch set (made necessary due to recent commit [1])
> [1] 
> https://github.com/postgres/postgres/commit/0926e96c493443644ba8e96b5d96d013a9ffaf64
>
> And at the same time I have added patch 0009 to this set - This is for
> the new SUBSCRIPTION option "two_phase" (0009 is still WIP but
> stable).
>
> PSA new patch set with version bumped to v29.

Thank you for updating the patch!

While looking at the patch set I found that the tests in
src/test/subscription don't work with this patch. I got the following
error:

2020-12-03 15:18:12.666 JST [44771] tap_sub ERROR:  unrecognized
pgoutput option: two_phase
2020-12-03 15:18:12.666 JST [44771] tap_sub CONTEXT:  slot "tap_sub",
output plugin "pgoutput", in the startup callback
2020-12-03 15:18:12.666 JST [44771] tap_sub STATEMENT:
START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '2',
two_phase 'on', publication_names '"tap_pub","tap_pub_ins_only"')

In v29-0009 patch "two_phase" option is added on the subscription side
(i.g., libpqwalreceiver) but it seems not on the publisher side
(pgoutput).

Regards,

-- 
Masahiko Sawada
EnterpriseDB:  https://www.enterprisedb.com/




Github Actions (CI)

2020-12-02 Thread Thomas Munro
Hi hackers,

I'm looking for more horsepower for testing commitfest entries
automatically, and today I tried out $SUBJECT.  The attached is a
rudimentary first attempt, for show-and-tell.  If you have a Github
account, you just have to push it to a branch there and look at the
Actions tab on the web page for the results.  Does anyone else have
.github files and want to share, to see if we can combine efforts
here?

The reason for creating three separate "workflows" for Linux, Windows
and macOS rather than three separate "jobs" inside one workflow is so
that cfbot.cputube.org could potentially get separate pass/fail
results for each OS out of the API rather than one combined result.  I
rather like that feature of cfbot's results.  (I could be wrong about
needing to do that, this is the first time I've ever looked at this
stuff.)

The Windows test actually fails right now, exactly as reported by
Ranier[1].  It is a release build on a recent MSVC, so I guess that is
expected and off-topic for this thread.  But generally,
.github/workflows/ci-windows.yml is the weakest part of this.  It'd be
great to get a debug/assertion build, show backtraces when it crashes,
run more of the tests, etc etc, but I don't know nearly enough about
Windows to do that myself.  Another thing is that it uses Choco for
flex and bison; it'd be better to find those on the image, if
possible.  Also, for all 3 OSes, it's not currently attempting to
cache build results or anything like that.

I'm a bit sad that GH doesn't have FreeBSD build runners.  Those are
now popping up on other CIs, but I'm not sure if their free/open
source tiers have enough resources for cfbot.

[1] 
https://www.postgresql.org/message-id/flat/CAEudQArhn8bH836OB%2B3SboiaeEcgOtrJS58Bki4%3D5yeVqToxgw%40mail.gmail.com
From 424bcae7f1fcb1ada0e7046bfc5e0c5254c6f439 Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Thu, 3 Dec 2020 10:58:16 +1300
Subject: [PATCH] Github CI WIP

---
 .github/workflows/ci-linux.yml | 43 ++
 .github/workflows/ci-macos.yml | 38 +++
 .github/workflows/ci-windows-buildsetup.pl | 35 ++
 .github/workflows/ci-windows-dumpregr.pl   | 22 +++
 .github/workflows/ci-windows.yml   | 32 
 5 files changed, 170 insertions(+)
 create mode 100644 .github/workflows/ci-linux.yml
 create mode 100644 .github/workflows/ci-macos.yml
 create mode 100644 .github/workflows/ci-windows-buildsetup.pl
 create mode 100644 .github/workflows/ci-windows-dumpregr.pl
 create mode 100644 .github/workflows/ci-windows.yml

diff --git a/.github/workflows/ci-linux.yml b/.github/workflows/ci-linux.yml
new file mode 100644
index 00..8ee32770cd
--- /dev/null
+++ b/.github/workflows/ci-linux.yml
@@ -0,0 +1,43 @@
+name: ci-linux
+on: [push]
+jobs:
+  ci-linux:
+runs-on: ubuntu-latest
+steps:
+- uses: actions/checkout@v2
+- name: Install packages
+  run: |
+sudo apt-get --yes update
+sudo apt-get --yes install gcc libreadline-dev flex bison make perl libipc-run-perl clang llvm-dev libperl-dev libpython-dev tcl-dev libldap2-dev libicu-dev docbook-xml docbook-xsl fop libxml2-utils xsltproc krb5-admin-server krb5-kdc krb5-user slapd ldap-utils libssl-dev pkg-config locales-all gdb
+  env:
+DEBIAN_FRONTEND: noninteractive
+- name: Configure
+  run: ./configure --enable-cassert --enable-debug --enable-tap-tests --with-tcl --with-python --with-perl --with-ldap --with-openssl --with-icu --with-llvm
+- name: Build
+  run: |
+echo "COPT=-Wall -Werror" > src/Makefile.custom
+make -s -j3
+- name: Check world
+  run: |
+echo '/tmp/%e-%s-%p.core' | sudo tee /proc/sys/kernel/core_pattern
+ulimit -c unlimited
+make -s check-world
+  env:
+PG_TEST_EXTRA: "ssl kerberos"
+#PG_TEST_EXTRA: "ssl ldap kerberos" why does slapd fail to start?
+- name: Look for clues
+  if: ${{ failure() }}
+  run: |
+# dump useful logs
+for F in ` find . -name initdb.log -o -name regression.diffs ` ; do
+  echo === $F ===
+  head -1000 $F
+done
+# look for core files and spit out backtraces
+for corefile in $(find /tmp/ -name '*.core' 2>/dev/null) ; do
+  binary=$(gdb -quiet -core $corefile -batch -ex 'info auxv' | grep AT_EXECFN | perl -pe "s/^.*\"(.*)\"\$/\$1/g")
+  echo dumping $corefile for $binary
+  gdb --batch --quiet -ex "thread apply all bt full" -ex "quit" $binary $corefile
+done
+- name: Documentation
+  run: make -s docs
diff --git a/.github/workflows/ci-macos.yml b/.github/workflows/ci-macos.yml
new file mode 100644
index 00..2c20a5a279
--- /dev/null
+++ b/.github/workflows/ci-macos.yml
@@ -0,0 +1,38 @@
+name: ci-macos
+on: [push]
+jobs:
+  ci-macos:
+runs-on: macos-latest
+steps:
+- uses: actions/checkout@v2
+- name: Install packages
+ 

Re: Printing LSN made easy

2020-12-02 Thread Ashutosh Bapat
On Mon, Nov 30, 2020 at 7:38 PM Li Japin  wrote:

> Hi,
>
> On Nov 30, 2020, at 9:06 PM, Ashutosh Bapat 
> wrote:
>
> On Fri, Nov 27, 2020 at 9:51 PM Li Japin  wrote:
>
>
> Hi,
>
> Here, we cannot use sizeof(but) to get the buf size, because it is a
> pointer, so it always
> 8 bytes on 64-bit or 4 bytes on 32-bit machine.
>
>
> For an array, the sizeof() returns the size of memory consumed by the
> array. See section "Application to arrays" at
> https://en.wikipedia.org/wiki/Sizeof.
>
>
> That’s true! However, in pg_lsn_out_buffer(), it converts to a pointer,
> not an array. See the following test:
>
>
Ah! Thanks for pointing that out. I have fixed this in my repository.
However, from Alvaro's reply it looks like the approach is not acceptable,
so I am not posting the fixed version here.

--
Best Wishes,
Ashutosh


Re: should INSERT SELECT use a BulkInsertState?

2020-12-02 Thread Bharath Rupireddy
On Wed, Dec 2, 2020 at 10:24 PM Justin Pryzby  wrote:
>
> One loose end in this patch is how to check for volatile default expressions.
>
> copyfrom.c is a utility statement, so it can look at the parser's column list:
> COPY table(c1,c2)...
>
> However, for INSERT, in nodeModifyTable.c, it looks like parsing, rewriting,
> and planning are done, at which point I don't know if there's a good way to
> find that.  The default expressions will have been rewritten into the planned
> statement.
>
> We need the list of columns whose default is volatile, excluding columns for
> which a non-default value is specified.
>
> INSERT INTO table (c1,c2) VALUES (1,default);
>
> We'd want the list of any column in the table with a volatile default,
> excluding columns c1, but not-excluding explicit default columns c2 or any
> implicit default columns (c3, etc).
>
> Any idea ?
>

I think we should be doing all the necessary checks in the planner and
have a flag in the planned stmt to indicate whether to go with multi
insert or not. For the required checks, we can have a look at how the
existing COPY decides to go with either CIM_MULTI or CIM_SINGLE.

Now, the question of how we can get to know whether a given relation
has default expressions or volatile expressions, it is worth to look
at build_column_default() and contain_volatile_functions().

I prefer to have the multi insert deciding code in COPY and INSERT
SELECT, in a single common function which can be reused. Though COPY
has somethings like default expressions and others ready unlike INSERT
SELECT, we can try to keep them under a common function and say for
COPY we can skip some code and for INSERT SELECT we can do extra work
to find default expressions.

Although unrelated, for parallel inserts in INSERT SELECT[1], in the
planner there are some checks to see if the parallelism is safe or
not. Check max_parallel_hazard_for_modify() in
v8-0001-Enable-parallel-SELECT-for-INSERT-INTO-.-SELECT.patch from
[1]. On the similar lines, we can also have multi insert deciding
code.

[1] 
https://www.postgresql.org/message-id/CAJcOf-fy3P%2BkDArvmbEtdQTxFMf7Rn2%3DV-sqCnMmKO3QKBsgPA%40mail.gmail.com

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com




Re: Huge memory consumption on partitioned table with FKs

2020-12-02 Thread Kyotaro Horiguchi
At Thu, 3 Dec 2020 12:27:53 +0900, Amit Langote  wrote 
in 
> Hello,
> 
> On Thu, Dec 3, 2020 at 10:15 AM Kyotaro Horiguchi
>  wrote:
> > At Wed, 2 Dec 2020 22:02:50 +0900, Amit Langote  
> > wrote in
> > > Hmm, I don't see what the problem is here, because it's not
> > > RI_ConstraintInfo that is being shared among partitions of the same
> > > parent, but the RI query (and the SPIPlanPtr for it) that is issued
> > > through SPI.  The query (and the plan) turns out to be the same no
> > > matter what partition's RI_ConstraintInfo is first used to generate
> > > it.  What am I missing?
> >
> > I don't think you're missing something. As I (tried to..) mentioned in
> > another branch of this thread, you're right. On the otherhand
> > fk_attnums is actually used to generate the query, thus *issue*
> > potentially affects the query.  Although that might be paranoic, that
> > possibility is eliminated by checking the congruency(?) of fk_attnums
> > (or other members).  We might even be able to share a riinfo among
> > such children.
> 
> Just to be sure I've not misunderstood you, let me mention why I think
> the queries used by different partitions all turn out to be the same
> despite attribute number differences among partitions.  The places
> that uses fk_attnums when generating a query are these among others:
> 
> Oid fk_type = RIAttType(fk_rel, riinfo->fk_attnums[i]);
> ...
> Oid fk_coll = RIAttCollation(fk_rel, riinfo->fk_attnums[i]);
> ...
> quoteOneName(attname,
>  RIAttName(fk_rel, riinfo->fk_attnums[i]));
> 
> For the queries on the referencing side ("check" side),
> type/collation/attribute name determined using the above are going to
> be the same for all partitions in a given tree irrespective of the
> attribute number, because they're logically the same column.  On the

Yes, I know that, which is what I meant by "practically" or
"actually", but it is not explicitly defined AFAICS.

Thus that would be no longer an issue if we explicitly define that
"When conpraentid stores a valid value, each element of fk_attnums
points to logically the same attribute with the RI_ConstraintInfo for
the parent constraint."  Or I'd be happy if we have such a comment
there instead.

> referenced side ("restrict", "cascade", "set" side), as you already
> mentioned, fk_attnums refers to the top parent table of the
> referencing side, so no possibility of they being different in the
> various referenced partitions' RI_ConstraintInfos.

Right. (I'm not sure I have mention that here, though:p)A

> On the topic of how we'd be able to share even the RI_ConstraintInfos
> among partitions, that would indeed look a bit more elaborate than the
> patch we have right now.

Maybe just letting the hash entry for the child riinfo point to the
parent riinfo if all members (other than constraint_id, of course)
share the exactly the same values.  No need to count references since
we don't going to remove riinfos.

> > > BTW, one problem with Kuroda-san's patch is that it's using
> > > conparentid as the shared key, which can still lead to multiple
> > > queries being generated when using multi-level partitioning, because
> > > there would be many (intermediate) parent constraints in that case.
> > > We should really be using the "root" constraint OID as the key,
> > > because there would only be one root from which all constraints in a
> > > given partition hierarchy would've originated.  Actually, I had
> > > written a patch a few months back to do exactly that, a polished
> > > version of which I've attached with this email.  Please take a look.
> >
> > I don't like that the function self-recurses but
> > ri_GetParentConstrOid() actually climbs up to the root constraint, so
> > the patch covers the multilevel partitioning cases.
> 
> Sorry, I got too easily distracted by the name Kuroda-san chose for
> the field in RI_ConstraintInfo and the function.

I thought the same at the first look to the function.

> > About your patch, it calculates the root constrid at the time an
> > riinfo is created, but when the root-partition is further attached to
> > another partitioned-table after the riinfo creation,
> > constraint_root_id gets stale.  Of course that dones't matter
> > practically, though.
> 
> Maybe we could also store the hash value of the root constraint OID as
> rootHashValue and check for that one too in
> InvalidateConstraintCacheCallBack().  That would take care of this
> unless I'm missing something.

Seems to be sound.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-02 Thread Pavel Stehule
st 2. 12. 2020 v 21:02 odesílatel Tom Lane  napsal:

> Dmitry Dolgov <9erthali...@gmail.com> writes:
> >> On Wed, Dec 02, 2020 at 12:58:51PM -0500, Tom Lane wrote:
> >> So ... one of the things that's been worrying me about this patch
> >> from day one is whether it would create a noticeable performance
> >> penalty for existing use-cases.  I did a small amount of experimentation
> >> about that with the v35 patchset, and it didn't take long at all to
> >> find that this:
> >> ...
> >> is about 15% slower with the patch than with HEAD.  I'm not sure
> >> what an acceptable penalty might be, but 15% is certainly not it.
>
> > I've tried to reproduce that, but get ~2-4% slowdown (with a pinned
> > backend, no turbo etc). Are there any special steps I've probably
> > missed?
>
> Hmm, no, I just built with --disable-cassert and otherwise my usual
> development options.
>
> I had experimented with some other variants of the test case,
> where the repeated statement is
>
> a[i] := i;-- about the same
> a[i] := a[i-1] + 1;   -- 7% slower
> a[i] := a[i-1] - a[i-2];  -- 15% slower
>
> so it seems clear that the penalty is on the array fetch not array
> assign side.  This isn't too surprising now that I think about it,
> because plpgsql's array assignment code is untouched by the patch
> (which is a large feature omission BTW: you still can't write
> jsonb['x'] := y;
>

The refactoring of the left part of the assignment statement in plpgsql
probably can be harder work than this patch. But it should be the next
step.


in plpgsql).
>
>
I tested the last patch on my FC33 Lenovo T520 (I7) and I don't see 15%
slowdown too .. On my comp there is a slowdown of about 1.5-3%. I used your
function arraytest.

Regards

Pavel




> regards, tom lane
>


pg_ctl.exe file deleted automatically

2020-12-02 Thread Joel Mariadasan (jomariad)
Hi,

We are using Windows 2019 server.

Sometimes we see the pg_ctl.exe getting automatically deleted.
Due to this, while starting up Postgres windows service we are getting the 
error.
"Error 2: The system cannot find the file specified"

Can you let us know the potential causes for this pg_ctl.exe file deletion?

Regards,
Joel


Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace on the fly

2020-12-02 Thread Justin Pryzby
On Thu, Dec 03, 2020 at 10:19:43AM +0900, Michael Paquier wrote:
> OK, this one is now committed.  As of this thread, I think that we are
> going to need to do a bit more once we add options that are not just
> booleans for both commands (the grammar rules do not need to be
> changed now):
> - Have a ReindexParams, similarly to VacuumParams except that we store
> the results of the parsing in a single place.  With the current HEAD,
> I did not see yet the point in doing so because we just need an
> integer that maps to a bitmask made of ReindexOption.
> - The part related to ReindexStmt in utility.c is getting more and
> more complicated, so we could move most of the execution into
> indexcmds.c with some sort of ExecReindex() doing the option parsing
> job, and go to the correct code path depending on the object type
> dealt with.

Good idea.  I think you mean like this.

I don't know where to put the struct.
I thought maybe the lowlevel, integer options should live in the struct, in
addition to bools, but it's not important.

-- 
Justin
>From 520d1c54d6988d69768178b4abf03c5837654b9a Mon Sep 17 00:00:00 2001
From: Alexey Kondratov 
Date: Wed, 23 Sep 2020 18:21:16 +0300
Subject: [PATCH v31 3/5] Refactor and reuse set_rel_tablespace()

---
 src/backend/catalog/index.c  | 74 
 src/backend/commands/indexcmds.c | 35 ---
 src/include/catalog/index.h  |  2 +
 3 files changed, 49 insertions(+), 62 deletions(-)

diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 532c11e9dd..b317f556df 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -3607,7 +3607,6 @@ reindex_index(Oid indexId, bool skip_constraint_checks, char persistence,
 	Relation	iRel,
 heapRelation;
 	Oid			heapId;
-	Oid			oldTablespaceOid;
 	IndexInfo  *indexInfo;
 	volatile bool skipped_constraint = false;
 	PGRUsage	ru0;
@@ -3723,41 +3722,17 @@ reindex_index(Oid indexId, bool skip_constraint_checks, char persistence,
 		tablespaceOid = InvalidOid;
 
 	/*
-	 * Set the new tablespace for the relation.  Do that only in the
-	 * case where the reindex caller wishes to enforce a new tablespace.
+	 * Set the new tablespace for the relation if requested.
 	 */
-	oldTablespaceOid = iRel->rd_rel->reltablespace;
 	if (set_tablespace &&
-		(tablespaceOid != oldTablespaceOid ||
-		(tablespaceOid == MyDatabaseTableSpace && OidIsValid(oldTablespaceOid
+		set_rel_tablespace(indexId, tablespaceOid))
 	{
-		Relation		pg_class;
-		Form_pg_class	rd_rel;
-		HeapTuple		tuple;
-
-		/* First get a modifiable copy of the relation's pg_class row */
-		pg_class = table_open(RelationRelationId, RowExclusiveLock);
-
-		tuple = SearchSysCacheCopy1(RELOID, ObjectIdGetDatum(indexId));
-		if (!HeapTupleIsValid(tuple))
-			elog(ERROR, "cache lookup failed for relation %u", indexId);
-		rd_rel = (Form_pg_class) GETSTRUCT(tuple);
-
 		/*
 		 * Mark the relation as ready to be dropped at transaction commit,
 		 * before making visible the new tablespace change so as this won't
 		 * miss things.
 		 */
 		RelationDropStorage(iRel);
-
-		/* Update the pg_class row */
-		rd_rel->reltablespace = tablespaceOid;
-		CatalogTupleUpdate(pg_class, &tuple->t_self, tuple);
-
-		heap_freetuple(tuple);
-
-		table_close(pg_class, RowExclusiveLock);
-
 		RelationAssumeNewRelfilenode(iRel);
 
 		/* Make sure the reltablespace change is visible */
@@ -4063,6 +4038,51 @@ reindex_relation(Oid relid, int flags, int options, Oid tablespaceOid)
 	return result;
 }
 
+/*
+ * set_rel_tablespace - modify relation tablespace in the pg_class entry.
+ *
+ * 'reloid' is an Oid of relation to be modified.
+ * 'tablespaceOid' is an Oid of new tablespace.
+ *
+ * Catalog modification is done only if tablespaceOid is different from
+ * the currently set.  Returned bool value is indicating whether any changes
+ * were made or not.
+ */
+bool
+set_rel_tablespace(Oid reloid, Oid tablespaceOid)
+{
+	Relation		pg_class;
+	HeapTuple		tuple;
+	Form_pg_class	rd_rel;
+	bool			changed = false;
+	Oid			oldTablespaceOid;
+
+	/* Get a modifiable copy of the relation's pg_class row. */
+	pg_class = table_open(RelationRelationId, RowExclusiveLock);
+
+	tuple = SearchSysCacheCopy1(RELOID, ObjectIdGetDatum(reloid));
+	if (!HeapTupleIsValid(tuple))
+		elog(ERROR, "cache lookup failed for relation %u", reloid);
+	rd_rel = (Form_pg_class) GETSTRUCT(tuple);
+
+	/* No work if no change in tablespace. */
+	oldTablespaceOid = rd_rel->reltablespace;
+	if (tablespaceOid != oldTablespaceOid ||
+		(tablespaceOid == MyDatabaseTableSpace && OidIsValid(oldTablespaceOid)))
+	{
+		/* Update the pg_class row. */
+		rd_rel->reltablespace = (tablespaceOid == MyDatabaseTableSpace) ?
+			InvalidOid : tablespaceOid;
+		CatalogTupleUpdate(pg_class, &tuple->t_self, tuple);
+
+		changed = true;
+	}
+
+	heap_freetuple(tuple);
+	table_close(pg_class, RowExclusiveLock);
+
+	return changed;
+}
 
 /* 

Re: Deprecate custom encoding conversions

2020-12-02 Thread Fujii Masao




On 2020/12/03 13:07, Tom Lane wrote:

Fujii Masao  writes:

On 2020/12/03 1:04, Heikki Linnakangas wrote:

We never use non-default conversions for anything. All code that performs 
encoding conversions only cares about the default ones.



Yes. I had to update pg_conversion.condefault directly so that we can
use custom encoding when I registered it. The direct update of
pg_conversion is of course not good thing. So I was wondering
if we should have something like ALTER CONVERSION SET DEFAULT.


Perhaps, but what I'd think is more urgent is having some way for
a session to select a non-default conversion for client I/O.


+1

What about adding new libpq parameter like encoding_conversion
(like client_encoding) so that a client can specify what conversion to use?
That setting is sent to the server while establishing the connection.
Probably we would need to verify that the setting of this new parameter
is consistent with that of client_encoding.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: Deprecate custom encoding conversions

2020-12-02 Thread Tom Lane
Fujii Masao  writes:
> On 2020/12/03 1:04, Heikki Linnakangas wrote:
>> We never use non-default conversions for anything. All code that performs 
>> encoding conversions only cares about the default ones.

> Yes. I had to update pg_conversion.condefault directly so that we can
> use custom encoding when I registered it. The direct update of
> pg_conversion is of course not good thing. So I was wondering
> if we should have something like ALTER CONVERSION SET DEFAULT.

Perhaps, but what I'd think is more urgent is having some way for
a session to select a non-default conversion for client I/O.

regards, tom lane




Re: Deprecate custom encoding conversions

2020-12-02 Thread Fujii Masao




On 2020/12/03 1:04, Heikki Linnakangas wrote:

Hi,

PostgreSQL allows writing custom encoding conversion functions between any 
character encodings, using the CREATE CONVERSION command. It's pretty flexible, 
you can define default and non-default conversions, and the conversions live in 
schemas so you can have multiple conversions installed in a system and you can 
switch between them by changing search_path.

However:

We never use non-default conversions for anything. All code that performs 
encoding conversions only cares about the default ones.


Yes. I had to update pg_conversion.condefault directly so that we can
use custom encoding when I registered it. The direct update of
pg_conversion is of course not good thing. So I was wondering
if we should have something like ALTER CONVERSION SET DEFAULT.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: Deprecate custom encoding conversions

2020-12-02 Thread Fujii Masao




On 2020/12/03 11:48, tsunakawa.ta...@fujitsu.com wrote:

From: Michael Paquier 

Tsunakawa-san, could you post a link to this article, if possible?  I am curious
about their problem and why they used CREATE CONVERSION as a way to
solve it.  That's fine even if it is in Japanese.


I just pulled info from my old memory in my previous mail.  Now the following 
links seem to be relevant.  (They should be different from what I saw in the 
past.)

https://ml.postgresql.jp/pipermail/pgsql-jp/2007-January/013103.html

https://teratail.com/questions/295988

IIRC, the open source Postgres extension EUDC also uses CREATE CONVERSION.


FWIW, about four years before, for the real project, I wrote
the extension [1] that provides yet another encoding conversion
from UTF-8 to EUC_JP, to handle some external characters.
This extension uses CREATE CONVERSION.

[1]
https://github.com/MasaoFujii/pg_fallback_utf8_to_euc_jp

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: A new function to wait for the backend exit after termination

2020-12-02 Thread Bharath Rupireddy
Thanks for the review.

On Thu, Dec 3, 2020 at 7:24 AM Hou, Zhijie  wrote:
>
> 1.
> +
> +   ereport(WARNING,
> +   (errmsg("could not wait for the termination of the 
> backend with PID %d within %ld milliseconds",
> +   pid, timeout)));
> +
>
> The code use %ld to print int64 type.
> How about use INT64_FORMAT, which looks more appropriate.
>

Changed it to use %lld and typecasting timeout to (long long int) as
suggested by Tom.

>
> 2.
> +   if (timeout <= 0)
> +   {
> +   ereport(WARNING,
> +   (errmsg("timeout cannot be negative or zero: 
> %ld", timeout)));
> +   PG_RETURN_BOOL(r);
> +   }
>
> The same as 1.
>

Changed.

>
> 3.
> +pg_terminate_backend_and_wait(PG_FUNCTION_ARGS)
> +{
> +   int pid = PG_GETARG_DATUM(0);
>
> +pg_wait_backend(PG_FUNCTION_ARGS)
> +{
> +   int pid = PG_GETARG_INT32(0);
>
> The code use different macro to get pid,
> How about use PG_GETARG_INT32(0) for each one.
>

Changed.

> I am Sorry I forgot a possible typo comment.
>
> +{ oid => '16386', descr => 'terminate a backend process and wait for it\'s 
> exit or until timeout occurs'
>
> Does the following change looks better?
>
> Wait for it\'s exit => Wait for its exit
>

Changed.

>
> I changed the status to 'wait on anthor'.
> The others of the patch LGTM,
> I think it can be changed to Ready for Committer again, when this comment is 
> confirmed.
>

Attaching v4 patch. Please have a look.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com
From e3f364e8a5a06b3e122b657e668146f220549acb Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Thu, 3 Dec 2020 09:09:13 +0530
Subject: [PATCH v4] pg_terminate_backend with wait, timeout and
 pg_wait_backend with timeout

This patch adds two new functions:
1. pg_terminate_backend(pid, wait, timeout) - terminate and
wait or timeout for a given backend.
2. pg_wait_backend(pid, timeout) - check the existence of the
backend with a given PID and wait or timeout until it goes away.
---
 doc/src/sgml/func.sgml|  28 +-
 src/backend/catalog/system_views.sql  |  10 ++
 src/backend/postmaster/pgstat.c   |   3 +
 src/backend/storage/ipc/signalfuncs.c | 137 ++
 src/include/catalog/pg_proc.dat   |   9 ++
 src/include/pgstat.h  |   3 +-
 6 files changed, 187 insertions(+), 3 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index df29af6371..6b77b80336 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -23954,7 +23954,7 @@ SELECT collation for ('foo' COLLATE "de_DE");
 
  pg_terminate_backend
 
-pg_terminate_backend ( pid integer )
+pg_terminate_backend ( pid integer, wait boolean, timeout bigint DEFAULT 100 )
 boolean


@@ -23962,7 +23962,31 @@ SELECT collation for ('foo' COLLATE "de_DE");
 specified process ID.  This is also allowed if the calling role
 is a member of the role whose backend is being terminated or the
 calling role has been granted pg_signal_backend,
-however only superusers can terminate superuser backends.
+however only superusers can terminate superuser backends. When no
+wait and timeout are
+provided, only SIGTERM is sent to the backend with the given process
+ID and false is returned immediately. But the
+backend may still exist. With wait specified as
+true, the function waits for the backend termination
+until the given timeout or the default 100
+milliseconds. On timeout false is returned.
+   
+  
+
+  
+   
+
+ pg_wait_backend
+
+pg_wait_backend ( pid integer, timeout bigint DEFAULT 100 )
+boolean
+   
+   
+Check the existence of the session whose backend process has the
+specified process ID. On success return true. This
+function waits until the given timeout or the
+default 100 milliseconds. On timeout false is
+returned.

   
  
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index b140c210bc..2d1907b4a8 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1246,6 +1246,16 @@ CREATE OR REPLACE FUNCTION
   RETURNS boolean STRICT VOLATILE LANGUAGE INTERNAL AS 'pg_promote'
   PARALLEL SAFE;
 
+CREATE OR REPLACE FUNCTION
+  pg_terminate_backend(pid integer, wait boolean, timeout int8 DEFAULT 100)
+  RETURNS boolean STRICT VOLATILE LANGUAGE INTERNAL AS 'pg_terminate_backend_and_wait'
+  PARALLEL UNSAFE;
+
+CREATE OR REPLACE FUNCTION
+  pg_wait_backend(pid integer, timeout int8 DEFAULT 100)
+  RETURNS boolean STRICT VOLATILE LANGUAGE INTERNAL AS 'pg_wait_backend'
+  PARALLEL UNSAFE;
+
 -- legacy definition for compatibility with 9.3
 

RE: Disable WAL logging to speed up data loading

2020-12-02 Thread tsunakawa.ta...@fujitsu.com
From: Osumi, Takamichi/大墨 昂道 
> I've made a new patch v05 that took in comments to filter out WALs more 
> strictly
> and addressed some minor fixes that were discussed within past few days.
> Also, I changed the documentations, considering those modifications.

The code looks good, and the performance seems to be nice, so I marked this 
ready for committer.


Regards
Takayuki Tsunakawa





RE: [Patch] Optimize dropping of relation buffers using dlist

2020-12-02 Thread k.jami...@fujitsu.com
On Thursday, November 26, 2020 4:19 PM, Horiguchi-san wrote:
> Hello, Kirk. Thank you for the new version.

Apologies for the delay, but attached are the updated versions to simplify the 
patches.
The changes reflected most of your comments/suggestions.

Summary of changes in the latest versions.
1. Updated the function description of DropRelFileNodeBuffers in 0003.
2. Updated the commit logs of 0003 and 0004.
3, FindAndDropRelFileNodeBuffers is now called for each relation fork,
  instead of for all involved forks.
4. Removed the unnecessary palloc() and subscripts like forks[][],
   firstDelBlock[], nforks, as advised by Horiguchi-san. The memory
   allocation for block[][] was also simplified.
   So 0004 became simpler and more readable.


> At Thu, 26 Nov 2020 03:04:10 +, "k.jami...@fujitsu.com"
>  wrote in
> > On Thursday, November 19, 2020 4:08 PM, Tsunakawa, Takayuki wrote:
> > > From: Andres Freund 
> > > > DropRelFileNodeBuffers() in recovery? The most common path is
> > > > DropRelationFiles()->smgrdounlinkall()->DropRelFileNodesAllBuffers
> > > > (), which 3/4 doesn't address and 4/4 doesn't mention.
> > > >
> > > > 4/4 seems to address DropRelationFiles(), but only talks about
> > > TRUNCATE?
> > >
> > > Yes.  DropRelationFiles() is used in the following two paths:
> > >
> > > [Replay of TRUNCATE during recovery]
> > > xact_redo_commit/abort() -> DropRelationFiles()  ->
> > > smgrdounlinkall() ->
> > > DropRelFileNodesAllBuffers()
> > >
> > > [COMMIT/ROLLBACK PREPARED]
> > > FinishPreparedTransaction() -> DropRelationFiles()  ->
> > > smgrdounlinkall()
> > > -> DropRelFileNodesAllBuffers()
> >
> > Yes. The concern is that it was not clear in the function descriptions
> > and commit logs what the optimizations for the
> > DropRelFileNodeBuffers() and DropRelFileNodesAllBuffers() are for. So
> > I revised only the function description of DropRelFileNodeBuffers() and the
> commit logs of the 0003-0004 patches. Please check if the brief descriptions
> would suffice.
> 
> I read the commit message of 3/4.  (Though this is not involved literally in 
> the
> final commit.)
> 
> > While recovery, when WAL files of XLOG_SMGR_TRUNCATE from vacuum
> or
> > autovacuum are replayed, the buffers are dropped when the sizes of all
> > involved forks of a relation are already "cached". We can get
> 
> This sentence seems missing "dropped by (or using) what".
> 
> > a reliable size of nblocks for supplied relation's fork at that time,
> > and it's safe because DropRelFileNodeBuffers() relies on the behavior
> > that cached nblocks will not be invalidated by file extension during
> > recovery.  Otherwise, or if not in recovery, proceed to sequential
> > search of the whole buffer pool.
> 
> This sentence seems involving confusion. It reads as if "we can rely on it
> because we're relying on it".  And "the cached value won't be invalidated"
> doesn't explain the reason precisely. The reason I think is that the cached
> value is guaranteed to be the maximum page we have in shared buffer at least
> while recovery, and that guarantee is holded by not asking fseek once we
> cached the value.

Fixed the commit log of 0003.

> > > > I also don't get why 4/4 would be a good idea on its own. It uses
> > > > BUF_DROP_FULL_SCAN_THRESHOLD to guard
> > > > FindAndDropRelFileNodeBuffers() on a per relation basis. But since
> > > > DropRelFileNodesAllBuffers() can be used for many relations at
> > > > once, this could end up doing BUF_DROP_FULL_SCAN_THRESHOLD
> - 1
> > > lookups a lot
> > > > of times, once for each of nnodes relations?
> > >
> > > So, the threshold value should be compared with the total number of
> > > blocks of all target relations, not each relation.  You seem to be right, 
> > > got
> it.
> >
> > Fixed this in 0004 patch. Now we compare the total number of
> > buffers-to-be-invalidated For ALL relations to the
> BUF_DROP_FULL_SCAN_THRESHOLD.
> 
> I didn't see the previous version, but the row of additional palloc/pfree's in
> this version looks uneasy.

Fixed this too.
 
>   int i,
> + j,
> + *nforks,
>   n = 0;
> 
> Perhaps I think we don't define variable in different types at once.
> (I'm not sure about defining multple variables at once.)

Fixed this too.

> @@ -3110,7 +3125,10 @@ DropRelFileNodesAllBuffers(RelFileNodeBackend
> *rnodes, int nnodes)
> 
>   DropRelFileNodeAllLocalBuffers(rnodes[i].node);
>   }
>   else
> + {
> + rels[n] = smgr_reln[i];
>   nodes[n++] = rnodes[i].node;
> + }
>   }
> 
> We don't need to remember nodes and rnodes here since rnodes[n] is
> rels[n]->smgr_rnode here.  Or we don't even need to store rels since we can
> scan the smgr_reln later again.
> 
> nodes is needed in the full-scan path but it is enough to collect it after 
> finding
> that we do full-scan.

I followe

RE: A new function to wait for the backend exit after termination

2020-12-02 Thread Hou, Zhijie
> 
> I changed the status to 'wait on anthor'.
> The others of the patch LGTM,
> I think it can be changed to Ready for Committer again, when this comment
> is confirmed.
> 

I am Sorry I forgot a possible typo comment.

+{ oid => '16386', descr => 'terminate a backend process and wait for it\'s 
exit or until timeout occurs'

Does the following change looks better?

Wait for it\'s exit => Wait for its exit

Best regards,
houzj




Re: Huge memory consumption on partitioned table with FKs

2020-12-02 Thread Amit Langote
Hello,

On Thu, Dec 3, 2020 at 10:15 AM Kyotaro Horiguchi
 wrote:
> At Wed, 2 Dec 2020 22:02:50 +0900, Amit Langote  
> wrote in
> > Hello,
> >
> > On Tue, Dec 1, 2020 at 9:40 AM Kyotaro Horiguchi
> >  wrote:
> > >
> > > At Mon, 30 Nov 2020 21:03:45 -0300, Alvaro Herrera 
> > >  wrote in
> > > > On 2020-Nov-26, Kyotaro Horiguchi wrote:
> > > >
> > > > > This shares RI_ConstraintInfo cache by constraints that shares the
> > > > > same parent constraints. But you forgot that the cache contains some
> > > > > members that can differ among partitions.
> > > > >
> > > > > Consider the case of attaching a partition that have experienced a
> > > > > column deletion.
> > > >
> > > > I think this can be solved easily in the patch, by having
> > > > ri_BuildQueryKey() compare the parent's fk_attnums to the parent; if
> > > > they are equal then use the parent's constaint_id, otherwise use the
> > > > child constraint.  That way, the cache entry is reused in the common
> > > > case where they are identical.
> > >
> > > *I* think it's the direction.  After an off-list discussion, we
> > >  confirmed that even in that case the patch works as is because
> > >  fk_attnum (or contuple.conkey) always stores key attnums compatible
> > >  to the topmost parent when conparent has a valid value (assuming the
> > >  current usage of fk_attnum), but I still feel uneasy to rely on that
> > >  unclear behavior.
> >
> > Hmm, I don't see what the problem is here, because it's not
> > RI_ConstraintInfo that is being shared among partitions of the same
> > parent, but the RI query (and the SPIPlanPtr for it) that is issued
> > through SPI.  The query (and the plan) turns out to be the same no
> > matter what partition's RI_ConstraintInfo is first used to generate
> > it.  What am I missing?
>
> I don't think you're missing something. As I (tried to..) mentioned in
> another branch of this thread, you're right. On the otherhand
> fk_attnums is actually used to generate the query, thus *issue*
> potentially affects the query.  Although that might be paranoic, that
> possibility is eliminated by checking the congruency(?) of fk_attnums
> (or other members).  We might even be able to share a riinfo among
> such children.

Just to be sure I've not misunderstood you, let me mention why I think
the queries used by different partitions all turn out to be the same
despite attribute number differences among partitions.  The places
that uses fk_attnums when generating a query are these among others:

Oid fk_type = RIAttType(fk_rel, riinfo->fk_attnums[i]);
...
Oid fk_coll = RIAttCollation(fk_rel, riinfo->fk_attnums[i]);
...
quoteOneName(attname,
 RIAttName(fk_rel, riinfo->fk_attnums[i]));

For the queries on the referencing side ("check" side),
type/collation/attribute name determined using the above are going to
be the same for all partitions in a given tree irrespective of the
attribute number, because they're logically the same column.  On the
referenced side ("restrict", "cascade", "set" side), as you already
mentioned, fk_attnums refers to the top parent table of the
referencing side, so no possibility of they being different in the
various referenced partitions' RI_ConstraintInfos.

On the topic of how we'd be able to share even the RI_ConstraintInfos
among partitions, that would indeed look a bit more elaborate than the
patch we have right now.

> > BTW, one problem with Kuroda-san's patch is that it's using
> > conparentid as the shared key, which can still lead to multiple
> > queries being generated when using multi-level partitioning, because
> > there would be many (intermediate) parent constraints in that case.
> > We should really be using the "root" constraint OID as the key,
> > because there would only be one root from which all constraints in a
> > given partition hierarchy would've originated.  Actually, I had
> > written a patch a few months back to do exactly that, a polished
> > version of which I've attached with this email.  Please take a look.
>
> I don't like that the function self-recurses but
> ri_GetParentConstrOid() actually climbs up to the root constraint, so
> the patch covers the multilevel partitioning cases.

Sorry, I got too easily distracted by the name Kuroda-san chose for
the field in RI_ConstraintInfo and the function.

> About your patch, it calculates the root constrid at the time an
> riinfo is created, but when the root-partition is further attached to
> another partitioned-table after the riinfo creation,
> constraint_root_id gets stale.  Of course that dones't matter
> practically, though.

Maybe we could also store the hash value of the root constraint OID as
rootHashValue and check for that one too in
InvalidateConstraintCacheCallBack().  That would take care of this
unless I'm missing something.

-- 
Amit Langote
EDB: http://www.enterprisedb.com




RE: Disable WAL logging to speed up data loading

2020-12-02 Thread osumi.takami...@fujitsu.com
Hello


I've made a new patch v05 that took in comments
to filter out WALs more strictly and addressed some minor fixes
that were discussed within past few days.
Also, I changed the documentations, considering those modifications.


Best,
Takamichi Osumi



disable_WAL_logging_v05.patch
Description: disable_WAL_logging_v05.patch


RE: Deprecate custom encoding conversions

2020-12-02 Thread tsunakawa.ta...@fujitsu.com
From: Michael Paquier 
> Tsunakawa-san, could you post a link to this article, if possible?  I am 
> curious
> about their problem and why they used CREATE CONVERSION as a way to
> solve it.  That's fine even if it is in Japanese.

I just pulled info from my old memory in my previous mail.  Now the following 
links seem to be relevant.  (They should be different from what I saw in the 
past.)

https://ml.postgresql.jp/pipermail/pgsql-jp/2007-January/013103.html

https://teratail.com/questions/295988

IIRC, the open source Postgres extension EUDC also uses CREATE CONVERSION.



Regards
Takayuki Tsunakawa





Re: autovac issue with large number of tables

2020-12-02 Thread Kasahara Tatsuhito
On Wed, Dec 2, 2020 at 7:11 PM Masahiko Sawada  wrote:
>
> On Wed, Dec 2, 2020 at 3:33 PM Fujii Masao  
> wrote:
> >
> >
> >
> > On 2020/12/02 12:53, Masahiko Sawada wrote:
> > > On Tue, Dec 1, 2020 at 5:31 PM Masahiko Sawada  
> > > wrote:
> > >>
> > >> On Tue, Dec 1, 2020 at 4:32 PM Fujii Masao  
> > >> wrote:
> > >>>
> > >>>
> > >>>
> > >>> On 2020/12/01 16:23, Masahiko Sawada wrote:
> >  On Tue, Dec 1, 2020 at 1:48 PM Kasahara Tatsuhito
> >   wrote:
> > >
> > > Hi,
> > >
> > > On Mon, Nov 30, 2020 at 8:59 PM Fujii Masao 
> > >  wrote:
> > >>
> > >>
> > >>
> > >> On 2020/11/30 10:43, Masahiko Sawada wrote:
> > >>> On Sun, Nov 29, 2020 at 10:34 PM Kasahara Tatsuhito
> > >>>  wrote:
> > 
> >  Hi, Thanks for you comments.
> > 
> >  On Fri, Nov 27, 2020 at 9:51 PM Fujii Masao 
> >   wrote:
> > >
> > >
> > >
> > > On 2020/11/27 18:38, Kasahara Tatsuhito wrote:
> > >> Hi,
> > >>
> > >> On Fri, Nov 27, 2020 at 1:43 AM Fujii Masao 
> > >>  wrote:
> > >>>
> > >>>
> > >>>
> > >>> On 2020/11/26 10:41, Kasahara Tatsuhito wrote:
> >  On Wed, Nov 25, 2020 at 8:46 PM Masahiko Sawada 
> >   wrote:
> > >
> > > On Wed, Nov 25, 2020 at 4:18 PM Kasahara Tatsuhito
> > >  wrote:
> > >>
> > >> Hi,
> > >>
> > >> On Wed, Nov 25, 2020 at 2:17 PM Masahiko Sawada 
> > >>  wrote:
> > >>>
> > >>> On Fri, Sep 4, 2020 at 7:50 PM Kasahara Tatsuhito
> > >>>  wrote:
> > 
> >  Hi,
> > 
> >  On Wed, Sep 2, 2020 at 2:10 AM Kasahara Tatsuhito
> >   wrote:
> > >> I wonder if we could have table_recheck_autovac do two 
> > >> probes of the stats
> > >> data.  First probe the existing stats data, and if it 
> > >> shows the table to
> > >> be already vacuumed, return immediately.  If not, *then* 
> > >> force a stats
> > >> re-read, and check a second time.
> > > Does the above mean that the second and subsequent 
> > > table_recheck_autovac()
> > > will be improved to first check using the previous 
> > > refreshed statistics?
> > > I think that certainly works.
> > >
> > > If that's correct, I'll try to create a patch for the PoC
> > 
> >  I still don't know how to reproduce Jim's troubles, but I 
> >  was able to reproduce
> >  what was probably a very similar problem.
> > 
> >  This problem seems to be more likely to occur in cases 
> >  where you have
> >  a large number of tables,
> >  i.e., a large amount of stats, and many small tables need 
> >  VACUUM at
> >  the same time.
> > 
> >  So I followed Tom's advice and created a patch for the PoC.
> >  This patch will enable a flag in the table_recheck_autovac 
> >  function to use
> >  the existing stats next time if VACUUM (or ANALYZE) has 
> >  already been done
> >  by another worker on the check after the stats have been 
> >  updated.
> >  If the tables continue to require VACUUM after the 
> >  refresh, then a refresh
> >  will be required instead of using the existing statistics.
> > 
> >  I did simple test with HEAD and HEAD + this PoC patch.
> >  The tests were conducted in two cases.
> >  (I changed few configurations. see attached scripts)
> > 
> >  1. Normal VACUUM case
> > - SET autovacuum = off
> > - CREATE tables with 100 rows
> > - DELETE 90 rows for each tables
> > - SET autovacuum = on and restart PostgreSQL
> > - Measure the time it takes for all tables to be 
> >  VACUUMed
> > 
> >  2. Anti wrap round VACUUM case
> > - CREATE brank tables
> > - SELECT all of these tables (for generate stats)
> > - SET autovacuum_freeze_max_age to low values and 
> >  restart PostgreSQL
> > - Consumes a lot of XIDs by using txid_curent()
> > - Measure the time it takes for all tables to be 
> >  VA

Re: Commitfest 2020-11 is closed

2020-12-02 Thread Chapman Flack
On 12/02/20 16:51, David Steele wrote:
> Depending on how you have Github organized migrating to travis-ci.com may be
> bit tricky because it requires full access to all private repositories in
> your account and orgs administrated by your account.

PL/Java just began using travis-ci.com this summer at the conclusion of
our GSoC project, and Thomas had been leery of the full-access-to-everything
requirement, but that turned out to have been an old way it once worked.
The more current way involves installation as a GitHub app into a particular
repository, and it did not come with excessive access requirements.

That being said, we got maybe three months of use out of it all told.
On 2 November, they announced a "new pricing model"[1],[2], and since
24 November it has no longer run  PL/Java tests, logging a "does not have
enough credits" message[3] instead. So I rather hastily put a GitHub
Actions workflow together to plug the hole.

Apparently there is a way for OSS projects to ask nicely for an allocation
of some credits that might be available.

Regards,
-Chap


[1]
https://www.jeffgeerling.com/blog/2020/travis-cis-new-pricing-plan-threw-wrench-my-open-source-works
[2] https://blog.travis-ci.com/2020-11-02-travis-ci-new-billing
[3] https://travis-ci.com/github/tada/pljava/requests




Re: Deprecate custom encoding conversions

2020-12-02 Thread Tatsuo Ishii
> I recall a discussion in which someone explained that things are messy for
> Japanese because there's not really just one version of SJIS; there are
> several, because various groups extended the original standard in not-
> quite-compatible ways.  In turn this means that there's not really one
> well-defined conversion to/from Unicode

That's true.

> --- which is the reason why
> allowing custom conversions seemed like a good idea.  I don't know
> whether anyone over there is actually using custom conversions, but
> I'd be hesitant to just nuke the feature altogether.

By Googling I found an instance which is using CREATE CONVERSION
(Japanese article).

http://grep.blog49.fc2.com/blog-entry-87.html
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp




RE: A new function to wait for the backend exit after termination

2020-12-02 Thread Hou, Zhijie
> > +   ereport(WARNING,
> > +   (errmsg("could not wait for the termination of the
> backend with PID %d within %ld milliseconds",
> > +   pid, timeout)));
> 
> > The code use %ld to print int64 type.
> > How about use INT64_FORMAT, which looks more appropriate.
> 
> This is a translatable message, so INT64_FORMAT is no good -- we need
> something that is the same across platforms.  The current project standard
> for this problem is to use "%lld" and explicitly cast the argument to long
> long int to match that.

Thank you for pointing out that,
And sorry for did not think of it.

Yes, we can use %lld, (long long int) timeout.

Best regards,
houzj






Renaming cryptohashes.c to cryptohashfuncs.c

2020-12-02 Thread Michael Paquier
Hi all,

Now that the tree has a new set of files for cryptohash functions as
of src/common/cryptohash*.c, it is a bit weird to have another file
called cryptohashes.c for the set of SQL functions.

Any objections to rename that to cryptohashfuncs.c?  That would be
much more consistent with the surroundings (cleaning up anythig
related to MD5 is on my TODO list).  Patch is attached.

Thoughts?
--
Michael
From d84ccefbf13f44c352d985dd87eadcc998cfb7b9 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Thu, 3 Dec 2020 11:02:39 +0900
Subject: [PATCH] Rename cryptohashes.c to cryptohashfuncs.c

---
 src/backend/utils/adt/Makefile  | 2 +-
 src/backend/utils/adt/{cryptohashes.c => cryptohashfuncs.c} | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)
 rename src/backend/utils/adt/{cryptohashes.c => cryptohashfuncs.c} (98%)

diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile
index b4d55e849b..f6ec7b64cd 100644
--- a/src/backend/utils/adt/Makefile
+++ b/src/backend/utils/adt/Makefile
@@ -22,7 +22,7 @@ OBJS = \
 	bool.o \
 	cash.o \
 	char.o \
-	cryptohashes.o \
+	cryptohashfuncs.o \
 	date.o \
 	datetime.o \
 	datum.o \
diff --git a/src/backend/utils/adt/cryptohashes.c b/src/backend/utils/adt/cryptohashfuncs.c
similarity index 98%
rename from src/backend/utils/adt/cryptohashes.c
rename to src/backend/utils/adt/cryptohashfuncs.c
index 5de294a7fd..47bc0b3482 100644
--- a/src/backend/utils/adt/cryptohashes.c
+++ b/src/backend/utils/adt/cryptohashfuncs.c
@@ -1,13 +1,13 @@
 /*-
  *
- * cryptohashes.c
+ * cryptohashfuncs.c
  *	  Cryptographic hash functions
  *
  * Portions Copyright (c) 2018-2020, PostgreSQL Global Development Group
  *
  *
  * IDENTIFICATION
- *	  src/backend/utils/adt/cryptohashes.c
+ *	  src/backend/utils/adt/cryptohashfuncs.c
  *
  *-
  */
-- 
2.29.2



signature.asc
Description: PGP signature


Re: A new function to wait for the backend exit after termination

2020-12-02 Thread Tom Lane
"Hou, Zhijie"  writes:
> + ereport(WARNING,
> + (errmsg("could not wait for the termination of the 
> backend with PID %d within %ld milliseconds",
> + pid, timeout)));

> The code use %ld to print int64 type.
> How about use INT64_FORMAT, which looks more appropriate. 

This is a translatable message, so INT64_FORMAT is no good -- we need
something that is the same across platforms.  The current project standard
for this problem is to use "%lld" and explicitly cast the argument to long
long int to match that.

regards, tom lane




RE: A new function to wait for the backend exit after termination

2020-12-02 Thread Hou, Zhijie
Hi

I take a look into the patch, and here some comments.

1.
+
+   ereport(WARNING,
+   (errmsg("could not wait for the termination of the 
backend with PID %d within %ld milliseconds",
+   pid, timeout)));
+

The code use %ld to print int64 type.
How about use INT64_FORMAT, which looks more appropriate. 

2.
+   if (timeout <= 0)
+   {
+   ereport(WARNING,
+   (errmsg("timeout cannot be negative or zero: 
%ld", timeout)));
+   PG_RETURN_BOOL(r);
+   }

The same as 1.

3.
+pg_terminate_backend_and_wait(PG_FUNCTION_ARGS)
+{
+   int pid = PG_GETARG_DATUM(0);

+pg_wait_backend(PG_FUNCTION_ARGS)
+{
+   int pid = PG_GETARG_INT32(0);

The code use different macro to get pid,
How about use PG_GETARG_INT32(0) for each one.


I changed the status to 'wait on anthor'.
The others of the patch LGTM, 
I think it can be changed to Ready for Committer again, when this comment is 
confirmed.


Best regards,
houzj








Re: Deprecate custom encoding conversions

2020-12-02 Thread Michael Paquier
On Thu, Dec 03, 2020 at 12:54:56AM +, tsunakawa.ta...@fujitsu.com wrote:
> I can't answer deeper questions because I'm not familiar with
> character sets, but I saw some Japanese web articles that use CREATE
> CONVERSION to handle external characters.  So, I think we may as
> well retain it.  (OTOH, I'm wondering how other DBMSs support
> external characters and what Postgres should implement it.)

Tsunakawa-san, could you post a link to this article, if possible?  I
am curious about their problem and why they used CREATE CONVERSION as
a way to solve it.  That's fine even if it is in Japanese.
--
Michael


signature.asc
Description: PGP signature


Re: scram-sha-256 broken with FIPS and OpenSSL 1.0.2

2020-12-02 Thread Michael Paquier
On Wed, Dec 02, 2020 at 12:03:49PM +0900, Michael Paquier wrote:
> Thanks.  0001 has been applied and the buildfarm does not complain, so
> it looks like we are good (I'll take care of any issues, like the one
> Fujii-san has just reported).  Attached are new patches for 0002, the
> EVP switch.  One thing I noticed is that we need to free the backup
> manifest a bit earlier once we begin to use resource owner in
> basebackup.c as there is a specific step that may do a double-free.
> This would not happen when not using OpenSSL or on HEAD.  It would be
> easy to separate the resowner and cryptohash portions of the patch
> here, but both are tightly linked, so I'd prefer to keep them
> together.

Attached is a rebased version to take care of the conflicts introduced
by 91624c2f.
--
Michael
From a68819b3f843b4ce2883c35c008d5dd4fb47ee35 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Wed, 2 Dec 2020 11:51:43 +0900
Subject: [PATCH v8] Switch cryptohash_openssl.c to use EVP

Postgres is two decades late for this switch.
---
 src/include/utils/resowner_private.h  |   7 ++
 src/backend/replication/basebackup.c  |   8 +-
 src/backend/utils/resowner/resowner.c |  62 
 src/common/cryptohash_openssl.c   | 132 --
 src/tools/pgindent/typedefs.list  |   1 +
 5 files changed, 158 insertions(+), 52 deletions(-)

diff --git a/src/include/utils/resowner_private.h b/src/include/utils/resowner_private.h
index a781a7a2aa..c373788bc1 100644
--- a/src/include/utils/resowner_private.h
+++ b/src/include/utils/resowner_private.h
@@ -95,4 +95,11 @@ extern void ResourceOwnerRememberJIT(ResourceOwner owner,
 extern void ResourceOwnerForgetJIT(ResourceOwner owner,
    Datum handle);
 
+/* support for cryptohash context management */
+extern void ResourceOwnerEnlargeCryptoHash(ResourceOwner owner);
+extern void ResourceOwnerRememberCryptoHash(ResourceOwner owner,
+			Datum handle);
+extern void ResourceOwnerForgetCryptoHash(ResourceOwner owner,
+		  Datum handle);
+
 #endif			/* RESOWNER_PRIVATE_H */
diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index 22be7ca9d5..79b458c185 100644
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -729,11 +729,17 @@ perform_base_backup(basebackup_options *opt)
  errmsg("checksum verification failure during base backup")));
 	}
 
+	/*
+	 * Make sure to free the manifest before the resource owners as
+	 * manifests use cryptohash contexts that may depend on resource
+	 * owners (like OpenSSL).
+	 */
+	FreeBackupManifest(&manifest);
+
 	/* clean up the resource owner we created */
 	WalSndResourceCleanup(true);
 
 	pgstat_progress_end_command();
-	FreeBackupManifest(&manifest);
 }
 
 /*
diff --git a/src/backend/utils/resowner/resowner.c b/src/backend/utils/resowner/resowner.c
index 8bc2c4e9ea..546ad8d1c5 100644
--- a/src/backend/utils/resowner/resowner.c
+++ b/src/backend/utils/resowner/resowner.c
@@ -20,6 +20,7 @@
  */
 #include "postgres.h"
 
+#include "common/cryptohash.h"
 #include "common/hashfn.h"
 #include "jit/jit.h"
 #include "storage/bufmgr.h"
@@ -128,6 +129,7 @@ typedef struct ResourceOwnerData
 	ResourceArray filearr;		/* open temporary files */
 	ResourceArray dsmarr;		/* dynamic shmem segments */
 	ResourceArray jitarr;		/* JIT contexts */
+	ResourceArray cryptohasharr;	/* cryptohash contexts */
 
 	/* We can remember up to MAX_RESOWNER_LOCKS references to local locks. */
 	int			nlocks;			/* number of owned locks */
@@ -175,6 +177,7 @@ static void PrintTupleDescLeakWarning(TupleDesc tupdesc);
 static void PrintSnapshotLeakWarning(Snapshot snapshot);
 static void PrintFileLeakWarning(File file);
 static void PrintDSMLeakWarning(dsm_segment *seg);
+static void PrintCryptoHashLeakWarning(Datum handle);
 
 
 /*
@@ -444,6 +447,7 @@ ResourceOwnerCreate(ResourceOwner parent, const char *name)
 	ResourceArrayInit(&(owner->filearr), FileGetDatum(-1));
 	ResourceArrayInit(&(owner->dsmarr), PointerGetDatum(NULL));
 	ResourceArrayInit(&(owner->jitarr), PointerGetDatum(NULL));
+	ResourceArrayInit(&(owner->cryptohasharr), PointerGetDatum(NULL));
 
 	return owner;
 }
@@ -553,6 +557,17 @@ ResourceOwnerReleaseInternal(ResourceOwner owner,
 
 			jit_release_context(context);
 		}
+
+		/* Ditto for cryptohash contexts */
+		while (ResourceArrayGetAny(&(owner->cryptohasharr), &foundres))
+		{
+			pg_cryptohash_ctx *context =
+			(pg_cryptohash_ctx *) PointerGetDatum(foundres);
+
+			if (isCommit)
+PrintCryptoHashLeakWarning(foundres);
+			pg_cryptohash_free(context);
+		}
 	}
 	else if (phase == RESOURCE_RELEASE_LOCKS)
 	{
@@ -725,6 +740,7 @@ ResourceOwnerDelete(ResourceOwner owner)
 	Assert(owner->filearr.nitems == 0);
 	Assert(owner->dsmarr.nitems == 0);
 	Assert(owner->jitarr.nitems == 0);
+	Assert(owner->cryptohasharr.nitems == 0);
 	Assert(owner->nlocks == 0 || owner->

Re: Deprecate custom encoding conversions

2020-12-02 Thread Tom Lane
"tsunakawa.ta...@fujitsu.com"  writes:
> From: Heikki Linnakangas 
>> Any objections? Anyone using custom encoding conversions in production?

> I can't answer deeper questions because I'm not familiar with character sets, 
> but I saw some Japanese web articles that use CREATE CONVERSION to handle 
> external characters.  So, I think we may as well retain it.  (OTOH, I'm 
> wondering how other DBMSs support external characters and what Postgres 
> should implement it.)

I recall a discussion in which someone explained that things are messy for
Japanese because there's not really just one version of SJIS; there are
several, because various groups extended the original standard in not-
quite-compatible ways.  In turn this means that there's not really one
well-defined conversion to/from Unicode --- which is the reason why
allowing custom conversions seemed like a good idea.  I don't know
whether anyone over there is actually using custom conversions, but
I'd be hesitant to just nuke the feature altogether.

Having said that, it doesn't seem like the conversion API is necessarily
any more set in stone than any of our other C-level APIs.  We break things
at the C API level whenever there's sufficient reason.

regards, tom lane




Re: Get memory contexts of an arbitrary backend process

2020-12-02 Thread Tom Lane
Fujii Masao  writes:
> I'm starting to study how this feature behaves. At first, when I executed
> the following query, the function never returned. ISTM that since
> the autovacuum launcher cannot respond to the request of memory
> contexts dump, the function keeps waiting infinity. Is this a bug?
> Probably we should exclude non-backend proceses from the target
> processes to dump? Sorry if this was already discussed.

>  SELECT pg_get_backend_memory_contexts(pid) FROM pg_stat_activity;

FWIW, I think this patch is fundamentally unsafe.  It's got a
lot of the same problems that I complained about w.r.t. the
nearby proposal to allow cross-backend stack trace dumping.
It does avoid the trap of thinking that it can do work in
a signal handler, but instead it supposes that it can do
work involving very high-level objects such as shared hash tables
in anyplace that might execute CHECK_FOR_INTERRUPTS.  That's
never going to be safe: the only real expectation the system
has is that CHECK_FOR_INTERRUPTS is called at places where our
state is sane enough that a transaction abort can clean up.
Trying to do things like taking LWLocks is going to lead to
deadlocks or worse.  We need not even get into the hard questions,
such as what happens when one process or the other exits
unexpectedly.

I also find the idea that this should be the same SQL function
as pg_get_backend_memory_contexts to be a seriously bad decision.
That means that it's not possible to GRANT the right to examine
only your own process's memory --- with this proposal, that means
granting the right to inspect every other process as well.

Beyond that, the fact that there's no way to restrict the capability
to just, say, other processes owned by the same user means that
it's not really safe to GRANT to non-superusers anyway.  Even with
such a restriction added, things are problematic, since for example
it would be possible to inquire into the workings of a
security-definer function executing in another process that
nominally is owned by your user.

Between the security and implementation issues here, I really
think we'd be best advised to just reject the concept, period.

regards, tom lane




Re: Huge memory consumption on partitioned table with FKs

2020-12-02 Thread Keisuke Kuroda
Hi Hackers,

>> I would embed all this knowledge in ri_BuildQueryKey though, without
>> adding the new function ri_GetParentConstOid.  I don't think that
>> function meaningful abstraction value, and instead it would make what I
>> suggest more difficult.

> It seems to me reasonable.

Indeed, I tried to get the conparentid with ri_BuildQueryKey.
(v2_reduce_ri_SPI-plan-hash.patch)

> Somewhat of a detour, but in reviewing the patch for
> Statement-Level RI checks, Andres and I observed that SPI
> made for a large portion of the RI overhead.

If this can be resolved by reviewing the Statement-Level RI checks,
then I think it would be better.

> BTW, one problem with Kuroda-san's patch is that it's using
> conparentid as the shared key, which can still lead to multiple
> queries being generated when using multi-level partitioning, because
> there would be many (intermediate) parent constraints in that case.

Horiguchi-san also mentiond,
In my patch, in ri_GetParentConstOid, iterate on getting
the constraint OID until comparentid == InvalidOid.
This should allow to get the "root constraint OID".

However, the names "comparentid" and "ri_GetParentConstOid"
are confusing. We should use names like "constraint_root_id",
as in Amit-san's patch.

Best Regards,
-- 
Keisuke Kuroda
NTT Software Innovation Center
keisuke.kuroda.3...@gmail.com


v2_reduce_ri_SPI-plan-hash.patch
Description: Binary data


Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace on the fly

2020-12-02 Thread Michael Paquier
On Tue, Dec 01, 2020 at 03:10:13PM +0900, Michael Paquier wrote:
> Well, my impression is that both of you kept exchanging patches,
> touching and reviewing each other's patch (note that Alexei has also
> sent a rebase of 0002 just yesterday), so I think that it is fair to
> say that both of you should be listed as authors and credited as such
> in the release notes for this one.

OK, this one is now committed.  As of this thread, I think that we are
going to need to do a bit more once we add options that are not just
booleans for both commands (the grammar rules do not need to be
changed now):
- Have a ReindexParams, similarly to VacuumParams except that we store
the results of the parsing in a single place.  With the current HEAD,
I did not see yet the point in doing so because we just need an
integer that maps to a bitmask made of ReindexOption.
- The part related to ReindexStmt in utility.c is getting more and
more complicated, so we could move most of the execution into
indexcmds.c with some sort of ExecReindex() doing the option parsing
job, and go to the correct code path depending on the object type
dealt with.
--
Michael


signature.asc
Description: PGP signature


Re: Huge memory consumption on partitioned table with FKs

2020-12-02 Thread Kyotaro Horiguchi
At Wed, 2 Dec 2020 22:02:50 +0900, Amit Langote  wrote 
in 
> Hello,
> 
> On Tue, Dec 1, 2020 at 9:40 AM Kyotaro Horiguchi
>  wrote:
> >
> > At Mon, 30 Nov 2020 21:03:45 -0300, Alvaro Herrera 
> >  wrote in
> > > On 2020-Nov-26, Kyotaro Horiguchi wrote:
> > >
> > > > This shares RI_ConstraintInfo cache by constraints that shares the
> > > > same parent constraints. But you forgot that the cache contains some
> > > > members that can differ among partitions.
> > > >
> > > > Consider the case of attaching a partition that have experienced a
> > > > column deletion.
> > >
> > > I think this can be solved easily in the patch, by having
> > > ri_BuildQueryKey() compare the parent's fk_attnums to the parent; if
> > > they are equal then use the parent's constaint_id, otherwise use the
> > > child constraint.  That way, the cache entry is reused in the common
> > > case where they are identical.
> >
> > *I* think it's the direction.  After an off-list discussion, we
> >  confirmed that even in that case the patch works as is because
> >  fk_attnum (or contuple.conkey) always stores key attnums compatible
> >  to the topmost parent when conparent has a valid value (assuming the
> >  current usage of fk_attnum), but I still feel uneasy to rely on that
> >  unclear behavior.
> 
> Hmm, I don't see what the problem is here, because it's not
> RI_ConstraintInfo that is being shared among partitions of the same
> parent, but the RI query (and the SPIPlanPtr for it) that is issued
> through SPI.  The query (and the plan) turns out to be the same no
> matter what partition's RI_ConstraintInfo is first used to generate
> it.  What am I missing?

I don't think you're missing something. As I (tried to..) mentioned in
another branch of this thread, you're right. On the otherhand
fk_attnums is actually used to generate the query, thus *issue*
potentially affects the query.  Although that might be paranoic, that
possibility is eliminated by checking the congruency(?) of fk_attnums
(or other members).  We might even be able to share a riinfo among
such children.

> BTW, one problem with Kuroda-san's patch is that it's using
> conparentid as the shared key, which can still lead to multiple
> queries being generated when using multi-level partitioning, because
> there would be many (intermediate) parent constraints in that case.
> We should really be using the "root" constraint OID as the key,
> because there would only be one root from which all constraints in a
> given partition hierarchy would've originated.  Actually, I had
> written a patch a few months back to do exactly that, a polished
> version of which I've attached with this email.  Please take a look.

I don't like that the function self-recurses but
ri_GetParentConstrOid() actually climbs up to the root constraint, so
the patch covers the multilevel partitioning cases.

About your patch, it calculates the root constrid at the time an
riinfo is created, but when the root-partition is further attached to
another partitioned-table after the riinfo creation,
constraint_root_id gets stale.  Of course that dones't matter
practically, though.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Add docs stub for recovery.conf

2020-12-02 Thread Bruce Momjian
On Wed, Dec  2, 2020 at 08:07:47PM -0500, Isaac Morland wrote:
> On Wed, 2 Dec 2020 at 19:33, David G. Johnston 
> wrote:
> 
> On Wed, Dec 2, 2020 at 5:26 PM Bruce Momjian  wrote:
> 
> I think the ideal solution is to create a section for all the rename
> cases and do all the redirects to that page.  The page would list the
> old and new name for each item, and would link to the section for each
> new item.
> 
> 
> 
> Nothing prevents us from doing that for simple renames.  For me, this
> situation is not a simple rename and the proposed solution is appropriate
> for what it is - changing the implementation details of an existing
> feature.  We can do both - though the simple rename page doesn't seem
> particularly appealing at first glance.
> 
> 
> I for one do not like following a bookmark or link and then being redirected 
> to
> a generic page that doesn't relate to the specific link I was following. What
> is being proposed here is not as bad as the usual, where all the old links
> simply turn into redirects to the homepage, but it's still disorienting. I
> would much rather each removed page be moved to an appendix (without renaming)
> and edited to briefly explain what happened to the page and provide links to
> the appropriate up-to-date page or pages.

Yes, that is pretty much the same thing I was suggesting, except that
each rename has its own _original_ URL link, which I think is also
acceptable.  My desire is for these items to all exist in one place, and
an appendix of them seems fine.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee





Re: Get memory contexts of an arbitrary backend process

2020-12-02 Thread Fujii Masao




On 2020/11/16 19:58, torikoshia wrote:

On 2020-10-28 15:32, torikoshia wrote:

On 2020-10-23 13:46, Kyotaro Horiguchi wrote:



I think we might need to step-back to basic design of this feature
since this patch seems to have unhandled corner cases that are
difficult to find.


I've written out the basic design below and attached
corresponding patch.


I'm starting to study how this feature behaves. At first, when I executed
the following query, the function never returned. ISTM that since
the autovacuum launcher cannot respond to the request of memory
contexts dump, the function keeps waiting infinity. Is this a bug?
Probably we should exclude non-backend proceses from the target
processes to dump? Sorry if this was already discussed.

SELECT pg_get_backend_memory_contexts(pid) FROM pg_stat_activity;


Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: Add docs stub for recovery.conf

2020-12-02 Thread Isaac Morland
On Wed, 2 Dec 2020 at 19:33, David G. Johnston 
wrote:

> On Wed, Dec 2, 2020 at 5:26 PM Bruce Momjian  wrote:
>
>> I think the ideal solution is to create a section for all the rename
>> cases and do all the redirects to that page.  The page would list the
>> old and new name for each item, and would link to the section for each
>> new item.
>>
>>
> Nothing prevents us from doing that for simple renames.  For me, this
> situation is not a simple rename and the proposed solution is appropriate
> for what it is - changing the implementation details of an existing
> feature.  We can do both - though the simple rename page doesn't seem
> particularly appealing at first glance.
>

I for one do not like following a bookmark or link and then being
redirected to a generic page that doesn't relate to the specific link I was
following. What is being proposed here is not as bad as the usual, where
all the old links simply turn into redirects to the homepage, but it's
still disorienting. I would much rather each removed page be moved to an
appendix (without renaming) and edited to briefly explain what happened to
the page and provide links to the appropriate up-to-date page or pages.


Re: pg_stat_statements oddity with track = all

2020-12-02 Thread Julien Rouhaud
On Wed, Dec 02, 2020 at 06:23:54AM -0800, Nikolay Samokhvalov wrote:
> On Tue, Dec 1, 2020 at 10:32 PM Julien Rouhaud  wrote:
> 
> > On Tue, Dec 01, 2020 at 10:08:06PM -0800, Nikolay Samokhvalov wrote:
> > > If all top-level records in pg_stat_statements have "true" in the new
> > > column (is_toplevel), how would this lead to the need to increase
> > > pg_stat_statements.max? The number of records would remain the same, as
> > > before extending pg_stat_statements.
> >
> > If the same query is getting executed both at top level and as a nested
> > statement, two entries will then be created.  That's probably unlikely for
> > things like RI trigger queries, but I don't know what to expect for client
> > application queries.
> >
> 
> Right, but this is how things already work. The extra field you've proposed
> won't increase the number of records so it shouldn't affect how users
> choose pg_stat_statements.max.

The extra field I've proposed would increase the number of records, as it needs
to be a part of the key.  The only alternative would be what Fufi-san
mentioned, i.e. to split plans and calls (for instance plans_toplevel,
plans_nested, calls_toplevel, calls_nested) and let users compute an
approximate value for toplevel counters.




RE: Deprecate custom encoding conversions

2020-12-02 Thread tsunakawa.ta...@fujitsu.com
From: Heikki Linnakangas 
> I propose that we add a notice to the CREATE CONVERSION docs to say that
> it is deprecated, and remove it in a few years.
> 
> Any objections? Anyone using custom encoding conversions in production?

I can't answer deeper questions because I'm not familiar with character sets, 
but I saw some Japanese web articles that use CREATE CONVERSION to handle 
external characters.  So, I think we may as well retain it.  (OTOH, I'm 
wondering how other DBMSs support external characters and what Postgres should 
implement it.)

Also, the SQL standard has CREATE CHARACTER SET and CREATE TRANSLATION.  I 
don't know how these are useful, but the mechanism of CREATE CONVERSION can be 
used to support them.


CREATE CHARACTER SET  [ AS ]

 [  ]


CREATE TRANSLATION  FOR 

TO  FROM 


Regards
Takayuki Tsunakawa



Re: Add docs stub for recovery.conf

2020-12-02 Thread David G. Johnston
On Wed, Dec 2, 2020 at 5:26 PM Bruce Momjian  wrote:

> I think the ideal solution is to create a section for all the rename
> cases and do all the redirects to that page.  The page would list the
> old and new name for each item, and would link to the section for each
> new item.
>
>
Nothing prevents us from doing that for simple renames.  For me, this
situation is not a simple rename and the proposed solution is appropriate
for what it is - changing the implementation details of an existing
feature.  We can do both - though the simple rename page doesn't seem
particularly appealing at first glance.

David J.


Re: Add docs stub for recovery.conf

2020-12-02 Thread Bruce Momjian
On Wed, Dec  2, 2020 at 05:57:01PM -0500, Stephen Frost wrote:
> * Bruce Momjian (br...@momjian.us) wrote:
> > We were not going to use just redirects --- we were going to create a
> > page that had all the renames listed, with links to the new names.
> 
> Maybe I'm the one who is confused here, but I thought there was
> objection to adding a new section/page which covers these topics (which
> is what Craig's original patch does)...?  If there isn't an objection to
> that then it seems like we should move forward with it.
> 
> If I'm following correctly, maybe there was some idea that we should
> have more things added to this section than just the recovery.conf bits,
> and perhaps we should, but that could certainly be done later.  To be
> clear though, I don't think we need to do this in all cases- the
> existing flow for pg_xlogdump -> pg_waldump works pretty well.  Maybe we
> add in a note here too if someone wants to but I don't think it's
> strictly necessary for the 'simple' rename cases.
> 
> I also feel like that could be done once the section gets added, if
> someone wants to.
> 
> Was there something else that I'm missing here in terms of what the
> concern is regarding Craig's patch..?

I think the ideal solution is to create a section for all the rename
cases and do all the redirects to that page.  The page would list the
old and new name for each item, and would link to the section for each
new item.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee





Re: SELECT INTO deprecation

2020-12-02 Thread Michael Paquier
On Wed, Dec 02, 2020 at 03:35:46PM -0500, Tom Lane wrote:
> Yeah, if we want to kill it let's just do so.  The negative language in
> the reference page has been there since (at least) 7.1, so people can
> hardly say they didn't see it coming.

+1.  I got to wonder about the impact when migrating applications
though.  SELECT INTO has a different meaning in Oracle, but SQL server
creates a new table like Postgres.
--
Michael


signature.asc
Description: PGP signature


Re: Commitfest 2020-11 is closed

2020-12-02 Thread Josef Šimánek
st 2. 12. 2020 v 23:36 odesílatel Andrew Dunstan  napsal:
>
>
> On 12/2/20 5:13 PM, Thomas Munro wrote:
> >
> > I'm experimenting with Github's built in CI.  All other ideas welcome.
>
>
>
> I'd look very closely at gitlab.

I was about to respond before with the same idea. Feel free to ping
regarding another CI. Also there is possibility to move to GitHub
Actions (free open source CI). I got some experience with that as
well.

>
> cheers
>
>
> andrew
>
>
>




Re: Add docs stub for recovery.conf

2020-12-02 Thread Stephen Frost
Greetings,

* Bruce Momjian (br...@momjian.us) wrote:
> On Wed, Dec  2, 2020 at 02:47:13PM -0500, Stephen Frost wrote:
> > * David G. Johnston (david.g.johns...@gmail.com) wrote:
> > > On Mon, Nov 30, 2020 at 11:42 AM Bruce Momjian  wrote:
> > > > The downside is you end up with X-1 dummy sections just to allow for
> > > > references to old syntax, and you then have to find them all and remove
> > > > them when you implement the proper solution.  I have no intention of
> > > > applying such an X-1 fix.
> > >
> > > X = 2; seems like a strong objection for such a minor issue.  The status
> > > quo seems worse than that.
> > 
> > I've been thinking about this and I think I'm on Craig and David's side-
> > having something cleaner, and clearer, than just http redirects and such
> > would be good for these cases and I don't think we are going to end up
> > with so many of them that it ends up becoming an issue.
> 
> We were not going to use just redirects --- we were going to create a
> page that had all the renames listed, with links to the new names.

Maybe I'm the one who is confused here, but I thought there was
objection to adding a new section/page which covers these topics (which
is what Craig's original patch does)...?  If there isn't an objection to
that then it seems like we should move forward with it.

If I'm following correctly, maybe there was some idea that we should
have more things added to this section than just the recovery.conf bits,
and perhaps we should, but that could certainly be done later.  To be
clear though, I don't think we need to do this in all cases- the
existing flow for pg_xlogdump -> pg_waldump works pretty well.  Maybe we
add in a note here too if someone wants to but I don't think it's
strictly necessary for the 'simple' rename cases.

I also feel like that could be done once the section gets added, if
someone wants to.

Was there something else that I'm missing here in terms of what the
concern is regarding Craig's patch..?

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: pg_dump, ATTACH, and independently restorable child partitions

2020-12-02 Thread Justin Pryzby
On Wed, Nov 25, 2020 at 06:35:19PM -0500, Tom Lane wrote:
> Justin Pryzby  writes:
> > 1. Maybe pg_restore ExecuteSqlCommandBuf() should (always?) call
> > ExecuteSimpleCommands() instead of ExecuteSqlCommand().  It doesn't seem to
> > break anything (although that surprised me).
> 
> That certainly does break everything, which I imagine is the reason
> why the cfbot shows that this patch is failing the pg_upgrade tests.

Thanks for looking, I have tried this.

> What we'd need to do if we want this to work with direct-to-DB restore
> is to split off the ATTACH PARTITION command as a separate TOC entry.
> That doesn't seem amazingly difficult, and it would even offer the
> possibility that you could extract the partition standalone without
> having to ignore errors.  (You'd use -l/-L to select the CREATE TABLE,
> the data, etc, but not the ATTACH object.)


-- 
Justin
>From 0d0b013930199158306fd295eebbf36c4c4cb5b4 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Thu, 26 Nov 2020 22:02:07 -0600
Subject: [PATCH v4] pg_dump: output separate "object" for ALTER TABLE..ATTACH
 PARTITION

..allowing table partitions to be restored separately and independently, even
if there's an error attaching to parent table.
---
 src/bin/pg_dump/common.c   | 28 +
 src/bin/pg_dump/pg_dump.c  | 76 --
 src/bin/pg_dump/pg_dump.h  |  8 
 src/bin/pg_dump/pg_dump_sort.c | 48 +++--
 4 files changed, 118 insertions(+), 42 deletions(-)

diff --git a/src/bin/pg_dump/common.c b/src/bin/pg_dump/common.c
index 634ca86cfb..8450e22327 100644
--- a/src/bin/pg_dump/common.c
+++ b/src/bin/pg_dump/common.c
@@ -320,6 +320,34 @@ flagInhTables(Archive *fout, TableInfo *tblinfo, int numTables,
 			for (j = 0; j < numParents; j++)
 parents[j]->interesting = true;
 		}
+
+		if (tblinfo[i].dobj.dump && tblinfo[i].ispartition)
+		{
+			/* We don't even need to store this anywhere, but maybe there's
+			 * some utility in making a single, large allocation earlier ? */
+			TableAttachInfo *attachinfo = palloc(sizeof(*attachinfo));
+
+			attachinfo->dobj.objType = DO_TABLE_ATTACH;
+			attachinfo->dobj.catId.tableoid = 0;
+			attachinfo->dobj.catId.oid = 0;
+			AssignDumpId(&attachinfo->dobj);
+			attachinfo->dobj.name = pg_strdup(tblinfo[i].dobj.name);
+			attachinfo->dobj.namespace = tblinfo[i].dobj.namespace;
+			attachinfo->parentTbl = tblinfo[i].parents[0];
+			attachinfo->partitionTbl = &tblinfo[i];
+
+			/*
+			 * We must state the DO_TABLE_ATTACH object's dependencies
+			 * explicitly, since it will not match anything in pg_depend.
+			 *
+			 * Give it dependencies on both the partition table and the parent
+			 * table, so that it will not be executed till both of those
+			 * exist.  (There's no need to care what order those are created
+			 * in.)
+			 */
+			addObjectDependency(&attachinfo->dobj, tblinfo[i].dobj.dumpId);
+			addObjectDependency(&attachinfo->dobj, tblinfo[i].parents[0]->dobj.dumpId);
+		}
 	}
 }
 
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index dc1d41dd8d..f66182bc73 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -201,6 +201,7 @@ static void dumpAgg(Archive *fout, AggInfo *agginfo);
 static void dumpTrigger(Archive *fout, TriggerInfo *tginfo);
 static void dumpEventTrigger(Archive *fout, EventTriggerInfo *evtinfo);
 static void dumpTable(Archive *fout, TableInfo *tbinfo);
+static void dumpTableAttach(Archive *fout, TableAttachInfo *tbinfo);
 static void dumpTableSchema(Archive *fout, TableInfo *tbinfo);
 static void dumpAttrDef(Archive *fout, AttrDefInfo *adinfo);
 static void dumpSequence(Archive *fout, TableInfo *tbinfo);
@@ -10110,6 +10111,9 @@ dumpDumpableObject(Archive *fout, DumpableObject *dobj)
 		case DO_TABLE:
 			dumpTable(fout, (TableInfo *) dobj);
 			break;
+		case DO_TABLE_ATTACH:
+			dumpTableAttach(fout, (TableAttachInfo *) dobj);
+			break;
 		case DO_ATTRDEF:
 			dumpAttrDef(fout, (AttrDefInfo *) dobj);
 			break;
@@ -15573,6 +15577,55 @@ createDummyViewAsClause(Archive *fout, TableInfo *tbinfo)
 	return result;
 }
 
+/*
+ * dumpTableAttach
+ *	  write to fout the commands to attach a child partition
+ *
+ * For partitioned tables, emit the ATTACH PARTITION clause.  Note
+ * that we always want to create partitions this way instead of using
+ * CREATE TABLE .. PARTITION OF, mainly to preserve a possible column
+ * layout discrepancy with the parent, but also to ensure it gets the
+ * correct tablespace setting if it differs from the parent's.
+ */
+static void
+dumpTableAttach(Archive *fout, TableAttachInfo *attachinfo)
+{
+	DumpOptions *dopt = fout->dopt;
+	char	   *qualrelname;
+	PQExpBuffer q;
+
+	if (dopt->dataOnly)
+		return;
+
+	if (attachinfo->partitionTbl->dobj.dump & DUMP_COMPONENT_DEFINITION == 0)
+		return;
+
+	/* With partitions there can only be one parent */
+	if (attachinfo->partitionTbl->numParents != 1)
+		fatal("invalid number of parents %d for table \"%s\"",

Re: Commitfest 2020-11 is closed

2020-12-02 Thread Andrew Dunstan


On 12/2/20 5:13 PM, Thomas Munro wrote:
>
> I'm experimenting with Github's built in CI.  All other ideas welcome.



I'd look very closely at gitlab.


cheers


andrew





Re: Commitfest 2020-11 is closed

2020-12-02 Thread David Steele

On 12/2/20 5:13 PM, Thomas Munro wrote:

On Thu, Dec 3, 2020 at 10:51 AM David Steele  wrote:

On 12/2/20 4:15 PM, Thomas Munro wrote:

On Thu, Dec 3, 2020 at 10:00 AM Tom Lane  wrote:

This is actually a bit problematic, because now the cfbot is ignoring
those patches (or if it's not, I don't know where it's displaying the
results).  Please go ahead and move the remaining open patches, or
else re-open the CF if that's possible.


As of quite recently, Travis CI doesn't seem to like cfbot's rate of
build jobs.  Recently it's been taking a very long time to post
results for new patches and taking so long to get around to retesting
patches for bitrot that the my "delete old results after a week" logic
started wiping out some results while they are still the most recent,
leading to the blank spaces where the results are supposed to be.
D'oh. I'm looking into a couple of options.


pgBackRest test runs have gone from ~17 minutes to 3-6 hours over the
last two months.


Ouch.


Yeah.


Also keep in mind that travis-ci.org will be shut down at the end of the
month. Some users who have migrated to travis-ci.com have complained
that it is not any faster, but I have not tried myself (yet).


Oh.


Yeah.


Depending on how you have Github organized migrating to travis-ci.com
may be bit tricky because it requires full access to all private
repositories in your account and orgs administrated by your account.


I'm experimenting with Github's built in CI.  All other ideas welcome.


We're leaning towards Github actions ourselves. The only thing that 
makes us want to stay with Travis (at least for some tests) is the 
support for the ppc64le, arm64, and s390x architectures. s390x in 
particular since it is the only big-endian architecture we have access to.


Regards,
--
-David
da...@pgmasters.net




Re: Commitfest 2020-11 is closed

2020-12-02 Thread Thomas Munro
On Thu, Dec 3, 2020 at 10:51 AM David Steele  wrote:
> On 12/2/20 4:15 PM, Thomas Munro wrote:
> > On Thu, Dec 3, 2020 at 10:00 AM Tom Lane  wrote:
> >> This is actually a bit problematic, because now the cfbot is ignoring
> >> those patches (or if it's not, I don't know where it's displaying the
> >> results).  Please go ahead and move the remaining open patches, or
> >> else re-open the CF if that's possible.
> >
> > As of quite recently, Travis CI doesn't seem to like cfbot's rate of
> > build jobs.  Recently it's been taking a very long time to post
> > results for new patches and taking so long to get around to retesting
> > patches for bitrot that the my "delete old results after a week" logic
> > started wiping out some results while they are still the most recent,
> > leading to the blank spaces where the results are supposed to be.
> > D'oh. I'm looking into a couple of options.
>
> pgBackRest test runs have gone from ~17 minutes to 3-6 hours over the
> last two months.

Ouch.

> Also keep in mind that travis-ci.org will be shut down at the end of the
> month. Some users who have migrated to travis-ci.com have complained
> that it is not any faster, but I have not tried myself (yet).

Oh.

> Depending on how you have Github organized migrating to travis-ci.com
> may be bit tricky because it requires full access to all private
> repositories in your account and orgs administrated by your account.

I'm experimenting with Github's built in CI.  All other ideas welcome.




Re: Commitfest 2020-11 is closed

2020-12-02 Thread David Steele

On 12/2/20 4:15 PM, Thomas Munro wrote:

On Thu, Dec 3, 2020 at 10:00 AM Tom Lane  wrote:

This is actually a bit problematic, because now the cfbot is ignoring
those patches (or if it's not, I don't know where it's displaying the
results).  Please go ahead and move the remaining open patches, or
else re-open the CF if that's possible.


As of quite recently, Travis CI doesn't seem to like cfbot's rate of
build jobs.  Recently it's been taking a very long time to post
results for new patches and taking so long to get around to retesting
patches for bitrot that the my "delete old results after a week" logic
started wiping out some results while they are still the most recent,
leading to the blank spaces where the results are supposed to be.
D'oh. I'm looking into a couple of options.


pgBackRest test runs have gone from ~17 minutes to 3-6 hours over the 
last two months.


Also keep in mind that travis-ci.org will be shut down at the end of the 
month. Some users who have migrated to travis-ci.com have complained 
that it is not any faster, but I have not tried myself (yet).


Depending on how you have Github organized migrating to travis-ci.com 
may be bit tricky because it requires full access to all private 
repositories in your account and orgs administrated by your account.


Regards,
--
-David
da...@pgmasters.net




Proposed patch for key managment

2020-12-02 Thread Bruce Momjian
Attached is a patch for key management, which will eventually be part of
cluster file encryption (CFE), called TDE (Transparent Data Encryption)
by Oracle.  It is an update of Masahiko Sawada's patch from July 31:


https://www.postgresql.org/message-id/ca+fd4k6rjwnvztro3q2f5hsdd8hgyuc4cuy9u3e6ran4c6t...@mail.gmail.com

Sawada-san did all the hard work, and I just redirected the patch.  The
general outline of this CFE feature can be seen here:

https://wiki.postgresql.org/wiki/Transparent_Data_Encryption

The currently planned progression for this feature is to allow secure
retrieval of key encryption keys (KEK) outside of the database, then use
those to encrypt data keys that encrypt heap/index/tmpfile files.

This patch was changed from Masahiko Sawada's version by removing
references to non-cluster file encryption because having SQL-level keys
stored in this way was considered to have limited usefulness.  I have
also remove references to SQL-level KEK rotation since that is best done
as a command-line too.

If most people approve of this general approach, and the design
decisions made, I would like to apply this in the next few weeks, but
this brings complications.  The syntax added by this commit might not
provide a useful feature until PG 15, so how do we hide it from users. 
I was thinking of not applying the doc changes (or commenting them out)
and commenting out the --help output.

Once this patch is applied, I would like to use the /dev/tty file
descriptor passing feature for the ssl_passphrase_command parameter, so
the ssl passphrase can be prompted from the terminal.  (I am attaching
pass_fd.sh as a POC for how file descriptor handling works.)  I would
also then write the KEK rotation command-line tool.  After that, we can
start working on heap/index/tmpfile encryption using this patch as a
base.

Here is an example of the current patch in action, using a KEK like
'7CE7945EA2F7322536F103E4D7D91C21E52288089EF99D186B9A76D666EBCA66',
which is not echoed to the terminal:

$ initdb -R -c '/u/postgres/tmp/pass_fd.sh "Enter password:" %R'
The files belonging to this database system will be owned by user 
"postgres".
This user must also own the server process.

The database cluster will be initialized with locale "en_US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.
Cluster file encryption is enabled.

fixing permissions on existing directory /u/pgsql/data ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... America/New_York
creating configuration files ... ok
running bootstrap script ...
--> Enter password:ok
performing post-bootstrap initialization ...
--> Enter password:ok
syncing data to disk ... ok

initdb: warning: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.

Success. You can now start the database server using:

pg_ctl -D /u/pgsql/data -l logfile start

$ pg_ctl -R -l /u/pg/server.log start
waiting for server to start...
--> Enter password: done
server started

A non-matching passphrase looks like this:

$ pg_ctl -R -l /u/pg/server.log start
waiting for server to start...
--> Enter password: stopped waiting
pg_ctl: could not start server
Examine the log output.

$ tail -2 /u/pg/server.log
2020-12-02 16:32:34.045 EST [13056] FATAL:  cluster passphrase does not 
match expected passphrase
2020-12-02 16:32:34.047 EST [13056] LOG:  database system is shut down

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee



key.diff.gz
Description: application/gzip


pass_fd.sh
Description: Bourne shell script


Re: libpq compression

2020-12-02 Thread Robert Haas
On Thu, Nov 26, 2020 at 8:15 AM Daniil Zakhlystov
 wrote:
> However, I don’t mean by this that we shouldn’t support switchable 
> compression. I propose that we can offer two compression modes: permanent 
> (which is implemented in the current state of the patch) and switchable 
> on-the-fly. Permanent compression allows us to deliver a robust solution that 
> is already present in some databases. Switchable compression allows us to 
> support more complex scenarios in cases when the frontend and backend really 
> need it and can afford development effort to implement it.

I feel that one thing that may be getting missed here is that my
suggestions were intended to make this simpler, not more complicated.
Like, in the design I proposed, switchable compression is not a
separate form of compression and doesn't require any special support.
Both sides are just allowed to set the compression method;
theoretically, they could set it more than once. Similarly, I don't
intend the possibility of using different compression algorithms in
the two directions as a request for an advanced feature so much as a
way of simplifying the protocol.

Like, in the protocol that you proposed previously, you've got a
four-phase handshake to set up compression. The startup packet carries
initial information from the client, and the server then sends
CompressionAck, and then the client sends SetCompressionMethod, and
then the server sends SetCompressionMethod. This system is fairly
complex, and it requires some form of interlocking. Once the client
has sent a SetCompressionMethod message, it cannot send any other
protocol message until it receives a SetCompressionMethod message back
from the server. Otherwise, it doesn't know whether the server
actually responded with SetCompressionMethod as well, or whether it
sent say ErrorResponse or NoticeResponse or something. In the former
case it needs to send compressed data going forward; in the latter
uncompressed; but it can't know which until it seems the server
message. And keep in mind that control isn't necessarily with libpq at
this point, because non-blocking mode could be in use. This is all
solvable, but the way I proposed it, you don't have that problem. You
never need to wait for a message from the other end before being able
to send a message yourself.

Similarly, allowing different compression methods in the two
directions may seem to make things more complicated, but I don't think
it really is. Arguably it's simpler. The instant the server gets the
startup packet, it can issue SetCompressionMethod. The instant the
client gets SupportedCompressionTypes, it can issue
SetCompressionMethod. So there's practically no hand-shaking at all.
You get a single protocol message and you immediately respond by
setting the compression method and then you just send compressed
messages after that. Perhaps the time at which you begin receiving
compressed data will be a little different than the time at which you
begin sending it, or perhaps compression will only ever be used in one
direction. But so what? The code really does need to care. You just
need to keep track of the active compression mode in each direction,
and that's it.

And again, if you allow the compression method to be switched at any
time, you just have to know how what to do when you get a
SetCompressionMethod. If you only allow it to be changed once, to set
it initially, then you have to ADD code to reject that message the
next time it's sent. If that ends up avoiding a significant amount of
complexity somewhere else then I don't have a big problem with it, but
if it doesn't, it's simpler to allow it whenever than to restrict it
to only once.

> 2. List of the compression algorithms which the frontend is able to 
> decompress in the order of preference.
> For example:
> “zlib:1,3,5;zstd:7,8;uncompressed” means that frontend is able to:
>  - decompress zlib with 1,3 or 5 compression levels
>  - decompress zstd with 7 or 8 compression levels
>  - “uncompressed” at the end means that the frontend agrees to receive 
> uncompressed messages. If there is no “uncompressed” compression algorithm 
> specified it means that the compression is required.

I think that there's no such thing as being able to decompress some
compression levels with the same algorithm but not others. The level
only controls behavior on the compression side. So, the client can
only send zlib data if the server can decompress it, but the server
need not advertise which levels it can decompress, because it's all or
nothing.

> Supported compression and decompression methods are configured using GUC 
> parameters:
>
> compress_algorithms = ‘...’ // default value is ‘uncompressed’
> decompress_algorithms = ‘...’ // default value is ‘uncompressed’

This raises an interesting question which I'm not quite sure about. It
doesn't seem controversial to assert that the client must be able to
advertise which algorithms it does and does not support, and likewise
for the s

Re: Commitfest 2020-11 is closed

2020-12-02 Thread Thomas Munro
On Thu, Dec 3, 2020 at 10:00 AM Tom Lane  wrote:
> This is actually a bit problematic, because now the cfbot is ignoring
> those patches (or if it's not, I don't know where it's displaying the
> results).  Please go ahead and move the remaining open patches, or
> else re-open the CF if that's possible.

As of quite recently, Travis CI doesn't seem to like cfbot's rate of
build jobs.  Recently it's been taking a very long time to post
results for new patches and taking so long to get around to retesting
patches for bitrot that the my "delete old results after a week" logic
started wiping out some results while they are still the most recent,
leading to the blank spaces where the results are supposed to be.
D'oh. I'm looking into a couple of options.




Re: Commitfest 2020-11 is closed

2020-12-02 Thread Tom Lane
Anastasia Lubennikova  writes:
> Commitfest 2020-11 is officially closed now.
> Many thanks to everyone who participated by posting patches, reviewing 
> them, committing and sharing ideas in discussions!

Thanks for all the hard work!

> Today, me and Georgios will move the remaining items to the next CF or 
> return them with feedback. We're planning to leave Ready For Committer 
> till the end of the week, to make them more visible and let them get the 
> attention they deserve.

This is actually a bit problematic, because now the cfbot is ignoring
those patches (or if it's not, I don't know where it's displaying the
results).  Please go ahead and move the remaining open patches, or
else re-open the CF if that's possible.

regards, tom lane




Re: SELECT INTO deprecation

2020-12-02 Thread Tom Lane
Stephen Frost  writes:
> * Peter Eisentraut (peter.eisentr...@enterprisedb.com) wrote:
>> While reading about deprecating and removing various things in other
>> threads, I was wondering about how deprecated SELECT INTO is.  There are
>> various source code comments about this, but the SELECT INTO reference page
>> only contains soft language like "recommended".  I'm proposing the attached
>> patch to stick a more explicit deprecation notice right at the top.

> I don't see much value in this.

Yeah, if we want to kill it let's just do so.  The negative language in
the reference page has been there since (at least) 7.1, so people can
hardly say they didn't see it coming.

regards, tom lane




Re: Add docs stub for recovery.conf

2020-12-02 Thread Bruce Momjian
On Wed, Dec  2, 2020 at 02:47:13PM -0500, Stephen Frost wrote:
> Greetings,
> 
> * David G. Johnston (david.g.johns...@gmail.com) wrote:
> > On Mon, Nov 30, 2020 at 11:42 AM Bruce Momjian  wrote:
> > > The downside is you end up with X-1 dummy sections just to allow for
> > > references to old syntax, and you then have to find them all and remove
> > > them when you implement the proper solution.  I have no intention of
> > > applying such an X-1 fix.
> >
> > X = 2; seems like a strong objection for such a minor issue.  The status
> > quo seems worse than that.
> 
> I've been thinking about this and I think I'm on Craig and David's side-
> having something cleaner, and clearer, than just http redirects and such
> would be good for these cases and I don't think we are going to end up
> with so many of them that it ends up becoming an issue.

We were not going to use just redirects --- we were going to create a
page that had all the renames listed, with links to the new names.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee





Re: Minor documentation error regarding streaming replication protocol

2020-12-02 Thread Bruce Momjian
On Tue, Dec  1, 2020 at 10:16:31AM -0800, Jeff Davis wrote:
> On Thu, 2020-10-08 at 23:17 -0400, Bruce Momjian wrote:
> > As I said before, it is more raw bytes, but
> > we
> > don't have an SQL data type for that.
> 
> Sorry to jump in to this thread late.
> 
> Can't we just set both "filename" and "content" to be BYTEA, but then
> set the format code to 1 (indicating binary)?
> 
> For clients that do look at the descriptor, they are more likely to get
> it right; and for clients that don't look at the descriptor, there will
> be no change.

Yes, we could, but I thought the format code was not something we set at
this level.  Looking at byteasend() it is true it just sends the bytes.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee





Re: Add table access method as an option to pgbench

2020-12-02 Thread Fabien COELHO



Hello David,

Some feedback about v4.

It looks that the option is *silently* ignored when creating a partitionned 
table, which currently results in an error, and not passed to partitions, 
which would accept them. This is pretty weird.
The input check is added with an error message when both partitions and table 
access method are used.


Hmmm. If you take the resetting the default, I do not think that you 
should have to test anything? AFAICT the access method is valid on 
partitions, although not on the partitioned table declaration. So I'd say 
that you could remove the check.


They should also trigger failures, eg 
"--table-access-method=no-such-table-am", to be added to the @errors list.
No sure how to address this properly, since the table access method 
potentially can be *any* name.


I think it is pretty unlikely that someone would chose the name 
"no-such-table-am" when developing a new storage engine for Postgres 
inside Postgres, so that it would make this internal test fail.


There are numerous such names used in tests, eg "no-such-database", 
"no-such-command".


So I'd suggest to add such a test to check for the expected failure.


I do not understand why you duplicated all possible option entry.

Test with just table access method looks redundant if the feature is make 
to work orthonogonally to partitions, which should be the case.
Only one positive test case added using *heap* as the table access method to 
verify the initialization.


Yep, only "heap" if currently available for tables.

--
Fabien.




Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-02 Thread Tom Lane
Dmitry Dolgov <9erthali...@gmail.com> writes:
>> On Wed, Dec 02, 2020 at 12:58:51PM -0500, Tom Lane wrote:
>> So ... one of the things that's been worrying me about this patch
>> from day one is whether it would create a noticeable performance
>> penalty for existing use-cases.  I did a small amount of experimentation
>> about that with the v35 patchset, and it didn't take long at all to
>> find that this:
>> ...
>> is about 15% slower with the patch than with HEAD.  I'm not sure
>> what an acceptable penalty might be, but 15% is certainly not it.

> I've tried to reproduce that, but get ~2-4% slowdown (with a pinned
> backend, no turbo etc). Are there any special steps I've probably
> missed?

Hmm, no, I just built with --disable-cassert and otherwise my usual
development options.

I had experimented with some other variants of the test case,
where the repeated statement is

a[i] := i;-- about the same
a[i] := a[i-1] + 1;   -- 7% slower
a[i] := a[i-1] - a[i-2];  -- 15% slower

so it seems clear that the penalty is on the array fetch not array
assign side.  This isn't too surprising now that I think about it,
because plpgsql's array assignment code is untouched by the patch
(which is a large feature omission BTW: you still can't write
jsonb['x'] := y;
in plpgsql).

regards, tom lane




Re: SELECT INTO deprecation

2020-12-02 Thread David Fetter
On Wed, Dec 02, 2020 at 12:58:36PM -0500, Stephen Frost wrote:
> Greetings,
> 
> * Peter Eisentraut (peter.eisentr...@enterprisedb.com) wrote:
> > While reading about deprecating and removing various things in
> > other threads, I was wondering about how deprecated SELECT INTO
> > is.  There are various source code comments about this, but the
> > SELECT INTO reference page only contains soft language like
> > "recommended".  I'm proposing the attached patch to stick a more
> > explicit deprecation notice right at the top.
> 
> I don't see much value in this.  Users already have 5 years to adapt
> their code to new major versions of PG and that strikes me as plenty
> enough time and is why we support multiple major versions of PG for
> so long.  Users who keep pace and update for each major version
> aren't likely to have issue making this change since they're already
> used to regularly updating their code for new major versions, while
> others are going to complain no matter when we remove it and will
> ignore any deprecation notices we put out there, so there isn't much
> point in them.

+1 for removing it entirely and including this prominently in the
release notes.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate




Re: Off-Schedule Minor Release Consideration?

2020-12-02 Thread Tom Lane
"David G. Johnston"  writes:
> Given that we have already heard about 3 separate minor release regressions
> in the most recent update I'm putting forth for discussion whether an
> off-schedule release should be done.  I probably wouldn't care as much if
> it wasn't for the fact that the same release fixes two CVEs.

The release team had a discussion about this.  We're leaning towards
not doing anything out of the ordinary, but are open to public input.
To recap a bit:

Actually there are four known regressions, because there are two
different CREATE TABLE LIKE bugs:
* CREATE TABLE LIKE gets confused if new table masks old one
  (see f7f83a55b)
* CREATE TABLE LIKE fails to handle a self-referential foreign key
  defined in the CREATE, if it depends on an index imported by LIKE
  (see 97390fe8a)
* psql \c with a connstring containing a password ignores the password
  (see 7e5e1bba0)
* LISTEN/NOTIFY can fail to empty its SLRU queue in v12 and earlier
  (see 9c83b54a9)

We felt that the first three of these are at best minor annoyances.

The LISTEN/NOTIFY issue is potentially more serious, in that it
could cause a service outage for a site that makes heavy use of
NOTIFY.  However, the race is between two listeners, so in the
common (I think) usage pattern of only one listener there's no
issue.  Even if you hit the race condition, you don't have a problem
until 8GB more NOTIFY traffic has been sent, which seems like a lot.
The complainants in that bug report indicated they'd overrun that
amount in a week or two, but that seems like unusually heavy usage.

The other aspect that seems to mitigate the seriousness of the
LISTEN/NOTIFY issue is that a server restart is enough to clear
the queue.  Given that you'd also need a server restart to install
a hypothetical off-cycle release, it's easy to see that there's
no benefit except to sites that anticipate logging more than 16GB
of NOTIFY traffic (and thus needing two or more restarts) before
February.

So we are leaning to the position that the severity of these issues
doesn't justify the work involved in putting out an off-cycle release.
But if there are people out there with high-NOTIFY-traffic servers
who would be impacted, please speak up!

(BTW, the calendar is also a bit unfriendly to doing an off-cycle
release.  It's probably already too late to plan a Dec 10 release,
and by the week after that a lot of people will be in holiday mode.
If you start thinking about January, it's not clear why bother,
since we'll have minor releases in February anyway.)

regards, tom lane




Re: [PATCH] SET search_path += octopus

2020-12-02 Thread Bruce Momjian
On Tue, Dec  1, 2020 at 08:19:34PM +0530, Abhijit Menon-Sen wrote:
> I'm certainly not in favour of introducing multiple operators to express
> the various list operations, like += for prepend and =+ for append. (By
> the standard of assembling such things from spare parts, the right thing
> to do would be to add M4 support to postgresql.conf, but I'm not much of
> a fan of that idea either.)

Yeah, when M4 is your next option, you are in trouble.  ;-)

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee





Re: Add docs stub for recovery.conf

2020-12-02 Thread Stephen Frost
Greetings,

* David G. Johnston (david.g.johns...@gmail.com) wrote:
> On Mon, Nov 30, 2020 at 11:42 AM Bruce Momjian  wrote:
> > The downside is you end up with X-1 dummy sections just to allow for
> > references to old syntax, and you then have to find them all and remove
> > them when you implement the proper solution.  I have no intention of
> > applying such an X-1 fix.
>
> X = 2; seems like a strong objection for such a minor issue.  The status
> quo seems worse than that.

I've been thinking about this and I think I'm on Craig and David's side-
having something cleaner, and clearer, than just http redirects and such
would be good for these cases and I don't think we are going to end up
with so many of them that it ends up becoming an issue.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Log message for GSS connection is missing once connection authorization is successful.

2020-12-02 Thread Stephen Frost
Greetings,

* Stephen Frost (sfr...@snowman.net) wrote:
> * vignesh C (vignes...@gmail.com) wrote:
> > Thanks for testing this, I had missed testing this. The expression
> > matching was not correct. Attached v6 patch which includes the fix for
> > this.
> 
> This generally looks pretty good to me.  I did reword the commit message
> a bit, run pgindent, and added the appropriate log message for the last
> test (was there a reason you didn't include that..?).  In general, this
> looks pretty good to commit to me.
> 
> I'll look at it again over the weekend or early next week and unless
> there's objections, I'll push it.

And committed.

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-02 Thread Dmitry Dolgov
> On Wed, Dec 02, 2020 at 01:20:10PM -0600, Justin Pryzby wrote:
> On Wed, Dec 02, 2020 at 08:18:08PM +0100, Dmitry Dolgov wrote:
> > > On Wed, Dec 02, 2020 at 12:58:51PM -0500, Tom Lane wrote:
> > > So ... one of the things that's been worrying me about this patch
> > > from day one is whether it would create a noticeable performance
> > > penalty for existing use-cases.  I did a small amount of experimentation
> > > about that with the v35 patchset, and it didn't take long at all to
> > > find that this:
> > > --- cut ---
> > >
> > > is about 15% slower with the patch than with HEAD.  I'm not sure
> > > what an acceptable penalty might be, but 15% is certainly not it.
> > >
> > > I'm also not quite sure where the cost is going.  It looks like
> > > 0001+0002 aren't doing much to the executor except introducing
> > > one level of subroutine call, which doesn't seem like it'd account
> > > for that.
> >
> > I've tried to reproduce that, but get ~2-4% slowdown (with a pinned
> > backend, no turbo etc). Are there any special steps I've probably
> > missed? At the same time, I remember had conducted this sort of tests
> > before when you and others raised the performance degradation question
> > and the main part of the patch was already more or less stable. From
> > what I remember the numbers back then were also rather small.
>
> Are you comparing with casserts (and therefor MEMORY_CONTEXT_CHECKING) 
> disabled?

Yep, they're disabled.




Re: proposal: unescape_text function

2020-12-02 Thread Bruce Momjian
On Wed, Dec  2, 2020 at 07:30:39PM +0100, Pavel Stehule wrote:
> postgres=# select
>  'Arabic     : ' || unistr( '\0627\0644\0639\0631\0628\064A\0629' )      || '
>   Chinese    : ' || unistr( '\4E2D\6587' )                               || '
>   English    : ' || unistr( 'English' )                                  || '
>   French     : ' || unistr( 'Fran\00E7ais' )                             || '
>   German     : ' || unistr( 'Deutsch' )                                  || '
>   Greek      : ' || unistr( '\0395\03BB\03BB\03B7\03BD\03B9\03BA\03AC' ) || '
>   Hebrew     : ' || unistr( '\05E2\05D1\05E8\05D9\05EA' )                || '
>   Japanese   : ' || unistr( '\65E5\672C\8A9E' )                          || '
>   Korean     : ' || unistr( '\D55C\AD6D\C5B4' )                          || '
>   Portuguese : ' || unistr( 'Portugu\00EAs' )                            || '
>   Russian    : ' || unistr( '\0420\0443\0441\0441\043A\0438\0439' )      || '
>   Spanish    : ' || unistr( 'Espa\00F1ol' )                              || '
>   Thai       : ' || unistr( '\0E44\0E17\0E22' )
>   as unicode_test_string;
> ┌──┐
> │   unicode_test_string    │
> ╞══╡
> │ Arabic     : العربية    ↵│
> │   Chinese    : 中文     ↵│
> │   English    : English  ↵│
> │   French     : Français ↵│
> │   German     : Deutsch  ↵│
> │   Greek      : Ελληνικά ↵│
> │   Hebrew     : עברית    ↵│
> │   Japanese   : 日本語   ↵│
> │   Korean     : 한국어   ↵│
> │   Portuguese : Português↵│
> │   Russian    : Русский  ↵│
> │   Spanish    : Español  ↵│
> │   Thai       : ไทย       │
> └──┘

Offlist, this table output is super-cool!

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee





Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-02 Thread Justin Pryzby
On Wed, Dec 02, 2020 at 08:18:08PM +0100, Dmitry Dolgov wrote:
> > On Wed, Dec 02, 2020 at 12:58:51PM -0500, Tom Lane wrote:
> > So ... one of the things that's been worrying me about this patch
> > from day one is whether it would create a noticeable performance
> > penalty for existing use-cases.  I did a small amount of experimentation
> > about that with the v35 patchset, and it didn't take long at all to
> > find that this:
> > --- cut ---
> >
> > is about 15% slower with the patch than with HEAD.  I'm not sure
> > what an acceptable penalty might be, but 15% is certainly not it.
> >
> > I'm also not quite sure where the cost is going.  It looks like
> > 0001+0002 aren't doing much to the executor except introducing
> > one level of subroutine call, which doesn't seem like it'd account
> > for that.
> 
> I've tried to reproduce that, but get ~2-4% slowdown (with a pinned
> backend, no turbo etc). Are there any special steps I've probably
> missed? At the same time, I remember had conducted this sort of tests
> before when you and others raised the performance degradation question
> and the main part of the patch was already more or less stable. From
> what I remember the numbers back then were also rather small.

Are you comparing with casserts (and therefor MEMORY_CONTEXT_CHECKING) disabled?

-- 
Justin




Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-02 Thread Dmitry Dolgov
> On Wed, Dec 02, 2020 at 11:52:54AM -0500, Tom Lane wrote:
> Dmitry Dolgov <9erthali...@gmail.com> writes:
> >> On Mon, Nov 30, 2020 at 02:26:19PM +0100, Dmitry Dolgov wrote:
> >>> On Mon, Nov 30, 2020 at 04:12:29PM +0300, Alexander Korotkov wrote:
> >>> The idea of an opaque field in SubscriptingRef structure is more
> >>> attractive to me.  Could you please implement it?
>
> >> Sure, doesn't seem to be that much work.
>
> I just happened to notice this bit.  This idea is a complete nonstarter.
> You cannot have an "opaque" field in a parsetree node, because then the
> backend/nodes code has no idea what to do with it for
> copy/compare/outfuncs/readfuncs.  The patch seems to be of the opinion
> that "do nothing" is adequate, which it completely isn't.
>
> Perhaps this is a good juncture at which to remind people that parse
> tree nodes are read-only so far as the executor is concerned, so
> storing something there only at execution time won't work either.

Oh, right, stupid of me. Then I'll just stick with the original
Alexanders suggestion.




Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-02 Thread Dmitry Dolgov
> On Wed, Dec 02, 2020 at 12:58:51PM -0500, Tom Lane wrote:
> So ... one of the things that's been worrying me about this patch
> from day one is whether it would create a noticeable performance
> penalty for existing use-cases.  I did a small amount of experimentation
> about that with the v35 patchset, and it didn't take long at all to
> find that this:
>
> --- cut ---
> create or replace function arraytest(n int) returns void as
> $$
> declare
>   a int[];
> begin
>   a := array[1, 1];
>   for i in 3..n loop
> a[i] := a[i-1] - a[i-2];
>   end loop;
> end;
> $$
> language plpgsql stable;
>
> \timing on
>
> select arraytest(1000);
> --- cut ---
>
> is about 15% slower with the patch than with HEAD.  I'm not sure
> what an acceptable penalty might be, but 15% is certainly not it.
>
> I'm also not quite sure where the cost is going.  It looks like
> 0001+0002 aren't doing much to the executor except introducing
> one level of subroutine call, which doesn't seem like it'd account
> for that.

I've tried to reproduce that, but get ~2-4% slowdown (with a pinned
backend, no turbo etc). Are there any special steps I've probably
missed? At the same time, I remember had conducted this sort of tests
before when you and others raised the performance degradation question
and the main part of the patch was already more or less stable. From
what I remember the numbers back then were also rather small.




Re: proposal: unescape_text function

2020-12-02 Thread Pavel Stehule
st 2. 12. 2020 v 11:37 odesílatel Pavel Stehule 
napsal:

>
>
> st 2. 12. 2020 v 9:23 odesílatel Peter Eisentraut <
> peter.eisentr...@enterprisedb.com> napsal:
>
>> On 2020-11-30 22:15, Pavel Stehule wrote:
>> > I would like some supporting documentation on this.  So far we only
>> > have
>> > one stackoverflow question, and then this implementation, and they
>> are
>> > not even the same format.  My worry is that if there is not precise
>> > specification, then people are going to want to add things in the
>> > future, and there will be no way to analyze such requests in a
>> > principled way.
>> >
>> >
>> > I checked this and it is "prefix backslash-u hex" used by Java,
>> > JavaScript  or RTF -
>> > https://billposer.org/Software/ListOfRepresentations.html
>>
>> Heh.  The fact that there is a table of two dozen possible
>> representations kind of proves my point that we should be deliberate in
>> picking one.
>>
>> I do see Oracle unistr() on that list, which appears to be very similar
>> to what you are trying to do here.  Maybe look into aligning with that.
>>
>
> unistr is a primitive form of proposed function.  But it can be used as a
> base. The format is compatible with our  "4.1.2.3. String Constants with
> Unicode Escapes".
>
> What do you think about the following proposal?
>
> 1. unistr(text) .. compatible with Postgres unicode escapes - it is
> enhanced against Oracle, because Oracle's unistr doesn't support 6 digits
> unicodes.
>
> 2. there can be optional parameter "prefix" with default "\". But with
> "\u" it can be compatible with Java or Python.
>
> What do you think about it?
>

I thought about it a little bit more, and  the prefix specification has not
too much sense (more if we implement this functionality as function
"unistr"). I removed the optional argument and renamed the function to
"unistr". The functionality is the same. Now it supports Oracle convention,
Java and Python (for Python U) and \+XX. These formats was
already supported. The compatibility witth Oracle is nice.

postgres=# select
 'Arabic : ' || unistr( '\0627\0644\0639\0631\0628\064A\0629' )  ||
'
  Chinese: ' || unistr( '\4E2D\6587' )   ||
'
  English: ' || unistr( 'English' )  ||
'
  French : ' || unistr( 'Fran\00E7ais' ) ||
'
  German : ' || unistr( 'Deutsch' )  ||
'
  Greek  : ' || unistr( '\0395\03BB\03BB\03B7\03BD\03B9\03BA\03AC' ) ||
'
  Hebrew : ' || unistr( '\05E2\05D1\05E8\05D9\05EA' )||
'
  Japanese   : ' || unistr( '\65E5\672C\8A9E' )  ||
'
  Korean : ' || unistr( '\D55C\AD6D\C5B4' )  ||
'
  Portuguese : ' || unistr( 'Portugu\00EAs' )||
'
  Russian: ' || unistr( '\0420\0443\0441\0441\043A\0438\0439' )  ||
'
  Spanish: ' || unistr( 'Espa\00F1ol' )  ||
'
  Thai   : ' || unistr( '\0E44\0E17\0E22' )
  as unicode_test_string;
┌──┐
│   unicode_test_string│
╞══╡
│ Arabic : العربية↵│
│   Chinese: 中文 ↵│
│   English: English  ↵│
│   French : Français ↵│
│   German : Deutsch  ↵│
│   Greek  : Ελληνικά ↵│
│   Hebrew : עברית↵│
│   Japanese   : 日本語   ↵│
│   Korean : 한국어   ↵│
│   Portuguese : Português↵│
│   Russian: Русский  ↵│
│   Spanish: Español  ↵│
│   Thai   : ไทย   │
└──┘
(1 row)


postgres=# SELECT UNISTR('Odpov\u011Bdn\u00E1 osoba');
┌─┐
│ unistr  │
╞═╡
│ Odpovědná osoba │
└─┘
(1 row)

New patch attached

Regards

Pavel






> Pavel
>
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index df29af6371..6ad8136523 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -3553,6 +3553,34 @@ repeat('Pg', 4) PgPgPgPg

   
 
+  
+   
+
+ unistr
+
+unistr ( string text )
+text
+   
+   
+Evaluate escaped unicode chars (4 or 6 digits) without prefix or
+ with prefix u (4 digits) or with prefix
+U (8 digits) to chars or with prefix
++ (6 digits).
+   
+   
+unistr('\0441\043B\043E\043D')
+слон
+   
+   
+unistr('d\0061t\+61')
+data
+   
+   
+unistr('d\u0061t\U0061')
+data
+   
+  
+
  
 

diff --git a/src/backend/parser/parser.c b/src/backend/parser/parser.c
index be86eb37fe..cbddb61396 100644
--- a/src/backend/parser/parser.c
+++ b/src/backend/parser/parser.c
@@ -278,30 +278,6 @@ base_yylex(YYSTYPE *lvalp, YYLTYPE *llocp, core_yyscan_t yyscanner)
 	return cur_token;
 }
 
-/* convert hex digit (caller should have verified that) to value */
-static unsigned int
-hexval(unsigned char c)
-

Re: Deprecate custom encoding conversions

2020-12-02 Thread Tom Lane
Heikki Linnakangas  writes:
> On 02/12/2020 18:18, Tom Lane wrote:
>> Heikki Linnakangas  writes:
>>> I propose that we add a notice to the CREATE CONVERSION docs to say that
>>> it is deprecated, and remove it in a few years.

>> While I agree that it's probably not that useful, what would we gain
>> by removing it?  If you intend to remove those catalogs, what lookup
>> mechanism would replace them?  We can't exactly drop the encoding
>> conversion functionality.

> I'm thinking of replacing the catalog with a hard-coded 2D array of 
> function pointers. Or something along those lines.

I like the current structure in which the encoding functions are in
separate .so's, so that you don't load the ones you don't need.
It's not real clear how to preserve that if we hard-code things.

> So I'm looking for refactoring the conversion routines to be more 
> convenient for the callers. But the current function signature is 
> exposed at the SQL level, thanks to CREATE CONVERSION.

I'd be the first to agree that the current API for conversion functions is
not terribly well-designed.  But what if we just change it?  That can't be
any worse of a compatibility hit than removing CREATE CONVERSION
altogether.

regards, tom lane




Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-02 Thread Tom Lane
So ... one of the things that's been worrying me about this patch
from day one is whether it would create a noticeable performance
penalty for existing use-cases.  I did a small amount of experimentation
about that with the v35 patchset, and it didn't take long at all to
find that this:

--- cut ---
create or replace function arraytest(n int) returns void as
$$
declare
  a int[];
begin
  a := array[1, 1];
  for i in 3..n loop
a[i] := a[i-1] - a[i-2];
  end loop;
end;
$$
language plpgsql stable;

\timing on

select arraytest(1000);
--- cut ---

is about 15% slower with the patch than with HEAD.  I'm not sure
what an acceptable penalty might be, but 15% is certainly not it.

I'm also not quite sure where the cost is going.  It looks like
0001+0002 aren't doing much to the executor except introducing
one level of subroutine call, which doesn't seem like it'd account
for that.

I don't think this can be considered RFC until the performance
issue is addressed.

regards, tom lane




Re: SELECT INTO deprecation

2020-12-02 Thread Stephen Frost
Greetings,

* Peter Eisentraut (peter.eisentr...@enterprisedb.com) wrote:
> While reading about deprecating and removing various things in other
> threads, I was wondering about how deprecated SELECT INTO is.  There are
> various source code comments about this, but the SELECT INTO reference page
> only contains soft language like "recommended".  I'm proposing the attached
> patch to stick a more explicit deprecation notice right at the top.

I don't see much value in this.  Users already have 5 years to adapt
their code to new major versions of PG and that strikes me as plenty
enough time and is why we support multiple major versions of PG for so
long.  Users who keep pace and update for each major version aren't
likely to have issue making this change since they're already used to
regularly updating their code for new major versions, while others are
going to complain no matter when we remove it and will ignore any
deprecation notices we put out there, so there isn't much point in them.

> I also found some gratuitous uses of SELECT INTO in various tests and
> documentation (not ecpg or plpgsql of course).  Here is a patch to adjust
> those to CREATE TABLE AS.

If we aren't actually removing SELECT INTO then I don't know that it
makes sense to just stop testing it.

> I don't have a specific plan for removing top-level SELECT INTO altogether,
> but there is a nontrivial amount of code for handling it, so there would be
> some gain if it could be removed eventually.

We should either remove it, or remove the comments that it's deprecated,
not try to make it more deprecated or try to somehow increase the
recommendation to not use it.

I'd recommend we remove it and make note that it's been removed in v14
in the back branches at the same time, which will give those who
actually pay attention and care that much more time before v14 comes out
to update their code (not that I'm all that worried, as they'll also see
it in the beta release notes too).

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: should INSERT SELECT use a BulkInsertState?

2020-12-02 Thread Justin Pryzby
One loose end in this patch is how to check for volatile default expressions.

copyfrom.c is a utility statement, so it can look at the parser's column list:
COPY table(c1,c2)...

However, for INSERT, in nodeModifyTable.c, it looks like parsing, rewriting,
and planning are done, at which point I don't know if there's a good way to
find that.  The default expressions will have been rewritten into the planned
statement.

We need the list of columns whose default is volatile, excluding columns for
which a non-default value is specified.

INSERT INTO table (c1,c2) VALUES (1,default);

We'd want the list of any column in the table with a volatile default,
excluding columns c1, but not-excluding explicit default columns c2 or any
implicit default columns (c3, etc).

Any idea ?

-- 
Justin




Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-02 Thread Tom Lane
Dmitry Dolgov <9erthali...@gmail.com> writes:
>> On Mon, Nov 30, 2020 at 02:26:19PM +0100, Dmitry Dolgov wrote:
>>> On Mon, Nov 30, 2020 at 04:12:29PM +0300, Alexander Korotkov wrote:
>>> The idea of an opaque field in SubscriptingRef structure is more
>>> attractive to me.  Could you please implement it?

>> Sure, doesn't seem to be that much work.

I just happened to notice this bit.  This idea is a complete nonstarter.
You cannot have an "opaque" field in a parsetree node, because then the
backend/nodes code has no idea what to do with it for
copy/compare/outfuncs/readfuncs.  The patch seems to be of the opinion
that "do nothing" is adequate, which it completely isn't.

Perhaps this is a good juncture at which to remind people that parse
tree nodes are read-only so far as the executor is concerned, so
storing something there only at execution time won't work either.

regards, tom lane




Re: Additional improvements to extended statistics

2020-12-02 Thread Tomas Vondra
On 12/2/20 4:51 PM, Dean Rasheed wrote:
> On Sun, 29 Nov 2020 at 21:02, Tomas Vondra
>  wrote:
>>
>> I wonder how much of the comment before clauselist_selectivity should
>> move to clauselist_selectivity_ext - it does talk about range clauses
>> and so on, but clauselist_selectivity does not really deal with that.
>> But maybe that's just an implementation detail and it's better to keep
>> the comment the way it is.
> 
> I think it's better to keep it the way it is, so that the entirety of
> what clauselist_selectivity() does (via clauselist_selectivity_ext())
> can be read in one place, but I have added separate comments for the
> new "_ext" functions to explain how they differ. That matches similar
> examples elsewhere.
> 

+1

> 
>> I noticed this outdated comment:
>>
>>   /* Always compute the selectivity using clause_selectivity */
>>   s2 = clause_selectivity_ext(root, clause, varRelid, jointype, sjinfo,
> 
> Updated.
> 
> 
>> Also, the comment at clauselist_selectivity_or seems to not follow the
>> usual pattern, which I think is
>>
>> /*
>>  * function name
>>  *  short one-sentence description
>>  *
>>  * ... longer description ...
>>  */
> 
> Hmm, it seems OK to me. The first part is basically copied from
> clauselist_selectivity(). The "longer description" doesn't really have
> much more to say because it's much simpler than
> clauselist_selectivity(), but it seems neater to keep the two roughly
> in sync.
> 

I see. In that case it's OK, I guess.

> 
> I've been hacking on this a bit more and attached is an updated
> (hopefully final) version with some other comment improvements and
> also a couple of other tweaks:
> 
> The previous version had duplicated code blocks that implemented the
> same MCV-correction algorithm using simple_sel, mcv_sel, base_sel,
> other_sel and total_sel, which was quite messy. So I refactored that
> into a new function mcv_combine_selectivities(). About half the
> comments from statext_mcv_clauselist_selectivity() then move over to
> mcv_combine_selectivities(). I also updated the comments for
> mcv_clauselist_selectivity() and mcv_clause_selectivity_or() to
> explain how their outputs are expected to be used by
> mcv_combine_selectivities(). That hopefully makes for a clean
> separation of concerns, and makes it easier to tweak the way MCV stats
> are applied on top of simple stats, if someone thinks of a better
> approach in the future.
> 
> In the previous version, for an ORed list of clauses, the MCV
> correction was only applied to the overlaps between clauses. That's OK
> as long as each clause only refers to a single column, since the
> per-column statistics ought to be the best way to estimate each
> individual clause in that case. However, if the individual clauses
> refer to more than one column, I think the MCV correction should be
> applied to each individual clause as well as to the overlaps. That
> turns out to be pretty straightforward, since we're already doing all
> the hard work of computing the match bitmap for each clause. The sort
> of queries I had in mind were things like this:
> 
>   WHERE (a = 1 AND b = 1) OR (a = 2 AND b = 2)
> 
> I added a new test case along those lines and the new estimates are
> much better than they are without this patch, but not for the reason I
> thought --- the old code consistently over-estimated queries like that
> because it actually applied the MCV correction twice (once while
> processing each AND list, via clause_selectivity(), called from
> clauselist_selectivity_simple(), and once for the top-level OR clause,
> contained in a single-element implicitly-ANDed list). The way the new
> code is structured avoids any kind of double application of extended
> stats, producing a much better estimate, which is good.
> 

Nice.

> However, the new code doesn't apply the extended stats directly using
> clauselist_selectivity_or() for this kind of query because there are
> no RestrictInfos for the nested AND clauses, so
> find_single_rel_for_clauses() (and similarly
> statext_is_compatible_clause()) regards those clauses as not
> compatible with extended stats. So what ends up happening is that
> extended stats are used only when we descend down to the two AND
> clauses, and their results are combined using the original "s1 + s2 -
> s1 * s2" formula. That actually works OK in this case, because there
> is no overlap between the two AND clauses, but it wouldn't work so
> well if there was.
> 
> I'm pretty sure that can be fixed by teaching
> find_single_rel_for_clauses() and statext_is_compatible_clause() to
> handle BoolExpr clauses, looking for RestrictInfos underneath them,
> but I think that should be left for a follow-in patch. I have left a
> regression test in place, whose estimates ought to be improved by such
> a fix.
> 

Yeah, I agree with leaving this for a separate patch. We can't do
everything at the same time.

> The upshot of all that is that the new code that applies the MCV
> correction to the individ

Re: proposal: unescape_text function

2020-12-02 Thread Chapman Flack
On 12/02/20 09:55, Chapman Flack wrote:
> In Perl, there is a useful extension to regexp substitution where
> you specify the replacement not as a string or even a string with &
> and \1 \2 ... magic, but as essentially a lambda that is passed the
> match and returns a computed replacement. That makes conversions of
> the sort discussed here generally trivial to implement.

Python, I should have added, allows that also. Java too, since release 9.

Regards,
-Chap




Re: Deprecate custom encoding conversions

2020-12-02 Thread Heikki Linnakangas

On 02/12/2020 18:18, Tom Lane wrote:

Heikki Linnakangas  writes:

I propose that we add a notice to the CREATE CONVERSION docs to say that
it is deprecated, and remove it in a few years.


While I agree that it's probably not that useful, what would we gain
by removing it?  If you intend to remove those catalogs, what lookup
mechanism would replace them?  We can't exactly drop the encoding
conversion functionality.


I'm thinking of replacing the catalog with a hard-coded 2D array of 
function pointers. Or something along those lines.


I had this idea while looking at the encoding conversions performed in 
COPY. The current conversion functions return a null-terminated, 
palloc'd string, which is a bit awkward for the callers. The caller 
needs to call strlen() on the result, and it would be nice to reuse the 
same buffer for all the conversions. And I've got a hunch that it'd be 
faster to convert the whole 64 kB raw input buffer in one go, rather 
than convert each line separately, but the current function signature 
doesn't make that easy either.


So I'm looking for refactoring the conversion routines to be more 
convenient for the callers. But the current function signature is 
exposed at the SQL level, thanks to CREATE CONVERSION.


- Heikki




Re: Deprecate custom encoding conversions

2020-12-02 Thread Tom Lane
Heikki Linnakangas  writes:
> I propose that we add a notice to the CREATE CONVERSION docs to say that 
> it is deprecated, and remove it in a few years.

While I agree that it's probably not that useful, what would we gain
by removing it?  If you intend to remove those catalogs, what lookup
mechanism would replace them?  We can't exactly drop the encoding
conversion functionality.

regards, tom lane




Re: SELECT INTO deprecation

2020-12-02 Thread Pavel Stehule
st 2. 12. 2020 v 12:55 odesílatel Peter Eisentraut <
peter.eisentr...@enterprisedb.com> napsal:

> While reading about deprecating and removing various things in other
> threads, I was wondering about how deprecated SELECT INTO is.  There are
> various source code comments about this, but the SELECT INTO reference
> page only contains soft language like "recommended".  I'm proposing the
> attached patch to stick a more explicit deprecation notice right at the
> top.
>
> I also found some gratuitous uses of SELECT INTO in various tests and
> documentation (not ecpg or plpgsql of course).  Here is a patch to
> adjust those to CREATE TABLE AS.
>
> I don't have a specific plan for removing top-level SELECT INTO
> altogether, but there is a nontrivial amount of code for handling it, so
> there would be some gain if it could be removed eventually.
>

+1

Pavel


Deprecate custom encoding conversions

2020-12-02 Thread Heikki Linnakangas

Hi,

PostgreSQL allows writing custom encoding conversion functions between 
any character encodings, using the CREATE CONVERSION command. It's 
pretty flexible, you can define default and non-default conversions, and 
the conversions live in schemas so you can have multiple conversions 
installed in a system and you can switch between them by changing 
search_path.


However:

We never use non-default conversions for anything. All code that 
performs encoding conversions only cares about the default ones.


I think this flexibility is kind of ridiculous anyway. If a built-in 
conversion routine doesn't handle some characters correctly, surely we 
should fix the built-in function, rather than provide a mechanism for 
having your own conversion functions. If you truly need to perform the 
conversions differently than the built-in routines do, you can always 
perform the conversion in the client instead.


Note that we don't support adding completely new custom encodings, all 
this is just for conversions between the built-in encodings that we have.


I propose that we add a notice to the CREATE CONVERSION docs to say that 
it is deprecated, and remove it in a few years.


Any objections? Anyone using custom encoding conversions in production?

- Heikki




Re: Additional improvements to extended statistics

2020-12-02 Thread Dean Rasheed
On Sun, 29 Nov 2020 at 21:02, Tomas Vondra
 wrote:
>
> I wonder how much of the comment before clauselist_selectivity should
> move to clauselist_selectivity_ext - it does talk about range clauses
> and so on, but clauselist_selectivity does not really deal with that.
> But maybe that's just an implementation detail and it's better to keep
> the comment the way it is.

I think it's better to keep it the way it is, so that the entirety of
what clauselist_selectivity() does (via clauselist_selectivity_ext())
can be read in one place, but I have added separate comments for the
new "_ext" functions to explain how they differ. That matches similar
examples elsewhere.


> I noticed this outdated comment:
>
>   /* Always compute the selectivity using clause_selectivity */
>   s2 = clause_selectivity_ext(root, clause, varRelid, jointype, sjinfo,

Updated.


> Also, the comment at clauselist_selectivity_or seems to not follow the
> usual pattern, which I think is
>
> /*
>  * function name
>  *  short one-sentence description
>  *
>  * ... longer description ...
>  */

Hmm, it seems OK to me. The first part is basically copied from
clauselist_selectivity(). The "longer description" doesn't really have
much more to say because it's much simpler than
clauselist_selectivity(), but it seems neater to keep the two roughly
in sync.


I've been hacking on this a bit more and attached is an updated
(hopefully final) version with some other comment improvements and
also a couple of other tweaks:

The previous version had duplicated code blocks that implemented the
same MCV-correction algorithm using simple_sel, mcv_sel, base_sel,
other_sel and total_sel, which was quite messy. So I refactored that
into a new function mcv_combine_selectivities(). About half the
comments from statext_mcv_clauselist_selectivity() then move over to
mcv_combine_selectivities(). I also updated the comments for
mcv_clauselist_selectivity() and mcv_clause_selectivity_or() to
explain how their outputs are expected to be used by
mcv_combine_selectivities(). That hopefully makes for a clean
separation of concerns, and makes it easier to tweak the way MCV stats
are applied on top of simple stats, if someone thinks of a better
approach in the future.

In the previous version, for an ORed list of clauses, the MCV
correction was only applied to the overlaps between clauses. That's OK
as long as each clause only refers to a single column, since the
per-column statistics ought to be the best way to estimate each
individual clause in that case. However, if the individual clauses
refer to more than one column, I think the MCV correction should be
applied to each individual clause as well as to the overlaps. That
turns out to be pretty straightforward, since we're already doing all
the hard work of computing the match bitmap for each clause. The sort
of queries I had in mind were things like this:

  WHERE (a = 1 AND b = 1) OR (a = 2 AND b = 2)

I added a new test case along those lines and the new estimates are
much better than they are without this patch, but not for the reason I
thought --- the old code consistently over-estimated queries like that
because it actually applied the MCV correction twice (once while
processing each AND list, via clause_selectivity(), called from
clauselist_selectivity_simple(), and once for the top-level OR clause,
contained in a single-element implicitly-ANDed list). The way the new
code is structured avoids any kind of double application of extended
stats, producing a much better estimate, which is good.

However, the new code doesn't apply the extended stats directly using
clauselist_selectivity_or() for this kind of query because there are
no RestrictInfos for the nested AND clauses, so
find_single_rel_for_clauses() (and similarly
statext_is_compatible_clause()) regards those clauses as not
compatible with extended stats. So what ends up happening is that
extended stats are used only when we descend down to the two AND
clauses, and their results are combined using the original "s1 + s2 -
s1 * s2" formula. That actually works OK in this case, because there
is no overlap between the two AND clauses, but it wouldn't work so
well if there was.

I'm pretty sure that can be fixed by teaching
find_single_rel_for_clauses() and statext_is_compatible_clause() to
handle BoolExpr clauses, looking for RestrictInfos underneath them,
but I think that should be left for a follow-in patch. I have left a
regression test in place, whose estimates ought to be improved by such
a fix.

The upshot of all that is that the new code that applies the MCV
correction to the individual clauses in an ORed list doesn't help with
queries like the one above at the moment, and it's not obvious whether
it is currently reachable, but I think it's worth leaving in because
it seems more principled, and makes that code more future-proof. I
also think it's neater because now the signature of
mcv_clause_selectivity_or() is more natural --- it's primary return
va

Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-02 Thread Dmitry Dolgov
> On Mon, Nov 30, 2020 at 02:26:19PM +0100, Dmitry Dolgov wrote:
> > On Mon, Nov 30, 2020 at 04:12:29PM +0300, Alexander Korotkov wrote:
> >
> > > > My first question is whether we're
> > > > able to handle different subscript types differently.  For instance,
> > > > one day we could handle jsonpath subscripts for jsonb.  And for sure,
> > > > jsonpath subscripts are expected to be handled differently from text
> > > > subscripts.  I see we can distinguish types during in prepare and
> > > > validate functions.  But it seems there is no type information in
> > > > fetch and assign functions.  Should we add something like this to the
> > > > SubscriptingRefState for future usage?
> > > >
> > > > Datum uppertypeoid[MAX_SUBSCRIPT_DEPTH];
> > > > Datum lowertypeoid[MAX_SUBSCRIPT_DEPTH];
> > >
> > > Yes, makes sense. My original idea was that it could be done within the
> > > jsonpath support patch itself, but at the same time providing these
> > > fields into SubscriptingRefState will help other potential extensions.
> > >
> > > Having said that, maybe it would be even better to introduce a field
> > > with an opaque structure for both SubscriptingRefState and
> > > SubscriptingRef, where every implementation of custom subscripting can
> > > store any necessary information? In case of jsonpath it could keep type
> > > information acquired in prepare function, which would be then passed via
> > > SubscriptingRefState down to the fetch/assign.
> >
> > The idea of an opaque field in SubscriptingRef structure is more
> > attractive to me.  Could you please implement it?
>
> Sure, doesn't seem to be that much work.

The attached implementation should be enough I guess, as fetch/assign
are executed in a child memory context of one where prepare does the
stuff.
>From 6f7d9589cc827dfb3b1d82a3e0e629b482e4c77a Mon Sep 17 00:00:00 2001
From: erthalion <9erthali...@gmail.com>
Date: Thu, 31 Jan 2019 22:37:19 +0100
Subject: [PATCH v35 1/5] Base implementation of subscripting mechanism

Introduce all the required machinery for generalizing subscripting
operation for a different data types:

* subscripting handler procedure, to set up a relation between data type
and corresponding subscripting logic.

* subscripting routines, that help generalize a subscripting logic,
since it involves few stages, namely preparation (e.g. to figure out
required types), validation (to check the input and return meaningful
error message), fetch (executed when we extract a value using
subscripting), assign (executed when we update a data type with a new
value using subscripting). Without this it would be neccessary to
introduce more new fields to pg_type, which would be too invasive.

All ArrayRef related logic was removed and landed as a separate
subscripting implementation in the following patch from this series. The
rest of the code was rearranged, to e.g. store a type of assigned value
for an assign operation.

Reviewed-by: Tom Lane, Arthur Zakirov
---
 .../pg_stat_statements/pg_stat_statements.c   |   1 +
 src/backend/catalog/heap.c|   6 +-
 src/backend/catalog/pg_type.c |   7 +-
 src/backend/commands/typecmds.c   |  77 +-
 src/backend/executor/execExpr.c   |  26 +---
 src/backend/executor/execExprInterp.c | 124 +++
 src/backend/nodes/copyfuncs.c |   2 +
 src/backend/nodes/equalfuncs.c|   2 +
 src/backend/nodes/outfuncs.c  |   2 +
 src/backend/nodes/readfuncs.c |   2 +
 src/backend/parser/parse_expr.c   |  54 ---
 src/backend/parser/parse_node.c   | 141 --
 src/backend/parser/parse_target.c |  88 +--
 src/backend/utils/adt/ruleutils.c |  21 +--
 src/backend/utils/cache/lsyscache.c   |  23 +++
 src/include/c.h   |   2 +
 src/include/catalog/pg_type.h |   9 +-
 src/include/executor/execExpr.h   |  15 +-
 src/include/nodes/primnodes.h |   8 +
 src/include/nodes/subscripting.h  |  42 ++
 src/include/parser/parse_node.h   |   6 +-
 src/include/utils/lsyscache.h |   1 +
 22 files changed, 333 insertions(+), 326 deletions(-)
 create mode 100644 src/include/nodes/subscripting.h

diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 1eac9edaee..31ba120fb2 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -2800,6 +2800,7 @@ JumbleExpr(pgssJumbleState *jstate, Node *node)
 JumbleExpr(jstate, (Node *) sbsref->reflowerindexpr);
 JumbleExpr(jstate, (Node *) sbsref->refexpr);
 JumbleExpr(jstate, (Node *) sbsref->refassgnexpr);
+APP_JUMB(sbsref->refnestedfunc);
 			}
 			break;
 		case T_FuncExpr:
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c

Re: proposal: unescape_text function

2020-12-02 Thread Chapman Flack
On 12/02/20 05:37, Pavel Stehule wrote:
> 2. there can be optional parameter "prefix" with default "\". But with "\u"
> it can be compatible with Java or Python.

Java's unicode escape form is one of those early ones that lack
a six-digit form, and where any character outside of the basic multilingual
plane has to be represented by two four-digit escapes in a row, encoding
the two surrogates that would make up the character's representation
in UTF-16.

Obviously that's an existing form that's out there, so it's not a bad
thing to have some kind of support for it, but it's not a great
representation to encourage people to use.

Python, by contrast, has both \u and \U where you would use
the latter to represent a non-BMP character directly. So the Java and
Python schemes should be considered distinct.

In Perl, there is a useful extension to regexp substitution where
you specify the replacement not as a string or even a string with &
and \1 \2 ... magic, but as essentially a lambda that is passed the
match and returns a computed replacement. That makes conversions of
the sort discussed here generally trivial to implement. Would it be
worth considering to add something of general utility like that, and
then there could be a small library of pure SQL functions (or a wiki
page or GitHub gist) covering a bunch of the two dozen representations
on that page linked above?

Regards,
-Chap




Re: Corner-case bug in pg_rewind

2020-12-02 Thread Ian Barwick

On 02/12/2020 20:13, Heikki Linnakangas wrote:

On 01/12/2020 16:52, Pavel Borisov wrote:

    Status update for a commitfest entry.

    The patch is Waiting on Author for some time. As this is a bug fix,
    I am
    moving it to the next CF.
    Ian, are you planning to continue working on it?

As a reviewer, I consider the patch useful and good overall. The comments I 
left were purely cosmetic. It's a pity to me that this bugfix delayed for such 
a small reason and outdated, therefore. It would be nice to complete this fix 
on the next CF.


Yeah, we really should fix this..

On 16/11/2020 04:49, Ian Lawrence Barwick wrote:

Also, I think Heikki's notion could be fulfilled.


I spent a bit of time looking at that suggestion but couldn't actually
verify it was an issue which needed fixing.

>

Attached are two patches. The first patch is your original patch, unmodified
(except for a cosmetic rename of the test file). The second patch builds on
that, demonstrating and fixing the issue I mentioned. It took me a while to
create a repro for it, it's easily masked by incidental full-page writes or
because rows created by XIDs that are not marked as committed on the other
timeline are invisible, but succeeded at last.


Aha, many thanks. I wasn't entirely sure what I was looking for there and
recently haven't had the time or energy to dig any further.


Regards

Ian Barwick

--
Ian Barwick   https://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services




macOS SIP, next try

2020-12-02 Thread Peter Eisentraut

Previous discussions on macOS SIP:

https://www.postgresql.org/message-id/flat/561E73AB.8060800%40gmx.net
https://www.postgresql.org/message-id/flat/CA%2BTgmoYGi5oR8Rvb2-SY1_WEjd76H5gCqSukxFQ66qR7MewWAQ%40mail.gmail.com
https://www.postgresql.org/message-id/flat/6a4d6124-41f0-756b-0811-c5c5def7ef4b%402ndquadrant.com

Tests that use a temporary installation (i.e., make check) use a 
platform-specific environment variable to make programs and libraries 
find shared libraries in the temporary installation rather than in the 
actual configured installation target directory.  On macOS, this is 
DYLD_LIBRARY_PATH.  Ever since System Integrity Protection (SIP) 
arrived, this hasn't worked anymore, because DYLD_LIBRARY_PATH gets 
disabled by the system (in somewhat obscure ways).  The workaround was 
to either disable SIP or to do make install before make check.


The solution here is to use install_name_tool to edit the shared library 
path in the binaries (programs and libraries) after installation into 
the temporary prefix.


The solution is, I think, correct in principle, but the implementation 
is hackish, since it hardcodes the names and versions of the interesting 
shared libraries in an unprincipled way.  More refinement is needed 
here.  Ideas welcome.


In order to allow all that, we need to link everything with the option 
-headerpad_max_install_names so that there is enough room in the 
binaries to put in the temporary path in place of the original one. 
(The temporary path is strictly longer than the original one.)  This 
bloats the binaries, so it might only be appropriate for development 
builds and should perhaps be hidden behind some option.  Ideas also 
welcome here.


The attached patch works for me.  You can see that it disables setting 
DYLD_LIBRARY_PATH, but the tests still work.  Verification on other 
system variantions welcome, of course.


Comments welcome.  I think this direction is promising, but some details 
need to be addressed.
From 28f2f9fc2234e3d993027fb014f1c85f7b3db6be Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Wed, 2 Dec 2020 14:49:53 +0100
Subject: [PATCH] WIP: Fix temp-install tests to work with macOS SIP

Tests that use a temporary installation (i.e., make check) use a
platform-specific environment variable to make programs and libraries
find shared libraries in the temporary installation rather than in the
actual configured installation target directory.  On macOS, this is
DYLD_LIBRARY_PATH.  Ever since System Integrity Protection (SIP)
arrived, this hasn't worked anymore, because DYLD_LIBRARY_PATH gets
disabled by the system (in somewhat obscure ways).  The workaround was
to either disable SIP or to do make install before make check.

The solution here is to use install_name_tool to edit the shared
library path in the binaries (programs and libraries) after
installation into the temporary prefix.

XXX The solution is, I think, correct in principle, but the
implementation is hackish, since it hardcodes the names and versions
of the interesting shared libraries in an unprincipled way.

In order to allow that, we need to link everything with the option
-headerpad_max_install_names so that there is enough room in the
binaries to put in the temporary path in place of the original
one.  (The temporary path is strictly longer than the original one.)
This bloats the binaries, so it might only be appropriate for
development builds and should perhaps be hidden behind some option.
---
 src/Makefile.global.in|  1 +
 src/interfaces/ecpg/test/Makefile |  1 +
 src/makefiles/Makefile.darwin | 13 -
 src/test/isolation/Makefile   |  3 +++
 4 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/src/Makefile.global.in b/src/Makefile.global.in
index 7ca1e9aac5..a2349c814b 100644
--- a/src/Makefile.global.in
+++ b/src/Makefile.global.in
@@ -409,6 +409,7 @@ ifeq ($(MAKELEVEL),0)
$(MKDIR_P) '$(abs_top_builddir)'/tmp_install/log
$(MAKE) -C '$(top_builddir)' DESTDIR='$(abs_top_builddir)'/tmp_install 
install >'$(abs_top_builddir)'/tmp_install/log/install.log 2>&1
$(MAKE) -j1 $(if $(CHECKPREP_TOP),-C $(CHECKPREP_TOP),) checkprep 
>>'$(abs_top_builddir)'/tmp_install/log/install.log 2>&1
+   $(call 
temp_install_postprocess,'$(abs_top_builddir)/tmp_install/$(bindir)'/* 
'$(abs_top_builddir)/tmp_install/$(libdir)'/*.so 
'$(abs_top_builddir)/tmp_install/$(libdir)'/*.dylib)
 endif
 endif
 endif
diff --git a/src/interfaces/ecpg/test/Makefile 
b/src/interfaces/ecpg/test/Makefile
index be53b7b94d..468e3076ad 100644
--- a/src/interfaces/ecpg/test/Makefile
+++ b/src/interfaces/ecpg/test/Makefile
@@ -80,6 +80,7 @@ endif
 REGRESS_OPTS = --dbname=ecpg1_regression,ecpg2_regression 
--create-role=regress_ecpg_user1,regress_ecpg_user2 $(EXTRA_REGRESS_OPTS)
 
 check: all
+   $(call temp_install_postprocess,$(patsubst %.pgc,%,$(wildcard 
compat_informix/*.pgc compat_oracle/*.pgc connect/*.pgc pgtypeslib/*.pgc 
preproc/*.pgc sql/*.pg

Re: pg_stat_statements oddity with track = all

2020-12-02 Thread Nikolay Samokhvalov
On Tue, Dec 1, 2020 at 10:32 PM Julien Rouhaud  wrote:

> On Tue, Dec 01, 2020 at 10:08:06PM -0800, Nikolay Samokhvalov wrote:
> > If all top-level records in pg_stat_statements have "true" in the new
> > column (is_toplevel), how would this lead to the need to increase
> > pg_stat_statements.max? The number of records would remain the same, as
> > before extending pg_stat_statements.
>
> If the same query is getting executed both at top level and as a nested
> statement, two entries will then be created.  That's probably unlikely for
> things like RI trigger queries, but I don't know what to expect for client
> application queries.
>

Right, but this is how things already work. The extra field you've proposed
won't increase the number of records so it shouldn't affect how users
choose pg_stat_statements.max.


Re: pg_stat_statements oddity with track = all

2020-12-02 Thread Sergei Kornilov
Hello

> - add a parent_statement_id column that would be NULL for top level queries

Will generate too much entries... Every FK for each different delete/insert, 
for example.
But very useful for databases with a lot of stored procedures to find where 
this query is called. May be new mode track = tree? Use NULL to indicate a 
top-level query (same as with track=tree) and some constant for any nested 
queries when track = all.

Also, currently a top statement will account buffers usage for underlying 
statements?

regards, Sergei




Re: Autovacuum on partitioned table (autoanalyze)

2020-12-02 Thread yuzuko
Hello Alvaro,

Thank you for your comments.

>
> > In second thought about the reason for the "toprel_oid". It is perhaps
> > to avoid "wrongly" propagated values to ancestors after a manual
> > ANALYZE on a partitioned table.  But the same happens after an
> > autoanalyze iteration if some of the ancestors of a leaf relation are
> > analyzed before the leaf relation in a autoanalyze iteration.  That
> > can trigger an unnecessary analyzing for some of the ancestors.
>
> I'm not sure I understand this point.  I think we should only trigger
> this on analyzes of *leaf* partitions, not intermediate partitioned
> relations.  That way you would never get these unnecesary analyzes.
> Am I missing something?
>
> (So with my proposal in the previous email, we would send the list of
> ancestor relations after analyzing a leaf partition.  Whenever we
> analyze a non-leaf, then the list of ancestors is sent as an empty
> list.)
>
The problem Horiguchi-san mentioned is as follows:

create table p1 (i int) partition by range(i);
create table p1_1 partition of p1 for values from (0) to (500)
partition by range(i);
create table p1_1_1 partition of p1_1 for values from (0) to (300);
insert into p1 select generate_series(0,299);

-- After auto analyze (first time)
postgres=# select relname, n_mod_since_analyze, last_autoanalyze from
pg_stat_all_tables where relname in ('p1','p1_1','p1_1_1');
 relname | n_mod_since_analyze |   last_autoanalyze
-+-+---
 p1   |  300 |
 p1_1| 300 |
 p1_1_1  |   0 | 2020-12-02 22:24:18.753574+09
(3 rows)

-- Insert more rows
postgres=# insert into p1 select generate_series(0,199);
postgres=# select relname, n_mod_since_analyze, last_autoanalyze from
pg_stat_all_tables where relname in ('p1','p1_1','p1_1_1');
 relname | n_mod_since_analyze |   last_autoanalyze
-+-+---
 p1  |   300 |
 p1_1| 300 |
 p1_1_1  | 200 | 2020-12-02 22:24:18.753574+09
(3 rows)

-- After auto analyze (second time)
postgres=# select relname, n_mod_since_analyze, last_autoanalyze from
pg_stat_all_tables where relname in ('p1','p1_1','p1_1_1');
relname | n_mod_since_analyze |   last_autoanalyze
-+-+---
 p1  |   0 | 2020-12-02 22:25:18.857248+09
 p1_1| 200 | 2020-12-02 22:25:18.661932+09
 p1_1_1  |   0 | 2020-12-02 22:25:18.792078+09

After 2nd auto analyze, all relations' n_mod_since_analyze should be 0,
but p1_1's is not.  This is because p1_1 was analyzed before p1_1_1.
So p1_1 will be analyzed again in the 3rd auto analyze.
That is propagating changes_since_analyze to *all ancestors* may cause
unnecessary analyzes even if we do this only when analyzing leaf partitions.
So I think we should track ancestors which are not analyzed in the current
iteration as Horiguchi-san proposed.

Regarding your idea:
> typedef struct PgStat_MsgAnalyze
> {
>PgStat_MsgHdr  m_hdr;
>Oidm_databaseid;
>Oidm_tableoid;
>bool   m_autovacuum;
>bool   m_resetcounter;
>TimestampTzm_analyzetime;
>PgStat_Counter m_live_tuples;
>PgStat_Counter m_dead_tuples;
>intm_nancestors;
>Oidm_ancestors[PGSTAT_NUM_ANCESTORENTRIES];
> } PgStat_MsgAnalyze;

I'm not sure but how about storing only ancestors that aren't analyzed
in the current
iteration in m_ancestors[PGSTAT_NUM_ANCESTORENTRIES] ?


> > > > Regarding the counters in pg_stat_all_tables: maybe some of these 
> > > > should be
> > > > null rather than zero ?  Or else you should make an 0001 patch to fully
> > > > implement this view, with all relevant counters, not just 
> > > > n_mod_since_analyze,
> > > > last_*analyze, and *analyze_count.  These are specifically misleading:
> > > >
> > > > last_vacuum |
> > > > last_autovacuum |
> > > > n_ins_since_vacuum  | 0
> > > > vacuum_count| 0
> > > > autovacuum_count| 0
> > > >
> > > I haven't modified this part yet, but you meant that we should set
> > > null to counters
> > > about vacuum because partitioned tables are not vacuumed?
> >
> > Perhaps bacause partitioned tables *cannot* be vacuumed.  I'm not sure
> > what is the best way here.  Showing null seems reasonable but I'm not
> > sure that doesn't break anything.
>
> I agree that showing NULLs for the vacuum columns is better.  Perhaps
> the most reasonable way to do this is use -1 as an indicator that NULL
> ought to be returned from pg_stat_get_vacuum_count() et al, and add a
> boolean in PgStat_TableCounts next to t_truncated, maybe "t_nullvacuum"
> that says to store -1 instead of 0 in pgstat_recv_tabstat.
>
Thank you for the advice.  I'll fix it based on this idea.

-- 
Best regards,
Yuzuko Hosoya
NTT Open Source Software 

Re: convert elog(LOG) calls to ereport

2020-12-02 Thread Alvaro Herrera
On 2020-Dec-02, Peter Eisentraut wrote:

> There are a number of elog(LOG) calls that appear to be user-facing, so they
> should be ereport()s.  This patch changes them.  There are more elog(LOG)
> calls remaining, but they all appear to be some kind of debugging support.
> Also, I changed a few elog(FATAL)s that were nearby, but I didn't
> specifically look for them.

> - elog(LOG, "WSAIoctl(SIO_KEEPALIVE_VALS) failed: %ui",
> -  WSAGetLastError());
> + ereport(LOG,
> + (errmsg("WSAIoctl(SIO_KEEPALIVE_VALS) failed: 
> %ui",
> + WSAGetLastError(;

Please take the opportunity to move the flag name out of the message in
this one, also.  I do wonder if it'd be a good idea to move the syscall
name itself out of the message, too; that would reduce the number of
messages to translate 50x to just "%s(%s) failed: %m" instead of one
message per distinct syscall.

Should fd.c messages do errcode_for_file_access() like elsewhere?

Overall, it looks good to me.

Thanks





Re: wrong link in acronyms.sgml

2020-12-02 Thread Alvaro Herrera
On 2020-Dec-02, Erik Rijkers wrote:

> Hi
> 
> I just noticed that in
> 
>   https://www.postgresql.org/docs/13/acronyms.html
> (i.e., doc/src/sgml/acronyms.sgml)
> 
> there is under lemma 'HOT' a link with URL:
> 
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/access/heap/README.HOT;hb=HEAD
> 
> Is this deliberate? Surely pointing 13-docs into HEAD-docs is wrong?

Yeah, I noticed this while working on the glossary.

This link works:
https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/access/heap/README.HOT;hb=REL_13_STABLE
but the problem with this one is that we'll have to update once per
release.

> Otherwise it may be better to remove the link altogether and just have the
> acronym lemma: 'HOT' and explanation 'Heap-Only Tuple'.

Yeah, I don't think pointing to the source-code README file is a great
resource.  I think starting with 13 we should add an entry in the
glossary for Heap-Only Tuple, and make the acronym entry point to the
glossary.  In fact, we could do this with several other acronyms too,
such as DDL, DML, GEQO, LSN, MVCC, SQL, TID, WAL.  The links to external
resources can then be put in the glossary definition, if needed.





Re: A new function to wait for the backend exit after termination

2020-12-02 Thread Muhammad Usama
On Wed, Dec 2, 2020 at 1:30 PM Bharath Rupireddy <
bharath.rupireddyforpostg...@gmail.com> wrote:

> On Mon, Nov 30, 2020 at 8:10 PM Muhammad Usama  wrote:
> >
> > The following review has been posted through the commitfest application:
> > make installcheck-world:  tested, passed
> > Implements feature:   tested, passed
> > Spec compliant:   tested, passed
> > Documentation:not tested
> >
> > I have tested the patch against current master branch
> (commit:6742e14959a3033d946ab3d67f5ce4c99367d332)
> > Both functions work without a problem and as expected.
> >
>
> Thanks!
>
> >
> > Just a tiny comment/suggestion.
> > specifying a -ve timeout in pg_terminate_backed rightly throws an error,
> > I am not sure if it would be right or a wrong approach but I guess we
> can ignore -ve
> > timeout in pg_terminate_backend function when wait (second argument) is
> false.
> >
> > e.g.  pg_terminate_backend(12320, false,-1); -- ignore -1 timout since
> wait is false
> >
>
> IMO, that's not a good idea. I see it this way, for any function first
> the input args have to be validated. If okay, then follows the use of
> those args and the main functionality. I can also see pg_promote(),
> which first does the input timeout validation throwing error if it is
> <= 0.
>
> We can retain the existing behaviour.
>

Agreed!

Thanks
Best regards
Muhammad Usama


>
> >
> > The new status of this patch is: Ready for Committer
> >
>
> Thanks!
>
> With Regards,
> Bharath Rupireddy.
> EnterpriseDB: http://www.enterprisedb.com
>


  1   2   >