Re: Fix typo in verify_heapam.c

2021-04-02 Thread Fujii Masao




On 2021/04/02 15:02, Masahiko Sawada wrote:

Hi all,

I found typos in verify_heapam.c.

s/comitted/committed/

Please find an attached patch.


Thanks for the report and patch! Pushed.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




RE: Fix pg_checksums progress report

2021-04-02 Thread Shinya11.Kato
>-Original Message-
>From: Fujii Masao 
>Sent: Friday, April 2, 2021 2:39 PM
>To: shinya11.k...@nttdata.com; pgsql-hack...@postgresql.org
>Subject: Re: Fix pg_checksums progress report
>
>
>
>On 2021/04/02 14:23, shinya11.k...@nttdata.com wrote:
>> Hi,
>>
>> I found a problem with the pg_checksums.c.
>>
>> The total_size is calculated by scanning the directory.
>> The current_size is calculated by scanning the files, but the current_size 
>> does
>not include the size of NewPages.
>>
>> This may cause pg_checksums progress report to not be 100%.
>> I have attached a patch that fixes this.
>
>Thanks for the report and patch!
>
>I could reproduce this issue and confirmed that the patch fixes it.
>
>Regarding the patch, I think that it's better to add the comment about why
>current_size needs to be counted including new pages.

Thanks for your review.
I added a comment to the patch, and attached the new patch.

Regards,
Shinya Kato



fix_pg_checksums_progress_report_v2.patch
Description: fix_pg_checksums_progress_report_v2.patch


Re: making update/delete of inheritance trees scale better

2021-04-02 Thread Amit Langote
On Wed, Mar 31, 2021 at 9:54 PM Amit Langote  wrote:
> I reran some of the performance tests I did earlier (I've attached the
> modified test running script for reference):

For archives' sake, noticing a mistake in my benchmarking script, I
repeated the tests. Apparently, all pgbench runs were performed with
40 column tables, not 10, 20, and 40 as shown in the results.

> pgbench -n -T60 -M{simple|prepared} -f nojoin.sql
>
> nojoin.sql:
>
> \set a random(1, 100)
> update test_table t set b = :a where a = :a;
>
> ...and here are the tps figures:
>
> -Msimple
>
> nparts  10cols  20cols  40cols
>
> master:
> 64  10112   987810920
> 128 966210691   10604
> 256 9642969110626
> 1024858996759521
>
> patched:
> 64  13493   13463   13313
> 128 13305   13447   12705
> 256 13190   13161   12954
> 102411791   11408   11786
>
> No variation across various column counts, but the patched improves
> the tps for each case by quite a bit.

-Msimple

pre-86dc90056:
nparts  10cols  20cols  40cols

64  11345   10650   10327
128 11014   11005   10069
256 10759   10827   10133
1024951810314   8418

post-86dc90056:
10cols  20cols  40cols

64  13829   13677   13207
128 13521   12843   12418
256 13071   13006   12926
102412351   12036   11739

My previous assertion that the tps does vary across different column
counts seems to hold in this case, that is, -Msimple mode.

> -Mprepared (plan_cache_mode=force_generic_plan)
>
> master:
> 64  228622852266
> 128 116311271091
> 256 531 519 544
> 102477  71  69
>
> patched:
> 64  652266126275
> 128 356836253372
> 256 184717101823
> 1024433 427 386
>
> Again, no variation across columns counts.

-Mprepared

pre-86dc90056:
10cols  20cols  40cols

64  305928512154
128 167513661100
256 685 658 544
1024126 85  76

post-86dc90056:
10cols  20cols  40cols

64  766569666444
128 421139683389
256 220520201783
1024545 499 389

In the -Mprepared case however, it does vary, both before and after
86dc90056.  For the post-86dc90056 case, I suspect it's because
ExecBuildUpdateProjection(), whose complexity is O(number-of-columns),
being performed for *all* partitions in ExecInitModifyTable().  In the
-Msimple case, it would always be for only one partition, so it
doesn't make that much of a difference to ExecInitModifyTable() time.

>  tps drops as partition
> count increases both before and after applying the patches, although
> patched performs way better, which is mainly attributable to the
> ability of UPDATE to now utilize runtime pruning (actually of the
> Append under ModifyTable).  The drop as partition count increases can
> be attributed to the fact that with a generic plan, there are a bunch
> of steps that must be done across all partitions, such as
> AcauireExecutorLocks(), ExecCheckRTPerms(), per-result-rel
> initializations in ExecInitModifyTable(), etc., even with the patched.
> As mentioned upthread, [1] can help with the last bit.

Here are the numbers after applying that patch:

10cols  20cols  40cols

64  17185   17064   16625
128 12261   11648   11968
256 766275647439
1024225221852101

With the patch, ExecBuildUpdateProjection() will be called only once
irrespective of the number of partitions, almost like the -Msimple
case, so the tps across column counts does not vary by much.

-- 
Amit Langote
EDB: http://www.enterprisedb.com




Re: invalid data in file backup_label problem on windows

2021-04-02 Thread Michael Paquier
On Thu, Apr 01, 2021 at 09:56:02AM +0900, Michael Paquier wrote:
> Okay, that sounds good to me.

Applied and backpatched then as 6fb66c26 & co.
--
Michael


signature.asc
Description: PGP signature


Re: Fix typo in verify_heapam.c

2021-04-02 Thread Masahiko Sawada
On Fri, Apr 2, 2021 at 4:28 PM Fujii Masao  wrote:
>
>
>
> On 2021/04/02 15:02, Masahiko Sawada wrote:
> > Hi all,
> >
> > I found typos in verify_heapam.c.
> >
> > s/comitted/committed/
> >
> > Please find an attached patch.
>
> Thanks for the report and patch! Pushed.

Thank you!

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: Fix pg_checksums progress report

2021-04-02 Thread Michael Paquier
On Fri, Apr 02, 2021 at 07:30:32AM +, shinya11.k...@nttdata.com wrote:
> I added a comment to the patch, and attached the new patch.

Hmm.  This looks to come from 280e5f14 that introduced the progress
reports so this would need a backpatch down to 12.  I have not looked
in details and have not looked at the patch yet, though.  Fujii-san,
are you planning to take care of that?  That was my stuff originally,
so I am fine to look at it.  But not now, on a Friday afternoon :)
--
Michael


signature.asc
Description: PGP signature


why pg_walfile_name() cannot be executed during recovery?

2021-04-02 Thread SATYANARAYANA NARLAPURAM
Hello Hackers,

Why pg_walfile_name() can't be executed under recovery? What is the best
way for me to get the current timeline and/or the file being recovering on
the standby using a postgres query? I know I can get it via process title
but don't want to go that route.

Thanks,
Satya


Re: Data type correction in pgstat_report_replslot function parameters

2021-04-02 Thread Fujii Masao




On 2021/04/02 11:20, vignesh C wrote:

On Thu, Apr 1, 2021 at 11:18 PM Fujii Masao  wrote:




On 2021/04/02 2:18, Jeevan Ladhe wrote:



On Thu, Apr 1, 2021 at 10:20 PM vignesh C mailto:vignes...@gmail.com>> wrote:

 Hi,

 While I was reviewing replication slot statistics code, I found one
 issue in the data type used for pgstat_report_replslot function
 parameters. We pass int64 variables to the function but the function
 prototype uses int type. I I felt the function parameters should be
 int64. Attached patch fixes the same.


Isn't it better to use PgStat_Counter instead of int64?



Thanks for your comment, the updated patch contains the changes for it.


Thanks for updating the patch! Pushed.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




badly calculated width of emoji in psql

2021-04-02 Thread Pavel Stehule
Hi

I am checked an query from
https://www.depesz.com/2021/04/01/waiting-for-postgresql-14-add-unistr-function/
article.

postgres=# SELECT U&'\+01F603';
┌──┐
│ ?column? │
╞══╡
│ 😃│
└──┘
(1 row)


The result is not correct. Emoji has width 2 chars, but psql calculates
with just one char.

Regards

Pavel


Re: Fix pg_checksums progress report

2021-04-02 Thread Fujii Masao




On 2021/04/02 16:47, Michael Paquier wrote:

On Fri, Apr 02, 2021 at 07:30:32AM +, shinya11.k...@nttdata.com wrote:

I added a comment to the patch, and attached the new patch.


Thanks for updating the patch!

+   /*
+* The current_size is calculated before checking if header is a
+* new page, because total_size includes the size of new pages.
+*/
+   current_size += r;

I'd like to comment more. What about the following?

---
Since the file size is counted as total_size for progress status information, 
the sizes of all pages including new ones in the file should be counted as 
current_size. Otherwise the progress reporting calculated using those counters 
may not reach 100%.
---



Hmm.  This looks to come from 280e5f14 that introduced the progress
reports so this would need a backpatch down to 12.


Yes.



I have not looked
in details and have not looked at the patch yet, though.  Fujii-san,
are you planning to take care of that?


Yes, I will. Thanks for the consideration!

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: Add Nullif case for eval_const_expressions_mutator

2021-04-02 Thread Peter Eisentraut

On 30.03.21 11:20, Peter Eisentraut wrote:

On 24.03.21 11:52, houzj.f...@fujitsu.com wrote:

+    if (!has_nonconst_input)
+    return ece_evaluate_expr(expr);

That's not okay without a further check to see if the comparison 
function used

by the node is immutable.  Compare ScalarArrayOpExpr, for instance.


Thanks for pointing it out.
Attaching new patch with this change.


This patch looks okay to me and addresses all the feedback that was 
given.  If there are no more comments, I'll commit it in a few days.


done




RE: Fix pg_checksums progress report

2021-04-02 Thread Shinya11.Kato
>-Original Message-
>From: Fujii Masao 
>Sent: Friday, April 2, 2021 6:03 PM
>To: Michael Paquier ; shinya11.k...@nttdata.com
>Cc: pgsql-hack...@postgresql.org
>Subject: Re: Fix pg_checksums progress report
>
>
>
>On 2021/04/02 16:47, Michael Paquier wrote:
>> On Fri, Apr 02, 2021 at 07:30:32AM +, shinya11.k...@nttdata.com wrote:
>>> I added a comment to the patch, and attached the new patch.
>
>Thanks for updating the patch!
>
>+  /*
>+   * The current_size is calculated before checking if header is a
>+   * new page, because total_size includes the size of new
>pages.
>+   */
>+  current_size += r;
>
>I'd like to comment more. What about the following?
>
>---
>Since the file size is counted as total_size for progress status information, 
>the
>sizes of all pages including new ones in the file should be counted as
>current_size. Otherwise the progress reporting calculated using those counters
>may not reach 100%.
>---

Thanks for your review!
I updated the patch, and attached it.

Regards,
Shinya Kato


fix_pg_checksums_progress_report_v3.patch
Description: fix_pg_checksums_progress_report_v3.patch


Re: badly calculated width of emoji in psql

2021-04-02 Thread Laurenz Albe
On Fri, 2021-04-02 at 10:45 +0200, Pavel Stehule wrote:
> I am checked an query from 
> https://www.depesz.com/2021/04/01/waiting-for-postgresql-14-add-unistr-function/
>  article.
> 
> postgres=# SELECT U&'\+01F603';
> ┌──┐
> │ ?column? │
> ╞══╡
> │ 😃│
> └──┘
> (1 row)
> 
> 
> The result is not correct. Emoji has width 2 chars, but psql calculates with 
> just one char.

How about this:

diff --git a/src/common/wchar.c b/src/common/wchar.c
index 6e7d731e02..e2d0d9691c 100644
--- a/src/common/wchar.c
+++ b/src/common/wchar.c
@@ -673,7 +673,8 @@ ucs_wcwidth(pg_wchar ucs)
  (ucs >= 0xfe30 && ucs <= 0xfe6f) ||   /* CJK Compatibility 
Forms */
  (ucs >= 0xff00 && ucs <= 0xff5f) ||   /* Fullwidth Forms */
  (ucs >= 0xffe0 && ucs <= 0xffe6) ||
- (ucs >= 0x2 && ucs <= 0x2)));
+ (ucs >= 0x2 && ucs <= 0x2) ||
+ (ucs >= 0x1f300 && ucs <= 0x1faff))); /* symbols and emojis */
 }
 
 /*

This is guesswork based on the unicode entries that look like symbols.

Yours,
Laurenz Albe





Re: badly calculated width of emoji in psql

2021-04-02 Thread Pavel Stehule
pá 2. 4. 2021 v 11:37 odesílatel Laurenz Albe 
napsal:

> On Fri, 2021-04-02 at 10:45 +0200, Pavel Stehule wrote:
> > I am checked an query from
> https://www.depesz.com/2021/04/01/waiting-for-postgresql-14-add-unistr-function/
> article.
> >
> > postgres=# SELECT U&'\+01F603';
> > ┌──┐
> > │ ?column? │
> > ╞══╡
> > │ 😃│
> > └──┘
> > (1 row)
> >
> >
> > The result is not correct. Emoji has width 2 chars, but psql calculates
> with just one char.
>
> How about this:
>
> diff --git a/src/common/wchar.c b/src/common/wchar.c
> index 6e7d731e02..e2d0d9691c 100644
> --- a/src/common/wchar.c
> +++ b/src/common/wchar.c
> @@ -673,7 +673,8 @@ ucs_wcwidth(pg_wchar ucs)
>   (ucs >= 0xfe30 && ucs <= 0xfe6f) ||   /* CJK
> Compatibility Forms */
>   (ucs >= 0xff00 && ucs <= 0xff5f) ||   /* Fullwidth Forms
> */
>   (ucs >= 0xffe0 && ucs <= 0xffe6) ||
> - (ucs >= 0x2 && ucs <= 0x2)));
> + (ucs >= 0x2 && ucs <= 0x2) ||
> + (ucs >= 0x1f300 && ucs <= 0x1faff))); /* symbols and
> emojis */
>  }
>
>  /*
>
> This is guesswork based on the unicode entries that look like symbols.
>

it helps

with this patch, the formatting is correct

Pavel

>
> Yours,
> Laurenz Albe
>
>


Re: Fix pg_checksums progress report

2021-04-02 Thread Michael Paquier
On Fri, Apr 02, 2021 at 06:03:21PM +0900, Fujii Masao wrote:
> On 2021/04/02 16:47, Michael Paquier wrote:
>> I have not looked
>> in details and have not looked at the patch yet, though.  Fujii-san,
>> are you planning to take care of that?
> 
> Yes, I will. Thanks for the consideration!

OK, thanks!
--
Michael


signature.asc
Description: PGP signature


Re: Refactoring HMAC in the core code

2021-04-02 Thread Michael Paquier
On Mon, Feb 15, 2021 at 08:25:27PM +0900, Michael Paquier wrote:
> Again a new rebase, giving v5:
> - Fixed the APIs to return -1 if the caller gives NULL in input, to be
> consistent with cryptohash.
> - Added a length argument to pg_hmac_final(), wiht sanity checks.

So, this patch has been around for a couple of weeks now, and I would
like to get this part done in 14 to close the loop with the parts of
the code that had better rely on what the crypto libs have.  The main
advantage of this change is for SCRAM so as it does not use its own
implementation of HMAC whenever possible.

Any objections?
--
Michael


signature.asc
Description: PGP signature


Re: [PATCH] postgres_fdw connection caching - cause remote sessions linger till the local session exit

2021-04-02 Thread Fujii Masao




On 2021/04/02 2:22, Fujii Masao wrote:

Thanks a lot! Barring any objection, I will commit this version.


Pushed. Thanks!

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: why pg_walfile_name() cannot be executed during recovery?

2021-04-02 Thread Robert Haas
On Fri, Apr 2, 2021 at 4:23 AM SATYANARAYANA NARLAPURAM
 wrote:
> Why pg_walfile_name() can't be executed under recovery?

I believe the issue is that the backend executing the function might
not have an accurate idea about which TLI to use. But I don't
understand why we can't find some solution to that problem.

> What is the best way for me to get the current timeline and/or the file being 
> recovering on the standby using a postgres query? I know I can get it via 
> process title but don't want to go that route.

pg_stat_wal_receiver has LSN and TLI information, but probably won't
help except when WAL receiver is actually active.
pg_last_wal_receive_lsn() and pg_last_wal_replay_lsn() will give the
LSN at any point during recovery, but not the TLI. We might have some
gaps in this area...

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: SQL-standard function body

2021-04-02 Thread Peter Eisentraut

On 31.03.21 12:12, Julien Rouhaud wrote:

On Tue, Mar 23, 2021 at 11:28:55PM -0500, Jaime Casanova wrote:

On Fri, Mar 19, 2021 at 8:49 AM Peter Eisentraut
 wrote:


Right.  Here is a new patch with that fix added and a small conflict
resolved.


Great.

It seems print_function_sqlbody() is not protected to avoid receiving
a function that hasn't a standard sql body in
src/backend/utils/adt/ruleutils.c:3292, but instead it has an assert
that gets hit with something like this:

CREATE FUNCTION foo() RETURNS int LANGUAGE SQL AS $$ SELECT 1 $$;
SELECT pg_get_function_sqlbody('foo'::regproc);


fixed


It would also be good to add a regression test checking that we can't define a
function with both a prosrc and a prosqlbody.


done


@@ -76,6 +77,7 @@ ProcedureCreate(const char *procedureName,
 Oid languageValidator,
 const char *prosrc,
 const char *probin,
+   Node *prosqlbody,
 char prokind,
 bool security_definer,
 bool isLeakProof,
@@ -119,8 +121,6 @@ ProcedureCreate(const char *procedureName,
 /*
  * sanity checks
  */
-   Assert(PointerIsValid(prosrc));
-
 parameterCount = parameterTypes->dim1;


Shouldn't we still assert that we either have a valid procsrc or valid
prosqlbody?


fixed

New patch attached.
From 7292b049b9017e02ffb8bbb830d91f7700e14a76 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Fri, 2 Apr 2021 14:23:14 +0200
Subject: [PATCH v11] SQL-standard function body

This adds support for writing CREATE FUNCTION and CREATE PROCEDURE
statements for language SQL with a function body that conforms to the
SQL standard and is portable to other implementations.

Instead of the PostgreSQL-specific AS $$ string literal $$ syntax,
this allows writing out the SQL statements making up the body
unquoted, either as a single statement:

CREATE FUNCTION add(a integer, b integer) RETURNS integer
LANGUAGE SQL
RETURN a + b;

or as a block

CREATE PROCEDURE insert_data(a integer, b integer)
LANGUAGE SQL
BEGIN ATOMIC
  INSERT INTO tbl VALUES (a);
  INSERT INTO tbl VALUES (b);
END;

The function body is parsed at function definition time and stored as
expression nodes in a new pg_proc column prosqlbody.  So at run time,
no further parsing is required.

However, this form does not support polymorphic arguments, because
there is no more parse analysis done at call time.

Dependencies between the function and the objects it uses are fully
tracked.

A new RETURN statement is introduced.  This can only be used inside
function bodies.  Internally, it is treated much like a SELECT
statement.

psql needs some new intelligence to keep track of function body
boundaries so that it doesn't send off statements when it sees
semicolons that are inside a function body.

Also, per SQL standard, LANGUAGE SQL is the default, so it does not
need to be specified anymore.

Discussion: 
https://www.postgresql.org/message-id/flat/1c11f1eb-f00c-43b7-799d-2d44132c0...@2ndquadrant.com
---
 doc/src/sgml/catalogs.sgml|  10 +
 doc/src/sgml/ref/create_function.sgml | 126 -
 doc/src/sgml/ref/create_procedure.sgml|  62 -
 src/backend/catalog/pg_aggregate.c|   1 +
 src/backend/catalog/pg_proc.c | 116 +---
 src/backend/commands/aggregatecmds.c  |   2 +
 src/backend/commands/functioncmds.c   | 130 +++--
 src/backend/commands/typecmds.c   |   4 +
 src/backend/executor/functions.c  |  79 +++---
 src/backend/nodes/copyfuncs.c |  15 +
 src/backend/nodes/equalfuncs.c|  13 +
 src/backend/nodes/outfuncs.c  |  12 +
 src/backend/nodes/readfuncs.c |   1 +
 src/backend/optimizer/util/clauses.c  | 126 ++---
 src/backend/parser/analyze.c  |  35 +++
 src/backend/parser/gram.y | 129 +++--
 src/backend/tcop/postgres.c   |   3 +-
 src/backend/utils/adt/ruleutils.c | 140 +-
 src/bin/pg_dump/pg_dump.c |  45 ++-
 src/bin/psql/describe.c   |  15 +-
 src/fe_utils/psqlscan.l   |  24 +-
 src/include/catalog/pg_proc.dat   |   4 +
 src/include/catalog/pg_proc.h |   6 +-
 src/include/commands/defrem.h |   2 +
 src/include/executor/functions.h  |  15 +
 src/include/fe_utils/psqlscan_int.h   |   2 +
 src/include/nodes/nodes.h |   1 +
 src/include/nodes/parsenodes.h|  13 +
 src/include/parser/kwlist.h   |   2 +
 src/include/tcop/tcopprot.h   |   1 +
 src/interfaces/ecpg/preproc/ecpg.addons   |   6 +
 src/interfaces/ecpg/preproc/ecpg.trailer  |   4 +-
 .../regress/expected/create_function_3.out| 260 +-
 .../regress/expect

Re: libpq debug log

2021-04-02 Thread Alvaro Herrera
On 2021-Apr-02, Kyotaro Horiguchi wrote:

> At Fri, 02 Apr 2021 14:40:09 +0900 (JST), Kyotaro Horiguchi 
>  wrote in 

> > So, the cheapest measure for regression test would be just making the
> 
> So, the cheapest measure for regression test would be just *masking* the
> 
> > length field, while regress is true...

Yeah.

> tired?

Same here.

-- 
Álvaro Herrera39°49'30"S 73°17'W
"Thou shalt not follow the NULL pointer, for chaos and madness await
thee at its end." (2nd Commandment for C programmers)




Re: simplifying foreign key/RI checks

2021-04-02 Thread Amit Langote
On Sat, Mar 20, 2021 at 10:21 PM Amit Langote  wrote:
> Updated patches attached.  Sorry about the delay.

Rebased over the recent DETACH PARTITION CONCURRENTLY work.
Apparently, ri_ReferencedKeyExists() was using the wrong snapshot.

-- 
Amit Langote
EDB: http://www.enterprisedb.com


v8-0001-Export-get_partition_for_tuple.patch
Description: Binary data


v8-0002-Avoid-using-SPI-for-some-RI-checks.patch
Description: Binary data


Re: policies with security definer option for allowing inline optimization

2021-04-02 Thread Isaac Morland
On Fri, 2 Apr 2021 at 01:44, Dan Lynch  wrote:

> RLS policies quals/checks are optimized inline, and so I generally avoid
> writing a separate procedure so the optimizer can do it's thing.
>
> However, if you need a security definer to avoid recursive RLS if you're
> doing a more complex query say, on a join table, anyone wish there was a
> flag on the policy itself to specify that `WITH CHECK` or `USING`
> expression could be run via security definer?
>
> The main reason for this is to avoid writing a separate security definer
> function so you can benefit from the optimizer.
>
> Is this possible? Would this be worth a feature request to postgres core?
>

If we're going to do this we should do the same for triggers as well.

It's easy to imagine a situation in which RLS policies need to refer to
information which should not be accessible to the role using the table, and
similarly it's easy to imagine a situation in which a trigger needs to
write to another table which should not be accessible to the role using the
table which has the trigger.


Re: a misbehavior of partition row movement (?)

2021-04-02 Thread Amit Langote
On Thu, Apr 1, 2021 at 10:56 AM Masahiko Sawada  wrote:
> On Tue, Mar 23, 2021 at 6:27 PM Amit Langote  wrote:
> > Actually, I found a big hole in my assumptions around deferrable
> > foreign constraints, invalidating the approach I took in 0002 to use a
> > query-lifetime tuplestore to record root parent tuples.  I'm trying to
> > find a way to make the tuplestore transaction-lifetime so that the
> > patch still works.
> >
> > In the meantime, I'm attaching an updated set with 0001 changed per
> > your comments.
>
> 0001 patch conflicts with 71f4c8c6f74. Could you please rebase the patchset?

Thanks for the heads up.

I still don't have a working patch to address the above mentioned
shortcoming of the previous approach, but here is a rebased version in
the meantime.


--
Amit Langote
EDB: http://www.enterprisedb.com


v7-0001-Create-foreign-key-triggers-in-partitioned-tables.patch
Description: Binary data


v7-0002-Enforce-foreign-key-correctly-during-cross-partit.patch
Description: Binary data


Re: policies with security definer option for allowing inline optimization

2021-04-02 Thread Stephen Frost
Greetings,

* Isaac Morland (isaac.morl...@gmail.com) wrote:
> On Fri, 2 Apr 2021 at 01:44, Dan Lynch  wrote:
> > RLS policies quals/checks are optimized inline, and so I generally avoid
> > writing a separate procedure so the optimizer can do it's thing.
> >
> > However, if you need a security definer to avoid recursive RLS if you're
> > doing a more complex query say, on a join table, anyone wish there was a
> > flag on the policy itself to specify that `WITH CHECK` or `USING`
> > expression could be run via security definer?
> >
> > The main reason for this is to avoid writing a separate security definer
> > function so you can benefit from the optimizer.
> >
> > Is this possible? Would this be worth a feature request to postgres core?
> 
> If we're going to do this we should do the same for triggers as well.

... and views.

> It's easy to imagine a situation in which RLS policies need to refer to
> information which should not be accessible to the role using the table, and
> similarly it's easy to imagine a situation in which a trigger needs to
> write to another table which should not be accessible to the role using the
> table which has the trigger.

I'm generally +1 on adding the ability for the DBA to choose which user
various things run as.  There's definitely use-cases for both in my
experience.  Also would be great to add the ability to have policies on
views too which would probably help address some of these cases.

Thanks,

Stephen


signature.asc
Description: PGP signature


Why reset pgstat during recovery

2021-04-02 Thread 蔡梦娟(玊于)

Hi, all

I want to know why call pgstat_reset_all function during recovery process,  
under what circumstances the data will be invalid after recovery?
 
Thanks & Best Regard




Re: policies with security definer option for allowing inline optimization

2021-04-02 Thread Chapman Flack
On 04/02/21 09:09, Isaac Morland wrote:
> If we're going to do this we should do the same for triggers as well.
> 
> ... it's easy to imagine a situation in which a trigger needs to
> write to another table which should not be accessible to the role using the
> table which has the trigger.

Triggers seem to be an area of long-standing weirdness[1].

Regards,
-Chap


[1]
https://www.postgresql.org/message-id/b1be2d05-b9fd-b9db-ea7f-38253e4e4bab%40anastigmatix.net




Re: libpq debug log

2021-04-02 Thread Alvaro Herrera
On 2021-Apr-02, Alvaro Herrera wrote:

> On 2021-Apr-02, Kyotaro Horiguchi wrote:
> 
> > At Fri, 02 Apr 2021 14:40:09 +0900 (JST), Kyotaro Horiguchi 
> >  wrote in 
> 
> > > So, the cheapest measure for regression test would be just making the
> > 
> > So, the cheapest measure for regression test would be just *masking* the
> > 
> > > length field, while regress is true...
> 
> Yeah.

As in the attached patch.

-- 
Álvaro Herrera   Valdivia, Chile
"Investigación es lo que hago cuando no sé lo que estoy haciendo"
(Wernher von Braun)
>From 387a15ec80cfcdd2279b109215b9f658bc972748 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Fri, 2 Apr 2021 10:51:42 -0300
Subject: [PATCH] Suppress length of Notice/Error in PQtrace regress mode

---
 src/interfaces/libpq/fe-trace.c  | 9 -
 .../modules/libpq_pipeline/traces/pipeline_abort.trace   | 8 
 src/test/modules/libpq_pipeline/traces/transaction.trace | 6 +++---
 3 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/src/interfaces/libpq/fe-trace.c b/src/interfaces/libpq/fe-trace.c
index 9a4595f5c8..298713c602 100644
--- a/src/interfaces/libpq/fe-trace.c
+++ b/src/interfaces/libpq/fe-trace.c
@@ -540,7 +540,14 @@ pqTraceOutputMessage(PGconn *conn, const char *message, bool toServer)
 	length = (int) pg_ntoh32(length);
 	logCursor += 4;
 
-	fprintf(conn->Pfdebug, "%s\t%d\t", prefix, length);
+	/*
+	 * In regress mode, suppress the length of ErrorResponse and
+	 * NoticeResponse -- they vary depending on compiler.
+	 */
+	if (regress && !toServer && (id == 'E' || id == 'N'))
+		fprintf(conn->Pfdebug, "%s\tNN\t", prefix);
+	else
+		fprintf(conn->Pfdebug, "%s\t%d\t", prefix, length);
 
 	switch (id)
 	{
diff --git a/src/test/modules/libpq_pipeline/traces/pipeline_abort.trace b/src/test/modules/libpq_pipeline/traces/pipeline_abort.trace
index b061435785..254e485997 100644
--- a/src/test/modules/libpq_pipeline/traces/pipeline_abort.trace
+++ b/src/test/modules/libpq_pipeline/traces/pipeline_abort.trace
@@ -1,5 +1,5 @@
 F	42	Query	 "DROP TABLE IF EXISTS pq_pipeline_demo"
-B	123	NoticeResponse	 S "NOTICE" V "NOTICE" C "0" M "table "pq_pipeline_demo" does not exist, skipping" F "" L "" R "" \x00
+B	NN	NoticeResponse	 S "NOTICE" V "NOTICE" C "0" M "table "pq_pipeline_demo" does not exist, skipping" F "" L "" R "" \x00
 B	15	CommandComplete	 "DROP TABLE"
 B	5	ReadyForQuery	 I
 F	99	Query	 "CREATE UNLOGGED TABLE pq_pipeline_demo(id serial primary key, itemno integer,int8filler int8);"
@@ -27,7 +27,7 @@ B	4	ParseComplete
 B	4	BindComplete
 B	4	NoData
 B	15	CommandComplete	 "INSERT 0 1"
-B	217	ErrorResponse	 S "ERROR" V "ERROR" C "42883" M "function no_such_function(integer) does not exist" H "No function matches the given name and argument types. You might need to add explicit type casts." P "8" F "" L "" R "" \x00
+B	NN	ErrorResponse	 S "ERROR" V "ERROR" C "42883" M "function no_such_function(integer) does not exist" H "No function matches the given name and argument types. You might need to add explicit type casts." P "8" F "" L "" R "" \x00
 B	5	ReadyForQuery	 I
 B	4	ParseComplete
 B	4	BindComplete
@@ -39,7 +39,7 @@ F	12	Bind	 "" "" 0 0 0
 F	6	Describe	 P ""
 F	9	Execute	 "" 0
 F	4	Sync
-B	123	ErrorResponse	 S "ERROR" V "ERROR" C "42601" M "cannot insert multiple commands into a prepared statement" F "" L "" R "" \x00
+B	NN	ErrorResponse	 S "ERROR" V "ERROR" C "42601" M "cannot insert multiple commands into a prepared statement" F "" L "" R "" \x00
 B	5	ReadyForQuery	 I
 F	54	Parse	 "" "SELECT 1.0/g FROM generate_series(3, -1, -1) g" 0
 F	12	Bind	 "" "" 0 0 0
@@ -52,7 +52,7 @@ B	33	RowDescription	 1 "?column?"  0  65535 -1 0
 B	32	DataRow	 1 22 '0.'
 B	32	DataRow	 1 22 '0.5000'
 B	32	DataRow	 1 22 '1.'
-B	70	ErrorResponse	 S "ERROR" V "ERROR" C "22012" M "division by zero" F "" L "" R "" \x00
+B	NN	ErrorResponse	 S "ERROR" V "ERROR" C "22012" M "division by zero" F "" L "" R "" \x00
 B	5	ReadyForQuery	 I
 F	40	Query	 "SELECT itemno FROM pq_pipeline_demo"
 B	31	RowDescription	 1 "itemno"  2  4 -1 0
diff --git a/src/test/modules/libpq_pipeline/traces/transaction.trace b/src/test/modules/libpq_pipeline/traces/transaction.trace
index 0267a534fa..1dcc2373c0 100644
--- a/src/test/modules/libpq_pipeline/traces/transaction.trace
+++ b/src/test/modules/libpq_pipeline/traces/transaction.trace
@@ -1,5 +1,5 @@
 F	79	Query	 "DROP TABLE IF EXISTS pq_pipeline_tst;CREATE TABLE pq_pipeline_tst (id int)"
-B	122	NoticeResponse	 S "NOTICE" V "NOTICE" C "0" M "table "pq_pipeline_tst" does not exist, skipping" F "" L "" R "" \x00
+B	NN	NoticeResponse	 S "NOTICE" V "NOTICE" C "0" M "table "pq_pipeline_tst" does not exist, skipping" F "" L "" R "" \x00
 B	15	CommandComplete	 "DROP TABLE"
 B	17	CommandComplete	 "CREATE TABLE"
 B	5	ReadyForQue

Re: policies with security definer option for allowing inline optimization

2021-04-02 Thread Isaac Morland
On Fri, 2 Apr 2021 at 09:30, Stephen Frost  wrote:

> Greetings,
>
> * Isaac Morland (isaac.morl...@gmail.com) wrote:
> > On Fri, 2 Apr 2021 at 01:44, Dan Lynch  wrote:
> > > RLS policies quals/checks are optimized inline, and so I generally
> avoid
> > > writing a separate procedure so the optimizer can do it's thing.
> > >
> > > However, if you need a security definer to avoid recursive RLS if
> you're
> > > doing a more complex query say, on a join table, anyone wish there was
> a
> > > flag on the policy itself to specify that `WITH CHECK` or `USING`
> > > expression could be run via security definer?
> > >
> > > The main reason for this is to avoid writing a separate security
> definer
> > > function so you can benefit from the optimizer.
> > >
> > > Is this possible? Would this be worth a feature request to postgres
> core?
> >
> > If we're going to do this we should do the same for triggers as well.
>
> ... and views.
>

Views already run security definer, allowing them to be used for some of
the same information-hiding purposes as RLS. But I just found something
strange: current_user/_role returns the user's role, not the view owner's
role:

postgres=# create table tt as select 5;
SELECT 1
postgres=# create view tv as select *, current_user from tt;
CREATE VIEW
postgres=# table tt;
 ?column?
--
5
(1 row)

postgres=# table tv;
 ?column? | current_user
--+--
5 | postgres
(1 row)

postgres=# set role to t1;
SET
postgres=> table tt;
ERROR:  permission denied for table tt
postgres=> table tv;
ERROR:  permission denied for view tv
postgres=> set role to postgres;
SET
postgres=# grant select on tv to public;
GRANT
postgres=# set role to t1;
SET
postgres=> table tt;
ERROR:  permission denied for table tt
postgres=> table tv;
 ?column? | current_user
--+--
5 | t1
(1 row)

postgres=>

Note that even though current_user is t1 "inside" the view, it is still
able to see the contents of table tt. Shouldn't current_user/_role return
the view owner in this situation? By contrast security definer functions
work properly:

postgres=# create function get_current_user_sd () returns name security
definer language sql as $$ select current_user $$;
CREATE FUNCTION
postgres=# select get_current_user_sd ();
 get_current_user_sd
-
 postgres
(1 row)

postgres=# set role t1;
SET
postgres=> select get_current_user_sd ();
 get_current_user_sd
-
 postgres
(1 row)

postgres=>


Re: policies with security definer option for allowing inline optimization

2021-04-02 Thread Isaac Morland
On Fri, 2 Apr 2021 at 09:44, Chapman Flack  wrote:

> On 04/02/21 09:09, Isaac Morland wrote:
> > If we're going to do this we should do the same for triggers as well.
> >
> > ... it's easy to imagine a situation in which a trigger needs to
> > write to another table which should not be accessible to the role using
> the
> > table which has the trigger.
>
> Triggers seem to be an area of long-standing weirdness[1].
>

Thanks for that reference. That has convinced me that I was wrong in a
previous discussion to say that triggers should run as the table owner:
instead, they should run as the trigger owner (implying that triggers
should have owners). Of course at this point the change could only be made
as an option in order to avoid a backward compatibility break.

[1]
>
> https://www.postgresql.org/message-id/b1be2d05-b9fd-b9db-ea7f-38253e4e4bab%40anastigmatix.net
>


Re: libpq debug log

2021-04-02 Thread Tom Lane
Alvaro Herrera  writes:
> As in the attached patch.

+1, but the comment could be more specific.  Maybe like "In regress mode,
suppress the length of ErrorResponse and NoticeResponse messages --- the F
(file name) field, in particular, can vary in length depending on compiler
used."

Actually though ... I recall now that elog.c tries to standardize the F
output by stripping the path part:

/* keep only base name, useful especially for vpath builds */
slash = strrchr(filename, '/');
if (slash)
filename = slash + 1;

I bet what is happening on drongo is that the compiler has generated a
__FILE__ value that contains backslashes not slashes, and this code
doesn't know how to shorten those.  So maybe instead of lobotomizing
this test, we should fix that.

slash = strrchr(filename, '/');
+   if (!slash)
+   slash = strrchr(filename, '\\');
if (slash)
filename = slash + 1;

regards, tom lane




Re: Refactoring HMAC in the core code

2021-04-02 Thread Bruce Momjian
On Fri, Apr  2, 2021 at 07:04:18PM +0900, Michael Paquier wrote:
> On Mon, Feb 15, 2021 at 08:25:27PM +0900, Michael Paquier wrote:
> > Again a new rebase, giving v5:
> > - Fixed the APIs to return -1 if the caller gives NULL in input, to be
> > consistent with cryptohash.
> > - Added a length argument to pg_hmac_final(), wiht sanity checks.
> 
> So, this patch has been around for a couple of weeks now, and I would
> like to get this part done in 14 to close the loop with the parts of
> the code that had better rely on what the crypto libs have.  The main
> advantage of this change is for SCRAM so as it does not use its own
> implementation of HMAC whenever possible.
> 
> Any objections?

Works for me.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  If only the physical world exists, free will is an illusion.





Re: policies with security definer option for allowing inline optimization

2021-04-02 Thread Joe Conway

On 4/2/21 9:57 AM, Isaac Morland wrote:
Views already run security definer, allowing them to be used for some of the 
same information-hiding purposes as RLS. But I just found something strange: 
current_user/_role returns the user's role, not the view owner's role:



postgres=# set role to t1;
SET
postgres=> table tt;
ERROR:  permission denied for table tt
postgres=> table tv;
  ?column? | current_user
--+--
         5 | t1
(1 row)

postgres=>

Note that even though current_user is t1 "inside" the view, it is still able to 
see the contents of table tt. Shouldn't current_user/_role return the view owner 
in this situation? By contrast security definer functions work properly:


That is because while VIEWs are effectively SECURITY DEFINER for table access, 
functions running as part of the view are still SECURITY INVOKER if they were 
defined that way. And "current_user" is essentially just a special grammatical 
interface to a SECURITY INVOKER function:


postgres=# \df+ current_user
List of functions
-[ RECORD 1 ]---+--
Schema  | pg_catalog
Name| current_user
Result data type| name
Argument data types |
Type| func
Volatility  | stable
Parallel| safe
Owner   | postgres
Security| invoker
Access privileges   |
Language| internal
Source code | current_user
Description | current user name

Joe

--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development




Re: Process initialization labyrinth

2021-04-02 Thread Mike Palmiotto
On Thu, Apr 1, 2021 at 8:22 PM Andres Freund  wrote:
> 
> I have now worked around this by generous application of ugly, but I
> think we really need to do something about this mazy mess. There's just
> an insane amount of duplication, and it's too complicated to remember
> more than a few minutes.
>
> I really would like to not see things like
>
> /*
>  * Create a per-backend PGPROC struct in shared memory, except in the
>  * EXEC_BACKEND case where this was done in SubPostmasterMain. We 
> must do
>  * this before we can use LWLocks (and in the EXEC_BACKEND case we 
> already
>  * had to do some stuff with LWLocks).
>  */
> #ifdef EXEC_BACKEND
> if (!IsUnderPostmaster)
> InitProcess();
> #else
> InitProcess();
> #endif
>
> Similarly, codeflow like bootstrap.c being involved in bog standard
> stuff like starting up wal writer etc, is just pointlessly
> confusing. Note that bootstrap itself does *not* go through
> AuxiliaryProcessMain(), and thus has yet another set of initialization
> needs.

I can't really speak to the initial points directly relating to
pgstat/shmem, but for the overall maze-like nature of the startup
code: is there any chance the startup centralization patchset would be
of any help here?
https://www.postgresql.org/message-id/flat/CAMN686FE0OdZKp9YPO=htC6LnA6aW4r-+jq=3q5raofqgw8...@mail.gmail.com

I know you are at least vaguely aware of the efforts there, as you
reviewed the patchset. Figured I should at least bring it up in case
it seemed helpful, or an effort you'd like to re-invigorate.

Thanks,

--
Mike Palmiotto
https://crunchydata.com




Re: documentation fix for SET ROLE

2021-04-02 Thread Laurenz Albe
On Mon, 2021-03-15 at 17:09 +, Bossart, Nathan wrote:
> On 3/15/21, 7:06 AM, "Laurenz Albe"  wrote:
> > On Fri, 2021-03-12 at 21:41 +, Bossart, Nathan wrote:
> > > On 3/12/21, 11:14 AM, "Joe Conway"  wrote:
> > > > Looking back at the commit history it seems to me that this only works
> > > > accidentally. Perhaps it would be best to fix RESET ROLE and be done 
> > > > with it.
> > > 
> > > That seems reasonable to me.
> > 
> > +1 from me too.
> 
> Here's my latest attempt.  I think it's important to state that it
> sets the role to the current session user identifier unless there is a
> connection-time setting.  If there is no connection-time setting, it
> will reset the role to the current session user, which might be
> different if you've run SET SESSION AUTHORIZATION.
> 
> diff --git a/doc/src/sgml/ref/set_role.sgml b/doc/src/sgml/ref/set_role.sgml
> index 739f2c5cdf..f02babf3af 100644
> --- a/doc/src/sgml/ref/set_role.sgml
> +++ b/doc/src/sgml/ref/set_role.sgml
> @@ -53,9 +53,16 @@ RESET ROLE
>
> 
>
> -   The NONE and RESET forms reset the 
> current
> -   user identifier to be the current session user identifier.
> -   These forms can be executed by any user.
> +   SET ROLE NONE sets the current user identifier to the
> +   current session user identifier, as returned by
> +   session_user.  RESET ROLE sets the
> +   current user identifier to the connection-time setting specified by the
> +   command-line options,
> +   ALTER ROLE, or
> +   ALTER 
> DATABASE,
> +   if any such settings exist.  Otherwise, RESET ROLE sets
> +   the current user identifier to the current session user identifier.  These
> +   forms can be executed by any user.
>
>   

Actually, SET ROLE NONE is defined by the SQL standard:

  18.3 

  [...]

  If NONE is specified, then
  Case:
  i) If there is no current user identifier, then an exception condition is 
raised:
 invalid role specification.
  ii) Otherwise, the current role name is removed.

This is reflected in a comment in src/backend/commands/variable.c:

  /*
   * SET ROLE
   *
   * The SQL spec requires "SET ROLE NONE" to unset the role, so we hardwire
   * a translation of "none" to InvalidOid.  Otherwise this is much like
   * SET SESSION AUTHORIZATION.
   */

On the other hand, RESET (according to src/backend/utils/misc/README)
does something different:

  Prior values of configuration variables must be remembered in order to deal
  with several special cases: RESET (a/k/a SET TO DEFAULT)

So I think it is intentional that RESET ROLE does something else than
SET ROLE NONE, and we should not change that.

So I think that documenting this is the way to go.  I'll mark it as
"ready for committer".

Yours,
Laurenz Albe





Re: policies with security definer option for allowing inline optimization

2021-04-02 Thread Stephen Frost
Greetings,

* Joe Conway (m...@joeconway.com) wrote:
> On 4/2/21 9:57 AM, Isaac Morland wrote:
> >Views already run security definer, allowing them to be used for some of
> >the same information-hiding purposes as RLS. But I just found something
> >strange: current_user/_role returns the user's role, not the view owner's
> >role:
> 
> >postgres=# set role to t1;
> >SET
> >postgres=> table tt;
> >ERROR:  permission denied for table tt
> >postgres=> table tv;
> >  ?column? | current_user
> >--+--
> >         5 | t1
> >(1 row)
> >
> >postgres=>
> >
> >Note that even though current_user is t1 "inside" the view, it is still
> >able to see the contents of table tt. Shouldn't current_user/_role return
> >the view owner in this situation? By contrast security definer functions
> >work properly:
> 
> That is because while VIEWs are effectively SECURITY DEFINER for table
> access, functions running as part of the view are still SECURITY INVOKER if
> they were defined that way. And "current_user" is essentially just a special
> grammatical interface to a SECURITY INVOKER function:

Right- and what I was really getting at is that it'd sometimes be nice
to have the view run as 'security invoker' for table access.  In
general, it seems like it'd be useful to be able to control each piece
and define if it's to be security invoker or security definer.  We're
able to do that for functions, but not other parts of the system.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: A reloption for partitioned tables - parallel_workers

2021-04-02 Thread Laurenz Albe
On Wed, 2021-03-24 at 14:14 +1300, David Rowley wrote:
> On Fri, 19 Mar 2021 at 02:07, Amit Langote  wrote:
> > Attached a new version rebased over c8f78b616, with the grouping
> > relation partitioning enhancements as a separate patch 0001.  Sorry
> > about the delay.
> 
> I had a quick look at this and wondered if the partitioned table's
> parallel workers shouldn't be limited to the sum of the parallel
> workers of the Append's subpaths?
> 
> It seems a bit weird to me that the following case requests 4 workers:
> 
> # create table lp (a int) partition by list(a);
> # create table lp1 partition of lp for values in(1);
> # insert into lp select 1 from generate_series(1,1000) x;
> # alter table lp1 set (parallel_workers = 2);
> # alter table lp set (parallel_workers = 4);
> # set max_parallel_workers_per_Gather = 8;
> # explain select count(*) from lp;
> QUERY PLAN
> ---
>  Finalize Aggregate  (cost=97331.63..97331.64 rows=1 width=8)
>->  Gather  (cost=97331.21..97331.62 rows=4 width=8)
>  Workers Planned: 4
>  ->  Partial Aggregate  (cost=96331.21..96331.22 rows=1 width=8)
>->  Parallel Seq Scan on lp1 lp  (cost=0.00..85914.57
> rows=4166657 width=0)
> (5 rows)
> 
> I can see a good argument that there should only be 2 workers here.

Good point, I agree.

> If someone sets the partitioned table's parallel_workers high so that
> they get a large number of workers when no partitions are pruned
> during planning, do they really want the same number of workers in
> queries where a large number of partitions are pruned?
> 
> This problem gets a bit more complex in generic plans where the
> planner can't prune anything but run-time pruning prunes many
> partitions. I'm not so sure what to do about that, but the problem
> does exist today to a lesser extent with the current method of
> determining the append parallel workers.

Also a good point.  That would require changing the actual number of
parallel workers at execution time, but that is tricky.
If we go with your suggestion above, we'd have to disambiguate if
the number of workers is set because a partition is large enough
to warrant a parallel scan (then it shouldn't be reduced if the executor
prunes partitions) or if it is because of the number of partitions
(then it should be reduced).

Currently, we don't reduce parallelism if the executor prunes
partitions, so this could be seen as an independent problem.

I don't know if Seamus is still working on that; if not, we might
mark it as "returned with feedback".

Perhaps Amit's patch 0001 should go in independently.

I'll mark the patch as "waiting for author".

Yours,
Laurenz Albe





Re: libpq debug log

2021-04-02 Thread Tom Lane
I wrote:
> I bet what is happening on drongo is that the compiler has generated a
> __FILE__ value that contains backslashes not slashes, and this code
> doesn't know how to shorten those.  So maybe instead of lobotomizing
> this test, we should fix that.

Did that, but we'll have to wait a few hours to see if it makes
drongo happy.

regards, tom lane




Re: policies with security definer option for allowing inline optimization

2021-04-02 Thread Joe Conway

On 4/2/21 10:23 AM, Stephen Frost wrote:

Greetings,

* Joe Conway (m...@joeconway.com) wrote:

On 4/2/21 9:57 AM, Isaac Morland wrote:
>Views already run security definer, allowing them to be used for some of
>the same information-hiding purposes as RLS. But I just found something
>strange: current_user/_role returns the user's role, not the view owner's
>role:

>postgres=# set role to t1;
>SET
>postgres=> table tt;
>ERROR:  permission denied for table tt
>postgres=> table tv;
>  ?column? | current_user
>--+--
>         5 | t1
>(1 row)
>
>postgres=>
>
>Note that even though current_user is t1 "inside" the view, it is still
>able to see the contents of table tt. Shouldn't current_user/_role return
>the view owner in this situation? By contrast security definer functions
>work properly:

That is because while VIEWs are effectively SECURITY DEFINER for table
access, functions running as part of the view are still SECURITY INVOKER if
they were defined that way. And "current_user" is essentially just a special
grammatical interface to a SECURITY INVOKER function:


Right- and what I was really getting at is that it'd sometimes be nice
to have the view run as 'security invoker' for table access.  In
general, it seems like it'd be useful to be able to control each piece
and define if it's to be security invoker or security definer.  We're
able to do that for functions, but not other parts of the system.


+1

Agreed -- I have opined similarly in the past

Joe

--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development




Re: Stronger safeguard for archive recovery not to miss data

2021-04-02 Thread Laurenz Albe
On Thu, 2021-04-01 at 17:25 +0900, Fujii Masao wrote:
> Thanks for updating the patch!
> 
> +errhint("Use a backup taken after setting 
> wal_level to higher than minimal "
> +"or recover to the point in 
> time before wal_level becomes minimal even though it causes data loss")));
> 
> ISTM that "or recover to the point in time before wal_level was changed
>   to minimal even though it may cause data loss" sounds better. Thought?

I would reduce it to

"Either use a later backup, or recover to a point in time before \"wal_level\" 
was set to \"minimal\"."

I'd say that we can leave it to the intelligence of the reader to
deduce that recovering to an earlier time means more data loss.

Yours,
Laurenz Albe





Re: libpq debug log

2021-04-02 Thread Alvaro Herrera
On 2021-Apr-02, Tom Lane wrote:

> I wrote:
> > I bet what is happening on drongo is that the compiler has generated a
> > __FILE__ value that contains backslashes not slashes, and this code
> > doesn't know how to shorten those.  So maybe instead of lobotomizing
> > this test, we should fix that.
> 
> Did that, but we'll have to wait a few hours to see if it makes
> drongo happy.

Thanks, I really hope it does!

-- 
Álvaro Herrera   Valdivia, Chile




Re: simplifying foreign key/RI checks

2021-04-02 Thread Zhihong Yu
Hi,

+   skip = !ExecLockTableTuple(erm->relation, &tid, markSlot,
+  estate->es_snapshot,
estate->es_output_cid,
+  lockmode, erm->waitPolicy, &epq_needed);
+   if (skip)

It seems the variable skip is only used above. The variable is not needed -
if statement can directly check the return value.

+ * Locks tuple with given TID with given lockmode following given
wait

given appears three times in the above sentence. Maybe the following is bit
easier to read:

Locks tuple with the specified TID, lockmode following given wait policy

+ * Checks whether a tuple containing the same unique key as extracted from
the
+ * tuple provided in 'slot' exists in 'pk_rel'.

I think 'same' is not needed here since the remaining part of the sentence
has adequately identified the key.

+   if (leaf_pk_rel == NULL)
+   goto done;

It would be better to avoid goto by including the cleanup statements in the
if block and return.

+   if (index_getnext_slot(scan, ForwardScanDirection, outslot))
+   found = true;
+
+   /* Found tuple, try to lock it in key share mode. */
+   if (found)

Since found is only assigned in one place, the two if statements can be
combined into one.

Cheers

On Fri, Apr 2, 2021 at 5:46 AM Amit Langote  wrote:

> On Sat, Mar 20, 2021 at 10:21 PM Amit Langote 
> wrote:
> > Updated patches attached.  Sorry about the delay.
>
> Rebased over the recent DETACH PARTITION CONCURRENTLY work.
> Apparently, ri_ReferencedKeyExists() was using the wrong snapshot.
>
> --
> Amit Langote
> EDB: http://www.enterprisedb.com
>


Re: libpq debug log

2021-04-02 Thread Tom Lane
I wrote:
>> I bet what is happening on drongo is that the compiler has generated a
>> __FILE__ value that contains backslashes not slashes, and this code
>> doesn't know how to shorten those.  So maybe instead of lobotomizing
>> this test, we should fix that.

> Did that, but we'll have to wait a few hours to see if it makes
> drongo happy.

On third thought, maybe we should push your patch too.  Although I think
53aafdb9f is going to fix the platform-specific aspect of this, we are
still going to risk some implementation dependence of the libpq_pipeline
results:

* Every so often, the number of digits in the reported line numbers
will change (999 -> 1001 or the like), due to changes in unrelated
code.

* Occasionally we refactor things so that the "same" error is reported
from a different file.

It's hard to judge whether that will happen often enough to be an
annoying maintenance problem, but there's definitely a hazard.
Not sure if we should proactively lobotomize the test, or wait to
see if we get annoyed.

In any case I'd like to wait till after drongo's next run, so
we can confirm whether or not the backslashes-in-__FILE__ hypothesis
is correct.  If it is, then 53aafdb9f is a good fix on its own
merits, independently of libpq_pipeline.

regards, tom lane




Re: simplifying foreign key/RI checks

2021-04-02 Thread Alvaro Herrera
On 2021-Apr-02, Amit Langote wrote:

> On Sat, Mar 20, 2021 at 10:21 PM Amit Langote  wrote:
> > Updated patches attached.  Sorry about the delay.
> 
> Rebased over the recent DETACH PARTITION CONCURRENTLY work.
> Apparently, ri_ReferencedKeyExists() was using the wrong snapshot.

Hmm, I wonder if that stuff should be using a PartitionDirectory?  (I
didn't actually understand what your code is doing, so please forgive if
this is a silly question.)

-- 
Álvaro Herrera39°49'30"S 73°17'W
"After a quick R of TFM, all I can say is HOLY CR** THAT IS COOL! PostgreSQL was
amazing when I first started using it at 7.2, and I'm continually astounded by
learning new features and techniques made available by the continuing work of
the development team."
Berend Tober, http://archives.postgresql.org/pgsql-hackers/2007-08/msg01009.php




Re: libpq debug log

2021-04-02 Thread Alvaro Herrera
On 2021-Apr-02, Tom Lane wrote:

> On third thought, maybe we should push your patch too.  Although I think
> 53aafdb9f is going to fix the platform-specific aspect of this, we are
> still going to risk some implementation dependence of the libpq_pipeline
> results:
> 
> * Every so often, the number of digits in the reported line numbers
> will change (999 -> 1001 or the like), due to changes in unrelated
> code.

Bah -- of course.

> * Occasionally we refactor things so that the "same" error is reported
> from a different file.

True.  (This was the reason for masking F and R, but I didn't realize
that it'd have an effect in the message length).

> It's hard to judge whether that will happen often enough to be an
> annoying maintenance problem, but there's definitely a hazard.
> Not sure if we should proactively lobotomize the test, or wait to
> see if we get annoyed.

I suspect we're going to see enough bf failures if we don't suppress it,
because the problem is going to only show up with TAP testing enabled
and the src/test/modules tests, which perhaps not all committers run.

> In any case I'd like to wait till after drongo's next run, so
> we can confirm whether or not the backslashes-in-__FILE__ hypothesis
> is correct.  If it is, then 53aafdb9f is a good fix on its own
> merits, independently of libpq_pipeline.

Wilco.

-- 
Álvaro Herrera   Valdivia, Chile
Essentially, you're proposing Kevlar shoes as a solution for the problem
that you want to walk around carrying a loaded gun aimed at your foot.
(Tom Lane)




Re: Fix pg_checksums progress report

2021-04-02 Thread Fujii Masao




On 2021/04/02 18:19, shinya11.k...@nttdata.com wrote:

Thanks for your review!
I updated the patch, and attached it.


Thanks for updating the patch! Pushed.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: Asynchronous and "direct" IO support for PostgreSQL.

2021-04-02 Thread Dmitry Dolgov
Sorry for another late reply, finally found some time to formulate couple of
thoughts.

> On Thu, Feb 25, 2021 at 09:22:43AM +0100, Dmitry Dolgov wrote:
> > On Wed, Feb 24, 2021 at 01:45:10PM -0800, Andres Freund wrote:
> >
> > > I'm curious if it makes sense
> > > to explore possibility to have these sort of "backpressure", e.g. if
> > > number of inflight requests is too large calculate inflight_limit a bit
> > > lower than possible (to avoid hard performance deterioration when the db
> > > is trying to do too much IO, and rather do it smooth).
> >
> > What I do think is needed and feasible (there's a bunch of TODOs in the
> > code about it already) is to be better at only utilizing deeper queues
> > when lower queues don't suffice. So we e.g. don't read ahead more than a
> > few blocks for a scan where the query is spending most of the time
> > "elsewhere.
> >
> > There's definitely also some need for a bit better global, instead of
> > per-backend, control over the number of IOs in flight. That's not too
> > hard to implement - the hardest probably is to avoid it becoming a
> > scalability issue.
> >
> > I think the area with the most need for improvement is figuring out how
> > we determine the queue depths for different things using IO. Don't
> > really want to end up with 30 parameters influencing what queue depth to
> > use for (vacuum, index builds, sequential scans, index scans, bitmap
> > heap scans, ...) - but they benefit from a deeper queue will differ
> > between places.

Talking about parameters, from what I understand the actual number of queues
(e.g. io_uring) created is specified by PGAIO_NUM_CONTEXTS, shouldn't it be
configurable? Maybe in fact there should be not that many knobs after all - if
the model assumes the storage has:

* Some number of hardware queues, then the number of queues AIO implementation
  needs to use depends on it. For example, lowering number of contexts between
  different benchmark runs I could see that some of the hardware queues were
  significantly underutilized. Potentially there could be also such
  thing as too many contexts.

* Certain bandwidth, then the submit batch size (io_max_concurrency or
  PGAIO_SUBMIT_BATCH_SIZE) depends on it. This will allow to distinguish
  attached storage with high bandwidth and high latency vs local storages.

>From what I see max_aio_in_flight is used as a queue depth for contexts, which
is workload dependent and not easy to figure out as you mentioned. To avoid
having 30 different parameters maybe it's more feasible to introduce "shallow"
and "deep" queues, where particular depth for those could be derived from depth
of hardware queues. The question which activity should use which queue is not
easy, but if I get it right from queuing theory (assuming IO producers are
stationary processes and fixed IO latency from the storage) it depends on IO
arrivals distribution in every particular case and this in turn could be
roughly estimated for each type of activity. One can expect different IO
arrivals distributions for e.g. a normal point-query backend and a checkpoint
or vacuum process, no matter what are the other conditions (collecting those
for few benchmark runs gives indeed pretty distinct distributions).

If I understand correctly, those contexts defined by PGAIO_NUM_CONTEXTS are the
main working horse, right? I'm asking because there is also something called
local_ring, but it seems there are no IOs submitted into those. Assuming that
contexts are a main way of submitting IO, it would be also interesting to
explore isolated for different purposes contexts. I haven't finished yet my
changes here to give any results, but at least doing some tests with fio show
different latencies, when two io_urings are processing mixed read/writes vs
isolated read or writes. On the side note, at the end of the day there are so
many queues - application queue, io_uring, mq software queue, hardware queue -
I'm really curious if it would amplify tail latencies.

Another thing I've noticed is AIO implementation is much more significantly
affected by side IO activity than synchronous one. E.g. AIO version tps drops
from tens of thousands to a couple of hundreds just because of some kworker
started to flush dirty buffers (especially with disabled writeback throttling),
while synchronous version doesn't suffer that much. Not sure what to make of
it. Btw, overall I've managed to get better numbers from AIO implementation on
IO bounded test cases with local NVME device, but non IO bounded were mostly a
bit slower - is it expected, or am I missing something?

Interesting thing to note is that io_uring implementation apparently relaxed
requirements for polling operations, now one needs to have only CAP_SYS_NICE
capability, not CAP_SYS_ADMIN. I guess theoretically there are no issues using
it within the current design?




SP-GiST confusion: indexed column's type vs. index column type

2021-04-02 Thread Tom Lane
While reviewing Pavel Borisov's patch to enable INCLUDE columns in
SP-GiST, I found some things that seem like pre-existing bugs.
These only accidentally fail to cause any problems in the existing
SP-GiST opclasses:

1. The attType passed to an opclass's config method is documented as

Oid attType;/* Data type to be indexed */

Now, I would read that as meaning the type of the underlying heap
column; the documentation and code about when a "compress" method
is required certainly seem to think so too.  What is actually being
passed, though, is the data type of the index column, that is the
"opckeytype" of the index opclass.  We've failed to notice this because
(1) for most of the core SP-GiST opclasses, the two types are the same,
and (2) none of the core opclasses bother to examine attType anyway.

2. When performing an index-only scan on an SP-GiST index, what we
pass back as the tuple descriptor of the output tuples is just the
index relation's own tupdesc, cf spgbeginscan:

/* Set up indexTupDesc and xs_hitupdesc in case it's an index-only scan */
so->indexTupDesc = scan->xs_hitupdesc = RelationGetDescr(rel);

Again, what this is going to report is the opckeytype, not the
reconstructed heap column datatype.  That's just flat out wrong.
We've failed to notice because the only core opclass for which
those types are different is poly_ops, which does not support
reconstructing the polygons for index-only scan.

We need to do something about this because the INCLUDE patch needs
the relation descriptor of an SP-GiST index to match the reality
of what is stored in the leaf tuples.  Right now, as far as I can tell,
there isn't really any necessary connection between the atttype
claimed by the relation descriptor and the leaf type that's physically
stored.  They're accidentally the same in all existing opclasses,
but only accidentally.

I propose changing things so that

(A) attType really is the input (heap) data type.

(B) We enforce that leafType agrees with the opclass opckeytype,
ensuring the index tupdesc can be used for leaf tuples.

(C) The tupdesc passed back for an index-only scan reports the
input (heap) data type.

This might be too much change for the back branches.  Given the
lack of complaints to date, I think we can just fix it in HEAD.

regards, tom lane




Re: SP-GiST confusion: indexed column's type vs. index column type

2021-04-02 Thread Peter Geoghegan
On Fri, Apr 2, 2021 at 9:37 AM Tom Lane  wrote:
> I propose changing things so that
>
> (A) attType really is the input (heap) data type.
>
> (B) We enforce that leafType agrees with the opclass opckeytype,
> ensuring the index tupdesc can be used for leaf tuples.
>
> (C) The tupdesc passed back for an index-only scan reports the
> input (heap) data type.
>
> This might be too much change for the back branches.  Given the
> lack of complaints to date, I think we can just fix it in HEAD.

+1 to fixing it on HEAD only.

-- 
Peter Geoghegan




Re: Process initialization labyrinth

2021-04-02 Thread Andres Freund
Hi,

On 2021-04-02 10:18:20 -0400, Mike Palmiotto wrote:
> I can't really speak to the initial points directly relating to
> pgstat/shmem, but for the overall maze-like nature of the startup
> code: is there any chance the startup centralization patchset would be
> of any help here?
> https://www.postgresql.org/message-id/flat/CAMN686FE0OdZKp9YPO=htC6LnA6aW4r-+jq=3q5raofqgw8...@mail.gmail.com

I think parts of it could help, at least. It doesn't really do much
about centralizing / de-mazing the actual initialization of sub-systems,
but it'd make it a bit easier to do so.

Greetings,

Andres Freund




Re: Force lookahead in COPY FROM parsing

2021-04-02 Thread John Naylor
On Thu, Apr 1, 2021 at 4:47 PM Heikki Linnakangas  wrote:
> Ok, I wouldn't expect to see much difference in that test, it gets
> drowned in all the other parsing overhead. I tested this now with this:
>
> copy (select repeat('x', 1) from generate_series(1, 10)) to
> '/tmp/copydata-x.txt'
> create table blackhole_tab (a text);
>
> copy blackhole_tab from '/tmp/copydata-x.txt' where false;
>
> I took the min of 5 runs of that COPY FROM statement:
>
> master:
> 4107 ms
>
> v3-0001-Simplify-COPY-FROM-parsing-by-forcing-lookahead.patch:
> 3172 ms
>
> I was actually surprised it was so effective on that test, I expected a
> small but noticeable gain. But I'll take it.

Nice! With this test on my laptop I only get 7-8% increase, but that's much
better than what I saw before.

I have nothing further so it's RFC. The patch is pretty simple compared to
the earlier ones, but is worth running the fuzzer again as added insurance?

As an aside, I noticed the URL near the top of copyfromparse.c that
explains a detail of macros has moved from

http://www.cit.gu.edu.au/~anthony/info/C/C.macros

to

https://antofthy.gitlab.io/info/C/C_macros.txt

--
John Naylor
EDB: http://www.enterprisedb.com


Re: shared-memory based stats collector

2021-04-02 Thread Andres Freund
Hi,

On 2021-04-02 15:34:54 +0900, Kyotaro Horiguchi wrote:
> At Thu, 1 Apr 2021 19:44:25 -0700, Andres Freund  wrote 
> in 
> > Hi,
> > 
> > I spent quite a bit more time working on the patch. There's are large
> > changes:
> > 
> > - postmaster/pgstats.c (which is an incorrect location now that it's not
> >   a subprocess anymore) is split into utils/activity/pgstat.c and
> >   utils/activity/pgstat_kind.c. I don't love the _kind name, but I
> >   couldn't come up with anything better.
> 
> The place was not changed to keep footprint smaller.  I agree that the
> old place is not appropriate.  pgstat_kind...  How about changin
> pgstat.c to pgstat_core.c and pgstat_kind.c to pgstat.c?

I don't really like that split over what I chose.


> > - A stat's entry's lwlock, refcount, .. are moved into the dshash
> >   entry. There is no need for them to be separate anymore. Also allows
> >   to avoid doing some dsa lookups while holding dshash locks.
> >
> > - The dshash entries are not deleted until the refcount has reached
> >   0. That's an important building block to avoid constantly re-creating
> >   stats when flushing pending stats for a dropped object.
> 
> Does that mean the entries for a dropped object is actually dropped by
> the backend that has been flushd stats of the dropped object at exit?
> Sounds nice.

It's marked as dropped after the commit of the transaction that dropped
the object. The memory is freed when then subsequently the last
reference goes away.

> > I know of one nontrivial issue that can lead to dropped stats being
> > revived:
> > 
> > Within a transaction, a functions can be called even when another
> > transaction that dropped that function has already committed. I added a
> > spec test reproducing the issue:
> > 
> > # FIXME: this shows the bug that stats will be revived, because the
> > # shared stats in s2 is only referenced *after* the DROP FUNCTION
> > # committed. That's only possible because there is no locking (and
> > # thus no stats invalidation) around function calls.
> > permutation
> >   "s1_track_funcs_all" "s2_track_funcs_none"
> >   "s1_func_call" "s2_begin" "s2_func_call" "s1_func_drop" 
> > "s2_track_funcs_all" "s2_func_call" "s2_commit" "s2_ff" "s1_func_stats" 
> > "s2_func_stats"
> > 
> > I think the best fix here would be to associate an xid with the dropped
> > stats object, and only delete the dshash entry once there's no alive
> > with a horizon from before that xid...
> 
> I'm not sure how we do that avoiding a full scan on dshash..

I think it's quite possible. Have a linked list of "to be dropped stats"
or such. A bit annoying because of needing to deal with dsa pointers,
but not too hard either.


> > > > - the replication slot stuff isn't quite right in my branch
> > >
> > > Ah, yeah. As I mentioned above I think it should be in the unified
> > > stats and should have a special means of shotcut.  And the global
> > > stats also should be the same.
> > 
> > The problem is that I use indexes for addressing, but that they can
> > change between restarts. I think we can fix that fairly easily, by
> > mapping names to indices once, pgstat_restore_stats().  At the point we
> > call pgstat_restore_stats() StartupReplicationSlots() already was
> > executed, so we can just inquire at that point...
> 
> Does that mean the saved replslot stats is keyed by their names?

I was thinking we'd key them by name only at startup, where their
indices are not known.

Greetings,

Andres Freund




Re: Parallel Full Hash Join

2021-04-02 Thread Melanie Plageman
On Fri, Mar 5, 2021 at 8:31 PM Thomas Munro  wrote:
>
> On Tue, Mar 2, 2021 at 11:27 PM Thomas Munro  wrote:
> > On Fri, Feb 12, 2021 at 11:02 AM Melanie Plageman
> >  wrote:
> > > I just attached the diff.
> >
> > Squashed into one patch for the cfbot to chew on, with a few minor
> > adjustments to a few comments.
>
> I did some more minor tidying of comments and naming.  It's been on my
> to-do-list to update some phase names after commit 3048898e, and while
> doing that I couldn't resist the opportunity to change DONE to FREE,
> which somehow hurts my brain less, and makes much more obvious sense
> after the bugfix in CF #3031 that splits DONE into two separate
> phases.  It also pairs obviously with ALLOCATE.  I include a copy of
> that bugix here too as 0001, because I'll likely commit that first, so
> I rebased the stack of patches that way.  0002 includes the renaming I
> propose (master only).  Then 0003 is Melanie's patch, using the name
> SCAN for the new match bit scan phase.  I've attached an updated
> version of my "phase diagram" finger painting, to show how it looks
> with these three patches.  "scan*" is new.

Feedback on
v6-0002-Improve-the-naming-of-Parallel-Hash-Join-phases.patch

I like renaming DONE to FREE and ALLOCATE TO REALLOCATE in the grow
barriers. FREE only makes sense for the Build barrier if you keep the
added PHJ_BUILD_RUN phase, though, I assume you would change this patch
if you decided not to add the new build barrier phase.

I like the addition of the asterisks to indicate a phase is executed by
a single arbitrary process. I was thinking, shall we add one of these to
HJ_FILL_INNER since it is only done by one process in parallel right and
full hash join? Maybe that's confusing because serial hash join uses
that state machine too, though. Maybe **? Maybe we should invent a
complicated symbolic language :)

One tiny, random, unimportant thing: The function prototype for
ExecParallelHashJoinPartitionOuter() calls its parameter "node" and, in
the definition, it is called "hjstate". This feels like a good patch to
throw in that tiny random change to make the name the same.

static void ExecParallelHashJoinPartitionOuter(HashJoinState *node);

static void
ExecParallelHashJoinPartitionOuter(HashJoinState *hjstate)




Re: Process initialization labyrinth

2021-04-02 Thread Mike Palmiotto
On Fri, Apr 2, 2021 at 1:02 PM Andres Freund  wrote:
>
> Hi,
>
> On 2021-04-02 10:18:20 -0400, Mike Palmiotto wrote:
> > I can't really speak to the initial points directly relating to
> > pgstat/shmem, but for the overall maze-like nature of the startup
> > code: is there any chance the startup centralization patchset would be
> > of any help here?
> > https://www.postgresql.org/message-id/flat/CAMN686FE0OdZKp9YPO=htC6LnA6aW4r-+jq=3q5raofqgw8...@mail.gmail.com
>
> I think parts of it could help, at least. It doesn't really do much
> about centralizing / de-mazing the actual initialization of sub-systems,
> but it'd make it a bit easier to do so.

The patchset needs some work, no doubt. If you think it'd be useful, I
can spend some of my free time addressing any gaps in the design. I'd
hate to see that code go to waste, as I think it may have been a
reasonable first step.

Also not opposed to you taking the patchset and running with it if you prefer.

Thanks,
-- 
Mike Palmiotto
https://crunchydata.com




Re: documentation fix for SET ROLE

2021-04-02 Thread Joe Conway

On 4/2/21 10:21 AM, Laurenz Albe wrote:

On Mon, 2021-03-15 at 17:09 +, Bossart, Nathan wrote:

On 3/15/21, 7:06 AM, "Laurenz Albe"  wrote:
> On Fri, 2021-03-12 at 21:41 +, Bossart, Nathan wrote:
> > On 3/12/21, 11:14 AM, "Joe Conway"  wrote:
> > > Looking back at the commit history it seems to me that this only works
> > > accidentally. Perhaps it would be best to fix RESET ROLE and be done with 
it.
> > 
> > That seems reasonable to me.
> 
> +1 from me too.


Here's my latest attempt.  I think it's important to state that it
sets the role to the current session user identifier unless there is a
connection-time setting.  If there is no connection-time setting, it
will reset the role to the current session user, which might be
different if you've run SET SESSION AUTHORIZATION.

diff --git a/doc/src/sgml/ref/set_role.sgml b/doc/src/sgml/ref/set_role.sgml
index 739f2c5cdf..f02babf3af 100644
--- a/doc/src/sgml/ref/set_role.sgml
+++ b/doc/src/sgml/ref/set_role.sgml
@@ -53,9 +53,16 @@ RESET ROLE
   

   
-   The NONE and RESET forms reset the 
current
-   user identifier to be the current session user identifier.
-   These forms can be executed by any user.
+   SET ROLE NONE sets the current user identifier to the
+   current session user identifier, as returned by
+   session_user.  RESET ROLE sets the
+   current user identifier to the connection-time setting specified by the
+   command-line options,
+   ALTER ROLE, or
+   ALTER DATABASE,
+   if any such settings exist.  Otherwise, RESET ROLE sets
+   the current user identifier to the current session user identifier.  These
+   forms can be executed by any user.
   
  


Actually, SET ROLE NONE is defined by the SQL standard:

   18.3 

   [...]

   If NONE is specified, then
   Case:
   i) If there is no current user identifier, then an exception condition is 
raised:
  invalid role specification.
   ii) Otherwise, the current role name is removed.

This is reflected in a comment in src/backend/commands/variable.c:

   /*
* SET ROLE
*
* The SQL spec requires "SET ROLE NONE" to unset the role, so we hardwire
* a translation of "none" to InvalidOid.  Otherwise this is much like
* SET SESSION AUTHORIZATION.
*/

On the other hand, RESET (according to src/backend/utils/misc/README)
does something different:

   Prior values of configuration variables must be remembered in order to deal
   with several special cases: RESET (a/k/a SET TO DEFAULT)

So I think it is intentional that RESET ROLE does something else than
SET ROLE NONE, and we should not change that.

So I think that documenting this is the way to go.  I'll mark it as
"ready for committer".


pushed

Joe
--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development




Re: Rename of triggers for partitioned tables

2021-04-02 Thread Arne Roland
Hi,


attached a patch with some cleanup and a couple of test cases. The lookup is 
now done with the parentid and the renaming affects detached partitions as well.


The test cases are not perfectly concise, but only add about 0.4 % to the total 
runtime of regression tests at my machine. So I didn't bother to much. They 
only consist of basic sql tests.


Further feedback appreciated! Thank you!


Regards

Arne



From: Arne Roland 
Sent: Thursday, April 1, 2021 4:38:59 PM
To: Alvaro Herrera
Cc: David Steele; Pg Hackers
Subject: Re: Rename of triggers for partitioned tables

Alvaro Herrera wrote:
> I think the child cannot not have a corresponding trigger.  If you try
> to drop the trigger on child, it should say that it cannot be dropped
> because the trigger on parent requires it.  So I don't think there's any
> case where altering name of trigger in parent would raise an user-facing
> error.  If you can't find the trigger in child, that's a case of catalog
> corruption [...]

Ah, I didn't know that. That changes my reasoning about that. I only knew, that 
the alter table ... detach partition ... doesn't detach the child trigger from 
the partitioned trigger, but the the attach partition seem to add one. Maybe we 
should be able to make sure that there is a one to one correspondence for child 
triggers and child partitions.

Currently upon detaching we keep the trigger within the partitioned trigger of 
the parent relation. (So the trigger remains in place and can only be dropped 
with the parent one.) Only we try to attach the partition again any partitioned 
triggers are simply removed.

One idea would be to detach partitioned triggers there in a similar manner we 
detach partitioned indexes. That could give above guarantee. (At least if one 
would be willing to write a migration for that.) But then we can't tell anymore 
where the trigger stems from. That hence would break our attach/detach workflow.

I was annoyed by the inability to drop partitioned triggers from detached 
partitions, but it seems I just got a workaround. Ugh.

This is a bit of a sidetrack discussion, but it is related.

> Consider this.  You have table p and partition p1.  Add trigger t to p,
> and do
> ALTER TRIGGER t ON p1 RENAME TO q;

> What I'm saying is that if you later do
> ALTER TRIGGER t ON ONLY p RENAME TO r;
> then the trigger on parent is changed, and the trigger on child stays q.
> If you later do
> ALTER TRIGGER r ON p RENAME TO s;
> then triggers on both tables end up with name s.

If we get into the mindset of expecting one trigger on the child, we have the 
following edge case:

- The trigger is renamed. You seem to favor the silent rename in that case over 
raising a notice (or even an error). I am not convinced a notice wouldn't the 
better in that case.
- The trigger is outside of partitioned table. That still means that the 
trigger might still be in the inheritance tree, but likely isn't. Should we 
rename it anyways, or skip that? Should be raise a notice about what we are 
doing here? I think that would be helpful to the end user.

> Hmm, ok, maybe this is sufficient, I didn't actually test it.  I think
> you do need to add a few test cases to ensure everything is sane.  Make
> sure to verify what happens if you have multi-level partitioning
> (grandparent, parent, child) and you ALTER the one in the middle.  Also
> things like if parent has two children and one child has multiple
> children.

The test I had somewhere upwards this thread, already tested that.

I am not sure how to add a test for pg_dump though. Can you point me to the 
location in pg_regress for pg_dump?

> Also, please make sure that it works correctly with INHERITS (legacy
> inheritance).  We probably don't want to cause recursion in that case.

That works, but I will add a test to make sure it stays that way. Thank you for 
your input!

Regards
Arne
diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c
index 7383d5994e..96cefef7c6 100644
--- a/src/backend/commands/trigger.c
+++ b/src/backend/commands/trigger.c
@@ -1326,7 +1326,6 @@ get_trigger_oid(Oid relid, const char *trigname, bool missing_ok)
 
 	tgscan = systable_beginscan(tgrel, TriggerRelidNameIndexId, true,
 NULL, 2, skey);
-
 	tup = systable_getnext(tgscan);
 
 	if (!HeapTupleIsValid(tup))
@@ -1348,12 +1347,8 @@ get_trigger_oid(Oid relid, const char *trigname, bool missing_ok)
 	return oid;
 }
 
-/*
- * Perform permissions and integrity checks before acquiring a relation lock.
- */
 static void
-RangeVarCallbackForRenameTrigger(const RangeVar *rv, Oid relid, Oid oldrelid,
- void *arg)
+callbackForRenameTrigger(char *relname, Oid relid)
 {
 	HeapTuple	tuple;
 	Form_pg_class form;
@@ -1370,25 +1365,36 @@ RangeVarCallbackForRenameTrigger(const RangeVar *rv, Oid relid, Oid oldrelid,
 		ereport(ERROR,
 (errcode(ERRCODE_WRONG_OBJECT_TYPE),
  errmsg("\"%s\" is not a table, view, or foreign table",
-		r

Re: documentation fix for SET ROLE

2021-04-02 Thread Bossart, Nathan
On 4/2/21, 10:54 AM, "Joe Conway"  wrote:
> On 4/2/21 10:21 AM, Laurenz Albe wrote:
>> So I think that documenting this is the way to go.  I'll mark it as
>> "ready for committer".
>
> pushed

Thanks!

Nathan



Re: Have I found an interval arithmetic bug?

2021-04-02 Thread Bruce Momjian
On Thu, Apr  1, 2021 at 09:46:58PM -0700, Bryn Llewellyn wrote:
> Or am I misunderstanding something?
> 
> Try this. The result of each “select” is shown as the trailing comment on the
> same line. I added whitespace by hand to line up the fields.
> 
> select interval '-1.7 years';  -- -1 years -8 mons
> 
> select interval '29.4 months'; --  2 years  5 mons 12
> days
> 
> select interval '-1.7 years 29.4 months';  --   8 mons 12
> days << wrong
> select interval '29.4 months -1.7 years';  --   9 mons 12
> days
> 
> select interval '-1.7 years' + interval '29.4 months'; --   9 mons 12
> days
> select interval '29.4 months' + interval '-1.7 years'; --   9 mons 12
> days
> 
> As I reason it, the last four “select” statements are all semantically the
> same. They’re just different syntaxes to add the two intervals  the the first
> two “select” statements use separately. There’s one odd man out. And I reason
> this one to be wrong. Is there a flaw in my reasoning?

[Thread moved to hackers.]

Looking at your report, I thought I could easily find the cause, but it
wasn't obvious.  What is happening is that when you cast a float to an
integer in C, it rounds toward zero, meaning that 8.4 rounds to 8 and
-8.4 rounds to -8.  The big problem here is that -8.4 is rounding in a
positive direction, while 8.4 rounds in a negative direction.  See this:

int(int(-8.4) + 29)
21
int(int(29) + -8.4)
20

When you do '-1.7 years' first, it become -8, and then adding 29 yields
21.  In the other order, it is 29 - 8.4, which yields 20.6, which
becomes 20.  I honestly had never studied this interaction, though you
would think I would have seen it before.  One interesting issue is that
it only happens when the truncations happen to values with different
signs --- if they are both positive or negative, it is fine.

The best fix I think is to use rint()/round to round to the closest
integer, not toward zero.  The attached patch does this in a few places,
but the code needs more research if we are happy with this approach,
since there are probably other cases.  Using rint() does help to produce
more accurate results, thought the regression tests show no change from
this patch.

> Further… there’s a notable asymmetry. The fractional part of “1.7 years” is 
> 8.4
> months. But the fractional part of the months value doesn’t spread further 
> down
> into days. However, the fractional part of “29.4 months” (12 days) _does_
> spread further down into days. What’s the rationale for this asymmetry?

Yes, looking at the code, it seems we only spill down to one unit, not
more.  I think we need to have a discussion if we want to change that. 
I think the idea was that if you specify a non-whole number, you
probably want to spill down one level, but don't want it spilling all
the way to milliseconds, which is certainly possible.

> I can’t see that my observations here can be explained by the difference
> between calendar time and clock time. Here I’m just working with non-metric
> units like feet and inches. One year is just defined as 12 months. And one
> month is just defined as 30 days. All that stuff about adding a month to
> 3-Feb-2020 taking you to 3-Mar-2020 (same for leap years an non-leap years) ,
> and that other stuff about adding one day to 23:00 on the day before the
> “spring forward” moment taking you to 23:00 on the next day (i.w. when
> intervals are added to timestamps) is downstream of simply adding two
> intervals.

Ah, seems you have done some research.  ;-)

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  If only the physical world exists, free will is an illusion.

diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
index 889077f55c..a92e19471d 100644
--- a/src/backend/utils/adt/datetime.c
+++ b/src/backend/utils/adt/datetime.c
@@ -3307,28 +3307,28 @@ DecodeInterval(char **field, int *ftype, int nf, int range,
 	case DTK_YEAR:
 		tm->tm_year += val;
 		if (fval != 0)
-			tm->tm_mon += fval * MONTHS_PER_YEAR;
+			tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
 		tmask = DTK_M(YEAR);
 		break;
 
 	case DTK_DECADE:
 		tm->tm_year += val * 10;
 		if (fval != 0)
-			tm->tm_mon += fval * MONTHS_PER_YEAR * 10;
+			tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 10);
 		tmask = DTK_M(DECADE);
 		break;
 
 	case DTK_CENTURY:
 		tm->tm_year += val * 100;
 		if (fval != 0)
-			tm->tm_mon += fval * MONTHS_PER_YEAR * 100;
+			tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 100);
 		tmask = DTK_M(CENTURY);
 		break;
 
 	case DTK_MILLENNIUM:
 		tm->tm_year += val * 1000;
 		if (fval != 0)
-			tm->tm_mon += fval * MONTHS_PER_YEAR * 1000;
+			tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 1000);
 		

Re: invalid data in file backup_label problem on windows

2021-04-02 Thread David Steele

On 4/2/21 3:43 AM, Michael Paquier wrote:

On Thu, Apr 01, 2021 at 09:56:02AM +0900, Michael Paquier wrote:

Okay, that sounds good to me.


Applied and backpatched then as 6fb66c26 & co.


Thank you!

--
-David
da...@pgmasters.net




Re: Proposal: Save user's original authenticated identity for logging

2021-04-02 Thread Jacob Champion
On Fri, 2021-04-02 at 13:45 +0900, Michael Paquier wrote:
> Attached is what I have come up with as the first building piece,
> which is basically a combination of 0001 and 0002, except that I
> modified things so as the number of arguments remains minimal for all
> the routines.  This avoids the manipulation of the list of parameters
> passed down to PostgresNode::psql. The arguments for the optional
> query, the expected stdout and stderr are part of the parameter set
> (0001 was not doing that).

I made a few changes, highlighted in the since-v18 diff:

> + # The result is assumed to match "true", or "t", here.
> + $node->connect_ok($connstr, $test_name, sql => $query,
> +   expected_stdout => qr/t/);

I've anchored this as qr/^t$/ so we don't accidentally match a stray
"t" in some larger string.

> - is($res, 0, $test_name);
> - like($stdoutres, $expected, $test_name);
> - is($stderrres, "", $test_name);
> + my ($stdoutres, $stderrres);
> +
> + $node->connect_ok($connstr, $test_name, $query, $expected);

$query and $expected need to be given as named parameters. We also lost
the stderr check from the previous version of the test, so I added
expected_stderr to connect_ok().

> @@ -446,14 +446,14 @@ TODO:
>   # correct client cert in encrypted PEM with empty password
>   $node->connect_fails(
>   "$common_connstr user=ssltestuser sslcert=ssl/client.crt 
> sslkey=ssl/client-encrypted-pem_tmp.key sslpassword=''",
> - qr!\Qprivate key file "ssl/client-encrypted-pem_tmp.key": 
> processing error\E!,
> + expected_stderr => qr!\Qprivate key file 
> "ssl/client-encrypted-pem_tmp.key": processing error\E!,
>   "certificate authorization fails with correct client cert and 
> empty password in encrypted PEM format"
>   );

These tests don't run yet inside the TODO block, but I've put the
expected_stderr parameter at the end of the list for them.

> For the main patch, this will need to be
> extended with two more parameters in each routine: log_like and
> log_unlike to match for the log patterns, handled as arrays of
> regexes.  That's what 0003 is basically doing already.

Rebased on top of your patch as v19, attached. (v17 disappeared into
the ether somewhere, I think. :D)

Now that it's easy to add log_like to existing tests, I fleshed out the
LDAP tests with a few more cases. They don't add code coverage, but
they pin the desired behavior for a few more types of LDAP auth.

--Jacob
commit d80742fbe2124687591e76dc069d7cab6a2cecc8
Author: Jacob Champion 
Date:   Fri Apr 2 10:31:40 2021 -0700

squash! Plug more TAP test suites with new PostgresNode's new routines

diff --git a/src/test/kerberos/t/001_auth.pl b/src/test/kerberos/t/001_auth.pl
index 31fdc49b86..2395664145 100644
--- a/src/test/kerberos/t/001_auth.pl
+++ b/src/test/kerberos/t/001_auth.pl
@@ -20,7 +20,7 @@ use Time::HiRes qw(usleep);
 
 if ($ENV{with_gssapi} eq 'yes')
 {
-   plan tests => 30;
+   plan tests => 34;
 }
 else
 {
@@ -192,7 +192,7 @@ sub test_access
{
# The result is assumed to match "true", or "t", here.
$node->connect_ok($connstr, $test_name, sql => $query,
- expected_stdout => qr/t/);
+ expected_stdout => qr/^t$/);
}
else
{
@@ -223,9 +223,9 @@ sub test_query
my $connstr = $node->connstr('postgres') .
" user=$role host=$host hostaddr=$hostaddr $gssencmode";
 
-   my ($stdoutres, $stderrres);
-
-   $node->connect_ok($connstr, $test_name, $query, $expected);
+   $node->connect_ok($connstr, $test_name, sql => $query,
+   expected_stdout => $expected,
+   expected_stderr => qr/^$/);
return;
 }
 
diff --git a/src/test/perl/PostgresNode.pm b/src/test/perl/PostgresNode.pm
index ba1407e761..f29c877c79 100644
--- a/src/test/perl/PostgresNode.pm
+++ b/src/test/perl/PostgresNode.pm
@@ -1876,6 +1876,10 @@ instead of the default.
 
 If this regular expression is set, matches it with the output generated.
 
+=item expected_stderr => B
+
+If this regular expression is set, matches it with the stderr generated.
+
 =back
 
 =cut
@@ -1906,7 +1910,11 @@ sub connect_ok
 
if (defined($params{expected_stdout}))
{
-   like($stdout, $params{expected_stdout}, "$test_name: matches");
+   like($stdout, $params{expected_stdout}, "$test_name: stdout 
matches");
+   }
+   if (defined($params{expected_stderr}))
+   {
+   like($stderr, $params{expected_stderr}, "$test_name: stderr 
matches");
}
 }
 
diff --git a/src/test/ssl/t/001_ssltests.pl b/src/test/ssl/t/001_ssltests.pl
index 21beef01aa..6df558fca7 100644
--- a/src/test/ssl/t/001_ssltests.pl
+++ b/src/test/ssl/t/001_ssltests.pl
@@ -446,15 +446,15 @@ TODO:
# correct client cert in encrypted PEM with empty 

Re: Parallel Full Hash Join

2021-04-02 Thread Zhihong Yu
Hi,
For v6-0003-Parallel-Hash-Full-Right-Outer-Join.patch

+* current_chunk_idx: index in current HashMemoryChunk

The above comment seems to be better fit
for ExecScanHashTableForUnmatched(), instead
of ExecParallelPrepHashTableForUnmatched.
I wonder where current_chunk_idx should belong (considering the above
comment and what the code does).

+   while (hashtable->current_chunk_idx <
hashtable->current_chunk->used)
...
+   next = hashtable->current_chunk->next.unshared;
+   hashtable->current_chunk = next;
+   hashtable->current_chunk_idx = 0;

Each time we advance to the next chunk, current_chunk_idx is reset. It
seems current_chunk_idx can be placed inside chunk.
Maybe the consideration is that, with the current formation we save space
by putting current_chunk_idx field at a higher level.
If that is the case, a comment should be added.

Cheers

On Fri, Mar 5, 2021 at 5:31 PM Thomas Munro  wrote:

> On Tue, Mar 2, 2021 at 11:27 PM Thomas Munro 
> wrote:
> > On Fri, Feb 12, 2021 at 11:02 AM Melanie Plageman
> >  wrote:
> > > I just attached the diff.
> >
> > Squashed into one patch for the cfbot to chew on, with a few minor
> > adjustments to a few comments.
>
> I did some more minor tidying of comments and naming.  It's been on my
> to-do-list to update some phase names after commit 3048898e, and while
> doing that I couldn't resist the opportunity to change DONE to FREE,
> which somehow hurts my brain less, and makes much more obvious sense
> after the bugfix in CF #3031 that splits DONE into two separate
> phases.  It also pairs obviously with ALLOCATE.  I include a copy of
> that bugix here too as 0001, because I'll likely commit that first, so
> I rebased the stack of patches that way.  0002 includes the renaming I
> propose (master only).  Then 0003 is Melanie's patch, using the name
> SCAN for the new match bit scan phase.  I've attached an updated
> version of my "phase diagram" finger painting, to show how it looks
> with these three patches.  "scan*" is new.
>


Making wait events a bit more efficient

2021-04-02 Thread Andres Freund
Hi,

This grew out of my patch to split the waits event code out of
pgstat.[ch], which in turn grew out of the shared memory stats patch
series.


pgstat_report_wait_start() and pgstat_report_wait_end() currently check
pgstat_track_activities before assigning to MyProc->wait_event_info.
Given the small cost of the assignment, and that pgstat_track_activities
is almost always used, I'm doubtful that that's the right tradeoff.

Normally I would say that branch prediction will take care of this cost
- but because pgstat_report_wait_{start,end} are inlined, that has to
happen in each of the calling locations.

The code works out to be something like the following (this is from
basebackup_read_file, the simplest caller I could quickly find, I
removed interspersed code from it):

267 if (!pgstat_track_activities || !proc)
   0x00430e4d <+13>:cmpb   $0x1,0x4882e1(%rip)# 0x8b9135 


265 volatile PGPROC *proc = MyProc;
   0x00430e54 <+20>:mov0x48c52d(%rip),%rax# 0x8bd388 


266
267 if (!pgstat_track_activities || !proc)
   0x00430e5b <+27>:jne0x430e6c 
   0x00430e5d <+29>:test   %rax,%rax
   0x00430e60 <+32>:je 0x430e6c 

268 return;
269
270 /*
271  * Since this is a four-byte field which is always read and 
written as
272  * four-bytes, updates are atomic.
273  */
274 proc->wait_event_info = wait_event_info;
   0x00430e62 <+34>:movl   $0xa00,0x2c8(%rax)

/home/andres/src/postgresql/src/backend/replication/basebackup.c:
2014rc = pg_pread(fd, buf, nbytes, offset);
   0x00430e6c <+44>:call   0xc4790 

stripping the source:
   0x00430e4d <+13>:cmpb   $0x1,0x4882e1(%rip)# 0x8b9135 

   0x00430e54 <+20>:mov0x48c52d(%rip),%rax# 0x8bd388 

   0x00430e5b <+27>:jne0x430e6c 
   0x00430e5d <+29>:test   %rax,%rax
   0x00430e60 <+32>:je 0x430e6c 
   0x00430e62 <+34>:movl   $0xa00,0x2c8(%rax)
   0x00430e6c <+44>:call   0xc4790 


just removing the pgstat_track_activities check turns that into

   0x00430d8d <+13>:mov0x48c5f4(%rip),%rax# 0x8bd388 

   0x00430d94 <+20>:test   %rax,%rax
   0x00430d97 <+23>:je 0x430da3 
   0x00430d99 <+25>:movl   $0xa00,0x2c8(%rax)
   0x00430da3 <+35>:call   0xc4790 

which does seem (a bit) nicer.

However, we can improve this further, entirely eliminating branches, by
introducing something like "my_wait_event_info" that initially just
points to a local variable and is switched to shared once MyProc is
assigned.

Obviously incorrect, for comparison: Just removing the MyProc != NULL
check yields:
   0x00430bcd <+13>:mov0x48c7b4(%rip),%rax# 0x8bd388 

   0x00430bd4 <+20>:movl   $0xa00,0x2c8(%rax)
   0x00430bde <+30>:call   0xc47d0 

using a uint32 *my_wait_event_info yields:
   0x00430b4d <+13>:mov0x47615c(%rip),%rax# 0x8a6cb0 

   0x00430b54 <+20>:movl   $0xa00,(%rax)
   0x00430b5a <+26>:call   0xc47d0 

Note how the lack of offset addressing in the my_wait_event_info version
makes the instruction smaller (call is at 26 instead of 30).


Now, perhaps all of this isn't worth optimizing, most of the things done
within pgstat_report_wait_start()/end() are expensive-ish. And forward
branches are statically predicted to be not taken on several
platforms. I have seen this these instructions show up in profiles in
workloads with contended lwlocks at least...

There's also a small win in code size:
   textdata bss dec hex filename
8932095  192160  204656 9328911  8e590f src/backend/postgres
8928544  192160  204656 9325360  8e4b30 src/backend/postgres_my_wait_event_info


If we went for the my_wait_event_info approach there is one further
advantage, after my change to move the wait event code into a separate
file: wait_event.h does not need to include proc.h anymore, which seems
architecturally nice for things like fd.c.


Attached is patch series doing this.


I'm inclined to separately change the comment format for
wait_event.[ch], there's no no reason to stick with the current style:

/* --
 * pgstat_report_wait_start() -
 *
 *  Called from places where server process needs to wait.  This is called
 *  to report wait event information.  The wait information is stored
 *  as 4-bytes where first byte represents the wait event class (type of
 *  wait, for different types of wait, refer WaitClass) and the next
 *  3-bytes represent the actual wait event.  Currently 2-bytes are used
 *  for wait event which is sufficient for current usage, 1-byte is
 *  reserved for future usage.
 *
 * NB: this *must* be able to survive being

Re: Support ALTER SUBSCRIPTION ... ADD/DROP PUBLICATION ... syntax

2021-04-02 Thread Peter Eisentraut

On 23.03.21 16:08, Japin Li wrote:

I check the duplicates for newpublist in merge_publications(). The code is
copied from publicationListToArray().

I do not check for all duplicates because it will make the code more complex.
For example:

ALTER SUBSCRIPTION mysub ADD PUBLICATION mypub2, mypub2, mypub2;

If we record the duplicate publication names in list A, when we find a
duplication in newpublist, we should check whether the publication is
in list A or not, to make the error message make sense (do not have
duplicate publication names in error message).


The code you have in merge_publications() to report all existing 
publications is pretty messy and is not properly internationalized.  I 
think what you are trying to do there is excessive.  Compare this 
similar case:


create table t1 (a int, b int);
alter table t1 add column a int, add column b int;
ERROR:  42701: column "a" of relation "t1" already exists

I think you can make both this and the duplicate checking much simpler 
if you just report the first conflict.


I think this patch is about ready to commit, but please provide a final 
version in good time.


(Also, please combine your patches into a single patch.)




Re: CLUSTER on partitioned index

2021-04-02 Thread Justin Pryzby
@cfbot: rebased
>From 686cd8fc644f1f86d81d7748b66feddd634c4dc8 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Thu, 26 Nov 2020 14:37:08 -0600
Subject: [PATCH v10 1/8] pg_dump: make CLUSTER ON a separate dump object..

..since it needs to be restored after any child indexes are restored *and
attached*.  The order needs to be:

1) restore child and parent index (order doesn't matter);
2) attach child index;
3) set cluster on child and parent index (order doesn't matter);
---
 src/bin/pg_dump/pg_dump.c  | 86 ++
 src/bin/pg_dump/pg_dump.h  |  8 
 src/bin/pg_dump/pg_dump_sort.c |  8 
 3 files changed, 82 insertions(+), 20 deletions(-)

diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 25717ce0e6..9a3044fd8c 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -215,6 +215,7 @@ static void dumpSequence(Archive *fout, const TableInfo *tbinfo);
 static void dumpSequenceData(Archive *fout, const TableDataInfo *tdinfo);
 static void dumpIndex(Archive *fout, const IndxInfo *indxinfo);
 static void dumpIndexAttach(Archive *fout, const IndexAttachInfo *attachinfo);
+static void dumpIndexClusterOn(Archive *fout, const IndexClusterInfo *clusterinfo);
 static void dumpStatisticsExt(Archive *fout, const StatsExtInfo *statsextinfo);
 static void dumpConstraint(Archive *fout, const ConstraintInfo *coninfo);
 static void dumpTableConstraintComment(Archive *fout, const ConstraintInfo *coninfo);
@@ -7232,6 +7233,11 @@ getIndexes(Archive *fout, TableInfo tblinfo[], int numTables)
 i_inddependcollversions;
 	int			ntups;
 
+	int	ncluster = 0;
+	IndexClusterInfo *clusterinfo;
+	clusterinfo = (IndexClusterInfo *)
+		pg_malloc0(numTables * sizeof(IndexClusterInfo));
+
 	for (i = 0; i < numTables; i++)
 	{
 		TableInfo  *tbinfo = &tblinfo[i];
@@ -7611,6 +7617,27 @@ getIndexes(Archive *fout, TableInfo tblinfo[], int numTables)
 /* Plain secondary index */
 indxinfo[j].indexconstraint = 0;
 			}
+
+			/* Record each table's CLUSTERed index, if any */
+			if (indxinfo[j].indisclustered)
+			{
+IndxInfo   *index = &indxinfo[j];
+IndexClusterInfo *cluster = &clusterinfo[ncluster];
+
+cluster->dobj.objType = DO_INDEX_CLUSTER_ON;
+cluster->dobj.catId.tableoid = 0;
+cluster->dobj.catId.oid = 0;
+AssignDumpId(&cluster->dobj);
+cluster->dobj.name = pg_strdup(index->dobj.name);
+cluster->dobj.namespace = index->indextable->dobj.namespace;
+cluster->index = index;
+cluster->indextable = &tblinfo[i];
+
+/* The CLUSTER ON depends on its index.. */
+addObjectDependency(&cluster->dobj, index->dobj.dumpId);
+
+ncluster++;
+			}
 		}
 
 		PQclear(res);
@@ -10472,6 +10499,9 @@ dumpDumpableObject(Archive *fout, const DumpableObject *dobj)
 		case DO_SUBSCRIPTION:
 			dumpSubscription(fout, (const SubscriptionInfo *) dobj);
 			break;
+		case DO_INDEX_CLUSTER_ON:
+			dumpIndexClusterOn(fout, (IndexClusterInfo *) dobj);
+			break;
 		case DO_PRE_DATA_BOUNDARY:
 		case DO_POST_DATA_BOUNDARY:
 			/* never dumped, nothing to do */
@@ -16721,6 +16751,41 @@ getAttrName(int attrnum, const TableInfo *tblInfo)
 	return NULL;/* keep compiler quiet */
 }
 
+/*
+ * dumpIndexClusterOn
+ *	  record that the index is clustered.
+ */
+static void
+dumpIndexClusterOn(Archive *fout, const IndexClusterInfo *clusterinfo)
+{
+	DumpOptions *dopt = fout->dopt;
+	TableInfo	*tbinfo = clusterinfo->indextable;
+	char		*qindxname;
+	PQExpBuffer	q;
+
+	if (dopt->dataOnly)
+		return;
+
+	q = createPQExpBuffer();
+	qindxname = pg_strdup(fmtId(clusterinfo->dobj.name));
+
+	/* index name is not qualified in this syntax */
+	appendPQExpBuffer(q, "\nALTER TABLE %s CLUSTER ON %s;\n",
+	  fmtQualifiedDumpable(tbinfo), qindxname);
+
+	if (clusterinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+		ArchiveEntry(fout, clusterinfo->dobj.catId, clusterinfo->dobj.dumpId,
+	 ARCHIVE_OPTS(.tag = clusterinfo->dobj.name,
+  .namespace = tbinfo->dobj.namespace->dobj.name,
+  .owner = tbinfo->rolname,
+  .description = "INDEX CLUSTER ON",
+  .section = SECTION_POST_DATA,
+  .createStmt = q->data));
+
+	destroyPQExpBuffer(q);
+	free(qindxname);
+}
+
 /*
  * dumpIndex
  *	  write out to fout a user-defined index
@@ -16775,16 +16840,6 @@ dumpIndex(Archive *fout, const IndxInfo *indxinfo)
 		 * similar code in dumpConstraint!
 		 */
 
-		/* If the index is clustered, we need to record that. */
-		if (indxinfo->indisclustered)
-		{
-			appendPQExpBuffer(q, "\nALTER TABLE %s CLUSTER",
-			  fmtQualifiedDumpable(tbinfo));
-			/* index name is not qualified in this syntax */
-			appendPQExpBuffer(q, " ON %s;\n",
-			  qindxname);
-		}
-
 		/*
 		 * If the index has any statistics on some of its columns, generate
 		 * the associated ALTER INDEX queries.
@@ -17111,16 +17166,6 @@ dumpConstraint(Archive *fout, const ConstraintInfo *coninfo)
 		 * similar code in dumpIndex!
 		 */
 
-		/* If the index i

Re: Making wait events a bit more efficient

2021-04-02 Thread Zhihong Yu
Hi,

+extern PGDLLIMPORT uint32 *my_wait_event_info;

It seems volatile should be added to the above declaration. Since later:

+   *(volatile uint32 *) my_wait_event_info = wait_event_info;

Cheers

On Fri, Apr 2, 2021 at 12:45 PM Andres Freund  wrote:

> Hi,
>
> This grew out of my patch to split the waits event code out of
> pgstat.[ch], which in turn grew out of the shared memory stats patch
> series.
>
>
> pgstat_report_wait_start() and pgstat_report_wait_end() currently check
> pgstat_track_activities before assigning to MyProc->wait_event_info.
> Given the small cost of the assignment, and that pgstat_track_activities
> is almost always used, I'm doubtful that that's the right tradeoff.
>
> Normally I would say that branch prediction will take care of this cost
> - but because pgstat_report_wait_{start,end} are inlined, that has to
> happen in each of the calling locations.
>
> The code works out to be something like the following (this is from
> basebackup_read_file, the simplest caller I could quickly find, I
> removed interspersed code from it):
>
> 267 if (!pgstat_track_activities || !proc)
>0x00430e4d <+13>:cmpb   $0x1,0x4882e1(%rip)#
> 0x8b9135 
>
> 265 volatile PGPROC *proc = MyProc;
>0x00430e54 <+20>:mov0x48c52d(%rip),%rax#
> 0x8bd388 
>
> 266
> 267 if (!pgstat_track_activities || !proc)
>0x00430e5b <+27>:jne0x430e6c 
>0x00430e5d <+29>:test   %rax,%rax
>0x00430e60 <+32>:je 0x430e6c 
>
> 268 return;
> 269
> 270 /*
> 271  * Since this is a four-byte field which is always read
> and written as
> 272  * four-bytes, updates are atomic.
> 273  */
> 274 proc->wait_event_info = wait_event_info;
>0x00430e62 <+34>:movl   $0xa00,0x2c8(%rax)
>
> /home/andres/src/postgresql/src/backend/replication/basebackup.c:
> 2014rc = pg_pread(fd, buf, nbytes, offset);
>0x00430e6c <+44>:call   0xc4790 
>
> stripping the source:
>0x00430e4d <+13>:cmpb   $0x1,0x4882e1(%rip)#
> 0x8b9135 
>0x00430e54 <+20>:mov0x48c52d(%rip),%rax#
> 0x8bd388 
>0x00430e5b <+27>:jne0x430e6c 
>0x00430e5d <+29>:test   %rax,%rax
>0x00430e60 <+32>:je 0x430e6c 
>0x00430e62 <+34>:movl   $0xa00,0x2c8(%rax)
>0x00430e6c <+44>:call   0xc4790 
>
>
> just removing the pgstat_track_activities check turns that into
>
>0x00430d8d <+13>:mov0x48c5f4(%rip),%rax#
> 0x8bd388 
>0x00430d94 <+20>:test   %rax,%rax
>0x00430d97 <+23>:je 0x430da3 
>0x00430d99 <+25>:movl   $0xa00,0x2c8(%rax)
>0x00430da3 <+35>:call   0xc4790 
>
> which does seem (a bit) nicer.
>
> However, we can improve this further, entirely eliminating branches, by
> introducing something like "my_wait_event_info" that initially just
> points to a local variable and is switched to shared once MyProc is
> assigned.
>
> Obviously incorrect, for comparison: Just removing the MyProc != NULL
> check yields:
>0x00430bcd <+13>:mov0x48c7b4(%rip),%rax#
> 0x8bd388 
>0x00430bd4 <+20>:movl   $0xa00,0x2c8(%rax)
>0x00430bde <+30>:call   0xc47d0 
>
> using a uint32 *my_wait_event_info yields:
>0x00430b4d <+13>:mov0x47615c(%rip),%rax#
> 0x8a6cb0 
>0x00430b54 <+20>:movl   $0xa00,(%rax)
>0x00430b5a <+26>:call   0xc47d0 
>
> Note how the lack of offset addressing in the my_wait_event_info version
> makes the instruction smaller (call is at 26 instead of 30).
>
>
> Now, perhaps all of this isn't worth optimizing, most of the things done
> within pgstat_report_wait_start()/end() are expensive-ish. And forward
> branches are statically predicted to be not taken on several
> platforms. I have seen this these instructions show up in profiles in
> workloads with contended lwlocks at least...
>
> There's also a small win in code size:
>textdata bss dec hex filename
> 8932095  192160  204656 9328911  8e590f src/backend/postgres
> 8928544  192160  204656 9325360  8e4b30
> src/backend/postgres_my_wait_event_info
>
>
> If we went for the my_wait_event_info approach there is one further
> advantage, after my change to move the wait event code into a separate
> file: wait_event.h does not need to include proc.h anymore, which seems
> architecturally nice for things like fd.c.
>
>
> Attached is patch series doing this.
>
>
> I'm inclined to separately change the comment format for
> wait_event.[ch], there's no no reason to stick with the current style:
>
> /* --
>  * pgstat_report_wait_start() -
>  *
>  *  Called from places where server process needs to wait.  This is

Re: Making wait events a bit more efficient

2021-04-02 Thread Andres Freund
Hi,

On 2021-04-02 13:06:35 -0700, Zhihong Yu wrote:
> +extern PGDLLIMPORT uint32 *my_wait_event_info;
> 
> It seems volatile should be added to the above declaration. Since later:
> 
> +   *(volatile uint32 *) my_wait_event_info = wait_event_info;

Why? We really just want to make the store volatile, nothing else. I
think it's much better to annotate that we want individual stores to
happen regardless of compiler optimizations, rather than all
interactions with a variable.

Greetings,

Andres Freund




Re: Have I found an interval arithmetic bug?

2021-04-02 Thread Zhihong Yu
Bruce:
Thanks for tackling this issue.

The patch looks good to me.
When you have time, can you include the places which were not covered by
the current diff ?

Cheers

On Fri, Apr 2, 2021 at 11:06 AM Bruce Momjian  wrote:

> On Thu, Apr  1, 2021 at 09:46:58PM -0700, Bryn Llewellyn wrote:
> > Or am I misunderstanding something?
> >
> > Try this. The result of each “select” is shown as the trailing comment
> on the
> > same line. I added whitespace by hand to line up the fields.
> >
> > select interval '-1.7 years';  -- -1 years -8
> mons
> >
> > select interval '29.4 months'; --  2 years  5
> mons 12
> > days
> >
> > select interval '-1.7 years 29.4 months';  --   8
> mons 12
> > days << wrong
> > select interval '29.4 months -1.7 years';  --   9
> mons 12
> > days
> >
> > select interval '-1.7 years' + interval '29.4 months'; --   9
> mons 12
> > days
> > select interval '29.4 months' + interval '-1.7 years'; --   9
> mons 12
> > days
> >
> > As I reason it, the last four “select” statements are all semantically
> the
> > same. They’re just different syntaxes to add the two intervals  the the
> first
> > two “select” statements use separately. There’s one odd man out. And I
> reason
> > this one to be wrong. Is there a flaw in my reasoning?
>
> [Thread moved to hackers.]
>
> Looking at your report, I thought I could easily find the cause, but it
> wasn't obvious.  What is happening is that when you cast a float to an
> integer in C, it rounds toward zero, meaning that 8.4 rounds to 8 and
> -8.4 rounds to -8.  The big problem here is that -8.4 is rounding in a
> positive direction, while 8.4 rounds in a negative direction.  See this:
>
> int(int(-8.4) + 29)
> 21
> int(int(29) + -8.4)
> 20
>
> When you do '-1.7 years' first, it become -8, and then adding 29 yields
> 21.  In the other order, it is 29 - 8.4, which yields 20.6, which
> becomes 20.  I honestly had never studied this interaction, though you
> would think I would have seen it before.  One interesting issue is that
> it only happens when the truncations happen to values with different
> signs --- if they are both positive or negative, it is fine.
>
> The best fix I think is to use rint()/round to round to the closest
> integer, not toward zero.  The attached patch does this in a few places,
> but the code needs more research if we are happy with this approach,
> since there are probably other cases.  Using rint() does help to produce
> more accurate results, thought the regression tests show no change from
> this patch.
>
> > Further… there’s a notable asymmetry. The fractional part of “1.7 years”
> is 8.4
> > months. But the fractional part of the months value doesn’t spread
> further down
> > into days. However, the fractional part of “29.4 months” (12 days) _does_
> > spread further down into days. What’s the rationale for this asymmetry?
>
> Yes, looking at the code, it seems we only spill down to one unit, not
> more.  I think we need to have a discussion if we want to change that.
> I think the idea was that if you specify a non-whole number, you
> probably want to spill down one level, but don't want it spilling all
> the way to milliseconds, which is certainly possible.
>
> > I can’t see that my observations here can be explained by the difference
> > between calendar time and clock time. Here I’m just working with
> non-metric
> > units like feet and inches. One year is just defined as 12 months. And
> one
> > month is just defined as 30 days. All that stuff about adding a month to
> > 3-Feb-2020 taking you to 3-Mar-2020 (same for leap years an non-leap
> years) ,
> > and that other stuff about adding one day to 23:00 on the day before the
> > “spring forward” moment taking you to 23:00 on the next day (i.w. when
> > intervals are added to timestamps) is downstream of simply adding two
> > intervals.
>
> Ah, seems you have done some research.  ;-)
>
> --
>   Bruce Momjian  https://momjian.us
>   EDB  https://enterprisedb.com
>
>   If only the physical world exists, free will is an illusion.
>
>


Re: Add client connection check during the execution of the query

2021-04-02 Thread Thomas Munro
On Fri, Apr 2, 2021 at 6:18 PM tsunakawa.ta...@fujitsu.com
 wrote:
> The patch looks committable to me.

I checked for performance impact compared to master with pgbench -S,
and didn't see any problem.  I thought more about how to write a
decent race-free test but struggled with the lack of a good way to
control multiple connections from TAP tests and gave up for now.  I
previously tried to write something to help with that, but abandoned
it because it was terrible[1].  It seems a bit strange to me that psql
has to be involved at all, and we don't have a simple way to connect
one or more sockets and speak the protocol from perl.

Pushed!  Thanks to all who contributed.

[1] 
https://www.postgresql.org/message-id/CA%2BhUKG%2BFkUuDv-bcBns%3DZ_O-V9QGW0nWZNHOkEPxHZWjegRXvw%40mail.gmail.com,
see v2-0006-Add-TAP-test-for-snapshot-too-old.patch




Re: Making wait events a bit more efficient

2021-04-02 Thread Zhihong Yu
Hi,
Maybe I am not familiar with your patch.

I don't see where my_wait_event_info is read (there is no getter method in
the patch).

In that case, it is fine omitting volatile in the declaration.

Cheers

On Fri, Apr 2, 2021 at 1:10 PM Andres Freund  wrote:

> Hi,
>
> On 2021-04-02 13:06:35 -0700, Zhihong Yu wrote:
> > +extern PGDLLIMPORT uint32 *my_wait_event_info;
> >
> > It seems volatile should be added to the above declaration. Since later:
> >
> > +   *(volatile uint32 *) my_wait_event_info = wait_event_info;
>
> Why? We really just want to make the store volatile, nothing else. I
> think it's much better to annotate that we want individual stores to
> happen regardless of compiler optimizations, rather than all
> interactions with a variable.
>
> Greetings,
>
> Andres Freund
>


Re: Making wait events a bit more efficient

2021-04-02 Thread Andres Freund
Hi,

On 2021-04-02 13:42:42 -0700, Zhihong Yu wrote:
> I don't see where my_wait_event_info is read (there is no getter method in
> the patch).

There are no reads via my_wait_event_info. Once connected to shared
memory, InitProcess() calls pgstat_set_wait_event_storage() to point it
to &MyProc->wait_event_info.

Greetings,

Andres Freund




[PATCH] Implement motd for PostgreSQL

2021-04-02 Thread Joel Jacobson
Dear fellow hackers,

This patch is one day late, my apologies for missing the deadline this year.

PostgreSQL has since long been suffering from the lack of a proper UNIX style 
motd (message of the day).

DBAs have no ways of conveying important information to users,
having to rely on external protocols, such as HTTPS and "websites" to provide 
such information.

By adding a motd configuration parameter, the DBA can set this to a text string,
which will be automatically presented to the user as a NOTICE when logging on 
to the server.

While at it, fix escape_single_quotes_ascii() to properly escape newlines,
so that such can be used in ALTER SYSTEM values.
This makes sense, since parsing \n in config values works just fine.

To demonstrate the usefulness of this feature,
I've setup an open public PostgreSQL server at "pit.org",
to which anyone can connect without a password.

You need to know the username though,
which will hopefully make problems for bots.

$ psql -U brad -h pit.org motd
NOTICE:
     __  ___
  /)/  /
( / ___)
  (/ o)  ( o)   )
   _  (_  ))  /
   /_/)_/
  /  //|  |\
  v |  | v
__/

This was accomplished by setting the "motd",
which requires superuser privileges:

$ psql motd
psql (14devel)
Type "help" for help.

motd=# ALTER SYSTEM SET motd TO E'\u001B[94m'
'\n     __  ___  '
'\n  /)/  \/   \ '
'\n ( / ___\)'
'\n  \(/ o)  ( o)   )'
'\n   \_  (_  )   \ )  / '
'\n \  /\_/\)_/  '
'\n  \/  //|  |\\'
'\n  v |  | v'
'\n\__/  '
'\u001b[0m';
ALTER SYSTEM
motd=# SELECT pg_reload_conf();
pg_reload_conf

t
(1 row)

motd=# \q

Ascii elephant in example by Michael Paquier [1], with ANSI colors added by me.

[1] 
https://www.postgresql.org/message-id/CAB7nPqRaacwcaANOYY3Hxd3T0No5RdZXyqM5HB6fta%2BCoDLOEg%40mail.gmail.com

Happy Easter!

/Joel

0001-quote-newlines.patch
Description: Binary data


0002-motd.patch
Description: Binary data


Re: [PATCH] Implement motd for PostgreSQL

2021-04-02 Thread Chapman Flack
On 04/02/21 16:46, Joel Jacobson wrote:

>  __  ___
>   /)/  /
> ( / ___)
>   (/ o)  ( o)   )
>_  (_  ))  /
>/_/)_/
>   /  //|  |\
>   v |  | v
> __/


Slonik's backslashes are falling off. Eww.

Regards,
-Chap




Re: Have I found an interval arithmetic bug?

2021-04-02 Thread John W Higgins
On Fri, Apr 2, 2021 at 11:05 AM Bruce Momjian  wrote:

> On Thu, Apr  1, 2021 at 09:46:58PM -0700, Bryn Llewellyn wrote:
> > Or am I misunderstanding something?
> >
> > Try this. The result of each “select” is shown as the trailing comment
> on the
> > same line. I added whitespace by hand to line up the fields.
> >
> > select interval '-1.7 years';  -- -1 years -8
> mons
> >
> > select interval '29.4 months'; --  2 years  5
> mons 12
> > days
> >
> > select interval '-1.7 years 29.4 months';  --   8
> mons 12
> > days << wrong
> > select interval '29.4 months -1.7 years';  --   9
> mons 12
> > days
> >
> > select interval '-1.7 years' + interval '29.4 months'; --   9
> mons 12
> > days
> > select interval '29.4 months' + interval '-1.7 years'; --   9
> mons 12
> > days
> >
>

While maybe there is an argument to fixing the negative/positive rounding
issue - there is no way this gets solved without breaking the current
implementation

select interval '0.3 years' + interval '0.4 years' - interval '0.7 years' +
interval '0.1 years' should not equal 0 but it certainly does.

Unless we take the concept of 0.3 years = 3 months and move to something
along the lines of

1 year = 360 days
1 month = 30 days

so therefore

0.3 years = 360 days * 0.3 = 108 days = 3 months 18 days
0.4 years = 360 days * 0.4 = 144 days = 4 months 24 days
0.7 years = 360 days * 0.7 = 252 days = 8 months 12 days

Then, and only if we don't go to any more than tenths of a year, does the
math work. Probably this should resolve down to seconds and then work
backwards - but unless we're looking at breaking the entire way it
currently resolves things - I don't think this is of much value.

Doing math on intervals is like doing math on rounded numbers - there is
always going to be a pile of issues because the level of precision just is
not good enough.

John


Re: Bug? pg_identify_object_as_address() et al doesn't work with pg_enum.oid

2021-04-02 Thread Andrew Dunstan


On 3/30/21 3:08 PM, Joel Jacobson wrote:
> Hi,
>
> Some catalog oid values originate from other catalogs,
> such as pg_aggregate.aggfnoid -> pg_proc.oid
> or pg_attribute.attrelid -> pg_class.oid.
>
> For such oid values, the foreign catalog is the regclass
> which should be passed as the first argument to
> all the functions taking (classid oid, objid oid, objsubid integer)
> as input, i.e. pg_describe_object(), pg_identify_object() and
> pg_identify_object_as_address().
>
> All oids values in all catalogs,
> can be used with these functions,
> as long as the correct regclass is passed as the first argument,
> *except* pg_enum.oid.
>
> (This is not a problem for pg_enum.enumtypid,
> its regclass is 'pg_type' and works fine.)
>
> I would have expected the regclass to be 'pg_enum'::regclass,
> since there is no foreign key on pg_enum.oid.
>
> In a way, pg_enum is similar to pg_attribute,
>    pg_enum.enumtypid -> pg_type.oid
> reminds me of
>    pg_attribute.attrelid -> pg_class.oid
>
> But pg_enum has its own oid column as primary key,
> whereas in pg_attribute we only have a multi-column primary
> key (attrelid, attnum).
>
> Is this a bug? I.e. should we add support to deal with pg_enum.oid?
>
> Or is this by design?
> If so, wouldn't it be good to mention this corner-case
> somewhere in the documentation for pg_identify_object_as_address() et al?
> That is, to explain these functions works for almost all oid values,
> except pg_enum.oid.
>
>


I think the short answer is it's not a bug. In theory we could provide
support for


  pg_describe_object('pg_enum'::regclass, myenum, 0)


but what would it return? There is no sane description other than the
enum's label, which you can get far more simply.


Maybe this small break on orthogonality should be noted, if enough
people care, but I doubt we should do anything else.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: [PATCH] Implement motd for PostgreSQL

2021-04-02 Thread Marko Tiikkaja
 Hi Joel

On Fri, Apr 2, 2021 at 11:47 PM Joel Jacobson  wrote:

> PostgreSQL has since long been suffering from the lack of a proper UNIX
> style motd (message of the day).
>

First of all, thanks for your work on this!  I think this is an important
feature to have, but I would love to see a way to have a set of strings
from which you choose a random one to display.  That way you could brighten
your day with random positive messages.


-marko


Re: policies with security definer option for allowing inline optimization

2021-04-02 Thread Dan Lynch
My goal is to use RLS for everything, including SELECTs, so it's super
important to consider every performance tweak possible. Appreciate any
insights or comments. I'm also hoping to document this better for
application developers who want to use postgres and RLS.

Does anyone know details of, or where to find more information about the
implications of the optimizer on the quals/checks for the policies being
functions vs inline?

It seems that the solution today may be that we have to write functions
with security definer. I also saw Joe's linked in share regarding an
article using inline functions

in the qual/checks to solve a policy size issue, but also wondering the
performance implications of inline vs functions:

Imagine you need to do a JOIN to check an owned table against an acl table

(group_id = ANY (ARRAY (
SELECT
acl.entity_id
FROM
org_memberships_acl acl
JOIN app_groups obj ON acl.entity_id = obj.owner_id
WHERE
acl.actor_id = current_user_id(


you could wrap that query into a function (so we can apply SECURITY DEFINER
to the tables involved to avoid nested RLS lookups)

(group_id = ANY (ARRAY (
get_group_ids_of_current_user()
)))

Does anyone here know how the optimizer would handle this? I suppose if the
get_group_ids_of_current_user() function is marked as STABLE, would the
optimizer cache this value for every row in a SELECT that returned
multiple rows? Is it possible that if the function is sql vs plpgsql it
makes a difference?

Am I splitting hairs here, and maybe this is a trivial nuance that
shouldn't really matter for performance? If it's true that inline functions
would perform bretter, then definitely this thread and potentially feature
request seems pretty important.


*Other important RLS Performance Optimizations*

I also want to share my research from online so it's documented somewhere.
I would love to get more information to formally document these
optimizations. Here are the two articles I've found to be useful for how to
structure RLS policies performantly:

https://cazzer.medium.com/designing-the-most-performant-row-level-security-strategy-in-postgres-a06084f31945


https://medium.com/@ethanresnick/there-are-a-few-faster-ways-that-i-know-of-to-handle-the-third-case-with-rls-9d22eaa890e5



The 2nd article was particularly useful (which was written in response to this
article
),
highlighting an important detail that should probably be more explicit for
folks writing policies, especially for SELECT policies. Essentially it
boils down to not passing properties from the rows into the functions used
to check security, but instead inverting the logic and instead returning
the identifiers as an array and checking if the row's owned key matches one
of those identifiers.

For example,

a GOOD qual/check expr

owner_id = ANY ( function_that_gets_current_users_organization_ids() )

a BAD qual/check expr

can_user_access_object(owner_id)

The main benefit of the first expr is that if
function_that_gets_current_users_organization_ids is STABLE, the optimizer
can run this once for all rows, and thus for SELECTs should actually run
fast. The 2nd expr takes as an argument the column, which would have to run
for every single row making SELECTs run very slow depending on the function.

This actually is pretty intuitive once you look at it. Reversing the logic
and returning IDs makes sense when you imagine what PG has to do in order
to check rows, I suppose there are limitations depending on the cardinality
of the IDs returned and postgres's ability to check some_id = ANY (array)
for large arrays.

Dan Lynch
(734) 657-4483


On Fri, Apr 2, 2021 at 7:47 AM Joe Conway  wrote:

> On 4/2/21 10:23 AM, Stephen Frost wrote:
> > Greetings,
> >
> > * Joe Conway (m...@joeconway.com) wrote:
> >> On 4/2/21 9:57 AM, Isaac Morland wrote:
> >> >Views already run security definer, allowing them to be used for some
> of
> >> >the same information-hiding purposes as RLS. But I just found something
> >> >strange: current_user/_role returns the user's role, not the view
> owner's
> >> >role:
> >>
> >> >postgres=# set role to t1;
> >> >SET
> >> >postgres=> table tt;
> >> >ERROR:  permission denied for table tt
> >> >postgres=> table tv;
> >> >  ?column? | current_user
> >> >--+--
> >> > 5 | t1
> >> >(1 row)
> >> >
> >> >postgres=>
> >> >
> >> >Note that even though current_user is t1 "inside" the view, it is still
> >> >able to see the contents of table tt. Shouldn't current_user/_role
> return
> >> >the view owner in this situation? By contrast security definer
> functions
> >> >work properly:
> >>
> >> That is because while VIEWs are effectivel

Re: [PATCH] Implement motd for PostgreSQL

2021-04-02 Thread Bruce Momjian
On Fri, Apr  2, 2021 at 10:46:16PM +0200, Joel Jacobson wrote:
> Dear fellow hackers,
> 
> This patch is one day late, my apologies for missing the deadline this year.

Uh, the deadline for the last commitfest was March 1, 2021, not April
1.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  If only the physical world exists, free will is an illusion.





Re: Using COPY FREEZE in pgbench

2021-04-02 Thread Peter Geoghegan
On Sun, Mar 21, 2021 at 5:23 PM Tatsuo Ishii  wrote:
> Anyway, this time total pgbench time is reduced by 14% over all
> here. I hope people agree that the patch is worth the gain.

Most of the time when I run pgbench I use my own shell script, which does this:

PGOPTIONS='-c vacuum_freeze_min_age=0 -c wal_compression=off' pgbench
-i -s $SCALE

Have you considered this case? In other words, have you considered the
benefits of this patch for users that currently deliberately force
freezing by VACUUM, just because it matters to their benchmark?

(BTW you might be surprised how much wal_compression=off matters here.)

-- 
Peter Geoghegan




Re: libpq debug log

2021-04-02 Thread Alvaro Herrera
On 2021-Apr-02, Tom Lane wrote:

> Alvaro Herrera  writes:
> > As in the attached patch.
> 
> +1, but the comment could be more specific.  Maybe like "In regress mode,
> suppress the length of ErrorResponse and NoticeResponse messages --- the F
> (file name) field, in particular, can vary in length depending on compiler
> used."

With your commit this is no longer true, so how about the attached?

-- 
Álvaro Herrera   Valdivia, Chile
>From 80af8f687ea3665a045d9005e0ec1fc53e1a393a Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Fri, 2 Apr 2021 10:51:42 -0300
Subject: [PATCH v2] Suppress length of Notice/Error in PQtrace regress mode

A (relatively minor) annoyance of ErrorResponse/NoticeResponse messages
as printed by PQtrace() is that their length might vary when we move
error messages from one file to another, one function to another, or
even when their location line number changes number of digits.

To avoid having to adjust expected files for some tests, make the
regress mode of PQtrace() suppress the length word of those messages.

Discussion: https://postgr.es/m/20210402023010.GA13563@alvherre.pgsql
Reviewed-by: Tom Lane 
---
 src/interfaces/libpq/fe-trace.c   | 11 ++-
 .../libpq_pipeline/traces/pipeline_abort.trace|  8 
 .../modules/libpq_pipeline/traces/transaction.trace   |  6 +++---
 3 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/src/interfaces/libpq/fe-trace.c b/src/interfaces/libpq/fe-trace.c
index 9a4595f5c8..ca24869e91 100644
--- a/src/interfaces/libpq/fe-trace.c
+++ b/src/interfaces/libpq/fe-trace.c
@@ -540,7 +540,16 @@ pqTraceOutputMessage(PGconn *conn, const char *message, bool toServer)
 	length = (int) pg_ntoh32(length);
 	logCursor += 4;
 
-	fprintf(conn->Pfdebug, "%s\t%d\t", prefix, length);
+	/*
+	 * In regress mode, suppress the length of ErrorResponse and
+	 * NoticeResponse.  The F (file name), L (line number) and R (routine
+	 * name) fields can change as server code is modified, and if their
+	 * lengths differ from the originals, that would break tests.
+	 */
+	if (regress && !toServer && (id == 'E' || id == 'N'))
+		fprintf(conn->Pfdebug, "%s\tNN\t", prefix);
+	else
+		fprintf(conn->Pfdebug, "%s\t%d\t", prefix, length);
 
 	switch (id)
 	{
diff --git a/src/test/modules/libpq_pipeline/traces/pipeline_abort.trace b/src/test/modules/libpq_pipeline/traces/pipeline_abort.trace
index b061435785..254e485997 100644
--- a/src/test/modules/libpq_pipeline/traces/pipeline_abort.trace
+++ b/src/test/modules/libpq_pipeline/traces/pipeline_abort.trace
@@ -1,5 +1,5 @@
 F	42	Query	 "DROP TABLE IF EXISTS pq_pipeline_demo"
-B	123	NoticeResponse	 S "NOTICE" V "NOTICE" C "0" M "table "pq_pipeline_demo" does not exist, skipping" F "" L "" R "" \x00
+B	NN	NoticeResponse	 S "NOTICE" V "NOTICE" C "0" M "table "pq_pipeline_demo" does not exist, skipping" F "" L "" R "" \x00
 B	15	CommandComplete	 "DROP TABLE"
 B	5	ReadyForQuery	 I
 F	99	Query	 "CREATE UNLOGGED TABLE pq_pipeline_demo(id serial primary key, itemno integer,int8filler int8);"
@@ -27,7 +27,7 @@ B	4	ParseComplete
 B	4	BindComplete
 B	4	NoData
 B	15	CommandComplete	 "INSERT 0 1"
-B	217	ErrorResponse	 S "ERROR" V "ERROR" C "42883" M "function no_such_function(integer) does not exist" H "No function matches the given name and argument types. You might need to add explicit type casts." P "8" F "" L "" R "" \x00
+B	NN	ErrorResponse	 S "ERROR" V "ERROR" C "42883" M "function no_such_function(integer) does not exist" H "No function matches the given name and argument types. You might need to add explicit type casts." P "8" F "" L "" R "" \x00
 B	5	ReadyForQuery	 I
 B	4	ParseComplete
 B	4	BindComplete
@@ -39,7 +39,7 @@ F	12	Bind	 "" "" 0 0 0
 F	6	Describe	 P ""
 F	9	Execute	 "" 0
 F	4	Sync
-B	123	ErrorResponse	 S "ERROR" V "ERROR" C "42601" M "cannot insert multiple commands into a prepared statement" F "" L "" R "" \x00
+B	NN	ErrorResponse	 S "ERROR" V "ERROR" C "42601" M "cannot insert multiple commands into a prepared statement" F "" L "" R "" \x00
 B	5	ReadyForQuery	 I
 F	54	Parse	 "" "SELECT 1.0/g FROM generate_series(3, -1, -1) g" 0
 F	12	Bind	 "" "" 0 0 0
@@ -52,7 +52,7 @@ B	33	RowDescription	 1 "?column?"  0  65535 -1 0
 B	32	DataRow	 1 22 '0.'
 B	32	DataRow	 1 22 '0.5000'
 B	32	DataRow	 1 22 '1.'
-B	70	ErrorResponse	 S "ERROR" V "ERROR" C "22012" M "division by zero" F "" L "" R "" \x00
+B	NN	ErrorResponse	 S "ERROR" V "ERROR" C "22012" M "division by zero" F "" L "" R "" \x00
 B	5	ReadyForQuery	 I
 F	40	Query	 "SELECT itemno FROM pq_pipeline_demo"
 B	31	RowDescription	 1 "itemno"  2  4 -1 0
diff --git a/src/test/modules/libpq_pipeline/traces/transaction.trace b/src/test/modules/libpq_pipeline/traces/transaction.trace
index 0267a534fa..1dcc2373c0 100644
--- a/src/test/modules/libpq_pipeline/traces/tra

Re: libpq debug log

2021-04-02 Thread Tom Lane
Alvaro Herrera  writes:
> On 2021-Apr-02, Tom Lane wrote:
>> +1, but the comment could be more specific.  Maybe like "In regress mode,
>> suppress the length of ErrorResponse and NoticeResponse messages --- the F
>> (file name) field, in particular, can vary in length depending on compiler
>> used."

> With your commit this is no longer true, so how about the attached?

Right.  Your text looks fine.

(I see drongo is now green, so it looks like its compiler does indeed emit
backslash-separated full paths.)

regards, tom lane




Re: SP-GiST confusion: indexed column's type vs. index column type

2021-04-02 Thread Tom Lane
Peter Geoghegan  writes:
> On Fri, Apr 2, 2021 at 9:37 AM Tom Lane  wrote:
>> I propose changing things so that
>> (A) attType really is the input (heap) data type.
>> (B) We enforce that leafType agrees with the opclass opckeytype,
>> ensuring the index tupdesc can be used for leaf tuples.
>> (C) The tupdesc passed back for an index-only scan reports the
>> input (heap) data type.
>> 
>> This might be too much change for the back branches.  Given the
>> lack of complaints to date, I think we can just fix it in HEAD.

> +1 to fixing it on HEAD only.

Here's a draft patch for that, in case anyone wants to look it
over.

The confusion went even deeper than I thought, as some of the code
mistakenly thought that reconstructed "leafValue" values were of the
leaf data type rather than the input attribute type.  (Which is not
too surprising, given that that's such a misleading name, but the
docs are clear and correct on the point.)

Also, both the code and docs thought that the "reconstructedValue"
datums that are passed down the tree during a search should be of
the leaf data type.  This seems to me to be arrant nonsense.
As an example, if you made an opclass that indexes 1-D arrays
by labeling each inner node with successive array elements,
right down to the leaf which is the last array element, it will
absolutely not work for the reconstructedValues to be of the
leaf type --- they have to be of the array type.  (As I said
in commit 1ebdec8c0, this'd be a fairly poorly-chosen opclass
design, but it seems like it ought to physically work.)

Given the amount of confusion here, I don't have a lot of confidence
that an opclass that wants to reconstruct values while having
leafType different from input type will work even with this patch.
I'm strongly tempted to make a src/test/modules module that
implements exactly the silly design given above, just so we have
some coverage for this scenario.

regards, tom lane

diff --git a/doc/src/sgml/spgist.sgml b/doc/src/sgml/spgist.sgml
index ea88ae45e5..ba66deb003 100644
--- a/doc/src/sgml/spgist.sgml
+++ b/doc/src/sgml/spgist.sgml
@@ -330,19 +330,25 @@ typedef struct spgConfigOut
  
 
  
-  leafType is typically the same as
-  attType.  For the reasons of backward
-  compatibility, method config can
-  leave leafType uninitialized; that would
-  give the same effect as setting leafType equal
-  to attType.  When attType
-  and leafType are different, then optional
+  leafType must match the index storage type
+  defined by the operator class's opckeytype
+  catalog entry.  For reasons of backward compatibility,
+  method config can
+  leave leafType uninitialized (zero);
+  that is interpreted as meaning opckeytype.
+  (Recall that opckeytype can in turn be zero,
+  implying the storage type is the same as the operator class's input
+  type.)  In what follows, leafType should be
+  understood as referring to the fully resolved leaf data type.
+ 
+
+ 
+  When attType
+  and leafType are different, the optional
   method compress must be provided.
   Method compress is responsible
   for transformation of datums to be indexed from attType
   to leafType.
-  Note: both consistent functions will get scankeys
-  unchanged, without transformation using compress.
  
  
 
@@ -677,8 +683,7 @@ typedef struct spgInnerConsistentOut
reconstructedValue is the value reconstructed for the
parent tuple; it is (Datum) 0 at the root level or if the
inner_consistent function did not provide a value at the
-   parent level. reconstructedValue is always of
-   spgConfigOut.leafType type.
+   parent level.
traversalValue is a pointer to any traverse data
passed down from the previous call of inner_consistent
on the parent index tuple, or NULL at the root level.
@@ -713,9 +718,14 @@ typedef struct spgInnerConsistentOut
necessarily so, so an array is used.)
If value reconstruction is needed, set
reconstructedValues to an array of the values
-   of spgConfigOut.leafType type
reconstructed for each child node to be visited; otherwise, leave
reconstructedValues as NULL.
+   The reconstructed values are assumed to be of the indexed column's
+   type, that is spgConfigIn.attType.
+   (However, since the core system will do nothing with them except
+   possibly copy them, it is sufficient for them to have the
+   same typlen and typbyval
+   properties as attType.)
If ordered search is performed, set distances
to an array of distance values according to orderbys
array (nodes with lowest distances will be processed first).  Leave it
@@ -797,8 +807,7 @@ typedef struct spgLeafConsistentOut
reconstructedValue is the value reconstructed for the
parent tuple; it is (Datum) 0 at the root level or 

Re: Have I found an interval arithmetic bug?

2021-04-02 Thread Bruce Momjian
On Fri, Apr  2, 2021 at 01:05:42PM -0700, Bryn Llewellyn wrote:
> I should come clean about the larger context. I work for Yugabyte, Inc. We 
> have
> a distributed SQL database that uses the Version 11.2 PostgreSQL C code for 
> SQL
> processing “as is”.
> 
> https://blog.yugabyte.com/
> distributed-postgresql-on-a-google-spanner-architecture-query-layer/
> 
> The founders decided to document YugabyteDB’s SQL functionality explicitly
> rather than just to point to the published PostgreSQL doc. (There are some DDL
> differences that reflect the storage layer differences.) I’m presently
> documenting date-time functionality. This is why I’m so focused on
> understanding the semantics exactly and on understanding the requirements that
> the functionality was designed to meet. I’m struggling with interval
> functionality. I read this:

[Sorry, also moved this to hackers.  I might normally do all the
discussion on general, with patches, and then move it to hackers, but
our PG 14 deadline is next week, so it is best to move it now in hopes
it can be addressed in PG 14.]

Sure, seems like a good idea.

> https://www.postgresql.org/docs/current/datatype-datetime.html#
> DATATYPE-INTERVAL-INPUT
> 
> « …field values can have fractional parts; for example '1.5 week' or
> '01:02:03.45'. Such input is converted to the appropriate number of months,
> days, and seconds for storage. When this would result in a fractional number 
> of
> months or days, the fraction is added to the lower-order fields using the
> conversion factors 1 month = 30 days and 1 day = 24 hours. For example, '1.5
> month' becomes 1 month and 15 days. Only seconds will ever be shown as
> fractional on output. »
> 
> Notice that the doc says that spill-down goes all the way to seconds and not
> just one unit. This simple test is consistent with the doc (output follows the
> dash-dash comment):
> 
> select ('6.54321 months'::interval)::text as i; --  6 mons 16 days 07:06:40.32
> 
> You see similar spill-down with this:
> 
> select ('876.54321 days'::interval)::text as i; -- 876 days 13:02:13.344
> 
> And so on down through the remaining smaller units. It’s only this test that
> doesn’t spill down one unit:
> 
> select ('6.54321 years'::interval)::text as i; --  6 years 6 mons
> 
> This does suggest a straight bug rather than a case for committee debate about
> what might have been intended. What do you think, Bruce?

So, that gets into more detail.  When I said "spill down one unit", I
was not talking about _visible_ units, but rather the three internal
units used by Postgres:


https://www.postgresql.org/docs/13/datatype-datetime.html#DATATYPE-INTERVAL-INPUT
Internally interval values are stored as months, days, and seconds.
 -

However, while that explains why years don't spill beyond months, it
doesn't explain why months would spill beyond days.  This certainly
seems inconsistent.

I have modified the patch to prevent partial months from creating
partial hours/minutes/seconds, so the output is now at least consistent
based on the three units:

SELECT ('6.54321 years'::interval)::text as i;
   i

 6 years 7 mons

SELECT ('6.54321 months'::interval)::text as i;
   i

 6 mons 16 days

SELECT ('876.54321 days'::interval)::text as i;
   i
---
 876 days 13:02:13.344

Partial years keeps it in months, partial months takes it to days, and
partial days take it to hours/minutes/seconds.  This seems like an
improvement.

This also changes the regression test output, I think for the better:

 SELECT INTERVAL '1.5 weeks';
  i
 --
- 10 days 12:00:00
+ 10 days

The new output is less precise, but probably closer to what the user
wanted.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  If only the physical world exists, free will is an illusion.

diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
index 889077f55c..d5b3705145 100644
--- a/src/backend/utils/adt/datetime.c
+++ b/src/backend/utils/adt/datetime.c
@@ -526,7 +526,6 @@ AdjustFractDays(double frac, struct pg_tm *tm, fsec_t *fsec, int scale)
 	extra_days = (int) frac;
 	tm->tm_mday += extra_days;
 	frac -= extra_days;
-	AdjustFractSeconds(frac, tm, fsec, SECS_PER_DAY);
 }
 
 /* Fetch a fractional-second value with suitable error checking */
@@ -3307,28 +3306,28 @@ DecodeInterval(char **field, int *ftype, int nf, int range,
 	case DTK_YEAR:
 		tm->tm_year += val;
 		if (fval != 0)
-			tm->tm_mon += fval * MONTHS_PER_YEAR;
+			tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
 		tmask = DTK_M(YEAR);
 		break;
 
 	case DTK_DECADE:
 		tm->tm_year += val * 1

Re: Have I found an interval arithmetic bug?

2021-04-02 Thread Bruce Momjian
On Fri, Apr  2, 2021 at 01:27:33PM -0700, Zhihong Yu wrote:
> Bruce:
> Thanks for tackling this issue.
> 
> The patch looks good to me.
> When you have time, can you include the places which were not covered by the
> current diff ?

I have just posted a new version of the patch which I think covers all
the right areas.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  If only the physical world exists, free will is an illusion.





Re: Any objections to implementing LogicalDecodeMessageCB for pgoutput?

2021-04-02 Thread Euler Taveira
On Thu, Apr 1, 2021, at 7:19 AM, Amit Kapila wrote:
> Your ideas/suggestions look good to me. Don't we need to provide a
> read function corresponding to logicalrep_write_message? We have it
> for other write functions. Can you please combine all of your changes
> into one patch?
Thanks for taking a look at this patch. I didn't consider a
logicalrep_read_message function because the protocol doesn't support it yet.

/*
* Logical replication does not use generic logical messages yet.
* Although, it could be used by other applications that use this
* output plugin.
*/

Someone that is inspecting the code in the future could possibly check this
discussion to understand why this function isn't available.

This new patch set version has 2 patches that is because there are 2 separate
changes: parse_output_parameters() refactor and logical decoding message
support.


--
Euler Taveira
EDB   https://www.enterprisedb.com/
From d17cda24259bbaa019bedd1175960375c2905112 Mon Sep 17 00:00:00 2001
From: Euler Taveira 
Date: Mon, 16 Nov 2020 18:49:57 -0300
Subject: [PATCH v3 1/2] Refactor function parse_output_parameters

Instead of using multiple parameters in parse_ouput_parameters function
signature, use the struct PGOutputData that encapsulates all pgoutput
options. It is the right approach to take in terms of maintainability.
---
 src/backend/replication/pgoutput/pgoutput.c | 24 -
 src/include/replication/pgoutput.h  |  1 +
 2 files changed, 10 insertions(+), 15 deletions(-)

diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 1b993fb032..6146c5acdb 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -156,9 +156,7 @@ _PG_output_plugin_init(OutputPluginCallbacks *cb)
 }
 
 static void
-parse_output_parameters(List *options, uint32 *protocol_version,
-		List **publication_names, bool *binary,
-		bool *enable_streaming)
+parse_output_parameters(List *options, PGOutputData *data)
 {
 	ListCell   *lc;
 	bool		protocol_version_given = false;
@@ -166,7 +164,8 @@ parse_output_parameters(List *options, uint32 *protocol_version,
 	bool		binary_option_given = false;
 	bool		streaming_given = false;
 
-	*binary = false;
+	data->binary = false;
+	data->streaming = false;
 
 	foreach(lc, options)
 	{
@@ -196,7 +195,7 @@ parse_output_parameters(List *options, uint32 *protocol_version,
 		 errmsg("proto_version \"%s\" out of range",
 strVal(defel->arg;
 
-			*protocol_version = (uint32) parsed;
+			data->protocol_version = (uint32) parsed;
 		}
 		else if (strcmp(defel->defname, "publication_names") == 0)
 		{
@@ -207,7 +206,7 @@ parse_output_parameters(List *options, uint32 *protocol_version,
 			publication_names_given = true;
 
 			if (!SplitIdentifierString(strVal(defel->arg), ',',
-	   publication_names))
+	   &data->publication_names))
 ereport(ERROR,
 		(errcode(ERRCODE_INVALID_NAME),
 		 errmsg("invalid publication_names syntax")));
@@ -220,7 +219,7 @@ parse_output_parameters(List *options, uint32 *protocol_version,
 		 errmsg("conflicting or redundant options")));
 			binary_option_given = true;
 
-			*binary = defGetBoolean(defel);
+			data->binary = defGetBoolean(defel);
 		}
 		else if (strcmp(defel->defname, "streaming") == 0)
 		{
@@ -230,7 +229,7 @@ parse_output_parameters(List *options, uint32 *protocol_version,
 		 errmsg("conflicting or redundant options")));
 			streaming_given = true;
 
-			*enable_streaming = defGetBoolean(defel);
+			data->streaming = defGetBoolean(defel);
 		}
 		else
 			elog(ERROR, "unrecognized pgoutput option: %s", defel->defname);
@@ -244,7 +243,6 @@ static void
 pgoutput_startup(LogicalDecodingContext *ctx, OutputPluginOptions *opt,
  bool is_init)
 {
-	bool		enable_streaming = false;
 	PGOutputData *data = palloc0(sizeof(PGOutputData));
 
 	/* Create our memory context for private allocations. */
@@ -265,11 +263,7 @@ pgoutput_startup(LogicalDecodingContext *ctx, OutputPluginOptions *opt,
 	if (!is_init)
 	{
 		/* Parse the params and ERROR if we see any we don't recognize */
-		parse_output_parameters(ctx->output_plugin_options,
-&data->protocol_version,
-&data->publication_names,
-&data->binary,
-&enable_streaming);
+		parse_output_parameters(ctx->output_plugin_options, data);
 
 		/* Check if we support requested protocol */
 		if (data->protocol_version > LOGICALREP_PROTO_MAX_VERSION_NUM)
@@ -295,7 +289,7 @@ pgoutput_startup(LogicalDecodingContext *ctx, OutputPluginOptions *opt,
 		 * we only allow it with sufficient version of the protocol, and when
 		 * the output plugin supports it.
 		 */
-		if (!enable_streaming)
+		if (!data->streaming)
 			ctx->streaming = false;
 		else if (data->protocol_version < LOGICALREP_PROTO_STREAM_VERSION_NUM)
 			ereport(ERROR,
diff --git a/src/include/replication/pgoutput.h b/src/include/replication/pgoutput.h
inde

Re: Have I found an interval arithmetic bug?

2021-04-02 Thread Bruce Momjian
On Fri, Apr  2, 2021 at 02:00:03PM -0700, John W Higgins wrote:
> On Fri, Apr 2, 2021 at 11:05 AM Bruce Momjian  wrote:
> While maybe there is an argument to fixing the negative/positive rounding 
> issue
> - there is no way this gets solved without breaking the current implementation
> 
> select interval '0.3 years' + interval '0.4 years' - interval '0.7 years' +
> interval '0.1 years' should not equal 0 but it certainly does.

My new code returns 0.2 months for this, not zero:

SELECT  interval '0.3 years' + interval '0.4 years' -
interval '0.7 years' + interval '0.1 years';
 ?column?
--
 2 mons

which is also wrong since:

SELECT interval '0.1 years';
 interval
--
 1 mon

> Unless we take the concept of 0.3 years = 3 months and move to something along
> the lines of 
> 
> 1 year = 360 days
> 1 month = 30 days 
> 
> so therefore 
> 
> 0.3 years = 360 days * 0.3 = 108 days = 3 months 18 days 
> 0.4 years = 360 days * 0.4 = 144 days = 4 months 24 days
> 0.7 years = 360 days * 0.7 = 252 days = 8 months 12 days
> 
> Then, and only if we don't go to any more than tenths of a year, does the math
> work. Probably this should resolve down to seconds and then work backwards -
> but unless we're looking at breaking the entire way it currently resolves
> things - I don't think this is of much value.
> 
> Doing math on intervals is like doing math on rounded numbers - there is 
> always
> going to be a pile of issues because the level of precision just is not good
> enough.

I think the big question is what units do people want with fractional
values.  I have posted a follow-up email that spills only for one unit,
which I think is the best approach.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  If only the physical world exists, free will is an illusion.





Re: Using COPY FREEZE in pgbench

2021-04-02 Thread Tatsuo Ishii
> Most of the time when I run pgbench I use my own shell script, which does 
> this:
> 
> PGOPTIONS='-c vacuum_freeze_min_age=0 -c wal_compression=off' pgbench
> -i -s $SCALE
> 
> Have you considered this case? In other words, have you considered the
> benefits of this patch for users that currently deliberately force
> freezing by VACUUM, just because it matters to their benchmark?

I am not sure how many people use this kind of options while running
pgbench -i but we could add yet another switch to fall back to none
FREEZE COPY if you want.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp




Re: Have I found an interval arithmetic bug?

2021-04-02 Thread Zhihong Yu
Hi,
bq. My new code returns 0.2 months for this, not zero

Can you clarify (the output below that was 2 mons, not 0.2) ?

Thanks

On Fri, Apr 2, 2021 at 4:58 PM Bruce Momjian  wrote:

> On Fri, Apr  2, 2021 at 02:00:03PM -0700, John W Higgins wrote:
> > On Fri, Apr 2, 2021 at 11:05 AM Bruce Momjian  wrote:
> > While maybe there is an argument to fixing the negative/positive
> rounding issue
> > - there is no way this gets solved without breaking the current
> implementation
> >
> > select interval '0.3 years' + interval '0.4 years' - interval '0.7
> years' +
> > interval '0.1 years' should not equal 0 but it certainly does.
>
> My new code returns 0.2 months for this, not zero:
>
> SELECT  interval '0.3 years' + interval '0.4 years' -
> interval '0.7 years' + interval '0.1 years';
>  ?column?
> --
>  2 mons
>
> which is also wrong since:
>
> SELECT interval '0.1 years';
>  interval
> --
>  1 mon
>
> > Unless we take the concept of 0.3 years = 3 months and move to something
> along
> > the lines of
> >
> > 1 year = 360 days
> > 1 month = 30 days
> >
> > so therefore
> >
> > 0.3 years = 360 days * 0.3 = 108 days = 3 months 18 days
> > 0.4 years = 360 days * 0.4 = 144 days = 4 months 24 days
> > 0.7 years = 360 days * 0.7 = 252 days = 8 months 12 days
> >
> > Then, and only if we don't go to any more than tenths of a year, does
> the math
> > work. Probably this should resolve down to seconds and then work
> backwards -
> > but unless we're looking at breaking the entire way it currently resolves
> > things - I don't think this is of much value.
> >
> > Doing math on intervals is like doing math on rounded numbers - there is
> always
> > going to be a pile of issues because the level of precision just is not
> good
> > enough.
>
> I think the big question is what units do people want with fractional
> values.  I have posted a follow-up email that spills only for one unit,
> which I think is the best approach.
>
> --
>   Bruce Momjian  https://momjian.us
>   EDB  https://enterprisedb.com
>
>   If only the physical world exists, free will is an illusion.
>
>
>
>


Re: Have I found an interval arithmetic bug?

2021-04-02 Thread Bruce Momjian
On Fri, Apr  2, 2021 at 07:47:32PM -0400, Bruce Momjian wrote:
> I have modified the patch to prevent partial months from creating
> partial hours/minutes/seconds, so the output is now at least consistent
> based on the three units:
> 
>   SELECT ('6.54321 years'::interval)::text as i;
>  i
>   
>6 years 7 mons
>   
>   SELECT ('6.54321 months'::interval)::text as i;
>  i
>   
>6 mons 16 days
>   
>   SELECT ('876.54321 days'::interval)::text as i;
>  i
>   ---
>876 days 13:02:13.344
> 
> Partial years keeps it in months, partial months takes it to days, and
> partial days take it to hours/minutes/seconds.  This seems like an
> improvement.
> 
> This also changes the regression test output, I think for the better:
> 
>SELECT INTERVAL '1.5 weeks';
> i
>--
>   - 10 days 12:00:00
>   + 10 days
> 
> The new output is less precise, but probably closer to what the user
> wanted.

Thinking some more about this, the connection between months and days is
very inaccurate, 30 days/month, but the connection between days and
hours/minutes/seconds is pretty accurate, except for leap days. 
Therefore, returning "10 days 12:00:00" is in many ways better, but
returning hours/minutes/seconds for fractional months is very arbitrary
and suggests an accuracy that doesn't exist.  However, I am afraid that
trying to enforce that distinction in the Postgres behavior would appear
very arbitrary, so what I did above is proabably the best I can do. 
Here is another example of what we have:

SELECT INTERVAL '1.5 years';
   interval
---
 1 year 6 mons

SELECT INTERVAL '1.5 months';
   interval
---
 1 mon 15 days

SELECT INTERVAL '1.5 weeks';
 interval
--
 10 days

SELECT INTERVAL '1.5 days';
interval

 1 day 12:00:00

SELECT INTERVAL '1.5 hours';
 interval
--
 01:30:00

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  If only the physical world exists, free will is an illusion.





Re: Have I found an interval arithmetic bug?

2021-04-02 Thread Bryn Llewellyn
br...@momjian.us wrote:

> I have just posted a new version of the patch which I think covers all the 
> right areas.

I found the relevant email from you to pgsql-hackers here:

https://www.postgresql.org/message-id/20210402234732.GA29125%40momjian.us

You said:

> I have modified the patch to prevent partial months from creating partial 
> hours/minutes/seconds… Partial years keeps it in months, partial months takes 
> it to days, and partial days take it to hours/minutes/seconds. This seems 
> like an improvement.

I have written some PL/pgSQL code that faithfully emulates the behavior that I 
see in my present vanilla PostgreSQL Version 13.2 system in a wide range of 
tests. This is the key part:

  m1   int not null := trunc(p.mo);
  m_remainder  numeric not null := p.mo - m1::numeric;
  mint not null := trunc(p.yy*12) + m1;

  d_real   numeric not null := p.dd + m_remainder*30.0;
  dint not null := floor(d_real);
  d_remainder  numeric not null := d_real - d::numeric;

  snumeric not null := d_remainder*24.0*60.0*60.0 +
   p.hh*60.0*60.0 +
   p.mi*60.0 +
   p.ss;
begin
  return (m, d, s)::modeled_interval_t;
end;

These quantities:

p.yy, p.mo, p.dd, p.hh, p.mi, and p.ss

are the user’s parameterization. All are real numbers. Because non-integral 
values for years, months, days, hours, and minutes are allowed when you specify 
a value using the ::interval typecast, my reference doc must state the rules. I 
would have struggled to express these rules in prose—especially given the use 
both of trunc() and floor(). I would have struggled more to explain what 
requirements these rules meet.

I gather that by the time YugabyteDB has adopted your patch, my PL/pgSQL will 
no longer be a correct emulation. So I’ll re-write it then.

I intend to advise users always to constrain the values, when they specify an 
interval value explicitly, so the the years, months, days, hours, and minutes 
are integers. This is, after all, the discipline that the make_interval() 
built-in imposes. So I might recommend using only that.











Re: TRUNCATE on foreign table

2021-04-02 Thread Kazutaka Onishi
All,

Thank you for discussion.
I've updated the patch (v6->v7) according to the conclusion.

I'll show the modified points:
1. Comments for ExecuteTuncate()
2. Replacing extra value in frels_extra with integer to label.
3. Skipping XLOG_HEAP_TRUNCATE on foreign table

Regards,

2021年4月2日(金) 11:44 Fujii Masao :
>
>
>
> On 2021/04/02 9:37, Kohei KaiGai wrote:
> > It is fair enough for me to reverse the order of actual truncation.
> >
> > How about the updated comments below?
> >
> >  This is a multi-relation truncate.  We first open and grab exclusive
> >  lock on all relations involved, checking permissions (local database
> >  ACLs even if relations are foreign-tables) and otherwise verifying
> >  that the relation is OK for truncation. In CASCADE mode, ...(snip)...
> >  Finally all the relations are truncated and reindexed. If any foreign-
> >  tables are involved, its callback shall be invoked prior to the 
> > truncation
> >  of regular tables.
>
> LGTM.
>
>
> >> BTW, the latest patch doesn't seem to be applied cleanly to the master
> >> because of commit 27e1f14563. Could you rebase it?
> >>
> > Onishi-san, go ahead. :-)
>
> +1
>
> Regards,
>
> --
> Fujii Masao
> Advanced Computing Technology Center
> Research and Development Headquarters
> NTT DATA CORPORATION


pgsql14-truncate-on-foreign-table.v7.patch
Description: Binary data


Re: Have I found an interval arithmetic bug?

2021-04-02 Thread Bruce Momjian
On Fri, Apr  2, 2021 at 05:50:59PM -0700, Bryn Llewellyn wrote:
> are the user’s parameterization. All are real numbers. Because non-integral
> values for years, months, days, hours, and minutes are allowed when you 
> specify
> a value using the ::interval typecast, my reference doc must state the rules. 
> I
> would have struggled to express these rules in prose—especially given the use
> both of trunc() and floor(). I would have struggled more to explain what
> requirements these rules meet.

The fundamental issue is that while months, days, and seconds are
consistent in their own units, when you have to cross from one unit to
another, it is by definition imprecise, since the interval is not tied
to a specific date, with its own days-of-the-month and leap days and
daylight savings time changes.  It feels like it is going to be
imprecise no matter what we do.

Adding to this is the fact that interval values are stored in C 'struct
tm' defined in libc's ctime(), where months are integers, so carrying
around non-integer month values until we get a final result would add a
lot of complexity, and complexity to a system that is by definition
imprecise, which doesn't seem worth it.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  If only the physical world exists, free will is an illusion.





Re: Using COPY FREEZE in pgbench

2021-04-02 Thread Peter Geoghegan
On Fri, Apr 2, 2021 at 4:58 PM Tatsuo Ishii  wrote:
> I am not sure how many people use this kind of options while running
> pgbench -i but we could add yet another switch to fall back to none
> FREEZE COPY if you want.

I was unclear. What I meant was that your patch isn't just useful
because it speeds up "pgbench -i" for everybody. It's also useful
because having all of the tuples already frozen after bulk loading
seems like a good benchmarking practice, at least most of the time.

The patch changes the initial state of the database with "pgbench -i",
I think. But that's good.

-- 
Peter Geoghegan




Re: Have I found an interval arithmetic bug?

2021-04-02 Thread Zhihong Yu
Hi,
I got a local build with second patch where:

yugabyte=# SELECT  interval '0.3 years' + interval '0.4 years' -
interval '0.7 years';
 ?column?
--
 1 mon

I think the outcome is a bit unintuitive (I would expect result close to 0).

Cheers

On Fri, Apr 2, 2021 at 5:07 PM Zhihong Yu  wrote:

> Hi,
> bq. My new code returns 0.2 months for this, not zero
>
> Can you clarify (the output below that was 2 mons, not 0.2) ?
>
> Thanks
>
> On Fri, Apr 2, 2021 at 4:58 PM Bruce Momjian  wrote:
>
>> On Fri, Apr  2, 2021 at 02:00:03PM -0700, John W Higgins wrote:
>> > On Fri, Apr 2, 2021 at 11:05 AM Bruce Momjian  wrote:
>> > While maybe there is an argument to fixing the negative/positive
>> rounding issue
>> > - there is no way this gets solved without breaking the current
>> implementation
>> >
>> > select interval '0.3 years' + interval '0.4 years' - interval '0.7
>> years' +
>> > interval '0.1 years' should not equal 0 but it certainly does.
>>
>> My new code returns 0.2 months for this, not zero:
>>
>> SELECT  interval '0.3 years' + interval '0.4 years' -
>> interval '0.7 years' + interval '0.1 years';
>>  ?column?
>> --
>>  2 mons
>>
>> which is also wrong since:
>>
>> SELECT interval '0.1 years';
>>  interval
>> --
>>  1 mon
>>
>> > Unless we take the concept of 0.3 years = 3 months and move to
>> something along
>> > the lines of
>> >
>> > 1 year = 360 days
>> > 1 month = 30 days
>> >
>> > so therefore
>> >
>> > 0.3 years = 360 days * 0.3 = 108 days = 3 months 18 days
>> > 0.4 years = 360 days * 0.4 = 144 days = 4 months 24 days
>> > 0.7 years = 360 days * 0.7 = 252 days = 8 months 12 days
>> >
>> > Then, and only if we don't go to any more than tenths of a year, does
>> the math
>> > work. Probably this should resolve down to seconds and then work
>> backwards -
>> > but unless we're looking at breaking the entire way it currently
>> resolves
>> > things - I don't think this is of much value.
>> >
>> > Doing math on intervals is like doing math on rounded numbers - there
>> is always
>> > going to be a pile of issues because the level of precision just is not
>> good
>> > enough.
>>
>> I think the big question is what units do people want with fractional
>> values.  I have posted a follow-up email that spills only for one unit,
>> which I think is the best approach.
>>
>> --
>>   Bruce Momjian  https://momjian.us
>>   EDB  https://enterprisedb.com
>>
>>   If only the physical world exists, free will is an illusion.
>>
>>
>>
>>


Re: Have I found an interval arithmetic bug?

2021-04-02 Thread Isaac Morland
On Fri, 2 Apr 2021 at 21:08, Zhihong Yu  wrote:

> Hi,
> I got a local build with second patch where:
>
> yugabyte=# SELECT  interval '0.3 years' + interval '0.4 years' -
> interval '0.7 years';
>  ?column?
> --
>  1 mon
>
> I think the outcome is a bit unintuitive (I would expect result close to
> 0).
>

That's not fundamentally different from this:

odyssey=> select 12 * 3/10 + 12 * 4/10 - 12 * 7/10;
 ?column?
--
   -1
(1 row)

odyssey=>

And actually the result is pretty close to 0. I mean it’s less than 0.1
year.

I wonder if it might have been better if only integers had been accepted
for the components? If you want 0.3 years write 0.3 * '1 year'::interval.
But changing it now would be a pretty significant backwards compatibility
break.

There's really no avoiding counterintuitive behaviour though. Look at this:

odyssey=> select 0.3 * '1 year'::interval + 0.4 * '1 year'::interval - 0.7
* '1 year'::interval;
 ?column?
--
 -1 mons +30 days
(1 row)

odyssey=> select 0.3 * '1 year'::interval + 0.4 * '1 year'::interval - 0.7
* '1 year'::interval = '0';
 ?column?
--
 t
(1 row)

odyssey=>

In other words, doing the “same” calculation but with multiplying 1 year
intervals by floats to get the values to add, you end up with an interval
that while not identical to 0 does compare equal to 0. So very close to 0;
in fact, as close to 0 as you can get without actually being identically 0.


Re: Using COPY FREEZE in pgbench

2021-04-02 Thread Tatsuo Ishii
> I was unclear. What I meant was that your patch isn't just useful
> because it speeds up "pgbench -i" for everybody. It's also useful
> because having all of the tuples already frozen after bulk loading
> seems like a good benchmarking practice, at least most of the time.
> 
> The patch changes the initial state of the database with "pgbench -i",
> I think. But that's good.

Oh, ok. Thanks for the explanation!
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp




Re: Data type correction in pgstat_report_replslot function parameters

2021-04-02 Thread vignesh C
On Fri, Apr 2, 2021 at 1:59 PM Fujii Masao  wrote:
>
>
>
> On 2021/04/02 11:20, vignesh C wrote:
> > On Thu, Apr 1, 2021 at 11:18 PM Fujii Masao  
> > wrote:
> >>
> >>
> >>
> >> On 2021/04/02 2:18, Jeevan Ladhe wrote:
> >>>
> >>>
> >>> On Thu, Apr 1, 2021 at 10:20 PM vignesh C  >>> > wrote:
> >>>
> >>>  Hi,
> >>>
> >>>  While I was reviewing replication slot statistics code, I found one
> >>>  issue in the data type used for pgstat_report_replslot function
> >>>  parameters. We pass int64 variables to the function but the function
> >>>  prototype uses int type. I I felt the function parameters should be
> >>>  int64. Attached patch fixes the same.
> >>
> >> Isn't it better to use PgStat_Counter instead of int64?
> >>
> >
> > Thanks for your comment, the updated patch contains the changes for it.
>
> Thanks for updating the patch! Pushed.

Thanks for pushing.

Regards,
Vignesh




Re: Have I found an interval arithmetic bug?

2021-04-02 Thread Zhihong Yu
Hi,
The mix of interval and comparison with float is not easy to interpret. See
the following (I got 0.0833 since the result for interval '0.3 years' +
interval '0.4 years' - ... query was 1 month and 1/12 ~= 0.0833).

yugabyte=# select 0.3 * '1 year'::interval + 0.4 * '1 year'::interval - 0.7
* '1 year'::interval = '0.0833 year'::interval;
 ?column?
--
 f

As long as Bruce's patch makes improvements over the current behavior, I
think that's fine.

Cheers

On Fri, Apr 2, 2021 at 6:24 PM Isaac Morland 
wrote:

> On Fri, 2 Apr 2021 at 21:08, Zhihong Yu  wrote:
>
>> Hi,
>> I got a local build with second patch where:
>>
>> yugabyte=# SELECT  interval '0.3 years' + interval '0.4 years' -
>> interval '0.7 years';
>>  ?column?
>> --
>>  1 mon
>>
>> I think the outcome is a bit unintuitive (I would expect result close to
>> 0).
>>
>
> That's not fundamentally different from this:
>
> odyssey=> select 12 * 3/10 + 12 * 4/10 - 12 * 7/10;
>  ?column?
> --
>-1
> (1 row)
>
> odyssey=>
>
> And actually the result is pretty close to 0. I mean it’s less than 0.1
> year.
>
> I wonder if it might have been better if only integers had been accepted
> for the components? If you want 0.3 years write 0.3 * '1 year'::interval.
> But changing it now would be a pretty significant backwards compatibility
> break.
>
> There's really no avoiding counterintuitive behaviour though. Look at this:
>
> odyssey=> select 0.3 * '1 year'::interval + 0.4 * '1 year'::interval - 0.7
> * '1 year'::interval;
>  ?column?
> --
>  -1 mons +30 days
> (1 row)
>
> odyssey=> select 0.3 * '1 year'::interval + 0.4 * '1 year'::interval - 0.7
> * '1 year'::interval = '0';
>  ?column?
> --
>  t
> (1 row)
>
> odyssey=>
>
> In other words, doing the “same” calculation but with multiplying 1 year
> intervals by floats to get the values to add, you end up with an interval
> that while not identical to 0 does compare equal to 0. So very close to 0;
> in fact, as close to 0 as you can get without actually being identically 0.
>


Re: SP-GiST confusion: indexed column's type vs. index column type

2021-04-02 Thread Tom Lane
I wrote:
> Also, both the code and docs thought that the "reconstructedValue"
> datums that are passed down the tree during a search should be of
> the leaf data type.  This seems to me to be arrant nonsense.
> As an example, if you made an opclass that indexes 1-D arrays
> by labeling each inner node with successive array elements,
> right down to the leaf which is the last array element, it will
> absolutely not work for the reconstructedValues to be of the
> leaf type --- they have to be of the array type.  (As I said
> in commit 1ebdec8c0, this'd be a fairly poorly-chosen opclass
> design, but it seems like it ought to physically work.)

So after trying to build an opclass that does that, I have a clearer
understanding of why opclasses that'd break the existing code are
so thin on the ground.  You can't do the above, because the opclass
cannot force the AM to add inner nodes that it doesn't want to.
For example, the first few index entries will simply be dumped into
the root page as undifferentiated leaf tuples.  This means that,
if you'd like to be able to return reconstructed index entries, the
leaf data type *must* be able to hold all the data that is in an
input value.  In principle you could represent it in some other
format, but the path of least resistance is definitely to make the
leaf type the same as the input.

I still want to make an opclass in which those types are different,
if only for testing purposes, but I'm having a hard time coming up
with a plan that's not totally lame.  Best idea I can think of is
to wrap the input in a bytea, which just begs the question "why
would you do that?".  Anybody have a less lame thought?

regards, tom lane




Re: Have I found an interval arithmetic bug?

2021-04-02 Thread Bruce Momjian
On Fri, Apr  2, 2021 at 06:11:08PM -0700, Zhihong Yu wrote:
> Hi,
> I got a local build with second patch where:
> 
> yugabyte=# SELECT  interval '0.3 years' + interval '0.4 years' -
>                 interval '0.7 years';
>  ?column?
> --
>  1 mon
> 
> I think the outcome is a bit unintuitive (I would expect result close to 0).

Uh, the current code returns:

SELECT  interval '0.3 years' + interval '0.4 years' - interval '0.7 
years';
 ?column?
--
 -1 mon

and with the patch it is:

SELECT  interval '0.3 years' + interval '0.4 years' - interval '0.7 
years';
 ?column?
--
 1 mon

What it isn't, is zero months, which is the obviously ideal answer.


-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  If only the physical world exists, free will is an illusion.





Re: Have I found an interval arithmetic bug?

2021-04-02 Thread Bruce Momjian
On Fri, Apr  2, 2021 at 07:06:08PM -0700, Zhihong Yu wrote:
> Hi,
> The mix of interval and comparison with float is not easy to interpret. See 
> the
> following (I got 0.0833 since the result for interval '0.3 years' + interval
> '0.4 years' - ... query was 1 month and 1/12 ~= 0.0833).
> 
> yugabyte=# select 0.3 * '1 year'::interval + 0.4 * '1 year'::interval - 0.7 *
> '1 year'::interval = '0.0833 year'::interval;
>  ?column?
> --
>  f
> 
> As long as Bruce's patch makes improvements over the current behavior, I think
> that's fine.

I wish I could figure out how to improve it any futher.  What is odd is
that I have never seen this reported as a problem before.  I plan to
apply this early next week for PG 14.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  If only the physical world exists, free will is an illusion.





Re: [PATCH] Implement motd for PostgreSQL

2021-04-02 Thread Michael Paquier
On Fri, Apr 02, 2021 at 10:46:16PM +0200, Joel Jacobson wrote:
> Ascii elephant in example by Michael Paquier [1], with ANSI colors added by 
> me.
> 
> [1] 
> https://www.postgresql.org/message-id/CAB7nPqRaacwcaANOYY3Hxd3T0No5RdZXyqM5HB6fta%2BCoDLOEg%40mail.gmail.com

The credit here goes to Charles Clavadetscher, not me.
--
Michael


signature.asc
Description: PGP signature


  1   2   >