Re: [Proposal] Global temporary tables

2020-02-09 Thread Konstantin Knizhnik



On 07.02.2020 21:37, Pavel Stehule wrote:


What when session 2 has active transaction? Then to be correct, you 
should to wait with index creation to end of transaction.



Session1:
postgres=# create unique index on gtt(x);
CREATE INDEX

Sessin2:
postgres=# explain select * from gtt where x=1;
ERROR:  could not create unique index "gtt_x_idx"
DETAIL:  Key (x)=(1) is duplicated.


This is little bit unexpected behave (probably nobody expect so any 
SELECT fail with error "could not create index" - I understand exactly 
to reason and context, but this side effect is something what I afraid.


The more I thinking creation of indexes for GTT on-demand, the more 
contractions I see.

So looks like there are only two safe alternatives:
1. Allow DDL for GTT (including index creation) only if there are no 
other sessions using this GTT ("using" means that no data was inserted 
in GTT by this session). Things can be even more complicated if we take 
in account inter-table dependencies (like foreign key constraint).

2. Create indexes for GTT locally.

2) seems to be very contradictory (global table metadata, but private 
indexes) and hard to implement because in this case we have to maintain 
some private copy of index catalog to keep information about private 
indexes.


1) is currently implemented by Wenjing. Frankly speaking I still find 
such limitation too restrictive and inconvenient for users. From my 
point of view Oracle developers have implemented better compromise. But 
if I am the only person voting for such solution, then let's stop this 
discussion.
But in any case I think that calling ambuild to construct index for 
empty table is better solution than implementation of all indexes (and 
still not solving the problem with custom indexes).


Re: [Proposal] Global temporary tables

2020-02-09 Thread Pavel Stehule
ne 9. 2. 2020 v 13:05 odesílatel Konstantin Knizhnik <
k.knizh...@postgrespro.ru> napsal:

>
>
> On 07.02.2020 21:37, Pavel Stehule wrote:
>
>
> What when session 2 has active transaction? Then to be correct, you should
> to wait with index creation to end of transaction.
>
>
>> Session1:
>> postgres=# create unique index on gtt(x);
>> CREATE INDEX
>>
>> Sessin2:
>> postgres=# explain select * from gtt where x=1;
>> ERROR:  could not create unique index "gtt_x_idx"
>> DETAIL:  Key (x)=(1) is duplicated.
>>
>
> This is little bit unexpected behave (probably nobody expect so any SELECT
> fail with error "could not create index" - I understand exactly to reason
> and context, but this side effect is something what I afraid.
>
>
> The more I thinking creation of indexes for GTT on-demand, the more
> contractions I see.
> So looks like there are only two safe alternatives:
> 1. Allow DDL for GTT (including index creation) only if there are no other
> sessions using this GTT ("using" means that no data was inserted in GTT by
> this session). Things can be even more complicated if we take in account
> inter-table dependencies (like foreign key constraint).
> 2. Create indexes for GTT locally.
>
> 2) seems to be very contradictory (global table metadata, but private
> indexes) and hard to implement because in this case we have to maintain
> some private copy of index catalog to keep information about private
> indexes.
>
> 1) is currently implemented by Wenjing. Frankly speaking I still find such
> limitation too restrictive and inconvenient for users. From my point of
> view Oracle developers have implemented better compromise. But if I am the
> only person voting for such solution, then let's stop this discussion.
>

Thank you. I respect your opinion.


> But in any case I think that calling ambuild to construct index for empty
> table is better solution than implementation of all indexes (and still not
> solving the problem with custom indexes).
>

I know nothing about this area - I expect so you and  Wenjing will find
good solution.

We have to start with something what is simple, usable, and if it possible
it is well placed to Postgres's architecture.

at @1 .. when all tables are empty for other sessions, then I don't see any
problem. From practical reason, I think so requirement to don't use table
in other sessions is too hard, and I can be nice (maybe it is) if creating
index should not be blocked, but if I create index too late, then index is
for other session (when the table is used) invalid (again it can be done in
future).

I am sure, so there are not end of all days -  and there is a space for
future enhancing and testing other variants. I can imagine more different
variations with different advantages/disadvantages. Just for begin I prefer
design that has concept closer to current Postgres.

Regards

Pavel


logical copy_replication_slot issues

2020-02-09 Thread Arseny Sher
Hi,

While jumping around partically decoded xacts questions [1], I've read
through the copy replication slots code (9f06d79ef) and found a couple
of issues.

1) It seems quite reckless to me to dive into
DecodingContextFindStartpoint without actual WAL reservation (donors
slot restart_lsn is used, but it is not acquired). Why risking erroring
out with WAL removal error if the function advances new slot position to
updated donors one in the end anyway?

2) In the end, restart_lsn of new slot is set to updated donors
one. However, confirmed_flush field is not updated. This is just wrong
-- we could start decoding too early and stream partially decoded
transaction.

I'd probably avoid doing DecodingContextFindStartpoint at all. Its only
purpose is to assemble consistent snapshot (and establish corresponding
 pair), but donor slot must have
already done that and we could use it as well. Was this considered?


[1] 
https://www.postgresql.org/message-id/flat/AB5978B2-1772-4FEE-A245-74C91704ECB0%40amazon.com

--
Arseny Sher
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Just for fun: Postgres 20?

2020-02-09 Thread Jose Luis Tallon

Hackers,

    Musing some other date-related things I stumbled upon the thought 
that naming the upcoming release PostgreSQL 20 might be preferrable to 
the current/expected "PostgreSQL 13".



Cons:

 * Discontinuity in versions. 12 -> 20.  Now that we have the precedent 
of 9.6 -> 10 (for very good reasons, I think), this is probably a minor 
issue... Mostly the inconvenience of having to add tests for the skipped 
versions, I believe.


    ¿any others that I don't know about?

Pros:

 * Simplified supportability assessment:  PostgreSQL 20, released in 
2020, would be supported until the release of PostgreSQL 25 (late 2025 
if release cadence is kept as today). Simple and straightforward.


 * We avoid users skipping the release altogether due to superstition 
or analogous reasons ---might be a major issue in some cultures---. 
Postgres 13 would be certainly skipped in production in some 
environments that I know about o_0



Nothing really important, I guess. I think of it as a thought experiment 
mostly, but might spark some ultimate useful debate.



Thanks,

    / J.L.






Re: Just for fun: Postgres 20?

2020-02-09 Thread Vik Fearing
On 09/02/2020 19:28, Jose Luis Tallon wrote:
>  * Simplified supportability assessment:  PostgreSQL 20, released in
> 2020, would be supported until the release of PostgreSQL 25 (late 2025
> if release cadence is kept as today). Simple and straightforward.

How would you handle multiple releases in the same calendar year (such
as 9.5 and 9.6 were)?

>  * We avoid users skipping the release altogether due to superstition or
> analogous reasons ---might be a major issue in some cultures---.
> Postgres 13 would be certainly skipped in production in some
> environments that I know about o_0

That's not our problem.
-- 
Vik Fearing




Re: Just for fun: Postgres 20?

2020-02-09 Thread Tom Lane
Jose Luis Tallon  writes:
>      Musing some other date-related things I stumbled upon the thought 
> that naming the upcoming release PostgreSQL 20 might be preferrable to 
> the current/expected "PostgreSQL 13".

Sorry, but it's not April 1st yet.

regards, tom lane




Re: improve transparency of bitmap-only heap scans

2020-02-09 Thread Alexey Bashtanov




I kinda suspect one of the ressons why this got so little attention is
that it was never added to any CF.


Thanks Tomas, I've created a CF entry 
https://commitfest.postgresql.org/27/2443/


Best, Alex





Re: Add %x to PROMPT1 and PROMPT2

2020-02-09 Thread Vik Fearing
On 06/02/2020 03:56, Vik Fearing wrote:
> On 06/02/2020 03:38, Michael Paquier wrote:
>> On Wed, Feb 05, 2020 at 10:21:11AM -0500, Tom Lane wrote:
>>> Robert Haas  writes:
 I'm not really against this change but, given how long it's been the
 way that it is, I think we shouldn't make it without more plus votes.
 If we've actually got a broad consensus on it, sure, but I don't think
 4 votes is a broad consensus.
>>>
>>> Fair point.  I'm still abstaining, but maybe this should be proposed
>>> on pgsql-general to try to get more opinions?
>>
>> Indeed, still I am not sure what kind of number is enough to define a
>> large consensus.  Vik, could you start a new thread on -general?
> 
> https://www.postgresql.org/message-id/flat/3d8e809b-fc26-87c5-55ac-616a98d2b0be%40postgresfriends.org

The vote here was 6 yeas and 1 nay (85.7%) with 2 abstentions.

The poll on pgsql-general has gone stale with 19 yeas and 2 nays (90.5%).

My poll on Twitter has ended with 33 yeas and 4 nays (89.2%).
https://twitter.com/pg_xocolatl/status/1225258876527874048

There is a little bit of overlap within those three groups but among the
minuscule percentage of our users that responded, the result is
overwhelmingly in favor of this change.
-- 
Vik Fearing




RE: Just for fun: Postgres 20?

2020-02-09 Thread tsunakawa.ta...@fujitsu.com
From: Jose Luis Tallon 
>      Musing some other date-related things I stumbled upon the thought
> that naming the upcoming release PostgreSQL 20 might be preferrable to
> the current/expected "PostgreSQL 13".

+1
Users can easily know how old/new the release is that they are using.


Regards
Takayuki Tsunakawa



Re: MSVC installs too much stuff?

2020-02-09 Thread Andrew Dunstan
On Fri, Jan 31, 2020 at 4:35 PM Craig Ringer  wrote:
>
> On Fri, 31 Jan 2020 at 13:27, Michael Paquier  wrote:
> >
> > On Fri, Jan 31, 2020 at 12:47:29PM +1030, Andrew Dunstan wrote:
> > > When I was working on the test_json stuff yesterday, I noticed that
> > > there are some unexpected (by me at least) things installed when we do
> > > an MSVC build:
> > >
> > > $ ls -l bin| egrep 'regress|isolation'
> > > -rwxr-xr-x 1 pgrunner None   72192 Jan 30 07:51 isolationtester.exe
> > > -rwxr-xr-x 1 pgrunner None  112640 Jan 30 07:51 pg_isolation_regress.exe
> > > -rwxr-xr-x 1 pgrunner None  112128 Jan 30 07:51 pg_regress.exe
> > > -rwxr-xr-x 1 pgrunner None  112640 Jan 30 07:51 pg_regress_ecpg.exe
>
> These tools should be installed. They are useful, important in fact,
> for testing extensions.
>
> In *nix builds we install them to
> $PREFIX/lib/postgresql/pgxs/src/test/regress/pg_regress etc.
>
> On Windows we don't have PGXS. It probably doesn't make sense to
> install them to the pgxs dir. So putting them in bin is pretty
> reasonable.

Oh, Ha! Forget I spoke.

cheers

andrew


-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




RE: Internal key management system

2020-02-09 Thread tsunakawa.ta...@fujitsu.com
From: Andres Freund 
> Perhaps this has already been discussed (I only briefly looked): I'd
> strongly advise against having any new infrastrure depend on
> pgcrypto. Its code quality imo is well below our standards and contains
> serious red flags like very outdated copies of cryptography algorithm
> implementations.  I think we should consider deprecating and removing
> it, not expanding its use.  It certainly shouldn't be involved in any
> potential disk encryption system at a later stage.

+1


Regards
Takayuki Tsunakawa





Re: Invisible PROMPT2

2020-02-09 Thread Thomas Munro
On Mon, Dec 23, 2019 at 5:43 AM Maxence Ahlouche
 wrote:
> On Wed, 27 Nov 2019 at 17:09, Tom Lane  wrote:
>> Good idea, but I think you need to account for "visible" (ie, if the
>> newline is inside RL_PROMPT_START_IGNORE, it shouldn't change the width).
>> It might be best to add logic inside the existing "if (visible)" instead
>> of making a new top-level case.
>
> Right, I assumed that it was safe given that only terminal control characters 
> were invisible.
> Since the title of the terminal window can be changed as well via control 
> characters, it's probably better not to make that assumption.
>
> I updated the patch accordingly.

Pushed.  Thanks!




Re: Add %x to PROMPT1 and PROMPT2

2020-02-09 Thread Michael Paquier
On Mon, Feb 10, 2020 at 12:16:44AM +0100, Vik Fearing wrote:
> There is a little bit of overlap within those three groups but among the
> minuscule percentage of our users that responded, the result is
> overwhelmingly in favor of this change.

Thanks Vik for handling that.  So, it seems to me that we have a
conclusion here.  Any last words?
--
Michael


signature.asc
Description: PGP signature


Re: WIP: expression evaluation improvements

2020-02-09 Thread Soumyadeep Chakraborty
Hi Andres,
> I've comitted a (somewhat evolved) version of this patch. I think it
> really improves the code!
Awesome! Thanks for taking it forward!

> I do wonder about adding a variadic wrapper like the one introduced here
> more widely, seems like it could simplify a number of places. If we then
> redirected all function calls through a common wrapper, for LLVMBuildCall,
> that also validated parameter count (and perhaps types), I think it'd be
> easier to develop...
+1. I was wondering whether such validations should be Asserts instead of
ERRORs.

Regards,

Soumyadeep Chakraborty
Senior Software Engineer
Pivotal Greenplum
Palo Alto


On Thu, Feb 6, 2020 at 10:35 PM Andres Freund  wrote:

> Hi,
>
> On 2019-10-28 23:58:11 -0700, Soumyadeep Chakraborty wrote:
> > > Sorry for not replying to that earlier.  I'm not quite sure it's
> > > actually worthwhile doing so - did you try to measure any memory / cpu
> > > savings?
> >
> > No problem, thanks for the reply! Unfortunately, I did not do anything
> > significant in terms of mem/cpu measurements. However, I have noticed
> > non-trivial differences between optimized and unoptimized .bc files
> > that were dumped from time to time.
>
> Could you expand on what you mean here? Are you saying that you got
> significantly better optimization results by doing function optimization
> early on?  That'd be surprising imo?
>
> Greetings,
>
> Andres Freund
>


Re: WIP: expression evaluation improvements

2020-02-09 Thread Soumyadeep Chakraborty
Hi Andres,

> Could you expand on what you mean here? Are you saying that you got
> significantly better optimization results by doing function optimization
> early on?  That'd be surprising imo?

Sorry for the ambiguity, I meant that I had observed differences in the
sizes
of the bitcode files dumped.

These are the size differences that I observed (for TPCH Q1):
Without my patch:
-rw---   1 pivotal  staff   278K Feb  9 11:59 1021.0.bc
-rw---   1 pivotal  staff   249K Feb  9 11:59 1374.0.bc
-rw---   1 pivotal  staff   249K Feb  9 11:59 1375.0.bc
With my patch:
-rw---   1 pivotal  staff   245K Feb  9 11:43 88514.0.bc
-rw---   1 pivotal  staff   245K Feb  9 11:43 88515.0.bc
-rw---   1 pivotal  staff   270K Feb  9 11:43 79323.0.bc

This means that the sizes of the module when execution encountered:

if (jit_dump_bitcode)
{
char *filename;

filename = psprintf("%u.%zu.bc",
MyProcPid,
context->module_generation);
LLVMWriteBitcodeToFile(context->module, filename);
pfree(filename);
}

were smaller with my patch applied. This means there is less memory
pressure between when the functions were built and when
llvm_compile_module() is called. I don't know if the difference is
practically
significant.

Soumyadeep


Re: Implementing Incremental View Maintenance

2020-02-09 Thread Yugo NAGATA
On Sat, 8 Feb 2020 11:15:45 +0900
nuko yokohama  wrote:

> Hi.
> 
> UNION query problem.(server crash)
> 
> When creating an INCREMENTAL MATERIALIZED VIEW,
> the server process crashes if you specify a query with a UNION.

Thank you for your report. As you noticed set operations including
UNION is concurrently unsupported, although this is not checked at
definition time and not documented either. Now we are thoroughly
investigating unsupported queries, and will add checks and
documentations for them.

Regards,
Yugo Nagata

> 
> (commit id = 23151be7be8d8f8f9c35c2d0e4e5353aedf2b31e)
> 
> execute log.
> 
> ```
> [ec2-user@ip-10-0-1-10 ivm]$ psql testdb -e -f union_query_crash.sql
> DROP TABLE IF EXISTS table_x CASCADE;
> psql:union_query_crash.sql:6: NOTICE:  drop cascades to view xy_union_v
> DROP TABLE
> DROP TABLE IF EXISTS table_y CASCADE;
> DROP TABLE
> CREATE TABLE table_x (id int, data numeric);
> CREATE TABLE
> CREATE TABLE table_y (id int, data numeric);
> CREATE TABLE
> INSERT INTO table_x VALUES (generate_series(1, 3), random()::numeric);
> INSERT 0 3
> INSERT INTO table_y VALUES (generate_series(1, 3), random()::numeric);
> INSERT 0 3
> SELECT * FROM table_x;
>  id |data
> +
>   1 |  0.950724735058774
>   2 | 0.0222670808201144
>   3 |  0.391258547114841
> (3 rows)
> 
> SELECT * FROM table_y;
>  id |data
> +
>   1 |  0.991717347778337
>   2 | 0.0528458947672874
>   3 |  0.965044982911163
> (3 rows)
> 
> CREATE VIEW xy_union_v AS
> SELECT 'table_x' AS name, * FROM table_x
> UNION
> SELECT 'table_y' AS name, * FROM table_y
> ;
> CREATE VIEW
> TABLE xy_union_v;
>   name   | id |data
> -++
>  table_y |  2 | 0.0528458947672874
>  table_x |  2 | 0.0222670808201144
>  table_y |  3 |  0.965044982911163
>  table_x |  1 |  0.950724735058774
>  table_x |  3 |  0.391258547114841
>  table_y |  1 |  0.991717347778337
> (6 rows)
> 
> CREATE INCREMENTAL MATERIALIZED VIEW xy_imv AS
> SELECT 'table_x' AS name, * FROM table_x
> UNION
> SELECT 'table_y' AS name, * FROM table_y
> ;
> psql:union_query_crash.sql:28: server closed the connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
> psql:union_query_crash.sql:28: fatal: connection to server was lost
> ```
> UNION query problem.(server crash)
> 
> When creating an INCREMENTAL MATERIALIZED VIEW,
> the server process crashes if you specify a query with a UNION.
> 
> (commit id = 23151be7be8d8f8f9c35c2d0e4e5353aedf2b31e)
> 
> execute log.
> 
> ```
> [ec2-user@ip-10-0-1-10 ivm]$ psql testdb -e -f union_query_crash.sql
> DROP TABLE IF EXISTS table_x CASCADE;
> psql:union_query_crash.sql:6: NOTICE:  drop cascades to view xy_union_v
> DROP TABLE
> DROP TABLE IF EXISTS table_y CASCADE;
> DROP TABLE
> CREATE TABLE table_x (id int, data numeric);
> CREATE TABLE
> CREATE TABLE table_y (id int, data numeric);
> CREATE TABLE
> INSERT INTO table_x VALUES (generate_series(1, 3), random()::numeric);
> INSERT 0 3
> INSERT INTO table_y VALUES (generate_series(1, 3), random()::numeric);
> INSERT 0 3
> SELECT * FROM table_x;
>  id |data
> +
>   1 |  0.950724735058774
>   2 | 0.0222670808201144
>   3 |  0.391258547114841
> (3 rows)
> 
> SELECT * FROM table_y;
>  id |data
> +
>   1 |  0.991717347778337
>   2 | 0.0528458947672874
>   3 |  0.965044982911163
> (3 rows)
> 
> CREATE VIEW xy_union_v AS
> SELECT 'table_x' AS name, * FROM table_x
> UNION
> SELECT 'table_y' AS name, * FROM table_y
> ;
> CREATE VIEW
> TABLE xy_union_v;
>   name   | id |data
> -++
>  table_y |  2 | 0.0528458947672874
>  table_x |  2 | 0.0222670808201144
>  table_y |  3 |  0.965044982911163
>  table_x |  1 |  0.950724735058774
>  table_x |  3 |  0.391258547114841
>  table_y |  1 |  0.991717347778337
> (6 rows)
> 
> CREATE INCREMENTAL MATERIALIZED VIEW xy_imv AS
> SELECT 'table_x' AS name, * FROM table_x
> UNION
> SELECT 'table_y' AS name, * FROM table_y
> ;
> psql:union_query_crash.sql:28: server closed the connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
> psql:union_query_crash.sql:28: fatal: connection to server was lost
> ```
> 
> 2018年12月27日(木) 21:57 Yugo Nagata :
> 
> > Hi,
> >
> > I would like to implement Incremental View Maintenance (IVM) on
> > PostgreSQL.
> > IVM is a technique to maintain materialized views which computes and
> > applies
> > only the incremental changes to the materialized views rather than
> > recomputate the contents as the current REFRESH command does.
> >
> > I had a presentation on our PoC implementation of IVM at PGConf.eu 2018
> > [1].
> > Our implementation uses row OIDs to compute deltas for materialized
> > views.
> > The basic idea is that if we have information about which rows in base
> > tables
> > a

Re: Identifying user-created objects

2020-02-09 Thread Masahiko Sawada
On Thu, 6 Feb 2020 at 17:18, Masahiko Sawada
 wrote:
>
> On Thu, 6 Feb 2020 at 16:53, Amit Langote  wrote:
> >
> > On Thu, Feb 6, 2020 at 4:31 PM Michael Paquier  wrote:
> > > On Thu, Feb 06, 2020 at 04:25:47PM +0900, Amit Langote wrote:
> > > > About the implementation, how about defining a static inline function,
> > > > say is_user_object(), next to FirstNormalObjectId's definition and
> > > > make pg_is_user_object() call it?  There are a few placed in the
> > > > backend code that perform the same computation as pg_is_user_object(),
> > > > which could be changed to use is_user_object() instead.
> > >
> > > FWIW, if we bother adding SQL functions for that, my first impression
> > > was to have three functions, each one of them returning:
> > > - FirstNormalObjectId
> > > - FirstGenbkiObjectId
> > > - FirstNormalObjectId
> >
> > Did you miss FirstBootstrapObjectId by any chance?
> >
> > I see the following ranges as defined in transam.h.
> >
> > 1-(FirstGenbkiObjectId - 1): manually assigned OIDs
> > FirstGenbkiObjectId-(FirstBootstrapObjectId - 1): genbki.pl assigned OIDs
> > FirstBootstrapObjectId-(FirstNormalObjectId - 1): initdb requested
> > FirstNormalObjectId or greater: user-defined objects
> >
> > Sawada-san's proposal covers #4.  Do we need an SQL function for the
> > first three?  IOW, would the distinction between OIDs belonging to the
> > first three ranges be of interest to anyone except core PG hackers?
>
> Yeah I thought of these three values but I'm also not sure it's worth for 
> users.
>
> If we have these functions returning the values respectively, when we
> want to check if an oid is assigned during initdb we will end up with
> doing something like 'WHERE oid >= pg_first_bootstrap_oid() and oid <
> pg_first_normal_oid()', which is not intuitive, I think. Users have to
> remember the order of these values.
>

Attached the updated version patch that includes regression tests. And
I have registered this to the next commit fest.

Regards,

-- 
Masahiko Sawadahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


identify_user_object_v2.patch
Description: Binary data


subplan resets wrong hashtable

2020-02-09 Thread Justin Pryzby
I believe the 2nd hunk should reset node->hashnulls, rather than reset
->hashtable a 2nd time:

@@ -505,7 +505,10 @@ buildSubPlanHash(SubPlanState *node, ExprContext *econtext)
if (nbuckets < 1)
nbuckets = 1;
 
-   node->hashtable = BuildTupleHashTable(node->parent,
+   if (node->hashtable)
+   ResetTupleHashTable(node->hashtable);
+   else
+   node->hashtable = BuildTupleHashTableExt(node->parent,

 node->descRight,

 ncols,

 node->keyColIdx,
...

@@ -527,7 +531,11 @@ buildSubPlanHash(SubPlanState *node, ExprContext *econtext)
if (nbuckets < 1)
nbuckets = 1;
}
-   node->hashnulls = BuildTupleHashTable(node->parent,
+
+   if (node->hashnulls)
+   ResetTupleHashTable(node->hashtable);
+   else
+   node->hashnulls = BuildTupleHashTableExt(node->parent,

 node->descRight,

 ncols,

 node->keyColIdx,

Added here:

commit 356687bd825e5ca7230d43c1bffe7a59ad2e77bd
Author: Andres Freund 
Date:   Sat Feb 9 00:35:57 2019 -0800

Reset, not recreate, execGrouping.c style hashtables.




Re: Implementing Incremental View Maintenance

2020-02-09 Thread nuko yokohama
Hi.

I understod that UNION is unsupported.

I also refer to the implementation of "./src/backend/commands/createas.c"
check_ivm_restriction_walker () to see if there are any other queries that
may be problematic.

2020年2月10日(月) 10:38 Yugo NAGATA :

> On Sat, 8 Feb 2020 11:15:45 +0900
> nuko yokohama  wrote:
>
> > Hi.
> >
> > UNION query problem.(server crash)
> >
> > When creating an INCREMENTAL MATERIALIZED VIEW,
> > the server process crashes if you specify a query with a UNION.
>
> Thank you for your report. As you noticed set operations including
> UNION is concurrently unsupported, although this is not checked at
> definition time and not documented either. Now we are thoroughly
> investigating unsupported queries, and will add checks and
> documentations for them.
>
> Regards,
> Yugo Nagata
>
> >
> > (commit id = 23151be7be8d8f8f9c35c2d0e4e5353aedf2b31e)
> >
> > execute log.
> >
> > ```
> > [ec2-user@ip-10-0-1-10 ivm]$ psql testdb -e -f union_query_crash.sql
> > DROP TABLE IF EXISTS table_x CASCADE;
> > psql:union_query_crash.sql:6: NOTICE:  drop cascades to view xy_union_v
> > DROP TABLE
> > DROP TABLE IF EXISTS table_y CASCADE;
> > DROP TABLE
> > CREATE TABLE table_x (id int, data numeric);
> > CREATE TABLE
> > CREATE TABLE table_y (id int, data numeric);
> > CREATE TABLE
> > INSERT INTO table_x VALUES (generate_series(1, 3), random()::numeric);
> > INSERT 0 3
> > INSERT INTO table_y VALUES (generate_series(1, 3), random()::numeric);
> > INSERT 0 3
> > SELECT * FROM table_x;
> >  id |data
> > +
> >   1 |  0.950724735058774
> >   2 | 0.0222670808201144
> >   3 |  0.391258547114841
> > (3 rows)
> >
> > SELECT * FROM table_y;
> >  id |data
> > +
> >   1 |  0.991717347778337
> >   2 | 0.0528458947672874
> >   3 |  0.965044982911163
> > (3 rows)
> >
> > CREATE VIEW xy_union_v AS
> > SELECT 'table_x' AS name, * FROM table_x
> > UNION
> > SELECT 'table_y' AS name, * FROM table_y
> > ;
> > CREATE VIEW
> > TABLE xy_union_v;
> >   name   | id |data
> > -++
> >  table_y |  2 | 0.0528458947672874
> >  table_x |  2 | 0.0222670808201144
> >  table_y |  3 |  0.965044982911163
> >  table_x |  1 |  0.950724735058774
> >  table_x |  3 |  0.391258547114841
> >  table_y |  1 |  0.991717347778337
> > (6 rows)
> >
> > CREATE INCREMENTAL MATERIALIZED VIEW xy_imv AS
> > SELECT 'table_x' AS name, * FROM table_x
> > UNION
> > SELECT 'table_y' AS name, * FROM table_y
> > ;
> > psql:union_query_crash.sql:28: server closed the connection unexpectedly
> > This probably means the server terminated abnormally
> > before or while processing the request.
> > psql:union_query_crash.sql:28: fatal: connection to server was lost
> > ```
> > UNION query problem.(server crash)
> >
> > When creating an INCREMENTAL MATERIALIZED VIEW,
> > the server process crashes if you specify a query with a UNION.
> >
> > (commit id = 23151be7be8d8f8f9c35c2d0e4e5353aedf2b31e)
> >
> > execute log.
> >
> > ```
> > [ec2-user@ip-10-0-1-10 ivm]$ psql testdb -e -f union_query_crash.sql
> > DROP TABLE IF EXISTS table_x CASCADE;
> > psql:union_query_crash.sql:6: NOTICE:  drop cascades to view xy_union_v
> > DROP TABLE
> > DROP TABLE IF EXISTS table_y CASCADE;
> > DROP TABLE
> > CREATE TABLE table_x (id int, data numeric);
> > CREATE TABLE
> > CREATE TABLE table_y (id int, data numeric);
> > CREATE TABLE
> > INSERT INTO table_x VALUES (generate_series(1, 3), random()::numeric);
> > INSERT 0 3
> > INSERT INTO table_y VALUES (generate_series(1, 3), random()::numeric);
> > INSERT 0 3
> > SELECT * FROM table_x;
> >  id |data
> > +
> >   1 |  0.950724735058774
> >   2 | 0.0222670808201144
> >   3 |  0.391258547114841
> > (3 rows)
> >
> > SELECT * FROM table_y;
> >  id |data
> > +
> >   1 |  0.991717347778337
> >   2 | 0.0528458947672874
> >   3 |  0.965044982911163
> > (3 rows)
> >
> > CREATE VIEW xy_union_v AS
> > SELECT 'table_x' AS name, * FROM table_x
> > UNION
> > SELECT 'table_y' AS name, * FROM table_y
> > ;
> > CREATE VIEW
> > TABLE xy_union_v;
> >   name   | id |data
> > -++
> >  table_y |  2 | 0.0528458947672874
> >  table_x |  2 | 0.0222670808201144
> >  table_y |  3 |  0.965044982911163
> >  table_x |  1 |  0.950724735058774
> >  table_x |  3 |  0.391258547114841
> >  table_y |  1 |  0.991717347778337
> > (6 rows)
> >
> > CREATE INCREMENTAL MATERIALIZED VIEW xy_imv AS
> > SELECT 'table_x' AS name, * FROM table_x
> > UNION
> > SELECT 'table_y' AS name, * FROM table_y
> > ;
> > psql:union_query_crash.sql:28: server closed the connection unexpectedly
> > This probably means the server terminated abnormally
> > before or while processing the request.
> > psql:union_query_crash.sql:28: fatal: connection to server was lost
> > ```
> >
> > 2018年12月27日(木) 21:57 Yugo Nagata :
> >
> > > Hi,
> > >
> > > I would like to implement 

Re: Parallel grouping sets

2020-02-09 Thread Pengzhou Tang
Thanks to reviewing those patches.

Ha, I believe you meant to say a "normal aggregate", because what's
> performed above gather is no longer "grouping sets", right?
>
> The group key idea is clever in that it helps "discriminate" tuples by
> their grouping set id. I haven't completely thought this through, but my
> hunch is that this leaves some money on the table, for example, won't it
> also lead to more expensive (and unnecessary) sorting and hashing? The
> groupings with a few partials are now sharing the same tuplesort with
> the groupings with a lot of groups even though we only want to tell
> grouping 1 *apart from* grouping 10, not neccessarily that grouping 1
> needs to come before grouping 10. That's why I like the multiplexed pipe
> / "dispatched by grouping set id" idea: we only pay for sorting (or
> hashing) within each grouping. That said, I'm open to the criticism that
> keeping multiple tuplesort and agg hash tabes running is expensive in
> itself, memory-wise ...
>
> Cheers,
> Jesse


That's something we need to testing, thanks. Meanwhile, for the approach to
use "normal aggregate" with grouping set id, one concern is that it cannot
use
Mixed Hashed which means if a grouping sets contain both non-hashable or
non-sortable sets, it will fallback to one-phase aggregate.


Re: Identifying user-created objects

2020-02-09 Thread Amit Langote
Sawada-san,

On Mon, Feb 10, 2020 at 12:25 PM Masahiko Sawada
 wrote:
> > > > On Thu, Feb 06, 2020 at 04:25:47PM +0900, Amit Langote wrote:
> > > > > About the implementation, how about defining a static inline function,
> > > > > say is_user_object(), next to FirstNormalObjectId's definition and
> > > > > make pg_is_user_object() call it?  There are a few placed in the
> > > > > backend code that perform the same computation as pg_is_user_object(),
> > > > > which could be changed to use is_user_object() instead.
>
> Attached the updated version patch that includes regression tests. And
> I have registered this to the next commit fest.

Thank you.

Any thoughts on the above suggestion?

Regards,
Amit




Re: Postgres 32 bits client compilation fail. Win32 bits client is supported?

2020-02-09 Thread Craig Ringer
On Sun, 9 Feb 2020 at 15:35, Amit Kapila  wrote:
>
> On Sat, Feb 8, 2020 at 8:05 AM Ranier Vilela  wrote:
> >
> > Hi,
> > I am migrating my applications that use postgres client from msvc 2010 
> > (32bits) to msvc 2019 (32 bits).
> > Compilation using msvc 2019 (64 bits), works very well.
> > But the build using msvc 2019 (32 bit) is not working.
> > The 32-bit Platform variable is set to x86, resulting in the first error.
> >
> > "Project" C: \ dll \ postgres \ pgsql.sln "on node 1 (default targets).
> > C: \ dll \ postgres \ pgsql.sln.metaproj: error MSB4126: The specified 
> > solution configuration "Release | x86" is invalid. Plea
> > if specify a valid solution configuration using the Configuration and 
> > Platform properties (e.g. MSBuild.exe Solution.sl
> > n / p: Configuration = Debug / p: Platform = "Any CPU") or leave those 
> > properties blank to use the default solution configurati
> > on. [C: \ dll \ postgres \ pgsql.sln]
> > Done Building Project "C: \ dll \ postgres \ pgsql.sln" (default targets) - 
> > FAILED. "
> >
> > This is because the Sub DeterminePlatform function of the Solution.pm 
> > program uses the following expression:
> > "my $ output =` cl /? 2> & 1`; "
> > The result of msvc 2019 (32bits) is:
> > "Microsoft (R) C / C ++ Optimizing Compiler Version 19.24.28315 for x86"
> >
> > By setting the Platform variable manually to WIn32, the compilation process 
> > continues until it stops at:
> > "Generating configuration headers ..."
> >
> > Where the second error occurs:
> > unused defines: HAVE_STRUCT_CMSGCRED
> > USE_NAMED_POSI ... etc ...
> > ALIGNOF_DOUBLE USE_DEV_URANDOM at C: \ dll \ postgres \ src \ tools \ msvc 
> > / Mkvcbuild.pm line 842.
> >
>
> Try by removing src/include/pg_config.
>
> > Question:
> > Will Postgres continue to support 32-bit client?
> >
>
> I am not aware of any discussion related to stopping the support of
> 32-bit client.

Buildfarm member whelk reports for 32-bit Windows.

https://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=whelk&br=HEAD

https://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=whelk&br=REL_12_STABLE

It says it uses Microsoft Visual C++ 2010 .

I don't see any other members building for 32-bit. But it should work
and as you say, nothing's been discussed about removing it.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 2ndQuadrant - PostgreSQL Solutions for the Enterprise




Re: subplan resets wrong hashtable

2020-02-09 Thread Andres Freund
Hi,

On 2020-02-09 21:25:47 -0600, Justin Pryzby wrote:
> I believe the 2nd hunk should reset node->hashnulls, rather than reset
> ->hashtable a 2nd time:
> 
> @@ -505,7 +505,10 @@ buildSubPlanHash(SubPlanState *node, ExprContext 
> *econtext)
> if (nbuckets < 1)
> nbuckets = 1;
>  
> -   node->hashtable = BuildTupleHashTable(node->parent,
> +   if (node->hashtable)
> +   ResetTupleHashTable(node->hashtable);
> +   else
> +   node->hashtable = BuildTupleHashTableExt(node->parent,
>   
>node->descRight,
>   
>ncols,
>   
>node->keyColIdx,
> ...
> 
> @@ -527,7 +531,11 @@ buildSubPlanHash(SubPlanState *node, ExprContext 
> *econtext)
> if (nbuckets < 1)
> nbuckets = 1;
> }
> -   node->hashnulls = BuildTupleHashTable(node->parent,
> +
> +   if (node->hashnulls)
> +   ResetTupleHashTable(node->hashtable);
> +   else
> +   node->hashnulls = BuildTupleHashTableExt(node->parent,
>   
>node->descRight,
>   
>ncols,
>   
>node->keyColIdx,

Ugh, that indeed looks wrong. Did you check whether it can actively
cause wrong query results? If so, did you do theoretically, or got to a
query returning wrong results?

- Andres




Re: subplan resets wrong hashtable

2020-02-09 Thread Justin Pryzby
On Sun, Feb 09, 2020 at 08:01:26PM -0800, Andres Freund wrote:
> Ugh, that indeed looks wrong. Did you check whether it can actively
> cause wrong query results? If so, did you do theoretically, or got to a
> query returning wrong results?

No, I only noticed while reading code.

I tried briefly to find a plan that looked like what I thought might be broken,
but haven't found anything close.

Justin




Re: Identifying user-created objects

2020-02-09 Thread Masahiko Sawada
On Mon, 10 Feb 2020 at 12:54, Amit Langote  wrote:
>
> Sawada-san,
>
> On Mon, Feb 10, 2020 at 12:25 PM Masahiko Sawada
>  wrote:
> > > > > On Thu, Feb 06, 2020 at 04:25:47PM +0900, Amit Langote wrote:
> > > > > > About the implementation, how about defining a static inline 
> > > > > > function,
> > > > > > say is_user_object(), next to FirstNormalObjectId's definition and
> > > > > > make pg_is_user_object() call it?  There are a few placed in the
> > > > > > backend code that perform the same computation as 
> > > > > > pg_is_user_object(),
> > > > > > which could be changed to use is_user_object() instead.
> >
> > Attached the updated version patch that includes regression tests. And
> > I have registered this to the next commit fest.
>
> Thank you.
>
> Any thoughts on the above suggestion?

Oops, I had overlooked it. I agree with you.

How about having it as a macro like:

#define ObjectIdIsUserObject(oid) ((Oid)(oid) >= FirstNormalObjectId)

Regards,

-- 
Masahiko Sawadahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: typos in comments and user docs

2020-02-09 Thread Amit Kapila
On Fri, Feb 7, 2020 at 9:47 AM Justin Pryzby  wrote:
>
> On Fri, Feb 07, 2020 at 09:26:04AM +0530, Amit Kapila wrote:
> > On Fri, Feb 7, 2020 at 8:41 AM Justin Pryzby  wrote:
> > >
> > > On Fri, Feb 07, 2020 at 08:33:40AM +0530, Amit Kapila wrote:
> > > > On Thu, Feb 6, 2020 at 7:26 PM Justin Pryzby  
> > > > wrote:
> > > > >
> > > > > On Thu, Feb 06, 2020 at 04:43:18PM +0530, Amit Kapila wrote:
> > > > > > On Thu, Feb 6, 2020 at 10:45 AM Michael Paquier 
> > > > > >  wrote:
> > > > > > >
> > > > > > > On Thu, Feb 06, 2020 at 08:47:14AM +0530, Amit Kapila wrote:
> > > > > > > > Your changes look fine to me on the first read.  I will push 
> > > > > > > > this to
> > > > > > > > HEAD unless there are any objections.   If we want them in
> > > > > > > > back-branches, we might want to probably segregate the changes 
> > > > > > > > based
> > > > > > > > on the branch until those apply.
> > > > > > >
> > > > > > > +1.  It would be nice to back-patch the user-visible changes in 
> > > > > > > the
> > > > > > > docs.
> > > > > > >
> > > > > >
> > > > > > Fair enough, Justin, is it possible for you to segregate the changes
> > > > > > that can be backpatched?
> > > > >
> > > > > Looks like the whole patch can be applied to master and v12 [0].
> > > >
> >
> > I tried your patch master and it failed to apply.
> > (Stripping trailing CRs from patch; use --binary to disable.)
> > patching file doc/src/sgml/bloom.sgml
> > (Stripping trailing CRs from patch; use --binary to disable.)
> > patching file doc/src/sgml/config.sgml
> > Hunk #1 FAILED at 4318.
> > 1 out of 1 hunk FAILED -- saving rejects to file 
> > doc/src/sgml/config.sgml.rej
>
> I think you applied the first patch, which I corrected here.
> https://www.postgresql.org/message-id/20200206135640.GG403%40telsasoft.com
>
> Just rechecked it works for master and v12.
>

Okay, thanks.  I have pushed user-facing changes (bloom.sgml and
alter_table.sgml) to back branches till they apply and rest of the
changes in just HEAD.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com




Re: Identifying user-created objects

2020-02-09 Thread Amit Langote
On Mon, Feb 10, 2020 at 1:06 PM Masahiko Sawada
 wrote:
> On Mon, 10 Feb 2020 at 12:54, Amit Langote  wrote:
> >
> > Sawada-san,
> >
> > On Mon, Feb 10, 2020 at 12:25 PM Masahiko Sawada
> >  wrote:
> > > > > > On Thu, Feb 06, 2020 at 04:25:47PM +0900, Amit Langote wrote:
> > > > > > > About the implementation, how about defining a static inline 
> > > > > > > function,
> > > > > > > say is_user_object(), next to FirstNormalObjectId's definition and
> > > > > > > make pg_is_user_object() call it?  There are a few placed in the
> > > > > > > backend code that perform the same computation as 
> > > > > > > pg_is_user_object(),
> > > > > > > which could be changed to use is_user_object() instead.
> > >
> > > Attached the updated version patch that includes regression tests. And
> > > I have registered this to the next commit fest.
> >
> > Thank you.
> >
> > Any thoughts on the above suggestion?
>
> Oops, I had overlooked it. I agree with you.
>
> How about having it as a macro like:
>
> #define ObjectIdIsUserObject(oid) ((Oid)(oid) >= FirstNormalObjectId)

I'm fine with a macro.

Thanks,
Amit




Re: subplan resets wrong hashtable

2020-02-09 Thread Justin Pryzby
On Sun, Feb 09, 2020 at 08:01:26PM -0800, Andres Freund wrote:
> Ugh, that indeed looks wrong. Did you check whether it can actively
> cause wrong query results? If so, did you do theoretically, or got to a
> query returning wrong results?

Actually .. I can "theoretically" prove that there's no wrong results from that
patch...since in that file it has no effect, the tested variables being zeroed
few lines earlier:

 @@ -499,51 +499,60 @@ buildSubPlanHash(SubPlanState *node, ExprContext 
*econtext)
*node->hashtable = NULL;
*node->hashnulls = NULL;
 node->havehashrows = false;
 node->havenullrows = false;
  
 nbuckets = (long) Min(planstate->plan->plan_rows, (double) LONG_MAX);
 if (nbuckets < 1)
 nbuckets = 1;
  
 -   node->hashtable = BuildTupleHashTable(node->parent,
 -  
   node->descRight,
 -  
   ncols,
 -  
   node->keyColIdx,
 -  
   node->tab_eq_funcoids,
 -  
   node->tab_hash_funcs,
 -  
   nbuckets,
 -  
   0,
 -  
   node->hashtablecxt,
 -  
   node->hashtempcxt,
 -  
   false);
*+   if (node->hashtable)
 +   ResetTupleHashTable(node->hashtable);
 +   else
 +   node->hashtable = BuildTupleHashTableExt(node->parent,
 
 ...
*+   if (node->hashnulls)
 +   ResetTupleHashTable(node->hashtable);
 +   else
 +   node->hashnulls = BuildTupleHashTableExt(node->parent,
 +  
  node->descRight,






Re: Is custom MemoryContext prohibited?

2020-02-09 Thread Craig Ringer
On Thu, 6 Feb 2020 at 11:09, Andres Freund  wrote:

> I wasn't advocating for making plannodes.h etc frontend usable. I think
> that's a fairly different discussion than making enum NodeTag,
> pg_list.h, memutils.h available.  I don't see them having access to the
> numerical value of node tag for backend structs as something actually
> problematic (I'm pretty sure you can do that today already if you really
> want to - but why would you?).
>
> I don't buy that having a separate magic number for various types that
> we may want to use both frontend and backend is better than largely just
> having one set of such magic type identifiers.

Simply using MemoryContext as the NodeTag seems very sensible based on
the above discussion.

But rather than adding a const char * name to point to some constant
for the implementation name as was proposed earlier, I think the
existing pointer MemoryContextData->methods is sufficient to identify
the context type. We could add a NameData field to
MemoryContextMethods that the initializer sets to the implementation
name for convenience.

It's trivial to see when debugging with a   p ctx->methods->name   .
We keep the MemoryContextData size down and we lose nothing. Though
gdb is smart enough to annotate a pointer to the symbol
AllocSetMethods as such when it sees it in a debuginfo build there's
no harm in having a single static string in the const-data segment per
memory context type.

I'd also like to add a

   bool (*instanceof)(MemoryContext context, MemoryContextMethods context_type);

to MemoryContextMethods . Then replace all use of   IsA(ctx,
AllocSetContext)   etc with a test like:

#define Insanceof_AllocSetContext(ctx) \
(ctx->methods == AllocSetMethods || ctx->is_a(AllocSetMethods));

In other words, we ask the target object what it is.

This would make it possible to create wrapper implementations of
existing contexts that do extra memory accounting or other limited
sorts of extensions. The short-circuit testing for the known concrete
AllocSetMethods should make it pretty much identical in performance
terms, which is of course rather important.

The OO-alike naming is not a coincidence.

I can't help but notice that we're facing some of the same issues
faced by early OO patterns. Not too surprising given that Pg uses a
lot of pseudo-OO - some level of structural inheritance and
behavioural inheritance, but no encapsulation, no messaging model, no
code-to-data binding. I'm no OO purist, I don't care much so long as
it works and is consistent.

In OO terms what we seem to be facing is difficulty with extending
existing object types into new subtypes without modifying all the code
that knows how to work with the parent types. MemoryContext is one
example of this, Node is another. The underlying issue is similar.

Being able to do this is something I'm much more interested in being
able to do for plan and parse nodes etc than for MemoryContext tbh,
but the same principles apply.


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 2ndQuadrant - PostgreSQL Solutions for the Enterprise




Re: Does recovery write to backup_label ?

2020-02-09 Thread David Steele




On 2/7/20 8:06 PM, Fujii Masao wrote:

On Sat, Feb 8, 2020 at 5:05 AM Chapman Flack  wrote:


On 2/7/20 2:55 PM, Andres Freund wrote:


If the file needs to have 0600 permissions, should there be
a note in the nonexclusive-mode backup docs to say so?


I'm not convinced that that's useful. The default is that everything
needs to be writable by postgres. The exceptions should be noted if
anything, not the default.


+1


+1.

In theory it would be more secure to only allow rename, but since 
Postgres can overwrite any other file in the cluster I don't see much 
value in making an exception for this file.



Could this arguably be a special case, as most things in the datadir
are put there by postgres, but the backup_label is now to be put there
(and not even 'there' there, but added as a final step only to a
'backup-copy-of-there' there) by the poor schmuck who reads the
non-exclusive backup docs as saying "retrieve this content from
pg_stop_backup() and preserve in a file named backup_label" and can't
think of any obvious reason to put write permission on a file
that preserves immutable history in a backup?


I have no strong objection to add more note about permissions
of the files that the users put in the data directory. If we really
do that, it'd be better to note about not only backup_label
but also other files like tablespace_map, recovery.signal,
promotion trigger file, etc.


More documentation seems like a good idea, especially since 
non-exclusive backup requires the app to choose/set permissions.


Regards,
--
-David
da...@pgmasters.net




Re: Implementing Incremental View Maintenance

2020-02-09 Thread Yugo NAGATA
Hi,

Attached is the latest patch (v13) to add support for Incremental
View Maintenance (IVM). Differences from the previous patch (v12)
include:

* Allow to maintain IMMVs containing user defined types

 Previously, IMMVs (Incrementally Maintainable Materialized Views)
 containing user defined types could not be maintained and an error
 was raised because such columns were compared using pg_calatog.=
 during tuple matching.  To fix this, use the column type's default
 equality operator instead of forcing to use the built-in operator.

 Pointed out by nuko-san.
 
https://www.postgresql.org/message-id/CAF3Gu1YL7HWF0Veor3t8sQD%2BJnvozHe6WdUw0YsMqJGFezVhpg%40mail.gmail.com

* Improve an error message for unspoorted aggregate functions

 Currentlly only built-in aggregate functions are supported, so
 aggregates on user-defined types causes an error at view definition
 time. However, the message was unappropreate like:

  ERROR:  aggregate function max is not supported

 even though built-in max is supported. Therefore, this is improved
 to include its argument types as following:

  ERROR:  aggregate function min(xxx) is not supported
  HINT:  IVM supports only built-in aggregate functions.

 Pointed out by nuko-san.
 
https://www.postgresql.org/message-id/CAF3Gu1bP0eiv%3DCqV%3D%2BxATdcmLypjjudLz_wdJgnRNULpiX9GrA%40mail.gmail.com

* Doc: fix description of support subquery

 IVM supports regular EXISTS clause not only correlated subqueries.

Regards,
Yugo Nagata

On Fri, 20 Dec 2019 14:02:32 +0900
Yugo Nagata  wrote:

> IVM is a way to make materialized views up-to-date in which only
> incremental changes are computed and applied on views rather than
> recomputing the contents from scratch as REFRESH MATERIALIZED VIEW
> does. IVM can update materialized views more efficiently
> than recomputation when only small part of the view need updates.
> 
> There are two approaches with regard to timing of view maintenance:
> immediate and deferred. In immediate maintenance, views are updated in
> the same transaction where its base table is modified. In deferred
> maintenance, views are updated after the transaction is committed,
> for example, when the view is accessed, as a response to user command
> like REFRESH, or periodically in background, and so on. 
> 
> This patch implements a kind of immediate maintenance, in which
> materialized views are updated immediately in AFTER triggers when a
> base table is modified.
> 
> This supports views using:
>  - inner and outer joins including self-join
>  - some built-in aggregate functions (count, sum, agv, min, max)
>  - a part of subqueries
>-- simple subqueries in FROM clause
>-- EXISTS subqueries in WHERE clause
>  - DISTINCT and views with tuple duplicates


Regareds,
Yugo Nagata

-- 
Yugo NAGATA 


IVM_patches_v13.tar.gz
Description: application/gzip


Re: logical copy_replication_slot issues

2020-02-09 Thread Masahiko Sawada
On Mon, 10 Feb 2020 at 01:29, Arseny Sher  wrote:
>
> Hi,
>
> While jumping around partically decoded xacts questions [1], I've read
> through the copy replication slots code (9f06d79ef) and found a couple
> of issues.
>
> 1) It seems quite reckless to me to dive into
> DecodingContextFindStartpoint without actual WAL reservation (donors
> slot restart_lsn is used, but it is not acquired). Why risking erroring
> out with WAL removal error if the function advances new slot position to
> updated donors one in the end anyway?

Good catch. It's possible that DecodingContextFindStartpoint could
fail when the restart_lsn of the source slot is advanced and removed
required WAL.

>
> 2) In the end, restart_lsn of new slot is set to updated donors
> one. However, confirmed_flush field is not updated. This is just wrong
> -- we could start decoding too early and stream partially decoded
> transaction.

I think you are right.

>
> I'd probably avoid doing DecodingContextFindStartpoint at all. Its only
> purpose is to assemble consistent snapshot (and establish corresponding
>  pair), but donor slot must have
> already done that and we could use it as well. Was this considered?

Skipping doing DecodingContextFindStartpoint while creating a new
destination logical slot seems sensible to me.

I've attached the draft patch fixing this issue but I'll continue
investigating it more deeply.

Regards,

-- 
Masahiko Sawadahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


fix_copy_slot.patch
Description: Binary data


Re: Identifying user-created objects

2020-02-09 Thread Michael Paquier
On Mon, Feb 10, 2020 at 01:16:30PM +0900, Amit Langote wrote:
> On Mon, Feb 10, 2020 at 1:06 PM Masahiko Sawada
>  wrote:
>> How about having it as a macro like:
>>
>> #define ObjectIdIsUserObject(oid) ((Oid)(oid) >= FirstNormalObjectId)
> 
> I'm fine with a macro.

I am not sure that it is worth having one extra abstraction layer for
that.
--
Michael


signature.asc
Description: PGP signature


Re: In PG12, query with float calculations is slower than PG11

2020-02-09 Thread Amit Langote
On Sat, Feb 8, 2020 at 3:13 AM Andres Freund  wrote:
> On 2020-02-07 17:17:21 +0900, Amit Langote wrote:
> > I did some tests using two relatively recent compilers: gcc 8 and
> > clang-7 and here are the results:
>
> Hm, these very much look like they've been done in an unoptimized build?
>
> > 40.62%  postgres  postgres   [.] ExecInterpExpr
> >  9.74%  postgres  postgres   [.] float8_accum
> >  6.12%  postgres  libc-2.17.so   [.] __isinf
> >  5.96%  postgres  postgres   [.] float8mul
> >  5.33%  postgres  postgres   [.] dsqrt
> >  3.90%  postgres  postgres   [.] ftod
> >  3.53%  postgres  postgres   [.] Float8GetDatum
> >  2.34%  postgres  postgres   [.] DatumGetFloat8
> >  2.15%  postgres  postgres   [.] AggCheckCallContext
> >  2.03%  postgres  postgres   [.] slot_deform_tuple
> >  1.95%  postgres  libm-2.17.so   [.] __sqrt
> >  1.19%  postgres  postgres   [.] check_float8_array
>
> > HEAD
> >
> > latency average = 549.071 ms
> >
> > 31.74%  postgres  postgres   [.] ExecInterpExpr
> > 11.02%  postgres  libc-2.17.so   [.] __isinf
> > 10.58%  postgres  postgres   [.] float8_accum
> >  4.84%  postgres  postgres   [.] check_float8_val
> >  4.66%  postgres  postgres   [.] dsqrt
> >  3.91%  postgres  postgres   [.] float8mul
> >  3.56%  postgres  postgres   [.] ftod
> >  3.26%  postgres  postgres   [.] Float8GetDatum
> >  2.91%  postgres  postgres   [.] float8_mul
> >  2.30%  postgres  postgres   [.] DatumGetFloat8
> >  2.19%  postgres  postgres   [.] slot_deform_heap_tuple
> >  1.81%  postgres  postgres   [.] AggCheckCallContext
> >  1.31%  postgres  libm-2.17.so   [.] __sqrt
> >  1.25%  postgres  postgres   [.] check_float8_array
>
> Because DatumGetFloat8, Float8GetDatum, etc aren't functions that
> normally stay separate.

Okay, fair.

Here are numbers after compiling with -O3:

gcc 8
=

HEAD

latency average = 350.187 ms

34.67%  postgres  postgres   [.] ExecInterpExpr
20.94%  postgres  libc-2.17.so   [.] __isinf
10.74%  postgres  postgres   [.] float8_accum
 8.22%  postgres  postgres   [.] dsqrt
 6.63%  postgres  postgres   [.] float8mul
 3.45%  postgres  postgres   [.] ftod
 2.32%  postgres  postgres   [.] tts_buffer_heap_getsomeattrs

HEAD + reverse-if-condition patch

latency average = 346.710 ms

34.48%  postgres  postgres   [.] ExecInterpExpr
21.00%  postgres  libc-2.17.so   [.] __isinf
12.26%  postgres  postgres   [.] float8_accum
 8.31%  postgres  postgres   [.] dsqrt
 6.32%  postgres  postgres   [.] float8mul
 3.23%  postgres  postgres   [.] ftod
 2.25%  postgres  postgres   [.] tts_buffer_heap_getsomeattrs

HEAD + revert-to-macro patch

latency average = 297.493 ms

39.25%  postgres  postgres   [.] ExecInterpExpr
14.44%  postgres  postgres   [.] float8_accum
11.02%  postgres  libc-2.17.so   [.] __isinf
 8.21%  postgres  postgres   [.] dsqrt
 5.55%  postgres  postgres   [.] float8mul
 4.15%  postgres  postgres   [.] ftod
 2.78%  postgres  postgres   [.] tts_buffer_heap_getsomeattrs

11.6

latency average = 290.301 ms

42.78%  postgres  postgres[.] ExecInterpExpr
12.27%  postgres  postgres[.] float8_accum
12.12%  postgres  libc-2.17.so[.] __isinf
 8.96%  postgres  postgres[.] dsqrt
 5.77%  postgres  postgres[.] float8mul
 3.94%  postgres  postgres[.] ftod
 2.61%  postgres  postgres[.] AggCheckCallContext


clang-7
===

HEAD

latency average = 246.278 ms

44.47%  postgres  postgres   [.] ExecInterpExpr
14.56%  postgres  postgres   [.] float8_accum
 7.25%  postgres  postgres   [.] float8mul
 7.22%  postgres  postgres   [.] dsqrt
 5.40%  postgres  postgres   [.] ftod
 4.09%  postgres  postgres   [.] tts_buffer_heap_getsomeattrs
 2.20%  postgres  postgres   [.] check_float8_val

HEAD + reverse-if-condition patch

latency average = 240.212 ms

45.49%  postgres  postgres   [.] ExecInterpExpr
13.69%  postgres  postgres   [.] float8_accum
 8.32%  postgres  postgres   [.] dsqrt
 5.28%  postgres  postgres   [.] ftod
 5.19%  postgres  postgres   [.] float8mul
 3.68%  postgres  postgres   [.] tts_buffer_heap_getsomeattrs
 2.90%  postgres  postgres   [.] float8_mul

HEAD + revert-to-macro patch

latency average = 240.620 ms

44.04%  postgres  postgres   [.] ExecInterpExpr
13.72%  postgres 

Re: Identifying user-created objects

2020-02-09 Thread Masahiko Sawada
On Mon, 10 Feb 2020 at 14:09, Michael Paquier  wrote:
>
> On Mon, Feb 10, 2020 at 01:16:30PM +0900, Amit Langote wrote:
> > On Mon, Feb 10, 2020 at 1:06 PM Masahiko Sawada
> >  wrote:
> >> How about having it as a macro like:
> >>
> >> #define ObjectIdIsUserObject(oid) ((Oid)(oid) >= FirstNormalObjectId)
> >
> > I'm fine with a macro.
>
> I am not sure that it is worth having one extra abstraction layer for
> that.

Hmm I'm not going to insist on that but I thought that it could
somewhat improve readability at places where they already compares an
oid to FirstNormalObjectId as Amit mentioned:

src/backend/catalog/pg_publication.c:   relid >= FirstNormalObjectId;
src/backend/utils/adt/json.c:   if (typoid >= FirstNormalObjectId)
src/backend/utils/adt/jsonb.c:  if (typoid >= FirstNormalObjectId)

Regards,

-- 
Masahiko Sawadahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: logical decoding : exceeded maxAllocatedDescs for .spill files

2020-02-09 Thread Kuntal Ghosh
Hello,

On Sat, Feb 8, 2020 at 1:18 AM Andres Freund  wrote:
>
> Hi,
>
> On 2020-02-07 20:02:01 +0100, Tomas Vondra wrote:
> > On Fri, Feb 07, 2020 at 10:33:48AM -0800, Andres Freund wrote:
> > > Hi,
> > >
> > > On 2020-02-04 10:15:01 +0530, Kuntal Ghosh wrote:
> > > > And, the issue got reproduced with the same error:
> > > > WARNING:  problem in Generation Tuples: number of free chunks 0 in
> > > > block 0x7fe9e9e74010 exceeds 1018 allocated
> > >
> > > That seems like a problem in generation.c - because this should be
> > > unreachable, I think?
>
> > That's rather strange. How could we print this message? The code looks
> > like this
> >
> >   if (block->nfree >= block->nchunks)
> > elog(WARNING, "problem in Generation %s: number of free chunks %d in 
> > block %p exceeds %d allocated",
> >  name, block->nfree, block, block->nchunks);
> >
> > so this says 0 >= 1018. Or am I missing something?
>
> Indeed, it's pretty weird. I can't reproduce it either. Kuntal, which
> exact git version did you repro this on? What precise settings /
> platform?
>

I've used the following steps:

Platform:
OS: Ubuntu 64-bit 18.04.2
VMWare Fusion Version 8.5.10 on my MacOS 10.14.6
16GB RAM with 8 core processors

git checkout -b pg11 remotes/origin/REL_11_STABLE
git reset --hard a4ccc1cef5a04cc (Generational memory allocator)
git apply Set-alloc_len-as-MaxHeapTupleSize.patch

Then, I performed the same test. I've attached the test.sql file for the same.

With this test, I wanted to check how the generational memory
allocator patch is solving the issue. As reported by you as well
earlier in the thread, each of the 4096*10 in-memory tuples
doesn't allocate MaxHeapTupleSize, but instead something like ~30
bytes with this patch. IMHO, this is the change that is making the
difference. If we allocate MaxHeapTupleSize instead for all the
tuples, we'll encounter the same out-of-memory issue.

I haven't looked into the issue in generational tuple context yet.
It's possible that that the changes I've done in the attached patch
don't make sense and break the logic of generational memory allocator.
:-)

-- 
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com


Set-alloc_len-as-MaxHeapTupleSize.patch
Description: Binary data


test.sql
Description: Binary data


Re: Identifying user-created objects

2020-02-09 Thread Amit Langote
On Mon, Feb 10, 2020 at 2:23 PM Masahiko Sawada
 wrote:
> On Mon, 10 Feb 2020 at 14:09, Michael Paquier  wrote:
> >
> > On Mon, Feb 10, 2020 at 01:16:30PM +0900, Amit Langote wrote:
> > > On Mon, Feb 10, 2020 at 1:06 PM Masahiko Sawada
> > >  wrote:
> > >> How about having it as a macro like:
> > >>
> > >> #define ObjectIdIsUserObject(oid) ((Oid)(oid) >= FirstNormalObjectId)
> > >
> > > I'm fine with a macro.
> >
> > I am not sure that it is worth having one extra abstraction layer for
> > that.
>
> Hmm I'm not going to insist on that but I thought that it could
> somewhat improve readability at places where they already compares an
> oid to FirstNormalObjectId as Amit mentioned:
>
> src/backend/catalog/pg_publication.c:   relid >= FirstNormalObjectId;
> src/backend/utils/adt/json.c:   if (typoid >= FirstNormalObjectId)
> src/backend/utils/adt/jsonb.c:  if (typoid >= FirstNormalObjectId)

Agree that ObjectIsUserObject(oid) is easier to read than oid >=
FirstNormalObject.  I would have not bothered, for example, if it was
something like oid >= FirstUserObjectId to begin with.

Thanks,
Amit




Re: Internal key management system

2020-02-09 Thread Masahiko Sawada
On Wed, 5 Feb 2020 at 22:28, Sehrope Sarkuni  wrote:
>
> On Sat, Feb 1, 2020 at 7:02 PM Masahiko Sawada 
>  wrote:
> > On Sun, 2 Feb 2020 at 00:37, Sehrope Sarkuni  wrote:
> > >
> > > On Fri, Jan 31, 2020 at 1:21 AM Masahiko Sawada
> > >  wrote:
> > > > On Thu, 30 Jan 2020 at 20:36, Sehrope Sarkuni  
> > > > wrote:
> > > > > That
> > > > > would allow the internal usage to have a fixed output length of
> > > > > LEN(IV) + LEN(HMAC) + LEN(DATA) = 16 + 32 + 64 = 112 bytes.
> > > >
> > > > Probably you meant LEN(DATA) is 32? DATA will be an encryption key for
> > > > AES256 (master key) internally generated.
> > >
> > > No it should be 64-bytes. That way we can have separate 32-byte
> > > encryption key (for AES256) and 32-byte MAC key (for HMAC-SHA256).
> > >
> > > While it's common to reuse the same 32-byte key for both AES256 and an
> > > HMAC-SHA256 and there aren't any known issues with doing so, when
> > > designing something from scratch it's more secure to use entirely
> > > separate keys.
> >
> > The HMAC key you mentioned above is not the same as the HMAC key
> > derived from the user provided passphrase, right? That is, individual
> > key needs to have its IV and HMAC key. Given that the HMAC key used
> > for HMAC(IV || ENCRYPT(KEY, IV, DATA)) is the latter key (derived from
> > passphrase), what will be the former key used for?
>
> It's not derived from the passphrase, it's unlocked by the passphrase (along 
> with the master encryption key). The server will have 64-bytes of random 
> data, saved encrypted in pg_control, which can be treated as two separate 
> 32-byte keys, let's call them master_encryption_key and master_mac_key. The 
> 64-bytes is unlocked by decrypting it with the user passphrase at startup 
> (which itself would be split into a pair of encryption and MAC keys to do the 
> unlocking).
>
> The wrap and unwrap operations would use both keys:
>
> wrap(plain_text, encryption_key, mac_key) {
> // Generate random IV:
> iv = pg_strong_random(16);
> // Encrypt:
> cipher_text = encrypt_aes256_cbc(encryption_key, iv, plain_text);
> // Compute MAC on all inputs:
> mac = hmac_sha256(mac_key, encryption_key || iv || cipher_text);
> // Concat user facing pieces together
> wrapped = mac || iv || cipher_text;
> return wrapped;
> }
>
> unwrap(wrapped, encryption_key, mac_key) {
> // Split wrapped into its pieces:
> actual_mac = wrapped.slice(0, 32);
> iv = wrapped.slice(0 + 32, 16);
> cipher_text = wrapped.slice(0 + 32 + 16);
> // Compute MAC on all inputs:
> expected_mac = hmac_sha256(mac_key, encryption_key || iv || cipher_text);
> // Compare MAC vs value in wrapped:
> if (expected_mac != actual_mac) { return Error("MAC does not match"); }
> // MAC matches so decrypt:
> plain_text = decrypt_aes256_cbc(encryption_key, iv, cipher_text);
> return plain_text;
> }
>
> Every input to the encryption operation, including the encryption key, must 
> be included into the HMAC calculation. If you use the same key for both 
> encryption and MAC that's not required as it's already part of the MAC 
> process as the key. Using separate keys requires explicitly adding in the 
> encryption key into the MAC input to ensure that it the correct key prior to 
> decryption in the unwrap operation. Any additional parts of the wrapped 
> output (ex: a "version" byte for the algos or padding choices) should also be 
> included.
>
> The wrap / unwrap above would be used with the encryption and mac keys 
> derived from the user passphrase to unlock the master_encryption_key and 
> master_mac_key from pg_control. Then those would be used by the higher level 
> functions:
>
> pg_kmgr_wrap(plain_text) {
> return wrap(plain_text, master_encryption_key, master_mac_key);
> }
>
> pg_kmgr_unwrap(wrapped) {
> return unwrap(wrapped, master_encryption_key, master_mac_key);
> }

Thank you for explaining the details. I had missed something.

Attached updated patch incorporated all comments I got so far. The changes are:

* Renamed data_encryption_cipher to key_management_cipher
* Renamed pg_kmgr_wrap and pg_kmgr_unwrap to pg_wrap_key and pg_unwrap_key
* Changed wrap and unwrap procedure based on the comments
* Removed the restriction of requiring the input key being a multiple
of 16 bytes.
* Created a context dedicated to wrap and unwrap data

Documentation and regression tests are still missing.

Regarding key rotation, currently we allow online key rotation by
doing pg_rotate_encryption_key after changing
cluster_passphrase_command and loading. But if the server crashed
during key rotation it might require the old passphrase in spite of
the passphrase command in postgresql.conf having been changed. We need
to deal with it but I'm not sure the best approach. Possibly having a
new frontend tool that changes the key offline would be a safe
approach.

Regards,


--
Masahiko Sawadahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7

Re: Postgres 32 bits client compilation fail. Win32 bits client is supported?

2020-02-09 Thread Michael Paquier
On Mon, Feb 10, 2020 at 11:55:09AM +0800, Craig Ringer wrote:
> I don't see any other members building for 32-bit. But it should work
> and as you say, nothing's been discussed about removing it.

Yes, it works normally AFAIK and there is no reason to remove this
support either.  My guess is that the repository was not cleaned up
properly when attempting the 32b compilation after a 64b compilation
was completed.
--
Michael


signature.asc
Description: PGP signature


RE: Complete data erasure

2020-02-09 Thread asaba.takan...@fujitsu.com
Hello Stephen,

From: Stephen Frost 
> I disagree- it's a feature that's been asked for multiple times and does
> have value in some situations.

I'm rethinking the need for this feature although I think that it improves the 
security.
You said that this feature has value in some situations.
Could you tell me about that situations?

Regards,

--
Takanori Asaba




Re: WAL usage calculation patch

2020-02-09 Thread Craig Ringer
On Wed, 5 Feb 2020 at 21:36, Kirill Bychik  wrote:
>
> Hello pgsql-hackers,
>
> Submitting a patch that would enable gathering of per-statement WAL
> generation statistics, similar to how it is done for buffer usage.
> Collected is the number of records added to WAL and number of WAL
> bytes written.
>
> The data collected was found valuable to analyze update-heavy load,
> with WAL generation being the bottleneck.
>
> The usage data is collected at low level, after compression is done on
> WAL record. Data is then exposed via pg_stat_statements, could also be
> used in EXPLAIN ANALYZE if needed. Instrumentation is alike to the one
> used for buffer stats. I didn't dare to unify both usage metric sets
> into single struct, nor rework the way both are passed to parallel
> workers.
>
> Performance impact is (supposed to be) very low, essentially adding
> two int operations and memory access on WAL record insert. Additional
> efforts to allocate shmem chunk for parallel workers. Parallel workers
> shmem usage is increased to fir in a struct of two longs.
>
> Patch is separated in two parts: core changes and pg_stat_statements
> additions. Essentially the extension has its schema updated to allow
> two more fields, docs updated to reflect the change. Patch is prepared
> against master branch.
>
> Please provide your comments and/or code findings.

I like the concept, I'm a big fan of anything that affordably improves
visibility into Pg's I/O and activity.

To date I've been relying on tools like systemtap to do this sort of
thing. But that's a bit specialised, and Pg currently lacks useful
instrumentation for it so it can be a pain to match up activity by
parallel workers and that sort of thing. (I aim to find time to submit
a patch for that.)

I haven't yet reviewed the patch.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 2ndQuadrant - PostgreSQL Solutions for the Enterprise




Re: In PG12, query with float calculations is slower than PG11

2020-02-09 Thread Amit Langote
On Fri, Feb 7, 2020 at 11:43 PM Emre Hasegeli  wrote:
> > > The patch looks unduly invasive to me, but I think that it might be
> > > right that we should go back to a macro-based implementation, because
> > > otherwise we don't have a good way to be certain that the function
> > > parameter won't get evaluated first.
> >
> > I'd first like to see some actual evidence of this being a problem,
> > rather than just the order of the checks.
>
> There seem to be enough evidence of this being the problem.  We are
> better off going back to the macro-based implementation.  I polished
> Keisuke Kuroda's patch commenting about the performance issue, removed
> the check_float*_val() functions completely, and added unlikely() as
> Tom Lane suggested.  It is attached.  I confirmed with different
> compilers that the macro, and unlikely() makes this noticeably faster.

Thanks for updating the patch.

Should we update the same macro in contrib/btree_gist/btree_utils_num.h too?

Regards,
Amit




pg_basebackup -F plain -R overwrites postgresql.auto.conf

2020-02-09 Thread Fujii Masao

Hi,

I found that pg_basebackup -F plain -R *overwrites* postgresql.auto.conf
taken from the primary server with new primary_conninfo setting,
while pg_basebackup -F tar -R just *appends* it into the file. I think that
this is a bug and pg_basebackup -F plain -R should *append* the setting.
Thought?

I attached the patch to fix the bug. This patch should be back-patch to
v12.

Regards,

--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters
diff --git a/src/fe_utils/recovery_gen.c b/src/fe_utils/recovery_gen.c
index facfb6b1f6..46ca20e20b 100644
--- a/src/fe_utils/recovery_gen.c
+++ b/src/fe_utils/recovery_gen.c
@@ -128,7 +128,7 @@ WriteRecoveryConfig(PGconn *pgconn, char *target_dir, 
PQExpBuffer contents)
snprintf(filename, MAXPGPATH, "%s/%s", target_dir,
 use_recovery_conf ? "recovery.conf" : 
"postgresql.auto.conf");
 
-   cf = fopen(filename, use_recovery_conf ? "a" : "w");
+   cf = fopen(filename, use_recovery_conf ? "w" : "a");
if (cf == NULL)
{
pg_log_error("could not open file \"%s\": %m", filename);