Re: Feature: temporary materialized views

2019-03-11 Thread Andreas Karlsson

On 3/8/19 2:38 AM, Michael Paquier wrote:

On Thu, Mar 07, 2019 at 10:45:04AM +0200, David Steele wrote:

I think a new patch is required here so I have marked this Waiting on
Author.  cfbot is certainly not happy and anyone trying to review is going
to have hard time trying to determine what to review.


I would recommend to mark this patch as returned with feedback as we
already know that we need to rethink a bit harder the way relations
are created in CTAS, not to mention that the case of EXPLAIN CTAS IF
NOT EXISTS is not correctly handled.  This requires more than three of
work which is what remains until the end of this CF, so v12 is not a
sane target.


Agreed. Even if I could find the time to write a patch for this there is 
no way it would make it into v12.


Andreas




Re: Feature: temporary materialized views

2019-03-14 Thread Andreas Karlsson
On 3/14/19 9:13 AM, Mitar wrote:> I just want to make sure if I 
understand correctly. But my initial

proposal/patch is currently waiting first for all patches for the
refactoring to happen, which are done by amazing Andreas? This sounds
good to me and I see a lot of progress/work has been done and I am OK
with waiting. Please ping me explicitly if there will be anything I am
expected to do at any point in time.

And just to make sure, these current patches are doing just
refactoring but are not also introducing temporary materialized views
yet? Or is that also done in patches made by Andreas?


Yeah, your patch is sadly stuck behind the refactoring, and the 
refactoring proved to be harder to do than I initially thought. The 
different code paths for executing CREATE MATERIALIZED VIEW are so 
different that it is hard to find a good common interface.


So there is unfortunately little you can do here other than wait for me 
or someone else to do the refactoring as I cannot see your patch getting 
accepted without keeping the existing restrictions on side effects for 
CREATE MATERIALIZED VIEW.


Andreas



Re: GIN indexes on an = ANY(array) clause

2019-03-14 Thread Andreas Karlsson

On 3/13/19 5:38 PM, Tom Lane wrote:

regression=# select amopopr::regoperator from pg_amop where amopfamily = 2745;
 amopopr
---
  &&(anyarray,anyarray)
  @>(anyarray,anyarray)
  <@(anyarray,anyarray)
  =(anyarray,anyarray)
(4 rows)

and none of those are obviously related to the =(int4,int4) operator that
is in the ScalarArrayOp.  The only way to get from point A to point B is
to know very specifically that =(anyarray,anyarray) is related to any
scalar-type btree equality operator, which is not the kind of thing the
GIN AM ought to know either.


In the discussions for the patch for foreign keys from arrays[1] some 
people proposed add a new operator, <<@(anyelement,anyarray), to avoid 
having to construct left hand side arrays. Would that help here or does 
it still have the same issues?


1. 
https://www.postgresql.org/message-id/CAJvoCut7zELHnBSC8HrM6p-R6q-NiBN1STKhqnK5fPE-9%3DGq3g%40mail.gmail.com


Andreas



Re: PostgreSQL pollutes the file system

2019-03-20 Thread Andreas Karlsson

On 3/19/19 11:19 AM, Fred .Flintstone wrote:

PostgreSQL pollutes the file system with lots of binaries that it is
not obvious that they belong to PostgreSQL.

Such as "/usr/bin/createdb", etc.

It would be better if these files were renamed to be prefixed with
pg_, such as pg_createdb.
Or even better postgresql-createdb then be reachable by through a
"postgresql" wrapper script.


Hi,

This topic has been discussed before e.g. in 2008 in 
https://www.postgresql.org/message-id/47EA5CC0.8040102%40sun.com and 
also more recently but I cannot find it in the archives right now.


I am personally in favor of renaming e.g. createdb to pg_createdb, since 
it is not obvious that createdb belongs to PostgreSQL when reading a 
script or looking in /usr/bin, but we would need a some kind of 
deprecation cycle here or we would suddenly break tons of people's scripts.


And as for the git-like solution with a wrapper script, that seems to be 
the modern way to do things but would be an even larger breakage and I 
am not convinced the advantage would be worth it especially since our 
executables are not as closely related and consistent as for example git's.


Andreas



Re: PostgreSQL pollutes the file system

2019-03-20 Thread Andreas Karlsson

On 3/20/19 8:19 PM, Andres Freund wrote:

On 2019-03-20 15:15:02 -0400, Jonathan S. Katz wrote:

If we are evaluating this whole symlink / renaming thing, there could be
arguments for a "pgsql" alias to psql (or vice versa), but I don't think
"pg_sql" makes any sense and could be fairly confusing.


I don't care much about createdb etc, but I'm *strongly* against
renaming psql and/or adding symlinks. That's like 95% of all
interactions people have with postgres binaries, making that more
confusing would be an enterily unnecessary self own.


+1 "psql" as a tool for connecting to PostgreSQL is so well established 
that renaming it would just confuse everyone.


Andreas



Re: PostgreSQL pollutes the file system

2019-03-21 Thread Andreas Karlsson

On 3/21/19 7:07 AM, Chris Travers wrote:
1.  createuser/dropuser are things that I don't consider good ways of 
creating users anyway.  I think we should just consider removing these 
binaries.  The SQL queries are better, more functional, and can be 
rolled back as a part of a larger transaction.


Those binaries are pretty convenient to use in scripts since they handle 
SQL escaping for you, but probably not convenient enough that we would 
have added createuser today.


Compare

createuser "$USER"

vs

echo 'CREATE ROLE :"user" LOGIN' | psql postgres -v "user=$USER"

Andreas



Re: PostgreSQL pollutes the file system

2019-03-27 Thread Andreas Karlsson

On 3/27/19 2:51 PM, Tomas Vondra wrote:

I think the consensus in this thread (and the previous ancient ones) is
that it's not worth it. It's one thing to introduce new commands with the
pg_ prefix, and it's a completely different thing to rename existing ones.
That has inherent costs, and as Tom pointed out the burden would fall on
people using PostgreSQL (and that's rather undesirable).

I personally don't see why having commands without pg_ prefix would be
an issue. Especially when placed in a separate directory, which eliminates
the possibility of conflict with other commands.


I buy that it may not be worth breaking tens of thousands of scripts to 
fix this, but I disagree about it not being an issue. Most Linux 
distributions add PostgreSQL's executables in to a directory which is in 
the default $PATH (/usr/bin in the case of Debian). And even if it would 
be installed into a separate directory there would still be a conflict 
as soon as that directory is added to $PATH.


And I think that it is also relatively easy to confuse adduser and 
createuser when reading a script. Nothing about the name createuser 
indicates that it will create a role in an SQL database.


Andreas




Re: PostgreSQL pollutes the file system

2019-03-27 Thread Andreas Karlsson

On 3/27/19 3:26 PM, Tomas Vondra wrote:

That is true, of course. But are there actual examples of such conflicts
in practice? I mean, are there tools/packages that provide commands with
a conflicting name? I'm not aware of any, and as was pointed before, we'd
have ~20 years of history on any new ones.


That is a fair argument. Since we squatted those names back in the 
mid-90s I think the risk of collision is low.


Andreas




Re: DWIM mode for psql

2019-03-31 Thread Andreas Karlsson
On 3/31/19 10:52 PM, Thomas Munro wrote:> Building on the excellent work 
begun by commit e529cd4ffa60, I would

like to propose a do-what-I-mean mode for psql.  Please find a POC
patch attached.  It works like this:

postgres=# select datnaam from pg_database where ooid = 12917;
ERROR:  column "datnaam" does not exist
LINE 1: select datnaam from pg_database where ooid = 12917;
^
HINT:  Perhaps you meant to reference the column "pg_database.datname".
postgres=# YES
  datname
--
  postgres
(1 row)


I think it is potentially confusing that YES and NO does not look like 
other psql commands. Let's pick something which is more in line with 
existing commands like \y and \n.


Andreas




Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2019-01-16 Thread Andreas Karlsson
On 1/16/19 9:27 AM, Michael Paquier wrote:> Regarding the grammar, we 
tend for the last couple of years to avoid

complicating the main grammar and move on to parenthesized grammars
(see VACUUM, ANALYZE, EXPLAIN, etc).  So in the same vein I think that
it would make sense to only support CONCURRENTLY within parenthesis
and just plugin that with the VERBOSE option.


Personally I do not care, but there have been a lot of voices for 
keeping REINDEX CONCURRENTLY consistent with CREATE INDEX CONCURRENTLY 
and DROP INDEX CONCURRENTLY.



Does somebody mind if I jump into the ship after so long?  I was the
original author of the monster after all...


Fine by me. Peter?

Andreas




Re: Early WIP/PoC for inlining CTEs

2019-01-17 Thread Andreas Karlsson

On 1/11/19 8:10 PM, Robert Haas wrote:

WITH cte_name [[NOT] MATERIALIZED] AS (query) main_query...


Hm, when would one want "NOT MATERIALIZED"? I am not sure I see the 
usefulness of forcing inlining other than if we by default do not inline 
when a CTE is referenced multiple times.


Do you imaging it working something like the below?

1. Default

# Not inlined

- Referenced multiple times
- Includes FOR UPDATE|SHARE
- Includes volatile functions
- Recurisve
- DML

# Inlined

- Simple case (no side effects, referenced once)

2. MATERIALIZED

# Not inlined

- Everything

# Inlined

- (none)

3. NOT MATERIALIZED

# Not inlined

- Recursive
- DML

# Inlined

- Everything else

Andreas




Re: Feature: temporary materialized views

2019-01-17 Thread Andreas Karlsson

On 1/11/19 8:47 PM, Mitar wrote:

In create_ctas_internal() why do you copy the relation even when you do
not modify it?


I was modelling this after code in view.c [1]. I can move copy into the "if".


Makes sense.


Is it really ok to just remove SECURITY_RESTRICTED_OPERATION from
ExecCreateTableAs()? I feel it is there for a good reason and that we
preferably want to reduce the duration of SECURITY_RESTRICTED_OPERATION
to only include when we actually execute the query.


The comment there said that this is not really necessary for security:

"This is not necessary for security, but this keeps the behavior
similar to REFRESH MATERIALIZED VIEW.  Otherwise, one could create a
materialized view not possible to refresh."

Based on my experimentation, this is required to be able to use
temporary materialized views, but it does mean one has to pay
attention from where one can refresh. For example, you cannot refresh
from outside of the current session, because temporary object is not
available there. I have not seen any other example where refresh would
not be possible.

This is why I felt comfortable removing this. Also, no test failed
after removing this.


Hm, I am still not convinced just removing it is a good idea. Sure, it 
is not a security issue but usability is also important. The question is 
how much this worsens usability and how much extra work it would be to 
keep the restriction.


Btw, if we are going to remove SECURITY_RESTRICTED_OPERATION we should 
remove more code. There is no reason to save and reset the bitmask if we 
do not alter it.


Andreas



Re: Feature: temporary materialized views

2019-01-17 Thread Andreas Karlsson

On 1/17/19 4:57 PM, Tom Lane wrote:

Andreas Karlsson  writes:

On 1/11/19 8:47 PM, Mitar wrote:

Is it really ok to just remove SECURITY_RESTRICTED_OPERATION from
ExecCreateTableAs()?



The comment there said that this is not really necessary for security:
"This is not necessary for security, but this keeps the behavior
similar to REFRESH MATERIALIZED VIEW.  Otherwise, one could create a
materialized view not possible to refresh."



Hm, I am still not convinced just removing it is a good idea. Sure, it
is not a security issue but usability is also important.


Indeed.  I don't buy the argument that this should work differently
for temp views.  The fact that they're only accessible in the current
session is no excuse for that: security considerations still matter,
because you can have different privilege contexts within a single
session (consider SECURITY DEFINER functions etc).

What is the stumbling block to just leaving that alone?


I think the issue Mitar ran into is that the temporary materialized view 
is created in the rStartup callback of the receiver which happens after 
SECURITY_RESTRICTED_OPERATION is set in ExecCreateTableAs(), so the 
creation of the view itself is denied.


From a cursory glance it looks like it would be possible to move the 
setting of SECURITY_RESTRICTED_OPERATION to inside the rStartup 
callabck, other than that the code for resetting the security context 
might get a bit ugly. Do you see any flaws with that solution?


Andreas



Re: Feature: temporary materialized views

2019-01-17 Thread Andreas Karlsson

On 1/11/19 8:47 PM, Mitar wrote:

Thanks for doing the review!


I did some functional testing today and everything seems to work as 
expected other than that the tab completion for psql seems to be missing.


Andreas




Re: Feature: temporary materialized views

2019-01-18 Thread Andreas Karlsson
On 1/18/19 2:53 AM, Mitar wrote:> On Thu, Jan 17, 2019 at 2:40 PM 
Andreas Karlsson  wrote:

I did some functional testing today and everything seems to work as
expected other than that the tab completion for psql seems to be missing.


Thanks. I can add those as soon as I figure how. :-)


These rules are usually pretty easy to add. Just take a look in 
src/bin/psql/tab-complete.c to see how it is usually done.



So what are next steps here besides tab autocompletion? It is OK to
remove that security check? If I understand correctly, there are some
general refactoring of code Tom is proposing, but I am not sure if I
am able to do that/understand that.


No, I do not think it is ok to remove the check without a compelling 
argument for why the usability we gain from this check is not worth it. 
Additionally I agree with Tom that the way the code is written currently 
is confusing so this refactoring would most likely be a win even without 
your patch.


I might take a stab at refactoring this myself this weekend. Hopefully 
it is not too involved.


Andreas



Re: Feature: temporary materialized views

2019-01-20 Thread Andreas Karlsson

On 1/17/19 8:31 PM, Tom Lane wrote:

Creating the view object inside the rStartup callback is itself pretty
much of a kluge; you'd expect that to happen earlier.  I think the
reason it was done that way was it was easier to find out the view's
column set there, but I'm sure we can find another way --- doing the
object creation more like a regular view does it is the obvious approach.


Here is a a stab at refactoring this so the object creation does not 
happen in a callback. I am not that fond of the new API, but given how 
different all the various callers of CreateIntoRelDestReceiver() are I 
had no better idea.


The idea behind the patch is to always create the empty 
table/materialized view before executing the query and do it in one more 
unified code path, and then later populate it unless NO DATA was 
specified. I feel this makes the code easier to follow.


This patch introduces a small behavioral change, as can be seen from the 
diff in the test suite, where since the object creation is moved earlier 
the CTAS query snapshot will for example see the newly created table. 
The new behavior seems more correct to me, but maybe I am missing some 
unintended consequences.


Andreas
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 5947996d67..6ee96628cf 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -55,16 +55,15 @@ typedef struct
 {
 	DestReceiver pub;			/* publicly-known function pointers */
 	IntoClause *into;			/* target relation specification */
+	ObjectAddress reladdr;		/* address of rel, for intorel_startup */
 	/* These fields are filled by intorel_startup: */
 	Relation	rel;			/* relation to write to */
-	ObjectAddress reladdr;		/* address of rel, for ExecCreateTableAs */
 	CommandId	output_cid;		/* cmin to insert in output tuples */
 	int			hi_options;		/* heap_insert performance options */
 	BulkInsertState bistate;	/* bulk insert state */
 } DR_intorel;
 
-/* utility functions for CTAS definition creation */
-static ObjectAddress create_ctas_internal(List *attrList, IntoClause *into);
+/* utility function for CTAS definition creation */
 static ObjectAddress create_ctas_nodata(List *tlist, IntoClause *into);
 
 /* DestReceiver routines for collecting data */
@@ -75,16 +74,18 @@ static void intorel_destroy(DestReceiver *self);
 
 
 /*
- * create_ctas_internal
+ * create_ctas_nodata
  *
- * Internal utility used for the creation of the definition of a relation
- * created via CREATE TABLE AS or a materialized view.  Caller needs to
- * provide a list of attributes (ColumnDef nodes).
+ * Create CTAS or materialized view without the data, starting from the
+ * targetlist of the SELECT or view definition.
  */
 static ObjectAddress
-create_ctas_internal(List *attrList, IntoClause *into)
+create_ctas_nodata(List *tlist, IntoClause *into)
 {
-	CreateStmt *create = makeNode(CreateStmt);
+	List	   *attrList;
+	ListCell   *t,
+			   *lc;
+	CreateStmt *create;
 	bool		is_matview;
 	char		relkind;
 	Datum		toast_options;
@@ -96,71 +97,6 @@ create_ctas_internal(List *attrList, IntoClause *into)
 	relkind = is_matview ? RELKIND_MATVIEW : RELKIND_RELATION;
 
 	/*
-	 * Create the target relation by faking up a CREATE TABLE parsetree and
-	 * passing it to DefineRelation.
-	 */
-	create->relation = into->rel;
-	create->tableElts = attrList;
-	create->inhRelations = NIL;
-	create->ofTypename = NULL;
-	create->constraints = NIL;
-	create->options = into->options;
-	create->oncommit = into->onCommit;
-	create->tablespacename = into->tableSpaceName;
-	create->if_not_exists = false;
-
-	/*
-	 * Create the relation.  (This will error out if there's an existing view,
-	 * so we don't need more code to complain if "replace" is false.)
-	 */
-	intoRelationAddr = DefineRelation(create, relkind, InvalidOid, NULL, NULL);
-
-	/*
-	 * If necessary, create a TOAST table for the target table.  Note that
-	 * NewRelationCreateToastTable ends with CommandCounterIncrement(), so
-	 * that the TOAST table will be visible for insertion.
-	 */
-	CommandCounterIncrement();
-
-	/* parse and validate reloptions for the toast table */
-	toast_options = transformRelOptions((Datum) 0,
-		create->options,
-		"toast",
-		validnsps,
-		true, false);
-
-	(void) heap_reloptions(RELKIND_TOASTVALUE, toast_options, true);
-
-	NewRelationCreateToastTable(intoRelationAddr.objectId, toast_options);
-
-	/* Create the "view" part of a materialized view. */
-	if (is_matview)
-	{
-		/* StoreViewQuery scribbles on tree, so make a copy */
-		Query	   *query = (Query *) copyObject(into->viewQuery);
-
-		StoreViewQuery(intoRelationAddr.objectId, query, false);
-		CommandCounterIncrement();
-	}
-
-	return intoRelationAddr;
-}
-
-
-/*
- * create_ctas_nodata
- *
- * Create CTAS or materialized view when WITH NO DATA is used, starting from
- * the targetlist of the SELECT or view definition.
- */
-static ObjectAddress
-create_ctas_nodata(List *tlist, I

Re: Feature: temporary materialized views

2019-01-20 Thread Andreas Karlsson

On 1/18/19 8:32 PM, Mitar wrote:

On Fri, Jan 18, 2019 at 7:18 AM Andreas Karlsson  wrote:

These rules are usually pretty easy to add. Just take a look in
src/bin/psql/tab-complete.c to see how it is usually done.


Thanks. I have added the auto-complete and attached a new patch.


Hm, I do not think we should complete UNLOGGED MATERIALIZED VIEW even 
though it is valid syntax. If you try to create one you will just get an 
error. I am leaning towards removing the existing completion for this, 
because I do not see the point of completing to useless but technically 
valid syntax.


This is the one I think we should probably remove:

else if (TailMatches("CREATE", "UNLOGGED"))
COMPLETE_WITH("TABLE", "MATERIALIZED VIEW");


I might take a stab at refactoring this myself this weekend. Hopefully
it is not too involved.


That would be great! I can afterwards update the patch accordingly.


I have submitted a first shot at this. Let's see what others think of my 
patch.


Andreas



Re: Early WIP/PoC for inlining CTEs

2019-01-21 Thread Andreas Karlsson

On 1/18/19 9:34 PM, Robert Haas wrote:

On Thu, Jan 17, 2019 at 10:48 AM Andreas Karlsson  wrote:

On 1/11/19 8:10 PM, Robert Haas wrote:

WITH cte_name [[NOT] MATERIALIZED] AS (query) main_query...


Hm, when would one want "NOT MATERIALIZED"? I am not sure I see the
usefulness of forcing inlining other than if we by default do not inline
when a CTE is referenced multiple times.


When the planner materializes it, but the performance of the resulting
plan therefore sucks, I suppose.

I don't feel super-strongly about this, and Tom is right that there
may be cases where materialization is just not practical due to
implementation restrictions.  But it's not crazy to imagine that
inlining a multiply-referenced CTE might create opportunities for
optimization at each of those places, perhaps not the same ones in
each case, whereas materializing it results in doing extra work.


I see.

I have a minor biksheddish question about the syntax.

You proposed:

WITH cte_name [[NOT] MATERIALIZED] AS (query) main_query

While Andrew proposed:

WITH cte_name AS [[NOT] MATERIALIZED] (query) main_query

Do people have any preference between these two?

Andreas



Re: Feature: temporary materialized views

2019-01-21 Thread Andreas Karlsson

On 1/21/19 3:31 AM, Andreas Karlsson wrote:
Here is a a stab at refactoring this so the object creation does not 
happen in a callback.


Rebased my patch on top of Andres's pluggable storage patches. Plus some 
minor style changes.


Andreas
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 2bc8f928ea..7ef3794e08 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -55,16 +55,15 @@ typedef struct
 {
 	DestReceiver pub;			/* publicly-known function pointers */
 	IntoClause *into;			/* target relation specification */
+	ObjectAddress reladdr;		/* address of rel, for intorel_startup */
 	/* These fields are filled by intorel_startup: */
 	Relation	rel;			/* relation to write to */
-	ObjectAddress reladdr;		/* address of rel, for ExecCreateTableAs */
 	CommandId	output_cid;		/* cmin to insert in output tuples */
 	int			hi_options;		/* heap_insert performance options */
 	BulkInsertState bistate;	/* bulk insert state */
 } DR_intorel;
 
-/* utility functions for CTAS definition creation */
-static ObjectAddress create_ctas_internal(List *attrList, IntoClause *into);
+/* utility function for CTAS definition creation */
 static ObjectAddress create_ctas_nodata(List *tlist, IntoClause *into);
 
 /* DestReceiver routines for collecting data */
@@ -75,16 +74,18 @@ static void intorel_destroy(DestReceiver *self);
 
 
 /*
- * create_ctas_internal
+ * create_ctas_nodata
  *
- * Internal utility used for the creation of the definition of a relation
- * created via CREATE TABLE AS or a materialized view.  Caller needs to
- * provide a list of attributes (ColumnDef nodes).
+ * Create CTAS or materialized view without the data, starting from the
+ * targetlist of the SELECT or view definition.
  */
 static ObjectAddress
-create_ctas_internal(List *attrList, IntoClause *into)
+create_ctas_nodata(List *tlist, IntoClause *into)
 {
-	CreateStmt *create = makeNode(CreateStmt);
+	List	   *attrList;
+	ListCell   *t,
+			   *lc;
+	CreateStmt *create;
 	bool		is_matview;
 	char		relkind;
 	Datum		toast_options;
@@ -96,71 +97,6 @@ create_ctas_internal(List *attrList, IntoClause *into)
 	relkind = is_matview ? RELKIND_MATVIEW : RELKIND_RELATION;
 
 	/*
-	 * Create the target relation by faking up a CREATE TABLE parsetree and
-	 * passing it to DefineRelation.
-	 */
-	create->relation = into->rel;
-	create->tableElts = attrList;
-	create->inhRelations = NIL;
-	create->ofTypename = NULL;
-	create->constraints = NIL;
-	create->options = into->options;
-	create->oncommit = into->onCommit;
-	create->tablespacename = into->tableSpaceName;
-	create->if_not_exists = false;
-
-	/*
-	 * Create the relation.  (This will error out if there's an existing view,
-	 * so we don't need more code to complain if "replace" is false.)
-	 */
-	intoRelationAddr = DefineRelation(create, relkind, InvalidOid, NULL, NULL);
-
-	/*
-	 * If necessary, create a TOAST table for the target table.  Note that
-	 * NewRelationCreateToastTable ends with CommandCounterIncrement(), so
-	 * that the TOAST table will be visible for insertion.
-	 */
-	CommandCounterIncrement();
-
-	/* parse and validate reloptions for the toast table */
-	toast_options = transformRelOptions((Datum) 0,
-		create->options,
-		"toast",
-		validnsps,
-		true, false);
-
-	(void) heap_reloptions(RELKIND_TOASTVALUE, toast_options, true);
-
-	NewRelationCreateToastTable(intoRelationAddr.objectId, toast_options);
-
-	/* Create the "view" part of a materialized view. */
-	if (is_matview)
-	{
-		/* StoreViewQuery scribbles on tree, so make a copy */
-		Query	   *query = (Query *) copyObject(into->viewQuery);
-
-		StoreViewQuery(intoRelationAddr.objectId, query, false);
-		CommandCounterIncrement();
-	}
-
-	return intoRelationAddr;
-}
-
-
-/*
- * create_ctas_nodata
- *
- * Create CTAS or materialized view when WITH NO DATA is used, starting from
- * the targetlist of the SELECT or view definition.
- */
-static ObjectAddress
-create_ctas_nodata(List *tlist, IntoClause *into)
-{
-	List	   *attrList;
-	ListCell   *t,
-			   *lc;
-
-	/*
 	 * Build list of ColumnDefs from non-junk elements of the tlist.  If a
 	 * column name list was specified in CREATE TABLE AS, override the column
 	 * names in the query.  (Too few column names are OK, too many are not.)
@@ -213,8 +149,56 @@ create_ctas_nodata(List *tlist, IntoClause *into)
 (errcode(ERRCODE_SYNTAX_ERROR),
  errmsg("too many column names were specified")));
 
-	/* Create the relation definition using the ColumnDef list */
-	return create_ctas_internal(attrList, into);
+	/*
+	 * Create the target relation by faking up a CREATE TABLE parsetree and
+	 * passing it to DefineRelation.
+	 */
+	create = makeNode(CreateStmt);
+	create->relation = into->rel;
+	create->tableElts = attrList;
+	create->inhRelations = NIL;
+	create->ofT

Re: Early WIP/PoC for inlining CTEs

2019-01-21 Thread Andreas Karlsson

On 1/10/19 2:28 AM, Andreas Karlsson wrote:
Here is a new version of the patch with added tests, improved comments, 
some minor code cleanup and most importantly slightly changed logic for 
when we should inline.


Add ctematerialized to the JumbleExpr() in pg_stat_statements on 
suggestion from Andrew Gierth. I think that is the correct thing to do 
since it can have a major impact on performance.


Andreas
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index f177ebaa2c..6d456f6bce 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -2925,6 +2925,7 @@ JumbleExpr(pgssJumbleState *jstate, Node *node)
 
 /* we store the string name because RTE_CTE RTEs need it */
 APP_JUMB_STRING(cte->ctename);
+APP_JUMB(cte->ctematerialized);
 JumbleQuery(jstate, castNode(Query, cte->ctequery));
 			}
 			break;
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index bb92d9d37a..8c26dd1f26 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -1888,7 +1888,7 @@ SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t
 
 -- join in CTE
 EXPLAIN (VERBOSE, COSTS OFF)
-WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
+WITH t (c1_1, c1_3, c2_1) AS MATERIALIZED (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
  QUERY PLAN  
 -
  Limit
@@ -1905,7 +1905,7 @@ WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2
Output: t.c1_1, t.c2_1, t.c1_3
 (12 rows)
 
-WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
+WITH t (c1_1, c1_3, c2_1) AS MATERIALIZED (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
  c1_1 | c2_1 
 --+--
   101 |  101
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index f438165650..56602a164c 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -493,8 +493,8 @@ SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t
 SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10 FOR SHARE;
 -- join in CTE
 EXPLAIN (VERBOSE, COSTS OFF)
-WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
-WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
+WITH t (c1_1, c1_3, c2_1) AS MATERIALIZED (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
+WITH t (c1_1, c1_3, c2_1) AS MATERIALIZED (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
 -- ctid with whole-row reference
 EXPLAIN (VERBOSE, COSTS OFF)
 SELECT t1.ctid, t1, t2, t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
diff --git a/doc/src/sgml/queries.sgml b/doc/src/sgml/queries.sgml
index 88bc189646..92f180c5cf 100644
--- a/doc/src/sgml/queries.sgml
+++ b/doc/src/sgml/queries.sgml
@@ -2199,6 +2199,8 @@ SELECT n FROM t LIMIT 100;
   
 
   
+   TODO: Update this for inlining and MATERIALIZED.
+
A useful property of WITH queries is that they are evaluated
only once per execution of the parent query, even if they are referred to
more than once by the parent query or sibling WITH queries.
diff --git a/doc/src/sgml/ref/select.sgml b/doc/src/sgml/ref/select.sgml
index 4db8142afa..92ede4fc6c 100644
--- a/doc/src/sgml/ref/select.sgml
+++ b/doc/src/sgml/ref/select.sgml
@@ -72,7 +72,7 @@ SELECT [ ALL | DISTINCT [ ON ( expressionand with_query is:
 
-with_query_name [ ( column_name [, ...] ) ] AS ( select | values | insert | update | delete )
+with_query_name [ ( column_name [, ...] ) ] AS [ MATERIALIZED ] ( select | values | insert | update | delete )
 
 TABLE [ ONLY ] table_name [ * ]
 
@@ -273,6 +273,8 @@ TABLE [ ONLY ] table_name [ * ]

 

+TODO: Up

Re: Early WIP/PoC for inlining CTEs

2019-01-27 Thread Andreas Karlsson

On 1/27/19 4:21 PM, Tom Lane wrote:

Andrew Gierth  writes:

I'm not sure we should nail down the rule that the absence of NOT
MATERIALIZED will mean a multiply-referenced CTE is evaluated once. One
would hope that in the future the planner might be taught to inline or
not in that case depending on cost. I think it makes more sense to say
that we never inline if MATERIALIZED is specified, that we always inline
if NOT MATERIALIZED is specified, and that if neither is specified the
planner will choose (but perhaps note that currently it always chooses
only based on refcount).


I have no objection to documenting it like that; I just don't want us
to go off into the weeds trying to actually implement something smarter
for v12.


+1

Andreas



Re: Early WIP/PoC for inlining CTEs

2019-01-27 Thread Andreas Karlsson
On 1/26/19 11:55 PM, Tom Lane wrote:> Hearing no immediate pushback on 
that proposal, I went ahead and made

a version of the patch that does it like that, as attached.  I also took
a stab at documenting it fully.


Thanks! This version of the patch looks solid, including the 
documentation. The only remaining question I see is the one Andrew 
raised about if we should change the language to allow for future 
versions of PostgreSQL to add costing for when the same CTE is 
referenced multiple times.


Andreas



Re: Covering GiST indexes

2019-01-27 Thread Andreas Karlsson

On 11/26/18 5:56 PM, Andrey Borodin wrote:

Here's rebased version. Thanks!


Here is my review.

= General

The features seems useful and an obvious extension of covering B-trees, 
and a potentially useful feature together with exclusion constraints.


The patch still applies (with git apply -3), compiles and passes the 
test suite.


From some testing it seems like it works as expected.

= Code

* Have some minor style issues like that there should be spaces around 
|| (in gistcanreturn()) and ? and : (in gistFormTuple()).


* I do not see any need for adding the new parameter to gistFormTuple. 
As far as I can tell isTruncated is always equal to !isleaf.


* The comment below from gistFormTuple() does not look correct. No 
allocation is taking place.


/*
 * Allocate each included attribute.
 */

* Why do you set a supportCollation for the included columns? As far as 
I can tell the collation is only used by the various GiST support 
functions. Also in theory we should be able to skip initializing these 
array entires, but it is probably for the best that we do.


* I think you should check for the attno first in gistcanreturn() to 
take advantage of the short circuiting.


* I am no fan of the tupdesc vs truncTupdesc separation and think that 
it is a potential hazard, but I do not have any better suggestion right now.


* There is no test case for exclusion constraints, and I feel since that 
is one of the more important uses we should probably have at least one 
such test case.


= Minor comments

* I think that "the" should have been kept in the text below.

-Currently, only the B-tree index access method supports this 
feature.
+Currently, B-tree and GiST index access methods supports this 
feature.


* I am not a native speaker but I think it should be "use the INCLUDE 
clause" in the diff below, and I think it also should be "without any 
GiST operator class".


+   A GiST index can be covering, i.e. use INCLUDE 
clause.
+   Included columns can have data types without GiST operator class. 
Included

+   attributes will be stored uncompressed.

* I think you should write something like "Included attributes always 
support index-only scans." rather than "Included attributes can be 
returned." in the comment for gistcanreturn().


* Why the random noise in the diff below? I think using "(c3) INCLUDE 
(c4)" for both gist and rtree results in a cleaner patch.


 CREATE TABLE tbl (c1 int,c2 int, c3 box, c4 box);
 CREATE INDEX on tbl USING brin(c1, c2) INCLUDE (c3, c4);
-CREATE INDEX on tbl USING gist(c3) INCLUDE (c4);
+CREATE INDEX on tbl USING gist(c4) INCLUDE (c3);
 CREATE INDEX on tbl USING spgist(c3) INCLUDE (c4);
 CREATE INDEX on tbl USING gin(c1, c2) INCLUDE (c3, c4);
 CREATE INDEX on tbl USING hash(c1, c2) INCLUDE (c3, c4);
-CREATE INDEX on tbl USING rtree(c1, c2) INCLUDE (c3, c4);
+CREATE INDEX on tbl USING rtree(c4) INCLUDE (c3, c1);
 CREATE INDEX on tbl USING btree(c1, c2) INCLUDE (c3, c4);



Re: Covering GiST indexes

2019-01-28 Thread Andreas Karlsson

On 1/28/19 7:26 PM, Andrey Borodin wrote:

* I am no fan of the tupdesc vs truncTupdesc separation and think that it is a 
potential hazard, but I do not have any better suggestion right now.

B-tree is copying tupdesc every time they truncate tuple. We need tuple 
truncation a little more often: when we are doing page split, we have to form 
all page tuples, truncated.
Initially, I've optimized only this case, but this led to prepared tupledesc 
for truncated tuples.


* There is no test case for exclusion constraints, and I feel since that is one 
of the more important uses we should probably have at least one such test case.


Actually, I did not understand this test case. Can you elaborate more on this? 
How included attributes should participate in exclude index? What for?


I mean include a table like below among the tests. I feel like this is a 
main use case for INCLUDE.


CREATE TABLE t2 (
  x int4range,
  y int,

  EXCLUDE USING gist (x WITH &&) INCLUDE (y)
);

* Why the random noise in the diff below? I think using "(c3) INCLUDE (c4)" for 
both gist and rtree results in a cleaner patch.

I've used columns with and without opclass in INCLUDE. This led to these 
seemingly random changes.


I mean the diff would be smaller as the below. It also may make sense to 
make both lines "(c3) INCLUDE (c1, c4)".


 CREATE TABLE tbl (c1 int,c2 int, c3 box, c4 box);
 CREATE INDEX on tbl USING brin(c1, c2) INCLUDE (c3, c4);
 CREATE INDEX on tbl USING gist(c3) INCLUDE (c4);
 CREATE INDEX on tbl USING spgist(c3) INCLUDE (c4);
 CREATE INDEX on tbl USING gin(c1, c2) INCLUDE (c3, c4);
 CREATE INDEX on tbl USING hash(c1, c2) INCLUDE (c3, c4);
-CREATE INDEX on tbl USING rtree(c1, c2) INCLUDE (c3, c4);
+CREATE INDEX on tbl USING rtree(c3) INCLUDE (c4);
 CREATE INDEX on tbl USING btree(c1, c2) INCLUDE (c3, c4);

Andreas



Re: Early WIP/PoC for inlining CTEs

2019-01-28 Thread Andreas Karlsson

On 1/28/19 10:54 PM, Peter Eisentraut wrote:

Or put it at the end?

 WITH ctename AS ( query ) MATERIALIZED


Hm, seems like that would be easy to miss for long queries.

Andreas




Re: Covering GiST indexes

2019-01-29 Thread Andreas Karlsson
Thanks for the new version of the patch. Based on my knowledge of PG 
this is starting to look good, and I have only three small comments below.


I am not 100% a fan of truncTupdesc, but as long as it is well commented 
I think that it is fine.


= Review

* I think it is worth writing a short comment when you create 
truncTupdesc about why this is done.


* Very minor thing: the diff below is pointless churn on a line not 
touched by the patch.


-values, isnull, true /* size is currently bogus 
*/ );
+values, isnull, true /* size is currently bogus 
*/);


* Another very minor thing: The diff below from gistFormTuple() should 
probably be consistent about brackets.


+   if (isnull[i])
+   compatt[i] = (Datum) 0;
+   else
+   {
+   compatt[i] = attdata[i];
+   }

Andreas



Re: Covering GiST indexes

2019-01-29 Thread Andreas Karlsson

On 29/01/2019 18.00, Andreas Karlsson wrote:
Thanks for the new version of the patch. Based on my knowledge of PG 
this is starting to look good, and I have only three small comments below.


Sorry, I missed retree in the tests. Can you fix the below to match the 
gist example?


 CREATE INDEX on tbl USING rtree(c1, c2) INCLUDE (c3, c4);
 NOTICE:  substituting access method "gist" for obsolete method "rtree"
-ERROR:  access method "gist" does not support included columns
+ERROR:  data type integer has no default operator class for access 
method "gist"
+HINT:  You must specify an operator class for the index or define a 
default operator class for the data type.

+CREATE INDEX on tbl USING rtree(c4) INCLUDE (c1, c4);

Andreas



Re: Covering GiST indexes

2019-01-29 Thread Andreas Karlsson

On 29/01/2019 18.45, Andrey Borodin wrote:

Also, I've unified gist and r-tree syntax tests for INCLUDE.


Why not just keep it simple? The purpose is not to test the GiST 
indexes, for that we have index_including_gist.sql.


I propose the following.

/*
 * 7. Check various AMs. All but btree and gist must fail.
 */
CREATE TABLE tbl (c1 int,c2 int, c3 box, c4 box);
CREATE INDEX on tbl USING brin(c1, c2) INCLUDE (c3, c4);
CREATE INDEX on tbl USING gist(c3) INCLUDE (c1, c4);
CREATE INDEX on tbl USING spgist(c3) INCLUDE (c4);
CREATE INDEX on tbl USING gin(c1, c2) INCLUDE (c3, c4);
CREATE INDEX on tbl USING hash(c1, c2) INCLUDE (c3, c4);
CREATE INDEX on tbl USING rtree(c3) INCLUDE (c1, c4);
DROP TABLE tbl;

and

/*
 * 7. Check various AMs. All but btree and gist must fail.
 */
CREATE TABLE tbl (c1 int,c2 int, c3 box, c4 box);
CREATE INDEX on tbl USING brin(c1, c2) INCLUDE (c3, c4);
ERROR:  access method "brin" does not support included columns
CREATE INDEX on tbl USING gist(c3) INCLUDE (c1, c4);
CREATE INDEX on tbl USING spgist(c3) INCLUDE (c4);
ERROR:  access method "spgist" does not support included columns
CREATE INDEX on tbl USING gin(c1, c2) INCLUDE (c3, c4);
ERROR:  access method "gin" does not support included columns
CREATE INDEX on tbl USING hash(c1, c2) INCLUDE (c3, c4);
ERROR:  access method "hash" does not support included columns
CREATE INDEX on tbl USING rtree(c3) INCLUDE (c1, c4);
NOTICE:  substituting access method "gist" for obsolete method "rtree"
DROP TABLE tbl;



Re: Covering GiST indexes

2019-01-29 Thread Andreas Karlsson

On 29/01/2019 19.58, Andrey Borodin wrote:

PFA patch without redundant checks.


I hope it is ok that I fixed a typo and some wording in the comment, 
plus re-added the btree query which I accidentally removed in my last mail.


I think the current state of the patch is fine, assuming the solution 
with truncTupdesc is deemed to be good enough by the committer, so I 
will mark this as ready for committer.


Thanks for the patch and the speedy fixes!

Andreas
>From f1a85940d01e208a6502e689616a8806146c9f3d Mon Sep 17 00:00:00 2001
From: Andreas Karlsson 
Date: Wed, 30 Jan 2019 01:59:14 +0100
Subject: [PATCH] Covering GiST v10

This patch allow adding INCLUDE colums to GiST index.
These colums do not participate in GiST tree build,
are not used for chosing subtree for inserts or splits.
But they can be fetched during IndexOnlyScan.
---
 doc/src/sgml/ref/create_index.sgml |   4 +-
 doc/src/sgml/textsearch.sgml   |   6 +
 src/backend/access/gist/gist.c |  33 +++-
 src/backend/access/gist/gistget.c  |   4 +-
 src/backend/access/gist/gistscan.c |  12 +-
 src/backend/access/gist/gistsplit.c|  12 +-
 src/backend/access/gist/gistutil.c |  42 --
 src/include/access/gist_private.h  |   2 +
 src/test/regress/expected/amutils.out  |   4 +-
 src/test/regress/expected/index_including.out  |   8 +-
 src/test/regress/expected/index_including_gist.out | 166 +
 src/test/regress/parallel_schedule |   2 +-
 src/test/regress/serial_schedule   |   1 +
 src/test/regress/sql/index_including.sql   |   6 +-
 src/test/regress/sql/index_including_gist.sql  |  90 +++
 15 files changed, 358 insertions(+), 34 deletions(-)
 create mode 100644 src/test/regress/expected/index_including_gist.out
 create mode 100644 src/test/regress/sql/index_including_gist.sql

diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
index ad619cdcfe..f8657c6ae2 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -181,8 +181,8 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] 
 

-Currently, only the B-tree index access method supports this feature.
-In B-tree indexes, the values of columns listed in the
+Currently, the B-tree and the GiST index access methods supports this
+feature. In B-tree indexes, the values of columns listed in the
 INCLUDE clause are included in leaf tuples which
 correspond to heap tuples, but are not included in upper-level
 index entries used for tree navigation.
diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml
index ecebade767..e4c75ebf6f 100644
--- a/doc/src/sgml/textsearch.sgml
+++ b/doc/src/sgml/textsearch.sgml
@@ -3675,6 +3675,12 @@ SELECT plainto_tsquery('supernovae stars');
   
 
   
+   A GiST index can be covering, i.e. use the INCLUDE
+   clause. Included columns can have data types without any GiST operator
+   class. Included attributes will be stored uncompressed.
+  
+
+  
Lossiness causes performance degradation due to unnecessary fetches of table
records that turn out to be false matches.  Since random access to table
records is slow, this limits the usefulness of GiST indexes.  The
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index b75b3a8dac..2bbf7a7f49 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -75,7 +75,7 @@ gisthandler(PG_FUNCTION_ARGS)
 	amroutine->amclusterable = true;
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
-	amroutine->amcaninclude = false;
+	amroutine->amcaninclude = true;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = gistbuild;
@@ -1382,8 +1382,8 @@ gistSplit(Relation r,
 		IndexTupleSize(itup[0]), GiSTPageSize,
 		RelationGetRelationName(r;
 
-	memset(v.spl_lisnull, true, sizeof(bool) * giststate->tupdesc->natts);
-	memset(v.spl_risnull, true, sizeof(bool) * giststate->tupdesc->natts);
+	memset(v.spl_lisnull, true, sizeof(bool) * giststate->truncTupdesc->natts);
+	memset(v.spl_risnull, true, sizeof(bool) * giststate->truncTupdesc->natts);
 	gistSplitByKey(r, page, itup, len, giststate, &v, 0);
 
 	/* form left and right vector */
@@ -1462,8 +1462,19 @@ initGISTstate(Relation index)
 	giststate->scanCxt = scanCxt;
 	giststate->tempCxt = scanCxt;	/* caller must change this if needed */
 	giststate->tupdesc = index->rd_att;
+	/*
+	 * The truncated tupdesc does not contain the INCLUDE attributes.
+	 *
+	 * It is used to form tuples during tuple adjustement and page split.
+	 * B-tree creates shortened tuple descriptor for every truncated tuple,
+	 * because it is doing this less often: it does not

Re: Feature: temporary materialized views

2019-02-04 Thread Andreas Karlsson

On 2/4/19 7:09 AM, Michael Paquier wrote:

On Tue, Jan 22, 2019 at 03:10:17AM +0100, Andreas Karlsson wrote:

On 1/21/19 3:31 AM, Andreas Karlsson wrote:

Here is a a stab at refactoring this so the object creation does not
happen in a callback.


Rebased my patch on top of Andres's pluggable storage patches. Plus some
minor style changes.


Taking a note to look at this refactoring bit, which is different from
the temp matview part.  Moved to next CF for now.


Should I submit it as a separate CF entry or is it easiest if my 
refactoring and Mi Tar's feature are reviewed together?


Andreas



Re: Feature: temporary materialized views

2019-02-05 Thread Andreas Karlsson
On 2/5/19 12:36 PM, Michael Paquier wrote:> - skipData is visibly always 
false.

> We may want to keep skipData to have an assertion at the beginning of
> inforel_startup for sanity purposes though.
This is not true in this version of the patch. The following two cases 
would crash if we add such an assertion:


EXPLAIN ANALYZE CREATE TABLE foo AS SELECT 1 WITH NO DATA;

and

PREPARE s AS SELECT 1;
CREATE TABLE bar AS EXECUTE s WITH NO DATA;

since they both still run the setup and tear down steps of the executor.

I guess that I could fix that for the second case as soon as I 
understand how much of the portal stuff can be skipped in 
ExecuteQuery(). But I am not sure what we should do with EXPLAIN ANALYZE 
... NO DATA. It feels like a contraction to me. Should we just raise an 
error? Or should we try to preserve the current behavior where you see 
something like the below?


QUERY PLAN
---
 Result  (cost=0.00..0.01 rows=1 width=4) (never executed)
 Planning Time: 0.040 ms
 Execution Time: 0.002 ms
(3 rows)

> 2) DefineIntoRelForDestReceiver is just a wrapper for
> create_ctas_nodata, so we had better just merge both of them and
> expose directly the routine creating the relation definition, so the
> new interface is a bit awkward.
Agreed, the API is awakward as it is now but it was the least awkward 
one I managed to design. But I think if we fix the issue above then it 
might be possible to create a less awkward API.


> 3) The part about the regression diff is well...  Expected...  We may
> want a comment about that.  We could consider as well adding a
> regression test inspired from REINDEX SCHEMA to show that the CTAS is
> created before the data is actually filled in.
Yeah, that sounds like a good idea.

Andreas



Re: Feature: temporary materialized views

2019-02-05 Thread Andreas Karlsson

On 2/5/19 6:56 PM, Andreas Karlsson wrote:
On 2/5/19 12:36 PM, Michael Paquier wrote:> - skipData is visibly always 
false.

 > We may want to keep skipData to have an assertion at the beginning of
 > inforel_startup for sanity purposes though.
This is not true in this version of the patch. The following two cases 
would crash if we add such an assertion:


EXPLAIN ANALYZE CREATE TABLE foo AS SELECT 1 WITH NO DATA;

and

PREPARE s AS SELECT 1;
CREATE TABLE bar AS EXECUTE s WITH NO DATA;

since they both still run the setup and tear down steps of the executor.

I guess that I could fix that for the second case as soon as I 
understand how much of the portal stuff can be skipped in 
ExecuteQuery(). But I am not sure what we should do with EXPLAIN ANALYZE 
... NO DATA. It feels like a contraction to me. Should we just raise an 
error? Or should we try to preserve the current behavior where you see 
something like the below?


In general I do not like how EXPLAIN CREATE TABLE AS uses a separate 
code path than CREATE TABLE AS, because we get weird but mostly harmless 
edge cases like the below and that I do not think that EXPLAIN ANALYZE 
CREATE MATERIALIZED VIEW sets the security context properly.


I am not sure if any of this is worth fixing, but it certainly 
contributed to why I thought that it was hard to design a good API.


postgres=# EXPLAIN ANALYZE CREATE TABLE IF NOT EXISTS bar AS SELECT 1;
 QUERY PLAN 



 Result  (cost=0.00..0.01 rows=1 width=4) (actual time=0.001..0.002 
rows=1 loops=1)

 Planning Time: 0.030 ms
 Execution Time: 12.245 ms
(3 rows)

Time: 18.223 ms
postgres=# EXPLAIN ANALYZE CREATE TABLE IF NOT EXISTS bar AS SELECT 1;
ERROR:  relation "bar" already exists
Time: 2.129 ms

Andreas



Add missing CREATE TABLE IF NOT EXISTS table_name AS EXECUTE query;

2019-02-05 Thread Andreas Karlsson

Hi,

The first example below works while the second one is a syntax error 
despite that the documentation 
(https://www.postgresql.org/docs/11/sql-createtableas.html) makes it 
seem like both should be valid. I do not see any reason to not support 
CTAS with both IF NOT EXISTS and EXECUTE at the same time so i am 
guessing that this was an oversight.


I have attached a patch which fixes this. What do you think? Should I 
add a new test case for this or is the change simple enough to not 
require any new test?


# CREATE TABLE IF NOT EXISTS a AS SELECT 1;
SELECT 1

# PREPARE q AS SELECT 1;
PREPARE
Time: 0.209 ms
foo=# CREATE TABLE IF NOT EXISTS a AS EXECUTE q;
ERROR:  syntax error at or near "EXECUTE"
LINE 1: CREATE TABLE IF NOT EXISTS a AS EXECUTE q;
^

Andreas


ctas-exec-ifne-v1.sql
Description: application/sql


Re: Add missing CREATE TABLE IF NOT EXISTS table_name AS EXECUTE query;

2019-02-06 Thread Andreas Karlsson

On 2/6/19 12:49 PM, Michael Paquier wrote:

A test case in select_into.sql would be nice.  Should we back-patch
that?  It seems to me that this qualifies as a bug fix, and I don't
see a reason to not support it knowing that we have the infrastructure
for that.


I added a test in create_table.sql where the test for the other form of 
CTAS IF NOT EXISTS is.


I have no idea if something like this should be back patched. I will add 
it to the commit fest either way so it is not lost.


Andreas


ctas-exec-ifne-v2.sql
Description: application/sql


Re: Feature: temporary materialized views

2019-02-06 Thread Andreas Karlsson

On 2/6/19 10:18 AM, Michael Paquier wrote:

Attached is a patch to do that and close the gap.  With that, we will
be able to check for inconsistencies better when working on the
follow-up patches.  What do you think?


I approve. I was when testing this stuff that I found the IF NOT EXISTS 
issue.


Andreas



Re: libpq compression

2019-02-10 Thread Andreas Karlsson
I will preface this with that I am not a security guy and that also do 
not know how the Zstd vompression works, so take any of what I say with 
a grain of salt.


On 2/8/19 8:14 AM, Andres Freund wrote:> I think compression is pretty 
useful, and I'm not convinced that the

threat model underlying the attacks on SSL really apply to postgres.
I think only because it is usually harder to intercept traffic between 
the application server and the database than between the we bbrowser and 
the web server.


Imagine the following query which uses the session ID from the cookie to 
check if the logged in user has access to a file.


SELECT may_download_file(session_id => $1, path => $2);

When the query with its parameters is compressed the compressed size 
will depend on the similarity between the session ID and the requested 
path (assuming Zstd works similar to DEFLATE), so by tricking the web 
browser into making requests with specifically crafted paths while 
monitoring the traffic between the web server and the database the 
compressed request size can be use to hone in the session ID and steal 
people's login sessions, just like the CRIME attack[1].


So while compression is a very useful feature I am worried that it also 
opens application developers to a new set of security vulnerabilities 
which they previously were protected from when compression was removed 
from SSL.


1. https://en.wikipedia.org/wiki/CRIME

Andreas



Re: libpq compression

2019-02-12 Thread Andreas Karlsson

On 2/11/19 5:28 PM, Peter Eisentraut wrote:

One mitigation is to not write code like that, that is, don't put secret
parameters and user-supplied content into the same to-be-compressed
chunk, or at least don't let the end user run that code at will in a
tight loop.

The difference in CRIME is that the attacker supplied the code.  You'd
trick the user to go to http://evil.com/ (via spam), and that site
automatically runs JavaScript code in the user's browser that contacts
https://bank.com/, which will then automatically send along any secret
cookies the user had previously saved from bank.com.  The evil
JavaScript code can then stuff the requests to bank.com with arbitrary
bytes and run the requests in a tight loop, only subject to rate
controls at bank.com.


Right, CRIME is worse since it cannot be mitigated by the application 
developer. But even so I do not think that my query is that odd. I do 
not think that it is obvious to most application developer that putting 
user supplied data close to sensitive data is potentially dangerous.


Will this attack ever be useful in practice? No idea, but I think we 
should be aware of what risks we open our end users to.


Andreas



Re: proposal: pg_restore --convert-to-text

2019-02-14 Thread Andreas Karlsson

On 14/02/2019 01.31, Euler Taveira wrote:

Em qua, 13 de fev de 2019 às 19:56, Andrew Gierth
 escreveu:

I propose we add a new option: --convert-to-text or some such name, and
then make pg_restore throw a usage error if neither -d nor the new
option is given.


However, I agree that pg_restore to stdout if -d wasn't specified is
not a good default. The current behavior is the same as "-f -"
(however, pg_restore doesn't allow - meaning stdout). Isn't it the
case to error out if -d or -f wasn't specified? If we go to this road,
-f option should allow - (stdout) as parameter.


Agreed, "-f -" would be acceptable. I use pg_restore to stdout a lot, 
but almost always manually and would have to have to remember and type 
"--convert-to-text".


Andreas



Re: Early WIP/PoC for inlining CTEs

2019-02-14 Thread Andreas Karlsson

On 14/02/2019 16.11, Tom Lane wrote:

... so, have we beaten this topic to death yet?  Can we make a decision?

Personally, I'd be happy with either of the last two patch versions
I posted (that is, either AS [[NOT] MATERIALIZED] or
AS [MATERIALIZE [ON|OFF]] syntax).  But we gotta pick something.


I am happy with either of those.

Andreas



Re: Early WIP/PoC for inlining CTEs

2019-02-16 Thread Andreas Karlsson

On 16/02/2019 22.14, Tom Lane wrote:

I was expecting more controversy ... pushed that way.


Thanks! And thanks to everyone else who worked on this patch!

Andreas



Re: Add missing CREATE TABLE IF NOT EXISTS table_name AS EXECUTE query;

2019-02-17 Thread Andreas Karlsson

On 15/02/2019 09.14, Michael Paquier wrote:

On Mon, Feb 11, 2019 at 09:53:59PM +0900, Michael Paquier wrote:

Let's wait a bit more than the beginning of this week.  I forgot about
this week's minor release, and it is too late to do something about
this report now, so we will have to wait unt

OK, committed down to 9.5.


Thanks!


Another thing I have noticed is the following, which is kind of funky
(just rinse and repeat once):
=# EXPLAIN ANALYZE CREATE TABLE IF NOT EXISTS ac AS SELECT 1;
ERROR:  42P07: relation "ac" already exists
LOCATION:  heap_create_with_catalog, heap.c:

The issue here is that we check for IF NOT EXISTS at the high level of
ExecCreateTableAs, however EXPLAIN takes the lower path of
create_ctas_internal() which enforces if_not_exists to false when
building the CreateStmt object to create the relation.  This brings
out a more interesting issue: how should an EXPLAIN behave in this
case?  It has nothing to output as the relation already exists.


Yeah, noticed the same thing myself while refactoring the CTAS code, but 
I guess the output could be like the current output for "EXPLAIN ANALYZE 
 WITH NO DATA;", i.e. a top plan with "(never executed)" (see 
below for an example).


# EXPLAIN ANALYZE CREATE TABLE ac AS SELECT 1 WITH NO DATA;
QUERY PLAN
---
 Result  (cost=0.00..0.01 rows=1 width=4) (never executed)
 Planning Time: 0.125 ms
 Execution Time: 17.046 ms
(3 rows)

The main thing which bothers me right now about my refactoring is how 
different the code paths for CTAS and EXPLAIN ANALYZE CTAS are, which 
leads to weirdness like this. I wonder if we cannot make them share more 
code e.g. by having ExplainOneUtility() call into some function in 
createas.c.


Maybe I should give it a shot and then start a new thread for the 
refactoring once I have looked more into this.


Andreas



Re: Journal based VACUUM FULL

2019-02-21 Thread Andreas Karlsson

On 2/21/19 12:16 AM, Ryan David Sheasby wrote:
I was reading on VACUUM and VACUUM FULL and saw that the current 
implementation of VACUUM FULL writes to an entirely new file and then 
switches to it, as opposed to rewriting the current file in-place. I 
assume the reason for this is safety in the event the database shuts 
down in the middle of the vacuum, the most you will lose is the progress 
of the vacuum and have to start from scratch but otherwise the database 
will retain its integrity and not become corrupted. This seems to be an 
effective but somewhat rudimentary system that could be improved. Most 
notably, the rewriting of almost the entire database takes up basically 
double the storage during the duration of the rewrite which can become 
expensive or even simply inconvenient in IaaS(and probably other) 
installations where the drive sizes are scaled on demand. Even if the 
VACUUM FULL doesn't need to run often, having to reallocate drive space 
for an effective duplication is not ideal. My suggestion is a journal 
based in-place rewrite of the database files.


Hi,

VACUUM FULL used to modify the table in-place in PostgreSQL 8.4 and 
earlier but that solution was slow and did often cause plenty of index 
bloat while moving the rows around in the table. Which is why PostgreSQL 
9.0 switched it to rewiring the whole table and its indexes.


I have not heard many requests for bringing back the old behavior, but 
I could easily have missed them. Either way I do not think there would 
be much demand for an in-place VACUUM FULL unless the index bloat 
problem is also solved.


Additionally I do not think that the project would want a whole new kind 
of infrastructure just to solve this very narrow case. PostgreSQL 
already has its own journal (the write-ahead log) which is used to 
ensure crash safety, and I think any proposed solution for this would 
need to use the WAL.


Andreas



Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-08 Thread Andreas Karlsson
On 04/08/2018 05:27 AM, Craig Ringer wrote:> More below, but here's an 
idea #5: decide InnoDB has the right idea, and

go to using a single massive blob file, or a few giant blobs.


FYI: MySQL has by default one file per table these days. The old 
approach with one massive file was a maintenance headache so they change 
the default some releases ago.


https://dev.mysql.com/doc/refman/8.0/en/innodb-multiple-tablespaces.html

Andreas



Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Andreas Karlsson

On 04/09/2018 02:16 PM, Craig Ringer wrote:
I'd like a middle ground where the kernel lets us register our interest 
and tells us if it lost something, without us having to keep eight 
million FDs open for some long period. "Tell us about anything that 
happens under pgdata/" or an inotify-style per-directory-registration 
option. I'd even say that's ideal.


Could there be a risk of a race condition here where fsync incorrectly 
returns success before we get the notification of that something went wrong?


Andreas



Re: unused_oids script is broken with bsd sed

2018-04-25 Thread Andreas Karlsson

On 04/25/2018 04:59 PM, David Fetter wrote:

While we're improving things, there's not really a reason this program
should need to be run from any particular spot. It can detect where it
is and act on that information.


+1

I would appreciate this change, and if Stats does not feel like adding 
it to his patch I can write a patch for this myself.


Andreas




Re: unused_oids script is broken with bsd sed

2018-04-26 Thread Andreas Karlsson
On 04/26/2018 01:37 PM, John Naylor wrote:> Patch 0002 is my attempt at 
this. Not sure how portable this is. It's

also clunkier than before in that you need to tell it where to find
Catalog.pm, since it seems Perl won't compile unless "use lib" points
to a string and not an expression. Suggestions welcome. I also edited
the boilerplate comments.


What about using FindBin (http://perldoc.perl.org/FindBin.html)?

Andreas



Re: MAP syntax for arrays

2018-05-08 Thread Andreas Karlsson

On 05/08/2018 02:49 PM, Ildar Musin wrote:

The main point of this patch was about convenience; the performance
thing came out later just as a side effect :) Many users are familiar
with "map/reduce/filter" concept that many languages (not only
functional ones) utilized. And my though was that it would be great to
have those in postgres too.


Map is a feature I have quite often wished to have, but I am not sure it 
is quite useful enough to be worth adding our own proprietary syntax. It 
would be a pain if the SQL committee started using MAP for something.


Andreas



Re: Remove Deprecated Exclusive Backup Mode

2018-11-27 Thread Andreas Karlsson

On 11/27/18 3:46 PM, Stephen Frost wrote:

I'm concerned, seriously, that people don't have anywhere near the
concern about the recovery side of things as they do about the backup
side of things and that's really concerning.


I agree with your larger point, but in this case the two breakages do 
not seem equal. As far as I gather the removal of recovery.conf will in 
practice result in a longer downtime when your recovery scripts breaks 
but not any data loss. While if we remove the exclusive backup mode then 
some people's backup scripts will break and if they do not properly 
monitor their backups they risk data loss.


I think we should use more caution when data loss is at stake rather 
than "just" downtime. So personally I am in favor of updating the manual 
with warnings (right now it does not even say if exclusive or 
non-exclusive is the default) and adding a deprecation warning when 
people use the exclusive mode.


Best regards,
Andreas



Re: Remove Deprecated Exclusive Backup Mode

2018-11-27 Thread Andreas Karlsson

On 11/27/18 4:11 PM, Stephen Frost wrote:

I agree with your larger point, but in this case the two breakages do not
seem equal. As far as I gather the removal of recovery.conf will in practice
result in a longer downtime when your recovery scripts breaks but not any
data loss. While if we remove the exclusive backup mode then some people's
backup scripts will break and if they do not properly monitor their backups
they risk data loss.


Let's please try to remember that this is across a major version upgrade
and is removing a broken capability that's already been deprecated for a
couple years.


I know that and you know that, but even our own manual use the exclusive 
mode in some of the examples, e.g: "pg_start_backup('label_goes_here')"[1].


Your point about that people who do not monitor their backups wont 
notice warnings is fair, but even so I think we should give people more 
time to migrate when even our own documentation isn't always clear about 
the exclusive mode being deprecated.


1. 
https://www.postgresql.org/docs/11/functions-admin.html#FUNCTIONS-ADMIN-BACKUP


Andreas



Re: Introducing SNI in TLS handshake for SSL connections

2018-12-11 Thread Andreas Karlsson
On 12/11/18 3:52 PM, Pablo Iranzo Gómez wrote:> I came to this old 
thread while trying to figure out on how to setup
postgres replication behind OpenShift/Kubernetes behind a route (which 
only forwards 80 or 443 traffic), but could work if SNI is supported on 
the client using it.


I haven't found any further follow-up on this, but based on the number 
of posts and questions on many sites on accessing postgres on 
OpenShift/Kubernetes it could be something good to have supported.


Any further information or plans?


I am pretty sure nobody is working on this.

It seems like it would be easy to implement (basically just call 
SSL_set_tlsext_host_name() with the right hostname) with the only issue 
being that we may need to add a new connection string parameter[1] 
because I doubt all users would want SNI enabled by default since 
PostgreSQL itself cannot do anything useful with the hostname, only some 
kind of TLS proxy can. Hopefully there wont be much bike shedding about 
the new connection parameter. :)


Feel free to write a patch if you have the time and submit it to the 
next commitfest[2] for review.


Notes:

1. List of current options: 
https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-PARAMKEYWORDS

2. https://wiki.postgresql.org/wiki/CommitFest

Andreas



Re: Reorganize collation lookup time and place

2018-12-12 Thread Andreas Karlsson

On 12/10/18 4:12 PM, Peter Eisentraut wrote:

Attached is a patch that reorganizes how and where collations are looked up.

Until now, a fmgr C function that makes use of collation information
(for example varstr_cmp(), str_toupper()) gets passed the collation OID,
looks up the collation with pg_newlocale_from_collation(), and then does
something with it.

With this change, the lookup is moved earlier, typically during executor
initialization.  The fmgr functions receive the locale pointer that is
the result of that lookup.


Sounds like a great plan. I too feel that having the look ups there 
makes more logical sense.


But when taking a look at your patch I got a segfault when running 
initdb. See the stack trace below. My compiler is "gcc (Debian 8.2.0-9) 
8.2.0" and the locale when running initdb is glibc's "en_US.utf8".


#0  __GI___strcoll_l (s1=0x55d64555d998 "({CONST :consttype 1009 
:consttypmod -1 :constcollid 100 :constlen -1 :constbyval false 
:constisnull false :location 160 :constvalue 16 [ 64 0 0 0 0 0 0 0 0 0 0 
0 25 0 0 0 ]})", s2=0x55d64555ddb0 "({CONST :consttype 1009 :consttypmod 
-1 :constcollid 100 :constlen -1 :constbyval false :constisnull false 
:location 161 :constvalue 16 [ 64 0 0 0 0 0 0 0 0 0 0 0 25 0 0 0 ]})", 
l=0x0) at strcoll_l.c:259
#1  0x55d644c4d607 in varstrfastcmp_locale (x=94378777346072, 
y=94378777347120, ssup=0x7ffea6a64990) at varlena.c:2185
#2  0x55d644958dad in ApplySortComparator (ssup=0x7ffea6a64990, 
isNull2=false, datum2=, isNull1=false, datum1=out>) at ../../../src/include/utils/sortsupport.h:224
#3  compare_scalars (a=, b=, 
arg=0x7ffea6a64980) at analyze.c:2784
#4  0x55d644cba494 in qsort_arg (a=a@entry=0x55d6456ee5f0, 
n=n@entry=14, es=es@entry=16, cmp=cmp@entry=0x55d644958d82 
, arg=arg@entry=0x7ffea6a64980) at qsort_arg.c:140
#5  0x55d64495b833 in compute_scalar_stats (stats=0x55d6456c6c88, 
fetchfunc=0x55d6449597f1 , samplerows=2904, 
totalrows=2904) at analyze.c:2360
#6  0x55d64495cca3 in do_analyze_rel 
(onerel=onerel@entry=0x7f3424bdbd50, options=options@entry=2, 
params=params@entry=0x7ffea6a64d90, va_cols=va_cols@entry=0x0, 
acquirefunc=0x55d64495901b , relpages=77, 
inh=false, in_outer_xact=false, elevel=13) at analyze.c:527
#7  0x55d64495d2ec in analyze_rel (relid=, 
relation=0x0, options=options@entry=2, 
params=params@entry=0x7ffea6a64d90, va_cols=0x0, 
in_outer_xact=, bstrategy=0x55d6455413a8) at analyze.c:258
#8  0x55d6449d61b1 in vacuum (options=2, relations=0x55d645541520, 
params=params@entry=0x7ffea6a64d90, bstrategy=, 
bstrategy@entry=0x0, isTopLevel=) at vacuum.c:357
#9  0x55d6449d644c in ExecVacuum 
(vacstmt=vacstmt@entry=0x55d645442e50, isTopLevel=isTopLevel@entry=true) 
at vacuum.c:141
#10 0x55d644b5ec9d in standard_ProcessUtility (pstmt=0x55d6454431a0, 
queryString=0x55d645442468 "ANALYZE;\n", 
context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, 
dest=0x55d644f50b00 , completionTag=0x7ffea6a650d0 "") at 
utility.c:670
#11 0x55d644b5f1cc in ProcessUtility 
(pstmt=pstmt@entry=0x55d6454431a0, queryString=, 
context=, params=, queryEnv=out>, dest=dest@entry=0x55d644f50b00 , 
completionTag=0x7ffea6a650d0 "") at utility.c:360
#12 0x55d644b5b582 in PortalRunUtility 
(portal=portal@entry=0x55d645434428, pstmt=pstmt@entry=0x55d6454431a0, 
isTopLevel=isTopLevel@entry=true, 
setHoldSnapshot=setHoldSnapshot@entry=false, 
dest=dest@entry=0x55d644f50b00 , 
completionTag=completionTag@entry=0x7ffea6a650d0 "") at pquery.c:1175
#13 0x55d644b5c1fd in PortalRunMulti 
(portal=portal@entry=0x55d645434428, isTopLevel=isTopLevel@entry=true, 
setHoldSnapshot=setHoldSnapshot@entry=false, 
dest=dest@entry=0x55d644f50b00 , 
altdest=altdest@entry=0x55d644f50b00 , 
completionTag=completionTag@entry=0x7ffea6a650d0 "") at pquery.c:1321
#14 0x55d644b5cfbb in PortalRun (portal=portal@entry=0x55d645434428, 
count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true, 
run_once=run_once@entry=true, dest=dest@entry=0x55d644f50b00 
, altdest=altdest@entry=0x55d644f50b00 , 
completionTag=0x7ffea6a650d0 "") at pquery.c:796
#15 0x55d644b591b8 in exec_simple_query 
(query_string=query_string@entry=0x55d645442468 "ANALYZE;\n") at 
postgres.c:1215
#16 0x55d644b5b0dd in PostgresMain (argc=, 
argv=, dbname=, username=) 
at postgres.c:4256

#17 0x55d644a3abf9 in main (argc=10, argv=0x55d6453c3490) at main.c:224





Re: Introducing SNI in TLS handshake for SSL connections

2018-12-12 Thread Andreas Karlsson

On 12/11/18 3:52 PM, Pablo Iranzo Gómez wrote:
I came to this old thread while trying to figure out on how to setup 
postgres replication behind OpenShift/Kubernetes behind a route (which 
only forwards 80 or 443 traffic), but could work if SNI is supported on 
the client using it.


Hm ... while hacking at a patch for this I gave your specific problem 
some more thought.


I am not familiar with OpenShift or Kubernetes but I want you to be 
aware of that whatever proxy you are going to use will still need to be 
aware of, at least a subset of, the PostgreSQL protocol, since similar 
to SMTP's STARTTLS command the PostgreSQL client will start out using 
the plain text PostgreSQL protocol and then request the server to switch 
over to SSL[1]. So it would be necessary to add support for this to 
whatever proxy you intend to use.


Do you know if adding such custom protocol support is easy to do to the 
proxies you refer to? And do you have any links to documentation for 
these solutions?


Notes

1. https://www.postgresql.org/docs/11/protocol-flow.html#id-1.10.5.7.11

Andreas



Re: Introducing SNI in TLS handshake for SSL connections

2018-12-13 Thread Andreas Karlsson

On 12/13/18 7:43 AM, Pablo Iranzo Gómez wrote:

haproxy is what is used behind, the idea is that haproxy by default when
enabled via a 'route' does allow http or https protocol ONLY, BUT
(https://docs.openshift.com/container-platform/3.9/architecture/networking/routes.html), 


routers do support TLS with SNI.

As PSQL by default tries TLS and fallbacks to plain psql protocol the
idea behind is that we tell OpenShift route to be 'Secure' and
'passtrough', in this way, when PSQL does speak to '443' port in the
route that goes to the 'pod' running postgres using TLS and SNI, the
connection goes thru without any special protocol change.


Sadly that confirms what I feared. Adding SNI to the PostgreSQL protocol 
wont help with solving your use case because the PostgreSQL protocol has 
its own handshake which happens before the SSL handshake so the session 
will not look like SSL to HA Proxy.


Just like HA Proxy does not support STARTTLS for IMAP[1] I do not think 
that it will ever support SSL for the PostgreSQL protocol, SNI or not.


To solve your use case I recommend using something like stunnel, which 
does support SNI, to wrap the unencrypted PostgreSQL protocol in SSL.


This all makes me very skeptical to if there is any realistic use case 
for adding SNI support to libpq, and just adding SNI support feels like 
adding a trap to people who do not know that they should rather use 
stunnel than PostgreSQL's built-in SSL support of they want to route on 
SNI with HA Proxy.


But I will attach my small patch for this, which I am now opposed to, 
anyway so the code exists if a use case turns up in the future (or if it 
turns out my reasoning above is incorrect).


Notes

1. 
https://www.haproxy.com/documentation/haproxy/deployment-guides/exchange-2010/imap4/


Andreas
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index d2e5b08541e..528757f775d 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1460,6 +1460,23 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
   
  
 
+ 
+  sslcompression
+  
+   
+If set to 1, the host name is sent to the server using SSL's
+SNI (Server Name Indication) extension.  If set
+to 0, no SNI extension will be sent.  The default is
+0.  This parameter is ignored if a connection without SSL is made.
+   
+
+   
+The PostgreSQL server ignores the SNI extension,
+but it can be used by SSL-aware proxy software.
+   
+  
+ 
+
  
   sslcert
   
@@ -7373,6 +7390,16 @@ myEventProc(PGEventId evtId, void *evtInfo, void *passThrough)
  
 
 
+
+ 
+  
+   PGSSLSNI
+  
+  PGSSLSNI behaves the same as the  connection parameter.
+ 
+
+
 
  
   
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index bc456fec0c2..4587e5ebb5a 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -278,6 +278,10 @@ static const internalPQconninfoOption PQconninfoOptions[] = {
 		"SSL-Compression", "", 1,
 	offsetof(struct pg_conn, sslcompression)},
 
+	{"sslsni", "PGSSLSNI", "0", NULL,
+		"SSL-SNI", "", 1,
+	offsetof(struct pg_conn, sslsni)},
+
 	{"sslcert", "PGSSLCERT", NULL, NULL,
 		"SSL-Client-Cert", "", 64,
 	offsetof(struct pg_conn, sslcert)},
@@ -3690,6 +3694,8 @@ freePGconn(PGconn *conn)
 		free(conn->sslcrl);
 	if (conn->sslcompression)
 		free(conn->sslcompression);
+	if (conn->sslsni)
+		free(conn->sslsni);
 	if (conn->requirepeer)
 		free(conn->requirepeer);
 	if (conn->connip)
diff --git a/src/interfaces/libpq/fe-secure-openssl.c b/src/interfaces/libpq/fe-secure-openssl.c
index beca3492e8d..fdae2eac74f 100644
--- a/src/interfaces/libpq/fe-secure-openssl.c
+++ b/src/interfaces/libpq/fe-secure-openssl.c
@@ -781,6 +781,7 @@ initialize_SSL(PGconn *conn)
 	char		homedir[MAXPGPATH];
 	char		fnbuf[MAXPGPATH];
 	char		sebuf[PG_STRERROR_R_BUFLEN];
+	char	   *host;
 	bool		have_homedir;
 	bool		have_cert;
 	bool		have_rootcert;
@@ -1183,6 +1184,11 @@ initialize_SSL(PGconn *conn)
 #endif
 #endif
 
+	host = conn->connhost[conn->whichhost].host;
+
+	if (conn->sslsni && conn->sslsni[0] == '1' && host)
+		SSL_set_tlsext_host_name(conn->ssl, host);
+
 	return 0;
 }
 
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 66fd317b949..9f69fbdf5fc 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -353,6 +353,7 @@ struct pg_conn
 	 * retransmits */
 	char	   *sslmode;		/* SSL mode (require,prefer,allow,disable) */
 	char	   *sslcompression; /* SSL compression (0 or 1) */
+	char	   *sslsni;			/* SSL SNI extension (0 or 1) */
 	char	   *sslkey;			/* client key filename */
 	char	   *sslcert;		/* client certificate filename */
 	char	   *sslrootcert;	/* root certificate filename */


Re: Early WIP/PoC for inlining CTEs

2018-12-31 Thread Andreas Karlsson

On 11/16/18 10:15 PM, Tom Lane wrote:
> I took a little bit of a look through this.  Some thoughts:

Thanks for the review! I have decided to pick up this patch and work on 
it since nothing has happened in a while. Here is a new version with 
most of the feedback fixed.



* This is not the idiomatic way to declare an expression tree walker:

+static bool inline_cte_walker(Node *node, void *ctxp)
+{
+   struct inline_cte_walker_ctx *ctx = ctxp;

* I have no faith in the idea that we can skip doing a copyObject on the
inlined subquery, except maybe in the case where we know there's exactly
one reference.

* In "here's where we do the actual substitution", if we're going to
scribble on the RTE rather than make a new one, we must take pains
to zero out the RTE_CTE-specific fields so that the RTE looks the
same as if it had been a RTE_SUBQUERY all along; cf db1071d4e.

* The lack of comments about what conditions we inline under
(at subselect.c:1318) is distressing.  I'm not particularly
in love with the way that inline_cte_walker is commented, either.
And dare I mention that this falsifies the intro comment for
SS_process_ctes?

* Speaking of the comments, I'm not convinced that view rewrite is
a good comparison point; I think this is more like subquery pullup.


I believe I have fixed these except for the comment on the conditions 
for when we inline.


Andrew Gierth: Why did you chose to not inline on FOR UPDATE but inline 
volatile functions? I feel that this might be inconsistent since in both 
cases the query in the CTE can change behavior if the planner pushes a 
WHERE clause into the subquery, but maybe I am missing something.



* I wonder whether we should make range_table_walker more friendly
to the needs of this patch.  The fact that it doesn't work for this usage
suggests that it might not work for others, too.  I could see replacing
the QTW_EXAMINE_RTES flag with QTW_EXAMINE_RTES_BEFORE and
QTW_EXAMINE_RTES_AFTER so that callers could say which order of operations
they want (ie, visit RTE itself before or after its substructure).  Then
we could get rid of the double traversal of the RTE list.


I did as suggested and the code is now much cleaner, but I feel while 
RTE walking business would become cleaner if we could change from having 
the range table walker yield both RTEs and the contents of the RTEs to 
having it just yeild the RTEs and then the walker callback can call 
expression_tree_walker() with the RTE so RTEs are treated like any other 
node in the tree.


I might look into how big impact such a change would have and if it is 
worth the churn.



* I think a large fraction of the regression test cases that this
changes are actually broken by the patch, in the sense that we meant
to test the behavior with a CTE and now we're not getting that.
So we'd need to add MATERIALIZED in many more places than this has
done.  Somebody else (Stephen?) would need to opine on whether that's
true for the CTEs in rowsecurity.sql, but it's definitely true for
the one in rowtypes.sql, where the point is to test what happens
with a whole-row Var.


Agreed, fixed.


* Which will mean we need some new test cases showing that this patch does
anything.  It'd be a good idea to add a test case showing that this gets
things right for conflicting CTE names at different levels, eg

explain verbose
with x as (select 1 as y)
select * from (with x as (select 2 as y) select * from x) ss;


Added this test case, but more are needed. Any suggestion for what file 
these tests belong (right now I just added it to subselect.sql)? Or 
should I add a new called cte.sql?



* ruleutils.c needs adjustments for the new syntax, if we keep that.


Thanks, fixed!


* And of course the documentation needs much more work than this.


Yeah, I was waiting for there to be more agreement on when CTEs should 
be inlined, but maybe I should start writing anyway.


Andreas

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index bb92d9d37a..8c26dd1f26 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -1888,7 +1888,7 @@ SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t
 
 -- join in CTE
 EXPLAIN (VERBOSE, COSTS OFF)
-WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
+WITH t (c1_1, c1_3, c2_1) AS MATERIALIZED (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
  QUERY PLAN  
 -
  Limit
@@ -1905,7 +1905,7 @@ WITH t (c1_1, c1_3, c2_1) AS 

Re: Early WIP/PoC for inlining CTEs

2019-01-01 Thread Andreas Karlsson

On 1/1/19 3:18 AM, Andrew Gierth wrote:

I had a comment around here which seems to have been lost:

  * Secondly, views (and explicit subqueries) currently have
  * different behaviour w.r.t. SELECT FOR UPDATE than CTEs do. A
  * FOR UPDATE clause is treated as extending into views and
  * subqueries, but not into CTEs. We preserve this distinction
  * by not trying to push rowmarks into the new subquery.

This comment seems to me to be worth preserving (unless this behavior is
changed). What I'm referring to is the following, which is unchanged by
the patch:

create table t1 as select 123 as a;
create view v1 as select * from t1;
select * from t1 for update;  -- locks row in t1
select * from t1 for update of t1;  -- locks row in t1
select * from v1 for update;  -- locks row in t1
select * from v1 for update of v1;  -- locks row in t1
select * from (select * from t1) s1 for update;  -- locks row in t1
select * from (select * from t1) s1 for update of s1;  -- locks row in t1
with c1 as (select * from t1)
   select * from c1 for update;  -- does NOT lock anything at all
with c1 as (select * from t1)
   select * from c1 for update of c1;  -- parse-time error

(Obviously, inlining decisions should not change what gets locked;
the behavior here should not be changed unless it is changed for both
inlined and non-inlined CTEs.)


I see, I misread the comment. I will re-add it, possibly with some word 
smithing. Thanks!


Andreas




Re: Early WIP/PoC for inlining CTEs

2019-01-01 Thread Andreas Karlsson

On 1/1/19 1:42 AM, Andrew Gierth wrote:

"Andreas" == Andreas Karlsson  writes:


  Andreas> I believe I have fixed these except for the comment on the
  Andreas> conditions for when we inline.

  Andreas> Andrew Gierth: Why did you chose to not inline on FOR UPDATE
  Andreas> but inline volatile functions? I feel that this might be
  Andreas> inconsistent since in both cases the query in the CTE can
  Andreas> change behavior if the planner pushes a WHERE clause into the
  Andreas> subquery, but maybe I am missing something.

I chose not to inline FOR UPDATE because it was an obvious compatibility
break, potentially changing the set of locked rows, and it was an easy
condition to test.

I did not test for volatile functions simply because this was a very
early stage of the project (which wasn't my project, I was just
assisting someone else). I left the comment "this likely needs some
additional checks" there for a reason.


Thanks, that makes sense! I will need to ponder some on if the behavior 
change when predicates are pushed into a subquery with volatile 
functions is ok. I am leaning towards no, because otherwise inlining 
CTEs would affect more than query performance.


Andreas



Re: insensitive collations

2019-01-09 Thread Andreas Karlsson

On 12/28/18 9:55 AM, Peter Eisentraut wrote:

Here is an updated patch.

I have updated the naming to "deterministic", as discussed.


Maybe this is orthogonal and best handled elsewhere but have you when 
working with string equality given unicode normalization forms[1] any 
thought? I feel there are three sane ways to do unicode string equality:


1) Binary equality
2) Binary equality after normalizing the unicode
3) Collation equality

Would there be any point in adding unicode normalization support into 
the collation system or is this best handle for example with a function 
run on INSERT or with something else entirely?


Right now PosgreSQL does not have any support for normalization forms as 
far as I know.


1. http://unicode.org/reports/tr15/

Andreas



Re: Early WIP/PoC for inlining CTEs

2019-01-09 Thread Andreas Karlsson
Here is a new version of the patch with added tests, improved comments, 
some minor code cleanup and most importantly slightly changed logic for 
when we should inline.


The current strategy is to always inline unless the CTE is recursive or 
it has side effects, i.e. it is a DML query, it contains calls it a 
volatile function, or it contains FOR SHARE/FOR UPDATE. I feel this is a 
conservative approach which prevents behavioral changes (other than 
performance).


Current open issues:

1. Currently the patch will inline CTEs even when there are multiple 
references to them. If this is a win or not depends on the specific 
query so I am not sure what we should do here. One thing worth noting is 
that our current documentation talks a lot about how CTEs are only 
evaluated once.


"A useful property of WITH queries is that they are 
evaluated

only once per execution of the parent query, even if they are referred to
more than once by the parent query or sibling WITH 
queries.

Thus, expensive calculations that are needed in multiple places can be
placed within a WITH query to avoid redundant work. 
Another

possible application is to prevent unwanted multiple evaluations of
functions with side-effects."

What do you think?

2. Feedback on the new syntax. I am personally fine with the current 
syntax, but it was just something I just quickly hacked together to move 
the patch forward and which also solved my personal uses cases.


3. Are we happy with how I modified query_tree_walker()? I feel the code 
would be clearer if we could change the tree walker to treat the RTE as 
the parent node of the subquery instead of a sibling, but this seems 
like potentially a quite invasive change.


4. I need to update the user documentation.

Andreas
commit 60aab4af243cc93d504b99c99010593c02a5705c
Author: Andreas Karlsson 
Date:   Mon Sep 17 03:42:40 2018 +0200

CTE inlining

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index bb92d9d37a..8c26dd1f26 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -1888,7 +1888,7 @@ SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t
 
 -- join in CTE
 EXPLAIN (VERBOSE, COSTS OFF)
-WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
+WITH t (c1_1, c1_3, c2_1) AS MATERIALIZED (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
  QUERY PLAN  
 -
  Limit
@@ -1905,7 +1905,7 @@ WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2
Output: t.c1_1, t.c2_1, t.c1_3
 (12 rows)
 
-WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
+WITH t (c1_1, c1_3, c2_1) AS MATERIALIZED (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
  c1_1 | c2_1 
 --+--
   101 |  101
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index f438165650..56602a164c 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -493,8 +493,8 @@ SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t
 SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10 FOR SHARE;
 -- join in CTE
 EXPLAIN (VERBOSE, COSTS OFF)
-WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
-WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
+WITH t (c1_1, c1_3, c2_1) AS MATERIALIZED (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
+WITH t (c1_1, c1_3, c2_1) AS MATERIALIZED (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
 -- ctid with whole-row reference
 EXPLAIN (VERBOSE, COSTS OFF)
 SELECT t1.ctid, t1, t2, t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
diff --git a/doc/src/sgml/queries.sgml b/doc/src/sgml/queries.sgml
index 88bc189646..92f180c5cf 100644
--- a/doc/src/sgml/qu

Re: Feature: temporary materialized views

2019-01-11 Thread Andreas Karlsson
On 12/28/18 8:48 AM, Mitar wrote:> One more version of the patch with 
more deterministic tests.


Her is quick initial review. I will do more testing later.

It applies builds and passes the tests.

The feature seems useful and also improves consistency, if we have 
temporary tables and temporary views there should logically also be 
temporary materialized tables.


As for you leaving out ON COMMIT I feel that it is ok since of the 
existing options only really DROP makes any sense (you cannot truncate 
materialized views) and since temporary views do not have any ON COMMIT 
support.


= Comments on the code

The changes to the code are small and look mostly correct.

In create_ctas_internal() why do you copy the relation even when you do 
not modify it?


Is it really ok to just remove SECURITY_RESTRICTED_OPERATION from 
ExecCreateTableAs()? I feel it is there for a good reason and that we 
preferably want to reduce the duration of SECURITY_RESTRICTED_OPERATION 
to only include when we actually execute the query.


Andreas



Re: insensitive collations

2019-01-14 Thread Andreas Karlsson

On 1/10/19 8:44 AM, Peter Eisentraut wrote:

On 09/01/2019 19:49, Andreas Karlsson wrote:

Maybe this is orthogonal and best handled elsewhere but have you when
working with string equality given unicode normalization forms[1] any
thought?


Nondeterministic collations do address this by allowing canonically
equivalent code point sequences to compare as equal.  You still need a
collation implementation that actually does compare them as equal; ICU
does this, glibc does not AFAICT.


Ah, right! You could use -ks-identic[1] for this.


Would there be any point in adding unicode normalization support into
the collation system or is this best handle for example with a function
run on INSERT or with something else entirely?


I think there might be value in a feature that normalizes strings as
they enter the database, as a component of the encoding conversion
infrastructure.  But that would be a separate feature.


Agreed. And if we ever implement this we could theoretically optimize 
the equality of -ks-identic to do a strcmp() rather than having to 
collate anything.


I think it could also be useful to just add functions which can 
normalize strings, which was in a proposal to the SQL standard which was 
not accepted.[2]


Notes

1. http://www.unicode.org/reports/tr35/tr35-collation.html#Setting_Options
2. https://dev.mysql.com/worklog/task/?id=2048

Andreas




Re: Zedstore - compressed in-core columnar storage

2019-04-11 Thread Andreas Karlsson

On 4/11/19 10:46 AM, Konstantin Knizhnik wrote:


This my results of compressing pbench data using different compressors:

Configuration   Size (Gb)   Time (sec)
no compression
15.31   92
zlib (default level)2.37284
zlib (best speed)   2.43191
postgres internal lz3.89214
lz4 4.12
95
snappy  5.1899
lzfse   2.801099
(apple) 2.80 1099
1.69125



You see that zstd provides almost 2 times better compression ration 
and almost at the same speed.



What is "(apple) 2.80 1099"? Was that intended to be zstd?

Andreas



Re: PostgreSQL pollutes the file system

2019-04-12 Thread Andreas Karlsson

On 4/12/19 5:14 PM, Chris Travers wrote:
1. naming things is surprisingly hard.  How sure are we that we can do 
this right?  Can we come up with a correct name for initdb?  Maybe 
pg_createcluster?


The Debian packagers already use pg_createcluster for their script which 
wraps initdb, and while pg_initdb is a bit misleading (it creates a 
cluster rather than a database) it is not that bad.


Andreas




Re: Early WIP/PoC for inlining CTEs

2018-10-01 Thread Andreas Karlsson

On 07/27/2018 10:10 AM, David Fetter wrote:

On Fri, Jul 27, 2018 at 02:55:26PM +1200, Thomas Munro wrote:

On Thu, Jul 26, 2018 at 7:14 AM, David Fetter  wrote:

Please find attached the next version, which passes 'make check'.


... but not 'make check-world' (contrib/postgres_fdw's EXPLAIN is different).


Please find attached a patch that does.

It doesn't always pass make installcheck-world, but I need to sleep
rather than investigate that at the moment.


I took a quick look at this patch and I have a couple of comments.

1) Is it really safe, for backwards compatibility reasons, to inline 
when there are volatile functions? I imagine that it is possible that 
there are people who rely on that volatile functions inside a CTE are 
always run.


Imagine this case:

WITH cte AS (
  SELECT x, volatile_f(x) FROM tab ORDER BY x
)
SELECT * FROM cte LIMIT 10;

It could change behavior if volatile_f() has side effects and we inline 
the CTE. Is backwards compatibility for cases like this worth 
preserving? People can get the old behavior by adding OFFSET 0 or 
MATERIALIZED, but existing code would break.


2) The code in inline_cte_walker() is quite finicky. It looks correct to 
me but it is hard to follow and some simple bug could easily be hiding 
in there. I wonder if this code could be rewritten in some way to make 
it easier to follow.


3) Can you recall what the failing test was because I have so far not 
managed to reproduce it?


Andreas



Re: Early WIP/PoC for inlining CTEs

2018-10-02 Thread Andreas Karlsson

Hi,

Here is an updated patch which adds some simple syntax for adding the 
optimization barrier. For example:


WITH x AS MATERIALIZED (
   SELECT 1
)
SELECT * FROM x;

Andreas
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 21a2ef5ad3a..017b73e0a29 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -1888,7 +1888,7 @@ SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t
 
 -- join in CTE
 EXPLAIN (VERBOSE, COSTS OFF)
-WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
+WITH t (c1_1, c1_3, c2_1) AS MATERIALIZED (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
  QUERY PLAN  
 -
  Limit
@@ -1905,7 +1905,7 @@ WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2
Output: t.c1_1, t.c2_1, t.c1_3
 (12 rows)
 
-WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
+WITH t (c1_1, c1_3, c2_1) AS MATERIALIZED (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
  c1_1 | c2_1 
 --+--
   101 |  101
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 88c4cb4783f..3032aed34f9 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -501,8 +501,8 @@ SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t
 SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10 FOR SHARE;
 -- join in CTE
 EXPLAIN (VERBOSE, COSTS OFF)
-WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
-WITH t (c1_1, c1_3, c2_1) AS (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
+WITH t (c1_1, c1_3, c2_1) AS MATERIALIZED (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
+WITH t (c1_1, c1_3, c2_1) AS MATERIALIZED (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) SELECT c1_1, c2_1 FROM t ORDER BY c1_3, c1_1 OFFSET 100 LIMIT 10;
 -- ctid with whole-row reference
 EXPLAIN (VERBOSE, COSTS OFF)
 SELECT t1.ctid, t1, t2, t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
diff --git a/doc/src/sgml/ref/select.sgml b/doc/src/sgml/ref/select.sgml
index 4db8142afaa..43f1b0e3f15 100644
--- a/doc/src/sgml/ref/select.sgml
+++ b/doc/src/sgml/ref/select.sgml
@@ -72,7 +72,7 @@ SELECT [ ALL | DISTINCT [ ON ( expressionand with_query is:
 
-with_query_name [ ( column_name [, ...] ) ] AS ( select | values | insert | update | delete )
+with_query_name [ ( column_name [, ...] ) ] AS [ MATERIALIZED ] ( select | values | insert | update | delete )
 
 TABLE [ ONLY ] table_name [ * ]
 
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 925cb8f3800..84435acb83e 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -2542,6 +2542,7 @@ _copyCommonTableExpr(const CommonTableExpr *from)
 	COPY_NODE_FIELD(ctequery);
 	COPY_LOCATION_FIELD(location);
 	COPY_SCALAR_FIELD(cterecursive);
+	COPY_SCALAR_FIELD(ctematerialized);
 	COPY_SCALAR_FIELD(cterefcount);
 	COPY_NODE_FIELD(ctecolnames);
 	COPY_NODE_FIELD(ctecoltypes);
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 3bb91c95958..65875d90201 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2794,6 +2794,7 @@ _equalCommonTableExpr(const CommonTableExpr *a, const CommonTableExpr *b)
 	COMPARE_NODE_FIELD(ctequery);
 	COMPARE_LOCATION_FIELD(location);
 	COMPARE_SCALAR_FIELD(cterecursive);
+	COMPARE_SCALAR_FIELD(ctematerialized);
 	COMPARE_SCALAR_FIELD(cterefcount);
 	COMPARE_NODE_FIELD(ctecolnames);
 	COMPARE_NODE_FIELD(ctecoltypes);
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 22dbae15d3b..56bfdcc5d10 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -3094,6 +3094,7 @@ _outCommonTableExpr(StringInfo str, const CommonTableExpr *node)
 	WRITE_NODE_FIELD(ctequery);
 	WRITE_LOCATION_

Re: [RFC] Removing "magic" oids

2018-10-03 Thread Andreas Karlsson
On 09/30/2018 05:48 AM, Andres Freund wrote:> I think we should drop 
WITH OIDs support.


+1


2) We need to be able to manually insert into catalog tables. Both
initdb and emergency surgery.  My current hack is doing so directly
in nodeModifyTable.c but that's beyond ugly.  I think we should add
an explicit DEFAULT clause to those columns with something like
nextoid('tablename', 'name_of_index') or such.


Adding a function to get the next OID sounds like a good solution for 
both our catalog and legacy applications. The only potential 
disadvantage that I see is that this function becomes something we need 
to maintain if we ever decide to move from OIDs to sequences.



3) The quickest way to deal with the bootstrap code was to just assign
all oids for oid carrying tables that don't yet have any assigned.
That doesn't generally seem terrible, although it's certainly badly
implemented right now.  That'd mean we'd have three ranges of oids
probably, unless we somehow have the bootstrap code advance the
in-database oid counter to a right state before we start with
post-bootstrap work.  I like the idea of all bootstrap time oids
being determined by genbki.pl (I think that'll allow to remove some
code too).


Agreed, having genbki.pl determine all oids in the bootstrap data sounds 
ideal.


Andreas




Re: Early WIP/PoC for inlining CTEs

2018-10-04 Thread Andreas Karlsson

On 10/03/2018 05:57 PM, David Fetter wrote:

Is there any meaningful distinction between "inlining," by which I
mean converting to a subquery, and predicate pushdown, which
would happen at least for a first cut, at the rewrite stage?


Sorry, but I do not think I understand your question. The ability to 
push down predicates is just one of the potential benefits from inlining.


Andreas



Re: PostgreSQL vs SQL/XML Standards

2018-10-25 Thread Andreas Karlsson

On 10/25/2018 03:53 PM, Chapman Flack wrote:

On 10/25/18 10:39 AM, Tom Lane wrote:

I think getting out from under libxml2's idiosyncrasies and security
lapses would be great, but is there a plausible alternative out there?


Depends on whether anything in [1] sounds plausible.


The libraries we depend on should really either be available in the 
package repositories of the major Linux distribution or be something we 
can put in our own repository and maintain without too much pain. So 
using Saxon/C does not seem like a realistic option.


Andreas



Re: date_trunc() in a specific time zone

2018-10-29 Thread Andreas Karlsson

On 10/29/2018 04:18 PM, Vik Fearing wrote:

A use case that I see quite a lot of is needing to do reports and other
calculations on data per day/hour/etc but in the user's time zone.  The
way to do that is fairly trivial, but it's not obvious what it does so
reading queries becomes just a little bit more difficult.


Hm, I am not sure if I see any major win from writing

date_trunc('day', timestamptz '2001-02-16 20:38:40+00', 'Australia/Sydney')

instead of

date_trunc('day', timestamptz '2001-02-16 20:38:40+00' AT TIME ZONE 
'Australia/Sydney')


. Especially since you still will have to do the second for other time 
related functions like date(). Maybe a slight win in that new users who 
read the manual will be reminded that they need to care about time 
zones, but I also see a value in teaching users about how to use "AT 
TIME ZONE".


Andreas



Re: date_trunc() in a specific time zone

2018-10-29 Thread Andreas Karlsson

On 10/29/2018 04:36 PM, Tom Lane wrote:

Andreas Karlsson  writes:

Hm, I am not sure if I see any major win from writing
date_trunc('day', timestamptz '2001-02-16 20:38:40+00', 'Australia/Sydney')
instead of
date_trunc('day', timestamptz '2001-02-16 20:38:40+00' AT TIME ZONE
'Australia/Sydney')


The latter would give you timestamp without time zone, whereas I think
what Vik wants is timestamp with time zone.  Yeah, you could then convert
it back with a second application of AT TIME ZONE 'Australia/Sydney',
but that's both inefficient and mighty confusing.


Sloppy reading on my part, thanks for pointing it out.

Andreas



Re: [RFC] Removing "magic" oids

2018-11-21 Thread Andreas Karlsson

On 11/21/18 1:07 AM, Andres Freund wrote:

Let's see what I broke :/


There is a small typo in the old release notes.

Andreas
diff --git a/doc/src/sgml/release-9.1.sgml b/doc/src/sgml/release-9.1.sgml
index 3407d8ad739..a522e7e0225 100644
--- a/doc/src/sgml/release-9.1.sgml
+++ b/doc/src/sgml/release-9.1.sgml
@@ -3654,7 +3654,7 @@ Branch: REL9_0_STABLE [9d6af7367] 2015-08-15 11:02:34 -0400
 
  
   Prevent foreign tables from being created with OIDS
-  when default_with_oids" is true
+  when default_with_oids is true
   (Etsuro Fujita)
  
 
diff --git a/doc/src/sgml/release-9.2.sgml b/doc/src/sgml/release-9.2.sgml
index 1e5f4e04c33..e012f7fb6ca 100644
--- a/doc/src/sgml/release-9.2.sgml
+++ b/doc/src/sgml/release-9.2.sgml
@@ -5650,7 +5650,7 @@ Branch: REL9_2_STABLE [6b700301c] 2015-02-17 16:03:00 +0100
 
  
   Prevent foreign tables from being created with OIDS
-  when default_with_oids" is true
+  when default_with_oids is true
   (Etsuro Fujita)
  
 
diff --git a/doc/src/sgml/release-9.3.sgml b/doc/src/sgml/release-9.3.sgml
index 23868d65d8e..3575cb5d76d 100644
--- a/doc/src/sgml/release-9.3.sgml
+++ b/doc/src/sgml/release-9.3.sgml
@@ -9237,7 +9237,7 @@ Branch: REL9_1_STABLE [dd1a5b09b] 2014-06-24 13:30:41 +0300
 
  
   Prevent foreign tables from being created with OIDS
-  when default_with_oids" is true
+  when default_with_oids is true
   (Etsuro Fujita)
  
 


Re: Need of maintaining unsupported release notes in HEAD?

2018-11-21 Thread Andreas Karlsson
On 11/21/18 2:51 AM, Haribabu Kommi wrote:> Attached patch removes all 
the older release notes sgml files from HEAD.
And also going forward, how about removing one major version file 
whenever that is unsupported from HEAD?


comments?


I have found these old notes useful multiple times in the past, 
especially when dealing with no longer supported versions of PostgreSQL 
or just to quickly check when some feature was first introduced. And 
removing them from the current master would make them a bit harder to 
navigate for a, in my opinion, tiny gain in reduced maintenance burden.


Andreas



Re: Early WIP/PoC for inlining CTEs

2018-07-26 Thread Andreas Karlsson

On 07/25/2018 06:08 PM, Andrew Gierth wrote:

WITH ctename AS [[NOT] MATERIALIZED] (query)


I think "NOT MATERIALIZED" would be a bit misleading since the planner 
may choose to materialize anyway, so I suggest skipping that part of the 
syntax unless there is a really strong reason for having it.


Andreas



Re: \describe*

2018-01-26 Thread Andreas Karlsson
On 01/26/2018 03:49 PM, David Fetter wrote:> They are indeed terse and 
cryptic, and what's worse, they're not

available to clients other than psql, so I propose that we do what at
least MySQL, Oracle, and DB2 do and implement DESCRIBE as its own
command.

Especially handy would be a variant DESCRIBE CREATE, which would do
what it says on the label in a copy-and-paste-able form, but that's
not strictly necessary for the first cut.


I am not fan of this since I like how easy it is to explain to beginners 
that all backslash commands are processed by the client while everything 
else is handled by the server. Yes, "help" is an exception, but nobody 
really needs to know about that command.


As for the actually proposal I do not care strongly either way. The \d 
commands are a bit cryptic and unfriendly to the occasional user, but I 
am not sure that having two ways to do it would be better.


Andreas



Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2018-01-30 Thread Andreas Karlsson

On 01/21/2018 10:36 PM, Mark Rofail wrote:

== The @>> operator

A previous version of your patch added the "anyelement <<@ anyarray"
operator to avoid having to build arrays, but that part was reverted
due to a bug.

I am not expert on the gin code, but as far as I can tell it would
be relatively simple to fix that bug. Just allocate an array of
Datums of length one where you put the element you are searching for
(or maybe a copy of it).

The @>> is now restored and functioning correctly, all issues with 
contrib libraries has been resolved


Hi,

I looked some at your anyarray @>> anyelement code and sadly it does not 
look like the index code could work. The issue I see is that 
ginqueryarrayextract() needs to make a copy of the search key but to do 
so it needs to know the type of anyelement (to know if it needs to 
detoast, etc). But there is as far as I can tell no way to check the 
type of anyelement in this context.


Am I missing something?

Andreas



Re: [HACKERS] GnuTLS support

2018-01-30 Thread Andreas Karlsson

On 01/26/2018 03:54 AM, Peter Eisentraut wrote:

On 1/25/18 20:10, Michael Paquier wrote:

Peter, could you change ssl_version() and ssl_cipher() in sslinfo at the
same time please? I think that those should use the generic backend-side
APIs as well. sslinfo depends heavily on OpenSSL, OK, but if possible
getting this code more generic will help users of sslinfo to get
something partially working with other SSL implementations natively.


sslinfo is currently entirely dependent on OpenSSL, so I don't think
it's useful to throw in one or two isolated API changes.

I'm thinking maybe we should get rid of sslinfo and fold everything into
pg_stat_ssl.


I think sslinfo should either use the pg_tls_get_* functions or be 
removed. I do not like having an OpenSSL specific extension. One issue 
though is that pg_tls_get_* truncates strings to a given length while 
sslinfo allocates a copy and is therefore only limited by the maximum 
size of text, but this may not be an issue in practice.


Andreas



Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2018-01-30 Thread Andreas Karlsson

On 01/26/2018 03:28 AM, Stephen Frost wrote:

I'm a big fan of this patch but it doesn't appear to have made any
progress in quite a while.  Is there any chance we can get an updated
patch and perhaps get another review before the end of this CF...?


Sorry, as you may have guessed I do not have the time right now to get 
this updated during this commitfest.



Refactoring this to have an internal representation between
ProcessUtility() and DefineIndex doesn't sound too terrible and if it
means the ability to reuse that, seems like it'd be awful nice to do
so..


I too like the concept, but have not had the time to look into it.

Andreas



Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2018-02-03 Thread Andreas Karlsson

On 02/01/2018 04:17 PM, Mark Rofail wrote:
since its a polymorphic function it only passes if the `anyarray` is the 
same type of the `anyelement` so we are sure they are the same type. 
Can't we get the type from the anyarray ? the type is already stored in 
` arr_type`.


In theory, yes, but I saw no obvious way to get it without major changes 
to the code. Unless I am missing something, of course.


If I am right my recommendation for getting this patch in is to 
initially skip the new operators and go back to the version of the patch 
which uses @>.


Andreas



Re: JIT compiling with LLVM v9.1

2018-02-03 Thread Andreas Karlsson

On 02/02/2018 10:48 AM, Pierre Ducroquet wrote:

I have successfully built the JIT branch against LLVM 4.0.1 on Debian testing.
This is not enough for Debian stable (LLVM 3.9 is the latest available there),
but it's a first step.
I've split the patch in four files. The first three fix the build issues, the
last one fixes a runtime issue.
I think they are small enough to not be a burden for you in your developments.
But if you don't want to carry these ifdefs right now, I maintain them in a
branch on a personal git and rebase as frequently as I can.


I tested these patches and while the code built for me and passed the 
test suite on Debian testing I have a weird bug where the very first 
query fails to JIT while the rest work as they should. I think I need to 
dig into LLVM's codebase to see what it is, but can you reproduce this 
bug at your machine?


Code to reproduce:

SET jit_expressions = true;
SET jit_above_cost = 0;
SELECT 1;
SELECT 1;

Output:

postgres=# SELECT 1;
ERROR:  failed to jit module
postgres=# SELECT 1;
 ?column?
--
1
(1 row)

Config:

Version: You patches applied on top of 
302b7a284d30fb0e00eb5f0163aa933d4d9bea10

OS: Debian testing
llvm/clang: 4.0.1-8

Andreas



Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2018-02-05 Thread Andreas Karlsson
The patch looks good to me now other than some smaller issues, mostly in 
the documentation. If you fix these I will set it as ready for 
committer, and let a committer chime in on the casting logic which we 
both find a bit ugly.


== Comments

The only a bit bigger issue I see is the error messages. Can they be 
improved?


For example:

CREATE TABLE t1 (a int, b int, PrIMARY KEY (a, b));
CREATE TABLE t2 (a int, bs int8[], FOREIGN KEY (a, EACH ELEMENT OF bs) 
REFERENCES t1 (a, b));

INSERT INTO t2 VALUES (0, ARRAY[1, 2]);

Results in:

ERROR:  insert or update on table "t2" violates foreign key constraint 
"t2_a_fkey"

DETAIL:  Key (a, bs)=(0, {1,2}) is not present in table "t1".

Would it be possible to make the DETAIL something like the below 
instead? Do you think my suggested error message is clear?


I am imaging something like the below as a patch. Does it look sane? The 
test cases need to be updated at least.


diff --git a/src/backend/utils/adt/ri_triggers.c 
b/src/backend/utils/adt/ri_triggers.c

index 402bde19d4..9dc7c9812c 100644
--- a/src/backend/utils/adt/ri_triggers.c
+++ b/src/backend/utils/adt/ri_triggers.c
@@ -3041,6 +3041,10 @@ ri_ReportViolation(const RI_ConstraintInfo *riinfo,
appendStringInfoString(&key_names, ", ");
appendStringInfoString(&key_values, ", ");
}
+
+   if (riinfo->fk_reftypes[idx] == FKCONSTR_REF_EACH_ELEMENT)
+   appendStringInfoString(&key_names, "EACH ELEMENT OF ");
+
appendStringInfoString(&key_names, name);
appendStringInfoString(&key_values, val);
}


DETAIL:  Key (a, EACH ELEMENT OF bs)=(0, {1,2}) is not present in table 
"t1".


-  REFERENCES reftable [ ( 
refcolumn ) ] [ MATCH FULL 
| MATCH PARTIAL | MATCH SIMPLE ]
+  [EACH ELEMENT OF] REFERENCES class="parameter">reftable [ ( class="parameter">refcolumn ) ] [ MATCH FULL | MATCH 
PARTIAL | MATCH SIMPLE ]


I think this documentation in doc/src/sgml/ref/create_table.sgml should 
be removed since it is no longer correct.


+ 
+  Foreign Key Arrays are an extension
+  of PostgreSQL and are not included in the SQL standard.
+ 

This pargraph and some of the others you added to 
doc/src/sgml/ref/create_table.sgml are strangely line wrapped.


+   
+ON DELETE CASCADE
+
+ 
+  Same as standard foreign key constraints. Deletes the entire 
array.

+ 
+
+   
+  

I thought this was no longer supported.

+   however, this syntax will cast ftest1 to int4 upon RI checks, 
thus defeating the

+  purpose of the index.

There is a minor indentation error on these lines.

+
+   
+ Array ELEMENT foreign keys and the ELEMENT
+ REFERENCES clause are a 
PostgreSQL

+ extension.
+   

We do not have any ELEMENT REFERENCES clause.

-   ereport(ERROR,
-   (errcode(ERRCODE_DATATYPE_MISMATCH),
-errmsg("foreign key constraint \"%s\" "
-   "cannot be implemented",
-   fkconstraint->conname),
-errdetail("Key columns \"%s\" and \"%s\" "
-  "are of incompatible types: %s and %s.",
-  strVal(list_nth(fkconstraint->fk_attrs, i)),
-  strVal(list_nth(fkconstraint->pk_attrs, i)),
-  format_type_be(fktype),
-  format_type_be(pktype;
+   ereport(ERROR,
+   (errcode(ERRCODE_DATATYPE_MISMATCH),
+errmsg("foreign key constraint \"%s\" "
+   "cannot be implemented",
+   fkconstraint->conname),
+errdetail("Key columns \"%s\" and \"%s\" "
+  "are of incompatible types: %s and %s.",
+  strVal(list_nth(fkconstraint->fk_attrs, i)),
+  strVal(list_nth(fkconstraint->pk_attrs, i)),
+  format_type_be(fktype),
+  format_type_be(pktype;

It seems like you accidentally change the indentation here,

Andreas



Warning when building man pages

2018-02-05 Thread Andreas Karlsson

Hi,

I get a warning when building the man pages on Debian testing. The 
warning is in the documentation for CREATE POLICY and as caused by the 
three footnotes in the sql-createpolicy-summary table.


I do not understand our documentation pipeline well enough to know what 
the proper fix would be.


Error message:

Element sup in namespace '' encountered in a, but no template matches.
Element sup in namespace '' encountered in a, but no template matches.
Element sup in namespace '' encountered in a, but no template matches.
Note: Writing man7/CREATE_POLICY.7

Package versions:

docbook  4.5-6
docbook-dsssl1.79-9.1
docbook-xml  4.5-8
docbook-xsl  1.79.1+dfsg-2
openjade 1.4devel1-21.3+b1
opensp   1.5.2-13+b1
xsltproc 1.1.29-5

Andreas



Re: JIT compiling with LLVM v9.1

2018-02-05 Thread Andreas Karlsson

OK that fixed the issue, but you have a typo in your patch set.

diff --git a/src/backend/lib/llvmjit_inline.cpp 
b/src/backend/lib/llvmjit_inline.cpp

index a785261bea..51f38e10d2 100644
--- a/src/backend/lib/llvmjit_inline.cpp
+++ b/src/backend/lib/llvmjit_inline.cpp
@@ -37,7 +37,7 @@ extern "C"
 #include 
 #include 
 #include 
-#if LLVM_MAJOR_VERSION > 3
+#if LLVM_VERSION_MAJOR > 3
 #include 
 #else
 #include "llvm/Bitcode/ReaderWriter.h"

Also I get some warning. Not sure if they are from your patches or from 
Andres's.


llvmjit_error.cpp:118:1: warning: unused function 
'fatal_llvm_new_handler' [-Wunused-function]

fatal_llvm_new_handler(void *user_data,
^
1 warning generated.
llvmjit_inline.cpp:114:6: warning: no previous prototype for function 
'operator!' [-Wmissing-prototypes]

bool operator!(const llvm::ValueInfo &vi) {
 ^
1 warning generated.
psqlscanslash.l: In function ‘psql_scan_slash_option’:
psqlscanslash.l:550:8: warning: variable ‘lexresult’ set but not used 
[-Wunused-but-set-variable]

  int   final_state;
^

Andreas

On 02/05/2018 11:39 AM, Pierre Ducroquet wrote:

On Sunday, February 4, 2018 12:45:50 AM CET Andreas Karlsson wrote:

On 02/02/2018 10:48 AM, Pierre Ducroquet wrote:

I have successfully built the JIT branch against LLVM 4.0.1 on Debian
testing. This is not enough for Debian stable (LLVM 3.9 is the latest
available there), but it's a first step.
I've split the patch in four files. The first three fix the build issues,
the last one fixes a runtime issue.
I think they are small enough to not be a burden for you in your
developments. But if you don't want to carry these ifdefs right now, I
maintain them in a branch on a personal git and rebase as frequently as I
can.


I tested these patches and while the code built for me and passed the
test suite on Debian testing I have a weird bug where the very first
query fails to JIT while the rest work as they should. I think I need to
dig into LLVM's codebase to see what it is, but can you reproduce this
bug at your machine?

Code to reproduce:

SET jit_expressions = true;
SET jit_above_cost = 0;
SELECT 1;
SELECT 1;

Output:

postgres=# SELECT 1;
ERROR:  failed to jit module
postgres=# SELECT 1;
   ?column?
--
  1
(1 row)

Config:

Version: You patches applied on top of
302b7a284d30fb0e00eb5f0163aa933d4d9bea10
OS: Debian testing
llvm/clang: 4.0.1-8

Andreas



I have fixed the patches, I was wrong on 'guessing' the migration of the API
for one function.
I have rebuilt the whole patch set. It is still based on 302b7a284d and has
been tested with both LLVM 3.9 and 4.0 on Debian testing.

Thanks for your feedback !





Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2018-02-06 Thread Andreas Karlsson

On 02/06/2018 11:15 AM, Mark Rofail wrote:

A new patch including all the fixes is ready.

Can you give the docs another look. I re-wrapped, re-indented  and 
changed all `Foreign Key Arrays` to `Array Element Foreign Keys` for 
consistency.


Looks good to me so set it to ready for committer. I still feel the type 
casting logic is a bit ugly but I am not sure if it can be improved much.


Andreas



Re: JIT compiling with LLVM v10.0

2018-02-08 Thread Andreas Karlsson

On 02/07/2018 03:54 PM, Andres Freund wrote:

I've pushed v10.0. The big (and pretty painful to make) change is that
now all the LLVM specific code lives in src/backend/jit/llvm, which is
built as a shared library which is loaded on demand.


It does not seem to be possible build without LLVM anymore.

Error:

In file included from planner.c:32:0:
../../../../src/include/jit/llvmjit.h:13:10: fatal error: 
llvm-c/Types.h: No such file or directory

 #include 
  ^~~~

Options:

./configure --prefix=/home/andreas/dev/postgresql-inst 
--enable-tap-tests --enable-cassert --enable-debug


I also noticed the following typo:

diff --git a/configure.in b/configure.in
index b035966c0a..b89c4a138a 100644
--- a/configure.in
+++ b/configure.in
@@ -499,7 +499,7 @@ fi
 if test "$enable_coverage" = yes; then
   if test "$GCC" = yes; then
 CFLAGS="$CFLAGS -fprofile-arcs -ftest-coverage"
-CFLAGS="$CXXFLAGS -fprofile-arcs -ftest-coverage"
+CXXFLAGS="$CXXFLAGS -fprofile-arcs -ftest-coverage"
   else
 AC_MSG_ERROR([--enable-coverage is supported only when using GCC])
   fi

Andreas



Re: JIT compiling with LLVM v9.1

2018-02-15 Thread Andreas Karlsson

On 02/05/2018 10:44 PM, Pierre Ducroquet wrote:

psqlscanslash.l: In function ‘psql_scan_slash_option’:
psqlscanslash.l:550:8: warning: variable ‘lexresult’ set but not used
[-Wunused-but-set-variable]
int   final_state;
  ^


I'm not sure Andres's patches have anything to do with psql, it's surprising.


I managed to track down the bug and apparently when building with 
--with-llvm the -DNDEBUG option is added to CPPFLAGS, but I am not 
entirely sure what the code in config/llvm.m4 is trying to do in the 
first place.


The two issues I see with what the code does are:

1) Why does config/llvm.m4 modify CPPFLAGS? That affects the building of 
the binaries too which may be done with gcc like in my case. Shouldn't 
it use a LLVM_CPPFLAGS or something?


2) When I build with --with-cassert I expect the assertions to be there, 
both in the binaries and the bitcode. Is that just a bug or is there any 
thought behind this?


Below is the diff in src/Makefile.global between when I run configure 
with --with-llvm or not.


diff src/Makefile.global-nollvm  src/Makefile.global-llvm
78c78
< configure_args =  '--prefix=/home/andreas/dev/postgresql-inst' 
'--enable-tap-tests' '--enable-cassert' '--enable-debug'

---
> configure_args =  '--prefix=/home/andreas/dev/postgresql-inst' 
'--enable-tap-tests' '--enable-cassert' '--enable-debug' '--with-llvm'

190c190
< with_llvm  = no
---
> with_llvm  = yes
227,229c227,229
< LLVM_CONFIG =
< LLVM_BINPATH =
< CLANG =
---
> LLVM_CONFIG = /usr/bin/llvm-config
> LLVM_BINPATH = /usr/lib/llvm-4.0/bin
> CLANG = /usr/bin/clang
238c238
< CPPFLAGS =  -D_GNU_SOURCE
---
> CPPFLAGS = -D__STDC_LIMIT_MACROS -D__STDC_FORMAT_MACROS 
-D__STDC_CONSTANT_MACROS -D_GNU_SOURCE -DNDEBUG 
-I/usr/lib/llvm-4.0/include  -D_GNU_SOURCE

261c261
< LLVM_CXXFLAGS =
---
> LLVM_CXXFLAGS =  -std=c++0x -std=c++11 -fno-exceptions
283c283
< LLVM_LIBS=
---
> LLVM_LIBS= -lLLVM-4.0
297c297
< LDFLAGS +=   -Wl,--as-needed
---
> LDFLAGS +=  -L/usr/lib/llvm-4.0/lib  -Wl,--as-needed




Re: JIT compiling with LLVM v9.1

2018-02-15 Thread Andreas Karlsson

On 02/15/2018 06:23 PM, Andres Freund wrote:

2) When I build with --with-cassert I expect the assertions to be there,
both in the binaries and the bitcode. Is that just a bug or is there any
thought behind this?


Not sure what you mean by that. NDEBUG and cassert are independent
mechanisms, no?


Yeah, I think I just managed to confuse myself there.

The actual issue is that --with-llvm changes if NDEBUG is set or not, 
which is quite surprising. I would not expect assertions to be disabled 
in the fronted code just because I compiled PostgreSQL with llvm.


Andreas



Re: [HACKERS] GnuTLS support

2017-11-21 Thread Andreas Karlsson

On 11/20/2017 02:56 AM, Michael Paquier wrote:

Sorry about that. Something more specific needs to happen here as well
for channel binding support with SCRAM. CheckSCRAMAuth() now assumes
that the channel binding mechanism SCRAM-SHA-256-PLUS can be published
to the client if SSL is used, because OpenSSL is the only
implementation available. Does gnutls include an API to allow an
application to fetch the bytes from the TLS finished message? I can
see some references by glancing at the tarball of gnutls 3.6.1, but
you would need something similar to OpenSSL's SSL_get_peer_finished().
If that cannot be achieved I think that it will be necessary to tweak
auth.c to not publish the -PLUS mechanism if for example the result of
be_tls_get_peer_finished() is NULL. No need for a new SSL-specific
API. At the end it would prove to be more portable to do so for all
the SSL implementations, MacOS stuff does not document an API to get
the TLS finish message bytes.


There is a function called gnutls_session_channel_binding() which seems 
to do something very similar to SSL_get*_finished() which has been in 
GnuTLS since 2.12.


https://www.gnutls.org/manual/gnutls.html#Channel-Bindings


  --with-ssl=(openssl|gnutls)


WIth potential patches coming to use macos' SSL implementation or
Windows channel, there should really be only one implementation
available at compile time. That's more simple as a first step as well.
So +1 for the --with-ssl switch.


Agreed.

Andreas



Re: [HACKERS] GnuTLS support

2017-11-26 Thread Andreas Karlsson

On 11/20/2017 02:56 AM, Michael Paquier wrote:

On Mon, Nov 20, 2017 at 9:42 AM, Tomas Vondra
 wrote:

If I get it right we ignore gnutls and use openssl (as it's the first
checked in #ifdefs). Shouldn't we enforce in configure that only one TLS
implementation is enabled? Either by some elaborate check, or by
switching to something like

  --with-ssl=(openssl|gnutls)


WIth potential patches coming to use macos' SSL implementation or
Windows channel, there should really be only one implementation
available at compile time. That's more simple as a first step as well.
So +1 for the --with-ssl switch.


I have now implemented this in the attached patch (plus added support 
for channel binding and rebased it) but I ran into one issue which I 
have not yet solved. The script for the windows version takes the 
--with-openssl= switch so that cannot just be translated to a 
single --with-ssl switch. Should to have both --with-openssl and 
--with-gnutls or --with-ssl=(openssl|gnutls) and --with-ssl-path=? 
I also do not know the Windows build code very well (or really at all).


Andreas
diff --git a/configure b/configure
index 6c4d743b35..76470af626 100755
--- a/configure
+++ b/configure
@@ -707,7 +707,7 @@ UUID_EXTRA_OBJS
 with_uuid
 with_systemd
 with_selinux
-with_openssl
+with_ssl
 krb_srvtab
 with_python
 with_perl
@@ -834,6 +834,7 @@ with_pam
 with_bsd_auth
 with_ldap
 with_bonjour
+with_ssl
 with_openssl
 with_selinux
 with_systemd
@@ -1528,7 +1529,8 @@ Optional Packages:
   --with-bsd-auth build with BSD Authentication support
   --with-ldap build with LDAP support
   --with-bonjour  build with Bonjour support
-  --with-openssl  build with OpenSSL support
+  --with-ssl=LIB  build with TLS support from LIB (openssl,gnutls)
+  --with-openssl  obsolete spelling of --with-ssl=openssl
   --with-selinux  build with SELinux support
   --with-systemd  build with systemd support
   --without-readline  do not use GNU Readline nor BSD Libedit for editing
@@ -5961,10 +5963,34 @@ $as_echo "$with_bonjour" >&6; }
 
 
 #
-# OpenSSL
+# TLS library
 #
-{ $as_echo "$as_me:${as_lineno-$LINENO}: checking whether to build with OpenSSL support" >&5
-$as_echo_n "checking whether to build with OpenSSL support... " >&6; }
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking which SSL library to build with" >&5
+$as_echo_n "checking which SSL library to build with... " >&6; }
+
+
+
+# Check whether --with-ssl was given.
+if test "${with_ssl+set}" = set; then :
+  withval=$with_ssl;
+  case $withval in
+yes)
+  as_fn_error $? "argument required for --with-ssl option" "$LINENO" 5
+  ;;
+no)
+  as_fn_error $? "argument required for --with-ssl option" "$LINENO" 5
+  ;;
+*)
+
+  ;;
+  esac
+
+fi
+
+
+if test x"$with_ssl" = x"" ; then
+  with_ssl=no
+fi
 
 
 
@@ -5973,9 +5999,7 @@ if test "${with_openssl+set}" = set; then :
   withval=$with_openssl;
   case $withval in
 yes)
-
-$as_echo "#define USE_OPENSSL 1" >>confdefs.h
-
+  :
   ;;
 no)
   :
@@ -5991,8 +6015,23 @@ else
 fi
 
 
-{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $with_openssl" >&5
-$as_echo "$with_openssl" >&6; }
+if test "$with_ssl" = openssl ; then
+  with_ssl=openssl
+fi
+
+if test "$with_ssl" = openssl ; then
+
+$as_echo "#define USE_OPENSSL 1" >>confdefs.h
+
+elif test "$with_ssl" = gnutls ; then
+
+$as_echo "#define USE_GNUTLS 1" >>confdefs.h
+
+elif test "$with_ssl" != no ; then
+  as_fn_error $? "--with-ssl must specify one of openssl or gnutls" "$LINENO" 5
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $with_ssl" >&5
+$as_echo "$with_ssl" >&6; }
 
 
 #
@@ -9911,7 +9950,7 @@ fi
   fi
 fi
 
-if test "$with_openssl" = yes ; then
+if test "$with_ssl" = openssl ; then
 if test "$PORTNAME" != "win32"; then
  { $as_echo "$as_me:${as_lineno-$LINENO}: checking for CRYPTO_new_ex_data in -lcrypto" >&5
 $as_echo_n "checking for CRYPTO_new_ex_data in -lcrypto... " >&6; }
@@ -10169,6 +10208,94 @@ done
 
 fi
 
+if test "$with_ssl" = gnutls ; then
+  { $as_echo "$as_me:${as_lineno-$LINENO}: checking for library containing gnutls_init" >&5
+$as_echo_n "checking for library containing gnutls_init... " >&6; }
+if ${ac_cv_search_gnutls_init+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  ac_func_search_save_LIBS=$LIBS
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+
+/* Override any GCC internal prototype to avoid an error.
+   Use char because int might match the return type of a GCC
+   builtin and then its argument prototype would still apply.  */
+#ifdef __cplusplus
+extern "C"
+#endif
+char gnutls_init ();
+int
+main ()
+{
+return gnutls_init ();
+  ;
+  return 0;
+}
+_ACEOF
+for ac_lib in '' gnutls; do
+  if test -z "$ac_lib"; then
+ac_res="none required"
+  else
+ac_res=-l$ac_lib
+LIBS="-l$ac_lib  $ac_func_search_save_LIBS"
+  fi
+  if ac_fn_c_try_link "$LINENO"; then :
+  ac_cv_search_gnutls_init=$

Typo in config_default.pl comment

2017-11-26 Thread Andreas Karlsson

Hi,

There is a --with-tls in the comments in config_default.pl which should 
be --with-tcl.


Andreas

diff --git a/src/tools/msvc/config_default.pl b/src/tools/msvc/config_default.pl
index 4d69dc2a2e..d7a9fc5039 100644
--- a/src/tools/msvc/config_default.pl
+++ b/src/tools/msvc/config_default.pl
@@ -18,7 +18,7 @@ our $config = {
 	icu   => undef,# --with-icu=
 	nls   => undef,# --enable-nls=
 	tap_tests => undef,# --enable-tap-tests
-	tcl   => undef,# --with-tls=
+	tcl   => undef,# --with-tcl=
 	perl  => undef,# --with-perl
 	python=> undef,# --with-python=
 	openssl   => undef,# --with-openssl=


Re: [HACKERS] GnuTLS support

2017-11-26 Thread Andreas Karlsson

On 11/27/2017 02:20 AM, Michael Paquier wrote:

On Mon, Nov 27, 2017 at 10:05 AM, Andreas Karlsson  wrote:

The script for the windows version takes the
--with-openssl= switch so that cannot just be translated to a single
--with-ssl switch. Should to have both --with-openssl and --with-gnutls or
--with-ssl=(openssl|gnutls) and --with-ssl-path=? I also do not know
the Windows build code very well (or really at all).


By default --with-openssl does not take a path with ./configure. When
using gnutls, do the name of the libraries and the binaries generated
change compared to openssl? If yes, and I guess that it is the case,
you will need to be able to make the difference between gnutls and
openssl anyway as the set of perl scripts in src/tools/msvc need to
make the difference with deliverables at name-level. I would be
personally fine with just listing gnutls in the list of options and
comment it as --with-ssl=, and change the openssl comment to
match that.


Hm, after reading more of our MSVC code it seems like building with MSVC 
does not really use switch, people rather have to create a config.pl. Is 
that correct? The MSVC scripts also seems to only support uuid-ossp 
which it just calls $config->{uuid}.


If so we could either have:

$config->{ssl} = "openssl";
$config->{sslpath} = "/path/to/openssl";

or

$config->{ssl} = "openssl";
$config->{openssl} = "/path/to/openssl";

or

$config->{openssl} = "/path/to/openssl";
vs
$config->{gnutls} = "/path/to/gnutls";

Andreas



Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-11-26 Thread Andreas Karlsson

On 11/13/2017 12:32 PM, Mark Rofail wrote:

== The @>> operator
I would argue that allocating an array of datums and building an
array would have the same complexity


I am not sure what you mean here. Just because something has the
same complexity does not mean there can't be major performance
differences.

I have spend a lot of time working on this operator and would like to 
benefit from it. How should I go about this ? Start a new patch ?


I am still not sure what you mean here. Feel free to add @>> to this 
patch if you like. You may want to submit it as two patch files (but 
please keep them as the same commitfest entry) but I leave that decision 
all up to you.
So the two main issues we remain to resolve are MATCH FULL and the 
RI_Initial_Check() query refactoring. The problem is that I am not one 
of the original authors and have not touched this part of the code.
I understand the problem but it will take some time for me to understand 
how to resolve everything.


Cool, feel free to ask if you need some assistance. I want this patch.

Andreas



Re: [HACKERS] GnuTLS support

2017-11-28 Thread Andreas Karlsson

On 11/28/2017 04:19 PM, Peter Eisentraut wrote:

On 11/19/17 20:56, Michael Paquier wrote:

  --with-ssl=(openssl|gnutls)


I'm not sure whether this is a great improvement.  Why upset existing
build and packaging scripts?  The usual options style is
--with-nameoflib.  We can have separate options and error if conflicting
combinations are specified.


We already have a precedent in --with-uuid=LIB which has the backwards 
compatibility alias --with-ossp-uuid, so I think adding --with-ssl=LIB 
while keeping --with-openssl as an alias is consistent with that. I also 
think the code and the interface ended up pretty clean when I added 
--with-ssl in my latest patch.


The only issue I see with --with-ssl is Window's config.pl which might 
end up a bit ugly to support '$config->{openssl} = "path";' as a legacy 
alias.


Andreas



Re: Commit fest 2017-11

2017-11-30 Thread Andreas Karlsson
On 11/30/2017 05:51 AM, Michael Paquier wrote:> One thing that could be 
improved in my

opinion is that patch authors should try more to move a patch to a
following commit fest once the end gets close...This would leverage
slightly the load of work for the CFM.


Thanks for the suggestion. I had no idea that I could do that.

Andreas



Re: Postgres with pthread

2017-12-06 Thread Andreas Karlsson

On 12/06/2017 06:08 PM, Andres Freund wrote:

I think the biggest problem with doing this for real is that it's a huge
project, and that it'll take a long time.


An additional issue is that this could break a lot of extensions and in 
a way that it is not apparent at compile time. This means we may need to 
break all extensions to force extensions authors to check if they are 
thread safe.


I do not like making life hard for out extension community, but if the 
gains are big enough it might be worth it.



Thanks for working on this!


Seconded.

Andreas



Re: Feature: temporary materialized views

2019-03-11 Thread Andreas Karlsson

On 3/8/19 2:38 AM, Michael Paquier wrote:

On Thu, Mar 07, 2019 at 10:45:04AM +0200, David Steele wrote:

I think a new patch is required here so I have marked this Waiting on
Author.  cfbot is certainly not happy and anyone trying to review is going
to have hard time trying to determine what to review.


I would recommend to mark this patch as returned with feedback as we
already know that we need to rethink a bit harder the way relations
are created in CTAS, not to mention that the case of EXPLAIN CTAS IF
NOT EXISTS is not correctly handled.  This requires more than three of
work which is what remains until the end of this CF, so v12 is not a
sane target.


Agreed. Even if I could find the time to write a patch for this there is 
no way it would make it into v12.


Andreas




Re: Feature: temporary materialized views

2019-03-14 Thread Andreas Karlsson
On 3/14/19 9:13 AM, Mitar wrote:> I just want to make sure if I 
understand correctly. But my initial

proposal/patch is currently waiting first for all patches for the
refactoring to happen, which are done by amazing Andreas? This sounds
good to me and I see a lot of progress/work has been done and I am OK
with waiting. Please ping me explicitly if there will be anything I am
expected to do at any point in time.

And just to make sure, these current patches are doing just
refactoring but are not also introducing temporary materialized views
yet? Or is that also done in patches made by Andreas?


Yeah, your patch is sadly stuck behind the refactoring, and the 
refactoring proved to be harder to do than I initially thought. The 
different code paths for executing CREATE MATERIALIZED VIEW are so 
different that it is hard to find a good common interface.


So there is unfortunately little you can do here other than wait for me 
or someone else to do the refactoring as I cannot see your patch getting 
accepted without keeping the existing restrictions on side effects for 
CREATE MATERIALIZED VIEW.


Andreas



Re: GIN indexes on an = ANY(array) clause

2019-03-14 Thread Andreas Karlsson

On 3/13/19 5:38 PM, Tom Lane wrote:

regression=# select amopopr::regoperator from pg_amop where amopfamily = 2745;
 amopopr
---
  &&(anyarray,anyarray)
  @>(anyarray,anyarray)
  <@(anyarray,anyarray)
  =(anyarray,anyarray)
(4 rows)

and none of those are obviously related to the =(int4,int4) operator that
is in the ScalarArrayOp.  The only way to get from point A to point B is
to know very specifically that =(anyarray,anyarray) is related to any
scalar-type btree equality operator, which is not the kind of thing the
GIN AM ought to know either.


In the discussions for the patch for foreign keys from arrays[1] some 
people proposed add a new operator, <<@(anyelement,anyarray), to avoid 
having to construct left hand side arrays. Would that help here or does 
it still have the same issues?


1. 
https://www.postgresql.org/message-id/CAJvoCut7zELHnBSC8HrM6p-R6q-NiBN1STKhqnK5fPE-9%3DGq3g%40mail.gmail.com


Andreas



Re: PostgreSQL pollutes the file system

2019-03-20 Thread Andreas Karlsson

On 3/19/19 11:19 AM, Fred .Flintstone wrote:

PostgreSQL pollutes the file system with lots of binaries that it is
not obvious that they belong to PostgreSQL.

Such as "/usr/bin/createdb", etc.

It would be better if these files were renamed to be prefixed with
pg_, such as pg_createdb.
Or even better postgresql-createdb then be reachable by through a
"postgresql" wrapper script.


Hi,

This topic has been discussed before e.g. in 2008 in 
https://www.postgresql.org/message-id/47EA5CC0.8040102%40sun.com and 
also more recently but I cannot find it in the archives right now.


I am personally in favor of renaming e.g. createdb to pg_createdb, since 
it is not obvious that createdb belongs to PostgreSQL when reading a 
script or looking in /usr/bin, but we would need a some kind of 
deprecation cycle here or we would suddenly break tons of people's scripts.


And as for the git-like solution with a wrapper script, that seems to be 
the modern way to do things but would be an even larger breakage and I 
am not convinced the advantage would be worth it especially since our 
executables are not as closely related and consistent as for example git's.


Andreas



Re: PostgreSQL pollutes the file system

2019-03-20 Thread Andreas Karlsson

On 3/20/19 8:19 PM, Andres Freund wrote:

On 2019-03-20 15:15:02 -0400, Jonathan S. Katz wrote:

If we are evaluating this whole symlink / renaming thing, there could be
arguments for a "pgsql" alias to psql (or vice versa), but I don't think
"pg_sql" makes any sense and could be fairly confusing.


I don't care much about createdb etc, but I'm *strongly* against
renaming psql and/or adding symlinks. That's like 95% of all
interactions people have with postgres binaries, making that more
confusing would be an enterily unnecessary self own.


+1 "psql" as a tool for connecting to PostgreSQL is so well established 
that renaming it would just confuse everyone.


Andreas



Re: PostgreSQL pollutes the file system

2019-03-21 Thread Andreas Karlsson

On 3/21/19 7:07 AM, Chris Travers wrote:
1.  createuser/dropuser are things that I don't consider good ways of 
creating users anyway.  I think we should just consider removing these 
binaries.  The SQL queries are better, more functional, and can be 
rolled back as a part of a larger transaction.


Those binaries are pretty convenient to use in scripts since they handle 
SQL escaping for you, but probably not convenient enough that we would 
have added createuser today.


Compare

createuser "$USER"

vs

echo 'CREATE ROLE :"user" LOGIN' | psql postgres -v "user=$USER"

Andreas



Re: PostgreSQL pollutes the file system

2019-03-27 Thread Andreas Karlsson

On 3/27/19 2:51 PM, Tomas Vondra wrote:

I think the consensus in this thread (and the previous ancient ones) is
that it's not worth it. It's one thing to introduce new commands with the
pg_ prefix, and it's a completely different thing to rename existing ones.
That has inherent costs, and as Tom pointed out the burden would fall on
people using PostgreSQL (and that's rather undesirable).

I personally don't see why having commands without pg_ prefix would be
an issue. Especially when placed in a separate directory, which eliminates
the possibility of conflict with other commands.


I buy that it may not be worth breaking tens of thousands of scripts to 
fix this, but I disagree about it not being an issue. Most Linux 
distributions add PostgreSQL's executables in to a directory which is in 
the default $PATH (/usr/bin in the case of Debian). And even if it would 
be installed into a separate directory there would still be a conflict 
as soon as that directory is added to $PATH.


And I think that it is also relatively easy to confuse adduser and 
createuser when reading a script. Nothing about the name createuser 
indicates that it will create a role in an SQL database.


Andreas




  1   2   3   >