date:20171109

Re: [HACKERS] UPDATE of partition key

2017-11-09 Thread Amit Khandekar

On 9 November 2017 at 09:27, Thomas Munro  wrote:
> On Wed, Nov 8, 2017 at 5:57 PM, Amit Khandekar  wrote:
>> On 8 November 2017 at 07:55, Thomas Munro  
>> wrote:
>>> On Tue, Nov 7, 2017 at 8:03 AM, Robert Haas  wrote:
 The changes to trigger.c still make me super-nervous.  Hey THOMAS
 MUNRO, any chance you could review that part?
>
> At first, it seemed quite strange to me that row triggers and
> statement triggers fire different events for the same modification.
> Row triggers see DELETE +  INSERT (necessarily because different
> tables are involved), but this fact is hidden from the target table's
> statement triggers.
>
> The alternative would be for all triggers to see consistent events and
> transitions.  Instead of having your special case code in ExecInsert
> and ExecDelete that creates the two halves of a 'synthetic' UPDATE for
> the transition tables, you'd just let the existing ExecInsert and
> ExecDelete code do its thing, and you'd need a flag to record that you
> should also fire INSERT/DELETE after statement triggers if any rows
> moved.

Yeah I also had thought about that. But thought that change was too
invasive. For e.g. letting ExecARInsertTriggers() do the transition
capture even when transition_capture->tcs_update_new_table is set.

I was also thinking of having a separate function to *only* add the
transition table rows. So in ExecInsert, call this one instead of
ExecARUpdateTriggers(). But realized that the existing
ExecARUpdateTriggers() looks like a better, robust interface with all
its checks. Just that calling ExecARUpdateTriggers() sounds like we
are also firing trigger; we are not firing any trigger or saving any
event, we are just adding the transition row.

>
> After sleeping on this question, I am coming around to the view that
> the way you have it is right.  The distinction isn't really between
> row triggers and statement triggers, it's between triggers at
> different levels in the hierarchy.  It just so happens that we
> currently only fire target table statement triggers and leaf table row
> triggers.

Yes. And rows are there only in leaf partitions. So we have to
simulate as though the target table has these rows. Like you
mentioned, the user has to get the impression of a normal table. So we
have to do something extra to capture the rows.

> Future development ideas that seem consistent with your choice:
>
> 1.  If we ever allow row triggers with transition tables on child
> tables, then I think *their* transition tables should certainly see
> the deletes and inserts, otherwise OLD TABLE and NEW TABLE would be
> inconsistent with the OLD and NEW variables in a single trigger
> invocation.  (These were prohibited mainly due to lack of time and
> (AFAIK) limited usefulness; I think they would need probably need
> their own separate tuplestores, or possibly some kind of filtering.)

As we know, for row triggers on leaf partitions, we treat them as
normal tables, so a trigger written on a leaf partition sees only the
local changes. The trigger is unaware whether the insert is part of an
UPDATE row movement. Similarly, the transition table referenced by
that row trigger function should see only the NEW table, not the old
table.

>
> 2.  If we ever allow row triggers on partitioned tables (ie that fire
> when its children are modified), then I think their UPDATE trigger
> should probably fire when a row moves between any two (grand-)*child
> tables, just as you have it for target table statement triggers.

Yes I agree.

> It doesn't matter that the view from parent tables' triggers is
> inconsistent with the view from leaf table triggers: it's a feature
> that we 'hide' partitioning from the user to the extent we can so that
> you can treat the partitioned table just like a table.
>
> Any other views?

I think because because there is no provision for a row trigger on
partitioned table, users who want to have a common trigger on a
partition subtree, has no choice but to create the same trigger
individually on the leaf partitions. And that's the reason we cannot
handle an update row movement with triggers without anomalies.

Thanks
-Amit Khandekar

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] path toward faster partition pruning

2017-11-09 Thread David Rowley

On 10 November 2017 at 16:30, Kyotaro HORIGUCHI
 wrote:
> In 0002, bms_add_range has a bit naive-looking loop
>
> +   while (wordnum <= uwordnum)
> +   {
> +   bitmapword mask = (bitmapword) ~0;
> +
> +   /* If working on the lower word, zero out bits below 'lower'. 
> */
> +   if (wordnum == lwordnum)
> +   {
> +   int lbitnum = BITNUM(lower);
> +   mask >>= lbitnum;
> +   mask <<= lbitnum;
> +   }
> +
> +   /* Likewise, if working on the upper word, zero bits above 
> 'upper' */
> +   if (wordnum == uwordnum)
> +   {
> +   int ushiftbits = BITS_PER_BITMAPWORD - (BITNUM(upper) 
> + 1);
> +   mask <<= ushiftbits;
> +   mask >>= ushiftbits;
> +   }
> +
> +   a->words[wordnum++] |= mask;
> +   }
>
> Without some aggressive optimization, the loop takes most of the
> time to check-and-jump for nothing especially with many
> partitions and somewhat unintuitive.
>
> The following uses a bit tricky bitmap operation but
> is straightforward as a whole.
>
> =
> /* fill the bits upper from BITNUM(lower) (0-based) of the first word */
> a->workds[wordnum++] += ~(bitmapword)((1 << BITNUM(lower)) - 1);
>
> /* fill up intermediate words */
> while (wordnum < uwordnum)
>a->words[wordnum++] = ~(bitmapword) 0;
>
> /* fill up to BITNUM(upper) bit (0-based) of the last word */
> a->workds[wordnum++] |=
>  (~(bitmapword) 0) >> (BITS_PER_BITMAPWORD - (BITNUM(upper) - 1));
> =

No objections here for making bms_add_range() perform better, but this
is not going to work when lwordnum == uwordnum. You'd need to special
case that. I didn't think it was worth the trouble, but maybe it is...

I assume the += should be |=.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] pg audit requirements

2017-11-09 Thread Pavel Stehule

Hi

I am sending some notes, experience about usage of pgAudit.

pgAudit provides basic functionality and usually is good enough. But it is
not good enough for some applications in financial services.

The requirements:

1. structured output - attached query is not good enough - column name,
table name, schema, database, role should be separated

2. separated log (log file) with guaranteed write - fsync after every line
means significant performance issue, but fsync every 1sec (or defined
interval) is acceptable

3. security issues - not enough access rights to database object should be
processed and logged in audit log too.

Regards

Pavel

Re: [HACKERS] proposal: psql command \graw

2017-11-09 Thread Pavel Stehule

2017-11-10 8:12 GMT+01:00 Fabien COELHO :

>
> ISTM that you can remove "force_column_header" and just set "tuple_only"
>>> to what you need, that is you do not need to change anything in function
>>> "print_unaligned_text".
>>>
>>
>> Last point is not possible - I would not to break original tuple only
>> mode.
>>
>
> Hmmm... I do not understand. I can see only one use of force_column_header
> in the function:
>
>  -   if (!opt_tuples_only)
>  +   if (!opt_tuples_only || opt_force_column_header)
>
> So I would basically suggest to do:
>
>  my_popt.topt.tuples_only = !pset.g_raw_header;
>
> in the driver. Looking at the detailed code in that function, probably you
> need to set start_table to on when headers are needed and stop_table to off
> for the raw mode anyway?
>
> Maybe I'm missing something, but it looks that it could be made to work
> without adding another boolean.
>

The tuples only cannot be disabled, because then other parts print number
of rows

postgres=# \pset format unaligned
Output format is unaligned.

postgres=# select 10 as a, 20 as b;
a|b
10|20
(1 row) <



> --
> Fabien.
>

Re: [HACKERS] proposal: psql command \graw

2017-11-09 Thread Fabien COELHO




ISTM that you can remove "force_column_header" and just set "tuple_only"
to what you need, that is you do not need to change anything in function
"print_unaligned_text".


Last point is not possible - I would not to break original tuple only mode.


Hmmm... I do not understand. I can see only one use of force_column_header 
in the function:


 -   if (!opt_tuples_only)
 +   if (!opt_tuples_only || opt_force_column_header)

So I would basically suggest to do:

 my_popt.topt.tuples_only = !pset.g_raw_header;

in the driver. Looking at the detailed code in that function, probably you 
need to set start_table to on when headers are needed and stop_table to 
off for the raw mode anyway?


Maybe I'm missing something, but it looks that it could be made to work 
without adding another boolean.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] path toward faster partition pruning

2017-11-09 Thread Kyotaro HORIGUCHI

Hello,

At Fri, 10 Nov 2017 14:44:55 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI 
 wrote in 
<20171110.144455.117208639.horiguchi.kyot...@lab.ntt.co.jp>

> > Those two conditions are not orthogonal. Maybe something like
> > following seems more understantable.
> > 
> > > if (!constfalse)
> > > {
> > >   /* No constraints on the keys, so, return *all* partitions. */
> > >   if (nkeys == 0)
> > > return bms_add_range(result, 0, partdesc->nparts - 1);
> > > 
> > >   result = get_partitions_for_keys(relation, &keys);
> > > }

So, the condition (!constfalse && nkeys == 0) cannot return
there. I'm badly confused by the variable name.

I couldn't find another reasonable structure using the current
classify_p_b_keys(), but could you add a comment like the
following as an example?

+ /*
+  * Ths function processes other than OR expressions and returns
+  * the excluded OR expressions in or_clauses
+  */
>   nkeys = classify_partition_bounding_keys(relation, clauses,
>&keys, &constfalse,
>&or_clauses);
>   /*
>* Only look up in the partition decriptor if the query provides
>* constraints on the keys at all.
>*/
>   if (!constfalse)
>   {
> if (nkey > 0)
>   result = get_partitions_for_keys(relation, &keys);
> else
-+   /* No constraints on the keys, so, all partitions are passed. */
>   result = bms_add_range(result, 0, partdesc->nparts - 1);
>   }
> 
+   /*
+* We have a partition set for clauses not returned in or_clauses
+* here. Conjuct the result of each OR clauses.
+*/
>   foreach(lc, or_clauses)
>   {
> BoolExpr *or = (BoolExpr *) lfirst(lc);
> ListCell *lc1;
> Bitmapset *or_partset = NULL;
> 
+ Assert(or_clause(or));

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: psql command \graw

2017-11-09 Thread Pavel Stehule

2017-11-09 21:12 GMT+01:00 Pavel Stehule :

>
>
> 2017-11-09 21:03 GMT+01:00 Fabien COELHO :
>
>>
>> Hello Pavel,
>>
>> I hope so I fixed all mentioned issues.
>>>
>>
>> Patch applies with a warning:
>>
>>  > git apply ~/psql-graw-2.patch
>>  /home/fabien/psql-graw-2.patch:192: new blank line at EOF.
>>  +
>>  warning: 1 line adds whitespace errors.
>>
>> Otherwise it compiles. "make check" ok. doc gen ok.
>>
>> Two spurious empty lines are added before StoreQueryTuple.
>>
>> Doc: "If + is appended to the command name, a column
>> names are displayed."
>>
>> I suggest instead: "When + is appended, column names
>> are also displayed."
>>
>> ISTM that you can remove "force_column_header" and just set "tuple_only"
>> to what you need, that is you do not need to change anything in function
>> "print_unaligned_text".
>>
>
> Last point is not possible - I would not to break original tuple only
> mode.
>
>
updated patch




> Pavel
>
>>
>> --
>> Fabien.
>>
>
>
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index e520cdf3ba..457a59eeab 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -2020,6 +2020,19 @@ CREATE INDEX
   
 
 
+  
+\graw[+] [ filename ]
+\graw[+]  [ |command ]
+
+
+\graw is equivalent to \g, but
+forces unaligned output mode for this query. When +
+is appended, column names are also displayed.
+
+
+  
+
+
   
 \gset [ prefix ]
 
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 8cc4de3878..b3461291eb 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -332,7 +332,8 @@ exec_command(const char *cmd,
 		status = exec_command_errverbose(scan_state, active_branch);
 	else if (strcmp(cmd, "f") == 0)
 		status = exec_command_f(scan_state, active_branch);
-	else if (strcmp(cmd, "g") == 0 || strcmp(cmd, "gx") == 0)
+	else if (strcmp(cmd, "g") == 0 || strcmp(cmd, "gx") == 0 ||
+			 strcmp(cmd, "graw") == 0 || strcmp(cmd, "graw+") == 0)
 		status = exec_command_g(scan_state, active_branch, cmd);
 	else if (strcmp(cmd, "gdesc") == 0)
 		status = exec_command_gdesc(scan_state, active_branch);
@@ -1232,6 +1233,7 @@ exec_command_f(PsqlScanState scan_state, bool active_branch)
 
 /*
  * \g [filename] -- send query, optionally with output to file/pipe
+ * \graw [filename] -- same as \g with raw format
  * \gx [filename] -- same as \g, with expanded mode forced
  */
 static backslashResult
@@ -1254,6 +1256,10 @@ exec_command_g(PsqlScanState scan_state, bool active_branch, const char *cmd)
 		free(fname);
 		if (strcmp(cmd, "gx") == 0)
 			pset.g_expanded = true;
+		else if (strcmp(cmd, "graw") == 0)
+			pset.g_raw = true;
+		else if (strcmp(cmd, "graw+") == 0)
+			pset.g_raw_header = true;
 		status = PSQL_CMD_SEND;
 	}
 	else
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index 7a91a44b2b..9f7ef51dfb 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -865,6 +865,14 @@ PrintQueryTuples(const PGresult *results)
 	if (pset.g_expanded)
 		my_popt.topt.expanded = 1;
 
+	/* one-shot raw output requested by \raw and \graw+ */
+	else if (pset.g_raw || pset.g_raw_header)
+	{
+		my_popt.topt.format = PRINT_UNALIGNED;
+		my_popt.topt.tuples_only = true;
+		my_popt.topt.force_column_header = pset.g_raw_header;
+	}
+
 	/* write output to \g argument, if any */
 	if (pset.gfname)
 	{
@@ -1517,6 +1525,10 @@ sendquery_cleanup:
 	/* reset \gx's expanded-mode flag */
 	pset.g_expanded = false;
 
+	/* reset \graw flags */
+	pset.g_raw = false;
+	pset.g_raw_header = false;
+
 	/* reset \gset trigger */
 	if (pset.gset_prefix)
 	{
diff --git a/src/bin/psql/help.c b/src/bin/psql/help.c
index a926c40b9b..e573711434 100644
--- a/src/bin/psql/help.c
+++ b/src/bin/psql/help.c
@@ -167,7 +167,7 @@ slashUsage(unsigned short int pager)
 	 * Use "psql --help=commands | wc" to count correctly.  It's okay to count
 	 * the USE_READLINE line even in builds without that.
 	 */
-	output = PageOutput(125, pager ? &(pset.popt.topt) : NULL);
+	output = PageOutput(126, pager ? &(pset.popt.topt) : NULL);
 
 	fprintf(output, _("General\n"));
 	fprintf(output, _("  \\copyright show PostgreSQL usage and distribution terms\n"));
@@ -176,6 +176,7 @@ slashUsage(unsigned short int pager)
 	fprintf(output, _("  \\g [FILE] or ; execute query (and send results to file or |pipe)\n"));
 	fprintf(output, _("  \\gdesc describe result of query, without executing it\n"));
 	fprintf(output, _("  \\gexec execute query, then execute each value in its result\n"));
+	fprintf(output, _("  \\graw[+] [FILE]as \\g, but forces unaligned raw output mode\n"));
 	fprintf(output, _("  \\gset [PREFIX] execute query and store results in psql variables\n"));
 	fprintf(output, _("  \\gx [FILE] as \\g, but forces expanded output mode\n"));
 	fprintf(output, _("  \\q quit psql\

Re: [HACKERS] path toward faster partition pruning

2017-11-09 Thread Kyotaro HORIGUCHI

Ooops! The following comment is wrong. Please ignore it.

At Fri, 10 Nov 2017 14:38:11 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI 
 wrote in 
<20171110.143811.97616847.horiguchi.kyot...@lab.ntt.co.jp>
> Those two conditions are not orthogonal. Maybe something like
> following seems more understantable.
> 
> > if (!constfalse)
> > {
> >   /* No constraints on the keys, so, return *all* partitions. */
> >   if (nkeys == 0)
> > return bms_add_range(result, 0, partdesc->nparts - 1);
> > 
> >   result = get_partitions_for_keys(relation, &keys);
> > }
> 
> I'm not sure what is meant to be (formally) done here, but it
> seems to me that OrExpr is assumed to be only at the top level of
> the caluses. So the following (just an example, but meaningful
> expression in this shpape must exists.) expression is perhaps
> wrongly processed here.
> 
> CREATE TABLE p (a int) PARITION BY (a);
> CREATE TABLE c1 PARTITION OF p FOR VALUES FROM (0) TO (10);
> CREATE TABLE c2 PARTITION OF p FOR VALUES FROM (10) TO (20);
> 
> SELECT * FROM p WHERE a = 15 AND (a = 15 OR a = 5);
> 
> get_partitions_for_keys() returns both c1 and c2 and still
> or_clauses here holds (a = 15 OR a = 5) and the function tries to
> *add* partitions for a = 15 and a = 5 separetely.

This is working fine. Sorry for the bogus comment.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [POC] hash partitioning

2017-11-09 Thread amul sul

On Fri, Nov 10, 2017 at 4:41 AM, Robert Haas  wrote:
> On Wed, Nov 1, 2017 at 6:16 AM, amul sul  wrote:
>> Fixed in the 0003 patch.
>
> I have committed this patch set with the attached adjustments.
>

Thanks a lot for your support & a ton of thanks to all reviewer.

Regards,
Amul


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] path toward faster partition pruning

2017-11-09 Thread Kyotaro HORIGUCHI

Hello, this is the second part of the review.

At Fri, 10 Nov 2017 12:30:00 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI 
 wrote in 
<20171110.123000.151902771.horiguchi.kyot...@lab.ntt.co.jp>
> In 0002, bms_add_range has a bit naive-looking loop
> In 0003, 

In 0004,

The name get_partitions_from_clauses_guts(), it seems to me that
we usually use _internal for recursive part of some function. (I
have the same comment as David about the comment for
get_partition_from_clause())

About the same function:

Couldn't we get out in the fast path when clauses == NIL?

+/* No constraints on the keys, so, return *all* partitions. */
+result = bms_add_range(result, 0, partdesc->nparts - 1);

This allows us to return immediately here. And just above this,

+   if (nkeys > 0 && !constfalse)
+   result = get_partitions_for_keys(relation, &keys);
+   else if (!constfalse)

Those two conditions are not orthogonal. Maybe something like
following seems more understantable.

> if (!constfalse)
> {
>   /* No constraints on the keys, so, return *all* partitions. */
>   if (nkeys == 0)
> return bms_add_range(result, 0, partdesc->nparts - 1);
> 
>   result = get_partitions_for_keys(relation, &keys);
> }

I'm not sure what is meant to be (formally) done here, but it
seems to me that OrExpr is assumed to be only at the top level of
the caluses. So the following (just an example, but meaningful
expression in this shpape must exists.) expression is perhaps
wrongly processed here.

CREATE TABLE p (a int) PARITION BY (a);
CREATE TABLE c1 PARTITION OF p FOR VALUES FROM (0) TO (10);
CREATE TABLE c2 PARTITION OF p FOR VALUES FROM (10) TO (20);

SELECT * FROM p WHERE a = 15 AND (a = 15 OR a = 5);

get_partitions_for_keys() returns both c1 and c2 and still
or_clauses here holds (a = 15 OR a = 5) and the function tries to
*add* partitions for a = 15 and a = 5 separetely.


I'd like to pause here.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [POC] Faster processing at Gather node

2017-11-09 Thread Amit Kapila

On Fri, Nov 10, 2017 at 8:36 AM, Robert Haas  wrote:
> On Thu, Nov 9, 2017 at 9:31 PM, Amit Kapila  wrote:
>> Have you set force_parallel_mode=regress; before running the
>> statement?
>
> Yes, I tried that first.
>
>> If so, then why you need to tune other parallel query
>> related parameters?
>
> Because I couldn't get it to break the other way, I then tried this.
>
> Instead of asking me what I did, can you tell me what I need to do?
> Maybe a self-contained reproducible test case including exactly what
> goes wrong on your end?
>

I think we are missing something very basic because you should see the
failure by executing that statement in force_parallel_mode=regress
even in a freshly created database.  I guess the missing point is that
I am using assertions enabled build and probably you are not (If this
is the reason, then it should have striked me first time).  Anyway, I
am writing steps to reproduce the issue.

1. initdb
2. start server
3. connect using psql
4. set force_parallel_mode=regress;
5. Create Table as_select1 AS SELECT * FROM pg_class WHERE relkind = 'r';


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] path toward faster partition pruning

2017-11-09 Thread Kyotaro HORIGUCHI

Hello,

At Fri, 10 Nov 2017 09:34:57 +0900, Amit Langote 
 wrote in 
<5fcb1a9f-b4ad-119d-14c7-282c30c7f...@lab.ntt.co.jp>
> Hi Amul.
> 
> On 2017/11/09 20:05, amul sul wrote:
> > On Mon, Nov 6, 2017 at 3:31 PM, Amit Langote
> >  wrote:
> >> On 2017/11/06 14:32, David Rowley wrote:
> >>> On 6 November 2017 at 17:30, Amit Langote wrote:
>  On 2017/11/03 13:32, David Rowley wrote:
> > On 31 October 2017 at 21:43, Amit Langote wrote:
> > []
> >>
> >> Attached updated set of patches, including the fix to make the new pruning
> >> code handle Boolean partitioning.
> >>
> > 
> > I am getting following warning on mac os x:
> 
> Thanks for the review.
> 
> > partition.c:1800:24: warning: comparison of constant -1 with
> > expression of type 'NullTestType'
> >   (aka 'enum NullTestType') is always false
> > [-Wtautological-constant-out-of-range-compare]
> > if (keynullness[i] == -1)
> > ~~ ^  ~~
> > partition.c:1932:25: warning: comparison of constant -1 with
> > expression of type 'NullTestType'
> >   (aka 'enum NullTestType') is always false
> > [-Wtautological-constant-out-of-range-compare]
> > if (keynullness[i] == -1)
> > ~~ ^  ~~
> > 2 warnings generated.
> > 
> > 
> > Comment for 0004 patch:
> > 270 +   /* -1 represents an invalid value of NullTestType. */
> >  271 +   memset(keynullness, -1, PARTITION_MAX_KEYS * sizeof(NullTestType));
> > 
> > I think we should not use memset to set a value other than 0 or true/false.
> > This will work for -1 on the system where values are stored in the 2's
> > complement but I am afraid of other architecture.
> 
> OK, I will remove all instances of comparing and setting variables of type
> NullTestType to a value of -1.

In 0002, bms_add_range has a bit naive-looking loop

+   while (wordnum <= uwordnum)
+   {
+   bitmapword mask = (bitmapword) ~0;
+
+   /* If working on the lower word, zero out bits below 'lower'. */
+   if (wordnum == lwordnum)
+   {
+   int lbitnum = BITNUM(lower);
+   mask >>= lbitnum;
+   mask <<= lbitnum;
+   }
+
+   /* Likewise, if working on the upper word, zero bits above 
'upper' */
+   if (wordnum == uwordnum)
+   {
+   int ushiftbits = BITS_PER_BITMAPWORD - (BITNUM(upper) + 
1);
+   mask <<= ushiftbits;
+   mask >>= ushiftbits;
+   }
+
+   a->words[wordnum++] |= mask;
+   }

Without some aggressive optimization, the loop takes most of the
time to check-and-jump for nothing especially with many
partitions and somewhat unintuitive.

The following uses a bit tricky bitmap operation but
is straightforward as a whole.

=
/* fill the bits upper from BITNUM(lower) (0-based) of the first word */
a->workds[wordnum++] += ~(bitmapword)((1 << BITNUM(lower)) - 1);

/* fill up intermediate words */
while (wordnum < uwordnum)
   a->words[wordnum++] = ~(bitmapword) 0;

/* fill up to BITNUM(upper) bit (0-based) of the last word */
a->workds[wordnum++] |=
 (~(bitmapword) 0) >> (BITS_PER_BITMAPWORD - (BITNUM(upper) - 1));
=


In 0003, 

+match_clauses_to_partkey(RelOptInfo *rel,
...
+  if (rinfo->pseudoconstant &&
+(IsA(clause, Const) &&
+ Const *) clause)->constisnull) ||
+  !DatumGetBool(((Const *) clause)->constvalue
+  {
+*constfalse = true;
+continue;

If we call this function in both conjunction and disjunction
context (the latter is only in recursive case). constfalse ==
true means no need of any clauses for the former case.

Since (I think) just a list of RestrictInfo is expected to be
treated as a conjunction (it's quite doubious, though..), we
might be better call this for each subnodes of a disjunction. Or,
like match_clauses_to_index, we might be better provide
match_clause_to_partkey(rel, rinfo, contains_const), which
returns NULL if constfalse. (I'm not self-confident on this..)


+  /*
+   * If no commutator exists, cannot flip the qual's args,
+   * so give up.
+   */
+  if (!OidIsValid(expr_op))
+continue;

I suppose it's better to leftop and rightop as is rather than
flipping over so that var is placed left-side. Does that make
things so complex?

+ * It the operator happens to be '<>', which is never listed
If?


+if (!op_in_opfamily(expr_op, partopfamily))
+{
+  Oidnegator = get_negator(expr_op);
+
+  if (!OidIsValid(negator) ||
+!op_in_opfamily(negator, partopfamily))
+continue;

classify_partition_bounding_keys() checks the same thing by
checking whether the negator's strategy is
BTEquealStrategyNumber. (I'm not sure the

Re: [HACKERS] [POC] Faster processing at Gather node

2017-11-09 Thread Robert Haas

On Thu, Nov 9, 2017 at 9:31 PM, Amit Kapila  wrote:
> Have you set force_parallel_mode=regress; before running the
> statement?

Yes, I tried that first.

> If so, then why you need to tune other parallel query
> related parameters?

Because I couldn't get it to break the other way, I then tried this.

Instead of asking me what I did, can you tell me what I need to do?
Maybe a self-contained reproducible test case including exactly what
goes wrong on your end?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Simplify ACL handling for large objects and removal of superuser() checks

2017-11-09 Thread Michael Paquier

On Fri, Nov 10, 2017 at 10:00 AM, Tom Lane  wrote:
> Stephen Frost  writes:
>> I'm guessing no, which essentially means that *we* consider access to
>> lo_import/lo_export to be equivilant to superuser and therefore we're
>> not going to implement anything to try and prevent the user who has
>> access to those functions from becoming superuser.  If we aren't willing
>> to do that, then how can we really say that there's some difference
>> between access to these functions and being a superuser?
>
> We seem to be talking past each other.  Yes, if a user has malicious
> intentions, it's possibly to parlay lo_export into obtaining a superuser
> login (I'm less sure that that's necessarily true for lo_import).
> That does NOT make it "equivalent", except perhaps in the view of someone
> who is only considering blocking malevolent actors.  It does not mean that
> there's no value in preventing a task that needs to run lo_export from
> being able to accidentally destroy any data in the database.  There's a
> range of situations where you are concerned about accidents and errors,
> not malicious intent; but your argument ignores those use-cases.

That will not sound much as a surprise as I spawned the original
thread, but like Robert I understand that getting rid of all superuser
checks is a goal that we are trying to reach to allow admins to have
more flexibility in handling permissions to a subset of objects.
Forcing an admin to give full superuser rights to one user willing to
work only on LOs import and export is a wrong concept.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [POC] Faster processing at Gather node

2017-11-09 Thread Amit Kapila

On Fri, Nov 10, 2017 at 12:05 AM, Robert Haas  wrote:
> On Thu, Nov 9, 2017 at 12:08 AM, Amit Kapila  wrote:
>> This change looks suspicious to me.  I think here we can't use the
>> tupDesc constructed from targetlist.  One problem, I could see is that
>> the check for hasOid setting in tlist_matches_tupdesc won't give the
>> correct answer.   In case of the scan, we use the tuple descriptor
>> stored in relation descriptor which will allow us to take the right
>> decision in tlist_matches_tupdesc.  If you try the statement CREATE
>> TABLE as_select1 AS SELECT * FROM pg_class WHERE relkind = 'r'; in
>> force_parallel_mode=regress, then you can reproduce the problem I am
>> trying to highlight.
>
> I tried this, but nothing seemed to be obviously broken.  Then I
> realized that the CREATE TABLE command wasn't using parallelism, so I
> retried with parallel_setup_cost = 0, parallel_tuple_cost = 0, and
> min_parallel_table_scan_size = 0.  That got it to use parallel query,
> but I still don't see anything broken.  Can you clarify further?
>

Have you set force_parallel_mode=regress; before running the
statement?  If so, then why you need to tune other parallel query
related parameters?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Runtime Partition Pruning

2017-11-09 Thread Amit Kapila

On Thu, Nov 9, 2017 at 9:01 PM, Robert Haas  wrote:
> On Thu, Nov 9, 2017 at 6:18 AM, Beena Emerson  wrote:
>> The code still chooses the custom plan instead of the generic plan for
>> the prepared statements. I am working on it.
>
> I don't think it's really the job of this patch to do anything about
> that problem.
>

+1.  I think if we really want to do something about plan choice when
partitions are involved that should be done as a separate patch.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-11-09 Thread Peter Geoghegan

On Thu, Nov 9, 2017 at 4:53 PM, Andres Freund  wrote:
> Primarily because it's not an anti-corruption tool. I'd be surprised if
> there weren't ways to corrupt the page using these corruptions that
> aren't detected by it.

It's very hard to assess the risk of missing something that's actually
detectable with total confidence, but I think that the check is actually
very thorough.

> But even if it were, I don't think there's
> enough information to do so in the general case. You very well can end
> up with pages where subsequent hot pruning has removed a good bit of the
> direct evidence of this bug.

Sure, but maybe those are cases that can't get any worse anyway. So the
question of avoiding making it worse doesn't arise.

> But I'm not really sure why the error detection capabilities of matter
> much for the principal point I raised, which is how much work we need to
> do to not further worsen the corruption.

You're right. Just trying to put the risk in context, and to understand the
extent of the concern that you have.

-- 
Peter Geoghegan

Re: [HACKERS] Simplify ACL handling for large objects and removal of superuser() checks

2017-11-09 Thread Tom Lane

Stephen Frost  writes:
> I'm guessing no, which essentially means that *we* consider access to
> lo_import/lo_export to be equivilant to superuser and therefore we're
> not going to implement anything to try and prevent the user who has
> access to those functions from becoming superuser.  If we aren't willing
> to do that, then how can we really say that there's some difference
> between access to these functions and being a superuser?

We seem to be talking past each other.  Yes, if a user has malicious
intentions, it's possibly to parlay lo_export into obtaining a superuser
login (I'm less sure that that's necessarily true for lo_import).
That does NOT make it "equivalent", except perhaps in the view of someone
who is only considering blocking malevolent actors.  It does not mean that
there's no value in preventing a task that needs to run lo_export from
being able to accidentally destroy any data in the database.  There's a
range of situations where you are concerned about accidents and errors,
not malicious intent; but your argument ignores those use-cases.

To put it more plainly: your argument is much like saying that a person
who knows a sudo password might as well do everything they ever do as
root.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-11-09 Thread Andres Freund

On 2017-11-09 16:45:07 -0800, Peter Geoghegan wrote:
> On Thu, Nov 9, 2017 at 4:17 PM, Andres Freund  wrote:
> >> Actually, on second thought, I take that back -- I don't think that
> >> REINDEXing will even finish once a HOT chain is broken by the bug.
> >> IndexBuildHeapScan() actually does quite a good job of making sure
> >> that HOT chains are sane, which is how the enhanced amcheck notices
> >> the bug here in practice.
> >
> > I think that's too optimistic.
> 
> Why? Because the "find the TID of the root" logic in
> IndexBuildHeapScan()/heap_get_root_tuples() won't reliably find the
> actual root (it might be some other HOT chain root following TID
> recycling by VACUUM)?

Primarily because it's not an anti-corruption tool. I'd be surprised if
there weren't ways to corrupt the page using these corruptions that
aren't detected by it.  But even if it were, I don't think there's
enough information to do so in the general case.  You very well can end
up with pages where subsequent hot pruning has removed a good bit of the
direct evidence of this bug.

But I'm not really sure why the error detection capabilities of matter
much for the principal point I raised, which is how much work we need to
do to not further worsen the corruption.

Greetings,

Andres Freund

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fix bloom WAL tap test

2017-11-09 Thread Michael Paquier

On Thu, Nov 9, 2017 at 7:51 PM, Alexander Korotkov
 wrote:
> On Wed, Nov 8, 2017 at 5:46 AM, Masahiko Sawada 
> wrote:
>> > So I think
>> > that you should instead do something like that:
>> >
>> > --- a/contrib/bloom/Makefile
>> > +++ b/contrib/bloom/Makefile
>> > @@ -20,5 +20,12 @@ include $(top_builddir)/src/Makefile.global
>> >  include $(top_srcdir)/contrib/contrib-global.mk
>> >  endif
>> >
>> > +installcheck: wal-installcheck
>> > +
>> > +check: wal-check
>> > +
>> > +wal-installcheck:
>> > +   $(prove_installcheck)
>> > +
>> >  wal-check: temp-install
>> > $(prove_check)
>> >
>> > This works for me as I would expect it should.
>>
>> Looks good to me too.
>
> OK, then so be it :)

Thanks for the new version. This one, as well as the switch to
psql_safe in 
https://www.postgresql.org/message-id/CAPpHfduxgEYF_0BTs-mxGC4=w5sw8rnUbq9BSTp1Wq7=nwr...@mail.gmail.com
are good for a committer lookup IMO.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-11-09 Thread Peter Geoghegan

On Thu, Nov 9, 2017 at 4:17 PM, Andres Freund  wrote:
>> Actually, on second thought, I take that back -- I don't think that
>> REINDEXing will even finish once a HOT chain is broken by the bug.
>> IndexBuildHeapScan() actually does quite a good job of making sure
>> that HOT chains are sane, which is how the enhanced amcheck notices
>> the bug here in practice.
>
> I think that's too optimistic.

Why? Because the "find the TID of the root" logic in
IndexBuildHeapScan()/heap_get_root_tuples() won't reliably find the
actual root (it might be some other HOT chain root following TID
recycling by VACUUM)?

Assuming that's what you meant: I would have thought that the
xmin/xmax matching within heap_get_root_tuples() makes the sanity
checking fairly reliable in practice.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] User defined data types in Logical Replication

2017-11-09 Thread Huong Dangminh

Hi,

We are getting the bellow error while trying use Logical Replication 
with user defined data types in a C program (when call elog function).

 ERROR:  XX000: cache lookup failed for type X

# X is remote type's oid

It occurs in worker.c:slot_store_error_callback function when remotetypoid not 
exist in local pg_type.

I have tried to write a patch (attached).
I think it is not kindly to change typename to the OID's one,
But I could not find the easy way to get typename from OID in the remote host.

--- 
Thanks and best regards, 
Dang Minh Huong 
NEC Solution Innovators, Ltd. 
http://www.nec-solutioninnovators.co.jp/en/



logicalrep_typmap.patch
Description: logicalrep_typmap.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] path toward faster partition pruning

2017-11-09 Thread Amit Langote

Hi Amul.

On 2017/11/09 20:05, amul sul wrote:
> On Mon, Nov 6, 2017 at 3:31 PM, Amit Langote
>  wrote:
>> On 2017/11/06 14:32, David Rowley wrote:
>>> On 6 November 2017 at 17:30, Amit Langote wrote:
 On 2017/11/03 13:32, David Rowley wrote:
> On 31 October 2017 at 21:43, Amit Langote wrote:
> []
>>
>> Attached updated set of patches, including the fix to make the new pruning
>> code handle Boolean partitioning.
>>
> 
> I am getting following warning on mac os x:

Thanks for the review.

> partition.c:1800:24: warning: comparison of constant -1 with
> expression of type 'NullTestType'
>   (aka 'enum NullTestType') is always false
> [-Wtautological-constant-out-of-range-compare]
> if (keynullness[i] == -1)
> ~~ ^  ~~
> partition.c:1932:25: warning: comparison of constant -1 with
> expression of type 'NullTestType'
>   (aka 'enum NullTestType') is always false
> [-Wtautological-constant-out-of-range-compare]
> if (keynullness[i] == -1)
> ~~ ^  ~~
> 2 warnings generated.
> 
> 
> Comment for 0004 patch:
> 270 +   /* -1 represents an invalid value of NullTestType. */
>  271 +   memset(keynullness, -1, PARTITION_MAX_KEYS * sizeof(NullTestType));
> 
> I think we should not use memset to set a value other than 0 or true/false.
> This will work for -1 on the system where values are stored in the 2's
> complement but I am afraid of other architecture.

OK, I will remove all instances of comparing and setting variables of type
NullTestType to a value of -1.

Thanks,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-11-09 Thread Peter Geoghegan

On Thu, Nov 9, 2017 at 4:17 PM, Andres Freund  wrote:
>> I don't follow you here. Why would REINDEXing make the rows that
>> should be dead disappear again, even for a short period of time?
>
> It's not the REINDEX that makes them reappear.

Of course. I was just trying to make sense of what you said.

> It's the second
> vacuum. The reindex part was about $user trying to fix the problem...
> As you need two vacuums with appropriate cutoffs to hit the "rows
> revive" problem, that'll often in practice not happen immediately.

This explanation clears things up, though.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-11-09 Thread Andres Freund

On 2017-11-09 16:02:17 -0800, Peter Geoghegan wrote:
> > What I'm currently wondering about is how much we need to harden
> > postgres against such existing corruption. If e.g. the hot chains are
> > broken somebody might have reindexed thinking the problem is fixed - but
> > if they then later vacuum everything goes to shit again, with dead rows
> > reappearing.
> 
> I don't follow you here. Why would REINDEXing make the rows that
> should be dead disappear again, even for a short period of time?

It's not the REINDEX that makes them reappear. It's the second
vacuum. The reindex part was about $user trying to fix the problem...
As you need two vacuums with appropriate cutoffs to hit the "rows
revive" problem, that'll often in practice not happen immediately.


> Actually, on second thought, I take that back -- I don't think that
> REINDEXing will even finish once a HOT chain is broken by the bug.
> IndexBuildHeapScan() actually does quite a good job of making sure
> that HOT chains are sane, which is how the enhanced amcheck notices
> the bug here in practice.

I think that's too optimistic.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Simplify ACL handling for large objects and removal of superuser() checks

2017-11-09 Thread Stephen Frost

Robert,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Thu, Nov 9, 2017 at 2:56 PM, Stephen Frost  wrote:
> > Further, I agree entirely that we
> > shouldn't be deciding that certain capabilities are never allowed to be
> > given to a user- but that's why superuser *exists* and why it's possible
> > to give superuser access to multiple users.  The question here is if
> > it's actually sensible for us to make certain actions be grantable to
> > non-superusers which allow that role to become, or to essentially be, a
> > superuser.  What that does, just as it does in the table case, from my
> > viewpoint at least, is make it *look* to admins like they're grant'ing
> > something less than superuser when, really, they're actually giving the
> > user superuser-level access.  That violates the POLA because the admin
> > didn't write 'ALTER USER joe WITH SUPERUSER', they just wrote 'GRANT
> > EXECUTE ON lo_import() TO joe;'.
> 
> This is exactly the kind of thing that I'm talking about.  Forcing an
> administrator to hand out full superuser access instead of being able
> to give just lo_import() makes life difficult for users for no good
> reason.  The existence of a theoretically-exploitable security
> vulnerability is not tantamount to really having access, especially on
> a system with a secured logging facility.

This is where I disagree.  The administrator *is* giving the user
superuser-level access, they're just doing it in a different way from
explicitly saying 'ALTER USER .. WITH SUPERUSER' and that's exactly the
issue that I have.  This isn't a theoretical exploit- the user with
lo_import/lo_export access is able to read and write any file on the
filesystem which the PG OS has access to, including things like
pg_hba.conf in most configurations, the file backing pg_authid, the OS
user's .bashrc, the Kerberos principal, the CA public key, the SSL keys,
tables owned by other users that the in-database role doesn't have
access to, et al.  Further, will we consider these *actual*,
non-theoretical, methods to gain superuser access, or to bypass the
database authorization system, to be security issues or holes to plug?

I'm guessing no, which essentially means that *we* consider access to
lo_import/lo_export to be equivilant to superuser and therefore we're
not going to implement anything to try and prevent the user who has
access to those functions from becoming superuser.  If we aren't willing
to do that, then how can we really say that there's some difference
between access to these functions and being a superuser?

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-11-09 Thread Peter Geoghegan

On Thu, Nov 9, 2017 at 2:24 PM, Andres Freund  wrote:
> Attached is a version of the already existing regression test that both
> reproduces the broken hot chain (and thus failing index lookups) and
> then also the tuple reviving.  I don't see any need for letting this run
> with arbitrary permutations.

I thought that the use of every possible permutation was excessive,
myself. It left us with an isolation test that didn't precisely
describe the behavior that is tested. What you came up with seems far,
far better, especially because of the comments you included. The mail
message-id references seem to add a lot, too.

> What I'm currently wondering about is how much we need to harden
> postgres against such existing corruption. If e.g. the hot chains are
> broken somebody might have reindexed thinking the problem is fixed - but
> if they then later vacuum everything goes to shit again, with dead rows
> reappearing.

I don't follow you here. Why would REINDEXing make the rows that
should be dead disappear again, even for a short period of time? It
might do so for index scans, I suppose, but not for sequential scans.
Are you concerned about a risk of somebody not noticing that
sequential scans are still broken?

Actually, on second thought, I take that back -- I don't think that
REINDEXing will even finish once a HOT chain is broken by the bug.
IndexBuildHeapScan() actually does quite a good job of making sure
that HOT chains are sane, which is how the enhanced amcheck notices
the bug here in practice. (Before this bug was discovered, I would
have expected amcheck to catch problems like it slightly later, during
the Bloom filter probe for that HOT chain...but, in fact, it never
gets there with corruption from this bug in practice, AFAIK.)

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_basebackup --progress output for batch execution

2017-11-09 Thread Jeff Janes

On Fri, Sep 29, 2017 at 4:00 PM, Martin Marques <
martin.marq...@2ndquadrant.com> wrote:

> Hi,
>
> Some time ago I had to work on a system where I was cloning a standby
> using pg_basebackup, that didn't have screen or tmux. For that reason I
> redirected the output to a file and ran it with nohup.
>
> I normally (always actually ;) ) run pg_basebackup with --progress and
> --verbose so I can follow how much has been done. When done on a tty you
> get a nice progress bar with the percentage that has been cloned.
>
> The problem came with the execution and redirection of the output, as
> the --progress option will write a *very* long line!
>
> Back then I thought of writing a patch (actually someone suggested I do
> so) to add a --batch-mode option which would change the behavior
> pg_basebackup has when printing the output messages.
>

While separate lines in the output file is better than one very long line,
it still doesn't seem so useful.  If you aren't watching it in real time,
then you really need to have a timestamp on each line so that you can
interpret the output.  The lines are about one second apart, but I don't
know robust that timing is under all conditions.

I think I agree with Arthur that I'd rather have the decision made by
inspecting whether output is going to a tty, rather than by adding another
command line option.  But maybe that is not detected robustly enough across
all platforms and circumstances?

Cheers,

Jeff

Re: [HACKERS] libpq connection strings: control over the cipher suites?

2017-11-09 Thread Joe Conway

On 11/09/2017 03:17 PM, Michael Paquier wrote:
> On Fri, Nov 10, 2017 at 2:53 AM, Joe Conway  wrote:
>> On 11/09/2017 03:27 AM, Graham Leggett wrote:
>>> Is there a parameter or mechanism for setting the required ssl cipher list 
>>> from the client side?
>>
>> I don't believe so. That is controlled by ssl_ciphers, which requires a
>> restart in order to change.
>>
>> https://www.postgresql.org/docs/10/static/runtime-config-connection.html#GUC-SSL-CIPHERS
> 
> Since commit de41869 present in v10, SSL parameters can be reloaded.

Oh, cool, I must have missed that -- thanks!

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature

Re: [HACKERS] Simplify ACL handling for large objects and removal of superuser() checks

2017-11-09 Thread Tom Lane

Michael Paquier  writes:
> On Fri, Nov 10, 2017 at 7:05 AM, Tom Lane  wrote:
>> I did miss the need to fix the docs, and am happy to put in some strong
>> wording about the security hazards of these functions while fixing the
>> docs.  But I do not think that leaving them with hardwired superuser
>> checks is an improvement over being able to control them with GRANT.

> Sorry about that. lobj.sgml indeed mentions superusers. Do you need a patch?

No, I can write it.  But I'm going to wait to see where this debate
settles before expending effort on the docs.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] libpq connection strings: control over the cipher suites?

2017-11-09 Thread Michael Paquier

On Fri, Nov 10, 2017 at 2:53 AM, Joe Conway  wrote:
> On 11/09/2017 03:27 AM, Graham Leggett wrote:
>> Is there a parameter or mechanism for setting the required ssl cipher list 
>> from the client side?
>
> I don't believe so. That is controlled by ssl_ciphers, which requires a
> restart in order to change.
>
> https://www.postgresql.org/docs/10/static/runtime-config-connection.html#GUC-SSL-CIPHERS

Since commit de41869 present in v10, SSL parameters can be reloaded.
On libpq there is only an API to have a look at what are the ciphers
set by the server via PQsslAttribute().
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Simplify ACL handling for large objects and removal of superuser() checks

2017-11-09 Thread Michael Paquier

On Fri, Nov 10, 2017 at 7:05 AM, Tom Lane  wrote:
> I did miss the need to fix the docs, and am happy to put in some strong
> wording about the security hazards of these functions while fixing the
> docs.  But I do not think that leaving them with hardwired superuser
> checks is an improvement over being able to control them with GRANT.

Sorry about that. lobj.sgml indeed mentions superusers. Do you need a patch?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [POC] hash partitioning

2017-11-09 Thread Robert Haas

On Wed, Nov 1, 2017 at 6:16 AM, amul sul  wrote:
> Fixed in the 0003 patch.

I have committed this patch set with the attached adjustments.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


hash-adjustments.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] A hook for session start

2017-11-09 Thread Michael Paquier

On Fri, Nov 10, 2017 at 2:32 AM, Fabrízio de Royes Mello
 wrote:
> On Thu, Nov 9, 2017 at 12:09 AM, Michael Paquier 
> wrote:
>> +++ b/src/test/modules/test_session_hooks/session_hooks.conf
>> @@ -0,0 +1 @@
>> +shared_preload_libraries = 'test_session_hooks'
>> Don't you think that this should be a GUC? My previous comment
>> outlined that. I won't fight hard on that point in any case, don't
>> worry. I just want to make things clear :)
>
> Ooops... my fault... fixed!
>
> Thanks again!!

+/* GUC variables */
+static char *session_hook_username = NULL;
This causes the module to crash if test_session_hooks.username is not
set. I would recommend just assigning a default value, say "postgres".
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-11-09 Thread Andres Freund

On 2017-11-04 06:15:00 -0700, Andres Freund wrote:
> The reason for that is that I hadn't yet quite figured out how the bug I
> described in the commit message (and the previously committed testcase)
> would cause that. I was planning to diagnose / experiment more with this
> and write an email if I couldn't come up with an explanation.   The
> committed test does *not* actually trigger that.
> 
> The reason I couldn't quite figure out how the problem triggers is that
> [ long explanation ]

Attached is a version of the already existing regression test that both
reproduces the broken hot chain (and thus failing index lookups) and
then also the tuple reviving.  I don't see any need for letting this run
with arbitrary permutations.

Thanks to whoever allowed isolationtester permutations to go over
multiple lines and allow comments. I was wondering about adding that as
a feature just to discover it's already there ;)

What I'm currently wondering about is how much we need to harden
postgres against such existing corruption. If e.g. the hot chains are
broken somebody might have reindexed thinking the problem is fixed - but
if they then later vacuum everything goes to shit again, with dead rows
reappearing.  There's no way we can fix hot chains after the fact, but
preventing dead rows from reapparing seems important.  A minimal version
of that is fairly easy - we slap a bunch of if if
!TransactionIdDidCommit() elog(ERROR) at various code paths. But that'll
often trigger clog access errors when the problem occurred - if we want
to do better we need to pass down relfrozenxid/relminmxid to a few
functions.  I'm inclined to do so, but it'll make the patch larger...

Comments?

Greetings,

Andres Freund
# Test for interactions of tuple freezing with dead, as well as recently-dead
# tuples using multixacts via FOR KEY SHARE.
setup
{
  DROP TABLE IF EXISTS tab_freeze;
  CREATE TABLE tab_freeze (
id int PRIMARY KEY,
name char(3),
x int);
  INSERT INTO tab_freeze VALUES (1, '111', 0);
  INSERT INTO tab_freeze VALUES (3, '333', 0);
}

teardown
{
  DROP TABLE tab_freeze;
}

session "s1"
step "s1_begin" { BEGIN; }
step "s1_update"{ UPDATE tab_freeze SET x = x + 1 WHERE id = 3; }
step "s1_commit"{ COMMIT; }
step "s1_vacuum"{ VACUUM FREEZE tab_freeze; }
step "s1_selectone" {
BEGIN;
SET LOCAL enable_seqscan = false;
SET LOCAL enable_bitmapscan = false;
SELECT * FROM tab_freeze WHERE id = 3;
COMMIT;
}
step "s1_selectall" { SELECT * FROM tab_freeze ORDER BY name, id; }

session "s2"
step "s2_begin" { BEGIN; }
step "s2_key_share" { SELECT id FROM tab_freeze WHERE id = 3 FOR KEY SHARE; 
}
step "s2_commit"{ COMMIT; }
step "s2_vacuum"{ VACUUM FREEZE tab_freeze; }

session "s3"
step "s3_begin" { BEGIN; }
step "s3_key_share" { SELECT id FROM tab_freeze WHERE id = 3 FOR KEY SHARE; 
}
step "s3_commit"{ COMMIT; }
step "s3_vacuum"{ VACUUM FREEZE tab_freeze; }

# This permutation verfies that a previous bug
# https://postgr.es/m/e5711e62-8fdf-4dca-a888-c200bf6b5...@amazon.com
# https://postgr.es/m/20171102112019.33wb7g5wp4zpj...@alap3.anarazel.de
# is not reintroduced. We used to make wrong pruning / freezing
# decision for multixacts, which could lead to a) broken hot chains b)
# dead rows being revived.
permutation "s1_begin" "s2_begin" "s3_begin" # start transactions
   "s1_update" "s2_key_share" "s3_key_share" # have xmax be a multi with an 
updater, updater being oldest xid
   "s1_update" # create additional row version that has multis
   "s1_commit" "s2_commit" # commit both updater and share locker
   "s2_vacuum" # due to bug in freezing logic, we used to *not* prune updated 
row, and then froze it
   "s1_selectone" # if hot chain is broken, the row can't be found via index 
scan
   "s3_commit" # commit remaining open xact
   "s2_vacuum" # pruning / freezing in broken hot chains would unset xmax, 
reviving rows
   "s1_selectall" # show borkedness

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pageinspect option to forgo buffer locking?

2017-11-09 Thread Andres Freund

On 2017-11-09 17:14:11 -0500, Tom Lane wrote:
> If we do this, I'd suggest exposing it as a separate SQL function
> get_raw_page_unlocked() rather than as an option to get_raw_page().
> 
> The reasoning is that if we ever allow these functions to be controlled
> via GRANT instead of hardwired superuser checks (cf nearby debate about
> lo_import/lo_export), one might reasonably consider the unlocked form as
> more risky than the locked form, and hence not wish to have to give out
> privileges to both at once.

Good idea!


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pageinspect option to forgo buffer locking?

2017-11-09 Thread Tom Lane

Robert Haas  writes:
> On Thu, Nov 9, 2017 at 12:58 PM, Andres Freund  wrote:
>> You can already pass arbitrary byteas to heap_page_items(), so I don't
>> see how this'd change anything exposure-wise? Or are you thinking that
>> users would continually do this with actual page contents and would be
>> more likely to hit edge cases than if using pg_read_binary_file() or
>> such (which obviously sees an out of date page)?

> More the latter.  It's not really an increase in terms of security
> exposure, but it might lead to more trouble in practice.

If we do this, I'd suggest exposing it as a separate SQL function
get_raw_page_unlocked() rather than as an option to get_raw_page().

The reasoning is that if we ever allow these functions to be controlled
via GRANT instead of hardwired superuser checks (cf nearby debate about
lo_import/lo_export), one might reasonably consider the unlocked form as
more risky than the locked form, and hence not wish to have to give out
privileges to both at once.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Simplify ACL handling for large objects and removal of superuser() checks

2017-11-09 Thread Tom Lane

Robert Haas  writes:
> On Thu, Nov 9, 2017 at 2:56 PM, Stephen Frost  wrote:
>> Further, I agree entirely that we
>> shouldn't be deciding that certain capabilities are never allowed to be
>> given to a user- but that's why superuser *exists* and why it's possible
>> to give superuser access to multiple users.  The question here is if
>> it's actually sensible for us to make certain actions be grantable to
>> non-superusers which allow that role to become, or to essentially be, a
>> superuser.  What that does, just as it does in the table case, from my
>> viewpoint at least, is make it *look* to admins like they're grant'ing
>> something less than superuser when, really, they're actually giving the
>> user superuser-level access.  That violates the POLA because the admin
>> didn't write 'ALTER USER joe WITH SUPERUSER', they just wrote 'GRANT
>> EXECUTE ON lo_import() TO joe;'.

> This is exactly the kind of thing that I'm talking about.  Forcing an
> administrator to hand out full superuser access instead of being able
> to give just lo_import() makes life difficult for users for no good
> reason.  The existence of a theoretically-exploitable security
> vulnerability is not tantamount to really having access, especially on
> a system with a secured logging facility.

Exactly.  I think that Stephen's argument depends on what a black-hat
privilege recipient could theoretically do, but fails to consider what's
useful for white-hat users.  One of the standard rules for careful admins
is to do as little as possible as root/superuser.  If you have a situation
where it's necessary to use, say, lo_import as part of a routine data
import task, then the only alternative previously was to do that task as
superuser.  That is not an improvement, by any stretch of the imagination,
over granting lo_import privileges to some otherwise-vanilla role that can
run the data import task.

We've previously discussed workarounds such as creating SECURITY DEFINER
wrapper functions, but I don't think that approach does much to change the
terms of the debate: it'd still be better if the wrapper function itself
didn't need full superuser.

I did miss the need to fix the docs, and am happy to put in some strong
wording about the security hazards of these functions while fixing the
docs.  But I do not think that leaving them with hardwired superuser
checks is an improvement over being able to control them with GRANT.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] different content of pg_depend after pg_upgrade

2017-11-09 Thread Pavel Stehule

Hi

We checked some check query based on some operations on pg_depend table.
This query did different result when database was migrated with pg_dump or
with pg_upgrade. I found so this query was broken, but I found interesting
thing.

The count is 1 for any objid

select distinct count(distinct classid), objid from pg_depend group by
objid;

when system was loaded from dump

but when we used pg_upgrade, then previous rule was invalid.

Is it expected behave?

Regards

Pavel

[HACKERS] OpeSSL - PostgreSQL

2017-11-09 Thread chiru r

Hi All,

I am using PostgreSQL version *9.5.7* on Red hat enterprise Linux *7.2.*

*OpenSSL version : * OpenSSL 1.0.1e-fips 11 Feb 2013.

I have a requirement to enable the SSL in my environment with specific
cipher suites,we want to restrict weak cipher suites from open SSL default
list.

We have list of cipher suites, which are authorized to use in my
environment.So the Client Applications use one of authorized cipher suites
while configuring application server.

Is it require to install different version of OpenSSL software instead of
default OpenSSL on Linux ?.

How to configure the PostgreSQL to allow specif cipher suites from
different client applications?


Thanks,
Chiru

Re: [HACKERS] Simplify ACL handling for large objects and removal of superuser() checks

2017-11-09 Thread Robert Haas

On Thu, Nov 9, 2017 at 2:56 PM, Stephen Frost  wrote:
> I agree that it may not be obvious which cases lead to a relatively easy
> way to obtain superuser and which don't, and that's actually why I'd
> much rather it be something that we're considering and looking out for
> because, frankly, we're in a much better position generally to realize
> those cases than our users are.

I disagree.  It's flattering to imagine that PostgreSQL developers, as
a class, are smarter than PostgreSQL users, but it doesn't match my
observations.

> Further, I agree entirely that we
> shouldn't be deciding that certain capabilities are never allowed to be
> given to a user- but that's why superuser *exists* and why it's possible
> to give superuser access to multiple users.  The question here is if
> it's actually sensible for us to make certain actions be grantable to
> non-superusers which allow that role to become, or to essentially be, a
> superuser.  What that does, just as it does in the table case, from my
> viewpoint at least, is make it *look* to admins like they're grant'ing
> something less than superuser when, really, they're actually giving the
> user superuser-level access.  That violates the POLA because the admin
> didn't write 'ALTER USER joe WITH SUPERUSER', they just wrote 'GRANT
> EXECUTE ON lo_import() TO joe;'.

This is exactly the kind of thing that I'm talking about.  Forcing an
administrator to hand out full superuser access instead of being able
to give just lo_import() makes life difficult for users for no good
reason.  The existence of a theoretically-exploitable security
vulnerability is not tantamount to really having access, especially on
a system with a secured logging facility.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: psql command \graw

2017-11-09 Thread Pavel Stehule

2017-11-09 21:03 GMT+01:00 Fabien COELHO :

>
> Hello Pavel,
>
> I hope so I fixed all mentioned issues.
>>
>
> Patch applies with a warning:
>
>  > git apply ~/psql-graw-2.patch
>  /home/fabien/psql-graw-2.patch:192: new blank line at EOF.
>  +
>  warning: 1 line adds whitespace errors.
>
> Otherwise it compiles. "make check" ok. doc gen ok.
>
> Two spurious empty lines are added before StoreQueryTuple.
>
> Doc: "If + is appended to the command name, a column
> names are displayed."
>
> I suggest instead: "When + is appended, column names
> are also displayed."
>
> ISTM that you can remove "force_column_header" and just set "tuple_only"
> to what you need, that is you do not need to change anything in function
> "print_unaligned_text".
>

Last point is not possible - I would not to break original tuple only mode.

Pavel

>
> --
> Fabien.
>

[HACKERS] Inlining functions with "expensive" parameters

2017-11-09 Thread Paul Ramsey

All,

As we try and make PostGIS more "parallel sensitive" we have been added
costs to our functions, so that their relative CPU cost is more accurately
reflected in parallel plans.

This has resulted in an odd side effect: some of our "wrapper" functions
stop giving index scans in plans [1]. This is a problem!

An example of a "wrapper" is ST_Intersects(geom1, geom2). It combines an
index operation (geom1 && geom2) with an exact spatial test
(_ST_Intersects(geom1, geom2). This is primarily for user convenience, and
has worked for us well for a decade and more. Having this construct stop
working is definitely a problem.

As we add costs to our functions, the odds increase that one of the
parameters to a wrapper might be a costed function. It's not uncommon to
see:

ST_Interects(geom, ST_SetSRID('POLYGON(...)', 4326))

It's fair to say that we really do depend on our wrappers getting inlined
basically all the time. They are simple functions, they do nothing other
than 'SELECT func1() AND func2() AND arg1 && arg2'.

However, once costs are added to the parameters, the inlining can be turned
off relatively quickly. Here's a PgSQL native example:

-- Create data table and index. Analyze.
DROP TABLE IF EXISTS boxen;
CREATE TABLE boxen AS
SELECT row_number() OVER() As gid,
box(point(x, y),point(x+1, y+1)) AS b, x, y
FROM generate_series(-100,100) As y, generate_series(-100,100) As x;
CREATE INDEX idx_b_geom_gist ON boxen USING gist(b);
ANALYZE boxen;

-- An inlined function
-- When set 'STRICT' it breaks index access
-- However 'IMMUTABLE' doesn't seem to bother it
CREATE OR REPLACE FUNCTION good_box(box, box)
RETURNS boolean
AS 'SELECT $1 OPERATOR(&&) $2 AND length(lseg(point($1),point($2)))
< 3'
LANGUAGE 'sql';

-- Start with a low cost circle()
ALTER FUNCTION circle(point, double precision) COST 1;

-- [A] Query plan hits index
EXPLAIN SELECT gid
FROM boxen
WHERE good_box(
boxen.b,
box(circle(point(20.5, 20.5), 2))
);

-- [B] Query plan hits index
EXPLAIN SELECT gid
FROM boxen,
(SELECT x, y FROM boxen WHERE x < 0 and y < 0) AS c
WHERE good_box(
boxen.b,
box(circle(point(c.x, c.y), 2))
);

-- Increase cost of circle
ALTER FUNCTION circle(point, double precision) COST 100;

-- [B] Query plan does not hit index!
EXPLAIN SELECT gid
FROM boxen,
(SELECT x, y FROM boxen WHERE x < 0 and y < 0) AS c
WHERE good_box(
boxen.b,
box(circle(point(c.x, c.y), 2))
);

The inlining is getting tossed out on a test of how expensive the function
parameters are [2]. As a result, we lose what is really the correct plan,
and get a sequence scan instead of an index scan.

The test of parameter cost seems quite old (15+ years) and perhaps didn't
anticipate highly variable individual function costs (or maybe it did). As
it stands though, PostGIS is currently stuck choosing between having costs
on our functions or having our inlined wrappers, because we cannot have
both at the same time.

Any thoughts?

Thanks!

P.


[1] https://trac.osgeo.org/postgis/ticket/3675#comment:18
[2]
https://github.com/postgres/postgres/blob/ae20b23a9e7029f31ee902da08a464d968319f56/src/backend/optimizer/util/clauses.c#L4581-L4584

Re: [HACKERS] proposal: psql command \graw

2017-11-09 Thread Fabien COELHO



Hello Pavel,


I hope so I fixed all mentioned issues.


Patch applies with a warning:

 > git apply ~/psql-graw-2.patch
 /home/fabien/psql-graw-2.patch:192: new blank line at EOF.
 +
 warning: 1 line adds whitespace errors.

Otherwise it compiles. "make check" ok. doc gen ok.

Two spurious empty lines are added before StoreQueryTuple.

Doc: "If + is appended to the command name, a column 
names are displayed."


I suggest instead: "When + is appended, column names 
are also displayed."


ISTM that you can remove "force_column_header" and just set "tuple_only" 
to what you need, that is you do not need to change anything in function 
"print_unaligned_text".


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Simplify ACL handling for large objects and removal of superuser() checks

2017-11-09 Thread Stephen Frost

Robert,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Thu, Nov 9, 2017 at 1:52 PM, Stephen Frost  wrote:
> > This is not unlike the discussions we've had in the past around allowing
> > non-owners of a table to modify properties of a table, where the
> > argument has been successfully and clearly made that if you make the
> > ability to change the table a GRANT'able right, then that can be used to
> > become the role which owns the table without much difficulty, and so
> > there isn't really a point in making that right independently
> > GRANT'able.  We have some of that explicitly GRANT'able today due to
> > requirements of the spec, but that's outside of our control.
> 
> I don't think it's quite the same thing.  I wouldn't go out of my way
> to invent grantable table rights that just let you usurp the table
> owner's permissions, but I still think it's better to have a uniform
> rule that functions we ship don't contain hard-coded superuser checks.

Just to be clear, we should certainly be thinking about more than just
functions but also about things like publications and subscriptions and
other bits of the system.  I haven't done a detailed analysis quite yet,
but I'm reasonably confident that a number of things in this last
release cycle have resulted in quite a few additional explicit superuser
checks, which I do think is an issue to be addressed.

> One problem is that which functions allow an escalation to superuser
> that is easy enough or reliable enough may not actually be a very
> bright line in all cases.  But more than that, I think it's a
> legitimate decision to grant to a user a right that COULD lead to a
> superuser escalation and trust them not to use that way, or prevent
> them from using it that way by some means not known to the database
> system (e.g. route all queries through pgbouncer and send them to a
> teletype; if a breach is detected, go to the teletype room, read the
> paper contained therein, and decide who to fire/imprison/execute at
> gunpoint).  It shouldn't be our job to decide that granting a certain
> right is NEVER ok.

I agree that it may not be obvious which cases lead to a relatively easy
way to obtain superuser and which don't, and that's actually why I'd
much rather it be something that we're considering and looking out for
because, frankly, we're in a much better position generally to realize
those cases than our users are.  Further, I agree entirely that we
shouldn't be deciding that certain capabilities are never allowed to be
given to a user- but that's why superuser *exists* and why it's possible
to give superuser access to multiple users.  The question here is if
it's actually sensible for us to make certain actions be grantable to
non-superusers which allow that role to become, or to essentially be, a
superuser.  What that does, just as it does in the table case, from my
viewpoint at least, is make it *look* to admins like they're grant'ing
something less than superuser when, really, they're actually giving the
user superuser-level access.  That violates the POLA because the admin
didn't write 'ALTER USER joe WITH SUPERUSER', they just wrote 'GRANT
EXECUTE ON lo_import() TO joe;'.

We can certainly argue about if the admin should have just known that
lo_import()/lo_export() or similar functions were equivilant to
grant'ing a user superuser access, but it's not entirely clear to me
that it's actually advantageous to go out of our way to provide multiple
ways for superuser-level access to be grant'ed out to users.  Let's
provide one very clear way to do that and strive to avoid building in
others, either intentionally or unintentionally.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] [PATCH] Improve geometric types

2017-11-09 Thread Emre Hasegeli

>> This is also effecting lseg ## box operator.
>
> Mmm.. It returns (1.5, 1.5) with the 0004 patch. It is surely a
> point on the second operand but I'm not sure it's right that the
> operator returns a specific point for two parallel segments.

I am changing it to return NULL, when they are parallel.

> I'd like to put comments on 0001 and 0004 only now:
>
>  - Adding [LR]DELIM_L looks good but they should be described in
>the comment just above.

I will mention it on the new version.

>  - I'm not sure the change of box_construct is good but currently
>all callers fits the new interface (giving two points, not
>four coordinates).

I tried to make things more consistent.  The other constructors takes points.

>  - close_lseg seems forgetting the case where the two given
>segments are crossing (and parallel).

I am re-implementing it covering those cases.

> - make_bound_box operates directly on the poly->boundbox. I'm
>   afraid that such coding hinders compiers from using registers.

I am changing it back.

>   This is a bit apart from this patch, it would be better if we
>   could eliminate internal calls using DirectFunctionCall.

We don't seem to be able to fix all issues without doing that.  I will
incorporate the change.

Thank you for your review.  I will address your other email before
posting new versions.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] why not parallel seq scan for slow functions

2017-11-09 Thread Robert Haas

On Thu, Nov 9, 2017 at 3:47 AM, Amit Kapila  wrote:
> I think I understood your concern after some offlist discussion and it
> is primarily due to the inheritance related check which can skip the
> generation of gather paths when it shouldn't.  So what might fit
> better here is a straight check on the number of base rels such that
> allow generating gather path in set_rel_pathlist, if there are
> multiple baserels involved.  I have used all_baserels which I think
> will work better for this purpose.

Yes, that looks a lot more likely to be correct.

Let's see what Tom thinks.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Simplify ACL handling for large objects and removal of superuser() checks

2017-11-09 Thread Robert Haas

On Thu, Nov 9, 2017 at 1:52 PM, Stephen Frost  wrote:
>> I disagree that that is, or should be, a guiding principle.
>
> It was what I was using as the basis of the work which I did in this
> area and, at least at that time, there seemed to be little issue with
> that.

Well, there actually kind of was.  It came up here:

http://postgr.es/m/ca+tgmoy6ne5n4jc5awxser2g2gdgr4omf7edcexamvpf_jq...@mail.gmail.com

I mis-remembered who was on which side of the debate, though, hence
the comment about employers.  But never mind about that, since I was
wrong.  Sorry for not checking that more carefully before.

> This is not unlike the discussions we've had in the past around allowing
> non-owners of a table to modify properties of a table, where the
> argument has been successfully and clearly made that if you make the
> ability to change the table a GRANT'able right, then that can be used to
> become the role which owns the table without much difficulty, and so
> there isn't really a point in making that right independently
> GRANT'able.  We have some of that explicitly GRANT'able today due to
> requirements of the spec, but that's outside of our control.

I don't think it's quite the same thing.  I wouldn't go out of my way
to invent grantable table rights that just let you usurp the table
owner's permissions, but I still think it's better to have a uniform
rule that functions we ship don't contain hard-coded superuser checks.
One problem is that which functions allow an escalation to superuser
that is easy enough or reliable enough may not actually be a very
bright line in all cases.  But more than that, I think it's a
legitimate decision to grant to a user a right that COULD lead to a
superuser escalation and trust them not to use that way, or prevent
them from using it that way by some means not known to the database
system (e.g. route all queries through pgbouncer and send them to a
teletype; if a breach is detected, go to the teletype room, read the
paper contained therein, and decide who to fire/imprison/execute at
gunpoint).  It shouldn't be our job to decide that granting a certain
right is NEVER ok.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Simplify ACL handling for large objects and removal of superuser() checks

2017-11-09 Thread Stephen Frost

Robert,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Thu, Nov 9, 2017 at 1:16 PM, Stephen Frost  wrote:
> > While we have been working to reduce the number of superuser() checks in
> > the backend in favor of having the ability to GRANT explicit rights, one
> > of the guideing principles has always been that capabilities which can
> > be used to gain superuser rights with little effort should remain
> > restricted to the superuser, which is why the lo_import/lo_export hadn't
> > been under consideration for superuser-check removal in the analysis I
> > provided previously.
> 
> I disagree that that is, or should be, a guiding principle.  

It was what I was using as the basis of the work which I did in this
area and, at least at that time, there seemed to be little issue with
that.

> I'm not
> sure that anyone other than you and your fellow employees at Crunchy
> has ever agreed that this is some kind of principle.  You make it
> sound like there's a consensus about this, but I think there isn't.

It's unclear to me why you're bringing up employers in this discussion,
particularly since Tom is the one who just moved things in the direction
that you're evidently advocating for.  Clearly there isn't consensus if
you and others disagree.  Things certainly can change over time as well,
but if we're going to make a change here then we should make it
willfully, plainly, and consistently.

> I think our guiding principle should be to get rid of ALL of the
> hard-coded superuser checks and let people GRANT what they want.  If
> they grant a permission that results in somebody escalating to
> superuser, they get to keep both pieces.  Such risks might be worth
> documenting, but we shouldn't obstruct people from doing it.

I would certainly like to see the many additional hard-coded superuser
checks introduced into PG10 removed and replaced with more fine-grained
GRANT-based privileges (either as additional GRANT rights or perhaps
through the default roles system).  What I dislike about allowing
GRANT's of a privilege which allows someone to be superuser is that it
makes it *look* like you're only GRANT'ing some subset of reasonable
rights when, in reality, you're actually GRANT'ing a great deal more.

This is not unlike the discussions we've had in the past around allowing
non-owners of a table to modify properties of a table, where the
argument has been successfully and clearly made that if you make the
ability to change the table a GRANT'able right, then that can be used to
become the role which owns the table without much difficulty, and so
there isn't really a point in making that right independently
GRANT'able.  We have some of that explicitly GRANT'able today due to
requirements of the spec, but that's outside of our control.

> In the same way, Linux will not prevent you from making a binary
> setuid regardless of what the binary does.  If you make a binary
> setuid root that lets someone hack root, that's your fault, not the
> operating system.

This is actually one of the things that SELinux is intended to address,
so I don't agree that it's quite this cut-and-dry.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Aggregates push-down to partitions

2017-11-09 Thread Maksim Milyutin


Hi Konstantin!


09.11.17 20:14, Konstantin Knizhnik wrote:
It is still far from ideal plan because each worker is working with 
all partitions, instead of spitting partitions between workers and 
calculate partial aggregates for each partition.


But if we add FDW as a child of parent table, then parallel scan can 
not be used and we get the worst possible plan:


postgres=# create foreign table derived_fdw() inherits(base) server 
pg_fdw options (table_name 'derived1');CREATE FOREIGN TABLE

postgres=# explain select sum(x) from base;
    QUERY PLAN
-- 


 Aggregate  (cost=34055.07..34055.08 rows=1 width=8)
   ->  Append  (cost=0.00..29047.75 rows=2002926 width=4)
 ->  Seq Scan on base  (cost=0.00..0.00 rows=1 width=4)
 ->  Seq Scan on derived1  (cost=0.00..14425.00 rows=100 
width=4)
 ->  Seq Scan on derived2  (cost=0.00..14425.00 rows=100 
width=4)
 ->  Foreign Scan on derived_fdw  (cost=100.00..197.75 
rows=2925 width=4)

(6 rows)

So we sequentially pull all data to this node and compute aggregates 
locally.
Ideal plan will calculate in parallel partial aggregates at all nodes 
and then combine partial results.

It requires two changes:
1. Replace Aggregate->Append with 
Finalize_Aggregate->Append->Partial_Aggregate
2. Concurrent execution of Append. It also can be done in two 
different ways: we can try to use existed parallel workers 
infrastructure and
replace Append with Gather. It seems to be the best approach for local 
partitioning. In case of remote (FDW) partitions, it is enough
to split starting of execution (PQsendQuery in postgres_fdw) and 
getting results. So it requires some changes in FDW protocol.



I wonder if somebody already investigate this problem or working in 
this direction.

May be there are already some patches proposed?
I have searched hackers archive, but didn't find something relevant...
Are there any suggestions about the best approach to implement this 
feature?




Maybe in this thread[1] your described problem are solved through 
introducing Parallel Append node?


1. 
https://www.postgresql.org/message-id/CAJ3gD9dy0K_E8r727heqXoBmWZ83HwLFwdcaSSmBQ1%2BS%2BvRuUQ%40mail.gmail.com


--
Regards,
Maksim Milyutin



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Simplify ACL handling for large objects and removal of superuser() checks

2017-11-09 Thread Robert Haas

On Thu, Nov 9, 2017 at 1:16 PM, Stephen Frost  wrote:
> While we have been working to reduce the number of superuser() checks in
> the backend in favor of having the ability to GRANT explicit rights, one
> of the guideing principles has always been that capabilities which can
> be used to gain superuser rights with little effort should remain
> restricted to the superuser, which is why the lo_import/lo_export hadn't
> been under consideration for superuser-check removal in the analysis I
> provided previously.

I disagree that that is, or should be, a guiding principle.  I'm not
sure that anyone other than you and your fellow employees at Crunchy
has ever agreed that this is some kind of principle.  You make it
sound like there's a consensus about this, but I think there isn't.

I think our guiding principle should be to get rid of ALL of the
hard-coded superuser checks and let people GRANT what they want.  If
they grant a permission that results in somebody escalating to
superuser, they get to keep both pieces.  Such risks might be worth
documenting, but we shouldn't obstruct people from doing it.

In the same way, Linux will not prevent you from making a binary
setuid regardless of what the binary does.  If you make a binary
setuid root that lets someone hack root, that's your fault, not the
operating system.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [POC] Faster processing at Gather node

2017-11-09 Thread Robert Haas

On Thu, Nov 9, 2017 at 12:08 AM, Amit Kapila  wrote:
> This change looks suspicious to me.  I think here we can't use the
> tupDesc constructed from targetlist.  One problem, I could see is that
> the check for hasOid setting in tlist_matches_tupdesc won't give the
> correct answer.   In case of the scan, we use the tuple descriptor
> stored in relation descriptor which will allow us to take the right
> decision in tlist_matches_tupdesc.  If you try the statement CREATE
> TABLE as_select1 AS SELECT * FROM pg_class WHERE relkind = 'r'; in
> force_parallel_mode=regress, then you can reproduce the problem I am
> trying to highlight.

I tried this, but nothing seemed to be obviously broken.  Then I
realized that the CREATE TABLE command wasn't using parallelism, so I
retried with parallel_setup_cost = 0, parallel_tuple_cost = 0, and
min_parallel_table_scan_size = 0.  That got it to use parallel query,
but I still don't see anything broken.  Can you clarify further?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: psql command \graw

2017-11-09 Thread Pavel Stehule

Hi

2017-08-24 5:50 GMT+02:00 Fabien COELHO :

>
> Hello Pavel,
>
> I have added the patch to the next commitfest.
>
> Patch applies, compiles, works.
>
> I'm okay with the names graw/graw+, and for having such short-hands.
>
> Missing break in switch, even if last item and useless, because other
> items do it... Also should be added at its place in alphabetical order?
>
> "column_header" is somehow redundant with "tuples_only". Use the
> existing one instead of adding a new one?
>
> More generally, ISTM that the same effect could be achieved without
> adding a new print function, but by setting more options (separator,
> ...) and calling an existing print function. If so, I think it would
> reduce the code size.
>
> Missing help entry.
>
> Missing non regression tests.
>
> Missing documentation.
>
>
I hope so I fixed all mentioned issues.

Regards

Pavel


> --
> Fabien.
>
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index e520cdf3ba..9e7030f247 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -2020,6 +2020,19 @@ CREATE INDEX
   
 
 
+  
+\graw[+] [ filename ]
+\graw[+]  [ |command ]
+
+
+\graw is equivalent to \g, but
+forces unaligned output mode for this query. If +
+is appended to the command name, a column names are displayed.
+
+
+  
+
+
   
 \gset [ prefix ]
 
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 8cc4de3878..b3461291eb 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -332,7 +332,8 @@ exec_command(const char *cmd,
 		status = exec_command_errverbose(scan_state, active_branch);
 	else if (strcmp(cmd, "f") == 0)
 		status = exec_command_f(scan_state, active_branch);
-	else if (strcmp(cmd, "g") == 0 || strcmp(cmd, "gx") == 0)
+	else if (strcmp(cmd, "g") == 0 || strcmp(cmd, "gx") == 0 ||
+			 strcmp(cmd, "graw") == 0 || strcmp(cmd, "graw+") == 0)
 		status = exec_command_g(scan_state, active_branch, cmd);
 	else if (strcmp(cmd, "gdesc") == 0)
 		status = exec_command_gdesc(scan_state, active_branch);
@@ -1232,6 +1233,7 @@ exec_command_f(PsqlScanState scan_state, bool active_branch)
 
 /*
  * \g [filename] -- send query, optionally with output to file/pipe
+ * \graw [filename] -- same as \g with raw format
  * \gx [filename] -- same as \g, with expanded mode forced
  */
 static backslashResult
@@ -1254,6 +1256,10 @@ exec_command_g(PsqlScanState scan_state, bool active_branch, const char *cmd)
 		free(fname);
 		if (strcmp(cmd, "gx") == 0)
 			pset.g_expanded = true;
+		else if (strcmp(cmd, "graw") == 0)
+			pset.g_raw = true;
+		else if (strcmp(cmd, "graw+") == 0)
+			pset.g_raw_header = true;
 		status = PSQL_CMD_SEND;
 	}
 	else
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index 7a91a44b2b..aeec302eae 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -865,6 +865,14 @@ PrintQueryTuples(const PGresult *results)
 	if (pset.g_expanded)
 		my_popt.topt.expanded = 1;
 
+	/* one-shot raw output requested by \raw and \graw+ */
+	else if (pset.g_raw || pset.g_raw_header)
+	{
+		my_popt.topt.format = PRINT_UNALIGNED;
+		my_popt.topt.tuples_only = true;
+		my_popt.topt.force_column_header = pset.g_raw_header;
+	}
+
 	/* write output to \g argument, if any */
 	if (pset.gfname)
 	{
@@ -893,6 +901,8 @@ PrintQueryTuples(const PGresult *results)
 }
 
 
+
+
 /*
  * StoreQueryTuple: assuming query result is OK, save data into variables
  *
@@ -1517,6 +1527,10 @@ sendquery_cleanup:
 	/* reset \gx's expanded-mode flag */
 	pset.g_expanded = false;
 
+	/* reset \graw flags */
+	pset.g_raw = false;
+	pset.g_raw_header = false;
+
 	/* reset \gset trigger */
 	if (pset.gset_prefix)
 	{
diff --git a/src/bin/psql/help.c b/src/bin/psql/help.c
index a926c40b9b..e573711434 100644
--- a/src/bin/psql/help.c
+++ b/src/bin/psql/help.c
@@ -167,7 +167,7 @@ slashUsage(unsigned short int pager)
 	 * Use "psql --help=commands | wc" to count correctly.  It's okay to count
 	 * the USE_READLINE line even in builds without that.
 	 */
-	output = PageOutput(125, pager ? &(pset.popt.topt) : NULL);
+	output = PageOutput(126, pager ? &(pset.popt.topt) : NULL);
 
 	fprintf(output, _("General\n"));
 	fprintf(output, _("  \\copyright show PostgreSQL usage and distribution terms\n"));
@@ -176,6 +176,7 @@ slashUsage(unsigned short int pager)
 	fprintf(output, _("  \\g [FILE] or ; execute query (and send results to file or |pipe)\n"));
 	fprintf(output, _("  \\gdesc describe result of query, without executing it\n"));
 	fprintf(output, _("  \\gexec execute query, then execute each value in its result\n"));
+	fprintf(output, _("  \\graw[+] [FILE]as \\g, but forces unaligned raw output mode\n"));
 	fprintf(output, _("  \\gset [PREFIX] execute query and store results in psql variables\n"));
 	fprintf(output, _("  \\gx [FILE] as \\g, but f

Re: [HACKERS] Simplify ACL handling for large objects and removal of superuser() checks

2017-11-09 Thread Stephen Frost

Tom, Michael,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Michael Paquier  writes:
> > On Thu, Nov 9, 2017 at 6:05 AM, Tom Lane  wrote:
> >> Another idea would be to invent a new external flag bit "INV_WRITE_ONLY",
> >> so that people who wanted true write-only could get it, without breaking
> >> backwards-compatible behavior.  But I'm inclined to wait for some field
> >> demand to show up before adding even that little bit of complication.
> 
> > Demand that may never show up, and the current behavior on HEAD does
> > not receive any complains either. I am keeping the patch simple for
> > now. That's less aspirin needed for everybody.
> 
> Looks good to me, pushed.

While we have been working to reduce the number of superuser() checks in
the backend in favor of having the ability to GRANT explicit rights, one
of the guideing principles has always been that capabilities which can
be used to gain superuser rights with little effort should remain
restricted to the superuser, which is why the lo_import/lo_export hadn't
been under consideration for superuser-check removal in the analysis I
provided previously.

As such, I'm not particularly thrilled to see those checks simply just
removed.  If we are going to say that it's acceptable to allow
non-superusers arbitrary access to files on the filesystem as the OS
user then we would also be removing similar checks from adminpack,
something I've also been against quite clearly in past discussions.

This also didn't update the documentation regarding these functions
which clearly states that superuser is required.  If we are going to
allow non-superusers access to these functions, we certainly need to
remove the claim stating that we don't do that and further we must make
it abundantly clear that these functions are extremely dangerous and
could be trivially used to make someone who has access to them into a
superuser.

I continue to feel that this is something worthy of serious
consideration and discussion, and not something to be done because we
happen to be modifying the code in this area.  I'm tempted to suggest we
should have another role attribute or some other additional check when
it comes to functions like these.

The right way to allow users to be able to pull in data off the
filesystem, imv, would be by providing a way to define specific
locations on the filesystem which users can be granted access to import
data from, as I once wrote a patch to do.  That's certainly quite tricky
to get correct, but that's much better than allowing non-superusers
arbitrary access, which is what this does and what users may start using
if we continue to remove these restrictions without providing a better
option.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Parallel Append implementation

2017-11-09 Thread Robert Haas

On Sat, Oct 28, 2017 at 5:50 AM, Robert Haas  wrote:
> No, because the Append node is *NOT* getting copied into shared
> memory.  I have pushed a comment update to the existing functions; you
> can use the same comment for this patch.

I spent the last several days working on this patch, which had a
number of problems both cosmetic and functional.  I think the attached
is in better shape now, but it could certainly use some more review
and testing since I only just finished modifying it, and I modified it
pretty heavily.  Changes:

- I fixed the "morerows" entries in the documentation.  If you had
built the documentation the way you had it and loaded up in a web
browser, you would have seen that the way you had it was not correct.

- I moved T_AppendState to a different position in the switch inside
ExecParallelReInitializeDSM, so as to keep that switch in the same
order as all of the other switch statements in that file.

- I rewrote the comment for pa_finished.  It previously began with
"workers currently executing the subplan", which is not an accurate
description. I suspect this was a holdover from a previous version of
the patch in which this was an array of integers rather than an array
of type bool.  I also fixed the comment in ExecAppendEstimate and
added, removed, or rewrote various other comments as well.

- I renamed PA_INVALID_PLAN to INVALID_SUBPLAN_INDEX, which I think is
more clear and allows for the possibility that this sentinel value
might someday be used for non-parallel-aware Append plans.

- I largely rewrote the code for picking the next subplan.  A
superficial problem with the way that you had it is that you had
renamed exec_append_initialize_next to exec_append_seq_next but not
updated the header comment to match.  Also, the logic was spread out
all over the file.  There are three cases: not parallel aware, leader,
worker.  You had the code for the first case at the top of the file
and the other two cases at the bottom of the file and used multiple
"if" statements to pick the right one in each case.  I replaced all
that with a function pointer stored in the AppendState, moved the code
so it's all together, and rewrote it in a way that I find easier to
understand.  I also changed the naming convention.

- I renamed pappend_len to pstate_len and ParallelAppendDescData to
ParallelAppendState.  I think the use of the word "descriptor" is a
carryover from the concept of a scan descriptor.  There's nothing
really wrong with inventing the concept of an "append descriptor", but
it seems more clear to just refer to shared state.

- I fixed ExecAppendReInitializeDSM not to reset node->as_whichplan.
Per commit 41b0dd987d44089dc48e9c70024277e253b396b7, that's wrong;
instead, local state should be reset in ExecReScanAppend.  I installed
what I believe to be the correct logic in that function instead.

- I fixed list_qsort() so that it copies the type of the old list into
the new list.  Otherwise, sorting a list of type T_IntList or
T_OidList would turn it into just plain T_List, which is wrong.

- I removed get_append_num_workers and integrated the logic into the
callers.  This function was coded quite strangely: it assigned the
return value of fls() to a double and then eventually rounded the
result back to an integer.  But fls() returns an integer, so this
doesn't make much sense.  On a related note, I made it use fls(# of
subpaths) instead of fls(# of subpaths)+1.  Adding 1 doesn't make
sense to me here because it leads to a decision to use 2 workers for a
single, non-partial subpath.  I suspect both of these mistakes stem
from thinking that fls() returns the base-2 logarithm, but in fact it
doesn't, quite: log2(1) = 0.0 but fls(1) = 1.

- In the process of making the changes described in the previous
point, I added a couple of assertions, one of which promptly failed.
It turns out the reason is that your patch didn't update
accumulate_append_subpaths(), which can result in flattening
non-partial paths from a Parallel Append into a parent Append's list
of partial paths, which is bad.  The easiest way to fix that would be
to just teach accumulate_append_subpaths() not to flatten a Parallel
Append into a parent Append or MergeAppend node, but it seemed to me
that there was a fair amount of duplication between
accumulate_partialappend_subpath() and accumulate_append_subpaths, so
what I did instead is folded all of the necessarily logic directly
into accumulate_append_subpaths().  This approach also avoids some
assumptions that your code made, such as the assumption that we will
never have a partial MergeAppend path.

- I changed create_append_path() so that it uses the parallel_aware
argument as the only determinant of whether the output path is flagged
as parallel-aware. Your version also considered whether
parallel_workers > 0, but I think that's not a good idea; the caller
should pass the correct value for parallel_aware rather than relying
on this function to fix it.  Possibly you indirectly

Re: [HACKERS] pageinspect option to forgo buffer locking?

2017-11-09 Thread Robert Haas

On Thu, Nov 9, 2017 at 12:58 PM, Andres Freund  wrote:
> You can already pass arbitrary byteas to heap_page_items(), so I don't
> see how this'd change anything exposure-wise? Or are you thinking that
> users would continually do this with actual page contents and would be
> more likely to hit edge cases than if using pg_read_binary_file() or
> such (which obviously sees an out of date page)?

More the latter.  It's not really an increase in terms of security
exposure, but it might lead to more trouble in practice.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pageinspect option to forgo buffer locking?

2017-11-09 Thread Andres Freund

On 2017-11-09 12:55:30 -0500, Robert Haas wrote:
> On Thu, Nov 9, 2017 at 12:49 PM, Andres Freund  wrote:
> > Occasionally, when debugging issues, I find it quite useful to be able
> > to do a heap_page_items() on a page that's actually locked exclusively
> > concurrently. E.g. investigating the recent multixact vacuuming issues,
> > it was very useful to attach a debugger to one backend to step through
> > freezing, and display the page in another session.
> >
> > Currently the locking in get_raw_page_internal() prevents that.  If it's
> > an option defaulting to off, I don't see why we couldn't allow that to
> > skip locking the page's contents. Obviously you can get corrupted
> > contents that way, but we already allow to pass arbitrary stuff to
> > heap_page_items().  Since pinning wouldn't be changed, there's no danger
> > of the page being moved out from under us.
> 
> heap_page_items() is, if I remember correctly, not necessarily going
> to tolerate malformed input very well - I think that's why we restrict
> all of these functions to superusers.  So using it in this way would
> seem to increase the risk of a server crash or other horrible
> misbehavior.  Of course if we're just dev-testing that doesn't really
> matter.

You can already pass arbitrary byteas to heap_page_items(), so I don't
see how this'd change anything exposure-wise? Or are you thinking that
users would continually do this with actual page contents and would be
more likely to hit edge cases than if using pg_read_binary_file() or
such (which obviously sees an out of date page)?

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pageinspect option to forgo buffer locking?

2017-11-09 Thread Robert Haas

On Thu, Nov 9, 2017 at 12:49 PM, Andres Freund  wrote:
> Occasionally, when debugging issues, I find it quite useful to be able
> to do a heap_page_items() on a page that's actually locked exclusively
> concurrently. E.g. investigating the recent multixact vacuuming issues,
> it was very useful to attach a debugger to one backend to step through
> freezing, and display the page in another session.
>
> Currently the locking in get_raw_page_internal() prevents that.  If it's
> an option defaulting to off, I don't see why we couldn't allow that to
> skip locking the page's contents. Obviously you can get corrupted
> contents that way, but we already allow to pass arbitrary stuff to
> heap_page_items().  Since pinning wouldn't be changed, there's no danger
> of the page being moved out from under us.

heap_page_items() is, if I remember correctly, not necessarily going
to tolerate malformed input very well - I think that's why we restrict
all of these functions to superusers.  So using it in this way would
seem to increase the risk of a server crash or other horrible
misbehavior.  Of course if we're just dev-testing that doesn't really
matter.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Simplify ACL handling for large objects and removal of superuser() checks

2017-11-09 Thread Tom Lane

Michael Paquier  writes:
> On Thu, Nov 9, 2017 at 6:05 AM, Tom Lane  wrote:
>> Another idea would be to invent a new external flag bit "INV_WRITE_ONLY",
>> so that people who wanted true write-only could get it, without breaking
>> backwards-compatible behavior.  But I'm inclined to wait for some field
>> demand to show up before adding even that little bit of complication.

> Demand that may never show up, and the current behavior on HEAD does
> not receive any complains either. I am keeping the patch simple for
> now. That's less aspirin needed for everybody.

Looks good to me, pushed.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] libpq connection strings: control over the cipher suites?

2017-11-09 Thread Joe Conway

On 11/09/2017 03:27 AM, Graham Leggett wrote:
> Is there a parameter or mechanism for setting the required ssl cipher list 
> from the client side?

I don't believe so. That is controlled by ssl_ciphers, which requires a
restart in order to change.

https://www.postgresql.org/docs/10/static/runtime-config-connection.html#GUC-SSL-CIPHERS

select name,setting,context from pg_settings where name like '%ssl%';
   name| setting  |  context
---+--+
 ssl   | off  | postmaster
 ssl_ca_file   |  | postmaster
 ssl_cert_file | server.crt   | postmaster
 ssl_ciphers   | HIGH:MEDIUM:+3DES:!aNULL | postmaster
 ssl_crl_file  |  | postmaster
 ssl_ecdh_curve| prime256v1   | postmaster
 ssl_key_file  | server.key   | postmaster
 ssl_prefer_server_ciphers | on   | postmaster
(8 rows)

HTH,

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature

Re: [HACKERS] pageinspect option to forgo buffer locking?

2017-11-09 Thread Peter Geoghegan

On Thu, Nov 9, 2017 at 9:49 AM, Andres Freund  wrote:
> Currently the locking in get_raw_page_internal() prevents that.  If it's
> an option defaulting to off, I don't see why we couldn't allow that to
> skip locking the page's contents. Obviously you can get corrupted
> contents that way, but we already allow to pass arbitrary stuff to
> heap_page_items().  Since pinning wouldn't be changed, there's no danger
> of the page being moved out from under us.

+1. I've done things like this before myself.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] pageinspect option to forgo buffer locking?

2017-11-09 Thread Andres Freund

Hi,

Occasionally, when debugging issues, I find it quite useful to be able
to do a heap_page_items() on a page that's actually locked exclusively
concurrently. E.g. investigating the recent multixact vacuuming issues,
it was very useful to attach a debugger to one backend to step through
freezing, and display the page in another session.

Currently the locking in get_raw_page_internal() prevents that.  If it's
an option defaulting to off, I don't see why we couldn't allow that to
skip locking the page's contents. Obviously you can get corrupted
contents that way, but we already allow to pass arbitrary stuff to
heap_page_items().  Since pinning wouldn't be changed, there's no danger
of the page being moved out from under us.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] A hook for session start

2017-11-09 Thread Fabrízio de Royes Mello

On Thu, Nov 9, 2017 at 12:09 AM, Michael Paquier 
wrote:
>
> On Thu, Nov 9, 2017 at 2:42 AM, Fabrízio de Royes Mello
>  wrote:
> > On Wed, Nov 8, 2017 at 12:47 AM, Michael Paquier <
michael.paqu...@gmail.com>
> > wrote:
> >> - Let's restrict the logging to a role name instead of a database
> >> name, and let's parametrize it with a setting in the temporary
> >> configuration file. Let's not bother about multiple role support with
> >> a list, for the sake of tests and simplicity only defining one role
> >> looks fine to me. Comments in the code should be clear about the
> >> dependency.
> >
> > Makes sense and simplify the test code. Fixed.
>
> +   if (!strcmp(username, "regress_sess_hook_usr2"))
> +   {
> +   const char *dbname;
> [...]
> +++ b/src/test/modules/test_session_hooks/session_hooks.conf
> @@ -0,0 +1 @@
> +shared_preload_libraries = 'test_session_hooks'
> Don't you think that this should be a GUC? My previous comment
> outlined that. I won't fight hard on that point in any case, don't
> worry. I just want to make things clear :)
>

Ooops... my fault... fixed!

Thanks again!!

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 05c5c19..d3156ad 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -169,6 +169,9 @@ static ProcSignalReason RecoveryConflictReason;
 static MemoryContext row_description_context = NULL;
 static StringInfoData row_description_buf;
 
+/* Hook for plugins to get control at start of session */
+session_start_hook_type session_start_hook = NULL;
+
 /* 
  *		decls for routines only used in this file
  * 
@@ -3857,6 +3860,9 @@ PostgresMain(int argc, char *argv[],
 	if (!IsUnderPostmaster)
 		PgStartTime = GetCurrentTimestamp();
 
+	if (session_start_hook)
+		(*session_start_hook) ();
+
 	/*
 	 * POSTGRES main processing loop begins here
 	 *
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 20f1d27..16ec376 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -76,6 +76,8 @@ static bool ThereIsAtLeastOneRole(void);
 static void process_startup_options(Port *port, bool am_superuser);
 static void process_settings(Oid databaseid, Oid roleid);
 
+/* Hook for plugins to get control at end of session */
+session_end_hook_type session_end_hook = NULL;
 
 /*** InitPostgres support ***/
 
@@ -1154,6 +1156,10 @@ ShutdownPostgres(int code, Datum arg)
 	 * them explicitly.
 	 */
 	LockReleaseAll(USER_LOCKMETHOD, true);
+
+	/* Hook at session end */
+	if (session_end_hook)
+		(*session_end_hook) ();
 }
 
 
diff --git a/src/include/tcop/tcopprot.h b/src/include/tcop/tcopprot.h
index f8c535c..9f05bfb 100644
--- a/src/include/tcop/tcopprot.h
+++ b/src/include/tcop/tcopprot.h
@@ -35,6 +35,13 @@ extern PGDLLIMPORT const char *debug_query_string;
 extern int	max_stack_depth;
 extern int	PostAuthDelay;
 
+/* Hook for plugins to get control at start and end of session */
+typedef void (*session_start_hook_type) (void);
+typedef void (*session_end_hook_type) (void);
+
+extern PGDLLIMPORT session_start_hook_type session_start_hook;
+extern PGDLLIMPORT session_end_hook_type session_end_hook;
+
 /* GUC-configurable parameters */
 
 typedef enum
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index b7ed0af..a3c8c1e 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -11,6 +11,7 @@ SUBDIRS = \
 		  snapshot_too_old \
 		  test_ddl_deparse \
 		  test_extensions \
+		  test_session_hooks \
 		  test_parser \
 		  test_pg_dump \
 		  test_rbtree \
diff --git a/src/test/modules/test_session_hooks/.gitignore b/src/test/modules/test_session_hooks/.gitignore
new file mode 100644
index 000..5dcb3ff
--- /dev/null
+++ b/src/test/modules/test_session_hooks/.gitignore
@@ -0,0 +1,4 @@
+# Generated subdirectories
+/log/
+/results/
+/tmp_check/
diff --git a/src/test/modules/test_session_hooks/Makefile b/src/test/modules/test_session_hooks/Makefile
new file mode 100644
index 000..c5c3860
--- /dev/null
+++ b/src/test/modules/test_session_hooks/Makefile
@@ -0,0 +1,21 @@
+# src/test/modules/test_session_hooks/Makefile
+
+MODULES = test_session_hooks
+PGFILEDESC = "test_session_hooks - Test session hooks with an extension"
+
+EXTENSION = test_session_hooks
+DATA = test_session_hooks--1.0.sql
+
+REGRESS = test_session_hooks
+REGRESS_OPTS = --temp-config=$(top_srcdir)/src/test/modules/test_session_hooks/session_hooks.conf
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/

[HACKERS] Aggregates push-down to partitions

2017-11-09 Thread Konstantin Knizhnik


There is a huge thread concerning pushing-down aggregates to FDW:

https://www.postgresql.org/message-id/flat/CAFjFpRcnueviDpngJ3QSVvj7oyukr9NkSiCspqd4N%2BdCEdvYvg%40mail.gmail.com#cafjfprcnuevidpngj3qsvvj7oyukr9nksicspqd4n+dcedv...@mail.gmail.com

but as far as I understand nothing is done for efficient calculation of 
aggregates for partitioned table.
In case of local partitions it is somehow compensated by parallel query 
plan:


postgres=# create table base(x integer);
CREATE TABLE
postgres=# create table derived1() inherits (base);
CREATE TABLE
postgres=# create table derived2() inherits (base);
CREATE TABLE
postgres=# insert into derived1  values (generate_series(1,100));
INSERT 0 100
postgres=# insert into derived2  values (generate_series(1,100));
INSERT 0 100
postgres=# explain select sum(x) from base;
   QUERY PLAN
-
 Finalize Aggregate  (cost=12176.63..12176.64 rows=1 width=8)
   ->  Gather  (cost=12176.59..12176.61 rows=8 width=8)
 Workers Planned: 8
 ->  Partial Aggregate  (cost=12175.59..12175.60 rows=1 width=8)
   ->  Append  (cost=0.00..11510.47 rows=266048 width=4)
 ->  Parallel Seq Scan on base (cost=0.00..0.00 
rows=1 width=4)
 ->  Parallel Seq Scan on derived1 
(cost=0.00..5675.00 rows=125000 width=4)
 ->  Parallel Seq Scan on derived2 
(cost=0.00..5835.47 rows=141047 width=4)

(8 rows)

It is still far from ideal plan because each worker is working with all 
partitions, instead of spitting partitions between workers and calculate 
partial aggregates for each partition.


But if we add FDW as a child of parent table, then parallel scan can not 
be used and we get the worst possible plan:


postgres=# create foreign table derived_fdw() inherits(base) server 
pg_fdw options (table_name 'derived1');CREATE FOREIGN TABLE

postgres=# explain select sum(x) from base;
    QUERY PLAN
--
 Aggregate  (cost=34055.07..34055.08 rows=1 width=8)
   ->  Append  (cost=0.00..29047.75 rows=2002926 width=4)
 ->  Seq Scan on base  (cost=0.00..0.00 rows=1 width=4)
 ->  Seq Scan on derived1  (cost=0.00..14425.00 rows=100 
width=4)
 ->  Seq Scan on derived2  (cost=0.00..14425.00 rows=100 
width=4)
 ->  Foreign Scan on derived_fdw  (cost=100.00..197.75 
rows=2925 width=4)

(6 rows)

So we sequentially pull all data to this node and compute aggregates 
locally.
Ideal plan will calculate in parallel partial aggregates at all nodes 
and then combine partial results.

It requires two changes:
1. Replace Aggregate->Append with 
Finalize_Aggregate->Append->Partial_Aggregate
2. Concurrent execution of Append. It also can be done in two different 
ways: we can try to use existed parallel workers infrastructure and
replace Append with Gather. It seems to be the best approach for local 
partitioning. In case of remote (FDW) partitions, it is enough
to split starting of execution (PQsendQuery in postgres_fdw) and getting 
results. So it requires some changes in FDW protocol.



I wonder if somebody already investigate this problem or working in this 
direction.

May be there are already some patches proposed?
I have searched hackers archive, but didn't find something relevant...
Are there any suggestions about the best approach to implement this feature?

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] (spelling) Ensure header of postgresql.auto.conf is consistent

2017-11-09 Thread Tom Lane

=?UTF-8?Q?Fabr=C3=ADzio_de_Royes_Mello?=  writes:
> Em qui, 9 de nov de 2017 às 06:15, Feike Steenbergen <
> feikesteenber...@gmail.com> escreveu:
>> Attached a patch that ensures the header of postgresql.auto.conf is
>> consistent, whether created by initdb or recreated when ALTER SYSTEM
>> is issued.

> Interesting... IMHO this typo should be backpatched to 9.4 when ALTER
> SYSTEM was introduced.

Agreed, and done.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] A weird bit in pg_upgrade/exec.c

2017-11-09 Thread Tom Lane

Alvaro Herrera  writes:
> I think odd coding this was introduced recently because of the
> pg_resetxlog -> pg_resetwal renaming.

Dunno about that, but certainly somebody fat-fingered a refactoring
there.  The 9.6 code looks quite different but doesn't seem to be
doing anything silly.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] A weird bit in pg_upgrade/exec.c

2017-11-09 Thread Tom Lane

a.akent...@postgrespro.ru writes:
> I've came across a weird bit in pg_upgrade/exec.c

> We have a function check_bin_dir() which goes like this (old_cluster and 
> new_cluster are global variables):
> void check_bin_dir(ClusterInfo *cluster)
> {
>  ...
>  get_bin_version(&old_cluster);
>  get_bin_version(&new_cluster);
>  ...
> }

> This function has two calls:
> check_bin_dir(&old_cluster);
> check_bin_dir(&new_cluster);

> I'd like to substitute these last two lines with this:
> get_bin_version(cluster);

Yeah, the way it is now seems outright broken.  It will try to do
get_bin_version on the new cluster before having done validate_exec
there, violating its own comment.

I think we should change this as a bug fix, independently of whatever
else you had in mind to do here.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] A weird bit in pg_upgrade/exec.c

2017-11-09 Thread Alvaro Herrera

a.akent...@postgrespro.ru wrote:

> This function has two calls:
> check_bin_dir(&old_cluster);
> check_bin_dir(&new_cluster);
> 
> I'd like to substitute these last two lines with this:
> get_bin_version(cluster);

Odd indeed.  One would think that if a cluster variable is passed as
parameter, the global vars should not be used.  +1 for fixing it, and
your proposal sounds as good as any.

> Doing it would simplify the patch I'm writing, but I'm worried I might break
> something that's been there for a long time and has been working fine.

I think odd coding this was introduced recently because of the
pg_resetxlog -> pg_resetwal renaming.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Pg V10: Patch for bug in bonjour support

2017-11-09 Thread Tom Lane

Thomas Munro  writes:
> On Thu, Nov 9, 2017 at 6:27 PM, Tom Lane  wrote:
>> Is there really much interest in Bonjour support on non-macOS platforms?
>> I hadn't heard that anybody but Apple was invested in it.

> Not from me.  My only interest here was to pipe up because I knew that
> what was originally proposed would have broken stuff on macOS, and
> after that piqued curiosity.  I won't mind at all if you revert the
> commit to prevent confusion.  If the intersection of FreeBSD,
> PostgreSQL and Bonjour users is a non-empty set, [s]he might at least
> find this archived discussion useful...

Not hearing anyone else speaking up for this, I'll go revert the
configure change, and instead put in a comment pointing out that
Avahi support would require a lot more than an extra -l switch.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2017-11-09 Thread Tom Lane

Robert Haas  writes:
> No, that's not right.  Now that you mention it, I realize that tuple
> locks can definitely cause deadlocks.  Example:

Yeah.  Foreign-key-related tuple locks are another rich source of
examples.

> ... So I don't
> think we can remove speculative insertion locks from the deadlock
> detector either.

That scares me too.  I think that relation extension can safely
be transferred to some lower-level mechanism, because what has to
be done while holding the lock is circumscribed and below the level
of database operations (which might need other locks).  These other
ideas seem a lot riskier.

(But see recent conversation where I discouraged Alvaro from holding
extension locks across BRIN summarization activity.  We'll need to look
and make sure that nobody else has had creative ideas like that.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Runtime Partition Pruning

2017-11-09 Thread Robert Haas

On Thu, Nov 9, 2017 at 6:18 AM, Beena Emerson  wrote:
> The code still chooses the custom plan instead of the generic plan for
> the prepared statements. I am working on it.

I don't think it's really the job of this patch to do anything about
that problem.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2017-11-09 Thread Robert Haas

On Wed, Nov 8, 2017 at 9:40 AM, Masahiko Sawada  wrote:
> Speaking of the acquiring these four lock types and heavy weight lock,
> there obviously is a call path to acquire any of four lock types while
> holding a heavy weight lock. In reverse, there also is a call path
> that we acquire a heavy weight lock while holding any of four lock
> types. The call path I found is that in heap_delete we acquire a tuple
> lock and call XactLockTableWait or MultiXactIdWait which eventually
> could acquire LOCKTAG_TRANSACTION in order to wait for the concurrent
> transactions finish. But IIUC since these functions acquire the lock
> for the concurrent transaction's transaction id, deadlocks doesn't
> happen.

No, that's not right.  Now that you mention it, I realize that tuple
locks can definitely cause deadlocks.  Example:

setup:
rhaas=# create table foo (a int, b text);
CREATE TABLE
rhaas=# create table bar (a int, b text);
CREATE TABLE
rhaas=# insert into foo values (1, 'hoge');
INSERT 0 1

session 1:
rhaas=# begin;
BEGIN
rhaas=# update foo set b = 'hogehoge' where a = 1;
UPDATE 1

session 2:
rhaas=# begin;
BEGIN
rhaas=# update foo set b = 'quux' where a = 1;

session 3:
rhaas=# begin;
BEGIN
rhaas=# lock bar;
LOCK TABLE
rhaas=# update foo set b = 'blarfle' where a = 1;

back to session 1:
rhaas=# select * from bar;
ERROR:  deadlock detected
LINE 1: select * from bar;
  ^
DETAIL:  Process 88868 waits for AccessShareLock on relation 16391 of
database 16384; blocked by process 88845.
Process 88845 waits for ExclusiveLock on tuple (0,1) of relation 16385
of database 16384; blocked by process 88840.
Process 88840 waits for ShareLock on transaction 1193; blocked by process 88868.
HINT:  See server log for query details.

So what I said before was wrong: we definitely cannot exclude tuple
locks from deadlock detection.  However, we might be able to handle
the problem in another way: introduce a separate, parallel-query
specific mechanism to avoid having two participants try to update
and/or delete the same tuple at the same time - e.g. advertise the
BufferTag + offset within the page in DSM, and if somebody else
already has that same combination advertised, wait until they no
longer do.  That shouldn't ever deadlock, because the other worker
shouldn't be able to find itself waiting for us while it's busy
updating a tuple.

After some further study, speculative insertion locks look problematic
too.  I'm worried about the code path ExecInsert() [taking speculative
insertion locking] -> heap_insert -> heap_prepare_insert ->
toast_insert_or_update -> toast_save_datum ->
heap_open(rel->rd_rel->reltoastrelid, RowExclusiveLock).  That sure
looks like we can end up waiting for a relation lock while holding a
speculative insertion lock, which seems to mean that speculative
insertion locks are subject to at least theoretical deadlock hazards
as well.  Note that even if we were guaranteed to be holding the lock
on the toast relation already at this point, it wouldn't fix the
problem, because we might still have to build or refresh a relcache
entry at this point, which could end up scanning (and thus locking)
system catalogs.  Any syscache lookup can theoretically take a lock,
even though most of the time it doesn't, and thus taking a lock that
has been removed from the deadlock detector (or, say, an lwlock) and
then performing a syscache lookup with it held is not OK.  So I don't
think we can remove speculative insertion locks from the deadlock
detector either.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] A weird bit in pg_upgrade/exec.c

2017-11-09 Thread a . akenteva


Hello!
I've came across a weird bit in pg_upgrade/exec.c

We have a function check_bin_dir() which goes like this (old_cluster and 
new_cluster are global variables):

void check_bin_dir(ClusterInfo *cluster)
{
...
get_bin_version(&old_cluster);
get_bin_version(&new_cluster);
...
}

This function has two calls:
check_bin_dir(&old_cluster);
check_bin_dir(&new_cluster);

I'd like to substitute these last two lines with this:
get_bin_version(cluster);

Doing it would simplify the patch I'm writing, but I'm worried I might 
break something that's been there for a long time and has been working 
fine. Is there maybe a reason for it to be this way that I don't happen 
to understand?
Also, there's the exact same situation with the get_bin_version() 
function in the same file.



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Reorder header files in alphabetical order

2017-11-09 Thread Etsuro Fujita

Hi,

Attached is a patch to reorder header files in joinrels.c and pathnode.c
in alphabetical order, removing unnecessary ones.

Best regards,
Etsuro Fujita
*** a/src/backend/optimizer/path/joinrels.c
--- b/src/backend/optimizer/path/joinrels.c
***
*** 16,30 
  
  #include "miscadmin.h"
  #include "catalog/partition.h"
- #include "nodes/relation.h"
  #include "optimizer/clauses.h"
  #include "optimizer/joininfo.h"
  #include "optimizer/pathnode.h"
  #include "optimizer/paths.h"
  #include "optimizer/prep.h"
- #include "optimizer/cost.h"
- #include "utils/memutils.h"
  #include "utils/lsyscache.h"
  
  
  static void make_rels_by_clause_joins(PlannerInfo *root,
--- 16,28 
  
  #include "miscadmin.h"
  #include "catalog/partition.h"
  #include "optimizer/clauses.h"
  #include "optimizer/joininfo.h"
  #include "optimizer/pathnode.h"
  #include "optimizer/paths.h"
  #include "optimizer/prep.h"
  #include "utils/lsyscache.h"
+ #include "utils/memutils.h"
  
  
  static void make_rels_by_clause_joins(PlannerInfo *root,
*** a/src/backend/optimizer/util/pathnode.c
--- b/src/backend/optimizer/util/pathnode.c
***
*** 17,24 
  #include 
  
  #include "miscadmin.h"
! #include "nodes/nodeFuncs.h"
  #include "nodes/extensible.h"
  #include "optimizer/clauses.h"
  #include "optimizer/cost.h"
  #include "optimizer/pathnode.h"
--- 17,25 
  #include 
  
  #include "miscadmin.h"
! #include "foreign/fdwapi.h"
  #include "nodes/extensible.h"
+ #include "nodes/nodeFuncs.h"
  #include "optimizer/clauses.h"
  #include "optimizer/cost.h"
  #include "optimizer/pathnode.h"
***
*** 29,35 
  #include "optimizer/tlist.h"
  #include "optimizer/var.h"
  #include "parser/parsetree.h"
- #include "foreign/fdwapi.h"
  #include "utils/lsyscache.h"
  #include "utils/memutils.h"
  #include "utils/selfuncs.h"
--- 30,35 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_basebackup --progress output for batch execution

2017-11-09 Thread Arthur Zakirov

Hello,

On Sun, Oct 01, 2017 at 04:49:17PM -0300, Martin Marques wrote:
> Updated patch with documentation of the new option.
> 

I have checked the patch.
The patch is applied and compiled correctly without any errors. Tests passed.
The documentation doesn't have errors too.

I have a little suggestion. Maybe insert new line without any additional 
parameters? We can check that stderr is not terminal using isatty().

The code could become:

if (!isatty(fileno(stderr)))
fprintf(stderr, "\n");
else
fprintf(stderr, "\r");

Also it could be good to not insert new line after progress:

if (showprogress)
{
progress_report(PQntuples(res), NULL, true);
/* if (!batchmode) */
/* or */
if (isatty(fileno(stderr)))
fprintf(stderr, "\n");  /* Need to move to next line */
}

Otherwise we have an extra empty line:

pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 0/1C28 on timeline 1
pg_basebackup: starting background WAL receiver
pg_basebackup: created temporary replication slot "pg_basebackup_19635"
 0/183345 kB (0%), 0/1 tablespace (data_repl/backup_label )
183355/183355 kB (100%), 0/1 tablespace (data_repl/global/pg_control)
183355/183355 kB (100%), 1/1 tablespace

pg_basebackup: write-ahead log end point: 0/1CF8
pg_basebackup: waiting for background process to finish streaming ...
pg_basebackup: base backup completed

-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Jsonb transform for pl/python

2017-11-09 Thread Aleksander Alekseev

The following review has been posted through the commitfest application:
make installcheck-world:  tested, failed
Implements feature:   tested, passed
Spec compliant:   tested, passed
Documentation:tested, passed

Hello Anthony,

Great job!

I decided to take a closer look on your patch. Here are some defects I 
discovered.

> +   Additional extensions are available that implement transforms for
> +   the jsonb type for the language PL/Python.  The
> +   extensions for PL/Perl are called

1. The part regarding PL/Perl is obviously from another patch.

2. jsonb_plpython2u and jsonb_plpythonu are marked as relocatable, while 
jsonb_plpython3u is not. Is it a mistake? Anyway if an extension is relocatable
there should be a test that checks this.

3. Not all json types are test-covered. Tests for 'true' :: jsonb, '3.14' :: 
jsonb and 'null' :: jsonb are missing.

4. jsonb_plpython.c:133 - "Iterate trhrough Jsonb object." Typo, it should be 
"through" or probably even "over".

5. It looks like you've implemented transform in two directions Python <-> 
JSONB, however I see tests only for Python <- JSONB case.

6. Tests passed on Python 2.7.14 but failed on 3.6.2:

>  CREATE EXTENSION jsonb_plpython3u CASCADE;
> + ERROR:  could not access file "$libdir/jsonb_plpython3": No such file or
> directory

module_pathname in jsonb_plpython3u.control should be $libdir/jsonb_plpython3u,
not $libdir/jsonb_plpython3.

Tested on Arch Linux x64, GCC 7.2.0.

The new status of this patch is: Waiting on Author

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] libpq connection strings: control over the cipher suites?

2017-11-09 Thread Graham Leggett

Hi all,

According to the docs at 
https://www.postgresql.org/docs/9.5/static/libpq-connect.html#LIBPQ-CONNSTRING 
there are various parameters that control ssl from the client side, including 
providing the ssl certs, keys, etc.

Is there a parameter or mechanism for setting the required ssl cipher list from 
the client side?

Regards,
Graham
—




smime.p7s
Description: S/MIME cryptographic signature

Re: [HACKERS] Runtime Partition Pruning

2017-11-09 Thread Beena Emerson

Hello all,

Here is the updated patch which is rebased over v10 of Amit Langote's
path towards faster pruning patch [1]. It modifies the PartScanKeyInfo
struct to hold expressions which is then evaluated by the executor to
fetch the correct partitions using the function.

The code still chooses the custom plan instead of the generic plan for
the prepared statements. I am working on it. The following output is
after adding a hack in the code forcing selection of generic plan.

postgres=# EXPLAIN EXECUTE prstmt_select(7);
QUERY PLAN
--
 Append  (cost=0.00..1732.25 rows=2 width=8)
   ->  Seq Scan on tprt_1  (cost=0.00..847.00 rows=16667 width=8)
 Filter: ($1 > col1)
   ->  Seq Scan on tprt_2  (cost=0.00..847.00 rows=16667 width=8)
 Filter: ($1 > col1)
(5 rows)

postgres=# EXPLAIN EXECUTE prstmt_select(20);
QUERY PLAN
--
 Append  (cost=0.00..1732.25 rows=3 width=8)
   ->  Seq Scan on tprt_1  (cost=0.00..847.00 rows=16667 width=8)
 Filter: ($1 > col1)
   ->  Seq Scan on tprt_2  (cost=0.00..847.00 rows=16667 width=8)
 Filter: ($1 > col1)
   ->  Seq Scan on tprt_3  (cost=0.00..38.25 rows=753 width=8)
 Filter: ($1 > col1)
(7 rows)


[1] 
https://www.postgresql.org/message-id/b8094e71-2c73-ed8e-d8c3-53f232c8c049%40lab.ntt.co.jp

Tested on commit: 9b9cb3c4534d717c1c95758670198ebbf8a20af2

-- 

Beena Emerson

EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


0001-Implement-runtime-partiton-pruning.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] path toward faster partition pruning

2017-11-09 Thread amul sul

On Mon, Nov 6, 2017 at 3:31 PM, Amit Langote
 wrote:
> On 2017/11/06 14:32, David Rowley wrote:
>> On 6 November 2017 at 17:30, Amit Langote wrote:
>>> On 2017/11/03 13:32, David Rowley wrote:
 On 31 October 2017 at 21:43, Amit Langote wrote:
[]
>
> Attached updated set of patches, including the fix to make the new pruning
> code handle Boolean partitioning.
>

I am getting following warning on mac os x:

partition.c:1800:24: warning: comparison of constant -1 with
expression of type 'NullTestType'
  (aka 'enum NullTestType') is always false
[-Wtautological-constant-out-of-range-compare]
if (keynullness[i] == -1)
~~ ^  ~~
partition.c:1932:25: warning: comparison of constant -1 with
expression of type 'NullTestType'
  (aka 'enum NullTestType') is always false
[-Wtautological-constant-out-of-range-compare]
if (keynullness[i] == -1)
~~ ^  ~~
2 warnings generated.

Comment for 0004 patch:
270 +   /* -1 represents an invalid value of NullTestType. */
 271 +   memset(keynullness, -1, PARTITION_MAX_KEYS * sizeof(NullTestType));

I think we should not use memset to set a value other than 0 or true/false.
This will work for -1 on the system where values are stored in the 2's
complement but I am afraid of other architecture.

Regards,
Amul

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fix bloom WAL tap test

2017-11-09 Thread Alexander Korotkov

On Wed, Nov 8, 2017 at 5:46 AM, Masahiko Sawada 
wrote:

> > So I think
> > that you should instead do something like that:
> >
> > --- a/contrib/bloom/Makefile
> > +++ b/contrib/bloom/Makefile
> > @@ -20,5 +20,12 @@ include $(top_builddir)/src/Makefile.global
> >  include $(top_srcdir)/contrib/contrib-global.mk
> >  endif
> >
> > +installcheck: wal-installcheck
> > +
> > +check: wal-check
> > +
> > +wal-installcheck:
> > +   $(prove_installcheck)
> > +
> >  wal-check: temp-install
> > $(prove_check)
> >
> > This works for me as I would expect it should.
>
> Looks good to me too.


OK, then so be it :)

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


wal-check-on-bloom-check-3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] (spelling) Ensure header of postgresql.auto.conf is consistent

2017-11-09 Thread Michael Paquier

On Thu, Nov 9, 2017 at 6:25 PM, Fabrízio de Royes Mello
 wrote:
> Interesting... IMHO this typo should be backpatched to 9.4 when ALTER SYSTEM
> was introduced.

Yeah, that's harmless.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] (spelling) Ensure header of postgresql.auto.conf is consistent

2017-11-09 Thread Fabrízio de Royes Mello

Em qui, 9 de nov de 2017 às 06:15, Feike Steenbergen <
feikesteenber...@gmail.com> escreveu:

> Attached a patch that ensures the header of postgresql.auto.conf is
> consistent, whether created by initdb or recreated when ALTER SYSTEM
> is issued.
>
> The tiny difference caused some false-positives on our configuration
> management identifying changes, which was enough of an itch for me to
> scratch.


>
Interesting... IMHO this typo should be backpatched to 9.4 when ALTER
SYSTEM was introduced.

Regards,
-- 
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello

Re: [HACKERS] proposal - Default namespaces for XPath expressions (PostgreSQL 11)

2017-11-09 Thread Pavel Stehule

Hi

2017-11-06 14:00 GMT+01:00 Kyotaro HORIGUCHI <
horiguchi.kyot...@lab.ntt.co.jp>:

> Thank you for the new patch.
>
> - The latest patch is missing xpath_parser.h at least since
>   ns-3. That of the first (not-numbered) version was still
>   usable.
>
> - c29c578 conflicts on doc/src/sgml/func.sgml
>
>
> At Sun, 15 Oct 2017 12:06:11 +0200, Pavel Stehule 
> wrote in  b9efon...@mail.gmail.com>
> > 2017-10-02 12:22 GMT+02:00 Kyotaro HORIGUCHI <
> > horiguchi.kyot...@lab.ntt.co.jp>:
> >
> > > Hi, thanks for the new patch.
> > >
> > > # The patch is missing xpath_parser.h. That of the first patch was
> usable.
> > >
> > > At Thu, 28 Sep 2017 07:59:41 +0200, Pavel Stehule <
> pavel.steh...@gmail.com>
> > > wrote in  > > mail.gmail.com>
> > > > Hi
> > > >
> > > > now xpath and xpath_exists supports default namespace too
> > >
> > > At Wed, 27 Sep 2017 22:41:52 +0200, Pavel Stehule <
> pavel.steh...@gmail.com>
> > > wrote in  > > gmail.com>
> > > > > 1. Uniformity among simliar features
> > > > >
> > > > >   As mentioned in the proposal, but it is lack of uniformity that
> > > > >   the xpath transformer is applied only to xmltable and not for
> > > > >   other xpath related functions.
> > > > >
> > > >
> > > > I have to fix the XPath function. The SQL/XML function Xmlexists
> doesn't
> > > > support namespaces/
> > >
> > > Sorry, I forgot to care about that. (And the definition of
> > > namespace array is of course fabricated by me). I'd like to leave
> > > this to committers.  Anyway it is working but the syntax (or
> > > whether it is acceptable) is still arguable.
> > >
> > > SELECT xpath('/a/text()', 'http://example.com";>
> > > test',
> > >  ARRAY[ARRAY['', 'http://example.com']]);
> > > |  xpath
> > > | 
> > > |  {test}
> > > | (1 row)
> > >
> > >
> > > The internal name is properly rejected, but the current internal
> > > name (pgdefnamespace.pgsqlxml.internal) seems a bit too long. We
> > > are preserving some short names and reject them as
> > > user-defined. Doesn't just 'pgsqlxml' work?
> > >
> > >
> > > Default namespace correctly become to be applied on bare
> > > attribute names.
> > >
> > > > updated doc,
> > > > fixed all variants of expected result test file
> > >
> > > Sorry for one by one comment but I found another misbehavior.
> > >
> > > create table t1 (id int, doc xml);
> > > insert into t1
> > >values
> > >(5, 'http://x.y";>50 > > rows>');
> > > select x.* from t1, xmltable(XMLNAMESPACES('http://x.y' AS x),
> > > '/x:rows/x:row' passing t1.doc columns data int PATH
> > > 'child::x:a[1][attribute::hoge="haha"]') as x;
> > > |  data
> > > | --
> > > |50
> > >
> > > but the following fails.
> > >
> > > select x.* from t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'),
> > > '/rows/row' passing t1.doc columns data int PATH
> > > 'child::a[1][attribute::hoge="haha"]') as x;
> > > |  data
> > > | --
> > > |
> > > | (1 row)
> > >
> > > Perhaps child::a is not prefixed by the transformation.
> > >
> >
> > the problem was in unwanted attribute modification. The parser didn't
> > detect "attribute::hoge" as attribute. Updated parser does it. I reduce
> > duplicated code there more.
>
> It worked as expected. But the comparison of "attribute" is
> missing t1.length = 9 so the following expression wrongly passes.
>
> child::a[1][attributeabcdefg::hoge="haha"
>
> It is confusing that is_qual_name becomes true when t2 is not a
> "qual name", and the way it treats a double-colon is hard to
> understand.
>
> It essentially does inserting the default namespace before
> unqualified non-attribute name. I believe we can easily
> look-ahead to detect a double colon and it would make things
> simpler. Could you consider something like the attached patch?
> (applies on top of ns-4 patch.)
>
> > > XPath might be complex enough so that it's worth switching to
> > > yacc/lex based transformer that is formally verifiable and won't
> > > need a bunch of cryptic tests that finally cannot prove the
> > > completeness. synchronous_standy_names is far simpler than XPath
> > > but using yacc/lex parser.
> > >
> > >
> > > Anyway the following is nitpicking of the current xpath_parser.c.
> > >
> > > - NODENAME_FIRSTCHAR allows '-' as the first char but it is
> > >   excluded from NameStartChar (https://www.w3.org/TR/REC-
> > > xml/#NT-NameStartChar)
> > >   I think characters with high-bit set is okay.
> > >   Also IS_NODENAME_CHAR should be changed.
> > >
> >
> > fixed
> >
> >
> > > - NODENAME_FIRSTCHAR and IS_NODENAME_CHAR is in the same category
> > >   but have different naming schemes. Can these are named in the same
> way?
> > >
> >
> > fixed
> >
> >
> > > - The current transoformer seems to using up to one token stack
> > >   depth. Maybe the stack is needless. (pushed token is always
> > >   popped just after)
> > >
> >
> > fixed
>
> Thank you.
>
> I found another (and should be the last, so sorry..) functional
> defect in this. This doesn't add default namespace if the ta

Re: [HACKERS] Proposal: Improve bitmap costing for lossy pages

2017-11-09 Thread amul sul

Hi Dilip,

v6 patch:
 42 +   /*
 43 +* Estimate number of hashtable entries we can have within
maxbytes. This
 44 +* estimates the hash cost as sizeof(PagetableEntry).
 45 +*/
 46 +   nbuckets = maxbytes /
 47 +   (sizeof(PagetableEntry) + sizeof(Pointer) + sizeof(Pointer));

It took me a little while to understand this calculation.  You have moved this
code from tbm_create(), but I think you should move the following
comment as well:

tidbitmap.c:
 276 /*
 277  * Estimate number of hashtable entries we can have within
maxbytes. This
 278  * estimates the hash cost as sizeof(PagetableEntry), which
is good enough
 279  * for our purpose.  Also count an extra Pointer per entry
for the arrays
 280  * created during iteration readout.
 281  */

Regards,
Amul


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] why not parallel seq scan for slow functions

2017-11-09 Thread Amit Kapila

On Wed, Nov 8, 2017 at 6:48 PM, Robert Haas  wrote:
> On Wed, Nov 8, 2017 at 7:26 AM, Amit Kapila  wrote:
>> We do want to generate it later when there isn't inheritance involved,
>> but only if there is a single rel involved (simple_rel_array_size
>> <=2).  The rule is something like this, we will generate the gather
>> paths at this stage only if there are more than two rels involved and
>> there isn't inheritance involved.
>
> Why is that the correct rule?
>
> Sorry if I'm being dense here.  I would have thought we'd want to skip
> it for the topmost scan/join rel regardless of the presence or absence
> of inheritance.
>

I think I understood your concern after some offlist discussion and it
is primarily due to the inheritance related check which can skip the
generation of gather paths when it shouldn't.  So what might fit
better here is a straight check on the number of base rels such that
allow generating gather path in set_rel_pathlist, if there are
multiple baserels involved.  I have used all_baserels which I think
will work better for this purpose.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

parallel_paths_include_tlist_cost_v6.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Restricting maximum keep segments by repslots

2017-11-09 Thread Kyotaro HORIGUCHI

Oops! The previous patch is forgetting the default case and crashes.

At Wed, 08 Nov 2017 13:14:31 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI 
 wrote in 
<20171108.131431.170534842.horiguchi.kyot...@lab.ntt.co.jp>
> > I don't think 'distance' is a good metric - that's going to continually
> > change. Why not store the LSN that's available and provide a function
> > that computes this? Or just rely on the lsn - lsn operator?
> 
> It seems reasonable.,The 'secured minimum LSN' is common among
> all slots so showing it in the view may look a bit stupid but I
> don't find another suitable place for it.  distance = 0 meant the
> state that the slot is living but insecured in the previous patch
> and that information is lost by changing 'distance' to
> 'min_secure_lsn'.
> 
> Thus I changed the 'live' column to 'status' and show that staus
> in text representation.
> 
> status: secured | insecured | broken
> 
> So this looks like the following (max_slot_wal_keep_size = 8MB,
> which is a half of the default segment size)
> 
> -- slots that required WAL is surely available
> select restart_lsn, status, min_secure_lsn, pg_current_wal_lsn() from 
> pg_replication_slots;
> restart_lsn | status  | min_recure_lsn | pg_current_wal_lsn 
> +-++
> 0/1A60  | secured | 0/1A00 | 0/1B42BC78
> 
> -- slots that required WAL is still available but insecured
> restart_lsn | status| min_recure_lsn | pg_current_wal_lsn 
> +---++
> 0/1A60  | insecured | 0/1C00 | 0/1D76C948
> 
> -- slots that required WAL is lost
> # We should have seen the log 'Some replication slots have lost...'
> 
> restart_lsn | status | min_recure_lsn | pg_current_wal_lsn 
> +++
> 0/1A60  | broken | 0/1C00 | 0/1D76C9F0
> 
> 
> I noticed that I abandoned the segment fragment of
> max_slot_wal_keep_size in calculating in the routines. The
> current patch honors the frament part of max_slot_wal_keep_size.

I changed IsLsnStillAvailable to return meaningful values
regardless whether max_slot_wal_keep_size is set or not.

# I had been forgetting to count the version for latestst several
# patches. I give the version '4' - as the next of the last
# numbered patch.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From 109f056e257aba70dddc8d466767ed0a1db371e2 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Tue, 28 Feb 2017 11:39:48 +0900
Subject: [PATCH 1/2] Add WAL releaf vent for replication slots

Adds a capability to limit the number of segments kept by replication
slots by a GUC variable.
---
 src/backend/access/transam/xlog.c | 39 +++
 src/backend/utils/misc/guc.c  | 11 
 src/backend/utils/misc/postgresql.conf.sample |  1 +
 src/include/access/xlog.h |  1 +
 4 files changed, 52 insertions(+)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index dd028a1..cfdae39 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -105,6 +105,7 @@ int			wal_level = WAL_LEVEL_MINIMAL;
 int			CommitDelay = 0;	/* precommit delay in microseconds */
 int			CommitSiblings = 5; /* # concurrent xacts needed to sleep */
 int			wal_retrieve_retry_interval = 5000;
+int			max_slot_wal_keep_size_mb = 0;
 
 #ifdef WAL_DEBUG
 bool		XLOG_DEBUG = false;
@@ -9432,9 +9433,47 @@ KeepLogSeg(XLogRecPtr recptr, XLogSegNo *logSegNo)
 	if (max_replication_slots > 0 && keep != InvalidXLogRecPtr)
 	{
 		XLogSegNo	slotSegNo;
+		int			slotlimitsegs;
+		uint64		recptroff;
+		uint64		slotlimitbytes;
+		uint64		slotlimitfragment;
+
+		recptroff = XLogSegmentOffset(recptr, wal_segment_size);
+		slotlimitbytes = 1024 * 1024 * max_slot_wal_keep_size_mb;
+		slotlimitfragment =	XLogSegmentOffset(slotlimitbytes,
+			  wal_segment_size);
+
+		/* calculate segments to keep by max_slot_wal_keep_size_mb */
+		slotlimitsegs = ConvertToXSegs(max_slot_wal_keep_size_mb,
+	   wal_segment_size);
+		/* honor the fragment */
+		if (recptroff < slotlimitfragment)
+			slotlimitsegs++;
 
 		XLByteToSeg(keep, slotSegNo, wal_segment_size);
 
+		/*
+		 * ignore slots if too many wal segments are kept.
+		 * max_slot_wal_keep_size is just accumulated on wal_keep_segments.
+		 */
+		if (max_slot_wal_keep_size_mb > 0 && slotSegNo + slotlimitsegs < segno)
+		{
+			segno = segno - slotlimitsegs; /* must be positive */
+
+			/*
+			 * warn only if the checkpoint flushes the required segment.
+			 * we assume here that *logSegNo is calculated keep location.
+			 */
+			if (slotSegNo < *logSegNo)
+ereport(WARNING,
+	(errmsg ("restart LSN of replication slots is ignored by checkpoint"),
+	 errdetail("Some replication slots have lost required WAL segnents to continue by up to %ld segments.",
+	   (segno < *logSegNo ? segno : *logSegNo) - slotS

Re: [HACKERS] [PATCH] Improve geometric types

2017-11-09 Thread Kyotaro HORIGUCHI

Hello,

> I'd like to put comments on 0001 and 0004 only now:
...

I don't have a comment on 0002.

About 0003:

>  @@ -4487,21 +4486,21 @@ circle_in(PG_FUNCTION_ARGS)
> ...
>   circle->radius = single_decode(s, &s, "circle", str);
> - if (circle->radius < 0)
> + if (float8_lt(circle->radius, 0.0))
>   ereport(ERROR,
>   (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),

flost8_lt and its family functions are provided to unify the
sorting order including NaN. NaN is not rejected by the usage of
float8_lt in the case but it is what the function is expected to
be used for. If we wanted to check if it is positive, it
unexpectedly throws an exception.  (I suppose that NaNs should be
silently ignored rather than stopping a query by throwng an
exception.)

Addition to that I don't think it proper that applying EPSILON(!)
there. It should be strictly compared regardless whether EPSION
is considered or not.


Similary, circle_overlap for example, float8_le is used to
compare the distance and the summed radius.

NaN causes a problem in another place.
  
> PG_RETURN_BOOL(FPle(point_dt(&circle1->center, &circle2->center),
> float8_pl(circle1->radius, circle2->radius)));

If the distance was NaN and the summed radius is not, the
function returns true. I think that a reasonable behavior is that
an object containing NaN cannot make any meaningful relationship
with other objects as floating number itself behave so. (Any
comparison other than != with NaN returns always false)

Just using another series of comparison functions that return
false for NaN-containing comparison is not a solution since the
meaning of the returned false differs by context, just same as
the first problem above. For exameple, the fictious functions
below,

| bool circle_overlaps()
|   ret = FPle(distance, radius_sum);

This gives correct results, but

| bool circle_not_overlaps()
|   ret = FPgt(distance, radius_sum);

This gives a wrong result for NaN-containing objects.

Perhaps it is enough to explicitly define behaviors for NaN
before comparison.

circle_overlap()
> distance = point_dt();
> radius_sum = float8_pl(...);
>
> /* NaN-containing objects doesn't overlap any other objects */ 
> if (isnan(distance) || isnan(radius_sum))
>  PG_RETURN_BOOL(false);
>
> /* NaN ordering of FPle() doesn't get into mischief here */
> return PG_RETURN_BOOL(FPle(distance, radius_sum));

(End Of the Comment to 0003)

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
 



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] (spelling) Ensure header of postgresql.auto.conf is consistent

2017-11-09 Thread Feike Steenbergen

Attached a patch that ensures the header of postgresql.auto.conf is
consistent, whether created by initdb or recreated when ALTER SYSTEM
is issued.

The tiny difference caused some false-positives on our configuration
management identifying changes, which was enough of an itch for me to
scratch.

regards,

Feike Steenbergen


0001-Make-header-of-postgresql.conf.auto-consistent.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

88 matches

Mail list logo