date:20160926

Re: [HACKERS] Declarative partitioning - another take

2016-09-26 Thread Ashutosh Bapat

>
> With this patch, the mapping is created *only once* during
> RelationBuildPartitionDesc() to assign canonical indexes to individual
> list values.  The partition OID array will also be rearranged such that
> using the new (canonical) index instead of the old
> catalog-scan-order-based index will retrieve the correct partition for
> that value.
>
> By the way, I fixed one thinko in your patch as follows:
>
> -result->oids[i] = oids[mapping[i]];
> +result->oids[mapping[i]] = oids[i];

While I can not spot any problem with this logic, when I make that
change and run partition_join testcase in my patch, it fails because
wrong partitions are matched for partition-wise join of list
partitions. In that patch, RelOptInfo of partitions are saved in
RelOptInfo of the parent by matching their OIDs. They are saved in the
same order as corresponding OIDs. Partition-wise join simply joins the
RelOptInfos at the same positions from both the parent RelOptInfos. I
can not spot an error in this logic too.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fix checkpoint skip logic on idle systems by tracking LSN progress

2016-09-26 Thread Michael Paquier

On Thu, May 19, 2016 at 6:57 AM, Michael Paquier
 wrote:
> I am adding that to the commit fest of September.

And a lot of activity has happened here since. Attached are refreshed
patches based on da6c4f6. v11 still applies correctly but it's always
better to avoid hunks when applying them.
-- 
Michael
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 289d240..0fd2e2b 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -8448,8 +8448,12 @@ CreateCheckPoint(int flags)
 	if ((flags & (CHECKPOINT_IS_SHUTDOWN | CHECKPOINT_END_OF_RECOVERY |
   CHECKPOINT_FORCE)) == 0)
 	{
+		elog(LOG, "Not a forced or shutdown checkpoint: progress_lsn %X/%X, ckpt %X/%X",
+			 (uint32) (progress_lsn >> 32), (uint32) progress_lsn,
+			 (uint32) (ControlFile->checkPoint >> 32), (uint32) ControlFile->checkPoint);
 		if (progress_lsn == ControlFile->checkPoint)
 		{
+			elog(LOG, "Checkpoint is skipped");
 			WALInsertLockRelease();
 			LWLockRelease(CheckpointLock);
 			END_CRIT_SECTION();
@@ -8616,7 +8620,11 @@ CreateCheckPoint(int flags)
 	 * recovery we don't need to write running xact data.
 	 */
 	if (!shutdown && XLogStandbyInfoActive())
-		LogStandbySnapshot();
+	{
+		XLogRecPtr lsn = LogStandbySnapshot();
+		elog(LOG, "snapshot taken by checkpoint %X/%X",
+			 (uint32) (lsn >> 32), (uint32) lsn);
+	}
 
 	START_CRIT_SECTION();
 
diff --git a/src/backend/postmaster/bgwriter.c b/src/backend/postmaster/bgwriter.c
index 3a791eb..7637a1d 100644
--- a/src/backend/postmaster/bgwriter.c
+++ b/src/backend/postmaster/bgwriter.c
@@ -331,7 +331,9 @@ BackgroundWriterMain(void)
 GetLastCheckpointRecPtr() < current_progress_lsn &&
 last_progress_lsn < current_progress_lsn)
 			{
-(void) LogStandbySnapshot();
+XLogRecPtr lsn = LogStandbySnapshot();
+elog(LOG, "snapshot taken by bgwriter %X/%X",
+	 (uint32) (lsn >> 32), (uint32) lsn);
 last_snapshot_ts = now;
 last_progress_lsn = current_progress_lsn;
 			}
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index b019bc1..ac40731 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -2507,7 +2507,7 @@ heap_insert(Relation relation, HeapTuple tup, CommandId cid,
 			heaptup->t_len - SizeofHeapTupleHeader);
 
 		/* filtering by origin on a row level is much more efficient */
-		XLogIncludeOrigin();
+		XLogSetFlags(XLOG_INCLUDE_ORIGIN);
 
 		recptr = XLogInsert(RM_HEAP_ID, info);
 
@@ -2846,7 +2846,7 @@ heap_multi_insert(Relation relation, HeapTuple *tuples, int ntuples,
 			XLogRegisterBufData(0, tupledata, totaldatalen);
 
 			/* filtering by origin on a row level is much more efficient */
-			XLogIncludeOrigin();
+			XLogSetFlags(XLOG_INCLUDE_ORIGIN);
 
 			recptr = XLogInsert(RM_HEAP2_ID, info);
 
@@ -3308,7 +3308,7 @@ l1:
 		}
 
 		/* filtering by origin on a row level is much more efficient */
-		XLogIncludeOrigin();
+		XLogSetFlags(XLOG_INCLUDE_ORIGIN);
 
 		recptr = XLogInsert(RM_HEAP_ID, XLOG_HEAP_DELETE);
 
@@ -6035,7 +6035,7 @@ heap_finish_speculative(Relation relation, HeapTuple tuple)
 		XLogBeginInsert();
 
 		/* We want the same filtering on this as on a plain insert */
-		XLogIncludeOrigin();
+		XLogSetFlags(XLOG_INCLUDE_ORIGIN);
 
 		XLogRegisterData((char *) &xlrec, SizeOfHeapConfirm);
 		XLogRegisterBuffer(0, buffer, REGBUF_STANDARD);
@@ -7703,7 +7703,7 @@ log_heap_update(Relation reln, Buffer oldbuf,
 	}
 
 	/* filtering by origin on a row level is much more efficient */
-	XLogIncludeOrigin();
+	XLogSetFlags(XLOG_INCLUDE_ORIGIN);
 
 	recptr = XLogInsert(RM_HEAP_ID, info);
 
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index e11b229..9130816 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -5232,7 +5232,7 @@ XactLogCommitRecord(TimestampTz commit_time,
 		XLogRegisterData((char *) (&xl_origin), sizeof(xl_xact_origin));
 
 	/* we allow filtering by xacts */
-	XLogIncludeOrigin();
+	XLogSetFlags(XLOG_INCLUDE_ORIGIN);
 
 	return XLogInsert(RM_XACT_ID, info);
 }
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index c1b9a97..289d240 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -442,11 +442,30 @@ typedef struct XLogwrtResult
  * the WAL record is just copied to the page and the lock is released. But
  * to avoid the deadlock-scenario explained above, the indicator is always
  * updated before sleeping while holding an insertion lock.
+ *
+ * The progressAt values indicate the insertion progress used to determine
+ * WAL insertion activity since a previous checkpoint, which is aimed at
+ * finding out if a checkpoint should be skipped or not or if standby
+ * activity should be logged. Progress position is basically updated
+ * for all types of records, for the time being only snapshot logging
+ * is out of this scope to properly skip their logging o

Re: [HACKERS] Requesting some information about the small portion of source code of postgreSQL

2016-09-26 Thread Craig Ringer

On 27 September 2016 at 14:13, Srinivas Karthik V
 wrote:
> Dear PostgresSQL Hackers,
>I am working in optimizer module of postgreSQL 9.4.1.

Why would you target a severely obsolete patch release?

> Also I would like to know for what targetlist stands for.

Can't really comment on the rest, but the targetlist is what gets
projected. It is the query or subquery output. It also contains a
bunch of internal-use values that get discarded and are not returned
to the user; these are labeled "resjunk".

e.g. in

SELECT a, b, f(b,c) / 100, 'x'
FROM some_table;

the targetlist is 4 entries. Two simple column references to "a" and
"b". An expression for f(b,c)/100 containing the column-references for
"b" and "c" and various surrounding expression nodes. Finally a
literal entry for 'x'.

You'll find turning on the various debug print options for parsetrees,
rewriter output and plan trees helpful in understanding what's going
on.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Remove superuser() checks from pgstattuple

2016-09-26 Thread Michael Paquier

On Tue, Sep 27, 2016 at 4:43 AM, Stephen Frost  wrote:
> * Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote:
>> This is now being obsoleted by the later idea of allowing base installs
>> from a chain of upgrade scripts.  But if your upgrade scripts contain
>> ALTER TABLE commands, you will probably still want to write base install
>> scripts that do a fresh CREATE TABLE instead.
>
> I've updated the patch to remove the new base version script and to rely
> on the changes made by Tom to install the 1.4 version and then upgrade
> to 1.5.
>
> Based on my testing, it appears to all work correctly.

Same conclusion from here.

> Updated (much smaller) patch attached.

I had a look at it, and it is doing the work it claims to do. So I am
marking it as "Ready for Committer".
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Requesting some information about the small portion of source code of postgreSQL

2016-09-26 Thread Srinivas Karthik V

Dear PostgresSQL Hackers,
   I am working in optimizer module of postgreSQL 9.4.1. I am trying to
return a subplan for a query instead of full plan. For this I need to
return an intermediate plan (or path) from the DP lattice (i.e.  from
*RelOptInfo
*standard_join_search() *at* allpaths.c) *instead of the full optimal plan
(which is from the last level of *root->join_rel_level()).*
while doing so I am getting *error :variable not found in subplan target
lists* at function *fix_join_expr_mutator* at *setrefs.c.*

It will be great if you can give me some idea about why this error is
happening, since this error is happening at *fix_join_expr_mutator* function
at *setrefs.c.* Please give me more information about this portion of code.
Also I would like to know for what targetlist stands for.


Thanks,
Srinivas Karthik

Re: [HACKERS] patch: function xmltable

2016-09-26 Thread Pavel Stehule

2016-09-27 5:53 GMT+02:00 Craig Ringer :

> On 24 September 2016 at 14:01, Pavel Stehule 
> wrote:
>
> >> Did some docs copy-editing and integrated some examples. Explained how
> >> nested elements work, that multiple top level elements is an error,
> >> etc. Explained the time-of-evaluation stuff. Pointed out that you can
> >> refer to prior output columns in PATH and DEFAULT, since that's weird
> >> and unusual compared to normal SQL. Documented handling of multiple
> >> node matches, including the surprising results of somepath/text() on
> >> xy. Documented handling of nested
> >> elements. Documented that xmltable works only on XML documents, not
> >> fragments/forests.
> >
> >
> > I don't understand to this sentence: "It is possible for a PATH
> expression
> > to reference output columns that appear before it in the column-list, so
> > paths may be dynamically constructed based on other parts of the XML
> > document:"
>
> This was based on a misunderstanding of something you said earlier. I
> thought the idea was to allow this to work:
>
> SELECT * FROM xmltable('/x' PASSING
> 'avalue' COLUMNS elemName text,
> extractedValue text PATH elemName);
>
> ... but it doesn't:
>
>
> SELECT * FROM xmltable('/x' PASSING
> 'avalue' COLUMNS elemName text,
> extractedValue text PATH elemName);
> ERROR:  column "elemname" does not exist
> LINE 1: ...' COLUMNS elemName text, extractedValue text PATH elemName);
>
> ... so please delete that text. I thought I'd tested it but the state
> of my tests dir says I just got distracted by another task at the
> wrong time.
>

deleted

Regards

Pavel

>
> --
>  Craig Ringer   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>


xmltable-11.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Declarative partitioning - another take

2016-09-26 Thread Amit Langote

On 2016/09/22 19:10, Ashutosh Bapat wrote:
> On Thu, Sep 22, 2016 at 1:02 PM, Ashutosh Bapat
>  wrote:
>> For list partitions, the ListInfo stores the index maps for values
>> i.e. the index of the partition to which the value belongs. Those
>> indexes are same as the indexes in partition OIDs array and come from
>> the catalogs. In case a user creates two partitioned tables with
>> exactly same lists for partitions but specifies them in a different
>> order, the OIDs are stored in the order specified. This means that
>> index array for these tables come out different. equal_list_info()
>> works around that by creating an array of mappings and checks whether
>> that mapping is consistent for all values. This means we will create
>> the mapping as many times as equal_list_info() is called, which is
>> expected to be more than the number of time
>> RelationBuildPartitionDescriptor() is called. Instead, if we
>> "canonicalise" the indexes so that they come out exactly same for
>> similarly partitioned tables, we build the mapping only once and
>> arrange OIDs accordingly.
>>
>> Here's patch to do that. I have ran make check with this and it didn't
>> show any failure. Please consider this to be included in your next set
>> of patches.

Thanks.  It seems like this will save quite a few cycles for future users
of equal_list_info().  I will incorporate it into may patch.

With this patch, the mapping is created *only once* during
RelationBuildPartitionDesc() to assign canonical indexes to individual
list values.  The partition OID array will also be rearranged such that
using the new (canonical) index instead of the old
catalog-scan-order-based index will retrieve the correct partition for
that value.

By the way, I fixed one thinko in your patch as follows:

-result->oids[i] = oids[mapping[i]];
+result->oids[mapping[i]] = oids[i];

That is, copy an OID at a given position in the original array to a
*mapped position* in the result array (instead of the other way around).

> The patch has an if condition as statement by itself
> +if (l1->null_index != l2->null_index);
>  return false;
> 
> There shouldn't be ';' at the end. It looks like in the tests you have
> added the function always bails out before it reaches this statement.

There is no user of equal_list_info() such that the above bug would have
caused a regression test failure.  Maybe a segfault crash (due to dangling
pointer into partition descriptor's listinfo after the above function
would unintentionally return false to equalPartitionDescs() which is in
turn called by RelationClearRelation()).

Thanks,
Amit

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Push down more full joins in postgres_fdw

2016-09-26 Thread Ashutosh Bapat

On Tue, Sep 27, 2016 at 8:48 AM, Etsuro Fujita
 wrote:
> On 2016/09/26 20:20, Ashutosh Bapat wrote:
>>
>> On Mon, Sep 26, 2016 at 4:06 PM, Etsuro Fujita
>>  wrote:
>>>
>>> On 2016/09/26 18:06, Ashutosh Bapat wrote:

 On Mon, Sep 26, 2016 at 1:05 PM, Etsuro Fujita
  wrote:
>
>
> ISTM that the use of the same RTI for subqueries in multi-levels in a
> remote
> SQL makes the SQL a bit difficult to read.  How about using the
> position
> of
> the join rel in join_rel_list, (more precisely, the position plus
> list_length(root->parse->rtable)), instead?
>
>
 We switch to hash table to maintain the join RelOptInfos when the
 number of joins grows larger, where the position won't make much
 sense.
>
>
>>> That's right, but we still store the joinrel into join_rel_list after
>>> creating that hash table.
>
>
>> As the list grows, it will take
>> longer to locate the RelOptInfo of the given join. Doing that for
>> creating an alias seems an overkill.
>
>
> The join rel is appended to the end of the list, so I was thinking to get
> the position info by list_length during postgresGetForeignJoinPaths.

That's true only when the paths are being added to a newly created
joinrel. But that's not true always. We may add paths with different
joining order to an existing joinrel, in which case list_length would
not give its position. Am I missing something?

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pageinspect: Hash index support

2016-09-26 Thread Alvaro Herrera

Peter Eisentraut wrote:
> On 9/26/16 1:39 PM, Jesper Pedersen wrote:

> >> - hash_metap result fields spares and mapp should be arrays of integer.
> > 
> > B-tree and BRIN uses a 'text' field as output, so left as is.
> 
> These fields are specific to hash, so the precedent doesn't necessarily
> apply.
> 
> >> - The data field could be of type bytea.
> > 
> > Left as is, for same reasons as 'spares' and 'mapp'.
> 
> Comments from others here?  Why not use bytea instead of text?

The BRIN pageinspect functions aren't a particularly well thought-out
precedent IMO.  There was practically no review of that part of the BRIN
patch, and I just threw it together in the way that seemed to make the
most sense at the time.  If there's some reason why some other way is
better for hash indexes, that should be followed rather than mimicking
BRIN.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: function xmltable

2016-09-26 Thread Craig Ringer

On 24 September 2016 at 14:01, Pavel Stehule  wrote:

>> Did some docs copy-editing and integrated some examples. Explained how
>> nested elements work, that multiple top level elements is an error,
>> etc. Explained the time-of-evaluation stuff. Pointed out that you can
>> refer to prior output columns in PATH and DEFAULT, since that's weird
>> and unusual compared to normal SQL. Documented handling of multiple
>> node matches, including the surprising results of somepath/text() on
>> xy. Documented handling of nested
>> elements. Documented that xmltable works only on XML documents, not
>> fragments/forests.
>
>
> I don't understand to this sentence: "It is possible for a PATH expression
> to reference output columns that appear before it in the column-list, so
> paths may be dynamically constructed based on other parts of the XML
> document:"

This was based on a misunderstanding of something you said earlier. I
thought the idea was to allow this to work:

SELECT * FROM xmltable('/x' PASSING
'avalue' COLUMNS elemName text,
extractedValue text PATH elemName);

... but it doesn't:


SELECT * FROM xmltable('/x' PASSING
'avalue' COLUMNS elemName text,
extractedValue text PATH elemName);
ERROR:  column "elemname" does not exist
LINE 1: ...' COLUMNS elemName text, extractedValue text PATH elemName);

... so please delete that text. I thought I'd tested it but the state
of my tests dir says I just got distracted by another task at the
wrong time.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: function xmltable

2016-09-26 Thread Pavel Stehule

2016-09-27 3:34 GMT+02:00 Craig Ringer :

> On 24 September 2016 at 14:01, Pavel Stehule 
> wrote:
>
> >> Did some docs copy-editing and integrated some examples. Explained how
> >> nested elements work, that multiple top level elements is an error,
> >> etc. Explained the time-of-evaluation stuff. Pointed out that you can
> >> refer to prior output columns in PATH and DEFAULT, since that's weird
> >> and unusual compared to normal SQL. Documented handling of multiple
> >> node matches, including the surprising results of somepath/text() on
> >> xy. Documented handling of nested
> >> elements. Documented that xmltable works only on XML documents, not
> >> fragments/forests.
> >
> >
> > I don't understand to this sentence: "It is possible for a PATH
> expression
> > to reference output columns that appear before it in the column-list, so
> > paths may be dynamically constructed based on other parts of the XML
> > document:"
>
>
>
> >> The docs and tests don't seem to cover XML entities. What's the
> >> behaviour there? Core XML only defines one entity, but if a schema
> >> defines more how are they processed? The tests need to cover the
> >> predefined entities " & ' < and > at least.
> >
> >
> > I don't understand, what you are propose here. ?? Please, can you send
> some
> > examples.
>
> Per below - handling of DTD  declarations, and the builtin
> entity tests I already added tests for.
>
>
> >> It doesn't seem to cope with internal DTDs at all (libxml2 limitation?):
> >>
> >> SELECT * FROM xmltable('/' PASSING $XML$ >> standalone="yes" ?>
> >>  >>   
> >>   
> >> ]>
> >> Hello &pg;.
> >> $XML$ COLUMNS foo text);
> >>
> >> + ERROR:  invalid XML content
> >> + LINE 1: SELECT * FROM xmltable('/' PASSING $XML$ ...
> >> +^
> >> + DETAIL:  line 2: StartTag: invalid element name
> >> +  >> +  ^
> >> + line 3: StartTag: invalid element name
> >> +   
> >> +^
> >> + line 4: StartTag: invalid element name
> >> +   
> >> +^
> >> + line 6: Entity 'pg' not defined
> >> + Hello &pg;.
> >> +^
> >>
> >
> > It is rejected before XMLTABLE function call
> >
> > postgres=# select $XML$
> > postgres$#  > postgres$#   
> > postgres$#   
> > postgres$# ]>
> > postgres$# Hello &pg;.
> > postgres$# $XML$::xml;
> > ERROR:  invalid XML content
> > LINE 1: select $XML$
> >^
> > DETAIL:  line 2: StartTag: invalid element name
> >  [snip]
>
> > It is disabled by default in libxml2. I found a function
> > xmlSubstituteEntitiesDefault http://www.xmlsoft.org/entities.html
> > http://www.xmlsoft.org/html/libxml-parser.html#
> xmlSubstituteEntitiesDefault
> >
> > The default behave should be common for all PostgreSQL's libxml2 based
> > function - and then it is different topic - maybe part for PostgreSQL
> ToDo?
> > But I don't remember any user requests related to this issue.
>
>
> OK, so it's not xmltable specific. Fine by me.
>
> Somebody who cares can deal with it. There's clearly nobody breaking
> down the walls wanting the feature.
>
> > I removed this tests - it is not related to XMLTABLE function, but to
> > generic XML processing/validation.
>
>
> Good plan.
>
> >> In +ExecEvalTableExpr, shouldn't you be using PG_ENSURE_ERROR_CLEANUP
> >> instead of a PG_TRY() / PG_CATCH() block?
> >
> >
> > If I understand to doc, the PG_ENSURE_ERROR_CLEANUP should be used, when
> you
> > want to catch FATAL errors (and when you want to clean shared memory).
> > XMLTABLE doesn't use shared memory, and doesn't need to catch fatal
> errors.
>
> Ok, makes sense.
>
>
> >> I don't have the libxml knowledge or remaining brain to usefully
> >> evaluate the xpath and xml specifics in xpath.c today. It does strike
> >> me that the new xpath parser should probably live in its own file,
> >> though.
> >
> > moved
>
> Thanks.
>
>
> > new version is attached
>
>
> Great.
>
> I'm marking this ready for committer at this point.
>

Thank you very much

Regards

Pavel


>
> I think the XML parser likely needs a more close reading, so I'll ping
> Peter E to see if he'll have a chance to check that bit out. But by
> and large I think the issues have been ironed out - in terms of
> functionality, structure and clarity I think it's looking solid.
>
> --
>  Craig Ringer   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>

Re: [HACKERS] Push down more full joins in postgres_fdw

2016-09-26 Thread Etsuro Fujita


On 2016/09/26 20:20, Ashutosh Bapat wrote:

On Mon, Sep 26, 2016 at 4:06 PM, Etsuro Fujita
 wrote:

On 2016/09/26 18:06, Ashutosh Bapat wrote:

On Mon, Sep 26, 2016 at 1:05 PM, Etsuro Fujita
 wrote:



ISTM that the use of the same RTI for subqueries in multi-levels in a
remote
SQL makes the SQL a bit difficult to read.  How about using the position
of
the join rel in join_rel_list, (more precisely, the position plus
list_length(root->parse->rtable)), instead?



We switch to hash table to maintain the join RelOptInfos when the
number of joins grows larger, where the position won't make much
sense.



That's right, but we still store the joinrel into join_rel_list after
creating that hash table.



As the list grows, it will take
longer to locate the RelOptInfo of the given join. Doing that for
creating an alias seems an overkill.


The join rel is appended to the end of the list, so I was thinking to 
get the position info by list_length during postgresGetForeignJoinPaths.


Best regards,
Etsuro Fujita




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pageinspect: Hash index support

2016-09-26 Thread Peter Eisentraut

On 9/26/16 1:39 PM, Jesper Pedersen wrote:
> Left as is, since BuildTupleFromCStrings() vs. xyzGetDatum() are equally 
> readable in this case. But, I can change the patch if needed.

The point is that to use BuildTupleFromCStrings() you need to convert
numbers to strings, and then they are converted back.  This is not a
typical way to write row-returning functions.

>> - hash_metap result fields spares and mapp should be arrays of integer.
> 
> B-tree and BRIN uses a 'text' field as output, so left as is.

These fields are specific to hash, so the precedent doesn't necessarily
apply.

>> - The data field could be of type bytea.
> 
> Left as is, for same reasons as 'spares' and 'mapp'.

Comments from others here?  Why not use bytea instead of text?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_basebackup, pg_receivexlog and data durability (was: silent data loss with ext4 / all current versions)

2016-09-26 Thread Michael Paquier

On Tue, Sep 27, 2016 at 11:16 AM, Peter Eisentraut
 wrote:
> On 9/23/16 11:15 AM, Michael Paquier wrote:
>>> 0002:
>>> >
>>> > durable_rename needs close(fd) in non-error path (compare backend code).
>> Oops, leak.
>
> Why did you remove the errno save and restore?

That's this bit in durable_rename, patch 0002:
+if (fsync(fd) != 0)
+{
+fprintf(stderr, _("%s: could not fsync file \"%s\": %s\n"),
+progname, newfile, strerror(errno));
+close(fd);
+return -1;
+}
+close(fd);
I thought that as long as the error string is shown to the user, it
does not matter much if errno is still saved or not. All the callers
of durable_rename() on frontends don't check for strerrno(errno)
afterwards. Do you think it matters? Changing that back is trivial.
Sorry for not mentioning directly in the thread that I modified that
when dropping the last patch set.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Password identifiers, protocol aging and SCRAM protocol

2016-09-26 Thread Michael Paquier

On Mon, Sep 26, 2016 at 9:22 PM, David Steele  wrote:
> On 9/26/16 4:54 AM, Heikki Linnakangas wrote:
>> Hmm. The server could send a SCRAM challenge first, and if the client
>> gives an incorrect response, or the username doesn't exist, or the
>> user's password is actually MD5-encrypted, the server could then send an
>> MD5 challenge. It would add one round-trip to the authentication of MD5
>> passwords, but that seems acceptable.

I don't think that this applies just to md5 or scram. Could we for
example use a connection parameter, like expected_auth_methods to do
that? We include that in the startup packet if the caller has defined
it, then the backend checks for matching entries in pg_hba.conf using
the username, database and the expected auth method if specified.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_basebackup, pg_receivexlog and data durability (was: silent data loss with ext4 / all current versions)

2016-09-26 Thread Peter Eisentraut

On 9/23/16 11:15 AM, Michael Paquier wrote:
>> 0002:
>> >
>> > durable_rename needs close(fd) in non-error path (compare backend code).
> Oops, leak.

Why did you remove the errno save and restore?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PATCH: two slab-like memory allocators

2016-09-26 Thread Tomas Vondra


Hi,

Attached is v2 of the patch, updated based on the review. That means:

- Better comment explaining how free chunks are tracked in Slab context.

- Removed the unused SlabPointerIsValid macro.

- Modified the comment before SlabChunkData, explaining how it relates
  to StandardChunkHeader.

- Replaced the two Assert() calls with elog().

- Implemented SlabCheck(). I've ended up with quite a few checks there,
  checking pointers between the context, block and chunks, checks due
  to MEMORY_CONTEXT_CHECKING etc. And of course, cross-checking the
  number of free chunks (bitmap, freelist vs. chunk header).

- I've also modified SlabContextCreate() to compute chunksPerBlock a
  bit more efficiently (use a simple formula instead of the loop, which
  might be a bit too expensive for large blocks / small chunks).


I haven't done any changes to GenSlab, but I do have a few notes:

Firstly, I've realized there's an issue when chunkSize gets too large - 
once it exceeds blockSize, the SlabContextCreate() fails as it's 
impossible to place a single chunk into the block. In reorderbuffer, 
this may happen when the tuples (allocated in tup_context) get larger 
than 8MB, as the context uses SLAB_LARGE_BLOCK_SIZE (which is 8MB).


For Slab the elog(ERROR) is fine as both parameters are controlled by 
the developer directly, but GenSlab computes the chunkSize on the fly, 
so we must not let it fail like that - that'd result in unpredictable 
failures, which is not very nice.


I see two ways to fix this. We may either increase the block size 
automatically - e.g. instead of specifying specifying chunkSize and 
blockSize when creating the Slab, specify chunkSize and chunksPerBlock 
(and then choose the smallest 2^k block large enough). For example with 
chunkSize=96 and chunksPerBlock=1000, we'd get 128kB blocks, as that's 
the closest 2^k block larger than 96000 bytes.


But maybe there's a simpler solution - we may simply cap the chunkSize 
(in GenSlab) to ALLOC_CHUNK_LIMIT. That's fine, because AllocSet handles 
those requests in a special way - for example instead of tracking them 
in freelist, those chunks got freed immediately.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


0001-simple-slab-allocator-fixed-size-allocations.patch
Description: binary/octet-stream


0002-generational-slab-auto-tuning-allocator.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Stopping logical replication protocol

2016-09-26 Thread Craig Ringer

On 26 September 2016 at 21:52, Vladimir Gordiychuk  wrote:
>>You should rely on time I tests as little as possible. Some of the test
>> hosts are VERY slow. If possible something deterministic should be used.
>
> That's why this changes difficult to verify. Maybe in that case we should
> write benchmark but not integration test?
> In that case we can say, before this changes stopping logical replication
> gets N ms but after apply changes it gets NN ms where NN ms less than N ms.
> Is it available add this kind of benchmark to postgresql? I will be grateful
> for links.

It's for that reason that I added a message printed only in verbose
mode that pg_recvlogical emits when it's exiting after a
client-initiated copydone.

You can use the TAP tests, print diag messages, etc. But we generally
want them to run fairly quickly, so big benchmark runs aren't
desirable. You'll notice that I left diag messages in to report the
timing for the results in your tests, I just changed the tests so they
didn't depend on very tight timing for success/failure anymore.

We don't currently have any automated benchmarking infrastructure.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Detect supported SET parameters when pg_restore is run

2016-09-26 Thread Vitaly Burovoy

Hackers,

At work we use several major versions of PostgreSQL, and developers
use non-local clusters for developing and debugging.
We do dump/restore schemas/data via custom/dir formats and we have to
keep several client versions for 9.2, 9.4 and 9.5 versions on local
workstations because after pg_restore95 connects to 9.2, it fails when
it sets run-time parameters via SET:

vitaly@work 01:21:26 ~ $ pg_restore95 --host DEV_HOST_9_2 -d DBNAME
--data-only -e -1 -Fc arhcivefile
Password:
pg_restore95: [archiver (db)] Error while INITIALIZING:
pg_restore95: [archiver (db)] could not execute query: ERROR:
unrecognized configuration parameter "lock_timeout"
Command was: SET lock_timeout = 0;

Of course, it can be fixed avoiding "--single-transaction", but if
there is inconsistent schema (or stricter constraints) part of
schema/data is already changed/inserted and a lot of errors are
generated for the next pg_restore run.

The pd_dump has checks in "setup_connection" function to detect what
to send after connection is done for dumping, but there is no checks
in _doSetFixedOutputState for restoring. If there are checks it is
possible to use a single version pg_dump96/pg_restore96 to
dump/restore, for example 9.2->9.2 as well as 9.4->9.4 and so on.

The only trouble we have is in "SET" block and after some research I
discovered it is possible not to send unsupported SET options to the
database.

Please, find attached simple patch.

For restoring to stdout (or dumping to a plain SQL file) I left
current behavior: all options in the SET block are written.
Also I left "SET row_security = on;" if "enable_row_security" is set
to break restoring to a DB non-supported version.

-- 
Best regards,
Vitaly Burovoy


detect_supported_set_parameters_for_pgrestore.001.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench - allow to store select results into variables

2016-09-26 Thread Amit Langote

On 2016/09/26 20:27, Fabien COELHO wrote:
> 
> Hello Amit,
> 
>>> I am marking the pgbench-into-5.patch [1] as "Ready for Committer" as I
>>> have no further comments at the moment.
>>
>> Wait... Heikki's latest commit now requires this patch to be rebased.
> 
> Indeed. Here is the rebased version, which still get through my various
> tests.

Thanks, Fabien.  It seems to work here too.

Regards,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: function xmltable

2016-09-26 Thread Craig Ringer

On 24 September 2016 at 14:01, Pavel Stehule  wrote:

>> Did some docs copy-editing and integrated some examples. Explained how
>> nested elements work, that multiple top level elements is an error,
>> etc. Explained the time-of-evaluation stuff. Pointed out that you can
>> refer to prior output columns in PATH and DEFAULT, since that's weird
>> and unusual compared to normal SQL. Documented handling of multiple
>> node matches, including the surprising results of somepath/text() on
>> xy. Documented handling of nested
>> elements. Documented that xmltable works only on XML documents, not
>> fragments/forests.
>
>
> I don't understand to this sentence: "It is possible for a PATH expression
> to reference output columns that appear before it in the column-list, so
> paths may be dynamically constructed based on other parts of the XML
> document:"



>> The docs and tests don't seem to cover XML entities. What's the
>> behaviour there? Core XML only defines one entity, but if a schema
>> defines more how are they processed? The tests need to cover the
>> predefined entities " & ' < and > at least.
>
>
> I don't understand, what you are propose here. ?? Please, can you send some
> examples.

Per below - handling of DTD  declarations, and the builtin
entity tests I already added tests for.


>> It doesn't seem to cope with internal DTDs at all (libxml2 limitation?):
>>
>> SELECT * FROM xmltable('/' PASSING $XML$> standalone="yes" ?>
>> >   
>>   
>> ]>
>> Hello &pg;.
>> $XML$ COLUMNS foo text);
>>
>> + ERROR:  invalid XML content
>> + LINE 1: SELECT * FROM xmltable('/' PASSING $XML$> +^
>> + DETAIL:  line 2: StartTag: invalid element name
>> + > +  ^
>> + line 3: StartTag: invalid element name
>> +   
>> +^
>> + line 4: StartTag: invalid element name
>> +   
>> +^
>> + line 6: Entity 'pg' not defined
>> + Hello &pg;.
>> +^
>>
>
> It is rejected before XMLTABLE function call
>
> postgres=# select $XML$
> postgres$#  postgres$#   
> postgres$#   
> postgres$# ]>
> postgres$# Hello &pg;.
> postgres$# $XML$::xml;
> ERROR:  invalid XML content
> LINE 1: select $XML$
>^
> DETAIL:  line 2: StartTag: invalid element name
>  It is disabled by default in libxml2. I found a function
> xmlSubstituteEntitiesDefault http://www.xmlsoft.org/entities.html
> http://www.xmlsoft.org/html/libxml-parser.html#xmlSubstituteEntitiesDefault
>
> The default behave should be common for all PostgreSQL's libxml2 based
> function - and then it is different topic - maybe part for PostgreSQL ToDo?
> But I don't remember any user requests related to this issue.


OK, so it's not xmltable specific. Fine by me.

Somebody who cares can deal with it. There's clearly nobody breaking
down the walls wanting the feature.

> I removed this tests - it is not related to XMLTABLE function, but to
> generic XML processing/validation.


Good plan.

>> In +ExecEvalTableExpr, shouldn't you be using PG_ENSURE_ERROR_CLEANUP
>> instead of a PG_TRY() / PG_CATCH() block?
>
>
> If I understand to doc, the PG_ENSURE_ERROR_CLEANUP should be used, when you
> want to catch FATAL errors (and when you want to clean shared memory).
> XMLTABLE doesn't use shared memory, and doesn't need to catch fatal errors.

Ok, makes sense.


>> I don't have the libxml knowledge or remaining brain to usefully
>> evaluate the xpath and xml specifics in xpath.c today. It does strike
>> me that the new xpath parser should probably live in its own file,
>> though.
>
> moved

Thanks.


> new version is attached


Great.

I'm marking this ready for committer at this point.

I think the XML parser likely needs a more close reading, so I'll ping
Peter E to see if he'll have a chance to check that bit out. But by
and large I think the issues have been ironed out - in terms of
functionality, structure and clarity I think it's looking solid.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Transaction traceability - txid_status(bigint)

2016-09-26 Thread Craig Ringer

On 21 September 2016 at 22:16, Robert Haas  wrote:
> On Wed, Sep 21, 2016 at 3:40 AM, Craig Ringer  wrote:
>> The only non-horrible one of those is IMO to just let the caller see
>> an error if they lose the race. It's a function that's intended for
>> use when you're dealing with indeterminate transaction state after a
>> server or application error anyway, so it's part of an error path
>> already. So I say we just document the behaviour.
>
> I am not keen on that idea.  The errors we're likely to be exposing
> are going to look like low-level internal failures, which might scare
> some DBA.  Even if they don't, I think it's playing with fire.  The
> system is designed on the assumption that nobody will try to look up
> an XID that's too old, and if we start to violate that assumption I
> think we're undermining the design integrity of the system in a way
> we'll likely come to regret.  To put that more plainly, when the code
> is written with the assumption that X will never happen, it's usually
> a bad idea to casually add code that does X.

Fair point.

> [snip]
>
> It might not be too hard to add a second copy of oldestXid in shared
> memory that is updated before truncation rather than afterward... but
> yeah, like you, I'm not finding this nearly as straightforward as
> might have been hoped.

Yeah.

I suspect that'll be the way to go, to add another copy that's updated
before clog truncation. It just seems ... unclean. Like it shouldn't
be necessary, something else isn't right. But it's probably the lowest
pain option.

I'm going to take a step back on this and see if I can spot an alternative.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: pg_ctl: Detect current standby state from pg_control

2016-09-26 Thread Michael Paquier

On Tue, Sep 27, 2016 at 9:45 AM, Peter Eisentraut
 wrote:
> On 9/26/16 7:56 PM, Peter Eisentraut wrote:
>> On 9/26/16 8:53 AM, Tom Lane wrote:
>>> I think that it's 100% pointless for get_control_dbstate
>>> to be worried about transient CRC failures.  If writes to pg_control
>>> aren't atomic then we have problems enormously larger than whether
>>> "pg_ctl promote" throws an error or not.
>>
>> The new code was designed to handle that, but if we think it's not an
>> issue, then we can simplify it.  I'm on it.
>
> How about this?

+   if (crc_ok_p)
+   *crc_ok_p = false;
+   else
+   {
+   pfree(ControlFile);
+   return NULL;
+   }
Seems overcomplicated to me. How about returning the control file all
the time and let the caller pfree the result? You could then use
crc_ok in pg_ctl.c's get_control_dbstate() to do the decision-making.

Also, if the sleep() loop is removed for promote -w, we assume that
we'd fail immediately in case of a corrupted control file. Could it be
possible that after a couple of seconds the backend gets it right and
overwrites a correct control file on top of a corrupted one? I am just
curious about such possibilities, still I am +1 for removing the
pg_usleep loop and failing right away as you propose.

If we do the renaming of get_controlfile(), it may be also a good idea
to do the same for get_configdata() in config_info.c for consistency.
That's not really critical IMHO though.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [COMMITTERS] pgsql: pg_ctl: Detect current standby state from pg_control

2016-09-26 Thread Peter Eisentraut

On 9/26/16 7:56 PM, Peter Eisentraut wrote:
> On 9/26/16 8:53 AM, Tom Lane wrote:
>> I think that it's 100% pointless for get_control_dbstate
>> to be worried about transient CRC failures.  If writes to pg_control
>> aren't atomic then we have problems enormously larger than whether
>> "pg_ctl promote" throws an error or not.
> 
> The new code was designed to handle that, but if we think it's not an
> issue, then we can simplify it.  I'm on it.

How about this?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


0001-Fix-CRC-check-handling-in-get_controlfile.patch
Description: invalid/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] wal_segment size vs max_wal_size

2016-09-26 Thread Michael Paquier

On Mon, Sep 26, 2016 at 9:30 PM, Kuntal Ghosh
 wrote:
> On Mon, Sep 26, 2016 at 5:04 PM, Amit Kapila  wrote:
>>
>> IIRC, there is already a patch to update the minRecoveryPoint
>> correctly, can you check if that solves the problem for you?
>>
>> [1] - 
>> https://www.postgresql.org/message-id/20160609.215558.118976703.horiguchi.kyotaro%40lab.ntt.co.jp
>>
> +1. I've tested after applying the patch. This clearly solves the problem.

Even if many things have been discussed on this thread,
Horiguchi-san's first patch is still the best approach found after
several lookups and attempts when messing with the recovery code.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speedup twophase transactions

2016-09-26 Thread Michael Paquier

On Thu, Sep 22, 2016 at 12:30 AM, Stas Kelvich  wrote:
> I’m not giving up yet, i’ll write them) I still have in mind several other 
> patches to 2pc handling in
> postgres during this release cycle — logical decoding and partitioned hash 
> instead of
> TwoPhaseState list.
>
> My bet that relative speed of that patches will depend on used filesystem. 
> Like it was with the
> first patch in that mail thread it is totally possible sometimes to hit 
> filesystem limits on file
> creation speed. Otherwise both approaches should be more or less equal, i 
> suppose.

OK. I am marking this patch as returned with feedback then. Looking
forward to seeing the next investigations.. At least this review has
taught us one thing or two.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] VACUUM's ancillary tasks

2016-09-26 Thread Thomas Munro

On Tue, Sep 27, 2016 at 2:33 AM, Tom Lane  wrote:
> Thomas Munro  writes:
>> I noticed that ATExecAlterColumnType does this:
>>  * Drop any pg_statistic entry for the column, since it's now wrong type
>
>> What if there is no rewrite, because the type is binary compatible
>> (ATColumnChangeRequiresRewrite returns false)?  Then I think your patch
>> won't attract an autovacuum daemon to ANALYZE the table and produce new
>> statistics, because it won't touch changes_since_analyze.  I wonder if we
>> should simply keep the stats if no rewrite is required.  Aren't the
>> statistical properties of binary-compatible types also compatible?
>
> Not necessarily: the type semantics might be different --- in fact,
> probably are different, else why would there be distinct types in the
> first place?  It would be unwise to keep the old stats IMO.
>
> If you need a concrete example, consider OID vs int4.  They're
> binary-compatible, but since int4 is signed while OID is unsigned,
> stats for one would be wrong for the other.  This is the same reason
> why ALTER COLUMN TYPE has to rebuild indexes even in binary-compatible
> cases.

Ah, right.  Then I think this patch should somehow bump
changes_since_analyze in the no-rewrite case if it's going to do it in
the rewrite case.  It would be surprising and weird if altering a
column's type *sometimes* resulted in new statistics being
automatically generated to replace those that were dropped, depending
on the technical detail of whether a rewrite was necessary.

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add support for restrictive RLS policies

2016-09-26 Thread Alvaro Herrera

Stephen Frost wrote:

> * Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
> > Stephen Frost wrote:

> > > + 
> > > +  If the policy is a "permissive" or "restrictive" policy.  
> > > Permissive
> > > +  policies are the default and always add visibillity- they ar "OR"d
> > > +  together to allow the user access to all rows through any of the
> > > +  permissive policies they have access to.  Alternativly, a policy 
> > > can
> > > +  instead by "restrictive", meaning that the policy will be "AND"d
> > > +  with other restrictive policies and with the result of all of the
> > > +  permissive policies on the table.
> > > + 
> > > +
> > > +   

Stephen,

> > I dislike the "AND"d and "OR"d spelling of those terms.  Currently they
> > only appear in comments within rowsecurity.c (of your authorship too, I
> > imagine).  I think it'd be better to find actual words for those
> > actions.
> 
> I'm certainly open to suggestions, should you, or anyone else, have
> them.  I'll try and come up with something else, but that really is what
> we're doing- literally using either AND or OR to join the expressions
> together.

I understand, but the words "and" and "or" are not verbs.  I don't know
what are good verbs to use for this but I suppose there must be verbs
related to "conjunction" and "disjunction" ("to conjoin" and "to
disjoin" appear in the Merriam-Webster dictionary but I don't think they
represent the operation very well).  Maybe some native speaker would
have a better suggestion.

> > I don't understand this part.  Didn't you say previously that we already
> > had this capability in 9.5 and you were only exposing it over SQL?  If
> > that is true, how come you need to add a new column to this catalog?
> 
> The capability exists in 9.5 through hooks which are available to
> extensions, see the test_rls_hooks extension.  Those hooks are called
> every time and therefore don't require the information about restrictive
> policies to be tracked in pg_policy, and so they weren't.  Since this is
> adding support for users to define restrictive policies, we need to save
> those policies and therefore we need to distinguish which policies are
> restrictive and which are permissive, hence the need to modify pg_policy
> to track that information.

Ah, okay.  I thought you meant that it was already possible to create a
policy to behave this way, just not using SQL-based DDL.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cache Hash Index meta page.

2016-09-26 Thread Jeff Janes

On Tue, Sep 13, 2016 at 12:55 PM, Mithun Cy 
wrote:

> On Thu, Sep 8, 2016 at 11:21 PM, Jesper Pedersen <
> jesper.peder...@redhat.com> wrote:
>>
>> > For the archives, this patch conflicts with the WAL patch [1].
>>
>> > [1] https://www.postgresql.org/message-id/CAA4eK1JS%2BSiRSQBzEFp
>> nsSmxZKingrRH7WNyWULJeEJSj1-%3D0w%40mail.gmail.com
>>
>
> Updated the patch it applies over Amit's concurrent hash index[1] and
> Amit's wal for hash index patch[2] together.
>

I think that this needs to be updated again for v8 of concurrent and v5 of
wal

Thanks,

Jeff

Re: [HACKERS] pageinspect: Hash index support

2016-09-26 Thread Alvaro Herrera

Jeff Janes wrote:
> On Tue, Sep 20, 2016 at 12:19 AM, Michael Paquier  > wrote:
> 
> >
> > Note: the patch checks if a superuser is calling the new functions,
> > which is a good thing.
> >
> 
> If we only have the bytea functions and the user needs to supply the raw
> pages themselves, rather than having the function go get the raw page for
> you, is there any reason to restrict the interpretation function to super
> users?  I guess if we don't trust the C coded functions not to dereference
> bogus data in harmful ways?

Yeah, it'd likely be simple to manufacture a fake page that causes the
server to misbehave resulting in a security leak or at least DoS.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pageinspect: Hash index support

2016-09-26 Thread Jeff Janes

On Tue, Sep 20, 2016 at 12:19 AM, Michael Paquier  wrote:

>
> Note: the patch checks if a superuser is calling the new functions,
> which is a good thing.
>

If we only have the bytea functions and the user needs to supply the raw
pages themselves, rather than having the function go get the raw page for
you, is there any reason to restrict the interpretation function to super
users?  I guess if we don't trust the C coded functions not to dereference
bogus data in harmful ways?

Cheers,

Jeff

Re: [HACKERS] pgbench more operators & functions

2016-09-26 Thread Fabien COELHO



Hello Jeevan,


I did the review of your patch and here are my views on your patch.


Thanks for this detailed review and debugging!

Documentation: [...] it be a good idea to have a table of operators 
similar to that of functions. We need not have several columns here like 
description, example etc., but a short table just categorizing the 
operators would be sufficient.


Ok, done.


Further testing and review:
===
1. Postgres treats '^' as exponentiation rather than XOR, and '#' as XOR.
Personally, I think it can cause confusion, so it will be better if we can stick
to the behavior of Postgres mathematical operators.


Ok. I agree to avoid '^'.

2. I could not see any tests for bitwise operators in the functions.sql 
file that you have attached.


Indeed. Included in attached version.


3. Precedence: [...]


Hmm. I got them all wrong, shame on me! I've followed C rules in the 
updated version.



5. Sorry, I was not able to understand the "should exist" comment in following
snippet.

+"xor" { return XOR_OP; } /* should exist */
+"^^" { return XOR_OP; } /* should exist */


There is no "logical exclusive or" operator in C nor in SQL. I do not see 
why not, so I put one...



7. You may want to reword following comment: [...] there -> here


Ok, fixed twice.


8. [...] if (coerceToBool(&vargs[0])) *retval = vargs[1]; else *retval = 
vargs[2];


Ok.

Attached is an updated patch & test script.

--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 285608d..c958c1c 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -821,9 +821,8 @@ pgbench  options  dbname
   The expression may contain integer constants such as 5432,
   double constants such as 3.14159,
   references to variables :variablename,
-  unary operators (+, -) and binary operators
-  (+, -, *, /,
-  %) with their usual precedence and associativity,
+  operators
+  with their usual precedence and associativity,
   function calls, and
   parentheses.
  
@@ -909,6 +908,84 @@ pgbench  options  dbname
   
  
 
+ 
+  Built-In Operators
+
+  
+   The arithmetic, bitwise, comparison and logical operators listed in
+are built into pgbench
+   and may be used in expressions appearing in
+   \set.
+  
+
+  
+   pgbench Operators
+   
+
+ 
+  Operator Category
+  Result Type
+  List of Operators
+ 
+
+
+ 
+  unary arithmetic
+  integer or double
+  +, -
+ 
+ 
+  binary arithmetic
+  integer or double
+  +, -, *, /
+ 
+ 
+  binary arithmetic
+  integer only
+  %
+ 
+ 
+  unary bitwise
+  integer
+  ~
+ 
+ 
+  binary bitwise
+  integer
+  &, |, #
+ 
+ 
+  shifts
+  integer
+  <<, >>
+ 
+ 
+  comparison
+  boolean (0/1)
+  
+   =/==, <>/!=,
+   <, <=, >, >=
+  
+ 
+ 
+  unary logical
+  boolean (0/1)
+  not/!
+ 
+ 
+  binary logical
+  boolean (0/1)
+  
+   and/&&,
+   or/||,
+   xor/^^,
+  
+ 
+
+   
+  
+ 
+
  
   Built-In Functions
 
@@ -955,6 +1032,13 @@ pgbench  options  dbname
5432.0
   
   
+   exp(x)
+   double
+   exponential
+   exp(1.0)
+   2.718281828459045
+  
+  
greatest(a [, ... ] )
double if any a is double, else integer
largest value among arguments
@@ -962,6 +1046,13 @@ pgbench  options  dbname
5
   
   
+   if(c,e1,e2)
+   same as e*
+   if c is not zero then e1 else e2
+   if(0,1.0,2.0)
+   2.0
+  
+  
int(x)
integer
cast to int
@@ -976,6 +1067,13 @@ pgbench  options  dbname
2.1
   
   
+   ln(x)
+   double
+   natural logarithm
+   ln(2.718281828459045)
+   1.0
+  
+  
pi()
double
value of the constant PI
diff --git a/src/bin/pgbench/exprparse.y b/src/bin/pgbench/exprparse.y
index 0cc665b..f8dbbaf 100644
--- a/src/bin/pgbench/exprparse.y
+++ b/src/bin/pgbench/exprparse.y
@@ -52,11 +52,21 @@ static PgBenchExpr *make_func(yyscan_t yyscanner, int fnumber, PgBenchExprList *
 %type  VARIABLE FUNCTION
 
 %token INTEGER_CONST DOUBLE_CONST VARIABLE FUNCTION
-
-/* Precedence: lowest to highest */
+%token AND_OP OR_OP XOR_OP NE_OP LE_OP GE_OP LS_OP RS_OP
+
+/* Precedence: lowest to highest, taken from C */
+%left	OR_OP
+%left	XOR_OP
+%left	AND_OP
+%left   '|'
+%left   '#'
+%left   '&'
+%left	'=' NE_OP
+%left	'<' '>' LE_OP GE_OP
+%left	LS_OP RS_OP
 %left	'+' '-'
 %left	'*' '/' '%'
-%right	UMINUS
+%right	UNARY
 
 %%
 
@@ -68,14 +78,32 @@ elist:  	{ $$ = NULL; }
 	;
 
 expr: '(' expr ')'			{ $$ = $2; }
-	| '+' expr %prec UMINUS	{ $$ = $2; }
-	| '-' expr %prec UMINUS	{ $$ = make_op(yyscanner, "-",
+	| '+' expr %prec UNARY	{ $$ = $2; }
+	| '-' expr %prec UNAR

Re: [HACKERS] Add support for restrictive RLS policies

2016-09-26 Thread Stephen Frost

Alvaro,

* Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
> Stephen Frost wrote:
> 
> Stephen, the typo "awseome" in the tests is a bit distracting.  Can you
> please fix it?

Will fix.

> > Updated patch changes to using IDENT and an strcmp() (similar to
> > AlterOptRoleElem and vacuum_option_elem) to check the results at parse-time,
> > and then throwing a more specific error if an unexpected value is found
> > (instead of just 'syntax error').  This avoids adding any keywords.
> 
> Thanks.

No problem.

> > diff --git a/doc/src/sgml/ref/create_policy.sgml 
> > b/doc/src/sgml/ref/create_policy.sgml
> > index 89d2787..d930052 100644
> > --- a/doc/src/sgml/ref/create_policy.sgml
> > +++ b/doc/src/sgml/ref/create_policy.sgml
> > @@ -22,6 +22,7 @@ PostgreSQL documentation
> >   
> >  
> >  CREATE POLICY name ON 
> > table_name
> > +[ AS ( PERMISSIVE | RESTRICTIVE ) ]
> 
> I think you should use braces here, not parens:
> 
> [ AS { PERMISSIVE | RESTRICTIVE } ]

Will fix.

> > 
> > +permissive
> > +
> > + 
> > +  If the policy is a "permissive" or "restrictive" policy.  Permissive
> > +  policies are the default and always add visibillity- they ar "OR"d
> > +  together to allow the user access to all rows through any of the
> > +  permissive policies they have access to.  Alternativly, a policy can
> > +  instead by "restrictive", meaning that the policy will be "AND"d
> > +  with other restrictive policies and with the result of all of the
> > +  permissive policies on the table.
> > + 
> > +
> > +   
> 
> I don't think this paragraph is right -- you should call out each of the
> values PERMISSIVE and RESTRICTIVE (in upper case) instead.  Also note
> typos "Alternativly" and "visibillity".

Will fix, along with the 'ar' typo.

> I dislike the "AND"d and "OR"d spelling of those terms.  Currently they
> only appear in comments within rowsecurity.c (of your authorship too, I
> imagine).  I think it'd be better to find actual words for those
> actions.

I'm certainly open to suggestions, should you, or anyone else, have
them.  I'll try and come up with something else, but that really is what
we're doing- literally using either AND or OR to join the expressions
together.

> > diff --git a/src/include/catalog/pg_policy.h 
> > b/src/include/catalog/pg_policy.h
> > index d73e9c2..30dc367 100644
> > --- a/src/include/catalog/pg_policy.h
> > +++ b/src/include/catalog/pg_policy.h
> > @@ -23,6 +23,7 @@ CATALOG(pg_policy,3256)
> > NameDatapolname;/* Policy name. */
> > Oid polrelid;   /* Oid of the relation 
> > with policy. */
> > charpolcmd; /* One of ACL_*_CHR, or '*' for 
> > all */
> > +   boolpolpermissive;  /* restrictive or permissive policy */
> >  
> >  #ifdef CATALOG_VARLEN
> > Oid polroles[1];/* Roles associated with 
> > policy, not-NULL */
> > @@ -42,12 +43,13 @@ typedef FormData_pg_policy *Form_pg_policy;
> >   * compiler constants for pg_policy
> >   * 
> >   */
> > -#define Natts_pg_policy6
> > -#define Anum_pg_policy_polname 1
> > -#define Anum_pg_policy_polrelid2
> > -#define Anum_pg_policy_polcmd  3
> > -#define Anum_pg_policy_polroles4
> > -#define Anum_pg_policy_polqual 5
> > -#define Anum_pg_policy_polwithcheck 6
> > +#define Natts_pg_policy6
> > +#define Anum_pg_policy_polname 1
> > +#define Anum_pg_policy_polrelid2
> > +#define Anum_pg_policy_polcmd  3
> > +#define Anum_pg_policy_polpermissive   4
> > +#define Anum_pg_policy_polroles5
> > +#define Anum_pg_policy_polqual 6
> > +#define Anum_pg_policy_polwithcheck7
> 
> I don't understand this part.  Didn't you say previously that we already
> had this capability in 9.5 and you were only exposing it over SQL?  If
> that is true, how come you need to add a new column to this catalog?

The capability exists in 9.5 through hooks which are available to
extensions, see the test_rls_hooks extension.  Those hooks are called
every time and therefore don't require the information about restrictive
policies to be tracked in pg_policy, and so they weren't.  Since this is
adding support for users to define restrictive policies, we need to save
those policies and therefore we need to distinguish which policies are
restrictive and which are permissive, hence the need to modify pg_policy
to track that information.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Add support for restrictive RLS policies

2016-09-26 Thread Alvaro Herrera

Stephen Frost wrote:

Stephen, the typo "awseome" in the tests is a bit distracting.  Can you
please fix it?

> Updated patch changes to using IDENT and an strcmp() (similar to
> AlterOptRoleElem and vacuum_option_elem) to check the results at parse-time,
> and then throwing a more specific error if an unexpected value is found
> (instead of just 'syntax error').  This avoids adding any keywords.

Thanks.

> diff --git a/doc/src/sgml/ref/create_policy.sgml 
> b/doc/src/sgml/ref/create_policy.sgml
> index 89d2787..d930052 100644
> --- a/doc/src/sgml/ref/create_policy.sgml
> +++ b/doc/src/sgml/ref/create_policy.sgml
> @@ -22,6 +22,7 @@ PostgreSQL documentation
>   
>  
>  CREATE POLICY name ON 
> table_name
> +[ AS ( PERMISSIVE | RESTRICTIVE ) ]

I think you should use braces here, not parens:

[ AS { PERMISSIVE | RESTRICTIVE } ]

> 
> +permissive
> +
> + 
> +  If the policy is a "permissive" or "restrictive" policy.  Permissive
> +  policies are the default and always add visibillity- they ar "OR"d
> +  together to allow the user access to all rows through any of the
> +  permissive policies they have access to.  Alternativly, a policy can
> +  instead by "restrictive", meaning that the policy will be "AND"d
> +  with other restrictive policies and with the result of all of the
> +  permissive policies on the table.
> + 
> +
> +   

I don't think this paragraph is right -- you should call out each of the
values PERMISSIVE and RESTRICTIVE (in upper case) instead.  Also note
typos "Alternativly" and "visibillity".

I dislike the "AND"d and "OR"d spelling of those terms.  Currently they
only appear in comments within rowsecurity.c (of your authorship too, I
imagine).  I think it'd be better to find actual words for those
actions.

> diff --git a/src/include/catalog/pg_policy.h b/src/include/catalog/pg_policy.h
> index d73e9c2..30dc367 100644
> --- a/src/include/catalog/pg_policy.h
> +++ b/src/include/catalog/pg_policy.h
> @@ -23,6 +23,7 @@ CATALOG(pg_policy,3256)
>   NameDatapolname;/* Policy name. */
>   Oid polrelid;   /* Oid of the relation 
> with policy. */
>   charpolcmd; /* One of ACL_*_CHR, or '*' for 
> all */
> + boolpolpermissive;  /* restrictive or permissive policy */
>  
>  #ifdef CATALOG_VARLEN
>   Oid polroles[1];/* Roles associated with 
> policy, not-NULL */
> @@ -42,12 +43,13 @@ typedef FormData_pg_policy *Form_pg_policy;
>   *   compiler constants for pg_policy
>   * 
>   */
> -#define Natts_pg_policy  6
> -#define Anum_pg_policy_polname   1
> -#define Anum_pg_policy_polrelid  2
> -#define Anum_pg_policy_polcmd3
> -#define Anum_pg_policy_polroles  4
> -#define Anum_pg_policy_polqual   5
> -#define Anum_pg_policy_polwithcheck 6
> +#define Natts_pg_policy  6
> +#define Anum_pg_policy_polname   1
> +#define Anum_pg_policy_polrelid  2
> +#define Anum_pg_policy_polcmd3
> +#define Anum_pg_policy_polpermissive 4
> +#define Anum_pg_policy_polroles  5
> +#define Anum_pg_policy_polqual   6
> +#define Anum_pg_policy_polwithcheck  7

I don't understand this part.  Didn't you say previously that we already
had this capability in 9.5 and you were only exposing it over SQL?  If
that is true, how come you need to add a new column to this catalog?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Remove superuser() checks from pgstattuple

2016-09-26 Thread Stephen Frost

Peter, all,

* Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote:
> This is now being obsoleted by the later idea of allowing base installs
> from a chain of upgrade scripts.  But if your upgrade scripts contain
> ALTER TABLE commands, you will probably still want to write base install
> scripts that do a fresh CREATE TABLE instead.

I've updated the patch to remove the new base version script and to rely
on the changes made by Tom to install the 1.4 version and then upgrade
to 1.5.

Based on my testing, it appears to all work correctly.

Updated (much smaller) patch attached.

Thanks!

Stephen
From 508a4db0111607d8d59eef4a8e172f8a7aa8fdfa Mon Sep 17 00:00:00 2001
From: Stephen Frost 
Date: Tue, 23 Aug 2016 17:13:09 -0400
Subject: [PATCH] Remove superuser checks in pgstattuple

Now that we track initial privileges on extension objects and changes to
those permissions, we can drop the superuser() checks from the various
functions which are part of the pgstattuple extension.

Since a pg_upgrade will preserve the version of the extension which
existed prior to the upgrade, we can't simply modify the existing
functions but instead need to create new functions which remove the
checks and update the SQL-level functions to use the new functions
(and to REVOKE EXECUTE rights on those functions from PUBLIC).

Approach suggested by Noah.
---
 contrib/pgstattuple/Makefile  |   7 +-
 contrib/pgstattuple/pgstatapprox.c|  35 ++--
 contrib/pgstattuple/pgstatindex.c | 108 +++--
 contrib/pgstattuple/pgstattuple--1.4--1.5.sql | 111 ++
 contrib/pgstattuple/pgstattuple.c |  36 +
 contrib/pgstattuple/pgstattuple.control   |   2 +-
 6 files changed, 284 insertions(+), 15 deletions(-)
 create mode 100644 contrib/pgstattuple/pgstattuple--1.4--1.5.sql

diff --git a/contrib/pgstattuple/Makefile b/contrib/pgstattuple/Makefile
index e732680..294077d 100644
--- a/contrib/pgstattuple/Makefile
+++ b/contrib/pgstattuple/Makefile
@@ -4,9 +4,10 @@ MODULE_big	= pgstattuple
 OBJS		= pgstattuple.o pgstatindex.o pgstatapprox.o $(WIN32RES)
 
 EXTENSION = pgstattuple
-DATA = pgstattuple--1.4.sql pgstattuple--1.3--1.4.sql \
-	pgstattuple--1.2--1.3.sql pgstattuple--1.1--1.2.sql \
-	pgstattuple--1.0--1.1.sql pgstattuple--unpackaged--1.0.sql
+DATA = pgstattuple--1.4.sql pgstattuple--1.4--1.5.sql \
+	pgstattuple--1.3--1.4.sql pgstattuple--1.2--1.3.sql \
+	pgstattuple--1.1--1.2.sql pgstattuple--1.0--1.1.sql \
+	pgstattuple--unpackaged--1.0.sql
 PGFILEDESC = "pgstattuple - tuple-level statistics"
 
 REGRESS = pgstattuple
diff --git a/contrib/pgstattuple/pgstatapprox.c b/contrib/pgstattuple/pgstatapprox.c
index a49ff54..11bae14 100644
--- a/contrib/pgstattuple/pgstatapprox.c
+++ b/contrib/pgstattuple/pgstatapprox.c
@@ -29,6 +29,9 @@
 #include "commands/vacuum.h"
 
 PG_FUNCTION_INFO_V1(pgstattuple_approx);
+PG_FUNCTION_INFO_V1(pgstattuple_approx_v1_5);
+
+Datum pgstattuple_approx_internal(Oid relid, FunctionCallInfo fcinfo);
 
 typedef struct output_type
 {
@@ -209,6 +212,33 @@ Datum
 pgstattuple_approx(PG_FUNCTION_ARGS)
 {
 	Oid			relid = PG_GETARG_OID(0);
+
+	if (!superuser())
+		ereport(ERROR,
+(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ (errmsg("must be superuser to use pgstattuple functions";
+
+	PG_RETURN_DATUM(pgstattuple_approx_internal(relid, fcinfo));
+}
+
+/*
+ * As of pgstattuple version 1.5, we no longer need to check if the user
+ * is a superuser because we REVOKE EXECUTE on the function from PUBLIC.
+ * Users can then grant access to it based on their policies.
+ *
+ * Otherwise identical to pgstattuple_approx (above).
+ */
+Datum
+pgstattuple_approx_v1_5(PG_FUNCTION_ARGS)
+{
+	Oid			relid = PG_GETARG_OID(0);
+
+	PG_RETURN_DATUM(pgstattuple_approx_internal(relid, fcinfo));
+}
+
+Datum
+pgstattuple_approx_internal(Oid relid, FunctionCallInfo fcinfo)
+{
 	Relation	rel;
 	output_type stat = {0};
 	TupleDesc	tupdesc;
@@ -217,11 +247,6 @@ pgstattuple_approx(PG_FUNCTION_ARGS)
 	HeapTuple	ret;
 	int			i = 0;
 
-	if (!superuser())
-		ereport(ERROR,
-(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
- (errmsg("must be superuser to use pgstattuple functions";
-
 	if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
 		elog(ERROR, "return type must be a row type");
 
diff --git a/contrib/pgstattuple/pgstatindex.c b/contrib/pgstattuple/pgstatindex.c
index 6084589..753e52d 100644
--- a/contrib/pgstattuple/pgstatindex.c
+++ b/contrib/pgstattuple/pgstatindex.c
@@ -54,6 +54,14 @@ PG_FUNCTION_INFO_V1(pg_relpages);
 PG_FUNCTION_INFO_V1(pg_relpagesbyid);
 PG_FUNCTION_INFO_V1(pgstatginindex);
 
+PG_FUNCTION_INFO_V1(pgstatindex_v1_5);
+PG_FUNCTION_INFO_V1(pgstatindexbyid_v1_5);
+PG_FUNCTION_INFO_V1(pg_relpages_v1_5);
+PG_FUNCTION_INFO_V1(pg_relpagesbyid_v1_5);
+PG_FUNCTION_INFO_V1(pgstatginindex_v1_5);
+
+Datum pgstatginindex_internal(Oid relid, FunctionCallInfo fcinfo);
+
 #define IS_INDEX(r) ((r)->rd_rel-

Re: [HACKERS] proposal: psql \setfileref

2016-09-26 Thread Pavel Stehule

2016-09-26 21:47 GMT+02:00 Ryan Murphy :

>
>
>> please, can you check attached patch? It worked in my laptop.
>>
>> Regards
>>
>> Pavel
>>
>>
> Yes, that one applied for me without any problems.
>

Great,

Thank you

Pavel


>
> Thanks,
> Ryan
>

Re: [HACKERS] proposal: psql \setfileref

2016-09-26 Thread Ryan Murphy

>
> please, can you check attached patch? It worked in my laptop.
>
> Regards
>
> Pavel
>
>
Yes, that one applied for me without any problems.

Thanks,
Ryan

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2016-09-26 Thread Peter Geoghegan

On Mon, Sep 26, 2016 at 6:58 PM, Robert Haas  wrote:
>> That requires some kind of mutual exclusion mechanism, like an LWLock.
>
> No, it doesn't.  Shared memory queues are single-reader, single-writer.

The point is that there is a natural dependency when merging is
performed eagerly within the leader. One thing needs to be in lockstep
with the others. That's all.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] inconsistent behaviour of set-returning functions in sub-query with random()

2016-09-26 Thread Tom Lane

Tom van Tilburg  writes:
> I'm often using the WHERE clause random() > 0.5 to pick a random subset of
> my data. Now I noticed that when using a set-returning function in a
> sub-query, I either get the whole set or none (meaning that the WHERE
> random() > 0.5 clause is interpreted *before* the set is being generated).
> e.g.:
>
> SELECT num FROM (
> SELECT unnest(Array[1,2,3,4,5,6,7,8,9,10]) num) AS foo WHERE random() > 
> 0.5;

Hmm, I think this is an optimizer bug.  There are two legitimate behaviors
here:

SELECT * FROM unnest(ARRAY[1,2,3,4,5,6,7,8,9,10]) WHERE random() > 0.5;

should (and does) re-evaluate the WHERE for every row output by unnest().

SELECT unnest(ARRAY[1,2,3,4,5,6,7,8,9,10]) WHERE random() > 0.5;

should evaluate WHERE only once, since that happens before expansion of the
set-returning function in the targetlist.  (If you're an Oracle user and
you imagine this query as having an implicit "FROM dual", the WHERE should
be evaluated for the single row coming out of the FROM clause.)

In the case you've got here, given the placement of the WHERE in the outer
query, you'd certainly expect it to be evaluated for each row coming out
of the inner query.  But the optimizer is deciding it can push the WHERE
clause down to become a WHERE of the sub-select.  That is legitimate in a
lot of cases, but not when there are SRF(s) in the sub-select's
targetlist, because that pushes the WHERE to occur before the SRF(s),
analogously to the change between the two queries I wrote.

I'm a bit hesitant to change this in existing releases.  Given the lack
of previous complaints, it seems more likely to break queries that were
behaving as-expected than to make people happy.  But we could change it
in v10 and up, especially since some other corner-case changes in
SRF-in-tlist behavior are afoot.

In the meantime, you could force it to work as you wish by inserting the
all-purpose optimization fence "OFFSET 0" in the sub-select:

=# SELECT num FROM (
SELECT unnest(Array[1,2,3,4,5,6,7,8,9,10]) num OFFSET 0) AS foo WHERE 
random() > 0.5;
 num 
-
   1
   4
   7
   9
(4 rows)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add support for restrictive RLS policies

2016-09-26 Thread Stephen Frost

Jeevan, all,

* Jeevan Chalke (jeevan.cha...@enterprisedb.com) wrote:
> On Mon, Sep 12, 2016 at 7:27 AM, Stephen Frost  wrote:
> > * Robert Haas (robertmh...@gmail.com) wrote:
> > > On Thu, Sep 8, 2016 at 5:21 PM, Tom Lane  wrote:
> > > > Stephen Frost  writes:
> > > >> * Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
> > > >>> Can't you keep those words as Sconst or something (DefElems?) until
> > the
> > > >>> execution phase, so that they don't need to be keywords at all?
> > > >
> > > >> Seems like we could do that, though I'm not convinced that it really
> > > >> gains us all that much.  These are only unreserved keywords, of
> > course,
> > > >> so they don't impact users the way reserved keywords (of any kind)
> > can.
> > > >> While there may be some places where we use a string to represent a
> > set
> > > >> of defined options, I don't believe that's typical
> > > >
> > > > -1 for having to write them as string literals; but I think what Alvaro
> > > > really means is to arrange for the words to just be identifiers in the
> > > > grammar, which you strcmp against at execution.  See for example
> > > > reloption_list.  (Whether you use DefElem as the internal
> > representation
> > > > is a minor detail, though it might help for making the parsetree
> > > > copyObject-friendly.)
> > > >
> > > > vacuum_option_elem shows another way to avoid making a word into a
> > > > keyword, although to me that one is more of an antipattern; it'd be
> > better
> > > > to leave the strcmp to execution, since there's so much other code that
> > > > does things that way.
> > >
> > > There are other cases like that, too, e.g. AlterOptRoleElem; I don't
> > > think it's a bad pattern.  Regardless of the specifics, I do think
> > > that it would be better not to bloat the keyword table with things
> > > that don't really need to be keywords.
> >
> > The AlterOptRoleElem case is, I believe, what Tom was just suggesting as
> > an antipattern, since the strcmp() is being done at parse time instead
> > of at execution time.
> >
> > If we are concerned about having too many unreserved keywords, then I
> > agree that AlterOptRoleElem is a good candidate to look at for reducing
> > the number we have, as it appears to contain 3 keywords which are not
> > used anywhere else (and 1 other which is only used in one other place).
> >
> > I do think that using IDENT for the various role attributes does make
> > sense, to be clear, as there are quite a few of them, they change
> > depending on major version, and those keywords are very unlikely to ever
> > have utilization elsewhere.
> >
> > For this case, there's just 2 keywords which seem like they may be used
> > again (perhaps for ALTER or DROP POLICY, or default policies if we grow
> > those in the future), and we're unlikly to grow more in the future for
> > that particular case (as we only have two binary boolean operators and
> > that seems unlikely to change), though, should that happens, we could
> > certainly revisit this.
> >
> 
> To me, adding two new keywords for two new options does not look good as it
> will bloat keywords list. Per my understanding we should add keyword if and
> only if we have no option than adding at, may be to avoid grammar conflicts.
> 
> I am also inclined to think that using identifier will be a good choice
> here.
> However I would prefer to do the string comparison right into the grammar
> itself, so that if we have wrong option as input there, then we will not
> proceed further with it. We are anyway going to throw an error later then
> why not at early stage.

Updated patch changes to using IDENT and an strcmp() (similar to
AlterOptRoleElem and vacuum_option_elem) to check the results at parse-time,
and then throwing a more specific error if an unexpected value is found
(instead of just 'syntax error').  This avoids adding any keywords.

Thanks!

Stephen
From 11471c7921271e3c03078f3d31148dd4afd9d6e0 Mon Sep 17 00:00:00 2001
From: Stephen Frost 
Date: Thu, 1 Sep 2016 02:11:30 -0400
Subject: [PATCH] Add support for restrictive RLS policies

We have had support for restrictive RLS policies since 9.5, but they
were only available through extensions which use the appropriate hooks.
This adds support into the grammer, catalog, psql and pg_dump for
restrictive RLS policies, thus reducing the cases where an extension is
necessary.
---
 doc/src/sgml/ref/create_policy.sgml   |  16 +++
 src/backend/commands/policy.c |   9 ++
 src/backend/nodes/copyfuncs.c |   1 +
 src/backend/nodes/equalfuncs.c|   1 +
 src/backend/parser/gram.y |  34 -
 src/backend/rewrite/rowsecurity.c |   7 +-
 src/bin/pg_dump/pg_dump.c |  38 --
 src/bin/pg_dump/pg_dump.h |   1 +
 src/bin/pg_dump/t/002_pg_dump.pl  |  39 +-
 src/bin/psql/describe.c   | 109 
 src/bin/psql/tab-complete.c   |  29 -
 src/include/catalog/pg_

Re: [HACKERS] pageinspect: Hash index support

2016-09-26 Thread Jeff Janes

On Mon, Sep 26, 2016 at 10:39 AM, Jesper Pedersen <
jesper.peder...@redhat.com> wrote:

> Hi,
>
> On 09/23/2016 12:10 AM, Peter Eisentraut wrote:
>
>>
>>

> - As of very recently, we don't need to move pageinspect--1.5.sql to
>> pageinspect--1.6.sql anymore.  Just add pageinspect--1.5--1.6.sql.
>>
>>
> We need to rename the pageinspect--1.5.sql file and add the new methods,
> right ? Or ?

You just need to add pageinspect--1.5--1.6.sql.  With recent changes to the
extension infrastructure, when you create the extension as version 1.6, the
infrastructure automatically figures out that it should install 1.5 and
then upgrade to 1.6.  Or that is my understanding, anyway.

Cheers,

Jeff

Re: [HACKERS] Add support for restrictive RLS policies

2016-09-26 Thread Stephen Frost

Jeevan,

* Jeevan Chalke (jeevan.cha...@enterprisedb.com) wrote:
> I have started reviewing this patch and here are couple of points I have
> observed so far:
> 
> 1. Patch applies cleanly
> 2. make / make install / initdb all good.
> 3. make check (regression) FAILED. (Attached diff file for reference).

I've re-based my patch on top of current head and still don't see the
failures which you are getting during the regression tests.  Is it
possible you were doing the tests without a full rebuild of the source
tree..?

Can you provide details of your build/test environment and the full
regression before and after output?

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] proposal: psql \setfileref

2016-09-26 Thread Pavel Stehule

Hi

2016-09-26 14:57 GMT+02:00 Ryan Murphy :

> Hi Pavel,
>
> I just tried to apply your patch psql-setfileref-initial.patch (using git
> apply) to the newest revision of postgres at the time (da6c4f6ca88) and it
> failed to patch startup.c.  Thinking that the patch was for some previous
> revision, I then checked out d062245b5, which was from 2016-08-31, and
> tried applying the patch from there.  I had the same problem:
>
> $ git apply psql-setfileref-initial.patch
> error: patch failed: src/bin/psql/startup.c:106
> error: src/bin/psql/startup.c: patch does not apply
>
> However, when I applied the changes to startup.c manually and removed them
> from the patch, the rest of the patch applied without a problem.  I don't
> know if I may have done something wrong in trying to apply the patch, but
> you may want to double check if you need to regenerate your patch from the
> latest revision so it will apply smoothly for reviewers.
>

please, can you check attached patch? It worked in my laptop.

Regards

Pavel



>
> In the meantime, I haven't had a chance to try out the fileref feature yet
> but I'll give it a go when I get a chance!
>
> Thanks!
> Ryan
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index a9a2fdb..af38ff9 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -1363,6 +1363,32 @@ exec_command(const char *cmd,
 		free(envval);
 	}
 
+	/* \setfileref - set variable by reference on file */
+	else if (strcmp(cmd, "setfileref") == 0)
+	{
+		char	   *name = psql_scan_slash_option(scan_state,
+  OT_NORMAL, NULL, false);
+
+		char	   *ref = psql_scan_slash_option(scan_state,
+		 OT_NORMAL, NULL, false);
+
+		success = false;
+
+		if (!name || !ref)
+		{
+			psql_error("\\%s: missing required argument\n", cmd);
+			success = false;
+		}
+		else
+		{
+			if (!SetFileRef(pset.vars, name, ref))
+			{
+psql_error("\\%s: error while setting variable\n", cmd);
+success = false;
+			}
+		}
+	}
+
 	/* \sf -- show a function's source code */
 	else if (strcmp(cmd, "sf") == 0 || strcmp(cmd, "sf+") == 0)
 	{
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index a7789df..b160228 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -25,6 +25,7 @@
 #include "settings.h"
 #include "command.h"
 #include "copy.h"
+#include "catalog/pg_type.h"
 #include "crosstabview.h"
 #include "fe_utils/mbprint.h"
 
@@ -33,7 +34,6 @@ static bool ExecQueryUsingCursor(const char *query, double *elapsed_msec);
 static bool command_no_begin(const char *query);
 static bool is_select_command(const char *query);
 
-
 /*
  * openQueryOutputFile --- attempt to open a query output file
  *
@@ -109,6 +109,120 @@ setQFout(const char *fname)
 	return true;
 }
 
+void
+psql_reset_query_params(void)
+{
+	int			i;
+
+	for (i = 0; i < pset.nparams; i++)
+		if (pset.params[i] != NULL)
+		{
+			PQfreemem(pset.params[i]);
+			pset.params[i] = NULL;
+		}
+
+	pset.nparams = 0;
+}
+
+/*
+ * Load a content of the file_ref related file to query params buffer.
+ * When escaping is requested, then the text content is expected.
+ * Without escaping the bytea content is expected and related bytea
+ * escaping is processed.
+ */
+static char *
+get_file_ref_content(const char *value, bool escape, bool as_ident)
+{
+	PQExpBufferData		buffer;
+	FILE			   *fd = NULL;
+	char			   *fname;
+	char			   *escaped_value;
+	charline[1024];
+	size_tsize;
+
+	fname = pstrdup(value);
+
+	expand_tilde(&fname);
+	if (!fname)
+	{
+		psql_error("missing valid path to file\n");
+		return NULL;
+	}
+
+	canonicalize_path(fname);
+
+	fd = fopen(fname, PG_BINARY_R);
+	if (!fd)
+	{
+		psql_error("%s: %s\n", fname, strerror(errno));
+		PQfreemem(fname);
+		return NULL;
+	}
+
+	/* can append another parameter */
+	if (pset.nparams >= MAX_BINARY_PARAMS)
+	{
+		psql_error("too much binary parameters");
+		PQfreemem(fname);
+		return NULL;
+	}
+
+	if (!pset.db)
+	{
+		psql_error("cannot escape without active connection\n");
+		PQfreemem(fname);
+		return NULL;
+	}
+
+	initPQExpBuffer(&buffer);
+
+	while ((size = fread(line, 1, sizeof(line), fd)) > 0)
+		appendBinaryPQExpBuffer(&buffer, line, size);
+
+	if (ferror(fd))
+	{
+		psql_error("%s: %s\n", fname, strerror(errno));
+		PQfreemem(fname);
+		termPQExpBuffer(&buffer);
+		return NULL;
+	}
+
+	if (escape)
+	{
+		if (as_ident)
+			escaped_value =
+PQescapeIdentifier(pset.db, buffer.data, buffer.len);
+		else
+			escaped_value =
+PQescapeLiteral(pset.db, buffer.data, buffer.len);
+		pset.paramTypes[pset.nparams] = UNKNOWNOID;
+	}
+	else
+	{
+		escaped_value = (char *)
+PQescapeByteaConn(pset.db,
+	(const unsigned char *) buffer.data, buffer.len, &size);
+		pset.paramTypes[pset.nparams] = BYTEAOID;
+	}
+
+	/* fname, buffer is not necessary longer */
+	PQfreemem(fna

Re: [HACKERS] Showing parallel status in \df+

2016-09-26 Thread Stephen Frost

* Robert Haas (robertmh...@gmail.com) wrote:
> On Mon, Sep 26, 2016 at 10:48 AM, Stephen Frost  wrote:
> > I agree that "do nothing" isn't a good option.  I'm not terribly
> > thrilled with just putting the source code at the bottom of the \df+
> > output either, though it's at least slightly less ridiculous than trying
> > to put the source code into a column in a table.
> 
> Several people have spoken against that option, 

I feel like we're getting wrapped around the axle as it regards who is
perceived to be voting for what.

Based on my review of this thread, we seem to have one person (Peter)
who dislikes removing prosrc from \df+, though he caveated his comments
with being concerned that it was happening after beta2 (which is no
longer the case, of course), see:
f16571cc-bf6f-53a1-6809-f09f48f0a...@2ndquadrant.com.  Subsequently, in
5aacd611-94b7-3b98-de8e-cae34e18c...@2ndquadrant.com, he seems to
suggest that he might support it if \sf was changed to support wildcards
and multiple results.

Michael had voiced concern that removing prosrc from \df+ would make
debugging more difficult, but subsequent discussion indicated that he
agreed with removing prosrc for non-internal/C functions (which was
later described as 'Internal name'), see:
cab7npqtikt-e7e5cx6wm89m9fr0-za+bkb1k4_xvfgpcr7d...@mail.gmail.com

Rushabh Lathia had an issue with keeping 'source code' for internal/C
language functions, but not for other languages, see:
cagpqqf2jz3q+bvkxraqxe+ztuzhrayfkp+syc74nc+fu5pn...@mail.gmail.com

Pavel didn't feel having the footer be used for the source code was
appropriate, see:
cafj8prbh-m7cmez2jt49fkgh8qwfglxxbzfdbbi3gtybbxd...@mail.gmail.com

I don't particularly care for it either, primairly because \sf could be
improved upon, as suggested by Peter, to avoid the need to have the same
information displayed by both \df+ and \sf.

> but I think it's
> actually pretty clearly an improvement over the status quo.  If you
> don't want to see all of that output, fine; look at the first
> screenful and then quit your pager (which probably means "press q").

I agree that it's an improvment over the current \df+ output, but I'm
not convinced that it's better than improving \sf to do that and then
removing prosrc from \df+ and adding 'Internal name' and I'm not sure
that there are actual dissenting votes for that as the end-goal.
Peter's comments seem to be brought up as rejecting the removal of
prosrc from \df+, full-stop, but that's not how I read his actual
comments on this thread.

> If you do want to see all of the output, you'll appreciate not having
> it indented by 60 or 80 columns any more.  There's really no
> circumstanced under which it's worse than what we're doing today.

That doesn't mean, at least to me, that we should forgo considering
better alternatives.

> It's fairly likely that there's no option here that will please
> everyone completely, but that's not a reason to reject a patch that is
> clearly better than what we've got now.

We often reject patches which only improve a bit on the status quo
because we wish for a better overall solution, particularly when we're
talking about user interfaces that we don't want to change between every
release.

That's part of the reason that we have a role system today; Tom,
correctly, pointed out that we don't just want a system where a given
object can have multiple owners (one of my first proposals to the lists)
but wished to have role based access controls, which would provide
that capability and more.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Allowing GIN array_ops to work on anyarray

2016-09-26 Thread Tom Lane

Enrique Meneses  writes:
> Great, given it does not apply to this patch, then all the other tests
> passed and the change looks good.

Pushed, thanks for the review!

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PL/Python adding support for multi-dimensional arrays

2016-09-26 Thread Dave Cramer

>
> This crashes with arrays with non-default lower bounds:
>
> postgres=# SELECT * FROM test_type_conversion_array_int4('[2:4]={1,2,3}');
> INFO:  ([1, 2, ], )
> server closed the connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
>
> Attached patch fixes this bug, and adds a test for it.

>
> I'd like to see some updates to the docs for this. The manual doesn't
> currently say anything about multi-dimensional arrays in pl/python, but it
> should've mentioned that they're not supported. Now that it is supported,
> should mention that, and explain briefly that a multi-dimensional array is
> mapped to a python list of lists.
>
> If the code passes I'll fix the docs

> It seems we don't have any mention in the docs about arrays with
> non-default lower-bounds ATM. That's not this patch's fault, but it would
> be good to point out that the lower bounds are discarded when an array is
> passed to python.
>
> I find the loop in PLyList_FromArray() quite difficult to understand. Are
> the comments there mixing up the "inner" and "outer" dimensions? I wonder
> if that would be easier to read, if it was written in a recursive-style,
> rather than iterative with stacks for the dimensions.
>
> Yes, it is fairly convoluted.


Dave Cramer

da...@postgresintl.com
www.postgresintl.com

Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers

2016-09-26 Thread Tomas Vondra


On 09/26/2016 07:16 PM, Tomas Vondra wrote:


The averages (over the 10 runs, 5 minute each) look like this:

 3.2.80 1  8 16 32 64128192

 granular-locking1567  12146  26341  44188  43263  49590  15042
 no-content-lock 1567  12180  25549  43787  43675  51800  16831
 group-update1550  12018  26121  44451  42734  51455  15504
 master  1566  12057  25457  42299  42513  42562  10462

 4.5.5  1  8 16 32 64128192

 granular-locking3018  19031  27394  29222  32032  34249  36191
 no-content-lock 2988  18871  27384  29260  32120  34456  36216
 group-update2960  18848  26870  29025  32078  34259  35900
 master  2984  18917  26430  29065  32119  33924  35897

That is:

(1) The 3.2.80 performs a bit better than before, particularly for 128
and 256 clients - I'm not sure if it's thanks to the reboots or so.

(2) 4.5.5 performs measurably worse for >= 32 clients (by ~30%). That's
a pretty significant regression, on a fairly common workload.



FWIW, now that I think about this, the regression is roughly in line 
with my findings presented in my recent blog post:


http://blog.2ndquadrant.com/postgresql-vs-kernel-versions/

Those numbers were collected on a much smaller machine (2/4 cores only), 
which might be why the difference observed on 32-core machine is much 
more significant.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2016-09-26 Thread Robert Haas

On Sat, Sep 24, 2016 at 9:07 AM, Peter Geoghegan  wrote:
> On Thu, Sep 22, 2016 at 8:57 PM, Robert Haas  wrote:
>> On Thu, Sep 22, 2016 at 3:51 PM, Heikki Linnakangas  wrote:
>>> It'd be good if you could overlap the final merges in the workers with the
>>> merge in the leader. ISTM it would be quite straightforward to replace the
>>> final tape of each worker with a shared memory queue, so that the leader
>>> could start merging and returning tuples as soon as it gets the first tuple
>>> from each worker. Instead of having to wait for all the workers to complete
>>> first.
>>
>> If you do that, make sure to have the leader read multiple tuples at a
>> time from each worker whenever possible.  It makes a huge difference
>> to performance.  See bc7fcab5e36b9597857fa7e3fa6d9ba54aaea167.
>
> That requires some kind of mutual exclusion mechanism, like an LWLock.

No, it doesn't.  Shared memory queues are single-reader, single-writer.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] pgpassfile connection option

2016-09-26 Thread Robert Haas

On Thu, Sep 22, 2016 at 11:34 AM, Julian Markwort
 wrote:
> I haven't really thought about this as I had been asked to make this work as
> an additional option to the connection parameters...
> Now that I've looked at it - there is really only the benefit of saving the
> step of setting the PGPASSFILE environment variable.
> However, there might be cases in which setting an environment variable might
> not be the easiest option.

So, there are some security concerns here in my mind.  If a program
running under a particular user ID accepts a connection string from a
source that isn't fully trusted, the user has to accept the risk that
their .pgpass file will be used for authentication to whatever
database the program might try to connect.  However, they don't have
to accept the possibility that arbitrary local files readable by the
user ID will be used for authentication and/or disclosed; this patch
would force them to accept that risk.  That doesn't seem particularly
good.  If an adversary has enough control over my account that they
can set environment variables, it's game over: they win.  But if I
merely accept connection strings from them, I shouldn't have to worry
about anything worse than that I might be induced to connect to the
wrong thing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pageinspect: Hash index support

2016-09-26 Thread Jesper Pedersen


On 09/23/2016 01:56 AM, Amit Kapila wrote:

While looking at patch, I noticed below code which seems somewhat problematic:

+ stat->max_avail = BLCKSZ - (BLCKSZ - phdr->pd_special + SizeOfPageHeaderData);
+
+ /* page type (flags) */
+ if (opaque->hasho_flag & LH_META_PAGE)
+ stat->type = 'm';
+ else if (opaque->hasho_flag & LH_OVERFLOW_PAGE)
+ stat->type = 'v';
+ else if (opaque->hasho_flag & LH_BUCKET_PAGE)
+ stat->type = 'b';
+ else if (opaque->hasho_flag & LH_BITMAP_PAGE)
+ stat->type = 'i';

In the above code, it appears that you are trying to calculate
max_avail space for all pages in same way.  Don't you need to
calculate it differently for bitmap page or meta page as they don't
share the same format as other type of pages?



Correct, however the max_avail calculation was removed in v6, since we 
don't display average item size anymore.


Thanks for the feedback !

Best regards,
 Jesper



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pageinspect: Hash index support

2016-09-26 Thread Jesper Pedersen


Hi,

On 09/23/2016 12:10 AM, Peter Eisentraut wrote:

On 9/21/16 9:30 AM, Jesper Pedersen wrote:

Attached is v5, which add basic page verification.


There are still some things that I found that appear to follow the old
style (btree) conventions instead the new style (brin, gin) conventions.

- Rename hash_metap to hash_metapage_info.



Done.


- Don't use BuildTupleFromCStrings().  (There is nothing inherently
wrong with it, but why convert numbers to strings and back again when it
can be done more directly.)



Left as is, since BuildTupleFromCStrings() vs. xyzGetDatum() are equally 
readable in this case. But, I can change the patch if needed.



- Documentation for hash_page_stats still has blkno argument.



Fixed.


- In hash_metap, the magic field could be type bytea, so that it
displays in hex.  Or text like the brin functions.



Changed to 'text'.


Some other comments:

- hash_metap result fields spares and mapp should be arrays of integer.



B-tree and BRIN uses a 'text' field as output, so left as is.


(Incidentally, the comment in hash.h talks about bitmaps[] but I think
it means hashm_mapp[].)

- As of very recently, we don't need to move pageinspect--1.5.sql to
pageinspect--1.6.sql anymore.  Just add pageinspect--1.5--1.6.sql.



We need to rename the pageinspect--1.5.sql file and add the new methods, 
right ? Or ?



- In the documentation for hash_page_stats(), the "+" sign is misaligned.



Fixed.


- In hash_page_items, the fields itemlen, nulls, vars are always 16,
false, false.  So perhaps there is no need for them.  Similarly, the
hash_page_stats in hash_page_stats is always 16.



Fixed, by removing them. We can add them back later if needed.


- The data field could be of type bytea.



Left as is, for same reasons as 'spares' and 'mapp'.


- Add a pointer in the documentation to the relevant header file.



Done.


Bonus:

- Add subsections in the documentation (general functions, B-tree
functions, etc.)



Done.


- Add tests.



Thanks for the review !

Best regards,
 Jesper

>From 24c3ff3cb48ca2076d589eda9027a0abe73b5aad Mon Sep 17 00:00:00 2001
From: jesperpedersen 
Date: Fri, 5 Aug 2016 10:16:32 -0400
Subject: [PATCH] pageinspect: Hash index support

---
 contrib/pageinspect/Makefile  |  10 +-
 contrib/pageinspect/hashfuncs.c   | 400 ++
 contrib/pageinspect/pageinspect--1.5--1.6.sql |  59 
 contrib/pageinspect/pageinspect--1.5.sql  | 279 --
 contrib/pageinspect/pageinspect--1.6.sql  | 346 ++
 contrib/pageinspect/pageinspect.control   |   2 +-
 doc/src/sgml/pageinspect.sgml | 151 +-
 7 files changed, 952 insertions(+), 295 deletions(-)
 create mode 100644 contrib/pageinspect/hashfuncs.c
 create mode 100644 contrib/pageinspect/pageinspect--1.5--1.6.sql
 delete mode 100644 contrib/pageinspect/pageinspect--1.5.sql
 create mode 100644 contrib/pageinspect/pageinspect--1.6.sql

diff --git a/contrib/pageinspect/Makefile b/contrib/pageinspect/Makefile
index a98237e..c73d728 100644
--- a/contrib/pageinspect/Makefile
+++ b/contrib/pageinspect/Makefile
@@ -2,13 +2,13 @@
 
 MODULE_big	= pageinspect
 OBJS		= rawpage.o heapfuncs.o btreefuncs.o fsmfuncs.o \
-		  brinfuncs.o ginfuncs.o $(WIN32RES)
+		  brinfuncs.o ginfuncs.o hashfuncs.o $(WIN32RES)
 
 EXTENSION = pageinspect
-DATA = pageinspect--1.5.sql pageinspect--1.4--1.5.sql \
-	pageinspect--1.3--1.4.sql pageinspect--1.2--1.3.sql \
-	pageinspect--1.1--1.2.sql pageinspect--1.0--1.1.sql \
-	pageinspect--unpackaged--1.0.sql
+DATA = pageinspect--1.6.sql pageinspect--1.5--1.6.sql \
+	pageinspect--1.4--1.5.sql pageinspect--1.3--1.4.sql \
+	pageinspect--1.2--1.3.sql pageinspect--1.1--1.2.sql \
+	pageinspect--1.0--1.1.sql pageinspect--unpackaged--1.0.sql
 PGFILEDESC = "pageinspect - functions to inspect contents of database pages"
 
 ifdef USE_PGXS
diff --git a/contrib/pageinspect/hashfuncs.c b/contrib/pageinspect/hashfuncs.c
new file mode 100644
index 000..96e72a9
--- /dev/null
+++ b/contrib/pageinspect/hashfuncs.c
@@ -0,0 +1,400 @@
+/*
+ * hashfuncs.c
+ *		Functions to investigate the content of HASH indexes
+ *
+ * Copyright (c) 2016, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *		contrib/pageinspect/hashfuncs.c
+ */
+
+#include "postgres.h"
+
+#include "access/hash.h"
+#include "funcapi.h"
+#include "miscadmin.h"
+
+PG_FUNCTION_INFO_V1(hash_metapage_info);
+PG_FUNCTION_INFO_V1(hash_page_items);
+PG_FUNCTION_INFO_V1(hash_page_stats);
+
+/* 
+ * structure for single hash page statistics
+ * 
+ */
+typedef struct HashPageStat
+{
+	uint32		live_items;
+	uint32		dead_items;
+	uint32		page_size;
+	uint32		free_size;
+	char		type;
+
+	/* opaque data */
+	BlockNumber hasho_prevblkno;
+	BlockNumber hasho_nextblkno;
+	Bucket		hasho_bucket;
+	uint16		hasho_flag;
+	uint16		hasho_page_id;
+} HashPageStat;
+
+

Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers

2016-09-26 Thread Robert Haas

On Fri, Sep 23, 2016 at 9:20 AM, Amit Kapila  wrote:
> On Fri, Sep 23, 2016 at 6:50 AM, Robert Haas  wrote:
>> On Thu, Sep 22, 2016 at 7:44 PM, Tomas Vondra
>>  wrote:
>>> I don't dare to suggest rejecting the patch, but I don't see how we could
>>> commit any of the patches at this point. So perhaps "returned with feedback"
>>> and resubmitting in the next CF (along with analysis of improved workloads)
>>> would be appropriate.
>>
>> I think it would be useful to have some kind of theoretical analysis
>> of how much time we're spending waiting for various locks.  So, for
>> example, suppose we one run of these tests with various client counts
>> - say, 1, 8, 16, 32, 64, 96, 128, 192, 256 - and we run "select
>> wait_event from pg_stat_activity" once per second throughout the test.
>> Then we see how many times we get each wait event, including NULL (no
>> wait event).  Now, from this, we can compute the approximate
>> percentage of time we're spending waiting on CLogControlLock and every
>> other lock, too, as well as the percentage of time we're not waiting
>> for lock.  That, it seems to me, would give us a pretty clear idea
>> what the maximum benefit we could hope for from reducing contention on
>> any given lock might be.
>>
> As mentioned earlier, such an activity makes sense, however today,
> again reading this thread, I noticed that Dilip has already posted
> some analysis of lock contention upthread [1].  It is clear that patch
> has reduced LWLock contention from ~28% to ~4% (where the major
> contributor was TransactionIdSetPageStatus which has reduced from ~53%
> to ~3%).  Isn't it inline with what you are looking for?

Hmm, yes.  But it's a little hard to interpret what that means; I
think the test I proposed in the quoted material above would provide
clearer data.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Showing parallel status in \df+

2016-09-26 Thread Robert Haas

On Mon, Sep 26, 2016 at 10:48 AM, Stephen Frost  wrote:
> I agree that "do nothing" isn't a good option.  I'm not terribly
> thrilled with just putting the source code at the bottom of the \df+
> output either, though it's at least slightly less ridiculous than trying
> to put the source code into a column in a table.

Several people have spoken against that option, but I think it's
actually pretty clearly an improvement over the status quo.  If you
don't want to see all of that output, fine; look at the first
screenful and then quit your pager (which probably means "press q").
If you do want to see all of the output, you'll appreciate not having
it indented by 60 or 80 columns any more.  There's really no
circumstanced under which it's worse than what we're doing today.

It's fairly likely that there's no option here that will please
everyone completely, but that's not a reason to reject a patch that is
clearly better than what we've got now.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: psql \setfileref

2016-09-26 Thread Pavel Stehule

2016-09-26 14:57 GMT+02:00 Ryan Murphy :

> Hi Pavel,
>
> I just tried to apply your patch psql-setfileref-initial.patch (using git
> apply) to the newest revision of postgres at the time (da6c4f6ca88) and it
> failed to patch startup.c.  Thinking that the patch was for some previous
> revision, I then checked out d062245b5, which was from 2016-08-31, and
> tried applying the patch from there.  I had the same problem:
>
> $ git apply psql-setfileref-initial.patch
> error: patch failed: src/bin/psql/startup.c:106
> error: src/bin/psql/startup.c: patch does not apply
>
> However, when I applied the changes to startup.c manually and removed them
> from the patch, the rest of the patch applied without a problem.  I don't
> know if I may have done something wrong in trying to apply the patch, but
> you may want to double check if you need to regenerate your patch from the
> latest revision so it will apply smoothly for reviewers.
>

Thank you for info. I'll do it immediately.

Regards

Pavel


>
> In the meantime, I haven't had a chance to try out the fileref feature yet
> but I'll give it a go when I get a chance!
>
> Thanks!
> Ryan
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

Re: [HACKERS] temporary table vs array performance

2016-09-26 Thread Pavel Stehule

2016-09-26 17:39 GMT+02:00 dby...@163.com :

> test:
> create type  h3 as (id int,name char(10));
>
> CREATE or replace FUNCTION proc17()
> RETURNS SETOF h3  AS $$
> DECLARE
> v_rec h3;
> BEGIN
> create temp table abc(id int,name varchar) on commit drop;
> insert into abc select 1,'lw';
> insert into abc select 2,'lw2';
> for v_rec in
> select * from abc loop
> return next v_rec;
> end loop;
> END;
> $$
> LANGUAGE plpgsql;
>
>
> CREATE or replace FUNCTION proc16()
> RETURNS   SETOF h3 AS $$
> DECLARE
>  id_array int[];
>  name_arr varchar[];
>  v_rec h3;
> BEGIN
> id_array =array[1,2];
> name_arr=array['lw','lw2'];
> for v_rec in
> select unnest(id_array)  ,unnest(name_arr) loop
> return next v_rec;
> end loop;
> END;
> $$
> LANGUAGE plpgsql;
> postgres=# select * from proc17();
>  id |name
> +
>   1 | lw
>   2 | lw2
> (2 rows)
>
> Time: 68.372 ms
> postgres=# select * from proc16();
>  id |name
> +
>   1 | lw
>   2 | lw2
> (2 rows)
>
> Time: 1.357 ms
>
> temp talbe result:
> [postgres@pg95 test_sql]$ pgbench -M prepared -n -r -c
> 2 -j 2 -T 10 -f temporary_test_1.sql
> transaction type: Custom query
> scaling factor: 1
> query mode: prepared
> number of clients: 2
> number of threads: 2
> duration: 10 s
> number of transactions actually processed: 5173
> latency average: 3.866 ms
> tps = 517.229191 (including connections establishing)
> tps = 517.367956 (excluding connections establishing)
> statement latencies in milliseconds:
> 3.863798 select * from proc17();
>
> array result:
> [postgres@pg95 test_sql]$ pgbench -M prepared -n -r -c
> 2 -j 2 -T 10 -f arrary_test_1.sql
> transaction type: Custom query
> scaling factor: 1
> query mode: prepared
> number of clients: 2
> number of threads: 2
> duration: 10 s
> number of transactions actually processed: 149381
> latency average: 0.134 ms
> tps = 14936.875176 (including connections establishing)
> tps = 14940.234960 (excluding connections establishing)
> statement latencies in milliseconds:
> 0.132983 select * from proc16();
>
> Array is not convenient to use in function, whether
> there are other methods can be replaced temp table in function
>
>
Temporary tables are pretty expensive - from more reasons, and horrible
when you use fresh table for two rows only. More if you recreate it every
transaction.

More often pattern is create first and delete repeatedly. Better don't use
temp tables when it is necessary. It is one reason why PostgreSQL supports
a arrays. Partially - PostgreSQL arrays are analogy to T-SQL memory tables.

Regards

Pavel




>
> --
> dby...@163.com
>

Re: [HACKERS] Allowing GIN array_ops to work on anyarray

2016-09-26 Thread Enrique Meneses

Great, given it does not apply to this patch, then all the other tests
passed and the change looks good.

Thank you,
Enrique


On Mon, Sep 26, 2016 at 6:27 AM Tom Lane  wrote:

> Enrique Meneses  writes:
> > I was not sure what "Spec compliant means"... so I did not select as
> tested or passed. What should I do to validate that this change is "Spec
> compliant"?
>
> It's irrelevant to this patch, AFAICS.  The SQL standard doesn't discuss
> indexes at all, much less legislate on which operator classes ought to
> exist for GIN indexes.
>
> regards, tom lane
>

Re: [HACKERS] temporary table vs array performance

2016-09-26 Thread David G. Johnston

Its considered bad form to post to multiple lists.  Please pick the most
relevant one - in this case I'd suggest -general.

On Mon, Sep 26, 2016 at 8:39 AM, dby...@163.com  wrote:

>
> Array is not convenient to use in function, whether
> there are other methods can be replaced temp table in function
>
>
I have no difficulty using arrays in functions.

As for "other methods" - you can use CTE (WITH) to create a truly local
table - updating the catalogs by using a temp table is indeed quite
expensive.

WITH vals AS  ( VALUES (1, 'lw'), (2, 'lw2') )
SELECT * FROM vals;

David J.

[HACKERS] temporary table vs array performance

2016-09-26 Thread dby...@163.com

test:
create type  h3 as (id int,name char(10));

CREATE or replace FUNCTION proc17() 
RETURNS SETOF h3  AS $$ 
DECLARE
v_rec h3;
BEGIN 
create temp table abc(id int,name varchar) on commit drop;
insert into abc select 1,'lw';
insert into abc select 2,'lw2';
for v_rec in
select * from abc loop
return next v_rec;
end loop;
END; 
$$ 
LANGUAGE plpgsql;


CREATE or replace FUNCTION proc16() 
RETURNS   SETOF h3 AS $$ 
DECLARE
 id_array int[];
 name_arr varchar[];
 v_rec h3;
BEGIN 
id_array =array[1,2];
name_arr=array['lw','lw2'];
for v_rec in
select unnest(id_array)  ,unnest(name_arr) loop
return next v_rec;
end loop;
END; 
$$ 
LANGUAGE plpgsql;
postgres=# select * from proc17();
 id |name
+
  1 | lw
  2 | lw2   
(2 rows)

Time: 68.372 ms
postgres=# select * from proc16();
 id |name
+
  1 | lw
  2 | lw2   
(2 rows)

Time: 1.357 ms

temp talbe result:
[postgres@pg95 test_sql]$ pgbench -M prepared -n -r -c 2 -j 2 -T 10 -f 
temporary_test_1.sql 
transaction type: Custom query
scaling factor: 1
query mode: prepared
number of clients: 2
number of threads: 2
duration: 10 s
number of transactions actually processed: 5173
latency average: 3.866 ms
tps = 517.229191 (including connections establishing)
tps = 517.367956 (excluding connections establishing)
statement latencies in milliseconds:
3.863798 select * from proc17();

array result:
[postgres@pg95 test_sql]$ pgbench -M prepared -n -r -c 2 -j 2 -T 10 -f 
arrary_test_1.sql 
transaction type: Custom query
scaling factor: 1
query mode: prepared
number of clients: 2
number of threads: 2
duration: 10 s
number of transactions actually processed: 149381
latency average: 0.134 ms
tps = 14936.875176 (including connections establishing)
tps = 14940.234960 (excluding connections establishing)
statement latencies in milliseconds:
0.132983 select * from proc16();

Array is not convenient to use in function, whether there are other methods can 
be replaced temp table in function




dby...@163.com

Re: [HACKERS] WIP: Covering + unique indexes.

2016-09-26 Thread Anastasia Lubennikova



24.09.2016 15:36, Amit Kapila:

On Wed, Sep 21, 2016 at 6:51 PM, Anastasia Lubennikova
 wrote:

20.09.2016 08:21, Amit Kapila:

On Tue, Sep 6, 2016 at 10:18 PM, Anastasia Lubennikova
 wrote:

28.08.2016 09:13, Amit Kapila:


The problem seems really tricky, but the answer is simple.
We store included columns unordered. It was mentioned somewhere in
this thread.


Is there any fundamental problem in storing them in ordered way?  I
mean to say, you need to anyway store all the column values on leaf
page, so why can't we find the exact location for the complete key.
Basically use truncated key to reach to leaf level and then use the
complete key to find the exact location to store the key.  I might be
missing some thing here, but if we can store them in ordered fashion,
we can use them even for queries containing ORDER BY (where ORDER BY
contains included columns).



I'd say that the reason for not using included columns in any
operations which require comparisons, is that they don't have opclass.
Let's go back to the example of points.
This data type don't have any opclass for B-tree, because of fundamental 
reasons.

And we can not apply _bt_compare() and others to this attribute, so
we don't include it to scan key.

create table t (i int, i2 int, p point);
create index idx1 on (i) including (i2);
create index idx2 on (i) including (p);
create index idx3 on (i) including (i2, p);
create index idx4 on (i) including (p, i2);

You can keep tuples ordered in idx1, but not for idx2, partially ordered 
for idx3, but not for idx4.


At the very beginning of this thread [1], I suggested to use opclass, 
where possible.
Exactly the same idea, you're thinking about. But after short 
discussion, we came
to conclusion that it would require many additional checks and will be 
too complicated,

at least for the initial patch.


Let me give you an example:

create table t (i int, p point);
create index on (i) including (p);
"point" data type doesn't have any opclass for btree.
Should we insert (0, '(0,2)') before (0, '(1,1)') or after?
We have no idea what is the "correct order" for this attribute.
So the answer is "it doesn't matter". When searching in index,
we know that only key attrs are ordered, so only them can be used
in scankey. Other columns are filtered after retrieving data.

explain select i,p from t where i =0 and p <@ circle '((0,0),2)';
 QUERY PLAN
---
  Index Only Scan using idx on t  (cost=0.14..4.20 rows=1 width=20)
Index Cond: (i = 0)
Filter: (p <@ '<(0,0),2>'::circle)


I think here reason for using Filter is that because we don't keep
included columns in scan keys, can't we think of having them in scan
keys, but use only key columns in scan key to reach till leaf level
and then use complete scan key at leaf level.



What should I add to README (or to documentation),
to make it more understandable?


May be add the data representation like only leaf pages contains all
the columns and how the scan works.  I think you can see if you can
extend "Notes About Data Representation" and or "Other Things That Are
Handy to Know" sections in existing README.


Ok, I'll write it in a few days.


[1] https://www.postgresql.org/message-id/55f84df4.5030...@postgrespro.ru

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Showing parallel status in \df+

2016-09-26 Thread Stephen Frost

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Pavel Stehule  writes:
> > 2016-09-23 7:22 GMT+02:00 Rushabh Lathia :
> >> On Thu, Sep 22, 2016 at 10:04 PM, Tom Lane  wrote:
> >>> If it's unreadable in \df+, how would \df++ make that any better?
> 
> >> Eventhough source code as part of \df+ is bit annoying (specifically
> >> for PL functions), I noticed the argument in this thread that it's
> >> useful information for some of.  So \df++ is just alternate option for
> >> the those who want the source code.
> 
> > ++ is little bit obscure. So better to remove src everywhere.
> 
> Well, that was suggested upthread (which is where the idea of relying
> on \sf came from) and Peter objected on the quite reasonable grounds
> that people expect \df+ to provide this info and won't know to go
> use \sf instead.  So I'm afraid that suggestion is going nowhere.

For my 2c, I disagree that "just because it's always been there and
that's where people know to go look" is a reason to not remove it.

Moving src out of \df+ will mean that people looking for it will need to
use \? to see where it went (or use \ef, which is what I'd argue most
already do today..), but I hardly see that as a huge issue and the
improvement in readability of \df+ is well worth that cost.

> I think the options that have a chance of happening are to rearrange
> \df+ output more or less as in my patch, or to do nothing.  I'm not very
> happy about "do nothing", but that seems to be where we're ending up.

I agree that "do nothing" isn't a good option.  I'm not terribly
thrilled with just putting the source code at the bottom of the \df+
output either, though it's at least slightly less ridiculous than trying
to put the source code into a column in a table.

If we really are worried that people who know how to use \df+ and how to
write plpgsql (or other PL) code can't figure out how to view the src
with \sf or \ef, then we could include at the bottom of the \df+ output
a hint which essentially says "use \sf to view function source".

Alternativly, and I kind of hate suggesting this, but it's not like most
people don't already have a .psqlrc to deal with our silly defaults, we
could add a variable to control if src is included in \df+ or not.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Showing parallel status in \df+

2016-09-26 Thread Tom Lane

Pavel Stehule  writes:
> 2016-09-23 7:22 GMT+02:00 Rushabh Lathia :
>> On Thu, Sep 22, 2016 at 10:04 PM, Tom Lane  wrote:
>>> If it's unreadable in \df+, how would \df++ make that any better?

>> Eventhough source code as part of \df+ is bit annoying (specifically
>> for PL functions), I noticed the argument in this thread that it's
>> useful information for some of.  So \df++ is just alternate option for
>> the those who want the source code.

> ++ is little bit obscure. So better to remove src everywhere.

Well, that was suggested upthread (which is where the idea of relying
on \sf came from) and Peter objected on the quite reasonable grounds
that people expect \df+ to provide this info and won't know to go
use \sf instead.  So I'm afraid that suggestion is going nowhere.

I think the options that have a chance of happening are to rearrange
\df+ output more or less as in my patch, or to do nothing.  I'm not very
happy about "do nothing", but that seems to be where we're ending up.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-09-26 Thread Tom Lane

"Tsunakawa, Takayuki"  writes:
>> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Robert Haas
>> I think that we shouldn't start changing things based on guesses about what
>> the problem is, even if they're fairly smart guesses.  The thing to do would
>> be to construct a test rig, crash the server repeatedly, and add debugging
>> instrumentation to figure out where the time is actually going.

> We have tried to reproduce the problem in the past several days with much 
> more stress on our environment than on the customer's one -- 1,000 tables 
> aiming for a dozens of times larger stats file and repeated reconnection 
> requests from hundreds of clients -- but we could not succeed.

>> I do think your theory about the stats collector might be worth pursuing.
>> It seems that the stats collector only responds to SIGQUIT, ignoring SIGTERM.
>> Making it do a clean shutdown on SIGTERM and a fast exit on SIGQUIT seems
>> possibly worthwhile.

> Thank you for giving confidence for proceeding.  And I also believe that 
> postmaster should close the listening ports earlier. Regardless of whether 
> this problem will be solved not confident these will solve the, I think it'd 
> be better to fix these two points so that postmaster doesn't longer time than 
> necessary.  I think I'll create a patch after giving it a bit more thought.

FWIW, I'm pretty much -1 on messing with the timing of the socket close
actions.  I broke that once within recent memory, so maybe I'm gun-shy,
but I think that the odds of unpleasant side effects greatly outweigh
any likely benefit there.

Allowing SIGQUIT to prompt fast shutdown of the stats collector seems
sane, though.  Try to make sure it doesn't leave partly-written stats
files behind.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Small race in pg_xlogdump --follow

2016-09-26 Thread Tom Lane

Magnus Hagander  writes:
> Oh, right, at the very last loop. I've never seen it need more than 1 loop
> so I didn't manage to hit that codepath :) But yeah, saving errno and
> restoring it on the other side of pg_usleep is probably a good idea.

Right.  On a larger scale --- I'm too uncaffeinated to figure out whether
this is the code path taken for the initially user-supplied file name.
But it would be unfriendly if when you fat-finger the command line
arguments, it takes 5 seconds for pg_xlogdump to tell you so.  Maybe
the looping behavior should only happen on non-first file access.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Stopping logical replication protocol

2016-09-26 Thread Vladimir Gordiychuk

>You should rely on time I tests as little as possible. Some of the test
hosts are VERY slow. If possible something deterministic should be used.

That's why this changes difficult to verify. Maybe in that case we should
write benchmark but not integration test?
In that case we can say, before this changes stopping logical replication
gets N ms but after apply changes it gets NN ms where NN ms less than N ms.
Is it available add this kind of benchmark to postgresql? I will be grateful
for links.

2016-09-26 5:32 GMT+03:00 Craig Ringer :

> On 26 Sep. 2016 02:16, "Vladimir Gordiychuk"  wrote:
> >
> > >It looks like you didn't import my updated patches, so I've
> rebased your new patches on top of them.
> > Yes, i forgot it, sorry. Thanks for you fixes.
> >
> > >I did't see you explain why this was removed:
> >
> > -/* fast path */
> > -/* Try to flush pending output to the client */
> > -if (pq_flush_if_writable() != 0)
> > -WalSndShutdown();
> > -
> > -if (!pq_is_send_pending())
> > -return;
> >
> > I remove it, because during decode long transaction code, we always
> exist from function by fast path and not receive messages from client. Now
> we always include in 'for' loop and executes
> > /* Check for input from the client */ ProcessRepliesIfAny();
>
> Makes sense.
> I
> > >Some of the changes to pg_recvlogical look to be copied from
> > >receivelog.c, most notably the functions ProcessKeepalive and
> > >ProcessXLogData .
> >
> > During refactoring huge function I also notice It, but decide not
> include additional changes in already huge patch. I only split method that
> do everything to few small functions.
>
> Good decision. This improves things already and makes future refactoring
> easier.
>
> > >I was evaluating the tests and I don't think they actually demonstrate
> > >that the patch works. There's nothing done to check that
> > >pg_recvlogical exited because of client-initiated CopyDone. While
> > >looking into that I found that it also never actually hits
> > >ProcessCopyDone(...) because libpq handles the CopyDone reply from the
> > >server its self and treats it as end-of-stream. So the loop in
> > >StreamLogicalLog will just end and ProcessCopyDone() is dead code.
> >
> > The main idea was about fast stop logical replication as it available,
> because if stop replication gets more them 1 seconds it takes more pain.
>
> You should rely on time I tests as little as possible. Some of the test
> hosts are VERY slow. If possible something deterministic should be used.
>
> > Increase this timeout to 3 or 5 second hide problems that this PR try
> fix, that's why I think this lines should be restore to state from previous
> patch.
>
> Per above I don't agree.
>
> >
> > ```
> > ok($spendTime <= 5, # allow extra time for slow machines
> > "pg_recvlogical exited promptly on signal when decoding");
> >
> > ok((time() - $cancelTime) <= 3, # allow extra time for slow machines
> > "pg_recvlogical exited promptly on sigint when idle"
> > );
> >
> > ```
> >
> > Also I do not understand why we do
> >
> > $proc->pump while $proc->pumpable;
> >
> > after send signal to process, why we can not just wait stop replication
> slot?
>
> Because verbose output can produce a lot of writes. If we don't pump the
> buffer pg_recvlogical could block writing on its socket.   Also it lets us
> capture stderr which we need to observe that pg_recvlogical actually ended
> copy on user command rather than just receiving all the input available.
>

Re: [HACKERS] Small race in pg_xlogdump --follow

2016-09-26 Thread Magnus Hagander

On Mon, Sep 26, 2016 at 3:44 PM, Tom Lane  wrote:

> Magnus Hagander  writes:
> > Attached patch puts a retry loop around opening the file that retries
> for 5
> > seconds (which is excessive, but should be safe) in case the file is
> > missing (and still fails out for other error messages of course).
>
> > Comments?
>
> The patch assumes that pg_usleep won't change errno, an assumption
> I have little faith in.
>
>
Oh, right, at the very last loop. I've never seen it need more than 1 loop
so I didn't manage to hit that codepath :) But yeah, saving errno and
restoring it on the other side of pg_usleep is probably a good idea.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/
diff --git a/src/bin/pg_xlogdump/pg_xlogdump.c b/src/bin/pg_xlogdump/pg_xlogdump.c
index 02575eb..6947c0c 100644
--- a/src/bin/pg_xlogdump/pg_xlogdump.c
+++ b/src/bin/pg_xlogdump/pg_xlogdump.c
@@ -249,6 +249,7 @@ XLogDumpXLogRead(const char *directory, TimeLineID timeline_id,
 		if (sendFile < 0 || !XLByteInSeg(recptr, sendSegNo))
 		{
 			char		fname[MAXFNAMELEN];
+			int			tries;
 
 			/* Switch to another logfile segment */
 			if (sendFile >= 0)
@@ -258,7 +259,24 @@ XLogDumpXLogRead(const char *directory, TimeLineID timeline_id,
 
 			XLogFileName(fname, timeline_id, sendSegNo);
 
-			sendFile = fuzzy_open_file(directory, fname);
+			for (tries = 0; tries < 10; tries++)
+			{
+sendFile = fuzzy_open_file(directory, fname);
+if (sendFile >= 0)
+	break;
+if (errno == ENOENT)
+{
+	int			save_errno = errno;
+
+	/* File not there yet, try again */
+	pg_usleep(500 * 1000);
+
+	errno = save_errno;
+	continue;
+}
+/* Any other error, fall through and fail */
+break;
+			}
 
 			if (sendFile < 0)
 fatal_error("could not find file \"%s\": %s",

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench more operators & functions

2016-09-26 Thread Jeevan Ladhe

On Sat, Jul 9, 2016 at 12:14 PM, Fabien COELHO  wrote:
>
>> Here is a simple patch which adds a bunch of operators (bitwise: & | ^ ~,
>> comparisons: =/== <>/!= < <= > >=, logical: and/&& or/|| xor/^^ not/!) and
>> functions (exp ln if) to pgbench. I've tried to be pg's SQL compatible where
>> appropriate.
>>
>> Also attached is a simple test script.
>
>
> Here is a sightly updated version.
>

Hi Fabien,

I did the review of your patch and here are my views on your patch.

Purpose of the patch:
=

This patch introduces extensive list of new operators and functions that can be
used in expressions in pgbench transaction scripts(given with -f option).

Here is the list of operators and functions introduced by this patch:

New operators:
--
bitwise: <<, >>, &, |, ^/#, ~
comparisons: =/==, <>/!=, <, <=, >, >=
logical: and/&&, or/||, xor/^^, not, !

New functions:
--
exp, ln, if

I see there had been a discussion about timing for submission of patch, but not
much about what the patch is doing, so I believe the purpose of patch is quite
acceptable.
Currently there are limited number of operators available in pgbench. So, I
think these new operators definitely make a value addition towards custom
scripts.

Documentation:
=
I could build the documentation without any errors.
New operators and functions are well categorized and added in proper sections
of existing pgbench documentation.

I have a small suggestion here: Earlier we had only few number of operators, so
it was OK to have the operators embedded in the description of '\set' command
itself, but now as we have much more number of operators will it be a good idea
to have a table of operators similar to that of functions. We need not have
several columns here like description, example etc., but a short table just
categorizing the operators would be sufficient.

Initial Run:

I was able to apply patch with 'patch -p1'.
The testcase file(functions.sql) given along the patch gives an expected output.

Further testing and review:
===
1. Postgres treats '^' as exponentiation rather than XOR, and '#' as XOR.
Personally, I think it can cause confusion, so it will be better if we can stick
to the behavior of Postgres mathematical operators.

2. I could not see any tests for bitwise operators in the functions.sql file
that you have attached.

3. Precedence:
a. Unary operators have more precedence than binary operators. So, UNOT and
UINV should have precedence next to UMINUS.
I tried couple of test expressions using C Vs your patch(pgbench)

expression result_in_C result_in_pgbench
(~14-14+2) -27 -3
(!14-14+2) -12 0

b. Similarly shift operators should take more precedence over other bitwise
operators:

expression result_in_C result_in_pgbench
(4|1<<1) 6 10
(4^5&3) 5 1

c. Also, comparison would take more precedence over bitwise operators(&,|,^)
but shift operators.

expression result_in_C result_in_pgbench
(2&1<3) 1 0

In backend/parser/gram.y, I see that unary operators are given higher precedence
than other operators, but it too does not have explicit precedence defined for
bitwise operators.
I tried to check Postgres documentation for operator precedence, but in
'Lexical Structure'[1] the documentation does not mention anything about bitwise
operators. Also, SQL standards 99 does not talk much about operator precedence.
I may be wrong while trying to understand the precedence you defined here and
you might have picked this per some standard, but do you have any reference
which you considered for this?

4. If we are going to stick to current precedence, I think it will be good idea
to document it.

5. Sorry, I was not able to understand the "should exist" comment in following
snippet.

+"xor" { return XOR_OP; } /* should exist */
+"^^" { return XOR_OP; } /* should exist */

7. You may want to reword following comment:

+ else /* cannot get there */

To

+ else /* cannot get here */

8.
+ case PGBENCH_IF:
+ /* should it do a lazy evaluation of the branch? */
+ Assert(nargs == 3);
+ *retval = coerceToBool(&vargs[0]) ? vargs[1] : vargs[2];

I believe ternary operator does the lazy evaluation internally, but to be sure
you may consider rewriting this as following:

if (coerceToBool(&vargs[0]))
*retval = vargs[1];
else
*retval = vargs[2];

Conclusion:
===
I have tested the patch and each of the operator is implemented correctly.
The only concern I have is precedence, otherwise the patch seems to be doing
what it is supposed to do.

[1]https://www.postgresql.org/docs/9.6/static/sql-syntax-lexical.html

Regards,
Jeevan Ladhe.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Small race in pg_xlogdump --follow

2016-09-26 Thread Tom Lane

Magnus Hagander  writes:
> Attached patch puts a retry loop around opening the file that retries for 5
> seconds (which is excessive, but should be safe) in case the file is
> missing (and still fails out for other error messages of course).

> Comments?

The patch assumes that pg_usleep won't change errno, an assumption
I have little faith in.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Small race in pg_xlogdump --follow

2016-09-26 Thread Magnus Hagander

When using pg_xlogdump in --follow mode, there is what I believe is a small
race condition when files are switched.

If pg_xlogdump detects then end of one file (by looking at the record) it
will immediately try to open the following file. If that file has not yet
been created, it will fail with an error message saying it cannot open the
file. But there's a race condition where the server has not had time to put
the file in place.

Attached patch puts a retry loop around opening the file that retries for 5
seconds (which is excessive, but should be safe) in case the file is
missing (and still fails out for other error messages of course).

Comments?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/
diff --git a/src/bin/pg_xlogdump/pg_xlogdump.c b/src/bin/pg_xlogdump/pg_xlogdump.c
index 02575eb..a0e0a0c 100644
--- a/src/bin/pg_xlogdump/pg_xlogdump.c
+++ b/src/bin/pg_xlogdump/pg_xlogdump.c
@@ -249,6 +249,7 @@ XLogDumpXLogRead(const char *directory, TimeLineID timeline_id,
 		if (sendFile < 0 || !XLByteInSeg(recptr, sendSegNo))
 		{
 			char		fname[MAXFNAMELEN];
+			int			tries;
 
 			/* Switch to another logfile segment */
 			if (sendFile >= 0)
@@ -258,7 +259,20 @@ XLogDumpXLogRead(const char *directory, TimeLineID timeline_id,
 
 			XLogFileName(fname, timeline_id, sendSegNo);
 
-			sendFile = fuzzy_open_file(directory, fname);
+			for (tries = 0; tries < 10; tries++)
+			{
+sendFile = fuzzy_open_file(directory, fname);
+if (sendFile >= 0)
+	break;
+if (errno == ENOENT)
+{
+	/* File not there yet, try again */
+	pg_usleep(500*1000);
+	continue;
+}
+/* Any other error, fall through and fail */
+break;
+			}
 
 			if (sendFile < 0)
 fatal_error("could not find file \"%s\": %s",

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] VACUUM's ancillary tasks

2016-09-26 Thread Tom Lane

Thomas Munro  writes:
> I noticed that ATExecAlterColumnType does this:
>  * Drop any pg_statistic entry for the column, since it's now wrong type

> What if there is no rewrite, because the type is binary compatible
> (ATColumnChangeRequiresRewrite returns false)?  Then I think your patch
> won't attract an autovacuum daemon to ANALYZE the table and produce new
> statistics, because it won't touch changes_since_analyze.  I wonder if we
> should simply keep the stats if no rewrite is required.  Aren't the
> statistical properties of binary-compatible types also compatible?

Not necessarily: the type semantics might be different --- in fact,
probably are different, else why would there be distinct types in the
first place?  It would be unwise to keep the old stats IMO.

If you need a concrete example, consider OID vs int4.  They're
binary-compatible, but since int4 is signed while OID is unsigned,
stats for one would be wrong for the other.  This is the same reason
why ALTER COLUMN TYPE has to rebuild indexes even in binary-compatible
cases.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Allowing GIN array_ops to work on anyarray

2016-09-26 Thread Tom Lane

Enrique Meneses  writes:
> I was not sure what "Spec compliant means"... so I did not select as tested 
> or passed. What should I do to validate that this change is "Spec compliant"?

It's irrelevant to this patch, AFAICS.  The SQL standard doesn't discuss
indexes at all, much less legislate on which operator classes ought to
exist for GIN indexes.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Allowing GIN array_ops to work on anyarray

2016-09-26 Thread Enrique Meneses

The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   not tested
Documentation:tested, passed

I built and installed (make world / make install-world) github branch 
REL9_6_STABLE and applied the patch (patch -p1 < 
patch/gin-true-anyarray-opclass-1.patch).

I then upgraded my development database (postgres 9.5) using pg_upgrade. This 
database has one table with an UUID array gin index. The database was upgraded 
correctly to postgresql 9.6 and I was able to successfully connect to it from a 
web application which uses the database. There were no conflicts so I expect 
others to be able to upgrade without issues.

I then dropped the database and re-created it... I made sure that I no longer 
used my prior operator class definition. I re-started my web application and 
confirmed it works. This verifies the feature works as designed.

The following is no longer required:

CREATE OPERATOR CLASS _uuid_ops DEFAULT
  FOR TYPE _uuid USING gin AS
  OPERATOR 1 &&(anyarray, anyarray),
  OPERATOR 2 @>(anyarray, anyarray),
  OPERATOR 3 <@(anyarray, anyarray),
  OPERATOR 4 =(anyarray, anyarray),
  FUNCTION 1 uuid_cmp(uuid, uuid),
  FUNCTION 2 ginarrayextract(anyarray, internal, internal),
  FUNCTION 3 ginqueryarrayextract(anyarray, internal, smallint, internal, 
internal, internal, internal),
  FUNCTION 4 ginarrayconsistent(internal, smallint, anyarray, integer, 
internal, internal, internal, internal),
  STORAGE uuid;

I also ran "make installcheck-world" and all the tests passed.

The new status of this patch is: Waiting on Author

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Allowing GIN array_ops to work on anyarray

2016-09-26 Thread Enrique Meneses

I was not sure what "Spec compliant means"... so I did not select as tested or 
passed. What should I do to validate that this change is "Spec compliant"?
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: psql \setfileref

2016-09-26 Thread Ryan Murphy

Hi Pavel,

I just tried to apply your patch psql-setfileref-initial.patch (using git 
apply) to the newest revision of postgres at the time (da6c4f6ca88) and it 
failed to patch startup.c.  Thinking that the patch was for some previous 
revision, I then checked out d062245b5, which was from 2016-08-31, and tried 
applying the patch from there.  I had the same problem:

$ git apply psql-setfileref-initial.patch
error: patch failed: src/bin/psql/startup.c:106
error: src/bin/psql/startup.c: patch does not apply

However, when I applied the changes to startup.c manually and removed them from 
the patch, the rest of the patch applied without a problem.  I don't know if I 
may have done something wrong in trying to apply the patch, but you may want to 
double check if you need to regenerate your patch from the latest revision so 
it will apply smoothly for reviewers.

In the meantime, I haven't had a chance to try out the fileref feature yet but 
I'll give it a go when I get a chance!

Thanks!
Ryan
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Aggregate Push Down - Performing aggregation on foreign server

2016-09-26 Thread Ashutosh Bapat

This patch will need some changes to conversion_error_callback(). That
function reports an error in case there was an error converting the
result obtained from the foreign server into an internal datum e.g.
when the string returned by the foreign server is not acceptable by
local input function for the expected datatype. In such cases, the
input function will throw error and conversion_error_callback() will
provide appropriate context for that error. postgres_fdw.sql has tests
to test proper context

-- ===
-- conversion error
-- ===
ALTER FOREIGN TABLE ft1 ALTER COLUMN c8 TYPE int;
SELECT * FROM ft1 WHERE c1 = 1;  -- ERROR
SELECT  ft1.c1,  ft2.c2, ft1.c8 FROM ft1, ft2 WHERE ft1.c1 = ft2.c1
AND ft1.c1 = 1; -- ERROR
SELECT  ft1.c1,  ft2.c2, ft1 FROM ft1, ft2 WHERE ft1.c1 = ft2.c1 AND
ft1.c1 = 1; -- ERROR
ALTER FOREIGN TABLE ft1 ALTER COLUMN c8 TYPE user_enum;

Right now we report the column name in the error context. This needs
to change for aggregate pushdown, which is pushing down expressions.
Right now, in case the aggregate on foreign server produces a string
unacceptable locally, it would crash at
4982 Assert(IsA(var, Var));
4983
4984 rte = rt_fetch(var->varno, estate->es_range_table);

since it's not a Var node as expected.

We need to fix the error context to provide meaningful information or
at least not crash. This has been discussed briefly in [1].

[1] 
https://www.postgresql.org/message-id/flat/CAFjFpRdcC7Ykb1SkARBYikx9ubKniBiAgHqMD9e47TxzY2EYFw%40mail.gmail.com#cafjfprdcc7ykb1skarbyikx9ubknibiaghqmd9e47txzy2e...@mail.gmail.com

On Fri, Sep 16, 2016 at 7:15 PM, Jeevan Chalke
 wrote:
> Hi,
>
> On Mon, Sep 12, 2016 at 5:17 PM, Jeevan Chalke
>  wrote:
>>
>> Thanks Ashutosh for the detailed review comments.
>>
>> I am working on it and will post updated patch once I fix all your
>> concerns.
>>
>>
>
> Attached new patch fixing the review comments.
>
> Here are few comments on the review points:
>
> 1. Renamed deparseFromClause() to deparseFromExpr() and
> deparseAggOrderBy() to appendAggOrderBy()
>
> 2. Done
>
> 3. classifyConditions() assumes list expressions of type RestrictInfo. But
> HAVING clause expressions (and JOIN conditions) are plain expressions. Do
> you mean we should modify the classifyConditions()? If yes, then I think it
> should be done as a separate (cleanup) patch.
>
> 4, 5. Both done.
>
> 6. Per my understanding, I think checking for just aggregate function is
> enough as we are interested in whole aggregate result. Meanwhile I will
> check whether we need to find and check shippability of transition,
> combination and finalization functions or not.
>
> 7, 8, 9, 10, 11, 12. All done.
>
> 13. Fixed. However instead of adding new function made those changes inline.
> Adding support for deparsing List expressions separating list by comma can
> be
> considered as cleanup patch as it will touch other code area not specific to
> aggregate push down.
>
> 14, 15, 16, 17. All done.
>
> 18. I think re-factoring estimate_path_cost_size() should be done separately
> as a cleanup patch too.
>
> 19, 20, 21, 22, 23, 24, 25, 26, 27, 28. All done.
>
> 29. I have tried passing input rel's relids to fetch_upper_rel() call in
> create_grouping_paths(). It solved this patch's issue, but ended up with
> few regression failures which is mostly plan changes. I am not sure whether
> we get good plan now or not as I have not analyzed them.
> So for this patch, I am setting relids in add_foreign_grouping_path()
> itself.
> However as suggested, used bms_copy(). I still kept the FIXME over there as
> I think it should have done by the core itself.
>
> 30, 31, 32, 33. All done.
>
> Let me know your views.
>
> Thanks
> --
> Jeevan B Chalke
> Principal Software Engineer, Product Development
> EnterpriseDB Corporation
> The Enterprise PostgreSQL Company
>



-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] wal_segment size vs max_wal_size

2016-09-26 Thread Kuntal Ghosh

On Mon, Sep 26, 2016 at 5:04 PM, Amit Kapila  wrote:
>
> IIRC, there is already a patch to update the minRecoveryPoint
> correctly, can you check if that solves the problem for you?
>
> [1] - 
> https://www.postgresql.org/message-id/20160609.215558.118976703.horiguchi.kyotaro%40lab.ntt.co.jp
>
+1. I've tested after applying the patch. This clearly solves the problem.

-- 
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Password identifiers, protocol aging and SCRAM protocol

2016-09-26 Thread David Steele

On 9/26/16 4:54 AM, Heikki Linnakangas wrote:
> On 09/26/2016 09:02 AM, Michael Paquier wrote:
>> On Mon, Sep 26, 2016 at 2:15 AM, David Steele 
>> wrote:
>>> However, it doesn't look like they can be used in conjunction since the
>>> pg_hba.conf entry must specify either m5 or scram (though the database
>>> can easily contain a mixture).  This would probably make a migration
>>> very unpleasant.
>>
>> Yep, it uses a given auth-method once user and database match. This is
>> partially related to the problem to support multiple password
>> verifiers per users, which was submitted last CF but got rejected
>> because of a lack of interest, and removed to simplify this patch. You
>> need as well to think about other things like password and protocol
>> aging. But well, it is a problem that we don't have to tackle with
>> this patch...
>>
>>> Is there any chance of a mixed mode that will allow new passwords to be
>>> set as scram while still honoring the old md5 passwords? Or does that
>>> cause too many complications with the protocol?
>>
>> Hm. That looks complicated to me. This sounds to me like a retry logic
>> if for multiple authentication methods, and a different feature. What
>> you'd be looking for here is a connection parameter to specify a list
>> of protocols and try them all, no?
> 
> It would be possible to have a "md5-or-scram" authentication method in
> pg_hba.conf, such that the server would look up the pg_authid row of the
> user when it receives startup message, and send an MD5 or SCRAM
> challenge depending on which one the user's password is encrypted with.
> It has one drawback though: it allows an unauthenticated user to probe
> if there is a role with a given name in the system, because if a user
> doesn't exist, we'd have to still send an MD5 or SCRAM challenge, or a
> "user does not exist" error without a challenge. If we send a SCRAM
> challenge for a non-existent user, and the attacker knows that most
> users still have a MD5 password, that reveals that the username doesn't
> most likely doesn't exist.
> 
> Hmm. The server could send a SCRAM challenge first, and if the client
> gives an incorrect response, or the username doesn't exist, or the
> user's password is actually MD5-encrypted, the server could then send an
> MD5 challenge. It would add one round-trip to the authentication of MD5
> passwords, but that seems acceptable.
> 
> We can do this as a follow-up patch though. Let's try to keep this patch
> series small.

Fair enough.  I'm not even 100% sure we should do it, but wanted to
raise it as a possible issue.

-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Transactions involving multiple postgres foreign servers

2016-09-26 Thread Ashutosh Bapat

On Mon, Sep 26, 2016 at 5:25 PM, Masahiko Sawada  wrote:
> On Mon, Sep 26, 2016 at 7:28 PM, Ashutosh Bapat
>  wrote:
>> My original patch added code to manage the files for 2 phase
>> transactions opened by the local server on the remote servers. This
>> code was mostly inspired from the code in twophase.c which manages the
>> file for prepared transactions. The logic to manage 2PC files has
>> changed since [1] and has been optimized. One of the things I wanted
>> to do is see, if those optimizations are applicable here as well. Have
>> you considered that?
>>
>>
>
> Yeah, we're considering it.
> After these changes are committed, we will post the patch incorporated
> these changes.
>
> But what we need to do first is the discussion in order to get consensus.
> Since current design of this patch is to transparently execute DCL of
> 2PC on foreign server, this code changes lot of code and is
> complicated.

Can you please elaborate. I am not able to understand what DCL is
involved here. According to [1], examples of DCL are GRANT and REVOKE
command.

> Another approach I have is to push down DCL to only foreign servers
> that support 2PC protocol, which is similar to DML push down.
> This approach would be more simpler than current idea and is easy to
> use by distributed transaction manager.

Again, can you please elaborate, how that would be different from the
current approach and how does it simplify the code.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add support for restrictive RLS policies

2016-09-26 Thread Jeevan Chalke

Hi,

I have started reviewing this patch and here are couple of points I have
observed so far:

1. Patch applies cleanly
2. make / make install / initdb all good.
3. make check (regression) FAILED. (Attached diff file for reference).

Please have a look over failures.

Meanwhile I will go ahead and review the code changes.

Thanks

-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


regression.diffs
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Transactions involving multiple postgres foreign servers

2016-09-26 Thread Masahiko Sawada

On Mon, Sep 26, 2016 at 7:28 PM, Ashutosh Bapat
 wrote:
> My original patch added code to manage the files for 2 phase
> transactions opened by the local server on the remote servers. This
> code was mostly inspired from the code in twophase.c which manages the
> file for prepared transactions. The logic to manage 2PC files has
> changed since [1] and has been optimized. One of the things I wanted
> to do is see, if those optimizations are applicable here as well. Have
> you considered that?
>
>

Yeah, we're considering it.
After these changes are committed, we will post the patch incorporated
these changes.

But what we need to do first is the discussion in order to get consensus.
Since current design of this patch is to transparently execute DCL of
2PC on foreign server, this code changes lot of code and is
complicated.
Another approach I have is to push down DCL to only foreign servers
that support 2PC protocol, which is similar to DML push down.
This approach would be more simpler than current idea and is easy to
use by distributed transaction manager.
I think that it would be good place to start.

I'd like to discuss what the best approach is for transaction
involving foreign servers.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add support for restrictive RLS policies

2016-09-26 Thread Jeevan Chalke

On Mon, Sep 12, 2016 at 7:27 AM, Stephen Frost  wrote:

> Robert,
>
> * Robert Haas (robertmh...@gmail.com) wrote:
> > On Thu, Sep 8, 2016 at 5:21 PM, Tom Lane  wrote:
> > > Stephen Frost  writes:
> > >> * Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
> > >>> Can't you keep those words as Sconst or something (DefElems?) until
> the
> > >>> execution phase, so that they don't need to be keywords at all?
> > >
> > >> Seems like we could do that, though I'm not convinced that it really
> > >> gains us all that much.  These are only unreserved keywords, of
> course,
> > >> so they don't impact users the way reserved keywords (of any kind)
> can.
> > >> While there may be some places where we use a string to represent a
> set
> > >> of defined options, I don't believe that's typical
> > >
> > > -1 for having to write them as string literals; but I think what Alvaro
> > > really means is to arrange for the words to just be identifiers in the
> > > grammar, which you strcmp against at execution.  See for example
> > > reloption_list.  (Whether you use DefElem as the internal
> representation
> > > is a minor detail, though it might help for making the parsetree
> > > copyObject-friendly.)
> > >
> > > vacuum_option_elem shows another way to avoid making a word into a
> > > keyword, although to me that one is more of an antipattern; it'd be
> better
> > > to leave the strcmp to execution, since there's so much other code that
> > > does things that way.
> >
> > There are other cases like that, too, e.g. AlterOptRoleElem; I don't
> > think it's a bad pattern.  Regardless of the specifics, I do think
> > that it would be better not to bloat the keyword table with things
> > that don't really need to be keywords.
>
> The AlterOptRoleElem case is, I believe, what Tom was just suggesting as
> an antipattern, since the strcmp() is being done at parse time instead
> of at execution time.
>
> If we are concerned about having too many unreserved keywords, then I
> agree that AlterOptRoleElem is a good candidate to look at for reducing
> the number we have, as it appears to contain 3 keywords which are not
> used anywhere else (and 1 other which is only used in one other place).
>
> I do think that using IDENT for the various role attributes does make
> sense, to be clear, as there are quite a few of them, they change
> depending on major version, and those keywords are very unlikely to ever
> have utilization elsewhere.
>
> For this case, there's just 2 keywords which seem like they may be used
> again (perhaps for ALTER or DROP POLICY, or default policies if we grow
> those in the future), and we're unlikly to grow more in the future for
> that particular case (as we only have two binary boolean operators and
> that seems unlikely to change), though, should that happens, we could
> certainly revisit this.
>

To me, adding two new keywords for two new options does not look good as it
will bloat keywords list. Per my understanding we should add keyword if and
only if we have no option than adding at, may be to avoid grammar conflicts.

I am also inclined to think that using identifier will be a good choice
here.
However I would prefer to do the string comparison right into the grammar
itself, so that if we have wrong option as input there, then we will not
proceed further with it. We are anyway going to throw an error later then
why not at early stage.

Thanks


>
> Thanks!
>
> Stephen
>



-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Re: [HACKERS] wal_segment size vs max_wal_size

2016-09-26 Thread Amit Kapila

On Mon, Sep 26, 2016 at 4:00 PM, Kuntal Ghosh
 wrote:
> On Wed, Sep 21, 2016 at 8:33 PM, Peter Eisentraut
>  wrote:
>> There is apparently some misbehavior if max_wal_size is less than 5 *
>> wal_segment_size.
>>
>
>> This should probably be made friendlier in some way.  But it also shows
>> that bigger WAL segment sizes are apparently not well-chartered
>> territory lately.
>>
> Well, there can be multiple solutions to this problem.
> 1. If somebody intends to increase wal segment size, he should
> increase max_wal_size accordingly.
> 2. In recovery test, we can add some delay before taking backup so
> that the pending logs in the buffer
> gets flushed. (Not a good solution)
> 3. In CreateRestartPoint() method, we can force a XLogFlush to update
> minRecoveryPoint.
>

IIRC, there is already a patch to update the minRecoveryPoint
correctly, can you check if that solves the problem for you?

[1] - 
https://www.postgresql.org/message-id/20160609.215558.118976703.horiguchi.kyotaro%40lab.ntt.co.jp

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench - allow to store select results into variables

2016-09-26 Thread Fabien COELHO



Hello Amit,


I am marking the pgbench-into-5.patch [1] as "Ready for Committer" as I
have no further comments at the moment.


Wait... Heikki's latest commit now requires this patch to be rebased.


Indeed. Here is the rebased version, which still get through my various 
tests.


--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 285608d..0a474e1 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -809,6 +809,30 @@ pgbench  options  dbname
   
 
   
+   
+
+ \into var1 [var2 ...]
+
+
+
+ 
+  Stores the first fields of the resulting row from the current or preceding
+  SQL command into these variables.
+  The queries must yield exactly one row and the number of provided
+  variables must be less than the total number of columns of the results.
+  This meta command does not end the current SQL command.
+ 
+
+ 
+  Example:
+
+SELECT abalance \into abalance
+  FROM pgbench_accounts WHERE aid=5432;
+
+ 
+
+   
+

 
  \set varname expression
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 1fb4ae4..9c13a18 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -373,11 +373,14 @@ static const char *QUERYMODE[] = {"simple", "extended", "prepared"};
 
 typedef struct
 {
-	char	   *line;			/* text of command line */
+	char	   *line;			/* first line for short display */
+	char	   *lines;			/* full multi-line text of command */
 	int			command_num;	/* unique index of this Command struct */
 	int			type;			/* command type (SQL_COMMAND or META_COMMAND) */
 	int			argc;			/* number of command words */
 	char	   *argv[MAX_ARGS]; /* command word list */
+	int			compound;   /* last compound command (number of \;) */
+	char	 ***intovars;   /* per-compound command \into variables */
 	PgBenchExpr *expr;			/* parsed expression, if needed */
 	SimpleStats stats;			/* time spent in this command */
 } Command;
@@ -1221,6 +1224,110 @@ getQueryParams(CState *st, const Command *command, const char **params)
 		params[i] = getVariable(st, command->argv[i + 1]);
 }
 
+/* read all responses from backend */
+static bool
+read_response(CState *st, char ***intovars)
+{
+	PGresult   *res;
+	int			compound = -1;
+
+	while ((res = PQgetResult(st->con)) != NULL)
+	{
+		compound += 1;
+
+		switch (PQresultStatus(res))
+		{
+		case PGRES_COMMAND_OK: /* non-SELECT commands */
+		case PGRES_EMPTY_QUERY: /* may be used for testing no-op overhead */
+			if (intovars[compound] != NULL)
+			{
+fprintf(stderr,
+		"client %d file %d command %d compound %d: "
+		"\\into expects a row\n",
+		st->id, st->use_file, st->command, compound);
+st->ecnt++;
+return false;
+			}
+			break; /* OK */
+
+		case PGRES_TUPLES_OK:
+			if (intovars[compound] != NULL)
+			{
+/* store result into variables */
+int ntuples = PQntuples(res),
+	nfields = PQnfields(res),
+	f = 0;
+
+if (ntuples != 1)
+{
+	fprintf(stderr,
+			"client %d file %d command %d compound %d: "
+			"expecting one row, got %d\n",
+			st->id, st->use_file, st->command, compound, ntuples);
+	st->ecnt++;
+	PQclear(res);
+	discard_response(st);
+	return false;
+}
+
+while (intovars[compound][f] != NULL && f < nfields)
+{
+	/* store result as a string */
+	if (!putVariable(st, "into", intovars[compound][f],
+	 PQgetvalue(res, 0, f)))
+	{
+		/* internal error, should it rather abort? */
+		fprintf(stderr,
+"client %d file %d command %d compound %d: "
+"error storing into var %s\n",
+st->id, st->use_file, st->command, compound,
+intovars[compound][f]);
+		st->ecnt++;
+		PQclear(res);
+		discard_response(st);
+		return false;
+	}
+
+	f++;
+}
+
+if (intovars[compound][f] != NULL)
+{
+	fprintf(stderr,
+			"client %d file %d command %d compound %d: missing results"
+			" to fill into variable %s\n",
+			st->id, st->use_file, st->command, compound,
+			intovars[compound][f]);
+	st->ecnt++;
+	return false;
+}
+			}
+			break;	/* OK */
+
+		default:
+			/* everything else is unexpected, so probably an error */
+			fprintf(stderr, "client %d file %d aborted in command %d compound %d: %s",
+	st->id, st->use_file, st->command, compound,
+	PQerrorMessage(st->con));
+			st->ecnt++;
+			PQclear(res);
+			discard_response(st);
+			return false;
+		}
+
+		PQclear(res);
+	}
+
+	if (compound == -1)
+	{
+		fprintf(stderr, "client %d command %d: no results\n", st->id, st->command);
+		st->ecnt++;
+		return false;
+	}
+
+	return true;
+}
+
 /* get a value as an int, tell if there is a problem */
 static bool
 coerceToInt(PgBenchValue *pval, int64 *ival)
@@ -1953,7 +2060,6 @@ evaluateSleep(CState *st, int argc, char **argv, int *usecs)
 static void
 doCustom(TState *thread, CState *st, StatsData *agg)
 {
-	PGresul

Re: [HACKERS] Push down more full joins in postgres_fdw

2016-09-26 Thread Ashutosh Bapat

On Mon, Sep 26, 2016 at 4:06 PM, Etsuro Fujita
 wrote:
> On 2016/09/26 18:06, Ashutosh Bapat wrote:
>>
>> On Mon, Sep 26, 2016 at 1:05 PM, Etsuro Fujita
>>  wrote:
>
>
>>> ISTM that the use of the same RTI for subqueries in multi-levels in a
>>> remote
>>> SQL makes the SQL a bit difficult to read.  How about using the position
>>> of
>>> the join rel in join_rel_list, (more precisely, the position plus
>>> list_length(root->parse->rtable)), instead?
>
>
>> We switch to hash table to maintain the join RelOptInfos when the
>> number of joins grows larger, where the position won't make much
>> sense.
>
>
> That's right, but we still store the joinrel into join_rel_list after
> creating that hash table.  That hash table is just for fast lookup.  See
> build_join_rel.

Sorry, I wasn't clear in my reply. As the list grows, it will take
longer to locate the RelOptInfo of the given join. Doing that for
creating an alias seems an overkill.


-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Push down more full joins in postgres_fdw

2016-09-26 Thread Etsuro Fujita


On 2016/09/26 18:06, Ashutosh Bapat wrote:

On Mon, Sep 26, 2016 at 1:05 PM, Etsuro Fujita
 wrote:



ISTM that the use of the same RTI for subqueries in multi-levels in a remote
SQL makes the SQL a bit difficult to read.  How about using the position of
the join rel in join_rel_list, (more precisely, the position plus
list_length(root->parse->rtable)), instead?



We switch to hash table to maintain the join RelOptInfos when the
number of joins grows larger, where the position won't make much
sense.


That's right, but we still store the joinrel into join_rel_list after 
creating that hash table.  That hash table is just for fast lookup.  See 
build_join_rel.



We might differentiate between a base relation alias and
subquery alias by using different prefixes like "r" and "s"  resp.


Hmm.  My concern about that would the recursive use of "s" with the same 
RTI as subquery aliases for different level subqueries in a single 
remote SQL.  For example, if those subqueries include a base rel with 
RTI=1, then "s1" would be used again and again within that SQL.  That 
would be logically correct, but seems confusing to me.


Best regards,
Etsuro Fujita




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] wal_segment size vs max_wal_size

2016-09-26 Thread Kuntal Ghosh

On Wed, Sep 21, 2016 at 8:33 PM, Peter Eisentraut
 wrote:
> There is apparently some misbehavior if max_wal_size is less than 5 *
> wal_segment_size.
>
> For example, if you build with --with-wal-segsize=64, then the recovery
> test fails unless you set max_wal_size to at least 320MB in
> PostgresNode.pm.  The issue is that pg_basebackup fails with:
>
In recovery tests, max_wal_size is set to 128MB. Now, when you build
with --with-wal-segsize=64,
max_wal_size is calculated as follows:
max_wal_size = 128 / (64 * 1024 * 1024) / (1024 * 1024) = 2.
and CheckPointSegments is calculated as follows:
CheckPointSegments = 2 / (2 + 0.5) = 0.8 rounded to 1. (Default is 3)
Hence, checkpoints occurs very frequently at master.

> pg_basebackup: could not get transaction log end position from server:
> ERROR:  could not find any WAL files
This error occurs when the recovery test tries to take backup from the
standby using the
above settings. pg_basebackup scans pg_xlog and include all WAL files
in the range
between 'startptr' and 'endptr', regardless of the timeline the file
is stamped with.
'startptr' is initialized to ControlFile->checkPointCopy.redo and
'endptr' is initialized to
ControlFile->minRecoveryPoint.
Now, whenever we redo a CHECKPOINT_ONLINE log, we update checkPointCopy.redo and
whenever we flush logs, we update minRecoveryPoint.
In this case, we are having frequent checkpoints at master which in
turn updates checkPointCopy.redo
in standy frequently. Sometimes, it even goes ahead of
minRecoveryPoint. At this point, if you call
pg_basebackup, it will throw the aforesaid error.

> This should probably be made friendlier in some way.  But it also shows
> that bigger WAL segment sizes are apparently not well-chartered
> territory lately.
>
Well, there can be multiple solutions to this problem.
1. If somebody intends to increase wal segment size, he should
increase max_wal_size accordingly.
2. In recovery test, we can add some delay before taking backup so
that the pending logs in the buffer
gets flushed. (Not a good solution)
3. In CreateRestartPoint() method, we can force a XLogFlush to update
minRecoveryPoint.

Thoughts?

-- 
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Transactions involving multiple postgres foreign servers

2016-09-26 Thread Ashutosh Bapat

My original patch added code to manage the files for 2 phase
transactions opened by the local server on the remote servers. This
code was mostly inspired from the code in twophase.c which manages the
file for prepared transactions. The logic to manage 2PC files has
changed since [1] and has been optimized. One of the things I wanted
to do is see, if those optimizations are applicable here as well. Have
you considered that?


[1]. 
https://www.postgresql.org/message-id/74355FCF-AADC-4E51-850B-47AF59E0B215%40postgrespro.ru

On Fri, Aug 26, 2016 at 11:43 AM, Ashutosh Bapat
 wrote:
>
>
> On Fri, Aug 26, 2016 at 11:37 AM, Masahiko Sawada 
> wrote:
>>
>> On Fri, Aug 26, 2016 at 3:03 PM, Ashutosh Bapat
>>  wrote:
>> >
>> >
>> > On Fri, Aug 26, 2016 at 11:22 AM, Masahiko Sawada
>> > 
>> > wrote:
>> >>
>> >> On Fri, Aug 26, 2016 at 1:32 PM, Vinayak Pokale 
>> >> wrote:
>> >> > Hi All,
>> >> >
>> >> > Ashutosh proposed the feature 2PC for FDW for achieving atomic
>> >> > commits
>> >> > across multiple foreign servers.
>> >> > If a transaction make changes to more than two foreign servers the
>> >> > current
>> >> > implementation in postgres_fdw doesn't make sure that either all of
>> >> > them
>> >> > commit or all of them rollback their changes.
>> >> >
>> >> > We (Masahiko Sawada and me) reopen this thread and trying to
>> >> > contribute
>> >> > in
>> >> > it.
>> >> >
>> >> > 2PC for FDW
>> >> > 
>> >> > The patch provides support for atomic commit for transactions
>> >> > involving
>> >> > foreign servers. when the transaction makes changes to foreign
>> >> > servers,
>> >> > either all the changes to all the foreign servers commit or rollback.
>> >> >
>> >> > The new patch 2PC for FDW include the following things:
>> >> > 1. The patch 0001 introduces a generic feature. All kinds of FDW that
>> >> > support 2PC such as oracle_fdw, mysql_fdw, postgres_fdw etc. can
>> >> > involve
>> >> > in
>> >> > the transaction.
>> >> >
>> >> > Currently we can push some conditions down to shard nodes, especially
>> >> > in
>> >> > 9.6
>> >> > the directly modify feature has
>> >> > been introduced. But such a transaction modifying data on shard node
>> >> > is
>> >> > not
>> >> > executed surely.
>> >> > Using 0002 patch, that modify is executed with 2PC. It means that we
>> >> > almost
>> >> > can provide sharding solution using
>> >> > multiple PostgreSQL server (one parent node and several shared node).
>> >> >
>> >> > For multi master, we definitely need transaction manager but
>> >> > transaction
>> >> > manager probably can use this 2PC for FDW feature to manage
>> >> > distributed
>> >> > transaction.
>> >> >
>> >> > 2. 0002 patch makes postgres_fdw possible to use 2PC.
>> >> >
>> >> > 0002 patch makes postgres_fdw to use below APIs. These APIs are
>> >> > generic
>> >> > features which can be used by all kinds of FDWs.
>> >> >
>> >> > a. Execute PREAPRE TRANSACTION and COMMIT/ABORT PREAPRED instead
>> >> > of
>> >> > COMMIT/ABORT on foreign server which supports 2PC.
>> >> > b. Manage information of foreign prepared transactions resolver
>> >> >
>> >> > Masahiko Sawada will post the patch.
>> >> >
>> >> >
>> >>
>> >
>> > Thanks Vinayak and Sawada-san for taking this forward and basing your
>> > work
>> > on my patch.
>> >
>> >>
>> >> Still lot of work to do but attached latest patches.
>> >> These are based on the patch Ashutosh posted before, I revised it and
>> >> divided into two patches.
>> >> Compare with original patch, patch of pg_fdw_xact_resolver and
>> >> documentation are lacked.
>> >
>> >
>> > I am not able to understand the last statement.
>>
>> Sorry to confuse you.
>>
>> > Do you mean to say that your patches do not have pg_fdw_xact_resolver()
>> > and
>> > documentation that my patches had?
>>
>> Yes.
>> I'm confirming them that your patches had.
>
>
> Thanks for the clarification. I had added pg_fdw_xact_resolver() to resolve
> any transactions which can not be resolved immediately after they were
> prepared. There was a comment from Kevin (IIRC) that leaving transactions
> unresolved on the foreign server keeps the resources locked on those
> servers. That's not a very good situation. And nobody but the initiating
> server can resolve those. That functionality is important to make it a
> complete 2PC solution. So, please consider it to be included in your first
> set of patches.
>
> --
> Best Wishes,
> Ashutosh Bapat
> EnterpriseDB Corporation
> The Postgres Database Company



-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Push down more full joins in postgres_fdw

2016-09-26 Thread Ashutosh Bapat

On Mon, Sep 26, 2016 at 1:05 PM, Etsuro Fujita
 wrote:
> On 2016/09/15 15:29, Ashutosh Bapat wrote:
>>
>> On Wed, Sep 14, 2016 at 8:52 PM, Robert Haas 
>> wrote:
>
>
>>> I'm not sure why it wouldn't work
>>> to just use the lowest RTI involved in the join, though; the others
>>> won't appear anywhere else at that query level.
>
>
>> So +1 for
>> using the smallest RTI to indicate a subquery.
>
>
> +1 for the general idea.
>
> ISTM that the use of the same RTI for subqueries in multi-levels in a remote
> SQL makes the SQL a bit difficult to read.  How about using the position of
> the join rel in join_rel_list, (more precisely, the position plus
> list_length(root->parse->rtable)), instead?
>
We switch to hash table to maintain the join RelOptInfos when the
number of joins grows larger, where the position won't make much
sense. We might differentiate between a base relation alias and
subquery alias by using different prefixes like "r" and "s"  resp.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Aggregate Push Down - Performing aggregation on foreign server

2016-09-26 Thread Ashutosh Bapat

Thanks Jeevan for working on the comments.

>
> 3. classifyConditions() assumes list expressions of type RestrictInfo. But
> HAVING clause expressions (and JOIN conditions) are plain expressions. Do
> you mean we should modify the classifyConditions()? If yes, then I think it
> should be done as a separate (cleanup) patch.

Ok. Yes, we should also handle bare conditions in
classifyConditions(). I am fine doing it as a separate patch.


>
> 6. Per my understanding, I think checking for just aggregate function is
> enough as we are interested in whole aggregate result. Meanwhile I will
> check whether we need to find and check shippability of transition,
> combination and finalization functions or not.

Ok. I agree with this analysis.


> 13. Fixed. However instead of adding new function made those changes inline.
> Adding support for deparsing List expressions separating list by comma can
> be
> considered as cleanup patch as it will touch other code area not specific to
> aggregate push down.

Ok.

>
> 18. I think re-factoring estimate_path_cost_size() should be done separately
> as a cleanup patch too.

Ok.

>
> 29. I have tried passing input rel's relids to fetch_upper_rel() call in
> create_grouping_paths(). It solved this patch's issue, but ended up with
> few regression failures which is mostly plan changes. I am not sure whether
> we get good plan now or not as I have not analyzed them.
> So for this patch, I am setting relids in add_foreign_grouping_path()
> itself.
> However as suggested, used bms_copy(). I still kept the FIXME over there as
> I think it should have done by the core itself.

I don't think add_foreign_grouping_path() is the right place to change
a structure managed by the core and esp when we are half-way adding
paths. An FDW should not meddle with an upper RelOptInfo. Since
create_foreignscan_plan() picks those up from RelOptInfo and both of
those are part of core, we need a place in core to set the
RelOptInfo::relids for an upper relation OR we have stop using
fs_relids for upper relation foreign scans.

Here are some more comments on the patch, mostly focused on the tests.
1. If we dig down each usage of deparse_expr_cxt::scanrel, it boils down
checking whether a given Var comes from given scan relation (deparseVar()) or
checking whether a given Var needs relation qualification while deparsing
(again deparseVar()). I think both of those purposes can be served by looking
at scanrel::relids, multiple relids implying relation qualification. So,
instead of having a separate member scan_rel, we should instead fetch the
relids from deparse_expr_cxt::foreignrel. I am assuming that the proposed
change to set relids in upper relations is acceptable. If it's not, we should
pass scanrel->relids through the context instead of scanrel itself.

2. SortGroupClause is a parser node, so we can name appendSortGroupClause() as
deparseSortGroupClause(), to be consistent with the naming convention. If we do
that change, may be it's better to pass SortGroupClause as is and handle other
members of that structure as well.

3. Following block may be reworded
+/*
+ * aggorder too present into args so no need to check its
+ * shippability explicitly.  However, if any expression has
+ * USING clause with sort operator, we need to make sure the
+ * shippability of that operator.
+ */

as "For aggorder elements, check whether the sort operator, if specified, is
shippable or not." and mention aggorder expression along with aggdirectargs in
the comment before this one.

4. Following line is correct as long as there is only one upper relation.
+context.scanrel = (rel->reloptkind == RELOPT_UPPER_REL) ?
fpinfo->outerrel : rel;
But as we push down more and more of upper operation, there will be a chain of
upper relations on top of scan relation. So, it's better to assert that the
scanrel is a join or a base relation to be future proof.

5. In deparseConst(), showtype is compared with hardcoded values. The
callers of this function pass hardcoded values. Instead, we should
define an enum and use it. I understand that this logic has been borrowed from
get_const_expr(), which also uses hardcoded values. Any reason why not to adopt
a better style here? In fact, it only uses two states for showtype, 0 and ">
0". Why don't we use a boolean then OR why isn't the third state in
get_const_expr() applicable here?

6. "will" or "should" is missing from the following sentence.
"Plain var nodes either be same as some GROUP BY expression or part of some
GROUP BY expression.

7. The changes in block
else
{
/*
 * If we have sortgroupref set, then it means that we have an
 * ORDER BY entry pointing to this expression.  Since we are
 * not pushing ORDER BY with GROUP BY, clear it.
 */
if (sgref)
gr

Re: [HACKERS] Password identifiers, protocol aging and SCRAM protocol

2016-09-26 Thread Heikki Linnakangas


On 09/26/2016 09:02 AM, Michael Paquier wrote:

On Mon, Sep 26, 2016 at 2:15 AM, David Steele  wrote:

However, it doesn't look like they can be used in conjunction since the
pg_hba.conf entry must specify either m5 or scram (though the database
can easily contain a mixture).  This would probably make a migration
very unpleasant.


Yep, it uses a given auth-method once user and database match. This is
partially related to the problem to support multiple password
verifiers per users, which was submitted last CF but got rejected
because of a lack of interest, and removed to simplify this patch. You
need as well to think about other things like password and protocol
aging. But well, it is a problem that we don't have to tackle with
this patch...


Is there any chance of a mixed mode that will allow new passwords to be
set as scram while still honoring the old md5 passwords? Or does that
cause too many complications with the protocol?


Hm. That looks complicated to me. This sounds to me like a retry logic
if for multiple authentication methods, and a different feature. What
you'd be looking for here is a connection parameter to specify a list
of protocols and try them all, no?


It would be possible to have a "md5-or-scram" authentication method in 
pg_hba.conf, such that the server would look up the pg_authid row of the 
user when it receives startup message, and send an MD5 or SCRAM 
challenge depending on which one the user's password is encrypted with. 
It has one drawback though: it allows an unauthenticated user to probe 
if there is a role with a given name in the system, because if a user 
doesn't exist, we'd have to still send an MD5 or SCRAM challenge, or a 
"user does not exist" error without a challenge. If we send a SCRAM 
challenge for a non-existent user, and the attacker knows that most 
users still have a MD5 password, that reveals that the username doesn't 
most likely doesn't exist.


Hmm. The server could send a SCRAM challenge first, and if the client 
gives an incorrect response, or the username doesn't exist, or the 
user's password is actually MD5-encrypted, the server could then send an 
MD5 challenge. It would add one round-trip to the authentication of MD5 
passwords, but that seems acceptable.


We can do this as a follow-up patch though. Let's try to keep this patch 
series small.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench - allow to store select results into variables

2016-09-26 Thread Amit Langote

On 2016/09/26 16:12, Amit Langote wrote:
> I am marking the pgbench-into-5.patch [1] as "Ready for Committer" as I
> have no further comments at the moment.

Wait... Heikki's latest commit now requires this patch to be rebased.

commit 12788ae49e1933f463bc59a6efe46c4a01701b76
Author: Heikki Linnakangas 
Date:   Mon Sep 26 10:56:02 2016 +0300

Refactor script execution state machine in pgbench.

So, will change the status to "Waiting on Author".

Thanks,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL logging problem in 9.4.3?

2016-09-26 Thread Kyotaro HORIGUCHI

Hello, I return to this before my things:)

Though I haven't played with the patch yet..

At Fri, 29 Jul 2016 16:54:42 +0900, Michael Paquier  
wrote in 
> > Well, not that soon at the end, but I am back on that... I have not
> > completely reviewed all the code yet, and the case of index relation
> > referring to a relation optimized with truncate is still broken, but
> > for now here is a rebased patch if people are interested. I am going
> > to get as well a TAP tests out of my pocket to ease testing.
> 
> The patch I sent yesterday was based on an incorrect version. Attached
> is a slightly-modified version of the last one I found here
> (https://www.postgresql.org/message-id/56b342f5.1050...@iki.fi), which
> is rebased on HEAD at ed0b228. I have also converted the test case
> script of upthread into a TAP test in src/test/recovery that covers 3
> cases and I included that in the patch:
> 1) CREATE + INSERT + COPY => crash
> 2) CREATE + trigger + COPY => crash
> 3) CREATE + TRUNCATE + COPY => incorrect number of rows.
> The first two tests make the system crash, the third one reports an
> incorrect number of rows.

At the first glance, managing sync_above and truncate_to is
workable for these cases, but seems too complicated for the
problem to be resolved.

This provides smgr with a capability to manage pending page
synchs. But the postpone-page-syncs-or-not issue rather seems to
be a matter of the users of that, who are responsible for WAL
issueing. Anyway heap_resgister_sync doesn't use any secret of
smgr. So I think this approach binds smgr with Relation too
tightly.

By this patch, many RelationNeedsWALs, which just accesses local
struct, are replaced with HeapNeedsWAL, which eventually accesses
a hash added by this patch. Especially in log_heap_update, it is
called for every update of single tuple (on a relation that needs
WAL).

Though I don't know how it actually impacts the perfomance, it
seems to me that we can live with truncated_to and sync_above in
RelationData and BufferNeedsWAL(rel, buf) instead of
HeapNeedsWAL(rel, buf).  Anyway up to one entry for one relation
seems to exist at once in the hash.

What do you think?

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench - minor fix for meta command only scripts

2016-09-26 Thread Heikki Linnakangas


On 09/24/2016 12:45 PM, Fabien COELHO wrote:

Although I cannot be absolutely sure that the refactoring does not
introduce any new bug, I'm convinced that it will be much easier to find
them:-)


:-)


Attached are some small changes to your version:

I have added the sleep_until fix.

I have fixed a bug introduced in the patch by changing && by || in the
(min_sec > 0 && maxsock != -1) condition which was inducing errors with
multi-threads & clients...

I have factored out several error messages in "commandFailed", in place of
the "metaCommandFailed", and added the script number as well in the error
messages. All messages are now specific to the failed command.

I have added two states to the machine:

  - CSTATE_CHOOSE_SCRIPT which simplifies threadRun, there is now one call
to chooseScript instead of two before.

  - CSTATE_END_COMMAND which manages is_latencies and proceeding to the
next command, thus merging the three instances of updating the stats
that were in the first version.

The later state means that processing query results is included in the per
statement latency, which is an improvement because before I was getting
some transaction latency significantly larger that the apparent sum of the
per-statement latencies, which did not make much sense...


Ok. I agree that makes more sense.


I have added & updated a few comments.


Thanks! Committed.


There are some places where the break could be a pass through
instead, not sure how desirable it is, I'm fine with break.


I left them as "break". Pass-throughs are error-prone, and make it more 
difficult to read, IMHO. The compiler will optimize it into a 
pass-through anyway, if possible and worthwhile, so there should be no 
performance difference.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Declarative partitioning - another take

2016-09-26 Thread Amit Langote


Hi Ashutosh,

On 2016/09/22 14:42, Ashutosh Bapat wrote:
> Hi Amit,
> Following sequence of DDLs gets an error
> --
> -- multi-leveled partitions
> --
> CREATE TABLE prt1_l (a int, b int, c varchar) PARTITION BY RANGE(a);
> CREATE TABLE prt1_l_p1 PARTITION OF prt1_l FOR VALUES START (0) END
> (250) PARTITION BY RANGE (b);
> CREATE TABLE prt1_l_p1_p1 PARTITION OF prt1_l_p1 FOR VALUES START (0) END 
> (100);
> CREATE TABLE prt1_l_p1_p2 PARTITION OF prt1_l_p1 FOR VALUES START
> (100) END (250);
> CREATE TABLE prt1_l_p2 PARTITION OF prt1_l FOR VALUES START (250) END
> (500) PARTITION BY RANGE (c);
> CREATE TABLE prt1_l_p2_p1 PARTITION OF prt1_l_p2 FOR VALUES START
> ('0250') END ('0400');
> CREATE TABLE prt1_l_p2_p2 PARTITION OF prt1_l_p2 FOR VALUES START
> ('0400') END ('0500');
> CREATE TABLE prt1_l_p3 PARTITION OF prt1_l FOR VALUES START (500) END
> (600) PARTITION BY RANGE ((b + a));
> ERROR:  cannot use column or expression from ancestor partition key
> 
> The last statement is trying create subpartitions by range (b + a),
> which contains a partition key from ancestor partition key but is not
> exactly same as that. In fact it contains some extra columns other
> than the ancestor partition key columns. Why do we want to prohibit
> such cases?

Per discussion [1], I am going to remove this ill-considered restriction.

Thanks,
Amit

[1]
https://www.postgresql.org/message-id/CA%2BHiwqEXAU_m%2BV%3Db-VGmsDNjoqc-Z_9KQdyPuOGbiQGzNObmVg%40mail.gmail.com




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Odd system-column handling in postgres_fdw join pushdown patch

2016-09-26 Thread Etsuro Fujita


On 2016/09/21 0:40, Robert Haas wrote:

On Fri, Jul 1, 2016 at 3:10 AM, Etsuro Fujita
 wrote:

On 2016/04/14 4:57, Robert Haas wrote:



1. For a regular FDW scan, zero the xmin, xmax, and cid of the tuple
before returning it from postgres_fdw, so that we don't expose the
datum-tuple fields.   I can't see any reason this isn't safe, but I
might be missing something.



I noticed odd behavior of this in EvalPlanQual.  Consider:

-- session 1
postgres=# create extension postgres_fdw;
CREATE EXTENSION
postgres=# create server fs foreign data wrapper postgres_fdw options
(dbname 'postgres');
CREATE SERVER
postgres=# create user mapping for public server fs;
CREATE USER MAPPING
postgres=# create table t1 (a int, b int);
CREATE TABLE
postgres=# create table t2 (a int, b int);
CREATE TABLE
postgres=# insert into t1 values (1, 1);
INSERT 0 1
postgres=# insert into t2 values (1, 1);
INSERT 0 1
postgres=# create foreign table ft1 (a int, b int) server fs options
(table_name 't1');
CREATE FOREIGN TABLE
postgres=# select xmin, xmax, cmin, * from ft1;
  xmin | xmax | cmin | a | b
--+--+--+---+---
 0 |0 |0 | 1 | 1
(1 row)

-- session 2
postgres=# begin;
BEGIN
postgres=# update t2 set a = a;
UPDATE 1

-- session 1
postgres=# select ft1.xmin, ft1.xmax, ft1.cmin, ft1.* from ft1, t2 for
update;

-- session 2
postgres=# commit;
COMMIT

-- session 1 (will show the following)
  xmin |xmax| cmin  | a | b
--++---+---+---
   128 | 4294967295 | 16398 | 1 | 1
(1 row)

The values of xmin, xmax, and cmin are not 0!  The reason for that is that
we don't zero these values in a test tuple stored by
EvalPlanQualFetchRowMarks for EvalPlanQual re-evaluations.

That cleanup applies to the file_fdw foreign table case as well, so I think
xmin, xmax, and cid in tuples from such tables should be set to 0, too.  And
ISTM that it's better that the core (ie, ForeignNext) supports doing that,
rather than each FDW does that work.  That would also minimize the overhead
because ForeignNext does that if needed.  Please find attached a patch.



If you want this to be considered, you'll need to rebase it and submit
it to the next CommitFest.


Will do.

Best regards,
Etsuro Fujita




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Push down more full joins in postgres_fdw

2016-09-26 Thread Etsuro Fujita


On 2016/09/15 15:29, Ashutosh Bapat wrote:

On Wed, Sep 14, 2016 at 8:52 PM, Robert Haas  wrote:



I'm not sure why it wouldn't work
to just use the lowest RTI involved in the join, though; the others
won't appear anywhere else at that query level.



So +1 for
using the smallest RTI to indicate a subquery.


+1 for the general idea.

ISTM that the use of the same RTI for subqueries in multi-levels in a 
remote SQL makes the SQL a bit difficult to read.  How about using the 
position of the join rel in join_rel_list, (more precisely, the position 
plus list_length(root->parse->rtable)), instead?


Best regards,
Etsuro Fujita




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Push down more full joins in postgres_fdw

2016-09-26 Thread Etsuro Fujita


On 2016/09/13 14:17, Ashutosh Bapat wrote:


But another concern about the approach of generating an
subselect alias for a path, if needed, at the path creation time
would be that the path might be rejected by add_path, which would
result in totally wasting cycles for generating the subselect alias
for the path.



A path may get rejected but the relation is not rejected. The alias
applies to a relation and its targetlist, which will remain same for all
paths created for that relation, esp when it's a base relation or join.
So, AFAIU that work never gets wasted.


I think there is no guarantee that add_path won't reject foreign join 
paths.  The possibility would be very low, though.



However minimum overhead it might have, we will deparse the
query every
time we create a path involving that kind of relation i.e. for
different
pathkeys, different parameterization and different joins in
which that
relation participates in. This number can be significant. We want to
avoid this overhead if we can.



Exactly.  As I said above, I think the overhead would be negligible
compared to the overhead in explaining each remote query for costing
or the overhead in sending the final remote query for execution, though.



It won't remain minimal as the number of paths created increases,
increasing the number of times a query is deparsed. We deparse query
every time time we cost a path for a relation with use_remote_estimates
true. As we try to push down more and more stuff, we will create more
paths and deparse the query more time.



Also, that makes the interface better. Right now, in your patch, you
have changed the order of deparsing in the existing code, so that the
aliases are registered while deparsing FROM clause and before any Var
nodes are deparsed. If we create aliases at the time of path creation,
only once in GetForeignJoinPaths or GetForeignPaths as appropriate, that
would require less code churn and would save some CPU cycles as well.


Agreed.  Will fix.

Best regards,
Etsuro Fujita




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Transactions involving multiple postgres foreign servers

2016-09-26 Thread vinayak



On 2016/09/07 10:54, vinayak wrote:


Thanks for the clarification. I had added pg_fdw_xact_resolver() to 
resolve any transactions which can not be resolved immediately after 
they were prepared. There was a comment from Kevin (IIRC) that 
leaving transactions unresolved on the foreign server keeps the 
resources locked on those servers. That's not a very good situation. 
And nobody but the initiating server can resolve those. That 
functionality is important to make it a complete 2PC solution. So, 
please consider it to be included in your first set of patches.

The attached patch included pg_fdw_xact_resolver.


The attached patch includes the documentation.

Regards,
Vinayak Pokale
NTT Open Source Software Center


0001-Support-transaction-with-foreign-servers.patch
Description: application/download

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench - allow to store select results into variables

2016-09-26 Thread Amit Langote

Hi Fabien,

I am marking the pgbench-into-5.patch [1] as "Ready for Committer" as I
have no further comments at the moment.

Thanks,
Amit

[1]
https://www.postgresql.org/message-id/alpine.DEB.2.20.1609130730380.10870%40lancre




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

98 matches

Mail list logo