Re: [HACKERS] [PATCH] Generic type subscripting

2017-03-30 Thread Arthur Zakirov
2017-03-31 5:32 GMT+03:00 Dmitry Dolgov <9erthali...@gmail.com>:
> On 30 March 2017 at 19:36, Arthur Zakirov  wrote:
>>
>> The last point is about the tutorial. As Tom pointed it is not useful when
>> the tutorial doesn't execute. It happens because there is not "custom" type
>> in subscripting.sql.
>
> I'm confused. Maybe I'm missing something, but there is "custom" type in
> this file:

Sorry for confusing. I should have been more careful. I've mixed up
NOTICE message with error message and I haven't noticed CREATE TYPE
command.

-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] increasing the default WAL segment size

2017-03-30 Thread Beena Emerson
Hello,

On Fri, Mar 31, 2017 at 11:20 AM, Kuntal Ghosh 
wrote:

> On Fri, Mar 31, 2017 at 10:40 AM, Beena Emerson 
> wrote:
> > On 30 Mar 2017 15:10, "Kuntal Ghosh"  wrote:
>
> > I do not think a generalised function is a good idea. Besides, I feel the
> > respective approaches are best kept in the modules used also because the
> > internal code is not easily accessible by utils.
> >
> Ahh, I wonder what the reason can be. Anyway, I'll leave that decision
> for the committer. I'm moving the status to Ready for committer.
>
>
Thank you.


-- 

Beena Emerson

EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: [HACKERS] increasing the default WAL segment size

2017-03-30 Thread Kuntal Ghosh
On Fri, Mar 31, 2017 at 10:40 AM, Beena Emerson  wrote:
> On 30 Mar 2017 15:10, "Kuntal Ghosh"  wrote:

>> 03-modify-tools.patch - Makes XLogSegSize into a variable, currently set
>> as
>> XLOG_SEG_SIZE and modifies the tools to fetch the size instead of using
>> inbuilt value.
> Several methods are declared and defined in different tools to fetch
> the size of wal-seg-size.
> In pg_standby.c,
> RetrieveXLogSegSize() - /* Set XLogSegSize from the WAL file header */
>
> In pg_basebackup/streamutil.c,
>  on behaRetrieveXLogSegSize(PGconn *conn) - /* set XLogSegSize using
> SHOW wal_segment_size */
>
> In pg_waldump.c,
> ReadXLogFromDir(char *archive_loc)
> RetrieveXLogSegSize(char *archive_path) /* Scan through the archive
> location to set XLogSegsize from the first WAL file */
>
> IMHO, it's better to define a single method in xlog.c and based on the
> different strategy, it can retrieve the XLogSegsize on behalf of
> different modules. I've suggested the same in my first set review and
> I'll still vote for it. For example, in xlog.c, you can define
> something as following:
> bool RetrieveXLogSegSize(RetrieveStrategy rs, void* ptr)
>
> Now based on the RetrieveStrategy(say Conn, File, Dir), you can cast
> the void pointer to the appropriate type. So, when a new tool needs to
> retrieve XLogSegSize, it can just call this function instead of
> defining a new RetrieveXLogSegSize method.
>
> It's just a suggestion from my side. Is there anything I'm missing
> which can cause the aforesaid approach not to be working?
> Apart from that, I've nothing to add here.
>
>
>
> I do not think a generalised function is a good idea. Besides, I feel the
> respective approaches are best kept in the modules used also because the
> internal code is not easily accessible by utils.
>
Ahh, I wonder what the reason can be. Anyway, I'll leave that decision
for the committer. I'm moving the status to Ready for committer.

I've only tested the patch in my 64-bit linux system. It needs some
testing on other environment settings.


-- 
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-03-30 Thread Michael Paquier
On Mon, Mar 13, 2017 at 11:11 AM, Andreas Karlsson  wrote:
> On 03/02/2017 03:10 AM, Michael Paquier wrote:
>> There is a lot of mumbo-jumbo in the patch to create the exact same
>> index definition as the original one being reindexed, and that's a
>> huge maintenance burden for the future. You can blame me for that in
>> the current patch. I am wondering if it would not just be better to
>> generate a CREATE INDEX query string and then use the SPI to create
>> the index, and also do the following extensions at SQL level:
>> - Add a sort of WITH NO DATA clause where the index is created, so the
>> index is created empty, and is marked invalid and not ready.
>> - Extend pg_get_indexdef_string() with an optional parameter to
>> enforce the index name to something else, most likely it should be
>> extended with the WITH NO DATA/INVALID clause, which should just be a
>> storage parameter by the way.
>> By doing something like that what the REINDEX CONCURRENTLY code path
>> should just be careful about is that it chooses an index name that
>> avoids any conflicts.
>
>
> Hm, I am not sure how much that would help since a lot of the mumb-jumbo is
> by necessity in the step where we move the constraints over from the old
> index to the new.

Well, the idea is really to get rid of that as there are already
facilities of this kind for CREATE TABLE LIKE in the parser and ALTER
TABLE when rewriting a relation. It is not really attractive to have a
3rd method in the backend code to do the same kind of things, for a
method that is even harder to maintain than the other two.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical decoding on standby

2017-03-30 Thread Craig Ringer
On 31 March 2017 at 12:49, Craig Ringer  wrote:
> On 31 March 2017 at 01:16, Andres Freund  wrote:

>> The comment and code don't quite square to me - it's far from obvious
>> that LogStandbySnapshot does something like that. I'd even say it's a
>> bad idea to have it do that.
>
> So you prefer the prior approach with separate xl_catalog_xmin advance 
> records?

Alternately, we can record the creation timeline on slots, so we know
if there's been a promotion. It wouldn't make sense to do this if that
were the only use of timelines on slots. But I'm aware you'd rather
keep slots timeline-agnostic and I tend to agree.

Anyway, per your advice will split out the validation step.

(I'd like replication origins to be able to track time alongside lsn,
and to pass the timeline of each LSN to output plugin callbacks so we
can detect if a physical promotion results in us backtracking down a
fork in history, but that doesn't affect slots.)

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New CORRESPONDING clause design

2017-03-30 Thread Pavel Stehule
Hi

2017-03-30 21:55 GMT+02:00 Pavel Stehule :

>
>
> 2017-03-30 21:43 GMT+02:00 Tom Lane :
>
>> Pavel Stehule  writes:
>> > Is following use case defined in standard?
>>
>> > postgres=# SELECT 0 AS x1, 1 AS a, 0 AS x2, 2 AS b, 0 AS x3, -1 AS x3
>> >UNION ALL CORRESPONDING BY(a,b) SELECT 4 AS b, 0 AS x4, 3 AS
>> a,
>> > 0 AS x6, -1 AS x6
>> >UNION ALL CORRESPONDING SELECT 0 AS x8, 6 AS a, -100 AS aa;
>> > ┌───┐
>> > │ a │
>> > ╞═══╡
>> > │ 1 │
>> > │ 3 │
>> > │ 6 │
>> > └───┘
>> > (3 rows)
>>
>> > It depends on order of implementation
>>
>> > if we do (T1 U T2) U T3 ---> then result is correct,
>> > but if we do T1 U (T2 U T3) ---> than it should to fail
>>
>> UNION ALL should associate left-to-right, just like most other binary
>> operators, so this looks fine to me.  Did you check that you get an
>> error if you put in parens to force the other order?
>>
>
> yes - it fails
>
>  postgres=# SELECT 0 AS x1, 1 AS a, 0 AS x2, 2 AS b, 0 AS x3, -1 AS x3
> UNION ALL CORRESPONDING BY(a,b) (SELECT 4 AS b, 0 AS x4, 3 AS a, 0 AS x6,
> -1 AS x6 UNION ALL CORRESPONDING SELECT 0 AS x8, 6 AS a, -100 AS aa);
> ERROR:  column name "b" can not be used in CORRESPONDING BY list
> LINE 1: ...b, 0 AS x3, -1 AS x3 UNION ALL CORRESPONDING BY(a,b) (SELECT...
>  ^
> HINT:  UNION queries with a CORRESPONDING BY clause must contain column
> names from both tables.
> Time: 1,135 ms
>
>
I fixed wrong my comment

I have no any other objections, I'll mark this patch as ready for commiter

Regards

Pavel


> Regards
>
> Pavel
>
>>
>> regards, tom lane
>>
>
>
diff --git a/doc/src/sgml/queries.sgml b/doc/src/sgml/queries.sgml
index 30792f45f1..2d60718ff1 100644
--- a/doc/src/sgml/queries.sgml
+++ b/doc/src/sgml/queries.sgml
@@ -1601,6 +1601,9 @@ SELECT DISTINCT ON (expression , EXCEPT
   
   
+   CORRESPONDING
+  
+  
set union
   
   
@@ -1617,9 +1620,9 @@ SELECT DISTINCT ON (expression , 
-query1 UNION ALL query2
-query1 INTERSECT ALL query2
-query1 EXCEPT ALL query2
+query1 UNION ALL CORRESPONDING BY query2
+query1 INTERSECT ALL CORRESPONDING BY query2
+query1 EXCEPT ALL CORRESPONDING BY query2
 
query1 and
query2 are queries that can use any of
@@ -1659,14 +1662,31 @@ SELECT DISTINCT ON (expression , 
 
   
-   In order to calculate the union, intersection, or difference of two
-   queries, the two queries must be union compatible,
-   which means that they return the same number of columns and
-   the corresponding columns have compatible data types, as
-   described in .
+   EXCEPT returns all rows that are in the result of
+   query1 but not in the result of
+   query2.  (This is sometimes called the
+   difference between two queries.)  Again, duplicates
+   are eliminated unless EXCEPT ALL is used.
   
- 
 
+  
+   CORRESPONDING returns all columns that are in both 
+   query1 and query2 with the same name.
+  
+
+  
+   CORRESPONDING BY returns all columns in the column list 
+   that are also in both query1 and 
+   query2 with the same name. The names in column list
+   must be unique.
+  
+
+  
+   The names of columns in result when CORRESPONDING or
+   CORRESPONDING BY clause is used must be unique in
+   query1 and query2.
+  
+ 
 
  
   Sorting Rows
diff --git a/doc/src/sgml/sql.sgml b/doc/src/sgml/sql.sgml
index 57396d7c24..f98c22e696 100644
--- a/doc/src/sgml/sql.sgml
+++ b/doc/src/sgml/sql.sgml
@@ -859,7 +859,7 @@ SELECT [ ALL | DISTINCT [ ON ( expressioncondition ]
 [ GROUP BY expression [, ...] ]
 [ HAVING condition [, ...] ]
-[ { UNION | INTERSECT | EXCEPT } [ ALL ] select ]
+[ { UNION | INTERSECT | EXCEPT } [ ALL ] [ CORRESPONDING [ BY ( expression ) ] ] select ]
 [ ORDER BY expression [ ASC | DESC | USING operator ] [ NULLS { FIRST | LAST } ] [, ...] ]
 [ LIMIT { count | ALL } ]
 [ OFFSET start ]
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 1c88d601bd..11e0590eec 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -2995,6 +2995,7 @@ _copySelectStmt(const SelectStmt *from)
 	COPY_NODE_FIELD(withClause);
 	COPY_SCALAR_FIELD(op);
 	COPY_SCALAR_FIELD(all);
+	COPY_NODE_FIELD(correspondingClause);
 	COPY_NODE_FIELD(larg);
 	COPY_NODE_FIELD(rarg);
 
@@ -3010,6 +3011,8 @@ _copySetOperationStmt(const SetOperationStmt *from)
 	COPY_SCALAR_FIELD(all);
 	COPY_NODE_FIELD(larg);
 	COPY_NODE_FIELD(rarg);
+	COPY_NODE_FIELD(correspondingColumns);
+	COPY_SCALAR_FIELD(hasCorrespondingBy);
 	COPY_NODE_FIELD(colTypes);
 	COPY_NODE_FIELD(colTypmods);
 	COPY_NODE_FIELD(colCollations);
@@ -4588,6 +4591,8 @@ _copyValue(const Value *from)
  (int) from->type);
 			break;
 	}
+	COPY_LOCATION_FIELD(location);
+
 	return newnode;
 }
 
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 5941b7a2bf..dd6598d85b 100644
--- 

Re: [HACKERS] Other formats in pset like markdown, rst, mediawiki

2017-03-30 Thread Jan Michálek
2017-03-30 21:53 GMT+02:00 Pavel Stehule :

>
>
> 2017-03-29 20:11 GMT+02:00 Jan Michálek :
>
>>
>>
>> 2017-03-27 19:41 GMT+02:00 Jan Michálek :
>>
>>>
>>>
>>> 2017-03-23 17:26 GMT+01:00 Pierre Ducroquet :
>>>
 The following review has been posted through the commitfest application:
 make installcheck-world:  tested, passed
 Implements feature:   tested, passed
 Spec compliant:   tested, passed
 Documentation:tested, passed

 Hi

 This is my first review (Magnus said in his presentation in PGDay Paris
 that volunteers should just come and help, so here I am), so please notify
 me for any mistake I do when using the review tools...

 The feature seems to work as expected, but I don't claim to be a
 markdown and rst expert.
 Some minor issues with the code itself :
 - some indentation issues (documentation and code itself with mix
 between space based and tab based indentation) and a few trailing spaces in
 code

>>>
>>> corrected
>>>
>>>
 - typographic issues in the documentation :
   - "The html, asciidoc, latex, latex-longtable, troff-ms, and markdown
 and rst formats" ==> duplicated and

>>>
>>> corrected
>>>
   - "Sets the output format to one of unaligned, aligned, wrapped,
 html, asciidoc, latex (uses tabular), latex-longtable, rst, markdown, or
 troff-ms." ==> extra comma at the end of the list
 - the comment " dont add line after last row, because line is added
 after every row" is misleading, it should warn that it's only for rst
 - there is a block of commented out code left
 - in the print_aligned_vertical function, there is a mix between
 "cont->opt->format == PRINT_RST" and "format == _rst" and I don't see
 any obvious reason for that

>>> corrected
>>>
 - the documentation doesn't mention (but ok, it's kind of obvious) that
 the linestyle option will not work with rst and markdown


>>> In this patch are corrected (i hope, i had correct changes in vimrc)
>>> indentation issues. Plese, look at this if it is OK (i men indentats) and
>>> some minor errors. And it should work on current master (probably).
>>>
>>
>> Added \x option form markdown
>> In markdown works multiline cels (newline replaced by )
>> regre tests passed
>>
>
> \pset format rst
> \x
> select 10
> crash on segfault
>
> Program received signal SIGSEGV, Segmentation fault.
> 0x7f77673a866c in vfprintf () from /lib64/libc.so.6
> (gdb) bt
> #0  0x7f77673a866c in vfprintf () from /lib64/libc.so.6
> #1  0x7f77673b1574 in fprintf () from /lib64/libc.so.6
> #2  0x00437bc5 in print_aligned_vertical (cont=0x7fffade43da0,
> fout=,
> is_pager=) at print.c:1755
> #3  0x0043a70d in printTable (cont=cont@entry=0x7fffade43da0,
> fout=,
> fout@entry=0x7f77677255e0 <_IO_2_1_stdout_>, is_pager= out>, is_pager@entry=0 '\000',
> flog=flog@entry=0x0) at print.c:3466
> #4  0x0043c37f in printQuery (result=result@entry=0x9c4b60,
> opt=opt@entry=0x7fffade43f00,
> fout=0x7f77677255e0 <_IO_2_1_stdout_>, is_pager=is_pager@entry=0
> '\000', flog=0x0) at print.c:3551
> #5  0x0040da6d in PrintQueryTuples (results=0x9c4b60) at
> common.c:808
> #6  PrintQueryResults (results=0x9c4b60) at common.c:1140
> #7  SendQuery (query=0x9c1700 "select 10;") at common.c:1317
> #8  0x0041c3d4 in MainLoop (source=0x7f77677248a0
> <_IO_2_1_stdin_>) at mainloop.c:319
> #9  0x00405d5d in main (argc=, argv= out>) at startup.c:396
>
> Regards
>

On source from monday it works (last commit on master I have is from 27.3
14:30). Or, maybe I didn`t generate diff well, or some gitt issue.

psql (10devel, server 9.6.1)
Type "help" for help.

jelen=# \x
Expanded display is on.
jelen=# \pset format rst
Output format is rst.
jelen=# select 10;
+--++
| **RECORD 1**  |
+--++
| **?column?** | 10 |
+--++

jelen=#



>
> Pavel
>
>
>>
>>
>> Jan
>>
>>
>>
>>>
>>> Have nice day
>>>
>>> Jan
>>>
>>>
 Thanks !

 The new status of this patch is: Waiting on Author

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

>>>
>>>
>>>
>>> --
>>> Jelen
>>> Starší čeledín datovýho chlíva
>>>
>>
>>
>>
>> --
>> Jelen
>> Starší čeledín datovýho chlíva
>>
>>
>> --
>> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-hackers
>>
>>
>


-- 
Jelen
Starší čeledín datovýho chlíva


Re: [HACKERS] [POC] A better way to expand hash indexes.

2017-03-30 Thread Mithun Cy
Thanks, I have tried to fix all of the comments.

On Fri, Mar 31, 2017 at 8:10 AM, Robert Haas  wrote:
> On Thu, Mar 30, 2017 at 2:36 PM, Mithun Cy  wrote:
>> Thanks Robert, I have tried to fix the comments given as below.
>
> Thanks.
>
> Since this changes the on-disk format of hash indexes in an
> incompatible way, I think it should bump HASH_VERSION.  (Hopefully
> that doesn't preclude REINDEX?)  pg_upgrade should probably also be
> patched to issue a warning if upgrading from a version < 10 to a
> version >= 10 whenever hash indexes are present; I thought we had
> similar cases already, but I don't see them at the moment.  Maybe we
> can get Bruce or someone to give us some advice on exactly what should
> be done here.

As of now increasing version ask us to REINDEX (metapage access verify
whether we are in right version)
postgres=# set enable_seqscan= off;
SET
postgres=# select * from t1 where i = 10;
ERROR:  index "hash2" has wrong hash version
HINT:  Please REINDEX it.
postgres=# insert into t1 values(10);
ERROR:  index "hash2" has wrong hash version
HINT:  Please REINDEX it.

postgres=# REINDEX INDEX hash2;
REINDEX
postgres=# select * from t1 where i = 10;
 i

 10
(1 row)

Last time we changed this version from 1 to 2
(4adc2f72a4ccd6e55e594aca837f09130a6af62b), from logs I see no changes
for upgrade specifically.

Hi Bruce, can you please advise us what should be done here.

> In a couple of places, you say that a splitpoint group has 4 slots
> rather than 4 phases.

--Fixed

> I think that in _hash_get_totalbuckets(), you should use blah &
> HASH_SPLITPOINT_PHASE_MASK rather than blah %
> HASH_SPLITPOINT_PHASES_PER_GRP for consistency with _hash_spareindex
> and, perhaps, speed.  Similarly, instead of blah /
> HASH_SPLITPOINT_PHASES_PER_GRP, use blah >>
> HASH_SPLITPOINT_PHASE_BITS.

--Fixed

>
> buckets_toadd is punctuated oddly.  buckets_to_add?  Instead of
> hand-calculating this, how about calculating it as
> _hash_get_totalbuckets(spare_ndx) - _hash_get_totalbuckets(spare_ndx -
> 1)?

I think this should do that, considering new_bucket is nothng but
1-based max_buckets.
buckets_to_add = _hash_get_totalbuckets(spare_ndx) - new_bucket;

That makes me do away with

+#define HASH_SPLITPOINT_PHASE_TO_SPLITPOINT_GRP(sp_phase) \
+ (((sp_phase) < HASH_SPLITPOINT_GROUPS_WITH_ONE_PHASE) ? \
+ (sp_phase) : \
+ sp_phase) - HASH_SPLITPOINT_GROUPS_WITH_ONE_PHASE) >> \
+   HASH_SPLITPOINT_PHASE_BITS) + \
+  HASH_SPLITPOINT_GROUPS_WITH_ONE_PHASE))

as this is now used in only one place _hash_get_totalbuckets.
I also think the comments above can be removed now. As we have removed
the code related to multi-phased allocation there.

+ * The number of buckets in the new splitpoint group is equal
to the
+ * total number already in existence. But we do not allocate
them at
+ * once. Each splitpoint group will have 4 phases, we
distribute the
+ * buckets equally among them. So we allocate only one fourth
of total
+ * buckets in new splitpoint group at a time to consume one phase after
+ * another.

> +uint32  splitpoint_group = 0;
>
> Don't need the  = 0 here; the next reference to this variable is an
> unconditional initialization.

--Fixed, with new code splitpoint_group is not needed.

>
> + */
> +
> +splitpoint_group = 
> HASH_SPLITPOINT_PHASE_TO_SPLITPOINT_GRP(spare_ndx);
>
> I would delete the blank line.

--Fixed.

>
> - * should start from ((2 ^ i) + metap->hashm_spares[i - 1] + 1).
> + * should start from
> + * (_hash_get_totalbuckets(i) + metap->hashm_spares[i - 1] + 1).
>
> Won't survive pgindent.

--Fixed as pgindent has suggested.

>
> - * The number of buckets in the new splitpoint is equal to the total
> - * number already in existence, i.e. new_bucket.  Currently this maps
> - * one-to-one to blocks required, but someday we may need a more
> - * complicated calculation here.  We treat allocation of buckets as a
> - * separate WAL-logged action.  Even if we fail after this operation,
> - * won't leak bucket pages; rather, the next split will consume this
> - * space. In any case, even without failure we don't use all the 
> space
> - * in one split operation.
> + * The number of buckets in the new splitpoint group is equal to the
> + * total number already in existence. But we do not allocate them at
> + * once. Each splitpoint group will have 4 slots, we distribute the
> + * buckets equally among them. So we allocate only one fourth of 
> total
> + * buckets in new splitpoint group at a time to consume one phase 
> after
> + * another. We treat allocation of buckets as a separate WAL-logged
> + * action. Even if we fail after this operation, won't leak bucket
> + * pages; rather, the next split will consume this 

Re: [HACKERS] [patch] reorder tablespaces in basebackup tar stream for backup_label

2017-03-30 Thread Kyotaro HORIGUCHI
At Fri, 31 Mar 2017 13:37:38 +0900, Michael Paquier  
wrote in 

Re: [HACKERS] increasing the default WAL segment size

2017-03-30 Thread Beena Emerson
Hello,

Thanks for testing my patch.

On 30 Mar 2017 15:10, "Kuntal Ghosh"  wrote:

On Tue, Mar 28, 2017 at 1:06 AM, Beena Emerson 
wrote:
> On Sat, Mar 25, 2017 at 10:32 PM, Peter Eisentraut
>  wrote:
>>
>> At this point, I suggest splitting this patch up into several
>> potentially less controversial pieces.
>>
>> One big piece is that we currently don't support segment sizes larger
>> than 64 GB, for various internal arithmetic reasons.  Your patch appears
>> to address that.  So I suggest isolating that.  Assuming it works
>> correctly, I think there would be no great concern about it.
>>
>> The next piece would be making the various tools aware of varying
>> segment sizes without having to rely on a built-in value.
>>
>> The third piece would then be the rest that allows you to set the size
>> at initdb
>>
>> If we take these in order, we would make it easier to test various sizes
>> and see if there are any more unforeseen issues when changing sizes.  It
>> would also make it easier to do performance testing so we can address
>> the original question of what the default size should be.
>
>
> PFA the patches divided into 3 parts:
Thanks for splitting the patches.
01-add-XLogSegmentOffset-macro.patch is same as before and it looks good.

> 02-increase-max-wal-segsize.patch - Increases the wal-segsize and changes
> the internal representation of max_wal_size and min_wal_size to mb.
looks good.

> 03-modify-tools.patch - Makes XLogSegSize into a variable, currently set
as
> XLOG_SEG_SIZE and modifies the tools to fetch the size instead of using
> inbuilt value.
Several methods are declared and defined in different tools to fetch
the size of wal-seg-size.
In pg_standby.c,
RetrieveXLogSegSize() - /* Set XLogSegSize from the WAL file header */

In pg_basebackup/streamutil.c,
 on behaRetrieveXLogSegSize(PGconn *conn) - /* set XLogSegSize using
SHOW wal_segment_size */

In pg_waldump.c,
ReadXLogFromDir(char *archive_loc)
RetrieveXLogSegSize(char *archive_path) /* Scan through the archive
location to set XLogSegsize from the first WAL file */

IMHO, it's better to define a single method in xlog.c and based on the
different strategy, it can retrieve the XLogSegsize on behalf of
different modules. I've suggested the same in my first set review and
I'll still vote for it. For example, in xlog.c, you can define
something as following:
bool RetrieveXLogSegSize(RetrieveStrategy rs, void* ptr)

Now based on the RetrieveStrategy(say Conn, File, Dir), you can cast
the void pointer to the appropriate type. So, when a new tool needs to
retrieve XLogSegSize, it can just call this function instead of
defining a new RetrieveXLogSegSize method.

It's just a suggestion from my side. Is there anything I'm missing
which can cause the aforesaid approach not to be working?
Apart from that, I've nothing to add here.



I do not think a generalised function is a good idea. Besides, I feel the
respective approaches are best kept in the modules used also because the
internal code is not easily accessible by utils.


> 04-initdb-walsegsize.patch - Adds the initdb option to set wal-segsize and
> make related changes. Update pg_test_fsync to use DEFAULT_XLOG_SEG_SIZE
> instead of XLOG_SEG_SIZE
looks good.

>>
>> One concern I have is that your patch does not contain any tests.  There
>> should probably be lots of tests.
>
>
> 05-initdb_tests.patch adds tap tests to initialize cluster with different
> wal_segment_size and then check the config values. What other tests do you
> have in mind? Checking the various tools?
Nothing from me to add here.

I've nothing to add here for the attached set of patches. On top of
these, David's patch can be used for preserving LSNs in the file
naming for all segment sizes.


Re: [HACKERS] [BUGS] Bug in Physical Replication Slots (at least 9.5)?

2017-03-30 Thread Kyotaro HORIGUCHI
Thank you having a look on this.

# I removed -bugs in CC:.

At Fri, 31 Mar 2017 13:40:00 +1100, Venkata B Nagothi  wrote 
in 
> On Fri, Mar 17, 2017 at 6:48 PM, Kyotaro HORIGUCHI <
> horiguchi.kyot...@lab.ntt.co.jp> wrote:
> 
> > Hello,
> >
> > At Mon, 13 Mar 2017 11:06:00 +1100, Venkata B Nagothi 
> > wrote in  > gmail.com>
> > > On Tue, Jan 17, 2017 at 9:36 PM, Kyotaro HORIGUCHI <
> > > horiguchi.kyot...@lab.ntt.co.jp> wrote:
> > > > I managed to reproduce this. A little tweak as the first patch
> > > > lets the standby to suicide as soon as walreceiver sees a
> > > > contrecord at the beginning of a segment.
> > > >
> > > > - M(aster): createdb as a master with wal_keep_segments = 0
> > > > (default), min_log_messages = debug2
> > > > - M: Create a physical repslot.
> > > > - S(tandby): Setup a standby database.
> > > > - S: Edit recovery.conf to use the replication slot above then
> > > >  start it.
> > > > - S: touch /tmp/hoge
> > > > - M: Run pgbench ...
> > > > - S: After a while, the standby stops.
> > > >   > LOG:   STOP THE SERVER
> > > >
> > > > - M: Stop pgbench.
> > > > - M: Do 'checkpoint;' twice.
> > > > - S: rm /tmp/hoge
> > > > - S: Fails to catch up with the following error.
> > > >
> > > >   > FATAL:  could not receive data from WAL stream: ERROR:  requested
> > WAL
> > > > segment 0001002B has already been removed
> > > >
> > > >
> > > I have been testing / reviewing the latest patch
> > > "0001-Fix-a-bug-of-physical-replication-slot.patch" and i think, i might
> > > need some more clarification on this.
> > >
> > > Before applying the patch, I tried re-producing the above error -
> > >
> > > - I had master->standby in streaming replication
> > > - Took the backup of master
> > >- with a low max_wal_size and wal_keep_segments = 0
> > > - Configured standby with recovery.conf
> > > - Created replication slot on master
> > > - Configured the replication slot on standby and started the standby
> >
> > I suppose the "configure" means primary_slot_name in recovery.conf.
> >
> > > - I got the below error
> > >
> > >>> 2017-03-10 11:58:15.704 AEDT [478] LOG:  invalid record length at
> > > 0/F2000140: wanted 24, got 0
> > >>> 2017-03-10 11:58:15.706 AEDT [481] LOG:  started streaming WAL from
> > > primary at 0/F200 on timeline 1
> > >>> 2017-03-10 11:58:15.706 AEDT [481] FATAL:  could not receive data
> > > from WAL stream: ERROR:  requested WAL segment 000100F2
> > has
> > > already been removed
> >
> > Maybe you created the master slot with non-reserve (default) mode
> > and put a some-minites pause after making the backup and before
> > starting the standby. For the case the master slot doesn't keep
> > WAL segments unless the standby connects so a couple of
> > checkpoints can blow away the first segment required by the
> > standby. This is quite reasonable behavior. The following steps
> > makes this more sure.
> >
> > > - Took the backup of master
> > >- with a low max_wal_size = 2 and wal_keep_segments = 0
> > > - Configured standby with recovery.conf
> > > - Created replication slot on master
> > + - SELECT pg_switch_wal(); on master twice.
> > + - checkpoint; on master twice.
> > > - Configured the replication slot on standby and started the standby
> >
> > Creating the slot with the following command will save it.
> >
> > =# select pg_create_physical_replication_slot('s1', true);
> >
> 
> I did a test again, by applying the patch and I am not sure if the patch is
> doing the right thing ?

This is a bit difficult to make it happen.

> Here is test case -
> 
> - I ran pgbench
> - I took the backup of the master first
> 
> - Below are the WALs on master after the stop backup -
> 
> postgres=# select pg_stop_backup();
> 
> NOTICE:  WAL archiving is not enabled; you must ensure that all required
> WAL segments are copied through other means to complete the backup
>  pg_stop_backup
> 
>  0/8C000130
> (1 row)
> 
> postgres=# \q
> [dba@buildhost data]$ ls -ltrh pgdata-10dev-prsb-1/pg_wal/
> total 65M
> drwx--. 2 dba dba 4.0K Mar 31 09:36 archive_status
> -rw---. 1 dba dba  16M Mar 31 11:09 0001008E
> -rw---. 1 dba dba  16M Mar 31 11:17 0001008F
> -rw---. 1 dba dba  16M Mar 31 11:18 0001008C
> -rw---. 1 dba dba  16M Mar 31 11:18 0001008D
> 
> - After the backup, i created the physical replication slot
> 
> 
> postgres=# select pg_create_physical_replication_slot('repslot',true);
> 
>  pg_create_physical_replication_slot
> -
>  (repslot,0/8D28)
> (1 row)

At this point, segment 8C is not protected from removal. It's too
late if the first record in 8D is continued from 8C.

> postgres=# select 

Re: [HACKERS] [PATCH] Reduce src/test/recovery verbosity

2017-03-30 Thread Craig Ringer
On 31 March 2017 at 04:29, Stephen Frost  wrote:

> Unless people wish to object, I'll use Michael's patch to remove
> --verbose from the top level tomorrow.

Sounds good.

Maybe add

To get more detailed output from tests set PROVE_FLAGS='--verbose' or examine
the detailed test output in tmp_check/logs.

to src/test/perl/README too?

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: [[Parallel] Shared] Hash

2017-03-30 Thread Thomas Munro
Hi hackers,

Thanks very much to Rafia for testing, and to Andres for his copious
review feedback.  Here's a new version.  Changes:

1.  Keep all the backing files that are part of a BufFileSet in
subdirectories, as suggested by Andres.  Now, instead of that
unpopular logic for scanning ranges of possible file paths to delete,
we can just blow away whole directories that group sets of related
files.

2.  Don't expose 'participant' and 'partition' concepts, Andres didn't
like much, in the BufFile API.  There is a new concept 'stripe' which
client code of BufFileSet can use to specify the participant number in
a more general way without saying so: it's really just a way to spread
files over tablespaces.  I'm not sure if tablespaces are really used
that way much, but it seemed like Peter wasn't going to be too happy
with a proposal that didn't do *something* to respect the existing
temp_tablespaces GUC beahviour (and he'd be right).  But I didn't
think it would make any kind of sense at all to stripe by 1GB segments
as private BufFiles do when writing from multiple processes, as I have
argued elsewhere, hence this scheme.

The 'qunique' function used here (basically poor man's std::unique) is
one I proposed earlier, with the name suggested by Tom Lane:

See 
https://www.postgresql.org/message-id/flat/CAEepm%3D2vmFTNpAmwbGGD2WaryM6T3hSDVKQPfUwjdD_5XY6vAA%40mail.gmail.com
.

3.  Merged the single-batch and multi-batch patches into one.
EarlierI had the idea that it was easier to review them in layers
since I hoped people might catch a glimpse of the central simplicity
without being hit by a wall of multi-batch logic, but since Andres is
reviewing and disagrees, I give you 0010-hj-parallel-v11.patch which
weighs in at 32 files changed, 2278 insertions(+), 250 deletions(-).

4.  Moved the DSM handling to the every end of resowner.c's cleanup.
Peter pointed out that it would otherwise happen before fd.c Files are
closed.  He was concerned about a different aspect of that which I'm
not sure I fully understand, but at the very least it seemed to
represent a significant problem for this design on Windows.  I
discussed this briefly with Robert off-list and he told me that there
is probably no good reason for the ordering that we have, and what's
more, there may be good arguments even outside this case for DSM
segments being cleaned up as late as possible, now that they contain
shared control information and not just tuple data as once had been
imagined.  I can't think of any reason why this would not be safe.
Can you?

5.  The empty inner relation optimisation implemented.

Some smaller changes and miles of feedback inline below:

On Mon, Mar 27, 2017 at 11:03 AM, Thomas Munro
 wrote:
> On Mon, Mar 27, 2017 at 9:41 AM, Andres Freund  wrote:
>> SharedBufFile allows temporary files to be created by one backend and
>> then exported for read-only access by other backends, with clean-up
>> managed by reference counting associated with a DSM segment.  This includes
>> changes to fd.c and buffile.c to support new kinds of temporary file.
>>
>>
>> diff --git a/src/backend/storage/file/buffile.c 
>> b/src/backend/storage/file/buffile.c
>> index 4ca0ea4..a509c05 100644
>> --- a/src/backend/storage/file/buffile.c
>> +++ b/src/backend/storage/file/buffile.c
>>
>> I think the new facilities should be explained in the file's header.
>
> Will do.

Done.

>> @@ -68,9 +71,10 @@ struct BufFile
>>  * avoid making redundant FileSeek calls.
>>  */
>>
>> -   boolisTemp; /* can only add files if 
>> this is TRUE */
>> +   boolisSegmented;/* can only add files if this is 
>> TRUE */
>>
>> That's a bit of a weird and uncommented upon change.
>
> I was trying to cut down on the number of places we use the word
> 'temporary' to activate various different behaviours.  In this case,
> the only thing it controls is whether the BufFile is backed by one
> single fd.c File or many segments, so I figured it should be renamed.
>
> As Peter and you have pointed out, there may be a case for removing it
> altogether.

Done in 0007-hj-remove-buf-file-is-temp-v11.patch.

>> @@ -79,6 +83,8 @@ struct BufFile
>>  */
>> ResourceOwner resowner;
>>
>> +   BufFileTag  tag;/* for discoverability 
>> between backends */
>>
>> Not perfectly happy with the name tag here, the name is a bit too
>> similar to BufferTag - something quite different.
>
> Yeah, will rename.

Done.  That existed only because I had sharedbuffile.c which needed
special access to buffile.c via those weird 'tag' interfaces.  In the
new version that isn't required, and a new struct BufFileSet is
provided by buffile.c/h.

>> +static void
>> +make_tagged_path(char *tempdirpath, char *tempfilepath,
>> +const BufFileTag *tag, int segment)
>> +{
>> +   if (tag->tablespace == DEFAULTTABLESPACE_OID ||
>> +   

Re: [HACKERS] postgres_fdw IMPORT SCHEMA and partitioned tables

2017-03-30 Thread Amit Langote
On 2017/03/31 13:46, Michael Paquier wrote:
> On Fri, Mar 31, 2017 at 1:37 PM, Amit Langote
>  wrote:
>> +   
>> +For partitioned tables, partitions are automatically excluded from the
>> +schema data imported. Only the definition of partitioned tables is
>> included
>> +to give access to the full data set of all partitions present remotely.
>> +   
>> +
>>
>> Only the definitions of "root" partitioned tables, because when using
>> multi-level partitioning, there would be partitioned tables that won't be
>> included (because, relispartition=true).
>>
>> If you agree, then this code comment too could use the same terminology:
>>
>> + * Ignore table data for partitions and only include the parent
>> + * definitions to allow access to the complete remote data set
>> + * locally in the schema imported.
>> + *
> 
> That makes sense. Updated this way as I have my hands on it now.

Thanks, no more comments from my side.

Regards,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical decoding on standby

2017-03-30 Thread Craig Ringer
On 31 March 2017 at 01:16, Andres Freund  wrote:
>> @@ -9633,6 +9643,12 @@ xlog_redo(XLogReaderState *record)
>>   SetTransactionIdLimit(checkPoint.oldestXid, 
>> checkPoint.oldestXidDB);
>>
>>   /*
>> +  * There can be no concurrent writers to oldestCatalogXmin 
>> during
>> +  * recovery, so no need to take ProcArrayLock.
>> +  */
>> + ShmemVariableCache->oldestCatalogXmin = 
>> checkPoint.oldestCatalogXmin;
>
> s/writers/writes/?

I meant writers, i.e. nothing else can be writing to it. But writes works too.


>> @@ -276,8 +279,21 @@ CreateInitDecodingContext(char *plugin,
>>
>>   ReplicationSlotsComputeRequiredXmin(true);
>>
>> + /*
>> +  * If this is the first slot created on the master we won't have a
>> +  * persistent record of the oldest safe xid for historic snapshots yet.
>> +  * Force one to be recorded so that when we go to replay from this 
>> slot we
>> +  * know it's safe.
>> +  */
>> + force_standby_snapshot =
>> + !TransactionIdIsValid(ShmemVariableCache->oldestCatalogXmin);
>
> s/first slot/first logical slot/. Also, the reference to master doesn't
> seem right.

Unsure what you mean re reference to master not seeming right.

If oldestCatalogXmin is 0 we'll ERROR when trying to start decoding
from the new slot so we need to make sure it gets advanced one we've
decided on our starting catalog_xmin.

>>   LWLockRelease(ProcArrayLock);
>>
>> + /* Update ShmemVariableCache->oldestCatalogXmin */
>> + if (force_standby_snapshot)
>> + LogStandbySnapshot();
>
> The comment and code don't quite square to me - it's far from obvious
> that LogStandbySnapshot does something like that. I'd even say it's a
> bad idea to have it do that.

So you prefer the prior approach with separate xl_catalog_xmin advance records?

I don't have much preference; I liked the small code reduction of
Simon's proposed approach, but it landed up being a bit awkward in
terms of ordering and locking. I don't think catalog_xmin tracking is
really closely related to the standby snapshot stuff and it feels a
bit like it's a tacked-on afterthought where it is now.

>>   /*
>>* tell the snapshot builder to only assemble snapshot once reaching 
>> the
>>* running_xact's record with the respective xmin.
>> @@ -376,6 +392,8 @@ CreateDecodingContext(XLogRecPtr start_lsn,
>>   start_lsn = slot->data.confirmed_flush;
>>   }
>>
>> + EnsureActiveLogicalSlotValid();
>
> This seems like it should be in a separate patch, and seperately
> reviewed. It's code that's currently unreachable (and thus untestable).

It is reached and is run, those checks run whenever creating a
non-initial decoding context on master or replica.

The failure case is reachable if a replica has hot_standby_feedback
off or it's not using a physical slot and loses its connection. If
promoted, any slot existing on that replica (from a file system level
copy when the replica was created) will fail. I agree it's contrived
since we can't create and maintain slots on replicas, which is the
main point of it.


>> +/*
>> + * Test to see if the active logical slot is usable.
>> + */
>> +static void
>> +EnsureActiveLogicalSlotValid(void)
>> +{
>> + TransactionId shmem_catalog_xmin;
>> +
>> + Assert(MyReplicationSlot != NULL);
>> +
>> + /*
>> +  * A logical slot can become unusable if we're doing logical decoding 
>> on a
>> +  * standby or using a slot created before we were promoted from standby
>> +  * to master.
>
> Neither of those is currently possible...

Right. Because it's foundations for decoding on standby.

>> If the master advanced its global catalog_xmin past the
>> +  * threshold we need it could've removed catalog tuple versions that
>> +  * we'll require to start decoding at our restart_lsn.
>> +  *
>> +  * We need a barrier so that if we decode in recovery on a standby we
>> +  * don't allow new decoding sessions to start after redo has advanced
>> +  * the threshold.
>> +  */
>> + if (RecoveryInProgress())
>> + pg_memory_barrier();
>
> I don't think this is a meaningful locking protocol.  It's a bad idea to
> use lock-free programming without need, especially when the concurrency
> protocol isn't well defined.

Yeah. The intended interaction is with recovery conflict on standby
which doesn't look likely to land in this release due to patch
split/cleanup etc. (Not a complaint).

Better to just take a brief shared ProcArrayLock.

>> diff --git a/src/backend/replication/walsender.c 
>> b/src/backend/replication/walsender.c
>> index cfc3fba..cdc5f95 100644
>> --- a/src/backend/replication/walsender.c
>> +++ b/src/backend/replication/walsender.c
>> @@ -1658,6 +1658,11 @@ PhysicalConfirmReceivedLocation(XLogRecPtr lsn)
>>* be energy wasted - the worst lost information can do here is give us
>> 

Re: [HACKERS] postgres_fdw IMPORT SCHEMA and partitioned tables

2017-03-30 Thread Michael Paquier
On Fri, Mar 31, 2017 at 1:37 PM, Amit Langote
 wrote:
> +   
> +For partitioned tables, partitions are automatically excluded from the
> +schema data imported. Only the definition of partitioned tables is
> included
> +to give access to the full data set of all partitions present remotely.
> +   
> +
>
> Only the definitions of "root" partitioned tables, because when using
> multi-level partitioning, there would be partitioned tables that won't be
> included (because, relispartition=true).
>
> If you agree, then this code comment too could use the same terminology:
>
> + * Ignore table data for partitions and only include the parent
> + * definitions to allow access to the complete remote data set
> + * locally in the schema imported.
> + *

That makes sense. Updated this way as I have my hands on it now.
-- 
Michael


pgfdw-partitions-v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strange parallel query behavior after OOM crashes

2017-03-30 Thread Kuntal Ghosh
On Fri, Mar 31, 2017 at 5:43 AM, Neha Khatri  wrote:
>
> On Fri, Mar 31, 2017 at 8:29 AM, Kuntal Ghosh 
> wrote:
>>
>> On Fri, Mar 31, 2017 at 2:05 AM, Kuntal Ghosh
>>  wrote:
>> >
>> > 1. Put an Assert(0) in ParallelQueryMain(), start server and execute
>> > any parallel query.
>> >  In LaunchParallelWorkers, you can see
>> >nworkers = n nworkers_launched = n (n>0)
>> > But, all the workers will crash because of the assert statement.
>> > 2. the server restarts automatically, initialize
>> > BackgroundWorkerData->parallel_register_count and
>> > BackgroundWorkerData->parallel_terminate_count in the shared memory.
>> > After that, it calls ForgetBackgroundWorker and it increments
>> > parallel_terminate_count. In LaunchParallelWorkers, we have the
>> > following condition:
>> > if ((BackgroundWorkerData->parallel_register_count -
>> >  BackgroundWorkerData->parallel_terminate_count) >=
>> > max_parallel_workers)
>> > DO NOT launch any parallel worker.
>> > Hence, nworkers = n nworkers_launched = 0.
>> parallel_register_count and parallel_terminate_count, both are
>> unsigned integer. So, whenever the difference is negative, it'll be a
>> well-defined unsigned integer and certainly much larger than
>> max_parallel_workers. Hence, no workers will be launched. I've
>> attached a patch to fix this.
>
>
> The current explanation of active number of parallel workers is:
>
>  * The active
>  * number of parallel workers is the number of registered workers minus the
>  * terminated ones.
>
> In the situations like you mentioned above, this formula can give negative
> number for active parallel workers. However a negative number for active
> parallel workers does not make any sense.
Agreed.

> I feel it would be better to explain in code that in what situations, the
> formula
> can generate a negative result and what that means.
I think that we need to find a fix so that it never generates a
negative result. The last patch submitted by me generates a negative
value correctly. But, surely that's not enough.



-- 
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw IMPORT SCHEMA and partitioned tables

2017-03-30 Thread Amit Langote
On 2017/03/31 13:23, Michael Paquier wrote:
> On Wed, Mar 29, 2017 at 12:30 PM, Michael Paquier
>  wrote:
>> Users like things that are friendly, and we are most likely going to
>> piss them off when using postgres_fdw if they need to list manually
>> each parent table from the IMPORT FOREIGN SCHEMA command.
>>
>>> However, if we're going to do something about this, I think it should
>>> be done soon.  Otherwise, I'm going to advocate for reclassifying this
>>> issue from "open item" to "possible area for future development".
>>
>> I was just waiting for the end of the CF before sending in a patch,
>> allocating now some time to look at some patches pending for reviews.
> 
> And here is the promised patch to address this open item.

Looks good to me, except maybe:

+
+   
+For partitioned tables, partitions are automatically excluded from the
+schema data imported. Only the definition of partitioned tables is
included
+to give access to the full data set of all partitions present remotely.
+   
+

Only the definitions of "root" partitioned tables, because when using
multi-level partitioning, there would be partitioned tables that won't be
included (because, relispartition=true).

If you agree, then this code comment too could use the same terminology:

+ * Ignore table data for partitions and only include the parent
+ * definitions to allow access to the complete remote data set
+ * locally in the schema imported.
+ *

Thanks,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] reorder tablespaces in basebackup tar stream for backup_label

2017-03-30 Thread Michael Paquier
On Fri, Mar 31, 2017 at 12:06 AM, Fujii Masao  wrote:
> I don't think that using psql to run BASE_BACKUP command is good
> approach.

That's basically what pg_basebackup is made for, and it makes sure
that some sanity checks and measures happen.

> Instead, IMO you basically should implement the client program
> which can handle BASE_BACKUP properly, or extend pg_basebackup
> so that you can do what you want to do.

In my first reviews of the patch, I completely forgot the fact that
BASE_BACKUP does send the start LSN of the backup in the first result
set, so the patch proposed is actually rather useless because the data
you are looking for is already at hand. If more data would be
interesting to have, like the start timestamp number, we could just
extend the first result set a bit as Fujii-san is coming at. Let's
drop this patch and move on.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw IMPORT SCHEMA and partitioned tables

2017-03-30 Thread Michael Paquier
On Wed, Mar 29, 2017 at 12:30 PM, Michael Paquier
 wrote:
> Users like things that are friendly, and we are most likely going to
> piss them off when using postgres_fdw if they need to list manually
> each parent table from the IMPORT FOREIGN SCHEMA command.
>
>> However, if we're going to do something about this, I think it should
>> be done soon.  Otherwise, I'm going to advocate for reclassifying this
>> issue from "open item" to "possible area for future development".
>
> I was just waiting for the end of the CF before sending in a patch,
> allocating now some time to look at some patches pending for reviews.

And here is the promised patch to address this open item.
-- 
Michael


pgfdw-partitions.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Declarative partitioning vs. information_schema

2017-03-30 Thread Amit Langote
On 2017/01/26 3:19, Robert Haas wrote:
> On Wed, Jan 25, 2017 at 1:04 PM, Peter Eisentraut
>  wrote:
>> On 1/18/17 2:32 PM, Robert Haas wrote:
>>> Unless we can find something official, I suppose we should just
>>> display BASE TABLE in that case as we do in other cases.  I wonder if
>>> the schema needs some broader revision; for example, are there
>>> information_schema elements intended to show information about
>>> partitions?
>>
>> Is it intentional that we show the partitions by default in \d,
>> pg_tables, information_schema.tables?  Or should we treat those as
>> somewhat-hidden details?
> 
> I'm not really sure what the right thing to do is there.  I was hoping
> you had an opinion.

I guess this is an open item then.  I think Greg Stark brought this up too
on the original partitioning thread [1].

Thanks,
Amit

[1]
https://www.postgresql.org/message-id/CAM-w4HOZ5fPS7GoCTTrW42q01%2BwPrOWFCnr9H0iDyVTZP2H1CA%40mail.gmail.com




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Something broken around FDW connection close

2017-03-30 Thread Kyotaro HORIGUCHI
At Fri, 31 Mar 2017 12:32:39 +0900, Etsuro Fujita  
wrote in 
> On 2017/03/31 8:28, David Rowley wrote:
..
> > create server test_server foreign data wrapper postgres_fdw options
> > (host 'localhost', port '5432', dbname 'postgres');
> > create foreign table ft_t (a int,b int) server test_server;
...
> > select count(*) from pg_stat_Activity; -- > 6
> > analyze ft_t;
> > ERROR:  could not connect to server "test_server"
> > DETAIL:  FATAL:  sorry, too many clients already
> > CONTEXT:  Remote SQL command: DECLARE c1 CURSOR FOR SELECT a, b FROM
> > public.ft_t
> > Remote SQL command: SELECT a, b FROM public.ft_t
> > Remote SQL command: SELECT a, b FROM public.ft_t
> > Remote SQL command: SELECT a, b FROM public.ft_t
> > (lots of these)
...
> IIUC, I think the cause would be that since the foreign table ft_t is
> considered to be still foreign on the foreign server, which is
> actually the same server, postgres_fdw recursively repeats the
> loopback access to ft_t.  (So, the same thing would happen for
> something like: select * from ft_t.)  If the analysis is right, ISTM
> that it's the user's fault.

Agreed, this behavior is mentioned here.

https://www.postgresql.org/docs/9.6/static/postgres-fdw.html#AEN182920

| table_name
| 
| This option, which can be specified for a foreign table, gives
| the table name to use for the foreign table on the remote
| server. If this option is omitted, the foreign table's name is
| used.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical decoding on standby

2017-03-30 Thread Craig Ringer
On 31 March 2017 at 01:16, Andres Freund  wrote:
> On 2017-03-29 08:01:34 +0800, Craig Ringer wrote:
>> On 28 March 2017 at 23:22, Andres Freund  wrote:
>>
>> >> --- a/doc/src/sgml/protocol.sgml
>> >> +++ b/doc/src/sgml/protocol.sgml
>> >> @@ -2034,6 +2034,8 @@ The commands accepted in walsender mode are:
>> >>   
>> >>Drops a replication slot, freeing any reserved server-side 
>> >> resources. If
>> >>the slot is currently in use by an active connection, this command 
>> >> fails.
>> >> +  If the slot is a logical slot that was created in a database other 
>> >> than
>> >> +  the database the walsender is connected to, this command fails.
>> >>   
>> >>   
>> >>
>> >
>> > Shouldn't the docs in the drop database section about this?
>>
>> DROP DATABASE doesn't really discuss all the resources it drops, but
>> I'm happy to add mention of replication slots handling.
>
> I don't think that's really comparable, because the other things aren't
> global objects, which replication slots are.

Fine by me.

Patch fix-slot-drop-docs.patch, upthread, adds the passage

+
+  
+   Active logical
+   replication slots count as connections and will prevent a
+   database from being dropped. Inactive slots will be automatically
+   dropped when the database is dropped.
+  

to the notes section of the DROP DATABASE docs; that should do the
trick. I'm not convinced it's worth going into the exceedingly
unlikely race with concurrent slot drop, and we don't seem to in other
places in the docs, like the race you mentioned with connecting to a
db as it's being dropped.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] few fts functions for jsonb

2017-03-30 Thread Dmitry Dolgov
On 31 March 2017 at 00:01, Andrew Dunstan 
wrote:
>
> I have just noticed as I was writing/testing the non-existent docs for
> this patch that it doesn't supply variants of to_tsvector that take a
> regconfig as the first argument. Is there a reason for that? Why
> should the json(b) versions be different from the text versions?

No, there is no reason, I just missed that. Here is a new version of the
patch (only the functions part)
to add those variants.
diff --git a/src/backend/tsearch/to_tsany.c b/src/backend/tsearch/to_tsany.c
index 6e5de8f..f19383e 100644
--- a/src/backend/tsearch/to_tsany.c
+++ b/src/backend/tsearch/to_tsany.c
@@ -16,6 +16,7 @@
 #include "tsearch/ts_cache.h"
 #include "tsearch/ts_utils.h"
 #include "utils/builtins.h"
+#include "utils/jsonapi.h"
 
 
 typedef struct MorphOpaque
@@ -24,6 +25,14 @@ typedef struct MorphOpaque
 	int			qoperator;		/* query operator */
 } MorphOpaque;
 
+typedef struct TSVectorBuildState
+{
+	ParsedText	*prs;
+	TSVector	result;
+	Oid			cfgId;
+} TSVectorBuildState;
+
+static void add_to_tsvector(void *state, char *elem_value, int elem_len);
 
 Datum
 get_current_ts_config(PG_FUNCTION_ARGS)
@@ -256,6 +265,135 @@ to_tsvector(PG_FUNCTION_ARGS)
 		PointerGetDatum(in)));
 }
 
+Datum
+jsonb_to_tsvector_byid(PG_FUNCTION_ARGS)
+{
+	Oid	cfgId = PG_GETARG_OID(0);
+	Jsonb*jb = PG_GETARG_JSONB(1);
+	TSVectorBuildState	state;
+	ParsedText			*prs = (ParsedText *) palloc(sizeof(ParsedText));
+
+	prs->words = NULL;
+	state.result = NULL;
+	state.cfgId = cfgId;
+	state.prs = prs;
+
+	iterate_jsonb_string_values(jb, , (JsonIterateStringValuesAction) add_to_tsvector);
+
+	PG_FREE_IF_COPY(jb, 1);
+
+	if (state.result == NULL)
+	{
+		/* There weren't any string elements in jsonb,
+		 * so wee need to return an empty vector */
+
+		if (prs->words != NULL)
+			pfree(prs->words);
+
+		state.result = palloc(CALCDATASIZE(0, 0));
+		SET_VARSIZE(state.result, CALCDATASIZE(0, 0));
+		state.result->size = 0;
+	}
+
+	PG_RETURN_TSVECTOR(state.result);
+}
+
+Datum
+jsonb_to_tsvector(PG_FUNCTION_ARGS)
+{
+	Jsonb	*jb = PG_GETARG_JSONB(0);
+	Oid		cfgId;
+
+	cfgId = getTSCurrentConfig(true);
+	PG_RETURN_DATUM(DirectFunctionCall2(jsonb_to_tsvector_byid,
+		ObjectIdGetDatum(cfgId),
+		JsonbGetDatum(jb)));
+}
+
+Datum
+json_to_tsvector_byid(PG_FUNCTION_ARGS)
+{
+	Oid	cfgId = PG_GETARG_OID(0);
+	text*json = PG_GETARG_TEXT_P(1);
+	TSVectorBuildState	state;
+	ParsedText			*prs = (ParsedText *) palloc(sizeof(ParsedText));
+
+	prs->words = NULL;
+	state.result = NULL;
+	state.cfgId = cfgId;
+	state.prs = prs;
+
+	iterate_json_string_values(json, , (JsonIterateStringValuesAction) add_to_tsvector);
+
+	PG_FREE_IF_COPY(json, 1);
+	if (state.result == NULL)
+	{
+		/* There weren't any string elements in json,
+		 * so wee need to return an empty vector */
+
+		if (prs->words != NULL)
+			pfree(prs->words);
+
+		state.result = palloc(CALCDATASIZE(0, 0));
+		SET_VARSIZE(state.result, CALCDATASIZE(0, 0));
+		state.result->size = 0;
+	}
+
+	PG_RETURN_TSVECTOR(state.result);
+}
+
+Datum
+json_to_tsvector(PG_FUNCTION_ARGS)
+{
+	text	*json = PG_GETARG_TEXT_P(0);
+	Oid		cfgId;
+
+	cfgId = getTSCurrentConfig(true);
+	PG_RETURN_DATUM(DirectFunctionCall2(json_to_tsvector_byid,
+		ObjectIdGetDatum(cfgId),
+		PointerGetDatum(json)));
+}
+
+/*
+ * Extend current TSVector from _state with a new one,
+ * build over a json(b) element.
+ */
+static void
+add_to_tsvector(void *_state, char *elem_value, int elem_len)
+{
+	TSVectorBuildState *state = (TSVectorBuildState *) _state;
+	ParsedText	*prs = state->prs;
+	TSVector	item_vector;
+	int			i;
+
+	prs->lenwords = elem_len / 6;
+	if (prs->lenwords == 0)
+		prs->lenwords = 2;
+
+	prs->words = (ParsedWord *) palloc(sizeof(ParsedWord) * prs->lenwords);
+	prs->curwords = 0;
+	prs->pos = 0;
+
+	parsetext(state->cfgId, prs, elem_value, elem_len);
+
+	if (prs->curwords)
+	{
+		if (state->result != NULL)
+		{
+			for (i = 0; i < prs->curwords; i++)
+prs->words[i].pos.pos = prs->words[i].pos.pos + TS_JUMP;
+
+			item_vector = make_tsvector(prs);
+
+			state->result = (TSVector) DirectFunctionCall2(tsvector_concat,
+	TSVectorGetDatum(state->result),
+	PointerGetDatum(item_vector));
+		}
+		else
+			state->result = make_tsvector(prs);
+	}
+}
+
 /*
  * to_tsquery
  */
diff --git a/src/backend/tsearch/wparser.c b/src/backend/tsearch/wparser.c
index 8ca1c62..6e4e445 100644
--- a/src/backend/tsearch/wparser.c
+++ b/src/backend/tsearch/wparser.c
@@ -20,6 +20,7 @@
 #include "tsearch/ts_cache.h"
 #include "tsearch/ts_utils.h"
 #include "utils/builtins.h"
+#include "utils/jsonapi.h"
 #include "utils/varlena.h"
 
 
@@ -31,6 +32,19 @@ typedef struct
 	LexDescr   *list;
 } TSTokenTypeStorage;
 
+/* state for ts_headline_json_* */
+typedef struct HeadlineJsonState
+{
+	HeadlineParsedText *prs;
+	TSConfigCacheEntry *cfg;
+	TSParserCacheEntry *prsobj;
+	TSQueryquery;
+	List

Re: [HACKERS] WIP: [[Parallel] Shared] Hash

2017-03-30 Thread Rafia Sabih
On Tue, Mar 28, 2017 at 11:11 AM, Rafia Sabih
 wrote:
> On Mon, Mar 27, 2017 at 12:20 PM, Thomas Munro
>  wrote:
>>
>> On Sun, Mar 26, 2017 at 3:56 PM, Thomas Munro
>>  wrote:
>> > But... what you said above must be a problem for Windows.  I believe
>> > it doesn't allow files to be unlinked if they are open, and I see that
>> > DSM segments are cleaned up in resowner's phase ==
>> > RESOURCE_RELEASE_BEFORE_LOCKS and files are closed in phase ==
>> > RESOURCE_RELEASE_AFTER_LOCKS.
>>
>> I thought this last point about Windows might be fatal to my design,
>> but it seems that Windows since at least version 2000 has support for
>> Unixoid unlinkability via the special flag FILE_SHARE_DELETE.
>
> On testing v10 of this patch over commit
> b54aad8e34bd6299093e965c50f4a23da96d7cc3 and applying the tweak
> mentioned in [1], for TPC-H queries I found the results quite
> encouraging,
>
> Experimental setup:
> TPC-H scale factor - 20
> work_mem = 1GB
> shared_buffers = 10GB
> effective_cache_size = 10GB
> random_page_cost = seq_page_cost = 0.1
> max_parallel_workers_per_gather = 4
>
> Performance numbers:
> (Time in seconds)
> Query  |  Head | Patch |
> ---
> Q3   | 73   | 37  |
> Q5   | 56   | 31  |
> Q7   | 40   | 30  |
> Q8   | 8 | 8|
> Q9   | 85   | 42  |
> Q10 | 86   | 46  |
> Q14 | 11   | 6|
> Q16 | 32   | 11  |
> Q21 | 53   | 56  |
>
> Please find the attached file for the explain analyse output of these
> queries on head as well as patch.
> Would be working on analysing the performance of this patch on 300 scale 
> factor.
>
> [1] 
> https://www.postgresql.org/message-id/flat/CAEepm%3D270ze2hVxWkJw-5eKzc3AB4C9KpH3L2kih75R5pdSogg%40mail.gmail.com
> --

Before moving to higher scale I tried playing around work_mem effects
on this patch and came across following results,
All settings are kept as before with the exception of work_mem that is
set to 64MB.

Most of the queries showed similar performance except a few, details
are as follows,
(all time are given in ms)
Query | Head| Patch
-+--+
 Q8 | 8720 | 8839
 Q18   | 370710 | 384347
 Q21   | 53270   | 65189

Clearly, regression in Q8 and Q18 is minor but that in Q21 is
significant. Just to confirm, I have applied the tweak mentioned in
[1] as before,
For the explain analyse output of Q21 on head and with patch, please
check the attached file.

 [1] 
https://www.postgresql.org/message-id/flat/CAEepm%3D270ze2hVxWkJw-5eKzc3AB4C9KpH3L2kih75R5pdSogg%40mail.gmail.com

-- 
Regards,
Rafia Sabih
EnterpriseDB: http://www.enterprisedb.com/


q21_reg_ph.tar.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Something broken around FDW connection close

2017-03-30 Thread Etsuro Fujita

On 2017/03/31 8:28, David Rowley wrote:

create table t (a int, b int);
insert into t1 select x/100,x/100 from  generate_series(1,10) x;
create extension if not exists postgres_fdw;
create server test_server foreign data wrapper postgres_fdw options
(host 'localhost', port '5432', dbname 'postgres');
create foreign table ft_t (a int,b int) server test_server;
select 'create user mapping for current_user server test_server
options(user ''' || current_user || ''');';
\gexec
select count(*) from pg_stat_Activity; -- > 6
analyze ft_t;
ERROR:  could not connect to server "test_server"
DETAIL:  FATAL:  sorry, too many clients already
CONTEXT:  Remote SQL command: DECLARE c1 CURSOR FOR SELECT a, b FROM
public.ft_t
Remote SQL command: SELECT a, b FROM public.ft_t
Remote SQL command: SELECT a, b FROM public.ft_t
Remote SQL command: SELECT a, b FROM public.ft_t
(lots of these)

select count(*) from pg_stat_Activity; --> 105

I've not had a moment to check into what's going on.


IIUC, I think the cause would be that since the foreign table ft_t is 
considered to be still foreign on the foreign server, which is actually 
the same server, postgres_fdw recursively repeats the loopback access to 
ft_t.  (So, the same thing would happen for something like: select * 
from ft_t.)  If the analysis is right, ISTM that it's the user's fault.


Best regards,
Etsuro Fujita




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Variable substitution in psql backtick expansion

2017-03-30 Thread Michael Paquier
On Fri, Mar 31, 2017 at 2:33 AM, Tom Lane  wrote:
> Awhile back in the discussion about the \if feature for psql,
> I'd pointed out that you shouldn't really need very much in
> the way of boolean-expression evaluation smarts, because you
> ought to be able to use a backtick shell escape:
>
> \if `expr :foo \> :bar`
> \echo :foo is greater than :bar
> \endif
>
> Now that the basic feature is in, I went to play around with this,
> and was disappointed to find out that it doesn't work.  The reason
> is not far to seek: we do not do variable substitution within the
> text between backticks.  psqlscanslash.l already foresaw that some
> day we'd want to do that:
>
> /*
>  * backticked text: copy everything until next backquote, then 
> evaluate.
>  *
>  * XXX Possible future behavioral change: substitute for :VARIABLE?
>  */
>
> I think today is that day, because it's going to make a material
> difference to the usability of this feature.
>
> I propose extending backtick processing so that
>
> 1. :VARIABLE is replaced by the contents of the variable
>
> 2. :'VARIABLE' is replaced by the contents of the variable,
> single-quoted according to Unix shell conventions.  (So the
> processing would be a bit different from what it is for the
> same notation in SQL contexts.)
>
> This doesn't look like it would take very much new code to do.
>
> Thoughts?

In short, +1.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] release notes for v10

2017-03-30 Thread Bruce Momjian
On Thu, Mar 30, 2017 at 06:23:04PM -0400, Robert Haas wrote:
> Hi,
> 
> One of the things we'll need to do to get 10beta1 out the door is have
> a set of release notes.  I spoke with Bruce Momjian today at PGCONF.US
> and he told me that he has set aside time during April to write them
> and will begin working on it once we reach feature freeze.  I think
> this is good, because last year Bruce didn't have time and we had to
> scramble to find somebody, with Tom Lane luckily stepping up to the
> plate.
> 
> Bruce, please confirm that I've got the message right.

Yes, confirmed, the release notes will be ready by the end of April.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] identity columns

2017-03-30 Thread Vitaly Burovoy
On 3/29/17, Peter Eisentraut  wrote:
> On 3/24/17 05:29, Vitaly Burovoy wrote:
>> It would be great if "DROP IDENTITY IF EXISTS" is in the current patch
>> since we don't have any disagreements about "DROP IDENTITY" behavior
>> and easiness of implementation of "IF EXISTS" there.
>
> Here is an updated patch that adds DROP IDENTITY IF EXISTS.
>
> Unfortunately, implementing ADD IF NOT EXISTS is much harder, so I
> haven't done that here.
>
> Additionally, this patch fixes a few minor issues that you had pointed
> out, and merges with the new expression evaluation system in the executor.
>
> I have also CC'ed you on a separate patch to improve the behavior when
> changing a sequence's data type.

Thank you a lot. I'll have a deep look by the Sunday evening.

Why do you still want to leave "ADD IF NOT EXISTS" instead of using
"SET IF NOT EXISTS"?
If someone wants to follow the standard he can simply not to use "IF
NOT EXISTS" at all. Without it error should be raised.
But we would not need to introduce a new grammar for a column property
which is single and can't be added more than once.

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sequence data type

2017-03-30 Thread Vitaly Burovoy
On 3/30/17, Vitaly Burovoy  wrote:
> On 3/29/17, Vitaly Burovoy  wrote:
>> On 3/29/17, Michael Paquier  wrote:
>>> On Thu, Mar 30, 2017 at 11:18 AM, Vitaly Burovoy
>>>  wrote:
 I think min_value and max_value should not be set to "1" or "-1" but
 to real min/max of the type by default.
>>>
>>> This is the default behavior for ages, since e8647c45 to be exact. So
>>> you would change 20 years of history?
>>
>> ... is it a wrong way to keep historical minimum as "1" by
>> default: it is not a minimum of any of supported type.
>
> I've read the standard about "minvalue", "maxvalue" and "start".
> OK, I was wrong. Since "start" should be equal to "minvalue" unless
> defined explicitly, the only bug left from my first email here is
> resetting "minvalue" back to 1 when data type changes and if the value
> matches the bound of the old type (the last case there).
>
> P.S.: the same thing with "maxvalue" when "increment" is negative.

It seemed not very hard to fix it.
Please find attached patch to be applied on top of your one.

I've added more tests to cover different cases of changing bounds when
data type is changed.

-- 
Best regards,
Vitaly Burovoy


0002-Fix-overriding-max-min-value-of-a-sequence-to-1-on-A.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] Bug in Physical Replication Slots (at least 9.5)?

2017-03-30 Thread Venkata B Nagothi
On Fri, Mar 17, 2017 at 6:48 PM, Kyotaro HORIGUCHI <
horiguchi.kyot...@lab.ntt.co.jp> wrote:

> Hello,
>
> At Mon, 13 Mar 2017 11:06:00 +1100, Venkata B Nagothi 
> wrote in  gmail.com>
> > On Tue, Jan 17, 2017 at 9:36 PM, Kyotaro HORIGUCHI <
> > horiguchi.kyot...@lab.ntt.co.jp> wrote:
> > > I managed to reproduce this. A little tweak as the first patch
> > > lets the standby to suicide as soon as walreceiver sees a
> > > contrecord at the beginning of a segment.
> > >
> > > - M(aster): createdb as a master with wal_keep_segments = 0
> > > (default), min_log_messages = debug2
> > > - M: Create a physical repslot.
> > > - S(tandby): Setup a standby database.
> > > - S: Edit recovery.conf to use the replication slot above then
> > >  start it.
> > > - S: touch /tmp/hoge
> > > - M: Run pgbench ...
> > > - S: After a while, the standby stops.
> > >   > LOG:   STOP THE SERVER
> > >
> > > - M: Stop pgbench.
> > > - M: Do 'checkpoint;' twice.
> > > - S: rm /tmp/hoge
> > > - S: Fails to catch up with the following error.
> > >
> > >   > FATAL:  could not receive data from WAL stream: ERROR:  requested
> WAL
> > > segment 0001002B has already been removed
> > >
> > >
> > I have been testing / reviewing the latest patch
> > "0001-Fix-a-bug-of-physical-replication-slot.patch" and i think, i might
> > need some more clarification on this.
> >
> > Before applying the patch, I tried re-producing the above error -
> >
> > - I had master->standby in streaming replication
> > - Took the backup of master
> >- with a low max_wal_size and wal_keep_segments = 0
> > - Configured standby with recovery.conf
> > - Created replication slot on master
> > - Configured the replication slot on standby and started the standby
>
> I suppose the "configure" means primary_slot_name in recovery.conf.
>
> > - I got the below error
> >
> >>> 2017-03-10 11:58:15.704 AEDT [478] LOG:  invalid record length at
> > 0/F2000140: wanted 24, got 0
> >>> 2017-03-10 11:58:15.706 AEDT [481] LOG:  started streaming WAL from
> > primary at 0/F200 on timeline 1
> >>> 2017-03-10 11:58:15.706 AEDT [481] FATAL:  could not receive data
> > from WAL stream: ERROR:  requested WAL segment 000100F2
> has
> > already been removed
>
> Maybe you created the master slot with non-reserve (default) mode
> and put a some-minites pause after making the backup and before
> starting the standby. For the case the master slot doesn't keep
> WAL segments unless the standby connects so a couple of
> checkpoints can blow away the first segment required by the
> standby. This is quite reasonable behavior. The following steps
> makes this more sure.
>
> > - Took the backup of master
> >- with a low max_wal_size = 2 and wal_keep_segments = 0
> > - Configured standby with recovery.conf
> > - Created replication slot on master
> + - SELECT pg_switch_wal(); on master twice.
> + - checkpoint; on master twice.
> > - Configured the replication slot on standby and started the standby
>
> Creating the slot with the following command will save it.
>
> =# select pg_create_physical_replication_slot('s1', true);
>

I did a test again, by applying the patch and I am not sure if the patch is
doing the right thing ?

Here is test case -

- I ran pgbench
- I took the backup of the master first

- Below are the WALs on master after the stop backup -

postgres=# select pg_stop_backup();

NOTICE:  WAL archiving is not enabled; you must ensure that all required
WAL segments are copied through other means to complete the backup
 pg_stop_backup

 0/8C000130
(1 row)

postgres=# \q
[dba@buildhost data]$ ls -ltrh pgdata-10dev-prsb-1/pg_wal/
total 65M
drwx--. 2 dba dba 4.0K Mar 31 09:36 archive_status
-rw---. 1 dba dba  16M Mar 31 11:09 0001008E
-rw---. 1 dba dba  16M Mar 31 11:17 0001008F
-rw---. 1 dba dba  16M Mar 31 11:18 0001008C
-rw---. 1 dba dba  16M Mar 31 11:18 0001008D

- After the backup, i created the physical replication slot


postgres=# select pg_create_physical_replication_slot('repslot',true);

 pg_create_physical_replication_slot
-
 (repslot,0/8D28)
(1 row)

postgres=# select pg_walfile_name('0/8D28');

 pg_walfile_name
---
 0001008D
(1 row)

Here, When you start the standby, it would ask for the file
0001008C, which is the first file needed for the standby
and since i applied your patch, i am assuming that, the file
0001008C should also be retained without being removed -
correct ?

- I started the standby and the below error occurs

>> 2017-03-31 11:26:01.288 AEDT [17475] LOG:  invalid record length at
0/8C000108: wanted 24, got 0
>> 2017-03-31 11:26:01.291 AEDT [17486] LOG:  started streaming 

Re: [HACKERS] [POC] A better way to expand hash indexes.

2017-03-30 Thread Robert Haas
On Thu, Mar 30, 2017 at 2:36 PM, Mithun Cy  wrote:
> Thanks Robert, I have tried to fix the comments given as below.

Thanks.

Since this changes the on-disk format of hash indexes in an
incompatible way, I think it should bump HASH_VERSION.  (Hopefully
that doesn't preclude REINDEX?)  pg_upgrade should probably also be
patched to issue a warning if upgrading from a version < 10 to a
version >= 10 whenever hash indexes are present; I thought we had
similar cases already, but I don't see them at the moment.  Maybe we
can get Bruce or someone to give us some advice on exactly what should
be done here.

In a couple of places, you say that a splitpoint group has 4 slots
rather than 4 phases.

I think that in _hash_get_totalbuckets(), you should use blah &
HASH_SPLITPOINT_PHASE_MASK rather than blah %
HASH_SPLITPOINT_PHASES_PER_GRP for consistency with _hash_spareindex
and, perhaps, speed.  Similarly, instead of blah /
HASH_SPLITPOINT_PHASES_PER_GRP, use blah >>
HASH_SPLITPOINT_PHASE_BITS.

buckets_toadd is punctuated oddly.  buckets_to_add?  Instead of
hand-calculating this, how about calculating it as
_hash_get_totalbuckets(spare_ndx) - _hash_get_totalbuckets(spare_ndx -
1)?  That way you reuse the existing logic instead of writing a
slightly different thing in a new place and maybe making a mistake.
If you're going to calculate it, use & and >> rather than % and /, as
above, and drop the parentheses around new_bucket -- this isn't a
macro definition.

+uint32  splitpoint_group = 0;

Don't need the  = 0 here; the next reference to this variable is an
unconditional initialization.

+ */
+
+splitpoint_group = HASH_SPLITPOINT_PHASE_TO_SPLITPOINT_GRP(spare_ndx);

I would delete the blank line.

- * should start from ((2 ^ i) + metap->hashm_spares[i - 1] + 1).
+ * should start from
+ * (_hash_get_totalbuckets(i) + metap->hashm_spares[i - 1] + 1).

Won't survive pgindent.

- * The number of buckets in the new splitpoint is equal to the total
- * number already in existence, i.e. new_bucket.  Currently this maps
- * one-to-one to blocks required, but someday we may need a more
- * complicated calculation here.  We treat allocation of buckets as a
- * separate WAL-logged action.  Even if we fail after this operation,
- * won't leak bucket pages; rather, the next split will consume this
- * space. In any case, even without failure we don't use all the space
- * in one split operation.
+ * The number of buckets in the new splitpoint group is equal to the
+ * total number already in existence. But we do not allocate them at
+ * once. Each splitpoint group will have 4 slots, we distribute the
+ * buckets equally among them. So we allocate only one fourth of total
+ * buckets in new splitpoint group at a time to consume one phase after
+ * another. We treat allocation of buckets as a separate WAL-logged
+ * action. Even if we fail after this operation, won't leak bucket
+ * pages; rather, the next split will consume this space. In any case,
+ * even without failure we don't use all the space in one split
+ * operation.

I think here you should break this into two paragraphs -- start a new
paragraph with the sentence that begins "We treat..."

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sequence data type

2017-03-30 Thread Vitaly Burovoy
On 3/30/17, Vitaly Burovoy  wrote:
> On 3/29/17, Vitaly Burovoy  wrote:
>> On 3/29/17, Michael Paquier  wrote:
>>> On Thu, Mar 30, 2017 at 11:18 AM, Vitaly Burovoy
>>>  wrote:
 I think min_value and max_value should not be set to "1" or "-1" but
 to real min/max of the type by default.
>>>
>>> This is the default behavior for ages, since e8647c45 to be exact. So
>>> you would change 20 years of history?
>>
>> ... is it a wrong way to keep historical minimum as "1" by
>> default: it is not a minimum of any of supported type.
>
> I've read the standard about "minvalue", "maxvalue" and "start".
> OK, I was wrong. Since "start" should be equal to "minvalue" unless
> defined explicitly, the only bug left from my first email here is
> resetting "minvalue" back to 1 when data type changes and if the value
> matches the bound of the old type (the last case there).
>
> P.S.: the same thing with "maxvalue" when "increment" is negative.

It seemed not very hard to fix it.
Please find attached patch.

I've added more tests to cover different cases of changing bounds when
data type is changed.

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Supporting huge pages on Windows

2017-03-30 Thread Tsunakawa, Takayuki
From: Amit Kapila [mailto:amit.kapil...@gmail.com]
> The latest patch looks good to me apart from one Debug message, so I have
> marked it as Ready For Committer.

Thank you so much!


> + ereport(DEBUG1,
> + (errmsg("disabling huge pages")));
> 
> I think this should be similar to what we display in sysv_shmem.c as below:
> 
> elog(DEBUG1, "mmap(%zu) with MAP_HUGETLB failed, huge pages disabled: %m",
> allocsize);

I understood you suggested this to make the reason clear for disabling huge 
pages.  OK, done as follows.

+   elog(DEBUG1, "CreateFileMapping(%llu) with SEC_LARGE_PAGES 
failed "
+"due to insufficient large pages, huge pages disabled",
+size);

I hope this will be committed soon.


Regards
Takayuki Tsunakawa



win_large_pages_v10.patch
Description: win_large_pages_v10.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Generic type subscripting

2017-03-30 Thread Dmitry Dolgov
On 30 March 2017 at 19:36, Arthur Zakirov  wrote:
>
> The last point is about the tutorial. As Tom pointed it is not useful
when the tutorial doesn't execute. It happens because there is not "custom"
type in subscripting.sql.

I'm confused. Maybe I'm missing something, but there is "custom" type in
this file:

```
-- subscripting.sql

CREATE TYPE custom (
   internallength = 8,
   input = custom_in,
   output = custom_out,
   subscripting = custom_subscript_parse
);
```

```
> \i subscripting.sql
psql:subscripting.sql:39: NOTICE:  42704: type "custom" is not yet defined
DETAIL:  Creating a shell type definition.
LOCATION:  compute_return_type, functioncmds.c:141
CREATE FUNCTION
Time: 4.257 ms
psql:subscripting.sql:47: NOTICE:  42809: argument type custom is only a
shell
LOCATION:  interpret_function_parameter_list, functioncmds.c:245
CREATE FUNCTION
Time: 37.038 ms
CREATE FUNCTION
Time: 13.891 ms
CREATE FUNCTION
Time: 0.946 ms
CREATE FUNCTION
Time: 1.161 ms
CREATE TYPE
Time: 1.336 ms
CREATE TABLE
Time: 2.129 ms
INSERT 0 1
Time: 2.501 ms
 data
--
2
(1 row)

Time: 0.960 ms
UPDATE 1
Time: 0.887 ms
```

So the only problem I see is notification about "type 'custom' is not yet
defined", but it's the same for "complex" tutorial

```
> \i complex.sql
psql:complex.sql:39: NOTICE:  42704: type "complex" is not yet defined
DETAIL:  Creating a shell type definition.
LOCATION:  compute_return_type, functioncmds.c:141
CREATE FUNCTION
Time: 1.741 ms
psql:complex.sql:47: NOTICE:  42809: argument type complex is only a shell
LOCATION:  interpret_function_parameter_list, functioncmds.c:245
CREATE FUNCTION
Time: 0.977 ms
psql:complex.sql:55: NOTICE:  42809: return type complex is only a shell
LOCATION:  compute_return_type, functioncmds.c:105
CREATE FUNCTION
Time: 0.975 ms
psql:complex.sql:63: NOTICE:  42809: argument type complex is only a shell
LOCATION:  interpret_function_parameter_list, functioncmds.c:245
CREATE FUNCTION
Time: 0.893 ms
CREATE TYPE
Time: 0.992 ms
...
```

Can you clarify this point?


[HACKERS] tuple-routing and constraint violation error message, revisited

2017-03-30 Thread Amit Langote
Hi,

Last message regarding this was by Robert on the original partitioning thread:

https://www.postgresql.org/message-id/CA%2BTgmoZjGzSM5WwnyapFaw3GxnDLWh7pm8Xiz8_QWQnUQy%3DSCA%40mail.gmail.com

Summary is: We decided in f1b4c771ea7 [1] that passing the original slot
(one containing the tuple formatted per root partitioned table's tupdesc)
to ExecConstraints(), but that breaks certain cases.  Imagine what would
happen if a BR insert trigger changed the tuple - the original slot would
not contain those changes. So, it seems better to convert (if necessary)
the tuple formatted per partition tupdesc after tuple-routing back to the
root table's format and use the converted tuple to make val_desc shown in
the message if an error occurs.

Attached rebased version of the patch that I had originally proposed
(summary above is the commit message).  Robert thought it would be fine to
show the row formatted per partition rowtype, but would look better if we
could show the column names as well (remember that we're trying to account
for possible differences in the ordering of columns between the root table
and leaf partitions to which tuples are routed.)

Added this to PostgreSQL 10 open items list.

Thanks,
Amit

[1] https://git.postgresql.org/gitweb/?p=postgresql.git=commit=f1b4c77
>From fb3e65de8018b867f755fe145fe4759be5a0fb54 Mon Sep 17 00:00:00 2001
From: amit 
Date: Fri, 31 Mar 2017 11:00:43 +0900
Subject: [PATCH] Fix reporting of violation in ExecConstraints, again

We decided in f1b4c771ea74f42447dccaed42ffcdcccf3aa694 that passing
the original slot (one containing the tuple formatted per root
partitioned table's tupdesc) to ExecConstraints(), but that breaks
certain cases.  Imagine what would happen if a BR trigger changed the
tuple - the original slot would not contain those changes.
So, it seems better to convert (if necessary) the tuple formatted
per partition tupdesc after tuple-routing back to the root table's
format and use the converted tuple to make val_desc shown in the
message if an error occurs.
---
 src/backend/commands/copy.c|  6 ++--
 src/backend/executor/execMain.c| 53 +-
 src/backend/executor/execReplication.c |  4 +--
 src/backend/executor/nodeModifyTable.c |  7 ++---
 src/include/executor/executor.h|  3 +-
 src/test/regress/expected/insert.out   | 21 --
 src/test/regress/sql/insert.sql| 21 --
 7 files changed, 90 insertions(+), 25 deletions(-)

diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 0158eda591..b537f84278 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -2505,8 +2505,7 @@ CopyFrom(CopyState cstate)
 
 	for (;;)
 	{
-		TupleTableSlot *slot,
-   *oldslot;
+		TupleTableSlot *slot;
 		bool		skip_tuple;
 		Oid			loaded_oid = InvalidOid;
 
@@ -2548,7 +2547,6 @@ CopyFrom(CopyState cstate)
 		ExecStoreTuple(tuple, slot, InvalidBuffer, false);
 
 		/* Determine the partition to heap_insert the tuple into */
-		oldslot = slot;
 		if (cstate->partition_dispatch_info)
 		{
 			int			leaf_part_index;
@@ -2650,7 +2648,7 @@ CopyFrom(CopyState cstate)
 /* Check the constraints of the tuple */
 if (cstate->rel->rd_att->constr ||
 	resultRelInfo->ri_PartitionCheck)
-	ExecConstraints(resultRelInfo, slot, oldslot, estate);
+	ExecConstraints(resultRelInfo, slot, estate);
 
 if (useHeapMultiInsert)
 {
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index f2995f2e7b..0f92bd49db 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -1825,8 +1825,7 @@ ExecPartitionCheck(ResultRelInfo *resultRelInfo, TupleTableSlot *slot,
  */
 void
 ExecConstraints(ResultRelInfo *resultRelInfo,
-TupleTableSlot *slot, TupleTableSlot *orig_slot,
-EState *estate)
+TupleTableSlot *slot, EState *estate)
 {
 	Relation	rel = resultRelInfo->ri_RelationDesc;
 	TupleDesc	tupdesc = RelationGetDescr(rel);
@@ -1849,23 +1848,37 @@ ExecConstraints(ResultRelInfo *resultRelInfo,
 			{
 char	   *val_desc;
 Relation	orig_rel = rel;
-TupleDesc	orig_tupdesc = tupdesc;
+TupleDesc	orig_tupdesc = RelationGetDescr(rel);
 
 /*
- * choose the correct relation to build val_desc from the
- * tuple contained in orig_slot
+ * If the tuple has been routed, it's been converted to the
+ * partition's rowtype, which might differ from the root
+ * table's.  We must convert it back to the root table's
+ * rowtype so that val_desc shown error message matches the
+ * input tuple.
  */
 if (resultRelInfo->ri_PartitionRoot)
 {
+	HeapTuple	tuple = ExecFetchSlotTuple(slot);
+	TupleConversionMap	*map;
+
 	rel = resultRelInfo->ri_PartitionRoot;
 	tupdesc = RelationGetDescr(rel);
+	/* a reverse map */
+	map = convert_tuples_by_name(orig_tupdesc, tupdesc,
+gettext_noop("could not convert row type"));

Re: [HACKERS] Guidelines for GSoC student proposals / Eliminate O(N^2) scaling from rw-conflict tracking in serializable transactions

2017-03-30 Thread Mengxing Liu

> > I agree that we can make skip list a general data structure.  But
> > can we use the fixed-level skip list as a Plan B? Or a quick attempt
> > before the general data structure ?
> > Because I am not familiar with shared memory structure and tricks
> > used in it, and I cannot estimate how much time it would take.
> 
> It's not really too bad for fixed allocation shared memory, and I
> can help with that.  If I thought it would save much I could see
> doing a prototype without generalization, but you would still have
> most of the same shared memory issues, since the structure *must*
> live in shared memory.
> 

Thank you. If there is no other problem, I will submit the proposal. 

--
Mengxing Liu










-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Somebody has not thought through subscription locking considerations

2017-03-30 Thread Petr Jelinek
On 30/03/17 07:25, Tom Lane wrote:
> I noticed this failure report:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dangomushi=2017-03-29%2019%3A45%3A27
> 
> in which we find
> 
> *** 
> /home/buildfarm/data/buildroot/HEAD/pgsql.build/src/test/regress/expected/updatable_views.out
>  Thu Mar 30 04:45:43 2017
> --- 
> /home/buildfarm/data/buildroot/HEAD/pgsql.build/src/test/regress/results/updatable_views.out
>   Thu Mar 30 05:32:37 2017
> ***
> *** 349,354 
> --- 349,358 
>   DROP VIEW ro_view10, ro_view12, ro_view18;
>   DROP SEQUENCE seq CASCADE;
>   NOTICE:  drop cascades to view ro_view19
> + ERROR:  deadlock detected
> + DETAIL:  Process 7576 waits for AccessShareLock on relation 1259 of 
> database 16384; blocked by process 7577.
> + Process 7577 waits for ShareRowExclusiveLock on relation 6102 of database 
> 16384; blocked by process 7576.
> + HINT:  See server log for query details.
>   -- simple updatable view
>   CREATE TABLE base_tbl (a int PRIMARY KEY, b text DEFAULT 'Unspecified');
>   INSERT INTO base_tbl SELECT i, 'Row ' || i FROM generate_series(-2, 2) g(i);
> 
> and the referenced bit of log is
> 
> [58dc19dd.1d98:175] ERROR:  deadlock detected
> [58dc19dd.1d98:176] DETAIL:  Process 7576 waits for AccessShareLock on 
> relation 1259 of database 16384; blocked by process 7577.
>   Process 7577 waits for ShareRowExclusiveLock on relation 6102 of 
> database 16384; blocked by process 7576.
>   Process 7576: DROP SEQUENCE seq CASCADE;
>   Process 7577: VACUUM FULL pg_class;
> [58dc19dd.1d98:177] HINT:  See server log for query details.
> [58dc19dd.1d98:178] STATEMENT:  DROP SEQUENCE seq CASCADE;
> 
> Of course, 1259 is pg_class and 6102 is pg_subscription_rel.
> 
> I await with interest an explanation of what "VACUUM FULL pg_class" is
> doing trying to acquire ShareRowExclusiveLock on pg_subscription_rel, not
> to mention why a DROP SEQUENCE is holding some fairly strong lock on that
> relation.  *Especially* in a situation where no subscriptions exist ---
> but even if any did, this seems unacceptable on its face.  Access to core
> catalogs like pg_class cannot depend on random other stuff.
> 

Hmm, the DROP SEQUENCE is result of not having dependency info for
relations/subscriptions I think. I was told during review it's needless
bloat of dependency catalog. I guess we should revisit that. It's also
likely that RemoveSubscriptionRel will work fine with lower lock level.

I have no idea why VACUUM FULL of pg_class would touch the
pg_subscription_rel though, I'll have to dig into that.

I can see that locking for example pg_trigger or pg_depend in
ShareRowExclusiveLock will also block VACUUM FULL on pg_class (and other
related tables like pg_attribute). So maybe we just need to be careful
about not taking such a strong lock...

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strange parallel query behavior after OOM crashes

2017-03-30 Thread Neha Khatri
On Fri, Mar 31, 2017 at 8:29 AM, Kuntal Ghosh 
 wrote:

> On Fri, Mar 31, 2017 at 2:05 AM, Kuntal Ghosh
>  wrote:
> >
> > 1. Put an Assert(0) in ParallelQueryMain(), start server and execute
> > any parallel query.
> >  In LaunchParallelWorkers, you can see
> >nworkers = n nworkers_launched = n (n>0)
> > But, all the workers will crash because of the assert statement.
> > 2. the server restarts automatically, initialize
> > BackgroundWorkerData->parallel_register_count and
> > BackgroundWorkerData->parallel_terminate_count in the shared memory.
> > After that, it calls ForgetBackgroundWorker and it increments
> > parallel_terminate_count. In LaunchParallelWorkers, we have the
> > following condition:
> > if ((BackgroundWorkerData->parallel_register_count -
> >  BackgroundWorkerData->parallel_terminate_count) >=
> > max_parallel_workers)
> > DO NOT launch any parallel worker.
> > Hence, nworkers = n nworkers_launched = 0.
> parallel_register_count and parallel_terminate_count, both are
> unsigned integer. So, whenever the difference is negative, it'll be a
> well-defined unsigned integer and certainly much larger than
> max_parallel_workers. Hence, no workers will be launched. I've
> attached a patch to fix this.


The current explanation of active number of parallel workers is:

 * The active
 * number of parallel workers is the number of registered workers minus the
 * terminated ones.

In the situations like you mentioned above, this formula can give negative
number for active parallel workers. However a negative number for active
parallel workers does not make any sense.

I feel it would be better to explain in code that in what situations, the
formula
can generate a negative result and what that means.

Regards,
Neha


[HACKERS] Something broken around FDW connection close

2017-03-30 Thread David Rowley
create table t (a int, b int);
insert into t1 select x/100,x/100 from  generate_series(1,10) x;
create extension if not exists postgres_fdw;
create server test_server foreign data wrapper postgres_fdw options (host
'localhost', port '5432', dbname 'postgres');
create foreign table ft_t (a int,b int) server test_server;
select 'create user mapping for current_user server test_server
options(user ''' || current_user || ''');';
\gexec
select count(*) from pg_stat_Activity; -- > 6
analyze ft_t;
ERROR:  could not connect to server "test_server"
DETAIL:  FATAL:  sorry, too many clients already
CONTEXT:  Remote SQL command: DECLARE c1 CURSOR FOR SELECT a, b FROM
public.ft_t
Remote SQL command: SELECT a, b FROM public.ft_t
Remote SQL command: SELECT a, b FROM public.ft_t
Remote SQL command: SELECT a, b FROM public.ft_t
(lots of these)

select count(*) from pg_stat_Activity; --> 105

I've not had a moment to check into what's going on.
Adding to open items...

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


[HACKERS] release notes for v10

2017-03-30 Thread Robert Haas
Hi,

One of the things we'll need to do to get 10beta1 out the door is have
a set of release notes.  I spoke with Bruce Momjian today at PGCONF.US
and he told me that he has set aside time during April to write them
and will begin working on it once we reach feature freeze.  I think
this is good, because last year Bruce didn't have time and we had to
scramble to find somebody, with Tom Lane luckily stepping up to the
plate.

Bruce, please confirm that I've got the message right.

Thanks,

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] End of CommitFest + Feature Freeze Delayed To April 7th

2017-03-30 Thread Robert Haas
Hi,

We had a brief stand-up meeting today at PGCONF.US to discuss the
proposal to delay the end of the CommitFest and feature freeze.  Those
present include 9 members of the pgsql-release team (including Peter
Eisentraut and myself as members of the RMT) and David Steele as the
current CommitFest manager.  Others present (in alphabetical order by
surname) were Joe Conway, Andrew Dunstan, Andres Freund, Stephen
Frost, Bruce Momjian, Dave Page, Simon Riggs. The unanimous sentiment
of the group was that feature freeze should be postponed by one week,
and that the end of the CommitFest should be postponed to the same
date.  On-list, on the thread entitled "Schedule and Release
Management Team for PG10", this sentiment was also endorsed by Magnus
Hagander, Alvaro Herrera, Petr Jelinek, and Tom Lane.  That's surely a
sufficient consensus to move forward with changing the date.

Accordingly, the feature freeze date is now April 7th, and no new
features for v10 should be committed after that date by anyone.

Robert Haas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] few fts functions for jsonb

2017-03-30 Thread Andrew Dunstan
On 29 March 2017 at 16:19, Dmitry Dolgov <9erthali...@gmail.com> wrote:
>> On 29 March 2017 at 18:28, Andrew Dunstan 
>> wrote:
>>
>> These patches seem fundamentally OK. But I'm still not happy with the
>> naming etc.
>
> I've changed names for all functions and action definitions, moved out the
> changes in header file to `jsonapi.h` and removed `is_jsonb_data` macro. So
> it
> should be better now.

I have just noticed as I was writing/testing the non-existent docs for
this patch that it doesn't supply variants of to_tsvector that take a
regconfig as the first argument. Is there a reason for that? Why
should the json(b) versions be different from the text versions?

cheers

andrew

-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical replication existing data copy

2017-03-30 Thread Erik Rijkers


(At the moment using these patches for tests:)
 0001-Reserve-global-xmin-for-create-slot-snasphot-export.patch +
 0002-Don-t-use-on-disk-snapshots-for-snapshot-export-in-l.patch+
 0003-Prevent-snapshot-builder-xmin-from-going-backwards.patch  +
 0004-Fix-xl_running_xacts-usage-in-snapshot-builder.patch  +
 0005-Skip-unnecessary-snapshot-builds.patch+
 and now (Tuesday 30) added :
 0001-Fix-remote-position-tracking-in-logical-replication.patch



I think what you have seen is because of this:
https://www.postgresql.org/message-id/flat/b235fa69-147a-5e09-f8f3-3f780a1ab...@2ndquadrant.com#b235fa69-147a-5e09-f8f3-3f780a1ab...@2ndquadrant.com



You were right: with that 6th patch (and wal_sender_timout back at its 
default 60s) there are no errors either (I tested on all 3 
test-machines).


I must have missed that last patch when you posted it.  Anyway all seems 
fine now; I hope the above patches can all be committed soon.


thanks,

Erik Rijkers



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strange parallel query behavior after OOM crashes

2017-03-30 Thread Kuntal Ghosh
On Fri, Mar 31, 2017 at 2:05 AM, Kuntal Ghosh
 wrote:
>
> 1. Put an Assert(0) in ParallelQueryMain(), start server and execute
> any parallel query.
>  In LaunchParallelWorkers, you can see
>nworkers = n nworkers_launched = n (n>0)
> But, all the workers will crash because of the assert statement.
> 2. the server restarts automatically, initialize
> BackgroundWorkerData->parallel_register_count and
> BackgroundWorkerData->parallel_terminate_count in the shared memory.
> After that, it calls ForgetBackgroundWorker and it increments
> parallel_terminate_count. In LaunchParallelWorkers, we have the
> following condition:
> if ((BackgroundWorkerData->parallel_register_count -
>  BackgroundWorkerData->parallel_terminate_count) >=
> max_parallel_workers)
> DO NOT launch any parallel worker.
> Hence, nworkers = n nworkers_launched = 0.
parallel_register_count and parallel_terminate_count, both are
unsigned integer. So, whenever the difference is negative, it'll be a
well-defined unsigned integer and certainly much larger than
max_parallel_workers. Hence, no workers will be launched. I've
attached a patch to fix this.



-- 
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com


fix-workers-launched.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2017-03-30 Thread Anastasia Lubennikova

30.03.2017 19:49, Robert Haas:

On Thu, Mar 30, 2017 at 11:26 AM, Teodor Sigaev  wrote:

I had a look on patch and played with it, seems, it looks fine. I splitted
it to two patches: core changes (+bloom index fix) and btree itself. All
docs are left in first patch - I'm too lazy to rewrite documentation which
is changed in second patch.
Any objection from reviewers to push both patches?

Has this really had enough review and testing?  The last time it was
pushed, it didn't go too well.  And laziness is not a very good excuse
for not dividing up patches properly.


Well,
I don't know how can we estimate the quality of the review or testing.
The patch was reviewed by many people.
Here are those who marked themselves as reviewers on this and previous 
committfests: Stephen Frost (sfrost), Andrew Dunstan (adunstan), 
Aleksander Alekseev (a.alekseev), Amit Kapila (amitkapila), Andrey 
Borodin (x4m), Peter Geoghegan (pgeoghegan), David Rowley (davidrowley).


For me it looks serious enough. These people, as well as many others, 
shared their thoughts on this topic and pointed out various mistakes.
I fixed all the issues as soon as I could. And I'm not going to 
disappear when it will be committed. Personally, I always thought that 
we have Alpha and Beta releases for integration testing.


Speaking of the feature itself, it is included into our fork of 
PostgreSQL 9.6 since it was released.
And as far as I know, there were no complaints from users. It makes me 
believe that there are no critical bugs there.

While there may be conflicts with some other features of v10.0.


It seems highly surprising to me that CheckIndexCompatible() only gets
a one line change in this patch.  That seems unlikely to be correct.
What makes you think so? CheckIndexCompatible() only cares about 
possible opclasses' changes.
For covering indexes opclasses are only applicable to indnkeyatts. And 
that is exactly what was changed in this patch.

Do you think it needs some other changes?


Has anybody done some testing of this patch with the WAL consistency
checker?  Like, create some tables with indexes that have INCLUDE
columns, set up a standby, enable consistency checking, pound the
master, and see if the standby bails?
Good point. I missed this feature, I wish someone mentioned this issue a 
bit earlier.

And as Alexander's test shows there is some problem with my patch, indeed.
I'll fix it and send updated patch.


Has anybody tested this patch with amcheck?  Does it break amcheck?
Yes, it breaks amcheck. Amcheck should be patched in order to work with 
covering indexes.

We've discussed it with Peter before and I even wrote small patch.
I'll attach it in the following message.


A few minor comments:

-foreach(lc, constraint->keys)
+else foreach(lc, constraint->keys)

That doesn't look like a reasonable way of formatting the code.

+/* Here is some code duplication. But we do need it. */

That is not a very informative comment.

+* NOTE It is not crutial for reliability in present,

Spelling, punctuation.



Will be fixed as well.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical decoding on standby

2017-03-30 Thread Andres Freund
On 2017-03-30 19:40:08 +0100, Simon Riggs wrote:
> On 30 March 2017 at 18:16, Andres Freund  wrote:
> 
> >>  /*
> >>   * Each page of XLOG file has a header like this:
> >>   */
> >> -#define XLOG_PAGE_MAGIC 0xD097   /* can be used as WAL version 
> >> indicator */
> >> +#define XLOG_PAGE_MAGIC 0xD100   /* can be used as WAL version 
> >> indicator */
> >
> > We normally only advance this by one, it's not tied to the poistgres 
> > version.
> 
> That was my addition. I rounded it up cos this is release 10. No biggie.

We'll probably upgrade that more than once again this release...


> (Poistgres? Is that the Manhattan spelling?)

Tiredness spelling ;)


> We've been redesigning the mechanisms for 2 years now, so it seems
> unlikely that further redesign can be required.

I don't think that's true *at all* - the mechanism previously
fundamentally different.

The whole topic has largely seen activity shortly before the code
freeze, both last time round and now.  I don't think it's surprising
that it thus doesn't end up being ready.


> If it is required,
> this patch is fairly low touch and change is possible later,
> incremental development etc. As regards overhead, this adds a small
> amount of time to a background process executed every 10 secs,
> generates no new WAL records.
> 
> So I don't see any reason not to commit this feature, after the minor
> corrections.

It doesn't have any benefit on its own, the locking model doesn't seem
fully there.  I don't see much reason to get this in before the release.


- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] REFERENCES privilege should not be symmetric (was Re: [GENERAL] Postgres Permissions Article)

2017-03-30 Thread Tom Lane
Paul Jungwirth  writes:
>> Also I don't understand why you wrote “You need the permission on both
>> tables”: Only the owner of a table can add constraints to it

> Ah, this piece was really helpful for me in making it click. Thanks so 
> much! I added a couple new paragraphs to my post with a link back to 
> this thread. I feel like it all makes sense now! :-)

> FYI "You need this permission on both tables" is what the docs say 
> (https://www.postgresql.org/docs/9.6/static/sql-grant.html):

>>> To create a foreign key constraint, it is necessary to have this 
>>> privilege on both the referencing and referenced columns.

> Maybe it would be worth clarifying there that you need to *own* the 
> referencing table, and you need REFERENCES on the referenced table?

Hmm ... interesting.  A bit of excavating in tablecmds.c shows that
in order to do ADD FOREIGN KEY, you need to be owner of the table
the constraint is being attached to (checked by ATSimplePermissions,
which is used for AT_AddConstraint by ATPrepCmd), *and* you need
REFERENCES on both tables, or at least on the columns involved in
the proposed FK constraint (checked by checkFkeyPermissions, which
is invoked against each of the tables by ATAddForeignKeyConstraint).

So yeah, this seems a little bit redundant.  In principle, a table owner
could revoke her own REFERENCES permissions on the table and thereby
disable creation of FKs leading out of it, but it'd be pretty unusual
to want to do so.

Moreover, this definition seems neither intuitive (REFERENCES doesn't
seem like it should be symmetric) nor compliant with the SQL standard.
In SQL:2011 section 11.8  I read

Access Rules
1) The applicable privileges for the owner of T shall include REFERENCES
for each referenced column.

(T is the referenced table.)  I see nothing suggesting that the command
requires REFERENCES privilege on the referencing table.  Now this is a
little garbled, because surely they meant the owner of the referencing
table (who is issuing the command) not the owner of the referenced table,
but I think the intent is clear enough.

In short, it seems like this statement in the docs is correctly describing
our code's behavior, but said behavior is wrong and should be changed.
I'd propose fixing it like that in HEAD; I'm not sure if the back branches
should also be changed.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strange parallel query behavior after OOM crashes

2017-03-30 Thread Kuntal Ghosh
On Fri, Mar 31, 2017 at 12:32 AM, Thomas Munro
 wrote:
> On Fri, Mar 31, 2017 at 7:38 AM, Tomas Vondra
>  wrote:
>> Hi,
>>
>> While doing some benchmarking, I've ran into a fairly strange issue with OOM
>> breaking LaunchParallelWorkers() after the restart. What I see happening is
>> this:
>>
>> 1) a query is executed, and at the end of LaunchParallelWorkers we get
>>
>> nworkers=8 nworkers_launched=8
>>
>> 2) the query does a Hash Aggregate, but ends up eating much more memory due
>> to n_distinct underestimate (see [1] from 2015 for details), and gets killed
>> by OOM
>>
>> 3) the server restarts, the query is executed again, but this time we get in
>> LaunchParallelWorkers
>>
>> nworkers=8 nworkers_launched=0
>>
>> There's nothing else running on the server, and there definitely should be
>> free parallel workers.
>>
>> 4) The query gets killed again, and on the next execution we get
>>
>> nworkers=8 nworkers_launched=8
>>
>> again, although not always. I wonder whether the exact impact depends on OOM
>> killing the leader or worker, for example.
>
> I don't know what's going on but I think I have seen this once or
> twice myself while hacking on test code that crashed.  I wonder if the
> DSM_CREATE_NULL_IF_MAXSEGMENTS case could be being triggered because
> the DSM control is somehow confused?
>
I think I've run into the same problem while working on parallelizing
plans containing InitPlans. You can reproduce that scenario by
following steps:

1. Put an Assert(0) in ParallelQueryMain(), start server and execute
any parallel query.
 In LaunchParallelWorkers, you can see
   nworkers = n nworkers_launched = n (n>0)
But, all the workers will crash because of the assert statement.
2. the server restarts automatically, initialize
BackgroundWorkerData->parallel_register_count and
BackgroundWorkerData->parallel_terminate_count in the shared memory.
After that, it calls ForgetBackgroundWorker and it increments
parallel_terminate_count. In LaunchParallelWorkers, we have the
following condition:
if ((BackgroundWorkerData->parallel_register_count -
 BackgroundWorkerData->parallel_terminate_count) >=
max_parallel_workers)
DO NOT launch any parallel worker.
Hence, nworkers = n nworkers_launched = 0.

I thought because of my stupid mistake the parallel worker is
crashing, so, this is supposed to happen. Sorry for that.

-- 
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Reduce src/test/recovery verbosity

2017-03-30 Thread Stephen Frost
Tom, all,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Stephen Frost  writes:
> > I'd be fine with removing --verbose globally, as your patch does, but
> > there was some argument that we then would have long 'quiet' periods.
> > I haven't had a chance to go test if that's really the case yet though.
> 
> I've been running it like this lately:
> 
> make -s check-world -j4 PROVE_FLAGS='-j4 --quiet --nocolor --nocount'
> 
> and it is still pretty noisy ;-).  The only place where it sits for
> more than a couple of seconds without printing anything is at the very
> start, which I believe to be the initial "make temp-install" step,
> which would be unaffected by prove verbosity anyway.
> 
> So I'd be +1 for just removing --verbose by default.  Anybody who really
> wants it can put it back via PROVE_FLAGS.

Yeah, I'm feeling the same way on this, and it'd reduce the size of the
logs on the buildfarm too, which I believe is a good thing.

Unless people wish to object, I'll use Michael's patch to remove
--verbose from the top level tomorrow.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [COMMITTERS] pgsql: Default monitoring roles

2017-03-30 Thread Tom Lane
I wrote:
> Simon Riggs  writes:
>> Weird. make check-world just skipped that directory. I guess for Dave also.

> If you just did check-world, it probably didn't build contrib modules that
> don't have tests, because the "check" target wouldn't do anything without
> tests to run.

I wondered why this is --- wouldn't "check" depend on "all", so that
things get built, even if there's no tests to run?

The answer is, no it doesn't.  "check" depends on "temp-install", and
then temp-install indirectly depends on "all".  So when you say
"make check-world", the reason that any code gets built at all is that
we're trying to install all the executables into the temporary
installation.  However, the basic temp-install target only installs
(and therefore only forces building of) the core code.  In a contrib
subdirectory, what is supposed to happen is that this line in
pgxs.mk causes that contrib subdirectory to also get built/installed:

temp-install: EXTRA_INSTALL+=$(subdir)

But if you look around a bit further, you discover that that line is
inside an "ifdef REGRESS" block --- therefore, unless the module
defines at least one REGRESS-style test, it's not built by "make check".

I thought that the attached patch would be a narrow fix for this,
just moving that dependency out of the "ifdef REGRESS" block.
However, it seems to send "make" into some infinite loop, at
least with make 3.81 which I'm using here.  Not sure why.

I also experimented with just changing "check: temp-install"
in Makefile.global to be "check: all temp-install", and that
seemed to work, but since I don't understand why it wasn't
like that already, I'm a bit afraid of that solution.

regards, tom lane

diff --git a/src/makefiles/pgxs.mk b/src/makefiles/pgxs.mk
index c27004e..c156806 100644
*** a/src/makefiles/pgxs.mk
--- b/src/makefiles/pgxs.mk
*** check:
*** 282,291 
  else
  check: submake $(REGRESS_PREP)
  	$(pg_regress_check) $(REGRESS_OPTS) $(REGRESS)
  
  temp-install: EXTRA_INSTALL+=$(subdir)
  endif
- endif # REGRESS
  
  
  # STANDARD RULES
--- 282,293 
  else
  check: submake $(REGRESS_PREP)
  	$(pg_regress_check) $(REGRESS_OPTS) $(REGRESS)
+ endif
+ endif # REGRESS
  
+ ifndef PGXS
  temp-install: EXTRA_INSTALL+=$(subdir)
  endif
  
  
  # STANDARD RULES

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New CORRESPONDING clause design

2017-03-30 Thread Pavel Stehule
2017-03-30 21:43 GMT+02:00 Tom Lane :

> Pavel Stehule  writes:
> > Is following use case defined in standard?
>
> > postgres=# SELECT 0 AS x1, 1 AS a, 0 AS x2, 2 AS b, 0 AS x3, -1 AS x3
> >UNION ALL CORRESPONDING BY(a,b) SELECT 4 AS b, 0 AS x4, 3 AS
> a,
> > 0 AS x6, -1 AS x6
> >UNION ALL CORRESPONDING SELECT 0 AS x8, 6 AS a, -100 AS aa;
> > ┌───┐
> > │ a │
> > ╞═══╡
> > │ 1 │
> > │ 3 │
> > │ 6 │
> > └───┘
> > (3 rows)
>
> > It depends on order of implementation
>
> > if we do (T1 U T2) U T3 ---> then result is correct,
> > but if we do T1 U (T2 U T3) ---> than it should to fail
>
> UNION ALL should associate left-to-right, just like most other binary
> operators, so this looks fine to me.  Did you check that you get an
> error if you put in parens to force the other order?
>

yes - it fails

 postgres=# SELECT 0 AS x1, 1 AS a, 0 AS x2, 2 AS b, 0 AS x3, -1 AS x3
UNION ALL CORRESPONDING BY(a,b) (SELECT 4 AS b, 0 AS x4, 3 AS a, 0 AS x6,
-1 AS x6 UNION ALL CORRESPONDING SELECT 0 AS x8, 6 AS a, -100 AS aa);
ERROR:  column name "b" can not be used in CORRESPONDING BY list
LINE 1: ...b, 0 AS x3, -1 AS x3 UNION ALL CORRESPONDING BY(a,b) (SELECT...
 ^
HINT:  UNION queries with a CORRESPONDING BY clause must contain column
names from both tables.
Time: 1,135 ms

Regards

Pavel

>
> regards, tom lane
>


Re: [HACKERS] Other formats in pset like markdown, rst, mediawiki

2017-03-30 Thread Pavel Stehule
2017-03-29 20:11 GMT+02:00 Jan Michálek :

>
>
> 2017-03-27 19:41 GMT+02:00 Jan Michálek :
>
>>
>>
>> 2017-03-23 17:26 GMT+01:00 Pierre Ducroquet :
>>
>>> The following review has been posted through the commitfest application:
>>> make installcheck-world:  tested, passed
>>> Implements feature:   tested, passed
>>> Spec compliant:   tested, passed
>>> Documentation:tested, passed
>>>
>>> Hi
>>>
>>> This is my first review (Magnus said in his presentation in PGDay Paris
>>> that volunteers should just come and help, so here I am), so please notify
>>> me for any mistake I do when using the review tools...
>>>
>>> The feature seems to work as expected, but I don't claim to be a
>>> markdown and rst expert.
>>> Some minor issues with the code itself :
>>> - some indentation issues (documentation and code itself with mix
>>> between space based and tab based indentation) and a few trailing spaces in
>>> code
>>>
>>
>> corrected
>>
>>
>>> - typographic issues in the documentation :
>>>   - "The html, asciidoc, latex, latex-longtable, troff-ms, and markdown
>>> and rst formats" ==> duplicated and
>>>
>>
>> corrected
>>
>>>   - "Sets the output format to one of unaligned, aligned, wrapped, html,
>>> asciidoc, latex (uses tabular), latex-longtable, rst, markdown, or
>>> troff-ms." ==> extra comma at the end of the list
>>> - the comment " dont add line after last row, because line is added
>>> after every row" is misleading, it should warn that it's only for rst
>>> - there is a block of commented out code left
>>> - in the print_aligned_vertical function, there is a mix between
>>> "cont->opt->format == PRINT_RST" and "format == _rst" and I don't see
>>> any obvious reason for that
>>>
>> corrected
>>
>>> - the documentation doesn't mention (but ok, it's kind of obvious) that
>>> the linestyle option will not work with rst and markdown
>>>
>>>
>> In this patch are corrected (i hope, i had correct changes in vimrc)
>> indentation issues. Plese, look at this if it is OK (i men indentats) and
>> some minor errors. And it should work on current master (probably).
>>
>
> Added \x option form markdown
> In markdown works multiline cels (newline replaced by )
> regre tests passed
>

\pset format rst
\x
select 10
crash on segfault

Program received signal SIGSEGV, Segmentation fault.
0x7f77673a866c in vfprintf () from /lib64/libc.so.6
(gdb) bt
#0  0x7f77673a866c in vfprintf () from /lib64/libc.so.6
#1  0x7f77673b1574 in fprintf () from /lib64/libc.so.6
#2  0x00437bc5 in print_aligned_vertical (cont=0x7fffade43da0,
fout=,
is_pager=) at print.c:1755
#3  0x0043a70d in printTable (cont=cont@entry=0x7fffade43da0,
fout=,
fout@entry=0x7f77677255e0 <_IO_2_1_stdout_>, is_pager=,
is_pager@entry=0 '\000',
flog=flog@entry=0x0) at print.c:3466
#4  0x0043c37f in printQuery (result=result@entry=0x9c4b60,
opt=opt@entry=0x7fffade43f00,
fout=0x7f77677255e0 <_IO_2_1_stdout_>, is_pager=is_pager@entry=0
'\000', flog=0x0) at print.c:3551
#5  0x0040da6d in PrintQueryTuples (results=0x9c4b60) at
common.c:808
#6  PrintQueryResults (results=0x9c4b60) at common.c:1140
#7  SendQuery (query=0x9c1700 "select 10;") at common.c:1317
#8  0x0041c3d4 in MainLoop (source=0x7f77677248a0 <_IO_2_1_stdin_>)
at mainloop.c:319
#9  0x00405d5d in main (argc=, argv=)
at startup.c:396

Regards

Pavel


>
>
> Jan
>
>
>
>>
>> Have nice day
>>
>> Jan
>>
>>
>>> Thanks !
>>>
>>> The new status of this patch is: Waiting on Author
>>>
>>> --
>>> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
>>> To make changes to your subscription:
>>> http://www.postgresql.org/mailpref/pgsql-hackers
>>>
>>
>>
>>
>> --
>> Jelen
>> Starší čeledín datovýho chlíva
>>
>
>
>
> --
> Jelen
> Starší čeledín datovýho chlíva
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>


Re: [HACKERS] New CORRESPONDING clause design

2017-03-30 Thread Tom Lane
Pavel Stehule  writes:
> Is following use case defined in standard?

> postgres=# SELECT 0 AS x1, 1 AS a, 0 AS x2, 2 AS b, 0 AS x3, -1 AS x3
>UNION ALL CORRESPONDING BY(a,b) SELECT 4 AS b, 0 AS x4, 3 AS a,
> 0 AS x6, -1 AS x6
>UNION ALL CORRESPONDING SELECT 0 AS x8, 6 AS a, -100 AS aa;
> ┌───┐
> │ a │
> ╞═══╡
> │ 1 │
> │ 3 │
> │ 6 │
> └───┘
> (3 rows)

> It depends on order of implementation

> if we do (T1 U T2) U T3 ---> then result is correct,
> but if we do T1 U (T2 U T3) ---> than it should to fail

UNION ALL should associate left-to-right, just like most other binary
operators, so this looks fine to me.  Did you check that you get an
error if you put in parens to force the other order?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Monitoring roles patch

2017-03-30 Thread Dave Page
On Thu, Mar 30, 2017 at 2:24 PM, Simon Riggs  wrote:
> On 30 March 2017 at 18:29, Simon Riggs  wrote:
>
>> Moving to commit this over the next hour. Last chance...
>
> Done. Great work Dave, thanks everybody.

Thanks Simon.

-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2017-03-30 Thread Aleksander Alekseev
Hi Robert,

> Has anybody done some testing of this patch with the WAL consistency
> checker?  Like, create some tables with indexes that have INCLUDE
> columns, set up a standby, enable consistency checking, pound the
> master, and see if the standby bails?

I've decided to run such a test. It looks like there is a bug indeed.

Steps to reproduce:

0. Apply a patch.
1. Build PostgreSQL using quick-build.sh [1]
2. Install master and replica using install.sh [2]
3. Download test.sql [3]
4. Run: `cat test.sql | psql`
5. In replica's logfile:

```
FATAL:  inconsistent page found, rel 1663/16384/16396, forknum 0, blkno 1
```

> Has anybody tested this patch with amcheck?  Does it break amcheck?

Amcheck doesn't complain.

[1] https://github.com/afiskon/pgscripts/blob/master/quick-build.sh
[2] https://github.com/afiskon/pgscripts/blob/master/install.sh
[3] http://afiskon.ru/s/88/93c544e6cf_test.sql

-- 
Best regards,
Aleksander Alekseev


signature.asc
Description: PGP signature


Re: [HACKERS] New CORRESPONDING clause design

2017-03-30 Thread Pavel Stehule
Hi

2017-03-30 13:11 GMT+02:00 Surafel Temesgen :

> hi
>
> Thank you very much for your help .
> here is the patch fix that issue as you suggest
>

The crash is fixed

I did a rebase + few more regress tests.

Is following use case defined in standard?

postgres=# SELECT 0 AS x1, 1 AS a, 0 AS x2, 2 AS b, 0 AS x3, -1 AS x3
   UNION ALL CORRESPONDING BY(a,b) SELECT 4 AS b, 0 AS x4, 3 AS a,
0 AS x6, -1 AS x6
   UNION ALL CORRESPONDING SELECT 0 AS x8, 6 AS a, -100 AS aa;
┌───┐
│ a │
╞═══╡
│ 1 │
│ 3 │
│ 6 │
└───┘
(3 rows)

It depends on order of implementation

if we do (T1 U T2) U T3 ---> then result is correct,
but if we do T1 U (T2 U T3) ---> than it should to fail

I am not sure, if this result is expected (correct). I expect more syntax
error because corresponding by is not filled.


>
> Regards
>
> Surafel
>
>
> On Tue, Mar 28, 2017 at 5:44 PM, Pavel Stehule 
> wrote:
>
>>
>>
>> 2017-03-28 14:18 GMT+02:00 Pavel Stehule :
>>
>>>
>>>
>>> 2017-03-28 13:58 GMT+02:00 Surafel Temesgen :
>>>
 can you help with fixing it Pavel?

>>>
>>> There must be some new preanalyze stage - you have to know result
>>> columns before you are starting a analyze
>>>
>>
>> maybe some recheck after analyze stage to remove invalid columns can be
>> good enough.
>>
>> Regards
>>
>> Pavel
>>
>>
>>>
>>> Regards
>>>
>>> Pavel
>>>

 On Mon, Mar 27, 2017 at 11:48 AM, Pavel Stehule <
 pavel.steh...@gmail.com> wrote:

> Hi
>
> fresh update - I enhanced Value node by location field as Tom proposal.
>
> Few more regress tests.
>
> But I found significant issue, that needs bigger fix - Surafel,
> please, can you fix it.
>
> It crash on
>
> SELECT 0 AS x1, 1 AS a, 0 AS x2, 2 AS b, 0 AS x3, -1 AS x3
> UNION ALL CORRESPONDING SELECT 4 AS b, 0 AS x4, 3 AS a, 0 AS x6, -1 AS
> x6
> UNION ALL CORRESPONDING SELECT 0 AS x8, 6 AS b, -100 AS x9;
>
> I'll mark this patch as waiting on author
>
> Regards
>
> Pavel
>
>
>

>>>
>>
>
diff --git a/doc/src/sgml/queries.sgml b/doc/src/sgml/queries.sgml
index 30792f45f1..2d60718ff1 100644
--- a/doc/src/sgml/queries.sgml
+++ b/doc/src/sgml/queries.sgml
@@ -1601,6 +1601,9 @@ SELECT DISTINCT ON (expression , EXCEPT
   
   
+   CORRESPONDING
+  
+  
set union
   
   
@@ -1617,9 +1620,9 @@ SELECT DISTINCT ON (expression , 
-query1 UNION ALL query2
-query1 INTERSECT ALL query2
-query1 EXCEPT ALL query2
+query1 UNION ALL CORRESPONDING BY query2
+query1 INTERSECT ALL CORRESPONDING BY query2
+query1 EXCEPT ALL CORRESPONDING BY query2
 
query1 and
query2 are queries that can use any of
@@ -1659,14 +1662,31 @@ SELECT DISTINCT ON (expression , 
 
   
-   In order to calculate the union, intersection, or difference of two
-   queries, the two queries must be union compatible,
-   which means that they return the same number of columns and
-   the corresponding columns have compatible data types, as
-   described in .
+   EXCEPT returns all rows that are in the result of
+   query1 but not in the result of
+   query2.  (This is sometimes called the
+   difference between two queries.)  Again, duplicates
+   are eliminated unless EXCEPT ALL is used.
   
- 
 
+  
+   CORRESPONDING returns all columns that are in both 
+   query1 and query2 with the same name.
+  
+
+  
+   CORRESPONDING BY returns all columns in the column list 
+   that are also in both query1 and 
+   query2 with the same name. The names in column list
+   must be unique.
+  
+
+  
+   The names of columns in result when CORRESPONDING or
+   CORRESPONDING BY clause is used must be unique in
+   query1 and query2.
+  
+ 
 
  
   Sorting Rows
diff --git a/doc/src/sgml/sql.sgml b/doc/src/sgml/sql.sgml
index 57396d7c24..f98c22e696 100644
--- a/doc/src/sgml/sql.sgml
+++ b/doc/src/sgml/sql.sgml
@@ -859,7 +859,7 @@ SELECT [ ALL | DISTINCT [ ON ( expressioncondition ]
 [ GROUP BY expression [, ...] ]
 [ HAVING condition [, ...] ]
-[ { UNION | INTERSECT | EXCEPT } [ ALL ] select ]
+[ { UNION | INTERSECT | EXCEPT } [ ALL ] [ CORRESPONDING [ BY ( expression ) ] ] select ]
 [ ORDER BY expression [ ASC | DESC | USING operator ] [ NULLS { FIRST | LAST } ] [, ...] ]
 [ LIMIT { count | ALL } ]
 [ OFFSET start ]
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 1c88d601bd..11e0590eec 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -2995,6 +2995,7 @@ _copySelectStmt(const SelectStmt *from)
 	COPY_NODE_FIELD(withClause);
 	COPY_SCALAR_FIELD(op);
 	COPY_SCALAR_FIELD(all);
+	COPY_NODE_FIELD(correspondingClause);
 	COPY_NODE_FIELD(larg);
 	COPY_NODE_FIELD(rarg);
 
@@ -3010,6 +3011,8 @@ _copySetOperationStmt(const SetOperationStmt *from)
 	COPY_SCALAR_FIELD(all);
 	COPY_NODE_FIELD(larg);
 	COPY_NODE_FIELD(rarg);
+	

Re: [HACKERS] [COMMITTERS] pgsql: Default monitoring roles

2017-03-30 Thread Tom Lane
Simon Riggs  writes:
> On 30 March 2017 at 19:31, Erik Rijkers  wrote:
>> The buildfarm is showing red (the same errors that I get...):
>> pgrowlocks.c: In function ‘pgrowlocks’:
>> pgrowlocks.c:105:65: error: expected ‘)’ before ‘;’ token
>> is_member_of_role(GetUserId(), DEFAULT_ROLE_STAT_SCAN_TABLES);

> Weird. make check-world just skipped that directory. I guess for Dave also.

If you just did check-world, it probably didn't build contrib modules that
don't have tests, because the "check" target wouldn't do anything without
tests to run.

AFAICS, there isn't any top-level make target that will build all of
contrib except "make world", which also builds the docs and therefore
isn't very fast.  I wonder if we should create some more convenient
testing target that builds all code but not the docs.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2017-03-30 Thread Andres Freund
On 2017-03-30 18:26:05 +0300, Teodor Sigaev wrote:
> Any objection from reviewers to push both patches?


> diff --git a/contrib/bloom/blutils.c b/contrib/bloom/blutils.c
> index f2eda67..59029b9 100644
> --- a/contrib/bloom/blutils.c
> +++ b/contrib/bloom/blutils.c
> @@ -120,6 +120,7 @@ blhandler(PG_FUNCTION_ARGS)
>   amroutine->amclusterable = false;
>   amroutine->ampredlocks = false;
>   amroutine->amcanparallel = false;
> + amroutine->amcaninclude = false;

That name doesn't strike me as very descriptive.


> +  INCLUDE
> +  
> +   
> +An optional INCLUDE clause allows a list of columns to be
> +specified which will be included in the non-key portion of the index.
> +Columns which are part of this clause cannot also exist in the
> +key columns portion of the index, and vice versa. The
> +INCLUDE columns exist solely to allow more queries to 
> benefit
> +from index-only scans by including certain columns in 
> the
> +index, the value of which would otherwise have to be obtained by 
> reading
> +the table's heap. Having these columns in the INCLUDE 
> clause
> +in some cases allows PostgreSQL to skip the heap read
> +completely. This also allows UNIQUE indexes to be 
> defined on
> +one set of columns, which can include another set of columns in the
> +   INCLUDE clause, on which the uniqueness is not enforced.
> +It's the same with other constraints (PRIMARY KEY and EXCLUDE). This 
> can
> +also can be used for non-unique indexes as any columns which are not 
> required
> +for the searching or ordering of records can be used in the
> +INCLUDE clause, which can slightly reduce the size of 
> the index.
> +Currently, only the B-tree access method supports this feature.
> +Expressions as included columns are not supported since they cannot 
> be used
> +in index-only scans.
> +   
> +  
> + 

This could use some polishing.


> +/*
> + * Reform index tuple. Truncate nonkey (INCLUDE) attributes.
> + */
> +IndexTuple
> +index_truncate_tuple(Relation idxrel, IndexTuple olditup)
> +{
> + TupleDesc   itupdesc = RelationGetDescr(idxrel);
> + Datum   values[INDEX_MAX_KEYS];
> + boolisnull[INDEX_MAX_KEYS];
> + IndexTuple  newitup;
> + int indnatts = IndexRelationGetNumberOfAttributes(idxrel);
> + int indnkeyatts = IndexRelationGetNumberOfKeyAttributes(idxrel);
> +
> + Assert(indnatts <= INDEX_MAX_KEYS);
> + Assert(indnkeyatts > 0);
> + Assert(indnkeyatts < indnatts);
> +
> + index_deform_tuple(olditup, itupdesc, values, isnull);
> +
> + /* form new tuple that will contain only key attributes */
> + itupdesc->natts = indnkeyatts;
> + newitup = index_form_tuple(itupdesc, values, isnull);
> + newitup->t_tid = olditup->t_tid;
> +
> + itupdesc->natts = indnatts;

Uh, isn't this a *seriously* bad idea?  If index_form_tuple errors out,
this'll corrupt the tuple descriptor.


Maybe also rename the function to index_build_key_tuple()?

>   * Construct a string describing the contents of an index entry, in the
>   * form "(key_name, ...)=(key_value, ...)".  This is currently used
> - * for building unique-constraint and exclusion-constraint error messages.
> + * for building unique-constraint and exclusion-constraint error messages,
> + * so only key columns of index are checked and printed.

s/index/the index/


> @@ -368,7 +370,7 @@ systable_beginscan(Relation heapRelation,
>   {
>   int j;
>  
> - for (j = 0; j < irel->rd_index->indnatts; j++)
> + for (j = 0; j < 
> IndexRelationGetNumberOfAttributes(irel); j++)

>   {
>   if (key[i].sk_attno == 
> irel->rd_index->indkey.values[j])
>   {
> @@ -376,7 +378,7 @@ systable_beginscan(Relation heapRelation,
>   break;
>   }
>   }
> - if (j == irel->rd_index->indnatts)
> + if (j == IndexRelationGetNumberOfAttributes(irel))
>   elog(ERROR, "column is not in index");
>   }

Not that it matters overly much, but why are we doing this for all
attributes, rather than just key attributes?


> --- a/src/backend/bootstrap/bootstrap.c
> +++ b/src/backend/bootstrap/bootstrap.c
> @@ -600,7 +600,7 @@ boot_openrel(char *relname)
>relname, (int) ATTRIBUTE_FIXED_PART_SIZE);
>  
>   boot_reldesc = heap_openrv(makeRangeVar(NULL, relname, -1), NoLock);
> - numattr = boot_reldesc->rd_rel->relnatts;
> + numattr = RelationGetNumberOfAttributes(boot_reldesc);
>   for (i = 0; i < numattr; i++)
>   {
>   if (attrtypes[i] == NULL)

That seems a bit unrelated.


> @@ 

Re: [HACKERS] strange parallel query behavior after OOM crashes

2017-03-30 Thread Thomas Munro
On Fri, Mar 31, 2017 at 7:38 AM, Tomas Vondra
 wrote:
> Hi,
>
> While doing some benchmarking, I've ran into a fairly strange issue with OOM
> breaking LaunchParallelWorkers() after the restart. What I see happening is
> this:
>
> 1) a query is executed, and at the end of LaunchParallelWorkers we get
>
> nworkers=8 nworkers_launched=8
>
> 2) the query does a Hash Aggregate, but ends up eating much more memory due
> to n_distinct underestimate (see [1] from 2015 for details), and gets killed
> by OOM
>
> 3) the server restarts, the query is executed again, but this time we get in
> LaunchParallelWorkers
>
> nworkers=8 nworkers_launched=0
>
> There's nothing else running on the server, and there definitely should be
> free parallel workers.
>
> 4) The query gets killed again, and on the next execution we get
>
> nworkers=8 nworkers_launched=8
>
> again, although not always. I wonder whether the exact impact depends on OOM
> killing the leader or worker, for example.

I don't know what's going on but I think I have seen this once or
twice myself while hacking on test code that crashed.  I wonder if the
DSM_CREATE_NULL_IF_MAXSEGMENTS case could be being triggered because
the DSM control is somehow confused?

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical decoding on standby

2017-03-30 Thread Simon Riggs
On 30 March 2017 at 18:16, Andres Freund  wrote:

>>  /*
>>   * Each page of XLOG file has a header like this:
>>   */
>> -#define XLOG_PAGE_MAGIC 0xD097   /* can be used as WAL version 
>> indicator */
>> +#define XLOG_PAGE_MAGIC 0xD100   /* can be used as WAL version 
>> indicator */
>
> We normally only advance this by one, it's not tied to the poistgres version.

That was my addition. I rounded it up cos this is release 10. No biggie.

(Poistgres? Is that the Manhattan spelling?)

> I'm sorry, but to me this patch isn't ready.  I'm also doubtful that it
> makes a whole lot of sense on its own, without having finished the
> design for decoding on a standby - it seems quite likely that we'll have
> to redesign the mechanisms in here a bit for that.  For 10 this seems to
> do nothing but add overhead?

I'm sure we can reword the comments.

We've been redesigning the mechanisms for 2 years now, so it seems
unlikely that further redesign can be required. If it is required,
this patch is fairly low touch and change is possible later,
incremental development etc. As regards overhead, this adds a small
amount of time to a background process executed every 10 secs,
generates no new WAL records.

So I don't see any reason not to commit this feature, after the minor
corrections.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] strange parallel query behavior after OOM crashes

2017-03-30 Thread Tomas Vondra

Hi,

While doing some benchmarking, I've ran into a fairly strange issue with 
OOM breaking LaunchParallelWorkers() after the restart. What I see 
happening is this:


1) a query is executed, and at the end of LaunchParallelWorkers we get

nworkers=8 nworkers_launched=8

2) the query does a Hash Aggregate, but ends up eating much more memory 
due to n_distinct underestimate (see [1] from 2015 for details), and 
gets killed by OOM


3) the server restarts, the query is executed again, but this time we 
get in LaunchParallelWorkers


nworkers=8 nworkers_launched=0

There's nothing else running on the server, and there definitely should 
be free parallel workers.


4) The query gets killed again, and on the next execution we get

nworkers=8 nworkers_launched=8

again, although not always. I wonder whether the exact impact depends on 
OOM killing the leader or worker, for example.


regards


[1] 
https://www.postgresql.org/message-id/flat/CAFWGqnsxryEevA5A_CqT3dExmTaT44mBpNTy8TWVsSVDS71QMg%40mail.gmail.com


--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [POC] A better way to expand hash indexes.

2017-03-30 Thread Mithun Cy
Thanks Robert, I have tried to fix the comments given as below.

On Thu, Mar 30, 2017 at 9:19 PM, Robert Haas  wrote:
> I think that the macros in hash.h need some more work:
>
> - Pretty much any time you use the argument of a macro, you need to
> parenthesize it in the macro definition to avoid surprises if the
> macros is called using an expression.  That isn't done consistently
> here.

--I have tried to fix same in the latest patch.

> - The macros make extensive use of magic numbers like 1, 2, and 3.  I
> suggest something like:
>
> #define SPLITPOINT_PHASE_BITS 2
> #define SPLITPOINT_PHASES_PER_GROUP (1 << SPLITPOINT_PHASE_BITS)
>
> And then use SPLITPOINT_PHASE_BITS any place where you're currently
> saying 2.  The reference to 3 is really SPLITPOINT_PHASE_BITS + 1.

-- Taken modified same in the latest patch.

> - Many of these macros are only used in one place.  Maybe just move
> the computation to that place and get rid of the macro.  For example,
> _hash_spareindex() could be written like this:
>
> if (splitpoint_group < SPLITPOINT_GROUPS_WITH_ONLY_ONE_PHASE)
> return splitpoint_group;
>
> /* account for single-phase groups */
> splitpoint = SPLITPOINT_GROUPS_WITH_ONLY_ONE_PHASE;
>
> /* account for completed groups */
> splitpoint += (splitpoint_group -
> SPLITPOINT_GROUPS_WITH_ONLY_ONE_PHASE) << SPLITPOINT_PHASE_BITS;
>
> /* account for phases within current group */
> splitpoint += (bucket_num >> (SPLITPOINT_PHASE_BITS + 1)) &
> SPLITPOINT_PHASE_MASK;
>
> return splitpoint;
>
> That eliminates the only use of two complicated macros and is in my
> opinion more clear than what you've currently got.

-- Taken, also rewrote _hash_get_totalbuckets in similar lines.

With that, we will end up with only 2 macros which have some computing code
+/* defines max number of splitpoit phases a hash index can have */
+#define HASH_MAX_SPLITPOINT_GROUP 32
+#define HASH_MAX_SPLITPOINTS \
+ (((HASH_MAX_SPLITPOINT_GROUP - HASH_SPLITPOINT_GROUPS_WITH_ONE_PHASE) * \
+  HASH_SPLITPOINT_PHASES_PER_GRP) + \
+ HASH_SPLITPOINT_GROUPS_WITH_ONE_PHASE)
+
+/* given a splitpoint phase get its group */
+#define HASH_SPLITPOINT_PHASE_TO_SPLITPOINT_GRP(sp_phase) \
+ (((sp_phase) < HASH_SPLITPOINT_GROUPS_WITH_ONE_PHASE) ? \
+ (sp_phase) : \
+ sp_phase) - HASH_SPLITPOINT_GROUPS_WITH_ONE_PHASE) >> \
+   HASH_SPLITPOINT_PHASE_BITS) + \
+  HASH_SPLITPOINT_GROUPS_WITH_ONE_PHASE))

> - Some of these macros lack clear comments explaining their purpose.

-- I have written some comments to explain the use of the macros.

> - Some of them don't include HASH anywhere in the name, which is
> essential for a header that may easily be included by non-hash index
> code.

-- Fixed, all MACROS are prefixed with HASH

> - The names don't all follow a consistent format.  Maybe that's too
> much to hope for at some level, but I think they could be more
> consistent than they are.

-- Fixed, apart from old HASH_MAX_SPLITPOINTS rest all have a prefix
HASH_SPLITPOINT.

-- 
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com


yet_another_expand_hashbucket_efficiently_12.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Monitoring roles patch

2017-03-30 Thread Simon Riggs
On 30 March 2017 at 18:29, Simon Riggs  wrote:

> Moving to commit this over the next hour. Last chance...

Done. Great work Dave, thanks everybody.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: Batch/pipelining support for libpq

2017-03-30 Thread Daniel Verite
Vaishnavi Prabakaran wrote:

> Hmm, With batch mode, after sending COPY command to server(and server
> started processing the query and goes into COPY state) , client does not
> immediately read the result , but it keeps sending other queries to the
> server. By this time, server already encountered the error
> scenario(Receiving different message during COPY state) and sent error
> messages

IOW, the test intentionally violates the protocol and then all goes wonky
because of that.
That's why I was wondering upthread what's it's supposed to test.
I mean, regression tests are meant to warn against a desirable behavior
being unknowingly changed by new code into an undesirable behavior.
Here we have the undesirable behavior to start with.
What kind of regression could we fear from that?

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Logical replication support for initial data copy

2017-03-30 Thread Fujii Masao
On Thu, Mar 23, 2017 at 9:59 PM, Peter Eisentraut  wrote:
> Logical replication support for initial data copy

+ case T_SQLCmd:
+ if (MyDatabaseId == InvalidOid)
+ ereport(ERROR,
+ (errmsg("not connected to database")));

This error message doesn't seem to follow the error message style in docs.
Also It seems a bit unclear to me. So what about replacing it with
something like the following?

ERROR:  must connect to database to execute command \"%s\"

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)

2017-03-30 Thread Bruce Momjian
On Tue, Mar 21, 2017 at 04:04:58PM -0400, Bruce Momjian wrote:
> On Tue, Mar 21, 2017 at 04:56:16PM -0300, Alvaro Herrera wrote:
> > Bruce Momjian wrote:
> > > On Tue, Mar 21, 2017 at 04:43:58PM -0300, Alvaro Herrera wrote:
> > > > Bruce Momjian wrote:
> > > > 
> > > > > I don't think it makes sense to try and save bits and add complexity
> > > > > when we have no idea if we will ever use them,
> > > > 
> > > > If we find ourselves in dire need of additional bits, there is a known
> > > > mechanism to get back 2 bits from old-style VACUUM FULL.  I assume that
> > > > the reason nobody has bothered to write the code for that is that
> > > > there's no *that* much interest.
> > > 
> > > We have no way of tracking if users still have pages that used the bits
> > > via pg_upgrade before they were removed.
> > 
> > Yes, that's exactly the code that needs to be written.
> 
> Yes, but once it is written it will take years before those bits can be
> used on most installations.

Actually, the 2 bits from old-style VACUUM FULL bits could be reused if
one of the WARM bits would be set  when it is checked.  The WARM bits
will all be zero on pre-9.0.  The check would have to be checking the
old-style VACUUM FULL bit and checking that a WARM bit is set.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Generic type subscripting

2017-03-30 Thread Arthur Zakirov

On 29.03.2017 20:14, Arthur Zakirov wrote:


I wanted to implement subscripting for ltree or hstore extensions.
Subscripting for ltree looks more interesting. Especially with slicing.
But I haven't done it yet. I hope that I will implement it tomorrow.



ltree
-

I've implemented fetching ltree elements using subscripting. But haven't 
implemented assigning ltree elements yet. I'll send second patch after 
implementing assigning.


Now you can execute the following query:

SELECT ('Top.Science.Astronomy.Astrophysics'::ltree)[1:2];
ltree
-
 Top.Science

Comments


But I've noticed about some points.

In array_subscript_parse() passed uninitialized values of "typesource" 
and "typeneeded" variables for coerce_to_target_type().



+   typesource = exprType(assignExpr);
+   typesource = is_slice ? sbsref->refcontainertype : 
sbsref->refelemtype;


Here is the bug. Second variable should be "typeneeded". Moreover these 
assignments should be moved up to first coerce_to_target_type() execution.



+   foreach(l, sbsref->reflowerindexpr)
+   {
+   List *expr_ai = (List *) lfirst(l);
+   A_Indices *ai = (A_Indices *) lfirst(list_tail(expr_ai));
+
+   subexpr = (Node *) lfirst(list_head(expr_ai));


This code looks like a magic. This happens because of appending 
A_Indeces to lowerIndexpr:



-   subexpr = NULL;
+   lowerIndexpr = lappend(lowerIndexpr, 
list_make2(subexpr, ai));
}


And this A_Indeces used only when slicing is not used to make a constant 
1. Maybe there are another way?


Also it would be better if "refevalfunc" and "refnestedfunc" had 
pointers to functions not Oid type. Now you need to create "..._fetch" 
and "..._assign" functions in catalog and in "..._parse" function you 
need get their Oid using to_regproc() function.


Can we use IndexAmRoutine structure method, when you use only pointers 
to necessary functions? You can see an example in blhandler() function 
in blutils.c.


The last point is about the tutorial. As Tom pointed it is not useful 
when the tutorial doesn't execute. It happens because there is not 
"custom" type in subscripting.sql. Also it contradicts the README of 
tutorials.


--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Variable substitution in psql backtick expansion

2017-03-30 Thread Tom Lane
Awhile back in the discussion about the \if feature for psql,
I'd pointed out that you shouldn't really need very much in
the way of boolean-expression evaluation smarts, because you
ought to be able to use a backtick shell escape:

\if `expr :foo \> :bar`
\echo :foo is greater than :bar
\endif

Now that the basic feature is in, I went to play around with this,
and was disappointed to find out that it doesn't work.  The reason
is not far to seek: we do not do variable substitution within the
text between backticks.  psqlscanslash.l already foresaw that some
day we'd want to do that:

/*
 * backticked text: copy everything until next backquote, then evaluate.
 *
 * XXX Possible future behavioral change: substitute for :VARIABLE?
 */

I think today is that day, because it's going to make a material
difference to the usability of this feature.

I propose extending backtick processing so that

1. :VARIABLE is replaced by the contents of the variable

2. :'VARIABLE' is replaced by the contents of the variable,
single-quoted according to Unix shell conventions.  (So the
processing would be a bit different from what it is for the
same notation in SQL contexts.)

This doesn't look like it would take very much new code to do.

Thoughts?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Monitoring roles patch

2017-03-30 Thread Simon Riggs
On 29 March 2017 at 21:42, Dave Page  wrote:
> On Wed, Mar 29, 2017 at 2:51 PM, Stephen Frost  wrote:
>>
>> Dave's currently hacking on a new patch based on our discussion, so I'd
>> suggest waiting another hour or so anyway until he's done.
>>
>> Might be a bit longer as he's trying to do it in a hallway at
>> PGConf.US...
>
> Thanks Stephen.
>
> Here's an updated patch, and description of the changes. Simon,
> Stephen and Robert have looked at the description and are all happy
> with it \o/. Thank you to them for taking the time out of the
> conference to go through it with me.

Moving to commit this over the next hour. Last chance...

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical decoding on standby

2017-03-30 Thread Andres Freund
On 2017-03-29 08:01:34 +0800, Craig Ringer wrote:
> On 28 March 2017 at 23:22, Andres Freund  wrote:
> 
> >> --- a/doc/src/sgml/protocol.sgml
> >> +++ b/doc/src/sgml/protocol.sgml
> >> @@ -2034,6 +2034,8 @@ The commands accepted in walsender mode are:
> >>   
> >>Drops a replication slot, freeing any reserved server-side 
> >> resources. If
> >>the slot is currently in use by an active connection, this command 
> >> fails.
> >> +  If the slot is a logical slot that was created in a database other 
> >> than
> >> +  the database the walsender is connected to, this command fails.
> >>   
> >>   
> >>
> >
> > Shouldn't the docs in the drop database section about this?
> 
> DROP DATABASE doesn't really discuss all the resources it drops, but
> I'm happy to add mention of replication slots handling.

I don't think that's really comparable, because the other things aren't
global objects, which replication slots are.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical decoding on standby

2017-03-30 Thread Andres Freund
> @@ -9633,6 +9643,12 @@ xlog_redo(XLogReaderState *record)
>   SetTransactionIdLimit(checkPoint.oldestXid, 
> checkPoint.oldestXidDB);
>  
>   /*
> +  * There can be no concurrent writers to oldestCatalogXmin 
> during
> +  * recovery, so no need to take ProcArrayLock.
> +  */
> + ShmemVariableCache->oldestCatalogXmin = 
> checkPoint.oldestCatalogXmin;

s/writers/writes/?

> @@ -9731,6 +9748,15 @@ xlog_redo(XLogReaderState *record)
> 
> checkPoint.oldestXid))
>   SetTransactionIdLimit(checkPoint.oldestXid,
> 
> checkPoint.oldestXidDB);
> +
> + /*
> +  * There can be no concurrent writers to oldestCatalogXmin 
> during
> +  * recovery, so no need to take ProcArrayLock.
> +  */
> + if (TransactionIdPrecedes(ShmemVariableCache->oldestCatalogXmin,
> + 
> checkPoint.oldestCatalogXmin)
> + ShmemVariableCache->oldestCatalogXmin = 
> checkPoint.oldestCatalogXmin;

dito.



> @@ -276,8 +279,21 @@ CreateInitDecodingContext(char *plugin,
>  
>   ReplicationSlotsComputeRequiredXmin(true);
>  
> + /*
> +  * If this is the first slot created on the master we won't have a
> +  * persistent record of the oldest safe xid for historic snapshots yet.
> +  * Force one to be recorded so that when we go to replay from this slot 
> we
> +  * know it's safe.
> +  */
> + force_standby_snapshot =
> + !TransactionIdIsValid(ShmemVariableCache->oldestCatalogXmin);

s/first slot/first logical slot/. Also, the reference to master doesn't
seem right.


>   LWLockRelease(ProcArrayLock);
>  
> + /* Update ShmemVariableCache->oldestCatalogXmin */
> + if (force_standby_snapshot)
> + LogStandbySnapshot();

The comment and code don't quite square to me - it's far from obvious
that LogStandbySnapshot does something like that. I'd even say it's a
bad idea to have it do that.


>   /*
>* tell the snapshot builder to only assemble snapshot once reaching the
>* running_xact's record with the respective xmin.
> @@ -376,6 +392,8 @@ CreateDecodingContext(XLogRecPtr start_lsn,
>   start_lsn = slot->data.confirmed_flush;
>   }
>  
> + EnsureActiveLogicalSlotValid();

This seems like it should be in a separate patch, and seperately
reviewed. It's code that's currently unreachable (and thus untestable).


> +/*
> + * Test to see if the active logical slot is usable.
> + */
> +static void
> +EnsureActiveLogicalSlotValid(void)
> +{
> + TransactionId shmem_catalog_xmin;
> +
> + Assert(MyReplicationSlot != NULL);
> +
> + /*
> +  * A logical slot can become unusable if we're doing logical decoding 
> on a
> +  * standby or using a slot created before we were promoted from standby
> +  * to master.

Neither of those is currently possible...


> If the master advanced its global catalog_xmin past the
> +  * threshold we need it could've removed catalog tuple versions that
> +  * we'll require to start decoding at our restart_lsn.
> +  *
> +  * We need a barrier so that if we decode in recovery on a standby we
> +  * don't allow new decoding sessions to start after redo has advanced
> +  * the threshold.
> +  */
> + if (RecoveryInProgress())
> + pg_memory_barrier();

I don't think this is a meaningful locking protocol.  It's a bad idea to
use lock-free programming without need, especially when the concurrency
protocol isn't well defined.  With what other barrier does this pair
with?  What prevents the data being out of date by the time we actually
do the check?


> diff --git a/src/backend/replication/walsender.c 
> b/src/backend/replication/walsender.c
> index cfc3fba..cdc5f95 100644
> --- a/src/backend/replication/walsender.c
> +++ b/src/backend/replication/walsender.c
> @@ -1658,6 +1658,11 @@ PhysicalConfirmReceivedLocation(XLogRecPtr lsn)
>* be energy wasted - the worst lost information can do here is give us
>* wrong information in a statistics view - we'll just potentially be 
> more
>* conservative in removing files.
> +  *
> +  * We don't have to do any effective_xmin / effective_catalog_xmin 
> testing
> +  * here either, like for LogicalConfirmReceivedLocation. If we received
> +  * the xmin and catalog_xmin from downstream replication slots we know 
> they
> +  * were already confirmed there,
>*/
>  }

This comment reads as if LogicalConfirmReceivedLocation had
justification for not touching / checking catalog_xmin. But it does.



>   /*
> +  * Update our knowledge of the oldest xid we can safely create historic
> +  * snapshots for.
> +  *
> +  

Re: [HACKERS] Use \if ... \elif ... \else ... \endif to simplify alternative in psql scripting.

2017-03-30 Thread Tom Lane
Andres Freund  writes:
> On 2017-03-30 16:59:39 +, Tom Lane wrote:
>> Support \if ... \elif ... \else ... \endif in psql scripting.

> Could we use this to simplify maintenance of some of the larger
> alternative output files?

Worth thinking about, for sure.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Use \if ... \elif ... \else ... \endif to simplify alternative in psql scripting.

2017-03-30 Thread Andres Freund
On 2017-03-30 16:59:39 +, Tom Lane wrote:
> Support \if ... \elif ... \else ... \endif in psql scripting.

Could we use this to simplify maintenance of some of the larger
alternative output files?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Schedule and Release Management Team for PG10

2017-03-30 Thread Joe Conway
On 03/30/2017 10:59 AM, Magnus Hagander wrote:
> On Thu, Mar 30, 2017 at 4:40 PM, Tom Lane  > wrote:
> 
> Robert Haas >
> writes:
> > On Thu, Mar 30, 2017 at 9:41 AM, Bruce Momjian  > wrote:
> >> I propose we go for a week delay in closing the commit fest, and we
> >> decide right now.  Ideally I like to to see delay in one-week 
> increments
> >> _and_ announce that a week before each deadline.
> 
> > Summary of opinions on this thread:
> 
> > - Tom thinks the RMT should consider extending by a day or two.
> 
> FWIW, while I'm sure we had a reason (back at the Ottawa 2016 meeting)
> for limiting the final 'fest to just a month, I'm quite sure we did
> not allow for most of the senior hackers being at a conference during
> the last week of that time.  So I now concur with Bruce's suggestion
> that a week's delay would be suitable.
> 
> 
> +1.

+1


-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)

2017-03-30 Thread Tom Lane
Fabien COELHO  writes:
>> If it actually is impossible to give pgbench equivalent behavior, we'll 
>> just have to live with the discrepancy,

> Yep.

>> but ISTM it could probably be made to work.

> Even if it could somehow, I do not see it as a useful feature for pgbench. 

Perhaps not.

> I also lack a good use case for psql for this feature.

It doesn't seem very outlandish to me: the alternative would be to repeat
all of a query that might span dozens of lines, in order to change one or
two lines.  That's not very readable or maintainable.

I'm prepared to believe that extremely long queries aren't ever going
to be common in pgbench scripts, so that there's not much need to support
the equivalent behavior in pgbench.  So maybe it's fine.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] delta relations in AFTER triggers

2017-03-30 Thread Kevin Grittner
On Thu, Mar 23, 2017 at 11:36 PM, Thomas Munro
 wrote:

> One more thought: should this be allowed?
>
> postgres=# create table mytab (i int) partition by list (i);
> CREATE TABLE
> postgres=# create table mytab1 partition of mytab for values in (42);
> CREATE TABLE
> postgres=# create function my_trigger_function() returns trigger as $$
> begin end; $$ language plpgsql;
> CREATE FUNCTION
> postgres=# create trigger my_trigger after update on mytab referencing
> old table as my_old for each statement execute procedure
> my_trigger_function();
> CREATE TRIGGER

> Perhaps the moral equivalent should be possible for statement triggers
> with transition tables, and that already works with your patch as far
> as I know.  So I think your patch probably just needs to reject them
> on partitioned tables.

> [patch provided]

Yeah, that looks good.  Included in next patch version.


On Sun, Mar 26, 2017 at 6:39 AM, Thomas Munro
 wrote:

> BTW I had to make the following change to your v12 because of commit b8d7f053:

Yeah, I ran into that, too, and used exactly the same fix.


On Sun, Mar 26, 2017 at 6:39 AM, Thomas Munro
 wrote:
> On Fri, Mar 24, 2017 at 1:14 PM, Thomas Munro

>> When PlanCacheRelCallback runs, I don't think it understands that
>> these named tuplestore RangeTblEntry objects are dependent on the
>> subject table.  Could that be fixed like this?
>>
>> @@ -2571,6 +2582,9 @@ extract_query_dependencies_walker(Node *node,
>> PlannerInfo *context)
>> if (rte->rtekind == RTE_RELATION)
>> context->glob->relationOids =
>>
>> lappend_oid(context->glob->relationOids, rte->relid);
>> +   else if (rte->rtekind == RTE_NAMEDTUPLESTORE)
>> +   context->glob->relationOids =
>> +
>> lappend_oid(context->glob->relationOids, [subject table's OID]);
>
> I'm not sure if this is the right approach and it may have style
> issues, but it does fix the crashing in the ALTER TABLE case I
> reported: see attached patch which applies on top of your v12.

I had been working along similar lines, but had not gotten it
working.  Merged your version and mine, taking the best of both.
:-)

Thanks for the reviews and the fixes!

New version attached.  It needs some of these problem cases added to
the testing, and a mention in the docs that only C and plpgsql
triggers can use the feature so far.  I'll add those tomorrow.

--
Kevin Grittner


transition-v13.diff.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2017-03-30 Thread Robert Haas
On Thu, Mar 30, 2017 at 11:26 AM, Teodor Sigaev  wrote:
> I had a look on patch and played with it, seems, it looks fine. I splitted
> it to two patches: core changes (+bloom index fix) and btree itself. All
> docs are left in first patch - I'm too lazy to rewrite documentation which
> is changed in second patch.
> Any objection from reviewers to push both patches?

Has this really had enough review and testing?  The last time it was
pushed, it didn't go too well.  And laziness is not a very good excuse
for not dividing up patches properly.

It seems highly surprising to me that CheckIndexCompatible() only gets
a one line change in this patch.  That seems unlikely to be correct.

Has anybody done some testing of this patch with the WAL consistency
checker?  Like, create some tables with indexes that have INCLUDE
columns, set up a standby, enable consistency checking, pound the
master, and see if the standby bails?

Has anybody tested this patch with amcheck?  Does it break amcheck?

A few minor comments:

-foreach(lc, constraint->keys)
+else foreach(lc, constraint->keys)

That doesn't look like a reasonable way of formatting the code.

+/* Here is some code duplication. But we do need it. */

That is not a very informative comment.

+* NOTE It is not crutial for reliability in present,

Spelling, punctuation.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Allow vacuums to report oldestxmin

2017-03-30 Thread Fujii Masao
On Wed, Mar 29, 2017 at 3:31 PM, Masahiko Sawada  wrote:
> On Wed, Mar 29, 2017 at 1:32 AM, Fujii Masao  wrote:
>> On Tue, Mar 28, 2017 at 1:06 AM, Masahiko Sawada  
>> wrote:
>>> On Sun, Mar 26, 2017 at 2:26 AM, Masahiko Sawada  
>>> wrote:
 On Sun, Mar 26, 2017 at 1:37 AM, Fujii Masao  wrote:
> On Mon, Mar 6, 2017 at 9:37 PM, Masahiko Sawada  
> wrote:
>> On Fri, Mar 3, 2017 at 10:50 PM, Simon Riggs  
>> wrote:
>>> Allow vacuums to report oldestxmin
>>>
>>> Allow VACUUM and Autovacuum to report the oldestxmin value they
>>> used while cleaning tables, helping to make better sense out of
>>> the other statistics we report in various cases.
>>>
>>> Branch
>>> --
>>> master
>>>
>>> Details
>>> ---
>>> http://git.postgresql.org/pg/commitdiff/9eb344faf54a849898d9be012ddfa8204cfeb57c
>>>
>>> Modified Files
>>> --
>>> src/backend/commands/vacuumlazy.c | 9 +
>>> 1 file changed, 5 insertions(+), 4 deletions(-)
>>>
>>>
>>
>> Should we change the example in vacuum.sgml file as well? Attached patch.
>
> "tuples" in the above should be "row versions"?
> We should review not only this line but also all the lines in the example
> of VERBOSE output, I think.

 Right. These verbose log messages are out of date. I ran
 VACUUM(VERBOSE, ANALYZE) with same scenario as current example as
 possible. Attached patch updates verbose log messages.


>>>
>>> Surprisingly the changes "tuples" -> "row versions" in vacuumlazy.c is
>>> introduced by commit feb4f44d296b88b7f0723f4a4f3945a371276e0b in 2003.
>>
>> This is the evidence that no one cares about the details of VACUUM VERBOSE
>> output example. So I'm tempted to simplify the example (please see the
>> attached patch) instead of keeping updating the example.
>
> Yes. I agree.

Pushed. I back-patched to all supported versions according to
Alvaro's comment upthread.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)

2017-03-30 Thread Fabien COELHO


Hello Tom,


pgbench (well, at least if I succeed in getting boolean expressions and
setting variables, which is just a maybe), but this kind of if in the
middle of expression does not make much sense for a pgbench script where
"if" must be evaluated at execution time, not parse time.


Well, it's not really clear to me why that would be true.


For example, how can you PREPARE a possibly combinatorial thing?

SELECT
  \if ... XX \else YY \endif
FROM
  \if ... ZZ \else WW \endif
WHERE
  \if ... AA \else BB \endif
 ;

Or the kind of operation:

  \if ...
SELECT *
  \else
DELETE
  \endif
  FROM table WHERE condition;

Even the structure can be changed somehow:

  SELECT
\if ...
  1 ;
  SELECT 2
\endif
  ;

If it actually is impossible to give pgbench equivalent behavior, we'll 
just have to live with the discrepancy,


Yep.


but ISTM it could probably be made to work.


Even if it could somehow, I do not see it as a useful feature for pgbench. 
I also lack a good use case for psql for this feature.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2017-03-30 Thread Aleksander Alekseev
Hi Teodor,

> I had a look on patch and played with it, seems, it looks fine. I splitted
> it to two patches: core changes (+bloom index fix) and btree itself. All
> docs are left in first patch - I'm too lazy to rewrite documentation which
> is changed in second patch.
> Any objection from reviewers to push both patches?

These patches look OK. Definitely no objections from me.

-- 
Best regards,
Aleksander Alekseev


signature.asc
Description: PGP signature


Re: [HACKERS] PDF build is broken

2017-03-30 Thread Devrim Gündüz

Hi Peter,

On Sun, 2017-03-26 at 15:05 -0400, Peter Eisentraut wrote:
> Fixed. 

(Sorry for the late response): Thanks, it builds fine.

>  But I also suggest that you try out the FOP based builds,
> because the jadetex-based builds will probably go away soon.

Can you please let me know how I will do it? Any docs somewhere?

Regards,
-- 
Devrim Gündüz
EnterpriseDB: http://www.enterprisedb.com
PostgreSQL Danışmanı/Consultant, Red Hat Certified Engineer
Twitter: @DevrimGunduz , @DevrimGunduzTR


signature.asc
Description: This is a digitally signed message part


[HACKERS] Re: [pgsql-students] GSoC 2017 Proposal for "Explicitly support predicate locks in index access methods besides btree"

2017-03-30 Thread Kevin Grittner
On Tue, Mar 28, 2017 at 12:36 PM, Shubham Barai
 wrote:

> * Hash Index

> currently, I am trying to understand how the splitting of bucket works.
> should we acquire predicate lock on every page from an old and new bucket in
> case there is a predicate lock on a page of an old bucket?

For page level locking in hash indexes, I think it might be fine to
lock the first page of a bucket.

Also, in case you missed it, here are some guidelines I suggested.
There weren't any comments, which means they probably didn't offend
anyone else too badly.  They're just my opinion, but you might want
to consider them:

Each GSoC student proposal should be a PDF file of 6 to 8 pages.  In
the end, Google will publish these documents on a web page, so the
student should make each proposal something which they will be happy
to have future potential employers review.

Some ideas for desirable content:

  - A resume or CV of the student, including any prior GSoC work
  - Their reasons for wanting to participate
  - What else they have planned for the summer, and what their time
commitment to the GSoC work will be
  - A clear statement that there will be no intellectual property
problems with the work they will be doing -- that the PostgreSQL
community will be able to use their work without encumbrances
(e.g., there should be no agreements related to prior or
ongoing work which might assign the rights to the work they do
to someone else)
  - A description of what they will do, and how
  - Milestones with dates
  - What they consider to be the test that they have successfully
completed the project

-- 
Kevin Grittner


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)

2017-03-30 Thread Robert Haas
On Thu, Mar 30, 2017 at 11:41 AM, Andres Freund  wrote:
> On 2017-03-30 16:43:41 +0530, Pavan Deolasee wrote:
>> Looks like OID conflict to me.. Please try rebased set.
>
> Pavan, Alvaro, everyone: I know you guys are working very hard on this,
> but I think at this point it's too late to commit this for v10.  This is
> patch that's affecting the on-disk format, in quite subtle
> ways.  Committing this just at the end of the development cyle / shortly
> before feature freeze, seems too dangerous to me.
>
> Let's commit this just at the beginning of the cycle, so we have time to
> shake out the bugs.

+1, although I think it should also have substantially more review first.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: proposal - psql: possibility to specify sort for describe commands, when size is printed

2017-03-30 Thread Pavel Stehule
>
>

 This proposal was here already - maybe two years ago. The psql command
 parser doesn't allow any complex syntax - more - the more parameters in one
 psql commands is hard to remember, hard to read.

>>>
>>> Could you please provide a link to this discussion.  Probably working
>>> with multiple parameters in psql commands require some rework, but that's
>>> definitely doable.
>>>
>>
>> http://grokbase.com/t/postgresql/pgsql-hackers/137nt5p6s0/
>> proposal-psql-show-longest-tables/oldest
>> https://www.postgresql.org/message-id/AANLkTikyaeJ0XdKDzxSvq
>> pe8karrtiuqjqhwnj8ec...@mail.gmail.com
>>
>
> I took a look to these threads, but I didn't find place where difficulties
> of adding extra arguments to psql commands are pointed.
> Could you, please, point particular messages about it?
>

I am sorry - maybe my memory doesn't serve well

Pavel


>
> --
> Alexander Korotkov
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>
>


Re: [HACKERS] [POC] A better way to expand hash indexes.

2017-03-30 Thread Robert Haas
On Wed, Mar 29, 2017 at 8:03 AM, Mithun Cy  wrote:
> Thanks, Amit for a detailed review.

I think that the macros in hash.h need some more work:

- Pretty much any time you use the argument of a macro, you need to
parenthesize it in the macro definition to avoid surprises if the
macros is called using an expression.  That isn't done consistently
here.

- The macros make extensive use of magic numbers like 1, 2, and 3.  I
suggest something like:

#define SPLITPOINT_PHASE_BITS 2
#define SPLITPOINT_PHASES_PER_GROUP (1 << SPLITPOINT_PHASE_BITS)

And then use SPLITPOINT_PHASE_BITS any place where you're currently
saying 2.  The reference to 3 is really SPLITPOINT_PHASE_BITS + 1.

- Many of these macros are only used in one place.  Maybe just move
the computation to that place and get rid of the macro.  For example,
_hash_spareindex() could be written like this:

if (splitpoint_group < SPLITPOINT_GROUPS_WITH_ONLY_ONE_PHASE)
return splitpoint_group;

/* account for single-phase groups */
splitpoint = SPLITPOINT_GROUPS_WITH_ONLY_ONE_PHASE;

/* account for completed groups */
splitpoint += (splitpoint_group -
SPLITPOINT_GROUPS_WITH_ONLY_ONE_PHASE) << SPLITPOINT_PHASE_BITS;

/* account for phases within current group */
splitpoint += (bucket_num >> (SPLITPOINT_PHASE_BITS + 1)) &
SPLITPOINT_PHASE_MASK;

return splitpoint;

That eliminates the only use of two complicated macros and is in my
opinion more clear than what you've currently got.

- Some of these macros lack clear comments explaining their purpose.

- Some of them don't include HASH anywhere in the name, which is
essential for a header that may easily be included by non-hash index
code.

- The names don't all follow a consistent format.  Maybe that's too
much to hope for at some level, but I think they could be more
consistent than they are.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [pgsql-students] GSoC 2017 Proposal for "Explicitly support predicate locks in index access methods besides btree"

2017-03-30 Thread Kevin Grittner
On Tue, Mar 28, 2017 at 12:36 PM, Shubham Barai
 wrote:

> My name is Shubham Barai and I am a final year student at Maharashtra
> Institute of Technology, Pune, India. I am very interested in contributing
> Postgresql this year through GSoC project.

Welcome!  Sorry I didn't spot this post earlier -- I'm behind on
reading the community lists and this didn't trigger any of the phrases
that pop things out to my attention right away.

> I am particularly interested in working on the project "Explicitly support
> predicate locks in index access methods besides btree". I have gone through
> some research papers which were recommended on
> https://wiki.postgresql.org/wiki/GSoC_2017 to understand the concept of
> Serializable Snapshot Isolation used in PostgreSQL. I have also browsed
> through the codebase to get some idea of different access methods for gin,
> gist, and hash indexes. I want to discuss my proposal to get some feedback
> before I write my final proposal. Sorry, I am discussing my proposal little
> late. I was really busy in my academics.

Understandable, but please be careful to get your final proposal in by
the deadline.  Deadlines in GSoC are not flexible.

> Currently, only B+-trees support page level predicate locking.For other
> indexes, it acquires relation level lock which can lead to unnecessary
> serialization failure due to rw dependency caused by any insert into the
> index. So, the main task of this project is to support page level predicate
> locking for remaining indexes.

Right.

> [calls out several places that specific calls to predicate locking functions 
> are needed]

> There may be a lot of other places where we need to insert function calls
> for predicate locking that I haven't included yet. I didn't go into details
> of every index AM.

That will be about half the work of the project.  It is fine to
identify examples for your proposal, to show that you know what to
look for, but tracking down every last location can be completed after
the proposal is accepted.  The other half of the work will be testing
and responding to issues others might raise.

> can anyone help me find existing tests for b-tree?

I think this should be it:

kgrittn@kevin-desktop:~/pg/master$ find src/backend/access/nbtree
-type f -name '*.c' | grep -v '/tmp_check/' | grep -v '/Debug/' |
xargs egrep -n 'PredicateLock|SerializableConflict'
src/backend/access/nbtree/nbtinsert.c:201:
CheckForSerializableConflictIn(rel, NULL, buf);
src/backend/access/nbtree/nbtinsert.c:402:
 CheckForSerializableConflictIn(rel, NULL, buf);
src/backend/access/nbtree/nbtinsert.c:784:
PredicateLockPageSplit(rel,
src/backend/access/nbtree/nbtsearch.c:1040:
PredicateLockRelation(rel, scan->xs_snapshot);
src/backend/access/nbtree/nbtsearch.c:1052:
PredicateLockPage(rel, BufferGetBlockNumber(buf),
src/backend/access/nbtree/nbtsearch.c:1483:
 PredicateLockPage(rel, blkno, scan->xs_snapshot);
src/backend/access/nbtree/nbtsearch.c:1578:
 PredicateLockPage(rel, BufferGetBlockNumber(so->currPos.buf),
scan->xs_snapshot);
src/backend/access/nbtree/nbtsearch.c:1869:
PredicateLockRelation(rel, scan->xs_snapshot);
src/backend/access/nbtree/nbtsearch.c:1874: PredicateLockPage(rel,
BufferGetBlockNumber(buf), scan->xs_snapshot);
src/backend/access/nbtree/nbtpage.c:1410:
PredicateLockPageCombine(rel, leafblkno, leafrightsib);

-- 
Kevin Grittner


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)

2017-03-30 Thread Tom Lane
Fabien COELHO  writes:
>> [...] Aside from cosmetic changes, I've made it behave reasonably for 
>> cases where \if is used on portions of a query, for instance

> A small issue I see is that I was planning to add such an if syntax to 
> pgbench (well, at least if I succeed in getting boolean expressions and 
> setting variables, which is just a maybe), but this kind of if in the 
> middle of expression does not make much sense for a pgbench script where 
> "if" must be evaluated at execution time, not parse time.

Well, it's not really clear to me why that would be true.  If it actually
is impossible to give pgbench equivalent behavior, we'll just have to live
with the discrepancy, but ISTM it could probably be made to work.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)

2017-03-30 Thread Andres Freund
Hi,

On 2017-03-30 16:43:41 +0530, Pavan Deolasee wrote:
> Looks like OID conflict to me.. Please try rebased set.

Pavan, Alvaro, everyone: I know you guys are working very hard on this,
but I think at this point it's too late to commit this for v10.  This is
patch that's affecting the on-disk format, in quite subtle
ways.  Committing this just at the end of the development cyle / shortly
before feature freeze, seems too dangerous to me.

Let's commit this just at the beginning of the cycle, so we have time to
shake out the bugs.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2017-03-30 Thread Teodor Sigaev
I had a look on patch and played with it, seems, it looks fine. I splitted it to 
two patches: core changes (+bloom index fix) and btree itself. All docs are left 
in first patch - I'm too lazy to rewrite documentation which is changed in 
second patch.

Any objection from reviewers to push both patches?


--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


0001-covering-core.patch
Description: binary/octet-stream


0002-covering-btree.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2017-03-30 Thread Teodor Sigaev

-   IDENTITY_P IF_P ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IMPORT_P IN_P
+   IDENTITY_P IF_P ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IMPORT_P IN_P 
INCLUDE

I think your syntax would read no worse, possibly even better, if you
just used the existing INCLUDING keyword.
It was a discussion in this thread about naming and both databases, which 
support covering indexes, use INCLUDE keyword.


--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Guidelines for GSoC student proposals / Eliminate O(N^2) scaling from rw-conflict tracking in serializable transactions

2017-03-30 Thread Kevin Grittner
On Wed, Mar 29, 2017 at 11:17 PM, Mengxing Liu
 wrote:
> Thanks, I've updated the proposal. Just one issue:
> I agree that we can make skip list a general data structure.  But
> can we use the fixed-level skip list as a Plan B? Or a quick attempt
> before the general data structure ?
> Because I am not familiar with shared memory structure and tricks
> used in it, and I cannot estimate how much time it would take.

It's not really too bad for fixed allocation shared memory, and I
can help with that.  If I thought it would save much I could see
doing a prototype without generalization, but you would still have
most of the same shared memory issues, since the structure *must*
live in shared memory.

--
Kevin Grittner


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: proposal - psql: possibility to specify sort for describe commands, when size is printed

2017-03-30 Thread Alexander Korotkov
On Tue, Mar 28, 2017 at 10:12 AM, Pavel Stehule 
wrote:

> 2017-03-27 13:59 GMT+02:00 Alexander Korotkov :
>
>> On Fri, Mar 10, 2017 at 6:06 PM, Pavel Stehule 
>> wrote:
>>
>>> 2017-03-10 16:00 GMT+01:00 Alexander Korotkov >> >:
>>>
 On Fri, Mar 10, 2017 at 5:16 PM, Stephen Frost 
 wrote:

> * Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote:
> > On 2/24/17 16:32, Pavel Stehule wrote:
> > > set EXTENDED_DESCRIBE_SORT size_desc
> > > \dt+
> > > \l+
> > > \di+
> > >
> > > Possible variants: schema_table, table_schema, size_desc,
> size_asc
> >
> > I can see this being useful, but I think it needs to be organized a
> > little better.
> >
> > Sort key and sort direction should be separate settings.
> >
> > I'm not sure why we need to have separate settings to sort by schema
> > name and table name.  But if we do, then we should support that for
> all
> > object types.  I think maybe that's something we shouldn't get into
> > right now.
> >
> > So I would have one setting for sort key = {name|size} and on for
> sort
> > direction = {asc|desc}.
>
> Perhaps I'm trying to be overly cute here, but why not let the user
> simply provide a bit of SQL to be put at the end of the query?
>
> That is, something like:
>
> \pset EXTENDED_DESCRIBE_ORDER_LIMIT 'ORDER BY 5 DESC LIMIT 10'
>

 I think that's the question of usability.  After all, one can manually
 type corresponding SQL instead of \d* commands.  However, it's quite
 cumbersome to do this every time.
 I found quite useful to being able to switch between different sortings
 quickly.  For instance, after seeing tables sorted by name, user can
 require them sorted by size descending, then sorted by size ascending,
 etc...
 Therefore, I find user-defined SQL clause to be cumbersome.  Even psql
 variable itself seems to be cumbersome for me.
 I would propose to add sorting as second optional argument to \d*
 commands.  Any thoughts?

>>>
>>> This proposal was here already - maybe two years ago. The psql command
>>> parser doesn't allow any complex syntax - more - the more parameters in one
>>> psql commands is hard to remember, hard to read.
>>>
>>
>> Could you please provide a link to this discussion.  Probably working
>> with multiple parameters in psql commands require some rework, but that's
>> definitely doable.
>>
>
> http://grokbase.com/t/postgresql/pgsql-hackers/
> 137nt5p6s0/proposal-psql-show-longest-tables/oldest
> https://www.postgresql.org/message-id/AANLkTikyaeJ0XdKDzxSvqPE8kaRRT
> iuqjqhwnj8ec...@mail.gmail.com
>

I took a look to these threads, but I didn't find place where difficulties
of adding extra arguments to psql commands are pointed.
Could you, please, point particular messages about it?

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] [PATCH] Reduce src/test/recovery verbosity

2017-03-30 Thread Tom Lane
Stephen Frost  writes:
> I'd be fine with removing --verbose globally, as your patch does, but
> there was some argument that we then would have long 'quiet' periods.
> I haven't had a chance to go test if that's really the case yet though.

I've been running it like this lately:

make -s check-world -j4 PROVE_FLAGS='-j4 --quiet --nocolor --nocount'

and it is still pretty noisy ;-).  The only place where it sits for
more than a couple of seconds without printing anything is at the very
start, which I believe to be the initial "make temp-install" step,
which would be unaffected by prove verbosity anyway.

So I'd be +1 for just removing --verbose by default.  Anybody who really
wants it can put it back via PROVE_FLAGS.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] reorder tablespaces in basebackup tar stream for backup_label

2017-03-30 Thread Fujii Masao
On Wed, Mar 29, 2017 at 5:36 PM, Kyotaro HORIGUCHI
 wrote:
> Hello,
>
> At Wed, 29 Mar 2017 09:23:42 +0200, Michael Banck  
> wrote in <149077.18436.14.ca...@credativ.de>
>> Hi,
>>
>> Am Mittwoch, den 29.03.2017, 15:22 +0900 schrieb Michael Paquier:
>> > On Wed, Mar 29, 2017 at 3:56 AM, Fujii Masao  wrote:
>> > > If your need other information except START WAL LOCATION at the 
>> > > beginning of
>> > > base backup and they are very useful for many third-party softwares,
>> > > you can add them into that first result set. If you do this, you can
>> > > retrieve them
>> > > at the beginning even when WAL files are included in the backup.
>> >
>> > You mean in the result tuple of pg_start_backup(), right? Why not.
>>
>> The replication protocol chapter says: "When the backup is started, the
>> server will first send two ordinary result sets, followed by one or more
>> CopyResponse results. The first ordinary result set contains the
>> starting position of the backup, in a single row with two columns."
>>
>> However, I don't think it is very obvious to users (or at least it is
>> not to me) how to get at this from psql, if you want to script it.  If I

I don't think that using psql to run BASE_BACKUP command is good
approach. Instead, IMO you basically should implement the client program
which can handle BASE_BACKUP properly, or extend pg_basebackup
so that you can do what you want to do.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [POC] A better way to expand hash indexes.

2017-03-30 Thread Mithun Cy
On Thu, Mar 30, 2017 at 7:23 PM, Robert Haas  wrote:
> I think approach B is incorrect.  Suppose we have 1536 buckets and
> hash values 2048, 2049, 4096, 4097, 6144, 6145, 8192, and 8193.  If I
> understand correctly, each of these values should be mapped either to
> bucket 0 or to bucket 1, and the goal of the sort is to put all of the
> bucket 0 tuples before all of the bucket 1 tuples, so that we get
> physical locality when inserting.  With approach A, the sort keys will
> match the bucket numbers -- we'll be sorting the list 0, 1, 0, 1, 0,
> 1, 0, 1 -- and we will end up doing all of the inserts to bucket 0
> before any of the inserts to bucket 1.  With approach B, we'll be
> sorting 512, 513, 1024, 1025, 0, 1, 512, 513 and will end up
> alternating inserts to bucket 0 with inserts to bucket 1.

Oops sorry, yes 2 denominators are different (one used in an insert
and another used in sorting keys) we will end up with different bucket
numbers. I think in patch B, I should have actually taken next 2-power
number of 1536 as the denominator and try to get the mod value. If the
mod value is > 1536 then reduce the denominator by half and retake the
mod to get the bucket within 1536. Which is what effectively Patch A
is doing. Approach B is a blunder, I apologize for that mistake. I
think Patch A should be considered. If adding the members of struct
Tuplesortstate is a concern I will rewrite Patch B as said above.

-- 
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Schedule and Release Management Team for PG10

2017-03-30 Thread Alvaro Herrera
Robert Haas wrote:

> - Alvaro proposes allowing more time next time, but not to change the
> dates for this time.

FWIW I didn't realize that the NY conference was ongoing, so count me
for postponing the end of the current CF.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Schedule and Release Management Team for PG10

2017-03-30 Thread Magnus Hagander
On Thu, Mar 30, 2017 at 4:40 PM, Tom Lane  wrote:

> Robert Haas  writes:
> > On Thu, Mar 30, 2017 at 9:41 AM, Bruce Momjian  wrote:
> >> I propose we go for a week delay in closing the commit fest, and we
> >> decide right now.  Ideally I like to to see delay in one-week increments
> >> _and_ announce that a week before each deadline.
>
> > Summary of opinions on this thread:
>
> > - Tom thinks the RMT should consider extending by a day or two.
>
> FWIW, while I'm sure we had a reason (back at the Ottawa 2016 meeting)
> for limiting the final 'fest to just a month, I'm quite sure we did
> not allow for most of the senior hackers being at a conference during
> the last week of that time.  So I now concur with Bruce's suggestion
> that a week's delay would be suitable.
>

+1.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] Schedule and Release Management Team for PG10

2017-03-30 Thread Petr Jelinek
On 30/03/17 16:54, Alvaro Herrera wrote:
> Robert Haas wrote:
> 
>> - Alvaro proposes allowing more time next time, but not to change the
>> dates for this time.
> 
> FWIW I didn't realize that the NY conference was ongoing, so count me
> for postponing the end of the current CF.
> 

+1, the conference makes it bit inconvenient to have freeze on exactly 31st.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Schedule and Release Management Team for PG10

2017-03-30 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Robert Haas  writes:
> > On Thu, Mar 30, 2017 at 9:41 AM, Bruce Momjian  wrote:
> >> I propose we go for a week delay in closing the commit fest, and we
> >> decide right now.  Ideally I like to to see delay in one-week increments
> >> _and_ announce that a week before each deadline.
> 
> > Summary of opinions on this thread:
> 
> > - Tom thinks the RMT should consider extending by a day or two.
> 
> FWIW, while I'm sure we had a reason (back at the Ottawa 2016 meeting)
> for limiting the final 'fest to just a month, I'm quite sure we did
> not allow for most of the senior hackers being at a conference during
> the last week of that time.  So I now concur with Bruce's suggestion
> that a week's delay would be suitable.

Agreed, same here.

Thanks!

Stephen


signature.asc
Description: Digital signature


  1   2   >