date:20161124

On Fri, Nov 25, 2016 at 4:02 PM, Ideriha, Takeshi
 wrote:
> I applied your fixed patch and new one, and confirmed the applied source 
> passed the tests successfully. And I also checked manually the error messages 
> were emitted successfully when cr/lf are included in dbname or rolename or 
> data_directory.
>
> AFAICT, this patch satisfies the concept discussed before. So I’ve switched 
> this patch “Ready for Committer”.

Thanks for the review, Ideriha-san. (See you next week perhaps?)
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] confusing checkpoint_flush_after / bgwriter_flush_after

2016-11-24 Thread Tomas Vondra


Hi,

while doing some benchmarking, I've once again got confused by the 
default settings for checkpoint_flush_after and bgwriter_flush_after. 
The sample config says this:


#checkpoint_flush_after = 0   # 0 disables,
  # default is 256kB on linux, 0 otherwise

#bgwriter_flush_after = 0 # 0 disables,
  # default is 512kB on linux, 0 otherwise

I find this pretty confusing, because for all other GUCs in the file, 
the commented-out value is the default one. In this case that would mean 
"0", disabling the flushing.


But in practice we use platform-dependent defaults - 256/512K on Linux, 
0 otherwise. There are other GUCs where the default is 
platform-specific, but none of them suggests "disabled" is the default 
state.


While the 9.6 cat is out of the bag, I think we can fix this quite 
easily - use "-1" to specify the default value should be used, and use 
that in the sample file. This won't break any user configuration.


If that's considered not acceptable, perhaps we should at least improve 
the comments, so make this clearer.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Forbid use of LF and CR characters in database and role names

2016-11-24 Thread Ideriha, Takeshi

> > [Summary]
> > 1. apply patch and make world
> >   -> failed because  was mistakenly coded .
> >
> > 2．correct this mistake and make check-world
> >   -> got 1 failed test: "'pg_dumpall with \n\r in database name'"
> >  because test script cannot createdb "foo\n\rbar"
> 
> The attached version addresses those problems. I have replaced the test in
> src/bin/pg_dump/t/ by tests in src/bin/scripts/t/ to check if the role name
> and database name with CR or LF fail to be created. I have as well added a 
> test
> for initdb when the data directory has an incorrect character in 0002.

Thanks for your modification.
I applied your fixed patch and new one, and confirmed the applied source passed 
the tests successfully. And I also checked manually the error messages were 
emitted successfully when cr/lf are included in dbname or rolename or 
data_directory.

AFAICT, this patch satisfies the concept discussed before. So I’ve switched 
this patch “Ready for Committer”.

Regards, 
Ideriha, Takeshi

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: function xmltable

2016-11-25 7:44 GMT+01:00 Pavel Stehule :

>
>
> 2016-11-25 3:31 GMT+01:00 Alvaro Herrera :
>
>> Michael Paquier wrote:
>>
>> > Nit: I did not look at the patch in details,
>> > but I find the size of the latest version sent, 167kB, scary as it
>> > complicates review and increases the likeliness of bugs.
>>
>> Here's the stat.  Note that removing the functionality as discussed
>> would remove all of xpath_parser.c but I think the rest of it remains
>> pretty much unchanged.  So it's clearly a large patch, but there are
>> large docs and tests too, not just code.
>>
>
> yes, lot of is regress tests (expected part is 3x) - and XMLTABLE function
> is not trivial.
>

regress tests are about 50%



>
> lot of code is mechanical - nodes related. The really complex part is only
> in xml.c. There is one longer function only - the complexity is based on
> mapping libxml2 result to PostgreSQL result (with catching exceptions due
> releasing libxml2 sources).
>
> The all changes are well isolated -  there is less risk to break some
> other.
>
>
>>
>>  doc/src/sgml/func.sgml   | 376 ++---
>>  src/backend/executor/execQual.c  | 335 +++
>>  src/backend/executor/execTuples.c|  42 +++
>>  src/backend/nodes/copyfuncs.c|  66 
>>  src/backend/nodes/equalfuncs.c   |  51 +++
>>  src/backend/nodes/nodeFuncs.c| 100 ++
>>  src/backend/nodes/outfuncs.c |  51 +++
>>  src/backend/nodes/readfuncs.c|  42 +++
>>  src/backend/optimizer/util/clauses.c |  33 ++
>>  src/backend/parser/gram.y| 181 ++-
>>  src/backend/parser/parse_coerce.c|  33 +-
>>  src/backend/parser/parse_expr.c  | 182 +++
>>  src/backend/parser/parse_target.c|   7 +
>>  src/backend/utils/adt/Makefile   |   2 +-
>>  src/backend/utils/adt/ruleutils.c| 100 ++
>>  src/backend/utils/adt/xml.c  | 610 ++
>> +
>>  src/backend/utils/adt/xpath_parser.c | 337 +++
>>  src/backend/utils/fmgr/funcapi.c |  13 +
>>  src/include/executor/executor.h  |   1 +
>>  src/include/executor/tableexpr.h |  69 
>>  src/include/funcapi.h|   1 -
>>  src/include/nodes/execnodes.h|  31 ++
>>  src/include/nodes/nodes.h|   4 +
>>  src/include/nodes/parsenodes.h   |  21 ++
>>  src/include/nodes/primnodes.h|  40 +++
>>  src/include/parser/kwlist.h  |   3 +
>>  src/include/parser/parse_coerce.h|   4 +
>>  src/include/utils/xml.h  |   2 +
>>  src/include/utils/xpath_parser.h |  24 ++
>>  src/test/regress/expected/xml.out| 415 
>>  src/test/regress/expected/xml_1.out  | 323 +++
>>  src/test/regress/expected/xml_2.out  | 414 
>>  src/test/regress/sql/xml.sql | 170 ++
>>  33 files changed, 4019 insertions(+), 64 deletions(-)
>>
>> --
>> Álvaro Herrerahttps://www.2ndQuadrant.com/
>> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>>
>
>

Re: [HACKERS] patch: function xmltable

2016-11-25 3:31 GMT+01:00 Alvaro Herrera :

> Michael Paquier wrote:
>
> > Nit: I did not look at the patch in details,
> > but I find the size of the latest version sent, 167kB, scary as it
> > complicates review and increases the likeliness of bugs.
>
> Here's the stat.  Note that removing the functionality as discussed
> would remove all of xpath_parser.c but I think the rest of it remains
> pretty much unchanged.  So it's clearly a large patch, but there are
> large docs and tests too, not just code.
>

yes, lot of is regress tests (expected part is 3x) - and XMLTABLE function
is not trivial.

lot of code is mechanical - nodes related. The really complex part is only
in xml.c. There is one longer function only - the complexity is based on
mapping libxml2 result to PostgreSQL result (with catching exceptions due
releasing libxml2 sources).

The all changes are well isolated -  there is less risk to break some other.


>
>  doc/src/sgml/func.sgml   | 376 ++---
>  src/backend/executor/execQual.c  | 335 +++
>  src/backend/executor/execTuples.c|  42 +++
>  src/backend/nodes/copyfuncs.c|  66 
>  src/backend/nodes/equalfuncs.c   |  51 +++
>  src/backend/nodes/nodeFuncs.c| 100 ++
>  src/backend/nodes/outfuncs.c |  51 +++
>  src/backend/nodes/readfuncs.c|  42 +++
>  src/backend/optimizer/util/clauses.c |  33 ++
>  src/backend/parser/gram.y| 181 ++-
>  src/backend/parser/parse_coerce.c|  33 +-
>  src/backend/parser/parse_expr.c  | 182 +++
>  src/backend/parser/parse_target.c|   7 +
>  src/backend/utils/adt/Makefile   |   2 +-
>  src/backend/utils/adt/ruleutils.c| 100 ++
>  src/backend/utils/adt/xml.c  | 610 ++
> +
>  src/backend/utils/adt/xpath_parser.c | 337 +++
>  src/backend/utils/fmgr/funcapi.c |  13 +
>  src/include/executor/executor.h  |   1 +
>  src/include/executor/tableexpr.h |  69 
>  src/include/funcapi.h|   1 -
>  src/include/nodes/execnodes.h|  31 ++
>  src/include/nodes/nodes.h|   4 +
>  src/include/nodes/parsenodes.h   |  21 ++
>  src/include/nodes/primnodes.h|  40 +++
>  src/include/parser/kwlist.h  |   3 +
>  src/include/parser/parse_coerce.h|   4 +
>  src/include/utils/xml.h  |   2 +
>  src/include/utils/xpath_parser.h |  24 ++
>  src/test/regress/expected/xml.out| 415 
>  src/test/regress/expected/xml_1.out  | 323 +++
>  src/test/regress/expected/xml_2.out  | 414 
>  src/test/regress/sql/xml.sql | 170 ++
>  33 files changed, 4019 insertions(+), 64 deletions(-)
>
> --
> Álvaro Herrerahttps://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>

Re: [HACKERS] Broken SSL tests in master


On 11/25/2016 07:19 AM, Tsunakawa, Takayuki wrote:

Specifying multiple hosts is a new feature to be introduced in v10, so that's 
here:

https://www.postgresql.org/docs/devel/static/libpq-connect.html


Thanks, I had missed that patch. If we add support for multiple hosts I 
think we should also allow for specifying the corresponding multiple 
hostaddrs.


Another thought about this code: should we not check if it is a unix 
socket first before splitting the host? While I doubt that it is common 
to have a unix socket in a directory with comma in the path it is a bit 
annoying that we no longer support this.


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Broken SSL tests in master

2016-11-24 Thread Tsunakawa, Takayuki

From: pgsql-hackers-ow...@postgresql.org
> sense to add support for multiple hostaddrs. For consitency's sake if
> nothing else.

Yes, consistency and performance.  The purpose of hostaddr is to speed up 
connection by eliminating DNS lookup, isn't it?  Then, some users should want 
to specify multiple IP addresses for hostaddr and omit host.

> By the way is comma separated hosts documented somewhere? It is not included
> in
> https://www.postgresql.org/docs/9.6/static/libpq-connect.html#LIBPQ-PA
> RAMKEYWORDS.

Specifying multiple hosts is a new feature to be introduced in v10, so that's 
here:

https://www.postgresql.org/docs/devel/static/libpq-connect.html

Regards
Takayuki Tsunakawa


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Random PGDLLIMPORTing

Craig Ringer  writes:
> PGDLLIMPORT is free, so the question should be "is there a reason not
> to add it here?".

TBH, my basic complaint about it is that I do not like Microsoft's tool
chain assuming that it's entitled to demand that people sprinkle
Microsoft-specific droppings throughout what would otherwise be platform
independent source code.

However, Victor Wagner's observation upthread is quite troubling:

>> It worth checking actual variable definitions, not just declarations.
>> I've found recently, that at least in MSVC build system, only
>> initialized variables are included into postgres.def file, and so are
>> actually exported from the backend binary.

If this is correct (don't ask me, I don't do Windows) then the issue is
not whether "PGDLLIMPORT is free".  This puts two separate source-code
demands on variables that we want to make available to extensions, neither
of which is practically checkable on non-Windows platforms.

I think that basically it's going to be on the heads of people who
want to work on Windows to make sure that things work on that platform.
That is the contract that every other platform under the sun understands,
but it seems like Windows people think it's on the rest of us to make
their platform work.  I'm done with that.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] IF (NOT) EXISTS in psql-completion

2016-11-24 Thread Kyotaro HORIGUCHI

Hello,

At Fri, 25 Nov 2016 06:51:43 +0100, Pavel Stehule  
wrote in 
> I am sure about benefit of all patches - but it is lot of changes in one
> moment, and it is not necessary in this moment.
> 
> patches 0004 and 0005 does some bigger mental changes, and the work can be
> separated.

The patches are collestions of sporadic changes on the same
basis. I agree that the result is too big too look at. (And the
code itself is confused) Please wait for a while for separated
patches.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Broken SSL tests in master

2016-11-24 Thread Mithun Cy

On Fri, Nov 25, 2016 at 10:41 AM, Tsunakawa, Takayuki <
tsunakawa.ta...@jp.fujitsu.com> wrote:
> I agree that pg_conn_host should have hostaddr in addition to host, and
PQhost() return host when host is specified with/without hostaddr specified.

typedef struct pg_conn_host
+{
*+ char   *host; /* host name or address, or socket path */*
*+ pg_conn_host_type type; /* type of host */*
+ char   *port; /* port number for this host; if not NULL,
+ * overrrides the PGConn's pgport */
+ char   *password; /* password for this host, read from the
+ * password file.  only set if the PGconn's
+ * pgpass field is NULL. */
+ struct addrinfo *addrlist; /* list of possible backend addresses */
+} pg_conn_host;

+typedef enum pg_conn_host_type
+{
+ CHT_HOST_NAME,
+ CHT_HOST_ADDRESS,
+ CHT_UNIX_SOCKET
+} pg_conn_host_type;

host parameter stores both hostname and hostaddr, and we already have
parameter "type" to identify same.
I think we should not be using PQHost() directly in
verify_peer_name_matches_certificate_name (same holds good for GSS, SSPI).
Instead proceed only if "conn->connhost[conn->whichhost]" is a
"CHT_HOST_NAME".
Also further old PQHost() did not produce CHT_HOST_ADDRESS as its output so
we might need to revert back to old behaviour.

>However, I wonder whether the hostaddr parameter should also accept
multiple IP addresses.  Currently, it accepts only one address as follows.
I >asked Robert and Mithun about this, but I forgot about that.

As far as I know only pghost allowed to have multiple host. And, pghostaddr
takes only one numeric address.

-- 
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] UNDO and in-place update

2016-11-25 1:44 GMT+01:00 Robert Haas :

> On Thu, Nov 24, 2016 at 6:20 PM, Pavel Stehule 
> wrote:
> >> I think that the whole emphasis on whether and to what degree this is
> >> like Oracle is somewhat misplaced.  I would look at it a different
> >> way.  We've talked many times over the years about how PostgreSQL is
> >> optimized for aborts.  Everybody that I've heard comment on that issue
> >> thinks that is a bad thing.
> >
> >
> > again this depends on usage - when you have a possibility to run VACUUM,
> > then this strategy is better.
> >
> > The fast aborts is one pretty good feature for stable usage.
> >
> > Isn't possible to use UNDO log (or something similar) for VACUUM?
> ROLLBACK
> > should be fast, but
> > VACUUM can do more work?
>
> I think that in this design we wouldn't use VACUUM at all.  However,
> if what you are saying is that we should try to make aborts
> near-instantaneous by pushing UNDO actions into the background, I
> agree entirely.  InnoDB already does that, IIUC.
>

ok, it can be another process - that can be more aggressive and less
limited than vacuum.

Regards

Pavel


>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

Re: [HACKERS] Broken SSL tests in master


On 11/25/2016 06:11 AM, Tsunakawa, Takayuki wrote:

However, I wonder whether the hostaddr parameter should also accept multiple IP 
addresses.


Yeah, I too thought about if we should fix that. I feel like it would 
make sense to add support for multiple hostaddrs. For consitency's sake 
if nothing else.


By the way is comma separated hosts documented somewhere? It is not 
included in 
https://www.postgresql.org/docs/9.6/static/libpq-connect.html#LIBPQ-PARAMKEYWORDS.


Andreas



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] IF (NOT) EXISTS in psql-completion

2016-11-25 2:24 GMT+01:00 Kyotaro HORIGUCHI :

> Hello,
>
> Thank you for looking this long-and-bothersome patch.
>
>
> At Wed, 23 Nov 2016 07:12:00 +0100, Pavel Stehule 
> wrote in  vd9fgpsgv...@mail.gmail.com>
> > Hi
> >
> > 2016-11-15 12:26 GMT+01:00 Kyotaro HORIGUCHI <
> horiguchi.kyot...@lab.ntt.co.
> > jp>:
> >
> > > Hello, I rebased this patch on the current master.
> > >
> > > At Mon, 31 Oct 2016 10:15:48 +0900 (Tokyo Standard Time), Kyotaro
> > > HORIGUCHI  wrote in <
> > > 20161031.101548.162143279.horiguchi.kyot...@lab.ntt.co.jp>
> > > > Anyway, I fixed space issues and addressed the
> > > > COMPLETE_WITH_QUERY()'s almost unused parameter problem. And
> > > > tried to write a README files.
> > >
> > > tab-complete.c have gotten some improvements after this time.
> > >
> > > 577f0bdd2b8904cbdfde6c98f4bda6fd93a05ffc psql: Tab completion for
> > > renaming enum values.
> > > 927d7bb6b120a2ca09a164898f887eb850b7a329 Improve tab completion for
> > > CREATE TRIGGER.
> > > 1d15d0db50a5f39ab69c1fe60f2d5dcc7e2ddb9c psql: Tab-complete LOCK
> [TABLE]
> > > ... IN {ACCESS|ROW|SHARE}.
> > >
> > > The attached patchset is rebsaed on the master including these
> > > patches.
> > >
> >
> > I checked patches 0001, 0002, 0003 patches
> >
> > There are no any problems with patching, compiling, it is working as
> > expected
> >
> > These patches can be committed separately - they are long, but the code
> is
> > almost mechanical
>
> Thanks.
>
> You're right. I haven't consider about relations among them.
>

I am sure about benefit of all patches - but it is lot of changes in one
moment, and it is not necessary in this moment.

patches 0004 and 0005 does some bigger mental changes, and the work can be
separated.

Regards

Pavel


>
>
> 0001 (if-else refactoring) does not anyting functionally. It is
> required by 0004(word-shift-and-removal) and 0005(if-not-exists).
>
> 0002 (keywords case improvement) is almost independent from all
> other patches in this patch set. And it brings an obvious
> improvement.
>
> 0003 (addition to 0002) is move embedded keywords out of defined
> queries. Functionally can be united to 0002 but separated for
> understandability
>
> 0004 (word-shift-and-removal) is quite arguable one. This
> introduces an ability to modify (or destroy) previous_words
> array. This reduces almost redundant matching predicates such as,
>
> >  if (TailMatches3("CREATE|UNIQUE", "INDEX", MatchAny) ||
> >  TailMatches4("CREATE|UNIQUE", "INDEX", "CONCURRENTLY", MatchAny))
>
> into
>
> >  if (Matches3("CREATE", "INDEX", MatchAnyExcept("ON")))
>
> by removing "CONCURRENTLY". This obviously simplifies the
> predicates literally but it the code implies history of
> modification. The implied history might be worse than the
> previous shape, especially for the simple cases like this. For a
> complex case of CREATE TRIGGER, it seems worse than the original
> shape... I'll consider this a bit more. Maybe match-and-collapse
> template should be written a predicate.
>
> 0005 (if-not-exists). I have admit that this is arguable
> feature...
>
> 0006 is the terder point:( but.
>
> > The README is perfect
>
> Thank you, I'm relieved by hearing that.
>
> regards,
>
> --
> Kyotaro Horiguchi
> NTT Open Source Software Center
>
>
>

Re: [HACKERS] Broken SSL tests in master

2016-11-24 Thread Tsunakawa, Takayuki

From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Andreas Karlsson
> On 11/24/2016 10:38 PM, Andreas Karlsson wrote:
> > To me it feels like the proper fix would be to make PQHost() return
> > the value of the host parameter rather than the hostaddr (maybe add a
> > new field in the pg_conn_host struct). But would be a behaviour change
> > which might break someones application. Thoughts?
> 
> I have attached a proof of concept patch for this. Remaining work is
> investigating all the callers of PQhost() and see if any of them are
> negatively affected by this patch and cleaning up the code some.

I agree that pg_conn_host should have hostaddr in addition to host, and 
PQhost() return host when host is specified with/without hostaddr specified.

However, I wonder whether the hostaddr parameter should also accept multiple IP 
addresses.  Currently, it accepts only one address as follows.  I asked Robert 
and Mithun about this, but I forgot about that.

static bool
connectOptions2(PGconn *conn)
{
/*
 * Allocate memory for details about each host to which we might possibly
 * try to connect.  If pghostaddr is set, we're only going to try to
 * connect to that one particular address.  If it's not, we'll use pghost,
 * which may contain multiple, comma-separated names.
 */

Regards
Takayuki Tsunakawa

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [sqlsmith] Failed assertion in parallel worker in ExecInitSubPlan

Amit Kapila  writes:
> On Fri, Nov 25, 2016 at 3:26 AM, Andreas Seltenreich  
> wrote:
>> just caught another InitPlan below Gather with the recent patches in
>> (master as of 4cc6a3f).  Recipe below.

> I think this problem exists since commit
> 110a6dbdebebac9401b43a8fc223e6ec43cd4d10 where we have allowed
> subqueries to be pushed to parallel workers.

The impression I got in poking at this for a few minutes, before
going off to stuff myself with turkey, was that we were allowing
a SubqueryScanPath to mark itself as parallel-safe even though the
resulting plan node would have an InitPlan attached.  So my thought
about fixing it was along the lines of if-subroot-contains-initplans-
then-dont-let-SubqueryScanPath-be-parallel-safe.  What you propose
here seems like it would shut off parallel query in more cases than
that.  But I'm still full of turkey and may be missing something.

There's another issue here, which is that the InitPlan is derived from an
outer join qual whose outer join seems to have been deleted entirely by
analyzejoins.c.  Up to now it was possible to avert our eyes from the fact
that join removal is lazy about getting rid of unused InitPlans, but if
they're going to disable parallelization of the surrounding query, maybe
we need to work harder on that.

Another question worth asking is whether it was okay for the subquery to
decide to use a Gather.  Are we OK with multiple Gathers per plan tree,
independently of the points above?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Declarative partitioning - another take

2016-11-24 Thread Ashutosh Bapat

>
> I assume you meant "...right after the column name"?
>
> I will modify the grammar to allow that way then, so that the following
> will work:
>
> create table p1 partition of p (
>  a primary key
> ) for values in (1);
>

That seems to be non-intuitive as well. The way it's written it looks
like "a primary key" is associated with p rather than p1.

Is there any column constraint that can not be a table constraint? If
no, then we can think of dropping column constraint syntax all
together and let the user specify column constraints through table
constraint syntax. OR we may drop constraints all-together from the
"CREATE TABLE .. PARTITION OF" syntax and let user handle it through
ALTER TABLE commands. In a later version, we will introduce constraint
syntax in that DDL.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [sqlsmith] Failed assertion in parallel worker in ExecInitSubPlan

2016-11-24 Thread Amit Kapila

On Fri, Nov 25, 2016 at 3:26 AM, Andreas Seltenreich  wrote:
> Hi,
>
> just caught another InitPlan below Gather with the recent patches in
> (master as of 4cc6a3f).  Recipe below.
>

I think this problem exists since commit
110a6dbdebebac9401b43a8fc223e6ec43cd4d10 where we have allowed
subqueries to be pushed to parallel workers.  I think we should
consider rel (where rtekind is RTE_SUBQUERY) to be parallel safe if
the subquery is also parallel safe.   Attached patch does that and
fixes the reported problem for me.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

allow_safe_subquery_parallel_worker_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Broken SSL tests in master

On Fri, Nov 25, 2016 at 8:15 AM, Andreas Karlsson  wrote:
> On 11/24/2016 10:38 PM, Andreas Karlsson wrote:
>> To me it feels like the proper fix would be to make PQHost() return the
>> value of the host parameter rather than the hostaddr (maybe add a new
>> field in the pg_conn_host struct). But would be a behaviour change which
>> might break someones application. Thoughts?

Thanks for digging up the root cause. I can see the problem with HEAD as well.

> I have attached a proof of concept patch for this. Remaining work is
> investigating all the callers of PQhost() and see if any of them are
> negatively affected by this patch and cleaning up the code some.

This needs some thoughts, but first I need to understand the
whereabouts of this refactoring work in 9a1d0af4 as well as the
structures that 274bb2b3 has introduced. From what I can see you are
duplicating some logic parsing the pghost string when there is a
pghostaddr set, so my first guess is that you are breaking some of the
intentions behind this code by patching it this way... I am adding in
CC Robert, Mithun and Tsunakawa-san who worked on that. On my side,
I'll need some time to study and understand this new code, that's
necessary before looking at your patch in details, there could be a
more elegant solution. And we had better address this issue before
looking more into the SSL reload patch.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Declarative partitioning - another take

Amit Langote wrote:
> On 2016/11/25 4:36, Alvaro Herrera wrote:

> > I think CREATE TABLE OF is pretty much a corner case.  I agree that
> > allowing the constraint right after the constraint name is more
> > intuitive.
> 
> I assume you meant "...right after the column name"?

Eh, right.


-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Typo in comment

2016-11-24 Thread Thomas Munro

Hi

Here is a tiny patch to fix a typo in execParallel.c.

-- 
Thomas Munro
http://www.enterprisedb.com


typo.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Declarative partitioning - another take

On Thu, Nov 24, 2016 at 6:13 AM, Amit Langote
 wrote:
> On 2016/11/23 4:50, Robert Haas wrote:
>> On Tue, Nov 22, 2016 at 4:15 AM, Amit Langote
>>  wrote:
 The easiest thing to do might be to just enforce that all of the
 partition key columns have to be not-null when the range-partitioned
 table is defined, and reject any attempt to DROP NOT NULL on them
 later.  That's probably better that shoehorning it into the table
 constraint.
>>>
>>> Agreed that forcing range partitioning columns to be NOT NULL during table
>>> creation would be a better approach.  But then we would have to reject
>>> using expressions in the range partition key, right?
>>
>> Why?
>
> I was thinking of it like how primary key columns cannot contain
> expressions; the reason for which, I assume, is because while we can
> ensure that a column contains only non-null values by defining a
> constraint on the column, there is no way to force expressions to be non-null.
>
> In any case, I have implemented in the latest patch that when creating a
> range partitioned table, its partition key columns are automatically set
> to be NOT NULL.  Although, it leaves out the columns that are referenced
> in expressions.  So even after doing so, we need to check after computing
> the range partition key of a input row, that none of the partitions keys
> is null, because expressions can still return null.

Right.  And ensuring that those columns were NOT NULL would be wrong,
as it wouldn't guarantee a non-null result anyway.

> Also, it does nothing to help the undesirable situation that one can
> insert a row with a null partition key (expression) into any of the range
> partitions if targeted directly, because of how ExecQual() handles
> nullable constraint expressions (treats null value as satisfying the
> partition constraint).

That's going to have to be fixed somehow.  How bad would it be if we
passed ExecQual's third argument as false for partition constraints?
Or else you could generate the actual constraint as expr IS NOT NULL
AND expr >= lb AND expr < ub.

> An alternative possibly worth considering might be to somehow handle the
> null range partition keys within the logic to compare against range bound
> datums.  It looks like other databases will map the rows containing nulls
> to the unbounded partition.  One database allows specifying NULLS
> FIRST/LAST and maps a row containing null key to the partition with
> -infinity as the lower bound or +infinity as the upper bound, respectively
> with  NULLS LAST the default behavior.

It seems more future-proof not to allow NULLs at all for now, and
figure out what if anything we want to do about that later.  I mean,
with the syntax we've got here, anything else is basically deciding
whether NULL is the lowest value or the highest value.  It would be
convenient for my employer if we made the same decision that Oracle
did, here, but it doesn't really seem like the PostgreSQL way - or to
put that another way, it's really ugly and unprincipled.  So I
recommend we decide for now that a partitioning column can't be null
and a partitioning expression can't evaluate to NULL.  If it does,
ERROR.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: function xmltable

Michael Paquier wrote:

> Nit: I did not look at the patch in details,
> but I find the size of the latest version sent, 167kB, scary as it
> complicates review and increases the likeliness of bugs.

Here's the stat.  Note that removing the functionality as discussed
would remove all of xpath_parser.c but I think the rest of it remains
pretty much unchanged.  So it's clearly a large patch, but there are
large docs and tests too, not just code.

 doc/src/sgml/func.sgml   | 376 ++---
 src/backend/executor/execQual.c  | 335 +++
 src/backend/executor/execTuples.c|  42 +++
 src/backend/nodes/copyfuncs.c|  66 
 src/backend/nodes/equalfuncs.c   |  51 +++
 src/backend/nodes/nodeFuncs.c| 100 ++
 src/backend/nodes/outfuncs.c |  51 +++
 src/backend/nodes/readfuncs.c|  42 +++
 src/backend/optimizer/util/clauses.c |  33 ++
 src/backend/parser/gram.y| 181 ++-
 src/backend/parser/parse_coerce.c|  33 +-
 src/backend/parser/parse_expr.c  | 182 +++
 src/backend/parser/parse_target.c|   7 +
 src/backend/utils/adt/Makefile   |   2 +-
 src/backend/utils/adt/ruleutils.c| 100 ++
 src/backend/utils/adt/xml.c  | 610 +++
 src/backend/utils/adt/xpath_parser.c | 337 +++
 src/backend/utils/fmgr/funcapi.c |  13 +
 src/include/executor/executor.h  |   1 +
 src/include/executor/tableexpr.h |  69 
 src/include/funcapi.h|   1 -
 src/include/nodes/execnodes.h|  31 ++
 src/include/nodes/nodes.h|   4 +
 src/include/nodes/parsenodes.h   |  21 ++
 src/include/nodes/primnodes.h|  40 +++
 src/include/parser/kwlist.h  |   3 +
 src/include/parser/parse_coerce.h|   4 +
 src/include/utils/xml.h  |   2 +
 src/include/utils/xpath_parser.h |  24 ++
 src/test/regress/expected/xml.out| 415 
 src/test/regress/expected/xml_1.out  | 323 +++
 src/test/regress/expected/xml_2.out  | 414 
 src/test/regress/sql/xml.sql | 170 ++
 33 files changed, 4019 insertions(+), 64 deletions(-)

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] macaddr 64 bit (EUI-64) datatype support

2016-11-24 Thread Haribabu Kommi

On Wed, Nov 23, 2016 at 12:53 PM, Tom Lane  wrote:

> Haribabu Kommi  writes:
> > On Wed, Nov 23, 2016 at 1:42 AM, Tom Lane  wrote:
> >> The precedent of int4/int8/float4/float8 is that SQL data types should
> >> be named after their length in bytes.  So I'd be inclined to call this
> >> "macaddr8" not "macaddr64".  That would suggest taking the simple
> >> approach of always storing values in the 8-byte format, rather than
> >> dealing with the complexities of having two formats internally, two
> >> display formats, etc.
>
> > Do you prefer the automatic conversion from 6 byte to 8 byte MAC address,
> > This way it takes extra space with new datatype. Is it fine to with new
> > datatype?
>
> Well, I think the space savings would be pretty illusory.  If we use a
> varlena datatype, then old-style MAC addresses would take 7 bytes and
> new-style would take 9.  That's not much of an improvement over always
> taking 8 bytes.  What's worse, if the next field has an alignment
> requirement more than char, is that it's really 8 bytes and 12 bytes (or
> 16!), making this strictly worse than a fixed-length-8-bytes approach.
>
> As against that, if we use a varlena type then we'd have some protection
> against the day when the MAC people realize they were still being
> short-sighted and go up to 10 or 12 or 16 bytes.  But even if that happens
> while Postgres is still in use, I'm not sure that we'd choose to make use
> of the varlena aspect rather than invent a third datatype to go with that
> new version of the standard.  Per the discussion in this thread, varlena
> storage in itself doesn't do very much for the client-side compatibility
> issues.  Making a new datatype with a new, well-defined I/O behavior
> ensures that applications don't get blindsided by a new behavior they're
> not ready for.
>
> In short, I'm leaning to having just a fixed-length-8-byte implementation.
> This may seem like learning nothing from our last go-round, but the
> advantages of varlena are very far in the hypothetical future, and the
> disadvantages are immediate.
>
> Also, if we define macaddr as "always 6 bytes" and macaddr8 as "always 8
> bytes", then there's a very simple way for users to widen an old-style
> address to 8 bytes or convert it back to the 6-byte format: just cast
> to the other datatype.  If the new macaddr type can store both widths
> then you need to invent at least one additional function to provide
> those behaviors.
>

Thanks for your feedback.

Here is attached updated patch with new datatype "macaddr8" with fixed
length
of 8 bytes.

If your input a 6 byte MAC address, it automatically  converts it into an 8
byte
MAC address by adding the reserved keywords and store it as an 8 byte
address.

While sending it to client it always send an 8 byte MAC address.

Currently the casting is supported from macaddr to macaddr8, but not the
other way. This is because, not all 8 byte MAC addresses can be converted
into
6 byte addresses.

Test and doc changes are also added.

comments?

Regards,
Hari Babu
Fujitsu Australia


mac_eui64_support_2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] DISTINCT with btree skip scan

On Wed, Nov 23, 2016 at 4:19 PM, Thomas Munro
 wrote:
> On Wed, Oct 12, 2016 at 4:19 PM, Thomas Munro
>  wrote:
>> Here it is, now split into two parts: one patch
>> to add an amskip operation, and another to consider using it to
>> implement DISTINCT when it was already has an index only scan path on
>> an index that supports skipping.
>
> Those patches add an amskip operation and then teach the planner and
> executor how to do this, given a table foo (a, b) with an index on (a)
> or (a, b):
>
>   SELECT DISTINCT foo FROM a
>
>   Index Only Scan for distinct values of (a) using foo_pkey on foo

Cool!

> In just the right circumstances that is vastly faster than scanning
> every tuple and uniquifying.  I was thinking that the next step in
> this experiment might be to introduce a kind of plan that can handle
> queries where the index's leading column is not in the WHERE clause,
> building on the above plan, and behaving conceptually a little bit
> like a Nested Loop, like so:
>
>   SELECT * FROM foo WHERE b = 42
>
>   Index Skip Scan
> ->  Index Only Scan for distinct values of (a) using foo_pkey on foo
> ->  Index Only Scan using foo_pkey on foo
>   Index Cond: ((a = outer.a) AND (b = 42))

Blech, that seems really ugly and probably inefficient.

> I suppose you could instead have a single node that knows how to
> perform the whole scan itself so that you don't finish up searching
> for each distinct value in both the inner and outer subplan, but it
> occurred to me that building the plan out of Lego bricks like this
> might generalise better to more complicated queries.  I suppose it
> might also parallelise nicely with a Parallel Index Only Scan as the
> outer plan.

I think putting all the smarts in a single node is likely to be
better, faster, and cleaner.

This patch has gotten favorable comments from several people, but
somebody really needs to REVIEW it.

Oh, well, since I'm here...

+if (ScanDirectionIsForward(dir))
+{
+so->currPos.moreLeft = false;
+so->currPos.moreRight = true;
+}
+else
+{
+so->currPos.moreLeft = true;
+so->currPos.moreRight = false;
+}

The lack of comments makes it hard for me to understand what the
motivation for this is, but I bet it's wrong.  Suppose there's a
cursor involved here and the user tries to back up. Instead of having
a separate amskip operation, maybe there should be a flag attached to
a scan indicating whether it should return only distinct results.
Otherwise, you're allowing for the possibility that the same scan
might sometimes skip and other times not skip, but then it seems hard
for the scan to survive cursor operations.  Anyway, why would that be
useful?

+if (ScanDirectionIsForward(dir))
+offnum = OffsetNumberPrev(offnum);
+if (!_bt_readpage(scan, dir, offnum))
+{
+if (!_bt_steppage(scan, dir))
+{
+return false;
+}
+}

As the TODO:TM comment at the top of the function sort of implies
without directly coming out and saying so, this is ugly and suggests
that you've picked the wrong API.  If the idea I proposed above isn't
appealing, I still think we need to get rid of this by some other
means.

+/*
+ * TODO:TM For now, we use _bt_search to search from the root; in theory
+ * we should be able to do a local traversal ie from the current page, but
+ * I don't know if it would actually be better in general.
+ */

I think that this depends a lot on the number of duplicates.  If we
end up merely fast-forwarding within the same page to find the next
unique value, re-traversing from the root sucks.  However, if the next
distinct key *isn't* on the same page, then traversing from the root
is a good bet.  A typical btree is only a few levels deep (like 3) so
once you don't find what you're looking for on the same page it's
probably fairly sound to re-traverse from the root.  Of course you
lose at least a little bit in the case where you would have found the
next distinct key within a page or two but in other cases you win big.
I wonder if that would be a suitable heuristic: first check the
current page, if all keys there are equal then retraverse from the
root.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] IF (NOT) EXISTS in psql-completion

2016-11-24 Thread Kyotaro HORIGUCHI

Hello,

Thank you for looking this long-and-bothersome patch.


At Wed, 23 Nov 2016 07:12:00 +0100, Pavel Stehule  
wrote in 
> Hi
> 
> 2016-11-15 12:26 GMT+01:00 Kyotaro HORIGUCHI  jp>:
> 
> > Hello, I rebased this patch on the current master.
> >
> > At Mon, 31 Oct 2016 10:15:48 +0900 (Tokyo Standard Time), Kyotaro
> > HORIGUCHI  wrote in <
> > 20161031.101548.162143279.horiguchi.kyot...@lab.ntt.co.jp>
> > > Anyway, I fixed space issues and addressed the
> > > COMPLETE_WITH_QUERY()'s almost unused parameter problem. And
> > > tried to write a README files.
> >
> > tab-complete.c have gotten some improvements after this time.
> >
> > 577f0bdd2b8904cbdfde6c98f4bda6fd93a05ffc psql: Tab completion for
> > renaming enum values.
> > 927d7bb6b120a2ca09a164898f887eb850b7a329 Improve tab completion for
> > CREATE TRIGGER.
> > 1d15d0db50a5f39ab69c1fe60f2d5dcc7e2ddb9c psql: Tab-complete LOCK [TABLE]
> > ... IN {ACCESS|ROW|SHARE}.
> >
> > The attached patchset is rebsaed on the master including these
> > patches.
> >
> 
> I checked patches 0001, 0002, 0003 patches
> 
> There are no any problems with patching, compiling, it is working as
> expected
> 
> These patches can be committed separately - they are long, but the code is
> almost mechanical

Thanks. 

You're right. I haven't consider about relations among them.


0001 (if-else refactoring) does not anyting functionally. It is
required by 0004(word-shift-and-removal) and 0005(if-not-exists).

0002 (keywords case improvement) is almost independent from all
other patches in this patch set. And it brings an obvious
improvement.

0003 (addition to 0002) is move embedded keywords out of defined
queries. Functionally can be united to 0002 but separated for
understandability

0004 (word-shift-and-removal) is quite arguable one. This
introduces an ability to modify (or destroy) previous_words
array. This reduces almost redundant matching predicates such as,

>  if (TailMatches3("CREATE|UNIQUE", "INDEX", MatchAny) ||
>  TailMatches4("CREATE|UNIQUE", "INDEX", "CONCURRENTLY", MatchAny))

into

>  if (Matches3("CREATE", "INDEX", MatchAnyExcept("ON")))

by removing "CONCURRENTLY". This obviously simplifies the
predicates literally but it the code implies history of
modification. The implied history might be worse than the
previous shape, especially for the simple cases like this. For a
complex case of CREATE TRIGGER, it seems worse than the original
shape... I'll consider this a bit more. Maybe match-and-collapse
template should be written a predicate.

0005 (if-not-exists). I have admit that this is arguable
feature...

0006 is the terder point:( but.

> The README is perfect

Thank you, I'm relieved by hearing that.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [bugfix] commit timestamps ERROR on lookup of FrozenTransactionId

On 25 November 2016 at 02:44, Alvaro Herrera  wrote:
> Craig Ringer wrote:
>
>> Updated to correct the other expected file, since there's an alternate.
>
> FWIW I don't know what you did here, but you did not patch the
> alternate expected file.

Damn. Attached the first patch a second time is what I did.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.6 TAP tests and extensions

On 25 November 2016 at 02:47, Alvaro Herrera  wrote:
> Craig Ringer wrote:
>> On 27 October 2016 at 00:42, Robert Haas  wrote:
>> > On Wed, Oct 26, 2016 at 7:17 AM, Andres Freund  wrote:
>> >> On 2016-09-23 16:04:32 -0400, Tom Lane wrote:
>> >>> Looking back over the thread, I see that you also proposed installing
>> >>> isolationtester and pg_isolation_regress for the benefit of extensions.
>> >>> I'm very much less excited about that idea.  It'd be substantially more
>> >>> dead weight in typical installations, and I'm not sure that it'd be 
>> >>> useful
>> >>> to common extensions, and I'm not eager to treat isolationtester's API
>> >>> and behavior as something we need to hold stable for extension use.
>> >>
>> >> FWIW, I'd be quite happy if it were installed. Running isolationtester
>> >> when compiling extensions against distribution postgres packages would
>> >> be quite useful.
>> >
>> > +1.
>>
>> Patch attached.
>
> Hmm but this only installs isolationtester itself ... don't you need
> pg_isolation_regress too?

Yeah, as Andres pointed out offlist after I posted this. Meant to
follow up but got side-tracked.

It needs PGXS support to be properly useful.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Declarative partitioning - another take

2016-11-24 Thread Amit Langote

On 2016/11/25 4:36, Alvaro Herrera wrote:
> Amit Langote wrote:
>> On 2016/11/24 15:10, Ashutosh Bapat wrote:
>>> On Thu, Nov 24, 2016 at 11:34 AM, Amit Langote wrote:
> 
 You have to specify column constraints using the keywords WITH OPTIONS,
 like below:

 create table p1 partition of p (
 a with options primary key
 ) for values in (1);
>>>
>>> Oh, sorry for not noticing it. You are right. Why do we need "with
>>> option" there? Shouldn't user be able to specify just "a primary key";
>>> it's not really an "option", it's a constraint.
>>
>> I just adopted the existing syntax for specifying column/table constraints
>> of a table created with CREATE TABLE OF type_name.
> 
> I think CREATE TABLE OF is pretty much a corner case.  I agree that
> allowing the constraint right after the constraint name is more
> intuitive.

I assume you meant "...right after the column name"?

I will modify the grammar to allow that way then, so that the following
will work:

create table p1 partition of p (
 a primary key
) for values in (1);

Thanks,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Random PGDLLIMPORTing

On 25 November 2016 at 07:36, Michael Paquier  wrote:
> On Thu, Nov 24, 2016 at 11:01 PM, Magnus Hagander  wrote:
>> On Thu, Nov 24, 2016 at 2:58 PM, Craig Ringer  wrote:
>> My guess is that PGDLLIMPORT has been added explicitly when somebody needed
>> it for something, without any actual thought. I can't say I see any reason
>> not to export the other ones as well -- more that maybe there are even more
>> that are needed?
>
> Yes, more or less. The reasoning behind at least the PostmasterPID bit is 
> that:
> https://www.postgresql.org/message-id/CAB7nPqS_=14krcdch6nyrsw8c58j1cwxv6qcvs7yp-6tthq...@mail.gmail.com
> That resulted in cac83219.
>
> The other ones could perhaps be marked with PGDLLIMPORT. Now usually
> we need a use-case behind this change, and there are not that many
> hackers on Windows doing much plugin development.

Exactly, that's why nobody's been shouting yet.

The use case is is exactly the same as the use case for these vars
existing. IsBackgroundWorker in particular makes no sense to have at
all if it isn't exported, and the only reason nobody's complaining is
that nobody's building cassert binaries under Windows or has tried to
write anything that has behaviour that cares if it's in a bgworker or
not.

PGDLLIMPORT is free, so the question should be "is there a reason not
to add it here?".

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: function xmltable

On Fri, Nov 25, 2016 at 3:31 AM, Pavel Stehule  wrote:
> 2016-11-24 18:51 GMT+01:00 Tom Lane :
>> contrib/xml2 has always relied on libxslt for xpath functionality.
>> Can we do that here instead of writing, debugging, and documenting
>> a pile of new code?
>
> I am sorry - I don't see it. There is nothing complex manipulation with
> XPath expressions.

You are missing the point here, which is to make the implementation
footprint as light as possible, especially if the added functionality
is already present in a dependency that Postgres can be linked to. OK,
libxslt can only be linked with contrib/xml2/ now, but it would be at
least worth looking at how much the current patch can be simplified
for things like transformTableExpr or XmlTableGetValue by relying on
some existing routines. Nit: I did not look at the patch in details,
but I find the size of the latest version sent, 167kB, scary as it
complicates review and increases the likeliness of bugs.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] UNDO and in-place update

On Thu, Nov 24, 2016 at 6:20 PM, Pavel Stehule  wrote:
>> I think that the whole emphasis on whether and to what degree this is
>> like Oracle is somewhat misplaced.  I would look at it a different
>> way.  We've talked many times over the years about how PostgreSQL is
>> optimized for aborts.  Everybody that I've heard comment on that issue
>> thinks that is a bad thing.
>
>
> again this depends on usage - when you have a possibility to run VACUUM,
> then this strategy is better.
>
> The fast aborts is one pretty good feature for stable usage.
>
> Isn't possible to use UNDO log (or something similar) for VACUUM? ROLLBACK
> should be fast, but
> VACUUM can do more work?

I think that in this design we wouldn't use VACUUM at all.  However,
if what you are saying is that we should try to make aborts
near-instantaneous by pushing UNDO actions into the background, I
agree entirely.  InnoDB already does that, IIUC.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Random PGDLLIMPORTing

On Thu, Nov 24, 2016 at 11:01 PM, Magnus Hagander  wrote:
> On Thu, Nov 24, 2016 at 2:58 PM, Craig Ringer  wrote:
> My guess is that PGDLLIMPORT has been added explicitly when somebody needed
> it for something, without any actual thought. I can't say I see any reason
> not to export the other ones as well -- more that maybe there are even more
> that are needed?

Yes, more or less. The reasoning behind at least the PostmasterPID bit is that:
https://www.postgresql.org/message-id/CAB7nPqS_=14krcdch6nyrsw8c58j1cwxv6qcvs7yp-6tthq...@mail.gmail.com
That resulted in cac83219.

The other ones could perhaps be marked with PGDLLIMPORT. Now usually
we need a use-case behind this change, and there are not that many
hackers on Windows doing much plugin development.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] UNDO and in-place update

> I think that the whole emphasis on whether and to what degree this is
> like Oracle is somewhat misplaced.  I would look at it a different
> way.  We've talked many times over the years about how PostgreSQL is
> optimized for aborts.  Everybody that I've heard comment on that issue
> thinks that is a bad thing.


again this depends on usage - when you have a possibility to run VACUUM,
then this strategy is better.

The fast aborts is one pretty good feature for stable usage.

Isn't possible to use UNDO log (or something similar) for VACUUM? ROLLBACK
should be fast, but
VACUUM can do more work?


> I am proposing a design that is optimized
> for commits; that is, if the transaction commits, none of the pages it
> modified need to be dirtied again afterwards at any point.  I think
> that's an extremely important property and it's one that we are very
> far from having today.  It necessarily implies that you cannot store
> the old row versions in the heap, because if you do, then you are
> going to have to dirty the pages again to get rid of them (unless you
> prefer to just leak the space forever).  Now there is plenty of room
> for argument about whether the specific design I proposed is going to
> be any good, and I think that would be quite an interesting discussion
> to have.  But I think if you say, well, you know, the fact that we may
> rewrite the same page 5 or 6 times after a modification to set hint
> bits (a few at a time) and HOT prune and set all-visible and freeze
> isn't really any big deal, you must live in a world where the write
> bandwidth of the I/O channel is a lot less of a problem than it is in
> mine.  And we've been around and around on all of that stuff and
> people have come up with various ideas to improve the situation - some
> of which have been implemented - but many of those ideas involve
> unpleasant trade-offs and so the core problems remain.  If we don't do
> something more fundamental, we're still going to be up against the
> same basic issues ten years from now.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

Re: [HACKERS] Broken SSL tests in master


On 11/24/2016 10:38 PM, Andreas Karlsson wrote:

To me it feels like the proper fix would be to make PQHost() return the
value of the host parameter rather than the hostaddr (maybe add a new
field in the pg_conn_host struct). But would be a behaviour change which
might break someones application. Thoughts?


I have attached a proof of concept patch for this. Remaining work is 
investigating all the callers of PQhost() and see if any of them are 
negatively affected by this patch and cleaning up the code some.


Andreas
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 3e9c45b..39c11eb 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -798,8 +798,34 @@ connectOptions2(PGconn *conn)
 	 */
 	if (conn->pghostaddr != NULL && conn->pghostaddr[0] != '\0')
 	{
-		conn->connhost[0].host = strdup(conn->pghostaddr);
-		if (conn->connhost[0].host == NULL)
+		if (conn->pghost != NULL && conn->pghost[0] != '\0')
+		{
+			char *e = conn->pghost;
+
+			/*
+			 * Search for the end of the first hostname; a comma or
+			 * end-of-string acts as a terminator.
+			 */
+			while (*e != '\0' && *e != ',')
+++e;
+
+			/* Copy the hostname whose bounds we just identified. */
+			conn->connhost[0].host =
+(char *) malloc(sizeof(char) * (e - conn->pghost + 1));
+			if (conn->connhost[0].host == NULL)
+goto oom_error;
+			memcpy(conn->connhost[0].host, conn->pghost, e - conn->pghost);
+			conn->connhost[0].host[e - conn->pghost] = '\0';
+		}
+		else
+		{
+			conn->connhost[0].host = strdup(conn->pghostaddr);
+			if (conn->connhost[0].host == NULL)
+goto oom_error;
+		}
+
+		conn->connhost[0].hostaddr = strdup(conn->pghostaddr);
+		if (conn->connhost[0].hostaddr == NULL)
 			goto oom_error;
 		conn->connhost[0].type = CHT_HOST_ADDRESS;
 	}
@@ -827,6 +853,10 @@ connectOptions2(PGconn *conn)
 			memcpy(conn->connhost[i].host, s, e - s);
 			conn->connhost[i].host[e - s] = '\0';
 
+			conn->connhost[i].hostaddr = strdup(conn->connhost[i].host);
+			if (conn->connhost[i].hostaddr == NULL)
+goto oom_error;
+
 			/* Identify the type of host. */
 			conn->connhost[i].type = CHT_HOST_NAME;
 #ifdef HAVE_UNIX_SOCKETS
@@ -845,12 +875,14 @@ connectOptions2(PGconn *conn)
 	{
 #ifdef HAVE_UNIX_SOCKETS
 		conn->connhost[0].host = strdup(DEFAULT_PGSOCKET_DIR);
+		conn->connhost[0].hostaddr = strdup(DEFAULT_PGSOCKET_DIR);
 		conn->connhost[0].type = CHT_UNIX_SOCKET;
 #else
 		conn->connhost[0].host = strdup(DefaultHost);
+		conn->connhost[0].hostaddr = strdup(DefaultHost);
 		conn->connhost[0].type = CHT_HOST_NAME;
 #endif
-		if (conn->connhost[0].host == NULL)
+		if (conn->connhost[0].host == NULL || conn->connhost[0].hostaddr == NULL)
 			goto oom_error;
 	}
 
@@ -1547,7 +1579,7 @@ connectDBStart(PGconn *conn)
 	for (i = 0; i < conn->nconnhost; ++i)
 	{
 		pg_conn_host *ch = &conn->connhost[i];
-		char	   *node = ch->host;
+		char	   *node = ch->hostaddr;
 		struct addrinfo hint;
 		int			thisport;
 
@@ -3034,6 +3066,8 @@ freePGconn(PGconn *conn)
 		{
 			if (conn->connhost[i].host != NULL)
 free(conn->connhost[i].host);
+			if (conn->connhost[i].hostaddr != NULL)
+free(conn->connhost[i].hostaddr);
 			if (conn->connhost[i].port != NULL)
 free(conn->connhost[i].port);
 			if (conn->connhost[i].password != NULL)
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 854ec89..19e3a5e 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -306,7 +306,10 @@ typedef enum pg_conn_host_type
  */
 typedef struct pg_conn_host
 {
-	char	   *host;			/* host name or address, or socket path */
+	char	   *host;			/* host name or address, or socket path,
+ * used for validating the hostname */
+	char	   *hostaddr;		/* host name or address, or socket path,
+ * used when actually connecting */
 	pg_conn_host_type type;		/* type of host */
 	char	   *port;			/* port number for this host; if not NULL,
  * overrrides the PGConn's pgport */

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Functions Immutable but not parallel safe?

On Thu, Nov 24, 2016 at 5:29 AM, David Rowley
 wrote:
> There's 11 functions which are marked immutable, but are marked as
> parallel unsafe.
>
> postgres=# select proname from pg_proc where provolatile = 'i' and
> proparallel = 'u';
>proname
> -
>  _pg_expandarray
>  _pg_keysequal
>  _pg_truetypid
>  _pg_truetypmod
>  _pg_char_max_length
>  _pg_char_octet_length
>  _pg_numeric_precision
>  _pg_numeric_precision_radix
>  _pg_numeric_scale
>  _pg_datetime_precision
>  _pg_interval_type
> (11 rows)
>
> I'm finding hard to imagine a reason why these might be unsafe, but
> failed. I do notice they're all only used in information_schema.
>
> Could it just perhaps be that these just missed the verification
> process the other functions went through to determine their parallel
> safety?

Yes, I think that's it.  I went through pg_proc.h, but never looked at
information_schema.sql.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] UNDO and in-place update

On Thu, Nov 24, 2016 at 11:06 AM, Greg Stark  wrote:
> Fwiw, Oracle does not use the undo log for snapshot fetches. It's used
> only for transaction rollback and recovery.
>
> For snapshot isolation Oracle has yet a *third* copy of the data in a
> space called the "rollback segment(s)". When you update a row in a
> block you save the whole block in the rollback segment. When you try
> to access a block you check if the CSN -- which is basically
> equivalent to our LSN -- is newer than your snapshot and if it is you
> fetch the old version of the block from the rollback.

My understanding is that this isn't correct.  I think the rollback
segments are what they call the thing that stores UNDO.  See e.g.
http://ss64.com/ora/syntax-redo.html

Having said that, I'm by no means an Oracle expert.

> Essentially their MVCC is done on a per-block level rather than a
> per-row level and they keep only the newest version of the block in
> the table, the rest are in the rollback segment.  For what it's worth
> I think our approach is cleaner and more flexible. They had a lot of
> trouble with their approach over the years and it works well only
> because they invested an enormous amount of development in it and also
> because people throw a lot of hardware at it too.

People do throw a lot of hardware at Oracle, but I don't think I'd
accept the contention that Oracle generally gets less performance out
of the same amount of hardware than PostgreSQL.  There are certainly
some things we often do faster -- including inserts, deletes, and
transaction aborts -- but their implementation of updates seems to be
very fast.  Obviously, a lot depends on configuration and workload, so
it's hard to make general statements, but I think it's more correct to
imagine that people throw a lot of hardware at Oracle because that's
where their big, critical databases are than to imagine that it's
because Oracle is a resource hog.

> I think the main use case we have trouble with is actually the "update
> every row in the table" type of update which requires we write to
> every block, plus a second copy of every block, plus write full pages
> of both copies, then later set hint bits dirtying pages again and
> generating more full pages writes, then later come along and vacuum
> which requires two more writes of every block, etc. If we had a
> solution for the special case of an update that replaces every row in
> a page that I think would complement HOT nicely and go a long way
> towards fixing our issues.

This case is certainly a problem, but I don't believe it's anywhere
close to being our only problem.  My experience is that our system
works fairly well when there are only a few indexes.  However, with
heavily indexed tables, all of your updates become non-HOT and then
bad things happen.  So you can either leave out indexes that you need
for good query performance and then you're hosed, or you can add those
indexes and then you're hosed anyway because of the bloat and write
amplification associated with non-HOT updates.

Also, write-heavy workloads that perform OK as long as there are no
long-lived snapshots often enter a death spiral when you run reporting
queries on the same system.  Finer-grained snapshot tracking would
help with that problem, but it doesn't solve it: now a single
long-lived snapshot can "only" cause the database to double in size
instead of increasing without bound, a scant consolation.  Nor does it
change the fundamental calculus that once a table gets bloated, it's
very painful to de-bloat it.  The appeal of a design that supports
in-place update is that you don't bloat the table in the first place.
You still have the bloat, of course, but it's off in a separate data
structure that is engineered for efficient deletion.

I think that the whole emphasis on whether and to what degree this is
like Oracle is somewhat misplaced.  I would look at it a different
way.  We've talked many times over the years about how PostgreSQL is
optimized for aborts.  Everybody that I've heard comment on that issue
thinks that is a bad thing.  I am proposing a design that is optimized
for commits; that is, if the transaction commits, none of the pages it
modified need to be dirtied again afterwards at any point.  I think
that's an extremely important property and it's one that we are very
far from having today.  It necessarily implies that you cannot store
the old row versions in the heap, because if you do, then you are
going to have to dirty the pages again to get rid of them (unless you
prefer to just leak the space forever).  Now there is plenty of room
for argument about whether the specific design I proposed is going to
be any good, and I think that would be quite an interesting discussion
to have.  But I think if you say, well, you know, the fact that we may
rewrite the same page 5 or 6 times after a modification to set hint
bits (a few at a time) and HOT prune and set all-visible and freeze
isn't really any big deal, you must liv

Re: [HACKERS] pg_dump / copy bugs with "big lines" ?

I propose to rename allow_long to huge_ok.  "Huge" is the terminology
used by palloc anyway.  I'd keep makeLongStringInfo() and
initLongStringInfo() though as interface, because using Huge instead of
Long there looks strange.  Not wedded to that, though (particularly as
it's a bit inconsistent).


I'm not terribly sure about enlargeStringInfo().  I think I would
propose that it takes Size instead of int.  That has rather more fallout
than I'd like, but if we don't do that, then I'm not sure that
appendStringInfo continues to makes sense -- considering that it uses
the return value from appendStringInfoVA (which I'm redefining as
returning Size) to pass to enlargeStringInfo.

I'm not really sure how to ensure that the values passed fit both in an
int and Size ... which they would, given that all the callers use
not-huge stringinfos anyway.  But it'd be better if the compiler knew.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [sqlsmith] Failed assertion in parallel worker in ExecInitSubPlan

2016-11-24 Thread Andreas Seltenreich

Hi,

just caught another InitPlan below Gather with the recent patches in
(master as of 4cc6a3f).  Recipe below.

regards,
andreas

set max_parallel_workers_per_gather = 2;
set min_parallel_relation_size = 0;
set parallel_setup_cost = 0;
set parallel_tuple_cost = 0;

explain select 1 from
   public.quad_point_tbl as ref_0,
   lateral (select
 ref_0.p as c3,
 sample_0.d as c5
   from
 public.nv_child_2010 as sample_0
   left join public.mvtest_tvv as ref_1
   on ('x'< (select contype from pg_catalog.pg_constraint limit 1))
   limit 82) as subq_0;

--QUERY PLAN
-- 

--  Gather  (cost=0.19..13727.52 rows=902246 width=4)
--Workers Planned: 2
--->  Nested Loop  (cost=0.19..13727.52 rows=902246 width=4)
--  ->  Parallel Seq Scan on quad_point_tbl ref_0  (cost=0.00..105.85 
rows=4585 width=16)
--  ->  Limit  (cost=0.19..1.33 rows=82 width=20)
--InitPlan 1 (returns $0)
--  ->  Limit  (cost=0.00..0.19 rows=1 width=1)
--->  Gather  (cost=0.00..10.22 rows=54 width=1)
--  Workers Planned: 2
--  ->  Parallel Seq Scan on pg_constraint  
(cost=0.00..10.22 rows=22 width=1)
--->  Seq Scan on nv_child_2010 sample_0  (cost=0.00..35.50 
rows=2550 width=20)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Broken SSL tests in master


Hi,

The SSL test suite (src/test/ssl) is broken in the master since commit 
9a1d0af4ad2cbd419115b453d811c141b80d872b, which is Robert's refactoring 
of getting the server hostname for GSS, SSPI, and SSL in libpq.


The error we get in the test suite:

# Running: psql -X -A -t -c SELECT 'connected with user=ssltestuser 
dbname=trustdb sslcert=invalid hostaddr=127.0.0.1 
host=common-name.pg-ssltest.test sslrootcert=ssl/root+server_ca.crt 
sslmode=verify-full' -d user=ssltestuser dbname=trustdb sslcert=invalid 
hostaddr=127.0.0.1 host=common-name.pg-ssltest.test 
sslrootcert=ssl/root+server_ca.crt sslmode=verify-full
psql: server certificate for "common-name.pg-ssltest.test" does not 
match host name "127.0.0.1"


As you can see, after the patch libpq will now look at hostaddr rather 
than host when validating the server certificate because that is what is 
stored in the first (and only) entry of conn->connhost, and therefore 
what PQhost() return.


To me it feels like the proper fix would be to make PQHost() return the 
value of the host parameter rather than the hostaddr (maybe add a new 
field in the pg_conn_host struct). But would be a behaviour change which 
might break someones application. Thoughts?


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_dump / copy bugs with "big lines" ?

Daniel Verite wrote:

> Here's an updated patch. Compared to the previous version:
> 
> - removed CopyStartSend (per comment #1 in review)
> 
> - renamed flag to allow_long (comment #2)
> 
> - resetStringInfo no longer resets the flag (comment #3).
> 
> - allowLongStringInfo() is removed (comment #3 and #5),
> makeLongStringInfo() and initLongStringInfo() are provided
> instead, as alternatives to makeStringInfo() and initStringInfo().
> 
> - StringInfoData.len is back to int type, 2GB max.
> (addresses comment #4 incidentally).
> This is safer because  many routines peeking
> into StringInfoData use int for offsets into the buffer,
> for instance most of the stuff in backend/libpq/pqformat.c
> Altough these routines are not supposed to be called on long
> buffers, this assertion was not enforced in the code, and
> with a 64-bit length effectively over 2GB, they would misbehave
> pretty badly.

Hmm, this looks a lot better I think.  One change we overlooked and I
just noticed is that we need to change appendStringInfoVA() to return
size_t rather than int.  This comment at the bottom of it gave that up:

/*
 * Return pvsnprintf's estimate of the space needed.  (Although this is
 * given as a size_t, we know it will fit in int because it's not more
 * than MaxAllocSize.)
 */
return (int) nprinted;

The parenthical comment is no longer true.

This means we need to update all its callers to match.  I suppose it
won't be a problem for most of them since they are not using long
stringinfos anyway, but it seems better to keep everything consistent.
I hope that callers that do not adjust the type of the declared variable
would get a compiler warning.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Declarative partitioning - another take

Amit Langote wrote:
> On 2016/11/24 15:10, Ashutosh Bapat wrote:
> > On Thu, Nov 24, 2016 at 11:34 AM, Amit Langote wrote:

> >> You have to specify column constraints using the keywords WITH OPTIONS,
> >> like below:
> >>
> >> create table p1 partition of p (
> >> a with options primary key
> >> ) for values in (1);
> > 
> > Oh, sorry for not noticing it. You are right. Why do we need "with
> > option" there? Shouldn't user be able to specify just "a primary key";
> > it's not really an "option", it's a constraint.
> 
> I just adopted the existing syntax for specifying column/table constraints
> of a table created with CREATE TABLE OF type_name.

I think CREATE TABLE OF is pretty much a corner case.  I agree that
allowing the constraint right after the constraint name is more
intuitive.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.6 TAP tests and extensions

Craig Ringer wrote:
> On 27 October 2016 at 00:42, Robert Haas  wrote:
> > On Wed, Oct 26, 2016 at 7:17 AM, Andres Freund  wrote:
> >> On 2016-09-23 16:04:32 -0400, Tom Lane wrote:
> >>> Looking back over the thread, I see that you also proposed installing
> >>> isolationtester and pg_isolation_regress for the benefit of extensions.
> >>> I'm very much less excited about that idea.  It'd be substantially more
> >>> dead weight in typical installations, and I'm not sure that it'd be useful
> >>> to common extensions, and I'm not eager to treat isolationtester's API
> >>> and behavior as something we need to hold stable for extension use.
> >>
> >> FWIW, I'd be quite happy if it were installed. Running isolationtester
> >> when compiling extensions against distribution postgres packages would
> >> be quite useful.
> >
> > +1.
> 
> Patch attached.

Hmm but this only installs isolationtester itself ... don't you need
pg_isolation_regress too?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [bugfix] commit timestamps ERROR on lookup of FrozenTransactionId

Craig Ringer wrote:

> Updated to correct the other expected file, since there's an alternate.

FWIW I don't know what you did here, but you did not patch the
alternate expected file.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [bugfix] commit timestamps ERROR on lookup of FrozenTransactionId

Tom Lane wrote:
> Alvaro Herrera  writes:
> > I considered the argument here for a bit and I think Craig is right --
> 
> FWIW, I agree.  We shouldn't require every call site to special-case this,
> and we definitely don't want it to require special cases in SQL code.
> 
> (And I'm for back-patching, too.)

Pushed.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: function xmltable

2016-11-24 18:51 GMT+01:00 Tom Lane :

> Alvaro Herrera  writes:
> > Pavel Stehule wrote:
> >> 2016-11-24 0:13 GMT+01:00 Alvaro Herrera :
> >>> Oh my, I just noticed we have a new xpath preprocessor in this patch
> >>> too.  Where did this code come from -- did you write it all from
> >>> scratch?
>
> >> I wrote it from scratch - libxml2 has not any API for iteration over
> XPath
> >> expression (different than iteration over XPath expression result), and
> >> what I have info, there will not be any new API in libxml2.
>
> > Okay, I agree that the default namespace stuff looks worthwhile in the
> > long run.  But I don't have enough time to review the xpath parser stuff
> > in the current commitfest, and I think it needs at the very least a lot
> > of additional code commentary.
>
> contrib/xml2 has always relied on libxslt for xpath functionality.
> Can we do that here instead of writing, debugging, and documenting
> a pile of new code?
>

I am sorry - I don't see it. There is nothing complex manipulation with
XPath expressions.

Regards

Pavel


>
> regards, tom lane
>

Re: [HACKERS] patch: function xmltable

Alvaro Herrera  writes:
> Pavel Stehule wrote:
>> 2016-11-24 0:13 GMT+01:00 Alvaro Herrera :
>>> Oh my, I just noticed we have a new xpath preprocessor in this patch
>>> too.  Where did this code come from -- did you write it all from
>>> scratch?

>> I wrote it from scratch - libxml2 has not any API for iteration over XPath
>> expression (different than iteration over XPath expression result), and
>> what I have info, there will not be any new API in libxml2.

> Okay, I agree that the default namespace stuff looks worthwhile in the
> long run.  But I don't have enough time to review the xpath parser stuff
> in the current commitfest, and I think it needs at the very least a lot
> of additional code commentary.

contrib/xml2 has always relied on libxslt for xpath functionality.
Can we do that here instead of writing, debugging, and documenting
a pile of new code?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: function xmltable

Pavel Stehule wrote:

> can me send your last work?

Sure, it's in the archives --
https://www.postgresql.org/message-id/20161123233130.oqf7jl6czehy5fiw@alvherre.pgsql

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: function xmltable

2016-11-24 18:29 GMT+01:00 Alvaro Herrera :

> Pavel Stehule wrote:
> > Hi
> >
> > 2016-11-24 0:13 GMT+01:00 Alvaro Herrera :
> >
> > > Oh my, I just noticed we have a new xpath preprocessor in this patch
> > > too.  Where did this code come from -- did you write it all from
> > > scratch?
> >
> > I wrote it from scratch - libxml2 has not any API for iteration over
> XPath
> > expression (different than iteration over XPath expression result), and
> > what I have info, there will not be any new API in libxml2.
>
> Okay, I agree that the default namespace stuff looks worthwhile in the
> long run.  But I don't have enough time to review the xpath parser stuff
> in the current commitfest, and I think it needs at the very least a lot
> of additional code commentary.
>
> However I think the rest of it can reasonably go in -- I mean the SQL
> parse of it, analysis, executor.  Let me propose this: you split the
> patch, leaving the xpath_parser.c stuff out and XMLNAMESPACES DEFAULT,
> and we introduce just the TableExpr stuff plus the XMLTABLE function.  I
> can commit that part in the current commitfest, and we leave the
> xpath_parser plus associated features for the upcoming commitfest.


> Deal?
>

ok

can me send your last work?

Regards

Pavel


>
> --
> Álvaro Herrerahttps://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>

Re: [HACKERS] patch: function xmltable

Pavel Stehule wrote:
> Hi
> 
> 2016-11-24 0:13 GMT+01:00 Alvaro Herrera :
> 
> > Oh my, I just noticed we have a new xpath preprocessor in this patch
> > too.  Where did this code come from -- did you write it all from
> > scratch?
> 
> I wrote it from scratch - libxml2 has not any API for iteration over XPath
> expression (different than iteration over XPath expression result), and
> what I have info, there will not be any new API in libxml2.

Okay, I agree that the default namespace stuff looks worthwhile in the
long run.  But I don't have enough time to review the xpath parser stuff
in the current commitfest, and I think it needs at the very least a lot
of additional code commentary.

However I think the rest of it can reasonably go in -- I mean the SQL
parse of it, analysis, executor.  Let me propose this: you split the
patch, leaving the xpath_parser.c stuff out and XMLNAMESPACES DEFAULT,
and we introduce just the TableExpr stuff plus the XMLTABLE function.  I
can commit that part in the current commitfest, and we leave the
xpath_parser plus associated features for the upcoming commitfest.

Deal?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] UNDO and in-place update

On Thu, Nov 24, 2016 at 2:32 AM, Tsunakawa, Takayuki
 wrote:
> IMHO, overall, there should be pros and cons of the current approach and the 
> new UNDo one (like Oracle?), depending on the workload.  Under update-heavy 
> workload, the UNDO method may be better.  OTOH, under the mostly-INSERT 
> workload (like data warehouse?), the current method will be better because it 
> writes no log for UNDO.

The foreground operation will complete more quickly, because it won't
have to write UNDO.  On the other hand, you'll have to set hint bits
later, as well as freeze, which may be more expensive than writing
UNDO by the time all is said and done.  Whether it's better to do pay
a foreground tax immediately or to do deferred work at a later time
depends on things like whether you have quiet times during which you
can catch up on the deferred work ... but the number of users who have
gotten unpleasant surprises due to autovacuum kicking in during a busy
period is not small.

> Furthermore, it maybe the best to be able to switch the method for each table 
> and/or tablespace.  For example, in pgbench, history table uses the current 
> method,
> and other tables use the UNDO method.  Is it time to introduce a pluggable 
> storage system?

IMHO, it's past time for that.

> Because PostgreSQL is a follower in the UNDO approach, I think it will be 
> better to study other DBMSs well (Oracle and MySQL?).  That includes not only 
> their manuals, but also whitepapers and books.  Especially, I expect good 
> books to give deep knowledge on performance tuning and troubleshooting, from 
> which we will be able to know the cons that Oracle's materials don't state.

I agree up to a point.  I think we need to design our own system as
well as we can, not just copy what others have done.  For example, the
design I sketched will work with all of PostgreSQL's existing index
types.  You need to modify each AM in order to support in-place
updates when a column indexed by that AM has been modified, and that's
probably highly desirable, but it's not a hard requirement.  I believe
that's a better approach for us than insisting that we have to do it
in exactly the same way as some other system.  Now, that doesn't mean
we shouldn't learn from what works well and poorly in other systems,
but I think our goal here should be to chart the best way forward
given PostgreSQL's existing architecture and its existing strengths
and weaknesses, rather than to make it exactly like Oracle or MySQL or
anything else.  Few people on this mailing list would say that either
of those systems are categorically better than PostgreSQL; most, I
suspect, would disagree somewhat vigorously.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] UNDO and in-place update

2016-11-24 Thread Thomas Kellerer

> FWIW, while this is basically true, the idea of repurposing UNDO to be
> usable for MVCC is definitely an Oracleism. Mohan's ARIES paper says
> nothing about MVCC.
> For snapshot isolation Oracle has yet a *third* copy of the data in a
> space called the "rollback segment(s)". 

UNDO and rollback segment are the same thing in Oracle. 

In older versions it was just called "rollback segment" I think it started
with Oracle 10g that they called it UNDO. 

> Fwiw, Oracle does not use the undo log for snapshot fetches. 
> It's used only for transaction rollback and recovery.

As UNDO and "rollback" are the same they do use the UNDO information for
MVCC:

http://docs.oracle.com/database/121/CNCPT/consist.htm#GUID-00A3688F-1219-423C-A5ED-4B8F25BEEAFB__BABFDBAJ






--
View this message in context: 
http://postgresql.nabble.com/UNDO-and-in-place-update-tp5931575p5931844.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [bugfix] commit timestamps ERROR on lookup of FrozenTransactionId

Alvaro Herrera  writes:
> I considered the argument here for a bit and I think Craig is right --

FWIW, I agree.  We shouldn't require every call site to special-case this,
and we definitely don't want it to require special cases in SQL code.

(And I'm for back-patching, too.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] UNDO and in-place update

2016-11-24 Thread Bruce Momjian

On Thu, Nov 24, 2016 at 04:06:14PM +, Greg Stark wrote:
> For snapshot isolation Oracle has yet a *third* copy of the data in a
> space called the "rollback segment(s)". When you update a row in a
> block you save the whole block in the rollback segment. When you try
> to access a block you check if the CSN -- which is basically
> equivalent to our LSN -- is newer than your snapshot and if it is you
> fetch the old version of the block from the rollback.

I assume they can do this with good performance because, unlike UNDO,
they don't need to fsync the pages kept for snapshot visibility --- if
the server crashes, you don't need them.

> Essentially their MVCC is done on a per-block level rather than a
> per-row level and they keep only the newest version of the block in
> the table, the rest are in the rollback segment.  For what it's worth
> I think our approach is cleaner and more flexible. They had a lot of
> trouble with their approach over the years and it works well only
> because they invested an enormous amount of development in it and also
> because people throw a lot of hardware at it too.
> 
> I think the main use case we have trouble with is actually the "update
> every row in the table" type of update which requires we write to
> every block, plus a second copy of every block, plus write full pages
> of both copies, then later set hint bits dirtying pages again and
> generating more full pages writes, then later come along and vacuum
> which requires two more writes of every block, etc. If we had a
> solution for the special case of an update that replaces every row in
> a page that I think would complement HOT nicely and go a long way
> towards fixing our issues.

Any ideas how we could optimize the update-all workload?  Create a new
heap file and indexes?

Our previous worst-case workload was a single row that was updated
repeatedly, but HOT fixed that for non-indexed rows, and WARM will
improve it for indexed rows.

One of the big takeaways when writing my MVCC talk is that no matter how
much we optimize UPDATE, we still need cleanup for deletes and aborted
transactions.

I am hopeful we can continue reducing the number of times we write a
page  for maintenance purposes.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] UNDO and in-place update

On Wed, Nov 23, 2016 at 5:18 PM, Thomas Munro
 wrote:
> On Wed, Nov 23, 2016 at 6:01 PM, Peter Geoghegan  wrote:
>> * Our behavior with many duplicates in secondary indexes is pretty bad
>> in general, I suspect.
>
> From the pie-in-the-sky department:  I believe there are
> snapshot-based systems that don't ever have more than one entry for a
> logical row in a btree.  Instead, they do UNDO-log based snapshot
> reconstruction for btrees too, generalising what is being discussed
> for heap tables in this thread.  That means that whenever you descend
> a btree, at each page you have to be prepared to go and look in the
> UNDO log if necessary on a page-by-page basis.  Alternatively, some
> systems use a shadow table rather than an UNDO log for the old
> entries, but still privilege queries that need the latest version by
> keeping only those in the btree.  I don't know the details but suspect
> that both of those approaches might require CSN (= LSN?) based
> snapshots, so that you can make page-level visibility determinations
> while traversing the 'current' btree based on a page header.  That
> would reduce both kinds of btree bloat: the primary kind of being
> entries for dead tuples that still exist in the heap, and the
> secondary kind being the resultant permanent btree fragmentation that
> remains even after vacuum removes them.  Presumably once you have all
> that, you can allow index-only-scans without a visibility map.  Also,
> I suspect that such a "3D time travelling" btree would be a
> requirement for index ordered tables in a snapshot-based RDBMS.

I don't really understand how a system of this type copes with page
splits.  Suppose there are entries 1000, 2000, ..., 1e6 in the btree.
I now start two overlapping transactions, one inserting all the
positive odd values < 1e6 and the other inserting all the positive
even values < 1e6 that are not already present.  The insertions are
interleaved with each other.  After they get done inserting, what does
the 3D time traveling btree look like at this point?

The problem I'm getting at here is that the heap, unlike an index, is
unordered.  So when I've got all values 1..1e6, they've got to be in a
certain order.  Necessarily, the values from the in-flight
transactions are interleaved with each other and with the old data.
If they're not, I can no longer support efficient searches by the
in-flight transactions, plus I have an unbounded amount of cleanup
work to do at commit time.  But if I *do* have all of those values
interleaved in the proper order, then an abort leaves fragmented free
space.  I can go and kill all of the inserted tuples during UNDO of an
aborted transactions, but because page splits have happened, the index
tuples I need to kill may not be on the same pages where I inserted
them, so I might need to just search from the top of the tree, so it's
expensive.  And even if I do it anyway, the pages are left half-full.
The point is that the splits are the joint result of the two
concurrent transactions taken together - you can't blame individual
splits on individual transactions.

Another way to put this is - if you could make all of the index
operations appear to happen instantaneously at commit time, then I see
how we could make what you're talking about work.  And if you never
needed to split pages, then you could do that in the same way that you
can do it for the heap.  But that's not realistic.

Even for an UNDO-based heap, we have a mild form of this problem.
Suppose we have a full page and delete a tuple, creating free space.
Now another transaction inserts a tuple, using up that free space.
Well, if the first transaction aborts and the second one commits,
we've got trouble.  So we can't allow that.  We have to keep track of
the amount of space that could potentially be gobbled up by UNDO
actions and leave enough space on the page to guarantee that such
actions will succeed.  If that means that some new tuple has to be
placed on a different page instead, so be it.  (This could create some
minor bloat, but I think it's not really that bad.)  For an index, we
can't manage so easily, because we don't have the same kind of
flexibility about where to put inserted tuples.

Do you see some trick here that I'm missing?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] UNDO and in-place update

2016-11-24 Thread Greg Stark

On 23 November 2016 at 04:28, Peter Geoghegan  wrote:
> On Tue, Nov 22, 2016 at 7:01 PM, Robert Haas  wrote:
>> This basic DO-UNDO-REDO protocol has been well-understood for
>> decades.
>
> FWIW, while this is basically true, the idea of repurposing UNDO to be
> usable for MVCC is definitely an Oracleism. Mohan's ARIES paper says
> nothing about MVCC.

Fwiw, Oracle does not use the undo log for snapshot fetches. It's used
only for transaction rollback and recovery.

For snapshot isolation Oracle has yet a *third* copy of the data in a
space called the "rollback segment(s)". When you update a row in a
block you save the whole block in the rollback segment. When you try
to access a block you check if the CSN -- which is basically
equivalent to our LSN -- is newer than your snapshot and if it is you
fetch the old version of the block from the rollback.

Essentially their MVCC is done on a per-block level rather than a
per-row level and they keep only the newest version of the block in
the table, the rest are in the rollback segment.  For what it's worth
I think our approach is cleaner and more flexible. They had a lot of
trouble with their approach over the years and it works well only
because they invested an enormous amount of development in it and also
because people throw a lot of hardware at it too.

I think the main use case we have trouble with is actually the "update
every row in the table" type of update which requires we write to
every block, plus a second copy of every block, plus write full pages
of both copies, then later set hint bits dirtying pages again and
generating more full pages writes, then later come along and vacuum
which requires two more writes of every block, etc. If we had a
solution for the special case of an update that replaces every row in
a page that I think would complement HOT nicely and go a long way
towards fixing our issues.

Incidentally the "Interested transaction list" is for locking rows for
updates and it's basically similar to what we've discussed before of
having a "most frequent xmin" in the header and then a bit indicating
the xmin is missing from the row header. Except in their case they
don't need it for the actual xmin/xmax because their visibility is
done per-block, only the transient lock state

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Creating a DSA area to provide work space for parallel execution

2016-11-24 Thread Dilip Kumar

I have one more question,

In V1 we were calling dsa_detach in ExecParallelCleanup and in
ParallelQueryMain, but it's removed in v2.

Any specific reason ?
Does this need to be used differently ?

 ExecParallelCleanup(ParallelExecutorInfo *pei)
 {
+ if (pei->area != NULL)
+ {
+ dsa_detach(pei->area);
+ pei->area = NULL;
+ }

After this changes, I am getting DSM segment leak warning.

I am calling dsa_allocate and dsa_free.

On Thu, Nov 24, 2016 at 8:09 PM, Dilip Kumar  wrote:
> On Wed, Nov 23, 2016 at 5:42 PM, Thomas Munro
>  wrote:
>> ... or we could allow DSA areas to be constructed inside existing
>> shmem, as in the attached patch which requires dsa_create_in_place,
>> from the patch at
>> https://www.postgresql.org/message-id/CAEepm%3D0-pbokaQdCXhtYn%3Dw64MmdJz4hYW7qcSU235ar276x7w%40mail.gmail.com
>> .
> Seems like there is problem in this patch..
>
> In below code,  pei->area is not yet allocated at this point , so
> estate->es_query_area will always me NULL ?
>
> + estate->es_query_area = pei->area;
> +
>   /* Create a parallel context. */
>   pcxt = CreateParallelContext(ParallelQueryMain, nworkers);
>   pei->pcxt = pcxt;
> @@ -413,6 +423,10 @@ ExecInitParallelPlan(PlanState *planstate, EState
> *estate, int nworkers)
>   shm_toc_estimate_keys(&pcxt->estimator, 1);
>   }
>
> + /* Estimate space for DSA area. */
> + shm_toc_estimate_chunk(&pcxt->estimator, PARALLEL_AREA_SIZE);
> + shm_toc_estimate_keys(&pcxt->estimator, 1);
> +
>   /* Everyone's had a chance to ask for space, so now create the DSM. */
>   InitializeParallelDSM(pcxt);
>
> @@ -466,6 +480,14 @@ ExecInitParallelPlan(PlanState *planstate, EState
> *estate, int nworkers)
>   pei->instrumentation = instrumentation;
>   }
>
> + /* Create a DSA area that can be used by the leader and all workers. */
> + area_space = shm_toc_allocate(pcxt->toc, PARALLEL_AREA_SIZE);
> + shm_toc_insert(pcxt->toc, PARALLEL_KEY_AREA, area_space);
> + pei->area = dsa_create_in_place(area_space, PARALLEL_AREA_SIZE,
> + LWTRANCHE_PARALLEL_EXEC_AREA,
> +   "parallel query memory area");
>
>
> --
> Regards,
> Dilip Kumar
> EnterpriseDB: http://www.enterprisedb.com



-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Creating a DSA area to provide work space for parallel execution

2016-11-24 Thread Dilip Kumar

On Wed, Nov 23, 2016 at 5:42 PM, Thomas Munro
 wrote:
> ... or we could allow DSA areas to be constructed inside existing
> shmem, as in the attached patch which requires dsa_create_in_place,
> from the patch at
> https://www.postgresql.org/message-id/CAEepm%3D0-pbokaQdCXhtYn%3Dw64MmdJz4hYW7qcSU235ar276x7w%40mail.gmail.com
> .
Seems like there is problem in this patch..

In below code,  pei->area is not yet allocated at this point , so
estate->es_query_area will always me NULL ?

+ estate->es_query_area = pei->area;
+
  /* Create a parallel context. */
  pcxt = CreateParallelContext(ParallelQueryMain, nworkers);
  pei->pcxt = pcxt;
@@ -413,6 +423,10 @@ ExecInitParallelPlan(PlanState *planstate, EState
*estate, int nworkers)
  shm_toc_estimate_keys(&pcxt->estimator, 1);
  }

+ /* Estimate space for DSA area. */
+ shm_toc_estimate_chunk(&pcxt->estimator, PARALLEL_AREA_SIZE);
+ shm_toc_estimate_keys(&pcxt->estimator, 1);
+
  /* Everyone's had a chance to ask for space, so now create the DSM. */
  InitializeParallelDSM(pcxt);

@@ -466,6 +480,14 @@ ExecInitParallelPlan(PlanState *planstate, EState
*estate, int nworkers)
  pei->instrumentation = instrumentation;
  }

+ /* Create a DSA area that can be used by the leader and all workers. */
+ area_space = shm_toc_allocate(pcxt->toc, PARALLEL_AREA_SIZE);
+ shm_toc_insert(pcxt->toc, PARALLEL_KEY_AREA, area_space);
+ pei->area = dsa_create_in_place(area_space, PARALLEL_AREA_SIZE,
+ LWTRANCHE_PARALLEL_EXEC_AREA,
+   "parallel query memory area");


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Random PGDLLIMPORTing

2016-11-24 Thread Victor Wagner

On Thu, 24 Nov 2016 15:01:33 +0100
Magnus Hagander  wrote:

> On Thu, Nov 24, 2016 at 2:58 PM, Craig Ringer 
> wrote:
> 
> > Hi all
> >
> > Noticed this while reading something unrelated
> >
> > extern PGDLLIMPORT pid_t PostmasterPid;
> > extern bool IsPostmasterEnvironment;
> > extern PGDLLIMPORT bool IsUnderPostmaster;
> > extern bool IsBackgroundWorker;
> > extern PGDLLIMPORT bool IsBinaryUpgrade;
> >
> > I don't see any sane reason for some of those to be PGDLLIMPORT but
> > not others. In particular it's pretty silly for IsBackgroundWorker
> > not to be PGDLLIMPORT.
> >
> > Too trivial to be worth making an actual patch, it'd be more work to
> > apply it than edit directly.
> >  
> 
> My guess is that PGDLLIMPORT has been added explicitly when somebody
> needed it for something, without any actual thought. I can't say I
> see any reason not to export the other ones as well -- more that
> maybe there are even more that are needed?
> 

It worth checking actual variable definitions, not just declarations.
I've found recently, that at least in MSVC build system, only
initialized variables are included into postgres.def file, and so are
actually exported from the backend binary.

But in this case both IsPostmasterEnvironment and IsBackgroundWorker
are initialized. So probably they are not marked as importable only for
historic reason.





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Physical append-only tables

2016-11-24 Thread Bruce Momjian

On Thu, Nov 24, 2016 at 10:13:30AM +0100, Magnus Hagander wrote:
> On Thu, Nov 24, 2016 at 2:26 AM, Bruce Momjian  wrote:
> 
> On Mon, Nov 14, 2016 at 08:43:12PM +, Greg Stark wrote:
> > That said, I don't think the "maintain clustering a bit better using
> > BRIN" is a bad idea. It's just the bit about turning a table
> > append-only to deal with update-once data that I think is overreach.
> 
> What if we used BRIN to find heap pages where the new row was in the
> page BRIN min/max range, and the heap page had free space.  Only if that
> fails do we put is somewhere else in the heap.
> 
> 
> That would certainly be useful. You'd have to figure out what to do in the 
> case
> of multiple conflicting BRIN indexes (which you shouldn't have in the first
> place, but that won't keep people from having them), but other than that it
> would be quite good I think. 

This idea is only possible because the BRIN index is so small and easy
to scan, i.e. this wouldn't work for a btree index.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] UNDO and in-place update

2016-11-24 Thread Bruce Momjian

On Wed, Nov 23, 2016 at 11:35:38PM -0800, Peter Geoghegan wrote:
> On Wed, Nov 23, 2016 at 11:32 PM, Tsunakawa, Takayuki
>  wrote:
> > IMHO, overall, there should be pros and cons of the current approach and 
> > the new UNDo one (like Oracle?), depending on the workload.  Under 
> > update-heavy workload, the UNDO method may be better.  OTOH, under the 
> > mostly-INSERT workload (like data warehouse?), the current method will be 
> > better because it writes no log for UNDO.
> 
> I believe that you are correct about that.

This is probably similar to our use of double-buffering, using
shared_buffers and the kernel cache --- in some cases, the
double-buffering is better (variable workloads), while in other cases a
single cache is best.

We have had trouble figuring out if we need to support both single and
double caching methods, and when to recommend one over the other.  Seems
UNDO would have the same complexity.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Reload SSL certificates on SIGHUP


On 11/24/2016 02:49 PM, Andreas Karlsson wrote:

Thanks for finding this. I will look at this more once I get home, but
the tests do not fail on my computer. I wonder what I do differently.

What versions of Perl and OpenSSL do you run and how did you run the
tests when the failed? I ran the tests by running "make check" inside
"src/test/ssl".


Never mind, I had not fetched the latest master. Once I had done so 
these tests were both broken in my rebased branch and in the master 
branch without any modifications. I guess I will have to bisect this 
once I get home.


Inof for myself or anyone else who feels like bisecting this: the last 
known good commit is 67dc4ccbb2e1c27da823eced66d9217a5652cbb0.


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Random PGDLLIMPORTing

2016-11-24 Thread Magnus Hagander

On Thu, Nov 24, 2016 at 2:58 PM, Craig Ringer  wrote:

> Hi all
>
> Noticed this while reading something unrelated
>
> extern PGDLLIMPORT pid_t PostmasterPid;
> extern bool IsPostmasterEnvironment;
> extern PGDLLIMPORT bool IsUnderPostmaster;
> extern bool IsBackgroundWorker;
> extern PGDLLIMPORT bool IsBinaryUpgrade;
>
> I don't see any sane reason for some of those to be PGDLLIMPORT but
> not others. In particular it's pretty silly for IsBackgroundWorker not
> to be PGDLLIMPORT.
>
> Too trivial to be worth making an actual patch, it'd be more work to
> apply it than edit directly.
>

My guess is that PGDLLIMPORT has been added explicitly when somebody needed
it for something, without any actual thought. I can't say I see any reason
not to export the other ones as well -- more that maybe there are even more
that are needed?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: [HACKERS] Random PGDLLIMPORTing

2016-11-24 Thread Magnus Hagander

On Thu, Nov 24, 2016 at 2:58 PM, Craig Ringer  wrote:

> Hi all
>
> Noticed this while reading something unrelated
>
> extern PGDLLIMPORT pid_t PostmasterPid;
> extern bool IsPostmasterEnvironment;
> extern PGDLLIMPORT bool IsUnderPostmaster;
> extern bool IsBackgroundWorker;
> extern PGDLLIMPORT bool IsBinaryUpgrade;
>
> I don't see any sane reason for some of those to be PGDLLIMPORT but
> not others. In particular it's pretty silly for IsBackgroundWorker not
> to be PGDLLIMPORT.
>
> Too trivial to be worth making an actual patch, it'd be more work to
> apply it than edit directly.
>
> --
>  Craig Ringer   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>



-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

[HACKERS] Random PGDLLIMPORTing

Hi all

Noticed this while reading something unrelated

extern PGDLLIMPORT pid_t PostmasterPid;
extern bool IsPostmasterEnvironment;
extern PGDLLIMPORT bool IsUnderPostmaster;
extern bool IsBackgroundWorker;
extern PGDLLIMPORT bool IsBinaryUpgrade;

I don't see any sane reason for some of those to be PGDLLIMPORT but
not others. In particular it's pretty silly for IsBackgroundWorker not
to be PGDLLIMPORT.

Too trivial to be worth making an actual patch, it'd be more work to
apply it than edit directly.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Reload SSL certificates on SIGHUP


On 11/24/2016 08:46 AM, Michael Paquier wrote:

On Sat, Nov 12, 2016 at 3:42 AM, Andreas Karlsson  wrote:

On 11/11/2016 07:40 PM, Andreas Karlsson wrote:

Here is a new version of the patch with the only differences;

1) The SSL tests have been changed to use reload rather than restart


Did you check if the tests pass? I am getting a couple of failures
like this one:
psql: server certificate for "common-name.pg-ssltest.test" does not
match host name "127.0.0.1"
not ok 11 - sslrootcert=ssl/root+server_ca.crt sslmode=verify-full
Attached are the logs of the run I did, and the same behavior shows
for macOS and Linux. The shape of the tests look correct to me after
review. Still, seeing failing tests with sslmode=verify-full is a
problem that needs to be addressed. This may be pointing to an
incorrect CA load handling, though I could not spot a problem when
going through the code.


Thanks for finding this. I will look at this more once I get home, but 
the tests do not fail on my computer. I wonder what I do differently.


What versions of Perl and OpenSSL do you run and how did you run the 
tests when the failed? I ran the tests by running "make check" inside 
"src/test/ssl".


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [bugfix] commit timestamps ERROR on lookup of FrozenTransactionId

I considered the argument here for a bit and I think Craig is right --
FrozenXid eventually makes it to a tuple's xmin where it becomes a burden
to the caller, making our interface bug-prone -- sure you can
special-case it, but you don't until it first happens ... and it may not
until you're deep into production.

Even the code comment is confused: "error if the given Xid doesn't
normally commit".  But surely FrozenXid *does* commit in the sense that
it appears in committed tuples' Xmin.

We already have a good mechanism for replying to the query with "this
value is too old for us to have its commit TS", which is a false return
value.  We should use that.

I think not backpatching is worse, because then users have to be aware
that they need to handle the FrozenXid case specially, but only on
9.5/9.6 ...  I think the reason it took this long to pop up is because
it has taken this long to get to replication systems on which this issue
matters.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] postgres 9.3 postgres_fdw ::LOG: could not receive data from client: Connection reset by peer

2016-11-24 Thread Vladimir Svedov

Local server log has the line and remote table log is empty (it is
configured for minimum warning and when I produce one it appears in log OK)
And I have new details - it happens on some additional environments - not
constantly. Some hours it happens every time, then just stops appearing:
postgres@hostname:~$ grep user
9.3/main/pg_log/postgresql-2016-11-22_00.log  | cat -n
 1  2016-11-22 06:00:02 UTC 127.0.0.1 user::LOG:  could not receive
data from client: Connection reset by peer
 2  2016-11-22 06:00:03 UTC 127.0.0.1 user::LOG:  could not receive
data from client: Connection reset by peer
 3  2016-11-22 06:00:03 UTC 127.0.0.1 user::LOG:  could not receive
data from client: Connection reset by peer
 4  2016-11-22 10:06:08 UTC 127.0.0.1 user::LOG:  could not receive
data from client: Connection reset by peer
 5  2016-11-22 10:06:08 UTC 127.0.0.1 user ::LOG:  could not receive
data from client: Connection reset by peer
 6  2016-11-22 10:06:08 UTC 127.0.0.1 user::LOG:  could not receive
data from client: Connection reset by peer
 7  2016-11-22 11:25:27 UTC 127.0.0.1  user::LOG:  could not receive
data from client: Connection reset by peer
 8  2016-11-22 11:25:27 UTC 127.0.0.1 user::LOG:  could not receive
data from client: Connection reset by peer
 9  2016-11-22 11:25:27 UTC 127.0.0.1 user::LOG:  could not receive
data from client: Connection reset by peer
this is log from local server where user runs same bunch of three
statements MINUTELY

But when it start happening if I psql to db, same statement produces NOT
this string in log.

Here reason why you should not follow this post any more:
postgres=# select version();
   version
--
 PostgreSQL 9.3.9 on x86_64-unknown-linux-gnu, compiled by gcc (Debian
4.7.2-5) 4.7.2, 64-bit
(1 row)

I will just upgrade it. And this is the only machine I have this weird
behavior at.
Sorry guys and thank you for your time

2016-11-23 18:16 GMT+00:00 Jeff Janes :

> On Mon, Nov 21, 2016 at 6:32 AM, Vladimir Svedov 
> wrote:
>
>> Hi,
>> I have this question. Looked for a help on http://dba.stackexchange.com/
>> No success.
>> Maybe you can answer?..
>> Thank you in advance
>>
>>
>> "FOREIGN_TABLE" created with postgres_fdw. LOCAL_TABLE is just a local
>> table...
>>
>> Symptoms:
>>
>>1. I run in psql query SELECT * from FOREIGN_TABLE. No log generated
>>2. I run in bash psql -c "SELECT * from LOCAL_TABLE". No log generated
>>3. I run in bash psql -c "SELECT * from FOREIGN_TABLE". ::LOG: could
>>not receive data from client: Connection reset by peer generated in
>>postgres log
>>
>>
> Which server log file is this generated in, the local or the foreign?
> Whichever it is, is there an entry in the logfile for the other server
> which seems to match up to this one?  That may have more useful details.
>
> Cheers,
>
> Jeff
>

Re: [HACKERS] Patch to implement pg_current_logfile() function

2016-11-24 Thread Karl O. Pinc

On Wed, 23 Nov 2016 23:08:18 -0600
"Karl O. Pinc"  wrote:

> On Wed, 23 Nov 2016 03:21:05 -0600
> "Karl O. Pinc"  wrote:
> 
> > On Sat, 19 Nov 2016 12:58:47 +0100
> > Gilles Darold  wrote:
> >   
> > > ... attached v14 of the patch. 
> 
> > patch_pg_current_logfile-v14.diff.retry_current_logfiles-part3  
> 
> > Re-try the write of current_logfiles should it fail because the
> > system is too busy.  

> ---
> 
> patch_pg_current_logfile-v14.diff.doc_linux_default
> 
> Applies on top of
> patch_pg_current_logfile-v14.diff.pg_ctl_and_more_docs
> 
> Mentions, in the body of the docs, defaults and their impact on
> current_logfiles and pg_current_logfiles.  It seems appropriate
> to mention this in the main documentation and in the overall context
> of logging.

Attached is version 2 of this patch.  Adds an index entry.

patch_pg_current_logfile-v14.diff.doc_linux_default-v2

Karl 
Free Software:  "You don't pay back, you pay forward."
 -- Robert A. Heinlein
 top of
> p

patch_pg_current_logfile-v14.diff.doc_linux_default-v2
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Push down more full joins in postgres_fdw


On 2016/11/22 18:28, Ashutosh Bapat wrote:

The comments should explain why is the assertion true.
+/* Shouldn't be NIL */
+Assert(tlist != NIL);


I noticed that I was wrong; in the Assertion the tlist can be empty.  An 
example for such a case is:


SELECT 1 FROM (SELECT c1 FROM ft4 WHERE c1 between 50 and 60) t1 FULL 
JOIN (SELECT c1 FROM ft5 WHERE c1 between 50 and 60) t2 ON (TRUE);


In this example any columns of the relations ft4 and ft5 wouldn't be 
needed for the join or the final output, so both the tlists for the 
relations created in deparseRangeTblRef would be empty.  Attached is an 
updated version fixing this bug.  Changes are:


* I removed make_outerrel_subquery/make_innerrel_subquery from fpinfo 
and added a new member "is_subquery_rel" to fpinfo, as a flag on whether 
to deparse the relation as a subquery.
* I modified deparseRangeTblRef, deparseSelectSql, and is_subquery_var 
to see the is_subquery_rel, not 
make_outerrel_subquery/make_innerrel_subquery.
* I modified appendSubselectAlias to handle the case where ncols = 0 
properly.
* I renamed subquery_rels in fpinfo to lower_subquery_rels, to make it 
easy to distinguish this from is_subquery_rel clearly.

* I added regression tests for that.

The rest of the patch is the same as the previous version, but:


+/* Should be same length */
+Assert(list_length(tlist) ==
list_length(foreignrel->reltarget->exprs));


I added comments explaining the reason.


The name get_subselect_alias_id() seems to suggest that the function returns
identifier for subselect alias, which isn't correct. It returns the alias
identifiers for deparsing a Var node. But I guess, we have left this to the
committer's judgement.


I agree that the name isn't good, so I changed it to 
get_relation_column_alias_ids().  Let me know if it may be better.


I also revised code and comments a bit, just for consistency.

I will update another patch for PHVs on top of the attached one.

Best regards,
Etsuro Fujita
*** a/contrib/postgres_fdw/deparse.c
--- b/contrib/postgres_fdw/deparse.c
***
*** 109,114  typedef struct deparse_expr_cxt
--- 109,116 
  /* Handy macro to add relation name qualification */
  #define ADD_REL_QUALIFIER(buf, varno)	\
  		appendStringInfo((buf), "%s%d.", REL_ALIAS_PREFIX, (varno))
+ #define SS_REL_ALIAS_PREFIX	"s"
+ #define SS_COL_ALIAS_PREFIX	"c"
  
  /*
   * Functions to determine whether an expression can be evaluated safely on
***
*** 167,172  static void appendConditions(List *exprs, deparse_expr_cxt *context);
--- 169,181 
  static void deparseFromExprForRel(StringInfo buf, PlannerInfo *root,
  	RelOptInfo *joinrel, bool use_alias, List **params_list);
  static void deparseFromExpr(List *quals, deparse_expr_cxt *context);
+ static void deparseRangeTblRef(StringInfo buf, PlannerInfo *root,
+    RelOptInfo *foreignrel, List **params_list);
+ static void appendSubselectAlias(StringInfo buf, int relno, int ncols);
+ static void get_relation_column_alias_ids(Var *node, RelOptInfo *foreignrel,
+ 			  int *relno, int *colno);
+ static bool is_subquery_var(Var *node, RelOptInfo *foreignrel, int *relno,
+ 		int *colno);
  static void deparseAggref(Aggref *node, deparse_expr_cxt *context);
  static void appendGroupByClause(List *tlist, deparse_expr_cxt *context);
  static void appendAggOrderBy(List *orderList, List *targetList,
***
*** 990,1001  deparseSelectSql(List *tlist, List **retrieved_attrs, deparse_expr_cxt *context)
  	 */
  	appendStringInfoString(buf, "SELECT ");
  
  	if (foreignrel->reloptkind == RELOPT_JOINREL ||
! 		foreignrel->reloptkind == RELOPT_UPPER_REL)
! 	{
! 		/* For a join relation use the input tlist */
  		deparseExplicitTargetList(tlist, retrieved_attrs, context);
- 	}
  	else
  	{
  		/*
--- 999,1013 
  	 */
  	appendStringInfoString(buf, "SELECT ");
  
+ 	/*
+ 	 * For a join relation or an upper relation, use deparseExplicitTargetList.
+ 	 * Likewise, for a relation that is being deparsed as a subquery, use that
+ 	 * function. Otherwise, use deparseTargetList.
+ 	 */
  	if (foreignrel->reloptkind == RELOPT_JOINREL ||
! 		foreignrel->reloptkind == RELOPT_UPPER_REL ||
! 		fpinfo->is_subquery_rel)
  		deparseExplicitTargetList(tlist, retrieved_attrs, context);
  	else
  	{
  		/*
***
*** 1155,1165  deparseLockingClause(deparse_expr_cxt *context)
--- 1167,1185 
  	StringInfo	buf = context->buf;
  	PlannerInfo *root = context->root;
  	RelOptInfo *rel = context->scanrel;
+ 	PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) rel->fdw_private;
  	int			relid = -1;
  
  	while ((relid = bms_next_member(rel->relids, relid)) >= 0)
  	{
  		/*
+ 		 * Ignore relation if it appears in a lower subquery. Locking clause
+ 		 * for such a relation, if needed, is included in the subquery.
+ 		 */
+ 		if (bms_is_member(relid, fpinfo->lower_subquery_rels))
+ 			continue;
+ 
+ 		/*
  		 * Add FOR UPDATE/SHARE if appropr

Re: [HACKERS] Patch: Implement failover on libpq connect level.

2016-11-24 Thread Mithun Cy

On Wed, Nov 23, 2016 at 10:19 PM, Catalin Iacob 
wrote:
   On Tue, Nov 22, 2016 at 8:38 AM, Tsunakawa, Takayuki <
tsunakawa.ta...@jp.fujitsu.com> wrote:
 >> If you want to connect to a server where the transaction is read-only,
then shouldn't the connection parameter be something like
 >>"target_session_attrs=readonly"?  That represents exactly what the code
does.

 >FWIW I find this to be a reasonable compromise. To keep the analogy
 >with the current patch it would be more something like
"target_session_attrs=read_write|any".

I have taken this suggestion now renamed target_server_type to
target_session_attrs with possible 2 values "read-write", "any".
May be we could expand to "readonly" and "prefer-readonly" in next patch
proposal. Attaching the patch for same.
-- 
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com


failover-to-new-writable-session-05.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Gather Merge

2016-11-24 Thread Rushabh Lathia

On Wed, Nov 16, 2016 at 3:10 PM, Rushabh Lathia 
wrote:

>
>
> On Mon, Nov 14, 2016 at 3:51 PM, Thomas Munro <
> thomas.mu...@enterprisedb.com> wrote:
>
>> On Sat, Nov 12, 2016 at 1:56 AM, Rushabh Lathia
>>  wrote:
>> > On Fri, Nov 4, 2016 at 8:30 AM, Thomas Munro <
>> thomas.mu...@enterprisedb.com>
>> > wrote:
>> >> + * Portions Copyright (c) 1996-2015, PostgreSQL Global Development
>> Group
>> >> + * Portions Copyright (c) 1994, Regents of the University of
>> California
>> >>
>> >> Shouldn't this say just "(c) 2016, PostgreSQL Global Development
>> >> Group"?
>> >
>> > Fixed.
>>
>> The year also needs updating to 2016 in nodeGatherMerge.h.
>>
>
> Oops sorry, fixed now.
>
>
>>
>> >> + /* Per-tuple heap maintenance cost */
>> >> + run_cost += path->path.rows * comparison_cost * 2.0 * logN;
>> >>
>> >> Why multiply by two?  The comment above this code says "about log2(N)
>> >> comparisons to delete the top heap entry and another log2(N)
>> >> comparisons to insert its successor".  In fact gather_merge_getnext
>> >> calls binaryheap_replace_first, which replaces the top element without
>> >> any comparisons at all and then performs a sift-down in log2(N)
>> >> comparisons to find its new position.  There is no per-tuple "delete"
>> >> involved.  We "replace" the top element with the value it already had,
>> >> just to trigger the sift-down, because we know that our comparator
>> >> function might have a new opinion of the sort order of this element.
>> >> Very clever!  The comment and the 2.0 factor in cost_gather_merge seem
>> >> to be wrong though -- or am I misreading the code?
>> >>
>> > See cost_merge_append.
>>
>> That just got tweaked in commit 34ca0905.
>>
>
> Fixed.
>
>
>>
>> > Looking at the plan I realize that this is happening because wrong
>> costing
>> > for Gather Merge. Here in the plan we can see the row estimated by
>> > Gather Merge is wrong. This is because earlier patch GM was considering
>> > rows = subpath->rows, which is not true as subpath is partial path. So
>> > we need to multiple it with number of worker. Attached patch also fixed
>> > this issues. I also run the TPC-H benchmark with the patch and results
>> > are same as earlier.
>>
>> In create_grouping_paths:
>> +   double  total_groups = gmpath->rows *
>> gmpath->parallel_workers;
>>
>> This hides a variable of the same name in the enclosing scope.  Maybe
>> confusing?
>>
>> In some other places like create_ordered_paths:
>> +   double  total_groups = path->rows * path->parallel_workers;
>>
>> Though it probably made sense to use this variable name in
>> create_grouping_paths, wouldn't total_rows be better here?
>>
>>
> Initially I just copied from the other places. I agree with you that
> create_ordered_paths - total_rows make more sense.
>
>
>> It feels weird to be working back to a total row count estimate from
>> the partial one by simply multiplying by path->parallel_workers.
>> Gather Merge will underestimate the total rows when parallel_workers <
>> 4, if using partial row estimates ultimately from  cost_seqscan which
>> assume some leader contribution.  I don't have a better idea though.
>> Reversing cost_seqscan's logic certainly doesn't seem right.  I don't
>> know how to make them agree on the leader's contribution AND give
>> principled answers, since there seems to be some kind of cyclic
>> dependency in the costing logic (cost_seqscan really needs to be given
>> a leader contribution estimate from its superpath which knows whether
>> it will allow the leader to pull tuples greedily/fairly or not, but
>> that superpath hasn't been created yet; cost_gather_merge needs the
>> row count from its subpath).  Or maybe I'm just confused.
>>
>>
> Yes, I agree with you. But we can't really do changes into cost_seqscan.
> Another option I can think of is just calculate the rows for gather merge,
> by using the reverse formula which been used into cost_seqscan. So
> we can completely remote the rows argument from the
> create_gather_merge_path()
> and then inside create_gather_merge_path() - calculate the total_rows
> using same formula which been used into cost_seqscan. This is working
> fine - but not quite sure about the approach. So I attached that part of
> changes
> as separate patch. Any suggestions?
>

With offline discussion with Thomas, I realized that this won't work. It
will
work only if the subplan is seqscan - so this logic won't be enough to
estimate the rows. I guess as Thomas told earlier, this is not problem
with GatherMerge implementation as such - so we will keep this as separate.

Apart from this my colleague Neha Sharma, reported the server crash with
the patch.
It was hitting below Assert into create_gather_merge_path().

 Assert(pathkeys);

Basically when query is something like "select * from foo where a = 1 order
by a;"
here query has sortclause but planner won't generate sort key because
where equality clause is on same column. Fix is about making sure of
pathkeys

Re: [HACKERS] Declarative partitioning - another take

2016-11-24 Thread Amit Langote

On 2016/11/23 4:50, Robert Haas wrote:
> On Tue, Nov 22, 2016 at 4:15 AM, Amit Langote
>  wrote:
>>> The easiest thing to do might be to just enforce that all of the
>>> partition key columns have to be not-null when the range-partitioned
>>> table is defined, and reject any attempt to DROP NOT NULL on them
>>> later.  That's probably better that shoehorning it into the table
>>> constraint.
>>
>> Agreed that forcing range partitioning columns to be NOT NULL during table
>> creation would be a better approach.  But then we would have to reject
>> using expressions in the range partition key, right?
> 
> Why?

I was thinking of it like how primary key columns cannot contain
expressions; the reason for which, I assume, is because while we can
ensure that a column contains only non-null values by defining a
constraint on the column, there is no way to force expressions to be non-null.

In any case, I have implemented in the latest patch that when creating a
range partitioned table, its partition key columns are automatically set
to be NOT NULL.  Although, it leaves out the columns that are referenced
in expressions.  So even after doing so, we need to check after computing
the range partition key of a input row, that none of the partitions keys
is null, because expressions can still return null.

Also, it does nothing to help the undesirable situation that one can
insert a row with a null partition key (expression) into any of the range
partitions if targeted directly, because of how ExecQual() handles
nullable constraint expressions (treats null value as satisfying the
partition constraint).

An alternative possibly worth considering might be to somehow handle the
null range partition keys within the logic to compare against range bound
datums.  It looks like other databases will map the rows containing nulls
to the unbounded partition.  One database allows specifying NULLS
FIRST/LAST and maps a row containing null key to the partition with
-infinity as the lower bound or +infinity as the upper bound, respectively
with  NULLS LAST the default behavior.

In our case, if we allowed a similar way of defining a range partitioned
table:

create table p (a int, b int) partition by range nulls first (a);
create table p0 partition of p for values from (unbounded) to (1);
create table p1 partition of p for values from (1) to (10);
create table p2 partition of p for values from (10) to (unbounded);

Row (null, 1) will be mapped to p0.  If we didn't have p0, we would report
the "partition not found" error.

In case of a multi-column key:

create table p (a int, b int) partition by range (a, b);
create table p0 partition of p for values from (1, unbounded) to (1, 1);
create table p1 partition of p for values from (1, 1) to (1, 10);
create table p2 partition of p for values from (1, 10) to (1, unbounded);

Row (1, null) will be mapped to p2 (default nulls last behavior).

But I guess we still end up without a solution for the problem that a row
with null partition key (expression) could be inserted into any of the
range partitions if targeted directly.

Thoughts?

>> As a result, one change became necessary: to how we flag individual range
>> bound datum as infinite or not.  Previously, it was a regular Boolean
>> value (either infinite or not) and to distinguish +infinity from
>> -infinity, we looked at whether the bound is lower or upper (the lower
>> flag).  Now, instead, the variable holding the status of individual range
>> bound datum is set to a ternary value: RANGE_DATUM_FINITE (0),
>> RANGE_DATUM_NEG_INF (1), and RANGE_DATUM_POS_INF (2), which still fits in
>> a bool.
> 
> You better not be using a bool to represent a ternary value!  Use an
> enum for that -- or if in the system catalogs, a char.

OK, created an enum called RangeDatumContent.  In the system catalog, we
still store the boolean value; it is only after we read it into the
relcache structure that we use one of these enum values.  I'm worried
though that using enum would consume more memory (we need to store nparts
* partnattrs instances of the enum).

Thanks,
Amit

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] DISTINCT with btree skip scan

2016-11-24 Thread Geoff Winkless

On 23 November 2016 at 21:19, Thomas Munro
 wrote:
> Worth pursuing?  Does amskip suck?  Does anyone have better ideas,
> either for how to do the low level skip or the higher level Index Skip
> Scan, or perhaps a completely different way of looking at this?

I have no helpful suggestions with how to achieve it, but can I add a
voice of encouragement: there have been a good few occasions in the
past year (we moved from another db to PG) where the lack of skip
scans has bitten us; in that case it was using MIN() and GROUP BY,
where the grouping column was the first element in the compound index
and the aggregate column was the second: in the Other DB that sort of
query was extremely quick, not so much here.

I was also idly pondering (in one of those moments of conceited
self-delusion where I thought I might actually have enough spare time
to try to work on it myself) whether it would be possible to implement
that sort of skip with two indexes (so MIN(a) GROUP BY b with separate
indexes on (a) and (b) rather than a single index (a,b)); I never got
much further than idle musings though.

Geoff

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Functions Immutable but not parallel safe?

2016-11-24 Thread David Rowley

There's 11 functions which are marked immutable, but are marked as
parallel unsafe.

postgres=# select proname from pg_proc where provolatile = 'i' and
proparallel = 'u';
   proname
-
 _pg_expandarray
 _pg_keysequal
 _pg_truetypid
 _pg_truetypmod
 _pg_char_max_length
 _pg_char_octet_length
 _pg_numeric_precision
 _pg_numeric_precision_radix
 _pg_numeric_scale
 _pg_datetime_precision
 _pg_interval_type
(11 rows)

I'm finding hard to imagine a reason why these might be unsafe, but
failed. I do notice they're all only used in information_schema.

Could it just perhaps be that these just missed the verification
process the other functions went through to determine their parallel
safety?

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] PassDownLimitBound for ForeignScan/CustomScan [take-2]

2016-11-24 Thread Jeevan Chalke

On Mon, Nov 21, 2016 at 1:59 PM, Kouhei Kaigai  wrote:

> Hello,
>
> The attached patch is a revised version of pass-down LIMIT to FDW/CSP.
>
> Below is the updates from the last version.
>
> 'ps_numTuples' of PlanState was declared as uint64, instead of long
> to avoid problems on 32bits machine when a large LIMIT clause is
> supplied.
>
> 'ps_numTuples' is re-interpreted; 0 means that its upper node wants
> to fetch all the tuples. It allows to eliminate a boring initialization
> on ExecInit handler for each executor node.
>
> Even though it was not suggested, estimate_path_cost_size() of postgres_fdw
> adjusts number of rows if foreign-path is located on top-level of
> the base-relations and LIMIT clause takes a constant value.
> It will make more adequate plan as follows:
>
> * WITHOUT this patch
> 
> postgres=# explain verbose select * from t_a, t_b where t_a.id = t_b.id
> and t_a.x < t_b.x LIMIT 100;
>QUERY PLAN
> 
> 
>  Limit  (cost=261.17..274.43 rows=100 width=88)
>Output: t_a.id, t_a.x, t_a.y, t_b.id, t_b.x, t_b.y
>->  Hash Join  (cost=261.17..581.50 rows=2416 width=88)
>  Output: t_a.id, t_a.x, t_a.y, t_b.id, t_b.x, t_b.y
>  Hash Cond: (t_a.id = t_b.id)
>  Join Filter: (t_a.x < t_b.x)
>  ->  Foreign Scan on public.t_a  (cost=100.00..146.12 rows=1204
> width=44)
>Output: t_a.id, t_a.x, t_a.y
>Remote SQL: SELECT id, x, y FROM public.t
>  ->  Hash  (cost=146.12..146.12 rows=1204 width=44)
>Output: t_b.id, t_b.x, t_b.y
>->  Foreign Scan on public.t_b  (cost=100.00..146.12
> rows=1204 width=44)
>  Output: t_b.id, t_b.x, t_b.y
>  Remote SQL: SELECT id, x, y FROM public.t
> (14 rows)
>
> * WITH this patch
> -
> postgres=# explain verbose select * from t_a, t_b where t_a.id = t_b.id
> and t_a.x < t_b.x LIMIT 100;
>
> QUERY PLAN
> 
> --
>  Limit  (cost=100.00..146.58 rows=100 width=88)
>Output: t_a.id, t_a.x, t_a.y, t_b.id, t_b.x, t_b.y
>->  Foreign Scan  (cost=100.00..146.58 rows=100 width=88)
>  Output: t_a.id, t_a.x, t_a.y, t_b.id, t_b.x, t_b.y
>  Relations: (public.t_a) INNER JOIN (public.t_b)
>  Remote SQL: SELECT r1.id, r1.x, r1.y, r2.id, r2.x, r2.y FROM
> (public.t r1 INNER JOIN public.t r2 ON (((r1.x < r2.x)) AND ((r1.id =
> r2.id
> (6 rows)
>
>
> That's nice.


> On the other hands, I noticed it is not safe to attach LIMIT clause at
> the planner stage because root->limit_tuples is declared as double.
> Even if LIMIT clause takes a constant value, it is potentially larger
> than 2^53 which is the limitation we can represent accurately with
> float64 data type but LIMIT clause allows up to 2^63-1.
> So, postgres_fdw now attaches LIMIT clause on the remote query on
> execution time only.
>

I think, it's OK.

Here are few comments on latest patch:

1.
make/make check is fine, however I am getting regression failure in
postgres_fdw contrib module (attached regression.diff).
Please investigate and fix.

2.
+ *
+ * MEMO: root->limit_tuples is not attached when query contains
+ * grouping-clause or aggregate functions. So, we don's adjust
+ * rows even if LIMIT  is supplied.

Can you please explain why you are not doing this for grouping-clause or
aggregate functions.

3.
Typo:

don's  => don't

Rest of the changes look good to me.

Thanks


> Thanks,
> 
> PG-Strom Project / NEC OSS Promotion Center
> KaiGai Kohei 
>
>
> > -Original Message-
> > From: Robert Haas [mailto:robertmh...@gmail.com]
> > Sent: Thursday, November 10, 2016 3:08 AM
> > To: Kaigai Kouhei(海外 浩平) 
> > Cc: pgsql-hackers@postgresql.org; Jeevan Chalke
> > ; Etsuro Fujita
> > ; Andres Freund 
> > Subject: ##freemail## Re: PassDownLimitBound for ForeignScan/CustomScan
> > [take-2]
> >
> > On Mon, Oct 31, 2016 at 10:20 AM, Kouhei Kaigai 
> > wrote:
> > > As an example, I enhanced postgres_fdw to understand the ps_numTuples
> > > if it is set. If and when remote ORDER BY is pushed down, the latest
> > > code tries to sort the entire remote table because it does not know
> > > how many rows to be returned. Thus, it took larger execution time.
> > > On the other hands, the patched one runs the remote query with LIMIT
> > > clause according to the ps_numTuples; which is informed by the Limit
> > > node on top of the ForeignScan node.
> >
> > So there are two cases here.  If the user says LIMIT 12, we could in
> theory
> > know that at planner time and optimize accordingly.  If the user says
> LIMIT
> > twelve(), however, we will need to wait until execution time unless
> twelve()
> > happens to be capable of being simplified to a

Re: [HACKERS] Push down more full joins in postgres_fdw


On 2016/11/24 18:20, Ashutosh Bapat wrote:

I wrote:

You missed the point; the foreignrel->reltarget->exprs doesn't contain
any
PHVs, so the tlist created by build_tlist_to_depase will be guaranteed to
be
one-to-one with the foreignrel->reltarget->exprs.


You wrote:

It's guaranteed now, but can not be forever. This means anybody who
supports PHV tomorrow, will have to "remember" to update this code as
well. If s/he misses it, a bug will be introduced. That's the avenue
for bug, I am talking about.


I wrote:

It *can* be guaranteed.  See another patch for supporting evaluating PHVs
remotely.



We can't base assumptions in this patch on something in the other
patch, esp. when that's not even reviewed once. PHV is just one case,
subqueries involving aggregates is other and there are others.


Let's discuss this together with the patch for PHVs.  That was one of 
the reasons why I had merged the two patches into one.  I'd like to 
leave enhancements such as subqueries involving aggregates for future 
work, though.



But
that's not really the point. The point is
add_to_flat_tlist(pull_var_clause(rel->reltarget->exprs) can not be
proved to be same as rel->reltarget->exprs in general.


Really?  As mentioned in comments for RelOptInfo in relation.h shown 
below, the rel->reltarget->exprs for a base/join relation only contains 
Vars and PHVs, so the two things would be proved to be one-to-one. 
(Note: we don't need to care about the appendrel-child-relation case 
because appendrel child relations wouldn't appear in the main join tree.)


 *  reltarget - Default Path output tlist for this rel; normally 
contains

 *  Var and PlaceHolderVar nodes for the values we need to
 *  output from this relation.
 *  List is in no particular order, but all rels of an
 *  appendrel set must use corresponding orders.
 *  NOTE: in an appendrel child relation, may contain
 *  arbitrary expressions pulled up from a subquery!

Best regards,
Etsuro Fujita




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Push down more full joins in postgres_fdw

2016-11-24 Thread Ashutosh Bapat

>
>
 build_tlist_to_depase() calls pull_var_nodes() before creating the
 tlist, whereas the code that searches does not do that. Code-wise
 those are not the same things.
>
>
>>> You missed the point; the foreignrel->reltarget->exprs doesn't contain
>>> any
>>> PHVs, so the tlist created by build_tlist_to_depase will be guaranteed to
>>> be
>>> one-to-one with the foreignrel->reltarget->exprs.
>
>
>> It's guaranteed now, but can not be forever. This means anybody who
>> supports PHV tomorrow, will have to "remember" to update this code as
>> well. If s/he misses it, a bug will be introduced. That's the avenue
>> for bug, I am talking about.
>
>
> It *can* be guaranteed.  See another patch for supporting evaluating PHVs
> remotely.

We can't base assumptions in this patch on something in the other
patch, esp. when that's not even reviewed once. PHV is just one case,
subqueries involving aggregates is other and there are others. But
that's not really the point. The point is
add_to_flat_tlist(pull_var_clause(rel->reltarget->exprs) can not be
proved to be same as rel->reltarget->exprs in general. So, we should
base our code on an assumption that can not be proved.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Physical append-only tables

2016-11-24 Thread Magnus Hagander

On Thu, Nov 24, 2016 at 2:26 AM, Bruce Momjian  wrote:

> On Mon, Nov 14, 2016 at 08:43:12PM +, Greg Stark wrote:
> > That said, I don't think the "maintain clustering a bit better using
> > BRIN" is a bad idea. It's just the bit about turning a table
> > append-only to deal with update-once data that I think is overreach.
>
> What if we used BRIN to find heap pages where the new row was in the
> page BRIN min/max range, and the heap page had free space.  Only if that
> fails do we put is somewhere else in the heap.
>

That would certainly be useful. You'd have to figure out what to do in the
case of multiple conflicting BRIN indexes (which you shouldn't have in the
first place, but that won't keep people from having them), but other than
that it would be quite good I think.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: [HACKERS] Push down more full joins in postgres_fdw

On 2016/11/24 17:39, Ashutosh Bapat wrote:

On Thu, Nov 24, 2016 at 1:27 PM, Etsuro Fujita
wrote:

On 2016/11/24 16:46, Ashutosh Bapat wrote:

table will be misleading as subquery can represent a join and
corresponding alias would represent the join. Relation is better term.

But the documentation uses the word "table" for a join relation. See
7.2.1.2. Table and Column Aliases:
https://www.postgresql.org/docs/9.6/static/queries-table-expressions.html#QUERIES-TABLE-ALIASES

The definition there says "A temporary name can be given to tables and
complex table references to be used for references to the derived
table in the rest of the query. This is called a table alias.", please
note the usage "complex table references". Within the code, we do not
refer to a relation as table. So, the comment in this code should not
refer to a relation as table either.

OK, will keep using the term "relation".

I wrote:

I don't think so; in the current version, we essentially deparse from and
search into the same object, ie, foreignrel->reltarget->exprs, since the
tlist created by build_tlist_to_deparse is build from the
foreignrel->reltarget->exprs. Also, since it is just created for
deparsing
the relation as a subquery in deparseRangeTblRef and isn't stored into
fpinfo or anywhere alse, we would need no concern about creating such
avenues. IMO I think adding comments would be enough for this.

build_tlist_to_depase() calls pull_var_nodes() before creating the
tlist, whereas the code that searches does not do that. Code-wise
those are not the same things.

You missed the point; the foreignrel->reltarget->exprs doesn't contain any
PHVs, so the tlist created by build_tlist_to_depase will be guaranteed to be
one-to-one with the foreignrel->reltarget->exprs.

It's guaranteed now, but can not be forever. This means anybody who
supports PHV tomorrow, will have to "remember" to update this code as
well. If s/he misses it, a bug will be introduced. That's the avenue
for bug, I am talking about.

It *can* be guaranteed. See another patch for supporting evaluating
PHVs remotely.

Best regards,
Etsuro Fujita

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Push down more full joins in postgres_fdw

2016-11-24 Thread Ashutosh Bapat

On Thu, Nov 24, 2016 at 1:27 PM, Etsuro Fujita
 wrote:
> On 2016/11/24 16:46, Ashutosh Bapat wrote:

 Sorry. I think the current version is better than previous one. The
 term "subselect alias" is confusing in the previous version. In the
 current version, "Get the relation and column alias for a given Var
 node," we need to add word "identifiers" like "Get the relation and
 column identifiers for a given Var node".
>
>
> I wrote:
>>>
>>> OK, but one thing I'm concerned about is the term "relation alias" seems
>>> a
>>> bit confusing because we already used the term for the alias of the form
>>> "rN".  To avoid that, how about saying "table alias", not "relation
>>> alias"?
>>> (in which case, the comment would be something like "Get the table and
>>> column identifiers for a given Var node".)
>
>
>> table will be misleading as subquery can represent a join and
>> corresponding alias would represent the join. Relation is better term.
>
>
> But the documentation uses the word "table" for a join relation.  See
> 7.2.1.2. Table and Column Aliases:
> https://www.postgresql.org/docs/9.6/static/queries-table-expressions.html#QUERIES-TABLE-ALIASES

The definition there says "A temporary name can be given to tables and
complex table references to be used for references to the derived
table in the rest of the query. This is called a table alias.", please
note the usage "complex table references". Within the code, we do not
refer to a relation as table. So, the comment in this code should not
refer to a relation as table either.

>
>>> I don't think so; in the current version, we essentially deparse from and
>>> search into the same object, ie, foreignrel->reltarget->exprs, since the
>>> tlist created by build_tlist_to_deparse is build from the
>>> foreignrel->reltarget->exprs.  Also, since it is just created for
>>> deparsing
>>> the relation as a subquery in deparseRangeTblRef and isn't stored into
>>> fpinfo or anywhere alse, we would need no concern about creating such
>>> avenues.  IMO I think adding comments would be enough for this.
>
>
>> build_tlist_to_depase() calls pull_var_nodes() before creating the
>> tlist, whereas the code that searches does not do that. Code-wise
>> those are not the same things.
>
>
> You missed the point; the foreignrel->reltarget->exprs doesn't contain any
> PHVs, so the tlist created by build_tlist_to_depase will be guaranteed to be
> one-to-one with the foreignrel->reltarget->exprs.

It's guaranteed now, but can not be forever. This means anybody who
supports PHV tomorrow, will have to "remember" to update this code as
well. If s/he misses it, a bug will be introduced. That's the avenue
for bug, I am talking about.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Push down more full joins in postgres_fdw