Re: [HACKERS] pg_dump dump catalog ACLs

2016-04-25 Thread Noah Misch
On Mon, Apr 25, 2016 at 12:39:09AM -0400, Stephen Frost wrote:
> * Noah Misch (n...@leadboat.com) wrote:
> > On Fri, Apr 22, 2016 at 12:31:41PM -0400, Stephen Frost wrote:
> > > After looking through the code a bit, I realized that there are a lot of
> > > object types which don't have ACLs at all but which exist in pg_catalog
> > > and were being analyzed because the bitmask for pg_catalog included ACLs
> > > and therefore was non-zero.
> > > 
> > > Clearing that bit for object types which don't have ACLs improved the
> > > performance for empty databases quite a bit (from about 3s to a bit
> > > under 1s on my laptop).  That's a 42-line patch, with comment lines
> > > being half of that, which I'll push once I've looked into the other
> > > concerns which were brought up on this thread.
> > 
> > That's good news.
> 
> Attached patch-set includes this change in patch #2.

Timings for the 100-database pg_dumpall:

HEAD:   131s
HEAD+patch:  33s
9.5:  8.6s

Nice improvement for such a simple patch.

> Patch #1 is the fix for the incorrect WHERE clause.

I confirmed that this fixed dumping of the function ACL in my test case.

> For languages, I believe that means that we simply need to modify the
> selectDumpableProcLang() function to use the same default we use for the
> 'pg_catalog' namespace- DUMP_COMPONENT_ACL, instead of
> DUMP_COMPONENT_NONE.

Makes sense.

> What's not clear to me is what, if any, issue there is with namespaces.

As far as I know, none.  The current behavior doesn't match the commit
message, but I think the current behavior is better.

> Certainly, in my testing at least, if you do:
> 
> REVOKE CREATE ON SCHEMA public FROM public;
> 
> Then you get the appropriate commands from pg_dump to implement the
> resulting ACLs on the public schema.  If you change the permissions back
> to match what is there at initdb-time (or you just don't change them),
> then there aren't any GRANT or REVOKE commands from pg_dump, but that's
> correct, isn't it?

Good question.  I think it's fine and possibly even optimal.  One can imagine
other designs that remember whether any GRANT or REVOKE has happened since
initdb, but I see no definite reason to prefer that alternative.

> In the attached patch-set, patch #3 includes support for
> 
> src/bin/pg_dump: make check
> 
> implemented using the TAP testing system.  There are a total of 360
> tests, generally covering:

I like the structure of this test suite.

> +# Start with 1 because of non-existant database test below
> +# Test connecting to a non-existant database

Spelling.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Protocol buffer support for Postgres

2016-04-25 Thread Craig Ringer
On 26 April 2016 at 14:06, 陈天舟  wrote:

> I am interested in adding Protocol Buffer support for Postgres. Protocol
> Buffer occupies less space than JSON. More importantly, it has schema and
> is forward/backward compatible. All these make it a very good format for
> persistency.
>
> Here are two rough ideas I have right now:
>
> Approach 1:
>
> Creating a datatype "PROTOBUF" similar as "JSON" and implement all the
> similar support (e.g. indexing) as done for "JSON" Type.
> (1) Since each protocol buffer column requires a schema. I am not sure
> where is the best place to store that schema info. Should it be in a
> CONSTRAINT (but I am not able to find the doc referring any custom
> constraint), or should it be in the COMMENT or somewhere else?
>

I can't really imagine how you'd do that without adding a new catalog like
we have for enum members. A typmod isn't sufficient since you need a whole
lot more than an integer, and typmods aren't tracked throughout the server
that well.

That'll make it hard to do it with an extension.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


[HACKERS] Protocol buffer support for Postgres

2016-04-25 Thread 陈天舟
I am interested in adding Protocol Buffer support for Postgres. Protocol
Buffer occupies less space than JSON. More importantly, it has schema and
is forward/backward compatible. All these make it a very good format for
persistency.

Here are two rough ideas I have right now:

Approach 1:

Creating a datatype "PROTOBUF" similar as "JSON" and implement all the
similar support (e.g. indexing) as done for "JSON" Type.
(1) Since each protocol buffer column requires a schema. I am not sure
where is the best place to store that schema info. Should it be in a
CONSTRAINT (but I am not able to find the doc referring any custom
constraint), or should it be in the COMMENT or somewhere else?
(2) The input/output will be already serialized protocol buffer. The
protocol buffer schema will be used to validate the column value when
storing/retrieving the data. It may be possible to skip the check upon
retrieving, instead the check can be done anytime the column schema is
changed.

Approach 2:

Still creating a datatype "PROTOBUF", but internally it uses "JSON" type.
So PROTOBUF is just a wrapper type doing the conversion and schema
validation on the fly. The benefit of this is it can leverage the JSON
support. The downside is it can only support the common features available
in both Protocol Buffer and JSON. e.g. Protocol Buffer annotation is a
quite useful feature which is not available in JSON.

I would like to hear some thoughts about having Protocol Buffer for
Postgres as well as the approach tackling it.

Thanks
Tianzhou


Re: [HACKERS] Bogus cleanup code in PostgresNode.pm

2016-04-25 Thread Tom Lane
Michael Paquier  writes:
> On Mon, Apr 25, 2016 at 11:51 PM, Tom Lane  wrote:
>> I believe we can fix this by forcing postmaster shutdown in an END
>> routine instead of a DESTROY routine, and hence propose the attached
>> patch, which does things in the right order for me.  I'm a pretty
>> poor Perl programmer, so I'd appreciate somebody vetting this.

> Another, perhaps more solid approach, would be put the DESTROY method
> in charge of removing PGDATA and extend TestLib::tempdir with an
> argument to be able to switch to CLEANUP => 0 at will. Then we use
> this argument for PGDATA after sending SIGQUIT.

Bearing in mind that I'm not a Perl expert --- this bothers me because
of what I read about the order of global destructor calls being
unspecified.  See http://perldoc.perl.org/perlobj.html#Destructors
specifically:

  When the last reference to an object goes away, the object is
  destroyed. If you only have one reference to an object stored in a
  lexical scalar, the object is destroyed when that scalar goes out of
  scope. If you store the object in a package global, that object may not
  go out of scope until the program exits.

(the last sentence being the one that applies here) and

  The order in which objects are destroyed during the global destruction
  before the program exits is unpredictable.

I do not think it's a good idea to have destructors with
externally-visible effects happening in an undefined order.  The present
bug is exactly because those things are happening in the "wrong" order.
Doubling down on using the DESTROY method won't make that better.

Now, whether using END is really an improvement is a separate question.
I have the impression that END calls happen in a better-defined order,
but I'm not a perl monk.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Parallel SAFE information missing in CREATE OR REPLACE FUNCTION definition

2016-04-25 Thread Ashutosh Sharma
Hi,

In PGSQL-9.6, if we create a function with PARALLEL clause and try
displaying it's definition using "*pg_get_functiondef*" we see that the
PARALLEL keyword used during function creation is missing.

Below are the steps to reproduce:

postgres=# CREATE FUNCTION add(integer, integer) RETURNS integer
postgres-# AS 'select $1 + $2;'
postgres-# LANGUAGE SQL
postgres-# PARALLEL SAFE
postgres-# RETURNS NULL ON NULL INPUT;
CREATE FUNCTION

postgres=# CREATE OR REPLACE FUNCTION increment(i integer) RETURNS integer
AS $$
postgres$# BEGIN
postgres$# RETURN i + 1;
postgres$# END;
postgres$# $$ LANGUAGE plpgsql PARALLEL SAFE;
CREATE FUNCTION

postgres=# CREATE FUNCTION square_root(double precision) RETURNS double
precision
AS 'dsqrt'
LANGUAGE internal
PARALLEL SAFE;
CREATE FUNCTION

postgres=# \df
   List of functions
 Schema |Name | Result data type | Argument data types |  Type
+-+--+-+
 public | add | integer  | integer, integer| normal
 public | increment   | integer  | i integer   | normal
 public | square_root | double precision | double precision| normal
(3 rows)

postgres=# SELECT  pg_get_functiondef('add'::regproc);
   pg_get_functiondef
-
 CREATE OR REPLACE FUNCTION public.add(integer, integer)+
  RETURNS integer   +
  LANGUAGE sql  +
  STRICT+
 AS $function$select $1 + $2;$function$ +

(1 row)

postgres=# SELECT  pg_get_functiondef('increment'::regproc);
   pg_get_functiondef

 CREATE OR REPLACE FUNCTION public.increment(i integer)+
  RETURNS integer  +
  LANGUAGE plpgsql +
 AS $function$ +
 BEGIN +
 RETURN i + 1; +
 END;  +
 $function$+

(1 row)

postgres=# SELECT  pg_get_functiondef('square_root'::regproc);
   pg_get_functiondef
-
 CREATE OR REPLACE FUNCTION public.square_root(double precision)+
  RETURNS double precision  +
  LANGUAGE internal +
 AS $function$dsqrt$function$   +

(1 row)


*RCA:* The proparallel information for a function is not considered while
preparing its definition inside pg_get_functiondef().

*Solution:* Add a  check for the proparallel flag inside
pg_get_functiondef() and based on the value in proparallel flag, store the
parallel {safe | unsafe | restricted} info in the buffer  that holds the
function definition. PFA patch to fix the issue.

With Regards,
Ashutosh Sharma
EnterpriseDB: *http://www.enterprisedb.com *
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index e38895d..10499c9 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -2044,6 +2044,19 @@ pg_get_functiondef(PG_FUNCTION_ARGS)
 		case PROVOLATILE_VOLATILE:
 			break;
 	}
+
+	switch (proc->proparallel)
+	{
+		case PROPARALLEL_SAFE:
+			appendStringInfoString(&buf, " PARALLEL SAFE");
+			break;
+		case PROPARALLEL_RESTRICTED:
+			appendStringInfoString(&buf, " PARALLEL RESTRICTED");
+			break;
+		case PROPARALLEL_UNSAFE:
+			break;
+	}
+
 	if (proc->proisstrict)
 		appendStringInfoString(&buf, " STRICT");
 	if (proc->prosecdef)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support for N synchronous standby servers - take 2

2016-04-25 Thread Amit Kapila
On Tue, Apr 26, 2016 at 9:15 AM, Kyotaro HORIGUCHI <
horiguchi.kyot...@lab.ntt.co.jp> wrote:

> Hello, attached is the new version v8.
>
> At Tue, 26 Apr 2016 11:02:25 +0900 (Tokyo Standard Time), Kyotaro
> HORIGUCHI  wrote in <
> 20160426.110225.35506931.horiguchi.kyot...@lab.ntt.co.jp>
> > At Sat, 23 Apr 2016 10:12:03 -0400, Tom Lane  wrote
> in <476.1461420...@sss.pgh.pa.us>
> > > Amit Kapila  writes:
> > > > The main point for this improvement is that the handling for guc
> s_s_names
> > > > is not similar to what we do for other somewhat similar guc's and
> which
> > > > causes in-efficiency in non-hot code path (less used code).
> > >
> > > This is not about efficiency, this is about correctness.  The proposed
> > > v7 patch is flat out not acceptable, not now and not for 9.7 either,
> > > because it introduces a GUC assign hook that can easily fail (eg,
> through
> > > out-of-memory for the copy step).  Assign hook functions need to be
> > > incapable of failure.


It seems to me that similar problem can be there
for assign_pgstat_temp_directory() as it can also lead to "out of memory"
error.  However, in general I understand your concern and I think we should
avoid any such failure in assign functions.


>   I do not see any good reason why this one cannot
> > > satisfy that requirement, either.  It just needs to make use of the
> > > "extra" mechanism to pass back an already-suitably-long-lived result
> from
> > > check_synchronous_standby_names.  See check_timezone_abbreviations/
> > > assign_timezone_abbreviations for a model to follow.
> >
> > I had already seen there before the v7 and had the same feeling
> > below in mind but packing in a blob needs to use other than List
> > to hold the name list (just should be an array) and it is
> > followed by the necessity of many changes in where the list is
> > accessed. But the result is hopeless as you mentioned :(
> >
> > > You are going to
> > > need to find a way to package the parse result into a single malloc'd
> > > blob, though, because that's as much as guc.c can keep track of for an
> > > "extra" value.
> >
> > Ok, I'll post the v8 with the blob solution sooner.
>
> Hmm. It was way easier than I thought. The attached v8 patch does,
>
> - Changed SyncRepConfigData from a struct using liked list to a
>   blob. Since the former struct is useful in parsing, it is still
>   used and converted into the latter form in check_s_s_names.
>
> - Make assign_s_s_names not to do nothing other than just
>   assigning SyncRepConfig.
>
> - Change SyncRepGetSyncStandbys to read the latter form of
>   configuration.
>
> - SyncRepFreeConfig is removed since it is no longer needed.
>
>
+ /* Convert SyncRepConfig into the packed struct fit to guc extra */

+ pconf = (SyncRepConfigData *)

+ malloc(SizeOfSyncRepConfig(

+   list_length(syncrep_parse_result->members)));

I think there should be a check for malloc failure in above code.


+ /* No further need for syncrep_parse_result */

+ syncrep_parse_result = NULL;

Isn't this a memory leak?  Shouldn't we need to free the corresponding
memory as well.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Verifying embedded oids in *recv is a bad idea

2016-04-25 Thread Kyotaro HORIGUCHI
Hello,

At Mon, 25 Apr 2016 17:17:13 -0700, Andres Freund  wrote in 
<20160426001713.hbqdiwvf4mkzk...@alap3.anarazel.de>
> Hi,
> 
> for performance reasons it's a good idea to use the binary protocol. But
> doing so between two postgres installations is made unnecessarily hard
> by the choice of embedding and verifying oids in composite and array
> types.  When using extensions, even commonly used ones like hstore, the
> datatype oids will often not be the same between systems, even when
> using exactly the same version on the same OS.
> 
> Specifically I'm talking about
> 
> Datum
> array_recv(PG_FUNCTION_ARGS)
> {
> ...
>   element_type = pq_getmsgint(buf, sizeof(Oid));
>   if (element_type != spec_element_type)
>   {
>   /* XXX Can we allow taking the input element type in any cases? 
> */
>   ereport(ERROR,
>   (errcode(ERRCODE_DATATYPE_MISMATCH),
>errmsg("wrong element type")));
>   }
> ...
> }
> 
> and
> 
> Datum
> record_recv(PG_FUNCTION_ARGS)
> {
> ...
>   /* Process each column */
>   for (i = 0; i < ncolumns; i++)
> ...
>   /* Verify column datatype */
>   coltypoid = pq_getmsgint(buf, sizeof(Oid));
>   if (coltypoid != column_type)
>   ereport(ERROR,
>   (errcode(ERRCODE_DATATYPE_MISMATCH),
>errmsg("wrong data type: %u, expected 
> %u",
>   coltypoid, 
> column_type)));
> ...
> }
> 
> 
> given that we're giving up quite some speed and adding complexity to
> make send/recv functions portable, this seems like a shame.
> 
> I'm failing to see what these checks are buying us? I mean the text
> representation of a composite doesn't include type information about
> contained columns either, I don't see why the binary version has to?
> 
> I'm basically thinking that we should remove the checks on the receiving
> side, but leave the sending of oids in place for backward compat.

FWIW, I think that typsend/typreceive are intended for the use by
COPY FROM/TO and the doc says that the following.

http://www.postgresql.org/docs/devel/static/sql-copy.html

> Binary Format
> 
> The binary format option causes all data to be stored/read as
> binary format rather than as text. It is somewhat faster than the
> text and CSV formats, but a binary-format file is less portable
> across machine architectures and PostgreSQL versions. Also, the
> binary format is very data type specific; for example it will not
> work to output binary data from a smallint column and read it
> into an integer column, even though that would work fine in text
> format.

I haven't considered much about the usefulness of this
restriction, but receiving into an array of a different type
easily leads to a crash.

# However, false matching of type oids also would do so.

I suppose that we should reconsider here if removing the checks.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bogus cleanup code in PostgresNode.pm

2016-04-25 Thread Michael Paquier
On Mon, Apr 25, 2016 at 11:51 PM, Tom Lane  wrote:
> I noticed that even when they are successful, buildfarm members bowerbird
> and jacana tend to spew a lot of messages like this in their bin-check
> steps:
>
> Can't remove directory 
> /home/pgrunner/bf/root/HEAD/pgsql.build/src/bin/scripts/tmp_check/data_main_DdUf/pgdata/global:
>  Directory not empty at /usr/lib/perl5/5.8/File/Temp.pm line 898
> Can't remove directory 
> /home/pgrunner/bf/root/HEAD/pgsql.build/src/bin/scripts/tmp_check/data_main_DdUf/pgdata/pg_xlog:
>  Directory not empty at /usr/lib/perl5/5.8/File/Temp.pm line 898
> Can't remove directory 
> /home/pgrunner/bf/root/HEAD/pgsql.build/src/bin/scripts/tmp_check/data_main_DdUf/pgdata:
>  Permission denied at /usr/lib/perl5/5.8/File/Temp.pm line 898
> Can't remove directory 
> /home/pgrunner/bf/root/HEAD/pgsql.build/src/bin/scripts/tmp_check/data_main_DdUf:
>  Directory not empty at /usr/lib/perl5/5.8/File/Temp.pm line 898
> ### Signalling QUIT to 9156 for node "main"
> # Running: pg_ctl kill QUIT 9156
>
> What is happening here is that the test script is not bothering to do an
> explicit $node->stop operation, and if it doesn't, the automatic cleanup
> steps happen in the wrong order: the File::Temp destructor for the temp
> data directory runs before PostgresNode.pm's DESTROY function, which is
> what's issuing the "pg_ctl kill" command.  On Unix that's just messy,
> but on Windows it fails because you can't delete a process's working
> directory.  I am not sure whether this is guaranteed wrong or just
> sometimes wrong; the Perl docs I can find say that destructors are run in
> unspecified order once interpreter shutdown begins.  But by adding some
> debug printout I was able to verify on my own machine that the data
> directory was already gone when DESTROY runs.

The docs say regarding File::Temp that he object is removed once the
object goes out of scope in the parent:
http://search.cpan.org/~dagolden/File-Temp-0.2304/lib/File/Temp.pm
So basically it means that when we enter in PostgresNode's DESTROY the
temporary folder just "went out of scope" and has been removed?

DESTROY is run once per object, END is a global destructor, and END is
called really at the end of the execution. And actually one reason why
a DESTROY block instead of END is given by Alvaro here:
http://www.postgresql.org/message-id/20151201231121.GI2763@alvherre.pgsql
"
- I changed start/stop/restart so that they keep track of the postmaster
  PID; also added a DESTROY sub to PostgresNode that sends SIGQUIT.
  This means that when the test finishes, the server gets an immediate
  stop signal.  We were getting a lot of errors in the server log about
  failing to write to the stats file otherwise, until the node noticed
  that the datadir was gone.
"

> I believe we can fix this by forcing postmaster shutdown in an END
> routine instead of a DESTROY routine, and hence propose the attached
> patch, which does things in the right order for me.  I'm a pretty
> poor Perl programmer, so I'd appreciate somebody vetting this.

Another, perhaps more solid approach, would be put the DESTROY method
in charge of removing PGDATA and extend TestLib::tempdir with an
argument to be able to switch to CLEANUP => 0 at will. Then we use
this argument for PGDATA after sending SIGQUIT.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw : Not able to update foreign table referring to a local table's view when use_remote_estimate = true

2016-04-25 Thread Robert Haas
On Mon, Apr 25, 2016 at 2:54 AM, Ashutosh Bapat
 wrote:
> On Fri, Apr 22, 2016 at 6:22 PM, Robert Haas  wrote:
>> On Fri, Apr 22, 2016 at 8:44 AM, Rajkumar Raghuwanshi
>>  wrote:
>> > I observed below in postgres_fdw.
>> >
>> > Observation: Update a foreign table which is referring to a local
>> > table's
>> > view (with use_remote_estimate = true) getting failed with below error.
>> > ERROR:  column "ctid" does not exist
>> > CONTEXT:  Remote SQL command: EXPLAIN SELECT c1,
>> > ctid
>> > FROM public.lt_view FOR UPDATE
>> >
>> > create extension postgres_fdw;
>> > create server link_server foreign data wrapper postgres_fdw options
>> > (host
>> > 'localhost',dbname 'postgres', port '5447');
>> > create user mapping for public server link_server;
>> >
>> > create table lt (c1 integer, c2 integer);
>> > insert into lt values (1,null);
>> > create view lt_view as select * from lt;
>> > create foreign table ft (c1 integer,c2 integer) server link_server
>> > options
>> > (table_name 'lt_view');
>> >
>> > --alter server with use_remote_estimate 'false'
>> > alter server link_server options (add use_remote_estimate 'false');
>> > --update foreign table refering to local view -- able to update
>> > update ft set c2 = c1;
>> > UPDATE 1
>> >
>> > --alter server with use_remote_estimate 'true'
>> > alter server link_server options (SET use_remote_estimate 'true');
>> > --update foreign table refering to local view -- fail, throwing error
>> > update ft set c2 = c1;
>> >
>> > psql:/home/edb/Desktop/edb_work/Postgres_Fdw/dml_pushdown_35882/observation_view.sql:24:
>> > ERROR:  column "ctid" does not exist
>> > CONTEXT:  Remote SQL command: EXPLAIN SELECT c1, ctid FROM
>> > public.lt_view
>> > FOR UPDATE
>>
>> Hmm, interesting.  Offhand, I don't really see how to make that case
>> work: postgres_fdw's UPDATE support supposes that the remote relation
>> has CTIDs.  If it doesn't, we're out of luck.  The "direct update"
>> mode might work if we can get that far, but here we're bombing out
>> during the planning phase, so we never have a chance to try it.
>>
>> I wouldn't say this is a bug, exactly; more like an unsupported case.
>> It would be nice to make it work, though, if someone can figure out
>> how.
>
>
> Thinking loudly:
>
> This error is hard to interpret for a user who doesn't know about ctid. Till
> we find a solution, we can at least fail gracefully with an error something
> like "DMLs are not supported on foreign tables referring to views/non-tables
> on foreign server" is not supported. While creating the foreign table a user
> can specify whether the object being referred is updatable (writable?) or
> not, Import foreign schema can set the status by looking at pg_class type
> entry. The efforts required may not be worth the usage given that this case
> is highly unlikely. May be we should just update the documents saying that a
> user may encounter such an error if s/he attempts to update/delete such a
> foreign table.

I would be disinclined to add code specifically to catch this, since
the only report we have so far is from someone whose job is to find
corner cases that break.  Which he did, and good for him, but like you
say, that doesn't mean it's a big real-world problem.  Adding a
sentence to the documentation stating that pointing a foreign table at
a view is fine for SELECT, but that UPDATE and DELETE may not work,
seems like enough.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Suspicious behaviour on applying XLOG_HEAP2_VISIBLE.

2016-04-25 Thread Andres Freund
On 2016-04-26 12:39:37 +0900, Michael Paquier wrote:
> Thinking about the logging of smgr invalidations, this is quite
> interesting. But what would we actually gain in doing that? Do you
> foresee any advantages in doing so? The only case where this would be
> useful now is for vm_extend by looking at the code.

Well, it'd make vm_extend actually correct, which replacing the
invalidation with a relcache one would not. Relcache invalidations are
transactional, whereas smgr ones are not (intentionally so!). I don't
think it's currently a big problem, but it does make me rather wary.

> >> As the invalidation patch is aimed at being backpatched, this may be
> >> something to do as well in back-branches.
> >
> > I'm a bit split here. I think forcing processing of invalidations at
> > moments they've previously never been processed is a bit risky for the
> > back branches. But on the other hand relcache invalidations are only
> > processed at end-of-xact, which isn't really correct for the code at
> > hand :/
> 
> Oh, OK. So you mean that this patch is not aimed for back-branches
> with this new record type, but that's only for HEAD.

No, I think we got to do this in all branches. I was just wondering
about how to fix vm_extend(). Which I do think we got to fix, even in
the back-branches.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support for N synchronous standby servers - take 2

2016-04-25 Thread Kyotaro HORIGUCHI
Hello, attached is the new version v8.

At Tue, 26 Apr 2016 11:02:25 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI 
 wrote in 
<20160426.110225.35506931.horiguchi.kyot...@lab.ntt.co.jp>
> At Sat, 23 Apr 2016 10:12:03 -0400, Tom Lane  wrote in 
> <476.1461420...@sss.pgh.pa.us>
> > Amit Kapila  writes:
> > > The main point for this improvement is that the handling for guc s_s_names
> > > is not similar to what we do for other somewhat similar guc's and which
> > > causes in-efficiency in non-hot code path (less used code).
> > 
> > This is not about efficiency, this is about correctness.  The proposed
> > v7 patch is flat out not acceptable, not now and not for 9.7 either,
> > because it introduces a GUC assign hook that can easily fail (eg, through
> > out-of-memory for the copy step).  Assign hook functions need to be
> > incapable of failure.  I do not see any good reason why this one cannot
> > satisfy that requirement, either.  It just needs to make use of the
> > "extra" mechanism to pass back an already-suitably-long-lived result from
> > check_synchronous_standby_names.  See check_timezone_abbreviations/
> > assign_timezone_abbreviations for a model to follow. 
> 
> I had already seen there before the v7 and had the same feeling
> below in mind but packing in a blob needs to use other than List
> to hold the name list (just should be an array) and it is
> followed by the necessity of many changes in where the list is
> accessed. But the result is hopeless as you mentioned :(
> 
> > You are going to
> > need to find a way to package the parse result into a single malloc'd
> > blob, though, because that's as much as guc.c can keep track of for an
> > "extra" value.
> 
> Ok, I'll post the v8 with the blob solution sooner.

Hmm. It was way easier than I thought. The attached v8 patch does,

- Changed SyncRepConfigData from a struct using liked list to a
  blob. Since the former struct is useful in parsing, it is still
  used and converted into the latter form in check_s_s_names.

- Make assign_s_s_names not to do nothing other than just
  assigning SyncRepConfig.

- Change SyncRepGetSyncStandbys to read the latter form of
  configuration.

- SyncRepFreeConfig is removed since it is no longer needed.

It passes both make check and recovery/make check.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

diff --git a/src/backend/replication/syncrep.c b/src/backend/replication/syncrep.c
index 3c9142e..376fe51 100644
--- a/src/backend/replication/syncrep.c
+++ b/src/backend/replication/syncrep.c
@@ -361,11 +361,6 @@ SyncRepInitConfig(void)
 {
 	int			priority;
 
-	/* Update the config data of synchronous replication */
-	SyncRepFreeConfig(SyncRepConfig);
-	SyncRepConfig = NULL;
-	SyncRepUpdateConfig();
-
 	/*
 	 * Determine if we are a potential sync standby and remember the result
 	 * for handling replies from standby.
@@ -575,7 +570,7 @@ SyncRepGetSyncStandbys(bool *am_sync)
 	if (am_sync != NULL)
 		*am_sync = false;
 
-	lowest_priority = list_length(SyncRepConfig->members);
+	lowest_priority = SyncRepConfig->nmembers;
 	next_highest_priority = lowest_priority + 1;
 
 	/*
@@ -730,9 +725,7 @@ SyncRepGetSyncStandbys(bool *am_sync)
 static int
 SyncRepGetStandbyPriority(void)
 {
-	List	   *members;
-	ListCell   *l;
-	int			priority = 0;
+	int			priority;
 	bool		found = false;
 
 	/*
@@ -745,12 +738,9 @@ SyncRepGetStandbyPriority(void)
 	if (!SyncStandbysDefined())
 		return 0;
 
-	members = SyncRepConfig->members;
-	foreach(l, members)
+	for (priority = 1 ; priority <= SyncRepConfig->nmembers ; priority++)
 	{
-		char	   *standby_name = (char *) lfirst(l);
-
-		priority++;
+		char  *standby_name = SyncRepConfig->members[priority - 1];
 
 		if (pg_strcasecmp(standby_name, application_name) == 0 ||
 			pg_strcasecmp(standby_name, "*") == 0)
@@ -867,50 +857,6 @@ SyncRepUpdateSyncStandbysDefined(void)
 	}
 }
 
-/*
- * Parse synchronous_standby_names and update the config data
- * of synchronous standbys.
- */
-void
-SyncRepUpdateConfig(void)
-{
-	int	parse_rc;
-
-	if (!SyncStandbysDefined())
-		return;
-
-	/*
-	 * check_synchronous_standby_names() verifies the setting value of
-	 * synchronous_standby_names before this function is called. So
-	 * syncrep_yyparse() must not cause an error here.
-	 */
-	syncrep_scanner_init(SyncRepStandbyNames);
-	parse_rc = syncrep_yyparse();
-	syncrep_scanner_finish();
-
-	if (parse_rc != 0)
-		ereport(ERROR,
-(errcode(ERRCODE_SYNTAX_ERROR),
- errmsg_internal("synchronous_standby_names parser returned %d",
- parse_rc)));
-
-	SyncRepConfig = syncrep_parse_result;
-	syncrep_parse_result = NULL;
-}
-
-/*
- * Free a previously-allocated config data of synchronous replication.
- */
-void
-SyncRepFreeConfig(SyncRepConfigData *config)
-{
-	if (!config)
-		return;
-
-	list_free_deep(config->members);
-	pfree(config);
-}
-
 #ifdef USE_ASSERT_CHECKING
 static bool
 SyncRepQueueIsOrderedByLSN(int mode)
@@ -956,9 +902,16 @@ bool
 check_synchronous_standby_nam

Re: [HACKERS] Suspicious behaviour on applying XLOG_HEAP2_VISIBLE.

2016-04-25 Thread Michael Paquier
On Tue, Apr 26, 2016 at 12:39 PM, Michael Paquier
 wrote:
> On Tue, Apr 26, 2016 at 11:47 AM, Andres Freund  wrote:
>> Thanks for looking into this.
>>
>> On 2016-04-26 11:43:06 +0900, Michael Paquier wrote:
>>> On Sun, Apr 24, 2016 at 11:51 AM, Andres Freund  wrote:
>>> > ISTM we should additionally replace the CacheInvalidateSmgr() with a
>>> > CacheInvalidateRelcache() and document that that implies an smgr
>>> > invalidation.  Alternatively we could log smgr (and relmapper)
>>> > invalidations as well, but that's not quite non-invasive either; but
>>> > might be a good long-term idea to keep things simpler.
>>> >
>>> > Comments?
>>>
>>> Yeah, this looks like a good idea at the end.
>>
>> You mean the bit about making smgr invalidations logged?
>
> Sorry for the lack of precision in my words. I was referring to your
> idea of replacing CacheInvalidateSmgr() by CacheInvalidateRelcache()
> sounds like a good option as this would make the invalidation to be
> logged at commit. Thinking about the logging of smgr invalidations,
> this is quite interesting. But what would we actually gain in doing
> that? Do you foresee any advantages in doing so? The only case where
> this would be useful now is for vm_extend by looking at the code.
>
>>> As the invalidation patch is aimed at being backpatched, this may be
>>> something to do as well in back-branches.
>>
>> I'm a bit split here. I think forcing processing of invalidations at
>> moments they've previously never been processed is a bit risky for the
>> back branches. But on the other hand relcache invalidations are only
>> processed at end-of-xact, which isn't really correct for the code at
>> hand :/
>
> Oh, OK. So you mean that this patch is not aimed for back-branches
> with this new record type, but that's only for HEAD.. To be honest, I
> thought that we had better address this issue on back-branches with a
> patch close to what you are proposing (which is more or less what
> Simon has actually sent), and keep the change of CacheInvalidateSmgr()
> in visibilitymap.c to be HEAD-only.

Ah, and actually as I'm on it from your previous patch there is this bit:
+   else if (msg->id == SHAREDINVALRELMAP_ID)
+   appendStringInfoString(buf, " relmap");
+   else if (msg->id == SHAREDINVALRELMAP_ID)
+   appendStringInfo(buf, " relmap db %u", msg->rm.dbId);
You surely want to check for !OidIsValid(msg->rm.dbId) in the first one.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Suspicious behaviour on applying XLOG_HEAP2_VISIBLE.

2016-04-25 Thread Michael Paquier
On Tue, Apr 26, 2016 at 11:47 AM, Andres Freund  wrote:
> Thanks for looking into this.
>
> On 2016-04-26 11:43:06 +0900, Michael Paquier wrote:
>> On Sun, Apr 24, 2016 at 11:51 AM, Andres Freund  wrote:
>> > ISTM we should additionally replace the CacheInvalidateSmgr() with a
>> > CacheInvalidateRelcache() and document that that implies an smgr
>> > invalidation.  Alternatively we could log smgr (and relmapper)
>> > invalidations as well, but that's not quite non-invasive either; but
>> > might be a good long-term idea to keep things simpler.
>> >
>> > Comments?
>>
>> Yeah, this looks like a good idea at the end.
>
> You mean the bit about making smgr invalidations logged?

Sorry for the lack of precision in my words. I was referring to your
idea of replacing CacheInvalidateSmgr() by CacheInvalidateRelcache()
sounds like a good option as this would make the invalidation to be
logged at commit. Thinking about the logging of smgr invalidations,
this is quite interesting. But what would we actually gain in doing
that? Do you foresee any advantages in doing so? The only case where
this would be useful now is for vm_extend by looking at the code.

>> As the invalidation patch is aimed at being backpatched, this may be
>> something to do as well in back-branches.
>
> I'm a bit split here. I think forcing processing of invalidations at
> moments they've previously never been processed is a bit risky for the
> back branches. But on the other hand relcache invalidations are only
> processed at end-of-xact, which isn't really correct for the code at
> hand :/

Oh, OK. So you mean that this patch is not aimed for back-branches
with this new record type, but that's only for HEAD.. To be honest, I
thought that we had better address this issue on back-branches with a
patch close to what you are proposing (which is more or less what
Simon has actually sent), and keep the change of CacheInvalidateSmgr()
in visibilitymap.c to be HEAD-only.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Add trigonometric functions that work in degrees.

2016-04-25 Thread Tom Lane
I wrote:
> Evidently, asin() is returning an 80-bit result, and that's not
> getting rounded to double width before we divide by asin_0_5.

I found that I could reproduce this type of assembly-code behavior
on dromedary (gcc version 4.2.1) by using -mfpmath=387.  That box
doesn't show the visible regression-test failure, but that must be
down to its version of asin() not producing the same low-order bits
that yours does.  It's clear from the assembly output that the code
*would* misbehave given an appropriate asin() library function.

With that version of gcc, just casting the function output to double
changes nothing.  The local-variable solution likewise.  I do get
a useful fix if I declare the asin_x local variable volatile:

.loc 1 1873 0
fstpl   (%esp)
call_asin
+   fstpl   -16(%ebp)
+ LVL107:
+   fldl-16(%ebp)
fdivl   _asin_0_5-"L019$pb"(%ebx)
fmuls   LC21-"L019$pb"(%ebx)

Interestingly, declaring asin_x as global is not enough to fix it,
because it stores into the global but doesn't fetch back:

.loc 1 1873 0
fstpl   (%esp)
call_asin
+   movlL_asin_x$non_lazy_ptr-"L019$pb"(%ebx), %eax
+   fstl(%eax)
fdivl   _asin_0_5-"L019$pb"(%ebx)
fmuls   LC21-"L019$pb"(%ebx)

Declaring asin_x as a volatile global does fix it:

.loc 1 1873 0
fstpl   (%esp)
call_asin
+   movlL_asin_x$non_lazy_ptr-"L019$pb"(%ebx), %eax
+   fstpl   (%eax)
+   fldl(%eax)
fdivl   _asin_0_5-"L019$pb"(%ebx)
fmuls   LC21-"L019$pb"(%ebx)

but that seems like just being uglier without much redeeming value.

In short, these tests suggest that we need a coding pattern like
this:

volatile float8 asin_x = asin(x);
return (asin_x / asin_0_5) * 30.0;

We could drop the "volatile" by adopting -ffloat-store, but that
doesn't seem like a better answer, because -ffloat-store would
pessimize a lot of code elsewhere.  (It causes a whole lot of
changes in float.c, for sure.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Suspicious behaviour on applying XLOG_HEAP2_VISIBLE.

2016-04-25 Thread Andres Freund
Hi,

Thanks for looking into this.

On 2016-04-26 11:43:06 +0900, Michael Paquier wrote:
> On Sun, Apr 24, 2016 at 11:51 AM, Andres Freund  wrote:
> > ISTM we should additionally replace the CacheInvalidateSmgr() with a
> > CacheInvalidateRelcache() and document that that implies an smgr
> > invalidation.  Alternatively we could log smgr (and relmapper)
> > invalidations as well, but that's not quite non-invasive either; but
> > might be a good long-term idea to keep things simpler.
> >
> > Comments?
> 
> Yeah, this looks like a good idea at the end.

You mean the bit about making smgr invalidations logged?

> As the invalidation patch is aimed at being backpatched, this may be
> something to do as well in back-branches.

I'm a bit split here. I think forcing processing of invalidations at
moments they've previously never been processed is a bit risky for the
back branches. But on the other hand relcache invalidations are only
processed at end-of-xact, which isn't really correct for the code at
hand :/

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Suspicious behaviour on applying XLOG_HEAP2_VISIBLE.

2016-04-25 Thread Michael Paquier
On Sun, Apr 24, 2016 at 11:51 AM, Andres Freund  wrote:
> Here's a patch doing so.  Note that, after putting the record into RM_XACT_ID
> first, I've decided to make it a RM_STANDBY_ID type record. I think it's
> likely that this is going to be needed beyond transaction commits, and
> it generally seems to fit better into standby.c; even if it's a bit
> awkward that commit records contain similar data. To avoid duplicating
> the *desc.c code, I've exposed standby_desc_invalidations() to also be
> used by xactdesc.c.

I have been eyeballing this patch for some time since yesterday and
did some tests, and that's really neat. Having the invalidation stuff
in standby.c makes quite some sense as well.

+* wanting to emit more WAL recoreds).
s/recoreds/records/

> The reason this all ends up working nonetheless is that the
> heap_inplace_update()s in vacuum triggers a CacheInvalidateHeapTuple()
> which queues a relcache invalidation, which in turn does a
> RelationCloseSmgr() in RelationClearRelation(). Thereby effectively
> doing a transactional CacheInvalidateSmgr().  But that seems more than
> fragile.

You have spent quite some time on that. As things stand that's indeed
a bit fragile..

> ISTM we should additionally replace the CacheInvalidateSmgr() with a
> CacheInvalidateRelcache() and document that that implies an smgr
> invalidation.  Alternatively we could log smgr (and relmapper)
> invalidations as well, but that's not quite non-invasive either; but
> might be a good long-term idea to keep things simpler.
>
> Comments?

Yeah, this looks like a good idea at the end. As the invalidation
patch is aimed at being backpatched, this may be something to do as
well in back-branches.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Can we improve this error message?

2016-04-25 Thread Tom Lane
Andreas Karlsson  writes:
> On 04/17/2016 09:28 PM, Bill Moran wrote:
>> What I believe is happening, is that the pg connection libs
>> first try to connect via ssl and get a password failed error,
>> then fallback to trying to connect without ssl, and get a "no
>> pg_hba.conf entry" error. The problem is that the second error
>> masks the first one, hiding the real cause of the connection
>> failure, and causing a lot of confusion.

> I got both the messages when I tried this with psql. What did you do 
> when you only got the second message?

Maybe Bill tried it with a rather old libpq?  This rings a bell
as being something we fixed awhile back.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Removing faulty hyperLogLog merge function

2016-04-25 Thread Peter Geoghegan
The function hyperLogLogMerge() is faulty [1]. It has no current
callers, though. I propose that we rip it out, as in the attached
patch.

[1] 
http://www.postgresql.org/message-id/CAM3SWZT-i6R9JU5YXa8MJUou2_r3LfGJZpQ9tYa1BYxfkj0=c...@mail.gmail.com
-- 
Peter Geoghegan
From 3b1e6db08412bdadd3ad0d5a62a87c71a83e4e04 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan 
Date: Mon, 25 Apr 2016 19:24:43 -0700
Subject: [PATCH] Remove faulty hyperLogLog merge function

The approach taken is not valid with the "sparse" HLL states that
Postgres supports.  Fortunately, there were no callers of the faulty
function, so the impact of this bug was quite limited.

Author: Peter Geoghegan
Discussion: CAM3SWZT-i6R9JU5YXa8MJUou2_r3LfGJZpQ9tYa1BYxfkj0=c...@mail.gmail.com
Backpatch: 9.5, where the hyperLogLog infrastructure was introduced
---
 src/backend/lib/hyperloglog.c | 22 --
 src/include/lib/hyperloglog.h |  1 -
 2 files changed, 23 deletions(-)

diff --git a/src/backend/lib/hyperloglog.c b/src/backend/lib/hyperloglog.c
index fa7f05a..6d246ce 100644
--- a/src/backend/lib/hyperloglog.c
+++ b/src/backend/lib/hyperloglog.c
@@ -221,28 +221,6 @@ estimateHyperLogLog(hyperLogLogState *cState)
 }
 
 /*
- * Merges the estimate from one HyperLogLog state to another, returning the
- * estimate of their union.
- *
- * The number of registers in each must match.
- */
-void
-mergeHyperLogLog(hyperLogLogState *cState, const hyperLogLogState *oState)
-{
-	int			r;
-
-	if (cState->nRegisters != oState->nRegisters)
-		elog(ERROR, "number of registers mismatch: %zu != %zu",
-			 cState->nRegisters, oState->nRegisters);
-
-	for (r = 0; r < cState->nRegisters; ++r)
-	{
-		cState->hashesArr[r] = Max(cState->hashesArr[r], oState->hashesArr[r]);
-	}
-}
-
-
-/*
  * Worker for addHyperLogLog().
  *
  * Calculates the position of the first set bit in first b bits of x argument
diff --git a/src/include/lib/hyperloglog.h b/src/include/lib/hyperloglog.h
index b999b30..ee88f8f 100644
--- a/src/include/lib/hyperloglog.h
+++ b/src/include/lib/hyperloglog.h
@@ -63,7 +63,6 @@ extern void initHyperLogLog(hyperLogLogState *cState, uint8 bwidth);
 extern void initHyperLogLogError(hyperLogLogState *cState, double error);
 extern void addHyperLogLog(hyperLogLogState *cState, uint32 hash);
 extern double estimateHyperLogLog(hyperLogLogState *cState);
-extern void mergeHyperLogLog(hyperLogLogState *cState, const hyperLogLogState *oState);
 extern void freeHyperLogLog(hyperLogLogState *cState);
 
 #endif   /* HYPERLOGLOG_H */
-- 
2.7.4


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] how to measure pglogical replication lag

2016-04-25 Thread Craig Ringer
On 26 April 2016 at 03:08, Guo, Yun  wrote:

> Is there a function or view that could return a measurable replication lag
> of Pglogical  ?
>

It's the same as normal streaming replication. Use pg_stat_replication
and/or pg_replication_slots to compare the LSN received by the client to
the current server xlog insert position.

Like streaming replication there's currently no way to find out from the
client side how far behind the client is. I think we should actually change
that by sending an occasional message to the client with the server's
current xlog insert position, though, so the client knows how far behind it
is...

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Fix for OpenSSL error queue bug

2016-04-25 Thread Peter Geoghegan
On Mon, Apr 25, 2016 at 6:44 PM, Michael Paquier
 wrote:
>> I'm not sure if project policy around backpatching (that commit
>> messages and so on should match exactly) has anything to say about the
>> case where backpatching follows several weeks after commit to the
>> master branch. In the absence of any clear direction on that, I've
>> created commits that look like what Peter E might have pushed in early
>> April, had he decided to do that backpatch leg-work up front.
>
> It seems to me that we definitely want to get this stuff backpatched
> at the end. So +1 for this move.

Right. This issue has a long history of causing users significant
(though often intermittent) problems. As an example, consider this
problem report from a co-worker of mine that dates back to 2012:

https://bitbucket.org/ged/ruby-pg/issues/142/async_exec-over-ssl-connection-can-fail-on

There are numerous problem reports like this that are easily found
using Google. I think that there are probably a variety of unpleasant
interactions and symptoms. Crashes are one rarer symptom, seen in
certain scenarios only (crashes are not described in the link above).

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support for N synchronous standby servers - take 2

2016-04-25 Thread Kyotaro HORIGUCHI
At Sat, 23 Apr 2016 10:12:03 -0400, Tom Lane  wrote in 
<476.1461420...@sss.pgh.pa.us>
> Amit Kapila  writes:
> > The main point for this improvement is that the handling for guc s_s_names
> > is not similar to what we do for other somewhat similar guc's and which
> > causes in-efficiency in non-hot code path (less used code).
> 
> This is not about efficiency, this is about correctness.  The proposed
> v7 patch is flat out not acceptable, not now and not for 9.7 either,
> because it introduces a GUC assign hook that can easily fail (eg, through
> out-of-memory for the copy step).  Assign hook functions need to be
> incapable of failure.  I do not see any good reason why this one cannot
> satisfy that requirement, either.  It just needs to make use of the
> "extra" mechanism to pass back an already-suitably-long-lived result from
> check_synchronous_standby_names.  See check_timezone_abbreviations/
> assign_timezone_abbreviations for a model to follow. 

I had already seen there before the v7 and had the same feeling
below in mind but packing in a blob needs to use other than List
to hold the name list (just should be an array) and it is
followed by the necessity of many changes in where the list is
accessed. But the result is hopeless as you mentioned :(

> You are going to
> need to find a way to package the parse result into a single malloc'd
> blob, though, because that's as much as guc.c can keep track of for an
> "extra" value.

Ok, I'll post the v8 with the blob solution sooner.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fix of doc for synchronous_standby_names.

2016-04-25 Thread Kyotaro HORIGUCHI
Hi,


At Fri, 22 Apr 2016 17:27:07 +0900, Amit Langote 
 wrote in <5719e05b.4030...@lab.ntt.co.jp>
> 
> Horiguchi-san,
> 
> On 2016/04/22 14:21, Kyotaro HORIGUCHI wrote:
> > I came to think that both of you are misunderstanding how
> > synchronous standbys are choosed so I'd like to clarify the
> > behavior.
> 
> I certainly had a different (and/or wrong) idea in mind about how this
> works.  Thanks a lot for clarifying.  I'm still a little confused, so
> please see below.

Sure thing.

> > Kyotaro HORIGUCHI  wrote:
>  But this particular sentence seems to be talking
>  about what's the case for any given slot.
> >>>
> >>> Right, that's my reading also.
> > 
> > In SyncRepInitConfig, every walsender sets sync_standby_priority
> > by itself. The priority value is the index of its
> > application_name in s_s_names list (1 based).
> > 
> > When a walsender receives a feedback from walreceiver, it may
> > release backends waiting for certain LSN to be secured.
> > 
> > First, SyncRepGetSyncStandbys collects active standbys. Then it
> > loops from the hightest priority value to pick up all of the
> > active standbys for each priority value until all of the seats
> > are occupied. Then SyncRepOldestSyncRepPtr calculates the oldest
> > LSNs only among the standbys SyncRepGetSyncStandbys
> > returned. Finally, it releases backends using the LSNs.
> > 
> > In short, every 'slot' in s_s_names can corresponds to two or
> > more *synchronous* standbys.
> > 
> > The resulting behavior is as the following.
> > 
> >> I don't certainly understnd what the 'sync slot' means. If it
> >> means a name in a replication set description, that is, 'nameN'
> >> in the following setting of s_s_names.
> >>
> >> '2(name1, name2, name3)'
> >>
> >> There may be two or more duplicates even in the
> >> single-sync-age. But only one synchronous standby was allowed so
> >> any 'sync slot' may have at least one matching synchronous
> >> standby in the single-sync-age. This is what I see in the
> >> sentense. Is this wrong?
> >>
> >> Now, we can have multiple synchronous standbys so, for example,
> >> if three standbys with the name 'name1', two of them are choosed
> >> as synchronous. This is a new behavior in the multi-sync-age and
> >> syncrep.c has been changed so as to do so.
> >>
> >> For a supplemnet, the following case.
> >>
> >> '5(name1, name2, name3)'
> >>
> >> and the following standbys
> >>
> >> (name1, name1, name2, name2, name3, name3)
> > 
> > This was a bad example,
> > 
> > (name1, name1, name2, name2, name2, name2, name3)
> > 
> > For this case, the followings are choosed as synchornous standby.
> > 
> > (name1, name1, name2, name2, name2)
> > 
> > Three of the four name2s are choosed but which name2s is an
> > implement matter.
> > 
> >> # However, 5 for three names causes a warning..
> 
> OK, I see.  I tried to understand it (by carrying it out) using the
> following example.
> 
> synchronous_standby_names: '3 (aa, aa, bb, bb, cc)'
> 
> Individual names get assigned the following priorities based on their
> position in the list:
> 
> aa: 1, bb: 3, cc: 5

Yes, it is (aa:1, aa:2, bb:3, bb:4, cc:5) precisely but 2 and 4
won't have a matching walsender.

> Then standbys connect one-by-one, say the following (host and
> application_name):
> 
> host1: aa
> host2: aa
> host3: bb
> host4: bb
> host5: cc
> 
> syncrep.c will assign following priorities (also shown is their sync status)
> 
> host1: aa: 1 (sync)
> host2: aa: 1 (sync)
> host3: bb: 3 (sync)
> host4: bb: 3 (potential)
> host5: cc: 5 (potential)

Which of host3 and 4 is to be sync is indeterminate, one that
have smaller index in walsender slots will be.

> As also evident in pg_stat_replication:
> 
> SELECT application_name, sync_priority, sync_state FROM
> pg_stat_replication ORDER BY 2;
>  application_name | sync_priority | sync_state
> --+---+
>  aa   | 1 | sync
>  aa   | 1 | sync
>  bb   | 3 | sync
>  bb   | 3 | potential
>  cc   | 5 | potential
> (5 rows)

I haven't confirmed that but it will be so.

> Hm, I see (though I wondered if one of the 'aa's should have been assigned
> priority 2 and likewise for 'bb').

It is just a faith on users, or a harmless bug that I have
overlooked, honestly saying:p But the duplication is not what the
documentation is saying about. We can inhibit such duplication if
we want but perhaps we don't.

> Following is the documentation paragraph under scrutiny:
> 
> """
> The name of a standby server for this purpose is the
> application_name setting of the standby, as set in the
> primary_conninfo of the standby's WAL receiver.  There is
> no mechanism to enforce uniqueness. In case of duplicates one of the
> matching standbys will be considered as higher priority, though
> exactly which one is indeterminate.
> """
> 
> Consider the name 'aa', it turns out that there are i

Re: [HACKERS] Fix for OpenSSL error queue bug

2016-04-25 Thread Michael Paquier
On Tue, Apr 26, 2016 at 9:37 AM, Peter Geoghegan  wrote:
> Only the 9.5 backpatch was a simple, conflict-free "git cherry-pick".
> Most of the effort here involved producing a clean 9.4 patch. This was
> largely mechanical, if a little tricky. In release branches for
> releases that preceded 9.4, there were a few further merge conflicts
> as I worked backwards through the branches, but those were trivial.

Looking again at this thread, the general agreement was to clear the
error stack before calling any SSL routine. Those patches are doing
so, and they look in good shape to me. Note: there is
SSL_do_handshake() on back-branches for the SSL renegotiation but we
don't need to bother about clearing the error queue as any error
occurring in those cases just stops the session, and we've never
bothered calling ERR_get_error there to get more details about the
errors.

> I'm not sure if project policy around backpatching (that commit
> messages and so on should match exactly) has anything to say about the
> case where backpatching follows several weeks after commit to the
> master branch. In the absence of any clear direction on that, I've
> created commits that look like what Peter E might have pushed in early
> April, had he decided to do that backpatch leg-work up front.

It seems to me that we definitely want to get this stuff backpatched
at the end. So +1 for this move.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Can we improve this error message?

2016-04-25 Thread Andreas Karlsson

On 04/17/2016 09:28 PM, Bill Moran wrote:

If you have a single line in the pg_hba.conf:

hostssl all all 0.0.0.0/0 md5

Attempting to log in with an incorrect password results in an
error message about there not being a pg_hba.conf entry for the
user.

Reading carefully, the error message states that there's no
pg_hba.conf for the user with **ssl off**.

What I believe is happening, is that the pg connection libs
first try to connect via ssl and get a password failed error,
then fallback to trying to connect without ssl, and get a "no
pg_hba.conf entry" error. The problem is that the second error
masks the first one, hiding the real cause of the connection
failure, and causing a lot of confusion.

If we could keep both errors and report them both, I feel like
it would be an improvement to our client library behavior.


I got both the messages when I tried this with psql. What did you do 
when you only got the second message?


Output:

psql: FATAL:  password authentication failed for user "andreas"
FATAL:  no pg_hba.conf entry for host "127.0.0.1", user "andreas", 
database "postgres", SSL off


Andreas



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Add trigonometric functions that work in degrees.

2016-04-25 Thread Peter Eisentraut
On 04/21/2016 08:18 PM, Tom Lane wrote:
> Hm.  This seems to prove that we're not getting exactly 1.0 from
> (asin(x) / asin_0_5) with x = 0.5, but I'm having a hard time guessing
> why that might be so when all the other cases work.
> 
> Could you send along the assembler code generated by the compiler (-S
> output) for float.c?  Maybe that would shed some light.  Probably the
> -O0 version would be easier to read.

Attached is a smaller test program that prints 29.9964 (same
as failing test result) as well as its assembler code.

#include 
#include 
#include 

static double
sub()
{
	double asin_0_5 = asin(0.5);
	double x = 0.5;
	return (asin(x) / asin_0_5) * 30.0;
}

int
main()
{
	printf("%.*g\n", DBL_DIG + 3, sub());

	return 0;
}
.file   "test.c"
.text
.type   sub, @function
sub:
.LFB0:
.cfi_startproc
pushl   %ebp
.cfi_def_cfa_offset 8
.cfi_offset 5, -8
movl%esp, %ebp
.cfi_def_cfa_register 5
subl$24, %esp
fldl.LC0
fstpl   -16(%ebp)
fldl.LC1
fstpl   -24(%ebp)
subl$8, %esp
pushl   -20(%ebp)
pushl   -24(%ebp)
callasin
addl$16, %esp
fdivl   -16(%ebp)
fldl.LC2
fmulp   %st, %st(1)
leave
.cfi_restore 5
.cfi_def_cfa 4, 4
ret
.cfi_endproc
.LFE0:
.size   sub, .-sub
.section.rodata
.LC4:
.string "%.*g\n"
.text
.globl  main
.type   main, @function
main:
.LFB1:
.cfi_startproc
leal4(%esp), %ecx
.cfi_def_cfa 1, 0
andl$-16, %esp
pushl   -4(%ecx)
pushl   %ebp
.cfi_escape 0x10,0x5,0x2,0x75,0
movl%esp, %ebp
pushl   %ecx
.cfi_escape 0xf,0x3,0x75,0x7c,0x6
subl$4, %esp
callsub
leal-8(%esp), %esp
fstpl   (%esp)
pushl   $18
pushl   $.LC4
callprintf
addl$16, %esp
movl$0, %eax
movl-4(%ebp), %ecx
.cfi_def_cfa 1, 0
leave
.cfi_restore 5
leal-4(%ecx), %esp
.cfi_def_cfa 4, 4
ret
.cfi_endproc
.LFE1:
.size   main, .-main
.section.rodata
.align 8
.LC0:
.long   942502758
.long   1071694162
.align 8
.LC1:
.long   0
.long   1071644672
.align 8
.LC2:
.long   0
.long   1077805056
.ident  "GCC: (Debian 5.3.1-13) 5.3.1 20160323"
.section.note.GNU-stack,"",@progbits

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fix for OpenSSL error queue bug

2016-04-25 Thread Peter Geoghegan
On Fri, Apr 8, 2016 at 11:43 AM, Peter Geoghegan  wrote:
> I'll do so soon. I was waiting on Peter E to take me up on the offer.

Attached is a series of patches for each supported release branch. As
discussed, I would like to target every released version of Postgres
with this bugfix. This is causing users real pain at the moment.

Only the 9.5 backpatch was a simple, conflict-free "git cherry-pick".
Most of the effort here involved producing a clean 9.4 patch. This was
largely mechanical, if a little tricky. In release branches for
releases that preceded 9.4, there were a few further merge conflicts
as I worked backwards through the branches, but those were trivial.

I'm not sure if project policy around backpatching (that commit
messages and so on should match exactly) has anything to say about the
case where backpatching follows several weeks after commit to the
master branch. In the absence of any clear direction on that, I've
created commits that look like what Peter E might have pushed in early
April, had he decided to do that backpatch leg-work up front.

-- 
Peter Geoghegan


openssl-patches.tar.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Verifying embedded oids in *recv is a bad idea

2016-04-25 Thread Andres Freund
Hi,

for performance reasons it's a good idea to use the binary protocol. But
doing so between two postgres installations is made unnecessarily hard
by the choice of embedding and verifying oids in composite and array
types.  When using extensions, even commonly used ones like hstore, the
datatype oids will often not be the same between systems, even when
using exactly the same version on the same OS.

Specifically I'm talking about

Datum
array_recv(PG_FUNCTION_ARGS)
{
...
element_type = pq_getmsgint(buf, sizeof(Oid));
if (element_type != spec_element_type)
{
/* XXX Can we allow taking the input element type in any cases? 
*/
ereport(ERROR,
(errcode(ERRCODE_DATATYPE_MISMATCH),
 errmsg("wrong element type")));
}
...
}

and

Datum
record_recv(PG_FUNCTION_ARGS)
{
...
/* Process each column */
for (i = 0; i < ncolumns; i++)
...
/* Verify column datatype */
coltypoid = pq_getmsgint(buf, sizeof(Oid));
if (coltypoid != column_type)
ereport(ERROR,
(errcode(ERRCODE_DATATYPE_MISMATCH),
 errmsg("wrong data type: %u, expected 
%u",
coltypoid, 
column_type)));
...
}


given that we're giving up quite some speed and adding complexity to
make send/recv functions portable, this seems like a shame.

I'm failing to see what these checks are buying us? I mean the text
representation of a composite doesn't include type information about
contained columns either, I don't see why the binary version has to?

I'm basically thinking that we should remove the checks on the receiving
side, but leave the sending of oids in place for backward compat.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_stat_activity crashes

2016-04-25 Thread Kyotaro HORIGUCHI
Hello, thank you for understanding.

At Mon, 25 Apr 2016 10:26:49 -0400, Robert Haas  wrote 
in 
> On Thu, Apr 21, 2016 at 9:47 PM, Kyotaro HORIGUCHI
>  wrote:
> > A lock was already held in BackendPidGetProc(). Is it also
> > needless? If so, we should use BackendPidGetProcWithLock() instad
> > (the name seems a bit confusing, though).
> 
> Oh, that's really sad.  No, that lock is definitely needed.  We should
> probably try to figure out some day if there is a way to make this
> completely lockless, but that'll have to be 9.7 material or later.
> :-(

Agreed.

> > What I did in the patch was just extending the lock duration
> > until reading the pointer proc. I didn't added any additional
> > lock.
> 
> Sorry, I didn't realize that.  Good point.

I'm happy that you understand me:)

> >> >  The
> >> > only thing we need to do is to prevent the value from being read
> >> > twice, and we already have precedent for how to prevent that in
> >> > freelist.c.
> >
> > However, I don't have objections for the patch applied.
> 
> OK, let's leave it like that for now, then.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Suspicious behaviour on applying XLOG_HEAP2_VISIBLE.

2016-04-25 Thread Michael Paquier
On Tue, Apr 26, 2016 at 1:13 AM, Andres Freund  wrote:
> I think my approach of a separate record is going to be easier to
> backpatch - the new commit record format you took advantage of is
> new.

Sorry for the late reply.

After reading both patches yesterday, I found your approach with
standby.c to be neater.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add jsonb_compact(...) for whitespace-free jsonb to text

2016-04-25 Thread Sehrope Sarkuni
On Sun, Apr 24, 2016 at 9:27 PM, Stephen Frost  wrote:

> * Andrew Dunstan (and...@dunslane.net) wrote:
> > On 04/24/2016 06:02 PM, Sehrope Sarkuni wrote:
> > >AFAIK, there's also no guarantee on the specific order of the
> > >resulting properties in the text representation either. I would
> > >suppose it's fixed for a given jsonb value within a database major
> > >version but across major versions it could change (if the
> > >underlying representation changes).
> >
> > The order is fixed and very unlikely to change, as it was chosen
> > quite deliberately to help ensure efficient processing. Any change
> > in on-disk representation of data types is something we work very
> > hard to avoid, as it makes it impossible to run pg_upgrade.
>
> We do, from time-to-time, change on-disk formats in a
> backwards-compatible way though.  In any case, it's my understanding
> that we don't *guarantee* any ordering currently and therefore we should
> discourage users from depending on it.  If we *are* going to guarantee
> ordering, then we should document what that ordering is.
>

Yes that's the idea, namely to have a fixed text format that will not
change across releases. If the on-disk representation is already supports
that then this could just be a doc change (assuming there's agreement that
it's a good idea and said guarantee will be maintained).

Separately, I think the compact (i.e. whitespace free) output is useful on
it's own. It adds up to two bytes per key/value pair (one after the colon
and one after the comma) so the more keys you have the more the savings.

Here's a (contrived) example to show the size difference when serializing
information_schema.columns. The row_to_json(...) function returns
whitespace free output (as json, not jsonb) so it's a proxy for
json_compact(..). It comes out to 7.5% smaller than the default jsonb text
format:

app=> SELECT
  MAX((SELECT COUNT(*) FROM json_object_keys(x))) AS num_keys,
  AVG(length(x::text)) AS json_text,
  AVG(length(x::jsonb::text)) AS jsonb_text,
  AVG(length(x::text)) / AVG(length(x::jsonb::text)) AS ratio
FROM (SELECT row_to_json(z.*) AS x
  FROM information_schema.columns z) t;
 num_keys |   json_text   |  jsonb_text   | ratio

--+---+---+
   44 | 1077.0748522652659225 | 1164.0748522652659225 |
0.92526253803121012857
(1 row)

Regards,
-- Sehrope Sarkuni
Founder & CEO | JackDB, Inc. | https://www.jackdb.com/


Re: [HACKERS] Is there a way around function search_path killing SQL function inlining?

2016-04-25 Thread Stephen Frost
* Bruce Momjian (br...@momjian.us) wrote:
> On Thu, Mar 10, 2016 at 11:48:41AM -0500, Tom Lane wrote:
> > Robert Haas  writes:
> > > Hmm.  The meaning of funcs.inline depends on the search_path, not just
> > > during dump restoration but all the time.  So anything uses it under a
> > > different search_path setting than the normal one will have this kind
> > > of problem; not just dump/restore.
> > 
> > Yeah, I see no reason to claim that this is a dump/restore-specific
> > problem.
> > 
> > > I don't have a very good idea what to do about that.
> > 
> > The safe way to write SQL-language functions to be search-path-proof
> > is to schema-qualify the names in them, or to add a "SET search_path"
> > clause to the function definition.
> > 
> > The problem with the latter approach is that it defeats inlining.
> > I thought for a minute that maybe we could teach the planner to do
> > inlining anyway by parsing the function body with the adjusted
> > search_path, but that doesn't really preserve the same semantics
> > (a SET would change the environment for called functions too).
> > 
> > So for now, the recommendation has to be "write functions you want
> > to inline with schema qualifications".  If you're worried about
> > preserving relocatability of an extension containing such functions,
> > the @extschema@ feature might help.
> 
> Is this a TODO item?

For my 2c, yes, but what we actually need is a way to reference
functions in *other* extensions, which might wish to be relocatable.

Even if they don't wish to be relocatable, we don't have any way to say
"replace this string with the schema name of this extension".  In other
words, we need something like:

@extschema{'postgis'}@

Which could then be used by extensions that depend on the PostGIS
extension to fully-qualify their function calls.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Is there a way around function search_path killing SQL function inlining?

2016-04-25 Thread Bruce Momjian
On Thu, Mar 10, 2016 at 11:48:41AM -0500, Tom Lane wrote:
> Robert Haas  writes:
> > Hmm.  The meaning of funcs.inline depends on the search_path, not just
> > during dump restoration but all the time.  So anything uses it under a
> > different search_path setting than the normal one will have this kind
> > of problem; not just dump/restore.
> 
> Yeah, I see no reason to claim that this is a dump/restore-specific
> problem.
> 
> > I don't have a very good idea what to do about that.
> 
> The safe way to write SQL-language functions to be search-path-proof
> is to schema-qualify the names in them, or to add a "SET search_path"
> clause to the function definition.
> 
> The problem with the latter approach is that it defeats inlining.
> I thought for a minute that maybe we could teach the planner to do
> inlining anyway by parsing the function body with the adjusted
> search_path, but that doesn't really preserve the same semantics
> (a SET would change the environment for called functions too).
> 
> So for now, the recommendation has to be "write functions you want
> to inline with schema qualifications".  If you're worried about
> preserving relocatability of an extension containing such functions,
> the @extschema@ feature might help.

Is this a TODO item?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Add trigonometric functions that work in degrees.

2016-04-25 Thread Tom Lane
Peter Eisentraut  writes:
> On 04/25/2016 03:09 PM, Tom Lane wrote:
>> I'm going to go ahead and push this, because it seems clearly more robust
>> than what we have.  But I'd appreciate a report on whether it fixes your
>> issue.

> 6b1a213bbd6599228b2b67f7552ff7cc378797bf did not fix it.

OK ... but it's still a good change, because it removes the assumption
that the compiler won't inline init_degree_constants().

> Attached is the assembler output (-O0) from float.c as of that commit.

Ah.  Here's the problem: line 1873 is

return (asin(x) / asin_0_5) * 30.0;

and that compiles into

subl$8, %esp
pushl   -12(%ebp)   ... push x
pushl   -16(%ebp)
callasin... call asin()
addl$16, %esp
fldlasin_0_5... divide by asin_0_5
fdivrp  %st, %st(1)
fldt.LC46   ... multiply by 30.0
fmulp   %st, %st(1)
fstpl   -24(%ebp)   ... round to 64 bits
fldl-24(%ebp)

Evidently, asin() is returning an 80-bit result, and that's not
getting rounded to double width before we divide by asin_0_5.

The least ugly change that might fix this is to explicitly cast the
result of asin to double:

return ((float8) asin(x) / asin_0_5) * 30.0;

However, I'm not certain that that would do anything; the compiler
might feel that the result of asin() is already double.  The next
messier answer is to explicitly store the function result in a local
variable:

{
float8 asin_x = asin(x);
return (asin_x / asin_0_5) * 30.0;
}

A sufficiently cavalier compiler might choose to optimize that away,
too.  A look at gcc's manual suggests that we might need to use
the -ffloat-store option to guarantee it will work; which is ugly
and I'd prefer not to turn that on globally anyway.  If it comes to
that, probably the better answer is to turn asin_x into a global
variable, similarly to what we just did with the constants.

Can you try the above variants of line 1873 and see if either of them
fixes the issue for you?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SET ROLE and reserved roles

2016-04-25 Thread Stephen Frost
Robert, all,

[... comments elsewhere made me realize I hadn't actually sent this when
I thought I had, my apologies on that ...]

* Robert Haas (robertmh...@gmail.com) wrote:
> Great.  But there's no particular use case served by a lot of things
> which are natural outgrowths of the rest of the system which we permit
> anyway because it's too awkward otherwise - like zero-column tables.

Based on our discussion at PGConf.US and the comments up-thread from
Tom, I'll work up a patch to remove those checks around SET ROLE and
friends which were trying to prevent default roles from possibly being
made to own objects.

Should the checks, which have been included since nearly the start of
this version of the patch, to prevent users from GRANT'ing other rights
to the default roles remain?  Or should those also be removed?  I
*think* pg_dump/pg_upgrade would be fine with rights being added, and if
we aren't preventing ownership of objects then we aren't going to be
able to remove such roles in any case.

Of course, with these default roles, users can't REVOKE the rights which
are granted to them as that happens in C code, outside of the GRANT
system.

Working up a patch to remove these checks should be pretty quickly done
(iirc, I've actually got an independent patch around from when I added
them, just need to find it and then go through the committed patches to
make sure I take care of everything), but would like to make sure that
we're now all on the same page and that *all* of these checks should be
removed, making default roles just exactly like "regular" roles, except
that they're created at initdb time and have "special" rights provided
by C-level code checks.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Rename max_parallel_degree?

2016-04-25 Thread Peter Geoghegan
On Mon, Apr 25, 2016 at 1:45 PM, Robert Haas  wrote:
>> I think the "degree" terminology is fundamentally tainted by the question
>> of whether or not it counts the leader, and that we will have bugs (or
>> indeed may have them today) caused by getting that wrong.
>
> This theory does not seem very plausible to me.  I don't really see
> how that could happen, although perhaps I'm blinded by being too close
> to the feature.  Also, you haven't presented entirely convincing
> evidence that other people are all united in the way they view this
> and that that way is different than PostgreSQL; and I've submitted
> some contrary evidence.

SQL Server definitely disables parallel query when max degree of
parallelism = 1. The default, 0, is "auto". I think that a DoP of 1 in
Oracle also disables parallelism.

> Even if Oracle for example does do it
> differently than what I've done here, slavishly following Oracle has
> never been a prerequisite for regarding a PostgreSQL feature as
> well-designed.  I think it is far more likely that going and
> offsetting the value of parallel_degree by 1 everywhere, as Peter has
> proposed, is going to introduce subtle bugs.

That was one approach that I mentioned as a theoretically sound way of
maintaining the "degree" terminology. This was several months back.
I'm not arguing for that.

I'm also not arguing for slavishly following Oracle or SQL Server.
Rather, my position is that as long as we're going to use their
terminology, we should also offer something roughly compatible with
their behavior. I think that not using their terminology is also a
reasonable solution.

I'm not sure about anyone else, but my complaint is entirely about the
baggage that the term "degree of parallelism" happens to have, and how
effectively that has been managed.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] Breakage with VACUUM ANALYSE + partitions

2016-04-25 Thread Andres Freund
On 2016-04-25 16:29:36 -0400, Robert Haas wrote:
> On Mon, Apr 25, 2016 at 2:05 PM, Andres Freund  wrote:
> > Well, I posted a patch. I'd have applied it too (after addressing your
> > comments obviously), except that there's some interdependencies with the
> > nsmg > 0 thread (some of my tests fail spuriously without that
> > fixed). Early last week I waited for a patch on that thread, but when
> > that didn't materialize by Friday I switched to work on that [1].  With
> > both fixes applied I can't reproduce any problems anymore.
> 
> OK.  What are the interdependencies?  You've said that a few times but
> I am not clear on what the nature of those interdependencies is.

I added checks to smgr/md.c that verify that the smgr state is
consistent with the on-file state. Without the two issues in [1] fixed,
these tests fail in a standby, while running regression tests.  Adding
those tests made me notice a bug in an unreleased version of the patch,
so it seems they're worthwhile to run.


> > About the delay: Sure, it'd be nicer if I'd addressed this
> > immediately. But priority-wise it's an issue that's damned hard to hit,
> > and where the worst known consequence is having to reconnect; that's not
> > earth shattering. So spending time to work on the, imo more severe
> > performance regressions, seemed to be more important; maybe I was wrong
> > in priorizing things that way.
> 
> Do you mean performance regression related to this patch, or to other
> patches like the atomic buffer pin/unpin stuff?

Other patches, I can't see this having measurable impact.


> >> We have a patch, and that's good, and I have reviewed it and
> >> Thom has tested it, and that's good, too.  But it is not clear whether
> >> you feel confident to commit it or when you might be planning to do
> >> that, so I asked.  Given that this is the open item of longest tenure
> >> and that we're hoping to ship beta soon, why is that out of line?
> >
> > Well, if you'd asked it that way, then I'd not responded the way I have.
> 
> Fair enough, but keep in mind that a few more mildly-phrased emails
> from Noah didn't actually get us to the point where this is fixed, or
> even to a clear explanation of when it might be fixed or why it wasn't
> getting fixed sooner.

Hrmpf. I'd initially missed the first email in the onslought, and the
second email I responded to and posted a patch in the timeframe I'd
promised. I didn't forsee, after we'd initially dismissed a correlation,
that there'd be a connection to the other patch.


> >> That commit introduced a new way to write to blocks that might have in
> >> the meantime been removed, and it failed to make that safe.
> >
> > There's no writing of any blocks involved - the problem is just about
> > opening segments which might or might not exist.
> 
> As I understand it, that commit introduced a backend-private queue;
> when it fills, we issue flush requests for particular blocks.  By the
> time the flush requests are issued, the blocks might have been
> truncated away.  So it's not writing blocks in the sense of write(),
> but persuading the OS to write those blocks to stable storage.

Right.


> >> And in fact I think it's a regression that can be
> >> expected to be a significant operational problem for people if not
> >> fixed, because the circumstances in which it can happen are not very
> >> obscure.  You just need to hold some pending flush requests in your
> >> backend-local queue while some other process truncates the relation,
> >> and boom.
> >
> > I found it quite hard to come up with scenarios to reproduce it. Entire
> > segments have to be dropped, the writes to the to-be-dropped segment
> > have to result in fully dead rows, only few further writes outside the
> > dropped can happen, invalidations may not fix the problem up.  But
> > obviously it should be fixed.
> 
> It's probably a bit tricky to produce a workload where the
> truncated-away block is in some backend's private queue, because the
> queues are very small and it's not exactly clear what sort of workload
> will produce regular relation truncations, and you need both of those
> things to hit this. However, I think it's pretty optimistic to think
> that there won't be people in that situation, partly because we had a
> customer find a bug in an EDB proprietary feature  a very similar
> whose profile is very similar to this feature less than 2 years ago.

I'm obviously not arguing that we shouldn't fix this.


> >> I assume you will be pretty darned unhappy if we end up at #2, so I am
> >> trying to figure out if we can achieve #1.  OK?
> >
> > Ok.
> 
> I'm still not clear on the answer to this.  I think the answer is that
> you think that this patch is OK to commit with some editing, but the
> situation with the nmsgs > 0 patch is not so clear yet because it
> hasn't had any review yet.  And they depend on each other somehow.  Am
> I close?

You are. I'd prefer pushing this after fixes for the two invalidation
issues, becau

Re: [HACKERS] pgsql: Add putenv support for msvcrt from Visual Studio 2013

2016-04-25 Thread Christian Ullrich

* Christian Ullrich wrote:


* Andrew Dunstan wrote:



What if both are present? Is a release build prevented from loading a
debug dll and vice versa?


Debug and release are simply two separate CRTs. If your process contains
a module that needs the one, and another that needs the other, you will
have both loaded at once.


pgwin32_putenv() is the gift that keeps giving.


According to its comment, it is not required that all modules exchanging 
calls with postgres.exe have to be built with the same CRT version (100, 
110, 120, etc.). Is it?


If not, the logic that remembers negative results from GetModuleHandle() 
(i.e. gives up forever on each possible CRT once it does not find it) is 
wrong even without considering the debug/release split. If we load a 
compiled extension built with a CRT we have not seen yet, _after_ the 
first call to pgwin32_putenv(), that module's CRT's view of its 
environment will be frozen because we will never attempt to update it.


If that code is in there because it has some noticeable performance 
advantage, the negative results could probably be reset in SQL LOAD, 
rather than just not remembering them anymore.



This comment is also incomplete then:

/*
 * Module loaded, but we did not find the function last time.
 * We're not going to find it this time either...
 */

This else branch is also taken if the module handle is set to 
INVALID_HANDLE_VALUE because the module was not found in a previous call.



If it can happen that a CRT DLL is unloaded before the process exits, 
and we cached the module handle while it was loaded, and later 
pgwin32_putenv() is called, that won't end well for the process. This 
might be a bit far-fetched; I have to see if I can actually make it happen.


One situation I can think of where this could occur is if an extension 
loaded with LOAD creates a COM in-proc server from a DLL built with yet 
another CRT, and when that object is released, either FreeLibrary() 
(transitively) or CoFreeUnusedLibraries() (directly) boots that CRT (if 
they do; it's possible that a CRT, once loaded, stays loaded.)



Finally: A nonzero handle returned from GetModuleHandle() is not 
something that needs to be CloseHandle()d. It is not actually a handle, 
but a pointer to the base (load) address of the module, although the 
documentation for GetModuleHandle() is careful not to admit that.


The value it is compared against to see whether we have seen the module 
before should be NULL, not 0.



It's getting a bit late for me today, but I will do the necessary 
experimentation and try to come up with a POC patch to fix whatever of 
the above I can actually prove to be real. Should anyone know for sure 
that I'm completely off track on something, better yet, everything, 
please let me know.


I should finish thinking before posting, then I would not have to reply 
to myself so often.


--
Christian



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Add trigonometric functions that work in degrees.

2016-04-25 Thread Peter Eisentraut
On 04/25/2016 03:09 PM, Tom Lane wrote:
> I'm going to go ahead and push this, because it seems clearly more robust
> than what we have.  But I'd appreciate a report on whether it fixes your
> issue.

6b1a213bbd6599228b2b67f7552ff7cc378797bf did not fix it.

Attached is the assembler output (-O0) from float.c as of that commit.


float.s.gz
Description: application/gzip

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-04-25 Thread Robert Haas
On Mon, Apr 25, 2016 at 4:24 PM, Tom Lane  wrote:
> Robert Haas  writes:
>>> What about calling it something even simpler, such as "max_parallelism"?
>>> This avoids such cargo cult, and there's no implication that it's
>>> per-query.
>
>> So what would we call the "parallel_degree" member of the Path data
>> structure, and the "parallel_degree" reloption?  I don't think
>> renaming either of those to "parallelism" is going to be an
>> improvement.
>
> I think we should rename all of these to something based on the concept of
> "number of worker processes", and adjust the code if necessary to match.

"worker processes" are a more general term than "parallel workers".  I
think if you conflate those two things, which I've tried quite hard to
keep separate in code and comments, there will be no end of confusion.
All parallel workers are background workers, aka worker processes, but
the reverse is false.

Also, when you think about the parallel degree of a path or plan, it
is the number of workers for which that path or plan has been costed,
which may or may not match the number we actually get.  I think of the
parallel_degree as the DESIRED number of workers for a path or plan
more than the actual number.  I think if we call it "number of
parallel workers" or something like that we had better think about
whether we're implying something other than that.  Perhaps the meaning
of the existing term "parallel degree" isn't as fully explicated as
would be desirable, but changing it to something else that definitely
doesn't mean what this was intended to mean won't be better.

> I think the "degree" terminology is fundamentally tainted by the question
> of whether or not it counts the leader, and that we will have bugs (or
> indeed may have them today) caused by getting that wrong.

This theory does not seem very plausible to me.  I don't really see
how that could happen, although perhaps I'm blinded by being too close
to the feature.  Also, you haven't presented entirely convincing
evidence that other people are all united in the way they view this
and that that way is different than PostgreSQL; and I've submitted
some contrary evidence.  Even if Oracle for example does do it
differently than what I've done here, slavishly following Oracle has
never been a prerequisite for regarding a PostgreSQL feature as
well-designed.  I think it is far more likely that going and
offsetting the value of parallel_degree by 1 everywhere, as Peter has
proposed, is going to introduce subtle bugs.

BTW, I am told by some of my colleagues that hinting /* PARALLEL 0 */
in Oracle hints no-parallelism, and that hinting /* PARALLEL 1 */
hints the use of one worker in addition to the main process.  I
haven't tried to verify this myself.

> Your arguments
> for not changing it seem to me not to address that point; you've merely
> focused on the question of whether we have the replacement terminology
> right.  If we don't, let's make it so, but the current situation is not
> good.

I don't agree that the current situation isn't good, but I'm willing
to change it anyway if we can come up with something that's actually
better.  In the absence of that, it makes no sense to change anything.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-04-25 Thread Peter Geoghegan
On Mon, Apr 25, 2016 at 1:24 PM, Tom Lane  wrote:
> I think we should rename all of these to something based on the concept of
> "number of worker processes", and adjust the code if necessary to match.
> I think the "degree" terminology is fundamentally tainted by the question
> of whether or not it counts the leader, and that we will have bugs (or
> indeed may have them today) caused by getting that wrong.

FWIW, my concern was always limited to that. I don't actually mind if
we use the "degree" terminology, as long as our usage is consistent
with that of other major systems. Since the GUC's behavior isn't going
to change now, the terminology should change. I'm fine with that. I'm
not particularly concerned with the specifics of some new terminology,
as long as it's consistent with the idea of auxiliary worker processes
that cooperate with a leader process.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] Breakage with VACUUM ANALYSE + partitions

2016-04-25 Thread Robert Haas
On Mon, Apr 25, 2016 at 2:05 PM, Andres Freund  wrote:
> Well, I posted a patch. I'd have applied it too (after addressing your
> comments obviously), except that there's some interdependencies with the
> nsmg > 0 thread (some of my tests fail spuriously without that
> fixed). Early last week I waited for a patch on that thread, but when
> that didn't materialize by Friday I switched to work on that [1].  With
> both fixes applied I can't reproduce any problems anymore.

OK.  What are the interdependencies?  You've said that a few times but
I am not clear on what the nature of those interdependencies is.

> About the delay: Sure, it'd be nicer if I'd addressed this
> immediately. But priority-wise it's an issue that's damned hard to hit,
> and where the worst known consequence is having to reconnect; that's not
> earth shattering. So spending time to work on the, imo more severe
> performance regressions, seemed to be more important; maybe I was wrong
> in priorizing things that way.

Do you mean performance regression related to this patch, or to other
patches like the atomic buffer pin/unpin stuff?

>> We have a patch, and that's good, and I have reviewed it and
>> Thom has tested it, and that's good, too.  But it is not clear whether
>> you feel confident to commit it or when you might be planning to do
>> that, so I asked.  Given that this is the open item of longest tenure
>> and that we're hoping to ship beta soon, why is that out of line?
>
> Well, if you'd asked it that way, then I'd not responded the way I have.

Fair enough, but keep in mind that a few more mildly-phrased emails
from Noah didn't actually get us to the point where this is fixed, or
even to a clear explanation of when it might be fixed or why it wasn't
getting fixed sooner.

>> That commit introduced a new way to write to blocks that might have in
>> the meantime been removed, and it failed to make that safe.
>
> There's no writing of any blocks involved - the problem is just about
> opening segments which might or might not exist.

As I understand it, that commit introduced a backend-private queue;
when it fills, we issue flush requests for particular blocks.  By the
time the flush requests are issued, the blocks might have been
truncated away.  So it's not writing blocks in the sense of write(),
but persuading the OS to write those blocks to stable storage.

>> And in fact I think it's a regression that can be
>> expected to be a significant operational problem for people if not
>> fixed, because the circumstances in which it can happen are not very
>> obscure.  You just need to hold some pending flush requests in your
>> backend-local queue while some other process truncates the relation,
>> and boom.
>
> I found it quite hard to come up with scenarios to reproduce it. Entire
> segments have to be dropped, the writes to the to-be-dropped segment
> have to result in fully dead rows, only few further writes outside the
> dropped can happen, invalidations may not fix the problem up.  But
> obviously it should be fixed.

It's probably a bit tricky to produce a workload where the
truncated-away block is in some backend's private queue, because the
queues are very small and it's not exactly clear what sort of workload
will produce regular relation truncations, and you need both of those
things to hit this. However, I think it's pretty optimistic to think
that there won't be people in that situation, partly because we had a
customer find a bug in an EDB proprietary feature  a very similar
whose profile is very similar to this feature less than 2 years ago.

>> I assume you will be pretty darned unhappy if we end up at #2, so I am
>> trying to figure out if we can achieve #1.  OK?
>
> Ok.

I'm still not clear on the answer to this.  I think the answer is that
you think that this patch is OK to commit with some editing, but the
situation with the nmsgs > 0 patch is not so clear yet because it
hasn't had any review yet.  And they depend on each other somehow.  Am
I close?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-04-25 Thread Tom Lane
Robert Haas  writes:
>> What about calling it something even simpler, such as "max_parallelism"?
>> This avoids such cargo cult, and there's no implication that it's
>> per-query.

> So what would we call the "parallel_degree" member of the Path data
> structure, and the "parallel_degree" reloption?  I don't think
> renaming either of those to "parallelism" is going to be an
> improvement.

I think we should rename all of these to something based on the concept of
"number of worker processes", and adjust the code if necessary to match.
I think the "degree" terminology is fundamentally tainted by the question
of whether or not it counts the leader, and that we will have bugs (or
indeed may have them today) caused by getting that wrong.  Your arguments
for not changing it seem to me not to address that point; you've merely
focused on the question of whether we have the replacement terminology
right.  If we don't, let's make it so, but the current situation is not
good.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-04-25 Thread Gavin Flower

On 26/04/16 06:38, Joshua D. Drake wrote:

On 04/25/2016 09:04 AM, Robert Haas wrote:
On Mon, Apr 25, 2016 at 11:27 AM, Joshua D. Drake 
 wrote:

max_parallel_nodes


I hope you are trolling me.


Actually I wasn't. It is usually a little more obvious when I troll, 
subtlety is not exactly my strong suit ;)



[...]

I've been reading the pg mailings lists for over ten years...

Just when I've noticed tempers flaring, and a flame war starting - it is 
already over!  In essence, I've never seen any genuine trolling or 
flamewars - people are just too friendly & cooperative.  Though I have 
seen strong disagreements, people are very careful to keep the tone from 
getting too heated, yet still manage to show their passion.



Cheers,
Gavin




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-04-25 Thread Robert Haas
On Mon, Apr 25, 2016 at 2:58 PM, Alvaro Herrera
 wrote:
> Tom Lane wrote:
>> Magnus Hagander  writes:
>> > On Sun, Apr 24, 2016 at 8:23 PM, Tom Lane  wrote:
>> >> FWIW, I agree with Bruce that using "degree" here is a poor choice.
>> >> It's an unnecessary dependence on technical terminology that many people
>> >> will not be familiar with.
>>
>> > FWIW, SQL Server calls it "degree of parallelism" as well (
>> > https://technet.microsoft.com/en-us/library/ms188611(v=sql.105).aspx). And
>> > their configuration option is "max degree of parallelism":
>> > https://technet.microsoft.com/en-us/library/ms181007(v=sql.105).aspx.
>>
>> Yes, but both they and Oracle appear to consider "degree" to mean the
>> total number of processors used, not the number of secondary jobs in
>> addition to the main one.  The only thing worse than employing obscure
>> technical terminology is employing it incorrectly:
>
> What about calling it something even simpler, such as "max_parallelism"?
> This avoids such cargo cult, and there's no implication that it's
> per-query.

So what would we call the "parallel_degree" member of the Path data
structure, and the "parallel_degree" reloption?  I don't think
renaming either of those to "parallelism" is going to be an
improvement.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposed change to make cancellations safe

2016-04-25 Thread Tom Lane
Shay Rojansky  writes:
>> We really do need "cancel up to" semantics for reliable behavior.
>> Consider the case where the client has sent the query (or thinks it has)
>> but the server hasn't received it yet.  If the cancel request can arrive
>> at the server before the query fully arrives, and we don't have "cancel
>> all messages up through N" semantics, the cancel will not do what the
>> client expects it to.

> Keep in mind that in the case of a cancellation arriving really too early,
> i.e. before any messages have been received by the server, it will be
> totally ignored since at the time of reception there's nothing for the
> server to cancel yet.

Right, that's how it works today ...

> This may seem a bit exotic, although if you really
> want to provide air-tight cancellation semantics you could have the server
> track unfulfilled cancellation requests. In other words, if the server
> receives "cancel up to X" and is now processing X-5, the cancellation
> request is kept in memory until X has been duly cancelled.

Exactly my point.  If we're trying to make cancel semantics less squishy,
I think we need to do that; errors in this direction are just as bad as
the late-cancel-arrival case.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposed change to make cancellations safe

2016-04-25 Thread Shay Rojansky
>
> We really do need "cancel up to" semantics for reliable behavior.
> Consider the case where the client has sent the query (or thinks it has)
> but the server hasn't received it yet.  If the cancel request can arrive
> at the server before the query fully arrives, and we don't have "cancel
> all messages up through N" semantics, the cancel will not do what the
> client expects it to.
>

Keep in mind that in the case of a cancellation arriving really too early,
i.e. before any messages have been received by the server, it will be
totally ignored since at the time of reception there's nothing for the
server to cancel yet. This may seem a bit exotic, although if you really
want to provide air-tight cancellation semantics you could have the server
track unfulfilled cancellation requests. In other words, if the server
receives "cancel up to X" and is now processing X-5, the cancellation
request is kept in memory until X has been duly cancelled.


Re: [HACKERS] pgsql: Add putenv support for msvcrt from Visual Studio 2013

2016-04-25 Thread Christian Ullrich

* Andrew Dunstan wrote:


On 04/25/2016 09:27 AM, Christian Ullrich wrote:

* Magnus Hagander wrote:


On Sun, Apr 24, 2016 at 9:56 PM, Christian Ullrich
 wrote:



Just noticed something. This DLL detection by name has never worked in
debug builds where the DLL names end in "d". Is that important?



That's an interesting point.  I guess our release builds are never with
debugging info - but could it make the buildfarm "wrong"?



What if both are present? Is a release build prevented from loading a
debug dll and vice versa?


Debug and release are simply two separate CRTs. If your process contains 
a module that needs the one, and another that needs the other, you will 
have both loaded at once.


I had hoped they might share state, but they don't, at least as far as 
putenv()/getenv() are concerned.



Alternatively, can we detect at compile time if we are a debug build and
if so add the suffix?


No; same reason as above.

--
Christian



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: Add putenv support for msvcrt from Visual Studio 2013

2016-04-25 Thread Andrew Dunstan



On 04/25/2016 09:27 AM, Christian Ullrich wrote:

* Magnus Hagander wrote:

On Sun, Apr 24, 2016 at 9:56 PM, Christian Ullrich 


wrote:


* Magnus Hagander wrote:

Add putenv support for msvcrt from Visual Studio 2013
http://git.postgresql.org/pg/commitdiff/9f633b404cb3be6139f8dfdea00538489ffef9ab 




Just noticed something. This DLL detection by name has never worked in
debug builds where the DLL names end in "d". Is that important?



That's an interesting point.  I guess our release builds are never with
debugging info - but could it make the buildfarm "wrong"?

Fixing it should probably be as easy as trying each dll with the 
specified

name and also with a "d" as a suffix?


I think so, yes.

Personally, I would have expected that at least the debug/release DLLs 
of a single CRT version would somehow share their environment, but I 
tried it and they don't.




What if both are present? Is a release build prevented from loading a 
debug dll and vice versa?


Alternatively, can we detect at compile time if we are a debug build and 
if so add the suffix?


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Add trigonometric functions that work in degrees.

2016-04-25 Thread Tom Lane
I wrote:
> If that is the answer, then the next question is how we can put more
> roadblocks in the way of compile-time evaluation of asin(0.5).  Dean
> suggested that creative use of "volatile" might do it, and I agree
> that that sounds like a promising thing to pursue.

It occurred to me that we don't actually need "volatile".  What we need
is a variable that the compiler is not allowed to assume is a compile-time
constant, and a plain global variable is sufficient for that.  In the
attached patch, we no longer need an assumption that init_degree_constants
doesn't get inlined; we only need to assume that the compiler can't prove
the variables degree_c_thirty etc to be immutable.  Which it cannot, even
if it does global optimization across the whole PG executable, because it
has to consider that loadable extensions might change them.

I'm going to go ahead and push this, because it seems clearly more robust
than what we have.  But I'd appreciate a report on whether it fixes your
issue.

regards, tom lane

diff --git a/src/backend/utils/adt/float.c b/src/backend/utils/adt/float.c
index c7c0b58..a95ebe5 100644
*** a/src/backend/utils/adt/float.c
--- b/src/backend/utils/adt/float.c
*** static float8 atan_1_0 = 0;
*** 77,91 
  static float8 tan_45 = 0;
  static float8 cot_45 = 0;
  
  /* Local function prototypes */
  static int	float4_cmp_internal(float4 a, float4 b);
  static int	float8_cmp_internal(float8 a, float8 b);
  static double sind_q1(double x);
  static double cosd_q1(double x);
! 
! /* This is INTENTIONALLY NOT STATIC.  Don't "fix" it. */
! void init_degree_constants(float8 thirty, float8 forty_five, float8 sixty,
! 	  float8 one_half, float8 one);
  
  #ifndef HAVE_CBRT
  /*
--- 77,100 
  static float8 tan_45 = 0;
  static float8 cot_45 = 0;
  
+ /*
+  * These are intentionally not static; don't "fix" them.  They will never
+  * be referenced by other files, much less changed; but we don't want the
+  * compiler to know that, else it might try to precompute expressions
+  * involving them.  See comments for init_degree_constants().
+  */
+ float8		degree_c_thirty = 30.0;
+ float8		degree_c_forty_five = 45.0;
+ float8		degree_c_sixty = 60.0;
+ float8		degree_c_one_half = 0.5;
+ float8		degree_c_one = 1.0;
+ 
  /* Local function prototypes */
  static int	float4_cmp_internal(float4 a, float4 b);
  static int	float8_cmp_internal(float8 a, float8 b);
  static double sind_q1(double x);
  static double cosd_q1(double x);
! static void init_degree_constants(void);
  
  #ifndef HAVE_CBRT
  /*
*** dtan(PG_FUNCTION_ARGS)
*** 1814,1848 
   * compilers out there that will precompute expressions such as sin(constant)
   * using a sin() function different from what will be used at runtime.  If we
   * want exact results, we must ensure that none of the scaling constants used
!  * in the degree-based trig functions are computed that way.
!  *
!  * The whole approach fails if init_degree_constants() gets inlined into the
!  * call sites, since then constant-folding can happen anyway.  Currently it
!  * seems sufficient to declare it non-static to prevent that.  We have no
!  * expectation that other files will call this, but don't tell gcc that.
   *
   * Other hazards we are trying to forestall with this kluge include the
   * possibility that compilers will rearrange the expressions, or compute
   * some intermediate results in registers wider than a standard double.
   */
! void
! init_degree_constants(float8 thirty, float8 forty_five, float8 sixty,
! 	  float8 one_half, float8 one)
  {
! 	sin_30 = sin(thirty * RADIANS_PER_DEGREE);
! 	one_minus_cos_60 = 1.0 - cos(sixty * RADIANS_PER_DEGREE);
! 	asin_0_5 = asin(one_half);
! 	acos_0_5 = acos(one_half);
! 	atan_1_0 = atan(one);
! 	tan_45 = sind_q1(forty_five) / cosd_q1(forty_five);
! 	cot_45 = cosd_q1(forty_five) / sind_q1(forty_five);
  	degree_consts_set = true;
  }
  
  #define INIT_DEGREE_CONSTANTS() \
  do { \
  	if (!degree_consts_set) \
! 		init_degree_constants(30.0, 45.0, 60.0, 0.5, 1.0); \
  } while(0)
  
  
--- 1823,1853 
   * compilers out there that will precompute expressions such as sin(constant)
   * using a sin() function different from what will be used at runtime.  If we
   * want exact results, we must ensure that none of the scaling constants used
!  * in the degree-based trig functions are computed that way.  To do so, we
!  * compute them from the variables degree_c_thirty etc, which are also really
!  * constants, but the compiler cannot assume that.
   *
   * Other hazards we are trying to forestall with this kluge include the
   * possibility that compilers will rearrange the expressions, or compute
   * some intermediate results in registers wider than a standard double.
   */
! static void
! init_degree_constants(void)
  {
! 	sin_30 = sin(degree_c_thirty * RADIANS_PER_DEGREE);
! 	one_minus_cos_60 = 1.0 - cos(degree_c_sixty * RADIANS_PER_DEGREE);
! 	asin_0_5 = as

Re: [HACKERS] Rename max_parallel_degree?

2016-04-25 Thread Alvaro Herrera
Tom Lane wrote:
> Magnus Hagander  writes:
> > On Sun, Apr 24, 2016 at 8:23 PM, Tom Lane  wrote:
> >> FWIW, I agree with Bruce that using "degree" here is a poor choice.
> >> It's an unnecessary dependence on technical terminology that many people
> >> will not be familiar with.
> 
> > FWIW, SQL Server calls it "degree of parallelism" as well (
> > https://technet.microsoft.com/en-us/library/ms188611(v=sql.105).aspx). And
> > their configuration option is "max degree of parallelism":
> > https://technet.microsoft.com/en-us/library/ms181007(v=sql.105).aspx.
> 
> Yes, but both they and Oracle appear to consider "degree" to mean the
> total number of processors used, not the number of secondary jobs in
> addition to the main one.  The only thing worse than employing obscure
> technical terminology is employing it incorrectly:

What about calling it something even simpler, such as "max_parallelism"?
This avoids such cargo cult, and there's no implication that it's
per-query.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] how to measure pglogical replication lag

2016-04-25 Thread Guo, Yun
Is there a function or view that could return a measurable replication lag of 
Pglogical  ?

Thanks
Yun


Re: [HACKERS] xlc atomics

2016-04-25 Thread Andres Freund
On 2016-04-23 21:54:07 -0400, Noah Misch wrote:
> I missed a second synchronization bug in generic-xlc.h, but the pgbench test
> suite caught it promptly after commit 008608b.

Nice catch.


> The bug is that the pg_atomic_compare_exchange_*() specifications
> grant "full barrier semantics", but generic-xlc.h provided only the
> semantics of an acquire barrier.

I find the docs at
http://www-01.ibm.com/support/knowledgecenter/SSGH3R_13.1.2/com.ibm.xlcpp131.aix.doc/compiler_ref/bifs_sync_atomic.html
to be be awfully silent about that matter. I guess you just looked at
the assembler code?  It's nice that one can figure out stuff like that
from an architecture manual, but it's sad that the docs for the
intrinsics is silent about that matter.


I do wonder if I shouldn't have left xlc fall by the wayside, and make
it use the fallback implementation. Given it's popularity (or rather
lack thereof) that'd probably have been a better choice.  But that ship
has sailed.


Except that I didn't verify the rs6000_pre_atomic_barrier() and
__fetch_and_add() internals about emitted sync/isync, the patch looks
good.   We've so far not referred to "sequential consistency", but given
it's rise in popularity, I don't think it hurts.

Stricly speaking generic-xlc.c shouldn't contain inline-asm, but given
xlc is only there for power...

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] Breakage with VACUUM ANALYSE + partitions

2016-04-25 Thread Fabien COELHO


Hello Andres,


Fixes it for me too.


Yep, for me too.

I'm not at ease with the part of the API the code is dealing with, so I do 
not feel like a competent reviewer. I agree with the "more comments" 
suggested by Robert.


I have just a small naming point:

  /* ereport if segment not present, create in recovery */
  EXTENSION_FAIL,
  /* return NULL if not present, create in recovery */
  EXTENSION_RETURN_NULL,
  /* return NULL if not present */
  EXTENSION_REALLY_RETURN_NULL,
  /* create new segments as needed */
  EXTENSION_CREATE

The comments seem pretty clear, but the naming of these options are more 
behavioral than functional somehow (or the reverse?), especially the 
RETURN_NULL and REALLY_RETURN_NULL names seemed pretty contrived to me.


There is one context: whether it is in recovery. There are 3 possible 
behaviors: whether to error or ignore or create if segment does not 
exist.


In recovery it is always create if asked for it must be made available, 
maybe it does not exists because of the crash...


If I understand correctly, with flushes kept a long time before being 
processed, there is a new state which is "ignore whatever", we do not want 
to create a segment for flushing non existing data.


So I would suggest it would be better to name the options closer to the 
comments above, something like:


normal / in recovery:
  EXTENSION_ERROR_OR_IR_CREATE
  EXTENSION_IGNORE_OR_IR_CREATE
  EXTENSION_IGNORE
  EXTENSION_CREATE

Now, maybe that is too late to try to find a better name for these 
options now:-)


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-04-25 Thread Joshua D. Drake

On 04/25/2016 09:04 AM, Robert Haas wrote:

On Mon, Apr 25, 2016 at 11:27 AM, Joshua D. Drake  
wrote:

max_parallel_nodes


I hope you are trolling me.


Actually I wasn't. It is usually a little more obvious when I troll, 
subtlety is not exactly my strong suit ;)


The only reason I suggested that name was because of this comment:

"I think Robert said it is per-executor node, not per session, similar 
to work_mem." (from Bruce)


If it doesn't make sense, I apologize for the noise.

Sincerely,

JD

--
Command Prompt, Inc.  http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Suspicious behaviour on applying XLOG_HEAP2_VISIBLE.

2016-04-25 Thread Andres Freund
Dean, Robert,

Afaics the problem described below was introduced in b4e07417, do you
have a different/better proposal than
s/CacheInvalidateSmgr/CacheInvalidateRelcache/? Doing that doesn't feel
quite right either, because it only makes the file extension visible at
end-of-xact - which is mostly harmless, but still.

On 2016-04-23 19:51:17 -0700, Andres Freund wrote:
> It fixes the problem at hand, but essentially it's just luck that the
> patch is sufficient. The first layer of the issue is that queued
> invalidation messages aren't sent; but for vm_extend() there's another
> side to it. Namely vm_extend() does
> 
>   /*
>* Send a shared-inval message to force other backends to close any smgr
>* references they may have for this rel, which we are about to change.
>* This is a useful optimization because it means that backends don't 
> have
>* to keep checking for creation or extension of the file, which happens
>* infrequently.
>*/
>   CacheInvalidateSmgr(rel->rd_smgr->smgr_rnode);
> 
> but CacheInvalidateSmgr is non-transactional as it's comment explains:
>  *
>  * Note: because these messages are nontransactional, they won't be captured
>  * in commit/abort WAL entries.  Instead, calls to CacheInvalidateSmgr()
>  * should happen in low-level smgr.c routines, which are executed while
>  * replaying WAL as well as when creating it.
>  *
> 
> as far as I can see vm_extend() is the only current caller forgetting
> that rule. I don't think it's safe to use CacheInvalidateSmgr() outside
> smgr.c.
> 
> The reason this all ends up working nonetheless is that the
> heap_inplace_update()s in vacuum triggers a CacheInvalidateHeapTuple()
> which queues a relcache invalidation, which in turn does a
> RelationCloseSmgr() in RelationClearRelation(). Thereby effectively
> doing a transactional CacheInvalidateSmgr().  But that seems more than
> fragile.
> 
> ISTM we should additionally replace the CacheInvalidateSmgr() with a
> CacheInvalidateRelcache() and document that that implies an smgr
> invalidation.  Alternatively we could log smgr (and relmapper)
> invalidations as well, but that's not quite non-invasive either; but
> might be a good long-term idea to keep things simpler.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Apparent race condition in standby promotion

2016-04-25 Thread Tom Lane
I'm looking at buildfarm member tern's recent failure:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=tern&dt=2016-04-25%2001%3A08%3A08

The perl script thinks it's started the postmaster and sent a promote
signal, then it's waited 90 seconds for the postmaster to come out of
standby:

### Starting node "standby"
# Running: pg_ctl -w -D 
/home/nm/farm/gcc32/HEAD/pgsql.build/src/bin/pg_rewind/tmp_check/data_standby_fTzy/pgdata
 -l 
/home/nm/farm/gcc32/HEAD/pgsql.build/src/bin/pg_rewind/tmp_check/log/001_basic_standby.log
 start
waiting for server to start done
server started
# Postmaster PID for node "standby" is 14025120
# Running: psql -q --no-psqlrc -d port=59882 host=/tmp/hr4cDmbIX0 
dbname=postgres -c INSERT INTO tbl1 values ('in master, before promotion')
# Running: psql -q --no-psqlrc -d port=59882 host=/tmp/hr4cDmbIX0 
dbname=postgres -c INSERT INTO trunc_tbl values ('in master, before promotion')
# Running: psql -q --no-psqlrc -d port=59882 host=/tmp/hr4cDmbIX0 
dbname=postgres -c INSERT INTO tail_tbl SELECT g, 'in master, before promotion: 
' || g FROM generate_series(1, 1) g
# Running: psql -q --no-psqlrc -d port=59882 host=/tmp/hr4cDmbIX0 
dbname=postgres -c CHECKPOINT
### Promoting node "standby"
# Running: pg_ctl -D 
/home/nm/farm/gcc32/HEAD/pgsql.build/src/bin/pg_rewind/tmp_check/data_standby_fTzy/pgdata
 -l 
/home/nm/farm/gcc32/HEAD/pgsql.build/src/bin/pg_rewind/tmp_check/log/001_basic_standby.log
 promote
server promoting
# 
Timed out while waiting for promotion of standby at RewindTest.pm line 166.

But look at what's in the standby's log:

LOG:  database system was interrupted; last known up at 2016-04-25 01:58:39 UTC
LOG:  entering standby mode
LOG:  redo starts at 0/228
LOG:  consistent recovery state reached at 0/2000C00
LOG:  database system is ready to accept read only connections
LOG:  started streaming WAL from primary at 0/300 on timeline 1
LOG:  statement: SELECT NOT pg_is_in_recovery()
... 88 occurrences of that removed ...
LOG:  statement: SELECT NOT pg_is_in_recovery()
LOG:  received promote request
FATAL:  terminating walreceiver process due to administrator command

The standby server didn't notice the promote request until AFTER
the perl script had probed 90 times, once a second, for promotion
to finish.  What's up with that?

One theory is that the postmaster handles promotion requests improperly.
Its only consideration of the problem is in sigusr1_handler:

if (CheckPromoteSignal() && StartupPID != 0 &&
(pmState == PM_STARTUP || pmState == PM_RECOVERY ||
 pmState == PM_HOT_STANDBY || pmState == PM_WAIT_READONLY))
{
/* Tell startup process to finish recovery */
signal_child(StartupPID, SIGUSR2);
}

This means that if pg_ctl sends a SIGUSR1 after creating a promotion
trigger file, and the system state is not such that the signal can be
forwarded on to the startup process at that very instant, the signal is
simply forgotten.  We will reconsider sending it to the startup process
only when the postmaster next gets a SIGUSR1, which might not be for a
long time.  Now, I do not immediately see a way for this to happen: the
postmaster should create the startup process before it first enables
interrupts, and that list of states seems to cover every possibility.
But it seems fragile as heck.  Maybe we need to check for promotion in
PostmasterStateMachine or related places, not in (or not just in) the
signal handler.

The other theory is that the startup process received the SIGUSR2
but failed to act on it for a long time.  It checks for that only
in CheckForStandbyTrigger(), and the calls to that function are
in assorted rather unprincipled places, which makes it hard to
convince oneself that such a call will always happen soon.
This example sure looks like no such call happened.

We've seen previous examples of the same thing, eg
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sungazer&dt=2016-02-09%2019%3A16%3A09
Noah raised the test's timeout from 30 seconds to 90 in response
to that failure, but it looks to me like this was a mis-diagnosis
of the problem.  What I suspect is happening is that if the signal
arrives at the startup process at the right time, probably around
the time it's opening up the streaming connection from the primary,
it doesn't get handled promptly.

I'm not planning to pursue this further, but someone who's more
familiar with the WAL-receiving logic in startup.c ought to.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] Breakage with VACUUM ANALYSE + partitions

2016-04-25 Thread Andres Freund
Hi,

On 2016-04-25 12:38:36 -0400, Robert Haas wrote:
> I think that the point of my message is exactly what I said in my
> message.  This isn't really about the last couple of days.  The issue
> was reported on March 20th.  On March 31st, Noah asked you for a plan
> to get it fixed by April 7th.  You never replied.  On April 16th, the
> issue not having been fixed, he followed up.  You said that you would
> fix it next week.  That week is now over, and we're into the following
> week.

Well, I posted a patch. I'd have applied it too (after addressing your
comments obviously), except that there's some interdependencies with the
nsmg > 0 thread (some of my tests fail spuriously without that
fixed). Early last week I waited for a patch on that thread, but when
that didn't materialize by Friday I switched to work on that [1].  With
both fixes applied I can't reproduce any problems anymore.

About the delay: Sure, it'd be nicer if I'd addressed this
immediately. But priority-wise it's an issue that's damned hard to hit,
and where the worst known consequence is having to reconnect; that's not
earth shattering. So spending time to work on the, imo more severe
performance regressions, seemed to be more important; maybe I was wrong
in priorizing things that way.


> We have a patch, and that's good, and I have reviewed it and
> Thom has tested it, and that's good, too.  But it is not clear whether
> you feel confident to commit it or when you might be planning to do
> that, so I asked.  Given that this is the open item of longest tenure
> and that we're hoping to ship beta soon, why is that out of line?

Well, if you'd asked it that way, then I'd not responded the way I have.


> We initially had a theory that the commit that caused this issue
> merely revealed an underlying problem that had existed before, but I
> no longer really think that's the case.

I do think there's some lingering problems (avoiding a FATAL by choosing
an index scan instead of a seqscan, the problem afaics can transiently
occur in HS, any corrupted index can trigger it ...) - but I agree it's
not really been a big problem so far.


> That commit introduced a new way to write to blocks that might have in
> the meantime been removed, and it failed to make that safe.

There's no writing of any blocks involved - the problem is just about
opening segments which might or might not exist.


> And in fact I think it's a regression that can be
> expected to be a significant operational problem for people if not
> fixed, because the circumstances in which it can happen are not very
> obscure.  You just need to hold some pending flush requests in your
> backend-local queue while some other process truncates the relation,
> and boom.

I found it quite hard to come up with scenarios to reproduce it. Entire
segments have to be dropped, the writes to the to-be-dropped segment
have to result in fully dead rows, only few further writes outside the
dropped can happen, invalidations may not fix the problem up.  But
obviously it should be fixed.


> I assume you will be pretty darned unhappy if we end up at #2, so I am
> trying to figure out if we can achieve #1.  OK?

Ok.

[1] 
http://archives.postgresql.org/message-id/20160424025117.2cmf6ku4ldfcoo44%40alap3.anarazel.de


Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposed change to make cancellations safe

2016-04-25 Thread Alvaro Herrera
Shay Rojansky wrote:

> BTW it seems it's no longer possible for anyone to add things to the TODO
> (no editor privileges).

You need to ask for editor privs on pgsql-www.  Make sure to mention
your community user name.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plpgsql - DECLARE - cannot to use %TYPE or %ROWTYPE for composite types

2016-04-25 Thread Bruce Momjian

Good summary.  Is there a TODO item here?

---

On Tue, Mar 15, 2016 at 08:17:07PM -0400, Tom Lane wrote:
> Pavel Stehule  writes:
> >> Robert Haas  writes:
> >>> That's not a dumb idea.  I think %TYPE is an Oracle-ism, and it
> >>> doesn't seem to have been their best-ever design decision.
> 
> > Using %TYPE has sense in PostgreSQL too.
> 
> It's certainly useful functionality; the question is whether this
> particular syntax is an appropriate base for extended features.
> 
> As I see it, what we're talking about here could be called type operators:
> given a type name or some other kind of SQL expression, produce the name
> of a related type.  The existing things of that sort are %TYPE and []
> (we don't really implement [] as a type operator, but a user could
> reasonably think of it as one).  This patch proposes to make %TYPE and []
> composable into a single operator, and then it proposes to add ELEMENT OF
> as a different operator; and these things are only implemented in plpgsql.
> 
> My concern is basically that I don't want to stop there.  I think we want
> more type operators in future, such as the rowtype-related operators
> I sketched upthread; and I think we will want these operators anywhere
> that you can write a type name.
> 
> Now, in the core grammar we have [] which can be attached to any type
> name, and we have %TYPE but it only works in very limited contexts.
> There's a fundamental problem with extending %TYPE to be used anywhere
> a type name can: consider
> 
>   select 'foo'::x%type from t;
> 
> It's ambiguous whether this is an invocation of %TYPE syntax or whether %
> is meant to be a regular operator and TYPE the name of a variable.  Now,
> we could remove that ambiguity by promoting TYPE to be a fully reserved
> word (it is unreserved today).  But that's not very palatable, and even
> if we did reserve TYPE, I think we'd still need a lexer kluge to convert
> %TYPE into a single token, else bison will have lookahead problems.
> That sort of kluge is ugly, costs performance, and tends to have
> unforeseen side-effects.
> 
> So my opinion is that rather than extending %TYPE, we need a new syntax
> that is capable of being used in more general contexts.
> 
> There's another problem with the proposal as given: it adds a prefix
> type operator (ELEMENT OF) where before we only had postfix ones.
> That means there's an ambiguity about which one binds tighter.  This is
> not a big deal right now, since there'd be little point in combining
> ELEMENT OF and [] in the same operation, but it's going to create a mess
> when we try to add additional type operators.  You're going to need to
> allow parentheses to control binding order.  I also find it unsightly
> that the prefix operator looks so little like the postfix operators
> syntactically, even though they do very similar sorts of things.
> 
> In short there basically isn't much to like about these syntax details.
> 
> I also do not like adding the feature to plpgsql first.  At best, that's
> going to be code we throw away when we implement the same functionality
> in the core's typename parser.  At worst, we'll have a permanent
> incompatibility because we find we can't make the core parser use exactly
> the same syntax.  (For example, it's possible we'd find out we have to
> make ELEMENT a fully-reserved word in order to use this ELEMENT OF syntax.
> Or maybe it's fine; but until we've tried to cram it into the Typename
> production, we won't know.  I'm a bit suspicious of expecting it to be
> fine, though, since AFAICS this patch breaks the ability to use "element"
> as a plain type name in a plpgsql variable declaration.  Handwritten
> parsing code like this tends to be full of such gotchas.)
> 
> In short, I think we should reject this implementation and instead try
> to implement the type operators we want in the core grammar's Typename
> production, from which plpgsql will pick it up automatically.  That is
> going to require some other syntax than this.  As I said, I'm not
> particularly pushing the function-like syntax I wrote upthread; but
> I want to see something that is capable of supporting all those features
> and can be extended later if we think of other type operators we want.
> 
>   regards, tom lane
> 
> 
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Ordering in guc.c vs. config.sgml vs. postgresql.sample.conf

2016-04-25 Thread Peter Geoghegan
On Mon, Apr 25, 2016 at 9:57 AM, Robert Haas  wrote:
> For myself, I would rather have guc.c in the order that it's in.
> Related options tend to be next to each other, and being able to look
> up and down to see that they are all consistent has value for me.

+1

The GUC autovacuum_work_mem is beside other autovacuum GUCs, not other
RESOURCES_MEM GUCs. track_activity_query_size is beside GUCs that
relate to logging, and yet is also a RESOURCES_MEM GUC. So, neither of
these GUCs would be better placed beside the things that we think of
as RESOURCES_MEM GUCs, such as work_mem. In short, the existing
ordering isn't really so arbitrary.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] EXPLAIN VERBOSE with parallel Aggregate

2016-04-25 Thread Robert Haas
On Fri, Apr 22, 2016 at 10:11 PM, David Rowley
 wrote:
> The most basic thing I can think of to rationalise my thinking for this is:
>
> # create table v (v varchar);
> # create view v_v as select lower(v) v from v;
> # explain verbose select upper(v) from v_v;
>  QUERY PLAN
> -
>  Seq Scan on public.v  (cost=0.00..30.40 rows=1360 width=32)
>Output: upper(lower((v.v)::text))
> (2 rows)
>
> It seems that you're proposing that the aggregate equivalence of this should 
> be;
>
>  QUERY PLAN
> -
>  Seq Scan on public.v  (cost=0.00..30.40 rows=1360 width=32)
>Output: upper((v)::text)
> (2 rows)
>
> which to me seems wrong, as it's hiding the fact that lower() was called.
>
> My arguments don't seem to be holding much weight, so I'll back down,
> although it would still be interesting to hear what others have to say
> about this.

Yeah, I'd be happy to have more people chime in.  I think your example
is interesting, but it doesn't persuade me, because the rule has
always been that EXPLAIN shows the output *columns*, not the output
*rows*.  The disappearance of some rows doesn't change the list of
output columns.  For scans and joins this rule is easy to apply; for
aggregates, where many rows become one, less so.  Some of the existing
choices there are certainly arguable, like the fact that FILTER is
shown anywhere at all, which seems like an artifact to me.  But I
think that now is not the time to rethink those decisions.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Ordering in guc.c vs. config.sgml vs. postgresql.sample.conf

2016-04-25 Thread Robert Haas
On Mon, Apr 25, 2016 at 6:29 AM, Magnus Hagander  wrote:
> guc.c might be better to just stick to alphabetical per group. (Which we
> also don't do today, of course, but it could be a better way to do it there)

For myself, I would rather have guc.c in the order that it's in.
Related options tend to be next to each other, and being able to look
up and down to see that they are all consistent has value for me.  If
we put them in alphabetical order, that's likely to be less true.  And
realistically, anybody who is looking for a particular setting is just
going to ask their editor to find it for them, so there's not a lot of
value in alpha order that I can see.

However, if I lose this argument, I will not cry into my beer.  Just
throwing out my $0.02.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] Breakage with VACUUM ANALYSE + partitions

2016-04-25 Thread Robert Haas
On Mon, Apr 25, 2016 at 11:56 AM, Andres Freund  wrote:
> On 2016-04-25 08:55:54 -0400, Robert Haas wrote:
>> Andres, this issue has now been open for more than a month, which is
>> frankly kind of ridiculous given the schedule we're trying to hit for
>> beta.  Do you think it's possible to commit something RSN without
>> compromising the quality of PostgreSQL 9.6?
>
> Frankly I'm getting a bit annoyed here too. I posted a patch Friday,
> directly after getting back from pgconf.us. Saturday I posted a patch
> for an issue which I didn't cause, but which caused issues when testing
> the patch in this thread. Sunday I just worked on some small patch - one
> you did want to see resolved too.  It's now 8.30 Monday morning.  What's
> the point of your message?

I think that the point of my message is exactly what I said in my
message.  This isn't really about the last couple of days.  The issue
was reported on March 20th.  On March 31st, Noah asked you for a plan
to get it fixed by April 7th.  You never replied.  On April 16th, the
issue not having been fixed, he followed up.  You said that you would
fix it next week.  That week is now over, and we're into the following
week.  We have a patch, and that's good, and I have reviewed it and
Thom has tested it, and that's good, too.  But it is not clear whether
you feel confident to commit it or when you might be planning to do
that, so I asked.  Given that this is the open item of longest tenure
and that we're hoping to ship beta soon, why is that out of line?

Fundamentally, the choices for each open item are (1) get it fixed
before beta, (2) revert the commit that caused it, (3) decide it's OK
to ship beta with that issue, or (4) slip beta.  We initially had a
theory that the commit that caused this issue merely revealed an
underlying problem that had existed before, but I no longer really
think that's the case.  That commit introduced a new way to write to
blocks that might have in the meantime been removed, and it failed to
make that safe.  That's not to say that md.c doesn't do some wonky
things, but the pre-existing code in the upper layers coped with the
wonkiness and your new code doesn't, so in effect it's a regression.
And in fact I think it's a regression that can be expected to be a
significant operational problem for people if not fixed, because the
circumstances in which it can happen are not very obscure.  You just
need to hold some pending flush requests in your backend-local queue
while some other process truncates the relation, and boom.  So I am
disinclined to option #3.  I also do not think that we should slip
beta for an issue that was reported this far in advance of the planned
beta date, so I am disinclined to option #4.  That leaves #1 and #2.
I assume you will be pretty darned unhappy if we end up at #2, so I am
trying to figure out if we can achieve #1.  OK?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposed change to make cancellations safe

2016-04-25 Thread Tom Lane
Shay Rojansky  writes:
>> Issuing bulk cancellations sounds like a bad plan.

> I'm not sure why, but at the very least it's a useful thing to have when
> batching several statements together and then wanting to cancel them all.

We really do need "cancel up to" semantics for reliable behavior.
Consider the case where the client has sent the query (or thinks it has)
but the server hasn't received it yet.  If the cancel request can arrive
at the server before the query fully arrives, and we don't have "cancel
all messages up through N" semantics, the cancel will not do what the
client expects it to.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposed change to make cancellations safe

2016-04-25 Thread Shay Rojansky
>
> I definitely agree that simply tracking message sequence numbers on both
>> sides is better. It's also a powerful feature to be able to cancel all
>> messages "up to N" - I'm thinking of a scenario where, for example, many
>> simple queries are sent and the whole process needs to be cancelled.
>>
>
> For security, I think any non-matching cancellation would be ignored so
> that only someone with proper context could issue the cancel.
>

Isn't security taken care of by the secret key?


> Issuing bulk cancellations sounds like a bad plan.
>

I'm not sure why, but at the very least it's a useful thing to have when
batching several statements together and then wanting to cancel them all.


> Yes, this has been happening to some Npgsql users, it's not very frequent
>> but it does happen from time to time. I also bumped into this in some
>> automated testing scenarios. It's not the highest-value feature, but it is
>> an improvement to have if you plan on working on a new protocol version.
>>
>> Let me know if you'd like me to update the TODO.
>>
>
> If you've got an itch, expecting someone else to scratch it is less likely
> to succeed.
>

Sure. I'd consider sending in a patch, but as this is a protocol-changing
feature it seems like working on this before the team "officially" starts
working on a new protocol might be a waste of time. Once there's critical
mass for a new protocol and agreement that PostgreSQL is going for it I'd
be happy to work on it.

BTW it seems it's no longer possible for anyone to add things to the TODO
(no editor privileges).


Re: [HACKERS] Suspicious behaviour on applying XLOG_HEAP2_VISIBLE.

2016-04-25 Thread Andres Freund
On 2016-04-24 22:10:19 +0100, Simon Riggs wrote:
> On 18 April 2016 at 13:15, Simon Riggs  wrote:
> > (non-reply just because of travel)
> >
> > OK, I'll write up a patch today to fix, with a view to backpatching.
> >
> 
> Patch from Tuesday.  On various planes.

I posted a version of this at
http://archives.postgresql.org/message-id/20160424025117.2cmf6ku4ldfcoo44%40alap3.anarazel.de
(because it was blocking me from fixing a newer bug)

I think my approach of a separate record is going to be easier to
backpatch - the new commit record format you took advantage of is
new. If we go your way, this also needs pg_xlogdump/xact_desc() &
logical decoding integration.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Updated backup APIs for non-exclusive backups

2016-04-25 Thread Kevin Grittner
On Mon, Apr 25, 2016 at 9:11 AM, Fujii Masao  wrote:

> PostgreSQL resource agent for Pacemaker removes backup_label in
> the case of failover to prevent the standby server from failing
> to recover from the crash.

Yikes!  I hope that the providers of Pacemaker document that they
are using a technique which can silently corrupt the cluster.

http://tbeitr.blogspot.com/2015/07/deleting-backuplabel-on-restore-will.html

> This is not so neat solution, but they could live with the
> problem for a long time, so far.

It can work sometimes.  It can also produce corruption which will
silently return incorrect results from queries for a long time
before some other symptom of the corruption surfaces.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-04-25 Thread Robert Haas
On Mon, Apr 25, 2016 at 11:27 AM, Joshua D. Drake  
wrote:
> max_parallel_nodes

I hope you are trolling me.  It does not bound the maximum number of
parallel nodes, but rather the maximum number of workers per parallel
node.  In most cases, a query is only going to have one parallel node,
but sometimes it could have more than one.  In all of the actual cases
tested by me thus far, the workers for one node terminate before we
try to launch workers for another node, but it is theoretically
possible to have a query plan where this isn't the case.  It is
getting a bit irritating to me to people who seem not even to have
read the existing documentation for this GUC, let alone tried it out
and gotten familiar with what it does, are convinced they know how it
should be changed.

On the underlying issue, the max_parallel_degree naming is tied into
the parallel_degree reloption and the parallel_degree field that is
part of each Path.  If you want to rename any of those, you are going
to need to rename them all, I think.  So we could do
max_parallel_degree -> max_parallel_workers_per_executor_node and
parallel_degree -> target_number_of_parallel_workers, but that is
rather longwinded and it's very difficult to shorten those very much
without losing clarity.  I grant that there is some learning curve in
getting familiar with new terms of art, like "parallel degree", but in
the long run it's better to ask people to learn a few new terms than
to trying to repeat the whole description of what the thing is every
time you refer to it.  We've got existing terms of art like
"multixact" or even "wraparound" that are far, far more obscure than
this, and those are entirely PostgreSQL-specific things without a hint
of any parallel in other products.

The concept of parallel degree is not going away, and its reach
extends well beyond the GUC.  We can rename it to some other term, but
this one is pretty standard.  Trying to avoid having a term for it is
not going to work.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] Breakage with VACUUM ANALYSE + partitions

2016-04-25 Thread Andres Freund
On 2016-04-25 08:55:54 -0400, Robert Haas wrote:
> Andres, this issue has now been open for more than a month, which is
> frankly kind of ridiculous given the schedule we're trying to hit for
> beta.  Do you think it's possible to commit something RSN without
> compromising the quality of PostgreSQL 9.6?

Frankly I'm getting a bit annoyed here too. I posted a patch Friday,
directly after getting back from pgconf.us. Saturday I posted a patch
for an issue which I didn't cause, but which caused issues when testing
the patch in this thread. Sunday I just worked on some small patch - one
you did want to see resolved too.  It's now 8.30 Monday morning.  What's
the point of your message?

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-04-25 Thread Joshua D. Drake

On 04/25/2016 07:08 AM, Bruce Momjian wrote:

On Sun, Apr 24, 2016 at 11:21:01AM +0530, Amit Kapila wrote:

On Sun, Apr 24, 2016 at 9:28 AM, Bruce Momjian  wrote:


Why is the parallelism variable called "max_parallel_degree"?  Is that a
descriptive name?  What does "degree" mean?


It is to denote amount of parallelism at node level.

--

Sorry, it was Amit who said the parallelism is per-node, meaning calling
the variable "session" would not work.


max_parallel_nodes

jD







--
Command Prompt, Inc.  http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why doesn't src/backend/port/win32/socket.c implement bind()?

2016-04-25 Thread Tom Lane
I wrote:
> Michael Paquier  writes:
>> Not worse, and still not enough... bowerbird complained again:
>> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=bowerbird&dt=2016-04-25%2002%3A13%3A54

> That's a different symptom that seems unrelated:

> cannot remove directory for 
> C:\prog\bf\root\HEAD\pgsql.build\src\bin\scripts\tmp_check\data_main_21Nw\pgdata\global:
>  Directory not empty at C:/Perl64/lib/File/Temp.pm line 902.

Ah, scratch that, I was taking that as being the cause of the reported
failure but it's just noise, cf <31417.1461595...@sss.pgh.pa.us>.

You're right, we're still getting

# pg_ctl failed; logfile:
LOG:  could not bind IPv4 socket: Permission denied
HINT:  Is another postmaster already running on port 60208? If not, wait a few 
seconds and retry.
WARNING:  could not create listen socket for "127.0.0.1"
FATAL:  could not create any TCP/IP sockets
LOG:  database system is shut down
Bail out!  pg_ctl failed

So the connect() test is inadequate.  Let's try bind() with SO_REUSEADDR
and see whether that makes things better or worse.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Bogus cleanup code in PostgresNode.pm

2016-04-25 Thread Tom Lane
I noticed that even when they are successful, buildfarm members bowerbird
and jacana tend to spew a lot of messages like this in their bin-check
steps:

Can't remove directory 
/home/pgrunner/bf/root/HEAD/pgsql.build/src/bin/scripts/tmp_check/data_main_DdUf/pgdata/global:
 Directory not empty at /usr/lib/perl5/5.8/File/Temp.pm line 898
Can't remove directory 
/home/pgrunner/bf/root/HEAD/pgsql.build/src/bin/scripts/tmp_check/data_main_DdUf/pgdata/pg_xlog:
 Directory not empty at /usr/lib/perl5/5.8/File/Temp.pm line 898
Can't remove directory 
/home/pgrunner/bf/root/HEAD/pgsql.build/src/bin/scripts/tmp_check/data_main_DdUf/pgdata:
 Permission denied at /usr/lib/perl5/5.8/File/Temp.pm line 898
Can't remove directory 
/home/pgrunner/bf/root/HEAD/pgsql.build/src/bin/scripts/tmp_check/data_main_DdUf:
 Directory not empty at /usr/lib/perl5/5.8/File/Temp.pm line 898
### Signalling QUIT to 9156 for node "main"
# Running: pg_ctl kill QUIT 9156

What is happening here is that the test script is not bothering to do an
explicit $node->stop operation, and if it doesn't, the automatic cleanup
steps happen in the wrong order: the File::Temp destructor for the temp
data directory runs before PostgresNode.pm's DESTROY function, which is
what's issuing the "pg_ctl kill" command.  On Unix that's just messy,
but on Windows it fails because you can't delete a process's working
directory.  I am not sure whether this is guaranteed wrong or just
sometimes wrong; the Perl docs I can find say that destructors are run in
unspecified order once interpreter shutdown begins.  But by adding some
debug printout I was able to verify on my own machine that the data
directory was already gone when DESTROY runs.

I believe we can fix this by forcing postmaster shutdown in an END
routine instead of a DESTROY routine, and hence propose the attached
patch, which does things in the right order for me.  I'm a pretty
poor Perl programmer, so I'd appreciate somebody vetting this.

regards, tom lane

diff --git a/src/test/perl/PostgresNode.pm b/src/test/perl/PostgresNode.pm
index cd2e974..120f3f2 100644
*** a/src/test/perl/PostgresNode.pm
--- b/src/test/perl/PostgresNode.pm
*** sub stop
*** 662,667 
--- 662,668 
  	my $pgdata = $self->data_dir;
  	my $name   = $self->name;
  	$mode = 'fast' unless defined $mode;
+ 	return unless defined $self->{_pid};
  	print "### Stopping node \"$name\" using mode $mode\n";
  	TestLib::system_log('pg_ctl', '-D', $pgdata, '-m', $mode, 'stop');
  	$self->{_pid} = undef;
*** sub get_new_node
*** 883,896 
  	return $node;
  }
  
! # Attempt automatic cleanup
! sub DESTROY
  {
! 	my $self = shift;
! 	my $name = $self->name;
! 	return unless defined $self->{_pid};
! 	print "### Signalling QUIT to $self->{_pid} for node \"$name\"\n";
! 	TestLib::system_log('pg_ctl', 'kill', 'QUIT', $self->{_pid});
  }
  
  =pod
--- 884,896 
  	return $node;
  }
  
! # Attempt automatic cleanup of all created nodes
! sub END
  {
! 	foreach my $node (@all_nodes)
! 	{
! 		$node->teardown_node;
! 	}
  }
  
  =pod

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Updated backup APIs for non-exclusive backups

2016-04-25 Thread David Steele
On 4/24/16 11:49 AM, Stephen Frost wrote:
> * Magnus Hagander (mag...@hagander.net) wrote:
>> On Sun, Apr 24, 2016 at 5:37 AM, Bruce Momjian  wrote:
>>
>>> On Fri, Apr 22, 2016 at 11:53:46AM -0400, Robert Haas wrote:
>>>
 Frankly, I think that's right.  It is one thing to say that the new
 method is preferred - +1 for that.  But the old method is going to
 continue to be used by many people for a long time, and in some cases
 will be superior.  That's not something we can deprecate, unless I'm
 misunderstanding the situation.
>>>
>>> I agree with Robert.  One the one hand we are saying pg_stop_backup()
>>> doesn't work well in psql because you get those two file contents output
>>> that you have to write, and on the other hand we are saying we are going
>>> to deprecate the method that does work well in psql?  I must be missing
>>> something too, as that makes no sense.
>>
>> I don't agree. I don't see how "making a backup using psql" is more
>> important than "making a backup without potentially dangerous sideeffects".
>> But if others don't agree, could one of you at least provide an example of
>> how you'd like the docs to read in a way that doesn't deprecate the unsafe
>> way but still informs the user properly?
> 
> I'm with Magnus on this, primairly because I've come to understand just
> how dangerous the old backup method is.  That method *should* be
> deprecated and discouraged.  A backup method which means your database
> doesn't restart properly if the system crashes during the backup is
> *bad*.

+1

> Perhaps we can look at improving psql to make it easier to use it for
> the new backup method but, honestly, all these hackish scripts to do
> backups aren't doing a lot of things that a real backup solution needs
> to do.  Improving psql for this is a bit like new paint on the titanic.

Personally I think we do the users a disservice by implying that backup
is as simple as calling two functions and copying the files.  Oh, and
don't forget to include WAL!

-- 
-David
da...@pgmasters.net



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] pg_stat_activity crashes

2016-04-25 Thread Robert Haas
On Thu, Apr 21, 2016 at 9:47 PM, Kyotaro HORIGUCHI
 wrote:
> A lock was already held in BackendPidGetProc(). Is it also
> needless? If so, we should use BackendPidGetProcWithLock() instad
> (the name seems a bit confusing, though).

Oh, that's really sad.  No, that lock is definitely needed.  We should
probably try to figure out some day if there is a way to make this
completely lockless, but that'll have to be 9.7 material or later.
:-(

> What I did in the patch was just extending the lock duration
> until reading the pointer proc. I didn't added any additional
> lock.

Sorry, I didn't realize that.  Good point.

>> >  The
>> > only thing we need to do is to prevent the value from being read
>> > twice, and we already have precedent for how to prevent that in
>> > freelist.c.
>
> However, I don't have objections for the patch applied.

OK, let's leave it like that for now, then.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pg_dump - tar format and compression

2016-04-25 Thread Stephen Frost
Greetings,

I'm working through the code coverage challenges for pg_dump and thought
I'd take on an easy one- pg_backup_tar.c.  Unfortunately, that turned
out to be not quite so simple as I had hoped.

As far as I can see, pg_dump's tar format doesn't support either writing
out or reading in compressed files, however, there's a bunch of

#ifdef HAVE_LIBZ

#else
#endif

code in pg_backup_tar.c.

Trying to create a compressed tar backup using:

pg_dump -Ft -Z5 -f test.tar.gz

Results in:

pg_dump: [tar archiver] compression is not supported by tar archive format

Running it without -Z5, then compressing the file and giving it to
pg_restore (even with -Ft) results in:

pg_restore -Ft test.tar.gz
pg_restore: [tar archiver] corrupt tar header found in�'W (expected 0, computed 
63963) file position 512

(without -Ft, you get:

pg_restore test.tar.gz
pg_restore: [archiver] input file does not appear to be a valid archive

).

So, am I missing something here?  The comments at the top of
pg_backup_tar.c indicate that it was copied from pg_backup_file.c, but
if we're not going to support compressed tar files, then we should
really rip that code out, otherwise it's just confusing.

If I'm not missing anything and this is really just dead code, then I'm
willing to look at ripping that code out once I'm through the current
project of providing decent test coverage to pg_dump, but this does seem
like it'd be perfect project for a relative newcomer to hacking on PG..

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Updated backup APIs for non-exclusive backups

2016-04-25 Thread Magnus Hagander
On Mon, Apr 25, 2016 at 4:11 PM, Fujii Masao  wrote:

> On Thu, Apr 21, 2016 at 2:32 AM, Magnus Hagander 
> wrote:
> > On Wed, Apr 20, 2016 at 1:12 AM, Fujii Masao 
> wrote:
> >>
> >> On Sun, Apr 17, 2016 at 1:22 AM, Magnus Hagander 
> >> wrote:
> >> > On Wed, Apr 13, 2016 at 4:07 AM, Noah Misch 
> wrote:
> >> >>
> >> >> On Tue, Apr 12, 2016 at 10:08:23PM +0200, Magnus Hagander wrote:
> >> >> > On Tue, Apr 12, 2016 at 8:39 AM, Noah Misch 
> >> >> > wrote:
> >> >> > > On Mon, Apr 11, 2016 at 11:22:27AM +0200, Magnus Hagander wrote:
> > Yes, it's a bit annoying. But it's something you can handle. Unlike the
> > problems that exist with an exclusive backup which you *cannot* handle
> from
> > the backup script/commands.
>
> I know many systems which are currently running well in production and
> handling that problem of an exclusive backup. For example, they use
> Pacemaker/Corosync to shared-disk HA configuration. PostgreSQL resource
> agent for Pacemaker removes backup_label in the case of failover to
> prevent the standby server from failing to recover from the crash.
> This is not so neat solution, but they could live with the problem for
> a long time, so far.
>
> I'm NOT against migrating their backup scripts so that new method is used,
> at the end. On the other hand, I think that we don't need to warn and rush
> them to do that at least until new method will be polished.
>

We have not put a timeframe on the deprecation. Given that, they can expect
to have multiple releases. And surely they don't use psql in the first
place, but an interface that gives them more control already, don't they?



> >> Another pain in a non-exclusive backup is to have to execute both
> >> pg_start_backup() and pg_stop_backup() on the same connection.
> >
> >
> > Pretty sure that's the same one you just listed?
>
> Yes, the sources of those pains are the same.
>
> I wonder if it's better to store the backup state information in shared
> memory, catalog, flat file or something instead of local memory
> so that we can execute pg_stop_backup() in different session from that
> pg_start_backup() is called. Maybe I'm missing something basic, but
> why do those functions need to be called in the same session at all?
>

So that we can automatically terminate backup mode if the client disappears.


> Yes, if you insist on using psql - which is not a good tool for this -
> then
> > yes. But you could for example make it a psql script that does something
> > along
> > SELECT pg_start_backup();
> > \! rsync ...
> > SELECT pg_stop_backup()
> >
> > Or as said above, drive it through a pipe instead. Or use an appropriate
> > client library.
>
> So, what's the appropriate client library for new backup method?
> It's better to document that.
>


Take your pick. *Any* client library will work fine. libpq, psycopg2, JDBC,
DBD::Pg, npgsql, doesn't matter.

psql is not a client *library*, it's a client.



>> How should we write those two values to different files when
> >> we execute pg_stop_backup() via psql? Whole output of
> >> pg_stop_backup() should be written to a transient file and
> >> it should be filtered and written to different two files by
> >> using some Linux commands? This also seems to make the backup
> >> script more complicated.
> >
> >
> > Yes, you would have to do that. And yes, psql is not a good tool for
> this.
> > So the solution is to not force yourself to use a tool that doesn't work
> > well for this problem.
>
> So it's better to document a tool which can work well or how to use psql
> for new backup method (i.e., how to create backup_label and tablespace_map
> files from the output of psql, of course it's also better to write
> the example).
>

We didn't have examples before, did we?

These APIs are directed at *tool developers*. Surely they can be trusted to
be able to write a file.

*End users* should not be using those APIs at all. They should be using
pg_basebackup, or one of the other tools. That's the "bigger documentation
rewrite" that was referenced in the thread, but nobody could commit to
getting done before beta.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] Rename max_parallel_degree?

2016-04-25 Thread Bruce Momjian
On Sun, Apr 24, 2016 at 11:21:01AM +0530, Amit Kapila wrote:
> On Sun, Apr 24, 2016 at 9:28 AM, Bruce Momjian  wrote:
> >
> > Why is the parallelism variable called "max_parallel_degree"?  Is that a
> > descriptive name?  What does "degree" mean?
> 
> It is to denote amount of parallelism at node level.
   --

Sorry, it was Amit who said the parallelism is per-node, meaning calling
the variable "session" would not work.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Updated backup APIs for non-exclusive backups

2016-04-25 Thread Fujii Masao
On Thu, Apr 21, 2016 at 2:32 AM, Magnus Hagander  wrote:
> On Wed, Apr 20, 2016 at 1:12 AM, Fujii Masao  wrote:
>>
>> On Sun, Apr 17, 2016 at 1:22 AM, Magnus Hagander 
>> wrote:
>> > On Wed, Apr 13, 2016 at 4:07 AM, Noah Misch  wrote:
>> >>
>> >> On Tue, Apr 12, 2016 at 10:08:23PM +0200, Magnus Hagander wrote:
>> >> > On Tue, Apr 12, 2016 at 8:39 AM, Noah Misch 
>> >> > wrote:
>> >> > > On Mon, Apr 11, 2016 at 11:22:27AM +0200, Magnus Hagander wrote:
>> >> > > > Well, if we *don't* do the rewrite before we release it, then we
>> >> > > > have to
>> >> > > > instead put information about the new version of the functions
>> >> > > > into
>> >> > > > the
>> >> > > old
>> >> > > > structure I think.
>> >> > > >
>> >> > > > So I think it's an open issue.
>> >> > >
>> >> > > Works for me...
>> >> > >
>> >> > > [This is a generic notification.]
>> >> > >
>> >> > > The above-described topic is currently a PostgreSQL 9.6 open item.
>> >> > > Magnus,
>> >> > > since you committed the patch believed to have created it, you own
>> >> > > this
>> >> > > open
>> >> > > item.  If that responsibility lies elsewhere, please let us know
>> >> > > whose
>> >> > > responsibility it is to fix this.  Since new open items may be
>> >> > > discovered
>> >> > > at
>> >> > > any time and I want to plan to have them all fixed well in advance
>> >> > > of
>> >> > > the
>> >> > > ship
>> >> > > date, I will appreciate your efforts toward speedy resolution.
>> >> > > Please
>> >> > > present, within 72 hours, a plan to fix the defect within seven
>> >> > > days
>> >> > > of
>> >> > > this
>> >> > > message.  Thanks.
>> >> > >
>> >> >
>> >> > I won't have time to do the bigger rewrite/reordeirng by then, but I
>> >> > can
>> >> > certainly commit to having the smaller updates done to cover the new
>> >> > functionality in less than a week. If nothing else, that'll be
>> >> > something
>> >> > for me to do on the flight over to pgconf.us.
>> >>
>> >> Thanks for that plan; it sounds good.
>> >
>> >
>> > Here's a suggested patch.
>> >
>> > There is some duplication between the non-exclusive and exclusive backup
>> > sections, but I wanted to make sure that each set of instructions can
>> > just
>> > be followed top-to-bottom.
>> >
>> > I've also removed some tips that aren't really necessary as part of the
>> > step-by-step instructions in order to keep things from exploding in
>> > size.
>> >
>> > Finally, I've changed references to "backup dump" to just be "backup",
>> > because it's confusing to call them something with dumps in when it's
>> > not
>> > pg_dump. Enough that I got partially confused myself while editing...
>> >
>> > Comments?
>>
>> +Low level base backups can be made in a non-exclusive or an exclusive
>> +way. The non-exclusive method is recommended and the exclusive one
>> will
>> +at some point be deprecated and removed.
>>
>> I don't object to add a non-exclusive mode of low level backup,
>> but I disagree to mark an exclusive backup as deprecated at least
>> until we can alleviate some pains that a non-exclusive mode causes.
>
>
> Note that we have not marked them as deprecated. We're just giving warnings
> that they will be deprecated.
>
>>
>>
>> One example of the pain, in a non-exclusive backup, we need to keep
>> the IDLE connection which was used to execute pg_start_backup(),
>> until the end of backup. Of course a backup can take a very
>> long time. In this case the IDLE connection also needs to remain
>> for such a long time. If it's accidentally terminated (e.g., because
>> of IDLE connection), the backup fails and needs to be taken again
>> from the beginning.
>
>
>
> Yes, it's a bit annoying. But it's something you can handle. Unlike the
> problems that exist with an exclusive backup which you *cannot* handle from
> the backup script/commands.

I know many systems which are currently running well in production and
handling that problem of an exclusive backup. For example, they use
Pacemaker/Corosync to shared-disk HA configuration. PostgreSQL resource
agent for Pacemaker removes backup_label in the case of failover to
prevent the standby server from failing to recover from the crash.
This is not so neat solution, but they could live with the problem for
a long time, so far.

I'm NOT against migrating their backup scripts so that new method is used,
at the end. On the other hand, I think that we don't need to warn and rush
them to do that at least until new method will be polished.

>> Another pain in a non-exclusive backup is to have to execute both
>> pg_start_backup() and pg_stop_backup() on the same connection.
>
>
> Pretty sure that's the same one you just listed?

Yes, the sources of those pains are the same.

I wonder if it's better to store the backup state information in shared
memory, catalog, flat file or something instead of local memory
so that we can execute pg_stop_backup() in different session from that
pg_start_backup() is called. Maybe I'm missing something basic, but
why do tho

Re: [HACKERS] Rename max_parallel_degree?

2016-04-25 Thread Bruce Momjian
On Mon, Apr 25, 2016 at 12:17:56PM +0200, Magnus Hagander wrote:
> 
> 
> On Mon, Apr 25, 2016 at 5:01 AM, Robert Haas  wrote:
> 
> On Sun, Apr 24, 2016 at 2:42 PM, Tom Lane  wrote:
> > Magnus Hagander  writes:
> >> On Sun, Apr 24, 2016 at 8:23 PM, Tom Lane  wrote:
> >>> FWIW, I agree with Bruce that using "degree" here is a poor choice.
> >>> It's an unnecessary dependence on technical terminology that many
> people
> >>> will not be familiar with.
> >
> >> FWIW, SQL Server calls it "degree of parallelism" as well (
> >> https://technet.microsoft.com/en-us/library/ms188611(v=sql.105).aspx).
> And
> >> their configuration option is "max degree of parallelism":
> >> https://technet.microsoft.com/en-us/library/ms181007(v=sql.105).aspx.
> >
> > Yes, but both they and Oracle appear to consider "degree" to mean the
> > total number of processors used, not the number of secondary jobs in
> > addition to the main one.  The only thing worse than employing obscure
> > technical terminology is employing it incorrectly: that way, you get to
> > confuse both the users who know what it means and those who don't.
> 
> This is not so clear-cut as you are making it out to be.  For example,
> see http://www.akadia.com/services/ora_parallel_processing.html - viz
> "The number of parallel slave processes associated with an operation
> is called its degree of parallelism", which is pretty close to what
> the parameter currently called max_parallel_degree actually does.
> 
> 
> 
> So maybe something like  session_parallel_degree, to add another color to the
> bikeshed?

I think Robert said it is per-executor node, not per session, similar to
work_mem.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: Convert the PGPROC->lwWaitLink list into a dlist instead of open

2016-04-25 Thread Robert Haas
On Thu, Dec 25, 2014 at 11:49 AM, Andres Freund  wrote:
> Convert the PGPROC->lwWaitLink list into a dlist instead of open coding it.
>
> Besides being shorter and much easier to read it changes the logic in
> LWLockRelease() to release all shared lockers when waking up any. This
> can yield some significant performance improvements - and the fairness
> isn't really much worse than before, as we always allowed new shared
> lockers to jump the queue.

Two different colleagues of mine have recently and independently of
each other discovered a major down side of this commit: it breaks the
ability to put an LWLock into a DSM segment, which was a principle
goal of ea9df812d8502fff74e7bc37d61bdc7d66d77a7f.  When that commit
went in, an LWLock only needed to store pointers to PGPROC structure,
which don't move even though the LWLock itself might.  But the dlist
implementation uses a *circular* doubly linked list, which means that
there are not only pointers to the PGPROC in play but also pointers
back to the LWLock itself, and that breaks when DSM segments don't map
in at the same address in all cooperating processes.

I don't know exactly what to do about this, but I think it's a
problem.  In one of the two cases, an LWLock wasn't really the right
data structure anyway; what was really needed was a condition
variable.  But in the other case, the thing that is needed is
precisely an LWLock.  Parallel sequential scan doesn't run afoul of
this restriction because we can get by with spinlocks, but as we
develop more complicated parallel operations (parallel index scan,
parallel bitmap index scan, parallel hash) I think we really need to
have other synchronization primitives available, including LWLocks.
They are a fundamental part of the PostgreSQL synchronization model,
and it's hard to write complex code without them.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Move PinBuffer and UnpinBuffer to atomics

2016-04-25 Thread Amit Kapila
On Mon, Apr 25, 2016 at 6:04 PM, Alexander Korotkov <
a.korot...@postgrespro.ru> wrote:

> On Sun, Apr 17, 2016 at 7:32 PM, Amit Kapila 
> wrote:
>
>> On Thu, Apr 14, 2016 at 8:05 AM, Andres Freund 
>> wrote:
>> >
>> > On 2016-04-14 07:59:07 +0530, Amit Kapila wrote:
>> > > What you want to see by prewarming?
>> >
>> > Prewarming appears to greatly reduce the per-run variance on that
>> > machine, making it a lot easier to get meaningful results.
>> >
>>
>> I think you are referring the tests done by Robert on power-8 m/c, but
>> the performance results I have reported were on intel x86.  In last two
>> days, I have spent quite some effort to do the performance testing of this
>> patch with pre-warming by using the same query [1] as used by Robert in his
>> tests.  The tests are done such that first it start server, pre-warms the
>> relations, ran read-only test, stop server, again repeat this for next test.
>>
>
> What did you include into single run: test of single version (HEAD or
> Patch) or test of both of them?
>

single run includes single version (either HEAD or Patch).


>
>
>> I have observed that the variance in run-to-run performance still occurs
>> especially at higher client count (128).  Below are results for 128 client
>> count both when the tests ran first with patch and then with HEAD and vice
>> versa.
>>
>> Test-1
>> --
>> client count - 128 (basically -c 128 -j 128)
>>
>> first tests ran with patch and then with HEAD
>>
>> Patch_ver/Runs HEAD (commit -70715e6a) Patch
>> Run-1 156748 174640
>> Run-2 151352 150115
>> Run-3 177940 165269
>>
>>
>> Test-2
>> --
>> client count - 128 (basically -c 128 -j 128)
>>
>> first tests ran with HEAD and then with patch
>>
>> Patch_ver/Runs HEAD (commit -70715e6a) Patch
>> Run-1 173063 151282
>> Run-2 173187 140676
>> Run-3 177046 166726
>>
>> I think this patch (padding pgxact) certainly is beneficial as reported
>> above thread. At very high client count some variation in performance is
>> observed with and without patch, but I feel in general it is a win.
>>
>
> So, what hardware did you use for these tests: power-8 or x86?
>

x86


> How long was single run?
>

5 minutes.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] snapshot too old, configured by time

2016-04-25 Thread Alexander Korotkov
On Sat, Apr 23, 2016 at 5:20 PM, Bruce Momjian  wrote:

> On Sat, Apr 23, 2016 at 12:48:08PM +0530, Amit Kapila wrote:
> > On Sat, Apr 23, 2016 at 8:34 AM, Bruce Momjian  wrote:
> > >
> > > I kind of agreed with Tom about just aborting transactions that held
> > > snapshots for too long, and liked the idea this could be set per
> > > session, but the idea that we abort only if a backend actually touches
> > > the old data is very nice.  I can see why the patch author worked hard
> > > to do that.
> > >
> > > How does/did Oracle handle this?
> > >
> >
> > IIRC then Oracle gives this error when the space in undo tablespace (aka
> > rollback segment) is low.  When the rollback segment gets full, it
> overwrites
> > the changed data which might be required by some old snapshot and when
> that old
> > snapshot statement tries to access the data (which is already
> overwritten), it
> > gets "snapshot too old" error.  Assuming there is enough space in
> rollback
> > segment, Oracle seems to provide a way via Alter System set
> undo_retention =
> > .
> >
> > Now, if the above understanding of mine is correct, then I think the
> current
> > implementation done by Kevin is closer to what Oracle provides.
>
> But does the rollback only happen if the long-running Oracle transaction
> tries to _access_ specific data that was in the undo segment, or _any_
> data that potentially could have been in the undo segment?  If the
> later, it seems Kevin's approach is better because you would have to
> actually need to access old data that was there to be canceled, not just
> any data that could have been overwritten based on the xid.
>

I'm not sure that we should rely that much on Oracle behavior.  It has very
different MVCC model.
Thus we can't apply same features one-by-one: they would have different pro
and cons for us.

Also, it seems we have similar behavior already in applying WAL on the
> standby --- we delay WAL replay when there is a long-running
> transaction.  Once the time expires, we apply the WAL.  Do we cancel the
> long-running transaction at that time, or wait for the long-running
> transaction to touch some WAL we just applied?  If the former, does
> Kevin's new code allow us to do the later?


That makes sense for me.  If we could improve read-only queries on slaves
this way, Kevin's new code becomes much more justified.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] pgsql: Add putenv support for msvcrt from Visual Studio 2013

2016-04-25 Thread Christian Ullrich

* Magnus Hagander wrote:


On Sun, Apr 24, 2016 at 9:56 PM, Christian Ullrich 
wrote:


* Magnus Hagander wrote:

Add putenv support for msvcrt from Visual Studio 2013
http://git.postgresql.org/pg/commitdiff/9f633b404cb3be6139f8dfdea00538489ffef9ab



Just noticed something. This DLL detection by name has never worked in
debug builds where the DLL names end in "d". Is that important?



That's an interesting point.  I guess our release builds are never with
debugging info - but could it make the buildfarm "wrong"?

Fixing it should probably be as easy as trying each dll with the specified
name and also with a "d" as a suffix?


I think so, yes.

Personally, I would have expected that at least the debug/release DLLs 
of a single CRT version would somehow share their environment, but I 
tried it and they don't.


--
Christian





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgsql: Add putenv support for msvcrt from Visual Studio 2013

2016-04-25 Thread Magnus Hagander
On Sun, Apr 24, 2016 at 9:56 PM, Christian Ullrich 
wrote:

> * Magnus Hagander wrote:
>
> Add putenv support for msvcrt from Visual Studio 2013
>>
>> This was missed when VS 2013 support was added.
>>
>> Michael Paquier
>>
>> Branch
>> --
>> master
>>
>> Details
>> ---
>>
>> http://git.postgresql.org/pg/commitdiff/9f633b404cb3be6139f8dfdea00538489ffef9ab
>>
>
> Just noticed something. This DLL detection by name has never worked in
> debug builds where the DLL names end in "d". Is that important?


That's an interesting point.  I guess our release builds are never with
debugging info - but could it make the buildfarm "wrong"?

Fixing it should probably be as easy as trying each dll with the specified
name and also with a "d" as a suffix?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] [BUGS] Breakage with VACUUM ANALYSE + partitions

2016-04-25 Thread Thom Brown
On 22 April 2016 at 18:07, Andres Freund  wrote:
> On 2016-04-14 11:09:29 +0530, Abhijit Menon-Sen wrote:
>> At 2016-04-12 09:00:57 -0400, robertmh...@gmail.com wrote:
>> >
>> > On Mon, Apr 11, 2016 at 1:17 PM, Andres Freund  wrote:
>> > >
>> > > 3) Actually handle the case of the last open segment not being
>> > >RELSEG_SIZE properly in _mdfd_getseg() - mdnblocks() does so.
>> >
>> > #3 seems like it's probably about 15 years overdue, so let's do that
>> > anyway.
>>
>> I don't have anything useful to contribute, I'm just trying to
>> understand this fix.
>>
>> _mdfd_getseg() looks like this:
>>
>>   targetseg = blkno / ((BlockNumber) RELSEG_SIZE);
>>   for (nextsegno = 1; nextsegno <= targetseg; nextsegno++)
>>   {
>>   Assert(nextsegno == v->mdfd_segno + 1);
>>
>>   if (v->mdfd_chain == NULL)
>>   {
>>   /*
>>* Normally we will create new segments only if 
>> authorized by the
>>* caller (i.e., we are doing mdextend()). […]
>>*/
>>   if (behavior == EXTENSION_CREATE || InRecovery)
>>   {
>>   if (_mdnblocks(reln, forknum, v) < RELSEG_SIZE)
>> /* mdextend */
>>   v->mdfd_chain = _mdfd_openseg(reln, forknum, 
>> +nextsegno, O_CREAT);
>>   }
>>   else
>>   {
>>   /* We won't create segment if not existent */
>>   v->mdfd_chain = _mdfd_openseg(reln, forknum, 
>> nextsegno, 0);
>>   }
>>   if (v->mdfd_chain == NULL)
>> /* return NULL or ERROR */
>>   }
>>   v = v->mdfd_chain;
>>   }
>>
>> Do I understand correctly that the case of the "last open segment"
>> (i.e., the one for which mdfd_chain == NULL) not being RELSEG_SIZE
>> (i.e., _mdnblocks(reln, forknum, v) < RELSEG_SIZE) should not call
>> _mfdf_openseg on nextsegno if behaviour is not EXTENSION_CREATE or
>> InRecovery?
>>
>> And that "We won't create segment if not existent" should happen, but
>> doesn't only because the next segment file wasn't removed earlier, so
>> we have to add an extra check for that case?
>>
>> In other words, is something like the following what's needed here, or
>> is there more to it?
>
> It's a bit more complicated than that, but that's the principle. Thanks
> for the patch, and sorry for my late response. See my attached version
> for a more fleshed out version of this.
>
> Looking at the code around this I found:
> * _mdfd_getseg(), in contrast to mdnblocks() doesn't check segment size,
>   neither whether to continue with a too small, nor to error out with a
>   too big segment
> * For, imo pretty bad, reasons in mdsync() we currently rely on
>   EXTENSION_RETURN_NULL to extend the file in recovery, and to set errno
>   to ENOENT. Brrr.
> * we currently FATAL if a segment is too big - I've copied that
>   behaviour, but why not just ERROR out?
>
> The attached patch basically adds the segment size checks to
> _mdfd_getseg(), and doesn't perform extension, even in recovery, if
> EXTENSION_REALLY_RETURN_NULL is passed.
>
> This fixes the issue for me, both in the originally reported variant,
> and in some more rudely setup version (i.e. rm'ing files).

Fixes it for me too.

Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] Breakage with VACUUM ANALYSE + partitions

2016-04-25 Thread Robert Haas
On Fri, Apr 22, 2016 at 5:26 PM, Robert Haas  wrote:
> On Fri, Apr 22, 2016 at 1:07 PM, Andres Freund  wrote:
>> The attached patch basically adds the segment size checks to
>> _mdfd_getseg(), and doesn't perform extension, even in recovery, if
>> EXTENSION_REALLY_RETURN_NULL is passed.
>>
>> This fixes the issue for me, both in the originally reported variant,
>> and in some more rudely setup version (i.e. rm'ing files).
>
> I think this looks broadly reasonable.  Some specific comments:
>
> +/*
> + * Normally we will create new segments only if authorized by
> + * the caller (i.e., we are doing mdextend()).  But when 
> doing
> + * WAL recovery, create segments anyway; this allows cases
> + * such as replaying WAL data that has a write into a
> + * high-numbered segment of a relation that was later 
> deleted.
> + * We want to go ahead and create the segments so we can
> + * finish out the replay.
>
> Add something like: "However, if the caller has specified
> EXTENSION_REALLY_RETURN_NULL, then extension is not desired even in
> recovery; we won't reach this point in that case."
>
> +errno = ENOENT;/* some callers check errno, yuck 
> */
>
> I think a considerably more detailed comment would be warranted.
>
> +else
> +{
> +ereport(ERROR,
> +(errcode_for_file_access(),
> + errmsg("could not open file \"%s\"
> (target block %u): previous segment is only %u blocks",
> +_mdfd_segpath(reln, forknum, nextsegno),
> +blkno, nblocks)));
> +}
>
> The else and its ensuing level of indentation are unnecessary.

Andres, this issue has now been open for more than a month, which is
frankly kind of ridiculous given the schedule we're trying to hit for
beta.  Do you think it's possible to commit something RSN without
compromising the quality of PostgreSQL 9.6?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusing comment in pg_upgrade in regards to VACUUM FREEZE

2016-04-25 Thread Craig Ringer
On 18 April 2016 at 22:46, Alvaro Herrera  wrote:


> Nothing to do with that.  The VACUUM FREEZE is executed on the new
> database before migrating the old data over; it's there so that the
> existing data has no trace of any permanent "normal" Xids from the
> original counter.


Right. That makes sense, and explains why nobody's been screaming in horror
as pg_upgrade takes six weeks to slowly freeze their tables ;)

The new cluster has rows created by initdb etc whose visibility information
only makes sense in the context of the control state, clog, etc of the new
cluster and we're about to clobber that. So they must be frozen.

Thanks. That feels very obvious in retrospect.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Move PinBuffer and UnpinBuffer to atomics

2016-04-25 Thread Alexander Korotkov
On Sun, Apr 17, 2016 at 7:32 PM, Amit Kapila 
wrote:

> On Thu, Apr 14, 2016 at 8:05 AM, Andres Freund  wrote:
> >
> > On 2016-04-14 07:59:07 +0530, Amit Kapila wrote:
> > > What you want to see by prewarming?
> >
> > Prewarming appears to greatly reduce the per-run variance on that
> > machine, making it a lot easier to get meaningful results.
> >
>
> I think you are referring the tests done by Robert on power-8 m/c, but the
> performance results I have reported were on intel x86.  In last two days, I
> have spent quite some effort to do the performance testing of this patch
> with pre-warming by using the same query [1] as used by Robert in his
> tests.  The tests are done such that first it start server, pre-warms the
> relations, ran read-only test, stop server, again repeat this for next test.
>

What did you include into single run: test of single version (HEAD or
Patch) or test of both of them?


> I have observed that the variance in run-to-run performance still occurs
> especially at higher client count (128).  Below are results for 128 client
> count both when the tests ran first with patch and then with HEAD and vice
> versa.
>
> Test-1
> --
> client count - 128 (basically -c 128 -j 128)
>
> first tests ran with patch and then with HEAD
>
> Patch_ver/Runs HEAD (commit -70715e6a) Patch
> Run-1 156748 174640
> Run-2 151352 150115
> Run-3 177940 165269
>
>
> Test-2
> --
> client count - 128 (basically -c 128 -j 128)
>
> first tests ran with HEAD and then with patch
>
> Patch_ver/Runs HEAD (commit -70715e6a) Patch
> Run-1 173063 151282
> Run-2 173187 140676
> Run-3 177046 166726
>
> I think this patch (padding pgxact) certainly is beneficial as reported
> above thread. At very high client count some variation in performance is
> observed with and without patch, but I feel in general it is a win.
>

So, what hardware did you use for these tests: power-8 or x86? How long was
single run?
Per-run variation seems quite high.  It also seems that it depends on which
version runs first.  But that could be a coincidence.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] Proposed change to make cancellations safe

2016-04-25 Thread Craig Ringer
On 24 April 2016 at 23:11, Tom Lane  wrote:


> Have you seen this to be a problem in practice, or is it just
> theoretical?  I do not recall many, if any, field complaints
> about the issue.
>

It's caused pain for me when working with JDBC in the past.

If libpq gets pipelined query support in future it'll become much more
noticeable. Right now libpq won't be too bothered.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Updated backup APIs for non-exclusive backups

2016-04-25 Thread Craig Ringer
On 24 April 2016 at 23:49, Stephen Frost  wrote:


>
> Fixing that means using something more complicated than the old method
> and that's a bit of a pain in psql, but that doesn't mean we should tell
> people that the old method is an acceptable approach.
>

+1

Frankly, I don't find doing backups with psql this way interesting. I'm a
bit astonished that anyone's doing it, really. Most of the time I'm doing
anything more complicated than pg_basebackup I've got more flexible tools
at hand than psql too.

\gset should be a workaround for anyone who really wants to do this in psql
for some reason.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Rename max_parallel_degree?

2016-04-25 Thread Geoff Winkless
On 25 April 2016 at 03:44, Robert Haas  wrote:
> On Sun, Apr 24, 2016 at 2:23 PM, Tom Lane  wrote:
>> FWIW, I agree with Bruce that using "degree" here is a poor choice.
>> It's an unnecessary dependence on technical terminology that many people
>> will not be familiar with.
>
> And many others will.  Some made-up term that is entirely
> PostgreSQL-specific is not going to be better.

Just to add my 2c, "degree" implies some sort of graded scale. So
setting it to (say) 10 would be "maximum", setting to 0 would be
"none" and setting it to anything in between would be relative to the
maximum.

eg in Vol26 "Encyclopedia of Computer Science and Technology" (the
first compsci reference that appeared for a google search) there are
three levels of granularity of degrees of parallelism.

https://books.google.co.uk/books?id=z4KECsT59NwC&pg=PA41&lpg=PA41&dq=degree+of+parallelism

Frankly it seems that the SQL crowd stole the computer science term
and applied it incorrectly.

Having a configuration option "_workers" makes much more sense to me.
It absolutely describes what it does without needing to refer to a
manual, and it removes ambiguity.

Geoff


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Ordering in guc.c vs. config.sgml vs. postgresql.sample.conf

2016-04-25 Thread Alexander Korotkov
On Sun, Apr 24, 2016 at 9:58 PM, Andres Freund  wrote:

> While working on
>
> http://www.postgresql.org/message-id/cabuevezwma9y+bp4fi4fe4hmpfzmjozomulvtbhhpwtcujr...@mail.gmail.com
> I once more taken aback by the total lack of consistency between the
> three files in $subject. Some of the inconsistency of guc.c vs. the rest
> comes from having separate lists for different datatypes - hard to avoid
> - but even disregarding that, there seems to be little to no
> consistency.
>
> How about we try to order them the same? That's obviously not a 9.6
> topic at this point, but it'd probably be good to that early in 9.7.
>

+1

Also, what do you think about validation script which checks consistency
between them?

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] pg_dump / copy bugs with "big lines" ?

2016-04-25 Thread Craig Ringer
On 24 March 2016 at 01:14, Daniel Verite  wrote:


>
> It provides a useful mitigation to dump/reload databases having
> rows in the 1GB-2GB range, but it works under these limitations:
>
> - no single field has a text representation exceeding 1GB.
> - no row as text exceeds 2GB (\copy from fails beyond that. AFAICS we
>   could push this to 4GB with limited changes to libpq, by
>   interpreting the Int32 field in the CopyData message as unsigned).


This seems like worthwhile mitigation for an issue multiple people have hit
in the wild, and more will.

Giving Pg more generally graceful handling for big individual datums is
going to be a bit of work, though. Support for wide-row, big-Datum COPY in
and out. Efficient lazy fetching of large TOASTed data by follow-up client
requests. Range fetching of large compressed TOASTed values (possibly at
the price of worse compression) without having to decompress the whole
thing up to the start of the desired range. Lots of fun.

At least we have lob / pg_largeobject.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Ordering in guc.c vs. config.sgml vs. postgresql.sample.conf

2016-04-25 Thread Magnus Hagander
On Sun, Apr 24, 2016 at 8:58 PM, Andres Freund  wrote:

> Hi,
>
> While working on
>
> http://www.postgresql.org/message-id/cabuevezwma9y+bp4fi4fe4hmpfzmjozomulvtbhhpwtcujr...@mail.gmail.com
> I once more taken aback by the total lack of consistency between the
> three files in $subject. Some of the inconsistency of guc.c vs. the rest
> comes from having separate lists for different datatypes - hard to avoid
> - but even disregarding that, there seems to be little to no
> consistency.
>
> How about we try to order them the same? That's obviously not a 9.6
> topic at this point, but it'd probably be good to that early in 9.7.
>

Agreed, at least between the documentation and postgresql.conf.sample.
That's also the order that users are likely to look at.

guc.c might be better to just stick to alphabetical per group. (Which we
also don't do today, of course, but it could be a better way to do it
there)


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] checkpoint_flush_after documentation inconsistency

2016-04-25 Thread Magnus Hagander
On Mon, Apr 25, 2016 at 6:57 AM, Fujii Masao  wrote:

> On Mon, Apr 25, 2016 at 4:27 AM, Andres Freund  wrote:
> > On 2016-04-21 11:20:38 -0700, Andres Freund wrote:
> >> On 2016-04-18 14:33:28 +0900, Fujii Masao wrote:
> >> > On Fri, Apr 15, 2016 at 6:56 PM, Magnus Hagander 
> wrote:
> >> > > The documentation says that the default value is 128Kb on Linux,
> but the
> >> > > code says it's 256Kb.
> >> > >
> >> > > Not sure which one is correct, but the other one should be updated
> :) I'm
> >> > > guessing it's a copy/paste mistake in the docs, but since I'm not
> sure I'm
> >> > > not just pushing a fix.
> >> >
> >> > I think you're right.
> >> >
> >> > I also found another several small problems regarding XXX_flush_after
> >> > parameters.
> >>
> >> Thanks for finding these, once I'm back home/office from pgconf.nyc
> >> (tonight / tomorrow) I'll push a fix.
> >
> > Pushed a fix, could you check that you're ok with the result?
>
> The fix looks good to me. Thanks!
>
>
+1, looks good.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] Rename max_parallel_degree?

2016-04-25 Thread Magnus Hagander
On Mon, Apr 25, 2016 at 5:01 AM, Robert Haas  wrote:

> On Sun, Apr 24, 2016 at 2:42 PM, Tom Lane  wrote:
> > Magnus Hagander  writes:
> >> On Sun, Apr 24, 2016 at 8:23 PM, Tom Lane  wrote:
> >>> FWIW, I agree with Bruce that using "degree" here is a poor choice.
> >>> It's an unnecessary dependence on technical terminology that many
> people
> >>> will not be familiar with.
> >
> >> FWIW, SQL Server calls it "degree of parallelism" as well (
> >> https://technet.microsoft.com/en-us/library/ms188611(v=sql.105).aspx).
> And
> >> their configuration option is "max degree of parallelism":
> >> https://technet.microsoft.com/en-us/library/ms181007(v=sql.105).aspx.
> >
> > Yes, but both they and Oracle appear to consider "degree" to mean the
> > total number of processors used, not the number of secondary jobs in
> > addition to the main one.  The only thing worse than employing obscure
> > technical terminology is employing it incorrectly: that way, you get to
> > confuse both the users who know what it means and those who don't.
>
> This is not so clear-cut as you are making it out to be.  For example,
> see http://www.akadia.com/services/ora_parallel_processing.html - viz
> "The number of parallel slave processes associated with an operation
> is called its degree of parallelism", which is pretty close to what
> the parameter currently called max_parallel_degree actually does.
>


So maybe something like  session_parallel_degree, to add another color to
the bikeshed?


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] pg_dump / copy bugs with "big lines" ?

2016-04-25 Thread Michael Paquier
On Sat, Apr 23, 2016 at 9:58 AM, Bruce Momjian  wrote:
> On Thu, Mar  3, 2016 at 10:31:26AM +0900, Michael Paquier wrote:
>> On Thu, Mar 3, 2016 at 12:47 AM, Alvaro Herrera
>>  wrote:
>> > Well, the CopyData message has an Int32 field for the message length.
>> > I don't know the FE/BE protocol very well but I suppose each row
>> > corresponds to one CopyData message, or perhaps each column corresponds
>> > to one CopyData message.  In either case, it's not possible to go beyond
>> > 2GB without changing the protocol ...
>>
>> Based on what I know from this stuff (OOM libpq and other stuff
>> remnants), one 'd' message means one row. fe-protocol3.c and
>> CopySendEndOfRow in backend's copy.c are confirming that as well. I am
>> indeed afraid that having extra logic to get chunks of data will
>> require extending the protocol with a new message type for this
>> purpose.
>
> Is there any documentation that needs updating based on this research?

Perhaps. On the docs the two sections referring to the CopyData
messages arein protocol.sgml
- with this portion for the 'd' message itself:

CopyData (F & B)

- and a more precise description here:
  
   COPY Operations
We could precise in one of them that the maximum size of a CopyData
message can be up to 1GB. Thoughts?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] VS 2015 support in src/tools/msvc

2016-04-25 Thread Michael Paquier
On Mon, Apr 25, 2016 at 4:29 AM, Christian Ullrich  wrote:
> Andrew wrote:
>> OK, here's my final version of the patch, which I will apply in 24 hours
> or so unless there is an objection.

Thanks Andrew for the updated patch!

> This one doesn't matter, but just for perfection's sake:
>
> +#if (_MSC_VER >= 1900)
> +   uint32  cp;
> +   WCHAR   wctype[80];
> +
> +   memset(wctype, 0, 80 * sizeof(WCHAR));
> +   MultiByteToWideChar(CP_ACP, 0, ctype, -1, wctype, 80);
>
> The maximum length is documented as 85 characters, also:
>
> :
> 'Note   Your application must use the constant [LOCALE_NAME_MAX_LENGTH] for
> the maximum locale name length, instead of hard-coding the value "85".'

Just an addition on top of the comments of Christian..

+#pragma warning(push)
+#pragma warning(disable : 4091)
 #include 
+#pragma warning(pop)
It seems to me that we had better protect those pragmas with _MSC_VER
>= 1900. The crash dump facility could be used by MinGW as well, no?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers