Re: [HACKERS] Patch: Implement failover on libpq connect level.

2016-11-13 Thread Mithun Cy
On Mon, Nov 14, 2016 at 11:31 AM, Tsunakawa, Takayuki <
tsunakawa.ta...@jp.fujitsu.com> wrote:
> PGconn->target_server_type is not freed in freePGconn().

Thanks, will fix in new patch.

>Could you add PGTARGETSERVERTYPE environment variable?  Like other
variables, it will ease testing, since users can change the behavior
>without altering the connection string here and there.

Okay, will add one.

> I think it would be better to expose the server state via ParameterStatus
protocol message like standard_conforming_strings, instead of running >
>"SELECT pg_is_in_recovery()".  We shouldn't want to add one round trip to
check the server type (master, standby).  postmaster can return the server
>type based on its state (pmState); PM_RUN is master, and PM_HOT_STANDBY is
standby.  In addition, as an impractical concern, DBA can revoke >EXECUTE
privilege on pg_is_in_recovery() from non-superusers.

If you are suggesting me to change in protocol messages, I think that would
not be backward compatible to older version servers. I also think such
level of protocol changes will not be allowed. with connection status
CONNECTION_SETENV used for protocol version 2.0 setup, we sent some query
 like
"select pg_catalog.pg_client_encoding()" for same. So I think using "SELECT
pg_is_in_recovery()" should be fine.
-
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Patch: Implement failover on libpq connect level.

2016-11-13 Thread Mithun Cy
On Fri, Nov 11, 2016 at 7:33 PM, Peter Eisentraut <
peter.eisentr...@2ndquadrant.com> wrote:
> We tend to use the term "primary" instead of "master".

Thanks, I will use primary instead of master in my next patch.

>Will this work with logical replication?

I have not tested with logical replication. Currently we identify the
primary to connect based on result of "SELECT pg_is_in_recovery()". So I
think it works. Do you want me test a particular setup?

-- 
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com


[HACKERS] Re: [PATCH] PostgresNode.pm enhancements, pg_lsn helper, and some more recovery tests

2016-11-13 Thread Craig Ringer
On 14 November 2016 at 15:01, Craig Ringer  wrote:

> These three are related enough, and all only touch the TAP framework,
> so I've bundled them in a series.

CF entry https://commitfest.postgresql.org/12/883/

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH] PostgresNode.pm enhancements, pg_lsn helper, and some more recovery tests

2016-11-13 Thread Craig Ringer
Every test I write with the TAP framework for recovery seems to need
to wait for one node to catch up to another or examine replication
slot state. So: attached is a patch to add helper methods for these
tasks.

Also add a new pg_lsn.pm helper class to deal with PostgreSQL's
awkward representation of LSNs in perl. Made extra fun by our use of
Perl 5.8.8 with no new modules, so we can't rely on having 64-bit
integers. Provides sensible LSN comparisons. I'll be using this in
coming patches that have to make assertions about differences between
LSNs, and I think it's sensible to have anyway. Incorporates tests for
the class.

Finally, add some coverage of physical replication slots to recovery tests.

Backpatch to 9.6 desirable, since we seem to be doing that for TAP
infrastructure.

These three are related enough, and all only touch the TAP framework,
so I've bundled them in a series.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
From f01790bb3f47f7fabf64328f8c9a01e8f3cf837c Mon Sep 17 00:00:00 2001
From: Craig Ringer 
Date: Wed, 9 Nov 2016 13:44:04 +0800
Subject: [PATCH 3/3] Expand streaming replication tests to cover hot standby
 feedback and physical replication slots

---
 src/test/recovery/t/001_stream_rep.pl | 105 +-
 1 file changed, 104 insertions(+), 1 deletion(-)

diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index 5ce69bb..ef29892 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -3,7 +3,7 @@ use strict;
 use warnings;
 use PostgresNode;
 use TestLib;
-use Test::More tests => 4;
+use Test::More tests => 22;
 
 # Initialize master node
 my $node_master = get_new_node('master');
@@ -58,3 +58,106 @@ is($node_standby_1->psql('postgres', 'INSERT INTO tab_int VALUES (1)'),
 	3, 'read-only queries on standby 1');
 is($node_standby_2->psql('postgres', 'INSERT INTO tab_int VALUES (1)'),
 	3, 'read-only queries on standby 2');
+
+diag "switching to physical replication slot";
+# Switch to using a physical replication slot. We can do this without a new
+# backup since physical slots can go backwards if needed. Do so on both
+# standbys. Since we're going to be testing things that affect the slot state,
+# also increase the standby feedback interval to ensure timely updates.
+my ($slotname_1, $slotname_2) = ('standby_1', 'standby_2');
+$node_master->append_conf('postgresql.conf', "max_replication_slots = 4\n");
+$node_master->restart;
+is($node_master->psql('postgres', qq[SELECT pg_create_physical_replication_slot('$slotname_1');]), 0, 'physical slot created on master');
+$node_standby_1->append_conf('recovery.conf', "primary_slot_name = $slotname_1\n");
+$node_standby_1->append_conf('postgresql.conf', "wal_receiver_status_interval = 1\n");
+$node_standby_1->append_conf('postgresql.conf', "max_replication_slots = 4\n");
+$node_standby_1->restart;
+is($node_standby_1->psql('postgres', qq[SELECT pg_create_physical_replication_slot('$slotname_2');]), 0, 'physical slot created on intermediate replica');
+$node_standby_2->append_conf('recovery.conf', "primary_slot_name = $slotname_2\n");
+$node_standby_2->append_conf('postgresql.conf', "wal_receiver_status_interval = 1\n");
+$node_standby_2->restart;
+
+sub get_slot_xmins
+{
+	my ($node, $slotname) = @_;
+	my $slotinfo = $node->slot($slotname);
+	return ($slotinfo->{'xmin'}, $slotinfo->{'catalog_xmin'});
+}
+
+# There's no hot standby feedback and there are no logical slots on either peer
+# so xmin and catalog_xmin should be null on both slots.
+my ($xmin, $catalog_xmin) = get_slot_xmins($node_master, $slotname_1);
+is($xmin, '', 'non-cascaded slot xmin null with no hs_feedback');
+is($catalog_xmin, '', 'non-cascaded slot xmin null with no hs_feedback');
+
+($xmin, $catalog_xmin) = get_slot_xmins($node_standby_1, $slotname_2);
+is($xmin, '', 'cascaded slot xmin null with no hs_feedback');
+is($catalog_xmin, '', 'cascaded slot xmin null with no hs_feedback');
+
+# Replication still works?
+$node_master->safe_psql('postgres', 'CREATE TABLE replayed(val integer);');
+
+sub replay_check
+{
+	my $newval = $node_master->safe_psql('postgres', 'INSERT INTO replayed(val) SELECT coalesce(max(val),0) + 1 AS newval FROM replayed RETURNING val');
+	$node_master->wait_for_catchup($node_standby_1);
+	$node_standby_1->wait_for_catchup($node_standby_2);
+	$node_standby_1->safe_psql('postgres', qq[SELECT 1 FROM replayed WHERE val = $newval])
+		or die "standby_1 didn't replay master value $newval";
+	$node_standby_2->safe_psql('postgres', qq[SELECT 1 FROM replayed WHERE val = $newval])
+		or die "standby_2 didn't replay standby_1 value $newval";
+}
+
+replay_check();
+
+diag "enabling hot_standby_feedback";
+# Enable hs_feedback. The slot should gain an xmin. We set the status interval
+# so we'll see the results promptly.

Re: [HACKERS] 9.6 TAP tests and extensions

2016-11-13 Thread Craig Ringer
On 27 October 2016 at 00:42, Robert Haas  wrote:
> On Wed, Oct 26, 2016 at 7:17 AM, Andres Freund  wrote:
>> On 2016-09-23 16:04:32 -0400, Tom Lane wrote:
>>> Looking back over the thread, I see that you also proposed installing
>>> isolationtester and pg_isolation_regress for the benefit of extensions.
>>> I'm very much less excited about that idea.  It'd be substantially more
>>> dead weight in typical installations, and I'm not sure that it'd be useful
>>> to common extensions, and I'm not eager to treat isolationtester's API
>>> and behavior as something we need to hold stable for extension use.
>>
>> FWIW, I'd be quite happy if it were installed. Running isolationtester
>> when compiling extensions against distribution postgres packages would
>> be quite useful.
>
> +1.

Patch attached.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
From 40735041753c48d282f3af58482992b8da00c96d Mon Sep 17 00:00:00 2001
From: Craig Ringer 
Date: Mon, 14 Nov 2016 14:51:09 +0800
Subject: [PATCH] Install isolationtester

Per discussion in 
---
 src/Makefile| 1 +
 src/test/isolation/Makefile | 9 +
 2 files changed, 10 insertions(+)

diff --git a/src/Makefile b/src/Makefile
index 977f80b..d4aa06b 100644
--- a/src/Makefile
+++ b/src/Makefile
@@ -27,6 +27,7 @@ SUBDIRS = \
 	pl \
 	makefiles \
 	test/regress \
+	test/isolation \
 	test/perl
 
 # There are too many interdependencies between the subdirectories, so
diff --git a/src/test/isolation/Makefile b/src/test/isolation/Makefile
index 3d272d5..613309e 100644
--- a/src/test/isolation/Makefile
+++ b/src/test/isolation/Makefile
@@ -66,3 +66,12 @@ installcheck-prepared-txns: all temp-install
 
 check-prepared-txns: all temp-install
 	./pg_isolation_regress --temp-instance=./tmp_check $(TEMP_CONF) $(EXTRA_REGRESS_OPTS) --inputdir=$(srcdir) --schedule=$(srcdir)/isolation_schedule prepared-transactions
+
+install: all installdirs
+	$(INSTALL_PROGRAM) isolationtester$(X) '$(DESTDIR)$(pgxsdir)/$(subdir)/isolationtester$(X)'
+
+installdirs:
+	$(MKDIR_P) '$(DESTDIR)$(pgxsdir)/$(subdir)'
+
+uninstall:
+	rm -f '$(DESTDIR)$(pgxsdir)/$(subdir)/isolationtester$(X)'
-- 
2.5.5


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Allow TAP tests to be run individually

2016-11-13 Thread Craig Ringer
On 11 November 2016 at 18:13, Michael Paquier  wrote:
> On Fri, Nov 11, 2016 at 6:10 PM, Craig Ringer  wrote:
>> Please backpatch to at least 9.6 since it's trivial and we seem to be
>> doing that for TAP. 9.5 and 9.4 would be nice too :)
>
> Yes please!

No immediate takers, so adding to CF.

I've taken the liberty of adding you as a reviewer based on your
response and the simplicity of the patch. if you get the chance to
test and verify please set ready for committer.


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Declarative partitioning - another take

2016-11-13 Thread Amit Langote
On 2016/11/04 0:49, Robert Haas wrote:
> On Thu, Nov 3, 2016 at 7:46 AM,   wrote:
>> El 2016-10-28 07:53, Amit Langote escribió:
>>> @@ -6267,6 +6416,12 @@ ATAddForeignKeyConstraint(AlteredTableInfo *tab,
>>> Relation rel,
>>>  * Validity checks (permission checks wait till we have the column
>>>  * numbers)
>>>  */
>>> +   if (pkrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
>>> +   ereport(ERROR,
>>> +   (errcode(ERRCODE_WRONG_OBJECT_TYPE),
>>> +errmsg("cannot reference relation
>>> \"%s\"", RelationGetRelationName(pkrel)),
>>> +errdetail("Referencing partitioned tables
>>> in foreign key constraints is not supported.")));
>>
>> Is there a plan for fixing this particular limitation?  It's a pretty
>> serious problem for users,
>> and the suggested workaround (to create a separate non-partitioned table
>> which carries only the PK
>> columns which is updated by triggers, and direct the FKs to it instead of to
>> the partitioned table)
>> is not only a very ugly one, but also very slow.
> 
> If you have two compatibly partitioned tables, and the foreign key
> matches the partitioning keys, you could implement a foreign key
> between the two tables as a foreign key between each pair of matching
> partitions.  Otherwise, isn't the only way to handle this a global
> index?

I am assuming you don't mean a global index (on partitioned tables) as in
some new kind of monolithic physical structure that implements the
constraint across tables (partitions), right?  I'm thinking you mean a
collection of btree indexes on individual partitions with the key of each
index matching the partition key of the parent, created internally as part
of the creation of the same index on the parent.  In fact, the said
indexes are created and maintained sort of like how inherited attributes,
constraints are.  That would require quite a bit of new infrastructure.
We did discuss about the possibility of such a feature being implemented
on top of declarative partitioning, but not in version 1 [1].

Thanks,
Amit

[1]
https://www.postgresql.org/message-id/CA%2BTgmoZZMfcf16YaHuhP1Vk%3Dj8PDFeHCvfj%2BFJQd%2BeFhs%2B7P8A%40mail.gmail.com




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fix checkpoint skip logic on idle systems by tracking LSN progress

2016-11-13 Thread Michael Paquier
On Sat, Nov 12, 2016 at 9:01 PM, Andres Freund  wrote:
> On 2016-11-11 16:42:43 +0900, Michael Paquier wrote:
>
>> + * This takes also
>> + * advantage to avoid 8-byte torn reads on some platforms by using the
>> + * fact that each insert lock is located on the same cache line.
>
> Something residing on the same cache line doens't provide that guarantee
> on all platforms.

OK. Let's remove it then.

>> + * XXX: There is still room for more improvements here, particularly
>> + * WAL operations related to unlogged relations (INIT_FORKNUM) should not
>> + * update the progress LSN as those relations are reset during crash
>> + * recovery so enforcing buffers of such relations to be flushed for
>> + * example in the case of a load only on unlogged relations is a waste
>> + * of disk write.
>
> Commit records still have to be written, everything else doesn't write
> WAL. So I'm doubtful this matters much?

Hm, okay. In most cases this may not matter... Let's rip that off.

>> @@ -997,6 +1022,24 @@ XLogInsertRecord(XLogRecData *rdata, XLogRecPtr 
>> fpw_lsn)
>>   inserted = true;
>>   }
>>
>> + /*
>> +  * Update the LSN progress positions. At least one WAL insertion lock
>> +  * is already taken appropriately before doing that, and it is simpler
>> +  * to do that here when the WAL record data and type are at hand.
>
> But we don't use the "WAL record data and type"?

Yes, at some point this patch did so...

>> + * GetProgressRecPtr -- Returns the newest WAL activity position, or in
>> + * other words any activity not referring to standby logging or segment
>> + * switches. Finding the last activity position is done by scanning each
>> + * WAL insertion lock by taking directly the light-weight lock associated
>> + * to it.
>> + */
>
> I'd prefer not to list the specific records here - that's just
> guaranteed to get out of date. Why not say something "any activity not
> requiring a checkpoint to be triggered" or such?

OK. Makes sense to minimize maintenance.

>> +  * If this isn't a shutdown or forced checkpoint, and if there has 
>> been no
>> +  * WAL activity, skip the checkpoint.  The idea here is to avoid 
>> inserting
>> +  * duplicate checkpoints when the system is idle. That wastes log 
>> space,
>> +  * and more importantly it exposes us to possible loss of both current 
>> and
>> +  * previous checkpoint records if the machine crashes just as we're 
>> writing
>> +  * the update.
>
> Shouldn't this mention archiving and also that we also ignore some forms
> of WAL activity?

I have reworded that as:
"If this isn't a shutdown or forced checkpoint, and if there has been
no WAL activity requiring a checkpoint, skip it."

>> -/* Should the in-progress insertion log the origin? */
>> -static bool include_origin = false;
>> +/* status flags of the in-progress insertion */
>> +static uint8 status_flags = 0;
>
> that seems a bit too generic a name. 'curinsert_flags'?

OK.

>>   /*
>> -  * only log if enough time has passed and some xlog 
>> record has
>> -  * been inserted.
>> +  * Only log if enough time has passed, that some WAL 
>> activity
>> +  * has happened since last checkpoint, and that some 
>> new WAL
>> +  * records have been inserted since the last time we 
>> came here.
>
> I think that sentence needs some polish.

Let's do this better:
/*
-* only log if enough time has passed and some xlog record has
-* been inserted.
+* Only log if one of the following conditions is satisfied since
+* the last time we came here::
+* - timeout has been reached.
+* - WAL activity has happened since last checkpoint.
+* - New WAL records have been inserted.
 */

>>*/
>>   if (now >= timeout &&
>> - last_snapshot_lsn != GetXLogInsertRecPtr())
>> + GetLastCheckpointRecPtr() < 
>> current_progress_lsn &&
>> + last_progress_lsn < current_progress_lsn)
>>   {
>
> Hm. I don't think it's correct to use GetLastCheckpointRecPtr() here?
> Don't we need to do the comparisons here (and when doing the checkpoint
> itself) with the REDO pointer of the last checkpoint?

Hm? The progress pointer is pointing to the lastly inserted LSN, which
is not the position of the REDO pointer, but the one of the checkpoint
record. Doing a comparison of the REDO pointer would be a moot
operation, because as the checkpoint completes, the progress LSN will
be updated as well. Or do you mean that the progress LSN should *not*
be updated for a checkpoint record? It seems to me that it should
but...

>> diff --git a/src/backend/postmaster/checkpointer.c 
>> b/src/backend/postmaster/checkpointer.c
>> 

Re: [HACKERS] Patch: Implement failover on libpq connect level.

2016-11-13 Thread Tsunakawa, Takayuki
Hi, Mithun

Before going deeper into the patch, let me give you some findings.

(1)
PGconn->target_server_type is not freed in freePGconn().


(2)
Could you add PGTARGETSERVERTYPE environment variable?  Like other variables, 
it will ease testing, since users can change the behavior without altering the 
connection string here and there.


(3)
I think it would be better to expose the server state via ParameterStatus 
protocol message like standard_conforming_strings, instead of running "SELECT 
pg_is_in_recovery()".  We shouldn't want to add one round trip to check the 
server type (master, standby).  postmaster can return the server type based on 
its state (pmState); PM_RUN is master, and PM_HOT_STANDBY is standby.  In 
addition, as an impractical concern, DBA can revoke EXECUTE privilege on 
pg_is_in_recovery() from non-superusers.

Regards
Takayuki Tsunakawa


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Declarative partitioning - another take

2016-11-13 Thread Amit Langote

I forgot to quote your comments in the email I sent on Friday [1], with
new patches that do take care of the following comments.

On 2016/11/11 4:04, Robert Haas wrote:
> On Thu, Nov 10, 2016 at 7:40 AM, Amit Langote
>>
>> OK, "partition key" and "partitioning method" it is then.  Source code
>> comments, error messages, variables call the latter (partitioning)
>> "strategy" though which hopefully is fine.
> 
> Oh, I like "partitioning strategy".  Can we standardize on that?

Done.

>> OK, I have removed the syntactic ability to specify INCLUSIVE/EXCLUSIVE
>> with each of the range bounds.
>>
>> I haven't changed any code (such as comparison functions) that manipulates
>> instances of PartitionRangeBound which has a flag called inclusive.  I
>> didn't remove the flag, but is instead just set to (is_lower ? true :
>> false) when initializing from the parse node. Perhaps, there is some scope
>> for further simplifying that code, which you probably alluded to when you
>> proposed that we do this.
> 
> Yes, you need to rip out all of the logic that supports it.  Having
> the logic to support it but not the syntax is bad because then that
> code can't be properly tested.

Agreed and done.  Now the only thing that dictates the inclusivity of
individual range bounds during comparisons (with other range bounds or
partition key of tuples) is whether the bound is a lower bound or not;
inclusive if, exclusive if not.

Thanks,
Amit

[1]
https://www.postgresql.org/message-id/8d7c35e3-1c85-33d0-4440-0a75bf9d31cd%40lab.ntt.co.jp




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fix checkpoint skip logic on idle systems by tracking LSN progress

2016-11-13 Thread Michael Paquier
On Mon, Nov 14, 2016 at 12:49 PM, Kyotaro HORIGUCHI
 wrote:
> At Sat, 12 Nov 2016 10:28:56 +0530, Amit Kapila  
> wrote in 
>> I think it is good to check the performance impact of this patch on
>> many core m/c.  Is it possible for you to once check with Alexander
>> Korotkov to see if he can provide you access to his powerful m/c which
>> has 70 cores (if I remember correctly)?

I heard about a number like that, and there is no reason to not do
tests to be sure. With that many cores we are more likely going to see
the limitation of the number of XLOG insert slots popping up as a
bottleneck, but that's just an assumption without any numbers.
Alexander (added in CC), could it be possible to get an access to this
machine?

>> @@ -1035,6 +1038,7 @@ LogAccessExclusiveLocks(int nlocks,
>> xl_standby_lock *locks)
>>   XLogBeginInsert();
>>   XLogRegisterData((char *) , offsetof(xl_standby_locks, locks));
>>   XLogRegisterData((char *) locks, nlocks * sizeof(xl_standby_lock));
>> + XLogSetFlags(XLOG_NO_PROGRESS);
>>
>>
>> Is it right to set XLOG_NO_PROGRESS flag in LogAccessExclusiveLocks?
>> This function is called not only in LogStandbySnapshot(), but during
>> DDL operations as well where lockmode >= AccessExclusiveLock.
>
> This does not remove any record from WAL. So theoretically any
> kind of record can be NO_PROGRESS, but practically as long as
> checkpoints are not unreasonably suppressed. Any explicit
> database operation must be accompanied with at least commit
> record that triggers checkpoint. NO_PROGRESSing there doesn't
> seem to me to harm database durability for this reason.
>
> The objective of this patch is skipping WALs on completely-idle
> state and the NO_PROGRESSing is necessary to do its work. Of
> course we can distinguish exclusive lock with PROGRESS and
> without PROGRESS but it is unnecessary complexity.

The point that applies here is that logging the exclusive lock
information is necessary for the *standby* recovery conflicts, not the
primary  which is why it should not influence the checkpoint activity
that is happening on the primary. So marking this record with
NO_PROGRESS is actually fine to me.

> But rethinking about the above, the namging of "XLOG_NO_PROGRESS"
> might be inappropriate. "XLOG_NO_CKPT_TRIGGER" or any sainer name
> would be needed.

I got fond of NO_PROGRESS to be honest with the time, even if I don't
like much the negative meaning it has... Perhaps something like
XLOG_SKIP_PROGRESS would hold more meaning.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Minor improvement to delete.sgml

2016-11-13 Thread Etsuro Fujita

On 2016/10/19 2:51, Robert Haas wrote:

On Fri, Oct 14, 2016 at 12:05 AM, Etsuro Fujita
 wrote:

I think it's better to mention that an alias is needed for the target table
specified in the USING clause of a DELETE statement, to set up a self-join,
as the documentation on the from_list parameter of UPDATE does.  Please find
attached a patch.



The statement you are proposing to add to the documentation isn't true.


Consider a counterexample of DELETE doing a self-join of a target table:

postgres=# create table t1 (c1 int);
CREATE TABLE
postgres=# insert into t1 values (1);
INSERT 0 1
postgres=# delete from t1 using t1 where t1.c1 = t1.c1;
ERROR:  table name "t1" specified more than once

Giving an alias to the target table t1 in the USING clause,

postgres=# delete from t1 using t1 r1 where t1.c1 = r1.c1;
DELETE 1

Am I missing something?

Sorry for the delay.

Best regards,
Etsuro Fujita




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fix checkpoint skip logic on idle systems by tracking LSN progress

2016-11-13 Thread Kyotaro HORIGUCHI
Hello, thank you for the comment.

At Sat, 12 Nov 2016 10:28:56 +0530, Amit Kapila  wrote 
in 
> On Tue, Nov 8, 2016 at 5:18 PM, Kyotaro HORIGUCHI
>  wrote:
> > Hello,
> >
> >> on something else than LW_EXCLUSIVE compared to now. To keep things
> >> more simple I' would still favor using WALInsertLocks for this patch,
> >> that looks more consistent, and also because there is no noticeable
> >> difference.
> >
> > Ok, the patch looks fine. So there's nothing for me but to accept
> > the current shape since the discussion about performance seems
> > not to be settled with out performance measurement with machines
> > with many cores.
> >
> 
> I think it is good to check the performance impact of this patch on
> many core m/c.  Is it possible for you to once check with Alexander
> Korotkov to see if he can provide you access to his powerful m/c which
> has 70 cores (if I remember correctly)?
> 
> 
> @@ -1035,6 +1038,7 @@ LogAccessExclusiveLocks(int nlocks,
> xl_standby_lock *locks)
>   XLogBeginInsert();
>   XLogRegisterData((char *) , offsetof(xl_standby_locks, locks));
>   XLogRegisterData((char *) locks, nlocks * sizeof(xl_standby_lock));
> + XLogSetFlags(XLOG_NO_PROGRESS);
> 
> 
> Is it right to set XLOG_NO_PROGRESS flag in LogAccessExclusiveLocks?
> This function is called not only in LogStandbySnapshot(), but during
> DDL operations as well where lockmode >= AccessExclusiveLock.

This does not remove any record from WAL. So theoretically any
kind of record can be NO_PROGRESS, but practically as long as
checkpoints are not unreasonably suppressed. Any explicit
database operation must be accompanied with at least commit
record that triggers checkpoint. NO_PROGRESSing there doesn't
seem to me to harm database durability for this reason.

The objective of this patch is skipping WALs on completely-idle
state and the NO_PROGRESSing is necessary to do its work. Of
course we can distinguish exclusive lock with PROGRESS and
without PROGRESS but it is unnecessary complexity.


But rethinking about the above, the namging of "XLOG_NO_PROGRESS"
might be inappropriate. "XLOG_NO_CKPT_TRIGGER" or any sainer name
would be needed.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sequences and pg_upgrade

2016-11-13 Thread Peter Eisentraut
On 11/2/16 2:34 AM, Michael Paquier wrote:
> I had a look at those fresh patches, and 0001 looks like a good thing.
> This makes the separation between sequences and table data dump
> cleaner. I ran some tests with pg_upgrade and 0002, and things are
> clear. And +1 for the way done in the patch, aka no options of pg_dump
> exposed to user, still keep the option tracking as a separate value.

Committed, thanks.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: Implement failover on libpq connect level.

2016-11-13 Thread Tsunakawa, Takayuki
From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Robert Haas
> Great, committed.  There's still potentially more work to be done here,
> because my patch omits some features that were present in Victor's original
> submission, like setting the failover timeout, optionally randomizing the
> order of the hosts, and distinguishing between master and standby servers;
> Victor, or anyone, please feel free to submit separate patches for those
> things.

The attached patch fixes some bugs and make a clarification for doc.  Could you 
check and test the authentication stuff as I don't have an environment at hand?

(1) fe-connect.c
There was a memory leak.

(2) fe-secure_openssl.c, fe-auth.c
GSSAPI/SSPI/SSL authentication requires the target host name, but the code uses 
conn->pghost which contains a comma-separated list of host names.

(3) libpq.sgml
Added sentences to clarify connect_timeout when it is used with multiple hosts.

BTW, let me two questions.

Q1: Is there any reason why hostaddr doesn't accept multiple IP addresses?

Q2: pg_isready (and hence PQping() family) reports success when one host is 
alive and other hosts are down.  Is this intended?  I think this behavior is 
correct.
e.g. host1 is down and host2 is alive.
$ pg_isready -h host1,host2
host1,host2:5450 - accepting connections
$ echo $?
0

Regards
Takayuki Tsunakawa



libpq-failover-smallbugs.patch
Description: libpq-failover-smallbugs.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Physical append-only tables

2016-11-13 Thread Alvaro Herrera
Magnus Hagander wrote:

> But then consider the same table. Except rows are typically updated once or
> twice when they are new, and *then* go read only. And we also have a
> process that at some point deletes *some* old rows (but not all - in fact,
> only a small portion).
> 
> In this case, the next INSERT once VACUUM has run is likely to stick a
> "new" row somewhere very "far back" in the table, since there is now free
> space there. This more or less completely ruins the BRIN index usability,
> as the "old" blocks will now contain a single row from a "new" series.

Yeah.  When we initially discussed BRIN, there was a mention of allowing
a BRIN index to guide new tuple location -- something like
auto-clustering, if you will.  We haven't discussed the exact details
but I think something along those lines is worth considering.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: two slab-like memory allocators

2016-11-13 Thread Tomas Vondra

On 11/12/2016 08:12 PM, Andres Freund wrote:

Hi,

Subject: [PATCH 1/2] slab allocator

diff --git a/src/backend/replication/logical/reorderbuffer.c 
b/src/backend/replication/logical/reorderbuffer.c
index 6ad7e7d..520f295 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c

I'd rather have that in a separate followup commit...


+ * IDENTIFICATION
+ *   src/backend/utils/mmgr/slab.c
+ *
+ *
+ * The constant allocation size allows significant simplification and 
various
+ * optimizations that are not possible in AllocSet. Firstly, we can get rid
+ * of the doubling and carve the blocks into chunks of exactly the right 
size
+ * (plus alignment), not wasting memory.

References to AllocSet aren't necessarily a good idea, they'll quite
possibly get out of date. The argument can be quite easily be made
without referring to a concrete reference to behaviour elsewhere.



Yeah, that's probably true.


+ *
+ * At the context level, we use 'freelist' array to track blocks grouped by
+ * number of free chunks. For example freelist[0] is a list of completely 
full
+ * blocks, freelist[1] is a block with a single free chunk, etc.

Hm. Those arrays are going to be quite large for small allocations w/
big blocks (an imo sensible combination). Maybe it'd make more sense to
model it as a linked list of blocks? Full blocks are at one end, empty
ones at the other?


So there'd be one huge list of blocks, sorted by the number of empty 
chunks? Hmm, that might work I guess.


I don't think the combination of large blocks with small allocations is 
particularly sensible, though - what exactly would be the benefit of 
such combination? I would even consider enforcing some upper limit on 
the number of chunks per block - say, 256, for example.





+ * About MEMORY_CONTEXT_CHECKING:
+ *
+ * Since we usually round request sizes up to the next power of 2, there
+ * is often some unused space immediately after a requested data
area.

I assume the "round up" stuff is copy-paste?



Yeah, sorry about that.



+ * Thus, if someone makes the common error of writing past what they've
+ * requested, the problem is likely to go unnoticed ... until the day when
+ * there *isn't* any wasted space, perhaps because of different memory
+ * ...
+ *
+ * See also the cooperating Valgrind client requests in mcxt.c.

I think we need a preliminary patch moving a lot of this into something
like mcxt_internal.h. Copying this comment, and a lot of the utility
functions, into every memory context implementation is a bad pattern.



Yes, I planned to do that for the next version of patch. Laziness.



+typedef struct SlabBlockData *SlabBlock;   /* forward reference */
+typedef struct SlabChunkData *SlabChunk;

Can we please not continue hiding pointers behind typedefs? It's a bad
pattern, and that it's fairly widely used isn't a good excuse to
introduce further usages of it.



Why is it a bad pattern?


+/*
+ * SlabContext is a specialized implementation of MemoryContext.
+ */
+typedef struct SlabContext
+{
+   MemoryContextData header;   /* Standard memory-context fields */
+   /* Allocation parameters for this context: */
+   SizechunkSize;  /* chunk size */
+   SizefullChunkSize;  /* chunk size including header and 
alignment */
+   SizeblockSize;  /* block size */
+   int chunksPerBlock; /* number of chunks per block */
+   int minFreeChunks;  /* min number of free chunks in 
any block */
+   int nblocks;/* number of blocks 
allocated */
+   /* Info about storage allocated in this context: */
+   SlabBlock   freelist[1];/* free lists (block-level) */

I assume this is a variable-length array? If so, that a) should be
documented b) use FLEXIBLE_ARRAY_MEMBER as length - not doing so
actually will cause compiler warnings and potential misoptimizations.



Will fix, thanks.


+/*
+ * SlabBlockData
+ * Structure of a single block in SLAB allocator.
+ *
+ * slab: context owning this block

What do we need this for?



You're right the pointer to the owning context is unnecessary - there's 
nothing like "standard block header" and we already have the pointer in 
the standard chunk header. But maybe keeping the pointer at least with 
MEMORY_CONTEXT_CHECKING would be a good idea?



+ * prev, next: used for doubly-linked list of blocks in global freelist

I'd prefer using an embedded list here (cf. ilist.h).



Makes sense.


+/*
+ * SlabChunk
+ * The prefix of each piece of memory in an SlabBlock
+ *
+ * NB: this MUST match StandardChunkHeader as defined by utils/memutils.h.
+ * However it's possible to fields in front of the StandardChunkHeader fields,
+ * which is used to add pointer to the block.
+ */


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-11-13 Thread Tsunakawa, Takayuki
From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Ashutosh Bapat
> I have changed some comments around this block. See attached patch.
> Let me know if that looks good.

Thanks, it looks good.

Regards
Takayuki Tsunakawa


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ExplainOneQuery_hook ignored for EXPLAIN EXECUTE

2016-11-13 Thread Tom Lane
Andres Freund  writes:
> On 2016-11-13 17:20:05 -0500, Tom Lane wrote:
>> Why do you care?  It's a pretty specialized hook.

> Citus currently uses it to output additional information for distributed
> queries. I suspect we'll instead, for now, have to intercept EXPLAIN as
> a whole then :( (moving to custom plans unfortunately is still a bit off
> unfortunately).

Seems like you might want a hook further down then, perhaps at
ExplainPrintPlan.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ExplainOneQuery_hook ignored for EXPLAIN EXECUTE

2016-11-13 Thread Andres Freund
On 2016-11-13 17:20:05 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > I don't quite know what the hook in $subject was originally designed
> > for, but right now it has the problem that it's not invoked for EXPLAIN
> > EXECUTE.  That's because ExplainExecuteQuery directly calls
> > ExplainOnePlan instead of ExplainOneQuery_hook.
>
> > Unfortunately that's not entirely trivial to fix, because the hook
> > accepts a Query, not a PlannedStmt.
>
> AFAIR, the purpose of that hook was to let index advisor plugins have
> control of the planning step in an EXPLAIN, so that they could do things
> like injecting hypothetical indexes.  There isn't any easy way to do
> something similar in EXPLAIN EXECUTE because the plan comes out of the
> plancache.

> Why do you care?  It's a pretty specialized hook.

Citus currently uses it to output additional information for distributed
queries. I suspect we'll instead, for now, have to intercept EXPLAIN as
a whole then :( (moving to custom plans unfortunately is still a bit off
unfortunately).

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ExplainOneQuery_hook ignored for EXPLAIN EXECUTE

2016-11-13 Thread Tom Lane
Andres Freund  writes:
> I don't quite know what the hook in $subject was originally designed
> for, but right now it has the problem that it's not invoked for EXPLAIN
> EXECUTE.  That's because ExplainExecuteQuery directly calls
> ExplainOnePlan instead of ExplainOneQuery_hook.

> Unfortunately that's not entirely trivial to fix, because the hook
> accepts a Query, not a PlannedStmt.

AFAIR, the purpose of that hook was to let index advisor plugins have
control of the planning step in an EXPLAIN, so that they could do things
like injecting hypothetical indexes.  There isn't any easy way to do
something similar in EXPLAIN EXECUTE because the plan comes out of the
plancache.

Why do you care?  It's a pretty specialized hook.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical Replication WIP

2016-11-13 Thread Steve Singer

On 10/31/2016 06:38 AM, Petr Jelinek wrote:

On 31/10/16 00:52, Steve Singer wrote:
There are some fundamental issues with initial sync that need to be 
discussed on list but this one is not known. I'll try to convert this 
to test case (seems like useful one) and fix it, thanks for the 
report. In meantime I realized I broke the last patch in the series 
during rebase so attached is the fixed version. It also contains the 
type info in the protocol.




Attached are some proposed documentation updates (to be applied ontop of 
your 20161031 patch set)


Also


  Publication


+  
+The tables are matched using fully qualified table name. Renaming of
+tables or schemas is not supported.
+  

Is renaming of tables any less supported than other DDL operations
For example

alter table nokey2 rename to nokey3
select * FROM pg_publication_tables ;
 pubname | schemaname | tablename
-++---
 tpub| public | nokey3
(1 row)


If I then kill the postmaster on my subscriber and restart it, I get

2016-11-13 16:17:11.341 EST [29488] FATAL:  the logical replication 
target public.nokey3 not found
2016-11-13 16:17:11.342 EST [29272] LOG:  worker process: logical 
replication worker 41076 (PID 29488) exited with exit code 1
2016-11-13 16:17:16.350 EST [29496] LOG:  logical replication apply for 
subscription nokeysub started
2016-11-13 16:17:16.358 EST [29498] LOG:  logical replication sync for 
subscription nokeysub, table nokey2 started
2016-11-13 16:17:16.515 EST [29498] ERROR:  table public.nokey2 not 
found on publisher
2016-11-13 16:17:16.517 EST [29272] LOG:  worker process: logical 
replication worker 41076 sync 24688 (PID 29498) exited with exit code 1


but if I then rename the table on the subscriber everything seems to work.

(I suspect the need to kill+restart is a bug, I've seen other instances 
where a hard restart of the subscriber following changes to is required)



I am also having issues adding a table to a publication ,it doesn't seem 
work


P: create publication tpub for table a;
S: create subscription mysub connection 'host=localhost dbname=test 
port=5440' publication tpub;

P: insert into a(b) values ('1');
P: alter publication tpub add table b;
P: insert into b(b) values ('1');
P: insert into a(b) values ('2');


select * FROM pg_publication_tables ;
 pubname | schemaname | tablename
-++---
 tpub| public | a
 tpub| public | b


but


S:  select * FROM b;
 a | b
---+---
(0 rows)
S: select * FROM a;
 a | b
---+---
 5 | 1
 6 | 2
(2 rows)


diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
new file mode 100644
index 63af1a5..89723c3
*** a/doc/src/sgml/logical-replication.sgml
--- b/doc/src/sgml/logical-replication.sgml
***
*** 95,110 
  how the table is accessed. Each table can be added to multiple
  publications if needed.  Publications may currently only contain
  tables. Objects must be added explicitly, except when a publication
! is created for ALL TABLES. There is no default name for
! a publication which specifies all tables.


  Publications can choose to limit the changes they produce to show
  any combination of INSERT, UPDATE and
  DELETE in a similar way to the way triggers are fired by
! particular event types. Only tables with a REPLICA IDENTITY
! index can be added to a publication which replicates UPDATE
! and DELETE operation.


  The definition of a publication object will be included within
--- 95,110 
  how the table is accessed. Each table can be added to multiple
  publications if needed.  Publications may currently only contain
  tables. Objects must be added explicitly, except when a publication
! is created for ALL TABLES.


  Publications can choose to limit the changes they produce to show
  any combination of INSERT, UPDATE and
  DELETE in a similar way to the way triggers are fired by
! particular event types. If a table with without a REPLICA IDENTITY
! index is added to a publication which replicates UPDATE or
! DELETE operations then subsequent UPDATE
! or DELETE operations will fail on the publisher.


  The definition of a publication object will be included within
***
*** 195,214 


  A conflict will produce an error and will stop the replication; it
! must be resolved manually by the user.


! The resolution can be done either by changing data on the subscriber
! so that it does not conflict with incoming change or by skipping the
! transaction that conflicts with the existing data. The transaction
! can be skipped by calling the
! 
! pg_replication_origin_advance() function
! with a node_name corresponding to the subscription name. The
! current position of origins can be seen in the
! 
! 

Re: [HACKERS] Improving RLS planning

2016-11-13 Thread Tom Lane
Stephen Frost  writes:
> * Dean Rasheed (dean.a.rash...@gmail.com) wrote:
>> +1 for that terminology and no renaming of fields.

> Agreed.

Here's an updated version of the RLS planning patch that gets rid of
the incorrect interaction with Query.hasRowSecurity and adjusts
terminology as agreed.

regards, tom lane



new-rls-planning-implementation-2.patch.gz
Description: new-rls-planning-implementation-2.patch.gz

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why PostgreSQL doesn't implement a semi sync replication?

2016-11-13 Thread 余森彬



在 2016/11/12 上午11:27, Craig Ringer 写道:

On 12 November 2016 at 02:12, Petr Jelinek  wrote:

On 11/11/16 16:03, Francisco Olarte wrote:

On Fri, Nov 11, 2016 at 4:40 AM, 余森彬  wrote:

 As we know, the synchronous commit process is blocked while receives
from acknowledgement from standby in
PostgreSQL.This is good for data consistence in master and standby, and
application can get important data from standby.But
when the standby crash or network goes wrong, the master could be hang.Is
there a feature plan for a semi sync like MySQL
InnoDB(set a timer, and become asynchronous when timeout)?

JMO, but it seems this basically means any process should be dessigned
to cope with the posibility of not having replicated data after
commit, so, why bother with synchronous replication in the first
place?

It's often more acceptable to say "we lose data when 2 servers die (or
are in problems)" than "we lose data when 1 server dies" and it's also
more acceptable to say "we stop answering when we lose 2 servers" but
not "we stop answering when we lose 1 server", and semisync replication
works for combination of these two.

Yep. Also, monitoring. sync with a short timeout means you can usually
rely on sync rep, and if it times out and falls back to async your
monitoring system can start screaming at you.

I think k= replication will help quite a bit with this though.



In my opinion, the semantic of synchronous commit design is to handle 
sync replication and provide high data consistence and availablity. 
Meanwhile a time limit is also vaulable within this synchronous commit 
process, which may protect things from getting wrose when some problems 
occur bewteen master and sync standbys.


Like option ' idle_in_transaction_session_timeout' in PostgreSQL 9.6, 
with a timeout, making transaction commit failed at bad condition rather 
than hanging untill things gets beteer.This doesn't break the 
synchronous commit semantic while becoming less unpredictable.





Re: [HACKERS] Floating point comparison inconsistencies of the geometric types

2016-11-13 Thread Emre Hasegeli
> We can remove the fuzz factor altogether but I think we also
> should provide a means usable to do similar things. At least "is
> a point on a line" might be useless for most cases without any
> fuzzing feature. (Nevertheless, it is a problem only when it is
> being used to do that:) If we don't find reasonable policy on
> fuzzing operations, it would be the proof that we shouldn't
> change the behavior.

It was my initial idea to keep the fuzzy comparison behaviour on some
places, but the more I get into I realised that it is almost
impossible to get this right.  Instead, I re-implemented some
operators to keep precision as much as possible.  The previous "is a
point on a line" operator would *never* give the true result without
the fuzzy comparison.  The new implementation would return true, when
precision is not lost.  I think this is a behaviour people, who are
working with floating points, are prepared to deal with.  By the way,
"is a point on a line" operator is quite wrong with the fuzzy
comparison at the moment [1].

> The 0001 patch adds many FP comparison functions individually
> considering NaN. As the result the sort order logic involving NaN
> is scattered around into the functions, then, you implement
> generic comparison function using them. It seems inside-out to
> me. Defining ordering at one place, then comparison using it
> seems to be reasonable.

I agree that it would be simpler to use the comparison function for
implementing other operators.  I have done it other way around to make
them more optimised.  They are called very often.  I don't think
checking exit code of the comparison function would be optimised the
same way.  I could leave the comparison functions as they are, but
re-implemented them using the others to keep documentation of NaN
comparison in the single place.

> If the center somehow goes extremely near to the origin, it could
> result in a false error.
>
>> =# select @@ box'(-8e-324, -8e-324), (4.9e-324, 4.9e-324)';
>> ERROR:  value out of range: underflow
>
> I don't think this underflow is an error, and actually it is a
> change of the current behavior without a reasonable reason. More
> significant (and maybe unacceptable) side-effect is that it
> changes the behavior of ordinary operators. I don't think this is
> acceptable. More consideration is needed.
>
>> =# select ('-8e-324'::float8 + '4.9e-324'::float8) / 2.0;
>> ERROR:  value out of range: underflow

This is the current behaviour of float datatype.  My patch doesn't
change that.  This problem would probably also apply to multiplying
very small values.  I agree that this is not the ideal behaviour.
Though I am not sure, if we should go to a different direction than
the float datatypes.

I think there is value in making geometric types compatible with the
float.  Users are going to mix them, anyway.  For example, users can
calculate the center of a box manually, and confuse when the built-in
operator behaves differently.

> In regard to fuzzy operations, libgeos seems to have several
> types of this kind of feature. (I haven't looked closer into
> them). Other than reducing precision seems overkill or
> unappliable for PostgreSQL bulitins. As Jim said, can we replace
> the fixed scale fuzz factor by precision reduction? Maybe, with a
> GUC variable (I hear someone's roaring..) to specify the amount
> defaults to fit the current assumption.

I am disinclined to try to implement something complicated for the
geometric types.  I think they are mostly useful for 2 purposes: uses
simple enough to not worth looking for better solutions, and
demonstrating our indexing capabilities.  The inconsistencies harm
both of those.

[1] 
https://www.postgresql.org/message-id/flat/CAE2gYzw_-z%3DV2kh8QqFjenu%3D8MJXzOP44wRW%3DAzzeamrmTT1%3DQ%40mail.gmail.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improvements in psql hooks for variables

2016-11-13 Thread Rahila Syed
>ECHO_HIDDEN differs from the generic boolean case because it also
>accepts "noexec" and pset.echo_hidden is an enum, not a boolean. When
>considering refactoring echo_hidden_hook() to call
>generic_boolean_hook() instead of ParseVariableBool() after
>having established that the value is not "noexec", I don't see
>any advantage in clarity or code size, so I'm not in favor of that change

I agree if generic_boolean_hook() is used in its current form instead of
ParseVariableBool() inside
echo_hidden_hook or on_error_rollback_hook the code will not change much.

I was suggesting change on the lines of extending generic_boolean_hook to
include
enum(part in enum_hooks which calls ParseVariableBool) and integer parsing
as well.

>Besides, I don't think it would go well with hooks. If we wanted one
>big function that knows all about parsing all built-in variables, we
>could just as well dispense with hooks, since their current purpose in
>psql is to achieve this parsing, but in a decentralized way.
>Or if we keep them, our various built-in variables would be
>essentially tied to the same one-big-hook-that-does-all, but isn't
>that an antipattern for hooks?

I was suggesting something on the lines of having common parsing logic for
each implementation of hook. This is similar to what ParseVariableBool does
in
the existing code. ParseVariableBool is being called from different hooks
to
parse the boolean value of the variable. Thus representing common code in
various
implementations of hook.

>"hook" is changed by the patch from [pointer to function returning
>void], to [pointer to function returning bool]. The cast to void is
>not essential, it just indicates that we deliberately want to
>ignore the return value here. I expect some compilers might
>complain under a high level of warnings without this cast, although
>TBH if you ask me, I wouldn't know which compiler with which flags
>exactly.
Thank you for explanation.

Thank you,
Rahila Syed




On Fri, Nov 11, 2016 at 2:57 AM, Daniel Verite 
wrote:

> Rahila Syed wrote:
>
> > I have applied this patch on latest HEAD and have done basic testing
> which
> > works fine.
>
> Thanks for reviewing this patch!
>
> > >if (current->assign_hook)
> > >-   (*current->assign_hook) (current->value);
> > >-   return true;
> > >+   {
> > >+   confirmed = (*current->assign_hook) (value);
> > >+   }
> > >+   if (confirmed)
> >
> > Spurious brackets
>
> OK.
>
> > >static bool
> > >+generic_boolean_hook(const char *newval, const char* varname, bool
> *flag)
> > >+{
> >
> > Contrary to what name suggests this function does not seem to have other
> > implementations as in a hook.
> > Also this takes care of rejecting a syntactically wrong value only for
> > boolean variable hooks like autocommit_hook,
> > on_error_stop_hook. However, there are other variable hooks which call
> > ParseVariableBool.
> > For instance, echo_hidden_hook which is handled separately in the patch.
> > Thus there is some duplication of code between generic_boolean_hook and
> > echo_hidden_hook.
> > Similarly for on_error_rollback_hook.
>
> The purpose of generic_boolean_hook() is to handle the case of a
> boolean variable that only accepts ON or OFF, and has its pset.varname
> declared as bool. I thought of this case as "generic" because that's
> the base case and several variables need no more than that.
>
> ECHO_HIDDEN differs from the generic boolean case because it also
> accepts "noexec" and pset.echo_hidden is an enum, not a boolean. When
> considering refactoring echo_hidden_hook() to call
> generic_boolean_hook() instead of ParseVariableBool() after
> having established that the value is not "noexec", I don't see
> any advantage in clarity or code size, so I'm not in favor of that change.
>
> The same applies to on_error_rollback_hook(), which has to deal
> with a specific enum as well.
>
> > >-static void
> > >+static bool
> > > fetch_count_hook(const char *newval)
> > > {
> > >pset.fetch_count = ParseVariableNum(newval, -1, -1, false);
> > >+   return true;
> > > }
> >
> > Shouldn't invalid numeric string assignment for numeric variables be
> > handled too?
>
> Agreed. Assignments like "\set FETCH_COUNT bogus" don't provoke any
> user feedback currently, which is not ideal. I'll add this in a
> v3 of the patch tomorrow.
>
> > Instead of generic_boolean_hook cant we have something like follows which
> > like generic_boolean_hook can be called from specific variable assignment
> > hooks,
> >
> > static bool
> > ParseVariable(newval, VariableName, )
> > {
> > if (VariableName == ‘AUTOCOMMIT’ || ECHO_HIDDEN || other variable
> with
> > hooks which call ParseVariableBool )
> >  >  > ON_ERROR_ROLLBACK>
> > else if (VariableName == ‘FETCH_COUNT’)
> > ParseVariableNum();
> > }
>
> It's not possible because pset.var corresponds to different fields from
> struct _psqlSettings 

[HACKERS] ExplainOneQuery_hook ignored for EXPLAIN EXECUTE

2016-11-13 Thread Andres Freund
Hi,

I don't quite know what the hook in $subject was originally designed
for, but right now it has the problem that it's not invoked for EXPLAIN
EXECUTE.  That's because ExplainExecuteQuery directly calls
ExplainOnePlan instead of ExplainOneQuery_hook.

Unfortunately that's not entirely trivial to fix, because the hook
accepts a Query, not a PlannedStmt.

Does anybody have a good way to fix this?

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Physical append-only tables

2016-11-13 Thread Thomas Munro
On Mon, Nov 14, 2016 at 4:45 AM, Magnus Hagander  wrote:
> For a scenario like this, would it make sense to have an option that could
> be set on an individual table making it physical append only? Basically
> VACUUM would run as normal and clean up the old space when rows are deleted
> back in history, but when new space is needed for a row the system would
> never look at the old blocks, and only append to the end.

One thing I have idly wondered about: if you had an append-only mode
that guarantees that we never write new tuples anywhere but the end of
the heap, you could create a bsearch access method that uses zero
storage and simply checks that every key inserted is >= the previous
high key before allowing the insertion to proceed.  Then it could make
use of the guaranteed correlation with physical order to do scans
using a binary search of the heap.  Maybe that'd be useful for some
kinds of write-only time-series data that needs to be searched by time
range.  On the other hand, BRIN indexes are tiny, should be nearly as
good and are much less restrictive, so I haven't follow this thought
up.

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tackling JsonPath support

2016-11-13 Thread Pavel Stehule
Hi

2016-11-13 15:14 GMT+01:00 Christian Convey :

> Hi Pavel,
>
> Can I check a few assumptions about what you're suggesting for this task?
>
> * Our ultimate goal is to give Postgres an implementation of the functions
> "JSON_EXISTS", "JSON_VALUE", and "JSON_QUERY" which fully comply with the
> SQL standards.
>
> * The best representation of those standards is found here: [1].
>
> * When [1] mentions a "JSON path expression" or "JSON path language", it's
> referring to the query language described here: [2].
>
> * Even if other popular DBMS's deviate from [1], or other popular JSONPath
> implementations deviate from [2], we remain committed to a faithful
> implementation of [1].
>
> * It's okay for my first commit to implement just two things: (a) a
> PG-internal implementation of JsonPath, and (b) a user-visible
> implementation of "JSON_QUERY" based on (a).  Later commits could add
> implementations of "JSON_VALUE", "JSON_EXISTS", etc. in terms of (a).
>

My goal is implementation of JSON_TABLE function - this function can be
used instead any other mentioned function (and it is really useful - it is
usual task - transform JSON to table). The SQL/JSON is pretty new and
bigger for implementation in one step. Nobody knows it from PostgreSQL
world. The our SQL/XML needed more than 10 years and still is not fully
complete - and we used power and features libxml2 (nothing similar we have
for JSON). But almost what is daily need from SQL/XML we have. For
JSON_TABLE we need only basic features of JSONPath - the predicates are not
necessary in first step.

http://www.ibm.com/support/knowledgecenter/SSEPEK_11.0.0/json/src/tpc/db2z_bif_jsontable.html
The vendors use name for this query language "SQL/JSON path expressions" -
so important source is SQL/JSON (this can be different than origin JSONPath
(your second source)).

Regards

Pavel




Regards

Pavel


> Thanks,
> Christian
>
> [1] http://jtc1sc32.org/doc/N2501-2550/32N2528-WG3-
> Tutorial-Opening-Plenary.pdf
>
> [2] http://goessner.net/articles/JsonPath
>
>
> On Fri, Sep 16, 2016 at 2:28 AM, Pavel Stehule 
> wrote:
>
>> Hi
>>
>> 2016-09-15 18:05 GMT+02:00 Christian Convey :
>>
>>> On Mon, Sep 5, 2016 at 1:44 PM, Pavel Stehule 
>>> wrote:
>>> ...
>>>
>>> > I wrote XMLTABLE function, and I am thinking about JSON_TABLE
>>> function. But
>>> > there is one blocker - missing JsonPath support in our JSON
>>> implementation.
>>> >
>>> > So one idea - implement JsonPath support and related JSON query
>>> functions.
>>> > This can help with better standard conformance.
>>>
>>> Hi Pavel,
>>>
>>> Are you still looking for someone to add the JsonPath support to the
>>> JSON implementation?  And if so, how urgently are people waiting for
>>> it?
>>>
>>
>> yes - JsonPath support should be great. I hope so this or next commitfest
>> the XMLTABLE patch will be committed, and with JsonPath I can start to work
>> on JSON_TABLE function.
>>
>> But the JsonPath can be merged separately without dependency to
>> JSON_TABLE. There are more JSON searching functions, and these functions
>> should to support JsonPath be ANSI SQL compliant.
>>
>>
>>>
>>> I'd be happy to start working on a patch, but since I'm new to PG
>>> development, I'm probably not the fastest person to get it done.
>>>
>>
>> It is not problem. Probably you should to do this work without deep
>> knowledges about PostgreSQL internals. The work with data types (and
>> functions for data types) is well isolated from PostgreSQL engine.
>>
>> You can learn from current searching on JSON -
>> postgresql/src/backend/utils/adt/json.c
>>
>> And it is good start to be PostgreSQL's hacker - I started with
>> implementation of own data type and related functions.
>>
>> Regards
>>
>> Pavel
>>
>>
>>> Kind regards,
>>> Christian
>>>
>>
>>
>


Re: [HACKERS] proposal: psql \setfileref

2016-11-13 Thread Pavel Stehule
Hi

I am sending a initial implementation of psql content commands. This design
is reaction to Tom's objections against psql file ref variables. This
design is more cleaner, more explicit and more practical - import can be in
one step.

Now supported commands are:

r - read file without any modification
rq - read file, escape as literal and use outer quotes
rb - read binary file - transform to hex code string
rbq - read binary file, transform to hex code string and use outer quotes

Usage:

create table testt(a xml);
insert into test values( {rq /home/pavel/.local/share/rhythmbox/rhythmdb.xml}
);
\set xxx {r /home/pavel/.local/share/rhythmbox/rhythmdb.xml}

This patch is demo of this design - one part is redundant - I'll clean it
in next iteration.

Regards

Pavel
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index a7789df..dbf3ffa 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -12,6 +12,7 @@
 #include 
 #include 
 #include 
+#include 
 #ifndef WIN32
 #include /* for write() */
 #else
@@ -168,6 +169,142 @@ psql_get_variable(const char *varname, bool escape, bool as_ident)
 	return result;
 }
 
+/*
+ * file-content-fetching callback for flex lexer.
+ */
+char *
+psql_get_file_content(const char *filename, bool escape, bool binary, bool quote)
+{
+	FILE	   *fd;
+	char	   *fname = pg_strdup(filename);
+	char	   *result = NULL;
+
+	expand_tilde();
+	canonicalize_path(fname);
+
+	fd = fopen(fname, PG_BINARY_R);
+	if (fd)
+	{
+		struct stat		fst;
+
+		if (fstat(fileno(fd), ) != -1)
+		{
+			if (S_ISREG(fst.st_mode))
+			{
+if (fst.st_size <= ((int64) 1024) * 1024 * 1024)
+{
+	size_tsize;
+	PQExpBufferData		raw_data;
+	charbuf[512];
+
+	initPQExpBuffer(_data);
+
+	if (!escape && quote)
+		appendPQExpBufferChar(_data, '\'');
+
+	while ((size = fread(buf, 1, sizeof(buf), fd)) > 0)
+		appendBinaryPQExpBuffer(_data, buf, size);
+
+	if (!ferror(fd) && !(PQExpBufferDataBroken(raw_data)))
+	{
+		if (escape)
+		{
+			if (binary)
+			{
+unsigned char  *escaped_value;
+size_t			escaped_size;
+
+escaped_value = PQescapeByteaConn(pset.db,
+		(const unsigned char *) raw_data.data, raw_data.len,
+  _size);
+
+if (escaped_value != NULL)
+{
+	if (quote)
+	{
+		PQExpBufferData		finalbuf;
+
+		initPQExpBuffer();
+		appendPQExpBufferChar(, '\'');
+		appendBinaryPQExpBuffer(,
+(const char *) escaped_value, escaped_size - 1);
+		appendPQExpBufferChar(, '\'');
+		PQfreemem(escaped_value);
+
+		result = finalbuf.data;
+	}
+	else
+		result = (char *) escaped_value;
+}
+else
+	psql_error("%s\n", PQerrorMessage(pset.db));
+			}
+			else
+			{
+/* escape text */
+if (quote)
+{
+	result = PQescapeLiteral(pset.db,
+raw_data.data, raw_data.len);
+	if (result == NULL)
+		psql_error("%s\n", PQerrorMessage(pset.db));
+}
+else
+{
+	int		error;
+
+	result = pg_malloc(raw_data.len * 2 + 1);
+	PQescapeStringConn(pset.db, result, raw_data.data, raw_data.len, );
+	if (error)
+	{
+		psql_error("%s\n", PQerrorMessage(pset.db));
+		PQfreemem(result);
+		result = NULL;
+	}
+}
+			}
+		}
+		else
+		{
+			/* returns raw data, without any transformations */
+			if (quote)
+appendPQExpBufferChar(_data, '\'');
+
+			if (PQExpBufferDataBroken(raw_data))
+psql_error("out of memory\n");
+			else
+result = raw_data.data;
+		}
+	}
+	else
+	{
+		if (PQExpBufferDataBroken(raw_data))
+			psql_error("out of memory\n");
+		else
+			psql_error("%s: %s\n", fname, strerror(errno));
+	}
+
+	if (result != raw_data.data)
+		termPQExpBuffer(_data);
+}
+else
+	psql_error("%s is too big (greather than 1GB)\n", fname);
+			}
+			else
+psql_error("%s is not regular file\n", fname);
+		}
+		else
+			psql_error("%s: %s\n", fname, strerror(errno));
+
+		fclose(fd);
+	}
+	else
+		psql_error("%s: %s\n", fname, strerror(errno));
+
+	PQfreemem(fname);
+
+	return result;
+}
 
 /*
  * Error reporting for scripts. Errors should look like
diff --git a/src/bin/psql/common.h b/src/bin/psql/common.h
index bdcb58f..6b8ae67 100644
--- a/src/bin/psql/common.h
+++ b/src/bin/psql/common.h
@@ -19,6 +19,7 @@ extern bool openQueryOutputFile(const char *fname, FILE **fout, bool *is_pipe);
 extern bool setQFout(const char *fname);
 
 extern char *psql_get_variable(const char *varname, bool escape, bool as_ident);
+extern char *psql_get_file_content(const char *filename, bool escape, bool binary, bool quote);
 
 extern void psql_error(const char *fmt,...) pg_attribute_printf(1, 2);
 
diff --git 

Re: [HACKERS] Contains and is contained by operators of inet datatypes

2016-11-13 Thread Andreas Karlsson

On 11/13/2016 01:21 PM, Emre Hasegeli wrote:

Thank you for the review.  New version is attached.


Nice, I am fine with this version of the patch. Setting it to ready for 
committer!


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tackling JsonPath support

2016-11-13 Thread Pavel Stehule
2016-11-13 18:13 GMT+01:00 Tom Lane :

> Christian Convey  writes:
> > * Our ultimate goal is to give Postgres an implementation of the
> functions
> > "JSON_EXISTS", "JSON_VALUE", and "JSON_QUERY" which fully comply with the
> > SQL standards.
> > * The best representation of those standards is found here: [1].
> > [1]
> > http://jtc1sc32.org/doc/N2501-2550/32N2528-WG3-Tutorial-
> Opening-Plenary.pdf
>
> You're going to need to find a draft standard somewhere, as that
> presentation is too thin on details to support writing an actual
> implementation.  In particular, it's far from clear that this is
> true at all:
>
> > * When [1] mentions a "JSON path expression" or "JSON path language",
> it's
> > referring to the query language described here: [2].
> > [2] http://goessner.net/articles/JsonPath
>
> The one slide they have on the path language mentions a lax/strict syntax
> that I don't see either in the document you mention or in the Wikipedia
> XPath article it links to.  This does not give me a warm feeling.  The SQL
> committee is *fully* capable of inventing their own random path notation,
> especially when there's no ISO-blessed precedent to bind them.
>
> In general, the stuff I see in these WG3 slides strikes me as pretty
> horribly designed.  The committee is evidently still stuck on the idea
> that every feature they invent should have a bunch of new bespoke syntax
> for function calls, which is a direction we really don't want to go in
> because of the parser overhead and need for more fully-reserved keywords.
> For instance:
> WHERE JSON_EXISTS (T.J, 'strict $.where' FALSE ON ERROR)
> Really?  Who thought that was a better idea than a simple bool parameter?
>
> I have no objection to providing some functions that implement XPath-like
> tests for JSON, but I'm not sure that you ought to try to tie it to
> whatever the SQL committee is going to do, especially when they've not
> published a finished standard yet.  You may be chasing a moving target.
>
> As for whether JSONPath is the right spec to follow, I'm not sure.
> The article you mention is from 2007 and I don't see all that many
> other references in a Google search.  I found this Wikipedia page:
> https://en.wikipedia.org/wiki/Comparison_of_data_serialization_formats
> which mentions half a dozen competitors, including "JSON Pointer"
> which has at least gotten as far as being an RFC standard:
> https://tools.ietf.org/html/rfc6901
> I'm not up enough on the JSON ecosystem to know which of these has the
> most traction, but I'm unconvinced that it's JSONPath.
>

We can use some other databases with this implementation as references.

I have to agree, so the people in SQL committee are not too consistent -
and sometimes creates too cobolish syntax, but it is standard - and it is
implemented by major vendors.

We doesn't need to implement full API - not in first step - important point
is don't close door to possible ANSI conformance. In first step we can take
the best and important from standard. It can be similar to our SQL/XML
implementation - we implement maybe 75% - and only XPath instead XQuery,
but I don't feel any weak. I see very useful "JSON_TABLE" function, which
is good for start.

Regards

Pavel


> regards, tom lane
>


Re: [HACKERS] Do we need use more meaningful variables to replace 0 in catalog head files?

2016-11-13 Thread Tom Lane
Andres Freund  writes:
> On 2016-11-13 11:23:09 -0500, Tom Lane wrote:
>> We can't use CREATE FUNCTION as the representation in the .bki file,
>> because of the circularities involved (you can't fill pg_proc before
>> pg_type nor vice versa).  But I think Peter was suggesting that the
>> input to the bki-generator script could look like CREATE commands.
>> That's true, but I fear it would greatly increase the complexity
>> of the script for not much benefit.

> It'd also be very pg_proc specific, which isn't where I think this
> should go..

The presumption is that we have a CREATE command for every type of
object that we need to put into the system catalogs.  But yes, the
other problem with this approach is that you need to do a lot more
work per-catalog to build the converter script.  I'm not sure how
much of that could be imported from gram.y, but I'm afraid the
answer would be "not enough".

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tackling JsonPath support

2016-11-13 Thread Tom Lane
Christian Convey  writes:
> * Our ultimate goal is to give Postgres an implementation of the functions
> "JSON_EXISTS", "JSON_VALUE", and "JSON_QUERY" which fully comply with the
> SQL standards.
> * The best representation of those standards is found here: [1].
> [1]
> http://jtc1sc32.org/doc/N2501-2550/32N2528-WG3-Tutorial-Opening-Plenary.pdf

You're going to need to find a draft standard somewhere, as that
presentation is too thin on details to support writing an actual
implementation.  In particular, it's far from clear that this is
true at all:

> * When [1] mentions a "JSON path expression" or "JSON path language", it's
> referring to the query language described here: [2].
> [2] http://goessner.net/articles/JsonPath

The one slide they have on the path language mentions a lax/strict syntax
that I don't see either in the document you mention or in the Wikipedia
XPath article it links to.  This does not give me a warm feeling.  The SQL
committee is *fully* capable of inventing their own random path notation,
especially when there's no ISO-blessed precedent to bind them.

In general, the stuff I see in these WG3 slides strikes me as pretty
horribly designed.  The committee is evidently still stuck on the idea
that every feature they invent should have a bunch of new bespoke syntax
for function calls, which is a direction we really don't want to go in
because of the parser overhead and need for more fully-reserved keywords.
For instance:
WHERE JSON_EXISTS (T.J, 'strict $.where' FALSE ON ERROR)
Really?  Who thought that was a better idea than a simple bool parameter?

I have no objection to providing some functions that implement XPath-like
tests for JSON, but I'm not sure that you ought to try to tie it to
whatever the SQL committee is going to do, especially when they've not
published a finished standard yet.  You may be chasing a moving target.

As for whether JSONPath is the right spec to follow, I'm not sure.
The article you mention is from 2007 and I don't see all that many
other references in a Google search.  I found this Wikipedia page:
https://en.wikipedia.org/wiki/Comparison_of_data_serialization_formats
which mentions half a dozen competitors, including "JSON Pointer"
which has at least gotten as far as being an RFC standard:
https://tools.ietf.org/html/rfc6901
I'm not up enough on the JSON ecosystem to know which of these has the
most traction, but I'm unconvinced that it's JSONPath.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do we need use more meaningful variables to replace 0 in catalog head files?

2016-11-13 Thread Andres Freund
On 2016-11-13 11:23:09 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > On 2016-11-13 00:20:22 -0500, Peter Eisentraut wrote:
> >> Then we're not very far away from just using CREATE FUNCTION SQL commands.
>
> > Well, those do a lot of syscache lookups, which in turn do lookups for
> > functions...
>
> We can't use CREATE FUNCTION as the representation in the .bki file,
> because of the circularities involved (you can't fill pg_proc before
> pg_type nor vice versa).  But I think Peter was suggesting that the
> input to the bki-generator script could look like CREATE commands.
> That's true, but I fear it would greatly increase the complexity
> of the script for not much benefit.

It'd also be very pg_proc specific, which isn't where I think this
should go..

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do we need use more meaningful variables to replace 0 in catalog head files?

2016-11-13 Thread Andres Freund
On 2016-11-13 11:11:37 -0500, Tom Lane wrote:
> 1. Are we going to try to keep these things in the .h files, or split
> them out?  I'd like to get them out, as that eliminates both the need
> to keep the things looking like macro calls, and the need for the data
> within the macro call to be at least minimally parsable as C.

I vote for splitting them out.


> 2. Andrew's example above implies some sort of mapping between the
> keywords and the actual column names (or at least column positions).
> Where and how is that specified?

I don't know what andrew was planning, but before I stopped I had a 1:1
mapping beteween column names and keywords. Catalog.pm parses the
pg_*.h headers and thus knows the table definition via the CATALOG()
stuff.


> 3. Also where are we going to provide the per-column default values?

That's a good question, I suspect we should move that knowledge to the
headers as well. Possibly using something like BKI_DEFAULT(...)?


> How does the converter script know which columns to convert to type oids,
> proc oids, etc?

I simply had that based on the underlying reg* type. I.e. if a column
was regtype the script would map it to type oids and so on.  That
required some type changes, which does have some compatibility concerns.


> Is it going to do any data validation beyond that, and if so on what basis?

Hm, not sure if we really need something.


> 4. What will we do about the #define's that some of the .h files provide
> for (some of) their object OIDs?  I assume that we want to move in the
> direction of autogenerating those macros a la fmgroids.h, but this needs
> a concrete spec as well.

I suspect at least type oids we'll continue to have to maintain
manually. A good number of things rely on the builtin type oids being
essentially stable.


> > If we can generalize this to other catalogs, then that will be good, but 
> > my inclination is to handle the elephant in the room (pg_proc.h) and 
> > worry about the gnats later.
> 
> I think we want to do them all.

+1


Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do we need use more meaningful variables to replace 0 in catalog head files?

2016-11-13 Thread Tom Lane
Andres Freund  writes:
> On 2016-11-13 00:20:22 -0500, Peter Eisentraut wrote:
>> Then we're not very far away from just using CREATE FUNCTION SQL commands.

> Well, those do a lot of syscache lookups, which in turn do lookups for
> functions...

We can't use CREATE FUNCTION as the representation in the .bki file,
because of the circularities involved (you can't fill pg_proc before
pg_type nor vice versa).  But I think Peter was suggesting that the
input to the bki-generator script could look like CREATE commands.
That's true, but I fear it would greatly increase the complexity
of the script for not much benefit.  It does little for the question of
"how do you update the data when adding a new pg_proc column", for
instance.  And you'd still need some non-SQL warts, like how to specify
manually-assigned OIDs for types and functions.  (I'm not sure whether
we could get away with dropping fixed assignments of function OIDs,
but we absolutely can't do so for types.  Lots of client code knows
that text is oid 25, for example.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do we need use more meaningful variables to replace 0 in catalog head files?

2016-11-13 Thread Tom Lane
Andrew Dunstan  writes:
> I'm not convinced the line prefix part is necessary, though. What I'm 
> thinking of is something like this:

> PROCDATA( oid=1242 name=boolin isstrict=t volatile=i parallel=s nargs=1
>  rettype=bool argtypes="cstring" src=boolin );

We could go in that direction too, but the apparent flexibility to split
entries into multiple lines is an illusion, at least if you try to go
beyond a few lines; you'd end up with duplicated line sequences in
different entries and thus ambiguity for patch(1).  I don't have any
big objection to the above, but it's not obviously better either.

Some things we should try to resolve before settling definitively on
a data representation:

1. Are we going to try to keep these things in the .h files, or split
them out?  I'd like to get them out, as that eliminates both the need
to keep the things looking like macro calls, and the need for the data
within the macro call to be at least minimally parsable as C.

2. Andrew's example above implies some sort of mapping between the
keywords and the actual column names (or at least column positions).
Where and how is that specified?

3. Also where are we going to provide the per-column default values?
How does the converter script know which columns to convert to type oids,
proc oids, etc?  Is it going to do any data validation beyond that, and
if so on what basis?

4. What will we do about the #define's that some of the .h files provide
for (some of) their object OIDs?  I assume that we want to move in the
direction of autogenerating those macros a la fmgroids.h, but this needs
a concrete spec as well.  If we don't want this change to result in a big
hit to the source code, we're probably going to need to be able to specify
the macro names to generate in the data files.

5. One of the requirements that was mentioned in previous discussions
was to make it easier to add new columns to catalogs.  This format
does that only to the extent that you don't have to touch entries that
can use the default value for such a column.  Is that good enough, and
if not, what might we be able to do to make it better?


> I'd actually like to roll up the DESCR lines in pg_proc.h into this too, 
> they strike me as a bit of a wart. But I'm flexible on that.

+1, if we can come up with a better syntax.  This together with the
OID-macro issue suggests that there will be items in each data entry that
correspond to something other than columns of the target catalog.  But
that seems fine.

> If we can generalize this to other catalogs, then that will be good, but 
> my inclination is to handle the elephant in the room (pg_proc.h) and 
> worry about the gnats later.

I think we want to do them all.  pg_proc.h is actually one of the easier
catalogs to work on presently, IMO, because the only kind of
cross-references it has are type OIDs.  Things like pg_amop are a mess.
And I really don't want to be dealing with multiple notations for catalog
data.  Also I think this will be subject to Polya's paradox: designing a
general solution will be easier and cleaner than a hack that works only
for one catalog.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Physical append-only tables

2016-11-13 Thread Magnus Hagander
Scenario:

Large table, *mostly* insert+select only, either a sequence or a timestamp
column set to current time. This is an ideal usecase for a BRIN index.
Awesome.

But then consider the same table. Except rows are typically updated once or
twice when they are new, and *then* go read only. And we also have a
process that at some point deletes *some* old rows (but not all - in fact,
only a small portion).

In this case, the next INSERT once VACUUM has run is likely to stick a
"new" row somewhere very "far back" in the table, since there is now free
space there. This more or less completely ruins the BRIN index usability,
as the "old" blocks will now contain a single row from a "new" series.

For a scenario like this, would it make sense to have an option that could
be set on an individual table making it physical append only? Basically
VACUUM would run as normal and clean up the old space when rows are deleted
back in history, but when new space is needed for a row the system would
never look at the old blocks, and only append to the end.

Yes, this wastes space in the database. But not as much as trying to track
the deleted rows somewhere else and do an anti-join or something like that.
And it would obviously not be on by default.

Eventually the lost space might grow to a point where re-CLUSTERing the
table might be worthwhile. But given the cost of that, I can see many
scenarios where people are just willing to pay that extra overhead on large
tables that are more or less never deleted from.

I've run into a number of cases recently where this would've made the BRIN
indexes on huge tables *much* more efficient. At least, it seems to me they
would :)

Or am I missing something that would make this not work?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] Do we need use more meaningful variables to replace 0 in catalog head files?

2016-11-13 Thread Andrew Dunstan



On 11/13/2016 04:54 AM, Andres Freund wrote:

On 2016-11-12 12:30:45 -0500, Andrew Dunstan wrote:


On 11/11/2016 11:10 AM, Tom Lane wrote:

boolin: OID=1242 proname=boolin proargtypes="cstring" prorettype=bool
boolin: prosrc=boolin provolatile=i proparallel=s




I have written a little perl script to turn the pg_proc DATA lines into
something like the format suggested. In order to keep the space used as
small as possible, I used a prefix based on the OID. See attached result.

Still plenty of work to go, e.g. grabbing the DESCR lines, and turning this
all back into DATA/DESCR lines, but I wanted to get this out there before
going much further.

The defaults I used are below (commented out keys are not defaulted, they
are just there for completeness).

In the referenced thread I'd started to work on something like this,
until other people also said they'd be working on it.  I chose a
different output format (plain Data::Dumper), but I'd added the parsing
of DATA/DESCR and such to genbki.

Note that I found that initdb performance is greatly increased *and*
legibility is improvided, if types and such in the data files are
expanded, and converted to their oids when creating postgres.bki.



Yeah, I have the type name piece, it was close to trivial. I just read 
in pg_type.h and stored the names/oids in a hash.


Data::Dumper is too wasteful of space. The thing I like about Tom's 
format is that it's nice and concise.


I'm not convinced the line prefix part is necessary, though. What I'm 
thinking of is something like this:


PROCDATA( oid=1242 name=boolin isstrict=t volatile=i parallel=s nargs=1
rettype=bool argtypes="cstring" src=boolin );

Teaching Catalog.pm how to parse that and turn the type names back into 
oids won't be difficult. I already have code for the prefix version, and 
this would be easier since there is an end marker.


I'd actually like to roll up the DESCR lines in pg_proc.h into this too, 
they strike me as a bit of a wart. But I'm flexible on that.


If we can generalize this to other catalogs, then that will be good, but 
my inclination is to handle the elephant in the room (pg_proc.h) and 
worry about the gnats later.




I basically made genbki/catalog.pm accept text whenever a column is of
type regtype/regprocedure/. To then make use of that I converted a bunch
of plain oid columns to their their reg* equivalent. That's also nice
for just plain qureying of the catalogs ;)

I don't think the code is going to be much use for you directlky, but it
might be worthwhile to crib some stuff from the 0002 of the attached
patches (based on 74811c4050921959d54d42e2c15bb79f0e2c37f3).



Thanks, I will take a look.

cheers

andrew




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tackling JsonPath support

2016-11-13 Thread Christian Convey
Hi Pavel,

Can I check a few assumptions about what you're suggesting for this task?

* Our ultimate goal is to give Postgres an implementation of the functions
"JSON_EXISTS", "JSON_VALUE", and "JSON_QUERY" which fully comply with the
SQL standards.

* The best representation of those standards is found here: [1].

* When [1] mentions a "JSON path expression" or "JSON path language", it's
referring to the query language described here: [2].

* Even if other popular DBMS's deviate from [1], or other popular JSONPath
implementations deviate from [2], we remain committed to a faithful
implementation of [1].

* It's okay for my first commit to implement just two things: (a) a
PG-internal implementation of JsonPath, and (b) a user-visible
implementation of "JSON_QUERY" based on (a).  Later commits could add
implementations of "JSON_VALUE", "JSON_EXISTS", etc. in terms of (a).

Thanks,
Christian

[1]
http://jtc1sc32.org/doc/N2501-2550/32N2528-WG3-Tutorial-Opening-Plenary.pdf

[2] http://goessner.net/articles/JsonPath


On Fri, Sep 16, 2016 at 2:28 AM, Pavel Stehule 
wrote:

> Hi
>
> 2016-09-15 18:05 GMT+02:00 Christian Convey :
>
>> On Mon, Sep 5, 2016 at 1:44 PM, Pavel Stehule 
>> wrote:
>> ...
>>
>> > I wrote XMLTABLE function, and I am thinking about JSON_TABLE function.
>> But
>> > there is one blocker - missing JsonPath support in our JSON
>> implementation.
>> >
>> > So one idea - implement JsonPath support and related JSON query
>> functions.
>> > This can help with better standard conformance.
>>
>> Hi Pavel,
>>
>> Are you still looking for someone to add the JsonPath support to the
>> JSON implementation?  And if so, how urgently are people waiting for
>> it?
>>
>
> yes - JsonPath support should be great. I hope so this or next commitfest
> the XMLTABLE patch will be committed, and with JsonPath I can start to work
> on JSON_TABLE function.
>
> But the JsonPath can be merged separately without dependency to
> JSON_TABLE. There are more JSON searching functions, and these functions
> should to support JsonPath be ANSI SQL compliant.
>
>
>>
>> I'd be happy to start working on a patch, but since I'm new to PG
>> development, I'm probably not the fastest person to get it done.
>>
>
> It is not problem. Probably you should to do this work without deep
> knowledges about PostgreSQL internals. The work with data types (and
> functions for data types) is well isolated from PostgreSQL engine.
>
> You can learn from current searching on JSON -
> postgresql/src/backend/utils/adt/json.c
>
> And it is good start to be PostgreSQL's hacker - I started with
> implementation of own data type and related functions.
>
> Regards
>
> Pavel
>
>
>> Kind regards,
>> Christian
>>
>
>


Re: [HACKERS] Floating point comparison inconsistencies of the geometric types

2016-11-13 Thread Emre Hasegeli
> The reason for that is that you forgot to edit an alternative
> exptect file, which matches for en_US locale.

Now I understand.  Thank you for the explanation.

> But the test doesn't for collation and the only difference
> between the alternative expect files comes from the difference of
> collation for the query. "the workaround" seems to be the right
> way to do it. I recommend rather to leave the workaroud as it is
> and remove select_views_1.out from the "expected" directory.

I will do.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Contains and is contained by operators of inet datatypes

2016-11-13 Thread Emre Hasegeli
> - I am not convinced that your changes to the descriptions of the operators
> necessarily make things clearer. For example "is contained by and smaller
> network (subnet)" only mentions subnets and not IP-addresses.

I was trying to avoid confusion.  <@ is the "contained by" operator
which is also returning true when both sides are equal.  We shouldn't
continue calling <<@ also "contained by".  I removed the "(subnet)"
and "(supernet)" additions.  Can you think of any better wording?

> - Maybe change "deprecated and will eventually be removed." to "deprecated
> and may be removed in a future release.". I prefer that latter wording but I
> am fine with either.

I copied that note from the Geometric Functions and Operators page.

> - Won't renaming the functions which implement risk breaking people's
> applications? While the new names are a bit nicer I am not sure it is worth
> doing.

You are right.  I reverted that part.

> - The changes to the code look generally good.

Thank you for the review.  New version is attached.


0001-inet-contain-op-v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical Replication WIP

2016-11-13 Thread Andres Freund
On 2016-11-13 00:40:12 -0500, Peter Eisentraut wrote:
> On 11/12/16 2:18 PM, Andres Freund wrote:
> >>>  I also wonder if we want an easier to
> >>> > > extend form of pubinsert/update/delete (say to add pubddl, 
> >>> > > pubtruncate,
> >>> > > pub ... without changing the schema).
> >>> > > 
> >> > 
> >> > So like, text array that's then parsed everywhere (I am not doing
> >> > bitmask/int definitely)?
> > Yes, that sounds good to me. Then convert it to individual booleans or a
> > bitmask when loading the publications into the in-memory form (which you
> > already do).
> 
> I'm not sure why that would be better.  Adding catalog columns in future
> versions is not a problem.

It can be extended from what core provides, for extended versions of
replication solutions, for one. I presume publications/subscriptions
aren't only going to be used by built-in code.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do we need use more meaningful variables to replace 0 in catalog head files?

2016-11-13 Thread Andres Freund
On 2016-11-13 00:20:22 -0500, Peter Eisentraut wrote:
> On 11/11/16 11:10 AM, Tom Lane wrote:
> > boolin: OID=1242 proname=boolin proargtypes="cstring" prorettype=bool
> > boolin: prosrc=boolin provolatile=i proparallel=s
> 
> Then we're not very far away from just using CREATE FUNCTION SQL commands.

Well, those do a lot of syscache lookups, which in turn do lookups for
functions...

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump, pg_dumpall and data durability

2016-11-13 Thread Andres Freund
Hi,

On 2016-11-08 18:18:01 -0500, Tom Lane wrote:
> I think this might be better addressed by adding something to backup.sgml
> pointing out that you'd better fsync or sync your backups before assuming
> that they can't be lost.

How does a normal user do that? I don't think there's a cross-platform
advice we can give, and even on *nix the answer basically is 'sync;
sync;' which is a pretty big hammer, and might be completely
unacceptable on a busy server.

Regards,

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers