Re: [HACKERS] [Proposal] Make the optimiser aware of partitions ordering

2017-04-05 Thread Ronan Dunklau
On vendredi 24 mars 2017 08:14:03 CEST Ashutosh Bapat wrote:
> On Mon, Mar 20, 2017 at 8:33 PM, Ronan Dunklau <ronan.dunk...@dalibo.com> 
wrote:
> > On lundi 20 mars 2017 15:52:03 CET Robert Haas wrote:
> >> On Mon, Mar 20, 2017 at 6:31 AM, Ronan Dunklau <ronan.dunk...@dalibo.com>
> > 
> > wrote:
> >> > With range partitioning, we guarantee that each partition contains non-
> >> > overlapping values. Since we know the range allowed for each partition,
> >> > it
> >> > is possible to sort them according to the partition key (as is done
> >> > already for looking up partitions).
> >> > 
> >> > Thus, we ca generate sorted Append plans instead of MergeAppend when
> >> > sorting occurs on the partition key.
> >> 
> >> Great idea.  This is too late for v10 at this point, but please add it
> >> to the next CommitFest so we don't forget about it.
> > 
> > I know it is too late, and thought that it was too early to add it to the
> > commitfest properly since so many design decisions should be discussed.
> > Thanks for the feedback, I added it.
> 
> Thanks for working on this. I was also thinking about the same.
> 
> I will try to get back to review/work with this for v11. Mean time, I
> am working on partition-wise joins [1]. In those patches, I have added
> a structure called PartitionScheme, which represents how a relation is
> partitioned. For base relations it's mostly copy of PartitionDesc and
> PartitionKey, but then it bubbles up across join, with each
> partitioned join getting relevant partitioning scheme. If you could
> base your design such that is uses PartitionScheme, it could be used
> for joins and probably when Jeevan's patch for partition-wise
> aggregate [2] comes along, it can be used with grouping.
> 
> [1]
> https://www.postgresql.org/message-id/CAFjFpRcMWwepj-Do1otxQ-GApGPSZ1FmH7YQ
> vQTwzQOGczq_sw%40mail.gmail.com [2]
> http://www.mail-archive.com/pgsql-hackers@postgresql.org/msg308861.html

Thank you. I should probably wait until your patch is finalised before 
spending too much time on it, and I could probably also leverage the 
incremental sort patch if there is progress on it.




-- 
Ronan Dunklau
http://dalibo.com - http://dalibo.org


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Proposal] Make the optimiser aware of partitions ordering

2017-03-20 Thread Ronan Dunklau
On lundi 20 mars 2017 15:52:03 CET Robert Haas wrote:
> On Mon, Mar 20, 2017 at 6:31 AM, Ronan Dunklau <ronan.dunk...@dalibo.com> 
wrote:
> > With range partitioning, we guarantee that each partition contains non-
> > overlapping values. Since we know the range allowed for each partition, it
> > is possible to sort them according to the partition key (as is done
> > already for looking up partitions).
> > 
> > Thus, we ca generate sorted Append plans instead of MergeAppend when
> > sorting occurs on the partition key.
> 
> Great idea.  This is too late for v10 at this point, but please add it
> to the next CommitFest so we don't forget about it.

I know it is too late, and thought that it was too early to add it to the 
commitfest properly since so many design decisions should be discussed. Thanks 
for the feedback, I added it.


-- 
Ronan Dunklau
http://dalibo.com - http://dalibo.org


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [Proposal] Make the optimiser aware of partitions ordering

2017-03-20 Thread Ronan Dunklau
>  Seq Scan on child22


EXPLAIN (COSTS OFF) SELECT * FROM parentparent ORDER BY c1, c2;
   QUERY PLAN   

 Append
   Sort Key: parent1.c1, parent1.c2
   ->  Sort
 Sort Key: parent1.c1, parent1.c2
 ->  Append
   ->  Seq Scan on parent1
   ->  Seq Scan on child11
   ->  Seq Scan on child12
   ->  Append
 Sort Key: child21.c1, child21.c2
 ->  Sort
   Sort Key: child21.c1, child21.c2
   ->  Seq Scan on child21
 ->  Sort
   Sort Key: child22.c1, child22.c2
   ->  Seq Scan on child22



= Patch design =

First, we don't really know what we're doing :). But this is a proof of 
concept, and if there is interest in this patch, let us know how we should 
have done things and we're willing to rewrite it entirely if needed.

 Overview 

In the optimiser, we generate PathKeys representing the two sort orders by 
wich partition can be ordered (ASC and DESC), only for "native" range-based 
partitioned tables, and store them in RelOptInfo associated with every parent 
table, along with a list of OIDs corresponding to the partitions.

Then, when the time comes to generate append paths, we add another step which 
tries to match the query_pathkeys to those of the partition, and generate 
another append node with the sorted paths for each partitions, in the expected 
order, and a pathkey to the append path itself to indicate its already sorted.

 Known Problems with the patch 

Once again, keep in mind it's a proof-of-concept  

- It is in conflict with the existing patch designed to avoid adding the 
parent partitions
to the append node. As such, it will almost certainly need a full rewrite to 
adapt to that since that is done in this patch in a probably less than ideal 
fashion.
- As of now, the patch doesn't try to match the partition pathkey to 
mergejoinable pathkeys. 
- maybe a new node should be created altogether, instead of porting most of 
the MergeAppend struct fields to the basic Append ?
- the way we store the PathKeys and partitions OID directly in RelOptInfo is 
probably wrong
- the major impact on existing queries is the fact that to handle sub-
partitioning, RangeTblEntries representing intermediate append nodes are added 
(but just in the case of native partitioning, regular inheritance is not 
affected). See updated prepunion.c for what that means. Those RTE are then 
ignored 
when generating the real Append nodes.
- new regression tests have not been written yet, and existing ones haven't 
been "fixed" to take into account the different partition ordering: in case of 
sub partitioning, the order is now "depth-first" rather than "breadth-first" 
like it was earlier.
- no documentation has been added, although we don't really know where it 
should go
- the patch lacks a lot of comments
- the key displayed in EXPLAIN output comes from the first partition, not the 
parent.
- maybe a "real" SortPath should be generated, instead of generating Sort 
nodes 
out of nowhere when creating the plan. This has been done to conform to what 
MergeAppend already did.

= Questions =

Is there interest for such an optimization ?
Should we work from the patch proposed to remove parent tables already ?
Should there be a new Node for such a "SortedAppend" operation or is it fine 
keeping it with the Append node already ? Should that instead be an 
optimization of the MergeAppend Node ?
What is or is not acceptable with regards to storing things in RelOptInfo 
(among others) ?

More generally, any kind of feedback that will help us get on the right track 
will be appreciated.


-- 
Ronan Dunklau & Julien Rouhaud
http://dalibo.com - http://dalibo.orgcommit 21e09fb1038b0fb48f32a9013bee64279af8dfd7
Author: Ronan Dunklau <ronan.dunk...@dalibo.com>
Date:   Tue Feb 28 09:00:44 2017 +0100

Consider sorting order of partitions for append nodes

diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index c9b55ead3d..e566928f6c 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -81,6 +81,8 @@ static void show_sort_keys(SortState *sortstate, List *ancestors,
 			   ExplainState *es);
 static void show_merge_append_keys(MergeAppendState *mstate, List *ancestors,
 	   ExplainState *es);
+static void show_append_keys(AppendState *mstate, List *ancestors,
+	   ExplainState *es);
 static void show_agg_keys(AggState *astate, List *ancestors,
 			  ExplainState *es);
 static void show_grouping_sets(PlanState *planstate, Agg *agg,
@@ -1561,6 +1563,10 @@ ExplainNode(PlanState *planstate, List *ancestors,
 			show_sort_keys(castNode(SortState, planstate), ancestors, es);
 			show_sort_info(castNode(SortState, planstate), es);
 			break;
+		case

[HACKERS] Questions about MergeAppend

2017-03-02 Thread Ronan Dunklau
Hello,

Looking into the MergeAppendPath generation, I'm a bit surprised by several 
things:

  - When generating the mergeappendpath, we use a dummy path to take the sort 
cost into account for non-sorted subpaths. This is called with the base 
relation tuples instead of the subpath estimated number of rows. This tends to 
overestimate the sorting cost drastically, since the base relation can be 
filtered thus reducing the number of input tuples for the sorting routine. 
Please find attached a trivial patch fixing this.
  - Why does it only generate a "fake" SortPath for sorting purpose, not 
adding it to the subpath, and posptone the creation of Sort plan node until 
later ? This also adds a bit of complexity when fixing the sort cost node 
later for explain output.
  - Why do we only consider generating MergeAppendPath for PathKeys for which 
a sorted Path exists in any of the child relation ? It seems to me there could 
be an advantage in using a MergeAppend of explicitly sorted relations over 
sorting an Append, in particular if every existing subpath can be sorted in 
work_mem.


-- 
Ronan Dunklau
http://dalibo.com - http://dalibo.org   diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index 324829690d..a99a3aeceb 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -1313,7 +1313,7 @@ create_merge_append_path(PlannerInfo *root,
 	  root,
 	  pathkeys,
 	  subpath->total_cost,
-	  subpath->parent->tuples,
+	  subpath->rows,
 	  subpath->pathtarget->width,
 	  0.0,
 	  work_mem,

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New design for FK-based join selectivity estimation

2016-12-13 Thread ronan . dunklau
On mardi 13 décembre 2016 09:10:47 CET Adrien Nayrat wrote:
> Hi hackers,
> 
> The commit 100340e2dcd05d6505082a8fe343fb2ef2fa5b2a introduce an
> estimation error :
[]
> 
> Estimated row is 10x larger since 100340e2d
> 
> Regards,

Hello,

I think I understand what the problem is. In get_foreign_key_join_selectiviy, 
we remove the restrict info clauses which match a foreign key. This is done so 
that the selectivy is not applied twice (once in the function itself, once 
when processing the restrictinfos).

The problem is, for semi and anti joins, we assume that we have nohing to do 
(costsize.c:4253):

else if (jointype == JOIN_SEMI || jointype == JOIN_ANTI)
{
/*
 * For JOIN_SEMI and JOIN_ANTI, the selectivity is 
defined as the
 * fraction of LHS rows that have matches.  If the 
referenced
 * table is on the inner side, that means the 
selectivity is 1.0
 * (modulo nulls, which we're ignoring for now).  We 
already
 * covered the other case, so no work here.
 */
}

This results in assuming that the whole outerrel will match, no matter the 
selectivity of the innerrel.

If I understand it correctly and the above is right, I think we should ignore 
SEMI or ANTI joins altogether when considering FKs, and keep the corresponding 
restrictinfos for later processing since they are already special-cased later 
on. 

Regards,

--
Ronan Dunklau

  




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Possible regression regarding estimating relation width in FDWs

2016-05-20 Thread Ronan Dunklau
Hello,

While working on adapting the Multicorn FDW for 9.6, I noticed that there is a  
regression with regards to estimating the remote relation width.

This behavior can be exposed using the postgres_fdw, using 
"use_remote_estimate".

Test case:

CREATE EXTENSION postgres_fdw;
CREATE SERVER localhost FOREIGN DATA WRAPPER postgres_fdw;
CREATE USER MAPPING FOR CURRENT_USER SERVER localhost;
CREATE TABLE local_table (c1 text);
INSERT INTO local_table (c1) (SELECT repeat('test', 1));
ANALYZE local_table;
CREATE FOREIGN TABLE foreign_table (c1 text) SERVER localhost OPTIONS 
(table_name 'local_table', use_remote_estimate 'true');
EXPLAIN SELECT * FROM foreign_table;

Output, on current HEAD:

  QUERY PLAN  
--
 Foreign Scan on foreign_table  (cost=100.00..101.03 rows=1 width=32)


On 9.5:
  QUERY PLAN   
---
 Foreign Scan on foreign_table  (cost=100.00..101.03 rows=1 width=472)


While the FDW correctly sets the pathtarget width, it is then overriden at a 
later point. I'm not sure what happens exactly, but it seems that the relation 
path target is ignored completely, in planner.c:1695:

/*
 * Convert the query's result tlist into PathTarget format.
 *
 * Note: it's desirable to not do this till after 
query_planner(),
 * because the target width estimates can use per-Var width 
numbers
 * that were obtained within query_planner().
 */
final_target = create_pathtarget(root, tlist);

It says explicitly that it will be computed using per-Var width numbers.

I think the current_rel->cheapest_total_path->pathtarget should be taken into 
account, at least in the FDW case.

I'm not sure if the ability to estimate the whole relation width should be 
deprecated in favor of per-var width, or if it still should be supported 
(after all, the GetForeignRelSize callback is called AFTER having set a value 
computed from the individual attr_widths, in set_foreign_size). But in any 
case, at least postgres_fdw should be updated to support that.

Sorry if that was not clear, I'm at PGCon at the moment so if anyone want to 
discuss that in person I'm available.


-- 
Ronan Dunklau
http://dalibo.com - http://dalibo.org


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump / copy bugs with big lines ?

2015-03-31 Thread Ronan Dunklau
Le lundi 30 mars 2015 18:45:41 Jim Nasby a écrit :
 On 3/30/15 5:46 AM, Ronan Dunklau wrote:
  Hello hackers,
  
  I've tried my luck on pgsql-bugs before, with no success, so I report
  these
  problem here.
  
  The documentation mentions the following limits for sizes:
  
  Maximum Field Size  1 GB
  Maximum Row Size1.6 TB
  
  However, it seems like rows bigger than 1GB can't be COPYed out:
  
  ro=# create table test_text (c1 text, c2 text);
  CREATE TABLE
  ro=# insert into test_text (c1) VALUES (repeat('a', 536870912));
  INSERT 0 1
  ro=# update test_text set c2 = c1;
  UPDATE 1
  
  Then, trying to dump or copy that results in the following error:
  
  ro=# COPY test_text TO '/tmp/test';
  ERROR:  out of memory
  DÉTAIL : Cannot enlarge string buffer containing 536870913 bytes by
  536870912 more bytes.
  
  In fact, the same thing happens when using a simple SELECT:
  
  ro=# select * from test_text ;
  ERROR:  out of memory
  DÉTAIL : Cannot enlarge string buffer containing 536870922 bytes by
  536870912 more bytes.
  
  In the case of COPY, the server uses a StringInfo to output the row. The
  problem is, a StringInfo is capped to MAX_ALLOC_SIZE (1GB - 1), but a row
  should be able to hold much more than that.
 
 Yeah, shoving a whole row into one StringInfo is ultimately going to
 limit a row to 1G, which is a far cry from what the docs claim. There's
 also going to be problems with FE/BE communications, because things like
 pq_sendbyte all use StringInfo as a buffer too. So while Postgres can
 store a 1.6TB row, you're going to find a bunch of stuff that doesn't
 work past around 1GB.
 
  So, is this a bug ? Or is there a caveat I would have missed in the
  documentation ?
 
 I suppose that really depends on your point of view. The real question
 is whether we think it's worth fixing, or a good idea to change the
 behavior of StringInfo.
 

 StringInfo uses int's to store length, so it could possibly be changed,
 but then you'd just error out due to MaxAllocSize.
 
 Now perhaps those could both be relaxed, but certainly not to the extent
 that you can shove an entire 1.6TB row into an output buffer.

Another way to look at it would be to work in small chunks. For the first test 
case (rows bigger than 1GB), maybe the copy command could be rewritten to work 
in chunks, flushing the output more often if needed.

For the conversion related issues, I don't really see any other solution than 
extending StrinigInfo to allow for more than 1GB of data. On the other hand, 
those one can easily be circumvented by using a COPY ... WITH binary.

 
 The other issue is that there's a LOT of places in code that blindly
 copy detoasted data around, so while we technically support 1GB toasted
 values you're probably going to be quite unhappy with performance. I'm
 actually surprised you haven't already seen this with 500MB objects.
 
 So long story short, I'm not sure how worthwhile it would be to try and
 fix this. We probably should improve the docs though.
 

I think that having data that can't be output by pg_dump is quite surprising, 
and if this is not fixable, I agree that it should clearly be documented.

 Have you looked at using large objects for what you're doing? (Note that
 those have their own set of challenges and limitations.)

Yes I do. This particular customer of ours did not mind the performance 
penalty of using bytea objects as long as it was convenient to use. 

 
  We also hit a second issue, this time related to bytea encoding.
 
 There's probably several other places this type of thing could be a
 problem. I'm thinking of conversions in particular.

Yes, thats what the two other test cases I mentioned are about: any conversion 
leadng to a size greater than 1GB results in an error, even implicit 
conversions like doubling antislashes in the output.

-- 
Ronan Dunklau
http://dalibo.com - http://dalibo.org

signature.asc
Description: This is a digitally signed message part.


[HACKERS] pg_dump / copy bugs with big lines ?

2015-03-30 Thread Ronan Dunklau
Hello hackers,

I've tried my luck on pgsql-bugs before, with no success, so I report these 
problem here.

The documentation mentions the following limits for sizes:

Maximum Field Size  1 GB
Maximum Row Size1.6 TB

However, it seems like rows bigger than 1GB can't be COPYed out:

ro=# create table test_text (c1 text, c2 text);
CREATE TABLE
ro=# insert into test_text (c1) VALUES (repeat('a', 536870912));
INSERT 0 1
ro=# update test_text set c2 = c1;
UPDATE 1

Then, trying to dump or copy that results in the following error:

ro=# COPY test_text TO '/tmp/test';
ERROR:  out of memory
DÉTAIL : Cannot enlarge string buffer containing 536870913 bytes by 536870912 
more bytes.

In fact, the same thing happens when using a simple SELECT:

ro=# select * from test_text ;
ERROR:  out of memory
DÉTAIL : Cannot enlarge string buffer containing 536870922 bytes by 536870912 
more bytes.

In the case of COPY, the server uses a StringInfo to output the row. The 
problem is, a StringInfo is capped to MAX_ALLOC_SIZE (1GB - 1), but a row 
should be able to hold much more than that.

So, is this a bug ? Or is there a caveat I would have missed in the 
documentation ?

We also hit a second issue, this time related to bytea encoding.

This test case is a bit more complicated, since I had to use an external 
(client) program to insert my data. It involves inserting a string that fit 
into 1GB when encoded in escape format, but is larger than that in hex, and 
another string which fits in 1GB using the hex format, but is larger than that 
in escape:

from psycopg2 import connect
from io import BytesIO

conn = connect(dbname=ro)
cur = conn.cursor()
fullcontent = BytesIO()

# Write a binary string that weight less
# than 1 GB when escape encoded, but more than
# that if hex encoded
for i in range(200):
fullcontent.write(baaa * 100)
fullcontent.seek(0)
cur.copy_from(fullcontent, test_bytea)

fullcontent.seek(0)
fullcontent.truncate()

# Write another binary string that weight
# less than 1GB when hex encoded, but more than
# that if escape encoded
cur.execute(SET bytea_output = 'hex')
fullcontent.write(bx)
for i in range(300):
fullcontent.write(b00 * 100)
fullcontent.seek(0)
cur.copy_from(fullcontent, test_bytea)

cur.execute(COMMIT;)
cur.close()

I couldn't find an invocation of pg_dump which would allow me to dump both 
lines:

ro@ronan_laptop /tmp % PGOPTIONS=-c bytea_output=escape pg_dump  -Fc   
/dev/null
pg_dump: Dumping the contents of table test_bytea failed: PQgetResult() 
failed.
pg_dump: Error message from server: ERROR:  invalid memory alloc request size 
120001
pg_dump: The command was: COPY public.test_bytea (c1) TO stdout;
ro@ronan_laptop /tmp % PGOPTIONS=-c bytea_output=hex pg_dump  -Fc   
/dev/null
pg_dump: Dumping the contents of table test_bytea failed: PQgetResult() 
failed.
pg_dump: Error message from server: ERROR:  invalid memory alloc request size 
120003
pg_dump: The command was: COPY public.test_bytea (c1) TO stdout;


Using a COPY with binary format works:

ro=# COPY test_bytea TO '/tmp/test' WITH BINARY;


There seems to be a third issue, with regards to escape encoding: the 
backslash character is escaped, by adding another backslash. This means that a 
field which size is less than 1GB using the escape  sequence will not be able 
to be output once the backslash are escaped.

For example, lets consider a string consisting of 3 '\' characters:


ro=# select length(c1) from test_bytea;
  length   
---
 3
(1 ligne)

ro=# select length(encode(c1, 'escape')) from test_bytea ;
  length   
---
 6
(1 ligne)

ro=# set bytea_output to escape;
SET
ro=# copy test_bytea to '/tmp/test.csv' ; 

ERROR:  out of memory
DÉTAIL : Cannot enlarge string buffer containing 1073741822 bytes by 1 more 
bytes.

I think pg_dump should not error out on any data which was valid upon 
insertion. It seems the fix would be non-trivial, since StringInfo structures 
are relying on a limit of MaxAllocSize. Or am I missing something ?

Thank you.

 
-- 
Ronan Dunklau
http://dalibo.com - http://dalibo.org

signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] Regarding pg_stat_statements

2015-03-13 Thread Ronan Dunklau
Le vendredi 13 mars 2015 14:59:28 Sreerama Manoj a écrit :
 Hi,
   As we know that pg_stat_statements will monitor the queries after
 normalizing the queries(Removes the values present in query). I want to
 know is there a way to store those normalized values because I want to
 check the type of data(values) ,range of data that is being hit to the
 database. I am using Postgres 9.4

Hello.

You may be interested in the pg_qualstats extension: 
https://github.com/dalibo/pg_qualstats

The purpose of the extension is to track values like pg_stat_statements, but 
at the predicate level rather than statement level. It stores normalized 
predicates as well as constants. 

The documentation is here:

http://powa.readthedocs.org/en/latest/stats_extensions/pg_qualstats.html#pg-qualstats

It won't give you all normalized values though, only those present in 
predicates.

-- 
Ronan Dunklau
http://dalibo.com - http://dalibo.org

signature.asc
Description: This is a digitally signed message part.


[HACKERS] Dumping database creation options and ACLs

2014-12-08 Thread Ronan Dunklau
Hello.

As of now, the only way to restore database options and ACLs is to use 
pg_dumpall without the globals options. The often recommended pg_dumpall -g + 
individual dumps of the target databases doesn't restore those.

Since pg_dump/pg_restore offer the ability to create the database, it should do 
so with the correct owner, options and database ACLs. 

There was some discussion about those issues a while ago (see 
http://www.postgresql.org/message-id/11646.1272814...@sss.pgh.pa.us for 
example). As I understand it, the best way to handle that would be to push 
these modifications in pg_dump, but it is unclear how it should be done with 
regards to restoring to a different database.

In the meantime, it would be great to add an option to pg_dumpall allowing to 
dump this information. We could add the db creation in the output of 
pg_dumpall -g,  and add a specific --createdb-only option (similar to --roles-
only and --tablespaces-only).

Would such a patch be welcome ?

PS: this email was originally sent to the pgsql-bugs mailing list


-- 
Ronan Dunklau
http://dalibo.com - http://dalibo.org

signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] foreign data wrapper option manipulation during Create foreign table time?

2014-10-30 Thread Ronan Dunklau
Le mercredi 29 octobre 2014 14:16:23 Demai Ni a écrit :
 Robert and Ronan,
 
 many thanks for your response.
 
 I realized there is no clean way/api for it. maybe a hacking of ptree can
 do the trick.. :-)
 
 I will also take a look at IMPORT FOREIGN SCHEMA. However, for this
 requirement, I still need the user to input filename or filefolder, and I'd
 like to process the file(s) during create foreign table time, and save the
 processed result somewhere like ftoptions column in pg_foreign_table. may
 be some other way I can save the process result and make it assessable
 during query time?

You can pass options to IMPORT FOREIGN SCHEMA. So, you could maybe implement 
it so that the user can do that:

IMPORT FOREIGN SCHEMA public FROM SERVER file_server INTO local_schema OPTIONS  
(filename '/path/to/the/file');

Your FDW would then issue several CREATE FOREIGN TABLE statements, with all 
the necessary options.

Or even, to allow importing multiple files at once:

IMPORT FOREIGN SCHEMA public FROM SERVER file_server INTO local_schema OPTIONS  
(directory '/path/to/the/file_dir/');



 
 Demai
 
 On Wed, Oct 29, 2014 at 10:01 AM, Ronan Dunklau ronan.dunk...@dalibo.com
 
 wrote:
  Le mercredi 29 octobre 2014 12:49:12 Robert Haas a écrit :
   On Tue, Oct 28, 2014 at 5:26 PM, Demai Ni nid...@gmail.com wrote:
I am looking for a couple pointers here about fdw, and how to change
  
  the
  
option values during CREATE table time.

I am using postgres-xc-1.2.1 right now. For example, it contains
  
  file_fdw,
  
whose create-table-stmt looks like:
CREATE FOREIGN TABLE t1()
SERVER file_server
OPTIONS(format 'text',filename 'testing.txt');

I would like to replace the 'testing.txt' with absolute path like
'/user/testor1/testing.txt', and make sure the new value is saved in
pg_foreign_table; the file_fdw_validator is used to validate the
  
  options,
  
but is there a way to replace the optionValue here? And get the new
  
  value
  
stored in pg_foreign_table?

Thanks

BTW, in my real use case, I am trying to interpret a hdfs file and
  
  would
  
need to save some hostname/port information in the option value, which
  
  not
  
necessary specified by user.
   
   I don't think there's going to be a clean way to do this.  The
   intention of the system is that the user-provided options are simply
   stored, not that the FDW author is going to use the options list to
   store their own bits and pieces.
  
  I would do that during the IMPORT FOREIGN SCHEMA statement. That way, the
  user
  doesn't have to specify those options: they would be generated at IMPORT
  time,
  and the user could change them later if really needed.
  
  --
  Ronan Dunklau
  http://dalibo.com - http://dalibo.org

-- 
Ronan Dunklau
http://dalibo.com - http://dalibo.org

signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] foreign data wrapper option manipulation during Create foreign table time?

2014-10-29 Thread Ronan Dunklau
Le mercredi 29 octobre 2014 12:49:12 Robert Haas a écrit :
 On Tue, Oct 28, 2014 at 5:26 PM, Demai Ni nid...@gmail.com wrote:
  I am looking for a couple pointers here about fdw, and how to change the
  option values during CREATE table time.
  
  I am using postgres-xc-1.2.1 right now. For example, it contains file_fdw,
  whose create-table-stmt looks like:
  CREATE FOREIGN TABLE t1()
  SERVER file_server
  OPTIONS(format 'text',filename 'testing.txt');
  
  I would like to replace the 'testing.txt' with absolute path like
  '/user/testor1/testing.txt', and make sure the new value is saved in
  pg_foreign_table; the file_fdw_validator is used to validate the options,
  but is there a way to replace the optionValue here? And get the new value
  stored in pg_foreign_table?
  
  Thanks
  
  BTW, in my real use case, I am trying to interpret a hdfs file and would
  need to save some hostname/port information in the option value, which not
  necessary specified by user.
 
 I don't think there's going to be a clean way to do this.  The
 intention of the system is that the user-provided options are simply
 stored, not that the FDW author is going to use the options list to
 store their own bits and pieces.

I would do that during the IMPORT FOREIGN SCHEMA statement. That way, the user 
doesn't have to specify those options: they would be generated at IMPORT time, 
and the user could change them later if really needed.

-- 
Ronan Dunklau
http://dalibo.com - http://dalibo.org

signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] Hide 'Execution time' in EXPLAIN (COSTS OFF)

2014-10-12 Thread Ronan Dunklau
Le dimanche 12 octobre 2014 13:17:00 Andres Freund a écrit :
 On 2014-10-12 23:13:27 +1300, David Rowley wrote:
  On Wed, Sep 24, 2014 at 8:02 AM, Christoph Berg c...@df7cb.de wrote:
   Re: Tom Lane 2014-09-23 15155.1411493...@sss.pgh.pa.us
   
Robert Haas robertmh...@gmail.com writes:
 On Sat, Sep 20, 2014 at 4:13 PM, Christoph Berg c...@df7cb.de wrote:
 Can we have EXPLAIN (timing off) in 9.4+ hide the Planning time
 line? That would even be backwards compatible with 9.x where it
 would
 be a no-op.
 
 I don't think that'll work becuase:
 /* check that timing is used with EXPLAIN ANALYZE */
 if (es.timing  !es.analyze)
 
 ereport(ERROR,

(errcode(ERRCODE_INVALID_PARAMETER_VALUE),

  errmsg(EXPLAIN option TIMING
 
 requires ANALYZE)));

It looks to me like that would complain about EXPLAIN (TIMING ON),
not the case Christoph is suggesting.  What he proposes seems a bit
odd and non-orthogonal, but we could make the code do it if we wanted.
   
   I don't think this warrants a new flag, and TIMING OFF seems to be the
   right naming for it. (In fact it was the first I tried, and I was
   cursing quite a bit over the lack of configurability until I realized
   that COSTS OFF disabled the planning time display as well.) It might
   be a bit odd, but it's easy to remember.
  
  I'm pretty interested in seeing something change around here.
  The patch I'm working on at the moment (INNER JOIN removals) implements
  skipping of joins at execution time rather than planning time. Currently
  I'm working on the regression test for this and it's not all that easy due
  to the execution time appearing in the results.
  
  An explain analyze output from master with the patch can look something
  like:
  
  explain (analyze, costs off, timing off)
  select a.* from a inner join b on a.b_id = b.id inner join c on b.c_id =
  c.id;
  
  QUERY PLAN
  
  ---
  
   Hash Join (actual rows=1 loops=1)
   
 Hash Cond: (b.c_id = c.id)
 -  Hash Join (actual rows=1 loops=1)
 
   Hash Cond: (a.b_id = b.id)
   -  Seq Scan on a (actual rows=1 loops=1)
   -  Hash (never executed)
   
 -  Seq Scan on b (never executed)
 
 -  Hash (never executed)
 
   -  Seq Scan on c (never executed)
   
   Execution time: 0.092 ms
  
  (10 rows)
 
 So you're now the third person reporting problems here. Let's remove
 'execution time' for COSTS off.
 
 I personally would even say that we should backpatch that to make
 backpatches involving regression tests less painful.

That wouldn't solve the first problem mentioned, which is that for some 
regression tests one may want to test the costs themselves, which is now 
impossible with the new planning time feature.

What would IMO make both cases suitable would be to eliminate ALL timing from 
TIMING OFF, not only the timing on the individual nodes. As was mentioned 
before, it is a bit counter intuitive to have COSTS OFF disable the planning 
time, and not TIMING OFF.

 
 Greetings,
 
 Andres Freund

-- 
Ronan Dunklau

signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] IMPORT FOREIGN SCHEMA statement

2014-07-10 Thread Ronan Dunklau
Le mercredi 9 juillet 2014 21:30:05 Tom Lane a écrit :
 Michael Paquier michael.paqu...@gmail.com writes:
  I guess that this implementation is enough as a first shot, particularly
  regarding the complexity that default expression import would bring up for
  postgres_fdw (point raised during the review, but not really discussed
  afterwards).
 
 Yeah.  I'm actually more concerned about the lack of collation import,
 but that's unfixable unless we can figure out how to identify collations
 in a platform-independent way.
 
   regards, tom lane

Thank you for working on this patch, I'm not really fond of building strings 
in a FDW for the core to parse them again but I don't see any other solution 
to the problem you mentioned.

Similarly to what we do for the schema, we should also check that the server 
in the parsed statement is the one we are importing from.


-- 
Ronan Dunklau
http://dalibo.com - http://dalibo.org

signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] IMPORT FOREIGN SCHEMA statement

2014-07-07 Thread Ronan Dunklau
Le lundi 7 juillet 2014 07:58:33 Robert Haas a écrit :
 On Mon, May 26, 2014 at 12:17 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  Albe Laurenz laurenz.a...@wien.gv.at writes:
  In addition to data type mapping questions (which David already raised)
  I have one problem when I think of the Oracle FDW:
  
  Oracle follows the SQL standard in folding table names to upper case.
  So this would normally result in a lot of PostgreSQL foreign tables
  with upper case names, which would be odd and unpleasant.
  
  I cannot see a way out of that, but I thought I should mention it.
  
  It seems like it would often be desirable for the Oracle FDW to smash
  all-upper-case names to all-lower-case while importing, so that no quoting
  is needed on either side.  I doubt though that this is so desirable that
  it should happen unconditionally.
  
  Between this and the type-mapping questions, it seems likely that
  we're going to need a way for IMPORT FOREIGN SCHEMA to accept
  user-supplied control options, which would in general be specific
  to the FDW being used.  (Another thing the SQL committee failed to
  think about.)
 
 Is this part of the SQL standard?  What is it defined to do about
 non-table objects?

The OPTIONS clause is not part of the SQL Standard.

Regarding non-table objects, the standard only talks about tables, and nothing 
else. 

-- 
Ronan Dunklau
http://dalibo.com - http://dalibo.org

signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] IMPORT FOREIGN SCHEMA statement

2014-07-01 Thread Ronan Dunklau
Le mardi 1 juillet 2014 06:59:49 Albe Laurenz a écrit :
 Michael Paquier wrote:
 
  After sleeping on it, I have put my hands on the postgres_fdw portion and
  came up with a largely
 simplified flow, resulting in the patch attached.
 
 
 [...]
 
 
  Ronan, what do you think of those patches? I have nothing more to add, and
  I think that they should be
 looked by a committer. Particularly the FDW
  API that is perhaps not the best fit, but let's see some extra opinions
  about that.

The remote_schema parameter can be used for SQL injection. Either we should go 
back to using parameters, or be extra careful. Since the remote schema is 
parsed as a name, it is limited to 64 characters which is not that useful for 
an SQL injection, but still.

The new query as you wrote it returns the typname (was cast to regtype before) 
This is not schema qualified, and will fail when importing tables with columns 
from a type not in search_path.

The regression tests don't pass: a user name is hard-coded in the result of 
DROP USER MAPPING. Should we expect the tests to be run as postgres ?

 
 
 I looked the the API and ist documentation, and while I saw no problem with
 the API,
 I think that the documentation still needs some attention:
 
 It mentions a local_schema, which doesn't exist (any more?).
 It should be mentioned that the CreateForeignTableStmt's
 base.relation-schemaname should be set to NULL.
 Also, it would be nice to find a few words for options,
 maybe explaining a potential application.
 
 Yours,
 Laurenz Albe

-- 
Ronan Dunklau
http://dalibo.com - http://dalibo.org

signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] Array of composite types returned from python

2014-07-01 Thread Ronan Dunklau
Le dimanche 29 juin 2014 16:54:03 Tom Lane a écrit :
 Abhijit Menon-Sen a...@2ndquadrant.com writes:
  3) This is such a simple change with no new infrastructure code
  (PLyObject_ToComposite already exists). Can you think of a reason
  why this wasn't done until now? Was it a simple miss or purposefully
  excluded?
 
  This is not an authoritative answer: I think the infrastructure was
  originally missing, but was later added in #bc411f25 for OUT parameters.
  Perhaps it was overlooked at the time that the same code would suffice
  for this earlier-missing case. (I've Cc:ed Peter E. in case he has any
  comments.)
 
  I think the patch is ready for committer.

Sorry for being this late.

I've tested the patch, everything seems to work as expected, including complex
nesting of Composite and array types.

No documentation changes are needed, since the limitation wasn't even
mentioned before.

Regression tests are ok, and the patch seems simple enough. Formatting looks
OK too.


 I took a quick look at this; not really a review either, but I have
 a couple comments.

 1. While I think the patch does what it intends to, it's a bit distressing
 that it will invoke the information lookups in PLyObject_ToComposite over
 again for *each element* of the array.  We probably ought to quantify that
 overhead to see if it's bad enough that we need to do something about
 improving caching, as speculated in the comment in PLyObject_ToComposite.

I don't know how to do that without implementing the cache itself.


 2. I wonder whether the no-composites restriction in plpy.prepare
 (see lines 133ff in plpy_spi.c) could be removed as easily.

Hum, I tried that, but its not that easy: lifting the restriction results in a
SEGFAULT when trying to pfree the parameters given to SPI_ExecutePlan (line
320 in plpy_spi.c).

Correct me if I'm wrong, but I think the problem is that HeapTupleGetDatum
returns the t_data field, whereas heap_form_tuple allocation returns the
address of the HeapTuple itself. Then, the datum itself has not been palloced.

Changing the HeapTupleGetDatum call for an heap_copy_tuple_as_datum fixes this
issue, but I'm not sure this the best way to do that.

The attached patch implements this.





   regards, tom lane

--
Ronan Dunklau
http://dalibo.com - http://dalibo.orgdiff --git a/src/pl/plpython/expected/plpython_composite.out b/src/pl/plpython/expected/plpython_composite.out
index 61b3046..e7a7edb 100644
--- a/src/pl/plpython/expected/plpython_composite.out
+++ b/src/pl/plpython/expected/plpython_composite.out
@@ -270,9 +270,13 @@ yield {'tab': [['first', 1], ['second', 2]],
   {'first': 'fourth', 'second': 4}]}
 $$ LANGUAGE plpythonu;
 SELECT * FROM composite_types_table();
-ERROR:  PL/Python functions cannot return type table_record[]
-DETAIL:  PL/Python does not support conversion to arrays of row types.
-CONTEXT:  PL/Python function composite_types_table
+tab |typ
++
+ {(first,1),(second,2)} | {(third,3),(fourth,4)}
+ {(first,1),(second,2)} | {(third,3),(fourth,4)}
+ {(first,1),(second,2)} | {(third,3),(fourth,4)}
+(3 rows)
+
 -- check what happens if the output record descriptor changes
 CREATE FUNCTION return_record(t text) RETURNS record AS $$
 return {'t': t, 'val': 10}
diff --git a/src/pl/plpython/expected/plpython_spi.out b/src/pl/plpython/expected/plpython_spi.out
index 5175794..de547d2 100644
--- a/src/pl/plpython/expected/plpython_spi.out
+++ b/src/pl/plpython/expected/plpython_spi.out
@@ -376,6 +376,15 @@ plan = plpy.prepare(select fname, lname from users where fname like $1 || '%',
 [text])
 c = plpy.cursor(plan, [a, b])
 $$ LANGUAGE plpythonu;
+CREATE TYPE test_composite_type AS (
+  a1 int,
+  a2 varchar
+);
+CREATE OR REPLACE FUNCTION plan_composite_args() RETURNS test_composite_type AS $$
+plan = plpy.prepare(select $1::test_composite_type as c1, [test_composite_type])
+res = plpy.execute(plan, [{a1: 3, a2: label}])
+return res[0][c1]
+$$ LANGUAGE plpythonu;
 SELECT simple_cursor_test();
  simple_cursor_test
 
@@ -432,3 +441,9 @@ CONTEXT:  Traceback (most recent call last):
   PL/Python function cursor_plan_wrong_args, line 4, in module
 c = plpy.cursor(plan, [a, b])
 PL/Python function cursor_plan_wrong_args
+SELECT plan_composite_args();
+ plan_composite_args
+-
+ (3,label)
+(1 row)
+
diff --git a/src/pl/plpython/expected/plpython_types.out b/src/pl/plpython/expected/plpython_types.out
index b98318c..9576905 100644
--- a/src/pl/plpython/expected/plpython_types.out
+++ b/src/pl/plpython/expected/plpython_types.out
@@ -632,13 +632,12 @@ PL/Python function test_type_conversion_array_mixed2
 CREATE FUNCTION test_type_conversion_array_record() RETURNS type_record[] AS $$
 return [None]
 $$ LANGUAGE plpythonu;
-ERROR:  PL/Python functions cannot return type type_record[]
-DETAIL:  PL/Python does

Re: [HACKERS] IMPORT FOREIGN SCHEMA statement

2014-06-30 Thread Ronan Dunklau
Le lundi 30 juin 2014 16:43:38 Michael Paquier a écrit :
 On Thu, Jun 19, 2014 at 11:00 PM, Michael Paquier michael.paqu...@gmail.com
  wrote:
   This seems to be related to re-using the table-name between
 
  invocations. The
 
   attached patch should fix point 2. As for point 1, I don't know the
 
  cause for
 
   it. Do you have a reproducible test case ?
  
   Sure. I'll try harder when looking in more details at the patch for
   postgres_fdw :)
 
  With v2, I have tried randomly some of the scenarios of v1 plus some
  extras, but was not able to reproduce it.
 
   I'll look into the patch for postgres_fdw later.
 
  And here are some comments about it, when applied on top of the other 2
  patches.
  1) Code compiles without warning, regression tests pass.
  2) In fetch_remote_tables, the palloc for the parameters should be
  done after the number of parameters is determined. In the case of
  IMPORT_ALL, some memory is wasted for nothing.
  3) Current code is not able to import default expressions for a table.
  A little bit of processing with pg_get_expr would be necessary:
  select pg_catalog.pg_get_expr(adbin, adrelid) from pg_attrdef;
  There are of course bonus cases like SERIAL columns coming immediately
  to my mind but it would be possible to determine if a given column is
  serial with pg_get_serial_sequence.
  This would be a good addition for the FDW IMO.
  4) The same applies of course to the constraint name: CREATE FOREIGN
  TABLE foobar (a int CONSTRAINT toto NOT NULL) for example.
  5) A bonus idea: enable default expression obtention and null/not null
  switch by default but add options to disable their respective
  obtention.
  6) Defining once PGFDW_IMPORTRESULT_NUMCOLS at the top of
  postgres_fdw.c without undefining would be perfectly fine.
  7) In postgresImportForeignSchema, the palloc calls and the
  definitions of the variables used to save the results should be done
  within the for loop.
  8) At quick glance, the logic of postgresImportForeignSchema looks
  awkward... I'll have a second look with a fresher mind later on this
  one.

 While having a second look at the core patch, I have found myself
 re-hacking it, fixing a couple of bugs and adding things that have been
 missing in the former implementation:
 - Deletions of unnecessary structures to simplify code and make it cleaner
 - Fixed a bug related to the management of local schema name. A FDW was
 free to set the schema name where local tables are created, this should not
 be the case.
 - Improved documentation, examples and other things, fixed doc padding for
 example
 - Added some missing stuff in equalfuncs.c and copyfuncs.c
 - Some other things.
 With that, core patch looks pretty nice actually, and I think that we
 should let a committer have a look at this part at least for this CF.

 Also, the postgres_fdw portion has been updated based on the previous core
 patch modified, using a version that Ronan sent me, which has addressed the
 remarks I sent before. This patch is still lacking documentation, some
 cleanup, and regression tests are broken, but it can be used to test the
 core feature. I unfortunately don't have much time today but I am sending
 this patch either way to let people play with IMPORT SCHEMA if I don't come
 back to it before.

The regression tests fail because of a typo in pg_type.h: BOOLARRAYOID should
be defined to 1000, not 1003 (which clashes against NAMEARRAYOID).

What do you think should be documented, and where ?


 Regards,

--
Ronan Dunklau
http://dalibo.com - http://dalibo.orgFrom 1cc2922e9087f2d852d2b1196314d7e35dce42a3 Mon Sep 17 00:00:00 2001
From: Michael Paquier mich...@otacoo.com
Date: Mon, 30 Jun 2014 16:36:47 +0900
Subject: [PATCH 2/2] Add support of IMPORT SCHEMA for postgres_fdw

---
 contrib/postgres_fdw/expected/postgres_fdw.out |  62 ++
 contrib/postgres_fdw/postgres_fdw.c| 259 +
 contrib/postgres_fdw/sql/postgres_fdw.sql  |  30 +++
 src/include/catalog/pg_type.h  |   2 +
 4 files changed, 353 insertions(+)

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 2e49ee3..07e6c11 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -2834,3 +2834,65 @@ NOTICE:  NEW: (13,test triggered !)
  (0,27)
 (1 row)

+-- Test IMPORT FOREIGN SCHEMA statement
+CREATE schema import_destination;
+CREATE schema import_source;
+CREATE TABLE import_source.t1 (c1 int, c2 varchar NOT NULL);
+CREATE TABLE import_source.t2 (c1 int, c2 varchar);
+CREATE TABLE import_source.t3 (c1 int, c2 varchar);
+IMPORT FOREIGN SCHEMA import_source FROM SERVER loopback INTO import_destination;
+\det+ import_destination;
+List of foreign tables
+   Schema   | Table |  Server  |  FDW Options   | Description

Re: [HACKERS] IMPORT FOREIGN SCHEMA statement

2014-06-16 Thread Ronan Dunklau
Le lundi 16 juin 2014 11:32:38 Atri Sharma a écrit :
 On Mon, Jun 16, 2014 at 11:28 AM, Michael Paquier michael.paqu...@gmail.com
  wrote:
  Just wondering: what about the case where the same data type is
  defined on both local and remote, but with *different* definitions? Is
  it the responsibility of the fdw to check for type incompatibilities?
 
 Logically, should be.
 

This is akin to what Stephen proposed, to allow IMPORT FOREIGN SCHEMA to also 
import types. 

The problem with checking if the type is the same is deciding where to stop. 
For composite types, sure it should be easy. But what about built-in types ? 
Or types provided by an extension / a native library ? These could theorically 
change from one release to another.

 Just wondering, cant we extend the above proposed function  typedef List
 *(*ImportForeignSchema_function) (ForeignServer *server,
 ImportForeignSchemaStmt * parsetree); be changed a bit to give exact type
 definitions from the remote side as well?

I toyed with this idea, but the more I think about it the less I'm sure what 
the API should look like, should we ever decide to go beyond the standard and 
import more than tables. Should the proposed function return value be changed 
to void, letting the FDW execute any DDL statement ? The advantage of 
returning a list of statement was to make it clear that tables should be 
imported, and letting the core enforce INTO local_schema part of the clause.

I would prefer if the API is limited by design to import tables. This 
limitation can always be bypassed by executing arbitrary statements before 
returning the list of ImportForeignSchemaStmt*. 

For the postgres_fdw specific case, we could add two IMPORT options (since it 
looked like we had a consensus on this):

 - import_types
 - check_types

Import types would import simple, composite types, issuing the corresponding 
statements before returning to core.

Check types would compare the local and remote definition for composite types. 
For types installed by an extension, it would check that the local type has 
also been created by an extension of the same name, installed in the same 
schema, raising a warning if the local and remote version differ.
For built-in types, a warning would be raised if the local and remote versions 
of PostgreSQL differ.

However, I don't know what we should be doing about types located in a 
different schema. For example, if the remote table s1.t1 has a column of 
composite type s2.typ1, should we import typ1 in s1 ? In S2, optionnaly 
creating the non-existing schema ? Raise an error ?

Regards,

-- 
Ronan Dunklau
http://dalibo.com - http://dalibo.org

signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] IMPORT FOREIGN SCHEMA statement

2014-05-27 Thread Ronan Dunklau
 Additionally, I don't really see how that would be useful in a general 
case. 
 With an in-core defined meaning of type transformation, any FDW that 
doesn't 
 fit exactly into the model would have a hard time. For example, what 
happens 
 if an FDW is only ever capable of returning text ?

 That's actually the case where it's most important to have the feature
 all the way down to the column level.

I'm not sure configuration about specific columns from specific tables would 
be that useful: if you already know the structure of the table, you can either 
create it yourself, or run an alter column statement just after creating it.

Alternativeley, with the arbitrary options clause proposed by Tom and detailed 
below, one could use the LIMIT TO / EXCEPT clauses to fine-tune what options 
should apply.

 That's why I suggested doing it via CREATE/ALTER with JSONB or similar
 containing the details rather than inventing new SQL grammar, an
 option I included only to dismiss it.

Hum, adding a simple TYPE MAPPING is already inventing new SQL grammar, but 
its less invasive.

 Between this and the type-mapping questions, it seems likely that
 we're going to need a way for IMPORT FOREIGN SCHEMA to accept
 user-supplied control options, which would in general be specific
 to the FDW being used.  (Another thing the SQL committee failed to
 think about.)

So, without extending the syntax too much, we could add options:

IMPORT FOREIGN SCHEMA remote_schema FROM SERVER server_name INTO local_schema
[ { LIMIT TO | EXCEPT } (table_name [, ...])]
[ OPTIONS (option_list) ]

This option list could then contain fdw-specific options, including type 
mapping.

Or we could add a specific clause:

IMPORT FOREIGN SCHEMA remote_schema FROM SERVER server_name INTO local_schema
[ { LIMIT TO | EXCEPT } (table_name [, ...])]
[ OPTIONS (option_list) ]
[ TYPE MAPPING some mapping_definition ]

With mapping_definition being either a tuple, or well-defined jsonb format 
common accross FDWs.

A third solution, which I don't like but doesn't modify the SQL grammar, would 
be to encode the options in the remote_schema name.

Would one of those solutions be acceptable ?

-- 
Ronan Dunklau
http://dalibo.com - http://dalibo.org

signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] IMPORT FOREIGN SCHEMA statement

2014-05-27 Thread Ronan Dunklau
Le mardi 27 mai 2014 09:52:27 Stephen Frost a écrit :
 * David Fetter (da...@fetter.org) wrote:
  In the easy case of PostgreSQL, you might also be able to establish
  whether the UDT in the already defined locally case above is
  identical to the one defined remotely, but I don't think it's possible
  even in principle for non-PostgreSQL remote systems, possibly not even
  for ones with non-identical architecture, PostgreSQL major version,
  etc.
 
 For my 2c, it'd be very handy if IMPORT had an option or similar to also
 copy over all (in the schema) or at least any depended-upon UDTs.  It'd
 really be nice if we had an ability to check the remote UDT vs. the
 local one at some point also, since otherwise you're going to get bitten
 at run-time when querying the foreign table.

The specification only talks about importing tables, not types, views or (why 
not ?) foreign tables.

If we want to extend the spec further, we could accept more than 
CreateForeignTableStmt as return values from the API:
 - CreateDomainStmt
 - CompositeTypeStmt
 - AlterTableCmd


The core code would be responsible for executing them, and making sure the 
destination schema is correctly positioned on all of those.

The postgres_fdw behaviour could then be controlled with options such as 
include_types (tri-valued enum accepting none, all, as_needed), 
relation_kinds (default to 'r', can support multiple kinds with a string 'rfv' 
for tables, foreign tables and views).

I think we're drifting further away from the standard with that kind of stuff, 
but I'd be happy to implement it if that's the path we choose.

-- 
Ronan Dunklau
http://dalibo.com - http://dalibo.org

signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] IMPORT FOREIGN SCHEMA statement

2014-05-25 Thread Ronan Dunklau
Le dimanche 25 mai 2014 12:41:18 David Fetter a écrit :
 On Fri, May 23, 2014 at 10:08:06PM +0200, Ronan Dunklau wrote:
  Hello,
  
  Since my last proposal didn't get any strong rebuttal, please find
  attached a more complete version of the IMPORT FOREIGN SCHEMA statement.
 
 Thanks!
 
 Please to send future patches to this thread so people can track them
 in their mail.

I'll do.

I didn't for the previous one because it was a few months ago, and no patch 
had been added to the commit fest.

 
  I tried to follow the SQL-MED specification as closely as possible.
  
  This adds discoverability to foreign servers. The structure of the
  statement as I understand it is simple enough:
  
  IMPORT FOREIGN SCHEMA remote_schema FROM SERVER some_server [ (LIMIT TO |
  EXCEPT) table_list ] INTO local_schema.
  
  The import_foreign_schema patch adds the infrastructure, and a new FDW
  routine:
  
  typedef List *(*ImportForeignSchema_function) (ForeignServer *server,
  ImportForeignSchemaStmt * parsetree);
  
  This routine must return a list of CreateForeignTableStmt mirroring
  whatever tables were found on the remote side, which will then be
  executed.
  
  The import_foreign_schema_postgres_fdw patch proposes an implementation of
  this API for postgres_fdw. It will import a foreign schema using the right
  types as well as nullable information.
 
 In the case of PostgreSQL, the right types are obvious until there's
 a user-defined one.  What do you plan to do in that case ?
 

The current implementation fetches the types as regtype, and when receiving a 
custom type, two things can happen:

 - the type is defined locally: everything will work as expected
 - the type is not defined locally: the conversion function will fail, and 
raise an error of the form: ERROR:  type schema.typname does not exist

Should I add that to the regression test suite ?

  Regarding documentation, I don't really know where it should have been
  put. If I missed something, let me know and I'll try to correct it.
 
 It's not exactly something you missed, but I need to bring it up
 anyway before we go too far.  The standard missed two crucial concepts
 when this part of it was written:
 
 1.  No single per-database-type universal type mapping can be correct.
 
 People will have differing goals for type mapping, and writing a whole
 new FDW for each of those goals is, to put it mildly, wasteful.  I
 will illustrate with a concrete and common example.
 
 MySQL's datetime type encourages usages which PostgreSQL's
 corresponding type, timestamptz, simply disallows, namely '-00-00
 00:00:00' as its idea of UNKNOWN or NULL.
 
 One way PostgreSQL's mapping could work is to map it to TEXT, which
 would preserve the strings exactly and be in some sense an identity
 map.  It would also make the type somewhat useless in its original
 intended form.
 
 Another one would map the type is to a composite data type
 mysql_datetime(tstz timestamptz, is_wacky boolean) which would
 capture, for example, ('2014-04-01 00:00:00+00', false) for the UTC
 start of April Fools' Day this year, and (NULL, true) for '-00-00
 00:00:00'.
 
 There are doubtless others, and there is no principled way to assign
 any one of them as universally correct.
 
 This brings me to the next crucial concept the standard missed:
 
 2.  The correct mapping may not be the identity, and furthermore, the
 inbound and outbound mappings might in general not be mere inversions
 of each other.
 
 MySQL (no aspersions intended) again provides a concrete example with
 its unsigned integer types.  Yes, it's possible to create a domain
 over INT8 which simulates UINT4, a domain over NUMERIC which simulates
 UINT8, etc., but again this process's correctness depends on
 circumstances.
 
 To address these problems, I propose the following:
 
 - We make type mappings settable at the level of:
 - FDW
 - Instance (a.k.a. cluster)
 - Database
 - Schema
 - Table
 - Column
 
 using the existing ALTER command and some way of spelling out how
 a remote type maps to a local type.
 
 This would consist of:
 - The remote type
 - The local type to which it maps
 - The inbound transformation (default identity)
 - The outbound transformation (default identity)
 
 At any given level, the remote type would need to be unique.  To
 communicate this to the system, we either invent new syntax, with
 all the hazards attendant thereto, or we could use JSON or similar
 serialization.
 
 ALTER FOREIGN TABLE foo
 ADD TYPE MAPPING
 FROM datetime
 TO TEXT
 WITH (
 INBOUND TRANSFORMATION IDENTITY,
 OUTBOUND TRANSFORMATION IDENTITY
 ) /* Ugh!!! */
 
 vs.
 
 ALTER FOREIGN TABLE foo ADD (mapping '{
 datetime: text,
 inbound: IDENTITY,
 outbound: IDENTITY
 }')
 
 Each FDW would have some set of default mappings and some way to
 override them as above

Re: [HACKERS] Triggers on foreign tables

2014-03-24 Thread Ronan Dunklau
Le dimanche 23 mars 2014 02:44:26 Noah Misch a écrit :
 On Tue, Mar 18, 2014 at 09:31:06AM +0100, Ronan Dunklau wrote:
  Le mardi 18 mars 2014 03:54:19 Kouhei Kaigai a écrit :
(1) To acquire the old tuple for UPDATE/DELETE operations, the patch
closely
   
   parallels our handling for INSTEAD OF triggers on views.  It
   
adds a wholerow resjunk attribute, from which it constructs a
HeapTuple
before calling a trigger function.  This loses the system columns, an
irrelevant
consideration for views but meaningful for foreign tables. 
postgres_fdw
maintains the ctid system column (t_self), but triggers will always
see
an invalid t_self for the old tuple.  One way to fix this is to pass
around
   
   the old tuple data as ROW(ctid, oid, xmin, cmin, xmax, cmax,
   
tableoid, wholerow).  That's fairly close to sufficient, but it
doesn't
cover t_ctid. Frankly, I would like to find a principled excuse to not
worry about making foreign table system columns accessible from
triggers.

 Supporting system columns dramatically affects the mechanism, and
 what

trigger is likely to care?  Unfortunately, that argument seems too
weak.
Does anyone have a cleaner idea for keeping track of the system column
values or a stronger argument for not bothering?
   
   Regarding to the first suggestion,
   I think, it is better not to care about system columns on foreign
   tables,
   because it fully depends on driver's implementation whether FDW fetches
   ctid from its data source, or not.
   Usually, system columns of foreign table, except for tableoid, are
   nonsense.
   
   Because of implementation reason, postgres_fdw fetches ctid of
   
   remote tables on UPDATE / DELETE, it is not a common nature for all FDW
   drivers. For example, we can assume an implementation that uses primary
   key
   of remote table to identify the record to be updated or deleted. In this
   case, local ctid does not have meaningful value.
   So, fundamentally, we cannot guarantee FDW driver returns meaningful
   ctid
   or other system columns.
  
  I agree, I think it is somewhat clunky to have postgres_fdw use a feature
  that is basically meaningless for other FDWs. Looking at some threads in
  this list, it confused many people.
 
 My own reasoning for accepting omission of system columns is more along the
 lines of Robert's argument.  Regardless, three folks voting to do so and
 none against suffices for me.  I documented the system columns limitation,
 made the returningList change I mentioned, and committed the patch.

Thank you, I'm glad the patch found its way to the repository !

 
  This is off-topic, but maybe we could devise an API allowing for local
  system attributes on foreign table. This would allow FDWs to carry
  attributes that weren't declared as part of the table definition. This
  could then be used for postgres_fdw ctid, as well as others foreign data
  wrappers equivalent of an implicit tuple id.
 
 We could, but I discourage it.  System columns are a legacy feature; I doubt
 we'd choose that design afresh today.  On the off-chance that you need the
 value of a remote system column, you can already declare an ordinary
 foreign table column for it.  I raised the issue because it's inconsistent
 for RETURNING to convey system columns while tg_trigtuple/tg_newtuple do
 not, not because acquiring system columns from foreign tables is notably
 useful.

The idea here was to allow an FDW author to add columns which are not part of 
the table definition, for example colums which are required to identify the 
tuple remotely. Without system columns, a postgres_fdw user would have to 
declare the ctid column on every table for a tuble to be identifiable. The 
proposal would allow postgres_fdw to automatically inject an hidden (system ?) 
column on every table for this ctid.

-- 
Ronan Dunklau
http://dalibo.com - http://dalibo.org

signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] Triggers on foreign tables

2014-03-18 Thread Ronan Dunklau
Le mardi 18 mars 2014 03:54:19 Kouhei Kaigai a écrit :
  I hacked on this for awhile, but there remain two matters on which I'm
  uncertain about the right way forward.
  
  (1) To acquire the old tuple for UPDATE/DELETE operations, the patch
  closely
 parallels our handling for INSTEAD OF triggers on views.  It
  adds a wholerow resjunk attribute, from which it constructs a HeapTuple
  before calling a trigger function.  This loses the system columns, an
  irrelevant
  consideration for views but meaningful for foreign tables.  postgres_fdw
  maintains the ctid system column (t_self), but triggers will always see
  an invalid t_self for the old tuple.  One way to fix this is to pass
  around
 the old tuple data as ROW(ctid, oid, xmin, cmin, xmax, cmax,
  tableoid, wholerow).  That's fairly close to sufficient, but it doesn't
  cover t_ctid. Frankly, I would like to find a principled excuse to not
  worry about making foreign table system columns accessible from triggers.
   Supporting system columns dramatically affects the mechanism, and what
  trigger is likely to care?  Unfortunately, that argument seems too weak. 
  Does anyone have a cleaner idea for keeping track of the system column
  values or a stronger argument for not bothering?
  
 
 Regarding to the first suggestion,
 I think, it is better not to care about system columns on foreign tables,
 because it fully depends on driver's implementation whether FDW fetches
 ctid from its data source, or not.
 Usually, system columns of foreign table, except for tableoid, are
 nonsense.
 Because of implementation reason, postgres_fdw fetches ctid of
 remote tables on UPDATE / DELETE, it is not a common nature for all FDW
 drivers. For example, we can assume an implementation that uses primary key
 of remote table to identify the record to be updated or deleted. In this
 case, local ctid does not have meaningful value.
 So, fundamentally, we cannot guarantee FDW driver returns meaningful ctid
 or other system columns.
 

I agree, I think it is somewhat clunky to have postgres_fdw use a feature that 
is basically meaningless for other FDWs. Looking at some threads in this list, 
it confused many people.

This is off-topic, but maybe we could devise an API allowing for local system 
attributes on foreign table. This would allow FDWs to carry attributes that 
weren't declared as part of the table definition. This could then be used for 
postgres_fdw ctid, as well as others foreign data wrappers equivalent of an 
implicit tuple id.

 
  (2) When a foreign table has an AFTER ROW trigger, the FDW's
  ExecForeign{Insert,Update,Delete} callbacks must return a slot covering
  all columns.  Current FDW API documentation says things like this:
  
  
The data in the returned slot is used only if the INSERT query has a
RETURNING clause.  Hence, the FDW could choose to optimize away
returning
some or all columns depending on the contents of the RETURNING clause.
  
  
  Consistent with that, postgres_fdw inspects the returningLists of the
  ModifyTable node to decide which columns are necessary.  This patch has
  rewriteTargetListIU() add a resjunk wholerow var to the returningList of
  any query that will fire an AFTER ROW trigger on a foreign table.  That
  avoids the need to change the FDW API contract or any postgres_fdw code.
  I was pleased about that for a time, but on further review, I'm doubting
  that the benefit for writable FDWs justifies muddying the definition of
  returningList; until now, returningList has been free from resjunk TLEs.
  For example, callers of
  FetchStatementTargetList() may be unprepared to see an all-resjunk list,
  instead of NIL, for a data modification statement that returns nothing.
  
  If we do keep the patch's approach, I'm inclined to rename returningList.
  However, I more lean toward adding a separate flag to indicate the need
  to return a complete tuple regardless of the RETURNING list.  The
  benefits
  of overloading returningList are all short-term benefits.  We know that
  the FDW API is still converging, so changing it seems
  eventually-preferable
 to, and safer than, changing the name or meaning
  of returningList. Thoughts?
  
  On Thu, Mar 06, 2014 at 09:11:19AM +0100, Ronan Dunklau wrote:
  
   Le mercredi 5 mars 2014 22:36:44 Noah Misch a écrit :
   
Agreed.  More specifically, I see only two scenarios for retrieving
tuples from the tuplestore.  Scenario one is that we need the next
tuple (or pair of tuples, depending on the TriggerEvent).  Scenario
two is that we need the tuple(s) most recently retrieved.  If that's
correct, I'm inclined to rearrange afterTriggerInvokeEvents() and
AfterTriggerExecute() to remember the tuple or pair of tuples most
recently retrieved.  They'll then never call tuplestore_advance()
just to reposition.  Do you see a problem with that?
  
  
  
   I don't see any problem with that. I don't know how this would be
   implemented

Re: [HACKERS] Triggers on foreign tables

2014-03-06 Thread Ronan Dunklau
Le mercredi 5 mars 2014 22:36:44 Noah Misch a écrit :
 Agreed.  More specifically, I see only two scenarios for retrieving tuples
 from the tuplestore.  Scenario one is that we need the next tuple (or pair
 of tuples, depending on the TriggerEvent).  Scenario two is that we need
 the tuple(s) most recently retrieved.  If that's correct, I'm inclined to
 rearrange afterTriggerInvokeEvents() and AfterTriggerExecute() to remember
 the tuple or pair of tuples most recently retrieved.  They'll then never
 call tuplestore_advance() just to reposition.  Do you see a problem with
 that?

I don't see any problem with that. I don't know how this would be implemented, 
but it would make sense to avoid those scans, as long as a fresh copy is 
passed to the trigger: modifications to a tuple performed in an after trigger 
should not be visible to the next one.


 
 I was again somewhat tempted to remove ate_tupleindex, perhaps by defining
 the four flag bits this way:
 
 #define AFTER_TRIGGER_DONE0x1000
 #define AFTER_TRIGGER_IN_PROGRESS 0x2000
 /* two bits describing the size of and tuple sources for this event */
 #define AFTER_TRIGGER_TUP_BITS0xC000
 #define AFTER_TRIGGER_FDW_REUSE   0x
 #define AFTER_TRIGGER_FDW_FETCH   0x4000
 #define AFTER_TRIGGER_1CTID   0x8000
 #define AFTER_TRIGGER_2CTID   0xC000
 
 AFTER_TRIGGER_FDW_FETCH and AFTER_TRIGGER_FDW_REUSE correspond to the
 aforementioned scenarios one and two, respectively.  I think, though, I'll
 rate this as needless micro-optimization and not bother; opinions welcome.
 (The savings is four bytes per foreign table trigger event.)

I was already happy with having a lower footprint for foreign table trigger 
events than for regular trigger events, but if we remove the need for seeking 
in the tuplestore entirely, it would make sense to get rid of the index.

 
 Thanks,
 nm

Thanks to you.

-- 
Ronan Dunklau
http://dalibo.com - http://dalibo.org

signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] jsonb and nested hstore

2014-03-06 Thread Ronan Dunklau
Le jeudi 6 mars 2014 09:33:18 Andrew Dunstan a écrit :
 On 03/06/2014 08:16 AM, Oleg Bartunov wrote:
  On Thu, Mar 6, 2014 at 12:43 PM, Peter Geoghegan p...@heroku.com wrote:
  On Thu, Mar 6, 2014 at 12:23 AM, Teodor Sigaev teo...@sigaev.ru wrote:
  That's possible to introduce GUC variable for i/o functions which will
  control old bug-to-bug behavior. IMHO, this is much better option that
  stopping hstore development or split hstore to two branches.
  
  A GUC that controls i/o functions is generally considered to be an
  unacceptable hack.
  
  In what sense are we really stopping hstore development if hstore2
  lives as jsonb? I have a hard time imagining someone dealing with the
  incompatibility that a user-facing hstore2 would introduce, while
  still preferring hstore syntax over json syntax given the choice.
  There are very rich facilities for manipulating json available in
  every programming language. The same is not true of hstore.
  
  Having looked at the issue today, I think that the amount of redundant
  code between a hstore2 in core as jsonb and hstore1 will be
  acceptable. The advantages of making a clean-break in having to
  support the legacy hstore disk format strengthen the case for doing so
  too.
  
  Heh, let's not to do an implusive decision about hstore2. I agree,
  that jsonb has
  a lot of facilities, but don't forget, that json(b) has to follow standard
  and in that sense it's more constrained than hstore, which we could
  further develop to support some interesting features, which will never be
  implemented in json(b).  Also,  it'd be a bit awkward after working on
  nested
  hstore and declaring it
  on several conferences (Engine Yard has sponsored part of our hstore
  work), suddenly
  break people expectation and say, that our work has moved to core to
  provide json
  some very cool features, good bye, hstore users :(   I'm afraid people
  will not understand us.
 
 Oleg,
 
 I hear you, and largely agree, as long as the compatibility issue is
 solved. If it's not, I think inventing a new hstore2 type is probably a
 lousy way to go.
 
 For good or ill, the world has pretty much settled on wanting to use
 json for lightweight treeish data. That's where we'll get the most
 impact. Virtually every programming language (including Perl) has good
 support for json.
 
 I'm not sure what the constraints of json that you might want to break
 are. Perhaps you'd like to specify.

I haven't followed the whole thread, but json is really restrictive on the 
supported types: a hierarchical hstore could maybe support more types 
(timestamp comes to mind) as its values, which is not a valid data type in the 
json spec.

 
 Whatever we do, rest assured your work won't go to waste.
 
 cheers
 
 andrew

-- 
Ronan Dunklau
http://dalibo.com - http://dalibo.org

signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] Triggers on foreign tables

2014-03-02 Thread Ronan Dunklau
Hello.

Did you have time to review the latest version of this patch ? Is there 
anything I can do to get this ready for commiter ?

Thank you for all the work performed so far.




Le mardi 4 février 2014 13:16:22 Ronan Dunklau a écrit :
 Le lundi 3 février 2014 23:28:45 Noah Misch a écrit :
  On Sun, Feb 02, 2014 at 11:53:51AM +0100, Ronan Dunklau wrote:
   Le jeudi 30 janvier 2014 14:05:08 Noah Misch a écrit :
On Thu, Jan 23, 2014 at 03:17:35PM +0100, Ronan Dunklau wrote:
 What do you think about this approach ? Is there something I missed
 which
 would make it not sustainable ?

Seems basically reasonable.  I foresee multiple advantages from having
one
tuplestore per query level as opposed to one for the entire
transaction.
You would remove the performance trap of backing up the tuplestore by
rescanning. It permits reclaiming memory and disk space in
AfterTriggerEndQuery() rather than at end of transaction.  You could
remove
ate_ptr1 and ate_ptr2 from AfterTriggerEventDataFDW and just store the
flags word: depending on AFTER_TRIGGER_2CTIDS, grab either the next
one
or
the next two tuples from the tuplestore.  Using work_mem per
AfterTriggerBeginQuery() instead of per transaction is no problem. 
What
do
you think of that design change?
   
   I agree that this design is better, but I have some objections.
   
   We can remove ate_ptr2 and rely on the AFTER_TRIGGER_2CTIDS flag, but
   the
   rescanning and ate_ptr1 (renamed ate_tupleindex in the attached patch)
   can't go away.
   
   Consider for example the case of a foreign table with more than one
   AFTER
   UPDATE triggers. Unless we store the tuples once for each trigger, we
   will
   have to rescan the tuplestore.
  
  Will we?  Within a given query level, when do (non-deferred) triggers
  execute in an order other than the enqueue order?
 
 Let me explain what I had in mind.
 
 Looking at the code in AfterTriggerSaveEvent:
 
 - we build a template AfterTriggerEvent, and store the tuple(s)
 - for each suitable after trigger that matches the trigger type, as well as
 the WHEN condition if any, a copy of the previously built AfterTriggerEvent
 is queued
 
 Later, those events are fired in order.
 
 This means that more than one event can be fired for one tuple.
 
 Take this example:
 
 CREATE TRIGGER trig_row_after1
 AFTER UPDATE ON rem2
 FOR EACH ROW
 WHEN (NEW.f1 % 5  3)
 EXECUTE PROCEDURE trigger_func('TRIG1');
 
 CREATE TRIGGER trig_row_after2
 AFTER UPDATE ON rem2
 FOR EACH ROW
 WHEN (NEW.f1 % 5  4)
 EXECUTE PROCEDURE trigger_func('TRIG2');
 
 UPDATE rem2 set f2 = 'something';
 
 Assuming 5 rows with f1 as a serial, the fired AfterTriggerEvent's
 ate_tupleindex will be, in that order. Ass
 
 0-0-2-2-4-8-8
 
 So, at least a backward seek is required for trig_row_after2 to be able to
 retrieve a tuple that was already consumed when firing trig_row_after1.
 
 On a side note, this made me realize that it is better to avoid storing a
 tuple entirely if there is no enabled trigger (the f1 = 4 case above). The
 attached patch does that, so the previous sequence becomes:
 
 0-0-2-2-4-6-6
 
 It also prevents from initalizing a tuplestore at all if its not needed.
 
   To mitigate the effects of this behaviour, I added the option to perform
   a
   reverse_seek when the looked-up tuple is nearer from the current index
   than
   from the start.
  
  If there's still a need to seek within the tuplestore, that should get rid
  of the O(n^2) effect.  I'm hoping that per-query-level tuplestores will
  eliminate the need to seek entirely.
 
 I think the only case when seeking is still needed is when there are more
 than one after trigger that need to be fired, since the abovementioned
 change prevents from seeking to skip tuples.
 
If you do pursue that change, make sure the code still does the right
thing
when it drops queued entries during subxact abort.
   
   I don't really understand what should be done at that stage. Since
   triggers on foreign tables are not allowed to be deferred, everything
   should be cleaned up at the end of each query, right ? So, there
   shouldn't be any queued entries.
  
  I suspect that's right.  If you haven't looked over
  AfterTriggerEndSubXact(), please do so and ensure all its actions still
  make sense in the context of this new kind of trigger storage.
 
 You're right, I missed something here. When aborting a subxact, the
 tuplestores for queries below the subxact query depth should be cleaned, if
 any, because AfterTriggerEndQuery has not been called for the failing query.
 
 The attached patch fixes that.
 
 The attached patch checks this, and add documentation for this
 limitation.
 I'm not really sure about how to phrase that correctly in the error
 message
 and the documentation. One can store at most INT_MAX foreign tuples,
 which
 means that at most INT_MAX insert or delete

[HACKERS] Proposal: IMPORT FOREIGN SCHEMA statement.

2014-02-21 Thread Ronan Dunklau
Hello,

The SQL-MED specification defines the IMPORT FOREIGN SCHEMA statement.

This adds discoverability to foreign servers. The structure of the statement
as I understand it is simple enough:

IMPORT FOREIGN SCHEMA remote_schema FROM SERVER some_server [ (LIMIT TO |
EXCEPT) table_list ] INTO local_schema.

Is anyone working on this? I found a reference to this from 2010 in the
archive, stating that work should be focused on core functionality, but
nothing more recent.

This would be very useful for postgres_fdw and other RDBMS-backed fdws, but I
think even file_fdw could benefit from it if it was able to create a foreign
table for every csv-with-header file in a directory.

I can see a simple API working for that.  A new function would be added to the
fdw routine, which is responsible for crafting CreateForeignTableStmt. It
could have the following signature:

typedef List *(*ImportForeignSchema_function) (ForeignServer *server,
ImportForeignSchemaStmt * parsetree);


I experimented with this idea, and came up with the attached two patches: one
for the core, and the other for actually implementing the API in postgres_fdw.

Maybe those can serve as a proof-of-concept for discussing the design?


--
Ronan Dunklau
http://dalibo.com - http://dalibo.orgdiff --git a/src/backend/commands/event_trigger.c b/src/backend/commands/event_trigger.c
index 024a477..40a2540 100644
--- a/src/backend/commands/event_trigger.c
+++ b/src/backend/commands/event_trigger.c
@@ -250,7 +250,8 @@ check_ddl_tag(const char *tag)
 		pg_strcasecmp(tag, REFRESH MATERIALIZED VIEW) == 0 ||
 		pg_strcasecmp(tag, ALTER DEFAULT PRIVILEGES) == 0 ||
 		pg_strcasecmp(tag, ALTER LARGE OBJECT) == 0 ||
-		pg_strcasecmp(tag, DROP OWNED) == 0)
+		pg_strcasecmp(tag, DROP OWNED) == 0 ||
+		pg_strcasecmp(tag, IMPORT FOREIGN SCHEMA) == 0)
 		return EVENT_TRIGGER_COMMAND_TAG_OK;

 	/*
diff --git a/src/backend/commands/foreigncmds.c b/src/backend/commands/foreigncmds.c
index 7f007d7..719c674 100644
--- a/src/backend/commands/foreigncmds.c
+++ b/src/backend/commands/foreigncmds.c
@@ -27,7 +27,9 @@
 #include catalog/pg_type.h
 #include catalog/pg_user_mapping.h
 #include commands/defrem.h
+#include commands/tablecmds.h
 #include foreign/foreign.h
+#include foreign/fdwapi.h
 #include miscadmin.h
 #include parser/parse_func.h
 #include utils/acl.h
@@ -1427,3 +1429,48 @@ CreateForeignTable(CreateForeignTableStmt *stmt, Oid relid)

 	heap_close(ftrel, RowExclusiveLock);
 }
+
+/*
+ * Import a foreign schema
+ */
+void
+ImportForeignSchema(ImportForeignSchemaStmt * stmt)
+{
+	Oid			ownerId;
+	ForeignDataWrapper *fdw;
+	ForeignServer *server;
+	FdwRoutine *fdw_routine;
+	AclResult	aclresult;
+	List	   *table_list = NULL;
+	ListCell   *lc;
+	char * local_schema = strdup(stmt-local_schema);
+
+	ownerId = GetUserId();
+	server = GetForeignServerByName(stmt-servername, false);
+	aclresult = pg_foreign_server_aclcheck(server-serverid, ownerId, ACL_USAGE);
+	if (aclresult != ACLCHECK_OK)
+		aclcheck_error(aclresult, ACL_KIND_FOREIGN_SERVER, server-servername);
+
+	/* Check the permissions on the schema */
+	LookupCreationNamespace(local_schema);
+
+	fdw = GetForeignDataWrapper(server-fdwid);
+	fdw_routine = GetFdwRoutine(fdw-fdwhandler);
+	if (fdw_routine-ImportForeignSchema == NULL)
+	{
+		ereport(ERROR,
+(errcode(ERRCODE_FDW_NO_SCHEMAS),
+ errmsg(This FDW does not support schema importation)));
+	}
+	table_list = fdw_routine-ImportForeignSchema(server, stmt);
+	foreach(lc, table_list)
+	{
+		CreateForeignTableStmt *create_stmt = lfirst(lc);
+		Oid			relOid = DefineRelation((CreateStmt *) create_stmt,
+			RELKIND_FOREIGN_TABLE,
+			InvalidOid);
+		/* Override whatever the fdw set for the schema */
+		create_stmt-base.relation-schemaname = local_schema;
+		CreateForeignTable(create_stmt, relOid);
+	}
+}
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 81169a4..80c18cc 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -235,8 +235,8 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 		DropAssertStmt DropTrigStmt DropRuleStmt DropCastStmt DropRoleStmt
 		DropUserStmt DropdbStmt DropTableSpaceStmt DropFdwStmt
 		DropForeignServerStmt DropUserMappingStmt ExplainStmt FetchStmt
-		GrantStmt GrantRoleStmt IndexStmt InsertStmt ListenStmt LoadStmt
-		LockStmt NotifyStmt ExplainableStmt PreparableStmt
+		GrantStmt GrantRoleStmt IndexStmt ImportForeignSchemaStmt InsertStmt
+		ListenStmt LoadStmt LockStmt NotifyStmt ExplainableStmt PreparableStmt
 		CreateFunctionStmt AlterFunctionStmt ReindexStmt RemoveAggrStmt
 		RemoveFuncStmt RemoveOperStmt RenameStmt RevokeStmt RevokeRoleStmt
 		RuleActionStmt RuleActionStmtOrEmpty RuleStmt
@@ -319,6 +319,8 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 %type ival	defacl_privilege_target
 %type defelt	DefACLOption
 %type list	DefACLOptionList
+%type node	OptImportForeignSchemaRestriction
+%type ival

Re: [HACKERS] Proposal: IMPORT FOREIGN SCHEMA statement.

2014-02-21 Thread Ronan Dunklau
 I havent had a look at the patch yet since I dont have a nice editor right
 now, but how do you handle inter operability between datatypes?
 Specifically, how do you handle those datatypes which have a different name
 from the PostgreSQL name for them and/or are stored in a different manner?

Do you mean in general, or for the postgres_fdw specifically ? 

In general, only valid column types should be accepted in the 
CreateForeignTableStmt. The CreateForeignTableStmt is passed through 
DefineRelation, which takes care of looking up the actual data types.

For the postgres_fdw POC implementation, this is done by parsing the 
attributes type from the query result with the regtype input functions. The 
attribute typmod is injected too.

 
 Regards,
 
 Atri

-- 
Ronan Dunklau
http://dalibo.com - http://dalibo.org

signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] Proposal: IMPORT FOREIGN SCHEMA statement.

2014-02-21 Thread Ronan Dunklau
Le vendredi 21 février 2014 16:45:20 Atri Sharma a écrit :
 On Fri, Feb 21, 2014 at 4:39 PM, Ronan Dunklau 
ronan.dunk...@dalibo.comwrote:
   I havent had a look at the patch yet since I dont have a nice editor
  
  right
  
   now, but how do you handle inter operability between datatypes?
   Specifically, how do you handle those datatypes which have a different
  
  name
  
   from the PostgreSQL name for them and/or are stored in a different
  
  manner?
  
  Do you mean in general, or for the postgres_fdw specifically ?
  
  In general, only valid column types should be accepted in the
  CreateForeignTableStmt. The CreateForeignTableStmt is passed through
  DefineRelation, which takes care of looking up the actual data types.
  
  For the postgres_fdw POC implementation, this is done by parsing the
  attributes type from the query result with the regtype input functions.
  The
  attribute typmod is injected too.
 
 I actually meant in general. Thanks for the reply.
 
 So please help me understand here. How exactly does CreateForeignTableStmt
 help in type compatibility? 

I'm not sure I understand your concern. It doesn't help in type compatibility, 
it is still the responsibility of the FDW to convert local types to remote 
ones.

The CreateForeignTableStmt defines the column, with their types. It is 
executed locally to create a new foreign table according to a remote 
description of the table. The only difference with a user-written 
CreateForeignTable statement is that the structure is crafted by the FDW 
instead of the parser.

 A statement may be valid on a foreign server but may not be compatible.

Do you mean the CreateForeignTableStmt ? It has to be valid locally, or it 
won't be executed. It is the FDW responsibility to build this statement in 
such a way that it is valid locally.


 Regards,
 
 Atri

-- 
Ronan Dunklau
http://dalibo.com - http://dalibo.org

signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] Proposal: IMPORT FOREIGN SCHEMA statement.

2014-02-21 Thread Ronan Dunklau

 Who says that the OIDs are the same on the local and the remove postgres
 instance? For user-defined types, that's certainly not going to be true...

That's why the the result is casted as regtype[], and parsed as such. The oid 
is not transmitted over the wire, but set by regtype_in. 

 
 Also, why do you aggregate the lists of columns, types and oids into arrays
 when querying them from the remote server? Just producing a query that
 returns one row per table column seems much simpler, both conceptually and
 implementation wise.

As I said, this is a Proof-Of-Concept. It is not meant to be a fully 
functional, optimal implementation, but to serve as basis for discussion of 
the API and the feature themselves.

The query could indeed be replaced by what you suggest, performing the 
grouping locally. I have absolutely no opinion on which implementation is 
better, this one seemed the most natural to me.

 
 Finally, I think there are a few missing pieces. For example, you quite
 easily could copy over not-NULL flags, but you currently don't. Similarly,
 what about inheritance relationships between remote tables? There's a patch
 in the current CF, I believe, which adds support for inheritance to foreign
 tables, so all you'd have to do is to make the foreign table's inheritance
 structure match the remote table's.

Duly noted, we could probably import NOT NULL flags, and maybe even the 
table's inheritance structure. I'll look into that if the feature and the API 
are deemed worthy.

Thank you for the review.


-- 
Ronan Dunklau
http://dalibo.com - http://dalibo.org

signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] [bug fix] pg_ctl stop times out when it should respond quickly

2014-01-27 Thread Ronan Dunklau
Le mardi 7 janvier 2014 17:05:03 Michael Paquier a écrit :
 On Sun, Jan 5, 2014 at 3:49 PM, MauMau maumau...@gmail.com wrote:
  Could you confirm again and tell me what problem is happening?
 
 FWIW, I just quickly tested those two patches independently and got
 them correctly applied with patch -p1  $PATCH on master at edc4345.
 They compiled and passed as well make check.
 Regards,

Both patches apply cleanly, I'll focus on the pg_stop_fail_v3 patch.

Tests are running correctly.

The bug this patch is supposed to fix has been reproduced on current HEAD, and 
cannot be reproduced using the patch.

Previous concerns about using both get_pgpid and postmaster_is_alive are 
adressed.

There is no regression tests covering this bugfix, althought I don't know if 
it would be practical to implement them.


-- 
Ronan Dunklau
http://dalibo.com - http://dalibo.org

signature.asc
Description: This is a digitally signed message part.


[HACKERS] Triggers on foreign tables

2014-01-07 Thread Ronan Dunklau
Hello.

Since the last discussion about it 
(http://www.postgresql.org/message-id/cadyhksugp6ojb1pybtimop3s5fg_yokguto-7rbcltnvaj5...@mail.gmail.com),
 I
finally managed to implement row triggers for foreign tables.

The attached patch does not contain regression tests or documentation yet, a
revised patch will include those sometime during the week. I'll add it to the
commitfest then.

A simple test scenario using postgres_fdw is however included as an
attachment.

The attached patch implements triggers for foreign tables according to the
design discussed in the link above.

For statement-level triggers, nothing has changed since the last patch.
Statement-level triggers are just allowed in the command checking.

For row-level triggers, it works as follows:

- during rewrite phase, every attribute of the foreign table is duplicated as
a resjunk entry if a trigger is defined on the relation. These entries are
then used to store the old values for a tuple. There is room for improvement
here: we could check if any trigger is in fact a row-level trigger
- during execution phase, the before triggers are fired exactly like triggers
on regular tables, except that old tuples are not fetched using their ctid,
but rebuilt using the previously-stored resjunk attributes.
- for after triggers, the whole queuing mechanism is bypassed for foreign
tables. This is IMO acceptable, since foreign tables cannot have constraints
or constraints triggers, and thus have not need for deferrable execution. This
design avoids the need for storing and retrieving/identifying remote tuples
until the query or transaction end.
- the duplicated resjunk attributes are identified by being:
  - marked as resjunk in the targetlist
  - not being system or whole-row attributes (varno  0)

There is still one small issue with the attached patch: modifications to the
tuple performed by the foreign data wrapper (via the returned TupleTableSlot
in ExecForeignUpdate and ExecForeignInsert hooks) are not visible to the AFTER
trigger. This could be fixed by merging the planslot containing the resjunk
columns with the returned slot before calling the trigger, but I'm not really
sure how to safely perform that. Any advice ?

Many thanks to Kohei Kaigai for taking the time to help with the design.

--
Ronan Dunklau
http://dalibo.com - http://dalibo.org

test_with_postgres_fdw.sql
Description: application/sql
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index b9cd88d..a509595 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -3150,13 +3150,16 @@ ATPrepCmd(List **wqueue, Relation rel, AlterTableCmd *cmd,
 			/* No command-specific prep needed */
 			break;
 		case AT_EnableTrig:		/* ENABLE TRIGGER variants */
-		case AT_EnableAlwaysTrig:
-		case AT_EnableReplicaTrig:
 		case AT_EnableTrigAll:
 		case AT_EnableTrigUser:
 		case AT_DisableTrig:	/* DISABLE TRIGGER variants */
 		case AT_DisableTrigAll:
 		case AT_DisableTrigUser:
+			ATSimplePermissions(rel, ATT_TABLE | ATT_FOREIGN_TABLE);
+			pass = AT_PASS_MISC;
+			break;
+		case AT_EnableReplicaTrig:
+		case AT_EnableAlwaysTrig:
 		case AT_EnableRule:		/* ENABLE/DISABLE RULE variants */
 		case AT_EnableAlwaysRule:
 		case AT_EnableReplicaRule:
diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c
index 4eff184..12870eb 100644
--- a/src/backend/commands/trigger.c
+++ b/src/backend/commands/trigger.c
@@ -75,6 +75,14 @@ static HeapTuple GetTupleForTrigger(EState *estate,
    ItemPointer tid,
    LockTupleMode lockmode,
    TupleTableSlot **newSlot);
+
+static HeapTuple ExtractOldTuple(TupleTableSlot * mixedtupleslot,
+	ResultRelInfo * relinfo);
+static void ExecARForeignTrigger(EState * estate,
+TriggerData LocTriggerData,
+ResultRelInfo *relinfo,
+int triggerEvent, int triggerType);
+
 static bool TriggerEnabled(EState *estate, ResultRelInfo *relinfo,
 			   Trigger *trigger, TriggerEvent event,
 			   Bitmapset *modifiedCols,
@@ -184,12 +192,22 @@ CreateTrigger(CreateTrigStmt *stmt, const char *queryString,
 			RelationGetRelationName(rel)),
 	 errdetail(Views cannot have TRUNCATE triggers.)));
 	}
+	else if (rel-rd_rel-relkind == RELKIND_FOREIGN_TABLE)
+	{
+		if(stmt-isconstraint)
+			ereport(ERROR,
+	(errcode(ERRCODE_WRONG_OBJECT_TYPE),
+	 errmsg(\%s\ is a foreign table,
+			RelationGetRelationName(rel)),
+	 errdetail(Foreign Tables cannot have constraint triggers.)));
+	}
 	else
+	{
 		ereport(ERROR,
 (errcode(ERRCODE_WRONG_OBJECT_TYPE),
  errmsg(\%s\ is not a table or view,
 		RelationGetRelationName(rel;
-
+	}
 	if (!allowSystemTableMods  IsSystemRelation(rel))
 		ereport(ERROR,
 (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
@@ -1062,10 +1080,11 @@ RemoveTriggerById(Oid trigOid)
 	rel = heap_open(relid, AccessExclusiveLock);

 	if (rel-rd_rel-relkind != RELKIND_RELATION 
-		rel-rd_rel-relkind != RELKIND_VIEW)
+		rel-rd_rel-relkind

Re: [HACKERS] Triggers on foreign tables

2013-10-16 Thread Ronan Dunklau
Le mardi 15 octobre 2013 09:47:31 Robert Haas a écrit :
 On Mon, Oct 14, 2013 at 5:24 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
  And, I also want some comments from committers, not only from mine.
  
  +1
  
  +1
 
 /me pokes head up.  I know I'm going to annoy people with this
 comment, but I feel like it's going to have to be made at some point
 by somebody, so here goes: I don't see the point of this feature.  If
 you want a trigger on a table, why not set it on the remote side?  A
 trigger on the foreign table won't be enforced consistently; it'll
 only work when the update is routed through the foreign table, not
 when people access the underlying table on the remote side through any
 other mechanism.  The number of useful things you can do this way
 seems fairly small.  Perhaps you could use a row-level trigger for
 RLS, to allow only certain rows on the foreign side to be updated, but
 even that seems like a slightly strange design: generally it'd be
 better to enforce the security as close to the target object as
 possible.
 
 There's another issue that concerns me here also: performance.  IIUC,
 an update of N tuples on the remote side currently requires N+1 server
 round-trips.  That is unspeakably awful, and we really oughta be
 looking for ways to make that number go down, by pushing the whole
 update to the remote side.  But obviously that won't be possible if
 there's a per-row trigger that has to be evaluated on the local side.
 Now, assuming somebody comes up with code that implements that
 optimization, we can just disable it when there are local-side
 triggers.  But, then you're back to having terrible performance.  So
 even if the use case for this seemed really broad, I tend to think
 performance concerns would sink most of the possible real-world uses.
 
 I could, of course, be all wet

Even if the row-level trigger functionality is controversial, in terms of 
provided features VS performance, the original patch I submitted enables 
statement-level triggers. 

Although my goal was to implement row-level triggers and I hit a wall pretty 
quickly, would it make sense to have statement-level triggers without their 
row counterparts ?

I also think that pushing the whole update to the remote side will not be 
possible in all cases, and like Kohei wrote, it may be an acceptable loss to 
not be able to push it when a trigger is present. If we want to push the whole 
update, we will have to cope with local joins and function evaluations in all 
cases. IMO, triggers are just a special case of these limiting factors.

signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] Triggers on foreign tables

2013-10-16 Thread Ronan Dunklau

 Sorry, I might call it something like primary key, instead of 'tupleid'.
 Apart from whether we can uniquely identify a particular remote record with
 an attribute, what FDW needs here is something to identify a remote
 record. So, we were talking about same concept with different names.

Ah, that makes sense: I was understanding tupleid as a synonym for ctid.


  Does the incomplete tuple mean a tuple image but some of columns
  are replaced with NULL due to optimization, as postgres_fdw is doing,
  doesn't it?
  If so, a solution is to enforce FDW driver to fetch all the columns if
  its
  managing foreign table has row level triggers for update / delete.
  
  What I meant is that a DELETE statement, for example, does not build a
  scan- stage reltargetlist consisting of the whole set of attributes: the
  only attributes that are fetched are the ones built by
  addForeignUpdateTargets, and whatever additional attributes are involved
  in the DELETE statement (in the WHERE clause, for example).
 
 DELETE statement indeed does not (need to) construct a complete tuple
 image on scan stage, however, it does not prohibit FDW driver to keep the
 latest record being fetched from remote server.
 FDW driver easily knows whether relation has row-level trigger preliminary,
 so here is no matter to keep a complete image internally if needed.
 Or, it is perhaps available to add additional target-entries that is
 needed to construct a complete tuple using AddForeignUpdateTargets()
 callback.

I think that the additional target-entries should be added in core, because 
that would require additional work from FDW drivers, and the code would be 
duplicated amongst all FDW drivers that wish to support triggers.


  I like this idea, but I don't really see all the implications of such a
  design.
  Where would this tuple be stored ?
  From what I understand, after-triggers are queued for later execution,
  using the old/new tupleid for later retrieval (in the
  AfterTriggerEventData struct). This would need a major change to work
  with foreign tables. Would that involve a special path for executing
  those triggers ASAP ?
 
 Ops, I oversight here. AfterTriggerSaveEvent() indeed saves only ctid of
 tuples in regular tables, because it can re-construct a complete tuple
 image from a particular ctid any time.
 On the other hand, it is not available on foreign table due to its nature.
 All we can do seems to me, probably, to save copy of before/after tuple
 image on local memory, even though it consumes much memory than
 regular tables.

Or, as suggested above, add a special code path for foreign tables which would 
execute the trigger as soon as possible instead of queuing it.  

 The feature we need here might become a bit clear little by little.
 A complete tuple image shall be fetched on the scan stage, if foreign
 table has row-level trigger. The planSlot may make sense if it contains
 (junk) attributes that is sufficient to re-construct an old tuple image.
 Target-list of DELETE statement contains a junk ctid attribute. Here
 is no reason why we cannot add junk attributes that reference each
 regular attribute, isn't it? Also, target-list of UPDATE statement
 contains new tuple image, however, it is available to add junk attributes
 that just reference old value, as a carrier of old values from scan stage
 to modify stage.
 I want core PostgreSQL to support a part of these jobs. For example,
 it may be able to add junk attribute to re-construct old tuple image on
 rewriteTargetListUD(), if target relation has row-level triggers. Also, it
 may be able to extract these old values from planSlot to construct
 an old tuple image on GetTupleForTrigger(), if relation is foreign table.

So, if I understand you, the target list would be built as follow:


 - core builds the target list, including every attribute
 - this target list is updated by the FDW to add any junk attributes it wishes 
to use 
 - in the case of an UPDATE, core duplicates the whole list of attributes 
(including the added junk attributes), as another set of junk attributes. 
Maybe we could duplicate only the updated attributes.
 


 Unfortunately, I have no special idea to handle old/new remote tuple
 on AfterTriggerSaveEvent(), except for keeping it as copy, but it is
 straightforward.

Maybe avoid it altogether in the case of a trigger on a remote foreign table ?

 
  And, I also want some comments from committers, not only from mine.
  
  +1
 
 +1
 
 Thanks,

signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] Triggers on foreign tables

2013-10-13 Thread Ronan Dunklau
Le samedi 12 octobre 2013 07:30:35 Kohei KaiGai a écrit :
 2013/10/10 Ronan Dunklau rdunk...@gmail.com:

 Sorry, I'm uncertain the point above.
 Are you saying FDW driver may be able to handle well a case when
 a remote tuple to be updated is different from a remote tuple being
 fetched on the first stage? Or, am I misunderstanding?

I was not clear in the point above: what I meant was that, from my 
understanding, the FDW driver has no obligation to use the system-attribute 
tupleid to identify the remote tuple. IIRC, one of the early API proposal  
involved had the tupleid passed as an argument to the ExecForeign* hooks. 
Now, these hooks receive the whole planslot instead. This slot can store 
additional resjunks attributes (thanks to the AddForeignUpdateTargets)  
which shouldn't be altered during the execution and are carried on until 
modify stage. So, these additional attributes can be used as identifiers 
instead of the tupleid.

In fact, this is the approach I implemented for the multicorn fdw, and I 
believe it to be used in other FDWs as well (HadoopFDW does that, if I 
understand it correctly).

So, it doesn't make sense to me to assume that the tupleid will be filled as 
valid remote identifier in the context of triggers.

 
 From another standpoint, I'd like to cancel the above my suggestion,
 because after-row update or delete trigger tries to fetch an older image
 of tuple next to the actual update / delete jobs.
 Even if FDW is responsible to identify a remote tuple using tupleid,
 the identified tuple we can fetch is the newer image because FDW
 driver already issues a remote query to update or delete the target
 record.
 So, soon or later, FDW shall be responsible to keep a complete tuple
 image when foreign table has row-level triggers, until writer operation
 is finished.

+1

 Does the incomplete tuple mean a tuple image but some of columns
 are replaced with NULL due to optimization, as postgres_fdw is doing,
 doesn't it?
 If so, a solution is to enforce FDW driver to fetch all the columns if its
 managing foreign table has row level triggers for update / delete.

What I meant is that a DELETE statement, for example, does not build a scan-
stage reltargetlist consisting of the whole set of attributes: the only 
attributes that are fetched are the ones built by addForeignUpdateTargets, and 
whatever additional attributes are involved in the DELETE statement (in the 
WHERE clause, for example).

I apologize if this is not clear, since both my technical and english 
vocabulary are maybe not precise enough to get my point accross.

  Or, perform a second round-trip to the foreign data store to fetch the
  whole tuple.
 
 It might be a solution for before-row trigger, but not for after-row
 trigger, unfortunately.

+1

 As I noted above, 2nd round trip is not a suitable design to support
 after-row trigger. Likely, GetTupleForTrigger() hook shall perform to
 return a cached older image being fetched on the first round of remote
 scan.
 So, I guess every FDW driver will have similar implementation to handle
 these the situation, thus it makes sense that PostgreSQL core has
 a feature to support this handling; that keeps a tuple being fetched last
 and returns older image for row-level triggers.
 
 How about your opinion?

I like this idea, but I don't really see all the implications of such a 
design.
Where would this tuple be stored ?
From what I understand, after-triggers are queued for later execution, using 
the old/new tupleid for later retrieval (in the AfterTriggerEventData struct). 
This would need a major change to work with foreign tables. Would that involve 
a special path for executing those triggers ASAP ?

 
 And, I also want some comments from committers, not only from mine.

+1

Thank you for taking the time to think this through.


--
Ronan Dunklau


signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] Triggers on foreign tables

2013-10-10 Thread Ronan Dunklau
Le dimanche 6 octobre 2013 22:33:23 Kohei KaiGai a écrit :
 2013/9/10 Ronan Dunklau rdunk...@gmail.com:
  For row-level triggers, it seems more complicated. From what I understand,
  OLD/NEW tuples are fetched from the heap using their ctid (except for
  BEFORE INSERT triggers). How could this be adapted for foreign tables ?
 
 It seems to me the origin of difficulty to support row-level trigger
 is that foreign table
 does not have a back-door to see the older tuple to be updated, unlike
 regular tables.
 The core-PostgreSQL knows on-disk format to store tuples within
 regular tables, thus,
 GetTupleForTrigger() can fetch a tuple being identified by tupleid.
 On the other hand, writable foreign table is also designed to identify
 a remote tuple
 with tupleid, as long as FDW module is responsible.

It is my understanding that the tupleid is one way for a FDW to identify a 
tuple, but the  AddForeignUpdateTargets hook allows for arbitrary resjunks 
columns to be added. It is these columns that are then used to identify the 
tuple to update. I don't think the tupleid itself can be used to identify a 
tuple for the trigger. 

 So, a straightforward idea is adding a new FDW interface to get an
 older image of
 the tuple to be updated. It makes possible to implement something similar to
 GetTupleForTrigger() on foreign tables, even though it takes a
 secondary query to
 fetch a record from the remote server. Probably, it is an headache for us

 One thing we pay attention is, the tuple to be updated is already
 fetched from the
 remote server and the tuple being fetched latest is (always?) the
 target of update
 or delete. It is not a heavy job for ForeignScanState node to hold a
 copy of the latest
 tuple. Isn't it an available way to reference relevant
 ForeignScanState to get the older
 image?

It is however possible to have only an incomplete tuple.

The FDW may have only fetched the tupleid-equivalent, and  we would have to 
ensure that the reltargetlist contains every attribute that the trigger could 
need. Or, perform a second round-trip to the foreign data store to fetch the 
whole tuple.

 If my assumption is right, all we need to enhance are (1) add a store on
 ForeignScanState to hold the last tuple on the scan stage, (2) add
 GetForeignTupleForTrigger() to reference the store above on the relevant
 ForeignScanState.

I would add a 3) have a GetTupleForTrigger() hook that would get called with 
the stored tuple.

I personnally don't like this approach: there is already (n+1) round trips to 
the foreign store (one during the scan phase, and one per tuple during the 
update/delete phase), we don't need another one per tuple for the triggers.

signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] Extensions makefiles - coverage

2013-09-24 Thread Ronan Dunklau
On Sunday 22 September 2013 01:34:53 Peter Eisentraut wrote:
 On Thu, 2013-07-25 at 17:07 +0200, Ronan Dunklau wrote:
  I am using approximatively the layout that was proposed here:
  http://www.postgresql.org/message-id/51bb1b6e.2070...@dunslane.net
  It looks like everything is hard-coded to take the source and the
  gcda, gcno files in the base directory, but these files lay in a src
  directory with the proposed layout.
 
 The PostgreSQL build system isn't going to work very well if you build
 files outside of the current directory.  If you want to put your source
 files into a src/ subdirectory, then your top-level makefile should to a
 $(MAKE) -C src, and you need to have a second makefile in the src
 directory.  If you do that, then the existing coverage targets will work
 alright, I think.

The PGXS build system allows for the definition of an OBJS variable, which 
works fine with almost every other make target.

Maybe we need to take a step back, and think about what kind of extension 
layouts we want to support ?

At the time of this writing, the HOW TO on http://manager.pgxn.org/howto 
strongly encourage to put all C-files in an src directory.

As a result, many extensions on pgxn use this layout. It would be great not to 
have to change them to measure code coverage.

signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] Triggers on foreign tables

2013-09-16 Thread Ronan Dunklau
On Thursday 12 September 2013 12:10:01 Peter Eisentraut wrote:
 The documentation build fails:
 
 openjade:trigger.sgml:72:9:E: end tag for ACRONYM omitted, but OMITTAG
 NO was specified
 openjade:trigger.sgml:70:56: start tag was here

Thank you, I took the time to install a working doc-building environment. 
Please find attached a patch that corrects this.diff --git a/doc/src/sgml/ref/create_trigger.sgml b/doc/src/sgml/ref/create_trigger.sgml
index e5ec738..08d133c 100644
--- a/doc/src/sgml/ref/create_trigger.sgml
+++ b/doc/src/sgml/ref/create_trigger.sgml
@@ -93,7 +93,7 @@ CREATE [ CONSTRAINT ] TRIGGER replaceable class=PARAMETERname/replaceable
 
   para
The following table summarizes which types of triggers may be used on
-   tables and views:
+   tables, views and foreign tables:
   /para
 
   informaltable id=supported-trigger-types
@@ -111,7 +111,7 @@ CREATE [ CONSTRAINT ] TRIGGER replaceable class=PARAMETERname/replaceable
   entry align=center morerows=1literalBEFORE//entry
   entry align=centercommandINSERT//commandUPDATE//commandDELETE//entry
   entry align=centerTables/entry
-  entry align=centerTables and views/entry
+  entry align=centerTables, views and foreign tables/entry
  /row
  row
   entry align=centercommandTRUNCATE//entry
@@ -122,7 +122,7 @@ CREATE [ CONSTRAINT ] TRIGGER replaceable class=PARAMETERname/replaceable
   entry align=center morerows=1literalAFTER//entry
   entry align=centercommandINSERT//commandUPDATE//commandDELETE//entry
   entry align=centerTables/entry
-  entry align=centerTables and views/entry
+  entry align=centerTables, views and foreign tables/entry
  /row
  row
   entry align=centercommandTRUNCATE//entry
@@ -244,7 +244,7 @@ UPDATE OF replaceablecolumn_name1/replaceable [, replaceablecolumn_name2/
 termreplaceable class=parametertable_name/replaceable/term
 listitem
  para
-  The name (optionally schema-qualified) of the table or view the trigger
+  The name (optionally schema-qualified) of the table, view or foreign table the trigger
   is for.
  /para
 /listitem
diff --git a/doc/src/sgml/trigger.sgml b/doc/src/sgml/trigger.sgml
index f579340..37c0a35 100644
--- a/doc/src/sgml/trigger.sgml
+++ b/doc/src/sgml/trigger.sgml
@@ -33,7 +33,7 @@
para
 A trigger is a specification that the database should automatically
 execute a particular function whenever a certain type of operation is
-performed.  Triggers can be attached to both tables and views.
+performed.  Triggers can be attached to tables, views and foreign tables.
   /para
 
   para
@@ -64,6 +64,15 @@
/para
 
para
+On foreign tables, trigger can be defined to execute either before or after any
+commandINSERT/command,
+commandUPDATE/command or
+commandDELETE/command operation, only once per acronymSQL/acronym statement.
+This means that row-level triggers are not supported for foreign tables.
+   /para
+
+
+   para
 The trigger function must be defined before the trigger itself can be
 created.  The trigger function must be declared as a
 function taking no arguments and returning type literaltrigger/.
@@ -96,6 +105,7 @@
 after may only be defined at statement level, while triggers that fire
 instead of an commandINSERT/command, commandUPDATE/command,
 or commandDELETE/command may only be defined at row level.
+On foreign tables, triggers can not be defined at row level.
/para
 
para
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index adc74dd..68bfa9c 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -3141,13 +3141,16 @@ ATPrepCmd(List **wqueue, Relation rel, AlterTableCmd *cmd,
 			pass = AT_PASS_MISC;
 			break;
 		case AT_EnableTrig:		/* ENABLE TRIGGER variants */
-		case AT_EnableAlwaysTrig:
-		case AT_EnableReplicaTrig:
 		case AT_EnableTrigAll:
 		case AT_EnableTrigUser:
 		case AT_DisableTrig:	/* DISABLE TRIGGER variants */
 		case AT_DisableTrigAll:
 		case AT_DisableTrigUser:
+ATSimplePermissions(rel, ATT_TABLE | ATT_FOREIGN_TABLE);
+pass = AT_PASS_MISC;
+break;
+		case AT_EnableReplicaTrig:
+		case AT_EnableAlwaysTrig:
 		case AT_EnableRule:		/* ENABLE/DISABLE RULE variants */
 		case AT_EnableAlwaysRule:
 		case AT_EnableReplicaRule:
diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c
index d86e9ad..d84b61b 100644
--- a/src/backend/commands/trigger.c
+++ b/src/backend/commands/trigger.c
@@ -184,12 +184,23 @@ CreateTrigger(CreateTrigStmt *stmt, const char *queryString,
 			RelationGetRelationName(rel)),
 	 errdetail(Views cannot have TRUNCATE triggers.)));
 	}
-	else
+	else if (rel-rd_rel-relkind == RELKIND_FOREIGN_TABLE)
+	{
+		if(stmt-row){
+			ereport(ERROR,
+	(errcode(ERRCODE_WRONG_OBJECT_TYPE),
+	 errmsg(\%s\ is a foreign table,
+			

Re: [HACKERS] Triggers on foreign tables

2013-09-11 Thread Ronan Dunklau
On Wednesday 11 September 2013 06:27:24 Michael Paquier wrote:
 As your patch is targeting the implementation of a new feature, please
 consider adding this patch to the next commit fest that is going to
 begin in a couple of days:
 https://commitfest.postgresql.org/action/commitfest_view?id=19

I intended to do that in the next couple of days if I don't get any feedback. 
Thank you for the reminder, its done.

signature.asc
Description: This is a digitally signed message part.


[HACKERS] Triggers on foreign tables

2013-09-10 Thread Ronan Dunklau
Hello.

I wanted to know what it would take to implement triggers on foreign tables. 
It seems that statement-level triggers can work provided they are allowed in 
the code.

Please find attached a simple POC patch that implement just that.

For row-level triggers, it seems more complicated. From what I understand, 
OLD/NEW tuples are fetched from the heap using their ctid (except for BEFORE 
INSERT triggers). How could this be adapted for foreign tables ?


--
Ronan Dunklaudiff --git a/doc/src/sgml/ref/create_trigger.sgml b/doc/src/sgml/ref/create_trigger.sgml
index e5ec738..08d133c 100644
--- a/doc/src/sgml/ref/create_trigger.sgml
+++ b/doc/src/sgml/ref/create_trigger.sgml
@@ -93,7 +93,7 @@ CREATE [ CONSTRAINT ] TRIGGER replaceable class=PARAMETERname/replaceable
 
   para
The following table summarizes which types of triggers may be used on
-   tables and views:
+   tables, views and foreign tables:
   /para
 
   informaltable id=supported-trigger-types
@@ -111,7 +111,7 @@ CREATE [ CONSTRAINT ] TRIGGER replaceable class=PARAMETERname/replaceable
   entry align=center morerows=1literalBEFORE//entry
   entry align=centercommandINSERT//commandUPDATE//commandDELETE//entry
   entry align=centerTables/entry
-  entry align=centerTables and views/entry
+  entry align=centerTables, views and foreign tables/entry
  /row
  row
   entry align=centercommandTRUNCATE//entry
@@ -122,7 +122,7 @@ CREATE [ CONSTRAINT ] TRIGGER replaceable class=PARAMETERname/replaceable
   entry align=center morerows=1literalAFTER//entry
   entry align=centercommandINSERT//commandUPDATE//commandDELETE//entry
   entry align=centerTables/entry
-  entry align=centerTables and views/entry
+  entry align=centerTables, views and foreign tables/entry
  /row
  row
   entry align=centercommandTRUNCATE//entry
@@ -244,7 +244,7 @@ UPDATE OF replaceablecolumn_name1/replaceable [, replaceablecolumn_name2/
 termreplaceable class=parametertable_name/replaceable/term
 listitem
  para
-  The name (optionally schema-qualified) of the table or view the trigger
+  The name (optionally schema-qualified) of the table, view or foreign table the trigger
   is for.
  /para
 /listitem
diff --git a/doc/src/sgml/trigger.sgml b/doc/src/sgml/trigger.sgml
index f579340..37c0a35 100644
--- a/doc/src/sgml/trigger.sgml
+++ b/doc/src/sgml/trigger.sgml
@@ -33,7 +33,7 @@
para
 A trigger is a specification that the database should automatically
 execute a particular function whenever a certain type of operation is
-performed.  Triggers can be attached to both tables and views.
+performed.  Triggers can be attached to tables, views and foreign tables.
   /para
 
   para
@@ -64,6 +64,15 @@
/para
 
para
+On foreign tables, trigger can be defined to execute either before or after any
+commandINSERT/command,
+commandUPDATE/command or
+commandDELETE/command operation, only onece per acronymSQL statement.
+This means that row-level triggers are not supported for foreign tables.
+   /para
+
+
+   para
 The trigger function must be defined before the trigger itself can be
 created.  The trigger function must be declared as a
 function taking no arguments and returning type literaltrigger/.
@@ -96,6 +105,7 @@
 after may only be defined at statement level, while triggers that fire
 instead of an commandINSERT/command, commandUPDATE/command,
 or commandDELETE/command may only be defined at row level.
+On foreign tables, triggers can not be defined at row level.
/para
 
para
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index adc74dd..68bfa9c 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -3141,13 +3141,16 @@ ATPrepCmd(List **wqueue, Relation rel, AlterTableCmd *cmd,
 			pass = AT_PASS_MISC;
 			break;
 		case AT_EnableTrig:		/* ENABLE TRIGGER variants */
-		case AT_EnableAlwaysTrig:
-		case AT_EnableReplicaTrig:
 		case AT_EnableTrigAll:
 		case AT_EnableTrigUser:
 		case AT_DisableTrig:	/* DISABLE TRIGGER variants */
 		case AT_DisableTrigAll:
 		case AT_DisableTrigUser:
+ATSimplePermissions(rel, ATT_TABLE | ATT_FOREIGN_TABLE);
+pass = AT_PASS_MISC;
+break;
+		case AT_EnableReplicaTrig:
+		case AT_EnableAlwaysTrig:
 		case AT_EnableRule:		/* ENABLE/DISABLE RULE variants */
 		case AT_EnableAlwaysRule:
 		case AT_EnableReplicaRule:
diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c
index d86e9ad..d84b61b 100644
--- a/src/backend/commands/trigger.c
+++ b/src/backend/commands/trigger.c
@@ -184,12 +184,23 @@ CreateTrigger(CreateTrigStmt *stmt, const char *queryString,
 			RelationGetRelationName(rel)),
 	 errdetail(Views cannot have TRUNCATE triggers.)));
 	}
-	else
+	else if (rel-rd_rel-relkind == RELKIND_FOREIGN_TABLE)
+	{
+		if(stmt-row){
+			ereport(ERROR,
+	

Re: [HACKERS] Extensions makefiles - coverage

2013-07-26 Thread Ronan Dunklau
Thank you for the tip, its done.

2013/7/26 Robert Haas robertmh...@gmail.com:
 On Thu, Jul 25, 2013 at 11:07 AM, Ronan Dunklau rdunk...@gmail.com wrote:
 Hello.

 I was having trouble figuring how to use the coverage targets when
 using an extension.
 I am using approximatively the layout that was proposed here:
 http://www.postgresql.org/message-id/51bb1b6e.2070...@dunslane.net
 It looks like everything is hard-coded to take the source and the
 gcda, gcno files in the base directory, but these files lay in a src
 directory with the proposed layout.

 It may be better to base the .gcda file discovery on the OBJS
 variables when using PGXS.

 Please find attached a small patch that implements this. There is
 probably a better way than the redundant rm $(gcda_files) / rm *.gcda
 to cleanup the generated files.

 With the attached patch, the following targets seem to have the same
 behaviour as on the current HEAD, both on the whole tree and on
 individual contrib modules:

 - coverage-html
 - clean
 - coverage-clean
 - clean-coverage

 I noticed that make clean leaves gcda and gcov files on the current
 HEAD, and this is no different with the given patch.

 I also tested it against several pgxn extensions, and it seems to work fine.

 You can ensure that your patch doesn't get forgotten about by adding it here:

 https://commitfest.postgresql.org/action/commitfest_view/open

 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Extensions makefiles - coverage

2013-07-25 Thread Ronan Dunklau
Hello.

I was having trouble figuring how to use the coverage targets when
using an extension.
I am using approximatively the layout that was proposed here:
http://www.postgresql.org/message-id/51bb1b6e.2070...@dunslane.net
It looks like everything is hard-coded to take the source and the
gcda, gcno files in the base directory, but these files lay in a src
directory with the proposed layout.

It may be better to base the .gcda file discovery on the OBJS
variables when using PGXS.

Please find attached a small patch that implements this. There is
probably a better way than the redundant rm $(gcda_files) / rm *.gcda
to cleanup the generated files.

With the attached patch, the following targets seem to have the same
behaviour as on the current HEAD, both on the whole tree and on
individual contrib modules:

- coverage-html
- clean
- coverage-clean
- clean-coverage

I noticed that make clean leaves gcda and gcov files on the current
HEAD, and this is no different with the given patch.

I also tested it against several pgxn extensions, and it seems to work fine.


coverage_pgxs.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extensions makefiles - coverage

2013-07-25 Thread Ronan Dunklau
Please ignore this comment:

 I noticed that make clean leaves gcda and gcov files on the current
 HEAD, and this is no different with the given patch.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix conversion for Decimal arguments in plpython functions

2013-06-26 Thread Ronan Dunklau
The v2 patch does not work for me: regression tests for plpython fail on
the plpython_types test: every numeric is converted to None.

It seems the global decimal ctor is not initialized.

Please find two patches, to be applied on top of the v2 patch: one
initializes the decimal ctor, the other uses cdecimal when possible.

Using the performance test from steve, on my machine:

- with cdecimal installed: ~84ms
- without cdecimal installed (standard decimal module): ~511ms


2013/6/26 Szymon Guz mabew...@gmail.com

 On 26 June 2013 01:40, Steve Singer st...@ssinger.info wrote:

 On 06/25/2013 06:42 AM, Szymon Guz wrote:



 Hi,

 I've attached a new patch. I've fixed all the problems you've found,
 except for the efficiency problem, which has been described in previous
 email.

 thanks,
 Szymon


 This version of the patch addresses the issues I mentioned.  Thanks for
 looking into seeing if the performance issue is with our conversions to
 strings or inherit with the python decimal type.  I guess we (Postgresql)
 can't do much about it.   A runtime switch to use cdecimal if it is
 available is a good idea, but I agree with you that could be a different
 patch.

 One minor thing I noticed in this round,

  PLy_elog(ERROR, could not import module 'decimal');

 I think should have decimal in double-quotes.

 I think this patch is ready for a committer to look at it.

 Steve




 Hi Steve,
 thanks for the review.

 I was thinking about speeding up the Decimal conversion using the module
 you wrote about. What about trying to import it, if it fails, than trying
 to load decimal.Decimal? There will be no warning in logs, just additional
 information in documentation that it uses this module if it is available?

 thanks,
 Szymon



cnumeric.patch
Description: Binary data


null_ctor.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix conversion for Decimal arguments in plpython functions

2013-06-26 Thread Ronan Dunklau
It seems like you confused me with steve :)

The patch applies cleanly, and the regression tests pass on python2 when
cdecimal is not installed. When it is, the type info returned for the
converted numeric value is cdecimal.Decimal instead of decimal.Decimal.

The regression tests expected output have not been modified for python3,
and as such they fail on the type conversions.

I am a bit confused with the use of PyModule_GetDict: shouldn't
PyObj_GetAttrString be used directly instead ? Moreover, the reference
count in the current implementation might be off: the reference count for
the decimal module is never decreased, while the reference count to the
module dict is, when the docs say it returns a borrowed reference.

Please find a patch that fixes both issues.




2013/6/26 Szymon Guz mabew...@gmail.com

 Thanks Steve, that's exactly what I wanted to send when you sent your
 patches :)

 I need to figure out why that patch v2 worked for me, I think I made mess
 somewhere in my git repo and didn't create the patch properly. Sorry for
 that.

 Patch is attached, I've also added information about cdecimal to plpython
 documentation.

 I'm just wondering how to make integration tests to check when cdecimal is
 installed and when it is not.

 thanks,
 Szymon


 On 26 June 2013 10:12, Ronan Dunklau rdunk...@gmail.com wrote:

 The v2 patch does not work for me: regression tests for plpython fail on
 the plpython_types test: every numeric is converted to None.

 It seems the global decimal ctor is not initialized.

 Please find two patches, to be applied on top of the v2 patch: one
 initializes the decimal ctor, the other uses cdecimal when possible.

 Using the performance test from steve, on my machine:

 - with cdecimal installed: ~84ms
 - without cdecimal installed (standard decimal module): ~511ms


 2013/6/26 Szymon Guz mabew...@gmail.com

 On 26 June 2013 01:40, Steve Singer st...@ssinger.info wrote:

 On 06/25/2013 06:42 AM, Szymon Guz wrote:



 Hi,

 I've attached a new patch. I've fixed all the problems you've found,
 except for the efficiency problem, which has been described in previous
 email.

 thanks,
 Szymon


 This version of the patch addresses the issues I mentioned.  Thanks for
 looking into seeing if the performance issue is with our conversions to
 strings or inherit with the python decimal type.  I guess we (Postgresql)
 can't do much about it.   A runtime switch to use cdecimal if it is
 available is a good idea, but I agree with you that could be a different
 patch.

 One minor thing I noticed in this round,

  PLy_elog(ERROR, could not import module 'decimal');

 I think should have decimal in double-quotes.

 I think this patch is ready for a committer to look at it.

 Steve




 Hi Steve,
 thanks for the review.

 I was thinking about speeding up the Decimal conversion using the module
 you wrote about. What about trying to import it, if it fails, than trying
 to load decimal.Decimal? There will be no warning in logs, just additional
 information in documentation that it uses this module if it is available?

 thanks,
 Szymon






plpython_decimal_fix_regression.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix conversion for Decimal arguments in plpython functions

2013-06-25 Thread Ronan Dunklau
Concerning the efficiency problem, it should be noted that the latest 3.3
release of cpython introduces an accelerator for decimal data types, as a
C-module.  This module was previously available from the Python package
index at: https://pypi.python.org/pypi/cdecimal/2.2

It may be overkill to try to include such a dependency, but the performance
overhead from using decimal is really mitigated by this implementation.


2013/6/25 Szymon Guz mabew...@gmail.com

 On 25 June 2013 05:16, Steve Singer st...@ssinger.info wrote:

 On 05/28/2013 04:41 PM, Szymon Guz wrote:

 Hi,
 I've got a patch.

 This is for a plpython enhancement.

 There is an item at the TODO list http://wiki.postgresql.org/**
 wiki/Todo#Server-Side_**Languageshttp://wiki.postgresql.org/wiki/Todo#Server-Side_Languages
 Fix loss of information during conversion of numeric type to Python
 float

 This patch uses a decimal.Decimal type from Python standard library for
 the plpthon function numeric argument instead of float.

 Patch contains changes in code, documentation and tests.

 Most probably there is something wrong, as this is my first Postgres
 patch :)


 Thanks for contributing.

 This patch applies cleanly against master and compiles with warnings

 plpy_main.c: In function ‘PLy_init_interp’:
 plpy_main.c:157:2: warning: ISO C90 forbids mixed declarations and code
 [-Wdeclaration-after-**statement]
 plpy_main.c:161:2: warning: ISO C90 forbids mixed declarations and code
 [-Wdeclaration-after-**statement]

 You can avoid this by moving the declaration of decimal and decimal_dict
 to be at the top of the function where mainmod is declared.

 Also in this function you've introduced places where it returns with an
 error (the PLy_elog(ERROR...) calls before decrementing the reference to
 mainmod. I think you can decrement the mainmod reference after the call to
 SetItemString  before your changes that import the Decimal module.


 The patch works as expected, I am able to write python functions that
 take numerics as arguments and work with them.  I can adjust the decimal
 context precision inside of  my function.

 One concern I have is that this patch makes pl/python functions involving
 numerics more than 3 times as slow as before.


 create temp table b(a numeric);
 insert into b select generate_series(1,1);

 create or replace function x(a numeric,b numeric) returns numeric as $$
 if a==None:
   return b
 return a+b
 $$ language plpythonu;
 create aggregate sm(basetype=numeric, sfunc=x,stype=numeric);


 test=# select sm(a) from b;
 sm
 --
  50005000
 (1 row)

 Time: 565.650 ms

 versus before the patch this was taking in the range of 80ms.

 Would it be faster to call numeric_send instead of numeric_out and then
 convert the sequence of Int16's to a tuple of digits that can be passed
 into the Decimal constructor? I think this is worth trying and testing,


 Documentation
 =
 Your patched version of the docs say

   PostgreSQL typereal/type, typedouble/type, and
 typenumeric/type are converted to
Python typeDecimal/type. This type is imported
 fromliteraldecimal.Decimal/**literal.


 I don't think this is correct, as far as I can tell your not changing the
 behaviour for postgresql real and double types, they continue to use
 floating point.



 listitem
 para
PostgreSQL typereal/type and typedouble/typeare converted
 to
Python typefloat/type.
 /para
 /listitem

 listitem
 para
PostgreSQL typenumeric/type is converted to
Python typeDecimal/type. This type is imported from
 literaldecimal.Decimal/**literal.
 /para
 /listitem


 Hi,
 I've attached a new patch. I've fixed all the problems you've found,
 except for the efficiency problem, which has been described in previous
 email.

 thanks,
 Szymon


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers




[HACKERS] Writable FDW: returning clauses.

2013-03-19 Thread Ronan Dunklau
Hello.

While implementing the new writable API for FDWs, I wondered wether they 
are any obvious problems with the following behavior for handling returning 
clauses (for the delete case).

The goal is to fetch all required attributes during the initial scan, and 
avoid fetching data on the delete operation itself.

- in the AddForeignUpdateTargets hook, add resjunk entries for the columns in 
the returning clause
- in the ExecForeignDelete hook, fill the returned slot with values taken from 
the planSlot.


--
Ronan Dunklau

signature.asc
Description: This is a digitally signed message part.


[HACKERS] FDW: ForeignPlan and parameterized paths

2012-12-19 Thread Ronan Dunklau
Hello.

I've noticed that,  when implementing a FDW, it is difficult to use a plan 
which 
best path is a parameterized path. This comes from the fact that the 
parameterized clause is not easily available at plan time.

This is what I understood from how it works:

- The clauses coming from the best path restrictinfo are not available in the 
scan_clauses argument to the GetForeignPlan function.

- They are, however, directly available on the path, but at this point the 
clauses are of the form InnerVar OPERATOR OuterVar. The outer Var node is then 
replaced by a Param node, using the replace_nestloop_params function.

It could be useful to make the parameterized version of the clause (in the 
form InnerVar OPERATOR Param) available to the fdw at plan time.

Could this be possible ?
Maybe by replacing the clauses on the restrictinfo nodes from the path param 
info by the parameterized clauses, and then adding these to the scan clauses 
passed to GetForeignPlan ?

Regards,

--
Ronan Dunklau


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FDW: ForeignPlan and parameterized paths

2012-12-19 Thread Ronan Dunklau
 I intentionally did the nestloop_params substitution after calling
 GetForeignPlan not before.  It's not apparent to me why it would be
 useful to do it before, because the FDW is going to have no idea what
 those params represent.  (Note that they represent values coming from
 some other, probably local, relation; not from the foreign table.)

Even if the FDW have no idea what they represent, it can identify a
clause of the form Var Operator Param, which allows to store the param
reference (paramid) for retrieving the param value at execution time.
If the chosen best path is a parameterized path that has been built by
the FDW, it allows to push down this restriction.

If this isn't possible, the only way I found to use those clauses
would be at scan time.

Lets's assume atable is a local relation, and aftable is a foreign
table, and the query looks like this:

select * from atable t1 inner join aftable t2 on t1.c1 = t2.c1


The FDW identifies the join clause on its column c1, and build a
parameterized path on this column (maybe because this column is unique
and indexed on the remote side).

The planner chooses this path, building a nested loop rescanning the
foreign table with this parameter value reflecting the outer relation
value (maybe because the local relation's size is much smaller than
the remote relation's size).

In that case, it seems to be of particular importance to have access
to the clause, so that the nested loop can work as intended: avoiding
a full seqscan on the remote side.

Or is there another way to achieve the same goal ?

Regards,

--
Ronan Dunklau


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] writable foreign tables

2012-12-18 Thread Ronan Dunklau
Hello.

I've tried to implement this API for our Multicorn FDW, and I have a few
questions about the API.

First, I don't understand what are the requirements on the rowid
pseudo-column values.

In particular, this sentence from a previous mail makes it ambiguous to me:

 At the beginning, I constructed the rowid pseudo-column using
 INTERNALOID, but it made troubles when fetched tuples are
 materialized prior to table modification.
 So, the rowid argument of them are re-defined as const char *.

Does that mean that one should only store a cstring in the rowid
pseudo-column ? Or can one store an arbitrary pointer ? Currently, I'm
building a text value using cstring_to_text_with_len. Could there be a
problem with that ?

Secondly, how does one use a returning clause ?
I've tried to look at the postgres_fdw code, but it does not seem to handle
that.

For what its worth, knowing that the postgres_fdw is still in WIP status,
here is a couple result of my experimentation with it:

- Insert into a foreign table mapped to a table with a before trigger,
using a returning clause, will return the values as they were before being
modified by the trigger.
- Same thing,  but if the trigger prevents insertion (returning NULL), the
would-have-been inserted row is still returned, although the insert
statement reports zero rows.
- Inserting into a table with default values does not work as intended,
since missing values are replaced by a null value in the remote statement.

What can be done to make the behaviour more consistent ?

I'm very excited about this feature, thank you for making this possible.

Regards,
--
Ronan Dunklau

2012/12/14 Albe Laurenz laurenz.a...@wien.gv.at

 Kohei KaiGai wrote:
  I came up with one more query that causes a problem:
 [...]
  This causes a deadlock, but one that is not detected;
  the query just keeps hanging.
 
  The UPDATE in the CTE has the rows locked, so the
  SELECT ... FOR UPDATE issued via the FDW connection will hang
  indefinitely.
 
  I wonder if that's just a pathological corner case that we shouldn't
  worry about.  Loopback connections for FDWs themselves might not
  be so rare, for example as a substitute for autonomous subtransactions.
 
  I guess it is not easily possible to detect such a situation or
  to do something reasonable about it.
 
  It is not avoidable problem due to the nature of distributed database
 system,
  not only loopback scenario.
 
  In my personal opinion, I'd like to wait for someone implements
 distributed
  lock/transaction manager on top of the background worker framework being
  recently committed, to intermediate lock request.
  However, it will take massive amount of efforts to existing
 lock/transaction
  management layer, not only enhancement of FDW APIs. It is obviously out
  of scope in this patch.
 
  So, I'd like to suggest authors of FDW that support writable features to
 put
  mention about possible deadlock scenario in their documentation.
  At least, above writable CTE example is a situation that two different
 sessions
  concurrently update the test relation, thus, one shall be blocked.

 Fair enough.

  I tried to overhaul the documentation, see the attached patch.
 
  There was one thing that I was not certain of:
  You say that for writable foreign tables, BeginForeignModify
  and EndForeignModify *must* be implemented.
  I thought that these were optional, and if you can do your work
  with just, say, ExecForeignDelete, you could do that.
 
  Yes, that's right. What I wrote was incorrect.
  If FDW driver does not require any state during modification of
  foreign tables, indeed, these are not mandatory handler.

 I have updated the documentation, that was all I changed in the
 attached patches.

  OK. I split the patch into two portion, part-1 is the APIs relevant
  patch, part-2 is relevant to postgres_fdw patch.

 Great.

 I'll mark the patch as ready for committer.

 Yours,
 Laurenz Albe


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers




Re: [HACKERS] Arguments to foreign tables?

2012-11-06 Thread Ronan Dunklau
Maybe you could set some options on the foreign table before selecting from
it ?
Another way you could achieve the same result would be to give some column
a special meaning (like it is done in the twitter_fdw for example).

If you don't mind, do you have a specific use-case for this ?

--
Ronan Dunklau

2012/11/6 Jeff Davis pg...@j-davis.com

 On Sun, 2012-11-04 at 15:13 -0500, Tom Lane wrote:
  Jeff Davis pg...@j-davis.com writes:
   Is there any fundamental or philosophical reason why a foreign table
   can't accept arguments?
 
  That isn't a table; it's some sort of function.  Now that we have
  LATERAL, there is no good reason to contort SQL's syntax and semantics
  in the direction you suggest.

 Maybe I should rephrase this as a problem with SRFs: you don't get to
 define the init/exec/end executor functions, and you don't get access to
 the optimizer information.

 It seems like foreign tables are a better mechanism (except for the
 simple cases where you don't care about the details), and the only thing
 an SRF can do that a foreign table can't is accept arguments. So, I
 thought maybe it would make more sense to combine the mechanisms
 somehow.

 Take something as simple as generate_series: right now, it materializes
 the entire thing if it's in the FROM clause, but it wouldn't need to if
 it could use the foreign table mechanism.

 Regards,
 Jeff Davis



 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



[HACKERS] Query planning, nested loops and FDW.

2012-07-13 Thread Ronan Dunklau
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hello.

Thanks to the indications given by Tom Lane, I was able to make use of
parameterized path in our multicorn extension.

For example, given a (local) table real_table and another (foreign)
table foreign_table, having the same set of columns, if the foreign
table declares that a filter on col1 and col2 will, on average, return
100 rows (vs 1million for an unparameterized path), postgresql will
produce the following plan:

explain select real_table.*, foreign_table.*
from real_table inner join foreign_table
on real_table.col1 = foreign_table.col1 and
   real_table.col2 = foreign_table.col2;

   QUERY PLAN
-
-
 Nested Loop  (cost=10.00..19145165.27 rows=50558 width=64)
   -  Seq Scan on real_table  (cost=0.00..38.27 rows=2127 width=32)
   -  Foreign Scan on foreign_table  (cost=10.00..9000.00 rows=100
width=90)
 Filter: ((real_table.col1 = (col1)::text) AND
(real_table.col2 = (col2)::text))

This is exactly what I wanted to achieve.

But there is still room for improvement here, because the inner
foreign scan will be executed for each row, and not for each distinct
couple of col1, col2.

Does this happen because:
 - there is no other way to proceed.
 - the planner thinks it is less costly to execute the inner scan
multiple times than to do whatever it takes to execute it only one
time for each distinct couple ?

Best regards,

- -- 
Ronan Dunklau

-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.19 (GNU/Linux)

iQEcBAEBAgAGBQJQAEbxAAoJECTYLCgFy323FTUH/j81AAT1ODBdizIdTV+yI7nX
KjCg+hBwTlKMs8l8KUuslEo0wp3Wc8Yem0PFCvO3+0IYZ26iGsi5jIoqflaZ86gZ
MAjRoUyXfn3Maz/vU3TIYYwYnWhMp1i4GwFf6bqXaVlCVYAaARetksxc5o52lZZT
cgN/D1wek6FQkKSSN916siuIwlkEIHiMkB3VF2up1veRtzPbOvvosmdAKyYaMAcH
auqOBu/PVMUkR/5g/HbqkK+DoN3PYXpUw7LWPfoAQHEYijCPdR9De9BnJGW4RZL2
j2xkixJSR0h8KYkgH6WIAXTfbz1/l9GFXe0mJFskfU42mGmpLO41YeqhbSHaNtw=
=SiMC
-END PGP SIGNATURE-

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PG9.2 and FDW query planning.

2012-07-12 Thread Ronan Dunklau
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 05/07/2012 18:30, Tom Lane wrote:
 Ronan Dunklau rdunk...@gmail.com writes:
 Let's say I have an IMAP foreign data wrapper, and I write a
 query joining the table on itself using the In-Reply-To and
 Message-ID headers, is there anything I can do to avoid fetching
 all the mails from the remote server ?
 
 If I could somehow inform the planner that it can look up rows
 by message-id, thus avoiding the need to fetch everything from
 the remote server. Perhaps persuading the planner to use a
 nested-loop ?
 
 OK, so what you're saying is that the imap server can effectively 
 provide an index on message_id.  What you'd do is create a
 parameterized path that uses the tbl.message_id =
 other_tbl.in_reply_to join clause. If that's enough cheaper than a
 full scan, the planner would select it.

Thank you, I was able to build such paths from your indication.

The python FDW implementor can optionally give a list of tuples
consisting of (path key, expected_number_of_row). So, in the imap
example that could be [('Message-ID', 1), ('From', 1000)] for example.

- From this information, if there is an equivalence class which
restrictinfo uses one of those keys, we build a parameterized path,
with an associated cost of base_width * expected_number_of_row, in
addition to the generic, unparameterized path.

The planner can then select this path, and build plans looking like this:

postgres=# explain select m1.From,
   m1.To,
   m2.From,
   m2.To
from mails m1 inner join mails m2 on m2.Message-ID = m1.In-Reply-To
where m1.From = '%t...@example.com%';

QUERY PLAN
- 
 Nested Loop  (cost=10.00..60001000.00 rows=5 width=128)
   -  Foreign Scan on mails m1  (cost=0.00..3000.00 rows=10
width=300)
 Filter: ((From)::text = '%t...@example.com%'::text)
   -  Foreign Scan on mails m2  (cost=10.00..300.00 rows=1 width=300)
 Filter: ((Message-ID)::text = (m1.In-Reply-To)::text)


If I understand it correctly, after returning a ForeignScan (from
GetForeignPlan), the planner decides to use a nestloop, and in the
process of creating the nestloop plan, replaces Var nodes coming from
the outerel (here, m1.In-Reply-To) by params nodes.

My current implementation already looks for (var = param) expressions
that it may handle  during the plan phase and stores the association
between the var and the param_id.
At execution time, the needed parameters values are fetched (from the
ParamExecData array found in es_param_exec_vals) and passed to the
python foreign data wrapper.

The problem I have: how can I retrieve the generated params and keep
the association between the var and the param ?

Should I replace the (var = outervar) clauses by (var = param) myself
and store them in the fdw_exprs field of the foreign scan ?

 FWIW, I'm not sure that it's sane to try to expose this stuff to
 python yet.  It's complicated and still something of a moving
 target.  Once we've got a few more C-coded FDWs that can do this
 type of optimization, things might be stable enough that it'd be
 useful to try to provide a high-level API.

The current API (as mentioned above) would be more declarative than
anything, only offering a way to (maybe) build parameterized paths
without guaranteeing anything. Even if the internals change, I fail to
see how it can hurt to offer such a feature.


Regards,

- -- 
Ronan Dunklau


-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.19 (GNU/Linux)

iQEcBAEBAgAGBQJP/s2dAAoJECTYLCgFy3239KkIAIiKJo/F1r4Yp49wLpmThjQI
ICo910ZajqlUKVsl9ye8m2l6p+lyGEmZMWUAWP6ae2pqFR+aC0zThypjF1faZ9tN
HfqMbEKx/trkDf05U28tJlvOeu21tiEOEs4n02fmfdHu9SvemuLdyhU3dOLxoBVK
ZZ8ra9q/+zHCPpc3zt0Mow80Q1X1M3DtirsHPoeIdOK69wD4nD2ZfhQule5HaoV1
dG3FlrKGAGzRpohLBCuWzyGPcWCS584lXGWfhsz/waLaSDIjcjvaaMke54eaa8Ci
7KxXkMM12CKFQyheSR5VVwFJrobnME2HDiJCoAOkRc0dW+Y2aASJnKG/FwL8C7s=
=4RjB
-END PGP SIGNATURE-

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] PG9.2 and FDW query planning.

2012-07-05 Thread Ronan Dunklau
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hello.

I'm in the process of porting our multicorn extension to pg9.2, and
I'd like to take advantage of the GetForeignPaths hook.

The multicorn extension allows the creation of a FDW in python using a
simple API, and I'd like to be able to provide FDW implementors a way
to influence the planner.

Let's say I have an IMAP foreign data wrapper, and I write a query
joining the table on itself using the In-Reply-To and Message-ID
headers, is there anything I can do to avoid fetching all the mails
from the remote server ?

If I could somehow inform the planner that it can look up rows by
message-id, thus avoiding the need to fetch everything from the remote
server. Perhaps persuading the planner to use a nested-loop ?

If I understand correctly, I should build one (or more) list of
pathkeys, and add more foreign paths using those pathkeys.

There was a discussion about index on foreign tables back in march.
- From what I understand from this discussion, someone proposed to
locally store information about indexes on the foreign tables, but I
did not find anything on how to build a path from scratch.

Thank you.

- -- 
Ronan Dunklau

-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.19 (GNU/Linux)

iQEcBAEBAgAGBQJP9bsIAAoJECTYLCgFy323KtsH/2/8AVfBRm75oWFMlU0l0oBC
ujxkt338PnpVi1V5gtE5GSRSwybPWytXAkgzIQ5/DEP/RmeW8pliV+0V7zvqlEWG
zgvfA7stRBWtIIIv6mdlTM0eBBgsFnoJLiJDTUDst5vAaj8vg8b+pX/ip7nSF5sw
dV2i3ir6JSsG4nJOcQ/kP6xg4Joan65pOwFDfwnx9pFnerT0YN9f87DRuohcj12e
fgWSqZkGU5nx9yCLWa294YzIFFY7lIjLowzEfg2eP2dVIM09GquKsSXyilJiMy4J
3QL64mUDF/pNZeH0LnpyGJfCfPlQsX4c554rZbO03tVeSEZMyVpCISLqutTnR9I=
=HwMc
-END PGP SIGNATURE-

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CREATE FOREGIN TABLE LACUNA

2012-03-14 Thread Ronan Dunklau
On 14/03/2012 16:47, David Fetter wrote:
 On Wed, Mar 14, 2012 at 11:29:14AM -0400, Tom Lane wrote:
 David Fetter da...@fetter.org writes:
 On Wed, Mar 14, 2012 at 10:27:28AM -0400, Tom Lane wrote:
 Hm.  That opinion seems to me to connect to the recently-posted
 patch to make contrib/file_fdw enforce NOT NULL constraints.
 Should we instead have the position that constraints declared for
 foreign tables are statements that we can take on faith, and it's
 the user's fault if they are wrong?

 I think that's something FDWs need to be able to communicate to
 PostgreSQL.  For example, something talking to another PostgreSQL
 would (potentially, anyhow) have access to deep knowledge of the
 remote side, but file_fdw would only have best efforts even for
 clever things like statistics.

 I think we're talking at cross-purposes.  What you're saying seems
 to assume that it's the system's responsibility to do something
 about a constraint declared on a foreign table.  What I'm suggesting
 is that maybe it isn't.
 
 Actually, I'm suggesting that this behavior needs to be controlled,
 not system-wide, but per FDW, and eventually per server, table and
 column.

 A constraint, ordinarily, would be enforced against table *updates*,
 and then just assumed valid during reads.  In the case of a foreign
 table, we can't enforce constraints during updates because we don't
 have control of all updates.
 
 I think that the situation will become a bit more nuanced than that.
 A FDW could delegate constraints to the remote side, and in principle,
 the remote side could inform PostgreSQL of all types of changes (DML,
 DCL, DDL).
 
 Should we ignore declared constraints because they're not
 necessarily true?  Should we assume on faith that they're true?
 
 Neither.  We should instead have ways for FDWs to say which
 constraints are local-only, and which presumed correct on the remote
 side.  If they lie when asserting the latter, that's pilot error.
 

I don't understand what value would that bring. Do you propose that, if
a FDW declares a constraint as local-only, the planner should ignore
them but when declared as remote, it could use this information ?

Let me describe a simple use case we have in one of our web
applications, which would benefit from foreign keys on foreign tables.

The application has users, stored in a users table, which can upload
files. The files are stored on the server's filesystem, using one folder
per user, named after the user_id.

Ex:

/
  1/
myfile.png
  2/
myotherfile.png


This filesystem is accessed using the StructuredFS FDW, which maps a
file system tree to a set of columns corresponding to parts of the file
path: every file which path matches the pattern results in a row. Using
the aforementioned structure, the foreign table would have an user_id
column, and a filename column.

Now, the FDW itself cannot know that the foreign key will be enforced,
but as the application developer, I know that every directory will be
named after an user_id.

Allowing foreign keys on such a foreign table would allow us to describe
the model more precisely in PostgreSQL, and external tools could use
this knowledge too, even if PostgreSQL completely ignore them.
Especially ORMs relying on foreign keys to determine join conditions
between tables.

On the other hand, should foreign keys referencing a foreign table be
allowed too ? From a foreign table to another, from a local table to a
foreign table ?

Regards,

-- 
Ronan Dunklau

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers