(2014/11/17 17:55), Ashutosh Bapat wrote:
Here are my review comments for patch fdw-inh-3.patch.
Thanks for the review!
Tests
-------
1. It seems like you have copied from testcase inherit.sql to
postgres_fdw testcase. That's a good thing, but it makes the test quite
long. May be we should have two tests in postgres_fdw contrib module,
one for simple cases, and other for inheritance. What do you say?
IMO, the test is not so time-consuming, so it doesn't seem worth
splitting it into two.
Documentation
--------------------
1. The change in ddl.sgml
- We will refer to the child tables as partitions, though they
- are in every way normal <productname>PostgreSQL</> tables.
+ We will refer to the child tables as partitions, though we assume
+ that they are normal <productname>PostgreSQL</> tables.
adds phrase "we assume that", which confuses the intention behind the
sentence. The original text is intended to highlight the equivalence
between "partition" and "normal table", where as the addition esp. the
word "assume" weakens that equivalence. Instead now we have to highlight
the equivalence between "partition" and "normal or foreign table". The
wording similar to "though they are either normal or foreign tables"
should be used there.
You are right, but I feel that there is something out of place in saying
that there (5.10.2. Implementing Partitioning) because the procedure
there has been written based on normal tables only. Put another way, if
we say that, I think we'd need to add more docs, describing the syntax
and/or the corresponding examples for foreign-table cases. But I'd like
to leave that for another patch. So, how about the wording "we assume
*here* that", instead of "we assume that", as I added the following
notice in the previous section (5.10.1. Overview)?
@@ -2650,7 +2669,10 @@ VALUES ('Albany', NULL, NULL, 'NY');
table of a single parent table. The parent table itself is normally
empty; it exists just to represent the entire data set. You should be
familiar with inheritance (see <xref linkend="ddl-inherit">) before
- attempting to set up partitioning.
+ attempting to set up partitioning. (The setup and management of
+ partitioned tables illustrated in this section assume that each
+ partition is a normal table. However, you can do that in a similar way
+ for cases where some or all partitions are foreign tables.)
2. The wording "some kind of optimization" gives vague picture. May be
it can be worded as "Since the constraints are assumed to be true, they
are used in constraint-based query optimization like constraint
exclusion for partitioned tables.".
+ Those constraints are used in some kind of query optimization such
+ as constraint exclusion for partitioned tables (see
+ <xref linkend="ddl-partitioning">).
Will follow your revision.
Code
-------
1. In the following change
+/*
* acquire_inherited_sample_rows -- acquire sample rows from
inheritance tree
*
* This has the same API as acquire_sample_rows, except that rows are
* collected from all inheritance children as well as the specified table.
- * We fail and return zero if there are no inheritance children.
+ * We fail and return zero if there are no inheritance children or
there are
+ * inheritance children that foreign tables.
The addition should be "there are inheritance children that *are all
*foreign tables. Note the addition "are all".
Sorry, I incorrectly wrote the comment. What I tried to write is "We
fail and return zero if there are no inheritance children or if we are
not in VAC_MODE_SINGLE case and inheritance tree contains at least one
foreign table.".
2. The function has_foreign() be better named has_foreign_child()? This
How about "has_foreign_table"?
function loops through all the tableoids passed even after it has found
a foreign table. Why can't we return true immediately after finding the
first foreign table, unless the side effects of heap_open() on all the
table are required. But I don't see that to be the case, since these
tables are locked already through a previous call to heap_open(). In the
Good catch! Will fix.
same function instead of argument name parentrelId may be we should use
name parent_oid, so that we use same notation for parent and child table
OIDs.
Will fix.
3. Regarding enum VacuumMode - it's being used only in case of
acquire_inherited_sample_rows() and that too, to check only a single
value of the three defined there. May be we should just infer that value
inside acquire_inherited_sample_rows() or pass a boolean true or false
from do_analyse_rel() based on the VacuumStmt. I do not see need for a
separate three value enum of which only one value is used and also to
pass it down from vacuum() by changing signatures of the minion functions.
I introduced that for possible future use. See the discussion in [1].
4. In postgresGetForeignPlan(), the code added by this patch is required
to handle the case when the row mark is placed on a parent table and
hence is required to be applied on the child table. We need a comment
explaining this. Otherwise, the three step process to get the row mark
information isn't clear for a reader.
Will add the comment.
5. In expand_inherited_rtentry() why do you need a separate variable
hasForeign? Instead of using that variable, you can actually set/reset
rte->hasForeign since existence of even a single foreign child would
mark that member as true. - After typing this, I got the answer when I
looked at the function code. Every child's RTE is initially a copy of
parent's RTE and hence hasForeign status would be inherited by every
child after the first foreign child. I think, this reasoning should be
added as comment before assignment to rte->hasForeign at the end of the
function.
As you mentioned, I think we could set rte->hasForeign directly during
the scan for the inheritance set, without the separate variable,
hasForeign. But ISTM that it'd improve code readability to set
rte->hasForeign using the separate variable at the end of the function
because rte->hasForeign has its meaning only when rte->inh is true and
because we know whether rte->inh is true, at the end of the function.
6. The tests in foreign_data.sql are pretty extensive. Thanks for that.
I think, we should also check the effect of each of the following
command using \d on appropriate tables.
+CREATE FOREIGN TABLE ft2 () INHERITS (pt1)
+ SERVER s0 OPTIONS (delimiter ',', quote '"', "be quoted" 'value');
+ALTER FOREIGN TABLE ft2 NO INHERIT pt1;
+DROP FOREIGN TABLE ft2;
+CREATE FOREIGN TABLE ft2 (
+ c1 integer NOT NULL,
+ c2 text,
+ c3 date
+) SERVER s0 OPTIONS (delimiter ',', quote '"', "be quoted" 'value');
+ALTER FOREIGN TABLE ft2 INHERIT pt1;
Will fix.
Apart from the above, I noticed that the patch doesn't consider to call
ExplainForeignModify during EXPLAIN for an inherited UPDATE/DELETE, as
shown below (note that there are no UPDATE remote queries displayed):
postgres=# explain verbose update parent set a = a * 2 where a = 5;
QUERY PLAN
-------------------------------------------------------------------------------------
Update on public.parent (cost=0.00..280.77 rows=25 width=10)
-> Seq Scan on public.parent (cost=0.00..0.00 rows=1 width=10)
Output: (parent.a * 2), parent.ctid
Filter: (parent.a = 5)
-> Foreign Scan on public.ft1 (cost=100.00..140.38 rows=12 width=10)
Output: (ft1.a * 2), ft1.ctid
Remote SQL: SELECT a, ctid FROM public.mytable_1 WHERE ((a =
5)) FOR UPDATE
-> Foreign Scan on public.ft2 (cost=100.00..140.38 rows=12 width=10)
Output: (ft2.a * 2), ft2.ctid
Remote SQL: SELECT a, ctid FROM public.mytable_2 WHERE ((a =
5)) FOR UPDATE
(10 rows)
So, I'd like to modify explain.c to show those queries like this:
postgres=# explain verbose update parent set a = a * 2 where a = 5;
QUERY PLAN
-------------------------------------------------------------------------------------
Update on public.parent (cost=0.00..280.77 rows=25 width=10)
-> Seq Scan on public.parent (cost=0.00..0.00 rows=1 width=10)
Output: (parent.a * 2), parent.ctid
Filter: (parent.a = 5)
Remote SQL: UPDATE public.mytable_1 SET a = $2 WHERE ctid = $1
-> Foreign Scan on public.ft1 (cost=100.00..140.38 rows=12 width=10)
Output: (ft1.a * 2), ft1.ctid
Remote SQL: SELECT a, ctid FROM public.mytable_1 WHERE ((a =
5)) FOR UPDATE
Remote SQL: UPDATE public.mytable_2 SET a = $2 WHERE ctid = $1
-> Foreign Scan on public.ft2 (cost=100.00..140.38 rows=12 width=10)
Output: (ft2.a * 2), ft2.ctid
Remote SQL: SELECT a, ctid FROM public.mytable_2 WHERE ((a =
5)) FOR UPDATE
(12 rows)
What do you say?
Sorry for the delay.
[1] http://www.postgresql.org/message-id/1316566782-sup-2...@alvh.no-ip.org
Best regards,
Etsuro Fujita
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers