Re: Refactoring pg_dump's getTables()

2021-10-19 Thread Tom Lane
Daniel Gustafsson  writes:
> On 17 Oct 2021, at 22:05, Alvaro Herrera  wrote:
>> Maybe I would group together the changes that all require the same version
>> test, rather than keeping the output columns in the same order.

> I agree with that, if we're doing all this we might as well go all the way for
> readability.

I had a go at doing that, but soon decided that it wasn't as great an
idea as it first seemed.  There are two problems:

1. It's not clear what to do with fields where there are three or more
variants, such as reloptions and checkoptions.

2. Any time we modify the behavior for a particular field, we'd
have to merge or un-merge it from the stanza for the
previously-most-recently-relevant version.  This seems like it'd
be a maintenance headache and make patch footprints bigger than they
needed to be.

So I ended up not doing very much merging.  I did make an effort
to group the fields in perhaps a slightly more logical order
than before.

regards, tom lane




Re: Refactoring pg_dump's getTables()

2021-10-19 Thread Tom Lane
Daniel Gustafsson  writes:
> On 17 Oct 2021, at 22:05, Alvaro Herrera  wrote:
>> Maybe I would group together the changes that all require the same version
>> test, rather than keeping the output columns in the same order.

> I agree with that, if we're doing all this we might as well go all the way for
> readability.

OK, I'll make it so.

regards, tom lane




Re: Refactoring pg_dump's getTables()

2021-10-19 Thread Daniel Gustafsson
> On 17 Oct 2021, at 22:05, Alvaro Herrera  wrote:
> 
> On 2021-Oct-16, Tom Lane wrote:
> 
>> Attached is a proposed patch that refactors getTables() along the
>> same lines as some previous work (eg 047329624, ed2c7f65b, daa9fe8a5)
>> to avoid having multiple partially-redundant copies of the SQL query.
>> This gets rid of nearly 300 lines of duplicative spaghetti code,
>> creates a uniform style for dealing with cross-version changes
>> (replacing at least three different methods currently being used
>> for that in this same stretch of code), and allows moving some
>> comments to be closer to the code they describe.
> 
> Yeah, this seems a lot better than the original coding.

+1

> Maybe I would group together the changes that all require the same version
> test, rather than keeping the output columns in the same order.


I agree with that, if we're doing all this we might as well go all the way for
readability.

--
Daniel Gustafsson   https://vmware.com/





Re: Refactoring pg_dump's getTables()

2021-10-17 Thread Tom Lane
Alvaro Herrera  writes:
> Yeah, this seems a lot better than the original coding.  Maybe I would
> group together the changes that all require the same version test,
> rather than keeping the output columns in the same order.  This reduces
> the number of branches.  Because the follow-on code uses column names
> rather than numbers, there is no reason to keep related columns
> together.  But it's a clear improvement even without that.

Yeah, I thought about rearranging the code order some more, but
desisted since it'd make the patch footprint a bit bigger (I'd
want to make all the related stanzas list things in a uniform
order).  But maybe we should just do that.

regards, tom lane




Re: Refactoring pg_dump's getTables()

2021-10-17 Thread Alvaro Herrera
On 2021-Oct-16, Tom Lane wrote:

> Attached is a proposed patch that refactors getTables() along the
> same lines as some previous work (eg 047329624, ed2c7f65b, daa9fe8a5)
> to avoid having multiple partially-redundant copies of the SQL query.
> This gets rid of nearly 300 lines of duplicative spaghetti code,
> creates a uniform style for dealing with cross-version changes
> (replacing at least three different methods currently being used
> for that in this same stretch of code), and allows moving some
> comments to be closer to the code they describe.

Yeah, this seems a lot better than the original coding.  Maybe I would
group together the changes that all require the same version test,
rather than keeping the output columns in the same order.  This reduces
the number of branches.  Because the follow-on code uses column names
rather than numbers, there is no reason to keep related columns
together.  But it's a clear improvement even without that.

-- 
Álvaro Herrera   39°49'30"S 73°17'W  —  https://www.EnterpriseDB.com/
"Puedes vivir sólo una vez, pero si lo haces bien, una vez es suficiente"




Refactoring pg_dump's getTables()

2021-10-16 Thread Tom Lane
Attached is a proposed patch that refactors getTables() along the
same lines as some previous work (eg 047329624, ed2c7f65b, daa9fe8a5)
to avoid having multiple partially-redundant copies of the SQL query.
This gets rid of nearly 300 lines of duplicative spaghetti code,
creates a uniform style for dealing with cross-version changes
(replacing at least three different methods currently being used
for that in this same stretch of code), and allows moving some
comments to be closer to the code they describe.

There's a lot I still want to change here, but this part seems like it
should be fairly uncontroversial.  I've tested it against servers back
to 8.0 (which is what led me to trip over the bug fixed in 40dfac4fc).
Any objections to just pushing it?

regards, tom lane

diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 6ec524f8e6..e35ff6e7fb 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -6274,65 +6274,152 @@ getTables(Archive *fout, int *numTables)
 	 * We include system catalogs, so that we can work if a user table is
 	 * defined to inherit from a system catalog (pretty weird, but...)
 	 *
-	 * We ignore relations that are not ordinary tables, sequences, views,
-	 * materialized views, composite types, or foreign tables.
-	 *
-	 * Composite-type table entries won't be dumped as such, but we have to
-	 * make a DumpableObject for them so that we can track dependencies of the
-	 * composite type (pg_depend entries for columns of the composite type
-	 * link to the pg_class entry not the pg_type entry).
-	 *
 	 * Note: in this phase we should collect only a minimal amount of
 	 * information about each table, basically just enough to decide if it is
 	 * interesting. We must fetch all tables in this phase because otherwise
 	 * we cannot correctly identify inherited columns, owned sequences, etc.
-	 *
-	 * We purposefully ignore toast OIDs for partitioned tables; the reason is
-	 * that versions 10 and 11 have them, but 12 does not, so emitting them
-	 * causes the upgrade to fail.
 	 */
 
+	appendPQExpBuffer(query,
+	  "SELECT c.tableoid, c.oid, c.relname, "
+	  "c.relnamespace, c.relkind, "
+	  "(%s c.relowner) AS rolname, "
+	  "c.relchecks, "
+	  "c.relhasindex, c.relhasrules, c.relpages, "
+	  "d.refobjid AS owning_tab, "
+	  "d.refobjsubid AS owning_col, "
+	  "tsp.spcname AS reltablespace, ",
+	  username_subquery);
+
+	if (fout->remoteVersion >= 80400)
+		appendPQExpBufferStr(query,
+			 "c.relhastriggers, ");
+	else
+		appendPQExpBufferStr(query,
+			 "(c.reltriggers <> 0) AS relhastriggers, ");
+
+	if (fout->remoteVersion >= 90500)
+		appendPQExpBufferStr(query,
+			 "c.relrowsecurity, c.relforcerowsecurity, ");
+	else
+		appendPQExpBufferStr(query,
+			 "false AS relrowsecurity, "
+			 "false AS relforcerowsecurity, ");
+
+	if (fout->remoteVersion >= 12)
+		appendPQExpBufferStr(query,
+			 "false AS relhasoids, ");
+	else
+		appendPQExpBufferStr(query,
+			 "c.relhasoids, ");
+
+	if (fout->remoteVersion >= 80200)
+		appendPQExpBufferStr(query,
+			 "c.relfrozenxid, tc.relfrozenxid AS tfrozenxid, ");
+	else
+		appendPQExpBufferStr(query,
+			 "0 AS relfrozenxid, 0 AS tfrozenxid, ");
+
+	if (fout->remoteVersion >= 90300)
+		appendPQExpBufferStr(query,
+			 "c.relminmxid, tc.relminmxid AS tminmxid, ");
+	else
+		appendPQExpBufferStr(query,
+			 "0 AS relminmxid, 0 AS tminmxid, ");
+
+	if (fout->remoteVersion >= 80200)
+		appendPQExpBufferStr(query,
+			 "tc.oid AS toid, ");
+	else
+		appendPQExpBufferStr(query,
+			 "0 AS toid, ");
+
+	if (fout->remoteVersion >= 90100)
+		appendPQExpBufferStr(query,
+			 "c.relpersistence, ");
+	else
+		appendPQExpBufferStr(query,
+			 "'p' AS relpersistence, ");
+
+	if (fout->remoteVersion >= 90300)
+		appendPQExpBufferStr(query,
+			 "c.relispopulated, ");
+	else
+		appendPQExpBufferStr(query,
+			 "'t' as relispopulated, ");
+
+	if (fout->remoteVersion >= 90400)
+		appendPQExpBufferStr(query,
+			 "c.relreplident, ");
+	else
+		appendPQExpBufferStr(query,
+			 "'d' AS relreplident, ");
+
+	if (fout->remoteVersion >= 90100)
+		appendPQExpBufferStr(query,
+			 "CASE WHEN c.relkind = " CppAsString2(RELKIND_FOREIGN_TABLE) " THEN "
+			 "(SELECT ftserver FROM pg_catalog.pg_foreign_table WHERE ftrelid = c.oid) "
+			 "ELSE 0 END AS foreignserver, ");
+	else
+		appendPQExpBufferStr(query,
+			 "0 AS foreignserver, ");
+
+	if (fout->remoteVersion >= 90300)
+		appendPQExpBufferStr(query,
+			 "array_remove(array_remove(c.reloptions,'check_option=local'),'check_option=cascaded') AS reloptions, "
+			 "CASE WHEN 'check_option=local' = ANY (c.reloptions) THEN 'LOCAL'::text "
+			 "WHEN 'check_option=cascaded' = ANY (c.reloptions) THEN 'CASCADED'::text ELSE NULL END AS checkoption, ");
+	else if (fout->remoteVersion >= 80200)
+		appendPQExpBufferStr(query,
+